288
CHAPTER 8
Tools and extensions public interface DocumentHandler { Document getDocument(File file) throws DocumentHandlerException; }
Implementing ConfigurableDocumentHandler allows the <index> task to pass additional information as a java.util.Properties object: public interface ConfigurableDocumentHandler extends DocumentHandler { void configure(Properties props); }
Configuration options are passed using a single <config> subelement with arbitrarily named attributes. The <config> attribute names become the keys to the properties. Our complete TestDataDocumentHandler class is shown in listing 8.4. Listing 8.4 TestDataDocumentHandler: how we built our sample index public class TestDataDocumentHandler implements ConfigurableDocumentHandler { private String basedir; public Document getDocument(File file) throws DocumentHandlerException { Properties props = new Properties(); try { props.load(new FileInputStream(file)); } catch (IOException e) { throw new DocumentHandlerException(e); } Document doc = new Document(); // category comes from relative path below the base directory String category = file.getParent().substring(basedir.length()); category = category.replace(File.separatorChar,'/');
Get category
String String String String String String
isbn = props.getProperty("isbn"); title = props.getProperty("title"); author = props.getProperty("author"); url = props.getProperty("url"); subject = props.getProperty("subject"); pubmonth = props.getProperty("pubmonth");
c
Pull fields
doc.add(Field.Keyword("isbn", isbn)); Add fields to doc.add(Field.Keyword("category", category)); Document instance doc.add(Field.Text("title", title));
Licensed to Simon Wong <simonwg@sinatown.com>
d
b