Wednesday, July 4, 2007

lucene issue

I feel sorry for Shay: he's written a beautiful object indexing layer, Compass, on top of Lucene (the equivalent of writing the first good Object Relational mapping tool but for searching), yet is let down by the underlying interface to Lucene. Don't get me wrong, Lucene is great, but it does have some quirks.

Take this one for example, setting a custom SegmentReader. Compass uses a custom SegmentReader (unsurprisingly called CompassSegmentReader), which in an of itself isn't a huge modification of Lucene. The quirk comes into play when we deploy multiple applications into the same Servlet container. It gets back to the method of setting the custom SegmentReader in Lucene; not changing a property on a factory or another configuration parameter... no, Lucene needs to set a system wide property. Yuck. This means that when Compass set's it's custom SegmentReader, it sets it for everything in that JVM. So App1 using Compass is okay, but App2 using pure Lucene then falls over unless the Compass jar file is included. Here's a thread on the topic over at the Compass forums.

In my case I've found this out when deploying DSpace on the same application server as AONS. I'm not sure how I'll get around it: I can probably ask the DSpace developers to include Compass in their distribution, or I'll probably just put in a README note about it. Either solution is less then ideal but probably the least path of resistance by far. Maybe the Lucene developers would be willing to change this... but the original post about this problem in the Compass forums was in 2005 so I have my doubts.

Well, here's the offending code from org.apache.lucene.index.SegmentReader (valid as of Lucene 2.2.0):

static {...

try {

String name = System.getProperty("org.apache.lucene.SegmentReader.class", SegmentReader.class.getName());

IMPL = Class.forName(name);

} catch (ClassNotFoundException e) {

throw new RuntimeException("cannot load SegmentReader class: " + e, e);

} catch (SecurityException se) {

try {

IMPL = Class.forName(SegmentReader.class.getName());

} catch (ClassNotFoundException e) {

throw new RuntimeException("cannot load default SegmentReader class: " + e, e);

}

}

}

This kind of code is exactly why Inversion of Control (aka Dependency Injection) was so good when it came along: it got rid of static calls and global system properties like this and made property injection obvious and intuitive. I'm sure when this segment of code was first written, it was completely acceptable to do things like this... but nowadays configuration and anticipated customisation should be a little bit more elegant.

0 comments: