I Love Lucene – interesting article on TSS about Jakarta Java Based text-search engine
I found a very interesting article on how to get started – and what to expect – from Lucene, the Jakarta Apache text search engine library written entirely in Java. The article: “I Love Lucene” was written by Dion Almaer. It discusses how TSS used Lucene to build their search facility. As such it is a clear introduction to what Lucene can do and how you can get going. Their conclusion:
Having said that, we donâ€™t see any reason to move away from Lucene. It has been a pleasure to work with, and is one of the best pieces of open source software that I have personally ever worked with.
TheServerSide search used be a weak link on the site. Now it is a powerhouse. I am constantly using it as Editor, and now manage to find exactly what I want.
Indexing our data is so fast, that we donâ€™t even need to run the incremental build plan that we developed. At one point we mistakenly had an IndexWriter.optimize() call every time we added a document. When we relaxed that to run less frequently we brought down the index time to a matter of seconds. It used to take a LOT longer, even as long as 45 minutes.
So to recap: We have gained relevance, speed, and power with this approach. We can tweak the way we index and search our content with little effort.
Thanks SO much to the entire Lucene team.
Note that Lucene can be used in any Java application that wants to search through text that can be presented through interfaces that Lucene understands, not just web-applications. Jakarta Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. Jakarta Lucene is an open source project available for free download from Apache Jakarta.
I hope we will have an opportunity to try out Lucene one of these days; it seems fun to work with. At the same time: most of our data is in the database and whenever Oracle is on site, Oracle Text gives probably even better functionality.