Lucene Digest, February 2010

Publishing a February Lucene Digest in March?  Nonsense, ha?  Blame it on Sematext keeping us busy and on the short month of February.  At least we got the Solr Digest and HBase Digest out in time!

  • Well, the first thing to know about Lucene now is that Lucene 2.9.2 and 3.0.1 have been released.  They contain fixes you’ll want to have, so go grab a new release.  While you are at it, you may also be interested in seeing a discussion about Lucene upgrades that emerged after the release announcement was made a few days ago.
  • Guess what?  Lucene in Action 2ns ed is in production.  What this means is that LIA2 authors are code working on the manuscript and that Manning Publications people are preparing it for print and distribution.  You got your MEAP already, right?  LIA2 covers Lucene 3.* API.
  • If Lucene in Action 2nd ed is not enough for you, note that another Lucene book is in the works: Lucene in Practice.  Hey, John Wang, Jake & Co., does LIP have a URL?  For now, the only URL I have is to the LIP source code: http://code.google.com/p/lucene-book/.
  • If you missed the popular Lucandra post, have a look at it now.  Lucandra from @tjake looks interesting.  Note that we’ll have a talk about Lucandra soon – keep an eye on the NY Search & Discovery Meetup.
  • Lucene developers are a super disciplined bunch.  Look how well unit-tested Lucene is in the Lucene Clover Report.
  • LinkedIn is one of the bigger Lucene users out there, and they’ve been publishing a lot about that.  Zoie and Bobo Browse are two projects you’ll see covered in Lucene in Action 2, but here is a LinkedIn Search presentation.
  • Search, search, more and more search frameworks are getting built on top of Lucene.  Solr is the biggest and the most well known, of course, but certainly not the only one:
  • And talking about Katta (Lucene (or Hadoop Mapfiles or any content which can be split into shards) in the cloud), the new release is out:

The key changes of the 0.6 release among dozens of bug fixes:
– upgrade lucene to 3.0
– upgrade zookeeper to  3.2.2
– upgrade hadoop to 0.20.1
– generalize katta for serving shard-able content (lucene is one implementation, hadoop mapfiles another one)
– basic lucene field sort capability
– more robust zookeeper session expiration handling
– throttling of shard deployment (kb/sec configurable) to have  a stable search while deploying
– load test facility
– monitoring facility
– alpha version of web-gui

See full list of changes at
http://oss.101tec.com/jira/secure/ReleaseNote.jspa?projectId=10000&styleName=Html&version=10010

  • Full-text Search and Spatial Search go hand in hand.  Both Lucene and Solr have seen work in the spatial search area and now a new Apache project called Spatial Information Systems (SIS) has been proposed and approved.  SIS will enter Apache Software Foundation via the Incubator.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 143 other followers

%d bloggers like this: