HBase Digest, May 2010

Big news first:

  • HBase 0.20.4 is out! This release includes critical fixes, some improvements and performance improvements. HBase 0.20.4 EC2 AMIs are now available in all regions, the latest launch scripts can be found here.
  • HBase has become Apache’s Top Level Project. Congratulations!

Good to know things shared by community:

  • HBase got a code review board. Feel free to join!
  • The guarantees for each operation in HBase with regard to ACID are properties stated here.
  • Writing filter that compares values in different columns is explained in this thread.
  • It is OK to mix transactional IndexTable and regular HTables in the same cluster. One can access tables w/out the transactional semantics/overhead as normal, even when running a TransactionalRegionServer. More in this thread.
  • Gets and scans now never return partially updated rows (as of 0.20.4 release).
  • Try to avoid building code on top of lockRow/unlockRow because this can lead to serious delays in a system work and even deadlock. Thread…
  • Read about how HBase performs load-balancing in this thread.
  • Thinking about using HBase with alternative (to HDFS) file system? Then this thread is a must-read for you.

Notable efforts:

  • HBase Indexing Library aids in building and querying indexes on top of HBase, in Google App Engine datastore-style. The library is complementary to the tableindexed contrib module of HBase.
  • HBasene is a scalable information retrieval engine, compatible with the Lucene library while using HBase as the store for the underlying TF-IDF representation.  This is much like Lucandra, which uses Lucene on top of Cassandra.  We will be covering HBasene in the near future here on Sematext Blog.

FAQ:

  1. Is there an easy way to remove/unset/clean a few columns in a column family for an HBase table?
    You can either delete an entire family or delete all the version of a single family/qualifier. There is no ‘wild card’ deletion or other pattern matching. Column Family is the closest.
  2. How to unsubscribe from user mailing list?
    Send mail to user-unsubscribe@hbase.apache.org.

2 Responses to HBase Digest, May 2010

  1. Pingback: Hadoop Digest, May 2010 « Sematext Blog

  2. Pingback: Hadoop-HBase-Lucene-Mahout-Nutch-Solr Digests « Another Word For It

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,640 other followers

%d bloggers like this: