Monitoring Kafka, Storm and Cassandra Together with SPM

Kafka, Storm and Cassandra — Big Data’s Three Amigos?  Not quite.  And while much less humorous than the movie, this often-used-together trio of tools work closely together to make in-stream processing as smooth, immediate and efficient as possible.  Needless to say, it makes a lot of sense to monitor them together.  And since you’re reading the Sematext blog you shouldn’t be surprised to hear that we offer an all-in-one solution that can be used for Kafka, Storm and Cassandra performance monitoring, alerts and anomaly detectionSPM Performance Monitoring.  Moreover, if you ship logs from any of these three systems to Logsene, you can correlate not only metrics from these three systems, but also their logs and really any other type of event!

Enough with all the words — here is an illustration of how these three tools typically work together:

Kafka-Storm-Cassandra

Of course, you could also be storing data into some other type of data store, like Elasticsearch or HBase or MySQL or Solr or Redis.  SPM monitors all of them, too.

So what do you get if you monitor Kafka, Storm, and Cassandra with SPM?

You get a single pane pane of glass, a single access point through which you can see visualizations of well over 100 metrics you can slice and dice by a number applications-specific dimensions.  For example, various Kafka performance metrics can be filtered by one or more topics, while some Storm performance metrics can be filtered by topology or worker.  Of course, you can set alerts on any of the metrics and you can even make use of SPM’s anomaly detection capabilities.

Kafka, Storm + Cassandra screenshot

 

Consolidate Your App Monitoring, Alerting, and Centralize Logging — It’s Easy!

Many organizations tackle performance monitoring with a mish-mash of different monitoring and alerting tools cobbled together in an uneasy coexistence that is often far from seamless.  SPM takes all that hassle away and makes it easy and comprehensive in one step.

Live Demo — See SPM for Yourself

Check out SPM’s live demo to see this monitoring for yourself.  You won’t find any demo apps showing Cassandra metrics because we don’t use it at Sematext yet, but you’ll be able to poke around and see Kafka, HBase, Elasticsearch, MySQL, and other types of apps being monitored.

Love the Idea of Monitoring Kafka, Storm & Cassandra Together?
Take a Test Drive — It’s Easy to Get Started.

Try SPM Performance Monitoring for Free for 30 days by registering here.  There’s no commitment and no credit card required.

We’re Hiring!

If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, and Storm, then drop us a line.  We’re hiring planet-wide!  Front end and JavaScript Developers, Developer Evangelists, Full-stack Engineers, Mobile App Developers…get in touch!

 

Announcement: What’s New in SPM Performance Monitoring

A new SPM Performance Monitoring release was just pushed to production and it’s chock full of great new stuff to complement its proactive performance monitoring, alerting, anomaly detection, etc., available in the Cloud or On Premise.  Here is a run-down of the juicier additions. The slightly longer version can be found in the SPM Changelog.

Integration with Logsene Log Management and Analytics

SPM performance monitoring now gives users access to even more metrics by seamlessly integrating with event data and logs via Logsene Log Management and Analytics.  This enables correlation across performance metrics, alerts, anomaliesevents, logs, and provides a single pane of glass across any organization’s IT infrastructure.

Monitoring Support for More Applications

We’ve added native monitoring support for the following applications to complement monitoring for Solr, Elasticsearch, Hadoop, HBase, Storm, Redis, Kafka, ZooKeeper and many others.

Screenshots

Eager to see pictures instead of reading content?  Then jump below to see screenshots of these apps being monitored.

UI/UX Improvements

UI/UX improvements include: zooming and panning, client-side caching, wider and simpler metric charts, new filter slide-out panels with search capabilities, quick access to all dashboards and easier dashboard creation, and more.

Event Graphs

Events and event graphs are now integrated into SPM Performance Monitoring.  You can now correlate various types of events, such as alerts, anomalies, application deployments, restarts, releases, server reboots, etc., with performance metrics graphs, as well as with logs and log graphs.  Many of you will also be happy to hear that SPM can now turn Alerts into Events, and graph them as well.  Check out Event Integration if you want to publish your own Events.

More Powerful Storm Monitoring

SPM Storm monitoring now serves up more metrics, more graphs and provides more granular details.  This includes the addition of metric filters and the ability to monitor not just Spouts and Bolts, but also monitor Storm Workers.

Dashboard Enhancements

Creating and working with dashboards just got a lot more intuitive and flexible.  This includes:

  • creating new dashboards via an intuitive “build your own dashboard” tool
  • easier navigation via Miller Columns (think column-oriented view in OSX Finder)
  • adding whatever graphs you want to an existing or brand new dashboard from within that dashboard
  • a pull down menu to select specific dashboards for much quicker access to a specific dashboard

 

Screenshot – SPM Dashboard (one of many possible views; click to enlarge)

Dashboard_test

 

Screenshot – Cassandra Overview  (click to enlarge)

cassandra_overview

 

Screenshot – MySQL Overview  (click to enlarge)

MySQL Overview

Screenshot – Memcached Overview  (click to enlarge)

memcached-overview

Screenshot – Apache Monitoring Overview  (click to enlarge)

Apache Overview

 

Screenshot – AWS CloudWatch EBS Read/Write Bandwidth  (click to enlarge)

AWS_EBS Read:Write Bandwidth

 

Live Demo

Check out SPM’s live demo to see it for yourself.  You won’t find any demo apps showing Cassandra or Memcached metrics because we don’t use them at Sematext yet, but you’ll be able to poke around and see other types of apps being monitored — like Solr, Kafka, Hadoop and HBase, for example — along with MySQL, AWS, and Apache.

Consolidate Your App Monitoring — It’s Easy!

Many organizations tackle performance monitoring with a mish-mash of different monitoring and alerting tools cobbled together in an uneasy coexistence that is often far from seamless. SPM takes all that hassle away and makes it easy and comprehensive in one step.

Try SPM for Free for 30 Days

Try SPM Performance Monitoring for Free for 30 days by registering here.  There’s no commitment and no credit card required.

We’re Hiring!

If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, and Storm, then drop us a line.  We’re hiring planet-wide!  Front end and JavaScript Developers, Developer Evangelists, Full-stack Engineers, Mobile App Developers…get in touch!

Announcement: Cassandra Performance Monitoring in SPM

Cassandra is a distributed database management system built to handle massive data sets while providing high availability without compromising performance.  That’s why many organizations use it for their mission-critical data. That being said, if you are running Cassandra then you’ll want to keep close tabs on it.  And now SPM Performance Monitoring can help you do just that as the latest release supports Cassandra performance monitoring.  Not to mention that all the usual SPM alerting, anomaly detection, etc., can be used with any of the Cassandra metrics.

Why Not Monitor More Than Just Cassandra?

Unlike some of the tools that only monitor Cassandra and nothing else, SPM Performance Monitoring covers Cassandra, the competing database HBase, and a lot more: Solr, Elasticsearch, Hadoop, MySQL, AWS CloudWatch, Memcached, Apache, and just about any other app you want to monitor.  SPM also monitors Storm and Kafka, which are often used together with Cassandra (considered their most popular data store).  Now all three can be monitored together!

Have a look at a few of the screenshots to see how we graph Cassandra metrics in SPM (a list of Cassandra metrics we monitor is listed further below).  You can also check out SPM’s live demo. You won’t find any demo apps showing Cassandra metrics because we don’t use Cassandra at Sematext yet, but you’ll be able to poke around and see other types of apps being monitored, like Solr, Kafka, Hadoop and HBase, for example.

Overview  (click to enlarge)

cassandra_overview

 

Compactions  (click to enlarge)

compactions

 

Pending Writes  (click to enlarge)

pending-writes

 

Write Requests  (click to enlarge)

write-requests

 

SSTable  (click to enlarge)

sstable

 

Bloom Filter  (click to enlarge)

bloom-filter

SPM now monitors Cassandra metrics like:

  • Write Requests (rate, count, latency)
  • Read Requests (rate, count, latency)
  • Pending Write Operations (flushes, post flushes, write requests, replication of write)
  • Pending Read Operations (read requests, read repair tasks, compactions)
  • Pending Cluster Operations (manual repair tasks, gossip tasks, hinterd handoff, internal responses, migrations, misc tasks, request responses)
  • Compactions
  • Row/Key Cache (hit ratio, requests count)
  • Local Writes (rate, count, latency)
  • Local Reads (rate, count, latency)
  • SSTable (size, count)
  • Bloom Filter (space used, false positives ratio)

Please tell us what you think – @sematext is always listening!  Is there something SPM Performance Monitoring doesn’t monitor that you would really like to monitor?

Consolidate Your App Monitoring — It’s Easy!

Many organizations tackle performance monitoring with a mish-mash of different monitoring and alerting tools cobbled together in an uneasy coexistence that is often far from seamless.  Think Graphite+Nagios, for example.  SPM takes all that hassle away and makes it easy and comprehensive in one step.

Try SPM for Free for 30 Days

Try SPM Performance Monitoring for Free for 30 days by registering here.  There’s no commitment and no credit card required.

We’re Hiring

If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, and Storm, then drop us a line.  We’re hiring planet-wide!  Front end and JavaScript Developers, Developer Evangelists, Full-stack Engineers, Mobile App Developers…get in touch!

The State of Solandra – Summer 2011

A little over 18 months ago we talked to Jake Luciani about Lucandra – a Cassandra-based Lucene backend.  Since then Jake has moved away from raw Lucene and married Cassandra with Solr, which is why Lucandra now goes by Solandra.  Let’s see what Jake and Solandra are up to these days.

What is the current status of Solandra in terms of features and stability?

Solandra has gone through a few iterations. First as Lucandra which partitioned data by terms and used thrift to communicate with Cassandra.  This worked for a few big use cases, mainly how to manage a index per user, and garnered a number of adopters.  But it performed poorly when you had very large indexes with many dense terms, due to the number and size of remote calls needed to fulfill a query.Last summer I started off on a new approach based on Solr that would address Lucandra’s shortcomings: Solandra.  The core idea of Solandra is to use Cassandra as a foundation for scaling Solr.  It achieves this by embedding Solr in the Cassandra runtime and uses the Cassandra routing layer to auto shard a index across the ring (by document).  This means good random distribution of data for writes (using Cassandra’s RandomParitioner) and good search performance since individual shards can be searched in parallel across nodes (using SolrDistributedSearch).  Cassandra is responsible for sharding, replication, failover and compaction.  The end user now gets a single scalable component for search without changing API’s which will scale in the background for them.  Since search functionality is performed by Solr so it will support anything Solr does.

I gave a talk recently on Solandra and how it works: http://blip.tv/datastax/scaling-solr-with-cassandra-5491642

Are you still the sole developer of Solandra?  How much time do you spend on Solandra?
Have there been any external contributions to Solandra?

I still am responsible for the majority of the code, however the Solandra community is quite large with over 500 github followers and 60 forks.  I receive many useful bug reports and patches through the community.  Late last year I started working at DataStax (formerly Riptano) to focus on Apache Cassandra.   DataStax is building a suite of products and services to help customers use Cassandra in production and incorporate Cassandra into existing enterprise infrastructure.  Solandra is a great example of this. We currently have a number of customers using Solandra and we encourage people interested in using Solandra to reach out to us for support.

What are the most notable differences with Solandra and Solr?

The primary difference is the ability to grow your Solr infrastructure seamlessly using Cassandra. I purposely want to avoid altering the Solr functionality since the primary goal here is to make it easy for users to migrate to and from Solandra and Solr.   That being said Solandra does offer some unique features regarding managing millions of indexes.  One is different Solr schemas can be injected at runtime using a RESTful interface and Solandra supports the concept of virtual Solr Cores which share the same core but are treated as different indexes. For example, if you have a core called “inbox” you can create an index per user like “inbox.otis” or “inbox.jake” just by changing the endpoint URL.

Finally, Solandra has a bulk loading interface that makes it easy to index large amounts of data at a time (one known cluster indexes at ~4-5MB of text per second).

What are the most notable differences with Solandra and Elastic Search?

ElasticSearch is more mature and offers a similar architecture for scaling search though not based on Cassandra or Solr.  I think ElasticSearch’s main weakness is it requires users to scrap their existing code and tools to use it.  On the other hand, Solandra provides a scalable platform built on Solr and lets you grow with it.

Solandra doesn’t use the Lucene index file format so it will grow to support millions of indexes. Systems like Solr and ElasticSearch create a directory per index which makes managing millions of indexes very hard. The flipside is there are a lot of performance tweaks lost by not using the native file format most of the current work on Solandra relates to improving single node performance.

Solandra is a single component that gives you search AND NoSQL database, and is therefore much easier to manage from the operations perspective IMO.

What do you plan on adding to Solandra that will make it clearly stand out from Solr or Elastic Search?

Solandra will continue to grow with Solr (4.0 will be out in the future), as well as with Cassandra. Right now Solandra’s real-time search is limited by the performance of Solr’s field cache implementation. By incorporating Cassandra triggers I think we can remove this bottleneck and get really impressive real-time performance at scale, due to how Solandra pre-allocates shards.

Also, since the Solr index is stored in the Cassandra datamodel, you can now apply some really interesting features of Cassandra to Solr, such as expiring indexes and triggered searches.

When should one use Solandra?

If you say yes to any of the following you should use Solandra:

  • I have more data than can fit on a single box
  • I have potentially millions of indexes
  • I need improved indexing with multi-master writes
  • I need multi-datacenter search
  • I am already using Cassandra and Solr
  • I am having trouble managing my Solr cluster

When should one not use Solandra?

If you are happy with your Solr system today and you have enough capacity to scale the size and number of indexes comfortably then there is no need to use Solandra.  Also, Solandra is under active development so you should be prepared to help diagnose unknown issues.  Also note that if you require search features that are currently not supported by Solr distributed search, you should not use Solandra.

Are there known problems with Solandra that users should be aware of?

Yes, currently the index sizes can be much larger in Solandra than Solr (in some cases 10x) this is due to how Solandra indexes data as well as Cassandra’s file format. Cassandra 1.0 includes compression so that will help quite a bit.Also, since consistency in Solandra is tunable it requires your application to consider the implications of writing data at lower consistencies.Finally, one thing that keeps coming up quite often is users assuming Solandra auto indexes the data you already have in Cassandra, since Solandra builds on Cassandra.  This is not the case.  Data must be indexed and searched through the traditional Solr APIs.

Is anyone using Solandra in production? What is the biggest production deployment in terms of # docs, data size on filesystem, query rate?

Solandra is now in production with a handful of users I know of.  Some others are in the testing/pre-production stage. But it’s still a small number AFAIK.The largest Solandra cluster I know of is in the order of ~5 nodes, ~10TB of data with ~100k indexes and ~2B documents.

If you had to do it all over, what would you do differently?

I’m really excited with the way Lucandra/Solandra has evolved over the past year. It’s been a great learning experience and has allowed me to work with technologies and people I’m really, really excited about. I don’t think I’d change a thing, great software takes time.

When is Solandra 1.0 coming out and what is the functionality/issues that remain to be implemented before 1.0?

I don’t really use the 1.0 moniker as people tend to assume too much when they read that. I think once Solandra is fully documented, supports things like Cassandra based triggers for indexing and search, and has an improved on disk format, I’d be comfortable calling Solandra 0.9 :)

Thank you Jake.  We are looking forward to Solandra 0.9 then.

Lucandra / Solandra: A Cassandra-based Lucene backend

In this guest post Jake Luciani (@tjake) introduces Lucandra (Update: now known as Solandra – see our State of Solandra post), a Cassandra-based backend for Lucene (Update: now integrated with Solr instead).
Update: Jake will be giving a talk about Lucandra in the April 2010 NY Search & Discovery Meetup.  Sign up!
Update 2: Slides from the Lucandra meetup talk are on line: http://www.jroller.com/otis/entry/lucandra_presentation
For most users, the trickiest part of deploying a Lucene based solution is managing and scaling storage, reads, writes and index optimization. Solr and Katta (among others) offer ways to address these, but still require quite a lot of administration and maintenance. This problem is not specific to Lucene. In fact most data management applications require a significant amount of administration.
In response to this problem of managing and scaling large amounts of data the “nosql” movement has started to become more popular. One of the most popular and widely used “nosql” systems is Apache Software Foundation project, originally developed at Facebook called Cassandra.

What is Cassandra?

Cassandra is a scalable and easy to administer column-oriented data store, modeled after Google’s BigTable, but built by the designers of Amazon’s S3. One of the big differentiators of Cassandra is it does not rely on a global file system as Hbase and BigTable do. Rather, Cassandra uses decentralized peer to peer “Gossip” which means two things:
  1. It has no single point of failure, and
  2. Adding nodes to the cluster is as simple as pointing it to any one live node

Cassandra also has built-in multi-master writes, replication, rack awareness, and can handle downed nodes gracefully. Cassandra has a thriving community and is currently being used at companies like Facebook, Digg and Twitter to name a few.

Enter Lucandra

Lucandra is a Cassandra backend for Lucene. Since Cassandra’s original use within Facebook was for search, integrating Lucene with Cassandra seemed like a “no brainer”. Lucene’s core design makes it fairly simple to strip away and plug in custom Analyzer, Writer, Reader, etc. implementations. Rather than trying to build a Lucene Directory interface on top of Lucene as some backends do (DbDirectory for example), our approach was to implement a an IndexReader and IndexWriter directly on top of Cassandra.

Here’s how Terms and Documents are stored in Cassandra. A Term is a composite key made up from the index, field and term with the document id as the column name and position vector as the column value.
      Term Key                    ColumnName   Value
      "indexName/field/term" => { documentId , positionVector }
      Document Key
      "indexName/documentId" => { fieldName , value }
Cassandra allows us to pull ranges of keys and groups of columns so we can really tune the performance of reads as well as minimize network IO for each query. Also, since writes are indexed and replicated by Cassandra we don’t need to worry about optimizing the indexes or reopening the index to see new writes. This means we get a soft real-time distributed search engine.
There is a impact on Lucandra searches when compared to native Lucene searches. In our testing we see Lucandra’s IndexReader is ~10% slower, than the default IndexReader. However, this is still quite acceptable to us given what you get in return.
For writes Lucadra is comparatively slow to regular Lucene, since every term is effectively written under its own key. Luckily, this will be fixed in the next version of Cassandra, which will allow batched writes for keys.
One other major caveat is, there is no term scoring in the current code. This simply hasn’t been needed yet. Adding is relatively trivial – via another column.
To see Lucandra in action you can try out the Twitter search app http://sparse.ly that is built on Lucandra. This service uses the Lucandra store exclusively and does not use any sort of relational or other type of database.

Lucandra in Action

Using Lucandra is extremely simple and switching a regular Lucene search application to use Lucandra is a matter of just several lines of code. Let’s have a look.

First we need to create the connection to Cassandra


import lucandra.CassandraUtils;
import lucandra.IndexReader;
import lucandra.IndexWriter;
...
Cassandra.Client client = CassandraUtils.createConnection();

Next, we create Lucandra’s IndexWriter and IndexReader, and Lucene’s own IndexSearcher.


IndexWriter indexWriter = new IndexWriter("bookmarks", client);
IndexReader indexReader = new IndexReader("bookmarks", client);
IndexSearcher indexSearcher = new IndexSearcher(indexReader);

From here on, you work with IndexWriter and IndexSearcher just like you in vanilla Lucene. Look at the BookmarksDemo for the complete class.

What’s next? Solandra!

Now that we have a Lucandra we can use it with anything built on Lucene. For example, we can integrate Lucandra with Solr and simplify our Solr administration. If fact this has already been attempted and we plan to support this in our code soon.
For most users, the trickiest part of deploying a Lucene based solution is managing and scaling storage, reads, writes and index optimization. Solr and Katta (among others) offer ways to address these, but still require quite a lot of administration and maintenance. This problem is not specific to Lucene. In fact most data management apps require a significant amount of administration.
In response to this problem of managing and scaling large amounts of data the “nosql” movement has started to become more popular. One of the most popular and widely used “nosql” systems is Apache project, originally developed at facebook called Cassandra.
What is Cassandra?
==============
Cassandra is a scalable and easy to administer column oriented data store, modeled after Google’s Bigtable but built by the designers of Amazon’s S3. One of the big differentiators of Cassandra is it does not rely on a global file system as Hbase and BigTable do. Rather, Cassandra uses decentralized peer to peer “Gossip” which means two things. It has no single point of failure and adding nodes into the cluster is as simple as pointing it to any one live node. Cassandra also has built in multi-master writes, replication, rack awareness, and can handle downed nodes gracefully.
Cassandra has a thriving community and is currently being used at companies like Facebook, Digg and Twitter to name a few.
Enter Lucandra:
===========
Lucandra is a Cassandra backend for Lucene. Since Cassandra’s original use within facebook was for search, integrating Lucene with Cassandra seemed like a nobrainer. Lucene’s core design makes it fairly simple to strip away and plug in custom Analyzer, Writer, Reader, etc. implementations. Rather than try and build a Lucene directory interface ontop of Lucene as some backends do (DbDirectory for example), our approach was to implement a index reader and writer directly ontop of Cassandra.
Here’s how Terms and Documents are stored in Cassandra. A Term is a composite key made up from the index, field and term with the document id as the column name and position vector as the column value.
      Term Key                    ColumnName   Value
      "indexName/field/term" => { documentId , positionVector }
      Document Key
      "indexName/documentId" => { fieldName , value }
Cassandra allows us to pull ranges of keys and groups of columns so we can really tune the performance of reads as well as minimize network IO for each query. Also, since writes are indexed and replicated by Cassandra we don’t need to worry about optimizing the indexes or reopening the index to see new writes. This means we get a soft real-time distributed search engine.
There is a impact on Lucandra searches when compared to native Lucene searches. In our testing we see Lucandra’s IndexReader is ~10% slower, than the default IndexReader however this is still quite acceptable to us given what you get in return.
For writes Lucadra is comparatively slow to regular Lucene, since every term is effectively written under it’s own key, however this will be fixed in next version of the Cassandra, which will allow batched writes for keys.
One other major caveat is, we do any term scoring in the current code.
To see Lucandra in action you can try out our Twitter search app http://sparse.ly that is built on Lucandra.
What’s next? Solandra:
================
Now that we have a Lucandra we can use it with anything built on Lucene. For example we can integrate Lucandra with Solr and simplify our Solr administration. If fact this has already been attempted and we plan to support this in our code soon.
Follow

Get every new post delivered to your Inbox.

Join 1,667 other followers