Event Stream Processor Matrix
September 26, 2011 5 Comments
We published our first ever UI-focused post on Top JavaScript Dynamic Table Libraries the other day and got some valuable feedback – thanks!
We are back to talking about the backend again. Our Search Analytics and Scalable Performance Monitoring services/products accept, process, and store huge amounts of data. One thing both of these services do is process a stream of events in real-time (and batch, of course). So what solutions are there that help one process data in real-time and perform some operations on a rolling window of data, such as the last 5 or 30 minutes of incoming event stream? We know of several solutions that fit that bill, so we decided to put together a matrix with essential attributes of those tools in order to compare them and make our pick. Below is the matrix we came up with. If you are viewing this on our site, the table is likely going to be too wide, but it should look find in a proper feed reader.
If you like working on systems that handle large volumes of data, like our Search Analytics and Scalable Performance Monitoring services, we are hiring world-wide.
Matrix part 1:
| License | Language | Scaling | Add or change rules on the fly | Other infra needed | Rule types | |
| Esper | GPL2, commercial | java | Scale up | yes | none | Declarative, query-based |
| Drools Fusion | ASL 2.0 | java | Scale up | yes | none | Declarative, mostly rule based, but support queries too |
| FlumeBase | ASL 2.0 | java | Horizontal: natural sharding on top of Flume | yes | Flume | Declarative, query-based |
| Storm | EPL 1.0 | clojure | Horizontal | Can be implemented on top of Zookeeper | ZeroMQ, Zookeeper | Provides only low level primitives(like grouping). Rule engine should be implemented manually on top. |
| S4 | ASL 2.0 | java | Horizontal | Can be implemented on top of Zookeeper | Zookeeper | Provides set of low level primitives. Somehow correlation support via joins. Documentation have a “windowing” section, but it empty. |
| Activeinsight | CPAL 1.0, commercial | java | Horizontal | yes | Declarative, Query-like | |
| Kafka | APL 2.0 | java | Horizontal | Zookeeper | Set of low level primitives |
Matrix part 2:
| Docs / examples | Maturity | Community | URL | Notes | |
| Esper | very good | mature, stable | medium | esper.codehaus.org | |
| Drools Fusion | good | 3 years, stable | small | jboss.org/drools/drools-fusion.html | |
| FlumeBase | good | alpha | small | flumebase.org | |
| Storm | exists | used in production | growing very fast | tech.backtype.com | good deployment features |
| S4 | average | alpha, butused in production | medium (will grow under ASF) | s4.io | |
| Activeinsight | poor | unknown | unknown | activeinsight.org | |
| Kafka | good | used in production | small (will grow under ASF) | incubator.apache.org/kafka |
So there you have it – we hope you find this useful. If you have any comments or questions, tweet us (@sematext) or leave a comment here. If you like working on systems that handle large volumes of data, like our Search Analytics and Scalable Performance Monitoring services, we are hiring world-wide.





This table doesn’t render properly. Only the first 7 columns are readable, the rest are behind other page elements.
Yeah
“If you are viewing this on our site, the table is likely going to be too wide, but it should look find in a proper feed reader.”
Out of curiosity, are you using a feed reader to read this blog or a browser?
Jacob – we’ve reformatted the matrix (split in 2) and eliminated the problem.
Here’s an addition for your grid. Streamcruncher (www.streamcruncher.com) is inactive at this time (so far as I can tell) but it is LGPL so who knows who may have forked it. It works pretty well and I have been poking it and extending some areas, trying to reduce the amount of code it takes to get a stream up and running. (Esper is so tight that it pretty much disgraces everyone else
)
I propose these grid entries:
Grid 1:
=====
License: LGPL
Language: Java
Scaling: Scale Up
Add or change rules on the fly: Yes
Other infra needed : Uses a database. Several supported including H2 In-Mem.
Rule types: Declarative, query-based
Grid 2:
=====
Docs / examples: exists/decent
Maturity: Not
Community: None/Small
URL : http://www.streamcruncher.com
Notes:
Thanks Nicholas, good find.