Berlin Buzzwords 2013 – Two Talks from Sematext

Last year at Berlin Buzzwords we were proud to give three talks. Alex talked about “Real-time Analytics with HBase” (slides, video), Otis talked about large scale monitoring in his talked titled “Large Scale ElasticSearch, Solr & HBase Performance Monitoring” (slides, video) and Rafał gave a talk about how we scale ElasticSearch clusters in his “Scaling Massive ElasticSearch Clusters” talk (slides, video). We were also very happy to be one of the sponsors of this great conference :) Because we really enjoyed the conference we decided to submit a few proposals this year and they got accepted. In this years schedule we will be giving the following talks:

Radu: JSON Logging with ElasticSearch

This talk is about aggregating loooots of logs – searching of seriously big data. We’ll go through everything we can possibly go through in 20 minutes. We’ll look at how, where, when, why, and what to log. We’ll show how to use Elasticsearch as a data store for logs and what the benefits of doing so are. We’ll discuss advantages and disadvantages of logging in JSON, which is easily processed by machines, over traditional logging, which is easily processed by humans. Finally, we’ll explore how you can get your logs – JSON or not – into Elasticsearch, run searches and statistics on them, and create pretty graphs you can’t stop staring at.

Rafał: Battle of the Giants, Round 2

battle_blog

Learn about how both of these great enterprise search servers are evolving and adding new features. We will be comparing the latest and greatest versions of Solr and ES, both of which are using Lucene 4.x and bringing different approaches to handling codecs, per field similarities, and more. Of course, we’ll not only look at technical aspects of both Apache Solr and ElasticSearch, but will also dig into the makeup of their contributors, compare the code and of course the user community. By the end of the talk you’ll learn the main differences when it comes to these two search servers, how they handle shard and replica distribution, automatic data replication, and different query types. In addition, you’ll learn what the admin APIs for both Solr and ElasticSearch look like and how to use them to control and alter your cluster state. Last, but not least, you’ll learn what to avoid when using ElasticSearch or Apache Solr.

We hope to see some of you in Berlin.  If these topics are of interest to you, but you won’t be coming to Berlin, feel free to get in touch, leave comments, or ping @sematext. As usual we’ll be posting slides after the talks and the organizers will probably record the talk and publish it after the conference. And if you love working with things our talks are about, we are hiring world-wide!

Poll: Using SolrCloud or Not?

We know that as of February 2013, of those Solr users who follow Sematext Blog about 75% use one some version of Solr 4.x.  But today we are trying to get to another interesting stat:

What portion of Solr 4.x users use SolrCloud?

Let’s find out!  Please tweet this to help us get more votes and better stats.

Please vote only if you are using Solr 4.x.  Please do NOT vote if you are using 1.x or 3.x version of Solr.

Poll: Which Solr version are you using?

With Solr 4.1 recently released, let’s see which version(s) of Solr people are using.  Please tweet it to help us get more votes and better stats.

Solr vs. ElasticSearch: Part 6 – User & Dev Communities

One of the questions after my talk during the recent ApacheCon EU was what I thought about the communities of the two search engines I was comparing. Not surprisingly, this is also a question we often address in our consulting engagements.  As a part of our Apache Solr vs ElasticSearch post series we decided to step away from the technical aspects of SolrCloud vs. ElasticSearch and look at the communities gathered around thesee two projects. If you haven’t read the previous posts about Apache Solr vs. ElasticSearch here are pointers to all of them:

Read more of this post

Solr vs ElasticSearch: Part 5 – Management API Capabilities

In previous posts, all listed below, we’ve discussed general architecture, full text search capabilities and facet aggregations possibilities. However, till now we have not discussed any of the administration and management options and things you can do on a live cluster without any restart. So let’s get into it and see what Apache Solr and ElasticSearch have to offer.

Read more of this post

Solr vs ElasticSearch: Part 4 – Faceting

Solr 4 (aka SolrCloud) has just been released, so it’s the perfect time to continue our ElasticSearch vs. Solr series. In the last three parts of the ElasticSearch vs. Solr series we gave a general overview of the two search engines, about data handling, and about their full text search capabilities. In this part we  look at how these two engines handle faceting.

Read more of this post

Solr vs ElasticSearch: Part 3 – Searching

In the last two parts of the series we looked at the general architecture and how data can be handled in both Apache Solr 4 (aka SolrCloud) and ElasticSearch and what the language handling capabilities of both enterprise search engines are like. In today’s post we will discuss one of the key parts of any search engine – the ability to match queries to documents and retrieve them.

Read more of this post

Battle of the Giants: Apache Solr 4.0 vs ElasticSearch

Apache Solr 4.0 release is imminent and we have a heavily anticipated Solr vs. ElasticSearch blog post series going on.  What better time to share that our Rafał Kuć will be giving a talk titled Battle of the giants: Apache Solr 4.0 vs ElasticSearch at the upcoming ApacheCon/Lucene EuroCon in Germany this November.

Abstract:

In this talk audience will be able to hear about how the long awaited Apache Solr 4.0 (aka SolrCloud) compares to the second search engine built on top of Apache Lucene – ElasticSearch. From understanding the architectural differences and behavior in situations like split – brain, to cluster recovery. From distributed indexing and document distribution control, to handling multiple shards and replicas in a single cluster. During the talk, we will also compare the most used and anticipated features such as faceting handling, documents grouping and so on. At the end we will talk about performance differences, cluster monitoring and troubleshooting.

Solr vs. ElasticSearch: Part 2 – Data Handling

In the previous part of Solr vs. ElasticSearch series we talked about general architecture of these two great search engines based on Apache Lucene. Today, we will look at their ability to handle your data and perform indexing and language analysis.

  1. Solr vs. ElasticSearch: Part 1 - Overview
  2. Solr vs. ElasticSearch: Part 2 - Indexing and Language Handling
  3. Solr vs. ElasticSearch: Part 3 - Searching
  4. Solr vs. ElasticSearch: Part 4 - Faceting
  5. Solr vs. ElasticSearch: Part 5 - Management API Capabilities
  6. Solr vs. ElasticSearch: Part 6 – User & Dev Communities Compared

Read more of this post

Solr vs. ElasticSearch: Part 1 – Overview

A good Solr vs. ElasticSearch coverage is long overdue.  We make good use of our own Search Analytics and pay attention to what people search for.  Not surprisingly, lots of people are wondering when to choose Solr and when ElasticSearch, and this SolrCloud vs. ElasticSearch question is something we regularly address in our search consulting engagements.

As the Apache Lucene 4.0 release approaches and with it Solr 4.0 release as well, we thought it would be beneficial to take a deeper look and compare the two leading open source search engines built on top of Lucene – Apache Solr and ElasticSearch. Because the topic is very wide and can go deep, we are publishing our research as a series of blog posts starting with this post, which provides the general overview of the functionality provided by both search engines.

  1. Solr vs. ElasticSearch: Part 1 – Overview
  2. Solr vs. ElasticSearch: Part 2 – Indexing and Language Handling
  3. Solr vs. ElasticSearch: Part 3 – Searching
  4. Solr vs. ElasticSearch: Part 4 – Faceting
  5. Solr vs. ElasticSearch: Part 5 - Management API Capabilities
  6. Solr vs. ElasticSearch: Part 6 – User & Dev Communities Compared

Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 1,218 other followers