Side by Side with Elasticsearch and Solr: Performance and Scalability

Back by popular demand!  Sematext engineers Radu Gheorghe and Rafal Kuc are returning to Berlin Buzzwords next Tuesday, June 2, with the second installment of their “Side by Side with Elasticsearch and Solr” talk.  (You can check out Part 1 here.)

Elasticsearch and Solr Performance and Scalability

This brand new talk — which will include a live demo, a video demo and slides — will dive deeper into into how Elasticsearch and Solr scale and perform. And of course, they will take into account all the goodies that came with these search platforms since last year.

Radu and Rafal will show attendees how to tune Elasticsearch and Solr for two common use-cases: logging and product search.  Then they will show what numbers they got after tuning. There will also be some sharing of best practices for scaling out massive Elasticsearch and Solr cluster; for example, how to divide data into shards and indices/collections that account for growth, when to use routing, and how to make sure that coordinated nodes don’t become unresponsive.

Not going to Berlin Buzzwords?  No problem — watch this space for the video and slides soon…

Side_by_Side_part_2

Elasticsearch Intro Training Workshop on June 3 in Berlin

Attending BBuzz or going to be in the Berlin area on June 3 (the day after conference ends)? We’re holding an Elasticsearch Intro training workshop in Berlin — here are the details.

Feedback & Questions — Bring It On

If you’ve got feedback or questions about topics like Elasticsearch vs. Solr (here’s a detailed comparison) and what’s new and exciting with both applications, just drop us a line.  We live and breathe this stuff, so we’re always happy to hear from like-minded people.

Presentation: Tuning Elasticsearch Indexing Pipeline for Logs

Fresh from GeeCON in Krakow…we have another Elasticsearch and Logging manifesto from Sematext engineers — and book authors — Rafal Kuc and Radu Gheorghe.  As with many of their previous presentations, Radu and Rafal go into detail on Elasticsearch, Logstash and Rsyslog topics like:

  • How Elasticsearch, Logstash and Rsyslog work
  • Tuning Elasticsearch
  • Using, scaling, and tuning Logstash
  • Using and tuning Rsyslog
  • Rsyslog with JSON parsing
  • Hardware and data tests
  • …and lots more along these lines

[Note: Video of the talk coming soon to this post!]

If you find this stuff interesting and have similar challenges, then drop us a line to chat about our Elasticsearch and Logging consulting services and Elasticsearch (and Solr, too) production support.  Oh yeah, and we’re hiring worldwide if you are into Logging, Monitoring, Search, or Big Data Analytics as much as Radu and Rafal!

Videos: Tuning Solr for Logs and Solr Anti-Patterns

If you’re an avid Solr user you’ll want to check out these Lucene / Solr Revolution videos from two of Sematext’s Solr experts: Rafal Kuc and Radu Gheorghe.

Tuning Solr for Logs

Radu talked about Solr performance tuning, which is always nice for keeping your applications snappy and your costs down. This is especially true for logs, social media and other stream-like data that can easily grow into terabyte territory.

(note: there’s no audio between 3:30 and 4:30; we hope to have this fixed soon and it doesn’t materially affect the talk)

Solr Anti-Patterns

Rafal points out common mistakes and roads that should be avoided at all costs when dealing with Solr.

Slides and Summaries

You can find slides of the Solr presentations in this blog post and summaries in this blog post.

Enjoy!

Solr Presentations from Lucene/Solr Revolution 2014

Thanks to everyone who stopped by the Sematext booth at last week’s Lucene/Solr Revolution event in Washington, DC and attended our two talks:

The attendance, questions and interest are very much appreciated.  As a company that prides itself on its Solr expertise (and Elasticsearch expertise too, for that matter), it was nice to spend a couple days talking about search and Big Data challenges, performance monitoring and logging with fellow experts from around the world. Here are the slides for the two talks we gave (summaries of the talks can be found here):

 

  Videos of the talks will be posted here soon.  Hope to see everyone again next year!

Video and Slides: Centralized Logging with Logstash and Elasticsearch

Sematext engineer and Elasticsearch / Logstash expert Rafal Kuc gave a well-received talk at the recent DevOps Days Warsaw event.  The talk was titled “From Zero to Hero – Centralized Logging with Logstash & Elasticsearch” and you can watch the video here:

And check out the slides here:

Brief Summary

Rafal talked about the common problem of digging through logs to find one particular event — or group of them.  And going even further into this pain point — what if you have lots of servers and you don’t have a single place to look for logs?  Do you really want to ssh to one or more servers and grep log files?  Of course not!  It’s 2014 and there are tools and services that help you spend less time hunting around for problems and more time actually fixing them.

To help solve this problem Rafal guided the audience through the basics of using Logstash and Elasticsearch together as the perfect combination for handling logs from multiple applications.  Attendees also learned how to set up Logstash, how to configure it to parse logs and, finally, how to send them to an Elasticsearch cluster.

Rafal also discussed tuning Elasticsearch for log management and centralized logging purposes, and showed how to easily switch between shipping logs to a self-hosted solution like Elasticsearch / Logstash / Kibana (aka ELK) and instead ship logs to Logsene Log Management and Analytics by changing a single line in Logstash configuration.

See also:

Enjoy!  And thanks to everyone who attended Rafal’s talk in person and stopped by the Sematext booth.

Community Voting for Sematext Talks at Lucene/Solr Revolution 2014

The biggest open source conference dedicated to Apache Lucene/Solr takes place in November in Washington, DC.  If you are planning to attend — and even if you are not — you can help improve the conference’s content by voting for your favorite talk topics.  The top vote-getters for each track will be added to Lucene/Solr Revolution 2014 agenda.

Not surprisingly for one of the leading Lucene/Solr products and services organizations, Sematext has two contenders in the Tutorial track:

We’d love your support to help us contribute our expertise to this year’s conference.  To vote, simply click on the above talk links and you’ll see a “Vote” button in the upper left corner.  That’s it!

To give you a better sense of what Radu and Rafal would like to present, here are their talk summaries:

Tuning Solr for Logs – by Radu Gheorghe

Performance tuning is always nice for keeping your applications snappy and your costs down. This is especially the case for logs, social media and other stream-like data that can easily grow into terabyte territory.

While you can always use SolrCloud to scale out of performance issues, this talk is about optimizing. First, we’ll talk about Solr settings by answering the following questions:

  • How often should you commit and merge?
  • How can you have one collection per day/month/year/etc?
  • What are the performance trade-offs for these options?

Then, we’ll turn to hardware. We know SSDs are fast, especially on cold-cache searches, but are they worth the price? We’ll give you some numbers and let you decide what’s best for your use case.

The last part is about optimizing the infrastructure pushing logs to Solr. We’ll talk about tuning Apache Flume for handling large flows of logs and about overall design options that also apply to other shippers, like Logstash. As always, there are trade-offs, and we’ll discuss the pros and cons of each option.

Solr Anti-Patternsby Rafal Kuc

Working as a consultant, software engineer and helping people in various ways we can see multiple patterns on how Solr is used and how it should be used. We all usually say what should be done, but we don’t talk and point out why we should not go some ways. That’s why I would like to point out common mistakes and roads that should be avoided at all costs.   During the talk I would like not only to show the bad patterns, but also show the difference before and after.

The talk is divided into three major sections:

  1. We will start with general configuration pitfalls that people are used to make. We will discuss different use cases showing the proper path that one should take
  2. Next we will focus on data modeling and what to avoid when making your data indexable. Again we will see real life use cases followed by the description how to handle them properly
  3. Finally we will talk about queries and all the juicy mistakes when it comes to searching for indexed data

Each shown use case will be illustrated by the before and after analysis – we will see the metrics changes, so the talk will not only bring pure facts, but hopefully know-how worth remembering.

Thank you for your support!

Presentation and Video: Side by Side with Solr and Elasticsearch

Fresh from Berlin Buzzwords where Sematext‘s own Radu Gheorghe and Rafal Kuc presented “Side by Side with Solr and Elasticsearch” on the same stage, at the same time…but in different colors.  The talk included live demos, graphing, stats, and hints at juicy things to come.  Needless to say — if you deal with Solr and Elasticsearch then there are great insights to be found here!

Here is the presentation:

 

And here is the video:

 

Want to Be on Stage Somewhere Like Radu and Rafal Talking About Solr and Elasticsearch?

Or maybe you don’t want the spotlight — that’s cool too.  But…if you do enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, and Storm, then drop us a line.  We’re hiring planet-wide!  Front end and JavaScript Developers, Developer Evangelists, Full-stack Engineers, Mobile App Developers…get in touch!

[Note: for those of you who don’t have the time or inclination to go through all the technical details, here’s a high-level, up-to-date (2015) Solr vs. Elasticsearch overview]

Enjoy!

Follow

Get every new post delivered to your Inbox.

Join 167 other followers