Presentation: Large Scale Log Analytics with Solr

In this presentation from Lucene/Solr Revolution 2015, Sematext engineers — and Solr and centralized logging experts — Radu Gheorghe and Rafal Kuć talk about searching and analyzing time-based data at scale.

Documents ranging from blog posts and social media to application logs and metrics generated by smartwatches and other “smart” things share a similar pattern: timestamps among their fields, rarely changeable, and deletion when they become obsolete. Because this kind of data is so large it often causes scaling and performance challenges.

In this talk, Radu and Rafal focus on these challenges, including: properly designing collections architecture, indexing data fast and without documents waiting in queues for processing, being able to run queries that include time-based sorting and faceting on enormous amounts of indexed data (without killing Solr!), and many more.

Here is the video:

…and here are the slides:


Here’s a Taste of What You’ll See

How do Logstash, rsyslog, Redis, and fast-food-hating zombies (?!) relate? You’ll have to check out the presentation to find out…


Solr “One-stop Shop”

Sematext is your “one-stop shop” for all things Solr: Expert Consulting, Production Support, Solr Training, and Solr Monitoring with SPM.

Log Analytics – We Can Help

If your log analysis and management leave something to be desired, then we’ve got you covered there as well.  There’s our centralized logging solution, Logsene.  And we also offer Logging Consulting should you require more in-depth support.

Questions or Feedback?

If you have any questions or feedback for us, please contact us by email or hit us on Twitter.  We love talking Solr — and logs!


Presentation: Log Analysis with Elasticsearch

Fresh from the Velocity NYC conference is the latest presentation from Sematext engineers Rafal Kuć and Radu Gheorghe“From zero to production hero: Log Analysis with Elasticsearch.”

The talk goes through the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. They cover:

  • Time-based indices and index templates to efficiently slice your data
  • Different node tiers to de-couple reading from writing, heavy traffic from low traffic
  • Tuning various Elasticsearch and OS settings to maximize throughput and search performance
  • Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead

Here is part 1 of the Video:

Here is part 2 of the Video:

Here are the slides:


And here are the Commands and Demo used in the presentation:

Read more of this post

Processing Metrics, Logs and Traces at Scale – DevOps Talk

If topics like performance monitoring and processing metrics, log management, and distributed transaction tracing — at scale, no less! — interest you, then you’ll want to check out what Sematext founder Otis Gospodnetić had to say at this week’s DevOps Summit in New York City.

Talk Summary

Application metrics, logs, and business KPIs are a goldmine. It’s easy to get started with the ELK stack (Elasticsearch, Logstash and Kibana) — you can see lots of people coming up with impressive dashboards, in less than a day, with no previous experience. Going from proof-of-concept to production tends to be a bit more difficult, unfortunately, and it tends to gobble up our attention, time, and money. In this talk Otis shared the architecture and decisions behind our services for handling large volumes of performance metrics, traces, logs, anomaly detection, alerts, etc.  He followed data from its sources, its collection, aggregation, storage, and visualization. The talk also covered the overview of some of the relevant technologies and their strengths and weaknesses, such as HBase, Elasticsearch and Kafka.


Feedback and Solutions for Monitoring and Logging

Just drop us an email or DM us if you have questions or comments about the presentation — we love feedback!  Or, if you have an interest in chatting about the solutions mentioned in it like SPM performance monitoring and Logsene log management and analytics we’re happy to engage about them as well.

Side by Side with Elasticsearch and Solr: Performance and Scalability

[Note: this post has been updated to include video and slides from the June 2 presentation]

Back by popular demand!  Sematext engineers Radu Gheorghe and Rafal Kuc returned to Berlin Buzzwords on Tuesday, June 2, with the second installment of their “Side by Side with Elasticsearch and Solr” talk.  (You can check out Part 1 here.)

Elasticsearch and Solr Performance and Scalability

This brand new talk — which included a live demo, a video demo and slides — dove deeper into into how Elasticsearch and Solr scale and perform. And, of course, they took into account all the goodies that came with these search platforms since last year.

Radu and Rafal showed attendees how to tune Elasticsearch and Solr for two common use-cases: logging and product search.  Then they showed what numbers they got after tuning. There was also some sharing of best practices for scaling out massive Elasticsearch and Solr clusters; for example, how to divide data into shards and indices/collections that account for growth, when to use routing, and how to make sure that coordinated nodes don’t become unresponsive.

Here is the video:


…and here are the slides:


Feedback & Questions — Bring It On

If you’ve got feedback or questions about topics like Elasticsearch vs. Solr (here’s a detailed comparison) and what’s new and exciting with both applications, just drop us a line.  We live and breathe this stuff, so we’re always happy to hear from like-minded people.

Presentation: Tuning Elasticsearch Indexing Pipeline for Logs

Fresh from GeeCON in Krakow…we have another Elasticsearch and Logging manifesto from Sematext engineers — and book authors — Rafal Kuc and Radu Gheorghe.  As with many of their previous presentations, Radu and Rafal go into detail on Elasticsearch, Logstash and Rsyslog topics like:

  • How Elasticsearch, Logstash and Rsyslog work
  • Tuning Elasticsearch
  • Using, scaling, and tuning Logstash
  • Using and tuning Rsyslog
  • Rsyslog with JSON parsing
  • Hardware and data tests
  • …and lots more along these lines

[Note: Video of the talk coming soon to this post!]

If you find this stuff interesting and have similar challenges, then drop us a line to chat about our Elasticsearch and Logging consulting services and Elasticsearch (and Solr, too) production support.  Oh yeah, and we’re hiring worldwide if you are into Logging, Monitoring, Search, or Big Data Analytics as much as Radu and Rafal!

Videos: Tuning Solr for Logs and Solr Anti-Patterns

If you’re an avid Solr user you’ll want to check out these Lucene / Solr Revolution videos from two of Sematext’s Solr experts: Rafal Kuc and Radu Gheorghe.

Tuning Solr for Logs

Radu talked about Solr performance tuning, which is always nice for keeping your applications snappy and your costs down. This is especially true for logs, social media and other stream-like data that can easily grow into terabyte territory.

(note: there’s no audio between 3:30 and 4:30; we hope to have this fixed soon and it doesn’t materially affect the talk)

Solr Anti-Patterns

Rafal points out common mistakes and roads that should be avoided at all costs when dealing with Solr.

Slides and Summaries

You can find slides of the Solr presentations in this blog post and summaries in this blog post.


Solr Presentations from Lucene/Solr Revolution 2014

Thanks to everyone who stopped by the Sematext booth at last week’s Lucene/Solr Revolution event in Washington, DC and attended our two talks:

The attendance, questions and interest are very much appreciated.  As a company that prides itself on its Solr expertise (and Elasticsearch expertise too, for that matter), it was nice to spend a couple days talking about search and Big Data challenges, performance monitoring and logging with fellow experts from around the world. Here are the slides for the two talks we gave (summaries of the talks can be found here):


  Videos of the talks will be posted here soon.  Hope to see everyone again next year!


Get every new post delivered to your Inbox.

Join 181 other followers