Elasticsearch Training in New York City — October 19-20

For those of you interested in some comprehensive Elasticsearch and ELK Stack (Elasticsearch / Logstash / Kibana) training taught by experts from Sematext who know them inside and out, we’re running a super hands-on training workshop in New York City from October 19-20.

This two-day, hands-on workshop will be taught by experienced Sematext engineers — and authors of Elasticsearch booksRafal Kuc and Radu Gheorghe.

Target audience:

Developers and DevOps who want to configure, tune and manage Elasticsearch and ELK Stack at scale.

What you’ll get out of it:

In two days with training run by two trainers we’ll:

  • bring Elasticsearch novices to the level where he/she would be comfortable with taking Elasticsearch to production
  • give experienced Elasticsearch users proven and practical advice based on years of experience designing, tuning, and operating numerous Elasticsearch clusters to help with their most advanced and pressing issues

When & Where:

  • Dates:        October 19 & 20 (Monday & Tuesday)
  • Time:         9:00 a.m. — 5:00 p.m.
  • Location:     New Horizons Computer Learning Center in Midtown Manhattan (map)
  • Cost:         $1,200 “early bird rate” (valid through September 1) and $1,500 afterward.  And…we’re also offering a 50% discount for the purchase of a 2nd seat!
  • Food/Drinks: Light breakfast and lunch will be provided


Attendees will go through several sequences of short lectures followed by interactive, group, hands-on exercises. There will be a Q&A session after each such lecture-practicum block.

Course outline:

  1. Basic flow of data in Elasticsearch
    1. what is Elasticsearch and typical use-cases
    2. installation
    3. index
    4. get
    5. search
    6. update
    7. delete
  2. Controlling how data is indexed and stored
    1. mappings and mapping types
    2. strings, integers and other core types
    3. _source, _all and other predefined fields
    4. analyzers
    5. char filters
    6. tokenizers
    7. token filters
  3. Searching through your data
    1. selecting fields, sorting and pagination
    2. search basics: term, range and bool queries
    3. performance: filters and the filtered query
    4. match, query string and other general queries
    5. tweaking the score with the function score query
  4. Aggregations
    1. relationships between queries, filters, facets and aggregations
    2. metrics aggregations
    3. multi-bucket aggregations
    4. single-bucket aggregations and nesting
  5. Working with relational data
    1. arrays and objects
    2. nested documents
    3. parent-child relations
    4. denormalizing and application-side joins
  6. Performance tuning
    1. bulk and multiget APIs
    2. memory management: field/filter cache, OS cache and heap sizes
    3. how often to commit: translog, index buffer and refresh interval
    4. how data is stored: merge policies; store settings
    5. how data and queries are distributed: routing, async replication, search type and shard preference
    6. doc values
    7. thread pools
    8. warmers
  7. Scaling out
    1. multicast vs unicast
    2. number of shards and replicas
    3. node roles
    4. time-based indices and aliases
    5. shard allocation
    6. tribe node
  8. Monitor and administer your cluster
    1. mapping and search templates
    2. snapshot and restore
    3. health and stats APIs
    4. cat APIs
    5. monitoring products
    6. hot threads API
  9. Beyond keyword search
    1. percolator
    2. suggesters
    3. geo-spatial search
    4. highlighting
  10. Ecosystem
    1. indexing tools: Logstash, rsyslog, Apache Flume
    2. data visualization: Kibana
    3. cluster visualization: Head, Kopf, BigDesk

Got any questions or suggestions for the course? Just drop us a line or hit us @sematext!

Lastly, if you can’t make it…watch this space or follow @sematext — we’ll be adding more Elasticsearch / ELK stack training workshops in the US, Europe and possibly other locations in the coming months.  We are also known worldwide for our Elasticsearch Consulting Services and Elasticsearch/ELK Production Support, as well as ELK Consulting.

Hope to see you in the Big Apple in October!

Monitoring CoreOS Clusters

In this post you’ll learn how to get operational insights (i.e. performance metrics, container events, etc.) from CoreOS and make that super simple with etcd, fleet, and SPM.

We’ll use:

  • SPM for Docker to run the monitoring agent as a Docker container and collect all Docker metrics and events for all other containers on the same host + metrics for hosts
  • fleet to seamlessly distribute this container to all hosts in the CoreOS cluster by simply providing it with a fleet unit file shown below
  • etcd to set a property to hold the SPM App token for the whole cluster

The Big Picture

Before we get started, let’s take a step back and look at our end goal.  What do we want?  We want charts with Performance Metrics, we want Event Collection, we’d love integrated Anomaly Detection and Alerting, and we want that not only for containers, but also for hosts running containers.  CoreOS has no package manager and deploys services in containers, so we want to run the SPM agent in a Docker container, as shown in the following figure:


By the end of this post each of your Docker hosts could look like the above figure, with one or more of your own containers running your own apps, and a single SPM Docker Agent container that monitors all your containers and the underlying hosts.

Read more of this post

Docker Events and Docker Metrics Monitoring

Docker deployments can be very dynamic with containers being started and stopped, moved around the YARN or Mesos-managed clusters, having very short life spans (the so-called pets) or long uptimes (aka cattle).  Getting insight into the current and historical state of such clusters goes beyond collecting container performance metrics and sending alert notifications.  If a container dies or gets paused, for example, you may want to know about it, right?  Or maybe you’d want to be able to see that a container went belly up in retrospect when troubleshooting, wouldn’t you?

Just two weeks ago we added Docker Monitoring (docker image is right here for your pulling pleasure) to SPM.  We didn’t stop there — we’ve now expanded SPM’s Docker support by adding Docker Event collection, charting, and correlation.  Every time a container is created or destroyed, started, stopped, or when it dies, spm-agent-docker captures the appropriate event so you can later see what happened where and when, correlate it with metrics, alerts, anomalies — all of which are captured in SPM — or with any other information you have at your disposal.  The functionality and the value this brings should be pretty obvious from the annotated screenshot below.

Like this post?  Please tweet about Docker Events and Docker Metrics Monitoring

Know somebody who’d find this post useful?  Please let them know…

Bildschirmfoto 2015-06-24 um 13.56.39

Here’s the list of Docker events SPM Docker monitoring agent currently captures:

  • Version Information on Startup:
    • server-info – created by spm-agent framework with node.js and OS version info on startup
    • docker-info – Docker Version, API Version, Kernel Version on startup
  • Docker Status Events:
    • Container Lifecycle Events like
      • create, exec_create, destroy, export
    • Container Runtime Events like
      • die, exec_start, kill, oom, pause, restart, start, stop, unpause

Every time a Docker container emits one of these events spm-agent-docker will capture it in real-time, ship it over to SPM, and you’ll be able to see it as shown in the above screenshot.

Oh, and if you’re running CoreOS, you may also want to see how to index CoreOS logs into ELK/Logsene. Why? Because then you can have not only metrics and container events in one place, but also all container and application logs, too!

If you’re using Docker, we hope you find this useful!  Anything else you’d like us to add to SPM (for Docker or any other integration)?  Leave a comment, ping @sematext, or send us email – tell us what you’d like to get for early Christmas!

Replaying Elasticsearch Slowlogs with Logstash and JMeter

[Note: We’re holding a 2-day, hands-on Elasticsearch / ELK Stack training workshop in New York from October 19-20, 2015. Click here for details!]


Sometimes we just need to replay production queries – whether it’s because we want a realistic load test for the new version of a product or because we want to reproduce, in a test environment, a bug that only occurs in production (isn’t it lovely when that happens? Everything is fine in tests but when you deploy, tons of exceptions in your logs, tons of alerts from the monitoring system…).

With Elasticsearch, you can enable slowlogs to make it log queries taking longer (per shard) than a certain threshold. You can change settings on demand. For example, the following request will record all queries for test-index:

curl -XPUT localhost:9200/test-index/_settings -d '{
  "index.search.slowlog.threshold.query.warn" : "1ms"

You can run those queries from the slowlog in a test environment via a tool like JMeter. In this post, we’ll cover how to parse slowlogs with Logstash to write only the queries to a file, and how to configure JMeter to run queries from that file on an Elasticsearch cluster.

Read more of this post

Log Alerting, Anomaly Detection and Scheduled Reports

Tired of tail -F /your/log/file | egrep -i ‘error|exception|warn’?
It’s common for devops to keep an eye out for errors in logs by running tail -F or to manually look for unusual application behavior by looking at logs in their terminal. The problem is that this gets tiring, boring — and even impossible — as the infrastructure grows.  If you think about this from the business perspective: it gets expensive.  Or maybe you automate things a bit via cron jobs that cat, grep, and mail errors, or maybe SSH to N remote servers to do that, etc.?  You can do this only for so long.  It doesn’t scale well.  It’s fragile.  Not the way to manage non-trivial infrastructure.

So what do you do?

First, consider using a centralized log management solution like Logsene instead of leaving log files on your file system. Alternatively, you can choose to run & maintain your own ELK stack, but then you won’t get what we are about to show you out of the box.

Saved, Alert & Scheduled Queries
We’ve created a 3-part blog series to detail the different types of Queries that Logsene lets you create:

  1. Saved Queries: queries that you’ve saved, so that you can later just execute them instead of writing them again
  2. Alert Queries: saved queries that are continuously running and that you configured to alert you when certain conditions are matched
  3. Scheduled Queries: queries that are executed periodically and that send you their output in a form of an log chart image

Put another way, using these queries means you can have Logsene’s servers do all the tedious work we mentioned above. That’s why we created computers in the first place, isn’t it?

It’s done in a few minutes, and how much time does it saves you every day?

So, how about that tail -F /my/log/file.log | egrep -i ‘error|exception|warn’ mentioned earlier? If you’re getting tired of tailing and grepping log files, sshing to multiple servers and chasing errors in them, try Logsene by registering here. If you are a young startup, a small or non-profit organization, or an educational institution, ask us for a discount (see special pricing)!

Saved Log Searches in Logsene

When digging through logs you might find yourself running the same searches again and again.  To solve this annoyance, Logsene lets you save queries so you can re-execute them quickly without having to retype them:

1) Enter your query and press the “disk” icon next to the search-textbox. Give your query a friendly Query Name and press the “save” button.


2) To run a Saved Query just click on it in the Search Queries pop-out window (see screenshot below). Existing Saved Queries can be edited or deleted, too:


Logsene tracks the history of recently used queries, so it’s easy to try several queries and finally save the one that worked best for your use case. That’s why you’ll find three tabs in the saved queries popup:

  1. Recent Queries – queries that you’ve recently used, you can save them using the save button
  2. Saved Queries – queries that you’ve saved, so that you can later just execute them instead of writing them again
  3. Alert Queries – saved queries that are continuously running and that you configured to alert you when certain conditions are matched

3-Part Blog Series about Log Queries

Speaking of log queries…this post is part of our 3-part blog series to detail the different types of Queries that Logsene lets you create.  Check out the other posts about Alert Queries and Scheduled Queries.

Does this sound like something you could use?

If so, simply sign up here – there’s no commitment and no credit card required.  Small startups, startups with no or very little outside investment money, non-profit and educational institutions get special pricing – just get in touch with us.  If you’d like to help us make SPM and Logsene even better, we are hiring!

5-Minute Recipe: Log Alerting and Anomaly Detection

Until software becomes so sophisticated that it becomes truly self-healing without human intervention it will remain important that we humans be notified of any problems with computing systems we run. This is especially true for large or distributed systems where it quickly becomes impossible to watch logs manually. A common practice is to watch performance metrics instead, centralize logs, and dig into logs only when performance problems are detected. If you use SPM Performance Monitoring already, you are used to defining alerts on critical metrics, and if you are a Logsene user you can now use alerting on logs, too! Here is how:

  1. Run your query in Logsene to search for relevant logs and press the “Save” button (see screenshot below)
  2. Mark the checkbox “Create Alert Query” and pick whether you want threshold-based or anomaly detection-based alerting:
Threshold-based alert in Logsene

Threshold-based alert in Logsene


Anomaly Detection using “Algolerts” in Logsene


Manage Alert Queries in Logsene

While alert creation dialog currently shows only email as a possible destination for alert notifications, you can actually have alert notifications sent to one or more other destinations.  To configure that go to “App Settings” as shown below:


Once there, under “Notification Transport” you will see all available alert destinations:


In addition to email, PagerDuty, and Nagios, you can have alert notifications go to any WebHook you configure, including Slack and Hipchat.

How does one decide between Threshold-based and Anomaly Detection-based Alerts (aka Algolerts)?

The quick answers:

  • If you have a clear idea about how many logs should be matching a given Alert Query, then simply use threshold-based Alerts.
  • If you do not have a sense of how many matches a given Alert Query matches on a regular basis, but you want to watch out for sudden changes in volume, whether dips or spikes, use Algolerts (Anomaly Detection-based Alerts).

For more detailed explanations of Logsene alerts, see the FAQ on our Wiki.

3-Part Blog Series about Log Queries

Speaking of log queries…this post is part of our 3-part blog series to detail the different types of Queries that Logsene lets you create.  Check out the other posts about Saved Queries and Scheduled Queries.

Keep an eye on anomalies or other patterns in your logs

…by checking out Logsene. Simply sign up here – there’s no commitment and no credit card required.  Small startups, startups with no or very little outside investment money, non-profit and educational institutions get special pricing – just get in touch with us.  If you’d like to help us make SPM and Logsene even better, we are hiring


Get every new post delivered to your Inbox.

Join 169 other followers