Elasticsearch Refresh Interval vs Indexing Performance

Elasticsearch is near-realtime, in the sense that when you index a document, you need to wait for the next refresh for that document to appear in a search. Refreshing is an expensive operation and that is why by default it’s made at a regular interval, instead of after each indexing operation. This interval is defined by the index.refresh_interval setting, which can go either in Elasticsearch’s configuration, or in each index’s settings. If you use both, index settings override the configuration. The default is 1s, so newly indexed documents will appear in searches after 1 second at most.

Because refreshing is expensive, one way to improve indexing throughput is by increasing refresh_interval. Less refreshing means less load, and more resources can go to the indexing threads. How does all this translate into performance? Below is what our benchmarks revealed when we looked at it through the SPM lens.

Please tweet about Elasticsearch refresh interval vs indexing performance

Test conditions

For this benchmark, we indexed apache logs in bulks of 3000 each, on 2 threads. Those logs would go in one index, with 3 shards and 1 replica, hosted by 3 m1.small Amazon EC2 instances. The Elasticsearch version used was 0.90.0.

As indexing throughput is a priority for this test, we also made some configuration changes in this direction:

  • index.store.type: mmapfs. Because memory-mapped files make better use of OS caches
  • indices.memory.index_buffer_size: 30%. Increased from the default 10%. Usually, the more buffers, the better, but we don’t want to overdo it
  • index.translog.flush_threshold_ops: 50000. This makes commits from the translog to the actual Lucene indices less often than the default setting of 5000

Results

First, we’ve indexed documents with the default refresh_interval of 1s. Within 30 minutes, 3.6M new documents were indexed, at an average of 2K documents per second. Here’s how indexing throughput looks like in SPM for Elasticsearch:

refresh_interval: 1s

Then, refresh_interval was set to 5s. Within the same 30 minutes, 4.5M new documents were indexed at an average of 2.5K documents per second. Indexing thoughput was increased by 25%, compared to the initial setup:

refresh_interval: 5s

The last test was with refresh_interval set to 30s. This time 6.1M new documents were indexed, at an average of 3.4K documents per second. Indexing thoughput was increased by 70% compared to the default setting, and 25% from the previous scenario:

refresh_interval: 30s

Other metrics from SPM, such as load, CPU, disk I/O usage, memory, JVM or Garbage Collection didn’t change significantly between runs. So configuring the refresh_interval really is a trade-off between indexing performance and how “fresh” your searches are. What settings works best for you? It depends on the requirements of your use-case. Here’s a summary of the indexing throughput values we got:

  • refresh_interval: 1s   – 2.0K docs/s
  • refresh_interval: 5s   – 2.5K docs/s
  • refresh_interval: 30s3.4K docs/s

12 Responses to Elasticsearch Refresh Interval vs Indexing Performance

  1. Pingback: why not increase performance by setting refresh interval in elasticsearchQueryBy | QueryBy, ejjuit, query, query by, queryby.com, android doubt, ios question, sql query, sqlite query, nodejsquery, dns query, update query, insert query, kony, mobilesecurit

  2. Pingback: Rsyslog 8.1 Elasticsearch Output Performance | Sematext Blog

  3. TD says:

    When the indexed documents are large, this seemed like not a good idea. The suggestion to increase refresh_interval to 30s blew up my system. So be aware.

  4. sematext says:

    TD: which part specifically? When your documents are huge then you can’t overdo indices.memory.index_buffer_size, yes, as I think it’s implied here. Were you referring to something else?

  5. Pingback: Performance Tuning&Tests for the Elasticsearch Output

  6. pranav says:

    Thanks for sharing the information. What was your average document size in your tests?

    We are running our test with 144 KB document size with SSD and good Hardware, but aren’t getting anything above 200 documents per second. We are keeping refresh interval as 1second, since we want the data to be read as soon as it is written.

    Any ideas to improve performance are most welcome.

    Thanks
    Pranav.

  7. Tamas Szasz says:

    Hi,

    Thanks for the post. I would see a test with the same data and refresh_interval: -1. Probable the difference between 30s and -1 is not so significant then between 1s and 30s but still …

    • Hi Tamas,

      Right, that should help (also check my reply below). Here I was assuming some sort of near-real-time search is needed. If you do batch indexing (for example, update stocks every night in a store or similar use-cases), then it makes sense to disable automatic refreshes altogether (set refresh_interval to -1) and enable it back after you’re done.

  8. Great post. I wonder why we don’t see as much improvement when changing refresh_interval from 5s to 30s as much as we do when changing refresh_interval from 1s to 5s.

    Is it that 30% of index buffer is not enough? Or is it disk IO? Did you guys care to check?

  9. Yes, the difference in indexing throughput is there because if you up the refresh_interval, you basically lower the time ES spends on refresh operations – freeing up CPU for other operations (such as bulk indexing). As the refresh_interval increases, the refresh time takes a less significant part of the overall computation, so increasing it further may not matter (depends on how your data and hardware looks like).

    The 30% indexing buffer has nothing to do with refreshes, but with flushes. An automatic flush is triggered when the index buffer gets full or when one of the transaction log thresholds is passed:

    http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-translog.html

    Increasing the buffer and translog sizes will make ES flush less often and, again depending on your hardware, might decrease the overall time ES spends flushing.

    Elasticsearch exposes refresh and flush times and counts through stats APIs. You can monitor these times with SPM: http://sematext.com/spm/index.html

    Whether you use SPM or not, monitoring can tell you what the bottleneck is for your use-case (it really depends here, though I hate to use this expression). For example, if you see tons of CPU I/O wait, you’ll know it’s disk I/O. Usually the bottleneck for indexing performance is usually CPU, assuming the disks can handle the sustained writes, and in most cases they can, even with spinning disks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,696 other followers

%d bloggers like this: