Introducing Top Database Operations

If you run Elasticsearch, Solr, or any datastore you connect to via JDBC, you’ll like what we’ve just added to SPM.  We call it Database Operations and in SPM you can find it in the new Database report:

If you didn’t watch the video, here’s what Database Operations gives you:

  • Top 5 operation types across all your data stores or filtered to a specific data store type
  • Top 5 operation types by speed, throughput, or simply their volume
  • Time-series reports for volume, throughput, and latency broken down by operation type
  • Ability to view all collected operations, not just the slowest ones, filter by database type or by operation type, sorted by average or total duration, or throughput
  • Sparklines that show last 5 minute values and trends
  • Top 10 slowest individual operations and drill-in details

Integration with Transaction Tracing, so you can correlate slow data store operations with the actual transaction/request that triggered slow operations


  • To get this information add SPM agent to the application that is talking to a data store (e.g. Solr or Elasticsearch). This is because the SPM agent captures operations at that client layer, not in the server itself.
  • To start capturing this information enable Transaction Tracing in your SPM agents

This, including Distributed Transaction Tracing, works for all Java applications




Don’t forget – when you enable Database Operations you will also automatically get Transaction Tracing, as well as the cool AppMaps – enjoy! :)

Got ideas how we could make Database Operations better and more useful to you?  Let us know via comments, email or @sematext.

Grab a free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or educational institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.

Docker + Elasticsearch: How to Monitor the Official Elasticsearch Image on Docker

The official Elasticsearch Image on Docker Hub has already generated more than 1.6 million pulls. It is probably the easiest way to get a development setup — which includes Elasticsearch — to the application stack. The reason for this crazy number? A rapidly growing number of organizations are using Elasticsearch and Docker in production. Needless to say, monitoring Elasticsearch is essential in production, and you can find a detailed analysis of this topic (including the “top 10 Elasticsearch metrics to watch”) in the free eBook: ElasticsearchMonitoring Essentials. Docker is disruptive in many ways, and there are many things that are slightly different and worth mentioning.  These include:

  1. Changed deployment for Elasticsearch and its monitoring tools using Dockerfile, Docker Compose or various Orchestration Tools
  2. There is a new Layer to monitor: Container Metrics and Events, see: Docker Events and Metrics monitoring and SPM for Docker
  3. Logging has changed: containers log to the console and logs needs to be retrieved from Docker-Daemon instead getting them from the Elasticsearch log file.  Check out our post on the subject: Innovative Docker Log Management
  4. Official Images may not provide options for monitoring (such as JMX).  However, the official Image for Elasticsearch provides an option to pass parameters to the Java Runtime Environment.  We we will use this option for Elasticsearch monitoring in this post. You should also be aware that the official Elasticsearch Image does not include any plugins, and commercial monitoring from Elastic can’t be distributed in this Image for licensing reasons.  Our monitoring tool of choice is SPM.  If you are not familiar with SPM — but have heard of it — or if you use Marvel, have a look at Marvel vs. SPM.

Next, I’m going to demonstrate a setup to monitor multiple Elasticsearch nodes on a single Docker Host. The final setup will provide the full Monitoring and Logging package:

  • Detailed Application Metrics for Elasticsearch, deployed on Docker
  • Detailed Container Metrics and Docker Events  
  • Centralized Logs for all Containers by SPM for Docker

So let’s first decide on one of the following options to monitor Elasticsearch on Docker.  You can:

  1. Build your own Elasticsearch container with the included monitoring components. I’m not going to go into details about this option today; rather, I’m going to focus on the official / trusted build.
  2. Use a standalone agent, which queries metrics from the Elasticsearch container. This requires a setup for JMX and Docker networking configurations for the monitor and Elasticsearch. The metrics, gathered by remote agents, are limited and, in the Docker context, running an external monitoring process plus Elasticsearch processes consumes more resources.  And the next option …
  3. Inject an SPM in-process monitoring agent into Elasticsearch. This option has the lowest resource usage and has support for advanced monitoring functions like Transaction Tracing and AppMap.

I chose to implement Option #3 in this blog post because it provides the best insights into Elasticsearch. This means the Elasticsearch container needs file-system access to the SPM monitoring agent. Sematext provides the SPM Client (which includes the monitoring agent and metrics sender) pre-installed in a Docker Image, referred as “SPM Client Image/Container” in the following instructions and published on Docker Hub as “sematext/spm-client”.  The main trick here is to mount a volume from SPM-Client Container into Elasticsearch Containers in order to load the monitoring library.

Let’s have a look at the desired setup and how to get there:


Monitoring Setup for Elasticsearch on Docker

Read more of this post

Introducing Akka Monitoring

Akka is a toolkit and runtime for building highly concurrent, distributed and resilient message-driven applications on the JVM. It’s a part of Scala’s standard distribution for the implementation of the “actor model”.

How Akka Works

Messages between Actors are exchanged in Mailbox queues and Dispatcher provides various concurrency models, while Routers manage the message flow between Actors. That’s quite a lot Akka is doing for developers!

But how does one find bottlenecks in distributed Akka applications? Well, many Akka users already use the great Kamon Open-Source Monitoring Tool, which makes it easy to collect Akka metrics.  However — and this is important! — predefined visualizations, dashboards, anomaly detection, alerts and role-based access controls for the DevOps team are out of scope for Kamon, which is focused on metrics collection only.  To overcome this challenge, Kamon’s design makes it possible to integrate Kamon with other monitoring tools.

Needless to say, Sematext has embraced this philosophy and contributed the Kamon backend to SPM.  This gives Akka users the option to use detailed Metrics from Kamon along with the visualization, alerting, anomaly detection, and team collaboration functionalities offered by SPM.

The latest release of Kamon 0.5.x includes kamon-spm module and was announced on August 17th, 2015 on the Kamon blog.  Here’s an excerpt:

Pavel Zalunin from Sematext contributed the new kamon-spm module, which as you might guess allows you to push metrics data to the Sematext Performance Monitor platform. This contribution is particularly special to us, given the fact that this is the first time that a commercial entity in the performance monitoring sector takes the first step to integrate with Kamon, and they did it so cleanly that we didn’t even have to ask any changes to the PR, it was just perfect. We sincerely hope that more companies follow the steps of Sematext in this matter.

Now let’s take a look at the result of this integration work:

  • Metrics pushed to SPM are displayed in predefined reports, including:
    • An overview of all key Akka metrics
    • Metrics for Actors, Dispatchers and Routers
    • Common metrics for CPU, Memory, Network, I/O,  JVM and Garbage Collection
  • Each chart has the “Action” menu to:
    • Define criteria for anomaly detection and alerts
    • Create scheduled email reports
    • Securely share charts with read-only links
    • Embed charts into custom dashboards
  • A single SPM App can take metrics from multiple hosts to monitor a whole cluster; filters by Host, Actor, Dispatcher, and Router make it easy to drill down to the relevant piece of information.
  • All other SPM features, are available for Akka users, too.  For example:


Akka Metrics Overview

Actor Metrics

Actors send and receive messages, therefore the key metrics for Actors are:

  • Time in Mailbox
    Messages are waiting to be processed in the Mailbox – high Time in Mailbox values indicate potential delays in processing.
  • Processing Time
    This is the time Actors need to process the received messages – use this to discover slow Actors
  • Mailbox Size
    Large Mailbox Size could indicate pending operations, e.g. when it is constantly growing.

Each of the above metrics is presented in aggregate for all Actors, but one can also use SPM filtering feature to view all Actors’ metrics separately or select one or more specific Actors and visualize only their metrics.  Filtering by Host is also possible, as show below.


Akka Actors

Dispatcher Metrics

In Akka a Dispatcher is what makes Actors ‘tick’. Each Actor is associated with a particular Dispatcher (default one is used if no explicit Dispatcher is set). Each Dispatcher is associated with a particular Executor – Thread Pool or Fork Join Pool. The SPM Dispatcher report shows information about Executors:

  • Fork Join Pool
  • Thread Pool Executor

All metrics can be filtered by Host and Dispatcher.


Akka Dispatchers

Router Metrics

Routers can be used to efficiently route messages to destination Actors, called Routees.

  • Routing Time – Time to route message to selected destination
  • Time In Mailbox – Time spent in routees mailbox.
  • Processing Time – Time spent by routee actor to process routed messages
  • Errors Count – Errors count during processing messages by routee

For all these metrics, lower values are better, of course.


Akka Routers

You can set Alerts and enable Anomaly Detection for any Akka or OS metrics you see in SPM and you can create custom Dashboards with any combination of charts, whether from your Akka apps or other apps monitored by SPM.

We hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email, or @sematext.

Not using SPM yet? Check out the free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or education institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.  SPM monitors a ton of applications, like Elasticsearch, Solr, Cassandra, Hadoop, Spark, Node.js (open-source), Docker (get open-source Docker image), CoreOS, RancherOS and more.

Introducing AppMap

[Note: This post is part of a series on Transaction Tracing — links to the other posts are at the bottom on this post]

As mentioned in the Transaction Tracing for Performance Bottleneck Detection and Transaction Tracing Reports and Traces posts, when you enable Transaction Tracing in SPM you will also automatically get:

  • Request Throughput
  • Request Latency
  • Error & Exception Rates
  • AppMap

Today we’re happy to officially introduce AppMaps. What’s AppMap? As you can see below, AppMap is a map-like visual representation of your complete application architecture. AppMaps show which components are communicating with which components, at what throughput and latency, at what network speed, whether there are any errors between them, etc.  Connections to external services and databases are also captured and visualized.

As such, AppMaps help you:

  • Instantly see your whole architecture and its operational state and health
  • Bring up to speed new team members by showing them the current architecture instead of showing them outdated architecture diagrams;
  • Keep the whole team up to date about the latest architecture


Things to note:

  • Errors and exceptions are shown in red when they are detected
  • Components are color-coded:
    • Orange components represent external HTTP services
    • Green components are databases (e.g., SQL server has its own shade of green; other databases have their own shades)
    • Blue components are other SPM Apps (e.g., Elasticsearch has its own shade, etc., etc.)
  • Arrows between components have variable thickness – thicker arrows mean bigger throughput (rpm).
  • Greater opacity means smaller latency.

Clicking on any of the components on the AppMap shows more details about that component, such as:

  • Overall Throughput, Latency, Error and Exception rates (also shown as sparklines)
  • Incoming and Outgoing connections and Throughput and Latency between them
  • List of Hosts/Nodes when an SPM App is selected with Throughput, Latency, Error and Exception rates for each of them


If you’d like to see AppMap for your applications, do the following:

That’s it!

Not using SPM yet, but would like to trace your apps? Easy: register here — there is no commitment and you can leave your credit card in your wallet.  You get 30 days Free for new SPM Apps so even if you don’t end up falling in love with SPM for monitoring, alerting and anomaly detection, or Logsene for your logs, you can use the Distributed Transaction Tracing to quickly speed up your apps!  Oh, and if you are a young startup, a small or non-profit organization, or an educational institution, ask us for a discount (see special pricing)!


Here are the other posts in our Transaction Tracing series:

Transaction Tracing Reports and Traces

[Note: This post is part of a series on Transaction Tracing — links to the other posts are at the bottom of this post]

If you missed the Distributed Transaction Tracing Intro, here’s the key bit you should know:

Distributed Transaction Tracing is great for:

  • Pinpointing root causes of poor application performance
  • Finding the slowest parts of your application
  • Tracing requests across networks and apps (hence “Distributed”!)
  • Works for Java and Scala apps

It’s also worth repeating that enabling Transaction Tracing provides more than just transaction traces, such as:

  • Your app’s Request Throughput, Response Latency, plus Error & Exception Rates
  • AppMap, which which how various components in your infrastructure communicate with each other

Now let’s run through a few reports Transaction Tracing provides in SPM.

Top 10 Slowest / Fastest Controllers

Under the new Transactions tab you will first see an overview like this:


On the left side we see the 10 slowest Controllers (actually methods inside them).  You can also see Top 10 Controllers by throughput or time consumed.

On the right side you can see request latency and throughput.

Not shown in this screenshot are a few more charts that show counts and rates of errors, exceptions, and requests that resulted in a 4XX or 5XX response code.

Top 10 Slowest Transactions

Clicking on one of the controllers shows the slowest transactions for that controller, as seen below:


Failed transactions are those that resulted in an error, exception, 4XX or 5XX error code.

As you can tell from this screenshot, these transactions are clickable.  Clicking on them shows details about a transaction, including all request parameters, response code, the exact URL, stop and start time, response code, etc. and, of course, the actual call trace itself, shown below:


Component Counting and Timing

Transaction tracing distinguishes between various components, such as JSPs, SQL, JPA, HTTP, etc.  It counts calls in those components and keeps track of how much time was spent in each of them.  This means that if your database calls are slow, for example, this report will show that and you’ll know what you need to optimize.


The little green “Logs” button in top-right is not associated with transaction tracing, but it’s worth describing.  If you ship your logs to Logsene, this button will pull in your log chart as well as the actual application logs into the SPM UI, thus allowing to troubleshoot performance issues much, much faster!

Transaction Component Breakdown

Similar to the the above Components chart, SPM shows component call count and execution duration breakdown in a tabular view.


Here are the key points about SPM’s transaction tracing:

  • Transaction Tracing does not require you to modify any source code – the instrumentation is done automatically, at the JVM bytecode level
  • Transaction Tracing is currently available for Java and Scala applications running inside the JVM
  • We support deep insight into specific technologies listed in SPM Transaction Tracing documentation
  • You’ll want to grab the latest version of the SPM client (it has some optimizations, too!)
  • You’ll need to use the SPM monitor in the embedded (aka javaagent) mode, not standalone
  • To add Transaction Tracing to your own custom apps you can easily create custom pointcuts

Not using SPM yet, but would like to trace your apps? Easy: register here — there is no commitment and you can leave your credit card in your wallet.  You get 30 days Free for new SPM Apps so even if you don’t end up falling in love with SPM for monitoring, alerting and anomaly detection, or Logsene for your logs, you can use the Distributed Transaction Tracing to quickly speed up your apps!  Oh, and if you are a young startup, a small or non-profit organization, or an educational institution, ask us for a discount (see special pricing)!


Here are the other posts in our Transaction Tracing series:

Transaction Tracing for Performance Bottleneck Detection

[Note: This post is part of a series on Transaction Tracing — links to the other posts are at the bottom of this post]

When you’re building a monitoring solution or evaluate existing ones, what do you look for?  Probably these four core aspects of functionality:

  1. Collection and display of metrics
  2. Alerting based on metric values and anomalies
  3. Collection and display of server and application logs and other types of events
  4. Alerting based on log patterns and metrics extracted from logs

But there is really one more juicy piece of functionality one should look for:

  • Distributed Transaction Tracing

This can be especially useful in Microservices architectures where complex applications are composed of multiple components and services talking to each other over the network while servicing user requests.  As a matter of fact, Dennis Callaghan, senior analyst of infrastructure software at 451 Research, points out:

Microservices solve a lot of challenges, and that’s why they are becoming the standard architecture both within and between applications. We anticipate accelerated adoption of microservices in enterprises this year. But those enterprises need two things in order to effectively monitor microservices architectures. One is the ability to see application and transaction behavior and trace transactions across these increasingly complex and distributed environments. The other is an APM economic model that makes sense and reflects the need to monitor many more smaller instances.

SPM has always had the “APM economic model that makes sense and reflects the need to monitor many more smaller instances”, which is basically the metered model where you pay only for what you use.  This post is about the other key part highlighted in Dennis Callaghan’s statement: “ability to see application and transaction behavior and trace transactions across these increasingly complex and distributed environments”.

Read more of this post

How to Add Performance Monitoring to Node.js and io.js Applications

We have been using Node.js here at Sematext and, since eating one’s own dogfood is healthy, we wanted to be able to monitor our Node.js apps with SPM (we are in performance monitoring and big data biz). So, the first thing to do in such a case is to add monitoring capabilities for technology we use in-house (like we did for Java, Solr, Elasticsearch, Kafka, HBase, NGINX, and others).  For example we monitor Kibana4 servers (based on Node.js), which we have in production for our “1-click ELK stack”.

You may have seen our post about SPM for Node.js  —  but I thought I’d share a bit about how we monitor Node.js to help others with the same DevOps challenges when introducing new Node.js apps, or even the additional challenge of operating large deployments with a mix of technologies in the application stack:

1) npm i spm-agent-nodejs

It’s open-sourced on Github: sematext/spm-agent-nodejs

2) add a new SPM App for Node.js — each monitored App is identified by its App-Token (and yes — there is an API to automate this step)

3) set the Environment variable for the application token


4) add one line to the beginning of your source code when using node.js, for io.js got a better option/see below …

var spmAgent = require (‘spm-agent-nodejs’)

5) Run your app and after 60 seconds you should start seeing metrics in SPM

At this point what do I get? I can see pre-defined metric charts like these, with about 5 minutes of work :)


I saved time already —there’s  no need to define metric queries/widgets/dashboards

Now I can set up alerts on Latency or Garbage Collection, or I can have anomaly detection tell me when the number of Workers in a dynamic queue changes drastically. I typically set ‘Algolerts’ (basically machine learning-based anomaly detection) to get notified (e.g. via PagerDuty) when a service suddenly slows down because  they produce less noise than regular threshold alerts. In addition, I recommend adding Heartbeat alerts for each monitored service to be notified of any server outages or network problems. In our case, where a Node.js app runs tasks on Elasticsearch, it makes sense to create a custom dashboard to see Elasticsearch and Node.js metrics together (see 2nd screenshot above) — of course, this is applicable for other applications in the stack like NGINX, Redis or HAProxy — and can be combined with Docker container metrics


In fact, you can use the application token for multiple servers  to see how your cluster behaves using the “birds eye view” (a kind of top + df to show the health of all your servers)

Now, let’s have a look at how the procedure differs when using io.js …

io.js Supports Preloading Modules

When we use io.js preload command-line option, we can add instrumentaion without adding the require statement for ‘spm-agent-nodejs’ to the source code:

That’s why Step 4) could be done even better with the io.js (>1.6) :

iojs -r “./spm-agent-nodejs” yourApp.js

This is just a little feature but it shows how the io.js community is listening to the needs of users and is able to release such things quickly.

If you want to try io.js, here is how to install it:

npm i n -g

n io 2.4

The ‘node’ executable is now linked to ‘iojs’ — to switch back to node 0.12 simply use

n 0.12

I hope this helps.  If you’d like to see some Node.js / io.js metrics that are currently not being captured by SPM then please hit me on Twitter — @seti321 — or drop me an email.  Or, even better, simply open an issue here:   Enjoy!


Get every new post delivered to your Inbox.

Join 181 other followers