Introducing Akka Monitoring

Akka is a toolkit and runtime for building highly concurrent, distributed and resilient message-driven applications on the JVM. It’s a part of Scala’s standard distribution for the implementation of the “actor model”.

How Akka Works

Messages between Actors are exchanged in Mailbox queues and Dispatcher provides various concurrency models, while Routers manage the message flow between Actors. That’s quite a lot Akka is doing for developers!

But how does one find bottlenecks in distributed Akka applications? Well, many Akka users already use the great Kamon Open-Source Monitoring Tool, which makes it easy to collect Akka metrics.  However — and this is important! — predefined visualizations, dashboards, anomaly detection, alerts and role-based access controls for the DevOps team are out of scope for Kamon, which is focused on metrics collection only.  To overcome this challenge, Kamon’s design makes it possible to integrate Kamon with other monitoring tools.

Needless to say, Sematext has embraced this philosophy and contributed the Kamon backend to SPM.  This gives Akka users the option to use detailed Metrics from Kamon along with the visualization, alerting, anomaly detection, and team collaboration functionalities offered by SPM.

The latest release of Kamon 0.5.x includes kamon-spm module and was announced on August 17th, 2015 on the Kamon blog.  Here’s an excerpt:

Pavel Zalunin from Sematext contributed the new kamon-spm module, which as you might guess allows you to push metrics data to the Sematext Performance Monitor platform. This contribution is particularly special to us, given the fact that this is the first time that a commercial entity in the performance monitoring sector takes the first step to integrate with Kamon, and they did it so cleanly that we didn’t even have to ask any changes to the PR, it was just perfect. We sincerely hope that more companies follow the steps of Sematext in this matter.

Now let’s take a look at the result of this integration work:

  • Metrics pushed to SPM are displayed in predefined reports, including:
    • An overview of all key Akka metrics
    • Metrics for Actors, Dispatchers and Routers
    • Common metrics for CPU, Memory, Network, I/O,  JVM and Garbage Collection
  • Each chart has the “Action” menu to:
    • Define criteria for anomaly detection and alerts
    • Create scheduled email reports
    • Securely share charts with read-only links
    • Embed charts into custom dashboards
  • A single SPM App can take metrics from multiple hosts to monitor a whole cluster; filters by Host, Actor, Dispatcher, and Router make it easy to drill down to the relevant piece of information.
  • All other SPM features, are available for Akka users, too.  For example:

Akka_overview

Akka Metrics Overview

Actor Metrics

Actors send and receive messages, therefore the key metrics for Actors are:

  • Time in Mailbox
    Messages are waiting to be processed in the Mailbox – high Time in Mailbox values indicate potential delays in processing.
  • Processing Time
    This is the time Actors need to process the received messages – use this to discover slow Actors
  • Mailbox Size
    Large Mailbox Size could indicate pending operations, e.g. when it is constantly growing.

Each of the above metrics is presented in aggregate for all Actors, but one can also use SPM filtering feature to view all Actors’ metrics separately or select one or more specific Actors and visualize only their metrics.  Filtering by Host is also possible, as show below.

Akka_actors

Akka Actors

Dispatcher Metrics

In Akka a Dispatcher is what makes Actors ‘tick’. Each Actor is associated with a particular Dispatcher (default one is used if no explicit Dispatcher is set). Each Dispatcher is associated with a particular Executor – Thread Pool or Fork Join Pool. The SPM Dispatcher report shows information about Executors:

  • Fork Join Pool
  • Thread Pool Executor

All metrics can be filtered by Host and Dispatcher.

Akka_dispatchers

Akka Dispatchers

Router Metrics

Routers can be used to efficiently route messages to destination Actors, called Routees.

  • Routing Time – Time to route message to selected destination
  • Time In Mailbox – Time spent in routees mailbox.
  • Processing Time – Time spent by routee actor to process routed messages
  • Errors Count – Errors count during processing messages by routee

For all these metrics, lower values are better, of course.

Akka_routers

Akka Routers

You can set Alerts and enable Anomaly Detection for any Akka or OS metrics you see in SPM and you can create custom Dashboards with any combination of charts, whether from your Akka apps or other apps monitored by SPM.

We hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email, or @sematext.

Not using SPM yet? Check out the free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or education institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.  SPM monitors a ton of applications, like Elasticsearch, Solr, Cassandra, Hadoop, Spark, Node.js (open-source), Docker (get open-source Docker image), CoreOS, RancherOS and more.

Tomcat Monitoring SPM Integration

This old cat, Apache Tomcat, has been around for ages, but it’s still very much alive!  It’s at version 8, with version 7.x still being maintained, while the new development is happening on version 9.0.  We’ve just added support for Tomcat monitoring to the growing list of SPM integrations the other day, so if you run Tomcat and want to take a peek at all its juicy metrics, give SPM for Tomcat a go!  Note that SPM supports both Tomcat 7.x and 8.x.

Before you jump to the screenshot below, read this: you may reeeeeally want to enable Transaction Tracing in the SPM agent running on your Tomcat boxes.  Why?  Because that will help you find bottlenecks in your web application by tracing transactions (think HTTP requests + method call + network calls + DB calls + ….).  It will also build a whole map of all your applications talking to each other with information about latency, request rate, error and exception rate between all component!  Check this out (and just click it to enlarge):

AppMap

Everyone loves maps.  It’s human. How about charts? Some of us have a thing for charts. Here are some charts with various Tomcat metrics, courtesy of SPM:

Overview  (click to enlarge)

Tomcat_overview_2

Session Counters  (click to enlarge)

Tomcat_Session_Counters

Cache Usage  (click to enlarge)

Tomcat_Sessions_4

Threads (Threads/Connections)  (click to enlarge)

Tomcat_threads_4

Requests  (click to enlarge)

Tomcat_Requests

Hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email, or @sematext.

Not using SPM yet? Check out the free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or education institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.  SPM monitors a ton of applications, like Elasticsearch, Solr, Hadoop, Spark, Node.js & io.js (open-source), Docker (get open-source Docker image), CoreOS, and more.

Monitoring CoreOS Clusters

UPDATE: Related to monitoring CoreOS clusters, we have recently optimized the SPM setup on CoreOS and integrated a logging gateway to Logsene into the SPM Agent for Docker.  You can read about it in Centralized Log Management and Monitoring for CoreOS Clusters

——-

[Note: We’re holding Docker Monitoring and Docker Logging webinars in September — sign up today!]

In this post you’ll learn how to get operational insights (i.e. performance metrics, container events, etc.) from CoreOS and make that super simple with etcd, fleet, and SPM.

We’ll use:

  • SPM for Docker to run the monitoring agent as a Docker container and collect all Docker metrics and events for all other containers on the same host + metrics for hosts
  • fleet to seamlessly distribute this container to all hosts in the CoreOS cluster by simply providing it with a fleet unit file shown below
  • etcd to set a property to hold the SPM App token for the whole cluster

The Big Picture

Before we get started, let’s take a step back and look at our end goal.  What do we want?  We want charts with Performance Metrics, we want Event Collection, we’d love integrated Anomaly Detection and Alerting, and we want that not only for containers, but also for hosts running containers.  CoreOS has no package manager and deploys services in containers, so we want to run the SPM agent in a Docker container, as shown in the following figure:

SPM_for_Docker

By the end of this post each of your Docker hosts could look like the above figure, with one or more of your own containers running your own apps, and a single SPM Docker Agent container that monitors all your containers and the underlying hosts.

Read more of this post

Get CoreOS Logs into ELK in 5 Minutes

Update: We have recently optimized the SPM setup on CoreOS and integrated a logging gateway to Logsene into the SPM Agent for Docker.  Please follow the setup instructions in Centralized Log Management and Monitoring for CoreOS Clusters


CoreOS Linux is the operating system for “Super Massive Deployments”.  We wanted to see how easily we can get CoreOS logs into Elasticsearch / ELK-powered centralized logging service. Here’s how to get your CoreOS logs into ELK in about 5 minutes, give or take.  If you’re familiar with CoreOS and Logsene, you can grab CoreOS/Logsene config files from Github. Here’s an example Kibana Dashboard you can get in the end:

CoreOS Kibana Dashboard

CoreOS Kibana Dashboard

CoreOS is based on the following:

  • Docker and rkt for containers
  • systemd for startup scripts, and restarting services automatically
  • etcd as centralized configuration key/value store
  • fleetd to distribute services over all machines in the cluster. Yum.
  • journald to manage logs. Another yum.

Amazingly, with CoreOS managing a cluster feels a lot like managing a single machine!  We’ve come a long way since ENIAC!

There’s one thing people notice when working with CoreOS – the repetitive inspection of local or remote logs using “journalctl -M machine-N -f | grep something“.  It’s great to have easy access to logs from all machines in the cluster, but … grep? Really? Could this be done better?  Of course, it’s 2015!

Here is a quick example that shows how to centralize logging with CoreOS with just a few commands. The idea is to forward the output of “journalctl -o short” to Logsene‘s Syslog Receiver and take advantage of all its functionality – log searching, alerting, anomaly detection, integrated Kibana, even correlation of logs with Docker performance metrics — hey, why not, it’s all available right there, so we may as well make use of it all!  Let’s get started!

Preparation:

1) Get a list of IP addresses of your CoreOS machines

fleetctl list-machines

2) Create a new Logsene App (here)
3) Change the Logsene App Settings, and authorize the CoreOS host IP Addresses from step 1) (here’s how/where)

Congratulations – you just made it possible for your CoreOS machines to ship their logs to your new Logsene app!
Test it by running the following on any of your CoreOS machines:

journalctl -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514

…and check if the logs arrive in Logsene (here).  If they don’t, yell at us @sematext – there’s nothing better than public shaming on Twitter to get us to fix things. :)

Create a fleet unit file called logsene.service

[Unit]
Description=Logsene Log Forwarder

[Service]
Restart=always
RestartSec=10s
ExecStartPre=/bin/sh -c "if [ -n \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" ]; then  echo \"Value Exists: /sematext.com/logsene/`hostname`/lastlog $(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\"; else etcdctl set /sematext.com/logsene/`hostname`/lastlog\"`date +\"%Y-%%m-%d %%H:%M:%S\"`\"; true; fi"
ExecStart=/bin/sh -c "journalctl --since \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com  10514"
ExecStopPost=/bin/sh -c "export D=\"`date +\"%Y-%%m-%%d %%H:%M:%S\"`\"; /bin/etcdctl set /sematext.com/logsene/$(hostname)/lastlog \"$D\""

[Install]
WantedBy=multi-user.target

[X-Fleet]
Global=true

Activate cluster-wide logging to Logsene with fleet

To start logging to Logsene from all machines activate logsene.service:

fleetctl load logsene.service
fleetctl start logsene.service

There.  That’s all there is to it!  Hope this worked for you!

At this point all your CoreOS logs should be going to Logsene.  Now you have a central place to see all your CoreOS logs.  If you want to send your app logs to Logsene, you can do that, too — anything that can send logs via Syslog or to Elasticsearch can also ship logs to Logsene. If you want some Docker containers & host monitoring to go with your CoreOS logs, just pull spm-agent-docker from Docker Registry.  Enjoy!

Monitoring Kibana 4’s Node.js App

The release of Kibana 4.x has had an impact on monitoring and other related activities.  In this post we’re going to get specific and show you how to add Node.js monitoring to the Kibana 4 server app.  Why Node.js?  Because Kibana 4 now comes with a little Node.js server app that sits between the Kibana UI and the Elasticsearch backend.  Conveniently, you can monitor Node.js apps with SPM, which means SPM can monitor Kibana in addition to monitoring Elasticsearch.  Futhermore, Logstash can also be monitored with SPM, which means you can use SPM to monitor your whole ELK Stack!  But, I digress…

A few important things to note first:

  • the Kibana 4 project moved from Ruby to pure browser app to Node.js on the server side, as mentioned above
  • it now uses the popular Express Web Framework
  • the server component has a built-in proxy to Elasticsearch, just like it did with the Ruby app
  • when monitoring Kibana 4, the proxy requests to Elasticsearch are monitored at the same time

OK, here’s how to add Node.js monitoring to the Kibana 4 server-side app.

1) Preparation

Get an App Token for SPM by creating a new Node.js SPM App in SPM.

Kibana 4 currently ships with Node.js version 0.10.35 in a subdirectory – so please make sure your Node.js is on 0.10 while installing SPM Agent for Node.js (it compiles native modules, which need to fit to Kibana’s 0.10 runtime).

  npm-install ny -g
  ny 0.10.35

After finishing the described installation below you can easily switch back to 0.12 or io.js 2.0 by using “ny 0.12” or “ny 2.0” – because Kibana will use its own node.js sub-folder.

2) Install SPM Agent for Node.js

Switch over to your Kibana 4 installation directory.  It has a “src” folder where the Node.js modules are installed.

  cd src
  npm install spm-agent-nodejs

Add the following line to ./src/app.js

  var spmAgent = require ('spm-agent-nodejs')

Add the following line to bin/kibana shell script at the beginning

export spmagent_tokens__spm=YOUR-SPM-APP-TOKEN

3) Run Kibana

bin/kibana

4) Check results in SPM

After a minute you should see the performance metrics such as EventLoop Latencies, Memory Usage, Garbage Collection details and HTTP statistics of your Kibana 4 Server app in SPM.

Kibana 4 - monitored with SPM for Node.js

Kibana 4 – monitored with SPM for Node.js

SPM for Node.js Monitoring – Details, Screenshots and more

For more specific details about SPM’s Node.js monitoring integration, check out this blog post.

That’s all there is to it!  If you’ve got questions or feedback to this post, please let us know!

Custom Metrics from Node.js Apps

We recently added support for Node.js and io.js monitoring to SPM and have received great feedback.  While SPM for Node.js monitors all key Node.js metrics, most applications have additional metrics one often wants to track — things like: the number of concurrent users, the number of items placed in a shopping cart, or any other kind of IT metric, business transaction or KPI.  SPM already provides a Custom Metrics API and libraries that make shipping custom metrics from Java and from Ruby applications a snap.  But why leave Node.js behind?  Meet spm-metrics-js (it’s on Github) – the npm module for sending custom metrics from Node.js apps to SPM.  

This JavaScript module supports measurements using counters, meters, timers, and histograms. These helpers calculate values of metrics objects and ship them to SPM, where they are then turned into charts and inputs to alert rules and anomaly detection algorithms.

Here’s an example for counting users on login and logout:

Sending custom metrics is really that easy!

Now, let’s have a look at the options used when creating a custom metric object:

  • name – the name of the metric you can find in SPM’s user interface
  • aggregation – the aggregation type: ‘avg’, ‘sum’, ‘min’ or ‘max’ used in SPM’s aggregations server
  • filter1 – the SPM user interface provides two filter criteria; the value will be available in the UI as the first filter
  • filter2 – the filter value for the second filter field in SPM’s UI
  • interval – time in ms to call save() periodically. Defaults to no automatic call to save(). The save() function captures the metric and resets meters, histograms, counters or timers.
  • valueFilter – array of property names for calculated values. Only specified fields are sent to SPM (e.g. [‘count’, ‘min’, ‘max’].

Additional measurement functions are available to extend the custom metric object automatically with additional calculated properties:

  • Meter – measure rates and provide the following calculated properties:
    • mean: the average rate since the meter was started
    • count: the total of all values added to the meter
    • currentRate: the rate of the meter since the meter was started
    • 1MinuteRate: the rate of the meter biased toward the last 1 minute
    • 5MinuteRate: the rate of the meter biased toward the last 5 minutes
    • 15MinuteRate: the rate of the meter biased toward the last 15 minutes
  • Histogram – build percentile, min, max, & sum aggregations over time
    • min: the lowest observed value
    • max: the highest observed value
    • sum: the sum of all observed values
    • variance: the variance of all observed values
    • mean: the average of all observed values
    • stddev: the stddev of all observed values
    • count: the number of observed values
    • median: 50% of all values in the reservoir are at or below this value.
    • p75: see median, 75% percentile
    • p95: see median, 95% percentile
    • p99: see median, 99% percentile
    • p999: see median, 99.9% percentile
  • Timer – measures time and captures rates in an internal meter and histogram

If this is more than you actually need, we recommend selecting only the relevant properties (using the ‘valueFilter’ option). Please note that Custom Metrics are aggregated by the specified aggregation type (‘avg’, ‘sum’, ‘min’, ‘max’).  Moreover, the aggregation type for each property can be defined – for further details please check the package documentation.

Adding instrumentation always raises the question of performance; in spm-metrics-js all metrics are buffered and efficiently ship metrics to SPM in bulk using asynchronous functions. We recommend using a transmit time of 60 seconds.

Once you send custom metrics to SPM you can create alerts on them, have SPM detect and alert you about anomalies, put charts with those metrics on dashboards, share charts with those metrics publicly or just with your team or organization, etc.

custom-metric-alert

Actions for Metrics – e.g. define alerts using anomaly detection

custom-metric-dashboard

Dashboard with Custom Metric and other Metrics

Please note the free plan has no limits on the number of monitored Applications, Processes, Dashboards or Users and you can share Accounts with your whole DevOps team and integrate SPM with Slack, HipChat, PagerDuty, Webhooks, etc. If you don’t use SPM yet, grab a free account to start monitoring your Node.js and io.js applications and benefit from all standard SPM features such as alerting, anomaly detection, event and log correlation, unlimited dashboards, secure information sharing, etc. Check out spm-metrics-js (or on Github) and drop us a line (or tweet 140 characters to @sematext) — we’d love to hear from you!

Node.js and io.js Monitoring Support

Node.js and io.js are increasingly being used to run JavaScript on the server side for many types of applications, such as websites, real-time messaging and controllers for small devices with limited resources. For DevOps it is crucial to monitor the whole application stack and Node.js is rapidly becoming an important part of the stack in many organizations. Sematext has historically had a strong support for monitoring big data applications such as Elastic (aka Elasticsearch), Cassandra, Solr, Spark, Hadoop, and HBase, as well as more traditional databases, web servers like Nginx, Nginx Plus and Apache, Java applications, cache servers like Redis and Memcached, messaging middleware like everyone’s darling Kafka, etc.  With such rapid adoption of Node.js and now io.js, we’d be remiss not to add performance monitoring, alerting, and anomaly detection for them in SPM!

spm-node-io

SPM for Node.js

We’re happy to announce we’ve just added Node.js monitoring to this growing list of SPM integrations.  SPM for Node.js covers key Node.js metrics such as Event Loop, Garbage Collection, CPU, Memory and web services metrics.  All metrics are organized in out-of-the-box charts, which can be put on additional dashboards and placed next to performance charts for other parts of the application stack.

Overview for top node.js and io.js metrics

Overview for top node.js and io.js metrics

Of course, you can view your Node.js metrics in a larger context.  For example, here is a dashboard that shows Node.js metrics together with Elasticsearch metrics, making it easier to correlate performance across multiple tiers of the application stack.  You could also get your event and log charts on the same dashboard for an even more thorough correlation.

nodejs-elasticsearch-dashboard

Dashboard with node.js HTTP response time and Elasticsearch query latency

Needless to say, we made sure everything works for the latest versions of Node.js (0.12) and io.js (1.6). Installation is as easy as integration of any other module using npm.  If you are not using SPM yet, you can sign up with no commitment or credit card.  You have 30-days free on any new app you create.  If you are already using SPM, you can simply add a new SPM App for Node.js and see all your Node.js metrics in just a few minutes.  Don’t see something in SPM for Node.js?  Please let us know (@sematext) or comment below, we are looking for feedback!

Follow

Get every new post delivered to your Inbox.

Join 173 other followers