What’s New in SPM 1.11.0

We’ve been doing quite a bit of work behind the scenes in SPM.  Here are a few new things in the most recent release – 1.11.0 from April 16, 2013:

  • We’ve added a Standalone Monitor.  So far the only way to monitor Solr, ElasticSearch, HBase, Sensei, or JVM with SPM was by running our SPM Monitor in-process, as a Javaagent.  Starting with this version you have an additional option of running the monitor in a separate process.
  • SPM URLs are now sharable. Just copy the URL from your browser while using SPM and give it to anyone who has access to the same SPM App and they’ll see the exact same view as you – this means seeing the same report, same graph, same filter selection(s), and the same time range!  Because we use SPM with a lot of our Solr and ElasticSearch consulting clients, this is huge for us (and them!), as it helps us all see the exact same view.
  • We have simplified the SPM client installation a lot and have simplified the Collectd config a bit, too.
  • Both SPM Sender and SPM Monitor have been reworked. Monitor has the ability to register new applications and Sender has the ability to pick that up.  Sender should also be running with ionice if you have ionice available and a bit of unnecessary work was removed from Monitor, so it should consume even fewer resources than before.
  • We’ve added more info about SPM to the new SPM Wiki Space.

In addition to that, we’re working on:

  • Hadoop monitoring. This includes performance reports for both HDFS and MapReduce – NameNode, JobTracker, TaskTracker, and DataNode.
  • SPM client packaging.  This means you’ll soon be able to install SPM client as a Deb package or RPM, and then automate with Puppet or Chef.

There are a few more interesting things in the works, but we’ve got to leave something for later.  If you have not tried SPM yet, you should!  User feedback has been awesome and there are a number of good things on the 2013 roadmap!

Sneak Peek: Hadoop Monitoring comes to SPM

When it comes to Hadoop, they say you’ve got to monitor it and then monitor it some more.  Since our own Performance Monitoring and Search Analytics services run on top of Hadoop, we figured it was time to add Hadoop performance monitoring to SPM.  So here is a sneak peek at SPM for Hadoop.  If you’d like to try it on your Hadoop cluster, we’ll be sending invitations soon and you can get on the private beta list starting today!

In the mean time, here is a small sample of pretty self-explanatory reports from SPM for Hadoop, so you can get a sense of what’s available.  There are, of course, a number of other Hadoop-specific reports included, as well as server reports, filtering, alerting, multi-user support, report sharing, etc. etc.

Please don’t forget to tell us what else would you like us to monitor - select your candidates - and if you like what you see and want a good monitoring tool for your Hadoop cluster, please sign up for private beta now.

Click on any graph to see it in its full size and high quality.

Hadoop NameNode Files

Hadoop NameNode Files

.

Hadoop DataNode Read-Write

Hadoop DataNode Read-Write

.

Hadoop JobTracker MapReduce Runtime

Hadoop JobTracker MapReduce Runtime

.

Hadoop TaskTracker Tasks

Hadoop TaskTracker Tasks

.

What else would you like us to monitor with SPM?  Please select your candidates!

For announcements, promotions, discounts, service status, milk, cookies, and other goodies follow @sematext.

EC2 Neighbour Caught Stealing CPU

We run all our services on top of AWS.  We like the flexibility and the speed of provisioning and decommissioning instances.  Unfortunately, this “new age” computing comes at a price.  Once in a while we hit an EC2 instance that has a loud, noisy neighbour.  Kind of like this:

Noisy Jack Nicholson

Unlike in real life, you can’t really hear your noisy neighours in virtualized worlds.  This is kind of good – if you don’t hear them, they won’t bother you, right? Wrong! Oh yes, they will bother you, it’s just that without proper tools you won’t really realize when they’ve become load, how loud they got, and how much their noise is hurting you! So while it’s true you can’t hear these neighbours, you can see them!  Have a look at this graph from SPM:

Noisy neighbour(s) stealing your CPU

Noisy neighbour(s) stealing your CPU. Click for a larger and sharper image.

What we see here is a graph for CPU “steal time” for one of our HBase servers.  Luckily, this happens to be one of our HBase masters, which doesn’t do a ton of CPU intensive work.  What we see is that somebody, some other VM(s) sharing the same underlying host, is stealing about 30% of the CPU that really belongs to us.  Sounds bad, doesn’t it?  What exactly does that mean?  It means that about 30% of the time, applications on this instance (i.e., in our VM) try to use the CPU and the CPU is not available.  Bummer. Of course, this happens at a very, very low level, so from the outside, without this sort of insight, everything looks OK — it’s impossible to tell whether applications are not getting the CPU cycles when they need them by just looking at applications themselves.

So, do you know how noisy your virtual neighbours are?  Do you know how much they steal from you?

If you want to see what your neighbour situation is, whether on AWS or in some other virtualized environment, this is what you can do:

  1. Get SPM (pick “Java” for your SPM Application type once you get in even if you don’t need to monitor any Java apps, yes)
  2. Run the installer, but don’t bother with the “monitor” (aka SPM Monitor) piece – all you need to know are your CPU metrics and for that we don’t need the monitor piece to be running at all actually.
  3. Go to http://apps.sematext.com/ and look at the “CPU & Mem” tab
  4. Unselect all metrics other than “steal”, as show in the image above.  Select each server you want to check in the filter right of that graph (not shown in the image) to check one server at a time.
  5. Make use of SPM alerts and set them up so you get notified when the CPU steal percentage goes over a certain threshold that you don’t want to tolerate. This way you’ll know when it’s time to consider moving to a new VM/instance.

What do you do if you find out you do have noisy neighbours?

There are a couple of options:

  • Be patient and hope they go to sleep or move out
  • Pack your belongings, launch a new EC2 instance, and move there after ensuring it doesn’t suffer from the same problem
  • Create more noise than your neighbour and drive him/her out instead. Yes, I just made this up.

In this particular case, we’ll try the patient option first and move out only when the noise starts noticeably hurting us or we run out of patience.  Happy monitoring!

Marketing Intern Position Available

We’ve been very busy at Sematext working on our flagship products – SPM, Search Analytics, and …. more (we’ll be announcing something new soon).  We’ve received great positive feedback from users and customers.  We’re now looking for a Marketing Intern to join our highly distributed team and help us further generate and drive the demand for our products.  This is flexible role that can be part-time or full-time, remote or local (NYC).  For more information about this opportunity and other open positions see our jobs page.

Poll: Using SolrCloud or Not?

We know that as of February 2013, of those Solr users who follow Sematext Blog about 75% use one some version of Solr 4.x.  But today we are trying to get to another interesting stat:

What portion of Solr 4.x users use SolrCloud?

Let’s find out!  Please tweet this to help us get more votes and better stats.

Please vote only if you are using Solr 4.x.  Please do NOT vote if you are using 1.x or 3.x version of Solr.

Poll: Which Solr version are you using?

With Solr 4.1 recently released, let’s see which version(s) of Solr people are using.  Please tweet it to help us get more votes and better stats.

Solr vs. ElasticSearch: Part 6 – User & Dev Communities

One of the questions after my talk during the recent ApacheCon EU was what I thought about the communities of the two search engines I was comparing. Not surprisingly, this is also a question we often address in our consulting engagements.  As a part of our Apache Solr vs ElasticSearch post series we decided to step away from the technical aspects of SolrCloud vs. ElasticSearch and look at the communities gathered around thesee two projects. If you haven’t read the previous posts about Apache Solr vs. ElasticSearch here are pointers to all of them:

Read more of this post

Solr vs ElasticSearch: Part 5 – Management API Capabilities

In previous posts, all listed below, we’ve discussed general architecture, full text search capabilities and facet aggregations possibilities. However, till now we have not discussed any of the administration and management options and things you can do on a live cluster without any restart. So let’s get into it and see what Apache Solr and ElasticSearch have to offer.

Read more of this post

HBaseWD and HBaseHUT: Handy HBase Libraries Available in Public Maven Repo

HBaseWD is aimed to help distribute writes of records with sequential row keys in HBase (and avoid RegionServer hotspotting). Good introduction can be found here.

We recently published 0.1.0 version of the library to Sonatype public maven repository. Thus, integration in your project became much easier:

  <repositories>
    <repository>
      <id>sonatype release</id>
      <url>https://oss.sonatype.org/content/repositories/releases/</url>
    </repository>
  </repositories>
  <dependency>
    <groupId>com.sematext.hbasewd</groupId>
    <artifactId>hbasewd</artifactId>
    <version>0.1.0</version>
  </dependency>

HBaseHUT is aimed to help in situations when you need to update a lot of records in HBase in read-modify-write style. Good introduction can be found here.

We recently published 0.1.0 version of this library to Sonatype public maven repository too. Integration info:

  <repositories>
    <repository>
      <id>sonatype release</id>
      <url>https://oss.sonatype.org/content/repositories/releases/</url>
    </repository>
  </repositories>
  <dependency>
    <groupId>com.sematext.hbasehut</groupId>
    <artifactId>hbasehut</artifactId>
    <version>0.1.0</version>
  </dependency>

For running (MR jobs) on hadoop-2.0+ (which is a part of CDH4.1+) use 0.1.0-hadoop-2.0 version:

  <dependency>
    <groupId>com.sematext.hbasehut</groupId>
    <artifactId>hbasehut</artifactId>
    <version>0.1.0-hadoop-2.0</version>
  </dependency>

Thank you to all contributors and users of the libraries!

SPM Discountorama Announcement

We are happy to announce the General Availability of SPM, our performance monitoring solution for Apache Solr, ElasticSearch, HBase, SenseiDB, and Java applications, and of course all system metrics. You can also vote for what else you want SPM to monitor.  Over the last N months that we’ve been running SPM we’ve received a lot of good feedback (thanks!), a lot of words of encouragement (thanks!), and even a few nice quotes (another thanks!). Here is one from Jerry Yang, a Software Engineer at Walmart Labs: “I have been using SPM for couple of days and it has been amazing. I learned a lot about my Solr services and was able to optimize based on the results on SPM. Great work.”

Discount Codes

Since holiday season is coming up, we thought we’d offer some discounts every week between now until the end of the year.  Each of the following discounts can be used only during “its week” specified below.  There is a limit to the number of people who can use each discount, so if you want it, don’t waste too much time.  Each discount will reduce the price of SPM SaaS for 365 days after you’ve used it, which effectively means you will get discount until the end of 2013.  Note that when you register for SPM you do not need to enter your credit card information.  You also don’t need to provide it when you create the SPM application for the system you want to monitor.  And it is when you create your SPM application that you can enter the discount code.

  • 20% for the remainder of this week until the end of this Sunday, December 9: NY201320
  • 15% for the week of December 10, 2012: NY201315
  • 10% for the week of December 17, 2012: NY201310
  • 5% for the week of December 24, 2012: NY201305

Note that each discount code expires on Sunday at 00:00 UTC.

SPM Flavours

The above discounts are good for our SPM SaaS.  However, if you’d rather run SPM on your own servers, we do offer SPM on Premises – please get in touch if you are interested in the on premises version.  You can also vote for SPM SaaS vs. On Premise and that way tell us which version you prefer or want.

SPM Plans

There are a few different subscription plans available in SPM SaaS:

  • Basic plan that is free and shows the last 30 minutes of performance data
  • Standard plan that shows the last 30 days of data and costs $0.035/server/hour
  • Pro plan that shows the last 60 days of performance data and costs $0.070/server/hour

If you have not used SPM before, here is what you can expect to see – click on the image to see a large, non-fuzzy version:

We hope you will find SPM useful and fun to use.  We are always looking for feedback – just email spm-support@sematext.com or ping @sematext and tell us what you like or don’t like about SPM.

Follow

Get every new post delivered to your Inbox.

Join 1,121 other followers