JOB: Professional Services Lead – Solr and Elasticsearch

We have a great opportunity at Sematext for a person who wants to take the Professional Services Lead role and grow both him/herself in this role as well as grow the whole Professional Services side of the house.  The person in this role will get to learn all aspects of the business from engineering, to speaking with numerous clients and customers, to working with remote team members, even touching on sales and marketing.  This position offers a truly multifaceted view into Sematext and the space that Sematext is in, which is a rich blend of search, big data, analytics, open source, products, services, engineering, support, etc.  The ideal candidate would already be in New York, where Sematext HQ is located, but we are open to people from other locations as well.

• Experience working with Solr or Elasticsearch
• Plan and coordinate customer engagements from business and technical perspective
• Identify customer pain points, needs, and success criteria at the onset of each engagement
• Provide expert-level consulting and support services and strive to be a trustworthy advisor to a wide range of customers
• Resolve complex search issues involving Solr or Elasticsearch
• Identify opportunities to provide customers with additional value through our products or services
• Communicate high-value use cases and customer feedback to our Product teams
• Participate in open source community by contributing bug fixes, improvements, answering questions, etc.

• BS or higher in Engineering or Computer Science preferred
• 2 or more years of IT Consulting and/or Professional Services experience required
• Exposure to other related open source projects (Hadoop, Nutch, Kafka, Storm, Mahout, etc.) a plus
• Experience with other commercial and open source search technologies a plus
• Enterprise Search, eCommerce, and/or Business Intelligence experience a plus
• Experience working in a startup a plus

Interested? Please send your resume to

For other job openings please see Jobs @ Sematext or even our previous job listings.

Meetup: Indexing and Searching Logs with Elasticsearch and Solr

If you are into logging and search like we are, and if you are in New York, like some of us are, come to Indexing and Searching Logs with Elasticsearch and Solr on Wednesday at Pivotal Labs office in Manhattan.

Announcement: Logsene 0.3

SPM was not the only one being released this week.  Logsene, our machine/application log and data analytics/exploration solution saw a release as well!  Let’s see what’s new in Logsene:

  • Like SPM, Logsene got a new  “native” Logsene UI to complement its existing Kibana UI.  Those who are looking for something simpler than Kibana or are not Kibana fans (such people do exist, apparently!) may prefer this new, simpler UI reminiscent of older versions of Kibana better.
  • We’ve put a lot of new info up on Logsene Wiki, including how to send logs to Logsene with Logstash, how to send logs to Logsene via Syslog (syslogs/syslog-ng/rsyslog), and of course directly via Logsene’s Elasticsearch API.
  • We’ve also published info about searching Logsene via Elasticsearch API, as well as searching with Kibana.
  • You know how when you are troubleshooting application issues and are asking for help on public mailing lists people often ask you to share your logs so they can help you more?  You can now do that from Logsene!  You can select any number of your log events by clicking on them in Logsene’s new UI and publish them anonymously to Github Gist (see a short video)!  Once you do that you can share the Gist URL with anyone you want, such as your team or people offering their help on some mailing list.  In the upcoming release(s) we’ll let you specify you username if you want to share non-anonymously.  Do you want us to support sharing logs via any other service other than Github Gist?  Pastie?  Pastebin?  Something else?  Leave a comment!
  • Just like you can select logs and “gist them”, you can export logs from Logsene in CSV format.  If you’ve always wanted to import your logs in Excel, now is your chance!
  • You know how you can search Google using syntax like +requiredTerm -excludedTerm “phrase query” and such?  You can use this flexible search syntax with Logsene now.  As a matter of fact, you can use the complete Lucene search syntax in Logsene now.
  • If you are like a lot of people out there who repeatedly run the same set of queries against their logs, you’ll appreciate the new Saved Queries functionality.  Like the same implies, Saved Queries you type in a query, save it, and re-run it later on without having to remember or retype it again.

If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, Storm, we’re hiring planet-wide!

Announcement: New goodness in SPM

We don’t typically announce new SPM, Logsene, or Search Analytics releases, but yesterday’s release calls for an exception.  Logsene release deserves its own post, so we’ll post that separately.  For a quick rundown you can jump over to SPM Changelog. This blog has a bit more descriptive info.

The most visible change in SPM is the whole new, much more modern UI based on Bootstrap. Yes, we have designer(s) on our team now!  You can now much more seamlessly switch between your SPM, Logsene, and Search Analytics apps and the whole experience should feel a lot smoother.  Dashboards were previously fairly hidden, but should now gain visibility.  The “Common” part of SPM, Logsene, and Search Analytics, what we internally call “SUA”, has been radically changes to make navigation much simpler.  While we’ve made lots of UI/UX changes in this release, you’ll see us improving the UI/UX going forward, too.  Please tell us (e.g. leave a comment here) what you think about the new UI, good and bad stuff, and tell us what sort of user experience you’d like to get from SPM!  While the new UI is impossible to miss, there is more in this release:

  • We’ve expanded SPM integration to Redis and Apache Storm.  SPM can now monitor both Redis and Storm and alert you on any of their metrics. This is in addition to monitoring Solr and SolrCloud, Elasticsearch, Hadoop, HBase, Kafka, ZooKeeper, Sensei, JVM, System, and Custom metrics.  Don’t forget to tell us what you want to monitor!
  • More security-sensitive SPM users asked if they could hide their hostnames, which led to the new hostname aliasing/obfuscation feature.  See Can hostnames in SPM be obfuscated or customized? in SPM FAQ.  This is really handy not only because it avoids sending hostnames over the network, but because it lets you specify nice, user-friendly nicknames/aliases for them, so you know which host is which in SPM.
  • When we announced Algolerts a couple of months ago we pointed out a few known kinks.  We’ve taken care of a couple of them in this release.  This boils down to being smart about recognizing regular metric variations and not confusing them with actual anomalies, as well as not missing anomalies that were until now masked by preceding anomalous patterns.
  • We’ve improved the SPM Client, which now loads in a separate classloader from the application in monitors when launched in embedded mode.  This avoids any potential conflicts between libraries included in the SPM Client and those loaded in the monitored application’s process.

If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, Storm, we’re hiring planet-wide!

SPM, Logsene, and Search Analytics Maintenance – 2013-12-16

SPM, Logsene, and Search Analytics will be down for maintenance between 04:00 and 05:00 EST on Monday, 2013-12-16.

Poll: Using SolrCloud or Not?

It’s been 9 months since we conducted a poll on SolrCloud usage.  A lot of things can change in 9 months.  SolrCloud itself went through a ton of development and bug fixing since our last poll.  It’s time to see how many of us are using SolrCloud now, at the end of 2013.

Please tweet this poll and help us spread the word, so we can get a good, statistically significant results.

ZooKeeper Poll Results

We’ve collected 50 votes in our ZooKeeper Usage Poll over the last few days.  Here are the results so far:

  • 66% of people use ZooKeeper directly
  • Another 16% use ZooKeeper indirectly
  • 18% do not use ZooKeeper at all

This puts total ZooKeeper usage at over 80%.  BUT:

Direct ZooKeeper usage being so high at 66% seems a little high and indirect usage being so low at 16% doesn’t feel quite right.  ZooKeeper is used by Hadoop, HBase, SolrCloud, Kafka, Storm, and a number of other popular distributed systems that one would think indirect usage would be much higher than direct usage.

What’s your take on these numbers?

Announcement: ZooKeeper Performance Monitoring in SPM

You don’t see him, but he is present.  He is all around us.  He keeps things running.  No, we are not talking about Him, nor about The Force.  We are talking about Apache ZooKeeper, the under-appreciated, often not talked-about, yet super-critical component of almost all distributed systems we’ve come to rely on – Hadoop, HBase, Solr, Kafka, Storm, and so on.  Our SPM, Search Analytics, and Logsene, all use ZooKeeper, and we are not alone – check our ZooKeeper poll.

We’re happy to announce that SPM can now monitor Apache ZooKeeper!  This means everyone using SPM to monitor Hadoop HBase, Solr, Kafka, Sensei, and other applications that rely on ZooKeeper can now use the same monitoring and alerting tool – SPM – to monitor their ZooKeeper instances.

Please tweet about Performance Monitoring for ZooKeeper

Here’s a glimpse into what SPM for ZooKeeper provides – click on the image to see the full view or look at the actual SPM live demo:

SPM for ZooKeeper Overview

SPM for ZooKeeper Overview

Please tell us what you think – @sematext is always listening!  Is there something SPM doesn’t monitor that you would really like to monitor?  Please vote for tech to monitor!

Want to build highly distributed big data apps with us?  We’re hiring good engineers (not just for positions listed on our jobs page), and we’re sitting on a heap of some pretty juicy big data!

Poll: Are You Using ZooKeeper?

In the last decade the world of distributed computing has exploded and Apache ZooKeeper is often at the center of it….which is why we just added ZooKeeper monitoring in SPM.  Let’s see what percentage of us use ZooKeeper.

Please tweet so we can collect a large number of votes and get a statistically representative sample.

Please tweet about Poll: Are you using ZooKeeper?

Presentation: Scaling Solr with SolrCloud

Squeezing the maximal possible performance out of Solr / SolrCloud, and Elasticsearch and making them scale well is what we do on a daily basis for our clients.  We make sure their servers are optimally configured and maximally utilized.  Rafal Kuć gave a long, 75-minute talk on the topic of Scaling Solr with SolrCloud at Lucene Revolution 2013 conference in Dublin. Enjoy!

If you are interesting in working with Solr and/or Elasticsearch, we are looking for good people to join our team.


Get every new post delivered to your Inbox.

Join 1,564 other followers