JOB: Summer Marketing Internship

We are looking for a high-energy intern with diverse marketing skills to help generate demand for our products.  The internship will be demanding as we move at a fast pace and are extremely agile.  This person will work closely with a globally distributed team — US, Canada, Eastern Europe and Asia.  Our headquarters is located in Brooklyn, but we are open to applicants from anywhere.  Depending on the person, this role could be full- or part-time.

Experience and skill set we are looking for:

  • Communicates well in person, in writing and over the phone
  • Uses social media platforms like Twitter, Facebook and LinkedIn
  • Familiar with email campaign tools like (e.g., MailChimp, Campaign Monitor, Constant Contact, etc.)
  • Willingness to do a wide range of tasks and see them through to completion
  • Experience with graphic design tools
  • Willingness to learn in a highly technical environment
  • Background with software and/or IT consulting organizations is especially desirable
  • CRM experience a plus

Approximate internship dates are June 2 – August 29 though we are flexible.  Sound like you?  Then send your resume to

Going to Be in Austin on April 2nd? Then Check Out BV:IO

Live or work in Austin?  Like small conferences filled with smart, interesting technical people, a roster of great speakers, and innovation everywhere you look?  Great — you’ll fit right in a Bazaarvoice’s first ever public technical conference and hackathon to drive innovation in the social commerce space.  Get all the BV:IO event details here.

And since you are reading this blog, there’s a good chance you know about our founder and CEO, Otis Gospodnetic, and his expertise with all things Search and Big Data.  Otis has been invited to speak, and he goes on at 1:30 pm on Wednesday, April 2nd.  Otis will speak about “Open Source Search Evolution” and he’ll be available before and after the talk at the Sematext sponsor table to say hello and talk about SPM, Logsene, Site Search Analytics, Solr, Elasticsearch, Hadoop, NYC vs. Austin tech scenes, Brooklyn Lager vs. Lone Star…and whatever else you bring to our table.

If you’re thinking of attending BV:IO drop us a line at  Hope to see you there!

Encrypting Logs on Their Way to Elasticsearch Part 2: TLS Syslog

In part 1 of the “encrypted logs” series we discussed sending logs to Elasticsearch over HTTPS. This second part is about TLS syslog.

If you wonder what this has to do with Elasticsearch, the point is that TLS syslog is a standard (RFC-5425): any decent version of rsyslog, syslog-ng or nxlog works with it. So you can forward logs over TLS to a recent, “intermediary” rsyslog. Then, you can either use omelasticsearch with HTTPS to ship your logs to Elasticsearch, or you can install rsyslog on an Elasticsearch node (and index logs over HTTP to localhost).

Such a setup will give you the following benefits:

  • it will work with most syslog daemons, because TLS syslog is so widely supported
  • the “intermediate” rsyslog can act as a buffer, taking that pressure off your application servers
  • the “intermediate” rsyslog can be used for processing, like parsing CEE-formatted JSON over syslog. Again, taking load off your applicaton servers

Our log analytics SaaS, Logsene, gives you all the benefits listed above through the syslog endpoint:

TLS syslog flow in Logsene

Client Setup

Before you start, you’ll need a Certificate Authority’s public key, which will be used to validate the encryption certificate from the syslog destination (more about the server side later).

If you’re using Logsene, you can download the CA certificates directly. If you’re on a local setup, or you just want to consolidate your logs before shipping them to Logsene, you can use your own certificates or generate self-signed ones. Here’s a guide to generating certificates that will work with TLS syslog.

With the CA certificate(s) in hand, you can start configuring your syslog daemon. For example, the rsyslog configuration can look like this:

module(load="imuxsock")  # listens for local logs on /dev/log

global (  # global settings
 defaultNetstreamDriver="gtls"  # use TLS driver when it comes to transporting over TCP
 defaultNetstreamDriverCAFile="/opt/rsyslog/ca_bundle.pem"  # CA certificate. Concatenate if you have more

action(  # how to send logs
  type="omfwd"                                    # Forward them
  target=""   # to Logsene's syslog endpoint
  port="10514"                                    # on port X
  protocol="tcp"                                  # over TCP
  template="RSYSLOG_SyslogProtocol23Format"       # using the RFC-5424 syslog format
  StreamDriverMode="1"                            # via the TLS mode of the driver defined above.
  StreamDriverAuthMode="x509/name"                # Request the machine certificate of the server
  StreamDriverPermittedPeers="*"     # and based on it, just allow Sematext hosts

This is the new-style configuration format for rsyslog, that works with version 6 or above. For the pre-v6 format (BSD-style), check out the Logsene documentation. You can also find the syslog-ng equivalent there.

Server Setup

If you’re using Logsene, you might as well stop here, because it handles everything from buffering and indexing to parsing JSON-formatted syslog.

If you’re consolidating logs before sending them to Logsene, or you’re running your local setup, here’s an excellent end-to-end guide to setting up TLS with rsyslog. The basic steps for the server are:

  • use the same CA certificates as the client, so they have the same basis
  • generate the machine public-private key pair. You’ll have to provide both in the rsyslog configuration
  • set up the TLS rsyslog configuration


Once you start logging, the end result should be just like in part 1. You can use Logsene’s hosted Kibana, your own Kibana or the Logsene UI to explore your logs:

Logsene Screnshot

As always, feel free to contact us if you need any help:

Encrypting Logs on Their Way to Elasticsearch

Let’s assume you want to send your logs to Elasticsearch, so you can search or analyze them in realtime. If your Elasticsearch cluster is in a remote location (EC2?) or is our log analytics service, Logsene (which exposes the Elasticsearch API), you might need to forward your data over an encrypted channel.

There’s more than one way to forward over SSL, and this post is part 1 of a series explaining how.

update: part 2 is now available!

Today’s method is about sending data over HTTPS to Elasticsearch (or Logsene), instead of plain HTTP. You’ll need two pieces to achieve this:

  1. a tool that can send logs over HTTPS
  2. the Elasticsearch REST API exposed over HTTPS

You can build your own tool or use existing ones. In this post we’ll show you how to use rsyslog’s Elasticsearch output to do that. For the API, you can use Nginx or Apache as a reverse proxy for HTTPS in front of your Elasticseach, or you can use Logsene’s HTTPS endpoint:

Rsyslog Configuration

To get rsyslog’s omelasticsearch plugin, you need at least version 6.6. HTTPS support was just added to master, and it’s expected to land in version 8.2.0. Once that is up, you’ll be able to use the Ubuntu, Debian or RHEL/CentOS packages to install both the base rsyslog and the rsyslog-elasticsearch packages you need. Otherwise, you can always install from sources:
- clone from the rsyslog github repository
- run ` –enable-elasticsearch && make && make install` (depending on your system, it might ask for some dependencies)

With omelasticsearch in place (the om part comes from output module, if you’re wondering about the weird name), you can try the configuration below to take all your logs from your local /dev/log and forward them to Elasticsearch/Logsene:

# load needed input and output modules
module(load="") # listen to /dev/log
module(load="") # provides Elasticsearch output capability

# template that will build a JSON out of syslog
# properties. Resulting JSON will be in Logstash format
# so it plays nicely with Logsene and Kibana
         type="list") {
                 property(name="timereported" dateFormat="rfc3339")
                 property(name="syslogtag" format="json")
                 property(name="msg" format="json")

# send resulting JSON documents to Elasticsearch
 # Elasticsearch index (or Logsene token)
 # bulk requests
 # buffer and retry indefinitely if Elasticsearch is unreachable
 # Elasticsearch/Logsene endpoint

Exploring Your Data

After restarting rsyslog, you should be able to see your logs flowing in the Logsene UI, where you can search and graph them:

Logsene Screnshot

If you prefer Logsene’s Kibana UI, or you run your own Elasticsearch cluster, you can run make your own Kibana connect to the HTTPS endpoint just like rsyslog or Logsene’s native UI do.

Wrapping Up

If you’re using Logsene, all you need to do is to make sure you add your Logsene application token as the Elasticsearch index name in rsyslog’s configuration.

If you’re running your own Elasticsearch cluster, there are some nice tutorials about setting up reverse HTTPS proxies with Nginx and Apache respectively. You can also try Elasticsearch plugins that support HTTPS, such as the jetty and security plugins.

Feel free to contact us if you need any help. We’d be happy to answer any Logsene questions you may have, as well as help you with your local setup through professional services and production support. If you just find this stuff exciting, you may want to join us, wherever you are.

Stay tuned for part 2, which will show you how to use RFC-5425 TLS syslog to encrypt your messages from one syslog daemon to the other.

Video and Presentation: Indexing and Searching Logs with Elasticsearch or Solr

Interested in log indexing using Elasticsearch or Solr?  Also interested in searching and analyzing logs in real time?

This topic really hits home for us since we released our log analytics tool, Logsene and we also offer consulting services for logging infrastructure.  If you are reading this and looking for a new opportunity then you might be interested to hear that we are hiring worldwide.

If you are into logging like we are, then you will want to check out this presentation delivered by Sematext’s own Radu Gheorghe to the NYC Search, Discovery and Analytics Meetup held recently at Pivotal Labs.  For the purposes of this presentation the term “logs” ranges from server logs and application events to metrics and even social media information.

The presentation has three parts:

  1. Overview of logging tools that play nicely with Elasticseach and Solr (like Logstash, Apache Flume or rsyslog)
  2. Performance tuning and scaling Elasticsearch and Solr
  3. Demo of an end-to-end solution

Here you go – enjoy!

Announcement: Coming Up in Site Search Analytics

Have you checked out Site Search Analytics yet?  If not, and if you think that gaining insight into user search behavior and experience is valuable information, then we’ve got something for you that’s battle-tested and ready to go.

This year we are adding some killer new features that will make SSA even more useful.  So, if you want to be enjoying benefits like:

  • Viewing real-time graphs showing search and click-through rates
  • Awareness of your top queries, top zero-hit queries, most seen and clicked on hits, etc.
  • Having a mechanism to perform search relevance A/B tests and a relevance feedback mechanism
  • Not having to develop, set up, manage or scale all the infrastructure needed for query and click log analysis
  • And many others — here is a full list of features and benefits

…then you will love the new functionality we have on the way.  After all, how can you improve search quality if you don’t measure it first and keep track of it?

Site Search Analytics

Site Search Analytics

Sound interesting?  Then check out a live demo.  SSA is 100% focused on helping you to improve the search experience of your customers and prospects.  And a better search experience translates into more traffic to your web site and greater awareness of your business.

JOB: Professional Services Lead – Solr and Elasticsearch

We have a great opportunity at Sematext for a person who wants to take the Professional Services Lead role and grow both him/herself in this role as well as grow the whole Professional Services side of the house.  The person in this role will get to learn all aspects of the business from engineering, to speaking with numerous clients and customers, to working with remote team members, even touching on sales and marketing.  This position offers a truly multifaceted view into Sematext and the space that Sematext is in, which is a rich blend of search, big data, analytics, open source, products, services, engineering, support, etc.  The ideal candidate would already be in New York, where Sematext HQ is located, but we are open to people from other locations as well.

• Experience working with Solr or Elasticsearch
• Plan and coordinate customer engagements from business and technical perspective
• Identify customer pain points, needs, and success criteria at the onset of each engagement
• Provide expert-level consulting and support services and strive to be a trustworthy advisor to a wide range of customers
• Resolve complex search issues involving Solr or Elasticsearch
• Identify opportunities to provide customers with additional value through our products or services
• Communicate high-value use cases and customer feedback to our Product teams
• Participate in open source community by contributing bug fixes, improvements, answering questions, etc.

• BS or higher in Engineering or Computer Science preferred
• 2 or more years of IT Consulting and/or Professional Services experience required
• Exposure to other related open source projects (Hadoop, Nutch, Kafka, Storm, Mahout, etc.) a plus
• Experience with other commercial and open source search technologies a plus
• Enterprise Search, eCommerce, and/or Business Intelligence experience a plus
• Experience working in a startup a plus

Interested? Please send your resume to

For other job openings please see Jobs @ Sematext or even our previous job listings.

Announcement: Percentiles added to SPM

In the spirit of continuous improvement, we are happy to announce that percentiles have recently been added to SPM’s arsenal of measurement tools.  Percentiles provide more accurate statistics than averages, and users are able to see 50%, 95% and 99% percentiles for specific metrics and set both regular threshold-based as well as anomaly detection alerts.  We will go more into the details about how the percentiles are computed in another post, but for now we want to put the word out and show some of the related graphs — click on them to enlarge them.  Enjoy!

Elasticsearch – Request Rate and Latency


Garbage Collectors Time


Kafka – Flush Time


Kafka – Fetch/Produce Latency 1


Kafka – Fetch/Produce Latency 2


Solr Req. Rate and Latency 1


Solr – Req. Rate and Latency 2


If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, Storm, we’re hiring planet-wide!

Meetup: Indexing and Searching Logs with Elasticsearch and Solr

If you are into logging and search like we are, and if you are in New York, like some of us are, come to Indexing and Searching Logs with Elasticsearch and Solr on Wednesday at Pivotal Labs office in Manhattan.

Announcement: Redis Monitoring in SPM

Don’t worry, we didn’t just stop at Storm monitoring and metrics while improving SPM.  We’re also happy to announce support for Redis.

Specifically, here are some of the key Redis metrics SPM monitors:

  • Used Memory
  • Used Memory Peak
  • Used Memory RSS
  • Connected Clients
  • Connected Slaves
  • Master Last IO Seconds Ago
  • Keyspace Hits
  • Keyspace Misses
  • Evicted Keys
  • Expired Keys
  • Commands Processed
  • Keys count per db
  • To be expired keys count per db

Also, for all application types users can add alerting rules, heartbeat alerts, and Algolerts, as well as receive emails with performance reports for a given time period.

Enough with the words, these are what the graphs look like — click them to enlarge them:





Used memory/Used memory peak/Used memory RSS chart



Keyspace Hits chart



Expiring Keys chart



Evicted Keys chart

And we’re not done.  Watch this space for more SPM updates coming soon…

Give SPM a spin – it’s free to get going and you’ll have it up and running, graphing all your Redis metrics in 5 minutes!

If you enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, Storm, we’re hiring planet-wide!


Get every new post delivered to your Inbox.

Join 1,564 other followers