Solr Presentations from Lucene/Solr Revolution 2014

Thanks to everyone who stopped by the Sematext booth at last week’s Lucene/Solr Revolution event in Washington, DC and attended our two talks:

The attendance, questions and interest are very much appreciated.  As a company that prides itself on its Solr expertise (and Elasticsearch expertise too, for that matter), it was nice to spend a couple days talking about search and Big Data challenges, performance monitoring and logging with fellow experts from around the world. Here are the slides for the two talks we gave (summaries of the talks can be found here):

 

  Videos of the talks will be posted here soon.  Hope to see everyone again next year!

Sematext at Lucene/Solr Revolution 2014

Going to Lucene/Solr Revolution next week — November 11-14 — in Washington, DC?  If so…Sematext will be there exhibiting AND giving two talks!  If you are going, stop by our table to say hello.  We can show you the latest versions of SPM Performance Monitoring, Logsene Log Management and Analytics, Site Search Analytics, and, of course, talk about metrics, centralized log management, Lucene, Solr, Elasticsearch, and just about any other search-related topic you might be interested in.  After all, not only have we blogged, given talks and spread the word in all sorts of ways, we’ve also written books on these subjects!

Both of the Sematext engineer talks take place on Friday, November 14.  They are:

Radu Gheorghe will talk about “Tuning Solr for Logs” at 10:15 am

Summary:  Performance tuning is always nice for keeping your applications snappy and your costs down. This is especially the case for logs, social media and other stream-like data that can easily grow into terabyte territory. While you can always use SolrCloud to scale out of performance issues, this talk is about optimizing. The following questions about Solr settings will be answered. How often should you commit and merge? How can you have one collection per day/month/year/etc? What are the performance trade-offs for these options?  There will also be a discussion around choosing the appropriate hardware.  Radu will talk about optimizing the infrastructure when pushing logs to Solr. This includes tuning Apache Flume to handle large flows of logs and overall design options that also apply to other shippers, like Logstash.

Rafal Kuc will talk about “Solr Anti-Patterns” at 10:55 am

Summary:  Working as a consultant, software engineer and helping people in various ways, Rafał has seen multiple patterns in how Solr is used and how it should be used. Consulting on best practices is common, but talking about what NOT to do is not. This talk will point out common mistakes and roads that should be avoided at all costs, covering use cases and guidelines around general configuration pitfalls, data modeling and what to avoid when making your data indexable, and mistakes made when it comes to queries and searching for indexed data. Each use case will be illustrated by a before and after analysis where changes in metrics will be shown to bring a know-how worth remembering.

20% Discount Code

If you currently use a Sematext product or have been a client in the past and want to go, drop us a line for more info.

Hope to see you in DC!

Job: Sematext is hiring – Elasticsearch Engineer

The Sematext team is more distributed than your average Elasticsearch cluster and, trust me, we’ve seen a a good portion of the world’s Elasticsearch clusters.  The thing with Elasticsearch clusters is they often get new nodes added and they keep expanding to handle more data and more queries.  Similarly, we are looking to add a new node to the Sematext team so we can reshard our work a bit, distribute it more evenly, and scale further.  In plain English, we are looking for an Engineer who loves working with Elasticsearch, who loves large volumes of data, and a wide variety of projects and challenges involving large scale data processing, high volume indexing, high query rates, who likes working with our clients, and wants to make Logsene and SPM the killer log management and monitoring platforms.  Advanced knowledge of Elasticsearch is less important than passion to learn and build, positive attitude, ability to make decisions, work both independently and with the rest of the team, communicate well, and simply be a good person.  We can teach you everything about Elasticsearch and turn you into a bonsai tree loving Elasticsearch samurai, but we need you to be all these other things.

As a member of our team you will get to:

  • Work with world-class search experts
  • Design and implement systems (both our own and our clients’) that process 10s of thousands of queries per second and handle billions of documents, logs, data points, etc.
  • Interact with clients and customers world-wide
  • Provide guidance, architecture design, implementation, and production support around Elasticsearch
  • Participate in and contribute to open-source (we’ve contributed to Solr, Lucene, HBase, Flume, rsyslog, Logstash, etc.)
  • Share your knowledge with clients, at conferences and under-conferences, online community, etc.

This position:

  • Offers a lot of independence, learning, and growth
  • Is open to applicants “west of New York City” (this could be South, Central, or North America, of course), though we’ll happily make an exception if you persuade us we should make an exception for you!

Our search team members have written several books about search, regularly give talks at conferences, blog, and participate in open-source projects.  For more info, see 19 things you may like about Sematext.

Interested? Please send your resume to jobs@sematext.com.

For other job openings please see Jobs @ Sematext or even our previous job listings.

Sematext in GooglePlus

Quick shout-out to all G+ fans — you can find us in G+, too, and follow us there if you prefer that over the more traditional blog subscription: https://plus.google.com/+SematextGroup

Of course, @sematext is an option, too!

Two Lucene/Solr Revolution 2014 Talks Accepted!

We recently got word from Lucene/Solr Revolution 2014 (in Washington, DC from Nov. 11-14) that talks submitted by two Sematext engineers were accepted as part of the Tutorial track!  They are:

In “Tuning Solr for Logs” Radu will discuss Solr settings, hardware options and optimizing the infrastructure pushing logs to Solr.

In “Solr Anti-Patterns” Rafal will point out common Solr mistakes and roads that should be avoided at all costs.  Each of the talk’s use cases will be illustrated with a before and after analysis — including changes in metrics.

You can see more details about both talks in this recent blog post.

The full agenda, including dates and times for the talks, will be available soon on the Lucene/Solr Revolution 2014 web site.

If you do attend one of these talks please stop by and say hello to Radu and Rafal.  Not only do they know Solr inside and out, but they are good guys as well!

Love Solr Enough to Even Want to Attend One of These Talks?

If you enjoy Solr enough to even think of attending these talks — and you’re looking for a new opportunity — then Sematext might be the place for you.  We’re hiring planet-wide and currently looking for Solr and Elasticsearch Engineers, Front end and JavaScript Developers, Developer Evangelists, Full-stack Engineers, and Mobile App Developers.

JOB: Elasticsearch / Lucene Engineer (starts in the Netherlands)

In addition to looking for an Elasticsearch / Solr Engineer to join the Sematext team, we are also looking for an Lucene / Elasticsearch Engineer in EU for a specific project.  This project calls for 6 months of on-site work with our client in Netherlands.  After 6 months the collaboration with our client would continue remotely if there is more work to be done for the client or, if the client project(s) are over, this person would join our global team of Engineers and Search Consultants and work remotely (we are all very distributed over several countries and continents). This is a position focused on search – it involves working with Elasticsearch, but also requires enough understanding of Lucene to allow one to write custom Elasticsearch/Lucene components, such as tokenizers, for example. Here are some of the skills one should have for this job:

  •  knowledge of different types of Lucene queries/filters (boolean, spans, etc.) and their capabilities
  •  experience in extending out-of-the-box Lucene functionality via developing custom queries, scorers, collectors
  •  understanding of Lucene document analysis in the process of indexing, experience in writing custom analyzers
  •  experience in mapping advanced hierarchical data structures to Lucene fields
  •  experience in scalable distributed open-source search technologies such as Elasticsearch or Solr

The above is not much information to go by, but if this piqued your interest and if you think you are a good match, please fix up your resume and send it to jobs@sematext.com quickly.

Fall Internship at Sematext

This coming Fall 2014 we will have 2 open positions for enthusiastic students world-wide interested in spending a semester working with Sematext.  The positions will involve work on Performance Monitoring, Logging, Alerting, and Anomaly Detection, all of which are part of SPM and Logsene.  SPM and Logsene involve work with HBase and/or Elasticsearch, custom data processing components, metric aggregation, log aggregation, Kafka, Machine Learning and Data Mining algorithms, JavaScript, visualizations, reporting, and so on.

Sematext HQ is in Brooklyn, NY, USA, but we are a very geographically distributed organization whose members are spread over several countries and continents.  As such, we welcome students from all across the globe to do their internship with us from wherever they are.  Internship positions are available year-round, but are subject to student demand and our capacity.

You can check out our products, our services, our clients, our team, and if you’d like to be a part of it, please contact us and tell us about yourself.
Follow

Get every new post delivered to your Inbox.

Join 143 other followers