JOB: Elasticsearch / Solr Engineer

We’ve grown nicely this year.  Our team has a new UI Developer, a new Solr/Elasticsearch Engineer, a new Marketing person, a new Automation Engineer, and this summer we have the first ever Intern.

Like all healthy organizations, we keep growing, and we are now looking for good Search Engineers who know Elasticsearch and/or Solr to join our geographically distributed search consulting team.  You will work remotely, from wherever you are, with smart people spread out across the planet and with an amazing array of companies world-wide on projects that range from just a week or two to several months.

At Sematext, we’ve built several exciting products – from smaller, search-focused products that work with Solr and Elasticsearch, to larger ones like SPMSearch Analytics, and most recently Logsene.  While not building products and running services, we help organizations world-wide with their search and big data needs – from fixing issues and providing production support to building complex search systems from scratch.  Our client list is long with a number of household names on it – from Instagram (Facebook) and Tumblr (Yahoo), Etsy and Shutterstock, to The BBC, Elsevier, Lockheed Martin, Reuters, Library of Congress, etc.  We did this without raising any money.  The demand for our products and services is growing and we are looking for good engineers and good people to join our adventure!

More formally:

Sematext is looking for a responsible, professional individual to join our team of search engineers.

Sematext is a New York-based startup with people spread over multiple continents and several hundred customers from Instagram and Tumblr, Etsy and Shutterstock, to The BBC, Elsevier, Lockheed Martin, Reuters, Library of Congress, etc. We’ve built systems handling over 10,000 QPS and have worked with multi-billion document indices. Our core products are:

In addition to the above products we offer consulting services around open source search and big data.

We are looking for a person who is:

  • Enthusiastic and positive
  • Driven, independent, and professional
  • A good communicator, both written and oral
  • Good with Solr and/or Elasticsearch and is hungry to learn more
  • Enjoys helping organizations make the best out of search

As a member of our search team you will get to:

  • Interact with clients world-wide
  • Provide guidance, architecture design, implementation, and support
  • Participate in Solr, Lucene, and Elasticsearch user and development communities
  • Work on Sematext’s search and data analytics products and participate in open-source search projects

This position:

  • Offers a lot of independence, learning, and growth
  • May require a bit of travel here and there, typically in the US and Europe
  • Is open world-wide

Our search team members have written several books about search, regularly give talks at conferences, blog, and participate in open-source projects.
For more info, see 19 things you may like about Sematext.

Interested? Please send your resume to jobs@sematext.com.

For other job openings please see Jobs @ Sematext or even our previous job listings.

Community Voting for Sematext Talks at Lucene/Solr Revolution 2014

The biggest open source conference dedicated to Apache Lucene/Solr takes place in November in Washington, DC.  If you are planning to attend — and even if you are not — you can help improve the conference’s content by voting for your favorite talk topics.  The top vote-getters for each track will be added to Lucene/Solr Revolution 2014 agenda.

Not surprisingly for one of the leading Lucene/Solr products and services organizations, Sematext has two contenders in the Tutorial track:

We’d love your support to help us contribute our expertise to this year’s conference.  To vote, simply click on the above talk links and you’ll see a “Vote” button in the upper left corner.  That’s it!

To give you a better sense of what Radu and Rafal would like to present, here are their talk summaries:

Tuning Solr for Logs – by Radu Gheorghe

Performance tuning is always nice for keeping your applications snappy and your costs down. This is especially the case for logs, social media and other stream-like data that can easily grow into terabyte territory.

While you can always use SolrCloud to scale out of performance issues, this talk is about optimizing. First, we’ll talk about Solr settings by answering the following questions:

  • How often should you commit and merge?
  • How can you have one collection per day/month/year/etc?
  • What are the performance trade-offs for these options?

Then, we’ll turn to hardware. We know SSDs are fast, especially on cold-cache searches, but are they worth the price? We’ll give you some numbers and let you decide what’s best for your use case.

The last part is about optimizing the infrastructure pushing logs to Solr. We’ll talk about tuning Apache Flume for handling large flows of logs and about overall design options that also apply to other shippers, like Logstash. As always, there are trade-offs, and we’ll discuss the pros and cons of each option.

Solr Anti-Patternsby Rafal Kuc

Working as a consultant, software engineer and helping people in various ways we can see multiple patterns on how Solr is used and how it should be used. We all usually say what should be done, but we don’t talk and point out why we should not go some ways. That’s why I would like to point out common mistakes and roads that should be avoided at all costs.   During the talk I would like not only to show the bad patterns, but also show the difference before and after.

The talk is divided into three major sections:

  1. We will start with general configuration pitfalls that people are used to make. We will discuss different use cases showing the proper path that one should take
  2. Next we will focus on data modeling and what to avoid when making your data indexable. Again we will see real life use cases followed by the description how to handle them properly
  3. Finally we will talk about queries and all the juicy mistakes when it comes to searching for indexed data

Each shown use case will be illustrated by the before and after analysis – we will see the metrics changes, so the talk will not only bring pure facts, but hopefully know-how worth remembering.

Thank you for your support!

Presentation and Video: Side by Side with Solr and Elasticsearch

Fresh from Berlin Buzzwords where Sematext‘s own Radu Gheorghe and Rafal Kuc presented “Side by Side with Solr and Elasticsearch” on the same stage, at the same time…but in different colors.  The talk included live demos, graphing, stats, and hints at juicy things to come.  Needless to say — if you deal with Solr and Elasticsearch then there are great insights to be found here!

Here is the presentation:

 

And here is the video:

 

Want to Be on Stage Somewhere Like Radu and Rafal Talking About Solr and Elasticsearch?

Or maybe you don’t want the spotlight — that’s cool too.  But…if you do enjoy performance monitoring, log analytics, or search analytics, working with projects like Elasticsearch, Solr, HBase, Hadoop, Kafka, and Storm, then drop us a line.  We’re hiring planet-wide!  Front end and JavaScript Developers, Developer Evangelists, Full-stack Engineers, Mobile App Developers…get in touch!

Enjoy!

Podcast: Tools to Monitor Solr, Manage Logs & Analyze Search Trends

Sematext Founder & President Otis Gospodnetic recently spoke with LucidWorks Chief of Product, Will Hayes as part of their SolrCluster podcast series.  Otis and Will discussed tools that Sematext has built to help monitor Solr and other stacks, manage and analyze logs, and analyze search trends.  They also discuss Solr/SolrCloud and Elasticsearch, their APIs, developer friendliness, as well as the general direction that search and big data industry leaders are moving toward around data acquisition and discovery as data increasingly grows.

Go here to listen to the podcast.  It runs about 36 minutes.  Enjoy!

Berlin Buzzwords 2014 – Side by Side with Elasticsearch and Solr

Last year at Berlin Buzzwords two Sematext Engineers had the opportunity to give two talks. Radu talked about “JSON Logging with Elasticsearch” (video, slides) and Rafał did the second round of Solr vs Elasticsearch in his talk “Battle of the Giants, round 2” (video, slides). We were also happy to be sponsoring Berlin Buzzwords 2013. This year, we decided to go for a talk where two of us can talk on the same stage, at the same time. On Tuesday, 27th of May, at 11:30, in the Frannz Club Radu and Rafał will be giving a talk called “Side by side with Solr and Elasticsearch“.

side by side

Solr – established, mature and well known open-source search server, commonly used. Elasticsearch – still young, but quickly gaining popularity, with over 200k downloads per month. Both search servers are based on Lucene – the open-source full text searching Java library, but each with their own extensions, their pros and cons.

We all know that Solr and Elasticsearch are different, but what those differences are and which solution is the best fit for a particular use case is a frequent question. We will try to make those differences clear, not by showing slides and comparing them, but by showing on online demo of both Elasticsearch and Solr:

  • Set up and start both search servers. See what you need to prepare and launch Solr and Elasticsearch.
  • Index data right after the server was started using the “schemaless” mode
  • Create index structure and modify it using the provided API
  • Explore different query use cases
  • Scale by adding and removing nodes from the cluster, creating indices and managing shards. See how that affects data indexing and querying.
  • Monitor and administer clusters.  See what metrics can be seen out of the box, how to get them and what tools can provide you with the graphical view of all the goodies that each search server can provide.

If you want to come, hear about both Solr and Elasticsearch from @sematext and how to achieve similar things, what how they behave and don’t see too many slides, come join us :)

Parameterizing Queries in Solr and Elasticsearch

We all know how good it is to have abstraction layers in software we create. We tend to abstract implementation from the method contracts using interfaces, we use n-tier architectures so that we can abstract and divide different system layers from each other. This is very good – when we change one piece, we don’t need to touch the other parts that only knew about method contracts, API’s, etc. Why not do the same with search queries? Can we even do that in Elasticsearch and Solr? We can and I’ll show you how to do that.

Read more of this post

Video and Presentation: Indexing and Searching Logs with Elasticsearch or Solr

Interested in log indexing using Elasticsearch or Solr?  Also interested in searching and analyzing logs in real time?

This topic really hits home for us since we released our log analytics tool, Logsene and we also offer consulting services for logging infrastructure.  If you are reading this and looking for a new opportunity then you might be interested to hear that we are hiring worldwide.

If you are into logging like we are, then you will want to check out this presentation delivered by Sematext’s own Radu Gheorghe to the NYC Search, Discovery and Analytics Meetup held recently at Pivotal Labs.  For the purposes of this presentation the term “logs” ranges from server logs and application events to metrics and even social media information.

The presentation has three parts:

  1. Overview of logging tools that play nicely with Elasticseach and Solr (like Logstash, Apache Flume or rsyslog)
  2. Performance tuning and scaling Elasticsearch and Solr
  3. Demo of an end-to-end solution

Here you go – enjoy!

JOB: Professional Services Lead – Solr and Elasticsearch

We have a great opportunity at Sematext for a person who wants to take the Professional Services Lead role and grow both him/herself in this role as well as grow the whole Professional Services side of the house.  The person in this role will get to learn all aspects of the business from engineering, to speaking with numerous clients and customers, to working with remote team members, even touching on sales and marketing.  This position offers a truly multifaceted view into Sematext and the space that Sematext is in, which is a rich blend of search, big data, analytics, open source, products, services, engineering, support, etc.  The ideal candidate would already be in New York, where Sematext HQ is located, but we are open to people from other locations as well.

REQUIREMENTS
• Experience working with Solr or Elasticsearch
• Plan and coordinate customer engagements from business and technical perspective
• Identify customer pain points, needs, and success criteria at the onset of each engagement
• Provide expert-level consulting and support services and strive to be a trustworthy advisor to a wide range of customers
• Resolve complex search issues involving Solr or Elasticsearch
• Identify opportunities to provide customers with additional value through our products or services
• Communicate high-value use cases and customer feedback to our Product teams
• Participate in open source community by contributing bug fixes, improvements, answering questions, etc.

EXPERIENCE
• BS or higher in Engineering or Computer Science preferred
• 2 or more years of IT Consulting and/or Professional Services experience required
• Exposure to other related open source projects (Hadoop, Nutch, Kafka, Storm, Mahout, etc.) a plus
• Experience with other commercial and open source search technologies a plus
• Enterprise Search, eCommerce, and/or Business Intelligence experience a plus
• Experience working in a startup a plus

Interested? Please send your resume to jobs@sematext.com.

For other job openings please see Jobs @ Sematext or even our previous job listings.

Meetup: Indexing and Searching Logs with Elasticsearch and Solr

If you are into logging and search like we are, and if you are in New York, like some of us are, come to Indexing and Searching Logs with Elasticsearch and Solr on Wednesday at Pivotal Labs office in Manhattan.

Video: Scaling Solr with SolrCloud

During last  year’s Lucene Revolution conference in Dublin we had the opportunity to give four talks, one of which was Scaling Solr with SolrCloud. Through it we wanted to share our experiences around scaling Solr, especially as we have experience in running Solr internally and as a team of search consultants.  Enjoy the video and/or the slides!

Note: we are looking for engineers passionate about search to join our professional services team.  We’re hiring planet-wide!

Follow

Get every new post delivered to your Inbox.

Join 1,633 other followers