Parameterizing Queries in Solr and Elasticsearch

We all know how good it is to have abstraction layers in software we create. We tend to abstract implementation from the method contracts using interfaces, we use n-tier architectures so that we can abstract and divide different system layers from each other. This is very good – when we change one piece, we don’t need to touch the other parts that only knew about method contracts, API’s, etc. Why not do the same with search queries? Can we even do that in Elasticsearch and Solr? We can and I’ll show you how to do that.

The problem

Imagine, that we have a query, a complicated one, with boosts, sorts, facets and so on. However in most cases the query is pretty static when it comes to its structure and the only thing that changes is one of the filters in the query (actually a filter value) and the query entered by the user. I guess such situation could ring a bell for someone who developed a search application. Of course we can include the whole query in the application itself and reuse it. But in such case, changes to boosts for example requires us to deploy the application or a configuration file. And if more than a single application uses the same query, than we need to change them all.

What if we could make the change on the search server side only and let application pass the necessary data only? That would be nice, but it requires us to do some work on the search server side.

For the purpose of the blog post, let’s assume that we want to have a query that:

  • searches for documents with terms entered by the user,
  • limits the searches to a given category,
  • displays facet results for the price ranges

This is a simple example, so that the queries are easy to understand. So, in the perfect world we would only need to provide user query and category identifier to a search engine.

Making it happen with Solr

When it comes to Apache Solr, we will use local params and the possibility of dereferencing parameters. Our initial query could look like this:

/solr/select?q=blue+jeans&qf=name^100+description^10&defType=edismax&fq=categoryId:12&facet=true&facet.query=price:[*+TO+30]&facet.query=price:[30+TO+100]&facet.query=price:[100+TO+*]

As you can see the query is simple as it can be. Now, let’s change it, so we only need to provide the query and category identifier. What we need to do first, is create a new handler that will be able to handle our queries. Such handler will include our query configuration and could look like this:

<requestHandler name="/search" class="solr.StandardRequestHandler">
 <lst name="defaults">
  <str name="q">_query_:"{!edismax qf=$queryFields v=$userQuery}"</str>
  <str name="queryFields">name^100 description^10</str>
  <str name="df">name</str>
  <str name="q.op">OR</str>
  <bool name="facet">true</bool>
  <str name="facet.query">price:[* TO 30]</str>
  <str name="facet.query">price:[30 TO 100]</str>
  <str name="facet.query">price:[100 TO *]</str>
 </lst>
</requestHandler>

And our changed query would look like this:

/solr/search?userQuery=blue+jeans&fq=categoryId:12

As you can see our query was simplified. What’s worth looking at is that we are now using the /search handler that we’ve created. In addition to that, instead of passing the usual query, we now only send the userQuery which holds the query entered by the user and the filter for category identifier. We use the $userQuery in request handler configuration to get the query value and pass it to Solr. Simple as that :)

Making it happen with Elasticsearch

To achieve the same with Elasticsearch, we will use the 1.1.0 version and the query templates functionality. Let’s assume that our query looks as follows:

{
 "query": {
  "filtered": {
   "query": {
    "multi_match" : {
     "query": "blue jeans",
     "fields": [ "name^100", "description^10" ]
    }
   },
   "filter": {
    "term" : {
     "categoryId": 12
    }
   }
  }
 },
 "facets": {
  "price": {
   "range": {
    "field": "price",
    "ranges": [
     {
      "to": 30
     },
     {
      "from": 30,
      "to": 100
     },
     {
      "from": 100
     }
    ]
   }
  }
 }
}

Now, let’s change it and use the newly introduced template query. We will create a new file called shopQuery.mustache in the $ES_HOME/config/scripts folder. the template name must end with .mustache extension. The name of the template that we will use, will be the file name without the extension, so in our case it will be shopQuery. The contents of the template file will be as follows:

{
 "query": {
  "filtered": {
   "query": {
    "multi_match" : {
     "query": "{{userQuery}}",
     "fields": [ "name^100", "description^10" ]
    }
   },
   "filter": {
    "term" : {
     "categoryId": {{userCategory}}
    }
   }
  }
 },
 "facets": {
  "price": {
   "range": {
    "field": "price",
    "ranges": [
     {
      "to": 30
     },
     {
      "from": 30,
      "to": 100
     },
     {
      "from": 100
     }
    ]
   }
  }
 }
}

As you can see, we’ve changed the query to {{userQuery}} and category identifier value to {{userCategory}}. After restarting Elasticsearch we can use the following query to use the template:

curl -XGET "http://localhost:9200/_search/template" -d '{
 "template": "shopQuery",
 "params": {
  "userQuery": "blue jeans",
  "userCategory": 12
 }
}'

We’ve used the /_search/template end-point and we’ve specified the template name by using the template property in the query. In addition to that we’ve specified two parameters in the params section. Seems way better, right?

Summary

As we can see, we can simplify querying for Solr and for Elasticsearch. However, not everything is green here. In case of Solr, we need to reload the configuration for the core or collection for the changes to be visible. So each time we would like to modify the query or add a new one, we need to alter the configuration and reload it. The same goes for Elasticsearch, at least in the described 1.1.0 version – we need to restart a node for the template to be visible. However, search templates in Elasticsearch seems to be more flexible, because we can just put the parameters for any value (and not only). But I suppose it will change in the future and we will be able to use the API to push the queries, just like we can with warming queries. In such case, the templates will be easier to use and maintain.

About Rafał Kuć
Sematext engineer, books author, speaker.

3 Responses to Parameterizing Queries in Solr and Elasticsearch

  1. mehielgr says:

    I’ve used query templates in ES 1.1.0 and it makes your querying at least more elegant.

    But in the end we end up facing the same problem as with stored procedures in databases in the past. We have business logic in the ES part and we need extra steps for the search server configuration.

    IMHO this feature should be used when things go complex and should be followed by a good deployment strategy (probably for ES, pushing those templates will be a solution to this).

    • Rafał Kuć says:

      I agree that queries are part of the business logic, and ES will be another point to take care of. However I wouldn’t compare templates to stored procedures – we can’t send mails from templates and I’ve seen it such deployments :)

      • mehielgr says:

        Sad but true. I’ve seen some horror cases like that too. That’s why I’m concerned even though I totally agree that they’re not the same beast as stored procs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,652 other followers

%d bloggers like this: