Understanding the default ElasticPress/Elasticsearch search query

For both the Post search and Autosuggest Features, ElasticPress performs the same default Elasticsearch query. This query can be filtered by changing the settings in the ElasticPress -> Weighting Engine menu in the WordPress Dashboard or by using the numerous hooks and filters available as the query is built.

Breaking Down the Default Query

The default query generated by ElasticPress in the event that the plugin is installed on a generic WordPress site with Posts and Pages available for search, and without any changes made via filter or in the Weighting Engine, looks like this:

JSON
{
    "from": 0,
    "size": 10,
    "sort": [
        {
            "_score": {
                "order": "desc"
            }
        }
    ],
    "query": {
        "function_score": {
            "query": {
                "bool": {
                    "should": [
                        {
                            "bool": {
                                "must": [
                                    {
                                        "bool": {
                                            "should": [
                                                {
                                                    "multi_match": {
                                                        "query": "elasticpress for wordpress search",
                                                        "type": "phrase",
                                                        "fields": [
                                                            "post_title^1",
                                                            "post_excerpt^1",
                                                            "post_content^1"
                                                        ],
                                                        "boost": 3
                                                    }
                                                },
                                                {
                                                    "multi_match": {
                                                        "query": "elasticpress for wordpress search",
                                                        "fields": [
                                                            "post_title^1",
                                                            "post_excerpt^1",
                                                            "post_content^1",
                                                            "post_title.suggest^1"
                                                        ],
                                                        "type": "phrase",
                                                        "slop": 5
                                                    }
                                                }
                                            ]
                                        }
                                    }
                                ],
                                "filter": [
                                    {
                                        "match": {
                                            "post_type.raw": "post"
                                        }
                                    }
                                ]
                            }
                        },
                        {
                            "bool": {
                                "must": [
                                    {
                                        "bool": {
                                            "should": [
                                                {
                                                    "multi_match": {
                                                        "query": "elasticpress for wordpress search",
                                                        "type": "phrase",
                                                        "fields": [
                                                            "post_title^1",
                                                            "post_excerpt^1",
                                                            "post_content^1"
                                                        ],
                                                        "boost": 3
                                                    }
                                                },
                                                {
                                                    "multi_match": {
                                                        "query": "elasticpress for wordpress search",
                                                        "fields": [
                                                            "post_title^1",
                                                            "post_excerpt^1",
                                                            "post_content^1",
                                                            "post_title.suggest^1"
                                                        ],
                                                        "type": "phrase",
                                                        "slop": 5
                                                    }
                                                }
                                            ]
                                        }
                                    }
                                ],
                                "filter": [
                                    {
                                        "match": {
                                            "post_type.raw": "page"
                                        }
                                    }
                                ]
                            }
                        }
                    ]
                }
            },
            "functions": [
                {
                    "exp": {
                        "post_date_gmt": {
                            "scale": "14d",
                            "decay": 0.25,
                            "offset": "7d"
                        }
                    }
                }
            ],
            "score_mode": "avg",
            "boost_mode": "sum"
        }
    },
    "post_filter": {
        "bool": {
            "must": [
                {
                    "terms": {
                        "post_type.raw": [
                            "post",
                            "page",
                            "product"
                        ]
                    }
                },
                {
                    "terms": {
                        "post_status": [
                            "publish",
                            "closed"
                        ]
                    }
                }
            ]
        }
    }
}

Query Paging and Sort Order

To begin with, we have some parameters that specify the number of records to return along with some basic info like offset and sort order:

JSON
{
    ...
    "from": 0, //This is the offset, used for paging results.
    "size": 10, //The total number of results to return
    "sort": [ //Here we set the sort order by _score, aka the best results are first
        {
            "_score": {
                "order": "desc"
            }
        }
    ],

After this section, you’ll see the main query portion–as you might suspect, this is where the magic happens! You’ll see a series of nested bool statements that can seem quite confusing, but there’s a good reason they’re there. If you look closely, you’ll see in the query section we actually have two almost identical queries within the should clause.

Per-Post Type Queries

The only difference between these two query clauses is the post type filter–the first query handles Posts, and the second query is for Pages. Having two separate queries might seem a bit redundant, but if you end up applying separate weighting for each post type, it can be really useful! Here’s the filter from the first sub-query (for Posts):

JSON
"filter": [
    {
        "match": {
            "post_type.raw": "post"
        }
     }
]

Optional date decay function (weight by date)

Within the Posts Feature, ElasticPress offers the option to weight results by date. If this option is selected, the following function is applied to all result scores:

JSON
"functions": [
    {
        "exp": {
            "post_date_gmt": {
                "scale": "14d",
                "decay": 0.25,
                "offset": "7d"
            }
        }
    }
],




This function reduces the scores of posts that are more than a week old, with the score reduction maxing out at 14 days. This does not exclude posts older than 14 days, but it significantly decreases their scores to make them less relevant, and therefore likely to appear lower in search results. If your site is news-based or otherwise favors new content, you will likely want to leave this feature enabled. If you have more evergreen content, however, you should disable the feature to remove date-based weighting.

Finding Our Results: The multi_match Query

This section ensures that all the weighting and configuration in the must query (nested in the top-level should query) applies only to the Post post type. While it’s identical (under default configuration settings) to the Page post type query, this could always be different once weightings are applied to each post type.

JSON
{
    "bool": {
        "should": [
            {
                "multi_match": {
                    "query": "elasticpress for wordpress search",
                    "type": "phrase",
                    "fields": [
                        "post_title^1",
                        "post_excerpt^1",
                        "post_content^1"
                    ],
                    "boost": 3
                }
            },
            {
                "multi_match": {
                    "query": "elasticpress for wordpress search",
                    "fields": [
                        "post_title^1",
                        "post_excerpt^1",
                        "post_content^1",
                        "post_title.suggest^1"
                    ],
                    "type": "phrase",
                    "slop": 5
                }
            }
        ]
    }
}

In this part of the query, you can see we perform two separate multi_match queries. Multi_match is a feature of Elasticsearch that allows you to search through a set of fields, which by default in ElasticPress means the Post’s title, excerpt, and content.

The search type is phrase, which means Elasticsearch will look for the entire phrase as typed out in the search query, in this case “elasticpress for wordpress search.”

Finally, we see a boost value of 3, which means that any exact match of the phrase results in a score that is then multiplied by 3. Remember from the beginning of our search request when we reviewed the sort order? That _score parameter is built from these queries, so an exact phrase match of “elasticpress for wordpress search” within a single Post’s title, excerpt, or content will get a nice high score and appear near the top of our results.

The second multi_match query is similar to the first, except we allow a slop of 5. Slop is a fancy way of referring to distance between words in the phrase or transposition of those words. For example, “elasticpress search for wordpress” has slop, because it contains the correct words from the matched phrase, but with some words transposed. You’ll notice this query is not boosted, meaning the score from a match with slop will be much lower than the boosted score from the previous query clause. This puts results with the right words, or perhaps one or two missing words, below exact matches.

Search in ElasticPress 3.5 and above

As of ElasticPress 3.5, the search algorithm no longer enables fuzzy matching by default. This is because the new phrase match does not support the fuzziness parameter. Previously, the search algorithm in ElasticPress 3.4.3 and below did include a fuzzy match, which some site owners may want to restore. To use the previous search algorithm or add a fuzzy match query clause into the new search algorithm, please refer to this article.

Depending on how your users search, you may or may not want fuzzy matching enabled. While fuzzy matching seems like a great feature (and it can be), it can create a situation where there are many, many items at the end of your search results that are loosely, if at all, related to the search query. Consider, for example, this (partial) list of matches for the search term dogs with a fuzziness value of 1: dog, does, cogs, hogs, logs, pogs, digs. A lot of those words aren’t even remotely related to the term dogs, but they’ll appear (later) in your search results, or even right on the top if your site doesn’t actually have anything related to the word(s) searched for.