🏡 Home 👈 Prev 👉 Next

⚡  ElasticsearchBook.com is crafted by Jozef Sorocin and powered by:

From Docs through Counts to Aggregations

We typically write search queries with three goals in mind:

  1. retrieval of a (paginated) subset of documents
  2. getting the total count
  3. aggregating (grouping) based on a category — the usual suspect being e-shops facets

In this chapter we'll be specifically focusing on aggregations but before we do that, let's quickly talk about counting.

Approximate vs Exact Counts

Prior to v7.x, a _search request used to yield the exact total. From 7.0 onwards, you'll need to specify "track_total_hits": true in your _search payload. If this param is left out, ES will accurately count up only to 10,000 docs and then set the total.relation to gte. Stopping at 10K is a good trade-off to speed up searches.

{
  "hits" : {
    "total" : 2228423,
		"hits": [
      ...
    ],
    ...
    ...
{
  "hits" : {
    "total" : {
      "value" : 2228423,
      "relation" : "eq"
    },
    "hits": [...],
    ...

<aside> 💡 If you don't care about the actual hits but only the total count, you can send your _search payload to the _count endpoint instead. This does not require the track_total_hits parameter and is usually faster than _search.

Note that a _count request body only supports query. No from, size, or aggs.

</aside>

Page Size & Offset

The pagination basics are nicely explained in the official docs:

By default, searches return the top 10 matching hits. The from parameter defines the number of hits to skip, defaulting to 0. The size parameter is the maximum number of hits to return. Together, these two parameters define a page of results.

<aside> 💡 Setting "size": 0 is often used when you won't be processing any hits directly and are only interested in the aggregations part of the response. This saves bandwidth and helps to avoid filling the shared request cache → faster responses.

</aside>

In this chapter →

Aggregation Data Tables

Table Cell Metadata

🏡 Home 👈 Prev 👉 Next