What is Metric Aggregation in Elasticsearch?

In this article, we will learn Metric Aggregation in Elasticsearch,This free Databases tutorial for complete beginners will help you learn Databases from scratch.

Metric Aggregation in Elasticsearch - ❤️Databases Tutorials In 2024

Q: How to use Metric Aggregation in Elasticsearch in Databases?

Elasticsearch is a powerful tool not just for search but also for performing complex data analytics

Elasticsearch is a powerful tool not just for search but also for performing complex data analytics. Metric aggregations are a crucial aspect of this capability, allowing users to compute metrics like averages, sums, and more on numeric fields within their data.

This guide will delve into metric aggregations in Elasticsearch, explaining what they are, how they work, and providing detailed examples to illustrate their use.

Metric aggregations in Elasticsearch calculate metrics based on the values of numeric fields in your documents. Unlike bucket aggregations, which group documents into buckets, metric aggregations work directly on the numeric values and return statistical metrics. They are essential for summarizing large datasets and deriving insights such as averages, minimums, maximums, sums, and more.

Elasticsearch offers several types of metric aggregations, each serving a different purpose:

Average Aggregation: Calculates the average of numeric values.
Sum Aggregation: Computes the sum of numeric values.
Min Aggregation: Finds the minimum value.
Max Aggregation: Finds the maximum value.
Stats Aggregation: Provides a summary of statistics (count, min, max, sum, and average).
Extended Stats Aggregation: Includes additional statistics such as variance, standard deviation, and sum of squares.
Value Count Aggregation: Counts the number of values.
Percentiles Aggregation: Calculates percentiles over numeric values.
Percentile Ranks Aggregation: Computes the percentile rank of specific values.
Cardinality Aggregation: Estimates the count of distinct values.
Geo Bounds Aggregation: Computes the bounding box containing all geo-points in the field.

Example Dataset

To make the explanations concrete, let’s assume we have an Elasticsearch index called products with documents that look like this:

{
  "product_id": 1,
  "name": "Laptop",
  "category": "electronics",
  "price": 1000,
  "quantity_sold": 5,
  "rating": 4.5
}

The average aggregation computes the average value of a numeric field. Let’s calculate the average price of products in our index.

Query:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "avg_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

Output:

{
  "aggregations": {
    "avg_price": {
      "value": 550.0
    }
  }
}

In this example, the average price of products is $550.0.

The sum aggregation calculates the total sum of a numeric field. Let’s calculate the total quantity sold for all products.

Query:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "total_quantity_sold": {
      "sum": {
        "field": "quantity_sold"
      }
    }
  }
}

Output:

{
  "aggregations": {
    "total_quantity_sold": {
      "value": 25
    }
  }
}

In this example, the total quantity sold for all products is 25.

The min aggregation finds the minimum value of a numeric field. Let’s find the minimum price of products.

Query:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "min_price": {
      "min": {
        "field": "price"
      }
    }
  }
}

Output

{
  "aggregations": {
    "min_price": {
      "value": 100.0
    }
  }
}

In this example, the minimum price of products is $100.0.

The max aggregation finds the maximum value of a numeric field. Let’s find the maximum price of products.

Query:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}

Output

{
  "aggregations": {
    "max_price": {
      "value": 1000.0
    }
  }
}

In this example, the maximum price of products is $1000.0.

The stats aggregation provides a summary of statistics, including count, sum, min, max, and average. Let’s get the stats for the price field.

Query:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "price_stats": {
      "stats": {
        "field": "price"
      }
    }
  }
}

Output

{
  "aggregations": {
    "price_stats": {
      "count": 10,
      "min": 100.0,
      "max": 1000.0,
      "avg": 550.0,
      "sum": 5500.0
    }
  }
}

In this example, we get a summary of statistics for the price field.

The extended stats aggregation provides additional statistics such as variance, standard deviation, and sum of squares. Let’s get the extended stats for the price field.

Query

GET /products/_search
{
  "size": 0,
  "aggs": {
    "extended_price_stats": {
      "extended_stats": {
        "field": "price"
      }
    }
  }
}

Output

{
  "aggregations": {
    "extended_price_stats": {
      "count": 10,
      "min": 100.0,
      "max": 1000.0,
      "avg": 550.0,
      "sum": 5500.0,
      "sum_of_squares": 3850000.0,
      "variance": 202500.0,
      "std_deviation": 450.0
    }
  }
}

In this example, we get extended statistics for the price field, including variance and standard deviation.

The value count aggregation counts the number of values in a field. Let’s count the number of products.

Query

GET /products/_search
{
  "size": 0,
  "aggs": {
    "product_count": {
      "value_count": {
        "field": "product_id"
      }
    }
  }
}

Output

{
  "aggregations": {
    "product_count": {
      "value": 10
    }
  }
}

In this example, the number of products is 10.

The percentiles aggregation calculates the percentiles over numeric values. Let’s calculate the 25th, 50th, and 75th percentiles for the price field.

Query

GET /products/_search
{
  "size": 0,
  "aggs": {
    "price_percentiles": {
      "percentiles": {
        "field": "price",
        "percents": [25, 50, 75]
      }
    }
  }
}

Output

{
  "aggregations": {
    "price_percentiles": {
      "values": {
        "25.0": 275.0,
        "50.0": 550.0,
        "75.0": 825.0
      }
    }
  }
}

In this example, we get the 25th, 50th, and 75th percentiles for the price field.

The percentile rank aggregation computes the percentile rank of specific values. Let’s calculate the percentile ranks for prices 300 and 600.

Query

GET /products/_search
{
  "size": 0,
  "aggs": {
    "price_percentile_ranks": {
      "percentile_ranks": {
        "field": "price",
        "values": [300, 600]
      }
    }
  }
}

Output

{
  "aggregations": {
    "price_percentile_ranks": {
      "values": {
        "300.0": 30.0,
        "600.0": 60.0
      }
    }
  }
}

In this example, prices 300 and 600 fall into the 30th and 60th percentiles, respectively.

The cardinality aggregation estimates the count of distinct values. Let’s estimate the number of distinct categories.

Query

GET /products/_search
{
  "size": 0,
  "aggs": {
    "distinct_categories": {
      "cardinality": {
        "field": "category.keyword"
      }
    }
  }
}

Output

{
  "aggregations": {
    "distinct_categories": {
      "value": 3
    }
  }
}

In this example, there are 3 distinct categories.

The geo-bounds aggregation computes the bounding box containing all geo-points in the field. Let’s calculate the geo-bounds for a field containing geo points.

Query

GET /locations/_search
{
  "size": 0,
  "aggs": {
    "geo_bounds": {
      "geo_bounds": {
        "field": "location"
      }
    }
  }
}

Output

{
  "aggregations": {
    "geo_bounds": {
      "bounds": {
        "top_left": {
          "lat": 40.73,
          "lon": -74.1
        },
        "bottom_right": {
          "lat": 40.01,
          "lon": -71.12
        }
      }
    }
  }
}

In this example, the geo-bounds aggregation calculates the bounding box for the geo-points.

Metric aggregations in Elasticsearch are a powerful way to perform statistical analysis on your data. They allow you to calculate averages, sums, minimums, maximums, and more, providing valuable insights into your data. By understanding and utilizing these aggregations, you can unlock the full potential of Elasticsearch for your data analytics needs. Whether you’re summarizing sales data, analyzing user behavior, or exploring any other type of numeric data, metric aggregations are an essential tool in your Elasticsearch toolkit.

Metric Aggregation in Elasticsearch

What are Metric Aggregations?

Types of Metric Aggregations

Example Dataset

Average Aggregation

Sum Aggregation

Min Aggregation

Max Aggregation

Stats Aggregation

Extended Stats Aggregation

Value Count Aggregation

Percentiles Aggregation

Percentile Ranks Aggregation

Cardinality Aggregation

Geo Bounds Aggregation

Conclusion