Advanced Histogram Aggregation
Minimum Document Count
You can use the min_doc_count parameter to exclude buckets with fewer than a specified number of documents. For example, to exclude buckets with fewer than 2 sales:
Query:
GET /sales/_search
{
"size": 0,
"aggs": {
"price_histogram": {
"histogram": {
"field": "price",
"interval": 100,
"min_doc_count": 2
}
}
}
}
Output:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"price_histogram": {
"buckets": [
{
"key": 0,
"doc_count": 2
}
]
}
}
}
In this case, only the bucket for prices between 0 and 100 is returned, as it has 2 documents.
Extended Bounds
You can use the extended_bounds parameter to ensure that specific buckets are included in the response, even if they have no documents. This is useful for maintaining a consistent range in your histogram.
Query:
GET /sales/_search
{
"size": 0,
"aggs": {
"price_histogram": {
"histogram": {
"field": "price",
"interval": 100,
"extended_bounds": {
"min": 0,
"max": 1200
}
}
}
}
}
Output:
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"price_histogram": {
"buckets": [
{
"key": 0,
"doc_count": 2
},
{
"key": 100,
"doc_count": 0
},
{
"key": 200,
"doc_count": 0
},
{
"key": 300,
"doc_count": 0
},
{
"key": 400,
"doc_count": 0
},
{
"key": 500,
"doc_count": 0
},
{
"key": 600,
"doc_count": 0
},
{
"key": 700,
"doc_count": 0
},
{
"key": 800,
"doc_count": 0
},
{
"key": 900,
"doc_count": 0
},
{
"key": 1000,
"doc_count": 1
},
{
"key": 1100,
"doc_count": 0
}
]
}
}
}
In this example, all price ranges from 0 to 1200 are included in the response, even if they have no documents.
Data Histogram Aggregation in Elasticsearch
Elasticsearch is a powerful search and analytics engine that allows for efficient data analysis through its rich aggregation framework. Among the various aggregation types, histogram aggregation is particularly useful for grouping data into intervals, which is essential for understanding the distribution and trends within your data.
In this article, we will delve into data histogram aggregation in Elasticsearch, explain its use cases, and provide detailed examples to help you master this powerful feature.