Practical Application: Indexing and Searching
Now that we’ve covered the basics of analyzers and tokenizers, let’s see how to apply them in a practical indexing and searching scenario.
Creating an Index with Custom Analyzer
PUT /articles
{
"settings": {
"analysis": {
"analyzer": {
"article_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop", "snowball"]
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "article_analyzer"
},
"content": {
"type": "text",
"analyzer": "article_analyzer"
}
}
}
}
}
Indexing Documents
POST /articles/_doc/1
{
"title": "Introduction to Elasticsearch",
"content": "Elasticsearch is a powerful search engine based on Apache Lucene."
}
POST /articles/_doc/2
{
"title": "Full-Text Search Techniques",
"content": "Learn how to use full-text search with Elasticsearch to build powerful search applications."
}
Searching with Custom Analyzer
GET /articles/_search
{
"query": {
"match": {
"content": "search powerful"
}
}
}
Output:
{
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"hits": [
{
"_index": "articles",
"_id": "1",
"_score": 0.75,
"_source": {
"title": "Introduction to Elasticsearch",
"content": "Elasticsearch is a powerful search engine based on Apache Lucene."
}
},
{
"_index": "articles",
"_id": "2",
"_score": 0.65,
"_source": {
"title": "Full-Text Search Techniques",
"content": "Learn how to use full-text search with Elasticsearch to build powerful search applications."
}
}
]
}
}
In this example:
- We search for documents containing “search” and “powerful” in the content field.
- The custom analyzer processes the content field, improving the relevance of the search results.
Full Text Search with Analyzer and Tokenizer
Elasticsearch is renowned for its powerful full-text search capabilities. At the heart of this functionality are analyzers and tokenizers, which play a crucial role in how text is processed and indexed. This guide will help you understand how analyzers and tokenizers work in Elasticsearch, with detailed examples and outputs to make these concepts easy to grasp.