Elasticsearch - Full-text in Practice
The course is designed for those who want to learn how to create a full-text search in their project, which we will explore more in deep. In addition to using Elasticsearch, we will learn other user search concepts.
During the course we will jointly create a search for real data; from design and architecture to a fully functional solution that we will gradually expand on advanced work with language, synonyms, typing, narrator and other features. We will discuss how to influence relevance based on user behavior (ratings, purchases, etc.) and other factors. Finally, we'll show you how to run the entire solution in a production environment.
Audience:
The course is designed for all developers who work on projects where they want to solve user searches (eg product catalogs, articles, etc.).
Prerequisites:
Basic knowledge of Elasticsearch, HTTP protocol, JSON format, general knowledge of database systems.
Course content:
Search built for users
- How do users search on your site?
- How to Design Search, Architecture
- The process of processing a user query
- Search Context
- Identity of entities (what does the user actually look for)?
Cluster for fulltext
- Design a cluster architecture for full-text search
- Lab
Data indexing to Elasticsearch
- Synchronize data from relational databases and other repositories
- Continuous and one-time data indexing
- Track and improve document indexing performance
- Lab
Creation of a full-text search
- Fundamentals of fulltext, inverted index
- Set up a suitable mapping
- Links between objects, tips from practice
- Basic text analysis
- Lab
Searching DSL query
- Creating a search query
- Search in one or more fields
- Multi-field search strategy and how do you choose in which case?
- Getting rid of whole phrases
- Practice tips
- Lab
Joining another field
- Benefit, boosting
- Code search
- Searching for parameters, categories, tags, tags
- How to set the scales in each field
- Signals
- Lab
Synonyms
- How and why to include synonyms in the search process?
- Dictionary of available synonyms for Czech and other languages
- Creating and engaging your own dictionaries
- Practice tips
- Lab
Relevancy
- What is relevance?
- How to measure relevancy?
- How do you measure change in relevancy when editing your search?
- Score how Elasticsearch counts relevance
- TF / IDF, BM25 and the theoretical minimum
- Lab
Affecting relevance
- Influencing results based on user actions (purchases, ratings, etc.)
- Influence based on document properties
- Rescoring documents
- Decay function
- Lab
Whisperer
- In general, the issue of "whispering"
- Different possibilities of implementation on practical demonstrations
- How to get the same results in your search engine and search results?
- Dis_max and next query in depth
- Lab
Typos
- Basic solution for typos
- We create "Did you mean?"
- Suggesters
- Lab
User filters, facet navigation
- How do user filters work?
- Aggregation
- We create "Did you mean?"
- Suggesters
- Lab
Launching into the production
- Configure cluster for production environment
- We select the number of nodes, shards, replicas
- Monitoring settings
- We are scaling for data and traffic
- Lab
About the instructor: Petr Novotny
Petr's knowledge goes from solution architecture to development (JavaScript, PHP) through Elasticsearch, Oracle, PL/SQL to agile methodology and SCRUM. At the same time, Petr has been working with Elasticsearch technology for several years and has become one of our main instructors.