diff --git a/docs/reference/search/search-your-data/retrievers-overview.asciidoc b/docs/reference/search/search-your-data/retrievers-overview.asciidoc new file mode 100644 index 0000000000000..fdd984819558b --- /dev/null +++ b/docs/reference/search/search-your-data/retrievers-overview.asciidoc @@ -0,0 +1,207 @@ +[[retrievers-overview]] +== Retrievers + +// Will move to a top level "Retrievers and reranking" section once reranking is live + +preview::[] + +A retriever is an abstraction that was added to the Search API in *8.14.0*. +This abstraction enables the configuration of multi-stage retrieval +pipelines within a single `_search` call. This simplifies your search +application logic, because you no longer need to configure complex searches via +multiple {es} calls or implement additional client-side logic to +combine results from different queries. + +This document provides a general overview of the retriever abstraction. +For implementation details, including notable restrictions, check out the +<> in the `_search` API docs. + +[discrete] +[[retrievers-overview-types]] +=== Retriever types + +Retrievers come in various types, each tailored for different search operations. +The following retrievers are currently available: + +* <>. Returns top documents from a +traditional https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html[query]. +Mimics a traditional query but in the context of a retriever framework. This +ensures backward compatibility as existing `_search` requests remain supported. +That way you can transition to the new abstraction at your own pace without +mixing syntaxes. +* <>. Returns top documents from a <>, +in the context of a retriever framework. +* <>. Combines and ranks multiple first-stage retrievers using +the reciprocal rank fusion (RRF) algorithm. Allows you to combine multiple result sets +with different relevance indicators into a single result set. +An RRF retriever is a *compound retriever*, where its `filter` element is +propagated to its sub retrievers. ++ +Sub retrievers may not use elements that +are restricted by having a compound retriever as part of the retriever tree. +See the <> for detailed +examples and information on how to use the RRF retriever. + +[NOTE] +==== +Stay tuned for more retriever types in future releases! +==== + +[discrete] +=== What makes retrievers useful? + +Here's an overview of what makes retrievers useful and how they differ from +regular queries. + +. *Simplified user experience*. Retrievers simplify the user experience by +allowing entire retrieval pipelines to be configured in a single API call. This +maintains backward compatibility with traditional query elements by +automatically translating them to the appropriate retriever. +. *Structured retrieval*. Retrievers provide a more structured way to define search +operations. They allow searches to be described using a "retriever tree", a +hierarchical structure that clarifies the sequence and logic of operations, +making complex searches more understandable and manageable. +. *Composability and flexibility*. Retrievers enable flexible composability, +allowing you to build pipelines and seamlessly integrate different retrieval +strategies into these pipelines. Retrievers make it easy to test out different +retrieval strategy combinations. +. *Compound operations*. A retriever can have sub retrievers. This +allows complex nested searches where the results of one retriever feed into +another, supporting sophisticated querying strategies that might involve +multiple stages or criteria. +. *Retrieval as a first-class concept*. Unlike +traditional queries, where the query is a part of a larger search API call, +retrievers are designed as standalone entities that can be combined or used in +isolation. This enables a more modular and flexible approach to constructing +searches. +. *Enhanced control over document scoring and ranking*. Retrievers +allow for more explicit control over how documents are scored and filtered. For +instance, you can specify minimum score thresholds, apply complex filters +without affecting scoring, and use parameters like `terminate_after` for +performance optimizations. +. *Integration with existing {es} functionalities*. Even though +retrievers can be used instead of existing `_search` API syntax (like the +`query` and `knn`), they are designed to integrate seamlessly with things like +pagination (`search_after`) and sorting. They also maintain compatibility with +aggregation operations by treating the combination of all leaf retrievers as +`should` clauses in a boolean query. +. *Cleaner separation of concerns*. When using compound retrievers, only the +query element is allowed, which enforces a cleaner separation of concerns +and prevents the complexity that might arise from overly nested or +interdependent configurations. + +[discrete] +[[retrievers-overview-example]] +=== Example + +The following example demonstrates how using retrievers +simplify the composability of queries for RRF ranking. + +[source,js] +---- +GET example-index/_search +{ + "retriever": { + "rrf": { + "retrievers": [ + { + "standard": { + "query": { + "text_expansion": { + "vector.tokens": { + "model_id": ".elser_model_2", + "model_text": "What blue shoes are on sale?" + } + } + } + } + }, + { + "standard": { + "query": { + "match": { + "text": "blue shoes sale" + } + } + } + } + ] + } + } +} +---- +//NOTCONSOLE + +This example demonstrates how you can combine different +retrieval strategies into a single `retriever` pipeline. + +Compare to `RRF` with `sub_searches` approach: + +.*Expand* for example +[%collapsible] +============== + +[source,js] +---- +GET example-index/_search +{ + "sub_searches":[ + { + "query":{ + "match":{ + "text":"blue shoes sale" + } + } + }, + { + "query":{ + "text_expansion":{ + "vector.tokens":{ + "model_id":".elser_model_2", + "model_text":"What blue shoes are on sale?" + } + } + } + } + ], + "rank":{ + "rrf":{ + "window_size":50, + "rank_constant":20 + } + } +} +---- +//NOTCONSOLE +============== + +[discrete] +[[retrievers-overview-glossary]] +=== Glossary + +Here are some important terms: + +* *Retrieval Pipeline*. Defines the entire retrieval and ranking logic to +produce top hits. +* *Retriever Tree*. A hierarchical structure that defines how retrievers interact. +* *First-stage Retriever*. Returns an initial set of candidate documents. +* *Compound Retriever*. Builds on one or more retrievers, +enhancing document retrieval and ranking logic. +* *Combiners*. Compound retrievers that merge top hits +from multiple sub-retrievers. +//* NOT YET *Rerankers*. Special compound retrievers that reorder hits and may adjust the number of hits, with distinctions between first-stage and second-stage rerankers. + +[discrete] +[[retrievers-overview-play-in-search]] +=== Retrievers in action + +The Search Playground builds Elasticsearch queries using the retriever abstraction. +It automatically detects the fields and types in your index and builds a retriever tree based on your selections. + +You can use the Playground to experiment with different retriever configurations and see how they affect search results. + +Refer to the {kibana-ref}/playground.html[Playground documentation] for more information. +// Content coming in https://github.com/elastic/kibana/pull/182692 + + + diff --git a/docs/reference/search/search-your-data/search-your-data.asciidoc b/docs/reference/search/search-your-data/search-your-data.asciidoc index bed204985296c..e1c1618410f2f 100644 --- a/docs/reference/search/search-your-data/search-your-data.asciidoc +++ b/docs/reference/search/search-your-data/search-your-data.asciidoc @@ -43,10 +43,11 @@ DSL, with a simplified user experience. Create search applications based on your results directly in the Kibana Search UI. include::search-api.asciidoc[] -include::search-application-overview.asciidoc[] include::knn-search.asciidoc[] include::semantic-search.asciidoc[] +include::retrievers-overview.asciidoc[] include::learning-to-rank.asciidoc[] include::search-across-clusters.asciidoc[] include::search-with-synonyms.asciidoc[] +include::search-application-overview.asciidoc[] include::behavioral-analytics/behavioral-analytics-overview.asciidoc[]