Advanced RAG Techniques
Gen AI is a fast-moving field, and new techniques are being developed all the time.
This section describes some of the advanced RAG techniques that can be implemented with RAGStack.
RAG fusion
RAG fusion generates similar queries to the user’s query and retrieves relevant context for both the original query as well as the generated similar queries. RAG fusion increases the likelihood that the query process has selected the most useful context for generating accurate results.
FLARE
Forward-looking active retrieval, or FLARE, is an example of a multi-query RAG technique that involves iteratively calling the LLM with custom instructions in your prompt asking the LLM to provide additional questions about key phrases that would help it generate a better answer. Once the LLM has context with no gaps, it terminates with the final response.
FLARE adds a loop between the LLM and the AI agent to facilitate these iterations, and uses logprobs returned from the LLM to identify uncertain tokens that need additional information.