Retrieval Augmented Generation (RAG) Overview
The Unstructured Query Pipeline (RAG 2.0) uses a large language model (LLM) to answer questions based only on relevant unstructured content. It includes modular components and a hybrid search strategy to improve both the quality and relevance of results.
To read through the implementation of RAG 2.0, refer to RAG Unified Tool.
This pipeline includes the following components:
- Query Rewriter (optional)
- Retriever
- Reranker (optional)
- Message Builder
- Query Answering
Query rewriter
The Query rewriter transforms ambiguous or incomplete queries into formats that improve retrieval quality. It expands acronyms, replaces unclear terms, adds context, and restructures the query as needed.
For example, the query "MTBF alarm 47" is rewritten as "mean time between failures for alarm code 47 in HVAC systems" by resolving acronyms and adding missing context. To resolve acronyms, you will also need to write few shot examples and add a glossary to the RAG system.
Retriever
The Hybrid Search and Retriever uses three methods in parallel to overcome the limitations of individual search techniques.
Semantic retrieval — finds content with similar meaning.
Metadata-based retrieval — filters content using structured tags such as process or department.
Keyword-based retrieval — matches explicit terms such as alarm codes that correspond to keywords in the user query.
For example, if you search “why did the packaging machine stop,” semantic retrieval finds related phrases like “machine halt,” metadata filters for the packaging department or machine ID, and keyword search ensures terms like “error code 302” are included.
Reranker
The Reranker prioritizes the most relevant content so that useful results appear first. It ranks results using cross-encoder models, LLM scoring, or custom rules, and can include nearby context to improve understanding.
For example, if you search “latest temperature warnings on reactor 4,” the reranker promotes recent warnings tagged with reactor 4, prioritizes entries with terms like “overheat,” and includes surrounding sensor readings or operator notes for context.
Message Builder
The Message Builder formats retrieved content for LLM processing while preserving traceability. It converts content into structured inputs, adds citations, and includes few-shot examples if configured. You can control formatting with configuration flags or custom logic.
For example, if a procedure contains a diagram and a parts table, the Message Builder combines them into one structured prompt with section labels and citations to help the LLM understand and trace the content.
Query Answering
The Query Answering component generates answers based only on retrieved content. It avoids adding external knowledge and enforces content grounding to maintain accuracy. It also manages token limits and supports custom logic for formatting responses.
For example, a maintenance operator querying about "restart steps for compressor 12" receives an LLM-generated answer that references only the retrieved standard operating procedure, avoiding generic or incorrect instructions.
This tool is available out of the box as part of the Multimodal Retrieval Toolkit for the Dynamic Agent.
Summary
The Unstructured Query Pipeline 2.0 combines modular components and hybrid retrieval to deliver accurate and grounded responses. Each stage in the process, from query rewriting through to LLM response generation, addresses specific operational challenges and allows for configuration. This pipeline supports a wide range of use cases across roles and domains, including maintenance, operations, education, and planning.