Building an End-to-End Azure RAG Strategy Agent with MS Foundry
This architecture showcases a complete Retrieval-Augmented Generation (RAG) pipeline. It begins by gathering raw documents from Azure Blob Storage, processes them through Document Intelligence, converts them into embeddings with Azure OpenAI, and finally indexes them in Azure AI Search to support hybrid retrieval. A Foundry/MAF-based agent manages the query processing by merging user queries with relevant search results and crafting contextual replies, which can be accessed via a FastAPI or CLI interface.
Azure-based RAG Pipeline with Agent-Orchestration
This solution features two primary layers:
This layer’s role is to convert raw enterprise documents into searchable knowledge.
- Raw documents stored in Azure Blob Storage
- Accepted formats include: PDF, DOCX, PPTX, images, etc.
- Document Intelligence extraction
- This step extracts:
- Text content
- Tables
- Key-value pairs
- Document structure
- The output is written as structured JSON back to Blob (processed/)
- This step extracts:
- Chunking + Embedding
- Documents are divided into manageable chunks
- Each chunk is embedded using Azure OpenAI (text-embedding-*)
- Indexing into Azure AI Search
- This creates a hybrid index:
- Keyword search capabilities
- Semantic ranking features
- Vector search functionality
- This enables versatile retrieval strategies
- This creates a hybrid index:
This layer provides intelligent query answering.
- Users submit queries using:
- A FastAPI endpoint
- A CLI interface
- The query processing is managed by:
- Microsoft Agent Framework (MAF) agent
- Operational on Azure AI Foundry
- The agent:
- Queries Azure AI Search
- Retrieves the most relevant chunks
- Incorporates them into the LLM prompt
- The LLM then generates a grounded response
- This adheres to the typical RAG sequence:
- Retrieval → Augmentation → Generation
- This adheres to the typical RAG sequence:
Service | Purpose |
Azure Blob Storage | Stores raw and processed documents |
Azure AI Document Intelligence | Extracts structured content from documents |
Azure OpenAI | Handles embeddings and LLM generation |
Azure AI Search | Functions as a hybrid retrieval engine |
Azure AI Foundry | Manages agent orchestration |
Microsoft Agent Framework | Operates as the execution layer for agents |
This solution extends beyond basic RAG by offering:
- A combination of keyword, semantic, and vector search
- Enhanced recall and accuracy
- Capability to process complex enterprise documents
- Extraction of tables and metadata
- Ability to reason over retrieval results
- Scalability for multi-agent workflows
- Support for continuous data ingestion
- Compatibility with large document collections
- Utilise Managed Identity for secure access
- Implement RBAC on Cosmos DB / Search / Storage
- Enable Private Endpoints for improved network isolation
- Employ Guardrails + Evaluations in Foundry
This repository presents a production-ready Azure RAG architecture:
- Ingest → Extract → Chunk → Embed → Index
- Retrieve → Reason → Generate
- All supported by Azure AI Foundry along with Agent Framework
By merging data engineering and AI orchestration, this setup allows for enterprise-grade AI systems that are:
- Accurate
- Grounded
- Extensible
Repo: https://github.com/snd94/azure-rag-strategy-agent
For more insights, please check the Microsoft Learn Documentation:
Share this content:
Discover more from Qureshi
Subscribe to get the latest posts sent to your email.