- unwind ai
- Posts
- Build and Deploy RAG-as-a-service
Build and Deploy RAG-as-a-service
RAG-as-a-service with Claude 3.5 Sonnet in less than 50 lines of Python Code (step-by-step instructions)
RAG is becoming a game-changer for applications that need accurate information from large datasets. As developers, we know the value of building tools that can search documents and provide relevant answers quickly. Today, we’ll take that one step further.
In this tutorial, we’ll walk you through building a production-ready RAG service using Claude 3.5 Sonnet and Ragie.ai, integrated into a clean, user-friendly Streamlit interface. With less than 50 lines of Python code, you’ll create a system that retrieves and queries documents—ready for real-world use.
What makes RAG-as-a-Service unique?
Unlike a typical RAG app, RAG-as-a-Service abstracts complex data ingestion, chunking, and vector retrieval through managed APIs, making it scalable and easy to integrate across products. This means fewer headaches managing infrastructure and more focus on building features.
What is Ragie.ai?
Ragie.ai is a fully managed RAG-as-a-Service for developers. It offers connectors for services like Google Drive, Notion, and Confluence, along with APIs for document upload and retrieval. It handles the entire pipeline—from chunking to hybrid keyword and semantic searches—so you can start with minimal setup.
🎁 $50 worth AI Bonus Content at the end!
What We’re Building
This implementation allows you to create a document querying system with a user-friendly Streamlit interface in less than 50 lines of Python code.
Features
Production-ready RAG pipeline
Integration with Claude 3.5 Sonnet for response generation
Document upload from URLs
Real-time document querying
Support for both fast and accurate document processing modes
Prerequisites
Before we begin, make sure you have:
Python installed on your machine (version 3.7 or higher is recommended)
Your Anthropic API Key and Ragie API Key
Basic familiarity with Python programming
A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)
Step-by-Step Instructions
Setting Up the Environment
First, let's get our development environment ready:
Clone the GitHub repository:
git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
Go to the rag-as-a-service folder:
cd rag-as-a-service
Install the required dependencies:
pip install -r requirements.txt
Get your API Keys: Sign up for an Anthropic account and Ragie account to obtain your API key.
Creating the Streamlit App
Let’s create our app. Create a new file rag_app.py
and add the following code:
Import Required Libraries:
• Streamlit for the web interface
• Requests for API calls• Anthropic for Claude integration
• Time for handling delays
• Typing for type hints
import streamlit as st
import requests
from anthropic import Anthropic
import time
from typing import List, Dict, Optional
from urllib.parse import urlparse
Let's create our RAGPipeline class:
• Initializes API clients
• Sets up API endpoints
• Manages authentication
class RAGPipeline:
def __init__(self, ragie_api_key: str, anthropic_api_key: str):
"""
Initialize the RAG pipeline with API keys.
"""
self.ragie_api_key = ragie_api_key
self.anthropic_api_key = anthropic_api_key
self.anthropic_client = Anthropic(api_key=anthropic_api_key)
# API endpoints
self.RAGIE_UPLOAD_URL = "https://api.ragie.ai/documents/url"
self.RAGIE_RETRIEVAL_URL = "https://api.ragie.ai/retrievals"
Add the Document Upload functionality:
• Handles document uploads via URL
• Supports fast/accurate modes
• Auto-generates document names
def upload_document(self, url: str, name: Optional[str] = None, mode: str = "fast") -> Dict:
return response.json()
Add the Retrieval functionality:
• Retrieves relevant text chunks
• Supports scoped searches
• Returns scored chunks
def retrieve_chunks(self, query: str, scope: str = "tutorial") -> List[str]:
return [chunk["text"] for chunk in data["scored_chunks"]]
def create_system_prompt(self, chunk_texts: List[str]) -> str:
return f"""These are very important to follow: You are "Ragie AI", a professional but friendly AI chatbot working as an assistant to the user. Your current task is to help the user based on all of the information available to you shown below. Answer informally, directly, and concisely without a heading or greeting but include everything relevant. Use richtext Markdown when appropriate including bold, italic, paragraphs, and lists when helpful. If using LaTeX, use double $$ as delimiter instead of single $. Use $$...$$ instead of parentheses. Organize information into multiple sections or points when appropriate. Don't include raw item IDs or other raw fields from the source. Don't use XML or other markup unless requested by the user. Here is all of the information available to answer the user: === {chunk_texts} === If the user asked for a search and there are no results, make sure to let the user know that you couldn't find anything, and what they might be able to do to find the information they need. END SYSTEM INSTRUCTIONS"""
Response generation with Claude Sonnet 3.5:
• Uses Claude
• Includes system instructions
• Manages token limits
def generate_response(self, system_prompt: str, query: str) -> str:
return message.content[0].text
def process_query(self, query: str, scope: str = "tutorial") -> str:
return self.generate_response(system_prompt, query)
def initialize_session_state():
st.session_state.api_keys_submitted = False
Setup the Streamlit app:
• Sets up wide layout
• Initializes session state
• Creates clean UI
def main():
st.set_page_config(page_title="RAG-as-a-Service", layout="wide")
initialize_session_state()
st.title("🖇️ RAG-as-a-Service")
API key configuration and setup:
• Secure API key input
• Column layout
• Expandable section
# API Keys Section
with st.expander("🔑 API Keys Configuration", expanded=not st.session_state.api_keys_submitted):
col1, col2 = st.columns(2)
with col1:
ragie_key = st.text_input("Ragie API Key", type="password", key="ragie_key")
with col2:
anthropic_key = st.text_input("Anthropic API Key", type="password", key="anthropic_key")
if st.button("Submit API Keys"):
if ragie_key and anthropic_key:
try:
st.session_state.pipeline = RAGPipeline(ragie_key, anthropic_key)
st.session_state.api_keys_submitted = True
st.success("API keys configured successfully!")
except Exception as e:
st.error(f"Error configuring API keys: {str(e)}")
else:
st.error("Please provide both API keys.")
Create the document upload interface:
• URL-based uploads
• Optional naming
• Mode selection
# Document Upload Section
if st.session_state.api_keys_submitted:
st.markdown("### 📄 Document Upload")
doc_url = st.text_input("Enter document URL")
doc_name = st.text_input("Document name (optional)")
col1, col2 = st.columns([1, 3])
with col1:
upload_mode = st.selectbox("Upload mode", ["fast", "accurate"])
if st.button("Upload Document"):
if doc_url:
try:
with st.spinner("Uploading document..."):
st.session_state.pipeline.upload_document(
url=doc_url,
name=doc_name if doc_name else None,
mode=upload_mode
)
time.sleep(5) # Wait for indexing
st.session_state.document_uploaded = True
st.success("Document uploaded and indexed successfully!")
except Exception as e:
st.error(f"Error uploading document: {str(e)}")
else:
st.error("Please provide a document URL.")
Create the Query Interface:
• Simple query input
• Loading indicators
• Markdown response display
# Query Section
if st.session_state.document_uploaded:
st.markdown("### 🔍 Query Document")
query = st.text_input("Enter your query")
if st.button("Generate Response"):
if query:
try:
with st.spinner("Generating response..."):
response = st.session_state.pipeline.process_query(query)
st.markdown("### Response:")
st.markdown(response)
except Exception as e:
st.error(f"Error generating response: {str(e)}")
else:
st.error("Please enter a query.")
if __name__ == "__main__":
main()
How the Code Works
The code above defines a RAG pipeline that integrates with Ragie.ai and Claude 3.5 Sonnet. The process is straightforward:
Upload a Document: The document is stored on Ragie and made searchable.
Retrieve Chunks: Based on the query, relevant sections from the uploaded document are fetched.
Generate a Response: Using Claude 3.5 Sonnet, the system generates a reply by synthesizing the information retrieved.
Running the App
With our code in place, it's time to launch the app.
In your terminal, navigate to the project folder, and run the following command
streamlit run rag_app.py
Streamlit will provide a local URL (typically http://localhost:8501). Open this in your web browser, put in your API keys, give it a URL, and have fun!
Working Application Demo
Conclusion
And you’ve just built a RAG-as-a-Service using Claude 3.5 Sonnet and Ragie.ai, all with minimal effort.
This setup can now be expanded further:
Adding voice input for queries.
Extending retrieval capabilities with more document types.
Deploying the app to the cloud for wider access.
Keep experimenting and refining to build even smarter AI solutions!
We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Reply