• unwind ai
  • Posts
  • Build and Deploy RAG-as-a-service

Build and Deploy RAG-as-a-service

RAG-as-a-service with Claude 3.5 Sonnet in less than 50 lines of Python Code (step-by-step instructions)

RAG is becoming a game-changer for applications that need accurate information from large datasets. As developers, we know the value of building tools that can search documents and provide relevant answers quickly. Today, we’ll take that one step further.

In this tutorial, we’ll walk you through building a production-ready RAG service using Claude 3.5 Sonnet and Ragie.ai, integrated into a clean, user-friendly Streamlit interface. With less than 50 lines of Python code, you’ll create a system that retrieves and queries documents—ready for real-world use.

What makes RAG-as-a-Service unique? 

Unlike a typical RAG app, RAG-as-a-Service abstracts complex data ingestion, chunking, and vector retrieval through managed APIs, making it scalable and easy to integrate across products. This means fewer headaches managing infrastructure and more focus on building features.

What is Ragie.ai?

Ragie.ai is a fully managed RAG-as-a-Service for developers. It offers connectors for services like Google Drive, Notion, and Confluence, along with APIs for document upload and retrieval. It handles the entire pipeline—from chunking to hybrid keyword and semantic searches—so you can start with minimal setup.

🎁 $50 worth AI Bonus Content at the end!

What We’re Building

This implementation allows you to create a document querying system with a user-friendly Streamlit interface in less than 50 lines of Python code.

Features

  • Production-ready RAG pipeline

  • Integration with Claude 3.5 Sonnet for response generation

  • Document upload from URLs

  • Real-time document querying

  • Support for both fast and accurate document processing modes

Prerequisites

Before we begin, make sure you have:

  1. Python installed on your machine (version 3.7 or higher is recommended)

  2. Your Anthropic API Key and Ragie API Key

  3. Basic familiarity with Python programming

  4. A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)

Step-by-Step Instructions

Setting Up the Environment

First, let's get our development environment ready:

  1. Clone the GitHub repository:

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
  1. Go to the rag-as-a-service folder:

cd rag-as-a-service
pip install -r requirements.txt
  1. Get your API Keys: Sign up for an Anthropic account and Ragie account to obtain your API key.

Creating the Streamlit App

Let’s create our app. Create a new file rag_app.py and add the following code:

  1. Import Required Libraries: 
    • Streamlit for the web interface
    • Requests for API calls

    • Anthropic for Claude integration

    • Time for handling delays

    • Typing for type hints

import streamlit as st
import requests
from anthropic import Anthropic
import time
from typing import List, Dict, Optional
from urllib.parse import urlparse
  1. Let's create our RAGPipeline class:

    • Initializes API clients

    • Sets up API endpoints

    • Manages authentication

class RAGPipeline:
    def __init__(self, ragie_api_key: str, anthropic_api_key: str):
        """
        Initialize the RAG pipeline with API keys.
        """
        self.ragie_api_key = ragie_api_key
        self.anthropic_api_key = anthropic_api_key
        self.anthropic_client = Anthropic(api_key=anthropic_api_key)
        
        # API endpoints
        self.RAGIE_UPLOAD_URL = "https://api.ragie.ai/documents/url"
        self.RAGIE_RETRIEVAL_URL = "https://api.ragie.ai/retrievals"
  1. Add the Document Upload functionality:

    • Handles document uploads via URL

    • Supports fast/accurate modes

    • Auto-generates document names

    def upload_document(self, url: str, name: Optional[str] = None, mode: str = "fast") -> Dict:
        return response.json()
  1. Add the Retrieval functionality:

    • Retrieves relevant text chunks

    • Supports scoped searches

    • Returns scored chunks

    def retrieve_chunks(self, query: str, scope: str = "tutorial") -> List[str]:
        return [chunk["text"] for chunk in data["scored_chunks"]]

    def create_system_prompt(self, chunk_texts: List[str]) -> str:
        return f"""These are very important to follow: You are "Ragie AI", a professional but friendly AI chatbot working as an assistant to the user. Your current task is to help the user based on all of the information available to you shown below. Answer informally, directly, and concisely without a heading or greeting but include everything relevant. Use richtext Markdown when appropriate including bold, italic, paragraphs, and lists when helpful. If using LaTeX, use double $$ as delimiter instead of single $. Use $$...$$ instead of parentheses. Organize information into multiple sections or points when appropriate. Don't include raw item IDs or other raw fields from the source. Don't use XML or other markup unless requested by the user. Here is all of the information available to answer the user: === {chunk_texts} === If the user asked for a search and there are no results, make sure to let the user know that you couldn't find anything, and what they might be able to do to find the information they need. END SYSTEM INSTRUCTIONS"""
  1. Response generation with Claude Sonnet 3.5:

    • Uses Claude

    • Includes system instructions

    • Manages token limits

    def generate_response(self, system_prompt: str, query: str) -> str:
        return message.content[0].text

    def process_query(self, query: str, scope: str = "tutorial") -> str:
        return self.generate_response(system_prompt, query)

def initialize_session_state():
        st.session_state.api_keys_submitted = False
  1. Setup the Streamlit app:

    • Sets up wide layout

    • Initializes session state

    • Creates clean UI

def main():
    st.set_page_config(page_title="RAG-as-a-Service", layout="wide")
    initialize_session_state()
    
    st.title("🖇️ RAG-as-a-Service")
  1. API key configuration and setup:

    • Secure API key input

    • Column layout

    • Expandable section

    # API Keys Section
    with st.expander("🔑 API Keys Configuration", expanded=not st.session_state.api_keys_submitted):
        col1, col2 = st.columns(2)
        with col1:
            ragie_key = st.text_input("Ragie API Key", type="password", key="ragie_key")
        with col2:
            anthropic_key = st.text_input("Anthropic API Key", type="password", key="anthropic_key")
        
        if st.button("Submit API Keys"):
            if ragie_key and anthropic_key:
                try:
                    st.session_state.pipeline = RAGPipeline(ragie_key, anthropic_key)
                    st.session_state.api_keys_submitted = True
                    st.success("API keys configured successfully!")
                except Exception as e:
                    st.error(f"Error configuring API keys: {str(e)}")
            else:
                st.error("Please provide both API keys.")
  1. Create the document upload interface:

    • URL-based uploads

    • Optional naming

    • Mode selection

     # Document Upload Section
    if st.session_state.api_keys_submitted:
        st.markdown("### 📄 Document Upload")
        doc_url = st.text_input("Enter document URL")
        doc_name = st.text_input("Document name (optional)")
        
        col1, col2 = st.columns([1, 3])
        with col1:
            upload_mode = st.selectbox("Upload mode", ["fast", "accurate"])
        
        if st.button("Upload Document"):
            if doc_url:
                try:
                    with st.spinner("Uploading document..."):
                        st.session_state.pipeline.upload_document(
                            url=doc_url,
                            name=doc_name if doc_name else None,
                            mode=upload_mode
                        )
                        time.sleep(5)  # Wait for indexing
                        st.session_state.document_uploaded = True
                        st.success("Document uploaded and indexed successfully!")
                except Exception as e:
                    st.error(f"Error uploading document: {str(e)}")
            else:
                st.error("Please provide a document URL.")
  1. Create the Query Interface:

    • Simple query input

    • Loading indicators

    • Markdown response display

    # Query Section
    if st.session_state.document_uploaded:
        st.markdown("### 🔍 Query Document")
        query = st.text_input("Enter your query")
        
        if st.button("Generate Response"):
            if query:
                try:
                    with st.spinner("Generating response..."):
                        response = st.session_state.pipeline.process_query(query)
                        st.markdown("### Response:")
                        st.markdown(response)
                except Exception as e:
                    st.error(f"Error generating response: {str(e)}")
            else:
                st.error("Please enter a query.")

if __name__ == "__main__":
    main()

How the Code Works

The code above defines a RAG pipeline that integrates with Ragie.ai and Claude 3.5 Sonnet. The process is straightforward:

  • Upload a Document: The document is stored on Ragie and made searchable.

  • Retrieve Chunks: Based on the query, relevant sections from the uploaded document are fetched.

  • Generate a Response: Using Claude 3.5 Sonnet, the system generates a reply by synthesizing the information retrieved.

Running the App

With our code in place, it's time to launch the app.

  • In your terminal, navigate to the project folder, and run the following command

streamlit run rag_app.py
  • Streamlit will provide a local URL (typically http://localhost:8501). Open this in your web browser, put in your API keys, give it a URL, and have fun!

Working Application Demo

Conclusion

And you’ve just built a RAG-as-a-Service using Claude 3.5 Sonnet and Ragie.ai, all with minimal effort.

This setup can now be expanded further:

  • Adding voice input for queries.

  • Extending retrieval capabilities with more document types.

  • Deploying the app to the cloud for wider access.

Keep experimenting and refining to build even smarter AI solutions!

We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Bonus worth $50 💵💰

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!

Reply

or to participate.