• unwind ai
  • Posts
  • Build an LLM app with RAG using Llama 3.2 Running Locally

Build an LLM app with RAG using Llama 3.2 Running Locally

Fully Functional LLM App that is 100% free and without internet (step-by-step instructions)

Meta’s new opensource models, Llama 3.2, are all the rage. But have you started building with the models yet? If not, now’s the perfect time to dive in. With smaller sizes, faster token generation, and high accuracy, Llama 3.2 opens up new possibilities for building AI apps with RAG.

In this tutorial, we’ll show you how to build a simple yet powerful PDF Chat Assistant using Llama 3.2 and RAG. By the end, you’ll be able to upload PDFs, ask questions, and get highly accurate answers while your app is running locally, absolutely free and without internet.

🎁 $50 worth AI Bonus Content at the end!

What We’re Building

Our PDF Chat Assistant uses Llama 3.2 with RAG to analyze the content of a PDF document and answer questions based on it. This assistant will:

  • Use Streamlit for an easy-to-use interface

  • Combine RAG with the power of Llama 3.2

  • Use Embedchain framework for RAG functionality

  • Use ChromaDB for vector storage of PDF content

Prerequisites

Before we begin, make sure you have:

  1. Python installed on your machine (version 3.7 or higher is recommended)

  2. Basic familiarity with Python programming

  3. A code editor of your choice (we recommend VSCode or PyCharm for their excellent Python support)

Step-by-Step Instructions

Step 1: Setting Up the Environment

First, let's get our development environment ready:

  1. Clone the GitHub repository:

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
  1. Go to the chat_with_pdf folder:

cd chat_with_pdf
pip install -r requirements.txt

Step 2: Creating the Streamlit App

Now that the environment is set, let’s create our Streamlit app. Create a new file chat_pdf.py and add the following code:

  • Import Required Libraries: At the top of your file, add

import os
import tempfile
import streamlit as st
from embedchain import App
import base64
from streamlit_chat import message
  • Configure the App: For this application, we will use Llama-3.2 using Ollama. You can choose from OpenAI, Anthropic, or any other LLM

    Select the vector database as the opensource ChromaDB (you are free to choose any other vector database of your choice)

def embedchain_bot(db_path):
    return App.from_config(
        config={
            "llm": {"provider": "ollama", "config": {"model": "llama3.2:latest", "max_tokens": 250, "temperature": 0.5, "stream": True, "base_url": 'http://localhost:11434'}},
            "vectordb": {"provider": "chroma", "config": {"dir": db_path}},
            "embedder": {"provider": "ollama", "config": {"model": "llama3.2:latest", "base_url": 'http://localhost:11434'}},
        }
    )
  • Handling PDF Upload and Display: Add a function to display PDFs in the Streamlit app This allows users to preview uploaded PDFs

def display_pdf(file):
    base64_pdf = base64.b64encode(file.read()).decode('utf-8')
    pdf_display = f'<iframe src="data:application/pdf;base64,{base64_pdf}" width="100%" height="400" type="application/pdf"></iframe>'
    st.markdown(pdf_display, unsafe_allow_html=True)
  • Set Up the Streamlit App: Streamlit lets you create user interface with just python code, for this app we will:
    • Add a title to the app using 'st.title()'
    • Add a description for the app using 'st.caption()'

st.title("Chat with PDF using Llama 3.2")
st.caption("This app allows you to chat with a PDF using Llama 3.2 running locally with Ollama!")

db_path = tempfile.mkdtemp()

if 'app' not in st.session_state:
    st.session_state.app = embedchain_bot(db_path)
if 'messages' not in st.session_state:
    st.session_state.messages = []
  • Create a sidebar for PDF upload and preview: Users can upload and preview PDFs here

with st.sidebar:
    st.header("PDF Upload")
    pdf_file = st.file_uploader("Upload a PDF file", type="pdf")

    if pdf_file:
        st.subheader("PDF Preview")
        display_pdf(pdf_file)
  • Adding the PDF to the Knowledge Base: When a PDF is uploaded, the content is processed and added to ChromaDB for retrieval

        if st.button("Add to Knowledge Base"):
            with st.spinner("Adding PDF to knowledge base..."):
                with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as f:
                    f.write(pdf_file.getvalue())
                    st.session_state.app.add(f.name, data_type="pdf_file")
                os.remove(f.name)
            st.success(f"Added {pdf_file.name} to knowledge base!")
  • Chat Interface: Create a chat interface to allow users to ask questions about the PDF

for i, msg in enumerate(st.session_state.messages):
    message(msg["content"], is_user=msg["role"] == "user", key=str(i))

if prompt := st.chat_input("Ask a question about the PDF"):
    st.session_state.messages.append({"role": "user", "content": prompt})
    message(prompt, is_user=True)
  • Process user questions and display responses: Add a button to clear chat history

    with st.spinner("Thinking..."):
        response = st.session_state.app.chat(prompt)
        st.session_state.messages.append({"role": "assistant", "content": response})
        message(response)

if st.button("Clear Chat History"):
    st.session_state.messages = []

Step 3: Running the App

With our code in place, it's time to launch the app and start comparing stocks.

  • Start the Streamlit App: In your terminal, navigate to the project folder, and run the following command

streamlit run chat_pdf.py
  • Access Your AI Assistant: Streamlit will provide a local URL (typically http://localhost:8501). Open this in your web browser and start comparing stocks by entering their symbols.

Working Application Demo

Conclusion

You’ve successfully built a PDF Chat Assistant powered by Meta’s Llama 3.2 and RAG running locally.

Llama 3.2 model’s speed and improved comprehension, combined with the versatility of ChromaDB, give you a solid foundation to build more advanced applications.

Whether you’re extending this project to support more document types, adding multi-document querying, or scaling with cloud integrations, you now have the tools to push the boundaries of what’s possible with the Llama 3.2 ecosystem.

We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Bonus worth $50 💵💰

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!

Reply

or to participate.