unwind ai
Posts
Build AI Research Agent with Memory to search Academic Papers

Build AI Research Agent with Memory to search Academic Papers

AI Agent to search research papers in just 40 lines of Python Code (step-by-step instructions)

Shubham Saboo & Gargi Gupta
October 19, 2024

Google search is great, but how about building your own AI research assistant that not only searches academic papers but also remembers your preferences based on your past queries? Sounds complex, right?

This tutorial breaks it down step-by-step, guiding you through building an AI agent that queries arXiv, processes results intelligently, and retains user context over time using memory storage.

The app combines several components: GPT-4o-mini for parsing search results, MultiOn for web browsing, and Mem0 with Qdrant to manage user-specific memory. With just a few lines of code, you’ll have a personalized research assistant that gets smarter with every interaction.

🎁 $50 worth AI Bonus Content at the end!

What We’re Building

This Streamlit app implements an AI-powered research assistant that helps users search for academic papers on arXiv while maintaining a memory of user interests and past interactions. It utilizes OpenAI's GPT-4o-mini model for processing search results, MultiOn for web browsing, Mem0 as the intelligent memory layer, and Qdrant as the vector database for maintaining user context.

Features

Search interface for querying arXiv papers
AI-powered processing of search results for improved readability
Persistent memory of user interests and past searches
Utilizes OpenAI's GPT-4o-mini model for intelligent processing
Implements memory storage and retrieval using Mem0 and Qdrant

Prerequisites

Before we begin, make sure you have:

Python installed on your machine (version 3.7 or higher is recommended)
Your OpenAI API Key and MultiOn API Key
Basic familiarity with Python programming
A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)

Step-by-Step Instructions

Setting Up the Environment

First, let's get our development environment ready:

Clone the GitHub repository:

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git

🌟 Don't forget to star the opensource repo to show your support.

Go to the ai_arxiv_agent_memory folder:

cd ai_arxiv_agent_memory

Ensure Qdrant is running: The app expects Qdrant to be running on localhost:6333. Adjust the configuration in the code if your setup is different.

docker pull qdrant/qdrant

docker run -p 6333:6333 -p 6334:6334 \
    -v $(pwd)/qdrant_storage:/qdrant/storage:z \
    qdrant/qdrant

Install the required dependencies:

pip install -r requirements.txt

Get your API Keys:

Creating the Streamlit App

Let’s create our Streamlit app. Create a new file ai_arxiv_agent_memory.py and add the following code:

Import Required Libraries:
• Streamlit for building the web app
• OpenAI for using GPT-4o-mini
• MultiOn for accessing Arxiv and getting the data (Internet of Agent)
• Mem0 for personalized memory layer

import streamlit as st
import os
from mem0 import Memory
from multion.client import MultiOn
from openai import OpenAI

Set up the Streamlit App:
• Add a title to the app using 'st.title()'
• Add a text input box for the user to enter their OpenAI API key using 'st.text_input()'

st.title("AI Research Agent with Memory 📚")

api_keys = {k: st.text_input(f"{k.capitalize()} API Key", type="password") for k in ['openai', 'multion']}

Initialize services if API keys are provided:
• Configures Mem0 with Qdrant as the vector store
• Initializes MultiOn and OpenAI clients

if all(api_keys.values()):
    os.environ['OPENAI_API_KEY'] = api_keys['openai']
    # Initialize Mem0 with Qdrant
    config = {
        "vector_store": {
            "provider": "qdrant",
            "config": {
                "model": "gpt-4o-mini",
                "host": "localhost",
                "port": 6333,
            }
        },
    }
    memory, multion, openai_client = Memory.from_config(config), MultiOn(api_key=api_keys['multion']), OpenAI(api_key=api_keys['openai'])

Create user input and search query fields:
• Adds a sidebar for user identification
• Provides an input field for the research paper search query

    user_id = st.sidebar.text_input("Enter your Username")
    #user_interests = st.text_area("Research interests and background")

    search_query = st.text_input("Research paper search query")

Define a function to process search results with GPT-4o-mini:
• Creates a structured prompt for GPT-4o-mini
• Processes arXiv search results into a readable format
• Returns a markdown-formatted table of research papers

    def process_with_gpt4(result):
        prompt = f"""
        Based on the following arXiv search result, provide a proper structured output in markdown that is readable by the users. 
        Each paper should have a title, authors, abstract, and link.
        Search Result: {result}
        Output Format: Table with the following columns: [{{"title": "Paper Title", "authors": "Author Names", "abstract": "Brief abstract", "link": "arXiv link"}}, ...]
        """
        response = openai_client.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": prompt}], temperature=0.2)
        return response.choices[0].message.content

Implement the paper search functionality:
• Retrieves relevant user memories
• Constructs a search prompt with user context
• Uses MultiOn to browse arXiv and GPT-4o-mini to process results
• Displays formatted results in the Streamlit interface

    if st.button('Search for Papers'):
        with st.spinner('Searching and Processing...'):
            relevant_memories = memory.search(search_query, user_id=user_id, limit=3)
            prompt = f"Search for arXiv papers: {search_query}\nUser background: {' '.join(mem['text'] for mem in relevant_memories)}"
            result = process_with_gpt4(multion.browse(cmd=prompt, url="https://arxiv.org/"))
            st.markdown(result)

Add a memory viewing feature:
• Adds a button to view stored memories
• Displays all memories associated with the current user
• Gives warning message to the user to enter the API keys

    if st.sidebar.button("View Memory"):
        st.sidebar.write("\n".join([f"- {mem['text']}" for mem in memory.get_all(user_id=user_id)]))

else:
    st.warning("Please enter your API keys to use this app.")

Running the App

With our code in place, it's time to launch the app.

Start the Streamlit App: In your terminal, navigate to the project folder, and run the following command

streamlit run ai_arxiv_agent_memory.py

Access Your AI Assistant: Streamlit will provide a local URL (typically http://localhost:8501). Open this in your web browser, give it the URL of the website you want the AI, and have fun!

Working Application Demo

Conclusion

You’ve built an AI-powered research assistant that searches for academic papers on arXiv while maintaining a memory of your interests and past searches. By combining the power of GPT-4o-mini, MultiOn, Mem0, and Qdrant, you’ve created a smart, personalized research tool that gets better with every use.

For your next steps, try enhancing the app by integrating additional academic sources, such as Google Scholar, or improving the memory system to store more detailed user preferences. You could also explore building alerts for new research papers matching stored interests to keep users updated effortlessly.

Keep experimenting and refining to build even smarter AI solutions!

We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Bonus worth $50 💵💰

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!

Reply

or to participate.