• unwind ai
  • Posts
  • Build a Multi LLM Playground with GPT-4o, Claude 3.5 Sonnet and Cohere Command R

Build a Multi LLM Playground with GPT-4o, Claude 3.5 Sonnet and Cohere Command R

Fully-functional multi-LLM app in just 15 lines of Python Code (step-by-step instructions)

Working with multiple LLMs simultaneously can be incredibly useful for comparing their strengths, weaknesses, and response styles. Setting up an app that allows direct comparison between top models would be great for understanding LLM behaviors and selecting the right model for specific tasks.

Does it sound complex to build your own chat playground with multiple LLMs? It’s really not. Just 20 lines of Python code and it’s done!

Let’s build this Multi-LLM Chat Playground that lets you interact with three popular models—GPT-4o, Claude Sonnet 3.5, and Cohere Command R Plus—all within a single app. You can swap these with any other LLMs of your choice too. With a few clicks, you can view responses from each model in a parallel layout for easy comparison.

For this app, we’re using LiteLLM, a Python SDK and proxy server that lets you interact with 100+ LLM APIs using a unified interface. It translates inputs to various providers' endpoints, ensuring consistent outputs and managing retry and fallback logic across multiple deployments.

Don’t forget to share this tutorial on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

What We’re Building

This application lets you interact with multiple language models (GPT-4o, Claude 3.5 Sonnet, and Cohere Command R Plus) simultaneously and compare their responses side by side.

Features

  • Simultaneous interaction with three leading LLMs:

  • Side-by-side response comparison

  • Secure API key handling

  • User-friendly Streamlit interface

  • Real-time error handling and feedback

Prerequisites

Before we begin, make sure you have:

  1. Python installed on your machine (version 3.7 or higher is recommended)

  2. Your OpenAI API Key, Anthropic API Key, and Cohere API Key

  3. Basic familiarity with Python programming

  4. A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)

After personally vetting dozens of AI courses, we're excited to share something special with our developer community. These aren't your typical theoretical courses - they're intensive, hands-on programs where you'll build production-grade AI systems with industry veterans from Google, Stanford, and leading AI companies.

Why we're recommending these:

  • Live cohort-based learning, not pre-recorded videos

  • Build real production systems, not toy projects

  • Direct mentorship from industry practitioners

  • Rigorously tested implementation patterns

  • Focused on scalable architectures & deployment

You'll be implementing everything from advanced RAG systems to multi-agent architectures in real time.

Step-by-Step Instructions

Setting Up the Environment

First, let's get our development environment ready:

  1. Clone the GitHub repository:

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
  1. Go to the multillm_chat_playground folder:

cd advanced_tools_frameworks/multillm_chat_playground
pip install -r requirements.txt
  1. Get your API Keys: Sign up/log in to your OpenAI, Anthropic, and Cohere accounts to get your API keys.

Code Walkthrough

Let’s create our app. Create a new file multillm_playground.py and add the following code:

  1. Import necessary libraries:
    • Streamlit for the web interface
    • LiteLLM for unified access to multiple Language Models

import streamlit as st
from litellm import completion
  1. Set up the Streamlit app and API key inputs: 
    • Creates a title for the app
    • Adds secure input fields for API keys

st.title("Multi-LLM Chat Playground")

openai_api_key = st.text_input("Enter your OpenAI API Key:", type="password")
anthropic_api_key = st.text_input("Enter your Anthropic API Key:", type="password")
cohere_api_key = st.text_input("Enter your Cohere API Key:", type="password")
  1. Create user input and send button: 
    • Checks if all API keys are provided
    • Creates an input field for user messages
    • Prepares the message format for LLMs

# Check if all API keys are provided
if openai_api_key and anthropic_api_key and cohere_api_key:

    # Create a text input for user messages
    user_input = st.text_input("Enter your message:")

    if st.button("Send to All LLMs"):
        if user_input:
            messages = [{"role": "user", "content": user_input}]
  1. Set up side-by-side display for LLM responses:
    • Creates three columns for displaying responses
    • Allows for easy comparison between different LLMs

            col1, col2, col3 = st.columns(3)
  1. Generate and display GPT-4o response: 
    • Uses LiteLLM to interact with GPT-4o
    • Displays the response or an error message
    • Handles exceptions gracefully

            with col1:
                st.subheader("GPT-4o")
                try:
                    gpt_response = completion(model="gpt-4o", messages=messages, api_key=openai_api_key)
                    st.write(gpt_response.choices[0].message.content)
                except Exception as e:
                    st.error(f"Error with GPT-4o: {str(e)}")
  1. Generate and display Claude 3.5 Sonnet response: 
    • Uses LiteLLM to interact with Claude 3.5 Sonnet
    • Displays the response or an error message
    • Maintains consistent error handling across LLMs

            with col2:
                st.subheader("Claude 3.5 Sonnet")
                try:
                    claude_response = completion(model="claude-3-5-sonnet-20240620", messages=messages, api_key=anthropic_api_key)
                    st.write(claude_response.choices[0].message.content)
                except Exception as e:
                    st.error(f"Error with Claude 3.5 Sonnet: {str(e)}")
  1. Generate and display Cohere response: 
    • Uses LiteLLM to interact with Cohere's Command R Plus
    • Displays the response or an error message
    • Completes the side-by-side comparison of LLMs

            with col3:
                st.subheader("Cohere")
                try:
                    cohere_response = completion(model="command-r-plus", messages=messages, api_key=cohere_api_key)
                    st.write(cohere_response.choices[0].message.content)
                except Exception as e:
                    st.error(f"Error with Cohere: {str(e)}")
  1. Sidebar information section:
    • Adds a sidebar title and description of the app's purpose.
    • Outlines major app functionalities

st.sidebar.title("About this app")

st.sidebar.write(
    "This app demonstrates the use of multiple Language Models (LLMs) "
    "in a single application using the LiteLLM library."
)

st.sidebar.subheader("Key features:")
st.sidebar.markdown(
    """
    - Utilizes three different LLMs:
        - OpenAI's GPT-4o
        - Anthropic's Claude 3.5 Sonnet
        - Cohere's Command R Plus
    - Sends the same user input to all models
    - Displays responses side-by-side for easy comparison
    - Showcases the ability to use multiple LLMs in one application
    """
)

st.sidebar.write(
    "Try it out to see how different AI models respond to the same prompt!"
)

Running the App

With our code in place, it's time to launch the app.

  • In your terminal, navigate to the project folder, and run the following command

streamlit run multillm_playground.py
  • Streamlit will provide a local URL (typically http://localhost:8501 or 8503). Open this in your web browser > Put in your API keys > Give your prompt > Hit Send to All LLMs button > Watch your LLMs responses side-by-side.

Working Application Demo

Conclusion

You’ve successfully built a Multi-LLM Chat Playground in under 20 lines of Python code! This tool lets you interact with and compare responses from top LLMs side by side, giving insight into how different models respond to the same queries.

For further enhancements, consider:

  1. Adding a Response Evaluation Feature: Allow users to rate responses to gather insights on model performance.

  2. Implementing Preset Prompts: Include a dropdown menu with common prompts or scenarios for testing.

  3. Storing Response History: Enable users to save and review previous interactions for better analysis.

  4. Customizable Model Settings: Allow users to adjust parameters like temperature and max tokens.

Keep experimenting and refining to build even smarter AI solutions!

We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this tutorial on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Reply

or to participate.