unwind ai
Posts
One API to Use 100+ LLMs

One API to Use 100+ LLMs

PLUS: OpenAI's new MLE bench, AI Playground to build Gradio apps

Shubham Saboo & Gargi Gupta
October 11, 2024

Today’s top AI Highlights:

Build full-stack Gradio apps with a simple prompt and preview in Artifacts style
Call 100+ LLMs using the OpenAI input/output format
OpenAI’s o1-agent bags bronze in the Kaggle ML competition
Opensource NotebookLM with more features and customizing options

& so much more!

Read time: 3 mins

AI Tutorials

Building AI tools that can handle customer interactions while retaining context is becoming increasingly important for modern applications.

In this tutorial, we’ll show you how to create a powerful AI customer support agent using GPT-4o, with memory capabilities to recall previous interactions.

The AI assistant’s memory will be managed using Mem0 with Qdrant as the vector store. The assistant will handle customer queries while maintaining a persistent memory of interactions, making the experience seamless and more intelligent.

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build a Customer Support AI Agent with Memory

LLM App using GPT-4o and vector database in less than 100 lines of Python code (step-by-step instructions)

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get an AI resource pack worth $50 for FREE. Valid for a limited time only!

Latest Developments

Stop Juggling SDKs: LiteLLM Simplifies Access to Any LLM 🎯

LiteLLM, a Python SDK, lets you call 100+ LLM APIs using the OpenAI format. It handles the messy details of different provider APIs (like Bedrock, HuggingFace, or VertexAI) so you can write cleaner code. Plus, there's a proxy server to manage costs and control access, making it super easy to switch between providers, track spending, and even set up rate limits. With built-in retry logic, your applications become more resilient.

Key Highlights:

One API Call, Many LLMs - Write your code once and call any provider's models—HuggingFace, Bedrock, VertexAI, etc.—using the familiar OpenAI format. LiteLLM translates your requests behind the scenes, so no more juggling different SDKs.
Smart Routing & Retries - LiteLLM automatically retries failed requests and can even route traffic across different model deployments. This built-in redundancy makes your LLM applications more robust and reliable.
Cost Control & Management - The LiteLLM proxy server helps manage costs by tracking usage and setting budgets. You can even generate API keys with specific rate limits to prevent runaway spending and control access for different teams or projects.
Built-in Logging & Observability - Monitor LLM performance and track usage with integrations for tools like Lunary, Langfuse, and Helicone. You can also define custom callbacks to capture specific metrics or integrate with your own logging systems, even during streaming responses.

Build Beautiful ML Apps with Just a Few Lines of Code 🪄

Gradio 5 is here. Build production-ready, performant, and visually appealing ML web apps quickly. One of the most exciting new features is the experimental AI playground which lets you create Gradio apps using simple English prompts which you can instantly view just like Artifacts and further edit them.

Upgrade now using pip install --upgrade gradio and explore the new features.

Key Highlights:

Instantaneous Loading with SSR - No more loading spinners! Gradio 5 implements server-side rendering (SSR), drastically reducing load times and providing a smooth user experience from the get-go.
Revamped UI/UX and Theming - Tired of outdated-looking apps? Gradio 5 refreshes core components like buttons, sliders, and tabs, plus the chatbot interface, with a modern design language. You can also customize the look and feel of your apps to match your branding or preferences.
Unlock Real-time Streaming - Build dynamic, interactive experiences with Gradio 5's low-latency streaming capabilities. Leverage automatic base64 encoding and websockets, or integrate WebRTC via custom components for applications like live video processing and real-time transcription.
AI Playground with Artifacts - Build full-stack apps without any coding in this new experimental AI Playground. Use AI to generate Gradio app code or modify existing ones, and instantly preview the results right in your browser. Deploy the app in a single click. Explore the playground here.

Quick Bites

OpenAI Chairman Bret Taylor’s AI company, Sierra, is in talks to raise hundreds of millions of dollars at a valuation exceeding $4 billion. Co-founded with former Google executive Clay Bavor, the funding round, led by Greenoaks Capital, would more than triple Sierra’s valuation from earlier this year.

Writer, the full-stack generative AI platform, has released Palmyra X 004, a new LLM that boasts excellent function-calling and workflow execution capabilities, crucial for agentic AI apps. It outperforms models like OpenAI and Google at a fraction of the cost.

OpenAI has released MLE-bench, a new benchmark that evaluates how well AI agents perform machine learning engineering tasks. It uses 75 real-world Kaggle competitions to measure skills like model training and dataset preparation. OpenAI’s o1-preview achieved the level of a Kaggle bronze medal in 16.9% of competitions.

Tools of the Trade

Podcastfy: Opensource Python package that transforms web content, PDFs, and text into engaging, multi-lingual audio conversations. You can even customize the output as you like, for eg., style, structure, audio length, etc.
Firebender: AI assistant for Android Studio that offers real-time, context-aware code help without storing your data. It updates with the latest Android SDKs and integrates directly into your development environment.
Beehive: Opensource framework that helps build AI agents capable of working together to solve tasks. It simplifies creating multi-agent workflows, such as sequential chats or debates, using models like GPT-4-Turbo for decision-making instead of hardcoded logic.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes

>i want 3.5 opus
>tell oai gc “omg i heard opus next week”
>they get scared
>ready o1 to mog opus
>deepmind hears leaks
>ready 1.5 ultra to mog oai and anthropic
>anthropic wasn’t gonna release opus
>but everyone else is releasing
>dario says fine sigh and drops opus
>win ~
Aidan McLau
You're 1000 job applications away from getting a job and 1 GPT wrapper away from financial freedom ~
Dennis

Meme of the Day

SF tech bro was a little too deep in Founder Mode
— Jason (@mytechceoo)
10:13 PM • Sep 13, 2024

That’s all for today! See you tomorrow with more such AI-filled content.

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it wtith at least one, two (or 20) of your friends 😉

Reply

or to participate.