- unwind ai
- Posts
- FastAPI for AI agents
FastAPI for AI agents
PLUS: LazyGraphRAG by Microsoft, LLMOps Database with 300+ real LLM implementations
Today’s top AI Highlights:
Build robust production-ready AI agent apps with Pydantic’s AI agent framework
Microsoft's LazyGraphRAG makes RAG systems better without high indexing costs
First open source AI video model that generates high-quality videos in real time
Learn from 300+ real LLM implementations in this LLMOps database
Self-hosted AI server for LLM APIs, Ollama, ComfyUI, and FFmpeg servers
& so much more!
Read time: 3 mins
AI Tutorials
In this tutorial, we'll build a Personal Health & Fitness AI Agent that demonstrates how to create task-specific AI agents that collaborate effectively. Using Google Gemini and Phidata, we'll create a system where two specialized agents - one for diet and one for fitness - work together to generate personalized recommendations.
This app generates tailored dietary and fitness plans based on user inputs such as age, weight, height, activity level, dietary preferences, and fitness goals.
Phidata makes this multi-agent approach straightforward by providing a framework designed for building and coordinating AI agents. It handles the complexity of agent communication, memory management, and response generation, letting us focus on defining our agents' roles and behaviors.
We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
The Pydantic team has released PydanticAI, a new agent framework designed specifically for production environments. Unlike other frameworks, PydanticAI integrates deeply with Pydantic's type system and validation capabilities.
It lets you write clean, maintainable code with vanilla Python patterns while giving you the tools to monitor and debug your AI applications in production through Pydantic Logfire.
Key Highlights:
Function Tools - Create and manage tools for your AI agents using familiar Python decorators. Whether you're querying databases, calling APIs, or processing data, tools get automatic schema generation from type hints and docstrings, making integration with external services natural and type-safe.
Real Production Monitoring - Track your agents' behavior with Pydantic Logfire integration. Watch message flows, debug issues, and monitor performance metrics right where they matter - in your production environment.
Model Agnostic - Work with OpenAI, Gemini, or Groq models today (Anthropic coming soon), with an easy path to add support for other providers. Switch between models without changing your application code.
Testing That Makes Sense - Write proper unit tests for your application code with TestModel, and evaluate your agents' performance separately. No more mixing model evaluation with implementation testing.
Quick Setup - Start with
pip install pydantic-ai
and use the slim install options if you only need specific providers. The framework comes with comprehensive examples and a clear upgrade path from local development to production deployment.
Writer RAG tool: build production-ready RAG apps in minutes
RAG in just a few lines of code? We’ve launched a predefined RAG tool on our developer platform, making it easy to bring your data into a Knowledge Graph and interact with it with AI. With a single API call, writer LLMs will intelligently call the RAG tool to chat with your data.
Integrated into Writer’s full-stack platform, it eliminates the need for complex vendor RAG setups, making it quick to build scalable, highly accurate AI workflows just by passing a graph ID of your data as a parameter to your RAG tool.
Microsoft has introduced LazyGraphRAG, a novel approach to RAG. It combines vector and graph RAG methods, for efficiency and cost savings by deferring heavy LLM processing until query time.
It significantly reduces initial indexing expenses, scaling well for diverse use cases—ideal for those seeking performant and economical solutions. Interestingly, LazyGraphRAG uses simple noun phrase extraction for initial indexing instead of resource-intensive LLMs, making it a lightweight yet powerful alternative for handling large datasets.
Key Highlights:
Cost-Efficient Indexing - LazyGraphRAG defers all LLM usage until query time and uses lightweight NLP noun phrase extraction for concept identification. This reduces indexing costs to match vector RAG while delivering better performance for both local and global queries. It can process queries with costs as low as 4% of GraphRAG global search.
Flexible Query Processing - It combines best-first and breadth-first search dynamics, automatically breaking down queries into 3-5 subqueries. It uses text chunk embeddings for initial ranking, followed by LLM-based relevance assessment. This results in efficient handling of both specific and broad questions about your dataset.
Scalable Performance Control - You can adjust the relevance test budget parameter to optimize the cost-quality balance. Lower budgets (around 100 tests) maintain competitive performance with standard RAG systems, while higher budgets (500-1500 tests) significantly outperform existing methods across all metrics.
Production-Ready Implementation - The system handles dynamic community selection, parallel processing of subqueries, and automatic concept subgraph building. It includes built-in sentence-level relevance assessment and claim extraction capabilities, making it ready for production deployment with minimal setup requirements.
Up and coming - Microsoft will soon open source LazyGraphRAG through the GraphRAG GitHub repository. Keep an eye (or we can do it for you) on the repo!
Quick Bites
Cohere has released the latest version of its Rerank model, Rerank 3.5, to boost RAG accuracy for enterprises. Rerank 3.5 finds the most relevant business data to answer a user question using a method called “cross-encoding” where the model computes a relevance score for a business document in relation to a user question.
Developers can integrate it easily to improve relevance in RAG apps and use it across 100+ languages. Users of older Rerank models will need to migrate to Rerank 3.5 by March 31, 2025.
AI startup Lightricks has released LTXV, the first open source video model that generates videos in real-time. The model generates 5-second, 768x512 resolution videos at 24 FPS in approximately 4 seconds on H100 GPUs, with near real-time performance on consumer-grade hardware like the RTX 4090. It is available open source on GitHub and Hugging Face, you can explore both text-to-video and image-to-video functionalities.
Anthropic just launched a Fellows program, offering funding ($2,100 weekly stipend and $10k/month for research costs) and mentorship from their researchers to help engineers transition into full-time AI safety work. Apply by January 20, 2025, to join the first cohort, and collaborate on projects like adversarial robustness and interpretability. You might even land a full-time role at Anthropic.
ZenML has launched an LLMOps Database, a collection of 300+ real-world LLM implementations with detailed summaries and technical notes. Skip the shiny demos and dive straight into practical solutions, production-grade architectures, and detailed implementations that actually work. You can filter by tags or search for specific companies and topics to quickly find relevant examples.
Tools of the Trade
Steel.dev: Open source API and infrastructure layer for AI apps and agents that interact with the web. It handles session management, proxy support, anti-detection, and offers tools for converting web content to markdown, readability, screenshots, or PDFs.
Automated-AI-Web-Researcher: Uses a locally-run LLM (through Ollama) to independently investigate topics by breaking them into focus areas, performing web searches, and scraping relevant content. It saves all findings with source links in a text document and provides a final summary of the research, after which users can ask questions about what it found.
Open Source AI Server: A self-hosted AI server providing unified APIs for accessing various AI services including LLMs (OpenAI, Anthropic, etc.), Ollama, ComfyUI, and FFmpeg. It offers typed client libraries for multiple programming languages and includes a monitoring dashboard for tracking AI usage and performance.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
Calling an LLM an agent doesn’t suddenly make it more intelligent. ~
Pedro Domingosi heard from an English prof that he encourages his students to run assignments through chatgpt to learn what the median essay/story/response to the assignment will look like so they can avoid and transcend all that ~
roon
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply