- unwind ai
- Posts
- Step-by-Step Reasoning in RAG
Step-by-Step Reasoning in RAG
PLUS: Context API for AI agents, New Gemini 2.0 models
Today’s top AI Highlights:
Connect your AI agents to user data without building RAG
DeepRAG helps LLMs think through information retrieval step-by-step
Google expands Gemini 2.0 family for developers
Opensource reproduction of o1-level model - Minimal recipe for test-time scaling
Generate entire front end of your app —navigation, state management, and UI screens—with one app
& so much more!
Read time: 3 mins
AI Tutorials
For businesses looking to stay competitive, understanding the competition is crucial. But manually gathering and analyzing competitor data is time-consuming and often yields incomplete insights. What if we could automate this process using AI agents that work together to deliver comprehensive competitive intelligence?
In this tutorial, we'll build a multi-agent competitor analysis team that automatically discovers competitors, extracts structured data from their websites, and generates actionable insights. You'll create a team of specialized AI agents that work together to deliver detailed competitor analysis reports with market opportunities and strategic recommendations.
This system combines web crawling, data extraction, and AI analysis to transform raw competitor website data into structured insights. Using a team of coordinated AI agents, each specializing in different aspects of competitive analysis
We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments

LLMs are powerful, but they often lack the context to truly shine in real-world applications. Sundry aims to solve this by offering an intelligent context provider API designed specifically for LLMs needing access to precise user data. This API connects directly to services your users already use like GitHub, Jira, Slack, and Office 365, allowing LLMs to query these sources using natural language.
Unlike traditional RAG systems that rely on approximate matching, Sundry focuses on retrieving precise, actionable data when the LLM needs it. The API accepts simple natural language questions and returns structured responses optimized for LLM consumption, making it easier to build AI applications that can proactively gather context.
Key Highlights:
Precision Context - Sundry delivers exact results from user data sources (GitHub, Jira, Slack, Office 365, etc.), avoiding the approximations common in RAG-based solutions. This is crucial for applications requiring verifiable accuracy, like automated code review or compliance tracking in finance.
Direct Results - Query connected data sources using plain language like "What was my last GitHub issue?" and receive exact matches instead of approximate results. The API handles the complexity of data access while maintaining precision, letting you focus on building features rather than managing multiple integrations.
Structured Responses - Sundry returns not just the data, but also metadata about how the query was interpreted and a confidence score. This allows your LLM to provide more informed responses to users and handle uncertainty gracefully.
Getting Started - Get started by generating an API key and having users connect their data sources. The service automatically indexes connected platforms, making data immediately available through a single endpoint. Detailed documentation includes best practices for incorporating Sundry queries into your AI applications.

RAG is great for providing the relevant context in which an LLM needs to respond. But ineffective retrieval and the introduction of unnecessary noise hurts response quality. LLMs also have a tendency to hallucinate and make up facts, which is exacerbated by a lack of understanding of when they need to consult external knowledge.
DeepRAG is a new framework tackles these challenges by dynamically deciding when to pull information from external sources during the reasoning process, rather than just at the beginning or end. The model learns to recognize its limitations and find the exact information it needs, when it needs it. Think of it as on-demand knowledge injection, powered by a Markov Decision Process (MDP).
Key Highlights:
Precise Knowledge Retrieval - DeepRAG dynamically evaluates each sub-query to determine if external knowledge is required, reducing reliance on potentially inaccurate internal knowledge. This optimizes API usage by reducing retrieval calls by only retrieving what you actually need.
Step-by-Step Retrieval Reasoning - The model breaks down complex queries into smaller, manageable sub-queries, enabling more accurate information retrieval at each step. Developers can adapt this decomposition strategy to their own application's question-answering pipelines, leading to greater context awareness.
Easy Fine-Tuning - DeepRAG utilizes a self-calibration method with synthetic data to refine its understanding of its own knowledge boundaries. The framework allows you to fine-tune your models by creating synthetic data, and preventing additional hallucination while maintaining performance.
Scalability - DeepRAG demonstrates its robustness and generalization capabilities in time-sensitive and out-of-distribution settings. DeepRAG adapts its strategy based on the specific requirements of the task, which ensures optimal accuracy and efficiency.
Code Availability - While the research paper is available, the codebase for DeepRAG has not yet been open-sourced. Developers interested in implementing DeepRAG will need to wait and keep an eye.
Quick Bites
Google has released its Gemini 2.0 series in the Google AI Studio and via API for more developers and production use. Here are all the details:
Gemini 2.0 Flash is now generally available, with higher rate limits, stronger performance, and simplified pricing.
Gemini 2.0 Flash-Lite, a new variant that is Google’s most cost-efficient model yet, is now available in public preview.
Gemini 2.0 Pro, an experimental update to their Pro series for coding and complex prompts, is now available.

A team of researchers has demonstrated they can match OpenAI's o1 model's inference-time scaling capabilities - which lets AI models "think longer" before responding - at a fraction of the cost. While OpenAI reportedly spent millions developing o1, the new open-source s1-32B model achieved similar results with minimal training costs by using 1,000 carefully selected examples and a simple "Wait" token technique to control thinking time.
Running on consumer hardware, s1-32B even outperformed o1-preview by 27% on competition math problems. The model, data, and code are all opensourced, available here.
Tools of the Trade
Surf.new: Opensource alternative to OpenAI’s Operator Agent. It’s a playground to test out different web agents. These agents can surf the web and interact with webpages similar to how a human would. Built by steel.dev.
Simba: Open-source knowledge management system that helps organize and prepare content for RAG apps by providing vector store integration, embedding models, and document chunking with a modern UI. It acts as a middleware layer that handles all the knowledge-processing complexities.
a0.dev: Generate complete React Native mobile apps and components with a simple description, with instant previews. It offers two main tools: a Full App Generator that handles navigation, state management and UI screens, and a Component Generator for creating individual React Native UI screens quickly.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes
Anthropic would really rather openly claim to be sitting on a model vastly superior to everyone's in order to signal their allegiance to AI safetyism and farm holier-than-thou points rather than actually ship it and accelerate their growth.
Unfathomable waste of opportunity. ~
Beff – e/acco3-full is basically ready. o3-pro is also.
Then you have a gap - models available internally, but not being released to the general public. I continue to think that this gap is larger than most realize. (I think many people assume it's zero.)
Then you have some even more powerful model in training. Marketing can call it "o4" or whatever, but that doesn't mean that it's the next model in line after o3.
Remember: OpenAI already had an RL model that freaked everybody out internally in October 2023. ~
Prinz Eugen, der edle Ritter
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply