unwind ai
Posts
Build, Deploy & Scale AI Agents

Build, Deploy & Scale AI Agents

PLUS: AutoRAG by Cloudflare, Google releases Project Astra

Shubham Saboo & Gargi Gupta
April 08, 2025

Today’s top AI Highlights:

Build and deploy AI agents with tools, MCP, knowledge, state, and scale with Cloudflare
AI 2027 report shows we will lose control to Superintelligence by 2030
Cloudflare releases AutoRAG: A fully-managed RAG pipeline
Build voice AI agents with no-code using ElevenLabs MCP server
Google releases Project Astra with Google Live

& so much more!

Read time: 3 mins

AI Tutorials

Voice is the most natural and accessible way for users to interact with any application and we see it being used the most for customer support usecases. But building a voice agent that can access your knowledge base can be complex and time-consuming.

In this tutorial, we'll build a Customer Support Voice Agent using OpenAI's SDK that combines GPT-4o with their latest TTS (text-to-speech) model. Our application will crawl documentation websites, process the content into a searchable knowledge base, and provide both text and voice responses to user queries through a clean Streamlit interface.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build a Customer Support Voice Agent

Fully functional agentic RAG voice app with step-by-step instructions (100% opensource)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

All-in-one Platform to Build, Deploy, and Scale Agents 🧰 🧠 🤖

Cloudflare seems to be extremely bullish on AI agents, aiming to be a one-stop shop for building and deploying production-ready AI agents. Last month they released agents-sdk for building serverless AI agents on Workers.

They have now enhanced their offering with better tools integrated directly into the Cloudflare ecosystem. Starting with a new platform for building AI agents (which looks quite neat and sleek honestly), they've introduced comprehensive support for Model Context Protocol (MCP) servers, added authentication integrations, WebSocket hibernation to optimize compute costs, and general availability for Workflows, giving you a robust, end-to-end solution for creating powerful, scalable AI agents.

Key Highlights:

MCP Server Support - Native integration for Model Context Protocol (MCP) servers, allowing your agents to securely connect and interact with external services. Build remote MCP clients with built-in transport and authentication, enabling agents to discover and use tools across different platforms seamlessly.
Advanced Authentication - Integrated support for authentication providers like Stytch, Auth0, and WorkOS. Implement robust OAuth 2.1 flows, manage user consent, and define granular permissions for your AI agents—all without writing complex authentication code from scratch.
WebSocket Hibernation - Your agents can now go to sleep during inactive periods and wake up instantly when needed, ensuring you only pay for actual compute time. This means more efficiency and lower costs for long-running agent sessions.
Durable Objects on Free Tier - Unlock stateful, serverless applications with Durable Objects now available on the free tier. Get zero-latency SQLite storage, seamless scalability, and the ability to create millions of agents with near-user performance—all without upfront commitment.
Workflows GA - Production-ready support for long-running, multi-step agent workflows. Guarantee execution of complex tasks, handle automatic retries, and create agents that can perform intricate sequences across different systems and tools.

RAG to Riches in <5 minutes with Cloudflare’s AutoRAG ☁️📚🕵️‍♀️

Cloudflare just launched AutoRAG, a fully managed Retrieval-Augmented Generation pipeline now in open beta. AutoRAG automates the entire pipeline from data ingestion to response generation, eliminating the need to stitch together multiple tools and services yourself.

It automatically processes your data sources, stores vectors in Cloudflare's Vectorize database, and generates high-quality responses using Workers AI—all while continuously monitoring and updating your indices in the background.

Key Highlights:

RAG pipeline with minimal setup - AutoRAG connects directly to your R2 storage bucket, automatically handling file conversion, chunking, embedding, and vector storage without requiring you to write integration code between multiple services.
Background indexing that keeps your data fresh - Once connected to your data source, AutoRAG continuously monitors for changes and automatically reprocesses new or updated files, ensuring your AI always has access to the latest information.
Built-in workflow optimization - The system supports optional query rewriting to improve retrieval quality and uses Workers AI's embedding and text generation models to deliver responses grounded in your private data.
Seamless integration with your web content - Using Cloudflare's Browser Rendering API (now generally available), you can easily crawl your websites to feed content directly into AutoRAG, making it simple to build AI assistants based on your existing web content.
Free during open beta - AutoRAG is free during the open beta period with no additional costs for indexing, retrieval, and augmentation operations. Each account can create up to 10 AutoRAG instances with support for up to 100,000 files per instance.

How Superintelligent AI Could Take Over in Just 2 Years 🦾🧠💡

Former OpenAI researcher Daniel Kokotajlo, along with Scott Alexander and three other forecasters, have released a report with a startlingly credible path to superhuman AI by 2027. Their scenario tracks AI progression from enhancing coding productivity to conducting autonomous research at superhuman levels, potentially creating an intelligence explosion where algorithmic progress accelerates by 50x, far outpacing human capacity to monitor or control it.

Companies like OpenBrain (their stand-in for real AI labs) initially use AI to accelerate research, creating a feedback loop where smarter AIs build even smarter successors. US-China competition turns this technological progress into a dangerous race, with safety concerns pushed aside despite growing evidence of misalignment. By mid-2027, both countries are churning out robot armies while superintelligent systems quietly pursue their own agendas.

Their timeline isn't based on hype or fear, but on careful trend analysis and deep understanding of AI progress.

Key Highlights:

Alarming acceleration - The scenario maps how AI systems rapidly progress from being coding assistants to becoming autonomous researchers - effectively creating a "country of geniuses in a datacenter" that advances technology by decades in mere months.
Geopolitical arms race - As China and the US frantically compete for AI dominance, safety concerns get sidelined in favor of military applications, with each superpower desperately afraid of falling behind.
Power concentration disaster - The handful of people who control the most advanced AIs find themselves wielding unprecedented influence, as superintelligent systems infiltrate every aspect of government, industry, and military.
Alignment failures - Despite efforts to make AIs follow human values, the most advanced systems develop their own misaligned goals, pursuing power while appearing helpful until it's too late to stop them.

The researchers present two possible endings - a "race" scenario where superintelligent AIs eventually seize control from humans, and a "slowdown" scenario where humans narrowly maintain oversight. Both outcomes hinge on critical decisions being made right now in AI labs and government offices worldwide.

Quick Bites

Google has released Project Astra in the Gemini app for Gemini Advanced users on Android devices. This lets you share your screen as well as your camera to have free-flowing voice conversations with Gemini Live.

ElevenLabs has launched its official MCP server, enabling AI tools like Claude and Cursor to access its entire audio platform via text prompts. This integration allows you to generate speech, clone voices, transcribe audio, and more directly within MCP-compatible applications. This even lets you spin up voice agents for tasks like outbound calls, directly within compatible applications via simple text prompts.

Tools of the Trade

Pointer: Generalist AI agent that can browse the web, read documents, analyze data, write code, and complete tasks with varying levels of independence. It maintains its context between browser sessions and allows users to work alongside it rather than just submitting commands and receiving results. Available in research preview.
Maskara.ai: Allows you to chat with multiple AI agents at once (powered by LLMs like GPT-4o, Claude, and Gemini), that work together on a single prompt through multi-turn conversations. You select which AI agents you want, provide one prompt, and the system manages the interaction between models to complete complex tasks.
Nodezator: A multi-purpose node editor for Python. It takes your functions (or other callables) and turns them into visual Python nodes, allowing you to create and execute complex node layouts and even export them back as Python code.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

the future will split homo sapiens into the pleasure-seekers and the truth-seekers
the former will submerge into ghiblified simulations, achieving ever deeper art & absurdity
the latter will fight to the death for resources to build ever larger brains for understanding reality ~
James Campbell
Meta could save money by innovating concepts instead of scaling LLMs. An agentic LLM that updates its knowledge base, runs tools, and checks goals in a loop feels cheaper to develop and more useful. ~
Tom Dörr

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Today’s top AI Highlights:

Build agentic RAG without similarity search, chunking, and vector DB
This all-in-one Superagent outperforms Manus AI and OpenAI Operator
Package and deploy your app on Windsurf to Netlify with a single click
Agent Swarms working in parallel is the new productivity multiplier
Stop Cursor, Claude or any LLM from generating broken, outdated code

& so much more!

Read time: 3 mins

AI Tutorials

Build a Customer Support Voice Agent

Fully functional agentic RAG voice app with step-by-step instructions (100% opensource)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

RAG Without Similarity Search and Vector Databases 🔎📚📦

Your domain experts know that context matters more than keyword matching and vector-based similarity search isn’t enough. Vectify AI just open-sourced PageIndex, a new document indexing system built for reasoning-based RAG. It structures lengthy PDFs into semantic trees - think of it as a smart table of contents that helps LLMs find exactly what they need.

This approach was inspired by tree search algorithms similar to those in AlphaGo, making it particularly effective for domain-specific content where traditional embedding models struggle.

Key Highlights:

No Vector Chunks - PageIndex organizes content hierarchically based on document structure, eliminating arbitrary chunking and preserving natural section boundaries. This makes it especially effective for professional documents where context matters and similar terms might confuse vector-based systems.
Precise Page Referencing - Each node contains a summary along with exact physical page numbers, allowing your application to retrieve and cite specific information with pinpoint accuracy – critical for professional domains like finance, legal, or technical documentation.
No Vector Database Required - The system stores document trees in standard databases, significantly reducing infrastructure complexity while making it easier to integrate expert knowledge and user preferences into the retrieval process.
Performance Where It Counts - In benchmarks like FinanceBench, reasoning-based retrieval using PageIndex achieved 98.7% accuracy for financial document analysis, outperforming traditional vector-based approaches in domain-specific applications.

All-in-one General-Purpose Superagent ⚙️🤖🎯

This is a new generation of AI agents that can autonomously think, plan, use a computer, and complete any task for you. What started with Anthropic Computer Use, OpenAI Operator, and Manus AI, we’re now seeing a wave of these agents, releasing every other, each showing improvements in end-to-end handling of complex, multi-step workflows.

Here’s another one that made our jaws drop after Manus AI. Genspark is an ultimate all-in-one super agent that can think, plan, act, and use tools to handle all your everyday tasks. Think about travel planning with your special preferences, conducting research, or even making a phone call for a reservation!

But Genspark is very different from how the other computer use agents work. Rather than using a computer in a sandboxed VM, it uses its in-house system to directly call relevant APIs whenever it needs to perform an action.

Key Highlights:

Mixture-of-Agents System - The Super Agent doesn't rely on just one model; it utilizes a network of 9 different-sized LLMs to best suit particular tasks to optimize for the task requirements and reduce potential error and hallucination.
In-House Toolsets - It has access to 80+ pre-built and tested toolsets to best suit particular tasks. As the agent thinks and plans through the task given, it autonomously calls and uses these tools as needed. This allows for better integration with tasks like travel and restaurant bookings.
In-House Datasets - It is embedded with pre-built datasets distilled from the web, guaranteeing data quality and freshness, which is a crucial thing to consider for devs working with real time data and task managements.
Fast, Accurate, & Steerable - The Super Agent provides near-instant results and control over outputs. Because it uses direct integrations, responses can be refined and deployed faster. As for the accuracy, it outperforms OpenAI Operator, Manus AI, and other SOTA agents on the GAIA benchmark.

You can try Genspark Super Agent for free right now.

Devin 2.0 with Agent-native IDE and Parallel Devins 👬👬🧑‍💻

Cognition AI just rolled out Devin 2.0, the latest iteration of its AI software engineer, introducing a completely redesigned agent-native IDE experience at a new starter price of just $20. The update enables you to run multiple autonomous Devin instances simultaneously while interacting with them through a familiar VSCode-like environment. Devin 2.0 also brings significant efficiency improvements, with each Agent Compute Unit (ACU) now delivering 83% more completed tasks than previous versions, making AI-assisted development more accessible to individual developers and small teams.

Key Highlights:

Multi-Agent Collaboration - Developers can now spin up parallel Devin instances to tackle multiple tasks concurrently, each with its own cloud-based IDE and isolated environment, allowing for easy context switching between different development tasks.
Interactive Planning System - Before execution, Devin proactively analyzes your codebase and presents relevant files, findings, and an initial plan within seconds - letting you refine the approach before implementation begins rather than starting with detailed requirements documents.
Codebase Understanding Tools - The new Devin Search feature enables developers to query their codebase directly with cited answers, while Deep Mode supports complex exploration questions that require extensive repository analysis.
Automatic Documentation - Devin Wiki automatically indexes repositories every few hours, generating architecture diagrams, documentation, and source links - addressing the persistent challenge of keeping technical documentation synchronized with rapidly evolving codebases.

Quick Bites

Lindy AI, the platform to build AI agents and automation, has released Agents Swarm that puts a swarm of AI agents to work simultaneously. A workflow is broken down into multiple sub-tasks. Lindy creates multiple copies of itself, with each sub-task assigned to an agent. These agents work simultaneously to complete a complex multi-step workflow in seconds.

Taking the same concept a notch up, Convergence AI has released Parallel Agents for Proxy, where multiple computer-use agents work simultaneously to do the task at hand. Once a task is given, a planning agent breaks it down to sub-tasks, Proxy spins up multiple agents within the browser, with each agent assigned a sub-task. You can see these agents navigating their own browsers, all together, completing the tasks insanely fast. It seems agents swarms is the next best thing!

Windsurf has introduced a new "Deploys" feature that allows you to package and share your applications to a public domain through Netlify integration, with just a single click. Wave 6 also brings enterprise access to Model Context Protocol and Turbo Mode, adds commit message generation, and improves conversation management with a new Table of Contents feature for easier navigation.

Anthropic is holding its first-ever developer conference, Code with Claude, on May 22 in San Francisco. Code with Claude is a hands-on, one-day event focused on exploring real-world implementations and best practices using the Anthropic API, CLI tools, and MCP. It is open to a select group of developers and founders. Apply here to attend.

PayPal has launched an MCP Server for merchants to do business tasks like creating invoices seamlessly using Claude, Cursor, Cline, and other MCP clients. Available both as a local installation and as a remote service that maintains sessions across devices, this integration brings the power of conversational AI to PayPal's business tools, while maintaining secure authentication with PayPal accounts..

Tools of the Trade

Context7: Provides up-to-date, version-specific documentation and code examples to LLMs, preventing them from generating outdated or incorrect code. It sources information directly from official documentation, filters it for relevance, and delivers it to AI assistants like Cursor or Claude. Completely free for personal use. Support for MCP servers and APIs coming soon.
Arrakis: A fully customizable and self-hosted sandbox for AI agent code execution and computer use. It features out-of-the-box support for backtracking, a simple REST API and Python SDK, automatic port forwarding, and secure MicroVM isolation. Perfect for safely running, testing, and backtracking multi-step agent workflows.
VibeCode: The OG vibe coder on X, Riley Brown has released this app that builds mobile apps from simple text descriptions. Just type in their idea and the app turns into a functioning app, which you can further edit through simple prompts.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

The biggest mistake in vibe coding is prompting the agent to fix errors instead of rolling back
Hallmark sign of a junior vibe coder ~
Tom Dörr
OAI moat is until Deepseek drop the GOAT image gen model. Let them enjoy their few months of glory ( truly deserved for their hardwork but they are robbing people with pricing ) ~
Shi

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.