unwind ai
Posts
Manus AI Agent but Opensource

Manus AI Agent but Opensource

PLUS: Microsoft’s open-source RAG, Run LLMs via React Native on your phone

Shubham Saboo & Gargi Gupta
March 10, 2025

In partnership with

Today’s top AI Highlights:

Microsoft’s open-source RAG for domain-specific knowledge
Al agent framework to orchestrate computer use, agents, and LLM calls
Run LLMs via React Native on your phone
Gemini 2.0 has a Python sandbox to run code, analyze data, create visualizations
OpenManus replicating Manus AI with 20K+ stars and upwards

& so much more!

Read time: 3 mins

AI Tutorials

AI Agent Tutorial

Air quality has become a crucial health factor, especially in urban areas where pollution levels can significantly impact our daily lives. While many air quality monitoring tools exist, there's a gap when it comes to personalized health recommendations based on real-time air quality data.

In this tutorial, we'll walk you through building a multi-agent AQI Analysis App that gives personalized health recommendations based on real-time air quality data. This system will analyze current air conditions and provide tailored advice based on your health conditions and planned activities.

Tech stack:

Firecrawl for web scraping
Agno (formerly Phidata) to create and coordinate AI agents
OpenAI GPT-4o as LLM
Streamlit for interface

Build an AQI Analysis Agent

Fully functional AI agent app with step-by-step instructions (100% opensource)

AI Workflow

This workflow combines Grok-3's image generation capabilities with Pika AI's video animation features to create stunning transformation videos that show the evolution from vintage to modern aesthetics. Perfect for photo restorations, concept visualizations, or creative storytelling.

AI Video Transformation with Grok-3 and Pika AI

Step-by-step AI Workflow Recipe to Transform

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

sPecIalized KnowledgE and RAG 🔎 ✨🗃️

Microsoft has released PIKE-RAG, a new approach to Retrieval-Augmented Generation (RAG) that goes beyond basic document retrieval. Unlike standard RAG systems, PIKE-RAG focuses on deeply extracting and utilizing specialized, domain-specific knowledge and building a clear line of reasoning.

This framework is specifically designed to tackle the complex challenges found in industrial applications, where data is often messy and expertise is critical. The system's modular design and phased approach to development should get your attention.

Key Highlights:

Modular Framework - PIKE-RAG isn't a one-size-fits-all solution. Its core modules (document parsing, knowledge extraction, storage, retrieval, organization, reasoning, and task decomposition) can be mixed and matched to build RAG pipelines tailored to specific problem types and complexity levels.
Handles Complex Data - PIKE-RAG is built to deal with the reality of industrial data – scanned images, PDFs, web data, and specialized databases. It uses techniques like context-aware segmentation and multi-granularity knowledge extraction to pull meaningful information from these varied sources.
Task Decomposition - For complex, multi-step reasoning tasks, PIKE-RAG employs knowledge atomizing to break down data chunks and extract intrinsic knowledge, and knowledge-aware task decomposition. This breaks down complex queries into smaller, manageable sub-queries, improving accuracy and enabling the system to handle multi-hop questions.
Staged Implementation - PIKE-RAG introduces task classification (factual, linkable-reasoning, predictive, creative) that allows you to start with simpler tasks and progressively add more complex reasoning capabilities, aligning development with real-world needs. This provides a clear roadmap for implementation.

Turn AI into Your Income Engine

Ready to transform artificial intelligence from a buzzword into your personal revenue generator

HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.

Inside you'll discover:

A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential
Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background
Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve

Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.

Get Your Guide

Build Reliable Agents: New Framework Supports MCP and Computer Use 🧑‍💻🌐🛠️

A new open-source AI agent framework, Upsonic, has launched, for building reliable and scalable agentic applications. It directly addresses common production hurdles by incorporating built-in reliability features, unlike many existing frameworks that require extensive custom coding for stability.

Upsonic natively supports the Model Context Protocol (MCP), opening up a vast ecosystem of pre-built tools and data sources. The framework also offers a unique task-centric approach that simplifies agent development and deployment, contrasting with role-based or graph-based architectures.

Key Highlights:

Reliability Mechanisms - Upsonic features a multi-layered reliability system, including Verifier and Editor agents, iterative verification rounds, and feedback loops. This drastically improves output accuracy, especially crucial for tasks involving numerical operations or critical actions.
High Reliability Scores - Their testing shows a 98.2% reliability score, demonstrating a substantial improvement over other frameworks like Crew AI and LangGraph in tasks like JSON key transformations.
Native MCP Support - You can directly integrate with hundreds of tools developed using the MCP. This gives immediate access to a broad range of tools, eliminating custom integrations from scratch.
Task Centric - Upsonic uses Tasks as its core component. Tasks can be customized with various parameters, tools, and context. It automatically generates the necessary steps within tasks. This allows agents to handle multiple tasks and simplifies the process of defining task dependencies.
Direct LLM Calls - For simple tasks, developers can bypass the agent system and directly interact with the underlying LLM. This reduces overhead, leading to faster execution and lower costs, without sacrificing structured outputs.
Computer Use Capability - The framework supports "Computer Use" and "Browser Use" integrations, allowing agents to interact with software and websites just like a human would – clicking, scrolling, and typing. This is critical for automating tasks that lack APIs or require navigating complex user interfaces.

Quick Bites

Google has released a new experimental Gemini Embedding text model (gemini-embedding-exp-03-07) available in the Gemini API. It claims the top spot on the MTEB Multilingual leaderboard with a score of 68.32, outperforming competitors by a significant margin. The model, trained on Gemini itself, comes with features like an 8K token input limit, 3K output dimensions, and support for over 100 languages.

Google has made Code Execution generally available for Gemini 2.0 models in both Google AI Studio and the Gemini API. This feature gives the models access to a Python sandbox where they can run calculations, analyze complex datasets, and create visualizations on the fly—with support for file input, Matplotlib graphs, and even Multimodal Live API and Thinking Mode. The sandbox supports libraries like NumPy and Pandas, with code execution lasting up to 30 seconds at a time and up to 5 executions without re-prompting.

Anthropic has redesigned its Console to be a unified workspace for building and testing AI applications with Claude, now featuring team collaboration through shareable prompts. The upgrade supports the latest Claude 3.7 Sonnet model and includes tools to write, generate, evaluate, and refine prompts, with new controls for the extended thinking feature that allows step-by-step reasoning.

Hugging Face just released a fun and easy guide to running LLMs directly on your phone! Using React Native and llama.cpp, you can now build mobile apps capable of downloading and running GGUF-quantized models like DeepSeek R1 Distil Qwen 2.5, and Llama-3.2, keeping everything private and offline. The step-by-step tutorial and full codebase are available now, so start building your on-device AI chatbot.

Tools of the Trade

OpenManus: China’s new AGI agent Manus AI is incredible, but you can use it only if you have the invite code. OpenManus is an open-source alternative to Manus AI which you can use today! It employs specialized agents (Project Manager, Planning Agent, Tool Call Agent) that collaborate to solve complex tasks. You can also extend or customize the system by adding new agents, tools, or features.
Comet API: A unified platform that provides access to over 500 AI models through a single API. It allows you to define workflows, connect to multiple data sources, and automate API calls without writing complex backend code.
Rio: Easy-to-use framework for creating websites and apps and is based entirely on Python. You won't need a single line of HTML, CSS, or JavaScript to create beautiful, modern apps.
Connected Apps by Stytch: Turn your applications into OAuth 2.0/OIDC Identity Providers for secure cross-app integrations and AI agent workflows. It handles token management, permissions control, and user consent flows so applications can safely share data with third-party services and AI agents.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

The ultimate test: give AI access to a bank account with $1,000 and a fixed amount of time. AI has to grow the balance faster than humans would.
That’s superintellinge in my book. Everything else is just a glorified benchmark. ~
Santiago
Vibe coding is basically code for "vibe now, stress later" when you're confronted with angry customers, impossible bugs and code you don't understand even one bit. ~
Erwin

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.