unwind ai
Posts
OpenAI Drops Opensource Multi-Agent Framework

OpenAI Drops Opensource Multi-Agent Framework

PLUS: Open compute for training LLMs, AI glasses reveal personal details

Shubham Saboo & Gargi Gupta
October 14, 2024

Today’s top AI Highlights:

OpenAI releases new opensource multi-agent AI framework Swarm
Decentralized training of a 10B parameter AI model using open compute
Tesla showcased their fully autonomous (or remotely operated) Robotaxi and humanoid robot Optimus
AI glasses that reveal anyone’s personal details - home address, phone number, and more
iOS app to chat with Llama 3.2 models locally on iPhone

& so much more!

Read time: 3 mins

🚀 India's Biggest Product Event - Product (Un)Conference '24 is Here!

Bet you're tired of conferences that are just panels and monologues! That’s exactly why The Product Folks created (Un)Conference!

Join over 350+ visionary leaders, including 50+ founders and CXOs, in Bangalore for India's premier product event. Dive into the future of product development in an AI-driven world, with unfiltered insights and behind-the-scenes stories from founders of InMobi, Clevertap & CXOs from Swiggy, MakeMyTrip, Myntra, Cleartrip, and many more companies.

Don’t miss this exclusive opportunity to learn from the brightest minds shaping the Indian startup ecosystem!

💡 What’s on Deck at (Un)Conference '24?

Cutting-Edge Techniques: Master AI and product strategies driving tomorrow.
Real-World Successes: Hear directly from leaders turning visions into reality.
Make New Friends: Make real connections with like-minded people.

📅 Date: October 19, 2024

📍 Place: Bangalore

🎨 Theme: Building Products for the AI World

🎟️ Act Fast — Spots Are Limited! Readers of Unwind AI get an exclusive discount by signing up with our referral code ‘UnwindxTPF’

👉 Register Now for (Un)Conference ‘24!

Latest Developments

Build Modular, Scalable Multi-Agent Apps with Swarm 👭

Probably the first time OpenAI has opensourced something! They have introduced Swarms, an ergonomic, lightweight framework to orchestrate multiple AI agents using the concept of "routines" and "handoffs." This simplifies complex tasks by breaking them down into smaller, manageable units and at any point an AI agent can choose to hand off a conversation to another agent.

Swarms is a robust and scalable approach compared to managing extensive prompts and diverse logic within a single agent. The sample library provides practical examples and code that you can easily implement.

Key Highlights:

Routines - Define a set of instructions (system prompt) and associate tools (functions) with them to handle specific tasks. This allows for modular design and easier management of complex flows. LLMs follow these routines, offering "soft" adherence for more natural conversations.
Handoffs - Enable dynamic switching between different agents during a conversation, allowing specialized agents to handle specific parts of the interaction. Agents can call transfer functions to trigger handoffs, maintaining full conversation history for the new agent.
Tool Calls & Execution - Functions are automatically converted to JSON Schema for use with OpenAI's tool-calling feature. Results from function execution are returned to the model, facilitating dynamic interaction and complex logic.
Local Setup - Swarm runs almost entirely on the client side, meaning you can test and run agent-based systems without worrying about server-side dependencies.
Swarm Library (Experimental) - Here’s a handy Python library (still experimental, though!) that implements these ideas along with examples. Think of this as a starter kit for building your own multi-agent systems.

Training 10B LLM Where Anyone can Contribute Compute 🌐

The first-ever opensource decentralized training of Llama like 10B LLM where anyone can contribute the compute! Prime Intellect’s project INTELLECT-1 trains a 10-billion-parameter AI model using decentralized computing power. This builds on their earlier work, OpenDiLoCo, which expanded Google DeepMind’s DiLoCo method for training AI across distributed devices. Now, with INTELLECT-1, anyone can contribute their computing resources to help train the model, moving a step closer to open and collaborative AI development.

The Prime framework behind this improves on communication and efficiency, allowing the model to be trained across different locations without needing constant syncing between machines. This new system uses resources from different contributors while still keeping the process smooth and reliable.

Key Highlights:

Opensource - INTELLECT-1 is built on OpenDiLoCo, an open-source platform that lets anyone contribute compute power. The code is available on GitHub, and there’s a live dashboard to track progress.
Reduced Communication Overhead - To make the training run smoothly across global locations, syncing only happens every 100 steps, minimizing the communication load between machines. This is further optimized by compressing the data using int8 quantization, cutting down the bandwidth requirements by 2000x.
Fault-Tolerant Design - Prime ensures training continues smoothly even if machines join or drop out. It automatically adjusts to available resources, making it flexible for decentralized training across the globe.
Scaling to 10B Parameters - Prime Intellect is training a 10B model (based on Llama-3) on a high-quality dataset of 6 trillion tokens, using GPUs from contributors spread across different locations. This large-scale, open effort is a step towards truly democratizing AI development.

Quick Bites

Elon Musk unveiled the highly-anticipated Tesla’s Robotaxi called “Cybercab” at the We, Robot event in Los Angeles. But the robotaxis were not the only new reveal. Tesla surprised everyone with a new Robovan prototype, a large autonomous vehicle to carry many people together.

Robotaxi - The Cybercab is fully autonomous, with no steering wheels or pedals, and charges wirelessly possibly using mats on the roads. It is priced at under $30K and will be mass-produced by 2026.
Robovan - Designed to transport up to 20 people, this large EV can be used for both personal and commercial purposes.
Optimus - Tesla left its humanoid robot Optimus untethered to interact with people (some say that the robots were being teleoperated though!). Optimus is expected to cost between $20,000 and $30,000 when it eventually goes on sale.

And we can’t not talk about SpaceX’s groundbreaking milestone achieved yesterday. SpaceX successfully caught the returning Super Heavy booster of the Starship using mechanical arms on the launch tower. After months of preparing and testing, the company achieved this feat in its first attempt and took a leap towards its goal of "rapid reusability," significantly reducing the costs and turnaround time for space launches.

A badass AI phone assistant that screens your calls, rejecting the spam and preserving your sanity. A group of engineers have built an early prototype of Donna, an AI secretary using OpenAI Realtime API and Google Calendar integration. It will connect the caller to you only if it’s an urgent or authenticated call, or if it's from friends or family. It can handle multilingual conversations. Do check the demo, it is pretty badass!

Two Harvard students have created I-XRAY, AI-powered glasses that can reveal personal details like names, home address, phone numbers, and relatives’ names by simply looking at someone. The glasses use face recognition and data extraction from public databases to gather this information. Privacy paranoia just got real!!

Apple is developing four new AR headsets, including a lower-cost Vision model priced around $2,000, expected by 2025. By 2027, they may introduce smart glasses on par with the Meta Ray-Bans, as well as AirPods with cameras.

Tools of the Trade

fullmoon: iOS app to chat with LLMs that’s optimized for Apple silicon and works on iPhone, iPad, and Mac. Your chat history is saved locally, and you can customize the appearance of the app. Currently supports Llama 3.2 1B and 3B models.
Actions by Firecrawl: Interact with any web page before extracting. It lets you simulate user interactions (clicks, scrolls, typing, etc.) on a webpage before extracting data. This ensures dynamic content is loaded and accessible for more complete and accurate data scraping.
GenAI Career Assistant: Opensource multi-agent AI app that simplifies the job search process by personalized career guidance, resume analysis, and custom cover letters. It streamlines tasks like job hunting and employer research through an interactive Streamlit interface.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes

Ask ChatGPT “From all of our interactions what is one thing that you can tell me about myself that I may not know about myself” ~
Tom Morgan
The most interesting to build over the next year is an AI engineer
Here is why - once you can scale engineering, you can build anything!
Including super intelligence ~
Bindu Reddy

Meme of the Day

“really cool tech, when will you be profitable?”

That’s all for today! See you tomorrow with more such AI-filled content.

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it wtith at least one, two (or 20) of your friends 😉

Reply

or to participate.