- unwind ai
- Posts
- Opensource Task Engine for Building AI Agents
Opensource Task Engine for Building AI Agents
PLUS: New model by Google and xAI, OpenAI's Reinforcement Finetuning technique
Today’s top AI Highlights:
Lightweight task engine for building stateful AI agents
Stripe brings financial tools to AI agents with new SDK
New models from Meta, Google, and xAI
OpenAI shows a new Reinforcement Finetuning technique that is better than regular fine-tuning
Unified API for AI agents to communicate across multiple channels (email, Slack, SMS, WhatsApp, etc.)
& so much more!
Read time: 3 mins
AI Tutorials
You might know about agencies that help build software products - with CEOs making strategic decisions, CTOs architecting solutions, developers writing code, and product managers coordinating everything. But can you imagine an agency fully run by AI agents that collaborate to analyze, plan and guide software projects, all working together seamlessly like a real team?
In this tutorial, we'll build exactly that - a multi-agent AI Services Agency where 5 specialized AI agents work together to provide comprehensive project analysis and planning:
CEO Agent: Strategic leader and final decision maker
CTO Agent: Technical architecture and feasibility expert
Product Manager Agent: Product strategy specialist
Developer Agent: Technical implementation expert
Client Success Agent: Marketing strategy leader
We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
Flow is a lightweight task engine for building AI agents without getting tangled in complex workflows. Instead of the usual node-based systems, it uses a dynamic task queue that lets your agents adapt and evolve as they run. You can run tasks in parallel, schedule new ones on the fly, and manage state - all with clean, straightforward code. Flow is lightweight but capable, so you can focus on building smart agents rather than wrestling with infrastructure.
Flow's integration with Laminar's tracing capabilities provides detailed insights into each step of your agent's execution, making debugging and state management significantly easier.
Key Highlights:
Parallelism and Dynamic Task Scheduling - Flow automatically handles the parallel execution of tasks, and tasks can dynamically add new tasks to the queue during runtime. This eliminates manual thread management and AI agents become highly adaptable to changing conditions or data.
State Management and Thread Safety - Each task in Flow operates within a shared, thread-safe context, making it easy to manage and access data across different parts of your agent. You can load the previous state and save the current state, providing a clean way to persist and resume agent operations, or even rollback to previous states.
Debugging with Laminar Tracing - Flow comes with out-of-the-box integration with Laminar’s detailed tracing built on OpenTelemtry. You can visualize the execution flow of your agent, inspect inputs and outputs at each step, and quickly identify bottlenecks or errors.
Lightweight and Dependency-Free - The core Flow engine has zero external dependencies, making it easy to integrate into existing projects without adding bloat. This makes Flow performant and doesn't introduce unnecessary complexity to your codebase.
Streamline your development process with Pinata’s easy File API
Easy file uploads and retrieval in minutes
No complex setup or infrastructure needed
Focus on building, not configurations
Stripe has launched a new agent toolkit that lets you directly integrate payment processing into their LLM agent workflows. This means AI agents can now handle financial transactions, opening up a whole new level of automation possibilities.
The toolkit works seamlessly with popular frameworks like Vercel's AI SDK, LangChain, and CrewAI, and supports any LLM provider that offers function-calling capabilities. With this toolkit, your agents can not only understand user requests but also directly act on them by creating invoices, processing payments, or even making purchases on behalf of users.
Key Highlights:
Framework Support - Native compatibility with LangChain, CrewAI, and Vercel AI SDK lets you integrate Stripe functionality without rebuilding existing agent architectures. The toolkit's tools can run alongside other agent toolsets, supporting complex multi-step operations.
Built-in Financial Tools - Create virtual cards with customizable spending limits, monitor transactions in real-time, and implement usage-based billing through the included middleware. Comes with pre-built functions for common operations like payment links and product management.
Optimized Agent Performance - Simplified API responses and focused functionality help agents make better decisions. The toolkit automatically abbreviates response data to essential fields, reducing confusion in multi-step processes and improving completion accuracy.
Developer-Focused Security - Test mode support and restricted API keys let you validate agent behavior before production. Granular permissions system helps you control exactly which Stripe features your agents can access, reducing potential errors while maintaining flexibility.
Quick Bites
Meta unexpectedly released Llama 3.3 70B model, delivering Llama 3 405B-level performance at just 10% inference cost. Driven by “new alignment process and progress in online RL techniques” which Meta didn’t disclose, Llama 3.3 70B beats the all-new Amazon Nova Pro. It matches GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro while being:
- Open source
- Locally runnable
- 25x cheaper than GPT-4o, 30x cheaper than Claude 3.5 Sonnet, 13x cheaper than Gemini 1.5 Pro
Llama 3.3 70B is currently not multimodal, is multilingual, has a 128K context window, December 2023 knowledge cutoff, and is available to download from Meta and Hugging Face. It is also available via Ollama for local inference, Hyperbolic Labs, and Groq (at an insane speed of 276 tokens/second).
On the second day of the 12 days of new releases by OpenAI, the team showcased a new technique called Reinforcement Fine-tuning (RFT) that teaches the model to reason in new ways over custom domains. The model is given space to "think through" the problem and then provide a response. It differs from regular fine-tuning in which the model is trained to mimic or replicate features of the input text or images. RFT can also generalize from limited examples unlike fine-tuning.
RFT for the O1 series models is currently in preview. OpenAI plans to launch this feature publicly early next year.
xAI team added a new image generator called Aurora with its AI chatbot Grok, Grok 2 + Aurora. It was momentarily available on X web and mobile app, but was soon taken down for some reason. What made Aurora stand out was it could create incredibly photorealistic images of public figures without the usual restrictions. There's still uncertainty around Aurora's development - xAI might have built it themselves, or they could be working with another company like Black Forest Labs for FLUX. Also, xAI team has announced that Grok 3 is coming soon!
Google has released a new experimental Gemini model Gemini-exp-1206 (unclear if it’s from Flash or Pro series) in the Google AI Studio. It comes with full 2 Million context lengths, along with improvements in coding, not more than this has been revealed till now.
The model is also leading the Chatbot Arena LLM Leaderboard, with significant improvements in coding and hard prompts.
Tools of the Trade
Cartograph: Generates explanations and architecture diagrams of your entire codebase. It supports Python, JS/TS (including React), and Rust, analyzing code changes across commits to provide up-to-date, interactive visual representations and reference documentation.
Agent Reach: A unified API that allows AI agents to communicate with users across various channels, including email, Slack, SMS, and WhatsApp. It can handle OAuth token management, user consent, and provide analytics on agent usage and performance.
Countless.dev: Explore, compare, and calculate costs for every AI model—LLMs, vision models, and more. Sort by price, token limits, or features, and find the perfect match for your use case in seconds (100% free and opensource).
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
The key skill to develop in order to take advantage of LLMs is being able to find useful ways to apply a fundamentally unreliable technology
Maybe the reason programmers find this hard is that the whole point of programming is that computers do exactly what you tell them to
Simon WillisonI don't get fine-tuning.
Like I see why you might want to customize a model. BUT, it feels like a big waste of time at the current moment, since billions of dollars is stil being poured into making base model improvements?? Like won't o2 or GPT-5 just be able to do whatever you fine tune for? ~
Nick Dobos
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply