- unwind ai
- Posts
- AWS Cuts LLM Cost in Half
AWS Cuts LLM Cost in Half
PLUS: Open-source AgentOS, AI agent to build React Native apps
Today’s top AI Highlights:
Amazon's Nova models beat Gemini and Claude on price while matching their performance
Open-source programming framework to build multi-agent systems
Google DeepMind unveils a world model, expect embodied AI agents with spatial awareness soon
Build a real-world AI hedge fund team with AI agents absolutely free
AI agent for building React Native apps
& so much more!
Read time: 3 mins
AI Tutorials
In this tutorial, we'll build a Personal Health & Fitness AI Agent that demonstrates how to create task-specific AI agents that collaborate effectively. Using Google Gemini and Phidata, we'll create a system where two specialized agents - one for diet and one for fitness - work together to generate personalized recommendations.
This app generates tailored dietary and fitness plans based on user inputs such as age, weight, height, activity level, dietary preferences, and fitness goals.
Phidata makes this multi-agent approach straightforward by providing a framework designed for building and coordinating AI agents. It handles the complexity of agent communication, memory management, and response generation, letting us focus on defining our agents' roles and behaviors.
We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
AG2 (former AutoGen) is an open-source framework to build and deploy multi-agent systems with production-grade features. You can create different types of agents (assistants, executors, critics) that work together to solve complex tasks through automated conversations. The framework handles all the heavy lifting of message routing, state management, and conversation flows, while giving you full control over LLM configurations and human input.
Instead of spending months building agent coordination from scratch, AG2 lets you focus on defining agent behaviors and business logic through clean, intuitive code.
Key Highlights:
Multi-Agent Communication - AG2 automates message routing and state management, freeing you to concentrate on agent logic, not communication setups. You can easily configure various conversation patterns, such as group chats or sequential interactions, using straightforward parameters.
Direct Tool Interaction via Code Execution - Agents can run code, enabling them to utilize external tools and systems directly. This extends their functionality to perform actions in your environment, manipulate data, or interface with software tools.
Flexible Human Participation - AG2 allows for seamless integration of human intervention points. Configure human input modes and specify intervention points to ensure users can guide or correct the AI when needed.
Docker Support - For improved security and reproducibility, AG2 recommends running code execution within Docker containers to ensure consistent environments across different machines and deployments. You can use AG2's pre-built Docker images or customize them to fit your project needs.
Developer-Friendly Setup - Get started quickly with pip installation and minimal dependencies. Supports Python 3.8-3.13 with clear documentation for LLM configurations and provider integrations. Add extra features as needed through optional installations.
If you've written off Amazon's AI models after their underwhelming Titan models, it's time to pay attention again. They just released a family of multimodal models, Amazon Nova - Micro, Lite, Pro, and Premier (coming soon). These models are notably different - they're lightning fast, handle multiple input types including video, and most importantly, they're surprisingly cost-effective.
While everyone focused on the models' capabilities, they missed discussing the incredible pricing - Nova Micro costs just $0.035 per million input tokens, undercutting even Gemini 1.5 Flash 8B.
Nova Pro, their most capable model, is priced at $0.80 per million input tokens - significantly cheaper than Claude 3.5 Sonnet ($3.00) and GPT-4o ($2.50) while giving competitive performance. For developers working on production apps where costs matter, this is a major advantage that Amazon should have marketed more prominently.
Key Highlights:
Performance Metrics - Nova Micro competes directly with Gemini 1.5 Flash-8B and Claude 3 Haiku while being significantly cheaper. Nova Lite offers multimodal capabilities similar to Gemini 1.5 Flash at a lower price point. At the high end, Nova Pro's performance competes with GPT-4o and Claude 3.5 Sonnet. It particularly shines in RAG applications and function-calling.
Cost Efficiency - Nova Micro costs 3.5¢/million tokens (input) and 14¢/million (output), while Nova Pro costs 80¢/million (input) and $3.2/million (output) - making it the most cost-effective option for high-volume applications compared to other major models. For perspective, processing a library of 67,000 images would cost just $9.21 with Nova Lite.
Processing Capabilities - All Nova models (except Micro) can handle text, images, PDFs, and videos up to 30 minutes long in a single request. They support 200+ languages and can process documents up to 300K tokens (128K for Micro), with a 2M token version coming in early 2025.
Development Integration - Direct integration with AWS services like Bedrock Knowledge Bases for RAG applications and Bedrock Agents for workflow automation. The models support real-time streaming for interactive applications and include comprehensive safety features.
Quick Bites
Google DeepMind has unveiled Genie 2, a world model that creates dynamic 3D environments from just a single image. Genie 2 can simulate virtual worlds, including the consequences of taking any action (e.g. jump, swim, etc.). It was trained on a large-scale video dataset and hence, understands object interactions, complex character animation, physics, and the ability to model. It's a big step for training embodied AI agents; Google DeepMind plans to make Genie 2 available for research soon.
Someone just built a real-world AI Hedge Fund Team with AI agents. It uses 4 AI agents: a Market Data Agent to gather data, a Quant Agent to generate signals, a Risk Manager to assess risk, and a Portfolio Manager to make final trading decisions. The entire code is here, for free. You can make and run it too, irrespective of any coding background!
Exa.ai introduces Websets, a web-scale embeddings-based search engine that lets you create comprehensive sets of data (of literally anything!) by simply what you want. The prompts can be “all AI startups building new LLM chips” or “all PhDs who worked on developer products and have a blog.” This could save hours of searching through traditional web searches and quickly give a custom dataset with the required parameters.
Hugging Face just released Text-to-SQL functionality across its 250,000+ public datasets using Qwen 2.5 32B Coder model. This will let you generate SQL queries from natural language prompts, executed directly in the browser via DuckDB WASM.
Tools of the Trade
Cali: AI agent that helps you build React Native apps. It takes all the utilities and functions of a React Native CLI and exposes them as tools to an LLM.
Pathways AI Pipelines: Ready-to-use pipelines for building apps with RAG, search, etc. using just YAML templates instead of Python code. It has built-in support for real-time data syncing and in-memory processing. It handles automatic indexing from various data sources and provides flexible deployment options.
UbiAI LLM Fine-tuning: An end-to-end platform that combines data labeling and LLM fine-tuning capabilities, allowing you to create datasets and fine-tune models like Llama3.1 and Mistral 7B in one environment. You can try it out for free.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
Stop telling me what AI can't do. I'm literally watching it take ideas and turn them into working solutions every day. The 'flaws' you're pointing out are just implementation details - the core ability to understand, plan, and create is already there. While you're debating definitions, I'm building real products with an AI partner that can execute any vision I give it. The proof is in the output. ~
Ray FernandoGovernments in third-world countries should open internet cafes that pay people for completing programming tasks with AI editors. As long as people can read and ask AI enough questions, their AI-augmented programming skills should improve quickly ~
Tom Dörr
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply