- unwind ai
- Posts
- Opensource Self-Building AI Agents
Opensource Self-Building AI Agents
PLUS: ChatGPT-4o with Canvas, Faster & Better Image Generation Models
Streamline your development process with Pinata’s easy File API
Easy file uploads and retrieval in minutes
No complex setup or infrastructure needed
Focus on building, not configurations
Today’s top AI Highlights:
Create self-building AI Agents with this opensource Python framework
A new interface in ChatGPT for coding and writing
Black Forest Labs releases new version of FLUX image generation model
Google Lens now lets you ask questions using videos and your voice
Cursor, Claude Sonnet 3.5 with Artifacts, v0 with the ability to install packages and run code in one AI platform, for free
& so much more!
Read time: 3 mins
AI Tutorials
Cutting-edge AI apps don’t always require the cloud. Today we are combining local Llama 3.1 with tool-use, making it possible to interact with real-world data sources while the AI itself runs locally.
In this tutorial, we’ll guide you through building a local Llama 3.1-powered assistant that integrates tools like Yahoo Finance for stock data and SerpAPI for web searches. While the assistant itself operates locally, it will access real-time data through these external APIs.
Not just that! You can even select which tools (YFinance and/or SerpAPI) you want the assistant to use via checkboxes in the sidebar.
We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
🎁 Bonus worth $50 💵
Latest Developments
The future of autonomous AI agents might just be self-building! Here’s a cool opensource project BabyAGI 2, an experimental Python framework to create a self-building autonomous agent. First things first: this is a personal project, not production-ready, and should be used with caution.
The project focuses on creating an AI agent capable of generating and managing its own functions, stored in a database as a graph structure. The framework includes a dashboard for managing functions, visualizing their relationships, and tracking execution logs. BabyAGI 2 also offers experimental self-building capabilities, allowing the agent to create new functions based on user input.
Key Highlights:
Dynamic Function Execution - BabyAGI 2 retrieves and executes functions from a database, dynamically loading dependencies, imports, and API keys into the execution environment. You can leverage this for flexible and context-aware agent behavior.
Automated Logging and Visualization - The framework automatically logs function executions, including parent-child relationships for nested calls, and provides graph visualizations (Cytoscape, Mermaid, 3D) for understanding function dependencies. This aids in debugging and optimizing agent workflows.
Self-Building Capabilities - The experimental process_user_input and self_build functions allow the agent to generate new functions based on user descriptions, breaking down tasks and creating reusable components.
No-Code Dashboard and Chat Playground - A user-friendly dashboard simplifies function management, key management, and log viewing. Finally, a chat playground enables interaction with the agent and dynamic loading of functions.
Build now (but with caution) - The entire project is opensource and a great starting point for those interested in exploring self-building AI agents. The purpose of this repo is to share ideas and spark discussion and for experienced devs to play with.
OpenAI has introduced Canvas, a new interface for working with ChatGPT, now available in beta. Unlike the standard chat interface, Canvas is a separate window within ChatGPT that allows you to interact with your draft projects through an intuitive visual interface. You can highlight specific sections to give ChatGPT targeted feedback, and control edits in real time.
This tool is especially helpful for writing and coding tasks where edits have to be done in a specific part of the draft, rather than re-working the entire thing. Canvas is being rolled out to ChatGPT Plus and Team users starting today.
Key Highlights:
How to get started - Canvas is built with GPT-4o and can be manually selected in the model picker as “ChatGPT 4o with Canvas”. It opens automatically when ChatGPT detects a scenario where it could be helpful. You can also include “use canvas” in your prompt to open Canvas and use it to work on an existing project.
Writing Tools in Canvas - For writing, here are some quick tools in Canvas:
Suggest edits: Inline suggestions and feedback.
Adjust the length: Document length to be shorter or longer.
Change reading level: Adjusts the reading level, from Kindergarten to Graduate School.
Add final polish: Checks for grammar, clarity, and consistency.
Add emojis: Adds relevant emojis for emphasis and color.
Code Enhancement Features - Canvas makes coding easier by tracking revisions and offering targeted improvements.
Review code: Provides inline suggestions to improve your code.
Add logs: Inserts print statements to help debug and understand.
Add comments: Adds comments for better code understanding.
Fix bugs: Detects and rewrites problematic code.
Port to a language: Translates your code into JavaScript, TypeScript, Python, or more.
Model Training with o1 - Canvas is powered by GPT-4o, which has been post-trained using OpenAI’s o1 model. The model was fine-tuned to trigger Canvas at the right moments for writing or coding tasks, using synthetic data for better updates.
Availability - Canvas is currently being rolled out to ChatGPT Plus and Team users. Enterprise and Education users will get access starting next week.
Quick Bites
Google has released a smaller and faster production-ready variant of Flash, Gemini 1.5 Flash-8B. Optimized for speed and efficiency, the model is excellent at tasks like chat, transcription, and long context language translation. It is available in the Google AI Studio for free and via API. The API is also very low-priced, at:
Prompt size | Input Tokens | Output Tokens |
---|---|---|
For <=128k tokens | $0.0375 per 1 million | $0.15 per 1 million |
For > 128k tokens | $0.075 per 1 million | $0.3 per 1 million |
Cached prompts | $0.01 per 1 million | $0.02 per 1 million |
This is probably for over-smart people like us who thought OpenAI might go bankrupt😂 OpenAI says that in addition to securing $6.6 billion in new funding, they have established a new $4 billion credit facility with a consortium of banks. They now have access to over $10 billion in liquidity, to invest in new initiatives and scale further. (GPT-4 will be available for free for another year)
Google Lens now lets you ask questions using videos and your voice. Open Lens in your Google app, record a video, or ask a question with your voice after taking a photo, and Lens will use AI to provide answers based on what it sees and hears. The video capability is generally available for Search Labs users enrolled in the “AI Overviews and more” experiment, with support for English queries.
The AI startup and a major Midjourney contender Black Forest Labs has released FLUX 1.1 [pro] text-to-image AI model. It is 6x faster than its predecessor FLUX.1 [pro] with image quality, prompt adherence, and diversity. The company has also released its API for the previous series of FLUX models at a competitive price. FLUX 1.1 [pro] API will also be available soon. You can try it here for free.
Groq has set a record for the highest inference speed for Llama 3.2 1B at more than 3,000 tokens per second. For this speed, the pricing is very appealing at $0.04/1M input/output tokens. To put this in perspective, it is ~25x faster than GPT-4o's API and ~110x cheaper.
Tools of the Trade
Vectorize: Build AI apps with RAG faster and with less hassle. It automates creating optimized vector search indexes for RAG pipelines and keeps your data updated for real-time AI use. It handles data extraction, embedding evaluation, and integration with vector databases.
Bolt.new by StackBlitz: Create, edit, run, and deploy full-stack applications quickly with a single English prompt. It provides a full development environment, solves errors, and can deploy production-ready apps with just one click.
Tackle AI: Automates time tracking and calendar audits, helping founders and executives align their daily actions with strategic priorities. It integrates with Google and Outlook calendars to deliver real-time insights.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.
Hot Takes
I gave a talk about a year ago at a company that had just laid off most of their developers because of AI. Today, I learned they have already hired the team back (different people.)
"I think we were too early," I was told. "Things didn't work out as we were hoping." ~
SantiagoIn the age of AI, the worst thing you can do is tie up your ego with the skills you developed, whether technical or not. ~
Guillermo Rauch
Meme of the Day
That’s all for today! See you tomorrow with more such AI-filled content.
🎁 Bonus worth $50 💵
Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it wtith at least one, two (or 20) of your friends 😉
Reply