unwind ai
Posts
Mistral's Mini LLM beats Llama 3.2

Mistral's Mini LLM beats Llama 3.2

PLUS: Run AI-generated code in secure sandbox, Perplexity for finance

Shubham Saboo & Gargi Gupta
October 17, 2024

Today’s top AI Highlights:

Mistral AI’s new small models bring powerful AI on device
Create secure environments for AI agents to run code and analyze data
Llama 3.1 turns your Lenovo PC into an AI agent with local AI processing
Run any GGUF on the Hugging Face Hub directly with Ollama
Compare GPT-4o and Llama code completions side-by-side right inside VS Code

& so much more!

Read time: 3 mins

AI Tutorials

Building a RAG app that interacts with YouTube videos might sound complicated—especially since most LLMs can’t natively process videos. But with the right tools, it’s a cakewalk.

In this tutorial, we’ll walk you through building an LLM app with RAG to interact with YouTube videos using the Embedchain framework and GPT-4o. And the best part? You can get this up and running in just 30 lines of Python code!

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build an LLM app with RAG to Chat with YouTube Videos

LLM App using GPT-4o in less than 30 lines of Python code (step-by-step instructions)

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get an AI resource pack worth $50 for FREE. Valid for a limited time only!

Latest Developments

Run Advanced AI Locally with Mistral’s Ministral Models 📱

With growing concerns over data privacy, the focus is shifting towards small language models that can run locally on devices. With this, Mistral AI has released two new small models Ministral 3B and Ministral 8B, for on-device computing and at-the-edge use cases.

The models achieve state-of-the-art performance in knowledge, commonsense, reasoning, and efficiency in their size category, natively supper function-calling, and can be tuned to a variety of uses, from orchestrating agentic workflows to creating specialist task workers.

Key Highlights:

Model features - Both models support up to 128k context length (currently 32k on vLLM) and Ministral 8B has a special interleaved sliding-window attention pattern for faster and memory-efficient inference.
Use cases - Besides simple on-device uses like translation, summarising, etc., these models can be used with larger LLMs as intermediaries for function-calling in multi-step agentic workflows. They can be tuned to handle input parsing, task routing, and calling APIs at extremely low latency and cost.
Benchmark performance - The base and instruct models outperform Gemma 2 and latest Llama models in their size categories, across various benchmarks including common sense, math, coding, function-calling, etc.
Availability - Both models are available via API. Ministral 8B Instruct’s weights are also available to download via Hugging Face.

Give Your AI Agents a Playground they Can't Break 🤖

E2B has released version 1.0 of its SDK to run AI-generated code inside secure, isolated sandboxes in the cloud. The platform allows you to integrate these sandboxes using JavaScript or Python SDKs. With a quick startup time of ~150ms, the E2B sandboxes are ideal for data analysis, code generation evaluations, and autonomous agent environments. The sandboxes are fully customizable and can run multiple AI frameworks, making them adaptable for a variety of use cases.

Key Highlights:

Multi-Language Support - E2B sandboxes can execute code in Python, JavaScript, R, and more. It’s LLM-agnostic, works with any LLM or AI framework.
Resource Management - E2B supports running multiple sandboxes simultaneously, allowing you to manage isolated sessions for different users or AI agents without resource conflicts.
Quick Setup and Execution - Each sandbox starts within ~150ms, ensuring smooth AI applications without cold starts. You can dynamically change sandbox timeouts or shut them down as needed.
Streaming Output and I/O Management - E2B SDK supports streaming of results, logs, and visual outputs like charts directly from the sandbox to the client, enhancing real-time interaction for coding agents and AI-generated applications.

Quick Bites

You can now run any GGUF model from Hugging Face on your laptop using Ollama, including VLMs supported by llama.cpp. With a simple ollama run command, access over 45K public GGUF checkpoints, customize quantization, and configure system prompts directly.

Amazon has unveiled a new Kindle lineup, including an upgraded Kindle Scribe with Active Canvas and AI-powered notetaking. With Active Canvas, you can write your thoughts directly in the book when inspiration strikes. Your note becomes part of the page, and the book text dynamically flows around it. The new AI-powered notebook will quickly summarize pages and pages of notes into concise bullets that you can easily share.

Lenovo introduced Lenovo AI Now at Tech World 2024, a local AI agent powered by Meta’s Llama 3.1, turning PCs into personalized assistants with on-device AI processing. The AI agent’s capabilities range from document management and summarization, to device control and content generation.

Perplexity is becoming the YFinance for stock research. When you search for a company’s stock, Perplexity will give you real-time stock quotes, historical earning reports, industry peer comparisons, detailed analysis of company financials - all with sleek and appealing UI.

Tools of the Trade

Copilot Arena: A free, opensource coding assistant that shows paired code completions from multiple LLMs, like GPT-4o and Llama-3.1 together in real-time. It gives side-by-side suggestions and tracks personal model preferences via a leaderboard.

EngineLabs.ai: A web-based AI-powered IDE for full-stack development to build and deploy complete applications using natural language and traditional coding. It is opensource, model agnostic, and extensible, based on 'strategies' and 'adapters'.
VectorShift: A no-code platform for teams to build AI automations, chatbots, and search tools without coding. It integrates with platforms like Google Drive and automates workflows using drag-and-drop components.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes

google is basically now a reddit search wrapper. ~
signüll
The term "AGI" might have done more damage to the AI field than anything else lol.
Human (or any animal intelligence) is NOT general; it's very specifically constrained. Generality would require too many observations to fit to; constrained intelligence learns the right stuff to be useful in a time and energy budget that works in our perceptual time scales ~
Naveen Rao

Meme of the Day

Average SF founder surfing the wave of the .AI boom 🌊🏄‍♂️🌉
— Beff – e/acc (@BasedBeffJezos)
7:16 AM • Oct 16, 2024

That’s all for today! See you tomorrow with more such AI-filled content.

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for a limited time only!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.