- unwind ai
- Posts
- Run LLMs in Browser
Run LLMs in Browser
PLUS: AI Sales Agent, Llama 3.1 fine-tuned for RAG
ELEKS' intelligent automation: Unmatched efficiency for market leaders
ELEKS' intelligent automation transforms your business with custom data-driven tools. We streamline processes, boost productivity, and cut costs by automating complex tasks. Our tailored approach unlocks growth opportunities, freeing your team to focus on high-value tasks.
Today’s top AI Highlights:
Build AI apps using on-device AI with Google's MediaPipe
Cerebras releases opensource RAG models trained in a few hours
Salesforce releases two autonomous AI sales agents to scale sales team
LangGraph v0.2 brings custom checkpointers to your workflow
Build, train and deploy AI on a single AI development platform
& so much more!
Read time: 3 mins
Latest Developments
Google's MediaPipe, a powerful opensource framework developed by Google for building cross-platform, customizable ML solutions, has achieved a significant milestone. You can now run LLMs directly in web browsers. This eliminates the need for server-side processing for certain AI tasks, opening doors to faster, more private, and cost-effective applications. The team has successfully run the Gemma 1.1 7B model within the browser.
Key Highlights:
On-Device Inference - MediaPipe allows LLMs to run entirely on your device. This translates to lower latency, reduced server costs, and enhanced privacy.
Memory Optimization for WebAssembly - The team overcame significant memory limitations using techniques like asynchronous loading and loading individual weight buffers on demand. This allows running large models efficiently within the browser's memory constraints.
Cross-Platform Capabilities - MediaPipe's cross-platform nature allows you to build apps that work seamlessly across Android, iOS, and web browsers, using the same underlying technology.
Open Source - Being opensource, you can customize and extend MediaPipe’s framework like integrating new models and features or tailoring the pipeline to specific use cases.
Cerebras has released DocChat, a new series of models for document-based conversational question answering. These models, Cerebras Llama3-DocChat and Cerebras Dragon-DocChat, are built on top of Llama 3 and Dragon+ respectively, offering significant performance improvements. Best of all, they were trained with incredible speed on a single Cerebras System: Llama3-DocChat in a few hours and Dragon-DocChat in mere minutes!
Key Highlights:
Top-tier Performance - DocChat models achieve top-of-the-line performance on a variety of benchmarks, including ChatRAG, Eleuther Eval Harness, and those specific to document retrieval. They show particularly strong improvements in recall for multi-turn retrieval tasks.
Focus on Practical Challenges - The training recipe specifically addresses issues like handling unanswerable questions, improving arithmetic performance, and enhancing entity extraction, leading to significant gains in accuracy.
Open Source - Cerebras is providing full transparency by releasing the model weights, training recipes, and datasets.
Quick Bites
Salesforce has introduced two AI Sales Agents to scale sales teams - the Einstein SDR Agent and the Einstein Sales Coach Agent. The SDR Agent autonomously engages with leads to book meetings, while the Sales Coach Agent simulates buyer interactions to help sellers practice and improve their sales skills.
LangChain has released LangGraph v0.2 with new checkpointer libraries, including SQLite for local workflows and an optimized Postgres checkpointer for production applications. It simplifies building stateful LLM apps with enhanced session memory, error recovery, and human-in-the-loop features.
OpenAI has hired former Meta executive Irina Kofman to lead strategic initiatives, focusing on safety and preparedness. This is part of OpenAI’s trend of recruiting experienced leaders from major tech companies to strengthen its team.
Tools of the Trade
Lightning AI Studio: A cloud-based development platform to build, train, and deploy AI models easily. It provides a persistent environment with automatic scaling and supports a range of AI tasks, from basic coding to complex, multi-node production workflows.
Ori: An AI-native GPU cloud platform offering cost-effective and customizable GPU infrastructure, ranging from virtual machines to private clouds.
UiSpark: An AI-powered design tool that helps you generate app icons, logos, and UI designs with just a few clicks. It analyzes your input and iterates on concepts to match your vision.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.
Hot Takes
LLMs are useless for coding. Anyone would write a better code by hand. LLMs cannot replace a coder. I've been hearing that coders will soon be replaced by AI for 30 years. I tried, it failed. Nobody will want to debug someone else's code, especially AI's. You cannot put AI-generated code in production. Wake me up when a client will be able to describe their needs clearly. ~
Andriy BurkovThe evolution of AI
- AI assistants and chatbots
- autonomous agents
- AI organizations with human supervisors / CEOs
- merged human-AI super beings
Timeline - 15 -20 years ~
Bindu Reddy
Meme of the Day
That’s all for today! See you tomorrow with more such AI-filled content.
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one (or 20) of your friends!
Reply