unwind ai
Posts
Last Week in AI - A Weekly Unwind

Last Week in AI - A Weekly Unwind

From 25-Aug-2024 to 31-Aug-2024

Shubham Saboo & Gargi Gupta
September 01, 2024

It was yet another thrilling week in the AI field with advancements that further extend the limits of what can be achieved with AI.

Here are 10 AI breakthroughs that you can’t afford to miss 🧵👇

Made with Whimsical AI

Runs 7B Models Directly in Your Browser 💻

Google's MediaPipe, a powerful opensource framework developed by Google for building cross-platform, customizable ML solutions, has achieved a significant milestone. You can now run LLMs directly in web browsers. This eliminates the need for server-side processing for certain AI tasks, opening doors to faster, more private, and cost-effective applications. The team successfully ran the Gemma 1.1 7B model within the browser.

Autonomous AI Sales Agents to Scale Sales Teams 🤖

Salesforce introduced two AI Sales Agents to scale sales teams - the Einstein SDR Agent and the Einstein Sales Coach Agent. The SDR Agent autonomously engages with leads to book meetings, while the Sales Coach Agent simulates buyer interactions to help sellers practice and improve their sales skills.

Llama 3.1 Just Got Ears 👂

AI research lab Homebrew released Llama3-s v0.2 multimodal checkpoint for enhanced speech understanding. Built upon Llama 3.1 8B, this opensource model leverages semantic tokens, drawing inspiration from WhisperVQ, to directly process audio data, bypassing the traditional transcription step. This results in faster processing and allows the model to understand and respond to speech almost real time.

Fastest AI Inference for AI Agents 💨

Cerebras launched its new AI inference solution, boasting speeds 20x faster than those found in NVIDIA GPU-based hyperscale clouds. Using their third-generation Wafer Scale Engine, they achieve 1,800 tokens per second with the Llama3.1 8B model and an impressive 450 tokens per second with the Llama3.1 70B model. Cerebras is charging 1/5th the cost of comparable solutions. You can access it today through their API, which has generous rate limits and uses 16-bit weights for maximum accuracy.

OpenAI’s New Flagship AI Model and Strawberry 🍓

OpenAI is reportedly developing two new AI models: Strawberry and Orion. Strawberry focuses on solving complex math and programming problems, while Orion aims to surpass GPT-4. Strawberry will help create advanced AI agents that can plan ahead and take actions on behalf of the user.

Orion will be OpenAI’s new flagship model designed to outperform GPT-4 trained on data generated by Strawberry. It's uncertain whether Strawberry will launch this year.

Google's AI Engine Builds Games On-theFly 🎮

Researchers at Google developed a groundbreaking AI-powered game engine that simulates the classic video game DOOM in real-time with interactive gameplay. This technology, called GameNGen, generates the game world and its evolution in real-time, as a player interacts with it, achieving high-quality visuals and gameplay. It's a major step forward in game development and offers a new way to create and play games.

AI Agents Sound More Human Than Your Employees 🤖

Bland AI, an innovative San Francisco-based startup specializing in AI-powered phone call automation, raised $16 million in Series A funding. They are developing hyper-realistic AI agents to handle a wide array of phone-based tasks for enterprises, including customer support, sales, and internal operations. Bland AI provides a platform for developers to build, train, and deploy reliable AI agents that sound just like humans and can respond without hallucinating.

Google’s Gemini version of Custom GPTs 👨‍👨‍👦‍👦

Google rolled out Gems, a new feature that lets you customize Gemini to create your own personal AI experts, for Gemini Advanced subscribers. Google’s latest text-to-image model Imagen 3 will also be available on all Gemini apps in the coming days. It will let you generate images of real people with careful considerations for safety guidelines.

Opensource AI Model for Video Understanding 🎥

Alibaba Cloud has released Qwen2-VL, a new suite of vision-language models built upon their existing Qwen2 language models. Qwen2-VL can understand images and videos, has multilingual support, and even agent-based capabilities for device control. The 72B parameter model surpasses even closed-source models like GPT-4o and Claude 3.5 Sonnet in many visual understanding tasks. This release includes opensource versions of the 2B and 7B parameter models, along with API access to the powerful 72B parameter model.

Super Fast Local LLM Inference on your Computer 🖥️

KTransformers is a new framework to speed up LLM inference in hardware-constrained environments. By implementing an optimized module with a single line of code, you get a Transformers-compatible interface, RESTful APIs compliant with OpenAI and Ollama, and even a simplified ChatGPT-like web UI. The latest updates to KTransformers enable the InternLM2.5-7B model to achieve a 5.65x speedup in token generation, and manage 1 million tokens using just 24GB of VRAM and 150GB of DRAM.

Which of the above AI development you are most excited about and why?

Tell us in the comments below ⬇️

That’s all for today 👋

Stay tuned for another week of innovation and discovery as AI continues to evolve at a staggering pace. Don’t miss out on the developments – join us next week for more insights into the AI revolution!

Click on the subscribe button and be part of the future, today!

📣 Spread the Word: Think your friends and colleagues should be in the know? Share Unwind AI and let them join this exciting adventure into the world of AI. Sharing knowledge is the first step towards innovation!

🔗 Stay Connected: Follow us for updates, sneak peeks, and more. Your journey into the future of AI starts here!

Shubham Saboo - Twitter | LinkedIn ⎸

Unwind AI - Twitter | LinkedIn | Instagram | Facebook

Awesome LLM Apps | Sponsor Us

Reply

or to participate.