- unwind ai
- Posts
- Google's Mixture of Million AI Models
Google's Mixture of Million AI Models
PLUS: Apple Intelligence beta rollout, Meta AI Studio
Today’s top AI Highlights:
Create your own AI Character with Meta’s New AI Studio
Google DeepMind proposes mixture of a million tiny AI models for efficiently scaling LLMs
Apple starts rolling out Apple Intelligence in developer beta
AI predicts breast cancer risk 5 years in advance
Create modern interactive web apps in pure Python
& so much more!
Read time: 3 mins
Latest Developments
Meta has launched AI Studio which lets anyone create, share and discover AI characters. Creators on Instagram can build these AI characters as an extension of themselves to reach more fans. Built on Llama 3.1 models, you don’t need to have any technical skills to build these AI characters. You can pick a template to customize and make your own, or start entirely from scratch and build your AI from the ground up.
Key Highlights:
How Creators can use it - These AI characters are an extension of you that can quickly answer common DM questions and story replies. Whether it’s sharing facts about yourselves or linking to your favorite brands and past videos, your AIs can help you reach more people and fans get responses faster.
Finding Your Inspiration - Before you start building, consider what makes your AI character unique. Are you an expert in a specific field? Do you have a passion you want to share?
How to get started - You can choose a customizable template or start from scratch. You have complete control over your AI's visual appearance (the avatar), its name and tagline, and, most importantly, its personality as expressed through its communication style.
Shaping Knowledge and Behavior - Craft a detailed description beyond the initial overview, provide specific instructions on language style, and, crucially, supply example responses that demonstrate how your AI should interact in conversations.
From Private to Public - You decide how you want to share your AI creation. Keep it private for personal use, share it with a select group, or unleash it on the world through Instagram, Messenger, WhatsApp, and the web.
In transformer architectures, feedforward (FFW) layers are crucial for storing factual knowledge and making predictions. However, these layers cause a significant increase in computational costs and memory usage as their size grows. This linear increase in resource consumption limits the scalability of transformer models, making it challenging to maintain performance without excessive computational overhead.
Google DeepMind has introduced the Parameter Efficient Expert Retrieval (PEER) architecture to tackle this issue. PEER utilizes a sparse mixture-of-experts (MoE) approach to decouple model size from computational cost. By leveraging a vast pool of tiny experts and a novel routing mechanism, PEER enhances the performance and scalability of transformer models.
Key Highlights:
PEER Architecture - It involves a new layer design that selects a small number of the most relevant tiny experts from a large pool using a method called the product key technique. This helps reduce the computational costs and memory usage associated with traditional dense FFW layers.
Enhanced Performance - Experiments show that by efficiently utilizing a massive number of tiny experts, PEER achieves a superior performance-compute trade-off, maintaining high performance with lower computational costs in comparison to FFW and MoE.
Learned Index Structure for Routing - PEER architecture uses a smart routing system to select and manage over a million tiny experts. This system learns which experts are most useful for each task, allowing the model to use the right experts without increasing computational costs.
Why it matters - PEER architecture addresses the limitations of existing ones, which are often constrained by computational and optimization challenges when scaling. PEER utilizes a massive pool of tiny experts that allows for more scalable and computationally efficient transformer models, paving the way for improved performance in various language modeling tasks.
Quick Bites 🤌
Apple has started rolling out Apple Intelligence via the new iOS 18.1 and macOS 15.1 in developer betas. These new OS will only be available for Macs and iPads with M1 chip or later, and iPhone 15 Pro and Max. Older devices will remain on iOS 18.0 beta or macOS 15.0 beta for now.
MIT researchers have developed an AI model called Mirai that can predict breast cancer risk using mammograms. It can detect high-risk patients up to 5 years in advance! The model analyzes traditional mammogram views, incorporating risk factors such as age and hormonal factors, aiding radiologists in accurate diagnosis.
Tesla has started rolling out FSD v12.5.1 to vehicles with HW4, including Tesla’s Model Y. Elon Musk has also announced that the technology is being optimized for HW3 vehicles and will be rolled out in 10 days.
Runway has released image-to-video feature in GEN-3 Alpha. You can use any image as the first frame of your video generation, either on its own or with a text prompt for additional guidance.
US-based AI company Zyphra has released a state-of-the-art small language model Zamba2-2.7B for on-device applications. It outperforms other models in its category, such as Gemma 2 and Phi 2, and is competitive with larger models with 4B parameters. It achieves 2x the speed in time-to-first-token, 27% lower memory overhead, and 1.29x faster generation latency compared to Phi3-3.8B.
😍 Enjoying so far, share it with your friends!
Tools of the Trade
FastHTML: Create modern interactive web applications in pure Python, with built-in support for authentication, databases, caching, and styling, which are all replaceable and extensible. You can also easily deploy on Railway, Vercel, and Hugging Face.
Ollama 0.3 with Tool Support: Ollama now supports tool calling with popular models like Llama 3.1, for popular tools like functions and APIs, web browsing, code interpreter, and more.
Phoenix: Opensource AI observability platform for experimenting, evaluation, and troubleshooting, providing tracing, evaluation, dataset management, and inference analysis. It supports popular frameworks and LLM providers.
Awesome LLM Apps: Build awesome LLM apps using RAG for interacting with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple texts. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.
Hot Takes
AI will be 100X more capable than GPT-4 by 2027, while compute costs are falling by 75-85% per year. This will TRANSFORM the world. ~
Peter H. DiamandisI still think OpenAI will release something as open source this year. 4o-mini would have been a perfect candidate and there's not much risk in releasing it for download, but let's wait and see. ~
Flowers
Meme of the Day
That’s all for today! See you tomorrow with more such AI-filled content.
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!
PS: We curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!
Reply