unwind ai
Posts
AI video Generation that's Ready to Use 🤯

AI video Generation that's Ready to Use 🤯

PLUS: Stable Diffusion 3, Databrick's AI analyst, Fast Inference on Phones

Shubham Saboo & Gargi Gupta
June 13, 2024

Today’s top AI Highlights:

Luma Labs releases AI video generation model - Dream Machine
Stability AI delivers a more powerful text-to-image model - Stable Diffusion 3 Medium
Databricks launches AI/BI - A new way to understand business data
New framework PowerInfer-2 optimizes AI for mobile devices
Midjourney’s new Personalization feature generates the kind of images you like

& so much more!

Read time: 3 mins

Latest Developments 🌍

OpenAI’s Sora Competitor Available to Use 🌅

Luma Labs has just released Dream Machine, a new AI video model that lets you create realistic videos from text descriptions and images. Unline OpenAI’s Sora, Dream Machine is available to use! Dream Machine is trained directly on video, so it can generate videos that are physically accurate, consistent, and full of action. The model is also incredibly fast, generating 120 frames in 120 seconds. This allows you to iterate your ideas quickly and experiment with different styles.

Key Highlights:

High-quality video - Dream Machine generates realistic 5-second videos with smooth motion, compelling cinematography, and dramatic effects.
Consistent characters - Dream Machine understands how objects interact with the world, so you can create videos with consistent characters and physics.
Cinematic camera moves - Dream Machine lets you experiment with an array of styles and camera movements, adding depth, character, and emotion to your videos.

Stability AI’s Most Sophisticated Image Generation Model 🪄

Stability AI has released Stable Diffusion 3 Medium, the latest version of their popular text-to-image AI model. It boasts a significant improvement in image quality, particularly in areas like hands and faces, while offering greater accuracy in understanding complex prompts. A key feature of the model is its smaller size, making it ideal for running on consumer-grade hardware without sacrificing performance.

Key Highlights:

Improved image quality - Stable Diffusion 3 Medium produces highly realistic images with fewer artifacts, especially in challenging areas like hands and faces.
Training data - Stability AI used synthetic data and filtered publicly available data to train our models. The model was pre-trained on 1 billion images. The fine-tuning data includes 30M high-quality aesthetic images focused on specific visual content and style, as well as 3M preference data images.
Open licensing - Stability AI offers both non-commercial and commercial licenses for Stable Diffusion 3 Medium.

Databricks Makes Data Analysis Accessible to All 📊

Businesses are constantly seeking answers from their data, but often find themselves relying on overworked data professionals and a cycle of creating dashboards that still leave many questions unanswered. Databricks has introduced AI/BI, a new business intelligence product built from the ground up to deeply understand your data's semantics and enable anyone to analyze data for themselves. This powerful solution, built on a compound AI system that learns from the full data lifecycle across the Databricks platform.

Key Highlights:

AI/BI Dashboards is a user-friendly, low-code experience for creating interactive data visualizations. They come equipped with standard BI capabilities like cross-filtering and periodic snapshots via email, and also provide a smooth transition to Genie for exploring deeper insights.
Genie, a conversational interface, empowers users to ask questions in natural language and receive accurate answers based on the data's semantics. Genie continuously learns and improves its performance through human feedback, enabling it to answer a wide range of business questions.
AI/BI is tightly integrated with the Databricks Data Intelligence Platform, ensuring unified governance, effortless sharing, and industry-leading performance. Importantly, it eliminates the need for data extraction, which leads to improved data freshness and simpler governance.
AI/BI leverages a compound AI system, which consists of an ensemble of AI agents each specializing in a specific task like planning, SQL generation, explanation, visualization, and result certification. This approach allows for more accurate and reliable answers compared to traditional, monolithic AI models.

Record Speed for LLMs on Phones 📱

Deploying LLMs on smartphones has been a challenge due to significant memory consumption. Apple is deploying a 3B model to power generative AI features in iOS18. But this limits the ability to process complex tasks on these small devices.

Researchers have introduced PowerInfer-2, an optimized inference framework designed for smartphones. This framework supports up to Mixtral 47B MoE models, achieving a speed of 11.68 tokens per second, which is up to 22x faster than current state-of-the-art frameworks.

Key Highlights:

Framework - PowerInfer-2 utilizes innovative techniques like heterogeneous computing – where it dynamically adjusts the size of computational units based on hardware characteristics – and I/O-Compute Pipeline to maximize overlap between data loading and computation. This ensures that the framework is both fast and memory-efficient.
Memory Efficiency - PowerInfer-2’s techniques save nearly 40% of memory usage for 7B LLMs while maintaining faster inference speeds, outperforming frameworks like llama.cpp and MLC-LLM.
New Model Releases: The team has released TurboSparse-Mistral-7B and TurboSparse-Mixtral-47B models, trained with 150B tokens within a budget of $0.1 million, ensuring enhanced performance and higher predictable sparsity.

😍 Enjoying so far, share it with your friends!

Tools of the Trade ⚒️

Personalization in Midjourney: The new personalization feature lets you create personalized images by ranking image pairs on Midjourney and use the personalization parameter on your images. When you rank these pairs, Midjourney takes note of the kinds of images you prefer and generates images based on your preferences. To use this feature, add the --p parameter to your prompt on Discord or the website.

nCompass: nCompass’ Realtime Audio Denoising API removes background voices from audio streams in real-time, enhancing sound quality for clearer communication. You can use it for applications requiring high-quality audio, such as transcription services or voice input tools.
Floom: Integrate AI functionalities into your applications, allowing tasks such as data ingestion, image creation, PDF querying, classification, summarization, and transcription. It offers security features, cost control, caching, and more, ensuring complete data privacy and seamless integration with CI/CD platforms.

Awesome LLM Apps: Build awesome LLM apps using RAG for interacting with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple texts. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes 🔥

The utility of most AI apps is very much rate limited by the reality that the creators have an incentive to minimize AI compute spend, despite being good for end users to have more. This changes as AI gets cheaper, goes on device, etc. The magic is in the “etc”. ~Logan Kilpatrick
I am on an airplane overhearing a row of people talking about how they are using AI for stuff and I decided that one thing that I will not forgive OpenAI for is making the word "ChatGPT" generic for AI Leaving aside any other issues, it is too many syllables and a terrible verb.ChatGPTed is so painful. We need slang for this soon, the AI companies can't name anything to save their lives (Gemini'ed? Clauded? Llama'ed? Microsoft Copiloted powered by OpenAIed?) Mistraled is the only one that is close. ~Ethan Mollick

Meme of the Day 🤡

Every Apple presenter

That’s all for today! See you tomorrow with more such AI-filled content.

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Reply

or to participate.