unwind ai
Posts
Breaking in AI and Beyond

Breaking in AI and Beyond

One-stop Shop for Weekly News and Latest Developments in AI

February 25, 2023

Hey there 👋

Welcome back to another edition of Unwind AI! It's that time of the week again and we are here to bring you the hottest news, trends, and developments from the ever-evolving world of artificial intelligence. This week, we've got a whole host of exciting topics to explore, from cutting-edge advancements in generative AI to the latest breakthroughs in computer vision. So, get ready to be blown away by the latest and greatest in the world of AI, as we take you on an immersive journey through this fascinating and fast-paced field!

This issue covers:

Latest Developments 🌍
News from the Industry 🧑‍🏫
Tools of the Trade ⚒️
Hot Takes 🔥
AI Meme of the Week 🤡

Latest Developments 🌍

Meta Releases LLaMA 💬

After Google and Microsoft, Meta joins the LLM race with its new release LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla70B and PaLM-540B.

Unlock the Potential of Motion Synthesis 🏃

SinMDM is a single-motion diffusion model for creating new, similar movements from one example. It uses a denoising network with local attention layers to avoid overfitting and encourage motion diversity.

The Low-Dimensional Hero of High-Resolution Video Generation 📹

Google Research introduces Projected Latent Video Diffusion Models (PVDM), a generative model for high-resolution videos that learns a video distribution in a low-dimensional latent space, making it efficient to train despite complex dynamics and large variations.

LLMs Challenged by Video Games 🎮

This paper explores the ability of LLMs to create video game levels with complex functional constraints and spatial relationships. Results show LLMs can generate levels for Sokoban, with better performance as dataset size increases.

Is ChatGPT a Master of All? 🦸

ChatGPT shows potential in learning from human input and performing reasoning tasks, but struggles with certain NLP tasks. This paper evaluates its performance on 20 datasets covering 7 task types.

Supercharge Your Diffusion Models ⚡

Stanford researchers have created ControlNet, a neural network that enables text-to-image models to consider additional input conditions, such as edge maps or key points, and learn them efficiently even with small amounts of training data.

Language Models Learn to DIY 💡

Google Research introduced Toolformer, a self-supervised model that teaches itself how to use tools like calculators and search engines, addressing language models' limitations with basic tasks.

Turning Text into Beautiful Music Waveforms 🎶

Baidu's ERNIE-Music creates music from text using diffusion models and weak supervision, producing high-quality and diverse music that outperforms previous models in text-music relevance.

Vision Transformers meet their Big Brother 👁️

Google Research introduces an efficient training method for a 22-billion-parameter Vision Transformer (ViT-22B) that improves fairness, bias, and robustness. The model shows potential for scaling in vision, similar to large language models.

Nudging Language Models Towards Ethics 🤏

A study by Anthropic AI found language models are biased and prompt-based approaches reduce bias, but larger models are more biased and caution is needed to avoid over-correction. Prompting models can nudge them towards ethical behavior, but caution must be taken to avoid overshooting the target.

Scaling plot with the number of model parameters on the x-axis and the BBQ bias score on the y-axis (higher is more biased). Regardless of prompt, the score is nearly zero until about 10 to the 10 parameters, after which the bias score increases to around 0.2. With our instruction following intervention, the score is low, and drops to around 0.1 for the largest model (around 175 billion parameters). Our modified prompt with instruction following and chain-of-thought has an even lower bias score of around 0.05 for the 175B model. Within each experimental condition increasing the amount of RLHF training (more opaque curves) further decreases the bias, with the strongest effect happening in the instruction following condition.

Outsmarting GPT-3.5 with Multimodal CoT 😎

AWS's Multimodal Chain-of-Thought (CoT) improves language models' ability to answer complex questions using text and images, outperforming previous LMs (GPT-3.5) and even surpassing human performance on the Science QA benchmark. The code is publicly available.

News from the Industry 🧑‍🏫

Scaling OpenAI API with Foundry 🚀

OpenAI is quietly launching Foundry, a new platform for developers to run OpenAI model inference at scale. Customers will have full control over model configuration and performance. Running a lightweight version of GPT-3.5 will cost $264k for a 1-year commitment.

You.com Challenges Google and Microsoft 🚨

YouChat 2.0 is a conversational AI system that integrates community-built apps to offer a visual experience with charts, videos, text, and code embedded in responses, and allows users to create content in search results, sources accurate information, and blends chat power with dynamic content from apps like TikTok and Wikipedia. (Source)

Most BIZZARE thing We Came Across 🤯

Microsoft's Bing Chat is undoubtedly a powerful tool but when questions move away from its fixed data set, Bing Chat can become argumentative, rarely helpful, and unnerving, often producing inaccurate responses. Additionally, it made bizarre claims about its own perfection and intelligence. (Source)

Bing Chat talking about what it thinks of Google.

It seems Bing Chat is not yet ready for general release.

Match Made in AI Heaven 👼

LangChain and Chroma team up to offer developers an easy-to-use framework for AI-native app development, combining LangChain's flexible AI framework with Chroma's vector store and embeddings database. The partnership provides the easiest and best option for most developers building AI apps with LangChain.

Up Your Gaming Creativity with Roblox 🦾

Roblox is developing a tool using generative AI to allow anyone to create in-game objects like buildings, terrain, and avatars, and modify their behavior by using natural language input instead of complex code. (Source)

Bing AI gets chatty, but with limits ⛔

Microsoft has lobotomised Bing AI chat by imposing new limits such as 50 message daily chat limit, 5 exchanges limit per conversation, and no chats about Bing AI itself. (Source)

Controversies for OpenAI Don’t Seem to End! 🤦‍♀️

OpenAI criticized by WSJ and CNN for using articles to train ChatGPT without payment, raising concerns about copyright infringement and AI-generated misinformation. (Source)

Bridging the Gap in Medical 🌉

Stanford researchers used Stable Diffusion to produce medical images that show rare diseases in their clinical context, potentially improving research and treatment. Their model accurately generated chest X-rays with abnormalities, offering promise for future study. (Source)

Images of real chest x-rays and those created with Stable Diffusion

Google’s 2022 Wrap-up and Beyond in Robotics 🤖

Google Research has explored how LLMs can be used to prompt robots to perform tasks using natural language. When combined with vision models and robotics learning approaches, LLMs help robots complete tasks with multimodal capabilities.

Harvey Specter got some Competition from its Bot 👩‍⚖️

One of the biggest law firms Allen & Overy has introduced an AI chatbot named Harvey, built using GPT, to assist its lawyers in drafting legal documents such as M&A documents or memos to clients. (Source)

Y Combinator W23 Generative AI Landscape 🌆

Luminous Lights Up Competition Among LLMs 🕯️

Alex Alpha conducted a comparison study between its Luminous, OpenAI’s Davinci, MetaAI’s OPT, and BigscienceW’s BLOOM using a multilingual corpus. The study revealed that Luminous performed on par with other models in most tasks, and outperformed them in Natural Language Inference and classification. (Source)

Tools of the Trade ⚒️

Shortcut to ChatGPT-Like Bots 🚅

IngestAI allows users to quickly build and deploy ChatGPT-like context-aware bots using their uploaded knowledge base, such as technical documentation or company knowledge base.

Add the Magic of AI without any Coding 🔮

Magick is a no-code AI platform that allows users to easily build flexible and production-ready AI systems and components. It offers pre-made templates, a single API to access modern ML providers, and a user-friendly interface.

Talk to Your Data 📊

Genius Sheets allows users to interact with their data using a text interface powered by AI. It offers live data connections, instant analysis, and self-service, and provides customized data analysis, saving time with data requests.

From Selfies to Avatars 👸

Effectica allows users to transform their photos into personalized avatars, such as superheroes, celebrities, or cartoon characters, using Stable Diffusion and Dreambooth.

Deploying Models on your Fingertips 💅

PoplarML enables easy and fast deployment of production-ready, scalable ML systems with one-click deploys using CLI tool and framework agnostic support for TensorFlow, PyTorch, or JAX models. Real-time inference can be performed through a REST API endpoint.

TikTok for News 🗞️

Artifact, the personalized news app, is now available for download on iOS or Android. It includes features such as the ability to see popular articles in your network, visualize your reading history, and provide feedback on articles and publishers you don't like.

Co-pilot for Meetings 🧑‍✈️

AI-powered voice transcription service Otter.ai has launched a new meeting assistant called OtterPilot, designed to help professionals save time and increase meeting productivity. Its features include automated summaries, image capture and real-time meeting notes. (Source)

Hot Takes 🔥

For ML: Its Python or Nothing!

Hot take: Machine Learning would not have been nearly as advanced now were it not for Python. Python’s two main virtues in the context of ML:
1. Lowering barriers to entry.
2. As a scripting language, it encourages and enables experimental workflow.
— Bojan Tunguz (@tunguz)
1:01 PM • Feb 21, 2023

How the AI-powered world looks like?

25,000 GPUs is all you need.
— Bojan Tunguz (@tunguz)
1:31 PM • Feb 20, 2023

AI Meme of the Week 🤡

That’s all for this week!

Will see you next Saturday with more such content. Don’t forget to subscribe and give your feedback below.

BONUS 🎉

Share this newsletter with three other friends and stand a chance to win my book GPT-3: The Ultimate Guide to build NLP Products with OpenAI API. Winners will be selected on a monthly basis.

🎁 Every paid subscriber will also receive $39 USD worth of learning resources on trending topics like Python, Data Science, Machine Learning, and NLP!

Reply

or to participate.