• unwind ai
  • Posts
  • When AI Starts Playing GOD 🤯

When AI Starts Playing GOD 🤯

Plus: $500B AI image, AI regulations in US & UK, OpenAI's plan for future of AI and more!

Hey there 👋

We’re here with another exciting edition jam-packed with mind-blowing developments that will leave you itching to know more. This week OpenAI took the center stage with Voyager (autonomous agent powered by GPT-4) exploring Minecraft, and the company’s plans to empower developers and accelerate the development of AGI. But that’s not all! We have a bunch of other exciting updates, including breakthroughs in mathematical reasoning by LLMs, fMRI-to-image reconstruction, and even transformers tracking humans in 4D.

Oh, and let's not forget the dramas and controversies. A lawyer got in trouble for relying on ChatGPT for research, heated debates about AI regulations, and how one AI-generated image swept $500 billion from the market within minutes. The AI world is never short on excitement!

So get ready to have your mind blown and your curiosity satisfied. Enjoy the ride!

This issue covers:

  • Latest Developments 🌍

  • News from the Industry 🧑‍🏫

  • Tools of the Trade ⚒️

  • Hot Takes 🔥

  • AI Meme of the Week 🤡

Latest Developments 🌍

Our Pick 👌

Voyager: A Minecraft agent powered by GPT-4, autonomously learning through exploration and acquiring skills, achieving remarkable proficiency through an automatic curriculum, a skill library, and an iterative prompting mechanism.

  • Photoswap: Allows for personalized subject swapping in images while preserving the original composition and charm of the image.

  • Limits of Transformers on Compositionality: While Transformer LLMs excel in complex reasoning, they struggle with trivial problems and lack systematic problem-solving skills.

  • SwiftSage: Combines behavior cloning and LLMs to excel in complex interactive tasks, integrating fast and intuitive thinking with deliberate thought processes.

  • Break-A-Scene: Extracts multiple concepts from a single image using masks and textual embeddings for fine-grained control over generated scenes.

  • Improving mathematical reasoning with process supervision: Training a model using process supervision, which rewards each correct step of reasoning, improves mathematical problem-solving performance.

  • Reconstructing the Mind's Eye: fMRI-to-image approach that retrieves and reconstructs viewed images from brain activity using contrastive learning and diffusion priors.

  • PaLI-X: Multilingual vision and language model, surpassing existing benchmarks and capabilities in complex counting and multilingual object detection.

  • Impossible Distillation: Creates high-quality summarization and paraphrasing models and datasets from a low-quality language model, surpassing GPT-3.

  • Humans in 4D: Using transformers to reconstruct and track humans in 4D from videos, achieving superior results in tracking and action recognition.

  • Think Before You Act: Internal working memory module boosts training efficiency and generalization in decision-making agents, reducing forgetting phenomenon.

  • OlaGPT: Emulates human problem-solving abilities in LLMs by incorporating cognitive modules and active learning mechanisms.

  • A PhD Student's Perspective on Research in NLP in the Era of Very Large LMs: PhD students highlight unexplored research areas in NLP, countering the misconception that LLMs have solved “all” problems.

  • Generating Images with Multimodal Language Models: Combining LLMs with image encoder and decoder models to generate images and text outputs, exhibiting a wider range of capabilities.

  • Control-GPT: Improves controllability in text-to-image generation by enhancing instruction following, utilizing GPT-4 to generate programmatic sketches.

  • LLMs as Tool Makers: A framework enabling LLMs to create their own tools for problem-solving, reducing dependency and achieving cost-effective solutions.

  • SPRING: A framework utilizing GPT-4 and academic papers, outperforms Reinforcement Learning algorithms in open-world survival games.

  • Sophia: A scalable second-order optimizer for language model pre-training with 2x speed-up compared to Adam.

  • Training Socially Aligned Language Models in Simulated Human Society: Improves LMs' generalization and alignment with societal values through simulated social interactions.

  • The Curse of Recursion: Model Collapse occurs when training models on generated data, highlighting the need of human interactions in data training.

  • Lexinvariant Language Models: Perform well without fixed token embeddings, relying on context-based token patterns, improving deciphering and in-context reasoning tasks.

  • Barkour: Benchmark for quadruped robots that measures their agility within a set time frame, with performance metric tied to real animal capabilities.

  • Gorilla: A finetuned LM that outperforms GPT-4 in generating accurate API calls, using document retrieval to adapt to changes, and mitigates hallucination.

News from the Industry 🧑‍🏫

Our Pick 👌

OpenAI’s plans to advance AGI development include improving GPT models, expanding API capabilities, avoiding competition with customers and advocating for regulation.

Tools of the Trade ⚒️

Our Pick 👌

Paragraphica: Camera that uses location data and AI to create unique photos reflecting the mood and emotion of a place, though they may not resemble it exactly.

  • Be My Eyes: Virtual volunteer tool powered by GPT-4, connects blind and low-vision people with volunteers through live video for accessing visual information.

  • EffluenceAI: Revolutionize influencer marketing with affordable and authentic connections between brands and lifelike AI-generated influencers.

  • NewsNotFound: AI-powered news website delivering unbiased and neutral news articles through automation, eliminating human bias.

  • SuperAGI: Open-source framework for developing and deploying autonomous agents with multiple features and efficient management.

  • Candydate: AI-powered recruitment platform that analyzes speech, body language, and personal traits to match candidates with the right job.

  • HeyGen: AI-powered text-to-video generation for business featuring customizable avatars, lip-syncing with 300+ voices in 40+ languages.

  • ConceptMap: Create interactive concept maps using AI.

  • Waibsites: Effortless website deployment with GPT-4-generated landing pages, customizable styles, advanced analytics, and seamless payment integration.

  • Scrivvy: Provides concise summaries of YouTube videos and breaks down long videos into short segments for easier understanding.

  • CreditHQ: Credit Score Simulator to predict the impact of various scenarios on your credit score and provides recommendations to address them.

  • Rask: AI-powered video localization along with translation, dubbing, voice cloning, and multispeaker capabilities.

  • Coach by Wonderway: AI-powered sales coaching tool that analyzes performance and provides real-time feedback on sales calls, instant insights and trend analysis.

  • BoltAI: Provides instant access to ChatGPT and Stable Diffusion on any Mac app, allowing users to access it directly in their favorite apps.

  • Teleport Assist: DevOps assistant powered by GPT-4 for infrastructure tasks, troubleshooting, and executing commands on target nodes.

Hot Takes 🔥

Meme of the Week 🤡

Image

That’s all for this week!

Will see you next Saturday with more such content. Don’t forget to subscribe and give your feedback below.

BONUS 🎉

Share this newsletter with three other friends and stand a chance to win my book GPT-3: The Ultimate Guide to build NLP Products with OpenAI API. Winners will be selected on a monthly basis.

Reply

or to participate.