• unwind ai
  • Posts
  • AI Creates 3D Worlds from Text Prompts 🪄

AI Creates 3D Worlds from Text Prompts 🪄

PLUS: Depth Up-Scaling in AI Models, Notux 8x7B leads the OpenLLM Leaderboard

Today’s top AI Highlights:

  1. Upstage's SOLAR 10.7B leads in AI model scaling with its innovative Depth Up-Scaling, surpassing major models on the HF Open Leaderboard.

  2. Notux 8x7B Model by Argilla sets new standards in the MoE method, leading the Hugging Face OpenLLM Leaderboard

  3. Align Your Gaussians (AYG) introduces Text-to-4D synthesis for vibrant, dynamic 3D scenes.

  4. New York Times Sues OpenAI and Microsoft for Copyright Infringement

  5. AI tools to convert text or photos into video, and transform concepts into 3D CAD models.

& so much more!

Read time: 3 mins

Latest Developments 🌍

SOLAR 10.7B: Pioneering Depth Up-Scaling in AI Models

Korean AI startup Upstage had released the SOLAR 10.7B model a few days back, which topped the HF Open Leaderboard Chart, outperforming much larger models including Qwen, Mixtral 8x7B, Yi-34B, Llama 2, and Falcon. The team has now introduced a novel technique, depth up-scaling (DUS), to scale up language models in a simple yet effective manner. Unlike the mixture-of-experts (MoE) method, DUS does not require complex changes for training and inference.

Key Highlights:

  1. DUS is a method that effectively scales up a base language model by increasing its depth, i.e., the number of layers. It involves copying the base model, then strategically removing and concatenating layers from these copies to create a larger model. This process simplifies scaling up by avoiding the need for major changes to the model's architecture or training processes.

  2. The MoE method requires complex changes in the training and inference framework, including additional modules like gating networks. In contrast, DUS allows for scaling up models without such complexities. It is compatible with existing training and inference frameworks, making it more accessible and user-friendly.

  3. SOLAR 10.7B excels in various natural language tasks, surpassing models like Llama 2 and Mistral 7B. The SOLAR 10.7B-Instruct variant, specifically fine-tuned for instruction-following tasks, demonstrates superior performance, outperforming larger models like Mixtral 8x7B.

Transform Text into Dynamic 3D Worlds

NVIDIA has made a significant leap in AI-driven animation with its innovative technology, Align Your Gaussians (AYG). This method transforms text inputs into vibrant, dynamic 3D scenes, utilizing a unique text-to-4D synthesis approach. It marks a notable advancement in the field of digital content creation, opening new possibilities in animation and simulation.

Key Highlights:

  1. Novel 4D Synthesis Technique: AYG employs a unique method combining text-to-image, text-to-video, and 3D-aware multiview diffusion models. This allows the generation of dynamic, animated 3D objects with an added temporal dimension, ensuring temporal consistency, high-quality visual appearance, and realistic geometry​.

  2. Two-Stage Optimization Process: The technology involves a two-stage process, beginning with the optimization of 3D Gaussians for static scene creation, followed by the addition of dynamics through deformation field optimization​.

  3. Innovative Motion Amplification and Autoregressive Synthesis: AYG introduces new techniques like motion amplification and an autoregressive synthesis scheme. These allow for the generation of longer 4D sequences and the capability to change text guidance during the process, enhancing the versatility and richness of the dynamic scene

Notux 8x7B Model Surpasses Expectations

Notux 8x7B model by Argilla represents a significant leap forward in the Mixture of Experts (MoE) method, setting new benchmarks for AI development and performance. This is part of the Notus family of models and experiments, where the Argilla team investigates data-first and preference-tuning methods like dDPO (distilled DPO).

Key Highlights:

  1. Innovative Preference Tuning: The Notux 8x7B, a variant of the Mixtral 8x7B-Instruct-v0.1, utilizes Direct Preference Optimization (DPO), showcasing a unique approach in AI model refinement.

  2. Leading the Pack: As of December 26, 2023, Notux 8x7B proudly stands atop the Hugging Face OpenLLM Leaderboard, surpassing its predecessor and other contenders in the MoE category including Mistral 8x7B.

Image

OpenAI and Microsoft Amid Another Copyright Infringement Suit ⚖️

As the legal issue of copyright infringement stemming from the training of AI models on publicly available data reaches no decision, The New York Times has filed a lawsuit against OpenAI and Microsoft, alleging copyright infringement. The suit claims these companies have used millions of the publication's articles to train their LLMs, ChatGPT and Copilot, affecting the Times' revenue from subscriptions and advertising by replicating and summarizing its content.

The publication is seeking billions in damages and wants the court to prevent OpenAI and Microsoft from using its content to train their AI models. It also demands the removal of its content from their datasets. The NYT, along with other major news outlets like the BBC, CNN, and Reuters, has blocked OpenAI’s web crawler, preventing further scraping of content for AI training.

Tools of the Trade ⚒️

  1. Assistive Video: Turn your ideas into videos simply by typing what you want to see. Or, input a photo and watch it come to life.

  1. TokenCost: A client-side tool for calculating the cost of using major LLM APIs. TokenCost estimates the USD cost of prompts and completions for LLM applications and AI agents, aiding in cost-effective AI development.

  2. Leo: AI tool to transform imaginative concepts into tangible 3D CAD models that can be edited anywhere, catering to a wide range of design needs, from individual parts to fully assembled products​.

  3. ResearchPlot: A Python project utilizing LangChain and the OpenAI API for research on various topics, generating intelligent questions, and producing flowcharts (using Mermaid.js) to visually represent findings. It focuses on topic analysis, natural language processing, and clear, concise data visualization.

😍 Enjoying so far, TWEET NOW to share with your friends!

Hot Takes 🔥

  1. The LLMs coming out of china performing really well on benchmarks but I don’t hear much on how they actually perform on real tasks or if anyone outside china is actually using them ~ anton

  2. I think part of the reason AGI hype takes off so much is that so many developers and computational scientists (including me) don't build enougn things with their hands. Embodiment is very hard. Each extra 9 of precision may require a breakthrough ~ Bharath Ramsundar

Meme of the Day 🤡

Robots can't reach my home

That’s all for today!

See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Reply

or to participate.