• unwind ai
  • Posts
  • 13B Model Beats GPT-4 Performance🎖️

13B Model Beats GPT-4 Performance🎖️

PLUS: Empowering Robots to Understand Our World, Link LLMs to Live Internet Info, ChatGPT for Research

AI Spotlight 🔦

Join this very insightful webinar “Build GenAI Text-to-Speech Apps with LangChain” by Single Store where you will learn how to harness GenAI and Text-to-Speech technologies with LangChain to create innovative, voice-enabled applications. The session will surely be a rich resource for seasoned developers, data enthusiasts, and anyone just starting out, offering a blend of live demonstrations and code sharing to enhance your app development skills.

What You’ll Learn:

Date and Time: Thursday, November 16th at 10:00 am PST

Duration: 60 minutes

It's a unique chance to engage with industry expert and elevate your skills. Don’t miss out, Register Now!

Today’s top AI Highlights:

  1. GOAT: GO to Any Thing

  2. Going Beyond RAG to Expand LLMs Context

  3. “Decontaminating” Data Makes 13B Model Beat GPT-4

  4. YOU.com Releases its API

  5. ChatGPT for Research, no Hallucinations

& so much more!

Read time: 3 mins

Latest Developments 🌍

The Robot That Can Find Anything, Anywhere 🤖

As robotic assistance in everyday life is increasingly becoming a reality, meet GO to Any Thing (GOAT), an advanced navigation system that can identify and navigate to any object in completely unknown environments, a task as challenging as asking a robot to find a specific item in a house it's never explored before. It is akin to giving robots a sense of direction and memory like a human navigating through an unfamiliar place.

Key Highlights:

  1. GOAT can locate objects using different inputs like images, language, or categories. It creates a semantic map from visual and depth data, coupled with an Object Instance Memory, allowing it to navigate to any previously seen object with remarkable accuracy.

  2. The system continuously updates its memory with new object information, enhancing its ability to efficiently navigate to new goals. This improvement over time in a specific environment showcases its advanced learning capabilities.

  3. GOAT was rigorously tested in various home settings, demonstrating a high success rate of 83% in navigating to over 200 different object instances, demonstrating its potential for practical applications in environments like homes and warehouses where autonomous navigation is key.

Going Beyond RAG with Extended Minds for LLMs 🧠

Like a human accessing a well-organized library, researchers introduce "Extended Mind Transformers," which significantly enhances the capabilities of LLMs by extending their memory and retrieval abilities. This method stands out by not requiring fine-tuning, and it outperforms traditional retrieval-augmented generation (RAG) methods in several ways.

Key Highlight:

  • Extended Mind Transformers uses a self-attention mechanism that lets each token in the model access a predetermined number of external memories, effectively turning external, easily accessible information into an integral part of the model's memory and surpassing the limitations of their built-in context windows.

  • The system outperformed existing models like RAG in handling and synthesizing comprehensive document sets and offering more detailed causal citations, as observed in experiments using Mosaic’s MPT-7B.

  • This method enhances LLMs without needing fine-tuning, making it a resource-efficient solution. It can be easily integrated into existing models, improving capabilities and interpretability by showing which external memories were used in the generation process.

13B Model Beats GPT-4 at Various Benchmarks 💪

Researchers at LMSYS have introduced Llama-rephraser which enables a 13 billion parameter model to match the performance of GPT-4 in various major benchmarks. This was achieved through a novel approach of rephrasing or translating test set samples, revealing important insights into the current understanding of data contamination in LMs.

Key Highlights:

  • Contamination occurs when test set leaks into the training set, leading to inflated performance metrics. Traditional detection methods like n-gram overlap and embedding similarity searches are inadequate in identifying more subtle forms of contamination.

  • “LLM decontaminator” method proposed in this project uses embedding similarity searches to identify similar training items, then evaluates potential rephrased pairs with an advanced LLM like GPT-4. This method outperforms existing techniques in detecting rephrased samples.

  • The decontaminator was applied to datasets like the Stack and RedPajama, uncovering substantial rephrased samples. The team recommends wider adoption of this method for public benchmarks and has made it publicly available.

APIs to Connect LLMs with Real-Time Web 🌐

You.com has introduced APIs to give LLMs like Meta's Llama 2 access to real-time internet information. These APIs, starting at $100 per month, enable LLMs to provide up-to-date responses by integrating current web data.

Companies such as LlamaIndex, Anthropic, and Cohere are already utilizing these APIs to enhance the accuracy of their LLMs.

Tools of the Trade ⚒️

  • ResearchGPT by Consensus: No more manual sifting through research papers, your AI research assistant is here. Query a database of over 200 million academic papers, get science-based answers, and draft content with precise citations, without any LLM hallucination. Try it here!

[video-to-gif output image]
  • Floutwork: Replace multiple tabs in your browser with this all-in-one desktop app that streamlines workflows, with integrated AI tools like ChatGPT to enhance task management, minimize distractions, boost productivity and AI assistance.

  • Hyper: Integrates with various platforms like Google Drive, Slack, Salesforce, and others and provides AI-powered chat and search capabilities for holistic, up-to-date, context-aware information and data management.

  • Scenery: Collaborative video editing with never-before ease. AI-assisted workflows meet professional editing enabling your team to share and manage assets, gather real-time feedback, and collectively create storyboards, all in one platform.

😍 Enjoying so far, TWEET NOW to share with your friends!

Hot Takes 🔥

  1. Both MSFT and OpenAI have won massively from capital injections, but Microsoft Research (MSR) has been majorly screwed over. World-class researchers trained and hired for innovative fundamental research — now relegated to “ChatGPT for X” papers because leadership literally does not allow them to do fundamental research or train large models. Such a damn tragedy and IMO a strategic mistake. ~ Ted Xiao

  2. GPT 5 will announce itself when it’s ready. ~ Alfie Whattam

Meme of the Day 🤡

Image

That’s all for today!

See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Reply

or to participate.