• unwind ai
  • Posts
  • AI Weekly Recap: Top 10 AI Breakthroughs

AI Weekly Recap: Top 10 AI Breakthroughs

From Sept 8 - Sept 14, 2024

Discover the top 10 AI breakthroughs from Sept 8-14, 2024. From Apple’s AI-powered iPhone 16 to OpenAI’s new models and Phind’s lightning-fast AI search engine—this week was packed with innovation!

AI search engine Phind has launched its new flagship model, Phind-405B, trained on Meta Llama 3.1 405B, which is excellent at programming and technical tasks. Phind-405B utilizes a 128K token context and achieves a 92% score on HumanEval (0-shot), on par with Claude 3.5 Sonnet.
Along with this, they have also released a new Phind Instant model, based on Meta Llama 3.1 8B, for significantly faster AI-powered search speeds. It offers speeds of up to 350 tokens per second, powered by a customized Nvidia TensorRT-LLM inference server.

Apple unveiled its new series of Apple Watches, AirPods, and iPhones. The new iPhone 16 lineup is built from the ground up for Apple Intelligence. iPhone 16 and 16 Plus are powered by the all-new A18 bionic chip, while the iPhone 16 Pro and Pro Max are built on the new A18 Pro bionic chip. These chips feature enhanced CPU, GPU, and neural engine for powering generative AI functions. Apple Intelligence will be rolled out in beta as an update to the iOS in the next month.

The latest VS Code release brings powerful GitHub Copilot updates for enhancing developer workflows with AI assistance. This includes improvements to inline chat, test generation, and chat history. There are also a bunch of experimental features to check out, like starting debugging sessions directly from the chat and setting custom instructions for Copilot's code generation.

SambaNova launched SambaNova Cloud, an AI inference service built for speed and power. You can tap into the service through a readily available API and build with LLMs like Llama 3.1, including the massive 405B parameter version. What's the big deal? SambaNova Cloud is optimized to run these models incredibly fast, surpassing current benchmarks. Plus, it delivers full precision inference, no compromises on accuracy even with the largest models.

Mistral AI has released its first-ever multimodal model Pixtral 12B that can process both text and images. It is built upon their existing Nemo 12B language model. This 24GB model is freely available for download and fine-tuning under the permissive Apache 2.0 license. The model boasts a 128k context window to take in substantial text and visual information.

The first AI agent that conducts entire scientific literature reviews on its own! Non-profit AI research organization FutureHouse introduced PaperQA2, an AI agent that can conduct scientific literature reviews autonomously. It is the first AI agent that surpasses PhD and postdoc-level biology researchers in various literature research tasks, as measured by objective benchmarks and assessments by human experts. The code for PaperQA2 is open-sourced, along with a detailed research paper outlining its development.

Adobe is introducing the Firefly Video Model, bringing generative AI to video editing, which will be available in beta later this year. The model offers tools to help editors generate b-roll, fill gaps, and create visual elements with simple text prompts, speeding up the creative process.

The rumors were true! OpenAI has finally released its highly-anticipated Strawberry models, officially o1 model series. These models, o1-preview and o1-mini, are designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math.

o1-mini is a cost-efficient reasoning model that nearly matches the o1’s performance in math and coding, but is much faster and cheaper. It is perfect for applications that require reasoning without broad world knowledge.

Hallucinations in LLMs are a major roadblock in their adoption. While RAG helps ground AI’s responses, its effectiveness hinges on the quality of the retrieval process. Google has released DataGemma, the first set of open models that tackle this hallucination issue in LLMs. These models leverage Google's Data Commons, a massive repository of over 240 billion real-world data points, using RAG as well as RIG, to ground LLM responses in verifiable information.

vLLM is an opensource library for accelerating inference for LLMs, especially those in the Llama family. The latest release vLLM v0.6.0 delivers significant performance enhancements for Llama 3.1 inference. This update addresses key bottlenecks in the previous version, resulting in up to 2.7x higher throughput and 5x faster token generation times for Llama 3.1 8B, and comparable gains for Llama 70B. vLLM 0.6.0 is now a top contender among LLM inference engines for performance.

Which of the above AI development you are most excited about and why?

Tell us in the comments below ⬇️

That’s all for today 👋

Stay tuned for another week of innovation and discovery as AI continues to evolve at a staggering pace. Don’t miss out on the developments – join us next week for more insights into the AI revolution!

Click on the subscribe button and be part of the future, today!

📣 Spread the Word: Think your friends and colleagues should be in the know? Share Unwind AI and let them join this exciting adventure into the world of AI. Sharing knowledge is the first step towards innovation!

🔗 Stay Connected: Follow us for updates, sneak peeks, and more. Your journey into the future of AI starts here!

Shubham Saboo - Twitter | LinkedIn

Unwind AI - Twitter | LinkedIn | Instagram | Facebook

Reply

or to participate.