• unwind ai
  • Posts
  • Last Week in AI - A Weekly Unwind

Last Week in AI - A Weekly Unwind

From 11-Aug-2024 to 17-Aug-2024

It was yet another thrilling week in the AI field with advancements that further extend the limits of what can be achieved with AI.

Here are 10 AI breakthroughs that you can’t afford to miss 🧵👇

LLMs have proven their prowess in tackling individual coding tasks, but they often stumble when faced with the complexities of entire code repositories. Introducing CodexGraph: it transforms code repositories into graph databases. This allows LLMs to query and navigate the codebase more effectively. By representing code elements and their relationships as nodes and edges, CodexGraph provides LLMs with a structured understanding of the code.

OpenAI released their GPT-4o model’s system card, including the Advanced Voice Mode, on Thursday, detailing the safety measures they have undertaken before releasing the models. The report says the Advanced Voice Mode can imitate a user’s voice in rare instances, and they are implementing safeguards to prevent this.

Listen to this audio clip where the model outbursts “No!” then begins continuing the sentence in a similar sounding voice to the red teamer’s voice.

AI startup Cosine, focusing on understanding human reasoning, has released Genie, a state-of-the-art, fully autonomous AI software engineering colleague. Genie boasts the highest score of 30.08% on SWE-Bench, surpassing Amazon’s Q Developer and Congnition’s Devin. Cosine captured the cognitive processes of human software engineers in Genie's training data. This involves the intricate steps of reasoning, problem breakdown, and decision-making that human engineers follow in real-world development tasks.

Get the power of Postgres and an AI assistant right in your browser with postgres.new. This tool lets you spin up unlimited, free Postgres databases instantly, directly in your browser, powered by the magic of WebAssembly. It's perfect when you want to experiment with data, prototype quickly, or just want to learn SQL in a fun and engaging environment. And the best part? It's supercharged with AI for effortless data manipulation and visualization.

Google released what we’ve been waiting for OpenAI to do for ages. At the Made by Google 2024 event, Google introduced Gemini Live, a conversational mode that allows for natural back-and-forth voice interactions. Leveraging Gemini 1.5 Flash model, Gemini Live’s responses are almost real-time and the experience is very smooth. Plus, you can now choose from 10 distinct voices for Gemini. It is being rolled out to Gemini Advanced subscribers on Android.

An AI agent that can learn from its mistakes and plan ahead like a human, that's Agent Q! Developed by researchers at MultiOn and Stanford University, Agent Q is a significant leap in autonomous web navigation. This AI agent uses a unique combination of search, self-critique, and learning from experience to complete complex tasks on websites.

Agent Q explores a website like a human would, trying different actions to see what works best. It then uses AI to critique its own actions and learn from its mistakes. This allows Agent Q to improve its performance over time, even without explicit instructions.

Anthropic has released prompt caching in public beta for their Claude 3.5 Sonnet and Claude 3 Haiku models. This exciting new feature lets you store frequently used context, like long instructions or code, between API calls. This means you can give Claude more background and examples, all while drastically cutting API costs by up to 90% and latency by up to 85% for lengthy prompts.

xAI finally released beta versions of Grok-2 and Grok-2 mini. It outperforms GPT-4o, Claude 3.5 Sonnet, and Llama 3 405B in multi-turn conversation capabilities, and closely competes in MMLU, math, coding, and other standard benchmarks. The models are available for Premium subscribers on X. You can look forward to accessing these models through xAI's enterprise API later this month.

AI's ultimate promise lies in its ability to accelerate scientific research and Sakana AI’s new system is a step towards it. Sakana AI's AI Scientist automates the entire scientific research process using LLMs. This system generates research ideas, writes code, conducts experiments, analyzes results, and even writes the resulting scientific papers. Developed in collaboration with the University of Oxford and the University of British Columbia, the AI Scientist even includes an AI-powered peer review process for iterative improvement.

Mixture of AI Agents 👨‍👨‍👦‍👦

No more one-size-fits-all for AI. Arcee AI has released Arcee Swarm, a novel architecture that leverages a network of specialized AI agents instead of relying on a single LLM. This "Mixture of Agents" approach provides more accurate and efficient results, especially for complex tasks requiring domain-specific knowledge. Arcee Swarm intelligently routes your requests to the right expert AI agent, and even offers a collaborative "Ultra Mode" for tackling critical problems. It will be available for use in the coming weeks.

Which of the above AI development you are most excited about and why?

Tell us in the comments below ⬇️

That’s all for today 👋

Stay tuned for another week of innovation and discovery as AI continues to evolve at a staggering pace. Don’t miss out on the developments – join us next week for more insights into the AI revolution!

Click on the subscribe button and be part of the future, today!

📣 Spread the Word: Think your friends and colleagues should be in the know? Share Unwind AI and let them join this exciting adventure into the world of AI. Sharing knowledge is the first step towards innovation!

🔗 Stay Connected: Follow us for updates, sneak peeks, and more. Your journey into the future of AI starts here!

Shubham Saboo - Twitter | LinkedIn

Unwind AI - Twitter | LinkedIn | Instagram | Facebook

Reply

or to participate.