• unwind ai
  • Posts
  • Google to Challenge ChatGPT's Dominance

Google to Challenge ChatGPT's Dominance

PLUS: Multimodal Capabilities in GPT-4, Inside Microsoft's GitHub Security Lapse

Today’s top AI Highlights:

  1. Google’s Multimodal LLM Gemini Releases Soon

  2. OpenAI's Multimodal Gobi Takes on Gemini

  3. Microsoft's GitHub Breach Exposes 38 TB of Sensitive Data

  4. Enhancing LLMs for Document Q&A Beyond their Context Window

& so much more!

Read time: 3 mins

Latest Developments 🌍

Google’s Gemini to Rival ChatGPT 💪

Google is reportedly nearing the release of Gemini, its flagship AI-product, and has provided early access to a small group of testers.

Key Highlights:

  • Gemini is designed for multimodal processing, capable of handling images and text, and generating context-sensitive responses. It is claimed to have 5x greater computational power than GPT-4, and trained on Google's advanced TPUv5 chips.

  • Google's access to proprietary training data from its wide-ranging portfolio, including YouTube, Google Search, Google Books, and Google Scholar, could give Gemini an edge over competitors.

  • Gemini will have applications ranging from chatbots to text summarization, content generation, coding, and more! Google plans to offer Gemini to businesses through its Google Cloud Vertex AI platform.

Google nearing Gemini AI release, reports claim

OpenAI’s Multimodal LLM Coming Soon… 👁️

OpenAI is racing to launch a multimodal LLM codenamed Gobi, rivalling Google's Gemini, which is reported to debut in the fall.

Key Highlights:

  • OpenAI aims to incorporate multimodal capabilities similar to Gemini's into GPT-4, with plans to roll out these features as GPT-Vision to a wider audience.

  • OpenAI had conducted preliminary tests of its multimodal functionality in collaboration with Be My Eyes, a company dedicated to assisting individuals with visual impairments. However, these features were not made publicly accessible.

  • The company’s venture into multimodal models highlights the growing significance of integrating text and image comprehension within AI models.

Microsoft's AI Misstep Led to a 38TB Data Breach ⚠️

Microsoft experienced a significant security breach, exposing 38 TB of data during an opensource AI training material update on GitHub, as discovered by cloud data security startup Wiz. The exposed data included sensitive information such as backups of employees' workstations, corporate secrets, private keys, passwords, and over 30k internal Microsoft Teams messages.

Key Highlights:

  • Microsoft used Azure SAS tokens for data sharing, but the token used was configured to share the entire storage account, rather than specific files, exposing additional private data.

  • The misconfigured token allowed attackers to have "full control" permissions, potentially enabling them to inject malicious code into AI models and infect users who trusted Microsoft's GitHub repository.

  • Microsoft responded by invalidating the SAS token within two days of notification and replacing it a month later. They assured that no customer data was exposed, and no other internal services were at risk.

Steering LLMs towards Structured Document Question Answering 📜

LLMs struggle with Q&A about documents beyond their context window, as current solutions represent structured documents like PDFs, web pages, and presentations as plain text, neglecting their inherent structure, hindering answering questions related to document structure. Researchers at Adobe introduce PDFTriage to address the issue.

Key Highlights:

  • PDFTriage offers models both structural metadata and retrieval functions for context, whether it's based on structure or content.

  • PDFTriage-augmented models outperform existing retrieval-augmented LLMs in answering various classes of questions related to structured documents.

  • The authors have also created a benchmark dataset comprising over 900 human-generated questions spanning 80 structured documents from 10 different question categories to aid research on document Q&A.

Tools of the Trade ⚒️

  • Dualite: A Figma plugin to convert dynamic designs into responsive code, saving time and accelerating product development 5-10x while maintaining code quality.

  • Doctopus: Simplifies Node.js documentation by aggregating dependencies' docs into one dashboard and enhancing search and interaction through AI.

  • Klu: AI productivity tool that seamlessly searches, understands, and engages with data across various apps like Gmail, Notion, Drive, Trello, Slack, and more.

  • Epsilon: AI search engine designed for academic research providing instant answers with up-to-date citations from academic literature.

  • Pentest Copilot: AI-powered ethical hacking assistant that streamlines the penetration testing process by providing contextual analysis, automating tasks, and advanced features.

😍 Enjoying so far, TWEET NOW to share with your friends!

Hot Takes 🔥

  • There is always an exponential decay of interest/understanding of any new technology from its epicenters of development. But I feel nowhere is that more acutely visible than with AI. ~ Bojan Tunguz

  • It’s 2023, the golden age of AI, and yet the only way to search for more than one word in VS Code is using a regex ~ Logan

Meme of the Day 🤡

r/ProgrammerHumor - dadJokeConstructor

That’s all for today!

See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Reply

or to participate.