• unwind ai
  • Posts
  • Mistral AI's Open Source Models for Math and Coding

Mistral AI's Open Source Models for Math and Coding

Mistral AI's Open Source Models for Math and Coding

Today’s top AI Highlights:

  1. Microsoft’s SpreadsheetLLM helps LLMs to process spreadsheets efficiently

  2. Mistral AI releases two models specializing in code generation and math

  3. Andrej Karpathy launches an AI + education company called Eureka Labs

  4. AI software engineer using Claude 3.5 Sonnet with agentic capabilities

& so much more!

Read time: 5 mins

Latest Developments 🌍

Spreadsheets are the fodder for data science but pose unique challenges in LLM processing. Their expansive grids usually exceed the token limits of LLMs, and the 2D layouts are not suitable for LLM inputs. Furthermore, spreadsheet-specific features such as cell addresses and formats complicate it more LLMs.

Microsoft has introduced SpreadsheetLLM to make spreadsheets more manageable for LLMs. This includes SheetCompressor, which compresses spreadsheets to reduce token usage by a staggering 96%. This tool helps LLMs understand spreadsheet layouts better and improves their performance in various tasks.

Key Highlights:

  1. Efficient Encoding - SheetCompressor compresses spreadsheet data to fit within LLMs’ token limits. It uses structural anchors to identify key rows and columns, removing redundant data to create a condensed spreadsheet.

  2. Improved Performance: It significantly improves performance in spreadsheet table detection tasks, outperforming the vanilla approach by 25.6% in GPT4’s in-context learning setting.

  3. Reduced Token Usage: SheetCompressor optimizes token usage by employing an inverted-index translation method that indexes non-empty cell texts and merges identical values, reducing tokens needed by 96%.

  4. Enhanced Accuracy: Fine-tuned LLMs with SheetCompressor achieve a compression ratio of 25x and a state-of-the-art 78.9% F1 score, surpassing previous best models by 12.3%.

Why it matters: Professions like data science and finance rely heavily on spreadsheet applications like Sheets and Excel. However, AI assistance in these applications is limited to generating formulas and charts, mostly for spreadsheets with a few cells. This compression method paves way for mainstream use of LLMs in data analysis tasks to handle larger and more complex spreadsheets. 

AI Platform for Surfacing Real Customer Pains to Prevent Churn

To create a successful business, understanding customer feedback is the most crucial yet challenging part. Traditional metrics like NPS and CSAT don’t provide the full picture as they are too simplistic and cannot capture the nuances of customer sentiment.

Here is Syncly (YC W23), the AI-powered customer feedback analysis tool to help you get deep insights into your CX. It uses AI to analyze everyday customer interactions and give a complete view of customer sentiment. Syncly seamlessly integrates with multiple communication platforms, automatically categorizing feedback and providing actionable insights.

  1. Seamless integration - Syncly can integrate seamlessly with your favorite tools (Zendesk, Gorgias, Intercom, Gong, Front, etc.)

  2. Holistic Visibility - With Smart Insight, identify the most critical issues that hinder customer experience without having to read all customer interactions across the user journey.

  3. Actionable Insights - You can Ask Syncly to pinpoint the positive and negative sentiment drivers and get instant action items to elevate customer experience.

  4. Sentiment Monitoring at Every Stage - With Customer AI, you can monitor sentiment trends by account and by individual user, to proactively engage with customers with underlying issues before they become at-risk.

  5. Cross-Functional Collaboration - The platform provides an easy way to create and share relevant analyses, charts, graphs, and action items across the organization, breaking down silos and promoting data-driven decision-making.

New Specialized Models by Mistral AI 🤖

Mistral AI has released two new specialized models: Codestral Mamba which specializes in code generation, and Mathstral designed for math reasoning and scientific discovery.

A Mamba2 language model specialized in code generation. Mamba models offer linear time inference so you can engage with the model extensively with quick responses, irrespective of the input length.

  1. It outperforms other 7B models on various coding benchmarks and even competes with larger models including Mistral’s Codestral 22B and Meta’s CodeLlama 34B.

  2. You can deploy Codestral Mamba using the mistral-inference SDK, or through TensorRT-LLM. For local inference, it should soon be available on llama.cpp. Raw weights are available on HuggingFace.

This is Mistral’s first Mathstral model with 7B parameters, designed for advanced mathematical problems requiring complex, multi-step logical reasoning. The model has a 32k context window published under the Apache 2.0 license.

  1. Based on Mistral 7B model, it specializes in STEM subjects. It achieves SOTA reasoning capacities in its size category across various benchmarks. It scored 56.6% on MATH and 63.47% on MMLU.

  2. Mathstral scored 68.37% on MATH with majority voting and 74.59% using a strong reward model among 64 candidates, demonstrating significant improvement with more inference-time computation.

  3. Mathstral can be fine-tuned using mistral-finetune, with weights available on HuggingFace.

Quick Bites 🤌

  1. Andrej Karpathy is starting an AI + Education company called Eureka Labs with the goal of creating an AI-native school. Partnering with subject matter experts, the first product will be LLM101n, an undergraduate-level course guiding students to train their own AI. (Source)

  2. Tech giants Apple, Nvidia, and Anthropic have used over 173,000 YouTube videos to train AI models without creators’ consent, according to an investigation by Proof News. These include popular educational and creators channels like Marques Brownlee and Khan Academy. (Source)

  3. YouTube is rolling out a new feature in YT Music called “sound search” which is an advanced version of Shazam. It lets you hum, sing, or play audio to search and recognize the music.

    YouTube is also testing an “AI-generated conversational radio” in the US for Premium users. It lets you create a custom radio by “describing exactly what you want to hear.” (Source)

  4. Anthropic has doubled the max output token limit for Claude 3.5 Sonnet from 4096 to 8192 in the Anthropic API. Just add the header “anthropic-beta”: “max-tokens-3-5-sonnet-2024-07-15” to your API calls. (Source)

  5. The UK’s Competition and Markets Authority (CMA) is launching an antitrust probe into Microsoft’s investment in Inflection AI, focused mostly on hiring Inflection’s staff. (Source)

😍 Enjoying so far, share it with your friends!

Tools of the Trade ⚒️

  1. Claude Engineer 2.0: Claude Engineer with AI agents is a CLI tool that combines the capabilities of Claude 3.5 Sonnet with practical file system operations, web search, vision support and intelligent code execution to build apps. It can do the following and so much more:

    Create a new Python project structure for a web application

    Explain the code in file.py and suggest improvements

    Search for the latest best practices in React development

    Help me debug this error: [paste your error message]

    Analyze this image and describe its contents

  1. PeachML: AI marketplace where you can create a model, deploy it for free, and earn money whenever others use it. Train in your preferred environment, deploy with a click, and get paid per inference used.

  2. Financial Datasets: This stock market API provides 30+ years of financial data for all S&P 500 companies, including income statements, balance sheets, and cash flow statements. Connect your AI financial agents for analysis, portfolio management, or trading with no API limits during the open beta.

  3. Awesome LLM Apps: Build awesome LLM apps using RAG for interacting with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple texts. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes 🔥

  1. ….I wished for tech to look like "a thriving coral reef" ecosystem but sometimes it feels more like mostly plankton, a few clown fish, two tunas, and 5 killer whales circling above. ~
    Andrej Karpathy

  2. LLMs attracted way too people who don't understand the basics of building systems with non-deterministic components to machine learning. ~
    Jaana Dogan

Meme of the Day 🤡

That’s all for today! See you tomorrow with more such AI-filled content.

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!

PS: We curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Reply

or to participate.