ChatGPT Pro for $200/month

PLUS: OpenAI o1 comes out of preview, All-in-one open-source RAG solution

In partnership with

Today’s top AI Highlights:

  1. All-in-one solution for RAG with production-ready features

  2. OpenAI launches stable o1 with image support and a $200 Pro subscription

  3. 4-bit quantization to shrink LLMs by 75% without accuracy loss (100% free)

  4. Build and deploy voice AI agents with Eleven Labs’ new platform

  5. All-in-one LLMOps platform with a visual interface powered by DSPy

& so much more!

Read time: 3 mins

AI Tutorials

Traditional chatbots can either access general knowledge or search through specific documents - but rarely do both well. Modern applications need the ability to intelligently combine document search with language model capabilities. Enter Hybrid Search RAG, a powerful approach that combines the best of both worlds.

In this tutorial, we'll build a sophisticated document Q&A system that seamlessly combines document-specific knowledge with Claude's general intelligence to deliver accurate and contextual responses. It:

  • Allows users to upload PDF files

  • Automatically creates text chunks and embeddings

  • Uses Hybrid Search to find relevant information in documents

  • Uses Claude for high-quality responses

  • Falls back to Claude's general knowledge when needed

  • Provides an intuitive chat interface

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

SciPhi’s new R2R ("RAG to Riches") open-source RAG system takes the complexity out of building RAG applications by packaging everything you need into a single, well-structured system. It gives you powerful production-ready features like multimodal content ingestion, hybrid search, configurable GraphRAG, and user and document management through a clean RESTful API.

Whether you're prototyping locally or deploying to production, R2R's Docker-based setup lets you focus on building features instead of wrestling with infrastructure.

Key Highlights:

  1. Document Processing - Ingest multiple file formats (.txt, .pdf, .json, .png, .mp3) and let R2R handle the rest - from chunking and embeddings to entity extraction and knowledge graph creation. No need to manually wire together different processing components.

  2. Production-ready - Ships with essential features like user authentication, role management, and document versioning are built right in. Comprehensive logging and monitoring help you track system performance and troubleshoot issues effectively.

  3. Deploy Your Way - Choose between lightweight and full installations based on your needs. Start locally for development, move to containers for scaling, or deploy to the cloud - your code works the same way everywhere.

  4. Tune Your RAG Pipeline - Fine-tune retrieval and generation by adjusting embedding models, search strategies, and LLM settings. The built-in hybrid search and knowledge graph capabilities help you build more accurate and context-aware applications.

  5. Get Started - Get up and running with just a few commands using Docker, with clear documentation to guide you through the setup. Integration with your existing codebase is straightforward through Python and JavaScript SDKs.

Save 1 hour every day with Fyxer AI

Fyxer AI automates daily email and meeting tasks through:

  • Email Organization: Fyxer puts your email into folders so you read the important ones first.

  • Automated Email Drafting: Drafts replies as if they were written by you; convincing, concise and with perfect spelling in every language.

  • Meeting Notes: Stay focused in meetings while Fyxer takes notes, writes summaries and drafts follow-up emails.

Fyxer AI is even adaptable to teams!

Setting up Fyxer AI takes just 30 seconds with Gmail or Outlook.

OpenAI's latest model o1 is now out of preview, bringing noticeably faster processing and better reasoning abilities than the preview model. The model is also now multimodal, handling images natively.

OpenAI has also launched a new tier ChatGPT Pro at $200/month, giving users unlimited access to all models, including a Pro-only version of o1 that further improves the reliability of o1 on challenging hard problems.

Key Highlights:

  1. o1 Stable Release - The production version delivers 34% fewer errors, 2x faster responses, more reliability, and enhanced reasoning capabilities in coding, math, and science questions. o1 can also now natively handle both text and images.

  2. ChatGPT Pro Launch - The new $200/month subscription tier offers unlimited access to OpenAI’s full line-up including o1, o1-mini, GPT-4o, and Advanced Voice. It also includes o1 pro mode, a version of o1 that uses more compute to think harder and provide even better answers to the hardest problems.

  3. o1 Pro Mode - Exclusive to Pro subscribers, this version is optimized for extended reasoning chains with around 80% reliability in math, coding, and PhD-level science tasks requiring complex reasoning. Since answers will take longer to generate, ChatGPT will display a progress bar and send an in-app notification if you switch away to another conversation.

  4. Availability - o1 is now available to Pro, Plus, and Team users in place of o1-preview model. Enterprise and Edu users will have access in one week. The Pro tier with the o1 Pro mode is also available starting today.

  5. Upcoming releases - OpenAI is working on adding support for tools like web browsing and file upload into OpenAI o1 in ChatGPT. They are also working on making o1 available in the API with support for function calling, developer messages, Structured Outputs, and vision.

What do you think about the Pro tier subscription pricing? Is it justified for the path towards AGI or just insanely expensive, encouraging even other AI companies to charge this high fee? Let us know in the comments below!

Quick Bites

Unsloth introduced a dynamic 4-bit quantization technique that drastically reduces LLM sizes to just 25% while preserving accuracy. Building on Bitsandbytes 4-bit, it selectively avoids quantizing certain parameters, boosting performance compared to standard 4-bit methods while increasing VRAM by ~10%.

  • The proof: Llama 3.2 11B Vision now runs on just 7.23GB instead of 20GB with full accuracy intact.

  • The pre-quantized models are available on Hugging Face under the 'unsloth-bnb-4bit' tag

  • Accompanying Colab notebooks for fine-tuning are also available absolutely free

OpenAI has launched Usage API which allows you to programmatically monitor your API consumption, token usage, and costs across different time intervals. The new API offers granular filtering by API keys, project IDs, user IDs, and models, while also providing a Costs endpoint for tracking daily spend—making it easier for teams to manage their API resources and budgets effectively.

Eleven Labs has launched a Conversational AI platform to build and deploy voice AI agents that can speak realistically across web, mobile, and phone systems in real-time. You create these agents by selecting or creating custom voices, integrating preferred LLMs (GPT, Claude, Gemini, or custom servers), and defining the agent's persona and knowledge base.

Integration is straightforward using ElevenLabs' WebSocket API, React, JavaScript, Python, and Swift/iOS SDKs, with pricing starting at $0.015/minute at scale, plus LLM costs.

Tools of the Trade

  1. LangWatch: All-in-one open-source LLMOps platform built on the DSPy framework that provides a visual interface for monitoring and optimizing LLM pipelines. It offers features like drag-and-drop pipeline construction, automated prompt generation, performance monitoring, and built-in evaluation metrics.

  2. Waveloom: A visual platform to build and deploy AI workflows, connecting various AI services (like LLMs, image models, and data storage) without infrastructure code. It provides a unified SDK and dashboard for orchestrating, monitoring, and managing these workflows.

  3. Marker: Converts PDFs to markdown, JSON, and HTML using a pipeline of deep learning models. It's fast, accurate, and can handle various document types, languages, and formatting elements like tables and code blocks.

  4. Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

  1. I think Amazon is making a strategic error making it so hard for non-technical people to access their models.
    IT departments are not where high value AI use cases are discovered, it is where experts play with a chatbot or tool & see what it does. Experimenting with Nova is hard. ~
    Ethan Mollick


  2. I think I'm speaking for most AI nerds when I say: if OpenAI doesn't release GPT-4.5 or demo GPT-5, this 12-day day event will be a huge waste of time and could really screw OpenAIs public perception.
    We already know that Sora, o1, price cuts and AVM are coming, because you promised us months ago.
    Give us a christmas surprise or pack your shit. If you make all this hype and don't deliver, I think as an investor I would rather bet my chips on Claude, Gemini, Grok, DeepSeek and Qwen. ~
    Lisan al Gaib

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.