unwind ai
Posts
Run 1000+ Vision Models with a Single API

Run 1000+ Vision Models with a Single API

PLUS: Google's text-to-music AI DJ, Server AI models at lightning speed

Shubham Saboo & Gargi Gupta
October 29, 2024

Today’s top AI Highlights:

Serve any AI model fast with this high-performance AI serving engine
Run inference on 1000+ vision models with a single Python library
Google’s text-to-music tool will turn you into a DJ without any setup
iOS18.1 with Apple Intelligence is now generally available
Manage AI/ML projects with versioned Artifacts; built for DevOps

& so much more!

Read time: 3 mins

AI Tutorials

RAG is becoming a game-changer for applications that need accurate information from large datasets. As developers, we know the value of building tools that can search documents and provide relevant answers quickly. Today, we’ll take that one step further.

In this tutorial, we’ll walk you through building a production-ready RAG service using Claude 3.5 Sonnet and Ragie.ai, integrated into a clean, user-friendly Streamlit interface. With less than 50 lines of Python code, you’ll create a system that retrieves and queries documents—ready for real-world use.

What is Ragie.ai?

Ragie.ai is a fully managed RAG-as-a-Service for developers. It offers connectors for services like Google Drive, Notion, and Confluence, along with APIs for document upload and retrieval. It handles the entire pipeline—from chunking to hybrid keyword and semantic searches—so you can start with minimal setup.

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build and Deploy RAG-as-a-service

RAG-as-a-service with Claude 3.5 Sonnet in less than 50 lines of Python Code (step-by-step instructions)

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get an AI resource pack worth $50 for FREE. Valid for a limited time only!

Latest Developments

Lightning-fast Serving Engine for Any AI Model 🤖

LitServe is an easy-to-use, flexible serving engine for AI models built on FastAPI. LitServe boasts optimized performance for AI-specific workloads, making it at least twice as fast as a standard FastAPI implementation.

Unlike traditional servers, LitServe reduces the overhead of creating a separate server for every model by letting developers manage multiple models seamlessly. It offers flexible support for various AI models, from LLMs to traditional machine learning, across diverse frameworks like PyTorch, JAX, and TensorFlow.

Key Highlights:

Simplified API - Define model serving logic with a clean, Pythonic API (LitAPI) that handles setup, request decoding, prediction, and response encoding. LitServer takes care of the underlying optimizations like batching and streaming.
Performance Boost - Leverage dynamic batching, GPU autoscaling, and multi-worker support for significantly improved throughput. Fine-tune performance by adjusting batch size, worker count, and precision settings.
Flexible Deployment - Choose between self-hosting on your own infrastructure or a fully managed deployment on Lightning Studios. Studios offers autoscaling, monitoring, and other enterprise-grade features.
Extensible Architecture - Customize the serving pipeline with middleware and callbacks, integrating custom logic, database connections, and other services seamlessly.
Get Started - Install LitServe with pip install litserve and explore the documentation to start deploying your models efficiently.

Any Computer Vision Model with Just Four Lines of Code 👀

Working with multiple computer vision models can be a hassle, constantly requiring you to learn new frameworks and APIs. Imagine seamlessly switching between different models without rewriting your core inference logic. That's precisely what x.infer, a new Python library, enables.

It provides a unified interface for running inference with a wide range of computer vision models from various frameworks like Transformers, vLLM, and Ultralytics, removing the need to learn the intricacies of each framework. x.infer supports tasks such as image classification, object detection, and image-to-text and has access to over 1000+ pre-trained models.

Key Highlights:

Unified Interface - Write your inference code once and use it across various frameworks, including Transformers, TIMM, Ultralytics, vLLM, and Ollama, drastically simplifying model prototyping and comparison.
Easy Model Integration - Extend x.infer with new models by implementing a straightforward interface, fostering code reusability and minimizing the time spent adapting to new models.
Gradio Integration - Visualize results and build interactive demos effortlessly using x.infer's integrated Gradio support. This facilitates quick model testing and effective demonstration of your work.
Batch Inference - Leverage the infer_batch function for efficient processing of multiple images and prompts simultaneously, a critical capability for real-world applications and comprehensive performance evaluation.

Quick Bites

Google has been releasing some really interesting generative AI tools for content creation and learning. Here’s another one that we found very interesting. Google’s MusicFX DJ is a text-to-music tool that generates live music using text prompts as genres, instruments, and vibes. Now available to try in Google Labs, the tool has a very intuitive UI to steer the soundscape.

This is what we generated.

Apple’s iOS 18.1 with new Apple Intelligence features are out of developer beta and now generally available. The first batch of Apple Intelligence features includes writing tools, image cleanup, and an enhanced Siri. Apple Intelligence is currently available only on iPhone 15 Pro models, iPads with A17 Pro or M1 chips, and Macs with M1 or later. You can enable it by going to Settings > Apple Intelligence & Siri and toggling it on.

Grok 2 on xAI can now understand images. This means that X Premium users can give it images and ask questions about those images. Elon Musk has said in a post that it is in early stages and will improve rapidly. Grok will soon have document-understanding capabilities too.

Nvidia is set to ship around a billion RISC-V cores in 2024, replacing proprietary microcontrollers across its GPUs, CPUs, and SoCs. These RISC-V cores, which manage key functions in Nvidia hardware, are now standard in virtually all of its products.

Tools of the Trade

KitOps: An opensource DevOps tool that packages and versions your AI/ML model, datasets, code, and configuration into a reproducible artifact called a ModelKit. It ensures compatibility with existing tools and streamlines collaboration across teams.
AutoGluon: An opensource tool to automate machine learning tasks so you can build high-accuracy models for image, text, time series, and tabular data with minimal coding.
Grunty: A self-hosted desktop app that lets Claude AI control your computer to perform tasks via mouse and keyboard automation. It can run on MacOS, Windows, and Linux and requires monitoring since it’s experimental.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes

Yes, LLMs are going to hit a wall in a year
But it’s also true that they are already smarter than most humans
The last mile in AI automation is not intelligence, it’s plumbing! ~
Bindu Reddy
AGI will be able to turn all communists into good communists. ~
Bojan Tunguz

Meme of the Day

ok claude computer use, please hardcode a 25% IRR and back into the model from there. revenue grows moderately and ebitda margins expand 200bps. don’t spin your wheels on the debt schedule, its fine to just use a weighted average interest rate. add a check for shareholders equity

That’s all for today! See you tomorrow with more such AI-filled content.

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for a limited time only!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.