• unwind ai
  • Posts
  • One API for RAG, Search & Recommendations

One API for RAG, Search & Recommendations

PLUS: No-code AI automations and agents, NVIDIA's agent avatar with vision

Today’s top AI Highlights:

  1. All-in-one infrastructure for search, recommendations, RAG, and analytics

  2. Open-source no-code platform to build AI agents and automation

  3. The first multi-agent AI coder to build and deploy full-stack apps

  4. NVIDIA gives AI agents a face and your desktop’s access

  5. Lightweight Python library for face recognition and facial attribute analysis

& so much more!

Read time: 3 mins

AI Tutorials

Data analysis often requires complex SQL queries and deep technical knowledge, creating a barrier for many who need quick insights from their data. What if we could make data analysis as simple as having a conversation?

In this tutorial, we'll build an AI Data Analysis Agent that lets users analyze CSV and Excel files using natural language queries. Powered by GPT-4o and DuckDB, this tool translates plain English questions into SQL queries, making data analysis accessible to everyone – no SQL expertise required.

We're using Phidata, a framework specifically designed for building and orchestrating AI agents. It provides the infrastructure for agent communication, memory management, and tool integration.

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

Trieve brings powerful search, recommendations, and RAG capabilities to your applications through a unified API. Designed for those who need more than basic search functionality, Trieve combines vector search, BM25, and neural sparse search with built-in analytics and relevance tuning. The platform handles everything from chunking and ingestion to search and recommendations, allowing you to focus on building features rather than managing infrastructure.

It also gives self-hosting options - you can run it in your VPC or on-premises while maintaining control over your data and models. You can also bring your own models for text embeddings, reranking, and LLMs to have more control over the tech stack.

Key Highlights:

  1. Flexible Search - Trieve allows you to customize search behavior by combining semantic and full-text search with cross-encoder reranking, meaning you can improve relevance for any type of query.

  2. Modular Architecture - You can either use Trieve's default embeddings models (OpenAI or Jina), or plug in your own text embedding models, SPLADE, rerankers, and LLMs. You can use custom-trained models and maintain end-to-end control over the search and RAG pipeline.

  3. Control Over Data - Data ingestion can be done through API calls for individual chunks, files, and bulk uploads. Trieve also provides group-related chunks, allowing search and recommendations at different granularities. Additionally, with its API keys feature, you can expose the API directly to your client with a fine-grained permissions system, for each dataset and even chunks based on tags, or routes.

  4. Built-In Analytics - Trieve automatically tracks search queries, low-confidence searches, popular filters and provides analytics through its dashboard. You also get a request ID for each API call, that can be used to track clicks, ratings, and any other event you'd like to measure.

  5. Deployment - Offers multiple deployment paths with pre-built configurations for AWS, GCP, Azure, and Kubernetes. Includes components for authentication, rate limiting, and monitoring. Your data is secure - no external dependencies or data leakage issues.

Lecca.io puts sophisticated AI agent and workflow automation capabilities in a no-code interface, letting you skip boilerplate setup and focus on building solutions. The platform provides a visual point-and-click, drag-and-drop interface to configure LLMs, create automation workflows, and equip AI agents with tools - all without writing integration code.

While offering no-code simplicity, Lecca.io maintains the flexibility of API access, custom tool creation, and self-hosting options that you’d need for production deployments. Additionally, support for human-in-the-loop workflows makes it a practical choice for applications that require oversight and governance.

Key Highlights:

  1. No-Code Agent Configuration - Configure LLM behavior, tools, and triggers through an intuitive UI without managing authentication flows or writing integration code. The platform handles OAuth2 connections, API keys, and token management for services like Google Workspace, Slack, and HubSpot, while still letting you bring your own API keys and providers when needed.

  2. Visual Workflow Builder - Design complex automations using a drag-and-drop interface that rivals n8n and Zapier. Build conditional paths, schedule tasks, add human oversight steps, and handle errors - all through a visual editor. Each workflow can be triggered manually, on schedule, or via webhook endpoints that the platform generates automatically.

  3. Modular Architecture - With the underlying modular design, you can build bespoke components and integrate them seamlessly. It allows you to add custom functionality. This is perfect for using custom-code to build specific use cases, while the end-user does everything using no-code.

  4. Knowledge Integration - Upload PDFs, docs, and text files through the UI to create queryable knowledge bases for your agents. The platform manages chunking, embeddings, and vector storage behind the scenes using S3 and Pinecone, with options to organize information into separate notebooks for different use cases.

  5. Full Control on Infrastructure - You can host Lecca.io on their you infrastructure and choose whether to use local models via Ollama or cloud-based options. The platform is open-source so you have full access to the code, allowing modifications and enhancements.

Quick Bites

LangChain is bringing together the AI agent community this May in San Francisco for Interrupt, its first AI agent conference. It’ll feature technical talks and hands-on workshops from leaders like Replit's Michele Catasta and Quora's Adam D'Angelo. You can sign up for the ticket drop and also apply to speak at the conference.

Dolphin 3.0 is a new series of open-source, local-first, steerable AI models based on Llama 3.1, 3.2, and Qwen 2.5 architecture. These models range from 0.5B to 8B parameters and give you complete control over AI alignment, system prompts, and data. Deployment options include Ollama, LM Studio, and the Hugging Face Transformers library.

A new startup KoderAI just emerged out of stealth and released a multi-agent AI coding platform, to build full-stack apps and websites from natural language descriptions using a team of specialized AI agents. These AI agents can conceptualize the project, design the UI, generate front and back-end code, test, and deploy your app.

You won’t even have to interact with an IDE if you don’t know how to use one. You can also customize agents and even swap out the default LLM. Technical preview and early access are available for sign-up at koder.com.

NVIDIA has introduced Project R2X, a vision-enabled PC avatar that can see your screen’s content and assist with tasks like using apps, video conference calls, reading and summarizing documents, and more. You can customize this AI agent’s components like LLMs, vector databases, search, etc. through a graph-based visual editor to define its functionality and access, and then talk to it like a human-assistant to get work done. Do watch the demo, it’s very cool!

Tools of the Trade

  1. DeepFace: A lightweight Python framework for face recognition and facial attribute analysis (age, gender, emotion, and race) that wraps multiple SOTA models like VGG-Face, FaceNet, OpenFace, and others. It offers simple functions for face verification, recognition, and analysis that handle all the complex pipeline stages (detection, alignment, normalization, representation, and verification) behind the scenes.

  2. Autochat: Python library to build AI agents using LLMs. It can transform Python functions into tools that the AI can use, supports conversations as generators, and includes features like caching, templates, and image handling.

  3. Aicmt: A command-line tool that uses AI to help with Git commits. It analyzes your code changes, automatically splits them into logical commits, and generates descriptive commit messages.

  4. Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

  1. How OpenAI, Google, Anthropic, and xAI might announce AGI:
    - OpenAI: "We know how to create AGI, we’re working on AGI, we’re very close to AGI... AGI achieved."
    -Google/Anthropic: "Here is your AGI."
    - xAI: Elon will post a meme that we have AGI. ~
    AshutoshShrivastava

  2. > be nvidia
    > only talk about fp4 perf for new GPUs
    > release models
    > released models only tested with bf16 ~
    Xeophon

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.