LocalAI

Run low-latency voice and text conversations on your own stack

LocalAI’s Realtime API lets you build voice and text experiences over WebSocket or WebRTC using an OpenAI-compatible protocol. It is aimed at developers who want a self-hosted, configurable realtime layer with their own VAD, STT, LLM, and TTS components.

API

Voice

B2B

Self-Hosted

Model Agnostic

Supports Local Models

Visit LocalAI

Is this your tool? Claim this listing to manage your content and analytics.

Recent activity

What's happened with LocalAI lately

May 22, 2026
Score change
Rubric upgrade v3_0 → v3.1: score 4/32 → 12/364 → 12/36(+8)
Rubric upgrade: agenticness v3.0 (8 dims, /32) → v3.1 (9 dims, /36). Adds Dim 9 (Operator Sovereignty), splits Dim 6 into 6a/6b lenses, tightens Dim 4 autonomous-retry distinction. Not a product change — score shift reflects new dimension + recalibrated rubric, not a change in the tool. Fanout suppressed.
See the news that prompted this

News mentions sourced from our news feed; score changes from periodic re-evaluations.

Ask about LocalAI

Get answers based on LocalAI's actual documentation

Try asking:

About

What It Is

LocalAI Realtime API is a self-hosted, OpenAI-compatible realtime interface for low-latency voice and text conversations. It is built for developers who want to serve multimodal chat locally or on their own infrastructure rather than relying on a hosted API.

What To Know

This is infrastructure, not a turnkey assistant. The realtime experience depends on the models and backends you install and configure, so quality and latency will vary based on your stack. WebRTC also requires the Opus backend to be installed separately.

Key Features

Supports the OpenAI Realtime API protocol

Streams low-latency voice and text conversations over WebSocket

Supports browser-based realtime voice via WebRTC

Uses configurable pipeline components for VAD, STT, LLM, and TTS

Accepts model configuration files such as YAML pipeline definitions

Use Cases

Build a self-hosted voice assistant with speech-to-speech interaction

Add realtime conversational voice to a browser app using WebRTC

Expose an OpenAI-compatible realtime endpoint for internal tools

Agenticness: Guided Assistant

Executes tasks you assign, one step at a time, within narrow domains.

High evidence

Last evaluated: May 22, 2026

Dimension Breakdown

Action Capability

Autonomy

Adaptation

State & Memory

Safety

How agenticness works →

Related Tools

View all in Agent Infrastructure

BuyWhere Product Catalog API

Agenticness6/36·Guided Assistant

Agent Infrastructure

BuyWhere gives developers a normalized product catalog API for Singapore and Southeast Asia. It helps AI agents search, compare, and route commerce queries without scraping storefronts.

Free Tier

API

Chrome Extension

Runloop AI

Agenticness15/36·Adaptive Collaborator

Agent Infrastructure

Runloop AI provides sandboxed devboxes for agent workflows, including turn-based interaction through GitHub pull requests. It’s aimed at developers building coding agents that need to execute commands, keep state across turns, and respond to reviewer comments.

API

Integrations

B2B

Fireworks AI

Agenticness4/36·Reactive Tool

Agent Infrastructure

Fireworks AI is a model hosting and inference platform for teams building with open and proprietary models. It covers serverless inference, fine-tuning, embeddings, speech-to-text, and on-demand GPU deployments.

Paid

Enterprise

API

Groq

Agenticness5/36·Reactive Tool

Agent Infrastructure

GroqCloud is an AI inference platform for developers that focuses on low latency and predictable spend. It provides API access to text, audio, vision, and image-to-text models, with free, developer, and enterprise plans.

API

For Developers

Usage-Based

Stay in the loop

Get the weekly agentic AI briefing

New tools, top picks, and trends — delivered every Thursday.

Ask about LocalAI

About

Dimension Breakdown

Categories

Ask about LocalAI

Related Tools

Get the weekly agentic AI briefing