llamafile

Run local LLMs as a single executable file

llamafile packages an LLM and runtime into one file you can download and run locally. It is aimed at developers and end users who want offline, no-install model execution across common operating systems.

Desktop

B2B

CLI

Self-Hosted

On-Device / Edge

Model Agnostic

Supports Local Models

Visit llamafile

Is this your tool? Claim this listing to manage your content and analytics.

Recent activity

What's happened with llamafile lately

May 23, 2026
Score change
Rubric upgrade v3_0 → v3.1: score 1/32 → 2/361 → 2/36(+1)
Rubric upgrade: agenticness v3.0 (8 dims, /32) → v3.1 (9 dims, /36). Adds Dim 9 (Operator Sovereignty), splits Dim 6 into 6a/6b lenses, tightens Dim 4 autonomous-retry distinction. Not a product change — score shift reflects new dimension + recalibrated rubric, not a change in the tool. Fanout suppressed.
See the news that prompted this

News mentions sourced from our news feed; score changes from periodic re-evaluations.

Ask about llamafile

Get answers based on llamafile's actual documentation

Try asking:

About

What It Is

llamafile is an open-source tool for distributing and running large language models as a single executable file. It combines llama.cpp with Cosmopolitan Libc so you can run supported models locally on most operating systems and CPU architectures without a traditional install step.

What to Know

This is not an autonomous agent platform. It is primarily a local model runtime and packaging format, so its value is in portability and ease of execution rather than multi-step task automation. The project says newer versions support more recent...

Key Features

Packages an LLM into a single executable file

Runs locally on macOS, Linux, BSD, Windows, and other supported systems

Works without a separate installation process

Uses llama.cpp as the model runtime foundation

Includes whisperfile for local speech-to-text transcription and translation

Use Cases

Running a local chat model on your laptop without setting up a full environment

Sharing a model as a single downloadable file for teammates or users

Testing LLM behavior offline across different operating systems

Agenticness: Reactive Tool

Responds to prompts but takes no autonomous action.

High evidence

Last evaluated: May 23, 2026

Dimension Breakdown

Action Capability

Autonomy

Adaptation

State & Memory

Safety

How agenticness works →

Related Tools

View all in Agent Infrastructure

BuyWhere Product Catalog API

Agenticness6/36·Guided Assistant

Agent Infrastructure

BuyWhere gives developers a normalized product catalog API for Singapore and Southeast Asia. It helps AI agents search, compare, and route commerce queries without scraping storefronts.

Free Tier

API

Chrome Extension

Runloop AI

Agenticness15/36·Adaptive Collaborator

Agent Infrastructure

Runloop AI provides sandboxed devboxes for agent workflows, including turn-based interaction through GitHub pull requests. It’s aimed at developers building coding agents that need to execute commands, keep state across turns, and respond to reviewer comments.

API

Integrations

B2B

Fireworks AI

Agenticness4/36·Reactive Tool

Agent Infrastructure

Fireworks AI is a model hosting and inference platform for teams building with open and proprietary models. It covers serverless inference, fine-tuning, embeddings, speech-to-text, and on-demand GPU deployments.

Paid

Enterprise

API

Groq

Agenticness5/36·Reactive Tool

Agent Infrastructure

GroqCloud is an AI inference platform for developers that focuses on low latency and predictable spend. It provides API access to text, audio, vision, and image-to-text models, with free, developer, and enterprise plans.

API

For Developers

Usage-Based

Stay in the loop

Get the weekly agentic AI briefing

New tools, top picks, and trends — delivered every Thursday.

Ask about llamafile

About

Dimension Breakdown

Categories

Ask about llamafile

Related Tools

Get the weekly agentic AI briefing