Skip to main content
LL

llamafile

Run local LLMs as a single executable file

llamafile packages an LLM and runtime into one file you can download and run locally. It is aimed at developers and end users who want offline, no-install model execution across common operating systems.

Desktop
B2B
CLI
Self-Hosted
On-Device / Edge
Model Agnostic
Supports Local Models
Visit llamafile

Is this your tool? Claim this listing to manage your content and analytics.

Ask about llamafile

Get answers based on llamafile's actual documentation

Try asking:

About

What It Is

llamafile is an open-source tool for distributing and running large language models as a single executable file. It combines llama.cpp with Cosmopolitan Libc so you can run supported models locally on most operating systems and CPU architectures without a traditional install step.

What to Know

This is not an autonomous agent platform. It is primarily a local model runtime and packaging format, so its value is in portability and ease of execution rather than multi-step task automation. The project says newer versions support more recent...

Key Features
Packages an LLM into a single executable file
Runs locally on macOS, Linux, BSD, Windows, and other supported systems
Works without a separate installation process
Uses llama.cpp as the model runtime foundation
Includes whisperfile for local speech-to-text transcription and translation
Use Cases
Running a local chat model on your laptop without setting up a full environment
Sharing a model as a single downloadable file for teammates or users
Testing LLM behavior offline across different operating systems
Agenticness: Reactive Tool

Responds to prompts but takes no autonomous action.

High evidence
Last evaluated: Mar 31, 2026

Dimension Breakdown

Action Capability
Autonomy
Adaptation
State & Memory
Safety

Categories

Pricing
  • Free: Pricing not publicly available; the project is open source under Apache 2.0.
  • Pro: Not listed.
  • Enterprise: Not listed.
Details
AddedMarch 31, 2026
RefreshedMarch 31, 2026
Agenticness
Quick Facts
DeploymentOn-device / local
AutonomyCopilot (human-in-loop)
Model supportSupports local models
Open sourceYes
Team supportIndividual only
Pricing modelFree / open source
Interfacecli, api
Similar tools

Related Tools

BuyWhere gives developers a normalized product catalog API for Singapore and Southeast Asia. It helps AI agents search, compare, and route commerce queries without scraping storefronts.

Free Tier
API
Chrome Extension
+4

Runloop AI provides sandboxed devboxes for agent workflows, including turn-based interaction through GitHub pull requests. It’s aimed at developers building coding agents that need to execute commands, keep state across turns, and respond to reviewer comments.

API
Integrations
B2B
+3

Fireworks AI is a model hosting and inference platform for teams building with open and proprietary models. It covers serverless inference, fine-tuning, embeddings, speech-to-text, and on-demand GPU deployments.

Paid
Enterprise
API
+3

GroqCloud is an AI inference platform for developers that focuses on low latency and predictable spend. It provides API access to text, audio, vision, and image-to-text models, with free, developer, and enterprise plans.

API
For Developers
Usage-Based
+3
Stay in the loop

Get the weekly agentic AI briefing

New tools, top picks, and trends — delivered every Thursday.

I use AI for: