Open-Source Agent Platform

Build, test & deploy AI agents — from YAML to production.

HoloDeck is an open-source experimentation platform for AI agents. Define agents in YAML, run hypothesis-driven tests, and deploy production-ready APIs with a simple CLI — no custom infra required.

MIT Licensed
Self-Hosted
CLI-First
Python 3.10+
terminal • holodeck
pip install holodeck-ai holodeck init support --template customer-support holodeck test agent.yaml holodeck deploy agent.yaml --port 8000

What is HoloDeck?

HoloDeck is a no-code agent development platform for engineering teams. Describe agents in YAML, run tests and evaluations from the CLI, and ship them as FastAPI services — locally or in your own cloud.

No-Code Agents

Models, prompts, tools, memory, and vector stores defined in YAML.

Hypothesis Testing

Turn real user journeys into structured test cases and evals.

Integrated Metrics

Groundedness, relevance, F1, BLEU, ROUGE, and more in one CLI.

Deploy Anywhere

holodeck deploy converts configs to FastAPI services.

name: "customer-support-agent" description: "Handles customer inquiries with empathy and accuracy" model: provider: openai name: gpt-4o-mini temperature: 0.7 instructions: file: instructions/system-prompt.md tools: - name: search_knowledge_base type: vectorstore source: data/faqs.md evaluations: - metric: groundedness threshold: 4.0 - metric: relevance threshold: 4.0

Designed for how engineers build agents

HoloDeck covers the full lifecycle — definition, testing, orchestration, deployment, and observability — with a CLI-first, YAML-native approach.

No-Code Agent Definition
Describe models, prompts, tools, memory, and data sources in YAML. Standardize how agents are built across your org.
Hypothesis-Driven Testing
Define structured test cases with expected tools and ground truth responses. Know when an agent is ready to ship.
Integrated Evaluations
Run AI metrics like groundedness and relevance, plus NLP metrics like F1, BLEU, ROUGE, and METEOR directly from the CLI.
Multi-Agent Orchestration
Model complex systems using sequential, concurrent, handoff, and group chat orchestration patterns.
Built for CI/CD
holodeck test, holodeck experiment, and holodeck deploy are CLI-native and CI-friendly.
Production Deployment
Deploy agents as FastAPI services locally or to AWS, Azure, or GCP, with health checks and observability hooks.

From idea to production in three steps

Go from a hypothesis about an agent to a monitored, production-ready API in minutes — not weeks.

1
Describe your agent
Use agent YAML to define the model, instructions, tools, context, evaluations, and test cases side-by-side.
holodeck init customer-support --template conversational
2
Test & evaluate
Run your agent against structured scenarios and capture evaluation metrics before rolling out changes.
holodeck test agent.yaml holodeck chat agent.yaml
3
Deploy anywhere
Turn your agent into a FastAPI endpoint, then ship it to your preferred cloud or run it on-prem.
holodeck deploy agent.yaml --cloud aws --region us-east-1

Multi-agent systems without bespoke glue code

Real systems use more than one agent. HoloDeck gives you declarative orchestration patterns so you can model routers, specialists, and evaluators in YAML.

Sequential workflows
Concurrent analysis
Handoff routing
Group chat collaboration

Use orchestration patterns inspired by the Microsoft Agent Framework to chain agents for parsing, extraction, summarization, compliance checks, and more — without inventing another internal framework.

name: "customer-service-system" orchestration: pattern: handoff router: name: "service-router" path: agents/router/agent.yaml specialists: - name: "billing" path: agents/billing/agent.yaml - name: "technical-support" path: agents/technical-support/agent.yaml

Built for engineering teams — not click-through demos

Platform & ML Engineers

Standardize how agents are defined, tested, and deployed across teams. Give developers a shared YAML contract instead of bespoke scripts.

Product & Application Engineers

Ship AI features quickly without owning prompt infrastructure. Focus on user experience, not wiring models and tools.

Data & Research Teams

Run structured experiments, compare agent variants, and evaluate responses using metrics tuned to your domain.

Modern AI Products

Customer support agents, research assistants, code assistants, sales agents, and more — all defined in YAML and deployed with a CLI.

Observe, measure, and improve — together

OpenTelemetry & cost tracking

HoloDeck emits traces, metrics, and logs following the semantic conventions for generative AI. Track latency, token usage, and cost per agent, per environment.

Open source & MIT licensed

HoloDeck is built in the open. Join the community, open issues, propose features, and help shape the agent platform you want to use.

View the repo →