Overview
Kimchi Coding is an autonomous AI coding agent. You describe a task in plain English — "add error handling to the API routes", "refactor this module to async/await", "write tests for the auth service" — and the agent executes it: reading files, writing code, running commands, and checking its own output.
This is different from AI code completion (Copilot-style inline suggestions). Kimchi Coding handles multi-step tasks with minimal hand-holding — you hand over a job and the agent does it.
How it works
A coding harness is the runtime that turns an LLM into an autonomous agent. Kimchi Coding's harness manages the full loop: it reads your codebase, plans the work, writes code, runs commands, and verifies the results — all in a terminal session.
%%{init: {'theme': 'dark'}}%%
flowchart LR
A["You describe a task"] --> B["Kimchi Coding harness"]
B --> C["Reads, writes, runs, verifies"]
C --> D["Completed work"]
style B fill:#2d2d2d,stroke:#888,color:#e0e0e0
Under the hood, the harness runs an orchestrator that classifies your task, breaks it into phases, and assigns each subtask to the right model. A planning step gets a reasoning-capable model; a bulk code-generation step gets a fast, high-throughput executor. You don't pick the model — the harness does.
Why Kimchi Coding?
| Claude Code / Cursor | Other API providers | Kimchi Coding | |
|---|---|---|---|
| Rate limits | Yes — sessions cut off mid-task | Varies | No rate limits |
| Cost | $100–200/month subscriptions | Pay-per-token, no routing intelligence | Pay-per-token, smart routing cuts costs |
| Model lock-in | Anthropic-only | Single model per request | Automatically picks the right model per task |
| Data residency | Anthropic's / provider's infra | Provider's infra | Your cluster (self-hosted) or Kimchi's GPUs |
| Claude models | Direct from Anthropic | Not available | Via Kimchi proxy — same models, no lock-in |
Key capabilities
Multi-model orchestration
By default, kimchi runs in multi-model mode. The orchestrator classifies each task and delegates subtasks to specialised models across five roles:
| Role | Responsibility |
|---|---|
| Orchestrator | Runs the main loop, classifies tasks, delegates work |
| Planner | Designs the approach, writes specs |
| Builder | Code implementation — picks heavier models for complex tasks |
| Reviewer | Code review — picks the strongest model by tier |
| Explorer | Codebase exploration and research — light models for scans, heavy for analysis |
You describe the task. The orchestrator picks the right model for each subtask based on complexity and model capabilities. Configure role assignments with the /multi-model command or in ~/.config/kimchi/harness/settings.json.
Phase tracking
Every LLM request is tagged with a work phase for analytics and cost attribution:
| Phase | Description |
|---|---|
| explore | Reading files, tracing imports, understanding code structure |
| plan | Designing, breaking down tasks, writing specs |
| build | Writing, modifying, or refactoring code |
| review | Verifying correctness, analysing output |
| research | External sources: web docs, library APIs, version changelogs |
Phases appear in the footer (e.g. ↳ build) and are included in every request tag for cost reporting.
Ferment — autonomous project mode
For multi-step projects that span sessions, Ferment provides structured planning, execution tracking, and persistent context. Describe a goal — "Build Tetris", "Add Google OAuth login" — and the harness breaks it into phases, executes each one using specialised subagent workers, grades the results, and picks up exactly where it left off if a session ends.
Start a Ferment session at any time with the /ferment command. See the Ferment overview for the full mental model.
Session persistence
Sessions (including agent runs) are saved to disk and fully recoverable. Prompt history is loaded into up/down arrow navigation so you can reuse past prompts without retyping.
Kimchi Coding vs. Kimchi Inference
Kimchi offers two products:
- Kimchi Coding (this section) — an autonomous AI coding agent with multi-model orchestration, phase tracking, and cost attribution. You describe a task and the agent handles it.
- Kimchi Inference — a serverless API for sending LLM requests from any OpenAI-compatible client. You control the model, prompt, and integration.
Kimchi Coding uses Kimchi Inference under the hood for model access.
