Overview

Kimchi Coding is an autonomous AI coding agent. You describe a task in plain English — "add error handling to the API routes", "refactor this module to async/await", "write tests for the auth service" — and the agent executes it: reading files, writing code, running commands, and checking its own output.

This is different from AI code completion (Copilot-style inline suggestions). Kimchi Coding handles multi-step tasks with minimal hand-holding — you hand over a job and the agent does it.

How it works

A coding harness is the runtime that turns an LLM into an autonomous agent. Kimchi Coding's harness manages the full loop: it reads your codebase, plans the work, writes code, runs commands, and verifies the results — all in a terminal session.

%%{init: {'theme': 'dark'}}%%
flowchart LR
    A["You describe a task"] --> B["Kimchi Coding harness"]
    B --> C["Reads, writes, runs, verifies"]
    C --> D["Completed work"]
    style B fill:#2d2d2d,stroke:#888,color:#e0e0e0

Under the hood, the harness runs an orchestrator that classifies your task, breaks it into phases, and assigns each subtask to the right model. A planning step gets a reasoning-capable model; a bulk code-generation step gets a fast, high-throughput executor. You don't pick the model — the harness does.

Why Kimchi Coding?

Claude Code / CursorOther API providersKimchi Coding
Rate limitsYes — sessions cut off mid-taskVariesNo rate limits
Cost$100–200/month subscriptionsPay-per-token, no routing intelligencePay-per-token, smart routing cuts costs
Model lock-inAnthropic-onlySingle model per requestAutomatically picks the right model per task
Data residencyAnthropic's / provider's infraProvider's infraYour cluster (self-hosted) or Kimchi's GPUs
Claude modelsDirect from AnthropicNot availableVia Kimchi proxy — same models, no lock-in

Key capabilities

Multi-model orchestration

By default, kimchi runs in multi-model mode. The orchestrator classifies each task and delegates subtasks to specialised models across five roles:

RoleResponsibility
OrchestratorRuns the main loop, classifies tasks, delegates work
PlannerDesigns the approach, writes specs
BuilderCode implementation — picks heavier models for complex tasks
ReviewerCode review — picks the strongest model by tier
ExplorerCodebase exploration and research — light models for scans, heavy for analysis

You describe the task. The orchestrator picks the right model for each subtask based on complexity and model capabilities. Configure role assignments with the /multi-model command or in ~/.config/kimchi/harness/settings.json.

Phase tracking

Every LLM request is tagged with a work phase for analytics and cost attribution:

PhaseDescription
exploreReading files, tracing imports, understanding code structure
planDesigning, breaking down tasks, writing specs
buildWriting, modifying, or refactoring code
reviewVerifying correctness, analysing output
researchExternal sources: web docs, library APIs, version changelogs

Phases appear in the footer (e.g. ↳ build) and are included in every request tag for cost reporting.

Ferment — autonomous project mode

For multi-step projects that span sessions, Ferment provides structured planning, execution tracking, and persistent context. Describe a goal — "Build Tetris", "Add Google OAuth login" — and the harness breaks it into phases, executes each one using specialised subagent workers, grades the results, and picks up exactly where it left off if a session ends.

Start a Ferment session at any time with the /ferment command. See the Ferment overview for the full mental model.

Session persistence

Sessions (including agent runs) are saved to disk and fully recoverable. Prompt history is loaded into up/down arrow navigation so you can reuse past prompts without retyping.

Kimchi Coding vs. Kimchi Inference

Kimchi offers two products:

  • Kimchi Coding (this section) — an autonomous AI coding agent with multi-model orchestration, phase tracking, and cost attribution. You describe a task and the agent handles it.
  • Kimchi Inference — a serverless API for sending LLM requests from any OpenAI-compatible client. You control the model, prompt, and integration.

Kimchi Coding uses Kimchi Inference under the hood for model access.

Next steps