Overview
Kimchi gives you instant access to production-ready open-source LLMs — Kimi K2.6, Kimi K2.5, MiniMax M2.7, and Nemotron — via an OpenAI-compatible API. No GPUs to provision, no clusters to manage.
Quickstart
IDE setup recipes
The CLI handles setup automatically, but if you prefer manual configuration or need tool-specific tweaks, follow these step-by-step guides:
Configure OpenCode with multi-model orchestration.
Configure Cline CLI and VS Code extension.
Route Claude Code through Kimchi's Anthropic-compatible endpoint.
Configure Cursor IDE with the OpenAI-compatible base URL.
Configure Windsurf through the Roo Code extension.
Configure Continue CLI and IDE extension.
Available models
| Model ID | Best for | Context | Output |
|---|---|---|---|
| kimi-k2.6 | Agentic coding, image analysis (latest) | 260K | 32K |
| kimi-k2.5 | Agentic coding, image analysis | 260K | 32K |
| minimax-m2.7 | Code execution, debugging | 200K | 32K |
| nemotron-3-super-fp4 | Fast inference, cost-efficient tasks | 128K | 32K |
A common pattern is to pair Kimi K2.6 for planning with MiniMax M2.7 for code execution.
Optional — GSD multi-agent setup
Get Shit Done (GSD) orchestrates multiple AI agents — planner, researcher, executor, and verifier — each using the model best suited to the task.
If you skipped GSD during CLI setup, install it manually:
For OpenCode (GSD 1.x):
npx gsd-opencodeFor GSD 2.0 (standalone TUI):
gsd configSee the OpenCode recipe for full GSD configuration examples with model assignments per agent role.
How it works
Your IDE / CLI ──► Kimchi config ──► https://llm.kimchi.dev/openai/v1 ──► Open-Source Model
│
OpenAI-compatible API- Serverless Model APIs: Kimchi runs models on its own GPU infrastructure. You pay per token.
- Self-hosted deployments: Run the same models on your own Kubernetes cluster when per-token costs exceed compute costs or compliance requires it. Learn more
Pricing
Pay-per-token. Separate rates for input and output tokens. The free tier has generous limits and no credit card requirement.
When you need more throughput, upgrade to a paid plan. See rate limits for details.
Next steps
Updated 14 days ago
