ktext

Common Questions

Why not just fill out a README?

READMEs are written for humans: badges, screenshots, setup walkthroughs. That's fine, but agents can't reliably extract structured facts from prose. There's no standard place for architectural decisions, no field for constraints, no way to signal severity on a known risk.

CONTEXT.yaml is structured and scored. You can validate it in CI and know whether it actually contains useful information, not just words.

Why not just use CLAUDE.md / agents.md / Copilot instructions?

Those files are agent-specific prompt fragments. They're not portable, not validated, and not scored. When you switch tools or add a second agent, you start over.

CONTEXT.yaml is tool-agnostic. Write it once, export it to whatever format a given tool prefers: ktext export xml for compact injection into a system prompt, ktext export json for scripting. The source of truth is the YAML, not a prompt fragment in a dotfile.

Why not just let the agent explore the codebase?

Agents can read files, but exploration is slow and incomplete. An agent reading your directory tree to infer your stack is burning tokens reconstructing knowledge you already have. It won't find the constraint that lives in someone's head, the decision that predates the current codebase, or the risk you know about but haven't written down.

CONTEXT.yaml captures the things exploration can't find: the why behind the code, not just the what.

Is this just another documentation format?

The goal is context, not documentation. Documentation explains how things work for human readers. Context tells an agent what it needs to act correctly in this codebase: what rules to follow, what decisions have already been made, what to watch out for.

A well-written CONTEXT.yaml is closer to a briefing document than a wiki page.

Does this replace existing documentation?

No. Keep your README, your ADRs, your runbooks. CONTEXT.yaml is a distillation: the highest-signal facts about your project, structured so agents can load it cheaply and act on it reliably. The detail lives in your existing docs. CONTEXT.yaml surfaces what matters most.

Why YAML and not JSON or TOML?

YAML is the most natural format for structured data that humans edit by hand. JSON has no comments and gets noisy with quoting. TOML works well for flat config but gets awkward with nested lists of objects.

The schema is JSON Schema draft 2020-12, validated at parse time. Export formats (XML, JSON) are generated from the parsed model, not from the source file directly.

Why XML for agent injection?

The XML export is compact: identity as attributes on the root <context> element, short tag names (<c> for constraints, <d> for decisions), no boilerplate. Most frontier models handle XML well in system prompts, and the token count is significantly lower than equivalent JSON or Markdown prose.

If you prefer JSON, ktext export json is available. If you're injecting into a tool that takes free-form text, the YAML source is often readable enough to paste directly.

Does ktext require an account or API key?

No. ktext is a local CLI binary. No account, no telemetry, no network calls. ktext init reads your repo, ktext validate scores the file, ktext export renders it. Everything runs locally.

Should I use an LLM to flesh out my CONTEXT.yaml?

Yes, with one condition: keep a human in the loop and be ruthless about signal.

LLMs are good at surfacing context from existing material. Point one at your ADRs, your CONTRIBUTING.md, your incident postmortems, your Slack threads, and ask it to draft constraints and decisions. It will find things you forgot to write down. That's genuinely useful.

The risk is noise. An LLM filling in gaps will produce plausible-sounding entries that are vague, generic, or just wrong. A constraint that says "always follow best practices" is worse than no constraint at all. Review everything it generates and ask: is this specific? Is this something only someone who worked on this project would know? If the answer is no, cut it or rewrite it.

ktext validate will catch the most obvious noise. Vague constraints and thin rationale score low. Use the score as a signal that something needs more thought, not as a target to game.

Who maintains the schema?

The canonical schema is published at github.com/arithmetike/ktext. Changes go through an RFD process. The schema is designed to be stable and backwards-compatible. The x: extension field lets you add project-specific metadata without waiting for schema changes.