Getting Started
How context selection works
CodeLedger selects files deterministically — no LLMs, no embeddings, no sampling. Every selection decision is traceable to a score.
The selection pipeline
When you run codeledger activate --task "...", CodeLedger runs four stages in sequence:
- 1Score — Every file in the repo is scored against the task using a weighted sum of six factors (below).
- 2Select — Files are ranked by score. The top-N files that fit within the token budget (default: 8,000 tokens, max 25 files) are selected.
- 3Shadow expand — Up to 3 additional shadow files are added from the co-commit graph — files that historically change alongside the selected set.
- 4Excerpt — Files under 200 lines are included in full. Larger files receive tiered excerpts: the highest-scoring spans are expanded first.
Scoring factors
The composite score is a weighted sum of six signals. Weights are configurable in .codeledger/config.json under selector.weights.
Keyword match
weight 0.30Compound phrases extracted from your task prompt are matched against file content. Exact phrases score higher than partial matches. The phrase index is built during scan and updated incrementally.
Structural centrality
weight 0.25Files that import — or are imported by — many other files are scored higher. High-centrality files are more likely to be relevant to cross-cutting changes. Centrality is derived from the dependency graph.
Churn history
weight 0.20Files that change frequently tend to be relevant to active work. Churn is weighted by recency using a 60-day half-life — recent commits count more than old ones.
Recent touch
weight 0.15Files you have modified recently in this session, or files touched in the last few commits on the current branch, receive a boost. This keeps the bundle anchored to your current working context.
Test proximity
weight 0.07Files that have discovered test coverage, or that are co-located with test files, receive a small boost. The test map is built during scan from naming conventions and import analysis.
Shadow affinity
weight 0.03Files that historically co-commit with selected files are eligible for shadow inclusion. Co-commit affinity decays over time and is penalised for large commits to reduce noise.
Token budget
The default budget is 8,000 tokens and 25 files. CodeLedger stops adding files when either limit would be exceeded by the next candidate.
Override either limit in your config:
// .codeledger/config.json
{
"selector": {
"default_budget": { "tokens": 12000, "max_files": 30 }
}
}Or per-activation with flags: --budget-tokens 12000 --budget-files 30.
Shadow files
Shadow files are additions beyond the scored top-N. They come from the temporal co-commit graph — pairs of files that have historically been committed together. If file A is in the bundle and file B has changed alongside A at least 3 times, B is a shadow candidate.
Intent sufficiency check
Before writing the bundle, CodeLedger runs an Intent Sufficiency Check (ISC) on your task prompt. If the task is vague or contradictory, you'll see a prompt health warning before activation completes — so you can refine the task rather than getting a low-quality bundle.
⚠ Prompt health: contradiction in task detected. Consider clarifying scope before activating.