Release · v0.10.20

The Context Control Plane Grows Up

Eighteen capabilities shipped across one release cycle. Each one answers a question that engineering leaders, compliance teams, and security organizations have been asking since AI coding tools became mainstream: how do you govern something you cannot inspect?

The big picture

AI coding tools crossed a threshold in the past eighteen months. They went from curiosity to productivity infrastructure — tools that your engineers use daily, that touch production codebases, and that produce output that ships. That shift created a new category of organizational risk that the tools themselves do not address: context risk.

Context risk is what happens when an agent acts on incomplete, stale, or misscoped information. The agent does not know what it doesn't know. It cannot tell you what it skipped, what governance rules it bypassed, or whether the files it read were the right ones for the task. The output looks like code. It may even pass review. But the evidence trail — the record of what informed the change — does not exist.

CodeLedger was built to close that gap. Version 0.10.20 is the release where the platform moves from context selection into full context governance: an Evidence Layer that structures what the agent knows before it acts, an Auditability Layer that records what it did and why, a Governance Layer that enforces doctrine at the point of tool use, and an Intelligence Layer that tells managers and teams whether it's working.

These are not features you install and forget. They are a compounding system. Each layer feeds the next. Evidence quality improves audit value. Audit records strengthen governance enforcement. Governance signals sharpen intelligence outputs. The value realized in sprint one is a fraction of the value realized by sprint twelve.

Evidence Layer Auditability Layer Governance Layer Intelligence Layer Value table

Who this release affects

Context governance is not a developer-productivity story alone. It touches every part of the organization that cares about what AI is doing inside the codebase.

🏗️CTO / VP Engineering

→Know exactly what context every agent used before it touched your codebase.
→Catch architecture doctrine violations before they merge — deterministically.
→Measure whether your team's AI context quality is trending up or down.

💰CFO / FinOps

→Stop paying for agent cycles that started with wrong or stale context.
→Attach a dollar figure to AI-driven work avoidance — verified, not estimated.
→Token savings compound: each sprint, the agent reads fewer irrelevant files.

📋CIO / Head of Compliance

→Every AI-assisted change now has a signed, inspectable audit trail.
→Compliance reports are machine-generated from real events — not reconstructed.
→Intent Lock prevents scope drift between what was approved and what was shipped.

🔒CISO / Head of Security

→MCP enforcement gates high-risk tool calls behind verified activation state.
→Doctrine signals block changes that touch security surfaces without full context.
→Domain security packs flag prompts that approach auth or secrets without proper framing.

📊Engineering Manager

→Context Debt Score surfaces which teams are working with stale or thin context.
→Trace Receipts show exactly which files the agent read, skipped, and changed.
→Skills corpus gives every agent on your team the same institutional knowledge.

One-time value vs. compounding value

Most software tools deliver their full value on the day you install them. Context governance is different. The value of a ledger is not the first entry — it is the thousandth. The value of a doctrine signal is not the first violation it catches — it is the pattern it reveals across six months of PRs.

Capability

Day-one value

Long-term compounding value

For

Evidence-grounded activation

Agent starts with causally-ranked context on the first run

Historical failure patterns accumulate; future activations get smarter

CTO / EM

Signed audit trail per task

First compliance question answered without a manual investigation

Every subsequent PR has a traceable record — audit cost approaches zero

CIO / Legal

Intent Lock

Prevents the first out-of-scope change from an ambiguous prompt

Drift detection becomes the norm; re-work cycles fall across the team

CTO / Compliance

Context Debt Score

First A–F grade surfaces which squads have the worst context hygiene

Weekly trend line shows whether interventions are working

EM / VP Eng

Domain Security Packs

First security-sensitive prompt gets flagged before a risky change is written

Pack weight tunes as the team accepts and rejects signals — noise falls

CISO / Security

Skills Registry

Junior engineers get the same step-by-step guidance as seniors on day one

New skills are promoted automatically from observed patterns

EM / DevEx

MCP Enforcement

First high-risk tool call is gated at the API boundary, not caught in review

Enterprise tier unlocks richer sources; enforcement tightens as trust data grows

CISO / CTO

PR Context Fingerprint

Reviewers see exactly what context state was active when the PR was written

Fingerprint history reveals which context states correlate with clean PRs

EM / CIO

The Zero-Friction Layer

Governance and intelligence are only valuable if every developer can access them in seconds — on day one, without reading documentation. The Zero-Friction Layer removes every onboarding step and makes CodeLedger self-starting for any user, any agent type, any surface.

codeledger ready

New

One command that works for everyone. New user? CodeLedger silently initializes in the background and brings you to a ready state. Returning user? It checks index freshness and shows a live snapshot of your repo — languages, file count, and the hottest files by churn. In both cases, the next two commands you need are printed at the bottom: activate and panel serve. Zero cognitive load, zero documentation required.

Why EMs and DevEx teams care

New team member onboarding time for CodeLedger drops to one command. There is no "first, run init; then, run scan; then, run activate" flow to document or remember. The ready command is the flow.

Long-term compound value

As team size grows, the cost of a complex onboarding process scales linearly. A single-command entry point pays dividends on every new hire, every contractor, and every agent runtime that needs initialization.

MCP V3 Read-Back Tools

New

Four new read-only MCP tools give agents and IDE extensions direct access to CodeLedger runtime state without shell access: get_trace_receipt returns the last signed reasoning trace; get_context_debt_score returns the A–F composite for any path; get_intent_lock_state returns the active scope lock; get_activation_payload returns the causally-ordered evidence sequence from the last activation. All four are gated by the existing activation enforcement policy.

Why platform and tooling teams care

IDE extensions and sub-agents cannot run shell commands. MCP read-back tools give them the same runtime visibility that a terminal user has — without compromising the enforcement boundary. The agent knows what it was told, and can prove it.

Long-term compound value

Read-back tools are the foundation for agent-to-agent context propagation. As multi-agent workflows become common, sub-agents need to inherit context state from orchestrators. These four tools are that inheritance mechanism.

Trace Reasoning Command

New

codeledger trace reasoning reads the last reasoning trace and prints a structured summary: which files were selected, considered but excluded, and suppressed by governance — with the suppression reason for each. The raw trace is stored as a versioned artifact; the compact signed receipt is available separately for audit workflows.

Why engineering leads and compliance teams care

"What did the agent look at before it made this change?" is now answerable in a single command. The trace reasoning output is the human-readable form of the same artifact that compliance workflows can ingest and file.

Long-term compound value

Trace history builds a causal record across the team. Patterns in what gets excluded or suppressed surface architecture signals that no code review process would catch.

100-Test Release Validation Suite

New

Every CodeLedger release is now validated through a three-phase Docker-isolated test gate: 34 core feature tests, 20 chaos and adversarial tests, and 46 onboarding and agent surface tests. The suite covers all 6 agent types, all 5 surface types, new vs. existing user paths, upgrade flows, hook simulation, and 10 extreme edge cases — including corrupt config, concurrent session-inits, mid-session agent type switching, and activate in a non-git directory.

Why CTOs and enterprise buyers care

Enterprise software that governs your AI coding agents must itself be provably reliable. The release validation suite is the evidence: 100 tests, 3 adversarial phases, Docker-isolated, zero false positives permitted before a release ships.

Long-term compound value

The suite grows with each release. New capabilities add new test coverage. The adversarial suite in particular accumulates edge cases from real production issues — making each release provably more robust than the last.

The Evidence Layer

Before an agent writes a single line, it needs to understand what it's working with. The Evidence Layer answers that question with structured, causally-ordered data — not a flat list of files.

Activation Director V3

New

Every task activation now produces a structured payload — a causally-ordered sequence of evidence items that tells the agent why each file matters, what historical failures are relevant, and which files are explicitly out of scope. The agent no longer receives a flat context dump; it receives a ranked argument.

Why CTOs and EMs care

Context quality is the primary determinant of agent output quality. An agent that starts with a ranked evidence chain — intent, risk zones, prior failure patterns — produces fewer off-target changes and requires less re-work. This is measurable: recall rates on the files actually changed track directly against the quality of the initial payload.

Long-term compound value

The payload improves over time. As your team accepts and rejects context bundles, the evidence sources learn which signals matter for your codebase. Month three looks meaningfully different from week one.

Code Domain Graph

New

A formal graph artifact is now written to the CodeLedger runtime directory on every activation. It classifies every file in the repo by surface type — auth, migrations, public contracts, generated, shared core — and tracks code ownership. This graph is the shared substrate for doctrine enforcement, trace receipts, and the PR fingerprint.

Why Security and Compliance care

Knowing which files are "auth surface" or "public contract" deterministically — without relying on naming conventions or manual tagging — is the prerequisite for any meaningful governance signal. A file that should never be touched without a security review can now be identified by graph position, not by hoping someone named it correctly.

Long-term compound value

As the repo grows, the graph stays current. New files are classified automatically. No manual annotation required.

The Auditability Layer

When compliance asks what the AI touched, the answer is now a signed artifact — not a reconstruction from memory. The Auditability Layer produces inspectable records for every task, every PR, and every drift event.

Engineering Reasoning Trace

New

After any task session, CodeLedger assembles a structured trace classifying every file into one of six buckets: selected for context, considered but excluded, excluded by governance, read but not bundled, edited, and test outcomes. This trace is stored as a versioned artifact alongside the task.

Why CIOs and Legal care

The trace answers the question every acquirer, auditor, and compliance review eventually asks: "What did the AI actually look at before it made this change?" The answer is now a structured document, not an anecdote.

Long-term compound value

Trace history builds a causal record across your entire engineering team. Patterns in what gets excluded or suppressed surface architecture signals that no code review process would catch.

Trace Receipt

New

The Trace Receipt is a compact, signed summary of the Reasoning Trace — written to the CodeLedger runtime directory at task completion. It is the artifact you share with auditors, compliance teams, or CI gates. Machine-readable, human-legible, and verifiable against the full trace.

Why Compliance and Security care

A receipt answers a different question than a trace. The trace is the full record; the receipt is the verifiable summary. Compliance workflows need receipts — something they can file, index, and query. The receipt is that artifact.

Long-term compound value

Receipt history is queryable. "Show me every task where a governance suppression occurred in the last 90 days" is a tractable query against receipts.

Intent Lock

New

When you activate on a task, CodeLedger writes an intent lock — a record of what you said you were doing and what level of scope drift is permitted. If a subsequent operation drifts beyond the declared scope (classified as none / minor / major / critical), the system warns before the change is written.

Why CTOs and Compliance care

Scope creep in AI-assisted development is not theoretical. An agent asked to fix a bug in module A will, without governance, touch module B if it seems related. Intent Lock makes the declared scope enforceable — not advisory.

Long-term compound value

Over time, Intent Lock data reveals which task types are most prone to scope drift. That signal feeds directly into risk scoring and onboarding guidance.

PR Context Fingerprint

New

Every pull request now carries a context fingerprint — a record of the confidence score, transaction ID, and doctrine signals that were active when the PR was written. Reviewers can see, at a glance, whether the agent was working with high-quality context or whether the PR was written without a valid activation.

Why EMs and CTOs care

A PR written with a confidence score of 0.45 deserves more scrutiny than one written with 0.91. The fingerprint surfaces that signal in the review UI, not in a separate tool.

Long-term compound value

Fingerprint history correlates context quality with PR outcomes. Teams that maintain high fingerprint scores have fewer post-merge incidents.

The Governance Layer

Governance is only meaningful if it runs automatically, before code merges. The Governance Layer enforces doctrine signals, context contracts, and compliance policy at the boundary where agents operate — not after the fact.

V3 Doctrine Enforcement

New

Four new deterministic signals run automatically as part of every pre-PR check. They detect: high-risk file changes without sufficient context, changes crossing too many domain boundaries, public contract modifications without test coverage, and governance-suppressed files that appear in the diff anyway. No LLMs. No heuristics. All signals derive from the Code Domain Graph and Trace Receipt.

Why Security and Architecture teams care

Architectural doctrine is easy to write down and hard to enforce. CodeLedger makes it enforced-by-default. A signal that fires in CI is cheaper than a signal that fires in production.

Long-term compound value

Doctrine signals are promotable. When the team observes a new violation pattern, it can be formalized as a signal and added to the pre-PR check without a platform change.

Context Contracts

New

A Context Contract formalizes which files must be in the active bundle for a given class of change. Auth changes require the auth surface to be bundled. Migration changes require the migration history to be bundled. Violations surface in `verify` and `pre-pr`, not in code review.

Why CISOs and CTOs care

Context contracts are governance-as-code. They express, in a machine-readable form, the requirement that the agent was informed before it acted. This is the audit-trail counterpart to code review.

Long-term compound value

Contracts accumulate. A mature team has contracts for every high-risk change class. New change classes are detected automatically and flagged for contract definition.

Compliance Report

New

A new command produces a machine-readable governance report for any time window: which policies were active, which were violated, which evidence gates fired, and a pass/fail verdict. Designed to drop directly into enterprise audit workflows.

Why CIOs and Legal care

Compliance reports reconstructed manually are expensive, error-prone, and slow. A compliance report generated from real-time ledger data is none of those things. It is also queryable, archivable, and diff-able across releases.

Long-term compound value

As the ledger accumulates history, compliance reports gain historical comparison. "Did our policy posture improve between Q1 and Q2?" is answerable in seconds.

MCP Runtime Enforcement

New

The MCP server — the API surface through which agents call CodeLedger tools — now enforces that high-risk tool calls require a valid, verified activation payload. Enterprise-tier evidence sources are unlocked only when the license tier permits. Local tier runs fully offline, with no external calls.

Why CISOs care

An agent that can call any tool without a verified context state is an agent operating without governance. MCP enforcement moves the control point from code review to the API boundary — where it can be enforced automatically.

Long-term compound value

Enterprise-tier enforcement gates on real trust data. The more history the system accumulates, the more precisely it can distinguish trusted from untrusted call patterns.

The Intelligence Layer

Raw data is not intelligence. The Intelligence Layer synthesizes context signals, team patterns, and historical outcomes into answers that engineering managers and developers can act on today.

Context Debt Score

New

A composite A–F score that answers: is your team's AI context quality getting better or worse? Five signals — average confidence score, prompt insufficiency rate, destabilization trend, unverified completion rate, and stale activation frequency — combine into a single metric with a trend direction and a prior comparison.

Why EMs and VPs care

You cannot manage what you cannot measure. Context Debt Score is the first aggregate metric for AI context hygiene that derives entirely from real usage data, not surveys or self-reporting.

Long-term compound value

Context Debt is a leading indicator. Teams with rising debt scores have incidents before teams with stable or falling scores. Tracking it weekly is tracking your risk posture.

Skills Registry

New

A formal registry of infrastructure skills — structured checklists that agents receive automatically when a task matches a trigger. Skills cover: adding a new CLI command, wiring a new engine subsystem, adding an MCP tool, writing an insight pack, hardening an auth surface, and more. Skills are injected into the active context bundle when relevant.

Why EMs and DevEx teams care

The gap between a senior engineer's output and a junior engineer's output, when both are using AI coding tools, is context. The senior knows the patterns, the gotchas, the required steps. Skills encode that knowledge and make it available to every agent on every task.

Long-term compound value

Skills are promotable. As agents complete tasks, patterns that appear repeatedly can be formalized as new skills. The corpus grows without manual curation.

Domain Security Packs

New

The prompt coach now ships two built-in domain-specific signal packs. The security pack detects prompts that approach authentication, secret handling, or cryptographic operations without appropriate framing — and escalates the coaching response accordingly. A default cross-domain pack provides baseline signals for all other task types.

Why Security teams care

Prompt coaching without domain awareness produces generic warnings. Domain-aware coaching produces specific, actionable interventions: "This prompt touches the auth surface without scoping the session boundary. Add intent clarity before activating."

Long-term compound value

Domain pack weights tune over time. As the team accepts and rejects interventions, signal weights adjust. False positive rates fall. High-signal interventions become more precise.

What this is worth to your organization

The ROI case for context governance is not theoretical. It derives from four real cost centers that every engineering organization carries, whether or not they have measured them.

Inference waste

Every token an agent processes is billed. Tokens spent on wrong files, irrelevant history, and dead code are pure waste. Benchmark-proven reduction: 28.7% fewer tokens consumed per task with a valid CodeLedger bundle.

Benchmark-proven across 8 public repos, 40 tasks

AI-driven rework cycles

Changes written without sufficient context require rework. Industry estimates place AI-driven rework at 15–25% of output without context control. Intent Lock and evidence-grounded activation directly reduce this rate.

Industry estimate · CodeLedger reduces via recall improvement

Compliance investigation cost

When an auditor or regulator asks what the AI touched, the current answer — without tooling — is a manual reconstruction. Trace Receipts and Compliance Reports replace reconstruction with a query.

One investigation avoided per quarter is material at enterprise scale

Architecture incident cost

Doctrine violations, boundary breaches, and context contract failures caught in CI cost a fraction of what they cost in production. V3 Doctrine Enforcement moves the detection point left of merge.

Each blocked pre-merge incident avoids downstream incident cost

Get started

v0.10.20 is available now. All fourteen capabilities described above ship in the standard CodeLedger CLI. No additional setup is required for the Evidence and Auditability layers — they activate automatically on your first task scan. Governance and Intelligence features activate with your existing configuration.

npm install -g codeledger && codeledger init

Quick start →Run a Truth Audit Wire into CI Full changelog ↗

© 2026 ContextECF, Inc. All rights reserved. CodeLedger and Context Control Plane are trademarks of ContextECF, Inc. Patent pending. All described capabilities and architectural methods are proprietary and confidential. Unauthorized reproduction or distribution is prohibited.