Enterprise AI Governance · 20,000+ employee organizations

The Control Plane for Enterprise AI Coding

AI coding agents create velocity.

CodeLedger creates the verification, memory, and governance layer that makes that velocity enterprise-safe.

Logs are history. Ledger is intelligence.

Without the right context,
your AI agents improvise.

Give AI coding agents the context, memory, and governance they need to work safely inside real-world codebases.

Any coding task comes down to 7–10 files that actually matter. Without a control plane, your agents do their best against your entire codebase — shadowing changes your security team cannot see, breaking dependencies nobody catches, and making confident claims that are simply wrong. Every session. Every engineer. Compounding.

Built on a deep patent-pending portfolio in enterprise context engineering, ContextECF CodeLedger brings deterministic context selection, repo-native memory, policy-aware guardrails, and audit-ready intelligence to AI-assisted software development.

At 20,000 employees, this is not a developer quality problem. It is $1.38M in provable annual waste, growing audit exposure, and operational risk that has no owner.

No credit card required · 8 Design Partner slots · Founding-team attention

One control plane across every AI coding tool

Claude CodeCursorCodexKiroWindsurfCopilotAntigravityGrok

The compounding problem

Five pains that get worse every quarter you wait

Each one is costly alone. Together they compound — because every new engineer, every new AI tool subscription, and every new quarter of unaudited AI output makes the next one harder to contain.

$1.38M

est. annual inference waste

CFO

You are paying for every wrong file your agent reads

28.7% of every token your agents process is context noise — wrong files, irrelevant history, dead code your codebase forgot about. For 2,000 engineers at $200/month in API costs, that is $1.38M a year in provable waste. The inference bill arrives monthly. The audit trail explaining it never does.

Benchmark-proven · 8 public repos · 40 tasks

0

signed records of what the agent changed

CIO

When compliance asks what the AI touched, you have no answer

Your security review, your board, your acquirer — each of them will ask what the AI agent changed last quarter. What files did it read? What did it skip? What evidence was checked? Without a signed audit trail, the current answer is: we do not know. That answer does not survive a procurement cycle.

Truth Audit Certificates address this directly

8–12 mo

avg new developer ramp time — unchanged by AI

CTO

AI tools do not shorten onboarding without shared context

AI coding agents start every session with a blank slate. They do not know your validated patterns, your naming conventions, or why you made that architectural decision three years ago. New hires get the same blank slate. The knowledge your senior engineers built over years lives nowhere the agent can use it.

Team Context Ledger → institutional memory that persists

3–4×

AI tools per engineer · zero shared governance

CTO / CFO

Three AI subscriptions. Three isolated contexts. No shared memory.

Your engineers are running Claude Code, Cursor, and Copilot in parallel. None of them share what your codebase has validated. Every tool maintains its own context in isolation. You are paying for three separate attempts at the same problem, with no way to govern, audit, or unify the output.

One control plane across all agents — agent-agnostic by design

47 days

avg enterprise detection time for shadow AI changes

CIO / CTO

Shadow changes bypass your entire review process

Without context control, agents touch files they should never see — infrastructure configs, security boundaries, shared utilities. No signal reaches your review process. The change merges, deploys, and surfaces as an incident weeks later. By the time it is caught, the blast radius is already wide.

Architecture guardrails in CI · 1,398 boundary violations caught in vscode/vscode alone

Compounding

the real cost

None of these problems stay contained

Shadow changes become technical debt. Bad context becomes bad patterns baked into your codebase. Knowledge loss accelerates with every senior departure. And every new AI agent subscription you add without governance multiplies the surface area for all of the above.

CodeLedger is the control plane that stops the compounding.

Four levers · one lifecycle

CodeLedger operates across every stage of how software gets built

Most tools optimize the agent. CodeLedger governs the system — at the moment of context selection, the moment of review, across every session, and for every engineer who comes after.

Develop

Immediate value

The agent starts informed. Not cold.

  • Scores every file in the repo for the current task
  • Selects the minimal, highest-signal context bundle
  • Validates the task prompt against 101 quality signals
  • Surfaces ⚠ Prompt health before the first line is written

28.7% token reduction · 100% top-5 stability

Review

Protective value

Risk signal before the merge.

  • Risk · Drift · Evidence Gaps on every PR
  • Deterministic additive model — no AI in the scoring path
  • Exact file and line where conventions were bypassed
  • Appears in the PR comment where engineers already look

45% catch rate · 0 hallucinations · 11 PRs verified

Improve

Compounding value

Every session makes the next one better.

  • Successful patterns promoted to institutional memory
  • Feedback flywheel: deploy outcomes flow back to signal weights
  • First-pass success rate climbs 62% → 80% over 8 weeks
  • No cloud, no retraining — improves inside your environment

+18% first-pass rate · -13% rework · Week 1 → Week 8

Retain

Institutional value

When engineers leave, the knowledge stays.

  • 5 persistent ledgers: truth, validation, ontology, structure, evidence
  • New hires inherit proven patterns, not a blank slate
  • Architectural invariants enforced across agents and sessions
  • Evidence gates prevent low-confidence findings from becoming noise

8–12 mo ramp time · 25% improvement with shared context

CodeLedger is the only system that operates across all four stages of the engineering lifecycle — not just code generation.

The numbers for a 20,000-person company

What uncontrolled AI coding costs, line by line

Assumptions: 2,000 engineers actively using AI coding tools · $200/month average inference spend · 28.7% token reduction benchmark-proven across 8 public repos and 40 tasks.

Cost categoryBasisAnnual impact

Inference waste

2,000 engineers × $200/mo × 28.7% token reduction

Benchmark-proven

$1,377,600 / yr

AI-driven rework cycles

15–25% of AI output requires rework without context control; 2,000 devs × conservative 2 hrs/mo

Industry estimate

$3.6M / yr

New developer ramp gap

200 new hires/yr · 25% ramp improvement with shared context · $75k avg onboarding cost

Directional

$3.75M / yr

Compliance exposure

One AI-related security incident requiring external audit

Directional

$500k – $2M

Only the inference savings figure is benchmark-proven. All other figures are conservative directional estimates and should be validated against your organization's actual data.

How CodeLedger addresses each pain

Five capabilities. Five business outcomes.

Pain 1

Inference spend with no ROI line item

Context-Compiler

28.7% token reduction across 40 tasks on 8 public TypeScript repos. Every optimization is auditable — trim, hoist, retain, or skip — with a full trace. Zero omission incidents. 100% top-5 file stability.

See the benchmark

Pain 2

No audit trail for AI-generated code

Truth Audit Certificates

Signed, tamper-evident certificates that record what the agent read, what changed, and what was skipped. Ready for procurement, security review, board reporting, or acquisition diligence. No source code leaves your environment.

See certificate tiers

Pain 3

Institutional knowledge evaporating

Team Context Ledger

Validated patterns, accepted changes, naming conventions, and architectural decisions are persisted across sessions. Every agent — and every new developer — starts with the context your senior engineers built, not a blank slate.

Enterprise overview

Pain 4

Three AI tools with zero shared governance

Agent-agnostic control plane

One governance layer across Claude Code, Cursor, Codex, Kiro, Windsurf, and Copilot. Context selection, guardrails, and audit evidence work identically regardless of which tool any engineer is using.

How it works

Pain 5

Shadow changes and architecture drift

CI Guardrails + PR Intelligence

Risk, drift, and evidence-gap signals on every PR. Conditional gates run heavy checks only when risk is High — keeping CI fast by default. VS Code: 1,398 boundary violations caught in a single pass.

Governance planes

Where it lives

Local intelligence. GitHub governance. Executive visibility.

Zero cloud dependency by default. The intelligence lives on the developer machine. Governance runs in GitHub. Evidence surfaces in a self-hosted dashboard. Three zones. No external SaaS.

Zone 1

Developer Machine

.codeledger/
├── active-bundle.md     ← current task context
├── memory/
│   ├── recent-truth.json
│   ├── ontology.json
│   ├── structural-trust.json
│   ├── evidence-gates.json
│   └── validation-ledger.json
├── patterns/            ← golden patterns
└── bin/
    └── codeledger-standalone.cjs

Local-first. Air-gap capable. Zero cloud dependency. Works offline. Works in regulated environments.

Zone 2

GitHub Enterprise

.github/workflows/
├── codeledger-verify.yml  ← CI on every PR
├── codeledger-pr.yml      ← Risk/Drift/Evidence
└── codeledger-guard.yml   ← release gate

.codeledger/team-ledger/
├── patterns/              ← shared golden patterns
└── merge-memory-records.jsonl

# Two-line integration:
- uses: codeledgerECF/codeledger@v0.10.19
  with:
    github-token: ${{ secrets.GITHUB_TOKEN }}

Governance runs here. Teams share patterns here. The PR signal appears where engineers already look.

Zone 3

Engineering Dashboard

[your-org].github.io/engineering
├── /overview    ← Architecture Health Score
├── /value       ← hours saved, dollar impact
├── /integrity   ← Architecture / Impl / Release
├── /engineering ← agent scorecards
├── /fleet       ← cross-repo (Enterprise)
└── /evidence    ← drill-down event log

Self-hosted on GitHub Pages. Board-ready metrics with evidence behind every number. SIEM-ready audit export.

Local is the source of truth. GitHub is the governance layer. The dashboard is the evidence surface. One system, three zones, no cloud.

For enterprise rollout

One platform. Five governance planes. Every stakeholder covered.

The enterprise dashboard organizes CodeLedger into five outcome planes, each with access-controlled depth and exportable evidence for the teams that need it — engineering, security, compliance, finance, and leadership.

1

Context

Engineering

The right files, every time

2

Verify

Security

Risk caught before it merges

3

Govern

Compliance

Signed evidence on demand

4

Fleet

Leadership

Visibility across every agent

5

Learn

Platform

Improves with your codebase

Evidence, not marketing copy

By the numbers

Every metric below maps to a real test, a real repo, or a real session record. No projections. No projections dressed as benchmarks.

What we measureTypical resultHow we measure it

Context-Compiler token reduction

28.7% weighted avg

40-task benchmark · 8 public TS repos

Top-5 file stability

100%

Top files retained after optimization on every task

First-pass success rate

>70%

Tasks completed without agent retry

Token reduction vs. full repo

>90%

Bundle tokens vs. estimated full-repo context

Bundle recall

>60% (growing)

% of touched files that were in the bundle

Rework reduction

−13% over 8 wks

CIC failure rate trend, controlled environment

Release surfaces verified

7/7 (168/168)

release-verify propagation check

Cross-repo PR scoring

Nightly

Live PRs across vercel/next.js · facebook/react · postgres/postgres

PR-Review-Intel records captured

Every run

Per-PR ledger with stable schema, audit-grade

62% → 80%

First-pass success rate

Week 1 → Week 8

24% → 11%

Rework rate

Week 1 → Week 8

18% → 64%

Pattern reuse rate

Week 1 → Week 8

ECF

Part of the ContextECF platform

Enterprise Context Fabric — the intelligence layer for every agent

CodeLedger is the engineering module of a broader platform: the ContextECF Enterprise Context Fabric. Where CodeLedger governs AI coding agents and repos, the Enterprise Context Fabric extends governed context across CRM, support, sales, ops, and every system your teams work in — powered by the same deterministic, audit-ready primitives.

Design Partner Program · 8 founding slots

Save over $1M in year one.
Shape the governance standard before it's set.

Design partners get founding-team attention, pricing locked at $45/seat for 12 months, early access to OVPI signal, and direct input into the roadmap while decisions still bend. At 2,000 engineers and $200/month in inference spend, CodeLedger's 28.7% token reduction alone saves $1.38M annually — before we account for rework, onboarding, or compliance overhead.

8

founding slots total

$45

per seat/month, locked 12 mo

$1.38M

yr-1 inference savings (2k engineers)

6

structured calls with founding team

Independent evidence

Run blind against public repos you can inspect yourself

Selection criteria were written before scoring. Results include catches, correct silences, and the one domain where we do not yet perform — because hiding misses is not how trust is built.

nestjs/nest

2 controlled tasks

Consistent 25–27% reduction on a large TS framework

Two tasks across a large TypeScript framework monorepo — rate limiting middleware and DI container refactor — both scored PASS with 25–27% token reduction and top-5 file stability. No crashes, no omission incidents.

Full study →

vercel/next.js

5 PRs blind-scored

Caught the revert before it happened

PR #93071 was flagged Medium WARN with dependency_manifest_changed and cross_package_boundary drivers. It was reverted 3 days later by #93226: "breaks lerna — cannot resolve catalog: references."

See the revert →

facebook/react

5 PRs blind-scored

Flagged the gap before the post-merge move

PR #36253 was flagged production_change_without_tests + uncovered_failure_branch. The entire implementation was relocated to the React Native repo days later "due to discovered bugs and iteration challenges."

See PR #36253 →

postgres/postgres

7 controlled tasks
honest assessment

Honest: 1.1% recall on cold-start C

Our model is tuned for JS/TS monorepos. On a 5,000-file C codebase with no prior ledger, average bundle recall is 1.1%. We publish the full study because calibration transparency is the product.

Full study →

Latest run · 3 repos · Docker verified

Full report →
RepoTaskToken reductionQuality
nestjs/nestAdd rate limiting middleware8,7776,372(-27.4%)PASS
nestjs/nestRefactor DI container async providers8,7706,540(-25.4%)PASS
vercel/next.jsFix hydration mismatch in app router8,2476,021(-27.0%)PASS
prisma/prismaAdd field type to schema parser2,4922,492(+0.0%)WATCH
prisma/prismaAdd PostgreSQL JSON query operators8,5217,268(-14.7%)WATCH

WATCH = no trim opportunities found or dense cross-package types — not a failure. Nest & Next.js: consistent 25–27% reduction, top-5 stable.

11 PRs · 45% catch rate · 0 hallucinations · pre-registered selection criteria

→ Full field-tests report

Context-Compiler benchmark · 2026-05-05

28.7% weighted token reduction · 100% top-5 file stability · 0 omission incidents · 8 public TS repos · 40 tasks · merged CLI

→ Full benchmark report

What it looks like in practice

$ codeledger activate --task "add rate limiting to the payments API"

✔  Scanned 1,847 files in 2.3 s
✔  Context bundle ready  (18 files · ~8,100 tokens · 28.3% below baseline)

Ranked files:
  0.91  src/api/payments/router.ts      keyword_match, centrality
  0.88  src/middleware/auth.ts           dependency_depth, churn
  0.85  src/api/payments/service.ts     keyword_match, recent_touch
  0.79  tests/payments/router.test.ts   test_relevant
  ...

Confidence: HIGH (0.94)  ·  Top-5 stable  ·  0 omission flags

Ready to govern your AI coding program?

Start with the slot that controls the most.

Design partners get locked pricing, founding-team attention, and measurable inference savings from day one. Enterprise trial gives you 30 days to run the benchmark against your own codebase.

No credit card required · 8 Design Partner slots · Local-first, no source code uploaded