Codex Analytics shows you what AI costs.
Git AI shows you what AI delivered.
What Codex's native dashboard tracks
OpenAI ships a Codex workspace analytics dashboard, plus an Analytics API and a Compliance API for governance. The dashboard tracks daily users by product (CLI, IDE, cloud, Code Review), session and message counts, code review feedback sentiment and severity, plus thumbs up/downs on suggestions.
OpenAI is unusually direct about what isn't tracked. Their docs explicitly call out that the analytics surface does not report:
- Lines of code generated
- Acceptance rate of suggestions
- Code quality or performance KPIs
Compliance API audit logs are kept for 30 days, and exports apply only to ChatGPT-authenticated usage — API-key traffic falls outside the report.
Their analytics measure usage and costs, not value. You can't tie token spend on a Codex session to a specific PR or to the value it delivered. You can't see how AI code holds up over time, how much of it gets rewritten, or how much rework it generates downstream. The cost is precise. The outcomes are not.
Track AI code through the entire SDLC
How much of what Codex generates gets thrown away before it's even committed? How many of its edits get rejected during code review? How much AI code actually makes it through merge? Once it ships, does it cause incidents? Get rewritten weeks later? Pile up rework that never gets accounted for?
The only way to answer those questions is to track AI code through the entire SDLC. Git AI extends Git's native git blame with line-level AI attribution, so every line of AI-code can be followed from the moment it enters your codebase to the moment it churns out.
Because attribution is recorded by Codex itself — via agent hooks — every line keeps its provenance through rebases, squashes, cherry-picks, and merges. Lines generated, accepted, committed, merged, and durable in production — the metrics OpenAI explicitly doesn't ship.
Token spend that maps to outcomes
OpenAI's Codex usage page shows daily totals for threads, turns, and credits. That's enough to see who's burning the most, but not enough to see what those credits bought you.
Git AI breaks Codex spend down by commit, PR, repository, team, and individual — so you can see which work cost what, which repos are token sinks, and where the inefficient sessions are concentrated.
Measure agent autonomy
Codex's analytics don't even surface accept rate. Two sessions can both ship 100% Codex-authored code — and one of them is a straight line from intent to production while the other is a loop of corrections, abandoned branches, and rewrites. Nothing in OpenAI's workspace dashboard can tell them apart.
A straight line from intent to production
- →Pulls context from the issue tracker
- →Writes failing tests, then the fix
- →Opens a PR with a clear cause-and-fix
- →Reviewer approves on the first read
Steering, rewrites, and regressions
- ↯Agent struggles to reproduce — repo docs are thin
- ↯Engineer steers it toward the right files
- ↯Reviewer spots a missed edge case; agent re-prompted
- ↯Customer reports a regression; manual hotfix
Only one of these sessions is actually autonomous. Git AI measures the gap so you can find the prompts, skills, and codebase context that close it.
Measure token efficiency, keep costs in check
Codex pricing is usage-based by design. It's not enough to know how many credits you're burning. You need to know what those credits are buying you, and whether the outcomes justify the costs.
For every 100 lines Codex generates, how many reach production? In well-prepared codebases — strong tests, clear AGENTS.md, good architectural docs — we see ratios near 4:1. In sparse codebases, the same agent can run 50:1 or worse, with most of what it generates getting regenerated, abandoned, or rewritten before it reaches a commit. Git AI helps you identify where the agents get stuck, and make your codebases AI-ready — saving you token costs.
Code durability and incidents traced back to the prompt
OpenAI's Compliance API audit logs disappear after 30 days. Git AI keeps tracking. We measure how much AI code is rewritten, reverted, or refactored in the 30 / 60 / 90 days after it ships — the durability of agent output. Across our fleet we see 30-day durability range from ~30% to ~85% depending on the team.
When a production incident fires, every line involved can be walked back to the exact Codex session, model version, and prompt that produced it. The session transcript lives in the Prompt + Context Store, so post-mortems can answer not just "who wrote this" but "what was the agent told, and what context did it have when it wrote it."
One dashboard for every agent
Most teams don't only run Codex. Cursor, Claude Code, Copilot, Gemini — each ships its own dashboard, with its own assumptions and its own attribution heuristics. Git AI is built on an open standard that unifies them all.
Cursor, Claude Code, OpenAI Codex, GitHub Copilot, Gemini CLI, OpenCode, Continue, Droid, Junie, Rovo Dev, Amp, Windsurf
Getting the data into one place is the easy part. Once every line of AI code is attributed and tracked through the SDLC, you can:
- Accelerate adoption. Spot the teams, repos, and prompting patterns that get the most leverage from agents — and roll what's working out everywhere else.
- Make AI work for your codebase. Find where agents get stuck, where context is thin, where tests and skills need to be tightened. The data tells you exactly where to invest.
- Justify the spend. Tie tokens to merged PRs, durable production code, and incidents avoided. Show finance and leadership exactly what the AI budget delivered.
Git AI: The open-source standard for tracking
AI-code from prompt to production.
curl -sSL https://usegitai.com/install.sh | bash









