Codex Analytics shows you what AI costs.
Git AI shows you what AI delivered.

What Codex's native dashboard tracks

OpenAI ships a Codex workspace analytics dashboard, plus an Analytics API and a Compliance API for governance. The dashboard tracks daily users by product (CLI, IDE, cloud, Code Review), session and message counts, code review feedback sentiment and severity, plus thumbs up/downs on suggestions.

OpenAI is unusually direct about what isn't tracked. Their docs explicitly call out that the analytics surface does not report:

Compliance API audit logs are kept for 30 days, and exports apply only to ChatGPT-authenticated usage — API-key traffic falls outside the report.

Their analytics measure usage and costs, not value. You can't tie token spend on a Codex session to a specific PR or to the value it delivered. You can't see how AI code holds up over time, how much of it gets rewritten, or how much rework it generates downstream. The cost is precise. The outcomes are not.

Track AI code through the entire SDLC

How much of what Codex generates gets thrown away before it's even committed? How many of its edits get rejected during code review? How much AI code actually makes it through merge? Once it ships, does it cause incidents? Get rewritten weeks later? Pile up rework that never gets accounted for?

The only way to answer those questions is to track AI code through the entire SDLC. Git AI extends Git's native git blame with line-level AI attribution, so every line of AI-code can be followed from the moment it enters your codebase to the moment it churns out.

AI Code Journey
7.8kdiscarded by agent1.1krewritten in review0.7kPR closed without merge1.4kchurned within 30 days12kGENERATED4.2kCOMMITTED3.1kREVIEWED2.4kMERGED1.0kDURABLE 30d
12 lines generated for every 1 to production
65% of AI-code is discarded before commit
58% of merged code is churned within 30 days

Because attribution is recorded by Codex itself — via agent hooks — every line keeps its provenance through rebases, squashes, cherry-picks, and merges. Lines generated, accepted, committed, merged, and durable in production — the metrics OpenAI explicitly doesn't ship.

Token spend that maps to outcomes

OpenAI's Codex usage page shows daily totals for threads, turns, and credits. That's enough to see who's burning the most, but not enough to see what those credits bought you.

Cost per PR
#1247Fix auth token refresh
codex$15.31
#1244
$▓.▓▓
#1241
$▓.▓▓
#1238
$▓.▓▓

Git AI breaks Codex spend down by commit, PR, repository, team, and individual — so you can see which work cost what, which repos are token sinks, and where the inefficient sessions are concentrated.

Measure agent autonomy

Codex's analytics don't even surface accept rate. Two sessions can both ship 100% Codex-authored code — and one of them is a straight line from intent to production while the other is a loop of corrections, abandoned branches, and rewrites. Nothing in OpenAI's workspace dashboard can tell them apart.

Session 1Autonomous

A straight line from intent to production

  • Pulls context from the issue tracker
  • Writes failing tests, then the fix
  • Opens a PR with a clear cause-and-fix
  • Reviewer approves on the first read
Session 2Heavy steering

Steering, rewrites, and regressions

  • Agent struggles to reproduce — repo docs are thin
  • Engineer steers it toward the right files
  • Reviewer spots a missed edge case; agent re-prompted
  • Customer reports a regression; manual hotfix

Only one of these sessions is actually autonomous. Git AI measures the gap so you can find the prompts, skills, and codebase context that close it.

Measure token efficiency, keep costs in check

Codex pricing is usage-based by design. It's not enough to know how many credits you're burning. You need to know what those credits are buying you, and whether the outcomes justify the costs.

For every 100 lines Codex generates, how many reach production? In well-prepared codebases — strong tests, clear AGENTS.md, good architectural docs — we see ratios near 4:1. In sparse codebases, the same agent can run 50:1 or worse, with most of what it generates getting regenerated, abandoned, or rewritten before it reaches a commit. Git AI helps you identify where the agents get stuck, and make your codebases AI-ready — saving you token costs.

Code durability and incidents traced back to the prompt

OpenAI's Compliance API audit logs disappear after 30 days. Git AI keeps tracking. We measure how much AI code is rewritten, reverted, or refactored in the 30 / 60 / 90 days after it ships — the durability of agent output. Across our fleet we see 30-day durability range from ~30% to ~85% depending on the team.

When a production incident fires, every line involved can be walked back to the exact Codex session, model version, and prompt that produced it. The session transcript lives in the Prompt + Context Store, so post-mortems can answer not just "who wrote this" but "what was the agent told, and what context did it have when it wrote it."

One dashboard for every agent

Most teams don't only run Codex. Cursor, Claude Code, Copilot, Gemini — each ships its own dashboard, with its own assumptions and its own attribution heuristics. Git AI is built on an open standard that unifies them all.

CursorClaude CodeCodexGitHub CopilotGemini CLIOpenCodeContinueDroidJunieRovo DevAmpWindsurf

Cursor, Claude Code, OpenAI Codex, GitHub Copilot, Gemini CLI, OpenCode, Continue, Droid, Junie, Rovo Dev, Amp, Windsurf

Getting the data into one place is the easy part. Once every line of AI code is attributed and tracked through the SDLC, you can:

Git AI: The open-source standard for tracking
AI-code from prompt to production.

curl -sSL https://usegitai.com/install.sh | bash