Git AI
Team Usage

Architecture & Data

Where the data comes from, how the CLI and platform authenticate, and the trust points between them.

The Git AI Platform is a managed service. You connect your source control and install the CLI on developer machines — Git AI runs the ingestion, processing, and storage.

Teams that need to keep everything inside their own perimeter can run the exact same architecture themselves — see Self-Hosting. The data flows, authentication, and trust points below are identical in both the hosted and self-hosted deployments.

The moving parts

Three things sit on your side of the line: the git ai CLI on developer laptops and CI, your local git repositories, and your source control provider. The platform sits on the other side and is made of a small set of services you never touch directly:

ComponentRole
Telemetry ingestionInternet-exposed, write-only endpoint that accepts agent telemetry from developer machines
API & UIThe dashboard, the webhook receiver, and authentication
WorkersBackground jobs — PR sync, ingestion, joining attribution to SCM metadata
DatastoresUsage analytics, organization and account records, and (in hosted notes mode) the notes of record

Where the data comes from

Three independent sources feed the platform. They have different producers, trust levels, and storage. For the shape of the datasets these produce — agent sessions, attributions, PR metrics — see Data Schema.

Data classProduced byReaches the platform viaSensitivity
Client telemetrygit ai CLI on laptops / CIWrite-only telemetry upload endpointToken usage, agent sessions, and tool calls — written with a least-privilege, write-only key
Git notes (authorship)git ai CLI, per commitgit_notes mode: pushed into your SCM as refs/notes/ai. hosted mode: notes upload endpointAuthorship attached to commits, linking lines to agent sessions — in git_notes mode it never leaves your SCM
SCM metadataYour SCM providerSigned webhooks + worker REST pullsPRs, commits, contributors — links agent activity to the SDLC

Authentication

Developer machines → platform

Telemetry is sent with a Client Telemetry Write key. These keys are write-only — they can push telemetry but cannot read notes, organization data, or reach any admin API — and they rotate easily. Each integration holds the narrowest credential for its job: a laptop pushing telemetry carries only the write-only key.

Admin access

You sign in to the platform with OAuth from your SCM organization, and members join and get invited there. Membership in your SCM org is the source of truth for who can access the dashboard — there's no separate user directory to manage.

Each provider follows the same pattern. Long-lived secrets are configured once at connection time and used to mint short-lived runtime tokens that the platform refreshes before use.

ProviderSign-inRepo / API calls
GitHubOAuthShort-lived App installation tokens (~1h), re-minted per use
Azure DevOpsEntra ID OAuthOAuth access + refresh tokens
GitLabOAuthOAuth access + refresh tokens (or a PAT)
BitbucketOAuthOAuth access + refresh tokens

Identity & authorization

Authorization is governed by organization membership. Every credential — UI session or telemetry key — is bound to an organization, and every route enforces that the caller belongs to the org it's acting on. Data is isolated per org. The developer's git email is used only for attribution — mapping activity to a person — and is never trusted for authorization, so a spoofed identity can at most misattribute within the same org.

SCM permissions (least privilege)

Git AI requests the narrowest permission for each capability, using each provider's native model. GitHub is granted once at App installation; the others via the OAuth scopes the user consents to. Step-by-step setup lives in Connect Source Control.

CapabilityGitHub App permissionAzure DevOpsGitLabBitbucket
Read repo + push refs/notes/aiContents — Read & writevso.code_writeapirepository
Commit status / checksCommit statuses — Read & writevso.code_statusapirepository
PR comments / footersPull requests — Read & writevso.code_writeapirepository
Repo / project metadataMetadata — Readvso.project, vso.graphapirepository
Identity / org membershipMembers — Read; Administration — Readvso.identityread_useraccount
User profile / emailEmail addresses — Readvso.profileread_useraccount
Sign-inOAuth appopenid, profile, email, offline_accessread_useraccount
WebhooksEvent subscriptionsprovisioned via APIwebhookwebhook

Git notes — two storage modes

Notes storage is configurable per organization. hosted mode is preferred for large monorepos or repositories with many contributors — notes are stored centrally rather than pushed as refs into the repo, avoiding notes-ref contention and large fetches. See How Git AI Works for the underlying notes mechanism.

git_notes (default)hosted
Where notes liveYour SCM repo (refs/notes/ai)The platform, keyed by (org, commit)
Pushed to SCM?YesNo
Write pathgit push notes ref (SCM's own auth)Notes upload endpoint (notes.write key)
Read pathgit ai fetch from SCMNotes read endpoint (notes.read key)
Authorship of recordStays in your SCMStays in the platform

Data flow

Authorship → notes write

When an agent works, the CLI writes authorship to refs/notes/ai in the local repo. On the way to the platform it takes one of two paths:

  • git_notes mode (default) — the CLI pushes refs/notes/ai into your SCM with the developer or CI's own git credentials. The notes of record live in your SCM repo.
  • hosted mode — the CLI uploads notes directly to the platform with a notes.write key, where they're validated and stored keyed by org and commit.

PR sync

PR metrics are computed when source control tells the platform something changed:

  1. Your SCM sends a webhook on a PR or push event.
  2. The platform verifies the HMAC signature, dedupes the delivery, and enqueues a sync job.
  3. A worker loads the org and its SCM token (refreshing if expired) and pulls the PR, commits, and iterations over REST.
  4. It reads the authorship notes — from your SCM in git_notes mode, or from the platform in hosted mode.
  5. It posts a PR comment and commit status, and persists the PR, session, and contributor records.

Security & isolation

  • In transit — TLS everywhere: ingestion, the dashboard, and every call out to your SCM and identity providers.
  • At rest — encrypted by the platform's datastore and object-storage layer.
  • Isolation — data is partitioned per organization, and the internal services and datastores are never internet-exposed. The only surfaces a developer machine or your SCM touches directly are the telemetry endpoint, the webhook receiver, and the UI.
  • No vendor lock-in for attribution — in the default git_notes mode, authorship of record stays in your own SCM repo.

Running the platform inside your own perimeter — including secret management, network policy, and the full egress allowlist — is covered in Self-Hosting. Client-side controls for developer machines live in Enterprise Configuration.