Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.karta.sh/llms.txt

Use this file to discover all available pages before exploring further.

Karta is two services that are versioned together but deployed and operated independently, talking over HTTP authenticated by a shared service token.
PlaneAppStackOwns
Data planeapps/karta-pythonPython · FastAPI · harness SDKsThe request path: sessions, harness execution, sandboxes/releases, request-time budget enforcement, usage emission. Published to PyPI as karta-python.
Control planeapps/karta-webRuby · Rails 8 · TailwindThe system of record: identity, API keys, usage metering, budget caps, Stripe billing, BYOK storage, outbound webhooks, audit log, marketing site.
                          ┌──────────────────────────────────────┐
   end users / SDK ──────▶│  DATA PLANE  (karta-python, FastAPI)  │
   (kt_live_… API key)    │                                       │
                          │  validates key ─────────┐             │
                          │  emits usage  ──────────┤             │
                          │  fetches BYOK ──────────┤             │
                          └──────────┬──────────────┼─────────────┘
                                     │              │ HTTP + service token
                          delegates  │              ▼
                          agent turn │   ┌──────────────────────────────┐
                                     ▼   │  CONTROL PLANE (karta-web)    │
                          ┌─────────────┐│                               │
                          │  HARNESS    ││  identity · keys · budgets    │
                          │ Claude Code ││  Stripe · webhooks-out · BYOK │
                          │  / OpenCode ││  audit log · dashboard        │
                          └─────────────┘└──────────────────────────────┘

The trust boundary

The split is the central architectural decision: the plane that runs customer agent code (data plane) is kept separate from, and minimally trusted by, the plane that holds money-and-identity state (control plane). The internal surface between them is defended in depth — a service token and an IP allowlist (KARTA_INTERNAL_ALLOWED_IPS) and a Host allowlist (KARTA_INTERNAL_HOSTS). A leaked token replayed off-network, or against the public dashboard host, gets a 404 — not the controller. A compromised data plane cannot, by itself, read or rewrite the control plane’s sensitive state.

The data plane: thin core, fuller platform

The data plane stacks two tiers:

A thin core

Almost entirely passthrough. Owns session handles, agent discovery, participant attribution, policies, and lifecycle hooks — but deliberately not conversation history. The harness owns that.

A platform layer

Absorbs production complexity: the HTTP server, tenant isolation, BYOK, usage emission, releases, and observability — the load-bearing, security-sensitive machinery.
This keeps the core simple and bug-resistant while the platform layer carries the weight.

The harness abstraction

Each harness sits behind a HarnessAdapter interface, with concrete ClaudeAdapter and OpenCodeAdapter implementations auto-detected from the project layout (.claude/ → Claude, .opencode/ → OpenCode). The adapter’s only job is translation: discover agents from the harness’s native format, then stream a single agent turn, mapping the harness’s native output into Karta’s uniform typed event model. It does not manage sessions or store messages.

The control plane: the system of record

A Rails 8 app whose strength is separation of concerns — identity ↔ authorization ↔ metering ↔ budget enforcement ↔ billing, each layer independently auditable and cacheable. It owns:
  • Identity & tenancy — Devise auth (confirmable, lockable, trackable) with TOTP and WebAuthn passkeys; Organization (the tenant) and OrganizationMembership with five roles: owner, admin, developer, billing, viewer.
  • API keys — Stripe-style kt_live_… bearer tokens; only a bcrypt digest plus a short prefix is stored. See API keys.
  • Metering & budgets — idempotent usage ingestion, exact-integer money, per-org caps. See Usage & budgets.
  • Billing, webhooks, BYOK, audit — Stripe subscriptions, signed outbound webhooks, encrypted provider keys, and an immutable audit log.

How the planes talk

All cross-plane traffic is HTTP authenticated by a shared service token. Three flows dominate:
On each customer request the data plane calls /internal/keys/validate and gets back a Principal (org id, scopes, budget_ok). It caches this for a short TTL (default 5s) to amortize the bcrypt cost; the control plane push-invalidates the cache on revocation, plan, or budget changes. Eventual consistency with a bounded, configurable staleness window.
The data plane buffers usage and flushes idempotent batches to /internal/usage/events. Metering never blocks a customer response — if the control plane is down, the bounded queue drops oldest events rather than stalling the request path.
The data plane fetches decrypted provider keys and model overrides on demand (short-TTL cached), so rotations propagate quickly.

Request lifecycle

1

Authenticate

A client calls POST /v1/sessions/{id}/messages (or /stream) with Authorization: Bearer kt_live_….
2

Validate & gate on budget

The data plane validates the key against the control plane (or cache) and gets a Principal with budget_ok. If false, it returns 402 Payment Required before running anything.
3

Resolve session & policy

It resolves the session (creating one if needed), checks tenant ownership and policies, and fetches BYOK / model settings if configured.
4

Run the harness

It hands the turn to the harness, which runs the agentic loop and streams typed events back. The data plane relays them as SSE and may pause for input_required approval prompts.
5

Meter

On completion, usage is queued for idempotent batch emission to the control plane, which aggregates it, re-checks budgets, fires threshold alerts, and push-invalidates the cache if a cap was crossed.

Cross-cutting decisions

These choices shape the system most; understand them and the rest follows.
  • Delegate, don’t duplicate. The harness owns all conversation state.
  • Harness-native definitions. Agents live in the harness’s own file formats — portable, convention over configuration.
  • Streaming as the primitive. Every entry point is event-streamed; request/response is just an accumulated stream.
  • Cache + push-invalidate on the hot path. Fast common case, bounded staleness.
  • Idempotency everywhere money moves. Usage events (client_event_id) and Stripe webhooks (stripe_event_id) both dedupe retries.
  • Exact-integer money with typed units. Micro-cents with distinct Micros/Cents types; always round up; never under-bill.
  • Immutable audit by construction. Audit rows are written in the same transaction as their mutation and locked at both the Ruby and Postgres-trigger level.
  • Fail loud on unsafe config. A missing host allowlist raises at boot rather than shipping silently unsafe.

Observability

A single trace can span Rails → karta-python → harness when both planes point their OTLP exporter at the same collector. The control plane ships structured Lograge JSON (every line carries request_id, user_id, org_id, remote_ip) and Sentry error reporting with PII off and a secret-scrubbing before_send — bearer tokens and provider keys are never logged or attached to spans.

Honest seams

A few parts are intentionally not finished yet:
  • Kaniko image builds are stubbed. Buildpack/Dockerfile detection and image_ref computation are real; the actual build + registry push is not. Image releases run via file-copy until enabled.
  • Data-plane BYOK wiring is incomplete. The control plane stores and serves BYOK keys; injecting per-request customer credentials into the harness is the next focused pass.
  • Per-project URLs and session-time release resolution are the linchpins of the hosted deploy loop still being closed.