Hosted API

The hosted API is the first tubebrain.ai web-service slice around the local FOSS core. It is intentionally narrow today, but it now covers both the deterministic timestamped-VOD demo path and the first live/radio session path:

  • GET /v1/health
  • POST /v1/transcript/section
  • GET /v1/admin/usage
  • GET /v1/stream
  • POST /v1/stream/start
  • GET /v1/stream/{session_id}/poll
  • GET /v1/stream/{session_id}/events
  • POST /v1/stream/{session_id}/stop

POST /v1/transcript/section mirrors the local MCP get_transcript_section tool. It accepts a timestamped YouTube URL or an explicit at_s value and returns the same TranscriptSection packet shape inside a hosted response envelope with a request_id.

The stream endpoints mirror the local MCP start_stream, poll_stream, stop_stream, and list_streams tools. They are the GTM MVP path for radio/HTTP audio and YouTube Live sources: source URL in, StreamSession and StreamChunk payloads out for agent polling or SSE-style delivery.

For hosted preview deployments, live STT is selected independently from the HTTP API process. tubebrain-hosted can run with a no-op backend for contract testing, an in-process Whisper backend for local development, or a remote internal tubebrain-stt service for Blahaj/Tinyland deployment. The remote STT service is an internal boundary, not a public API surface.

The broader API and auth contract lives in Hosted HTTP/SSE API Contract. The pilot-facing metering, retention, and proof-of-origin boundary lives in Hosted Pilot Policy.

Run Locally

From a checkout:

just hosted-dev

That starts tubebrain-hosted on 127.0.0.1:8787 with the local development API key tb_sk_test_local unless hosted API key configuration is already set.

Equivalent explicit command:

TUBEBRAIN_HOSTED_BIND=127.0.0.1:8787 \
cargo run --locked --bin tubebrain-hosted

Set TUBEBRAIN_API_KEY in the environment before running the server to use a non-default local token. For paid pilots, prefer TUBEBRAIN_API_KEYS_JSON so each key carries account identity, key identity, scopes, status, and an optional pilot label. The server stores SHA-256 key hashes in memory and never emits raw API keys in usage events or error bodies.

Single-key environment:

Variable Default Notes
TUBEBRAIN_API_KEY tb_sk_test_local in local dev Raw bearer token loaded from deployment secrets.
TUBEBRAIN_USAGE_ACCOUNT_ID pilot Account id attached to usage events for the single key.
TUBEBRAIN_API_KEY_ID derived key_<sha256-prefix> Stable key id for usage events.
TUBEBRAIN_API_KEY_SCOPES all current scopes Comma-separated scopes, such as transcript:read,stream:read.
TUBEBRAIN_API_KEY_STATUS active Set revoked to reject the key without removing the record.
TUBEBRAIN_API_KEY_LABEL unset Optional operator label for the key.

Multi-key environment:

[
  {
    "key": "tb_sk_live_example",
    "account_id": "acct_design_partner",
    "key_id": "key_design_partner_primary",
    "scopes": ["transcript:read", "stream:read", "stream:write", "admin:read"],
    "status": "active",
    "label": "design-partner-primary"
  }
]

TUBEBRAIN_API_KEYS_JSON records may use key_sha256 instead of key when a deployment secret pipeline pre-hashes the raw key. Supported scopes are transcript:read, stream:read, stream:write, and admin:read; all or * expands to the current full set.

The first controlled hosted deployment is documented in Hosted Preview Runbook. It runs tubebrain-hosted behind tailnet-only ingress and keeps bearer-token auth on every endpoint except GET /v1/health. Blahaj/OpenTofu owns the normal protected-preview Kubernetes state; this repo owns the binaries, images, API contract, docs, smoke scripts, and legacy/manual recovery manifests.

The pre-call paid design-partner checklist is Paid Pilot Operator Runbook. It is the operator go/no-go surface for DNS, CI/deploy, API keys, quota, STT posture, managed fallback posture, redaction, demo artifacts, and acceptance criteria.

The first quotable paid-pilot package is Paid Pilot Package. It defines the current price bands, source counts, hosted-preview hours, managed fallback caps, support limits, manual invoice path, and billing evidence boundaries.

The preview also exposes authenticated operator usage inspection at GET /v1/admin/usage. This endpoint returns the authenticated account's pilot usage snapshot: configured limits, rolling-window counters, and a bounded recent-event ring. When TUBEBRAIN_USAGE_EVENT_LOG is configured, the hosted process also appends JSONL usage events and rebuilds the current quota window from that file on restart. This is enough for protected-preview control and auditability, but it is not a multi-replica billing database.

Builds with --features po-token use the same BotGuard support as the local MCP binary. Builds with --features whisper can use the local Whisper fallback when captions are unavailable and can transcribe live stream chunks. Default hosted builds can start stream sessions, but report live STT as degraded through stream diagnostics unless a real STT backend is configured.

Stream STT Backends

tubebrain-hosted reads TUBEBRAIN_STT_BACKEND:

Value Behavior
noop Starts stream sessions but reports live STT as degraded. Useful for API contract tests.
local Uses in-process Whisper. Requires tubebrain-hosted built with --features whisper.
remote Sends audio chunks to the internal tubebrain-stt service over HTTP with bearer auth.

If TUBEBRAIN_STT_BACKEND is unset, Whisper-enabled builds default to local and default builds use noop.

Remote STT configuration:

Variable Purpose
TUBEBRAIN_REMOTE_STT_URL Base URL for the internal STT service, for example http://tubebrain-stt:8788.
TUBEBRAIN_REMOTE_STT_TOKEN Bearer token shared with the internal STT service.
TUBEBRAIN_REMOTE_STT_TIMEOUT_MS Optional request timeout. Defaults to 30000.
TUBEBRAIN_REMOTE_STT_MAX_BYTES Optional per-chunk byte cap. Defaults to 4194304.

Run the internal STT service locally:

just stt-dev

That starts tubebrain-stt on 127.0.0.1:8788 with the local development STT key tb_stt_test_local unless TUBEBRAIN_STT_API_KEY is set. The service exposes:

  • GET /v1/health
  • POST /v1/stt/chunk

The STT chunk endpoint accepts base64-encoded audio chunks and returns the same structured Segment shape used by local MCP and hosted stream responses. The service enforces bearer-token auth, a per-chunk byte limit, and redacted error responses. It is intended for private cluster traffic only.

Managed STT Fallback Boundary

Managed third-party STT is separate from the internal remote backend. The managed boundary is disabled by default, follows the hosted pilot policy, and is only for buyer-approved fallback or forced managed pilots. It does not change the public stream response shapes.

Configure the boundary on tubebrain-hosted:

Variable Purpose
TUBEBRAIN_MANAGED_STT_MODE disabled, opt-in, or forced. Defaults to disabled.
TUBEBRAIN_MANAGED_STT_PROVIDER Provider adapter. Currently http.
TUBEBRAIN_MANAGED_STT_URL Base URL for the managed provider proxy. Required unless disabled.
TUBEBRAIN_MANAGED_STT_TOKEN Bearer token for the managed provider proxy. Required unless disabled.
TUBEBRAIN_MANAGED_STT_FALLBACK_HOURS Explicit fallback-hour cap for the hosted process/account. Defaults to 0.
TUBEBRAIN_MANAGED_STT_TIMEOUT_MS Optional provider request timeout. Defaults to 30000.
TUBEBRAIN_MANAGED_STT_MAX_RETRIES Optional retry count after the first provider attempt. Defaults to 0.
TUBEBRAIN_MANAGED_STT_MAX_BYTES Optional per-chunk byte cap. Defaults to 4194304.
TUBEBRAIN_MANAGED_STT_MIN_BILLABLE_MS Optional minimum billable duration per non-empty chunk. Defaults to 1.
TUBEBRAIN_MANAGED_STT_COST_PER_HOUR_MICRO_USD Optional cost estimate used in usage events. Defaults to 0.

In opt-in mode, TubeBrain tries the selected primary backend first and calls the managed provider only after the primary transcriber fails. In forced mode, TubeBrain sends chunks directly through the managed boundary after enforcing the configured fallback-hour cap.

The managed provider request carries only the short audio chunk and chunk timing/format hints. It does not send TubeBrain API keys, cookies, signed media URLs, PoToken values, BotGuard details, session IDs, or account metadata. Usage events are emitted to the tubebrain_usage tracing target with provider, outcome, processed duration, retry count, and estimated cost fields. See the TIN-1212 cost model in docs/spikes/2026-05-16-managed-stt-fallback-cost-model.md.

Health

curl -fsS http://127.0.0.1:8787/v1/health | jq

Example response:

{
  "status": "ok",
  "service": "tubebrain-hosted",
  "version": "0.1.10",
  "core_version": "0.1.10"
}

Transcript Section

curl -fsS http://127.0.0.1:8787/v1/transcript/section \
  -H 'authorization: Bearer tb_sk_test_local' \
  -H 'content-type: application/json' \
  -d '{
    "url": "https://www.youtube.com/watch?v=Rzi7oFTzjac&t=2449s",
    "lang": "en",
    "before_s": 120,
    "after_s": 600
  }' | jq

The successful response contains:

  • request_id
  • section, matching the local TranscriptSection type
  • agent_contract.suggested_task, currently summarize_section_and_extract_links

Error responses use the hosted contract's stable envelope:

{
  "error": {
    "code": "invalid_request",
    "message": "transcript section requires at_s or a YouTube timestamp such as t=2449s",
    "request_id": "req_0000000000000001"
  }
}

Stream Sessions

Start a stream session:

curl -fsS http://127.0.0.1:8787/v1/stream/start \
  -H 'authorization: Bearer tb_sk_test_local' \
  -H 'content-type: application/json' \
  -d '{
    "url": "https://radio.example/live.mp3",
    "lang": "en"
  }' | jq

The successful response contains:

  • request_id
  • session, matching the local StreamSession type

Poll for new transcript chunks:

curl -fsS 'http://127.0.0.1:8787/v1/stream/sess-1/poll?cursor=0' \
  -H 'authorization: Bearer tb_sk_test_local' | jq

The successful response contains:

  • request_id
  • chunk, matching the local StreamChunk type

The SSE endpoint emits the same chunk envelope as a text/event-stream response:

curl -fsS 'http://127.0.0.1:8787/v1/stream/sess-1/events?cursor=0' \
  -H 'authorization: Bearer tb_sk_test_local' \
  -H 'accept: text/event-stream'

Stop a session:

curl -fsS -X POST http://127.0.0.1:8787/v1/stream/sess-1/stop \
  -H 'authorization: Bearer tb_sk_test_local' | jq

List active sessions:

curl -fsS http://127.0.0.1:8787/v1/stream \
  -H 'authorization: Bearer tb_sk_test_local' | jq

Pilot Usage Snapshot

Inspect the authenticated account's usage counters:

curl -fsS http://127.0.0.1:8787/v1/admin/usage \
  -H 'authorization: Bearer tb_sk_test_local' | jq

The response contains:

  • account_id, from the authenticated API key record
  • api_key_id, a stable key identifier, not the raw key
  • key_label, if the authenticated API key record has one
  • limits, from static environment configuration
  • counters, including account-scoped rolling-window endpoint counters and rate_limited
  • recent_events, a bounded event ring for that account, rebuilt from the durable JSONL sink when configured

Static quota configuration:

Variable Default Notes
TUBEBRAIN_USAGE_EVENT_LOG unset Optional JSONL file path for durable usage events.
TUBEBRAIN_USAGE_WINDOW_SECS 86400 Rolling quota window used for counters and rate-limit reset hints.
TUBEBRAIN_QUOTA_TRANSCRIPT_SECTION 300 Per-account window cap.
TUBEBRAIN_QUOTA_STREAM_START 60 Per-account window cap.
TUBEBRAIN_QUOTA_STREAM_POLL 1800 Per-account window cap.
TUBEBRAIN_QUOTA_STREAM_EVENTS 300 Per-account window cap.
TUBEBRAIN_QUOTA_STREAM_STOP 300 Per-account window cap.
TUBEBRAIN_QUOTA_STREAM_LIST 300 Per-account window cap.
TUBEBRAIN_USAGE_EVENT_CAPACITY 512 Bounded recent-event ring returned by GET /v1/admin/usage.

The JSONL sink stores usage event metadata only: event/request/account/key IDs, endpoint, source kind, outcome, status, duration, stream/STT timing, retry count, optional estimated cost, and a public error code. It does not store raw API keys, bearer tokens, cookies, signed media URLs, PoTokens, BotGuard internals, raw audio, or original source URLs.

Current Limits

This MVP has account-scoped in-memory API key records, account-scoped stream session ownership, and optional file-backed usage-event durability, but it does not yet include self-serve account management, billing, durable multi-process hosted sessions, database-backed metering, or public managed PoToken endpoints.

Stream sessions are in-memory and single-process. The protected-preview routing model is documented in Hosted Stream Session Routing: route active session traffic to the same worker, return 404 not_found after a worker restart or wrong-account access, and keep raw audio out of durable storage. Before broad paid traffic, the hosted layer still needs Redis-backed cursors/buffers, cross-replica usage aggregation, and automated billing-grade operations.

Public responses and diagnostics must not expose cookies, signed media URLs, PoTokens, BotGuard internals, or raw audio. The hosted error envelope redacts known sensitive URL, token, cookie, PoToken, and BotGuard-shaped strings before returning messages to clients. Raw audio is not persisted by default; live segment buffers are operational state for the active session only.