Hosted API¶
The hosted API is the first tubebrain.ai web-service slice around the local
FOSS core. It is intentionally narrow today, but it now covers both the
deterministic timestamped-VOD demo path and the first live/radio session path:
GET /v1/healthPOST /v1/transcript/sectionGET /v1/admin/usageGET /v1/streamPOST /v1/stream/startGET /v1/stream/{session_id}/pollGET /v1/stream/{session_id}/eventsPOST /v1/stream/{session_id}/stop
POST /v1/transcript/section mirrors the local MCP get_transcript_section
tool. It accepts a timestamped YouTube URL or an explicit at_s value and
returns the same TranscriptSection packet shape inside a hosted response
envelope with a request_id.
The stream endpoints mirror the local MCP start_stream, poll_stream,
stop_stream, and list_streams tools. They are the GTM MVP path for
radio/HTTP audio and YouTube Live sources: source URL in, StreamSession and
StreamChunk payloads out for agent polling or SSE-style delivery.
For hosted preview deployments, live STT is selected independently from the
HTTP API process. tubebrain-hosted can run with a no-op backend for contract
testing, an in-process Whisper backend for local development, or a remote
internal tubebrain-stt service for Blahaj/Tinyland deployment. The remote STT
service is an internal boundary, not a public API surface.
The broader API and auth contract lives in Hosted HTTP/SSE API Contract. The pilot-facing metering, retention, and proof-of-origin boundary lives in Hosted Pilot Policy.
Run Locally¶
From a checkout:
just hosted-dev
That starts tubebrain-hosted on 127.0.0.1:8787 with the local development
API key tb_sk_test_local unless hosted API key configuration is already set.
Equivalent explicit command:
TUBEBRAIN_HOSTED_BIND=127.0.0.1:8787 \
cargo run --locked --bin tubebrain-hosted
Set TUBEBRAIN_API_KEY in the environment before running the server to use a
non-default local token. For paid pilots, prefer TUBEBRAIN_API_KEYS_JSON so
each key carries account identity, key identity, scopes, status, and an optional
pilot label. The server stores SHA-256 key hashes in memory and never emits raw
API keys in usage events or error bodies.
Single-key environment:
| Variable | Default | Notes |
|---|---|---|
TUBEBRAIN_API_KEY |
tb_sk_test_local in local dev |
Raw bearer token loaded from deployment secrets. |
TUBEBRAIN_USAGE_ACCOUNT_ID |
pilot |
Account id attached to usage events for the single key. |
TUBEBRAIN_API_KEY_ID |
derived key_<sha256-prefix> |
Stable key id for usage events. |
TUBEBRAIN_API_KEY_SCOPES |
all current scopes | Comma-separated scopes, such as transcript:read,stream:read. |
TUBEBRAIN_API_KEY_STATUS |
active |
Set revoked to reject the key without removing the record. |
TUBEBRAIN_API_KEY_LABEL |
unset | Optional operator label for the key. |
Multi-key environment:
[
{
"key": "tb_sk_live_example",
"account_id": "acct_design_partner",
"key_id": "key_design_partner_primary",
"scopes": ["transcript:read", "stream:read", "stream:write", "admin:read"],
"status": "active",
"label": "design-partner-primary"
}
]
TUBEBRAIN_API_KEYS_JSON records may use key_sha256 instead of key when a
deployment secret pipeline pre-hashes the raw key. Supported scopes are
transcript:read, stream:read, stream:write, and admin:read; all or
* expands to the current full set.
The first controlled hosted deployment is documented in
Hosted Preview Runbook. It runs tubebrain-hosted behind
tailnet-only ingress and keeps bearer-token auth on every endpoint except
GET /v1/health. Blahaj/OpenTofu owns the normal protected-preview Kubernetes
state; this repo owns the binaries, images, API contract, docs, smoke scripts,
and legacy/manual recovery manifests.
The pre-call paid design-partner checklist is Paid Pilot Operator Runbook. It is the operator go/no-go surface for DNS, CI/deploy, API keys, quota, STT posture, managed fallback posture, redaction, demo artifacts, and acceptance criteria.
The first quotable paid-pilot package is Paid Pilot Package. It defines the current price bands, source counts, hosted-preview hours, managed fallback caps, support limits, manual invoice path, and billing evidence boundaries.
The preview also exposes authenticated operator usage inspection at
GET /v1/admin/usage. This endpoint returns the authenticated account's pilot
usage snapshot: configured limits, rolling-window counters, and a bounded
recent-event ring. When TUBEBRAIN_USAGE_EVENT_LOG is configured, the hosted
process also appends JSONL usage events and rebuilds the current quota window
from that file on restart. This is enough for protected-preview control and
auditability, but it is not a multi-replica billing database.
Builds with --features po-token use the same BotGuard support as the local
MCP binary. Builds with --features whisper can use the local Whisper fallback
when captions are unavailable and can transcribe live stream chunks. Default
hosted builds can start stream sessions, but report live STT as degraded through
stream diagnostics unless a real STT backend is configured.
Stream STT Backends¶
tubebrain-hosted reads TUBEBRAIN_STT_BACKEND:
| Value | Behavior |
|---|---|
noop |
Starts stream sessions but reports live STT as degraded. Useful for API contract tests. |
local |
Uses in-process Whisper. Requires tubebrain-hosted built with --features whisper. |
remote |
Sends audio chunks to the internal tubebrain-stt service over HTTP with bearer auth. |
If TUBEBRAIN_STT_BACKEND is unset, Whisper-enabled builds default to local
and default builds use noop.
Remote STT configuration:
| Variable | Purpose |
|---|---|
TUBEBRAIN_REMOTE_STT_URL |
Base URL for the internal STT service, for example http://tubebrain-stt:8788. |
TUBEBRAIN_REMOTE_STT_TOKEN |
Bearer token shared with the internal STT service. |
TUBEBRAIN_REMOTE_STT_TIMEOUT_MS |
Optional request timeout. Defaults to 30000. |
TUBEBRAIN_REMOTE_STT_MAX_BYTES |
Optional per-chunk byte cap. Defaults to 4194304. |
Run the internal STT service locally:
just stt-dev
That starts tubebrain-stt on 127.0.0.1:8788 with the local development STT
key tb_stt_test_local unless TUBEBRAIN_STT_API_KEY is set. The service
exposes:
GET /v1/healthPOST /v1/stt/chunk
The STT chunk endpoint accepts base64-encoded audio chunks and returns the same
structured Segment shape used by local MCP and hosted stream responses. The
service enforces bearer-token auth, a per-chunk byte limit, and redacted error
responses. It is intended for private cluster traffic only.
Managed STT Fallback Boundary¶
Managed third-party STT is separate from the internal remote backend. The
managed boundary is disabled by default, follows the hosted pilot policy, and is
only for buyer-approved fallback or forced managed pilots. It does not change
the public stream response shapes.
Configure the boundary on tubebrain-hosted:
| Variable | Purpose |
|---|---|
TUBEBRAIN_MANAGED_STT_MODE |
disabled, opt-in, or forced. Defaults to disabled. |
TUBEBRAIN_MANAGED_STT_PROVIDER |
Provider adapter. Currently http. |
TUBEBRAIN_MANAGED_STT_URL |
Base URL for the managed provider proxy. Required unless disabled. |
TUBEBRAIN_MANAGED_STT_TOKEN |
Bearer token for the managed provider proxy. Required unless disabled. |
TUBEBRAIN_MANAGED_STT_FALLBACK_HOURS |
Explicit fallback-hour cap for the hosted process/account. Defaults to 0. |
TUBEBRAIN_MANAGED_STT_TIMEOUT_MS |
Optional provider request timeout. Defaults to 30000. |
TUBEBRAIN_MANAGED_STT_MAX_RETRIES |
Optional retry count after the first provider attempt. Defaults to 0. |
TUBEBRAIN_MANAGED_STT_MAX_BYTES |
Optional per-chunk byte cap. Defaults to 4194304. |
TUBEBRAIN_MANAGED_STT_MIN_BILLABLE_MS |
Optional minimum billable duration per non-empty chunk. Defaults to 1. |
TUBEBRAIN_MANAGED_STT_COST_PER_HOUR_MICRO_USD |
Optional cost estimate used in usage events. Defaults to 0. |
In opt-in mode, TubeBrain tries the selected primary backend first and calls
the managed provider only after the primary transcriber fails. In forced mode,
TubeBrain sends chunks directly through the managed boundary after enforcing the
configured fallback-hour cap.
The managed provider request carries only the short audio chunk and chunk
timing/format hints. It does not send TubeBrain API keys, cookies, signed media
URLs, PoToken values, BotGuard details, session IDs, or account metadata. Usage
events are emitted to the tubebrain_usage tracing target with provider,
outcome, processed duration, retry count, and estimated cost fields. See the
TIN-1212 cost model in
docs/spikes/2026-05-16-managed-stt-fallback-cost-model.md.
Health¶
curl -fsS http://127.0.0.1:8787/v1/health | jq
Example response:
{
"status": "ok",
"service": "tubebrain-hosted",
"version": "0.1.10",
"core_version": "0.1.10"
}
Transcript Section¶
curl -fsS http://127.0.0.1:8787/v1/transcript/section \
-H 'authorization: Bearer tb_sk_test_local' \
-H 'content-type: application/json' \
-d '{
"url": "https://www.youtube.com/watch?v=Rzi7oFTzjac&t=2449s",
"lang": "en",
"before_s": 120,
"after_s": 600
}' | jq
The successful response contains:
request_idsection, matching the localTranscriptSectiontypeagent_contract.suggested_task, currentlysummarize_section_and_extract_links
Error responses use the hosted contract's stable envelope:
{
"error": {
"code": "invalid_request",
"message": "transcript section requires at_s or a YouTube timestamp such as t=2449s",
"request_id": "req_0000000000000001"
}
}
Stream Sessions¶
Start a stream session:
curl -fsS http://127.0.0.1:8787/v1/stream/start \
-H 'authorization: Bearer tb_sk_test_local' \
-H 'content-type: application/json' \
-d '{
"url": "https://radio.example/live.mp3",
"lang": "en"
}' | jq
The successful response contains:
request_idsession, matching the localStreamSessiontype
Poll for new transcript chunks:
curl -fsS 'http://127.0.0.1:8787/v1/stream/sess-1/poll?cursor=0' \
-H 'authorization: Bearer tb_sk_test_local' | jq
The successful response contains:
request_idchunk, matching the localStreamChunktype
The SSE endpoint emits the same chunk envelope as a text/event-stream
response:
curl -fsS 'http://127.0.0.1:8787/v1/stream/sess-1/events?cursor=0' \
-H 'authorization: Bearer tb_sk_test_local' \
-H 'accept: text/event-stream'
Stop a session:
curl -fsS -X POST http://127.0.0.1:8787/v1/stream/sess-1/stop \
-H 'authorization: Bearer tb_sk_test_local' | jq
List active sessions:
curl -fsS http://127.0.0.1:8787/v1/stream \
-H 'authorization: Bearer tb_sk_test_local' | jq
Pilot Usage Snapshot¶
Inspect the authenticated account's usage counters:
curl -fsS http://127.0.0.1:8787/v1/admin/usage \
-H 'authorization: Bearer tb_sk_test_local' | jq
The response contains:
account_id, from the authenticated API key recordapi_key_id, a stable key identifier, not the raw keykey_label, if the authenticated API key record has onelimits, from static environment configurationcounters, including account-scoped rolling-window endpoint counters andrate_limitedrecent_events, a bounded event ring for that account, rebuilt from the durable JSONL sink when configured
Static quota configuration:
| Variable | Default | Notes |
|---|---|---|
TUBEBRAIN_USAGE_EVENT_LOG |
unset | Optional JSONL file path for durable usage events. |
TUBEBRAIN_USAGE_WINDOW_SECS |
86400 |
Rolling quota window used for counters and rate-limit reset hints. |
TUBEBRAIN_QUOTA_TRANSCRIPT_SECTION |
300 |
Per-account window cap. |
TUBEBRAIN_QUOTA_STREAM_START |
60 |
Per-account window cap. |
TUBEBRAIN_QUOTA_STREAM_POLL |
1800 |
Per-account window cap. |
TUBEBRAIN_QUOTA_STREAM_EVENTS |
300 |
Per-account window cap. |
TUBEBRAIN_QUOTA_STREAM_STOP |
300 |
Per-account window cap. |
TUBEBRAIN_QUOTA_STREAM_LIST |
300 |
Per-account window cap. |
TUBEBRAIN_USAGE_EVENT_CAPACITY |
512 |
Bounded recent-event ring returned by GET /v1/admin/usage. |
The JSONL sink stores usage event metadata only: event/request/account/key IDs, endpoint, source kind, outcome, status, duration, stream/STT timing, retry count, optional estimated cost, and a public error code. It does not store raw API keys, bearer tokens, cookies, signed media URLs, PoTokens, BotGuard internals, raw audio, or original source URLs.
Current Limits¶
This MVP has account-scoped in-memory API key records, account-scoped stream session ownership, and optional file-backed usage-event durability, but it does not yet include self-serve account management, billing, durable multi-process hosted sessions, database-backed metering, or public managed PoToken endpoints.
Stream sessions are in-memory and single-process. The protected-preview routing
model is documented in
Hosted Stream Session Routing:
route active session traffic to the same worker, return 404 not_found after a
worker restart or wrong-account access, and keep raw audio out of durable
storage. Before broad paid traffic, the hosted layer still needs Redis-backed
cursors/buffers, cross-replica usage aggregation, and automated billing-grade
operations.
Public responses and diagnostics must not expose cookies, signed media URLs, PoTokens, BotGuard internals, or raw audio. The hosted error envelope redacts known sensitive URL, token, cookie, PoToken, and BotGuard-shaped strings before returning messages to clients. Raw audio is not persisted by default; live segment buffers are operational state for the active session only.