Hosted Pilot Policy¶
This page is the pilot-facing policy for the protected TubeBrain hosted preview. It explains what TubeBrain meters, what it keeps, and what it deliberately does not expose.
The current hosted preview is a controlled design-partner service. It keeps
tubebrain.ai as the static marketing surface and exposes the backend only
through the Tinyland tailnet unless a separate public API launch is approved.
What We Meter¶
TubeBrain now has a protected-preview usage sink for the first paid-pilot
shape. The current implementation is deliberately small: tubebrain-hosted
keeps account-scoped rolling-window counters and a bounded recent-event ring,
emits structured tubebrain_usage logs with account/key identity, optionally
persists usage events to a JSONL file, and exposes an authenticated operator
snapshot at GET /v1/admin/usage.
This is operational and billing-oriented, not surveillance-oriented. It is also not durable account storage yet: file-backed usage history can survive a restart for one hosted process, but it does not aggregate across replicas or replace a billing database.
The first quotable paid-pilot package is Paid Pilot Package. It maps pilot price bands and manual invoice evidence to the usage-event dimensions below.
Minimum usage events:
- transcript section request
- stream session start
- stream poll
- stream SSE connection
- stream stop
- source failure
- transcription failure
- rate-limit decision
Follow-on billing work will add upstream media fetch attempts, deeper audio decode/STT attempt detail, and durable per-account aggregation once the pilot path needs multi-process accounting.
Minimum dimensions:
| Field | Purpose |
|---|---|
event_id |
Stable unique usage-event identifier |
request_id |
Correlates the API response with operator diagnostics |
account_id |
Billing and support grouping |
api_key_id |
Key-level quota and abuse control, never the raw key |
endpoint |
API route or MCP-equivalent operation |
source_kind |
youtube_vod, youtube_live, http_audio, or future adapter |
session_id |
Present for stream events |
outcome |
ok, client_error, source_error, transcription_error, rate_limited, or internal_error |
status_code |
HTTP status for hosted calls |
duration_ms |
Server-side wall-clock duration |
stream_active_ms |
Active stream-session time charged or analyzed |
audio_decoded_ms |
Audio duration successfully decoded |
stt_processed_ms |
Audio duration submitted to STT |
stt_backend |
Primary STT backend when available |
stt_fallback_mode |
Managed fallback mode when available |
stt_provider |
Managed provider name when available |
estimated_cost_micro_usd |
Optional cost estimate in micro-USD |
egress_bytes |
Response and SSE egress estimate |
retry_count |
Retries caused by source, network, or resolver behavior |
error_code |
Stable public error code, not raw upstream error text |
created_at_unix_s |
Event timestamp |
Usage events must not contain raw API keys, bearer tokens, cookies, signed media URLs, PoTokens, BotGuard internals, or raw audio bytes.
Quota Window¶
The preview enforces static per-account quotas before each metered endpoint.
Counters are scoped to a rolling window and can be rebuilt from
TUBEBRAIN_USAGE_EVENT_LOG when that file-backed sink is configured. Defaults
are intentionally conservative and can be overridden with environment variables:
| Variable | Default |
|---|---|
TUBEBRAIN_API_KEY |
tb_sk_test_local in local dev |
TUBEBRAIN_API_KEYS_JSON |
unset |
TUBEBRAIN_USAGE_ACCOUNT_ID |
pilot |
TUBEBRAIN_API_KEY_SCOPES |
all current scopes |
TUBEBRAIN_API_KEY_STATUS |
active |
TUBEBRAIN_USAGE_EVENT_LOG |
unset |
TUBEBRAIN_USAGE_WINDOW_SECS |
86400 |
TUBEBRAIN_QUOTA_TRANSCRIPT_SECTION |
300 |
TUBEBRAIN_QUOTA_STREAM_START |
60 |
TUBEBRAIN_QUOTA_STREAM_POLL |
1800 |
TUBEBRAIN_QUOTA_STREAM_EVENTS |
300 |
TUBEBRAIN_QUOTA_STREAM_STOP |
300 |
TUBEBRAIN_QUOTA_STREAM_LIST |
300 |
TUBEBRAIN_USAGE_EVENT_CAPACITY |
512 |
Quota responses use HTTP 429 with the stable error code rate_limited.
Metered responses include x-ratelimit-limit, x-ratelimit-remaining, and
x-ratelimit-reset headers. The reset value is the number of seconds until the
oldest counted event or in-flight reservation leaves the rolling window.
What We Keep¶
Default retention for the hosted preview and paid-pilot design:
| Data | Default |
|---|---|
| API request metadata | 30 days |
| Usage events | Billing and audit retention |
| Transcript cache | Short TTL, initially 1 hour |
| Stream session metadata | Session lifetime plus operational cleanup window |
| Live transcript buffers | Session lifetime plus operational cleanup window |
| Raw audio | Not persisted by default |
| PoToken material | Not exposed and not stored beyond operational need |
| Cookies and signed media URLs | Not stored as customer-visible records |
The service may hold short-lived live buffers in memory while a stream session is active. Those buffers exist to support polling, SSE delivery, and transcription; they are not a product archive.
PoToken Boundary¶
TubeBrain can use proof-of-origin token support internally when a source requires it and the deployed build enables that feature. Managed token minting is not a public API endpoint in the hosted MVP.
Pilot users should see:
- ordinary transcript or stream responses
- stable error codes when a source cannot be resolved
- high-level diagnostics such as degraded STT or source unavailable
Pilot users should not see:
- PoToken values
- BotGuard worker details
- signed Googlevideo URLs
- cookies or authorization headers
- raw audio payloads
STT Boundary¶
The hosted preview uses self-hosted STT first. The intended cluster shape is:
tubebrain-hostedresolves the source, fetches live/radio audio, and emits the publicStreamSessionandStreamChunkcontracts.tubebrain-hostedsends short base64-encoded audio chunks to the internaltubebrain-sttservice whenTUBEBRAIN_STT_BACKEND=remote.tubebrain-sttruns the Whisper-backed chunk transcriber and returns structured transcript segments.
The internal STT service is bearer-authenticated and cluster-local. It is not a customer-facing endpoint, does not expose managed PoToken behavior, and does not persist raw audio by default.
Managed third-party STT is an optional hosted fallback boundary, disabled by
default and configured only for buyer-approved pilots. It supports opt-in
fallback after the primary STT backend fails and forced managed mode for a
specific pilot proof. The boundary enforces an explicit fallback-hour cap before
provider calls, emits usage events for attempts, provider, outcome, processed
duration, retry count, and estimated cost, and sends only short audio chunks plus
timing/format hints to the provider. It must not send API keys, cookies, signed
media URLs, PoTokens, BotGuard details, session IDs, or account metadata to a
managed provider. The cost and policy model is tracked in
docs/spikes/2026-05-16-managed-stt-fallback-cost-model.md (TIN-1212).
Redaction Boundary¶
Hosted API responses and diagnostics must be useful enough to debug a failed source, but not detailed enough to leak credentials or replayable media URLs.
Public responses may include:
request_id- high-level source kind
- stable error code
- stream health
- session cursor and buffer metadata
- sanitized diagnostic messages
Public responses must not include:
- raw bearer token values
- cookie values
- signed media URL path values or query values
- PoToken values
- BotGuard implementation details
- raw audio bytes
The codebase includes hosted error-redaction tests for known sensitive Googlevideo, cookie, bearer-token, PoToken, and BotGuard-shaped strings.
Current Limits¶
The hosted preview now supports account-scoped API key records with scopes and
revocation status, plus optional file-backed usage-event durability for one
hosted process. It also scopes hosted stream sessions to the account that
created them and returns 404 not_found for wrong-account or post-restart
session access. It still uses in-memory stream sessions, sticky single-worker
routing, and does not have cross-replica quota enforcement. Before broad paid
traffic, TubeBrain needs a database-backed usage store, Redis-backed session
buffers/cursors, self-serve key management, and a retention deletion path.
The intended paid-pilot posture is:
- narrow access for design partners
- explicit source and STT limits
- no public managed PoToken minting surface
- no raw-audio persistence by default
- short-lived stream buffers
- metering events that support billing without storing sensitive source secrets
The operator go/no-go checklist for applying this policy before a pilot call is Paid Pilot Operator Runbook.