Hosted Pilot Policy¶

This page is the pilot-facing policy for the protected TubeBrain hosted preview. It explains what TubeBrain meters, what it keeps, and what it deliberately does not expose.

The current hosted preview is a controlled design-partner service. It keeps tubebrain.ai as the static marketing surface and exposes the backend only through the Tinyland tailnet unless a separate public API launch is approved.

What We Meter¶

TubeBrain now has a protected-preview usage sink for the first paid-pilot shape. The current implementation is deliberately small: tubebrain-hosted keeps account-scoped rolling-window counters and a bounded recent-event ring, emits structured tubebrain_usage logs with account/key identity, optionally persists usage events to a JSONL file, and exposes an authenticated operator snapshot at GET /v1/admin/usage.

This is operational and billing-oriented, not surveillance-oriented. It is also not durable account storage yet: file-backed usage history can survive a restart for one hosted process, but it does not aggregate across replicas or replace a billing database.

The first quotable paid-pilot package is Paid Pilot Package. It maps pilot price bands and manual invoice evidence to the usage-event dimensions below.

Minimum usage events:

transcript section request
stream session start
stream poll
stream SSE connection
stream stop
source failure
transcription failure
rate-limit decision

Follow-on billing work will add upstream media fetch attempts, deeper audio decode/STT attempt detail, and durable per-account aggregation once the pilot path needs multi-process accounting.

Minimum dimensions:

Field	Purpose
`event_id`	Stable unique usage-event identifier
`request_id`	Correlates the API response with operator diagnostics
`account_id`	Billing and support grouping
`api_key_id`	Key-level quota and abuse control, never the raw key
`endpoint`	API route or MCP-equivalent operation
`source_kind`	`youtube_vod`, `youtube_live`, `http_audio`, or future adapter
`session_id`	Present for stream events
`outcome`	`ok`, `client_error`, `source_error`, `transcription_error`, `rate_limited`, or `internal_error`
`status_code`	HTTP status for hosted calls
`duration_ms`	Server-side wall-clock duration
`stream_active_ms`	Active stream-session time charged or analyzed
`audio_decoded_ms`	Audio duration successfully decoded
`stt_processed_ms`	Audio duration submitted to STT
`stt_backend`	Primary STT backend when available
`stt_fallback_mode`	Managed fallback mode when available
`stt_provider`	Managed provider name when available
`estimated_cost_micro_usd`	Optional cost estimate in micro-USD
`egress_bytes`	Response and SSE egress estimate
`retry_count`	Retries caused by source, network, or resolver behavior
`error_code`	Stable public error code, not raw upstream error text
`created_at_unix_s`	Event timestamp

Usage events must not contain raw API keys, bearer tokens, cookies, signed media URLs, PoTokens, BotGuard internals, or raw audio bytes.

Quota Window¶

The preview enforces static per-account quotas before each metered endpoint. Counters are scoped to a rolling window and can be rebuilt from TUBEBRAIN_USAGE_EVENT_LOG when that file-backed sink is configured. Defaults are intentionally conservative and can be overridden with environment variables:

Variable	Default
`TUBEBRAIN_API_KEY`	`tb_sk_test_local` in local dev
`TUBEBRAIN_API_KEYS_JSON`	unset
`TUBEBRAIN_USAGE_ACCOUNT_ID`	`pilot`
`TUBEBRAIN_API_KEY_SCOPES`	all current scopes
`TUBEBRAIN_API_KEY_STATUS`	`active`
`TUBEBRAIN_USAGE_EVENT_LOG`	unset
`TUBEBRAIN_USAGE_WINDOW_SECS`	`86400`
`TUBEBRAIN_QUOTA_TRANSCRIPT_SECTION`	`300`
`TUBEBRAIN_QUOTA_STREAM_START`	`60`
`TUBEBRAIN_QUOTA_STREAM_POLL`	`1800`
`TUBEBRAIN_QUOTA_STREAM_EVENTS`	`300`
`TUBEBRAIN_QUOTA_STREAM_STOP`	`300`
`TUBEBRAIN_QUOTA_STREAM_LIST`	`300`
`TUBEBRAIN_USAGE_EVENT_CAPACITY`	`512`

Quota responses use HTTP 429 with the stable error code rate_limited. Metered responses include x-ratelimit-limit, x-ratelimit-remaining, and x-ratelimit-reset headers. The reset value is the number of seconds until the oldest counted event or in-flight reservation leaves the rolling window.

What We Keep¶

Default retention for the hosted preview and paid-pilot design:

Data	Default
API request metadata	30 days
Usage events	Billing and audit retention
Transcript cache	Short TTL, initially 1 hour
Stream session metadata	Session lifetime plus operational cleanup window
Live transcript buffers	Session lifetime plus operational cleanup window
Raw audio	Not persisted by default
PoToken material	Not exposed and not stored beyond operational need
Cookies and signed media URLs	Not stored as customer-visible records

The service may hold short-lived live buffers in memory while a stream session is active. Those buffers exist to support polling, SSE delivery, and transcription; they are not a product archive.

PoToken Boundary¶

TubeBrain can use proof-of-origin token support internally when a source requires it and the deployed build enables that feature. Managed token minting is not a public API endpoint in the hosted MVP.

Pilot users should see:

ordinary transcript or stream responses
stable error codes when a source cannot be resolved
high-level diagnostics such as degraded STT or source unavailable

Pilot users should not see:

PoToken values
BotGuard worker details
signed Googlevideo URLs
cookies or authorization headers
raw audio payloads

STT Boundary¶

The hosted preview uses self-hosted STT first. The intended cluster shape is:

tubebrain-hosted resolves the source, fetches live/radio audio, and emits the public StreamSession and StreamChunk contracts.
tubebrain-hosted sends short base64-encoded audio chunks to the internal tubebrain-stt service when TUBEBRAIN_STT_BACKEND=remote.
tubebrain-stt runs the Whisper-backed chunk transcriber and returns structured transcript segments.

The internal STT service is bearer-authenticated and cluster-local. It is not a customer-facing endpoint, does not expose managed PoToken behavior, and does not persist raw audio by default.

Managed third-party STT is an optional hosted fallback boundary, disabled by default and configured only for buyer-approved pilots. It supports opt-in fallback after the primary STT backend fails and forced managed mode for a specific pilot proof. The boundary enforces an explicit fallback-hour cap before provider calls, emits usage events for attempts, provider, outcome, processed duration, retry count, and estimated cost, and sends only short audio chunks plus timing/format hints to the provider. It must not send API keys, cookies, signed media URLs, PoTokens, BotGuard details, session IDs, or account metadata to a managed provider. The cost and policy model is tracked in docs/spikes/2026-05-16-managed-stt-fallback-cost-model.md (TIN-1212).

Redaction Boundary¶

Hosted API responses and diagnostics must be useful enough to debug a failed source, but not detailed enough to leak credentials or replayable media URLs.

Public responses may include:

request_id
high-level source kind
stable error code
stream health
session cursor and buffer metadata
sanitized diagnostic messages

Public responses must not include:

raw bearer token values
cookie values
signed media URL path values or query values
PoToken values
BotGuard implementation details
raw audio bytes

The codebase includes hosted error-redaction tests for known sensitive Googlevideo, cookie, bearer-token, PoToken, and BotGuard-shaped strings.

Current Limits¶

The hosted preview now supports account-scoped API key records with scopes and revocation status, plus optional file-backed usage-event durability for one hosted process. It also scopes hosted stream sessions to the account that created them and returns 404 not_found for wrong-account or post-restart session access. It still uses in-memory stream sessions, sticky single-worker routing, and does not have cross-replica quota enforcement. Before broad paid traffic, TubeBrain needs a database-backed usage store, Redis-backed session buffers/cursors, self-serve key management, and a retention deletion path.

The intended paid-pilot posture is:

narrow access for design partners
explicit source and STT limits
no public managed PoToken minting surface
no raw-audio persistence by default
short-lived stream buffers
metering events that support billing without storing sensitive source secrets

The operator go/no-go checklist for applying this policy before a pilot call is Paid Pilot Operator Runbook.