Hosted Pilot Policy

This page is the pilot-facing policy for the protected TubeBrain hosted preview. It explains what TubeBrain meters, what it keeps, and what it deliberately does not expose.

The current hosted preview is a controlled design-partner service. It keeps tubebrain.ai as the static marketing surface and exposes the backend only through the Tinyland tailnet unless a separate public API launch is approved.

What We Meter

TubeBrain now has a protected-preview usage sink for the first paid-pilot shape. The current implementation is deliberately small: tubebrain-hosted keeps account-scoped rolling-window counters and a bounded recent-event ring, emits structured tubebrain_usage logs with account/key identity, optionally persists usage events to a JSONL file, and exposes an authenticated operator snapshot at GET /v1/admin/usage.

This is operational and billing-oriented, not surveillance-oriented. It is also not durable account storage yet: file-backed usage history can survive a restart for one hosted process, but it does not aggregate across replicas or replace a billing database.

The first quotable paid-pilot package is Paid Pilot Package. It maps pilot price bands and manual invoice evidence to the usage-event dimensions below.

Minimum usage events:

  • transcript section request
  • stream session start
  • stream poll
  • stream SSE connection
  • stream stop
  • source failure
  • transcription failure
  • rate-limit decision

Follow-on billing work will add upstream media fetch attempts, deeper audio decode/STT attempt detail, and durable per-account aggregation once the pilot path needs multi-process accounting.

Minimum dimensions:

Field Purpose
event_id Stable unique usage-event identifier
request_id Correlates the API response with operator diagnostics
account_id Billing and support grouping
api_key_id Key-level quota and abuse control, never the raw key
endpoint API route or MCP-equivalent operation
source_kind youtube_vod, youtube_live, http_audio, or future adapter
session_id Present for stream events
outcome ok, client_error, source_error, transcription_error, rate_limited, or internal_error
status_code HTTP status for hosted calls
duration_ms Server-side wall-clock duration
stream_active_ms Active stream-session time charged or analyzed
audio_decoded_ms Audio duration successfully decoded
stt_processed_ms Audio duration submitted to STT
stt_backend Primary STT backend when available
stt_fallback_mode Managed fallback mode when available
stt_provider Managed provider name when available
estimated_cost_micro_usd Optional cost estimate in micro-USD
egress_bytes Response and SSE egress estimate
retry_count Retries caused by source, network, or resolver behavior
error_code Stable public error code, not raw upstream error text
created_at_unix_s Event timestamp

Usage events must not contain raw API keys, bearer tokens, cookies, signed media URLs, PoTokens, BotGuard internals, or raw audio bytes.

Quota Window

The preview enforces static per-account quotas before each metered endpoint. Counters are scoped to a rolling window and can be rebuilt from TUBEBRAIN_USAGE_EVENT_LOG when that file-backed sink is configured. Defaults are intentionally conservative and can be overridden with environment variables:

Variable Default
TUBEBRAIN_API_KEY tb_sk_test_local in local dev
TUBEBRAIN_API_KEYS_JSON unset
TUBEBRAIN_USAGE_ACCOUNT_ID pilot
TUBEBRAIN_API_KEY_SCOPES all current scopes
TUBEBRAIN_API_KEY_STATUS active
TUBEBRAIN_USAGE_EVENT_LOG unset
TUBEBRAIN_USAGE_WINDOW_SECS 86400
TUBEBRAIN_QUOTA_TRANSCRIPT_SECTION 300
TUBEBRAIN_QUOTA_STREAM_START 60
TUBEBRAIN_QUOTA_STREAM_POLL 1800
TUBEBRAIN_QUOTA_STREAM_EVENTS 300
TUBEBRAIN_QUOTA_STREAM_STOP 300
TUBEBRAIN_QUOTA_STREAM_LIST 300
TUBEBRAIN_USAGE_EVENT_CAPACITY 512

Quota responses use HTTP 429 with the stable error code rate_limited. Metered responses include x-ratelimit-limit, x-ratelimit-remaining, and x-ratelimit-reset headers. The reset value is the number of seconds until the oldest counted event or in-flight reservation leaves the rolling window.

What We Keep

Default retention for the hosted preview and paid-pilot design:

Data Default
API request metadata 30 days
Usage events Billing and audit retention
Transcript cache Short TTL, initially 1 hour
Stream session metadata Session lifetime plus operational cleanup window
Live transcript buffers Session lifetime plus operational cleanup window
Raw audio Not persisted by default
PoToken material Not exposed and not stored beyond operational need
Cookies and signed media URLs Not stored as customer-visible records

The service may hold short-lived live buffers in memory while a stream session is active. Those buffers exist to support polling, SSE delivery, and transcription; they are not a product archive.

PoToken Boundary

TubeBrain can use proof-of-origin token support internally when a source requires it and the deployed build enables that feature. Managed token minting is not a public API endpoint in the hosted MVP.

Pilot users should see:

  • ordinary transcript or stream responses
  • stable error codes when a source cannot be resolved
  • high-level diagnostics such as degraded STT or source unavailable

Pilot users should not see:

  • PoToken values
  • BotGuard worker details
  • signed Googlevideo URLs
  • cookies or authorization headers
  • raw audio payloads

STT Boundary

The hosted preview uses self-hosted STT first. The intended cluster shape is:

  1. tubebrain-hosted resolves the source, fetches live/radio audio, and emits the public StreamSession and StreamChunk contracts.
  2. tubebrain-hosted sends short base64-encoded audio chunks to the internal tubebrain-stt service when TUBEBRAIN_STT_BACKEND=remote.
  3. tubebrain-stt runs the Whisper-backed chunk transcriber and returns structured transcript segments.

The internal STT service is bearer-authenticated and cluster-local. It is not a customer-facing endpoint, does not expose managed PoToken behavior, and does not persist raw audio by default.

Managed third-party STT is an optional hosted fallback boundary, disabled by default and configured only for buyer-approved pilots. It supports opt-in fallback after the primary STT backend fails and forced managed mode for a specific pilot proof. The boundary enforces an explicit fallback-hour cap before provider calls, emits usage events for attempts, provider, outcome, processed duration, retry count, and estimated cost, and sends only short audio chunks plus timing/format hints to the provider. It must not send API keys, cookies, signed media URLs, PoTokens, BotGuard details, session IDs, or account metadata to a managed provider. The cost and policy model is tracked in docs/spikes/2026-05-16-managed-stt-fallback-cost-model.md (TIN-1212).

Redaction Boundary

Hosted API responses and diagnostics must be useful enough to debug a failed source, but not detailed enough to leak credentials or replayable media URLs.

Public responses may include:

  • request_id
  • high-level source kind
  • stable error code
  • stream health
  • session cursor and buffer metadata
  • sanitized diagnostic messages

Public responses must not include:

  • raw bearer token values
  • cookie values
  • signed media URL path values or query values
  • PoToken values
  • BotGuard implementation details
  • raw audio bytes

The codebase includes hosted error-redaction tests for known sensitive Googlevideo, cookie, bearer-token, PoToken, and BotGuard-shaped strings.

Current Limits

The hosted preview now supports account-scoped API key records with scopes and revocation status, plus optional file-backed usage-event durability for one hosted process. It also scopes hosted stream sessions to the account that created them and returns 404 not_found for wrong-account or post-restart session access. It still uses in-memory stream sessions, sticky single-worker routing, and does not have cross-replica quota enforcement. Before broad paid traffic, TubeBrain needs a database-backed usage store, Redis-backed session buffers/cursors, self-serve key management, and a retention deletion path.

The intended paid-pilot posture is:

  • narrow access for design partners
  • explicit source and STT limits
  • no public managed PoToken minting surface
  • no raw-audio persistence by default
  • short-lived stream buffers
  • metering events that support billing without storing sensitive source secrets

The operator go/no-go checklist for applying this policy before a pilot call is Paid Pilot Operator Runbook.