RFC: Hosted Stream Session Routing¶
Date: 2026-05-18 Status: accepted for protected preview Owner: Jess Sullivan Linear: TIN-1390
Summary¶
The first paid-pilot hosted stream model is a sticky single-worker model with
account-scoped in-memory session ownership. This keeps local MCP
SessionManager semantics intact while making hosted behavior deterministic for
the protected preview.
This is not Redis-backed stream durability. It is the smallest productionization step that prevents cross-account session access and gives operators clear restart/reconnect behavior before paid design-partner traffic.
Selected Model¶
For the protected preview:
tubebrain-hostedcreates stream sessions on one active worker.- Ingress must route a session's
poll,events,stop, andlisttraffic back to the same worker while the session is active. - The hosted process keeps an in-memory session-owner registry:
session_id -> account_id, api_key_id, created_at_unix_s. GET /v1/streamreturns only sessions owned by the authenticated account.GET /v1/stream/{session_id}/poll,GET /v1/stream/{session_id}/events, andPOST /v1/stream/{session_id}/stopreturn404 not_foundwhen the session is missing or belongs to another account.- Raw audio is not persisted. Segment buffers remain active-session operational state only.
Failure Behavior¶
| Scenario | Protected-preview behavior |
|---|---|
| Client reconnects to the same worker | Use the cursor query parameter; semantics match local poll_stream. |
| SSE reconnects to the same worker | Use the cursor query parameter; the response is one chunk event for the current poll result. |
| Client routes to the wrong worker | The worker has no owner registry entry and returns 404 not_found. |
| Worker/pod restarts | The owner registry and stream buffers are lost; poll, events, and stop return 404 not_found. |
| Wrong account uses a known session ID | Return 404 not_found to avoid leaking whether the session exists. |
| Owner stops a session | The stream transcriber stops the session and the owner registry removes the mapping. |
| Owner lists sessions | Only that account's active sessions are returned. |
Tradeoffs¶
This model is intentionally conservative:
- It avoids storing raw audio or transcript buffers in durable infrastructure.
- It preserves the current in-process ingestion and
SessionManagerbehavior. - It is cheap to operate for one protected preview and a small number of paid pilot sessions.
- It makes restart behavior explicit rather than pretending sessions survive a pod restart.
The cost is that it requires sticky routing and does not survive worker restart. That is acceptable for the first paid-pilot proof, but not for broad public multi-tenant traffic.
Future Redis Model¶
The next durability step is a Redis-backed session store with:
session:{account_id}:{session_id}:metadatasession:{account_id}:{session_id}:segmentssession:{account_id}:{session_id}:diagnostics- worker ownership and lease metadata
- explicit TTL and cleanup jobs
That follow-up should happen only when active pilot traffic proves that restart-surviving stream sessions are worth the operational complexity.
Validation¶
The codebase tests this protected-preview model with hosted endpoint tests:
- wrong-account
listreturns no foreign sessions - wrong-account
poll,events, andstopreturn404 not_found - the creating account can still poll and stop after wrong-account attempts
- a restarted hosted process reports an old session as
404 not_found