yt-text

yt-text is a Rust MCP stdio server for structured YouTube transcript extraction, metadata lookup, and caption-language discovery.

It is designed for AI agents that need typed transcript data instead of raw page text or one-off scraping output.

It exposes three MCP tools:

  • get_transcript
  • list_languages
  • get_metadata

Why this project exists:

  • structured typed outputs instead of raw text dumps
  • optional PoToken support for YouTube BotGuard rollout cases
  • optional local Whisper fallback for captionless videos
  • single compiled binary with published release tarballs

Start here:

Repository links: