podscript

command module

v0.0.0-...-4108302 Latest Latest Go to latest Published: Mar 5, 2025 License: MIT Imports: 36 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

README ¶

podscript

podscript is a tool to generate transcripts for podcasts (and other similar audio files), using LLMs and Speech-to-Text (STT) APIs.

Install

> go install github.com/deepakjois/podscript@latest

> ~/go/bin/podscript --help

Web UI

Podscript has a web based UI for convenience

> podscript web
Starting server on port 8080

This runs a web server on at http://localhost:8080

Demo

For more advanced usage, see the CLI section below.

CLI Getting started

# Configure keys for supported services (OpenAI, Anthropic, Deepgram etc)
# and write them to $HOME/.podscript.toml
podscript configure

# Transcribe a YouTube Video by formatting and cleaning up autogenerated captions
podscript ytt https://www.youtube.com/watch?v=aO1-6X_f74M

# Transcribe audio from a URL using deepgram speech-to-text API
#
# Deepgram and AssemblyAI subcommands support `--from-url` for
# passing audio URLs, and `--from-file` to pass audio files.
podscript deepgram --from-url  https://audio.listennotes.com/e/p/d6cc86364eb540c1a30a1cac2b77b82c/

# Transcribe audio from a file using Groq's whisper model
#  Groq only supports audio files.
podscript groq --file huberman.mp3

More Info

Models for ytt subcommand

The ytt subommand uses the gpt-4o model by default. Use --model flag to set a different model. The following are supported:

OpenAI
- gpt-4o
- gpt-4o-mini
Google Gemini
- gemini-2.0-flash
Llama (via Groq)
- llama-3.3-70b-versatile
- llama-3.1-8b-instant
Anthropic
- claude-3-5-sonnet-20241022
- claude-3-5-haiku-20241022
Anthropic via Amazon Bedrock
- anthropic.claude-3-5-sonnet-20241022-v2:0 (via AWS)
- anthropic.claude-3-5-haiku-20241022-v1:0 (via AWS)

Transcript from audio URLs and files

[!TIP] You can find the audio download link for a podcast on ListenNotes under the More menu

podscript supports the following Speech-To-Text (STT) APIs:

Deepgram (which as of Jan 2025 provides $200 free signup credit!)
Assembly AI (which as of Oct 2024 is free to use within your credit limits and they provide $50 credits free on signup).
Groq (which as of Jul 2024 is in beta and free to use within your rate limits).

Development

Want to contribute? Here's how to build and run the project locally:

Prerequisites

Install npm: https://docs.npmjs.com/downloading-and-installing-node-js-and-npm?ref=meilisearch-blog
Install caddy: https://caddyserver.com/docs/install

Build and run the frontend:

cd web/frontend
npm run dev

Build the backend server and run it in dev mode:

go build -o podscript
./podscript web --dev

This will start the backend server and expose only the API endpoints without bundling the frontend assets

To connect the two:

cd web
caddy run

This should setup everything such that you can visit http://localhost:8080 and have the frontend connected to the backend via the Caddy reverse proxy

Feedback

Feel free to drop me a note on X or Email Me

License

MIT

Documentation ¶

Overview ¶

This file provides a Kong resolver that loads configuration from a TOML file. It is a lightly modified version of the kongtoml package.

It checks if the ytt subcommand is used and if so, it uses the parent path to construct the key. This makes the configuration file more readable and ergonomic.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL