Today I used #Emacs Lisp to parse Deepgram's #speech recognition JSON output with utterances, punctuation, and smart format turned on and the #Whisper Large model selected. I turned the words array into a VTT subtitle file with speaker identification (handy for EmacsConf Q&A) and captions limited to roughly 45 characters with punctuation preferred for splitting. It's way faster than waiting for a CPU-only computer to run Whisper Large on the files. Looking forward to experimenting with this for my personal braindumping too.