Use When
Use this skill when you have a video, podcast, screen recording, interview, course clip, or meeting recording and need a transcript that keeps the conversation structure intact.
Inputs
- A local audio or video file.
- Speaker names, if known.
- The desired output format: clean transcript, chaptered notes, captions, or content brief.
Workflow
- Confirm the user owns the media or has permission to process it.
- Use
ffmpegto extract a clean audio file from the source. - Run transcription with speaker diarization enabled.
- Review obvious speaker-label mistakes and timestamp drift.
- Clean filler only when it does not change meaning.
- Return the transcript with speaker names, timestamps, and a short quality note.
Output
The final transcript should be easy to quote, skim, and hand to another AI workflow. Prefer this shape:
[00:00:12] Speaker 1: Welcome back. Today we are looking at...
[00:00:26] Speaker 2: The tricky part is...
Prompt
Use the related prompt when you want an AI pass over a raw transcript to normalize speaker names, clean formatting, and preserve timestamps.