Transcribe Content with Speakers and Timestamps

Use When

Use this skill when you have a video, podcast, screen recording, interview, course clip, or meeting recording and need a transcript that keeps the conversation structure intact.

Inputs

A local audio or video file.
Speaker names, if known.
The desired output format: clean transcript, chaptered notes, captions, or content brief.

Workflow

Confirm the user owns the media or has permission to process it.
Use ffmpeg to extract a clean audio file from the source.
Run transcription with speaker diarization enabled.
Review obvious speaker-label mistakes and timestamp drift.
Clean filler only when it does not change meaning.
Return the transcript with speaker names, timestamps, and a short quality note.

Output

The final transcript should be easy to quote, skim, and hand to another AI workflow. Prefer this shape:

[00:00:12] Speaker 1: Welcome back. Today we are looking at...
[00:00:26] Speaker 2: The tricky part is...

Prompt

Use the related prompt when you want an AI pass over a raw transcript to normalize speaker names, clean formatting, and preserve timestamps.