AI Skills Library
Back to library

Transcribe Content with Speakers and Timestamps

Turn audio or video into a cleaned transcript with speaker labels, timestamps, and a usable handoff format.

How to use this skill

  1. Download it. Click “Download skill” above to save the zip file to your computer.
  2. Unzip and add it. Unzip it to get a folder with a SKILL.md inside, then drop that folder into the skills folder of an AI tool that runs on your computer — Claude Code, Claude Desktop, or Codex.
  3. Ask your AI to use it. Your AI reads the skill and follows the steps below for you.

Use When

Use this skill when you have a video, podcast, screen recording, interview, course clip, or meeting recording and need a transcript that keeps the conversation structure intact.

Inputs

  • A local audio or video file.
  • Speaker names, if known.
  • The desired output format: clean transcript, chaptered notes, captions, or content brief.

Workflow

  1. Confirm the user owns the media or has permission to process it.
  2. Use ffmpeg to extract a clean audio file from the source.
  3. Run transcription with speaker diarization enabled.
  4. Review obvious speaker-label mistakes and timestamp drift.
  5. Clean filler only when it does not change meaning.
  6. Return the transcript with speaker names, timestamps, and a short quality note.

Output

The final transcript should be easy to quote, skim, and hand to another AI workflow. Prefer this shape:

[00:00:12] Speaker 1: Welcome back. Today we are looking at...
[00:00:26] Speaker 2: The tricky part is...

Prompt

Use the related prompt when you want an AI pass over a raw transcript to normalize speaker names, clean formatting, and preserve timestamps.

See also