Punctuation-Aware Splits
Prioritizes sentence boundaries and optional comma breaks so cues read naturally on screen.
Generate clean subtitles from word-level timestamps with professional segmentation controls, then export polished SRT or VTT instantly.
or click to browse from your device
[
{"text":"Hello","start":0.12,"end":0.44},
{"text":"everyone,","start":0.44,"end":0.93},
{"text":"welcome.","start":0.93,"end":1.40}
]
// Also supported:
// {"words":[{"word":"Hello","start":120,"end":440}]}
// {"results":{"channels":[{"alternatives":[{"words":[...]}]}]}}
Paste timestamp JSON to validate it before generation. The validator flags missing start/end fields by word index.
Supports common word arrays from AssemblyAI, Whisper-style output, and nested words objects.
Tune line length, cue duration, reading speed, punctuation splits, and cue gap to match delivery style.
Generation runs in-browser. Your uploaded timestamp file is not sent to external conversion APIs.
Prioritizes sentence boundaries and optional comma breaks so cues read naturally on screen.
Applies minimum and maximum cue duration with configurable cue gaps to prevent collisions and flashes.
Generate clean SRT or VTT from the same timestamp source without reformatting manually.
Upload JSON where each word includes start and end timing. The tool auto-detects seconds or milliseconds.
[{"text":"Hello","start":0.12,"end":0.44}] or [{"word":"Hi","start":120,"end":360}].
Finds nested words arrays inside channels, alternatives, segments, or result objects.
Understands numeric seconds, numeric milliseconds, and strings like 00:00:12.340 or 120ms.
A word-level timestamp links each token to start and end time. This is common in Whisper and modern STT pipelines.
Each subtitle cue is built from accurate token timing, not guessed phrase timing.
Supports Whisper-style words arrays and nested ASR outputs from common speech-to-text providers.
Use the built-in validator to catch missing timing values before generating SRT/VTT files.
Recommended minimum schema for robust subtitle generation:
[{"text":"Hello","start":0.12,"end":0.44}] where start and end are either seconds or milliseconds.
word, start_time, end_time, duration, and nested words arrays.
Generate subtitle files ready for editors, social clips, and long-form video in SRT or VTT.
Each word should have text plus timing fields. Common keys are text or word, with start and end.
Yes. The tool supports direct word arrays and nested outputs that include per-word timing from Whisper and common STT providers.
Yes. Set “Lines per cue” to 1 and the generator will keep each cue on a single line.
Yes. Cue start and end are derived from word timings, then refined with your min/max duration and gap settings.
You can export the generated subtitles as SRT or VTT.