What input format is supported?

Upload JSON that contains words with start and end timestamps, such as text/start/end, word/start/end, or nested words arrays from ASR systems.

Can I choose subtitle segmentation logic?

Yes. You can configure max characters per line, lines per cue, CPS target, cue duration limits, cue gap, and punctuation break behavior.

Which export formats are available?

You can export generated subtitles as SRT or VTT.

Word Timestamp to Subtitles

Generate clean subtitles from word-level timestamps with professional segmentation controls, then export polished SRT or VTT instantly.

Drop your word timestamp JSON here

or click to browse from your device

Choose File

JSON

Subtitle Settings

Output Format

SRT SubRip

VTT WebVTT

Formatting Rules

Max chars / line

Professional range: 32-42

Lines per cue

▾

1 for social, 2 for long-form

Reading speed

cps

Characters per second

Min cue duration

Short cues are extended

Max cue duration

Long cues split faster

Gap between cues

Keeps transitions clean

Options

Prefer sentence boundaries

Break after . ? ! whenever duration allows

Allow comma-level splits

Use comma and semicolon breaks when a cue gets too long

View accepted JSON sample

[
  {"text":"Hello","start":0.12,"end":0.44},
  {"text":"everyone,","start":0.44,"end":0.93},
  {"text":"welcome.","start":0.93,"end":1.40}
]

// Also supported:
// {"words":[{"word":"Hello","start":120,"end":440}]}
// {"results":{"channels":[{"alternatives":[{"words":[...]}]}]}}

JSON Validation and Sample

Paste timestamp JSON to validate it before generation. The validator flags missing start/end fields by word index.

Generating...

Generation Complete

ASR JSON Ready

Supports common word arrays from AssemblyAI, Whisper-style output, and nested words objects.

Professional Logic Controls

Tune line length, cue duration, reading speed, punctuation splits, and cue gap to match delivery style.

Private by Design

Generation runs in-browser. Your uploaded timestamp file is not sent to external conversion APIs.

Built for Professional Subtitle Drafting

Punctuation-Aware Splits

Prioritizes sentence boundaries and optional comma breaks so cues read naturally on screen.

Timing Constraints

Applies minimum and maximum cue duration with configurable cue gaps to prevent collisions and flashes.

Two Output Targets

Generate clean SRT or VTT from the same timestamp source without reformatting manually.

Accepted Timestamp Inputs

Upload JSON where each word includes start and end timing. The tool auto-detects seconds or milliseconds.

ARRAY

Direct Word List

[{"text":"Hello","start":0.12,"end":0.44}] or [{"word":"Hi","start":120,"end":360}].

NESTED

Nested ASR Output

Finds nested words arrays inside channels, alternatives, segments, or result objects.

UNITS

Seconds or Milliseconds

Understands numeric seconds, numeric milliseconds, and strings like 00:00:12.340 or 120ms.

What Is a Word-Level Timestamp?

A word-level timestamp links each token to start and end time. This is common in Whisper and modern STT pipelines.

SYNC

Per-Word Timing

Each subtitle cue is built from accurate token timing, not guessed phrase timing.

WHISPER

Whisper and STT Ready

Supports Whisper-style words arrays and nested ASR outputs from common speech-to-text providers.

Validation Before Export

Use the built-in validator to catch missing timing values before generating SRT/VTT files.

JSON Schema Example

Recommended minimum schema for robust subtitle generation:

SCHEMA

Required Fields

[{"text":"Hello","start":0.12,"end":0.44}] where start and end are either seconds or milliseconds.

ALT KEYS

Also Accepted

word, start_time, end_time, duration, and nested words arrays.

EXPORT

Output

Generate subtitle files ready for editors, social clips, and long-form video in SRT or VTT.

Frequently Asked Questions

What should my JSON look like?

Each word should have text plus timing fields. Common keys are text or word, with start and end.

Does this work with Whisper and STT word timestamps?

Yes. The tool supports direct word arrays and nested outputs that include per-word timing from Whisper and common STT providers.

Can I generate one-line subtitles only?

Yes. Set “Lines per cue” to 1 and the generator will keep each cue on a single line.

Does it preserve word timings from my file?

Yes. Cue start and end are derived from word timings, then refined with your min/max duration and gap settings.

Which subtitle formats can I export?

You can export the generated subtitles as SRT or VTT.