The caption editing window lets you create, edit, import and export caption and subtitle information in a number of popular formats:
The main section of the editor lists all captions and subtitles available for the current language:
To edit a caption’s text, click, select and type as you would in a normal text editor. The user interface allows for up to 4 lines of text to be previewed. We recommend that no caption use more than 2 lines of text. This is both a strict requirement for exporting to iTunes Time Text and good practice to follow. The readability of your captions on any device and display suffers when more than 2 lines are presented at once.
All timecodes are displayed in the SMPTE notation, using colons as delimiters for non-drop frame mode (HH:MM:SS:FF
). When drop frame mode is enabled for NTSC frame rates, the last component in the timecode is separated by a semicolon (HH:MM:SS;FF
).
When entering timecodes, remember that components on the right take precedence. For example the timecode 2:30:02 translates to 2 minutes, 30 seconds and 2 frames. When entering an invalid frame number, it is automatically adjusted to be within the range allowed by the current frame rate. For example, timecode 1:10:28 is automatically adjusted to 1:10:23 if the current frame rate is 24fps, since this only allows frame numbers from 0 to 23.
Options available under the Timing section affect all captions and all languages in the file:
The Offset timecode defines a time shift applied to all captions. For example, an offset of 2:00:00 means that all captions are considered relative to the 2 minute mark on the timeline. In the same scenario, a caption that is set to start at 1:12 (1 second, 12 frames) will in fact start at 2:12:12.
This value is imported from iTunes Timed Text files, and few other file formats have a native concept of a global time offset. Its usefulness should be clear during export operations, since it allows you to shift all captions forward or backward in time without requiring you to manually adjust every timecode. A huge time saver!
Options under the Appearance section affect all selected captions:
A preview canvas provides a quick glance at how the caption might appear within the frame, according to the options selected below. Text formatting options will apply to entire captions, or only to the selected text within a caption.
Notably absent from this section are any options to choose a font and size. Most caption file formats do not support font information, instead putting the responsibility of picking the correct font and size to the software and/or device being used to display captions.
Below and to the left of your captions is a search box that supports text and timecode-based searches:
When you enter one or more keywords in the search box, all captions that contain any of those words are displayed.
You can also search for captions by entering a timecode:
All captions that appear near that timecode are displayed. By default, captions that appear 5 seconds before or after the given time are matched. You can set a different range through the Settings window.
The editor automatically detects when two or more captions are set to appear simultaneously during playback. These timings conflicts are highlighted in the user interface:
In the example above, a value of 8 was entered in the out timecode of the first caption when a value of 7 was intended. The result of this mistake is that the first caption overlaps the next caption. Changing the out timecode of the first caption to 2:57:17
would fix the problem.
When the imported data contains multiple conflicts, enable the Conflicts Only option to temporarily display only captions that have timing problems to be resolved.
Click the
and buttons to jump to the previous and next caption with problems. For quick navigation, the keyboard shortcut for these buttons is the key followed by the up or down arrow on your keyboard.Multiple languages can be edited and saved through a single editor. This makes it easier to manage different translations for the same media.
When creating a new set of captions, the current system locale is used to create the initial language, i.e. "English (United States)". When importing an existing file, add the language that matches the captions being imported. For example, when importing a SubRip (.srt) file containing German language subtitles, store those captions as "German".
To add a new language to the editor, click the
button to the right of the Language menu. Similarly, click the button to delete the current language and all its associated captions from the editor.When creating a new language, you are given a chance to duplicate existing captions to the new language. This is a common technique for anyone who embarks on the translation effort.
The editor goes a step further to make translation easier. Click the
button to reveal a second, side-by-side view where you can load a reference language (your source for the translation):Load your source language on the right side (the reference view) and work on the translation on the left (the primary view).
Click on a caption in one language to find its nearest equivalent in the other language. This allows you to identify the caption(s) that match a given timecode, and help you verify the correctness of the translation.
Languages are not required to have the same number of captions or the same timecodes. Translators have complete freedom in using more or less captions, and at different timecodes, to translate the underlying material.
Only captions and timecodes on the left-view can be changed. Text and timecodes in the right view are locked. They can only be selected and copied.
The import process begins when you open an existing file, or when you click the
button in the editing window.Certain file formats embed enough information as to be readable directly, with no user intervention. One such example are iTunes Timed Text files. All other formats may require you to fine-tune the import process through this window:
The most likely settings are automatically applied. In most cases, you only job when importing is to match the language to the contents of the file.
UTF-8
, @UTF-16
) to ensure that characters and glyphs in any language can be represented. Unfortunately a large number of captions are still available in files that use text encodings specific to a family of languages (e.g. Western European languages) or single languages (Japanese). Our software tries to automatically detect the text encoding from the input file. A Preview is available to check the contents of the file as interpreted through the given encoding. If the preview looks incorrect, switch to a different encoding until the contents of the file are read correctly.When importing WebVTT files, the following UI is visible under the Format menu:
The extra options give you control over any voice tags found in the source file:
For example, when using the default delimiter of ": " with the bold option enabled and the color set to cyan, the WebVTT cue:
<v John>Hello!<v>
...is imported as the caption:
John: Hello!
When importing Premiere Pro Markers files, the following UI is visible under the Format menu:
When importing LRC files, the following UI is visible under the Format menu:
The Apply custom style to active lyrics creates multiple captions for each set of word time tags found in the source file, and apply a different style to the active lyrics. When this option is disabled, any word time tags are ignored, and only one caption for each line of lyrics will be created.
Not all LRC files contain word time tags. It is a feature described as Enhanced LRC, designed to help karaoke machines highlight active lyrics on screen.
The export process begins when you click the
button in the editing window. Export options are available inside the standard Save panel:The options you choose at this stage only affect the export process. The original captions and options are left untouched during an export. This allows you to quickly perform a series of export operations at various output settings. For example, you might want to output a series of iTunes Timed Text files at various frame rates, or to create drop frame and non-drop frame versions of the same captions.
Demo.en_US.srt
. This option is not available when exporting to formats other than SubRip.UTF-8
or UTF-16
) the Add Byte Order Mark (BOM) option lets you decide if you want the output file to contain a special, invisible character sequence that instructs other software on how to interpret the contents of the file correctly.iTunes Timed Text (iTT) is a subset of the Timed Text Markup Language by the World Wide Web Consortium (W3C). All iTT documents are TTML documents that use the restricted subset of TTML. iTunes Timed Text is natively supported by Final Cut Pro 10.4 (or later). iTT files store all timecodes in the SMTPE format, with a distinction between drop frame (HH:MM:SS;FF) and non-drop frame (HH:MM:SS:FF) timecodes.
The majority of information is provided by the iTT specification, and the language, style and timing stored in the file will be faithfully imported.
Most of the information you can manipulate through the editor is faithfully exported to the iTT file, with a few important exceptions. The iTT specification does not allow for simultaneous captions (i.e. captions whose time periods overlap). iTT does not support text box sizing and placement. iTT does not allow you to customize text alignment, since all captions are centered within the frame.
WebVTT is an evolving standard by the World Wide Web Consortium called The Web Video Text Tracks Format. The export process supports a limited but growing subset of the specification that deals with static captions:
<v John>Hello!<v>
can be imported as:John: Hello!
Most other information is skipped during an import. WebVTT uses its own format for timecodes (00:00:00.000) where the last component represents milliseconds.
The import process recognizes text styles defined inline via HTML-style syntax, and text colors as defined via CSS-like statements in STYLE sections. Voices can be inlined into captions or skipped, based on the options provided for the import process. All other information is skipped.
Most of the caption information you can manipulate in the editor is exported to WebVTT. Bold, italic and underline text styles are exported via inline HTML-style syntax. Text alignment, text box size and relative positioning within the frame are also exported via Cue attributes. Text colors will be exported once the
STYLE
section is widely supported.SubRip (SRT) remains the most popular text-based caption file formats, despite having no formal specification and limited support for text formatting via HTML-like syntax. SubRip files use their own format for timecodes (00:00:00,000) where the last component represents milliseconds.
Text styles and colors are imported with appropriate HTML-like syntax.
Text styles and colors are exported via HTML-like syntax. Simultaneous captions, text alignment options, text box size and positioning are not supported by the specification. Make sure that you export your file as UTF-8 when targeting YouTube, Facebook or other popular social media platforms.In some cases it also helps to export files without markup (text formatting) to guarantee the best results. You can export captions without markup by turning off the Export Styles (markup) option in the Export panel.
SubViewer is a text-based file format that does not support any appearance options or simultaneous captions. It is still popular with a number of software packages and web video platforms, such as YouTube. SubViewer files use their own format for timecodes (00:00:00.00) where the last component represents hundredths of a second.
Only caption text is imported.
Only caption text is exported.
Adobe Encore support both Text and Image-based subtitles. Caption Converter allows you to import and export Text Script files. The file format is extremely simple. Timecodes use SMTPE-like components (HH;MM;SS;FF) but no distinction is made between drop vs non-drop timecodes.
Only caption text is imported.
Only caption text is exported. Make sure that you export the file in one of the Unicode formats (UTF-8 and UTF-16) to ensure that text in all languages is correctly preserved. The use of a Byte Order Mark is optional.
Adobe Premiere Pro allows you to export markers in XML and CSV files. Caption Converter currently allows you to import CSV files only. Premiere Pro uses SMPTE-like notation for its timecodes, distinguishing between non-drop frame (HH:MM:SS:FF) and drop-frame mode (HH;MM;SS;FF). The .csv
file extension suggests that file contents should always be comma-separated values. In practice, recent versions of Premiere Pro seem to export tab-separated values instead. Either variant is detected and handled automatically by the import process.
Only caption text is imported. Make sure to match the Drop frame setting to the value expected in the file. The import process will fail if your selection does not match the data in the file. While markers do not have any associated text formatting, Premiere Pro allows users to enter both a name and arbitrary comments for each marker. You can import marker name, comments, or both, via a setting in the import window. When importing name and comments, the name is imported as the first line in the caption and the comments are imported in subsequent lines.
Exporting to this format is not possible. Premiere Pro allows the importing of markers saved in the Final Cut ProML Interchange Format which predates Final Cut Pro. Let us know if you are looking forward to having it as an option.
LRC is a text-based file format that does not support any appearance options or simultaneous captions. It is popular for storing and displaying song lyrics. Its timecode has format MM:SS.XX, where MM is minutes, SS is seconds, and XX is hundredths of a second.
Text and any global offset stored in the file are imported. Since lyrics do not often include the out timecode, each line of lyrics ends where the next line begins. When the file does not specify the out timecode of the last lyric, the duration of the song is used to assign the correct timecode. Should the overall song duration also be unavailable or incorrect, the last lyric is assigned a default duration of one second.
When the LRC file contains enhanced word time tags, you have the option to translate a single line of lyrics into multiple captions, where the current words are displayed through a different style. You can choose a style for the active lyrics in the Import window. If you choose not to apply a custom style to active lyrics, all enhanced word tags are ignored, and one caption is created for each line.
Only caption text and the overall offset are exported. Note that the LRC file format interprets the [offset:...]
field with the opposite meaning as our software. A positive offset value indicates that lyrics are delayed by the specified amount. In our own software, a positive offset indicates that time should be fast-forwarded by the desired amount, thus causing captions to appearsooner on screen.