NOTE: These tools are no longer being developed.
NOTE: Tim Singer has developed a tool called Captioneer to replace the buggy cc_mux tool, as well as to provide an actual GUI for creating and adjusting closed caption timings against a video file. You can find this tool here. One thing the instructions on that site doesn't make clear is that if you have the M2V file open in the tool, you won't be able to mux it. Start a new project, load just the SCC file, and then multiplex.
NOTE: To download the fixed CCExtract.bdl file for General Parser (to fix ATSC extraction), click here. Place this in the C:\Program Files\General Parser\source\MPEG\ directory. This fix was by Ken Schultz.
NOTE: To download the fixed version of CCASDI (3.6), click here.
NOTE: Carlos Fernandez has written a tool to extract raw captions from an MPEG file faster than SCC_RIP. In addition, it can also handle HDTV files, which my tool cannot. You can download this CCExtractor tool (with source code) here.
The Scenarist Closed Caption Tools package is available for download here.
The latest version is 3.5 (May 5, 2005), which affects three tools:
CCADJ can now scale timecodes with the new -m
argument; CCASDI fixed a lot of bugs, including a mistake in filler byte
handling pointed out to me by correspondent Ji-Liang Song; and RAW2SCC will now
handle multi-channel raw files better. For details, see the complete
version history.
The SCC Tools package consists of ten command-line tools (and one General Parser module) designed to assist in the task of extracting, manipulating, and inserting the additional data included in Line 21 of NTSC video: closed captions, MSNTV links, V-Chip ratings, and a variety of lesser-used types of information.
The following files should have been included in this distribution:
For an explanation of Line 21 Closed Captions and how they are encoded, click here. For high-end DVD authoring (Sonic Scenarist, Apple DVDStudio Pro, and the discontinued Spruce DVDMaestro), closed captions are entered as files in the SCC format (for Scenarist Closed Caption). Since this format is not readable by humans, I have included tools to convert SCC format to CCD (Closed Caption Disassembly) format and back. The CCD Format is documented here (including additional pages on characters, Closed Caption codes, eXtended Data Service codes and Interactive TeleVision codes in Closed Caption Disassembly format).
Extracting Line 21 Data
Manipulating Line 21 Data
Inserting Line 21 Data
Capturing Raw Closed Caption Data from an Analog Source
To obtain closed captions, you can either write them from scratch in Closed Caption Disassembly Format, convert them from subtitles or other other closed caption formats, rip them from a DVD or MPEG or DVR-MS file, grab them from a VCD, or capture them with a TV capture card from an analog source. The last of these methods is the most involved, so it will be covered first.
To capture closed captions from a VCR, a laserdisc player, TV, or a settop DVD player, you will need a TV capture card. For the purposes of closed caption capture, the cheaper the card, the better, since the high-end cards usually throw away the Line 21 signal to improve performance. Any card with a WDM (Windows Driver Model) driver should work (see this site for a generic WDM driver for cards that use the Brooktree chip--if the box makes a big deal about "watching TV on your PC", then it probably uses a Brooktree chip). Unfortunately, the ATI-TV Wonder USB edition drivers are not WDM (and the main chip is not a Brooktree), so you cannot use this potentially-useful device. The most common card that does work is the ATI-TV Wonder PCI, available for less than $ 100. However, this card will not work on a Windows 2000 or Windows XP machine if your primary video card was not manufactured by ATI (NVidia cards in particular refuse to work with the ATI-TV Wonder in these two operating systems). Personally, I performed my research using an old computer with Windows 98 and the ATI-TV Wonder PCI to capture closed captions, then transferred the caption files to my main computer via floppy disk.
To capture the Line 21 data, you will be using GraphEdit, a tool that comes with the DirectX Software Development Kit, to put together a program for dumping closed captions to a file. You can download GraphEdit by itself from Doom 9, or from Microsoft as part of the DirectX SDK. The SDK (at 215+ MB a rather-large download) is probably worth loading for only two reasons: it already includes DirectX (in case you need to upgrade), and its help file is useful for figuring out what the different DirectShow filters do (in the Contents tab, look under DirectX - DirectShow - DirectShow Reference - DirectShow Filters). On the other hand, the Doom 9 version includes the Dump filter, which you will need in the below procedure.
The simplest way to set up your graph is to open GraphEdit and your TV watching program at the same time, get the second program up and working, then switch to GraphEdit and select "Connect to Remote Graph" from the File menu. This should give you all the interconnected parts needed to watch TV, and from here it you should be able to find the VBI pin on one of the filters (pins are represented in GraphEdit as bumps on the boxes, which are the filters).
Assuming this procedure worked and you found a VBI pin, select Insert Filters from the Graph menu. Under WDM Streaming Tee/Splitter Devices, insert the "Tee/Sink-to-Sink Converter" filter. Under WDM Streaming VBI Codecs, insert the "CC Decoder" filter. Under DirectShow Filters, insert the "Dump" filter (this will prompt you for a filename to dump to, so use "cc.bin").
You will see all of these filters on your graph, probably on top of each other. Spread them out, then connect them in sequence by dragging from an output pin (bump on right side) of one filter to an input pin (bump on left side) of another. Your goal is something like this (just pay attention to the VBI pin and the filters I've named--the rest doesn't apply as well; also, the blue box is the Dump filter, although that's not very obvious). One tweak you should be able to apply is selecting which line and field to capture (the default of Line 21 Field 1 for closed captions, Line 21 Field 2 for XDS, or other lines and fields to get something completely different)--right-clicking on the "CC" output pin of the CC Decoder filter should give you this option (I no longer have my Windows 98 computer with the ATI-TV Wonder PCI, so I have no way of confirming this anymore).
To start capturing captions, press the play button in GraphEdit. A preview window may or may not appear, and the captions probably won't be visible (unless you follow the steps below in the detailed procedure), so you'll have to trust that everything is working. Click the stop button in GraphEdit when you're done (don't close the preview window before you do this). Check to make sure the file created has something in it (it will be in binary format, so you won't be able to read it). If everything has worked up to this point, proceed to convert the file to SCC format.
Assuming that "Connect to Remote Graph" didn't work, the following procedure describes what to do in more detail (although if "Connect to Remote Graph" didn't work, that might mean that the capture card doesn't use the DirectShow architecture, which would mean there is no way to hijack it to capture closed captions--maybe you should return the card while you can).
set
path=%path%;.
" at the top, then run it again.Getting Raw Closed Caption Data from a VCD
Closed Captions for VCDs are not completely documented anywhere on the Internet, largely because the format has been abandoned for commercial purposes in North America (I presume they are documented in the White Book specifications, but not only is this $200, but any potential buyers are required to sign a non-disclosure agreement that would defeat the purpose of buying them in the first place). The only VCDs that are being made today lack closed captions.
The vast majority of the often poorly-formatted specifications for VCDs and SVCDs found on the 'net derive from the Philips PDF Super Video Compact Disc, a Technical Explanation. This document states two ways that closed captions are stored on a VCD: as the file EXT\CAPTnn.DAT, and in the form of user data embedded in the MPEG file.
There should be one CAPTnn.DAT file for each track of the VCD (so the closed captions for the AVSEQ02.DAT file are stored in CAPT02.DAT). The format of this file is spelled out in a document on the CD-I website. This page reveals that CAPTnn.DAT format is not the same as the EIA-608 format used for broadcast, videotapes and DVDs. It also shows that the file is specifically designed for CD-I players, and therefore will not be used by VCD or DVD players.
I know much less about the user data method. Looking at the source code for the VCDImager tool (starting with version 0.7.4), I can see that the user data header (0x000001b2) is followed by 0x11, but no details are given of the further structure. This rules out the DVD closed caption structure, which uses 0x43, and also the SCTE 20 2001 structure (0x03).
I have been looking for old commercial VCDs on e-bay, in order to reverse-engineer the format. So far I have a Swedish version of Ghost that verifies the CAPTnn.DAT format but lacks user data captions. If anyone can help me in this area, I'd really appreciate it (go to the bottom of this document to get my e-mail address). On the other hand, it's a good bet that no DVD player in existence (either software or hardware) would be capable of playing VCD user data closed captions, as it is likely their developers had the same incomplete documentation as I do.
Converting Raw Closed Caption Data to SCC Format: RAW2SCC
Once you have captured raw closed caption data, you can either convert it to
subtitles (using the CCParser program available at Doom9), or you can use the RAW2SCC tool to
convert it into the SCC format used by many high-end DVD authoring programs to
store closed captions. (CCParser by the way requires the raw captions to be in
DVD-format, so you will have to use RAW2SCC followed by SCC2RAW -d
to convert broadcast-format
into DVD-format.) To use RAW2SCC, run it from the command prompt with the name
of the raw file to convert. Optionally, add a second argument with the name of
the SCC file to create if you don't want it to be the name of the input file
with the extension changed to .scc.
Here is a sample SCC file to show how the format works:
Scenarist_SCC V1.0 01:02:53:14 94ae 94ae 9420 9420 947a 947a 97a2 97a2 a820 68ef f26e 2068 ef6e 6be9 6e67 2029 942c 942c 942f 942f 01:02:55:14 942c 942c 01:03:27:29 94ae 94ae 9420 9420 94f2 94f2 c845 d92c 2054 c845 5245 ae80 942c 942c 8080 8080 942f 942f |
A file represents the data in either Field 1 or Field 2 of Line 21, but not
both (the Field 1 file always has an extension of .SCC
; the Field 2
file is supposed to have the extension .SC2
, but it is usually
.SCC
as well). It is possible to determine the field by examining
the contents (for example, if you see 94ae
or 1cae
,
then it's a Field 1 file, but if you see 15ae
, 9dae
or
anything starting with 01
, then it's a Field 2 file), but as the
example shows, it isn't easy.
The file is double-spaced, with blank lines between each line of data. The first line consists of the header "Scenarist_SCC V1.0" (version 1.0 is as high as this format ever got).
The third and following alternate lines consist of a timecode (which can use either the non-dropframe or dropframe formats, see below), followed by a tab, followed by space-delimited data. Usually, each line is a separate caption.
The timecode is in SMPTE format, or hours:minutes:seconds:frames. Since
Line 21 Closed Captions is an NTSC format, there are 29.97 frames per
second. NTSC timecodes can be displayed in one of two slightly-different
formats. In non-dropframe time base, frame counts are translated straight into
SMPTE. This is the usual format for NTSC content that has no contact with a
broadcast environment. For a broadcast setting, timecodes are easier to work
with if you start with 30 frames per second and then subtract 3 % to get 29.97.
This is called dropframe time base, and is accomplished by skipping the first
two frames at the beginning of every minute for nine out of every ten minutes.
Dropframe timecodes are distinguished from non-dropframe timecodes by changing
the last colon into a semicolon (00:01:00;04
instead of
00:01:00:04
). Note that the difference between non-dropframe and
dropframe is purely in how timecodes are displayed; underneath, 29.97 frames
are still passing every second. RAW2SCC uses non-dropframe time base by
default. [Thanks to Dan Wilson for making this all clear to me.]
The data is made up of pairs of hexadecimal numbers. Each pair can either represent a command (for positioning and other special effects), or a pair of characters. One pair is transmitted with each frame of video.
Getting back to the RAW2SCC tool, you can use any of the following optional arguments with it (place them between the command and the name of the input file):
-1
, -2
, or -12
-12
means to extract
both of them and output to two SCC files, with "_1" added to the name of the
Field 1 file and "_2" added to the name of the Field 2 file).
-1
is the default. Note that none of these options apply if the
input file is in broadcast format, as that format only supports a single
field's data.-oHH:MM:SS:FF
HH:MM:SS;FF
).-fNFPS
-td
n
for non-dropframe or d
for dropframe. The
default is n
. This flag will automatically set NFPS to 29.97,
so don't mix the -f
and -t
arguments in the same
command line.-lN
8080
) transmitted sporadically in order to get the timing right.
RAW2SCC will split a line if it detects 2 or more null codes in a row. If
you're trying to get your SCC files to have one entire caption per line, use
-l
to increase the number of null codes allowed in the middle of
a line before the line is split, for example -l8
to increase the
limit to 8.Ripping SCC Captions from a DVD: VobSub and VOBSUB2SCC
Until recently, the only way you could get closed caption data out of a DVD was to build a filter graph similar to the one described for analog capture, and then play the whole DVD out in real time. For these DVD's where the new procedure doesn't work, you can try the old method by following the instructions here. With the tool CCExtract, you can extract the captions from the ripped MPEG-2 files (you can also choose which field to extract, unlike VobSub, which only extracts Field 1). The following process is an alternative to try in case that doesn't work.
Starting with version 2.19, Gabest's VobSub tool (which is designed for the two purposes of burning subtitles into video files or displaying them on top of video by use of an included filter) is capable of ripping Closed Captions from a DVD and converting them to the popular Subrip subtitle format. As a byproduct of this process, a temporary file is created that can be converted straight to an SCC file by using the tool VOBSUB2SCC. Gabest has recently developed a VobSub Ripper Wizard tool that makes this process even easier:
VTS_01_0.idx | Not needed (used by VobSub). |
VTS_01_0.sub | Not needed (used by VobSub). |
VTS_01_0.cc.raw | This is the file that we will convert to into SCC format. |
VTS_01_0.cc.srt | This is the Subrip file created from the captions. If you were after subtitles and had no interest in SCC files, you'd stop here. |
VTS_01_0.cc.unicode.srt | The Unicode-encoded
version of VTS_01_0.cc.srt . The Closed Caption
character set contains several unusual characters not usually encountered
in subtitles, with the musical note character the most popular. These
special characters cannot be stored in an ordinary ANSI text file and must
use Unicode formatting. Unfortunately, the Subrip program and virtually
every other program that uses subtitles can't open Unicode-formatted files.
Therefore this file, while much more accurate than
VTS_01_0.cc.srt , is virtually useless. |
Since VOBSUB2SCC is creating two files, it does not accept output file names as parameters. This means that the name of the file to be converted must end with ".cc.raw" (or ".sub.cc.raw", the output of the previous VobSub program). The following optional arguments can also be used with VOBSUB2SCC:
-oHH:MM:SS:FF
HH:MM:SS;FF
).-fNFPS
-td
n
for non-dropframe or d
for dropframe. The
default is n
. This flag will automatically set NFPS to 29.97,
so don't mix the -f
and -t
arguments in the same
command line.-lN
For details of the formatting of the SCC file created by this process, see the RAW2SCC section above.
Extracting Closed Captions from MPEG Files: General Parser and ccExtract
Captions are stored within MPEG-2 video and system files. The specific manner they are stored in varies based on the source of the file. MPEG-2 files ripped from DVDs use one format, and each of the different manufacturers of DVB (Digital Video Broadcast) recorders use a different format.
The General Parser tool (written by Takaaki Oka) is designed to quickly search through an MPEG file to output any customized data you might be interested in. The download for SCC_TOOLS includes two files to add to General Parser to allow it to extract closed captions in the raw format from MPEG files with a variety of different sources. So far, here is the list of sources supported:
To install General Parser, download it from here and unzip to a directory (such as C:\Program Files), then create a shortcut to General Parser\bin\gp.exe. Open SCC_TOOLS.ZIP, put CCExtract.gp and CCExtract_VES.gp in General Parser\projects\MPEG\video, and put CCExtract.bdl in General Parser\source\MPEG.
Here is the procedure to extract closed captions from an MPEG (or .VRO) file:
If you are having problems trying to process large files, try increasing the numbers in the Pipeline and Virtual Machine tabs of the Options Dialog.
If you have a file from a source other than those listed at the top of this section, try to use CCExtract on it. If it works, let me know which source it was so I can add it to the list. If it doesn't work, you can help me to add the new source to this tool by the following:
.txt
file..txt
file to me with an explanation of the source
you are using (see the bottom of this page for the address).-d1000
between dvbdump
and the input
file name to produce a 1000-row output.Before Jeff Davies alerted me to the existence of General Parser, I had my own tool, SCC_RIP, which had to be one of the slowest programs on the planet. However, I imagine the day might come where nothing else works, so here's how to use this tool:
Run SCC_RIP it with the name of the MPEG-2 file as the argument. The
program will (eventually) output two files with the same base name as the
MPEG-2 file: a raw file (extension .bin
) and an .scc
file. The following optional arguments can also be used with SCC_RIP:
-d
-1
, -2
, or -12
-1
is the default.
Option -12
will extract both fields; how many output files this
produces depends on the raw format: broadcast format can only hold one field,
so there will be four output files (input_1.bin, input_2.bin, input_1.scc and
input_2.scc), while DVD format can hold both fields, so there will only be
three output files (input.bin, input_1.scc and input_2.scc).-oHH:MM:SS:FF
HH:MM:SS;FF
).-fNFPS
-td
n
for non-dropframe or d
for dropframe. The
default is n
. This flag will automatically set NFPS to 29.97,
so don't mix the -f
and -t
arguments in the same
command line.Extracting Closed Captions from DVR-MS Files: DVR2SCC
Windows XP Media Center PCs record video in a proprietory format with the extension .dvr-ms. Tools have been developed to convert this format into MPEG, but in the process the closed captions which are included in the format are lost. The DVR2SCC tool will extract the captions from a Media Center file, both in raw format (.bin extension) and in SCC format. To use the tool, run it from a command prompt with the .dvr-ms file as the argument. Since there are two output files, DVR2SCC will not accept output filenames as parameters. The following optional arguments can also be used:
-d
-oHH:MM:SS:FF
HH:MM:SS;FF
).-fNFPS
-td
n
for non-dropframe or d
for dropframe. The
default is n
. This flag will automatically set NFPS to 29.97,
so don't mix the -f
and -t
arguments in the same
command line.For details of the formatting of the SCC file created by this process, see the RAW2SCC section above.
Converting Subtitles into Closed Captions: SUBRIP2SCC
Occationally it happens that you have your hands on a set of subtitles but no closed captions. If you still wish to use captions, you can use the SUBRIP2SCC tool to convert subtitles in Subrip format to SCC format (Subrip format is the native subtitle format produced by Zuggy and Brain's SubRip DVD subtitle ripper). Realize that the output from SUBRIP2SCC still needs to be massaged before it can be used (by using CCASDI to convert to CCD format and then editing). In particular:
-a
argument
to adjust the CCD times to match the Subrip times. The only rows that should
be off by more than half a second will be the ones SUBRIP2SCC told you it
had to adjust.*
"
character to represent the Eighth Note, and SUBRIP2SCC will automatically
convert all asterisks to musical notes (as the asterisk is a mostly-unsupported
character for captions).<b>
and </b>
tags) will be dropped,
because closed captions are only in bold text. However, italics
(<i></i>
) and underline
(<u></u>
) are converted correctly.<font color="#rrggbb"></font>
), with over sixteen
million different colors possible. Closed Captions only support eight
different colors (white, black, red, green, blue, cyan, magenta and yellow), so
SubRip colors are converted to the nearest Closed Caption match.SUBRIP2SCC accepts the name of a Subrip file (suffix .srt) as its required argument and optionally accepts another argument of the output file (assumed to be the input file with an .scc suffix if left out). The input file must end with a blank line, or SUBRIP2SCC will drop the last subtitle. The following additional arguments (placed between "SUBRIP2SCC" and the input file name) are all optional:
-2
-u
-k
*
" characters to eighth note characters. Since the asterisk
is an extended character in closed closed captions (and therefore unreliably
supported by software DVD players and other display devices), and because
Subrip files cannot include Unicode characters like the eighth note, I'd
recommend using asterisks only to represent musical notes.-oHH:MM:SS,MIL
-o-00:01:00,000
is allowed.-fNFPS
-td
n
for non-dropframe or d
for
dropframe. The default is n
. This flag will automatically set
NFPS to 29.97, so don't mix the -f
and -t
arguments
in the same command line.NOTE: There are dozens of subtitle formats out there that can be converted to Subrip format for use with this tool. If you are looking for a good converter for the common Sub-Station Alpha (SSA) format, I'd recommend Radek Strugalski's SubCreator. This is also a fine free tool for creating subtitles from scratch and timing against a video file.
Converting Between Closed Caption Formats: PAS2SCC
Although SCC is the accepted standard for storing closed captions, it is not the only such format in existence--every commercial captioning product out there uses their own format, although most of these tools will include the ability to export to SCC format. As I am presented with examples of these formats, I will develop tools to perform these conversions, for the use of those with odd-format captions but lacking the tools that created them.
The first such format I have received was created by the DOS version of
CCWriter, with the extension .pas
. The corresponding conversion
tool is PAS2SCC. To use it, run from the command line with the name of the
.pas file as the argument. The output will be in .scc format.
Here is an example of what CCWriter (DOS) format looks like:
;THE SINGING RIVER: RHYTHMS OF NATURE ;File: PART1.PAS ;created: 10/31/03 ;transc. by: EVC ;captioned by: SRC ;air date: 11/13/03 ;length: 58:55 ,mbc \THE SINGING RIVER\ CLOSED CAPTIONED + 00:01:15F00 00:01:23F08 (\soft music and \birds singing\) + 00:01:40F18 00:01:43F27 ,mbc THE RIVER IS LIKE AN ARTERY OF LIFE + 00:01:44F27 00:01:50F29 |
PAS2SCC will also accept the following optional arguments:
-2
-ohh:mm:ssFff
F
separating
seconds and frames). A negative value like -o-00:01:00F00
is
allowed.-fNFPS
-td
n
for non-dropframe or d
for
dropframe. The default is n
. This flag will automatically set
NFPS to 29.97, so don't mix the -f
and -t
arguments
in the same command line.Converting SCC to a Readable Format: CCASDI
Closed captions are frequently different from the actual dialog spoken. Part of this is due to the relatively low bandwidth (less than ten words per second, closer to ten characters per second when you consider all of the commands that are sent before and after each caption), but in addition, a number of typographical errors tend to sneak in. There are a number of tools for editing closed captions and outputting SCC files, but none of these are free (the least expensive is Jorge Morones' Stream SubText). What is needed is a free alternative, a way to turn SCC files into a human-readable format and back, so SCC files can be created and edited. The tool for this job is CCASDI, the Closed Caption ASsembler and DIsassembler), and the human-readable format it uses is CCD, for Closed Caption Disassembly (this format is described in the next section). To use CCASDI, run it from the command prompt with the name of the file to convert (and optionally, the name of the file to create). The program will automatically detect the type of the file (SCC or CCD) and will output a file of the opposite type.
The following optional arguments can be used with CCASDI:
-a
-s
-a
argument above). Note that if the
captions are in the roll-up format (i.e. if they came from a news broadcast or
any other live source), then each SCC line must have an entire caption, so
you may need to adjust the -l
parameter of
RAW2SCC to get this conversion to work. The default
output if you don't provide the output file argument is SubRip format,
changing the input file extension of .scc to .srt. Underline, italics,
and color formatting are converted, but not flash (which no subtitle format
supports). Note that all captions should be in boldface,
but I figured that most users would not want <b></b>
tags around every single subtitle. By providing an output file argument with
one of the following extensions, you can change the format of the subtitles
created:
.srt
: SubRip format (see above).sub
: MicroDVD format (does not support
colors).smi
: SAMI format (supports all SubRip
formatting).psb
: PowerDivX format (does not support any
formatting).ssa
: Sub-Station Alpha format (supports all SubRip
formatting).ass
: Advanced Sub-Station format (supports all
SubRip formatting).txt
: Adobe Encore format (does not support any
formatting)-cCC1
-s
option). The default
is CC1
, the first closed caption stream, which is the one used
for DVD's, videotapes and almost all broadcast captions. Other choices are
CC2
, CC3
, CC4
, T1
,
T2
, T3
and T4
("T
"
refers to Text streams, which are very rarely used nowadays).-oHH:MM:SS:FF
-td
argument is used (in which case the
timecode should be in HH:MM:SS;FF
format).-fNFPS
-o
argument.
NFPS can be between 15 and 60.-td
n
) or dropframe (d
). The default is
n
. Using this argument automatically sets Frames Per Second
to 29.97.The Closed Caption Disassembly Format
Here is the CCD (Closed Caption Disassembly) version of the sample SCC file given in the RAW2SCC section above:
SCC_disassembly V1.2 CHANNEL 1 01:02:53:14 {ENM}{ENM}{RCL}{RCL}{1520}{1520}{TO2}{TO2}( horn honking ){EDM}{EDM}{EOC}{EOC} 01:02:55:14 {EDM}{EDM} 01:03:27:29 {ENM}{ENM}{RCL}{RCL}{1504}{1504}HEY, THERE_{EDM}{EDM}{}{}{EOC}{EOC} |
The first line consists of the header "SCC_disassembly V1.2". Versions 1.0 and 1.1 were produced by earlier versions of CCASDI (1.0 and 2.0), and there is a limited degree of backward-compatibility if you have an earlier-version CCD file to convert to SCC (CCD is an editing format, though, so you should use SCC as the more-stable storage format).
The second line gives the channel to display the captions in. Line 21
Closed Captions support Channels 1, 2, 3 and 4, allowing two data streams to be
transmitted from each of two fields of Line 21 of the television signal (with
a maximum bitrate of 60 characters per second allocated to each field). Nobody
ever uses Channels 2 - 4 for DVD, and most PC DVD players are incapable of
displaying captions from these channels, but in the broadcast world, Channel 2
is occasionally used to give the same captions in another language (Channels 3
and 4 are rarely used for alternate captions). Note that for
backward-compatibility reasons, FIELD
is also accepted as a
synonym of CHANNEL
.
The third line is blank.
The fourth and subsequent lines consist of a timecode, in SMPTE format
(non-dropframe [HH:MM:SS:FF
] or dropframe
[HH:MM:SS;FF
] format), followed a tab, followed by the data.
Each line usually consists of a single caption.
The data consists of codes (in curly braces) and characters. One code or two characters can be transmitted each frame. Codes are always transmitted in pairs, to handle problems with bad TV transmissions. Even for DVD, the codes are usually doubled. Closed caption decoders are designed to ignore the second code in an identical pair.
Line 21 Closed Captions can include screens of Text, eXtended Data Service codes, Interactive TeleVision links, or Closed Captions. Closed Captions in turn can be displayed on four different channels and in one of three different formats. Luckily, only one specific combination is ever used in DVD's, videotapes and most programs broadcast on TV: Channel 1 Closed Captions in the Pop-On format. Pop-On Captions have the following format:
{ENM}{ENM}
{RCL}{RCL}
{####}{####}
Wh
: This is the same as 00 (white text, no extra
formatting).
WhU
: White underlined text.
WhI
: White italicized text.
WhIU
: White italicized underlined text.
Gr
: Green text.
GrU
: Green underlined text.
Bl
: Blue text.
BlU
: Blue underlined text.
Cy
: Cyan text.
CyU
: Cyan underlined text.
R
: Red text.
RU
: Red underlined text.
Y
: Yellow text.
YU
: Yellow underlined text.
Ma
: Magenta text.
MaU
: Magenta underlined text.
{TO#}{TO#}
ABCDEFG
{MRC}{MRC}
{Wh}
: Change to white text with no extra formatting.
{WhU}
: Change to white underlined text.
{Gr}
: Change to green text with no extra formatting.
{GrU}
: Change to green underlined text.
{Bl}
: Change to blue text with no extra formatting.
{BlU}
: Change to blue underlined text.
{Cy}
: Change to cyan text with no extra formatting.
{CyU}
: Change to cyan underlined text.
{R}
: Change to red text with no extra formatting.
{RU}
: Change to red underlined text.
{Y}
: Change to yellow text with no extra formatting.
{YU}
: Change to yellow underlined text.
{Ma}
: Change to magenta text with no extra formatting.
{MaU}
: Change to magenta underlined text.
{Bk}
: Change to black text with no extra formatting.
{BkU}
: Change to black underlined text.
{I}
: Change font style to italics (removing underline and
flash).
{IU}
: Change font style to italicized underline (removing
flash).
{FON}
: Change font style to flashing (without affecting
italics or underline, if applied).
{}{}
{EDM}{EDM}
{EOC}{EOC}
{EDM}{EOC}
combination automatically clears the buffer.
Here is a page describing characters in CCD files in detail, and here is a page describing CCD codes in detail (documenting the text mode and the three different closed caption formats for channels T1 - T4, CC1 - CC4).
For those interested in eXtended Data Service and Interactive TeleVision, you can find descriptions of how they are implemented in the CCD format here and here. Note that XDS was not correctly implemented when it was first added to CCD format version 1.2, and will only work with versions 1.3 (and above).
There are a variety of reasons why you might want to adjust all of the timecodes in an SCC or CCD file by the same amount. One of the most common is the fact that the SCC files created by RAW2SCC arbitrarily assign a timecode of 00:00:00:00 to the frame when the raw capture started, and this is usually not the correct zero-point of the finished video. The tool for timecode adjustment is CCADJ.
To use CCADJ, run it from the command prompt with three arguments: "-o"
followed by the timecode adjustment (i.e. -o01:00:00:00 to add an hour or
-o-00:01:00:00 to subtract a minute), the name of the file to adjust, and the
name of the adjusted file to create (this is the only tool where the output
file name is not optional). If the files are not using the NTSC non-drop
timebase and framerate, include either the -f
or -t
argument (-f
followed by number of frames per second for
different framerate, or -td
to change the NTSC timebase from
non-dropframe to dropframe). If you're using dropframe timecode, the
timecode in the -o
argument should be in the proper form
(HH:MM:SS;FF
).
You can also use CCADJ to speed up or slow down captions: use the
-m
argument with a factor to multiply against all timecodes. For
example, if you timed your captions against a PAL version of the movie, use
-m1.199
to multiply all timecodes by 119.9%. If for some reason you
wanted to perform the reverse of the procedure (say to convert NTSC into PAL
subtitles using CCASDI -s
), use -m0.834
to multiply
all timecodes by 83.4 %.
One possible use of CCADJ is to combine multiple closed caption files into a single file. Use CCADJ for each of the files after the first one to move the timecodes into the correct ranges, then cut and paste in a text editor to combine the contents of the resulting files.
Inserting Captions into MPEG Files: CC_MUX
Once you have extracted and manipulated your captions (or created them from scratch), you need some way to display them. If your target format is DVD and you own one of the three high-end DVD authoring tools (Sonic Scenarist, the discontinued Spruce DVDMaestro or Apple DVDStudioPro), you can supply the captions in SCC format and they will automatically be inserted in the final product. If you have some other DVD authoring tool, or if your target is a DVB machine, then you need to mux the captions into the MPEG file you're working with. The tool for this job is CC_MUX.
First, you need to have an MPEG file with any captions removed. The tool ReStream by shh (download from Doom 9) can show you if the MPEG file still has user data in it and can strip all user data from an MPEG file ("user data" is what closed captions are classified as). Chances are, if you did any re-encoding of the file (and that includes cutting on a non-GOP boundary), that the user data has already been stripped.
CC_MUX is the most complicated tool in the SCC_TOOLS package (and next to SCC_RIP, one of the slowest). Here are the arguments, which you can also see by running CC_MUX by itself in a command prompt:
-cDVD
DVD
, but RTV4
and RTV5
are also
possible, for ReplayTV 4000 and 5000. Dish Network format is not yet
supported because it is so complex (I can just figure out to read it, but it
has odd counters and checksums in it that I do not know how to
reproduce).-1caption.scc
.bin
)--if it's SCC, then temporary raw-format captions will be
created ("temporary" in this case meaning that they will be left to clutter
your hard drive).-2xds.scc
-1
for more details on possible formats.-12dvd.bin
-oHH:MM:SS:FF
-fNFPS
-td
n
) or dropframe (d
). If not provided, it will be
taken from the MPEG file, and if not found there, it will default to
n
. Using this argument automatically sets Frames Per Second
to 29.97.infile.m2v
.m2v
). Program files will probably not work as they depend on
constant packet sizes, and adding captions will throw that off. Demux to
elementary video and audio files, use CC_MUX on the video file, and re-mux
the resulting video file together with the audio file to get a working
program file.outfile.m2v
infile.m2v
" will output "infile_out.m2v
").Turning back to DVD authoring with something other than a high-end tool, after you use CC_MUX to add the captions to the MPEG file, bring the output file into the authoring tool and then use the tool to create the DVD files, you need to manipulate the VTS_XX_0.IFO file to turn on a flag that tells the DVD player that captions are present. To do this in IfoEdit (a tool you can download from Doom 9), open the file, double-click the Video line in the VTS overview - Title Set (Movie) attributes and check one or both boxes under "CC for Line 21":
Converting SCC Format to Raw Format: SCC2RAW
The main purpose of this tool right now is to convert SCC files into the DVD raw format that CCParser requires. The tool will also convert SCC into broadcast raw format.
To use SCC2RAW, run it from the command prompt with the name of the SCC file
to convert optionally followed by the name of the binary file to create (by
default, the output file will have the same name as the SCC file with the
suffix changed to .bin). To output in DVD format, include the additional
argument of "-d
". DVD format can include both fields, so you can
include the argument "-2
" to use Field 2 instead of the default
Field 1, or "-12
" to use both fields. If you want the output to
start at some point other than 00:00:00:00 relative to the timestamps in the
input files, include the argument "-o
" followed by the offset (this
can be negative)--this is one way to deal with the multi-part VCDs after the
first part (assuming that raw closed caption format works with VCDs, which it
apparently doesn't). If the files are not using the NTSC non-drop timebase and
framerate, include either the -f
or -t
argument
(-f
followed by number of frames per second for different
framerate, or -td
to change the NTSC timebase from non-dropframe
to dropframe).