This page will hopefully cover everything you need to know to add closed captions during the DVD authoring process.
An Introduction to Closed Captions
Line 21 Closed Captions is the system used by North American television stations to encode information useful to the deaf and the hard of hearing in a format that can be turned on or off by the viewer (a page on the Teletext Then and Now site shows what this actually looks like, for those of you from PAL or SECAM-broadcasting countries). There are a handful of alternate formats for this purpose used by TV broadcasters in other parts of the world, but only Line 21 Closed Captions are supported for DVD's, so all non-Region 1 discs claiming to include "Captions for the Deaf and Hard of Hearing" actually use subtitles instead (the short difference between subtitles and closed captions: you turn subtitles on and off with your DVD remote, and you turn closed captions on and off with your TV remote). The following explanation is derived from the Closed Caption FAQ, maintained by Paul Robson, which does an excellent job of explaining what Line 21 Closed Captions are and how they work in a broadcast setting.
The mechanism used for Line 21 Closed Captions allows the viewer to choose between a maximum of four different "channels" of simultaneous captions, plus four more "channels" of non-program related text. In the years since the introduction of this system, it was discovered that channels CC1, CC2, and T1 (the first and second closed-caption channels and the first text channel) were the only ones broadcasters ever used, so alternate uses were found for two of the remaining channels. Channel T2 is now used to transmit Interactive TV (ITV) signals, which are used by MSN-TV to transmit the internet links for their service. Channel CC3 is now used to transmit the eXtended Data Service (XDS). XDS contains a wide variety of information, but the two portions most commonly used are the time of day signal which newer VCR's use to program their clocks, and the rating signal which is used to control what content children are allowed to watch via the "V-Chip"s in newer TV's.
Line 21 Closed Captions are transmitted on the last odd and even lines in the Vertical Broadcast Interval (VBI), the non-visible part of the TV signal used mostly for calibration purposes. If you adjust the vertical hold on a North American television set, you should be able to see one or two lines above the normal "top" of the screen, each made up of sixteen rapidly-blinking segments. These are Fields 1 and 2 of Scanline 21. Each segment of each line is used as a bit to build up a total of four eight-bit bytes, two bytes in the odd field and two bytes in the even field. Field 1 is used to transmit channels CC1, CC2, T1 and T2 (ITV), while Field 2 is used to transmit channels CC3 (XDS), CC4, T3 and T4.
Closed Captions on Videotapes and DVD's
One of the major benefits of the Line 21 Closed Caption system is that it is automatically recorded with the program when taped by a VCR and can then be displayed on playback. Since Digital Versatile Discs only store the visible portion of the video signal, an alternate method had to be found in order to transmit Closed Captions and their related services, especially since there is a legal requirement in the United States to provide Closed Captions on every movie sold in the country. For DVD's, this data is muxed into the MPEG elementary video files in the form of a special user data packet inside each GOP. As far as I know, every DVD authoring program that supports Closed Captions (including Scenarist and Maestro) import them as one or two text files (one for Field 1, the other for Field 2) containing the raw hexidecimal data rather than expect them to already be muxed into the video source files. I have never heard of a DVD that stored anything but closed captions in the user data packets (the DVD specification includes a superior alternative to the XDS ratings packet, PCFriendly is superior to ITV, and of course XDS time of day is useless on a DVD), so the rest of this discussion will focus on the Field 1 data and channels CC1 and CC2.
Closed Caption Requirements
The following are not required, but are followed by all Closed Captions I've seen either broadcast or on DVD's:
SCC Format
Both Sonic Scenarist and Spruce Maestro use the Scenarist Closed Caption format (extension .SCC) to import closed caption data. Here is an example:
Scenarist_SCC V1.0 01:02:53:14 94ae 94ae 9420 9420 947a 947a 97a2 97a2 a820 68ef f26e 2068 ef6e 6be9 6e67 2029 942c 942c 8080 8080 942f 942f 01:02:55:14 942c 942c 01:03:27:29 94ae 94ae 9420 9420 94f2 94f2 c845 d92c 2054 c845 5245 ae80 942c 942c 8080 8080 942f 942f |
The file is double-spaced, with data lines alternating with blank lines. The first line identifies the format and version--it needs to be exactly like this. The third and subsequent alternating lines start with the timecode and are followed by the data.
The timecode is in SMPTE format, which is either
hours:minutes:seconds:frames
for non-dropframe timebase or
hours:minutes:seconds;frames
for dropframe timebase. Both
are 29.97 frames per second, but dropframe timebase accomplishes the fractional
framerate by using 30 frames per second and skipping the first two frames each
minute for nine out of every ten minutes (non-dropframe timebase simply runs
the clock at exactly 29.97 frames per second). Use the same format you encoded
your video with. Here's a hint: if it came from a broadcast source, it's
probably dropframe, while if you created it from scratch, it's probably
non-dropframe.
The data is made up of two-byte hexidecimal words, separated from each other by spaces and from the timecode by a tab character. The data uses only seven out of every eight bits of each byte, with the high bit used to satisfy odd parity--adding up all the bits has to result in an odd number, or the closed caption decoder will reject the byte as corrupt data. The major exception is ITV, which not only doesn't enforce odd parity, it also uses a slightly different character set than captions, text or XDS.
Deciphering the bytes
The full requirements for Closed Captions are contained in EIA/CEA standard 608-B (there is also a 708-B standard for high-definition TV captions, but that is beyond the scope of this document). CEA 608 can be purchased from IHS Global for $ 170, but luckily, the requirements are available for free in the Code of Federal Regulations, which can be obtained in PDF format from the Government Printing Office (just click "Browse" on the screen that comes up). Specifically, the requirements are contained in 47CFR15.119: book 47 covers the Federal Communications Commission, section 15 covers broadcasting in radio frequencies (including television), and 119 is the specific subsection for analog closed caption decoder requirements. The main adjustment you need to make to these requirements is for the odd parity: 00h (binary 00000000) is translated to 80h (10000000), but 07h (00000111) is left alone.
Here is a translation matrix to turn a 7-bit hexidecimal number into the equivalent odd-parity 8-bit number:
80, 01, 02, 83, 04, 85, 86, 07, 08, 89, 8a, 0b, 8c, 0d, 0e, 8f, 10, 91, 92, 13, 94, 15, 16, 97, 98, 19, 1a, 9b, 1c, 9d, 9e, 1f, 20, a1, a2, 23, a4, 25, 26, a7, a8, 29, 2a, ab, 2c, ad, ae, 2f, b0, 31, 32, b3, 34, b5, b6, 37, 38, b9, ba, 3b, bc, 3d, 3e, bf, 40, c1, c2, 43, c4, 45, 46, c7, c8, 49, 4a, cb, 4c, cd, ce, 4f, d0, 51, 52, d3, 54, d5, d6, 57, 58, d9, da, 5b, dc, 5d, 5e, df, e0, 61, 62, e3, 64, e5, e6, 67, 68, e9, ea, 6b, ec, 6d, 6e, ef, 70, f1, f2, 73, f4, 75, 76, f7, f8, 79, 7a, fb, 7c, fd, fe, 7f
As explained in the Closed Caption FAQ, there are three different types of closed captions: roll-up, paint-on, and pop-on. The only one of these used in DVD's are pop-on. The requirements also cover using CC1 and CC2 to put two different closed caption channels on the DVD, but none of the software DVD players can support CC2, so I'll only explain how to create pop-on captions for channel CC1.
Format of Pop-on Captions
Pop-on captions have a set format, as described below, made up of commands (always 2-byte words) and characters (usually single bytes). If the caption is to be broadcast, each of the commands are doubled up for redundancy in case the signal is garbled in transmission (garbled data is usually displayed as character 7f, the solid block). The decoder is programmed to ignore a second command when it is the same as the first. When writing captions for a DVD, you can choose whether you wish to double or not (if you look at the sample towards the top of the page, you will see a lot of doubling).
Column 0 (can set color and underline):
Row: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
High Byte: | 91 | 91 | 92 | 92 | 15 | 15 | 16 | 16 | 97 | 97 | 10 | 13 | 13 | 94 | 94 |
Low Byte by Column: | |||||||||||||||
0 (white) | d0 | 70 | d0 | 70 | d0 | 70 | d0 | 70 | d0 | 70 | d0 | d0 | 70 | d0 | 70 |
0 (white) underline | 51 | f1 | 51 | f1 | 51 | f1 | 51 | f1 | 51 | f1 | 51 | 51 | f1 | 51 | f1 |
0 green | c2 | 62 | c2 | 62 | c2 | 62 | c2 | 62 | c2 | 62 | c2 | c2 | 62 | c2 | 62 |
0 green underline | 43 | e3 | 43 | e3 | 43 | e3 | 43 | e3 | 43 | e3 | 43 | 43 | e3 | 43 | e3 |
0 blue | c4 | 64 | c4 | 64 | c4 | 64 | c4 | 64 | c4 | 64 | c4 | c4 | 64 | c4 | 64 |
0 blue underline | 45 | e5 | 45 | e5 | 45 | e5 | 45 | e5 | 45 | e5 | 45 | 45 | e5 | 45 | e5 |
0 cyan | 46 | e6 | 46 | e6 | 46 | e6 | 46 | e6 | 46 | e6 | 46 | 46 | e6 | 46 | e6 |
0 cyan underline | c7 | 67 | c7 | 67 | c7 | 67 | c7 | 67 | c7 | 67 | c7 | c7 | 67 | c7 | 67 |
0 red | c8 | 68 | c8 | 68 | c8 | 68 | c8 | 68 | c8 | 68 | c8 | c8 | 68 | c8 | 68 |
0 red underline | 49 | e9 | 49 | e9 | 49 | e9 | 49 | e9 | 49 | e9 | 49 | 49 | e9 | 49 | e9 |
0 yellow | 4a | ea | 4a | ea | 4a | ea | 4a | ea | 4a | ea | 4a | 4a | ea | 4a | ea |
0 yellow underline | cb | 6b | cb | 6b | cb | 6b | cb | 6b | cb | 6b | cb | cb | 6b | cb | 6b |
0 magenta | 4c | ec | 4c | ec | 4c | ec | 4c | ec | 4c | ec | 4c | 4c | ec | 4c | ec |
0 magenta underline | cd | 6d | cd | 6d | cd | 6d | cd | 6d | cd | 6d | cd | cd | 6d | cd | 6d |
Columns 4 - 28 (color white, can set underline)
Row: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
High Byte: | 91 | 91 | 92 | 92 | 15 | 15 | 16 | 16 | 97 | 97 | 10 | 13 | 13 | 94 | 94 |
Low Byte by Column: | |||||||||||||||
4 | 52 | f2 | 52 | f2 | 52 | f2 | 52 | f2 | 52 | f2 | 52 | 52 | f2 | 52 | f2 |
4 underline | d3 | 73 | d3 | 73 | d3 | 73 | d3 | 73 | d3 | 73 | d3 | d3 | 73 | d3 | 73 |
8 | 54 | f4 | 54 | f4 | 54 | f4 | 54 | f4 | 54 | f4 | 54 | 54 | f4 | 54 | f4 |
8 underline | d5 | 75 | d5 | 75 | d5 | 75 | d5 | 75 | d5 | 75 | d5 | d5 | 75 | d5 | 75 |
12 | d6 | 76 | d6 | 76 | d6 | 76 | d6 | 76 | d6 | 76 | d6 | d6 | 76 | d6 | 76 |
12 underline | 57 | f7 | 57 | f7 | 57 | f7 | 57 | f7 | 57 | f7 | 57 | 57 | f7 | 57 | f7 |
16 | 58 | f8 | 58 | f8 | 58 | f8 | 58 | f8 | 58 | f8 | 58 | 58 | f8 | 58 | f8 |
16 underline | d9 | 79 | d9 | 79 | d9 | 79 | d9 | 79 | d9 | 79 | d9 | d9 | 79 | d9 | 79 |
20 | da | 7a | da | 7a | da | 7a | da | 7a | da | 7a | da | da | 7a | da | 7a |
20 underline | 5b | fb | 5b | fb | 5b | fb | 5b | fb | 5b | fb | 5b | 5b | fb | 5b | fb |
24 | dc | 7c | dc | 7c | dc | 7c | dc | 7c | dc | 7c | dc | dc | 7c | dc | 7c |
24 underline | 5d | fd | 5d | fd | 5d | fd | 5d | fd | 5d | fd | 5d | 5d | fd | 5d | fd |
28 | 5e | fe | 5e | fe | 5e | fe | 5e | fe | 5e | fe | 5e | 5e | fe | 5e | fe |
28 underline | df | 7f | df | 7f | df | 7f | df | 7f | df | 7f | df | df | 7f | df | 7f |
Code | Meaning |
---|---|
9120 | change to white, no formatting |
91a1 | change to white underline |
91a2 | change to green, no formatting |
9123 | change to green underline |
91a4 | change to blue, no formatting |
9125 | change to blue underline |
9126 | change to cyan, no formatting |
91a7 | change to cyan underline |
91a8 | change to red, no formatting |
9129 | change to red underline |
912a | change to yellow, no formatting |
91ab | change to yellow underline |
912c | change to magenta, not formatting |
91ad | change to magenta underline |
91ae | turn on italics |
912f | turn on italics and underline |
94a8 | turn flash on |
As an example, here is the sample .SCC file from above, followed by its meaning:
Scenarist_SCC V1.0 01:02:53:14 94ae 94ae 9420 9420 947a 947a 97a2 97a2 a820 68ef f26e 2068 ef6e 6be9 6e67 2029 942c 942c 8080 8080 942f 942f 01:02:55:14 942c 942c 01:03:27:29 94ae 94ae 9420 9420 94f2 94f2 c845 d92c 2054 c845 5245 ae80 942c 942c 8080 8080 942f 942f |
94ae 94ae
); start
pop-on caption (9420 9420
); move cursor to row 15, column 20
(947a 947a
); move over 2 more columns to column 22
(97a2 97a2
); display "( horn honking )" (a820 68ef f26e 2068
ef6e 6be9 6e67 2029
); clear screen (942c 942c
); wait 2
frames (8080 8080
); and display caption (942f
942f
).942c 942c
).94ae 94ae
); start pop-on
caption (9420 9420
); move cursor to row 15, column 4 (94f2
94f2
); display "HEY, THERE." (c845 d92c 2054 c845 5245
ae80
--note the 80 used as a spacer); clear screen (942c
942c
); wait 2 frames (8080 8080
); and display caption
(942f 942f
).A Technical Explanation of Placement and Format of DVD Closed Caption User Data Packets
Data in MPEG files is organized in terms of packets. DVD closed captions are stored on a per-GOP basis, and are located within the video MPEG-2 file between the GOP Header packet and the (I-frame) Picture Header packet.
Structure of the DVD Closed Caption User Data Packet (all values are in hexidecimal):
Bytes | Sample Contents | Description | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
HEADER (9 bytes) | |||||||||||||||||
0 - 3 | 00 00 01 b2 | User Data Packet header (never changes). | |||||||||||||||
4 - 7 | 43 43 01 f8 | DVD Closed Caption header (never changes). | |||||||||||||||
8 | 9b |
Attributes:
| |||||||||||||||
CAPTION SEGMENT (6 bytes)--repeat for each frame of GOP | |||||||||||||||||
n | ff |
Field (ff = Field 1, fe = Field 2) | |||||||||||||||
n+1 - n+2 | 94 a3 |
Caption: Two bytes that are transmitted this field. Use
80 80 if there's nothing to transmit. | |||||||||||||||
n+3 | fe |
Field (always opposite value from above) | |||||||||||||||
n+4 - n+5 | 01 83 | Caption (see above) | |||||||||||||||
EXTRA FIELD (3 bytes)--only if Extra Field Flag is set | |||||||||||||||||
m | ff |
Field (ff = Field 1, fe = Field 2) | |||||||||||||||
m+1 - m+2 | 94 a3 |
Caption: Two bytes that are transmitted this field. Use
80 80 if there's nothing to transmit. | |||||||||||||||
FOOTER | |||||||||||||||||
- x | 00 00 00 00 00 00 |
Padding (repeat 00 byte until packet is evenly divisible
by 4) |
Note that some DVD's create a fixed 96-byte closed caption packet size as described above (by using padding for GOP's below 15 frames and the Truncate Flag for 15-frame GOP's), but many DVD's do not do this, and the DVD's created by Sonic Scenarist and Spruce DVDMaestro never do this. In these cases, the Extra Field Flag is always 0, the Pattern Flag is always 1 (Field 1 followed by Field 2), and no padding is used at the end of the data packet.
Another item to note is the variation of this format used by a number of
MPEG-capturing devices, including Hauppauge's WinTV-250 card and Panasonic's
DMR-H50S tabletop DVD recorder (in DVR mode). These devices use
ff
as the flag for both fields' caption data, relying on the
Pattern Flag to tell the fields apart.
Return to SCC Tools Documentation.