Issuu on Google+

Subtitle insertion

GEP100 - HEP100

Inserting subtitles and Teletext with the HSI11 module 3Gb/s, HD, SD embedded domain Dolby E to PCM

decoder with audio shuffler

A

A

®

® product application note

Quad speed

Upgradable to

3Gb/s Embedded

Metadata S2020

COPYRIGHT©2011 AXON DIGITAL DESIGN B.V. COPYRIGHT©2012 ALL RIGHTS RESERVED NO PART OF THIS DOCUMENT MAY BE REPRODUCED IN ANY FORM WITHOUT THE PERMISSION OF AXON DIGITAL DESIGN B.V.


EMBEDDED AUDIO INSERTING SUBTITLES PROCESSING

Introduction Sub·ti·tle [súb tit’l]: noun, verb, -tled, -tling 1. captions for foreign-language film, a printed translation of the dialogue in a foreign-language film, usually appearing at the bottom of the screen 2. printed words for hearing-impaired, the printed text of what is being said in a television program, provided for the hearing-impaired and usually at the bottom of the screen. Subtitling of program content is increasingly expected by the viewers; those with hearing impairment want to be able to appreciate the audio content, media sourced from a different country may not be produced in the language of the audience and therefore requires translation. Not all viewers might want to see the subtitles, so these are commonly sent as data and are converted into text and inserted into the picture by the viewer’s receiving equipment.

HD and SD video streams are formed of a series of pictures, transmitted one after another. Each picture is formed of a series of lines, some of which are not used to carry the picture information. In SD signals this is called the Vertical Blanking Interval (VBI) and was originally used to provide a period of black corresponding to the time taken for the electron beam to be deflected to the top of the glass CRT screen following the end of displaying the previous picture. HD signal have reserved a similar period in the video stream to carry data, called VANC (Vertical Ancillary Data). Both of which can be used to convey subtitle data from the broadcaster to the viewer.

Subtitle Preparation TV programming falls into two basic categories, pre-recorded (taped) and live. The creation of subtitles for pre-recorded shows tends to occur before the time of transmission whereas live programs require the creation of subtitles in real-time as the show is transmitted. Live programs often have an element of scripted content, for instance the anchored section of newscasts, these scripts can be used to generate some of these subtitles.

Pre-Recorded Preparation All subtitle preparation involves transcribing the spoken word to text, possibly in a different language to that spoken, and adding a description of other sounds, such as music and door slams, at the appropriate moment. For pre-recorded programming this process can be undertaken in an off-line environment, the Subtitler is provided with a frame accurate copy of the program, often as a low bit-rate proxy file made at the time of the final edit or when the material is ingested at the playout operation. They watch and listen to this file, translate the speech if required and either type the words directly into a subtitle workstation, e.g. a Softel Swift Create, or use a speech recognition program and re-speak the words of the actors as the program is played. Because this process is not live any errors can be corrected

by stopping playback and editing the captured text. Many subtitle workstations have inbuilt checks for reading speed, spelling, text formatting etc. The file that is produced by this process will be forwarded to the playout operation to be transmitted alongside the associated program material.

Live Preparation Live subtitle preparation requires a different approach. It might be possible to obtain scripts for sections of the program however the words spoken by the presenters may vary and there will be sections, if not all, of the program for which there will be no prior knowledge of the spoken word. In these cases the Subtitler will need to listen to the program audio and then type the words and descriptions, or translation, into a subtitling workstation e.g. a Softel Swift Create Live, this requires a high level of keyboard skills. An alternative approach is to re-speak the words and descriptions, a speech recognition program will then convert the speech to text for use by the subtitling application. The use of speech-to-text applications means that a Subtitler is not required to have fast and accurate typing skills however the application will need to be “trained” to understand their speech patterns.

page 2


EMBEDDED AUDIO INSERTING SUBTITLES PROCESSING This approach is very useful for simultaneous translation of live programming allowing skilled translators to focus on providing an accurate translation without being limited by their typing speed. Live preparation workstation systems often allow for multiple Subtitlers to work together on a single program and thereby share the workload. Although speech-to-text applications are constantly improving TV programs tend to have

background noise, music and often more than one voice speaking simultaneously making automatic transcription impossible using today’s technology. The data from the subtitle workstation is sent directly to the playout operation for inclusion in the channel’s output video stream. This approach to subtitle creation does not automatically produce a file, although the subtitle data can be captured in a file and this can be used if the program is later rebroadcast.

Subtitle Transmission Transmission of Pre-prepared subtitles Pre-prepared subtitle files have to be transmitted at the correct time relative to the content of the program they relate too so that subtitles are displayed at the same time as the words are spoken. Subtitles do not tend to be a continuous stream of text; instead they are discrete packets of words normally with breaks between phrases or sentences. Within the overall subtitle file each subtitle has a time-stamp which is referenced to the timecode of the original program. At the time the program is transmitted the playout automation system instructs the subtitling system, such as the Softel Switch TX, to retrieve the file from its store and load it into the playout application. The automation also loads and cues the program on a video server, when the program is scheduled to be transmitted the automation plays the clip from the server, the clip’s timecode sent to the subtitle playout system. As the program is transmitted the subtitle control system matches program timecodes to the timestamps in the subtitle file, when a match is made subtitle data is sent to be embedded into the video stream by the HSI11 module. Similar processes can take place at the time an original program is ingested into the video server; the ingest system controls a subtitle playout system to insert the subtitle data into the VBI (SD) or VANC (HD) and this is stored along with the program’s video and audio in a the file created at the time of ingest.

When the video clip is played-out the subtitles are already striped, or embedded, into the video and are therefore automatically transmitted.

Transmission of live subtitles Live subtitles, for simultaneous translation or for the hearing impaired, are added to the video stream as they are created. The automation’s playlist, or manual intervention by an operator, will signal to the subtitle control system that live subtitles are required for the next event. The subtitle control system will route the output of a subtitle workstation directly to the HSI11 insertion device. As subtitles are created, either by typing or automatic conversion of speech, the workstation forms these into subtitle data packets with each group being sent when triggered by the operator pressing Return or a dedicated Send key. A variation of this process is that the system sends each word as it is completed resulting in subtitles being sent (and being displayed) quicker than waiting for the speaker to complete a sentence. With the introduction of speech recognition technology the operator may not being using a keyboard, in this scenario the system waits for words to be recognized before sending the subtitle. The example below shows Softel Swift Create workstations used to produce subtitle files and live feeds and a Softel Swift TX handling these files and live feeds under automation control. In both cases the subtitle data is forwarded to a Synapse HSI11 module for inclusion into the video stream as either Teletext (SD) or OP47 (HD).

page 3


EMBEDDED AUDIO INSERTING SUBTITLES PROCESSING

Example 1: subtitle preparation and insertion overall block diagram Automation

Live Programs

HSI11

Recorded Programs Presentation System

Video Server

VBI/VANC Inserter

Embedded Audio ANC Bypass

Softel vFlex

CVBS Encoder

Transmission Output

Subtitle Control Timecode Off-line Subtitle Preparation Softel Swift Create

Subtitle Files

Softel Swift TX

Preview

Live Subtitle Preparation Subtitle Files

Softel Swift Create Live

The HSI11 module as an option to allow the VBI or VANC to bypass the inserter if there is no subtitle data being presented to the module from the subtitle control system. Therefore if the program’s video file already contains subtitle information, which may have been added as part of the ingest process, this will be transmitted by default if no other subtitles are sent from the control system. Although this may seem desirable careful consideration should be given to setting the module to bypass. If the originating source of program material is not closely controlled programs may be created with unsuitable subtitles and if new subtitles are not produced these could be automatically transmitted.

Subtitles only tend to occur when the program contains speech or if there is a requirement to add a description of another sound. This results in a noncontinuous data follow however the module has a configuration parameter to set how long subtitle data is missing before the inserter is bypassed. The HSI11 also supports the insertion of filler packets when no subtitles are present, creating a continuous stream of subtitle data which may aid the decoding of subtitles especially in older equipment

page 4


EMBEDDED AUDIO INSERTING SUBTITLES PROCESSING

Synapse HSI11 module HSI11

HD/SD-SDI OUTPUT HD/SD-SDI IN 1

EQ

ACTIVE VIDEO EMBEDDED AUDIO ANC BYPASS

VBI/VANC DATA INSERTER

SOFTEL VFLEX POWERED

ENCODER CVBS

HD/SD-SDI OUTPUT

PREVIEW CVBS OUTPUT

ETHERNET GPI IN

RACK CONTROLLER INTERNAL SYNAPSE BUS

The HSI11 provides an ideal interface between Softel’s subtitle preparation and playout applications and live video streams. The HSI11 uses the proven Softel vFlexMC VBI and VANC data insertion capabilities of the standalone units and incorporates them into a Synapse module. The module follows the standard form-factor, fitting into any of the frames in the Synapse range alongside any of the other 300+ different processing modules and utilizing dual power supplies in 2RU and 4RU frames.

External subtitles from a subtitle playout system or live subtitling workstation are sent to the module in the industry standard Newfor protocol via a dedicated Ethernet connection. The module encodes and inserts these subtitles as either WST in SD or OP47 or SMPTE2031 in HD. The user has full control of VBI/VANC line assignment , insertion of filler packets and other transmission settings. The vFlexMC embedded on the HSI11 module forms part of the Swift end-to-end subtitling and captioning solution from Softel.

Example 2: Cortex control screen for HSI11 showing HD and SD inserter configuration elements ad controls for video standard, WSS insertion and Bypass settings

HD inserter settings Automatic inserter bypass settings WST SD inserter settings

page 5


AN2012-10 Subtitle insertion.pdf