The Parliamentarian 2020: Issue Four - Social Media and Democracy in the Commonwealth

Page 60

HANSARD TECHNOLOGY: ALL CHANGE FOR THE OFFICIAL REPORT

HANSARD TECHNOLOGY: ALL CHANGE FOR THE OFFICIAL REPORT New technology in the UK Parliament brings automated speech software trials to the Hansard Team.

Jack Homer is

the Deputy Editor of the Official Report (Hansard) in the United Kingdom Parliament’s House of Commons.

The term ‘Hansard’ has long been well known around the Parliaments of the Commonwealth as the verbatim record of what is said in a national, state or provincial Parliament. Thomas Curson Hansard added his name to the UK publication in 1829, and in the UK House of Commons a full report was defined in 1907 as one “which, though not strictly verbatim, is substantially the verbatim report, with repetitions and redundancies omitted and with obvious mistakes corrected, but which on the other hand leaves out nothing that adds to the meaning of the speech or illustrates the argument.” When I joined Hansard in the UK House of Commons in 1997, many aspects of the Official Report had changed significantly since it came into being, although the 1907 terms of reference remained the same. One example of fundamental change is that the verbatim record is now available for free and from a single authoritative source; originally there were multiple competing accounts, each of which cost money to buy. But in the past 25 years the technology used to produce and publish the written record has transformed almost beyond recognition. The advent of resilient, reliable audio recording has meant our 90-strong staff no longer rely on shorthand and stenography. Hundreds of

cassette tapes, which were carried around between offices and Committee Rooms at the Palace of Westminster, have been replaced by a digital audio recording and playback system. Since the 1990s, the number of daily Hansard editions produced in printed form has declined dramatically, falling in the House of Commons from over 10,000 to fewer than 500. The entire Hansard archive has been digitised using optical character recognition, increasing many times over the volume of content accessible online, and there is an ongoing project to improve its searchability and accuracy. Hansard is now predominantly read online by a much larger readership, with millions of unique page views on debates such as those prompted by public e-petitions, and it is published within three hours of the words having been spoken. Proceedings in the UK Parliament are also watched online, with a huge increase in the amount of live video being streamed since fibre connectivity was installed in Parliamentary Committee Rooms. It was against this background of change that Hansard in the UK House of Commons embarked on a trial of using customised automated speech recognition (ASR) technology to assist with producing and publishing the official record. We wanted to explore two things. First, at a very simple level, could the dramatic

338 | The Parliamentarian | 2020: Issue Four | 100 years of publishing 1920-2020

improvements in accuracy reported in this technology be translated into an effective tool to assist with the production of the record? Previous trials in the 90s and early 2000s using off-the-shelf products had foundered at an early stage. Secondly, could we use a customised audio-to-text alignment system to enable Hansard reports to be displayed as subtitles on video-ondemand and bring to our video archive the same searchability that the Hansard text enjoys? In both cases, we didn’t have the capacity to rebuild our existing systems from the ground up, and therefore wanted the technology to be capable of easy integration. We had a good starting point. Thousands of hours of recordings with accompanying transcripts form a good basis for ‘training’ a speech recognition

“It was against this background of change that Hansard in the UK House of Commons embarked on a trial of using customised automated speech recognition (ASR) technology to assist with producing and publishing the official record.”


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
The Parliamentarian 2020: Issue Four - Social Media and Democracy in the Commonwealth by The Parliamentarian - Issuu