Issuu on Google+

Poster Paper Proc. of Int. Colloquiums on Computer Electronics Electrical Mechanical and Civil 2011

Speech Compression Using JPEG Munmun Baisantry1, Himanshu Arora2, Jyotsna Singh2 1

Image Analysis Centre, Defense Electronic Applications Laboratory, Dehradun-248001,India Email: munmunbaisantry@gmail.com 2 Division of Electronics and Communications, Netaji Subhas Institute of Technology Faculty of Technology, (University of Delhi) ,New Delhi - 110078

Abstract— When a video is streamed over internet, different types of data e.g., moving images, audio, control signals etc are sent over the channel multiplexed into individual sub channels. Different bit rates lead to poor utilization of channel capacity. A novel technique to convert WAVE speech signals into BITMAP images and compress them using JPEG module is proposed here. This not only maximally utilizes the channel capacity by using the same channel for image and speech signals, it also minimizes computational resources and time by enabling the same processing tools to be used for both type of signals. Also, due to better compression rate of JPEG over MP3, it facilitates faster transfer of extensive volumes of data. Multiplexers used to segregate the audio and image parts of videos can also be skipped in such a communication system. Results are shown to further prove that the compression ratio is better for the proposed technique as compared to speech signals compressed by conventional techniques. PESQ values further prove that the converted and compressed speech signals are easily reproducible.

Fig.1 Overview of the proposed module

In the paper, we have proposed a novel technique to convert the WAVE files into BITMAP files using Microsoft.NET framework as a suitable platform and visual C# for programming [1], [2]. For compression of the converted signals, the JPEG module of Paint.Net [3] was extracted and used. The rest of this paper is organized as follows. In Section II, WAVE, BITMAP formats as well as JPEG compression are discussed briefly. Section III gives a detailed description of conversion of WAVE files into BMP images and back. Section IV presents results to validate the proposed algorithm. Concluding remarks are given in Section V.

Index Terms— channel, compression, Bitmap, JPEG, MP3, PESQ, Q-factor, speech, Wave.

I. INTRODUCTION

II. WAVE AND BITMAP FILE FORMATS

When a video is streamed over internet, the moving image, audio, data (if any) and control signals are sent simultaneously over the sub-channels. Due to unequal bit-rates, while the some of the channels are still transmitting the data, the other ones are sitting idle and it is not possible to assign any new data to these idle channels until the previously sent data has been streamed completely. If, instead of being multiplexed, only a single channel is used, the capacity of the channel can be fully utilized. As different data types cannot be transferred simultaneously using the same channel, it is preferable to convert audio data into image and then transmit them. Along with the maximal utilization of the channel, various other advantages associated with the proposed idea are: [1]. Conversion of audio data into image format also allows for, same software and hardware to be developed for both speech and image signals [2]. This also saves us from including a multiplexer required to segregate the audio and image parts. [3]. Experimental results prove that the audio file converted into image and then compressed using JPEG compression format has a better compression ratio than MP3, thus saving bandwidth as well as time. A diagrammatic overview of the proposed technique is shown in Fig.1:

© 2011 ACEEE DOI: 02.CEMC.2011.01. 504

The commonly used format for multimedia files is discussed in A. For storing digital images in an uncompressed format, BITMAP format is used which is discussed in B. A. WAVE File Format A WAVE file is often just a RIFF file [4] with a single “WAVE” chunk which consists of two sub-chunks — a “fmt” chunk specifying the data format and a “data” chunk containing the actual sample data. The elements of RIFF chunk are as shown in Table 1. TABLE 1: RIFF CHUNK

FIELDS AND THEIR DESCRIPTION

The “WAVE” format consists of two subchunks: “fmt” and “data”: The “fmt” subchunk describes the sound data’s format as shown in Table 2.

61


Poster Paper Proc. of Int. Colloquiums on Computer Electronics Electrical Mechanical and Civil 2011 T ABLE II. FORMAT

TABLEV. BMP INFORMATION HEADER FIELDS AND THEIR DESCRIPTION

SUB CHUNK FIELDS AND THE DESCRIPTION

C. JPEG Compression Format JPEG provides a compression method that is capable of compressing continuous-tone image data with a pixel depth of 6 to 24 bits with reasonable speed and efficiency [5]-[7].

The “data” subchunk contains the size of the data and the actual sound as shown in Table 3. TABLE III. DATA SUB C HUNK FIELDS

AND THEIR DESCRIPTION

III. ALGORITHM TO IMPLEMENT SPEECH

CONVERSION AND

COMPRESSION

A. Selecting Suitable Software Platform Various JPEG compressor softwares were tested on the basis of the amount of compression, quality of the compressed pictures and the compatibility and userfriendliness of the software [8]. On the basis of the above mentioned factors, the choice was brought down to  Paint.NET  GIMP( GNU Image Manipulation Program)  JPEG-6b-4 An image of size 980 KB was compressed using the above mentioned softwares. Table 7 shows the resulting size for a particular quality factor for these softwares.

B. BITMAP File Format The BMP file format, called as Bitmap or DIB file format (for device-independent bitmap), is an image file format used to store digital images in an uncompressed form. A BMP file contains of the following data structures:  Bitmap File Header  Bitmap Information Header  Color Table  Bitmap Data Bytes Bitmap-file header contains information about the type, size, and layout of a device-independent bitmap file.

TABLE VI. SIZE OF

T ABLEIV. B ITMAP H EADER CHUNK

The bitmap-information header, immediately follows the bitmap file header structure and it is used to specifies the format based properties of a BMP file, like width, height, color depth or compression (if any). This header contains the following data elements:

© 2011 ACEEE DOI: 02.CEMC.2011.01. 504

THE COMPRESSED

JPEG IMAGE PRODUCED AT VARIED

QUALITY LEVELS FOR DIFFERENT COMPRESSORS.

62


Poster Paper Proc. of Int. Colloquiums on Computer Electronics Electrical Mechanical and Civil 2011 3. Declare an object of WaveFileType, initialize it using the constructor. 4. The various properties are set as follows: • Bits per sample= ((width of image/WIDTH_CONS) - 1) / 10) + 1) * 8) • No. of channels= ((width of image/WIDTH_CONS) - 1) / 5) % 2) + 1) * 2) • x = (width of image / WIDTH_CONS) % 5; if (x == 0) sampling rate = 44100 else if (x < 3) sampling rate = x * 8000 else sampling rate = (x - 2) * 11025 • Average bytes per second = sampling rate * No.of channels • Block align = No. of channels* bits per sample/ 8) • Size of data chunk= size of bitmap image • Size of WAVE= size of WAVE+ size of data chunks 5. A WAVE file is created and saved in hard disk, a stream is declared to link this file and these properties ( i.e., file header and information header) are written into the file.The location of end of file is returned. 6. Lastly, two streams are declared. One links the WAVE file in the hard disk to the program and the other one to the BMP file. 7. Datachunks are read from BMP file via the stream and written to the WAVE file till all the datachunks are taken care of. 8. Streams are closed and file is saved.

Thus, we can conclude that  To maintain a trade-off between quality factor and compression, we can keep the quality factor somewhere around 85-90.  Although at higher quality factors, performance of GIMP is better than the other two, it is more complicated to use as compared to others.  Although, Paint.net and JPEG-6b-4 show almost equal performance, Paint.net, is more user-friendly being Windowsoriented while the latter being Linux- based.  And as we are more familiar with windows and windows provides the .NET framework which is highly powerful tool in software designing. Thus, it was decided to select Paint.net over the other two softwares. B. Conversion of WAVE file to BMP and vice versa 1. Declare a stream for reading the WAVE file and attached the desired file to the stream. 2. Declare an object of WaveFileType, initialize it using the constructor and transfer the data from the Wave file in hard disk via the stream declared. 3. Declare an object of Bmp, initialize it using the constructor and assign the default value as shown in tables 4, 5. 4. Set the width of the BMP file( Table 8) as follows: Width=((((bits per sample/ 8) -1)*10+((No.of channels/2)1)*5))*WIDTH_CONS) if (samples per sec % 8000 == 0) Width of image= Width of image+ (samples per sec / 8000) * WIDTH_CONS; else if (samples per sec % 11025 == 0) Width of image= Width of image+ ((samples per sec / 22050) + 3) * WIDTH_CONS 5. The other properties of BMP file are set as follows: • Size of file= size of header( =54) + size of image. • Height of image= size of image/ (width of image*3). • No. of colours used= bits per sample in a WAVE file. • Size of file= size of header( =54) + size of image. • Height of image= size of image/ (width of image*3). • No. of colours used= bits per sample in a WAVE file. 6. A BMP file is created and saved in hard disk, a stream is declared to link this file and these properties ( i.e., file header and information header) are written into the file. The location of end of file is returned. 7. Lastly, two streams are declared. One links the WAVE file in the hard disk to the program and the other one to the BMP file. 8. Datachunks are read from WAVE file via the stream and written to the BMP file till all the datachunks are taken care of. 9. Streams are closed and file is saved. Similarly, to convert a bitmap file to WAVE, the following steps were applied. 1. Declare a stream for reading the BMP file and attached the desired file to the stream. 2. Declare an object of Bmp, initialize it using the constructor and transfer the data from the Bitmap file in hard disk via the stream declared. © 2011 ACEEE DOI: 02.CEMC.2011.01. 504

TABLE VII. VALUE OF WIDTH MULTIPLIER FOR SOME WAVE FILE PROPERTIES

Fig.2 shows how a wave file ,having dual channel, 8 bits per sample, sampling rate 11025 Hz looks after being converted into a bitmap file.

Fig.2 A WAVE file after conversion into a Bitmap file

63


Poster Paper Proc. of Int. Colloquiums on Computer Electronics Electrical Mechanical and Civil 2011 A. Compression using JPEG Module The JPEG compression module was decoupled from Pain.Net and redesigned again so as to be included in the application.

The PESQ (Perceptual Evaluation Speech Quality) value of the stereo wave samples when converted into bitmap and compressed to JPEG was found to be around 3.9 which is considered appreciable since the audio samples having PESQ values greater than 3.5 are considered to be of good quality.

III. RESULTS AND DISCUSSION

T ABLE VIII. C OMPARISON OF MP3

AND NEWLY PROPOSED TECHNIQUE

Fig.3 shows snapshots of the developed software tool for WAVE conversion and compression and conversion of the JPEG file back to .BMP and then to WAVE.

V. CONCLUSION Simultaneous transfer of both audio and video files can be carried out through the same channel which requires converting a one-dimensional audio signal into a twodimensional image file and vice versa. This also saves us from the circuitry complexity involved in including a multiplexer to separate the audio and image parts of a video before streaming it over the channel. Also, the same software and hardware developed for image processing can be applied to the audio signals as well. Such a conversion helps in saving bandwidth and time in transferring the audio files.The proposed algorithm accepts audio file in stereo WAVE format & converts it into BMP and then compresses it into JPEG format . JPEG allows the user to tune the quality of compression. Using the same module, at the receiving end, extraction of the files back into image files and conversion into audio can also be implemented.

Fig 3: Snapshot showing the Save As dialog during BMP to JPEG conversion

REFERENCES [1] H.Mössenböck ,”Introduction to C#, Part 1-2", University of Linz, Austria. [2] The CSharp Website. [Online]. Available: http:// www.functionx.com/csharp/ [3] The Paint.net website. [Online]. Available: http://paint.net/ [4] RIFF Format. [Online]. Available: http://netghost.narod.ru/gff/ graphics/summar y/micriff.htm [5] John Miano, Compressed image file formats: JPEG, PNG, GIF, XBM, BMP, first edition,. ACM Press, 1999. [6] William B. Penne baker, Joan L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand Reinhold, New York, NY, 1993. [7] Wallace, Gregory K., “The JPEG Still Picture Compression Standard,” Communications of the ACM, vol. 34(4), pp. 30-44, April 1991,. [8] The Gimp website. [Online]. Available: http://www.gimp.org/

Fig.4 Graph representing Compression ratios vs. Quality of JPEG file

From the graph shown in Fig.4, we notice that as the quality factor for compression is increased, the size of the compressed image also increases, albeit non-linearly. High speech quality is obtained for a quality factor of 85 and above. From Table 8, one can observe that greater compression ratios are achieved using our technique as compared to the commonly used MP3 compression.

© 2011 ACEEE DOI: 02.CEMC.2011.01. 504

64


504 1