Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC

A Novel Image Compression Technique using Simple Arithmetic Addition Nadeem Akhtar, Gufran Siddiqui and Salman Khan Department of Computer engineering, Zakir Husain College of Engineering and Technology Aligarh Muslim University, Aligarh, India Email: {nadeemalakhtar, m.gufran.sid}@gmail.com, salmanblues_23}@yahoo.com

Abstractâ€” A novel lossless image-compression scheme is proposed in this paper. We show how a set of pixels can be compressed using simple arithmetic addition. We perform a series of addition operations on a set of pixels to get an array of sums. Additions are performed in such a way that we can reverse the whole process and recover the original set of pixels through that array. Experimental results presented in this paper prove that this new method of image compression gives promising results as compared with original LZW dictionary algorithm, Deflate algorithm, PNG and GIF. Index Termsâ€” Image Compression, LZW, Deflate, PNG, GIF

I. INTRODUCTION The data compression [1-6] has always been important and it becomes even more popular and important nowadays. In many cases the data compression is necessary due to huge requirements of storage and time, especially in problems of information transmission. Images are very important form of data, to work with them in some applications they need to be compressed, more or less depending on the purpose of the application. There are some algorithms that perform this compression in a lossless way, such that no information is lost while compressing the images and when they are decompressed it is exactly the same as the original image. Some other algorithms perform the compression in a lossy way, such that some information is lost while compressing the images. Some of these compression methods are designed for specific kinds of images, so they will not be as good for other kinds of images. Moreover there are some algorithms that even let you change the parameters they use to adjust the amount of compression for an image. In this paper we deal with Lossless Image Compression [7-10]. This kind of compression is very important in many fields such as biomedical image analysis, medical images, art images, security and defense, remote sensing, and so on. During the past few years, several schemes have been developed for lossless image compression. Usually, a two stage coding technique is embedded in these schemes. In the first stage, a linear predictor such as differential pulse code modulation (DPCM) [11, 12] or some linear predicting functions is used to de-correlate the raw image data. In the second stage, a standard coding technique, such as Huffman coding [13, 14], arithmetic coding [15] or Lempel-Ziv coding, is used to encode the residual magnitudes. Such a two-stage scheme is useful because the high correlation between neighboring pixels in most images can be decorrelated, which results in a significant entropy reduction. However the algorithm introduced in this paper is totally different. The idea is that, when you add two numbers it is like you are merging two numbers into one. For example DOI: 02.ITC.2014.5.61 ÂŠ Association of Computer Electronics and Electrical Engineers, 2014

adding 10 and 5 gives us 15. So we can imagine 15 as a representation of two numbers. Only thing is that we cannot recover 10 and 5 from 15. 15 can give us {10, 5} {9, 6} {8, 7} and so on. Let us assume that there is a way of recovering 10 and 5 from 15. We know that 10 takes 4 bits to be stored in memory and 5 takes 3 bits to be stored in memory. That makes a total of 7 bits. On the other hand 15 take only 4 bits to be stored in memory. Hence storing 15 gives us a saving of 3 bits. In this paper we will be exploiting this idea and use it to compress images. Comparison is done by using the standard images of Lena, Barbara, F16 and some other high resolution images. II. RELATED W ORK The most popular Lossless Image Compression formats are GIF and PNG. GIF (Graphics Interchange Format) [16] is a bitmap image format. The format supports up to 8 bits per pixel thus allowing a single image to reference a palette of up to 256 distinct colors. The colors are chosen from the 24-bit RGB color space. It also supports animations and allows a separate palette of 256 colors for each frame. The color limitation makes the GIF format unsuitable for reproducing color photographs and other images with continuous color, but it is well-suited for simpler images such as graphics or logos with solid areas of color. GIF images are compressed using the Lempel-Ziv-Welch (LZW) [17] lossless data compression technique to reduce the file size without degrading the visual quality. PNG (Portable Network Graphics) [18], like Gif, is also a bitmap image format that employs lossless Image compression. PNG was created to both improve upon and replace the GIF format with an image file format that does not require a patent license to use. It uses the DEFLATE [17]compression algorithm, that uses a combination of the LZ77 [17] algorithm and Huffman coding. PNG supports palette based (with a palette defined in terms of the 24 bit RGB colors), greyscale and RGB images. PNG was designed for distribution of images on the internet not for professional graphics and as such other color spaces. III. PROPOSED METHOD The method of image compression which we present in this paper is based on simple addition. It is independent of all the currently available compression algorithms. It is simple yet effective. Firstly we split the image into 4*4 blocks and the then the same algorithm is applied to all the blocks of the image. Compression of the image is divided in three phases, while decompression is divided in two phases Compression: i. Initialization Phase ii. Iteration Phase iii. Storing Decompression: i. Reverse Iteration ii. Mapping A. Compression Consider a 4*4 block of an image given below:

i. Initialization Phase The image contains 5 distinct numbers 181, 182, 180, 184 & 187. Let us arrange them in ascending order.

320

Assume them to be as array indexes. We initialize each cell with the number of integers between itself and the previous integer. The number 180 has no previous number to be compared with, so we initialize it with 0. Between 181 and 180 there are no integers, so we initialize cell number 181 with 0 as well. Similarly there are no integers between 182 and 181, so cell number 182 is also initialized with 0. Next is cell number 184. There is 1 integer in between 184 and 182 (i.e. 183), so we initialize cell number 184 with 1. Between 187 and 184 there are 2 numbers (i.e. 185 and 186), so we initialize cell number 187 with 2. This completes our initialization phase. ii. Iteration Phase In this section we will iterate through the block sequentially from start to the finish. At each step our aim is to make the value stored in the cell (The Highest), corresponding to the current number. We do it by adding [current highest value (excluding the value stored in the current cell) + 1] to the value stored in the current cell. We will proceed as follows: Step 1: First number = 181 The current highest value is 2. So we add 1 to it and add the resulting value to the value initially stored in cell number 181. So the value of cell number 181 now becomes 3 (0 + [2+1]) which is now the highest number in the updated array. Old

Updated

While finding the highest number we are not considering the value already stored in the current cell (i.e. cell number 181). In the above case it does not make any difference because cell number 181 contains 0, but if suppose the stored number in cell number 181 was greater than or equal to 3, the highest number still would have been 2 in the above case. Step 2: Next number = 182 The highest value in the new updated array is 3 (excluding the value already present in cell number 182). We add 1 to it and add the resulting value to the value initially stored in cell number 182. So the value of cell number 182 now becomes 4 (0 + [3+1]) Old

Updated

Step 3: Next number = 180

321

The highest value in the new updated array is 4. So we add 1 to it and add the resulting value to the value initially stored in cell number 180. So the value of cell number 180 now becomes 5 (0 + [4+1]) Old

Updated

Step 4: Next number = 180 The highest value in the new array is 4. Note that we have excluded the value stored in the current cell number 180 i.e. 5 in order to find the highest value. So we add 1 to 4 and add the resulting value to the value initially stored in cell number 180 (i.e. 5). So the value of cell number 180 now becomes 10 (5 + [4+1]) Old

Updated

Similarly for all the remaining numbers in the block (i.e. 180, 184, 181, 181, 180, 184, 182, 187, 181, 187, 187, and 180), we proceed as above to get the following Final array. Final Array

iii. Storing Now we will store this final array according to the following format:

Our starting number (index) in the above array is 180. Total distinct numbers are 5. Maximum value is 381. It takes 9 bits to represent this value in binary. So we allocate 9 bits to all the values in the array and store it as follows: Start Dn Bm ďƒ&#x; Dn * Bm ďƒ Total bits occupied: 8 + 4 + 4 + (9 * 5) = 61 322

Total bits occupied by raw data: 8 * 16 = 128 (i.e. 8 bits per pixel and there are 16 pixels in one 4*4 block.) Total bits saved: 128 – 61 = 67 B. Decompression Decompression is simple. We do just the reverse of what we did in the compression phase. Let us consider the previously stored binary data: Start

Dn Bm

Dn * Bm

From this data we can retrieve the following information: Start = 180 Array =

Once we have this information we can now proceed with the reverse iteration phase and mapping phase. i. Reverse Iteration In this phase at each step we will proceed as follows: 1. Find the highest value and write its associated index in the 4*4 block in reverse order. 2. Subtract the [second highest value + 1] from the highest value and replace the highest value with it. Let us now proceed as above. The current highest value is 381 which is store in cell number 1. So we save 1 in the 4*4 block as shown below:

Next we subtract the [second highest value + 1] (i.e. 325 + 1) from the highest value (i.e. 381). We get 55. Then we will replace 381 with 55. The updated array will look follows:

Now the highest value in the updated array is 325 which store in cell number 5. So we will save 5 in the 4*4 block as shown below:

323

We again subtract the [second highest value +1] (i.e. 121 +1) from the highest value (i.e. 325). We get 203. Then we will replace 323 with 203. The updated array will look as follows:

We repeat the above procedure until we have filled the 4*4 block. The final block will look as follows:

The final array will look as follows:

ii. Mapping The recovered final array above gives us the gapping information between adjacent numbers. We also know that the starting number is 180 as mentioned earlier so we can now retrieve the original pixel array which will look as follows:

We replace the numbers (1, 2, 3, 4 & 5) in the 4*4 block with the above recovered pixel values to get the following block:

324

This block is same as the block we initially compressed. Hence we have recovered our original data without any loss of information. IV. R ESULTS AND D ISCUSSION Tables I and II show comparison results among the proposed method and other Lossless Compression algorithms. Table I shows the size of various images (24 bit RGB) in bytes after compression and Table II shows their respective compression ratios, where compression ratio is defined as: Compression Ratio (CR) = (Compressed Image Size) / (Original Image Size) The compression ratio is an important criterion in choosing a compression scheme for lossless image compression. From the above results, we can see that our newly proposed method is better than GIF, LZW and Deflate on average case. However, it is not better than PNG which is the most commonly used scheme for lossless image compression. Now, we analyze our method for performance analysis. The proposed addition-based algorithm gives very good results if the pixels values are close to each other no matter how randomly they are arranged. The performance of the algorithm increases as the number of distinct pixel values in the 4*4 block decreases. Let us consider the following 4*4 block having all 16 pixel values equal:

On applying the addition-based algorithm to the above block we obtain the following final array:

Total bits = 16(header) + 1*5(bits occupied by maximum value 16) Total bits = 21 Bits saved = 107 Now let us consider the following 4*4 block having all 16 pixel values different but closely related:

On applying the addition-based algorithm to the above block we obtain the following final array: 325

Total bits = 16(header) + 16*5(bits occupied by maximum value 16) Total bits = 96 Bits saved = 32 V. CONCLUSION In this work, we have proposed a novel image compression algorithm that works in spatial domain. The proposed compression method is much better than GIF, TIFF (LZW). It is comparable to TIFF (DEFLATE) and for most of the images, it gives slightly better results. The proposed method exploits the relativity of adjacent pixels in an image. As the number of repeated pixels in the image increases, the amount of compression that we get also increases. Hence for images with lower color space, the performance of the algorithm will increase because with lower color space the range of pixel values decrease. Hence the pixel values will be closer to each other and there will be even more redundancy in the images, as a result the compression will be better. TABLE I.

SIZE IN BYTES

Size in Byte

PGM

GIF

TIFF (LZW)

TIFF (DEFLATE)

PNG

LENA BARBARA F16 KID TOWN BRANCH FLOWER

262144 262144 262144 4010354 4010354 4010354 4010354

264618 291701 213463 3157267 3119224 4019161 3957380

263532 290052 212762 3146968 3110212 4004606 1873694

225174 236416 188434 2737830 2678488 3309338 1609068

15127 177872 139403 1803523 1816606 2559340 1089147

326

PROPOSED METHOD 200153 219774 188300 2456901 2671161 3380607 1648044

T ABLE II . COMPRESSI ON RATIO Size in Byte

PGM

GIF

LENA BARBARA F16 KID TOWN BRANCH FLOWER

1.000 1.000 1.000 1.000 1.000 1.000 1.000

1.009 1.112 0.813 0.787 0.777 1.002 0.986

TIFF (LZW) 1.005 1.106 0.809 0.784 0.775 0.998 0.467

TIFF (DEFLATE) 0.858 0.901 0.717 0.681 0.667 0.825 0.401

PNG 0.576 0.675 0.530 0.449 0.452 0.638 0.271

PROPOSED METHOD 0.763 0.835 0.717 0.612 0.666 0.842 0.411

REFERENCES [1] D. Hankerson, G.A.Harris, P.D. Johnson Jr., Introduction to Information Theory and Data Compression, CRC Press, Boca Raton, FL, 1997. [2] O. Egger, P. Fleury, T. Ebrahimi, M. Kunt, High performance compression of visual information a tutorial review – part I: still pictures, Proc. IEEE 87 (1999). [3] V.P. Baligar, L.M. Patnaik, G.R. Nagabhushana, High compression and low order linear predictor for lossless coding of grayscale images, Image Vis. Comput. 21 (6) (2003). [4] D. Salomon, Data Compression: The Complete Reference, third ed., Springer, New York, 2004. [5] D. Salomon, A Guide to Data Compression Methods, Springer, New York, 2002. [6] K. Sayood, Introduction to Data Compression, second ed., Academic Press, San Diego, CA, 2000. [7] M. Rabbani, P.W. Jones, Digital image compression techniques, Tutorial Texts in Optical Engineering, Vol. TT7, SPIE Optical Eng. Press. [8] K. Sayood, K. Anderson, A differential lossless image compression scheme, IEEE Trans. Signal Process. 40 (1) 236-241 (1992). [9] S.D. Stearns, L. Tan, N. Magotra, Lossless compression of waveform data for efficient storage and transmission, IEEE Trans. Geosci. Remote Sensing 31 (3) 645-654 (1993). [10] G.K. Wallace, The JPEG still picture compression standard, Comm. ACM 34 (4) 30-40 (1991). [11] C.C. Cutler, Differential quantization of communication signals, U.S. Patent 2,605,361, 1952. [12] J.B. O'Neil, Entropy coding in speech and television differential PCM systems, IEEE Trans. Inform. Theory IT-17 758-761 (1971). [13] R.G. Gallager, Variations on a theme by Huffman, IEEE Trans. Inform. Theory 24 668-674 (1978). [14] D.A. Huffman, A method for the construction of minimum redundancy codes, Proc. IRE 40 1098-1101 (1952). [15] G.G. Langdon, An introduction to arithmetic coding, IBM J. Res. Develop. 28 135-149 (1984). [16] Information Providers Guide, The EU Internet Handbook, http://ec.europa.eu/ipg/standards/image/gif/index_en.htm [17] David Salomon, Data Compression – The complete reference, 4th ed. Springer (Dec 2006) ISBN 1-84628-602-5 [18] Portable Network Graphics (PNG) Specification (Second Edition), http://www.w3.org/TR/PNG/#11IDAT

327