ISSN (ONLINE) : 2045 -8711
ISSN (PRINT) : 2045 -869X
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY & CREATIVE ENGINEERING
AUGUST 2015 VOL- 5 NO - 8
@IJITCE Publication @IJITCE Publication @IJITCE Publication
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015
UK: Managing Editor International Journal of Innovative Technology and Creative Engineering 1a park lane, Cranford London TW59WA UK E-Mail: email@example.com Phone: +44-773-043-0249 USA: Editor International Journal of Innovative Technology and Creative Engineering Dr. Arumugam Department of Chemistry University of Georgia GA-30602, USA. Phone: 001-706-206-0812 Fax:001-706-542-2626 India: Editor International Journal of Innovative Technology & Creative Engineering Dr. Arthanariee. A. M Finance Tracking Center India 66/2 East mada st, Thiruvanmiyur, Chennai -600041 Mobile: 91-7598208700
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY & CREATIVE ENGINEERING Vol.5 No.8 August 2015
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015
From Editor's Desk Dear Researcher, Greetings! Research article in this issue discusses about motivational factor analysis. Let us review research around the world this month. New Network Design Exploits Power-Efficient Flash Memory Engineers have developed a new system that, for several common big-data applications, should make servers using flash memory as efficient as those using conventional RAM while cutting cost and power consumption. Random-access memory or RAM is where computers like to store the data they’re working on. A processor can retrieve data from RAM tens of thousands of times more rapidly than it can from the computer’s disk drive. But in the age of big data, data sets are often much too large to fit in a single computer’s RAM. The data describing a single human genome would take up the RAM of somewhere between 40 and 100 typical computers. It’s about a tenth as expensive and it consumes about a tenth as much power. The problem is that it’s also a tenth as fast. But at the International Symposium on Computer Architecture MIT researchers presented a new system that, for several common big-data applications, should make servers using flash memory as efficient as those using conventional RAM, while preserving their power and cost savings. Text Compression text is a very big part of most files that digital technology users create. The files would be Word or PDF documents, emails, cell phone texts (SMS format) or web pages. Therefore being able to compress text for storage or transmission is extremely important. Fortunately files containing mainly text can be significantly compressed. Like image compression there are many algorithms or methods that have been devised. There is one important point to note about text compression and that is it needs to use a lossless method. This means the method must not discard any data when it compresses the data. The data when it is uncompressed would be incomplete. It has been an absolute pleasure to present you articles that you wish to read. We look forward to many more new technologies related research articles from you and your friends. We are anxiously awaiting the rich and thorough research papers that have been prepared by our authors for the next issue.
Thanks, Editorial Team IJITCE
Editorial Members Dr. Chee Kyun Ng Ph.D Department of Computer and Communication Systems, Faculty of Engineering,Universiti Putra Malaysia,UPMSerdang, 43400 Selangor,Malaysia. Dr. Simon SEE Ph.D Chief Technologist and Technical Director at Oracle Corporation, Associate Professor (Adjunct) at Nanyang Technological University Professor (Adjunct) at ShangaiJiaotong University, 27 West Coast Rise #08-12,Singapore 127470 Dr. sc.agr. Horst Juergen SCHWARTZ Ph.D, Humboldt-University of Berlin,Faculty of Agriculture and Horticulture,Asternplatz 2a, D-12203 Berlin,Germany Dr. Marco L. BianchiniPh.D Italian National Research Council; IBAF-CNR,Via Salaria km 29.300, 00015 MonterotondoScalo (RM),Italy Dr. NijadKabbaraPh.D Marine Research Centre / Remote Sensing Centre/ National Council for Scientific Research, P. O. Box: 189 Jounieh,Lebanon Dr. Aaron Solomon Ph.D Department of Computer Science, National Chi Nan University,No. 303, University Road,Puli Town, Nantou County 54561,Taiwan Dr. Arthanariee. A. M M.Sc.,M.Phil.,M.S.,Ph.D Director - Bharathidasan School of Computer Applications, Ellispettai, Erode, Tamil Nadu,India Dr. Takaharu KAMEOKA, Ph.D Professor, Laboratory of Food, Environmental & Cultural Informatics Division of Sustainable Resource Sciences, Graduate School of Bioresources,Mie University, 1577 Kurimamachiya-cho, Tsu, Mie, 514-8507, Japan Dr. M. Sivakumar M.C.A.,ITIL.,PRINCE2.,ISTQB.,OCP.,ICP. Ph.D. Project Manager - Software,Applied Materials,1a park lane,cranford,UK Dr. Bulent AcmaPh.D Anadolu University, Department of Economics,Unit of Southeastern Anatolia Project(GAP),26470 Eskisehir,TURKEY Dr. SelvanathanArumugamPh.D Research Scientist, Department of Chemistry, University of Georgia, GA-30602,USA.
Review Board Members Dr. Paul Koltun Senior Research ScientistLCA and Industrial Ecology Group,Metallic& Ceramic Materials,CSIRO Process Science & Engineering Private Bag 33, Clayton South MDC 3169,Gate 5 Normanby Rd., Clayton Vic. 3168, Australia Dr. Zhiming Yang MD., Ph. D. Department of Radiation Oncology and Molecular Radiation Science,1550 Orleans Street Rm 441, Baltimore MD, 21231,USA Dr. Jifeng Wang Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign Urbana, Illinois, 61801, USA Dr. Giuseppe Baldacchini ENEA - Frascati Research Center, Via Enrico Fermi 45 - P.O. Box 65,00044 Frascati, Roma, ITALY. Dr. MutamedTurkiNayefKhatib Assistant Professor of Telecommunication Engineering,Head of Telecommunication Engineering Department,Palestine Technical University (Kadoorie), TulKarm, PALESTINE.
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 Dr.P.UmaMaheswari Prof &Head,Depaartment of CSE/IT, INFO Institute of Engineering,Coimbatore. Dr. T. Christopher, Ph.D., Assistant Professor &Head,Department of Computer Science,Government Arts College(Autonomous),Udumalpet, India. Dr. T. DEVI Ph.D. Engg. (Warwick, UK), Head,Department of Computer Applications,Bharathiar University,Coimbatore-641 046, India. Dr. Renato J. orsato Professor at FGV-EAESP,Getulio Vargas Foundation,São Paulo Business School,RuaItapeva, 474 (8° andar),01332-000, São Paulo (SP), Brazil Visiting Scholar at INSEAD,INSEAD Social Innovation Centre,Boulevard de Constance,77305 Fontainebleau - France Y. BenalYurtlu Assist. Prof. OndokuzMayis University Dr.Sumeer Gul Assistant Professor,Department of Library and Information Science,University of Kashmir,India Dr. ChutimaBoonthum-Denecke, Ph.D Department of Computer Science,Science& Technology Bldg., Rm 120,Hampton University,Hampton, VA 23688 Dr. Renato J. Orsato Professor at FGV-EAESP,Getulio Vargas Foundation,São Paulo Business SchoolRuaItapeva, 474 (8° andar),01332-000, São Paulo (SP), Brazil Dr. Lucy M. Brown, Ph.D. Texas State University,601 University Drive,School of Journalism and Mass Communication,OM330B,San Marcos, TX 78666 JavadRobati Crop Production Departement,University of Maragheh,Golshahr,Maragheh,Iran VineshSukumar (PhD, MBA) Product Engineering Segment Manager, Imaging Products, Aptina Imaging Inc. Dr. Binod Kumar PhD(CS), M.Phil.(CS), MIAENG,MIEEE HOD & Associate Professor, IT Dept, Medi-Caps Inst. of Science & Tech.(MIST),Indore, India Dr. S. B. Warkad Associate Professor, Department of Electrical Engineering, Priyadarshini College of Engineering, Nagpur, India Dr. doc. Ing. RostislavChoteborský, Ph.D. Katedramateriálu a strojírenskétechnologieTechnickáfakulta,Ceskázemedelskáuniverzita v Praze,Kamýcká 129, Praha 6, 165 21 Dr. Paul Koltun Senior Research ScientistLCA and Industrial Ecology Group,Metallic& Ceramic Materials,CSIRO Process Science & Engineering Private Bag 33, Clayton South MDC 3169,Gate 5 Normanby Rd., Clayton Vic. 3168 Dr.ChutimaBoonthum-Denecke, Ph.D Department of Computer Science,Science& Technology Bldg.,HamptonUniversity,Hampton, VA 23688 Mr. Abhishek Taneja B.sc(Electronics),M.B.E,M.C.A.,M.Phil., Assistant Professor in the Department of Computer Science & Applications, at Dronacharya Institute of Management and Technology, Kurukshetra. (India). Dr. Ing. RostislavChotěborský,ph.d, Katedramateriálu a strojírenskétechnologie, Technickáfakulta,Českázemědělskáuniverzita v Praze,Kamýcká 129, Praha 6, 165 21
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 Dr. AmalaVijayaSelvi Rajan, B.sc,Ph.d, Faculty – Information Technology Dubai Women’s College – Higher Colleges of Technology,P.O. Box – 16062, Dubai, UAE Naik Nitin AshokraoB.sc,M.Sc Lecturer in YeshwantMahavidyalayaNanded University Dr.A.Kathirvell, B.E, M.E, Ph.D,MISTE, MIACSIT, MENGG Professor - Department of Computer Science and Engineering,Tagore Engineering College, Chennai Dr. H. S. Fadewar B.sc,M.sc,M.Phil.,ph.d,PGDBM,B.Ed. Associate Professor - Sinhgad Institute of Management & Computer Application, Mumbai-BangloreWesternly Express Way Narhe, Pune - 41 Dr. David Batten Leader, Algal Pre-Feasibility Study,Transport Technologies and Sustainable Fuels,CSIRO Energy Transformed Flagship Private Bag 1,Aspendale, Vic. 3195,AUSTRALIA Dr R C Panda (MTech& PhD(IITM);Ex-Faculty (Curtin Univ Tech, Perth, Australia))Scientist CLRI (CSIR), Adyar, Chennai - 600 020,India Miss Jing He PH.D. Candidate of Georgia State University,1450 Willow Lake Dr. NE,Atlanta, GA, 30329 Jeremiah Neubert Assistant Professor,MechanicalEngineering,University of North Dakota Hui Shen Mechanical Engineering Dept,Ohio Northern Univ. Dr. Xiangfa Wu, Ph.D. Assistant Professor / Mechanical Engineering,NORTH DAKOTA STATE UNIVERSITY SeraphinChallyAbou Professor,Mechanical& Industrial Engineering Depart,MEHS Program, 235 Voss-Kovach Hall,1305 OrdeanCourt,Duluth, Minnesota 55812-3042 Dr. Qiang Cheng, Ph.D. Assistant Professor,Computer Science Department Southern Illinois University CarbondaleFaner Hall, Room 2140-Mail Code 45111000 Faner Drive, Carbondale, IL 62901 Dr. Carlos Barrios, PhD Assistant Professor of Architecture,School of Architecture and Planning,The Catholic University of America Y. BenalYurtlu Assist. Prof. OndokuzMayis University Dr. Lucy M. Brown, Ph.D. Texas State University,601 University Drive,School of Journalism and Mass Communication,OM330B,San Marcos, TX 78666 Dr. Paul Koltun Senior Research ScientistLCA and Industrial Ecology Group,Metallic& Ceramic Materials CSIRO Process Science & Engineering Dr.Sumeer Gul Assistant Professor,Department of Library and Information Science,University of Kashmir,India Dr. ChutimaBoonthum-Denecke, Ph.D Department of Computer Science,Science& Technology Bldg., Rm 120,Hampton University,Hampton, VA 23688
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 Dr. Renato J. Orsato Professor at FGV-EAESP,Getulio Vargas Foundation,S찾o Paulo Business School,RuaItapeva, 474 (8째 andar)01332-000, S찾o Paulo (SP), Brazil Dr. Wael M. G. Ibrahim Department Head-Electronics Engineering Technology Dept.School of Engineering Technology ECPI College of Technology 5501 Greenwich Road Suite 100,Virginia Beach, VA 23462 Dr. Messaoud Jake Bahoura Associate Professor-Engineering Department and Center for Materials Research Norfolk State University,700 Park avenue,Norfolk, VA 23504 Dr. V. P. Eswaramurthy M.C.A., M.Phil., Ph.D., Assistant Professor of Computer Science, Government Arts College(Autonomous), Salem-636 007, India. Dr. P. Kamakkannan,M.C.A., Ph.D ., Assistant Professor of Computer Science, Government Arts College(Autonomous), Salem-636 007, India. Dr. V. Karthikeyani Ph.D., Assistant Professor of Computer Science, Government Arts College(Autonomous), Salem-636 008, India. Dr. K. Thangadurai Ph.D., Assistant Professor, Department of Computer Science, Government Arts College ( Autonomous ), Karur - 639 005,India. Dr. N. Maheswari Ph.D., Assistant Professor, Department of MCA, Faculty of Engineering and Technology, SRM University, Kattangulathur, Kanchipiram Dt - 603 203, India. Mr. Md. Musfique Anwar B.Sc(Engg.) Lecturer, Computer Science & Engineering Department, Jahangirnagar University, Savar, Dhaka, Bangladesh. Mrs. Smitha Ramachandran M.Sc(CS)., SAP Analyst, Akzonobel, Slough, United Kingdom. Dr. V. Vallimayil Ph.D., Director, Department of MCA, Vivekanandha Business School For Women, Elayampalayam, Tiruchengode - 637 205, India. Mr. M. Moorthi M.C.A., M.Phil., Assistant Professor, Department of computer Applications, Kongu Arts and Science College, India PremaSelvarajBsc,M.C.A,M.Phil Assistant Professor,Department of Computer Science,KSR College of Arts and Science, Tiruchengode Mr. G. Rajendran M.C.A., M.Phil., N.E.T., PGDBM., PGDBF., Assistant Professor, Department of Computer Science, Government Arts College, Salem, India. Dr. Pradeep H Pendse B.E.,M.M.S.,Ph.d Dean - IT,Welingkar Institute of Management Development and Research, Mumbai, India Muhammad Javed Centre for Next Generation Localisation, School of Computing, Dublin City University, Dublin 9, Ireland Dr. G. GOBI Assistant Professor-Department of Physics,Government Arts College,Salem - 636 007 Dr.S.Senthilkumar Post Doctoral Research Fellow, (Mathematics and Computer Science & Applications),UniversitiSainsMalaysia,School of Mathematical Sciences, Pulau Pinang-11800,[PENANG],MALAYSIA. Manoj Sharma Associate Professor Deptt. of ECE, PrannathParnami Institute of Management & Technology, Hissar, Haryana, India RAMKUMAR JAGANATHAN Asst-Professor,Dept of Computer Science, V.L.B Janakiammal college of Arts & Science, Coimbatore,Tamilnadu, India
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 Dr. S. B. Warkad Assoc. Professor, Priyadarshini College of Engineering, Nagpur, Maharashtra State, India Dr. Saurabh Pal Associate Professor, UNS Institute of Engg. & Tech., VBS Purvanchal University, Jaunpur, India Manimala Assistant Professor, Department of Applied Electronics and Instrumentation, St Joseph’s College of Engineering & Technology, Choondacherry Post, Kottayam Dt. Kerala -686579 Dr. Qazi S. M. Zia-ul-Haque Control Engineer Synchrotron-light for Experimental Sciences and Applications in the Middle East (SESAME),P. O. Box 7, Allan 19252, Jordan Dr. A. Subramani, M.C.A.,M.Phil.,Ph.D. Professor,Department of Computer Applications, K.S.R. College of Engineering, Tiruchengode - 637215 Dr. SeraphinChallyAbou Professor, Mechanical & Industrial Engineering Depart. MEHS Program, 235 Voss-Kovach Hall, 1305 Ordean Court Duluth, Minnesota 55812-3042 Dr. K. Kousalya Professor, Department of CSE,Kongu Engineering College,Perundurai-638 052 Dr. (Mrs.) R. Uma Rani Asso.Prof., Department of Computer Science, Sri Sarada College For Women, Salem-16, Tamil Nadu, India. MOHAMMAD YAZDANI-ASRAMI Electrical and Computer Engineering Department, Babol"Noshirvani" University of Technology, Iran. Dr. Kulasekharan, N, Ph.D Technical Lead - CFD,GE Appliances and Lighting, GE India,John F Welch Technology Center,Plot # 122, EPIP, Phase 2,Whitefield Road,Bangalore – 560066, India. Dr. Manjeet Bansal Dean (Post Graduate),Department of Civil Engineering,Punjab Technical University,GianiZail Singh Campus,Bathinda -151001 (Punjab),INDIA Dr. Oliver Jukić Vice Dean for education,Virovitica College,MatijeGupca 78,33000 Virovitica, Croatia Dr. Lori A. Wolff, Ph.D., J.D. Professor of Leadership and Counselor Education,The University of Mississippi,Department of Leadership and Counselor Education, 139 Guyton University, MS 38677
Contents A COMPARATIVE ANALYSIS ON DIFFERENT TECHNIQUES IN TEXT COMPRESSION Dr.S.Pannirselvam & D.Selvanayagi ……………….……………………………….…………………………. 
A COMPARATIVE ANALYSIS ON DIFFERENT TECHNIQUES IN TEXT COMPRESSION Dr.S.Pannirselvam Research Supervisor & Head Department of Computer Science, Erode Arts & Science College (Autonomous), Erode, Tamil Nadu, India. Email: firstname.lastname@example.org D.Selvanayagi Ph.D. Research Scholar, Department of Commerce with Computer Application, Erode Arts & Science College (Autonomous), Erode, Tamil Nadu, India. Email: email@example.com Abstractâ€” Digital image processing the term image refers to a digital image and its manipulation by means of processor. Data compression is the computing process to face the problems of the constraints in memory. Data compression is one of the important procedures to reduce the space occupied by a file, which normally leads to reduction of time taken to access the text file. The technique contains two types, they are lossy and lossless compression technique. This research work deals with lossless compression techniques. In this work describes a comparison of various techniques such as Run Length Encoding, Huffman Coding, Lempel Ziv Welch and Arithmetic Coding. The parameters that are considered to analyze text file are Compression Time, Compression Speed and Compression Ratio. Keywordsâ€”Data Compression, Lossless compression, RLE, Huffman, LZW, Arithmetic.
1. INTRODUCTION Digital image processing is a subset of the electronic domain where the image is converted to an array of small integers called pixels, representing a physical quantity such as scene radiance, stored in a digital memory and processed by computer or other digital hardware. A digital image compression is the method of image data rate reduction to save storage space and reduce transmission rate requirements. Compression is a system for reducing the quantity of facts needed to storage or transmission of information given like text, images, video, sound.Data compression algorithms are classified into lossy and lossless facts compression algorithm. One of the lossy compressions , techniques are used to compress image, records documents for conversation or archives purposes. Lossy compression  methods are categorized according to the type of data they are designed to compress. The methods, which may handle all binary input, are used to any form of data. However, most are unable to complete significant compression on data deal with a selected type. Lossless compression  methods to be categorized according their type of data are modeled to Compress. The algorithm that can handle all binary input can be used on any type of data. However, most are unable to achieve significant compression on data that is deal with a particular type.
2. LITERATURE REVIEW Apoor et al  presented the comparative analysis between Huffman coding and Arithmetic coding. As arithmetic accommodates adaptive models easily and provide separation between model and coding. In arithmetic coding there is no need to translate each symbol into an integral number of bits, but it involves the large computation on the data like multiplication and division. The disadvantage of arithmetic coding is that it runs slowly, complicated to implement and it does not produce prefix code. Arithmetic Compression  is more suitable for small text when compared with Huffman compression and for large text Huffman compression is suitable. Tanvi et al  analyzed lossless data compression techniques.Major focused is made on various data text compression methods like dictionary based and entropy based dictionary. In entropy based technique Run length encoding is not used much as that of Shannon Fano and Huffman. This twomethods are much better than RLE. But, both Shannon Fanoand Huffman compression is almost same. Huffman is betterthan Shannon Fano method in a very small difference. Indictionary based method three methods are discussed uponwhich LZW works best in comparison to LZ77 and LZ78. Kashfiaet al  suggested that after computing and comparing the compression ratio, average code length and standard deviation for Shannon Fano Coding, Huffman Coding, Repeated Huffman Coding, Run-Length Coding and Modified Run-Length Coding, an idea is generated about how much compression can be obtained by each technique. So, now the most effective algorithm can be used based on the input text file size, content type, available memory and execution time to get the best result. Future works can be carried on an efficient and optimal coding technique using mixture of two or more coding techniques for image file, exe file etc. to improve compression ratio and reduce average code length. Altarawneh et al presented various methods of data compression such as LZW, Huffman coding are using in text files. The authors have evaluated and test the algorithms on various sizes of text files and compared their performance on various parameters such as compression size, compression ratio, compression time. Khurana et al developed a new compression technique that uses referencing through two byte numbers for the purpose of encoding has been presented. The technique is efficient in providing high compression ratio and faster search through the text. It should need extensive study of general
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 sentence formats and scope for maximum compression. Another area of research would be two modify the compression scheme show that searching is even faster. Singh at al described the two phases encoding technique which compresses the shorted data more efficiently. The author provides a new way to enhance the compression techniques by merging RLE and incremental compression algorithms. In first phase the data is compressed by applying RLE algorithm, that compresses the frequent occur data bits by short bits. In second phase incremental compression algorithm stores the prefix of previous symbol from the current symbol and replaces with integer value. RLE technique can reduce the size of sorted data by 50% using two phases encoding techniques. Mann et al discussed and compared selected set of lossless data compression algorithms such as RLE, Huffman and Arithmetic coding. The author compares the performance of these algorithms on the basics of various parameters such as Compression ratio, Compression speed. The author has concluded that the compression speed of Huffman is better than the Arithmetic coding. Even though lot of issues are available in the lossless Text compression techniques. So, there is need to develop a new efficient model for the Text Compression. 3. Methodology 3.1 Run Length Encoding Data contains sequences of equal bytes. By changing the repeated byte sequences with the quantity of occurrences, a substantial reduction of data can be achieved and is called as Run Length Encoding. Run Length Encoding  is one of the simple data compression algorithm. The main aim of RLE algorithm is to pick out the runs of the source file and to report the symbol and the length of each run. In this encoding technique, one after another the same characters are repeated in text file. Algorithm Step 1: Read the character from the input string. Step 2: To store the repeated character. Step 3: If the character is not repeated, write out the current Characters. Step 4: The current character is does not match the previous character, set the previous character as current character and repeat the step 1. Step 5: Those characters are encoded as 2 bytes. Step 6: If the character is null to exit. For example, the string â€?ABABBBBBBCâ€? is considered as a source to compress, taken the first 3 letters as a non-run is having a length 3 and the next 6 letters taken as a run having length 6, since symbol B is repeated consequently. So, in this manner Run Length Encoding method compress the file or any type of document but it is not of much use because it cannot compress big files, which may not have many repeated words or symbols. 3.2 Huffman Coding Huffman coding  deals with data compression of ASCII characters. It follows top down approach the binary tree is built from the top down to generate achieve better result. In
Huffman coding, the characters in a data file are converted to binary code. The maximum usual characters in the file have the shortest binary codes and which are least commonplace have the longest binary code. A Huffman code can be determined by means of successively constructing a binary tree, whereby the leaves represent the characters, which can be to be encoded. Each node carries the relative opportunity of prevalence of the characters belonging to the sub tree lower than the node. The edges are categorized with the bits 0 & 1. Algorithm Step 1: Parse the input and count the occurrence of each symbol Step 2:Determine the probability of occurrence of each symbol using the symbol count. Step 3:Sort the symbol according to their probability of occurrence with most probable first. Step 4:Generate leaf nodes for each symbol and add them to a queue. Step 5:Take two least frequent characters and then logically group them together to obtain their combined frequency that leads to the construction of a binary tree structure. Then label the edges from each parent to its left child with the digit 0 and the edge to right child with one. Tracing down the tree yields to Huffman codes in which shortest codes are assigned to the character with greater frequency. 3.3 LZW LZW would only send the index to the dictionary. Dictionary based compression algorithms are based on a dictionary instead of a statistical model. A dictionary is a set of possible words of a language, and is stored in a table like structure and used the indexes of entries to represent larger and repeating dictionary words. The Lempel-Zev Welch algorithm or simply LZW algorithm is one of such algorithms. In this method, a dictionary is used to store and index the previously seen string patterns. In the compression process, those index values are used instead of repeating string patterns. The dictionary is created dynamically in the compression process and no need to transfer it with the encoded message for decompressing. In the decompression process, the same dictionary is created dynamically. Therefore, this algorithm is an adaptive compression algorithm. LZW is a general compression algorithm capable of working on almost any type of data. LZW compression creates a table of strings commonly occurring in the data being compressed, and replaces the actual data with references into the table. The table is formed during compression at the same time at which the data is encoded and during decompression at the same time as the data is decoded. The algorithm is surprisingly simple. LZW compression replaces strings of characters with single codes. It does not do any analysis of the incoming text. Instead, it just adds every new string of characters it sees to a table of strings. Compression occurs when a single code is output instead of a string of characters. It starts with a "dictionary" of all the single character with indexes 0...255. It then starts to expand the dictionary as
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 Table 1: Run Length Encoding
information is sent through. Soon, redundant strings will be coded as a single bit, and compression has occurred. This means codes 0-255 refer to individual bytes, while codes 2564095 refers to substrings. Algorithm Step 1: Read the input of first byte and stored as string Step 2: To check whether the input string is available or not in dictionary. Step 3: If the string is not available in the dictionary to add the string into dictionary Step 4: If it is available, add character to the string until the last characters. Step 5: To read the next character and Output the string. Step 6: Repeat the Process until the character is null.
Original Input File Size(kb)
Run length encoding Compression Time (M/s)
From the Table 1 show the Compression Time, Compression Speed, Compression ratio of the different sizes of input files applying Run Length Encoding. Table 2: Huffman coding
3.4 Arithmetic coding The basic idea of arithmetic is to assign short code word to more probable events and longer code word to less probable events. Arithmetic coding provides an effective mechanism for removing redundancy in the encoding of data. In arithmetic coding, an interval is assigned to each symbol. Starting with the interval [0,1] for each interval is divided in several sub intervals, which sizes are proportional to the current probability of the corresponding symbols of the alphabet. The subinterval from the coded symbol is then taken as the interval for the next symbol. The output is the interval of the last symbols. Algorithm Step 1: Read the input text file Step 2: Calculate the number of unique symbols from input text file. Step 3: Assign the values from each unique symbol in the order they appear. Step 4: The symbols are replaced with the codes in the input. Step 5: Convert the code to long fixed point binary number to preserving precision and store the length of the input string to generate decoding. Arithmetic coding is change the method of replacing each bit with a code word. So it replaces a string of input data with a single floating point number as a output. 4. EXPERIMENTATION & RESULTS In this work mainly focused on performance of various algorithms in with inputs. To evaluating the compression algorithms are based on Compression Time, Compression Ratio and Compression Speed. a) Data Compression Ratio (CR) = Uncompressed Size/compressed Size b) Compression time depends on size of the file to be compressed on particular time. c) Compression Speed = Uncompressed Bits/Seconds to Compress.
Original Input file Size(Kb)
Huffman coding Compression Time (M/s)
From the Table 2 shows the Compression Time, Compression Speed, Compression Ratio of the different size of input files implemented on Huffman coding. Table 3: LZW LZW Original Input file Size(Kb)
Compression Time (M/s)
From the Table 3 shows the Compression Time Compression Speed, Compression ratio of the different input files applying LZW. Table 4: Arithmetic Encoding
Original Input file Size(Kb)
Arithmetic Encoding Compression Time (M/s)
Compressi on Speed
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.5 NO.8 AUGUST 2015 From the Table 4 shows the Compression Time, Compression Speed, Compression ratio of the different input files using Arithmetic Encoding.
Table 5: Overall Performance of Compression Ratio
Run Length Coding
From the table 5 shows the Compression Ratio of the different input files.
Fig.1 Overall Compression Ratio From the Fig 1 shows the Compression Ratio of the different size of input files. In that Huffman Coding Ratio is better while compare to other techniques. 5. CONCLUSION Four different types of lossless compression techniques are focused in this work. Text data are compressed and decompressed by using these different techniques. From the overall performance we observed that Huffman technique is better for compression ratio while compare to other techniques. Huffman technique gives better results for compression time and itis best suited for the compression speed. Even though, in future there will be need to develop a lossless text compression algorithm that can compress the text data in the efficient way that can also be used in various real time applications where compression on text data is required.
6. REFERENCES Sashikala, Melwin.Y., Arunodhayansam Solomon,, M.N.Nachappa, “A survey of compression techniques”,International journal of recent technology and Engineering, vol-2,issue-1,March 2013. Apoorv Vikram Singh and Garima Singh, “A survey on Different text Data Compression Techniques”, International Journal of Science and Research,Vol 3,July 2014. S.Shanmugasundaram and R.Lourdusamy, “A Comparative Study of Text Compression Algorithms”, International Journal of Wisdom Based Computing, Vol.1, December 2011. 287
Tanvi Patel and Judith Angela, kurtiDangarwala, “ Survey Of Text Compression Algorithms”, International Journal Of Engineering Research and Technology, issue 3, march 2015. S.Senthil and L.Robert “Text compression Algorithms A Comparative Study” , journal of communication technology, vol 1, December 2011. Ming-Bo Lin Jang-Feng Lee; Gene Eu Jan, “A Lossless Data Compression and Decompression Algorithm and its Hardware Architecture”, Very Large Scale Integration Systems, IEEE Transactions on Vol 14, Issue:9, Pages: 925-936, October 2006. U.Khurana and A.Koul, “Text compression and superfast searching”, Thapar institute of engineering technology, Patiala, Punjab, India. Ming-Bo Lin Jang-Feng Lee; Gene Eu Jan, “A Lossless Data Compression and Decompression Algorithm and its Hardware Architecture”, Very Large Scale Integration Systems, IEEE Transactions on Vol 14, Issue:9, Pages: 925-936, October 2006. S.Porwal, Y.Chaudhary, J.Joshi and .Jain, “Data Compression Methodologies for Lossless Data and Comparison between Algorithms”, International Journal of Engineering Science and Innovative Technology, Vol 2, March 2013. R.Kaur and M.Goyal, “An Algorithm for Lossless Text Data Compression”, International Journal of Engineering Research & Technology, Vol 2, July 2013. AmarjitKaur, Navdeep Singh Sethi, Harinderpal Singh, “A Review on Data Compression Techniques”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol 5, January 2015. KashfiaSailunaz, Mohammed RokibulAlamKotwal, Mohammad Nurul Huda, “Data Compression Considering Text Files”, International Journal of Computer Applications (0975 – 8887) Volume 90 – No 11, March 2014. H.Altarawneh and M.Altarawneh, “Data compression Techniques on text files: A Comparative study” International Journal of Computer Applications, Volume 26 No.5, July 2011. A.J Mann, “Analysis and comparison of Algorithms for Lossless data Compression” International Journal Of Information and Computation Technology, ISSN:0974-2239.pp. 139-146. Nathanael Jacob, Priyanka Somvanshi, Rupali Tornekar “Comparative Analysis of Lossless Text Compression Techniques” International Journal of Computer Applications Volume 56- No.3, October 2012. A. Singh and Y.Bhatnagar, “Enhancement of data Compression using Incremental Encoding” International Journal of Scientific & Engineering Research, Volume 3, Issue 5, May-2012. Amitjain, kamaljit, I.Lakhtaria “Comparative Study of Dictionary Based Compression Algorithms on Text Data”, International Journal of Computer Engineering and Applications, Volume 5, Issue 2, May14.
@IJITCE @IJITCEPublication Publication
International Journal of Innovative Technology and Creative Engineering (ISSN:2045-8711)