Documents

TXT COMPRESSION PROJECT.ppt

Description
TEXT COMPRESSION SUBMITTED BY MUKESH KUMAR TANWAR 11104EN067 VIKASH KUMAR 11104EN068 PARAS CHANDOLIA 11104EN070 ABSTRACT Text Compression is the science and art of representing information in a compact form. There are a lot of text compression algorithms which are available to compress files of different formats. In this paper, we have used Huffman algorithm for the text compression. INTRODUCTION Text compression refers to reducing the amount of space needed to store the text. There are two
Categories
Published
of 23
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  TEXT COMPRESSION SUBMITTED BY MUKESH KUMAR TANWAR 11104EN067 VIKASH KUMAR 11104EN068 PARAS CHANDOLIA 11104EN070   ABSTRACT Text   Compression   is   the   science   and   art   of    representing   information   in   a   compact   form.   There   are   a   lot   of    text   compression   algorithms   which   are   available   to   compress   files   of    different   formats.   In   this   paper,   we   have   used   Huffman   algorithm   for   the   text   compression.  INTRODUCTION Text   compression   refers   to   reducing   the   amount   of    space   needed   to   store   the   text.   There   are   two   types   of    compression-   LOSSLESS- Data compression can be lossless only if it   is   possible   to   exactly   reconstruct   the   srcinal   data   from   the   compressed   version. Such a lossless technique is used when the srcinal data of a source are so important that we cannot afford to lose any details. Examples of such source data are medical images, text, some computer executable files, etc.   LOSSY- Another method of compression algorithms is called lossy as these algorithms irreversibly remove some parts of data and only an approximation of the srcinal data can be reconstructed. Data such as multimedia images, video and audio are more easily compressed by lossy compression techniques  Huffman coding   The   Huffman   coding   transforms   the   srcinal   code   used   for   the   characters   of    the   text(ASCII   code   on   8   bits).   Coding   the   text   is    just   replacing   each   symbol   by   its   new   code-word.   The   Huffman   algorithm   uses   the   notion   of    prefix   code.   A   prefix   code   is   a   set   of    words   containing   no   word   that   is   prefix   of    another   word   of    the   set.   The   advantage   of    such   coding   is   that   the   decoding   is   immediate. Huffman Code assigns shorter encodings to elements with a high frequency. Elements with the highest frequency get assigned the shortest bit length code. The key to decompressing Huffman code is a Huffman tree. The   steps   required   while   compressing   the   text   are   as   follows-  Counts   the   character   frequencies.  Builds   the   Huffman   coding   tree.  Builds   character   code-words   from   the   coding   tree.  Stores   the   coding   tree   in   the   compressed   file.  Encodes   the   characters   in   the   compressed   file.  Complete   function   for   the   Huffman   encoding.  Rebuilds   the   tree   from   header   of    the   compressed   file.  Recovers   the   srcinal   text.  Complete   function   for   the   Huffman   decoding.
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks