Syllabus - Data Compression (CD504)


CSE-Data Science/Data Science

Data Compression (CD504)

V semester

Unit I

Compression Techniques

Loss less compression, Lossy Compression, Measures of performance, Modeling and coding, Mathematical Preliminaries for Lossless compression: A brief introduction to information theory, Models: Physical models, Probability models, Markov models, com-posite source model, Coding: uniquely decodable codes, Prefix codes.

Unit II

The Huffman coding algorithm

Minimum variance Huffman codes, Adaptive Huffman coding: Update procedure, Encoding procedure, Decoding procedure. Golomb codes, Rice codes, Tunstall codes, Applications of Hoffman coding: Lossless image compression, Text compression, Audio Compression.

Coding

Coding a sequence, Generating a binary code, Comparison of Binary and Huffman coding, Applications: Bi-level image compression- The JBIG standard, JBIG2, Image compression. Dictionary Techniques: Introduction, Static Dictionary: Diagram Coding, Adaptive Dictionary. The LZ77 Approach, The LZ78 Approach, Applications: File Compression-UNIX compress, ImageCompression: The Graphics Interchange Format (GIF), Compression over Modems: V.42 bits, Predictive Coding: Prediction with Partial match (ppm): The basic algorithm, The ESCAPE SYMBOL, length of context, The Exclusion Principle, The Burrows-Wheeler Transform: Move to- front coding, CALIC, JPEGLS, Multi-resolution Approaches, Facsimile Encoding, Dynamic Markoy Compression.

Unit III

Scalar Quantization

Distortion criteria, Models, Scalar Quantization: The Quantization problem, Uniform Quantizer, Adaptive Quantization, Non uniform Quantization.

Unit IV

Vector Quantization

Advantages of Vector Quantization over Scalar Quantization, TheLinde-Buzo-Gray Algorithm.

Course Objective

The objective of this course is to gain a fundamental understanding of data compression methods for text, images, and video, and related issues in the storage, access, and use of large data sets. Select, giving reasons that are sensitive to the specific application and particular circumstance, most appropriate compression techniques for text, audio, image and video information. Illustrate the concept of various algorithms for compressing text, audio, image and video information.

Course Outcome

On completion of this course, the students will be able to: program, analyze Hoffman coding: Lossless image compression, Text compression, Audio Compression. Program and analyze various Image compression and dictionary based techniques like static Dictionary, Diagram Coding, Adaptive Dictionary. Understand the statistical basis and performance metrics for lossless compression. Understand the conceptual basis for commonly used lossless compression techniques, and understand how to use and evaluate several readily available implementations of those techniques. Understand the structural basis for and performance metrics for commonly used lossy compression techniques and conceptual basis for commonly used lossy compression techniques.

Practicals

Reference Books

  • The Data Compression Book – Mark Nelson.

  • Data Compression: The Complete Reference – David Salomon.