Huffman coding
  • Huffman coding is the  application of greedy algorithm.
  • Huffman coding is also known as Huffman encoding
  • It is a famous greedy algorithm which is used for lossless compression of data
  • It uses a variable-length encoding  where variable-length codes  are assigned to all characters depending on how frequently they occur in the given text

There are two major steps in Huffman coding-

  1. Building a Huffman tree from the input characters
  2. Assigning codes to the characters by traversing the Huffman tree


Steps to construct Huffman coding:

  1. Create a leaf node for all the given characters containing the occurring frequency of characters.
  2. Arrange all the nodes in the increasing order of frequency value contained in the nodes.
  3. Considering the first two nodes having a minimum frequency, create a new internal node having a frequency equal to the sum of the two nodes frequencies and make the first node as a left child and the other node as a right child of the newly created node.

  4. Keep repeating Step-02 and Step-03 until all the nodes form a single tree. After following all the above steps, our desired Huffman tree will be constructed.

Important Formula-



2.Total number of bits in Huffman encoded message

= Total number of characters in the message x Average code length per character

= ∑ ( frequencyi x Code lengthi )


Time Complexity-

ExtractMin( ) is called 2 x (n-1) times if there are n nodes.As extractMin( ) calls minHeapify( ), it takes O(logn) time.

Thus, Overall complexity becomes O(nlogn).


Time Complexity of Huffman Coding = O (nlogn)

where n is the number of unique characters in the text