How does arithmetic coding compress data?

Though I knew that arithmetic coding is a compression algorithm and "better than Huffman coding", I did not know how it can make data smaller.

Now, I finally understand it!

The basic idea is same as Huffman coding. Use fewer bits for frequently used symbols, and use more bits for rarely used symbols.

Let’s say we have symbols A and B. Each probability is 99.5% and 0.5%. We want to use fewer bits for A than B. Arithmetic coding uses one dimensional line:

 

                      A                                      B
|--------------------------------------------------------|---------|
0 99.5 100
 

When ‘A’ is entered to compressor, the code can be any value between 0 to 99.5. It’s fairly easy. We could just say 0, 1, or 9. One digit is enough.

When ‘B’ is entered to compressor, however, the code must be some values between 99.5 to 100. It may be 99.5, 99.7, or 99.9. Anyway we need at least THREE digits.

That’s it. Arithmetic coding assigns fewer bits to frequently used symbols by assigning larger ranges for the symbols. Since we need fewer precision to describe larger range, it uses fewer bits for frequently used symbols.

Advertisements

About Moto

Engineer who likes coding
This entry was posted in Algorithm. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s