Lesson Draft 1

LO: #informationtheory

Learning outcome:

In order to be prepared for class, you need to understand

(1) unique decodeability – why this code property is useful for lossless compression, how we check it (Kraft Inequality).

(2) expected and optimal code length, how to use it to evaluate asymptotic performance of compression algorithms.

(3) addressing changing source distribution with adaptive methods.

In addition, you need to be able to implement data compression algorithms like Huffman coding, Arithmetic coding and Lempel-Ziv coding at a high level.

Polls:

one about optimal compression, kraft inequality, unique decodeability, prefix code. include definitions and relations. #informationtheory.
one about the point of the non Huffman codes, like: what is an example of needing data compression but the source changes distribution? why would Arithmetic code work better than Huffman code? What is the measure of “work better”?

90 minutes total

15 minute burn in: 5 minute intro slides 5 minute poll 5 minute discussion

45 minute: confusion / PCW breakout My Solutions to PCW

1 minute eye rest

10 minute breakout

10 minute breakout walkthrough

5 minute reflection poll / 5 minute share out

Breakout notes:

you are given an arithmetic code based on a source. now the source changed. what do you do?

Screenshot 2022-09-14 at 21.09.26.png

Implement the arithmetic coding algorithm for individual characters from a tiny source: (or give a frequency table already)
1. source alphabet: X = {a, t, p} probability mass distribution: p = (p_0, p_1, p_2) = (.2,.4,.4) sequence to encode: x1, …, x3 = a, t, p
2. There are 4 symbols available to build messages – a, b, c, and d. The frequencies of these symbols are:
3. Based on the frequency table, here are the probabilities of the symbols:
4. The next figure shows the cumulative probabilities of the symbols where the interval of each symbol is equal to its probability
5. Now, it’s time for the message to be encoded, which is bdab.
6. In the first stage, the first symbol in this message, b, is processed. The interval will be restricted to b‘s interval that starts from 0.2 to 0.5.
7. The decoding process starts given the following 3 inputs:
  1. The value that encodes the message, 0.3884.6
  2. The frequency table.
  3. The length of the message, 4.
8. frequency table gotten from previous messages (aat, tpa, tata)
9. probability table gotten from
10. encoding with cumulative probabilities
  1. Given any symbol p, it starts from the value v and ends at the value u calculated using the following equation:
11. encode message in a tree (interval → sub interval → sub interval)
12. convert message to a single value
How you think this arithmetic code’s per symbol expected length compares to Huffman code and to entropy?
1. if we were to calculate, (they are both optimal)