Determining the worst-case and typical-case
As we know, fibonacci sequence are the sum of the previous two, knowing at first that Fo = 0 and F1 = 1, this sequence produces a tree that are the most unbalanced, because if we know that huffman are base in the last two frequencies which produce another node that are the sum of both, so, if we have a fibonacci sequence as a frequency of the letters that we would like to test, we can have this type of tree.
For example, let say that we have these case, when we have only 7 letter, just for know how works.
1, 1, 2, 3
a, b, cc, ddd
Alphabet and probability:
a : 0.142857 = 1/7
b : 0.142857 = 1/7
c : 0.285714 = 2/7
d : 0.428571 = 3/7
Tree:
[1] d
[] cabd
[1] b
[1] ab
[0] a
[0] cab
[0] c
The same:
cabd
/ \
d cab
/ \
c ab
/ \
a b
We can see that the tree have only new nodes just for one side, because we made the tree having the sum of the probabilities of frequency, and have a pattern that builds the tree in that way, also we can say that if the have in all the probabilities the same number, would have the same pattern.
A typical case is when we have a frequencies that are not the same for all the text or that don't have the pattern of fibonacci, so I made an experiment for the type that I think is interesting, I made a different graphics that can shows that these type of cases are bad compare to the average case.
In this graphic, we can see the compress ratio between the average-case and the worst-case, we can see that clearly are much difference between them, having a difference about 62% more compress than the strings that have a fibonacci sequence.
Program
Está chido lo de Fibonacci. Para lo de caso típico, textos de verdad tendrían sentido. 7+7 pts.
ResponderEliminar