+HCU papers
Little essay about the various methods and viewpoints of crunching
~ version June 1999 ~
by Joa ----- Part. VIII (Burrows - Wheeler - Transformation (BWT)
Courtesy of fravia's page of reverse engineering
Well, Joa continues his fundamental paper on crunching, this is part VIII
enjoy!
12 December 98 |
Joa
| ~ |
crunchi1.htm
| Little essay about the various methods and viewpoints of crunching
| papers
| ~ | fra_0126 |
10 June 98 |
Joa
| ~ |
crunchi2.htm
| Little essay about the various methods and viewpoints of crunching II
| papers
| ~ | fra_0129 |
17 June 98 |
Joa
| ~ |
crunchi3.htm
| Little essay about the various methods and viewpoints of crunching III
| papers
| ~ | fra_012E |
17 June 98 |
Joa
| ~ |
crunchi4.htm
| Little essay about the various methods and viewpoints of crunching IV
| papers
| ~ | fra_012F |
17 June 98 |
Joa
| ~ |
crunchi5.htm
| Little essay about the various methods and viewpoints of crunching V
| papers
| ~ | fra_012F |
17 June 98 |
Joa
| ~ |
crunchi6.htm
| Little essay about the various methods and viewpoints of crunching VI
| papers
| ~ | fra_012F |
17 June 98 |
Joa
| ~ |
crunchi7.htm
| Little essay about the various methods and viewpoints of crunching VII
| papers
| ~ | fra_xxxx |
Little essay about the various methods and viewpoints of crunching.
Part VIII: Burrows - Wheeler - Transformation (BWT)
By Joa Koester (JoKo2000@HotMail.Com)
This essay is free to be copied and read and spread as long as
you are decent enough to leave my name in it.
If you have corrections, comments, better examples than the ones used
in this essay etc. - drop me a line.
hello, hello,
well, shave my legs and call me Grandpa if that wasn't a long delay, but i still
have some serious difficulties accessing the Internet recently. I hope that i
can access you more frequently in the near future. And to sweet up the time till
then a little bit here my next essay about crunching methods:
THE BURROWS - WHEELER - TRANSFORMATION (tataaa :)
As we discussed last time the mechanism of Arithmetic Crunching some of you
asked me about the source (the input of the cruncher) and that sometimes it
would take the cruncher a long time to adjust to the source data. Yes, there is
a problem with this kind of crunching: the source itself should be transformed
to be better crunchable. But how do we transform a data to be better crunchable
with a fast algorithm?
Well, the arithmetic cruncher would love to crunch data like (all hex):
00, 01, 05, 02, 08, c2, 03, 01, c0, 0a, 12, 01, 02....
but how do we transform a sequence like {0a, f2, f2, 3c, 0a, 77, 44, 0a} into
such cruncherfriendly data?
Well, luckily for us, there is an algorithm available for us that does exactly
what we need. This alg is called Move-To-Front (MTF).
The MTF-Algorithm
MTF is - once you had the idea - a pretty simple algorithm. It transforms the
input data into some other data, it is reversible and it is pretty fast. The
basic idea is that we don't output the data itself but the index of the char
in some table (in our case the table has the usual possible 256 entries).
After the output we save the byte, shift the rest below one position up and
MOVE the saved byte TO the FRONT of the table. A simple char-Table will show
you what i mean:
Let's presume the input is {0a, f2, f2, 3c, 0a, 77, 44, 0a}
That's 8 entries. But we have only 5 distinct values. As we want a reference
table, we only need the distinct values, so for the sake of our example
we take a reference table with only 5 entries (which are - for the example -
filled with the distinct input values as starting values. In a real case the
table would have 256 entries filled with the 256 possible values for start):
Table Value
[ ]
0 0a
1 f2
2 3c
3 77
4 44
Now let's transform the first byte: 0a
We search our table for the value 0a and we find it at offset 00. And that's our
output: 00. Now we shift the rest one position upwards and move the 0a at offset
00 to offset 00. Luckily for us the loop terminates immediately and our table
looks now exactly like before:
Table Value
[ ]
0 0a
1 f2
2 3c
3 77
4 44
And now the second byte: f2
f2 is found at position 01 and this is our output: 01. Now we save the f2 at
position 01, move all other entries below the position 01 one up and move the
saved f2 to position 00. Let's watch the manipulation of the table in slow
motion:
Table Value Table Value Table Value
[ ] [ ] [ ]
0 0a f2 is saved. Copy 0 0a insert the -> 0 f2
1 f2