CRC and how to Reverse it
A CRC Tutorial & The c00l way to Reverse CRC
+HCU Papers
Release: 29 april 1999
Modified: 30 april 1999
by
anarchriz
Courtesy of Fravia's page of
reverse engineering
slightly edited
by fravia+
fra_00xx 990504 anarchriz 1100 PA PC
A beautiful study, that rightly belongs to the +HCU Papers
There is a crack, a crack in everything
That's how the light gets in
Rating
(X)Beginner (X)Intermediate (X)Advanced ( )Expert
You always wanted to know what CRC exactly is? Always wanted to know how to
compute it yourself? Ever thought about ways to reverse CRC, but didnt succede?
Ever tried to patch a piece of code without altering its CRC? Ever wanted to
write an anti-antivirus trick to render the CRC32 check useless?
Well then... you may have landed at the right place!
CRC and how to Reverse it
A CRC Tutorial & The c00l way to Reverse CRC
Written by
anarchriz
Introduction
This essay consists of a CRC tutorial and a way of how to reverse it. Many
Coders/Reversers don't know exactly how CRC works and almost no one knows how to
reverse it, while this knowledge could be very usefull. First the tutorial will
learn you how to calculate CRC in general, you can use it as data/code
protection. Second, the reverse part will learn you (mainly) how to reverse
CRC-32, you can use this to break certain CRC protections in programs or over
programs (like anti-virus). There seem to be utilities who can 'correct' CRCs
for you, but I doubt they also explain what they're doing.
I'd like to warn you, since there is quite some math used in this essay. This
wont harm anyone, and will be well understood by the avarage Reverser or Coder.
Why? Well. If you dont know why math is used in CRC, I suggest that you click
that button with a X at the top-right of this screen. So I assume the reader has
knowledge of binair arithmetic.
Essay
Part 1: CRC Tutorial, what it is and how to calculate it
Cyclic Redundancy Code or CRC
We all know CRC. Even if you don't recall, you will when you think of those
annoying messages RAR, ZIP and other compressors give you when the file is
corrupted due to bad connections or those !@#$% floppies. The CRC is a
value computed over a piece of data, for example for each file at the
time of compression. When the archiver is unpacking that file, it will read the
CRC and check it with the newly computed CRC of the uncompressed file. When
they match, there is a good chance that the files are identical. With CRC-32,
there is a chance of 1/2^32 of the check failing to recognize a change in data.
A lot of people think CRC is short for Cyclic Redundancy Check. If indeed CRC
is short for Cyclic Redundancy Check then a lot of people use the term incorrect.
If it was you could not say 'the CRC of the program is 12345678'. People are also
always saying a certain program has a CRC check, not a Cyclic Redundancy Check
check. Conclusion: CRC stands for Cyclic Redundancy Code and NOT for Cyclic
Redundancy Check.
How is the calculation done? Well, the main idea is to see the file as one
large string of bits divided by some number, which will leave you with a
remainder, the CRC! You always have a remainder (can also be zero) which is at
most one bit less then the divisor (else it still has a divisor in it).
(9/3=3 remainder=0 ; (9+2)/3=3 remainder=2)
Only here dividing with bits is done a little different. Dividing is repeatedly
substracting (x times) a number (divisor) from a number you want to divide, which
will leave you with the remainder. If you want the original number back you
multiply with the divisor or (idem) add x times the divisor with itself and
afterwards adding the remainder.
CRC computation uses a special way of substracting and adding, i.e. a
new 'arithmetic'. While computing the carry for each bit calculation is
'forgotten'.
Lets look at 2 examples, number 1 is a normal substraction, 2&3 are special.
-+
(1) 1101 (2) 1010 1010 (3) 0+0=0 0-0=0
1010- 1111+ 1111- 0+1=1 *0-1=1
---- ---- ---- 1+0=1 1-0=1
0011 0101 0101 *1+1=0 1-1=0
In (1), the second column from the right would evaluate to 0-1=-1, therefore
a bit is 'borrowed' from the bit next to it, which will give you this
substraction (10+0)-1=1. (this is like normal 'by-paper' decimal substraction)
The special case (2&3) 1+1 would normally have as answer 10, where the '1' is
the carry which 'transports' the value to the next bit computation. This value
is forgotten. The special case 0-1 would normally have as answer '-1', which
would have impact on the bit next to it (see example 1). This value is also
forgotten. If you know something about programming this looks like, or better,
it IS the XOR operation.
Now look at an example of a divide:
In normal arithmetic:
1001/1111000\1101 13 9/120\13
1001 - 09 -|
---- -- |
1100 30 |
1001 - 27 -
---- --
0110 3 -> the remainder
0000 -
----
1100
1001 -
----
011 -> 3, the remainder
In CRC arithmetic:
1001/1111000\1110 9/120\14 remainder 6
1001 -
----
1100
1001 -
----
1010
1001 -
----
0110
0000 -
----
110 -> the remainder
(example 3)
The quotient of a division is not important, and not efficient to remember,
because that would be only a couple of bits less than the bitstring where you
wanted to calculate the CRC from. What IS important is the remainder! That's
the thing that says something important over about the original file. That's
basicly the CRC!
Going over to the real CRC computation
To perform a CRC calculation we need to choose a divisor, we call it the
'poly' from now on. The width W of a poly is the position of the highest bit,
so the width of poly 1001 is 3, and not 4. Note that the highest bit is always
one, when you have chosen the width of the poly you only have to choose a value
for the lower W bits.
If we want to calculate the CRC over a bitstring, we want to make sure all
the bits are processed. Therefore we need to add W zero bits to the end of the
bitstring. In the case of example 3, we could say the bitstring was 1111.
Look at a little bigger example:
Poly = 10011, width W=4
Bitstring + W zeros = 110101101 + 0000
10011/1101011010000\110000101 (we don't care about the quotient)
10011|||||||| -
-----||||||||
10011|||||||
10011||||||| -
-----|||||||
00001||||||
00000|||||| -
-----||||||
00010|||||
00000||||| -
-----|||||
00101||||
00000|||| -
-----||||
01010|||
00000||| -
-----|||
10100||
10011|| -
-----||
01110|
00000| -
-----|
11100
10011 -
-----
1111 -> the remainder -> the CRC!
(example 4)
There are 2 important things to state here:
1.Only when the highest bit is one in the bitstring we XOR it with the poly,
otherwise we only 'shift' the bitstring one bit to the left.
2.The effect of XORring is, that it's XORed with the lower W bits, because the
highest bit always gives zero.
Going over to a Table-Driven Algorithm
You all should understand that an algorithm based on bitwise calculation will
be very slow and inefficient. It would be far more efficient if you could
calculate it on a per-byte basis. But then we can only accept poly's with a
width of a multiple of 8 bits (that's a byte ;). Lets visualize it in a example
poly with a width of 32 (W=32):
3 2 1 0 byte
+---+---+---+---+
Pop! this is the poly, 4*8 bits
(figure 1)
This is a register you use to store the temporary result of the CRC, I call
it the CRC register or just register from now on. You are shifting bits from
the bitstring in at the right side, and bits out at the left side. When the bit
just shifted out at the left side is one, the whole register is XORred by the
lower W bits of the poly (in this case 32). In fact, we are doing exactly the
same thing as the divisions above.
What if (as I said) we would shift in & out a whole group of bits at once.
Look at an example of 8 bit CRC with 4 bits at once shifted in & out:
The register just before the shift : 10110100
Then 4 bits (at the top) are shifted out at the left side while shifting 4 new
bits in at the right side. In this example 1011 is shifted out and 1101 (new)
is shifted in.
Then the situation is this:
8 bits currently CRC/Register : 01001101
4 top bits just shifted out : 1011
We use this poly : 101011100, width W=8
Now we calculate just as usual the new value of the register.
Top Register
---- --------
1011 01001101 the topbits and the register
1010 11100 + (*1) Poly is XORred on position 3 of top bits (coz there is a one)
-------------
0001 10101101 result of XORring
Now we still have a one on bit position 0 of topbits:
0001 10101101 previous result
1 01011100+ (*2) Poly is XORred on position 0 of top bits (coz there is a one)
-------------
0000 11110001 result of second XORring
^^^^
Now there are all zero's in the topbits, so we dont have to XOR with the poly
anymore for this sequence of topbits.
The same value in the register you get if you first XOR (*1) with (*2) and the
result with the register. This is because of the standard XOR property:
(a XOR b) XOR c = a XOR (b XOR c)
1010 11100 poly on position 3 of top bits
1 01011100+ poly XORred on position 0 of top bits
-------------
1011 10111100 (*3) result of XORring
The result (*3) is XORred with the register
1011 10111100
1011 01001101+ the top bits and the register
-------------
0000 11110001
You see? The same result! Now (*3) is important, because with the top bits 1010
is always the value (*3)=10111100 (only the lower W=8 bits) bound (under the
stated conditions, of course) This means you can precompute the XOR values for
each combination of top bits. Note that top bits always become zero after one
iteration, this must be because the combination of XORring leads to it.
Now we come back to figure 1. For each value of the top byte (8 bits) just
shifted out, we can precompute a value. In this case it would be a table
consisting of 256 (2^8) entries of double words (32bit). (the CRC-32 table is
in the appendix)
In pseudo-language our algoritm now is this:
While (byte string is not exhausted)
Begin
Top = top_byte of register ;
Register = Register shifted 8 bits left ORred with a new byte from string ;
Register = Register XORred by value from precomputedTable at position Top ;
End
The direct Table Algorithm
The algorithm proposed above can be optimized. The bytes from the byte string
don't need to travel through the whole register before they are used. With
this new algorithm we can directly XOR a byte from a byte string with the byte
shifted out of the register. The result points to a value in the precomputed
table which will be XORred with the register.
I don't know exactly why this gives the same result (it has to do with a XOR
property), but it has the Big advantage you don't have to append zero
bytes/bits to your byte string. (if you know why, pleaz tell me :)
Lets visuallize this algorithm:
+-----: : : : :
+---+---+---+---+
| | | | |
+---+---+---+---+
(figure 2)
The 'reflected' direct Table Algorithm
To make things more complicated there is a 'reflected' version of this
algorithm. A Reflected value/register is that it's bits are swapped around
it's centre. For example 0111011001 is the reflection of 1001101110.
They came up with this because of the UART (chip that performs serial IO),
which sends each byte with the least significant bit (bit 0) first and the most
significant bit (bit 7) last, this is the reverse of the normal situation.
Instead then of reflecting each byte before processing, every else is
reflected. An advantage is that it gives more compact code in the
implementation. So, in calculating the table, bits are shifted to the right and
the poly is reflected. In calculating the CRC the register is shifted to the
right and (of course) the reflected table is used.
byte string (or file) -->---+
| 1. In the table each entry is reflected
byte 3 2 1 0 V 2. The initial register is reflected
+---+---+---+---+ | 3. The bytes from the byte string aren't
| | | | |>---XOR reflected, because all the rest is.
+---+---+---+---+ |
| |
XOR V
^ |
+---+---|---+---+ |
| | | | | | Precomputed table
+---+---+---+---+ |
: : : : :
You are deep inside fravia's page of reverse engineering,
choose your way out: