Inside the VB3 .EXE
by _Duke_
|
The following essay is not intended to ba a "How to crack VB programs"
essay but I will show you exactly HOW a VB program is protected from de-compilers.
It is important that you have a working knowledge of programming using
Visual Basic in order to understand the essay and more importantly, to
follow the source code of the programs you de-compile. Although this essay
covers an older version of VB, there are many programs out there which
have yet to be cracked. It also serves as a starting point to understanding
the later versions.
|
1) Visual Basic 3- For Compiling our own test programs.
2) A Good Hex Editor- Use one that will let you Binary Compare two files and display the differences.
3) SoftIce- Of Course!
4) MAKE_MAK.EXE or DoDi's VBOPT to "Protect" our programs.
5) A VB DeCompiler. (Is there anything but DoDi's??!!)
|
As many people have discovered, a VB program isn't really a 'Program' in the traditional sense of the word. Visual Basic is an 'Interpreted' language. What this means is that the program is stored in a 'higher level' language than the machine's native code. It is the job of the interpreter to read back and execute this higher language AT RUNTIME. Most other languages (Such as C) are stored in native code and need nothing to translate them. In case you didn't know, the VB interpreter is VBRUN300.DLL (No wonder all the VB programs need it to run!!) This is the REAL program that is running. Any Softice breakpoints you set for the 'standard' Windows routines will ALWAYS return you to VBRUN not to the EXE! The interpreter is reading the contents of the EXE, translating the TOKENS, and executing various subroutines to perform the desired task. A VB program, therefore, cannot be disassembled by the standard tools. Softice is pretty much useless here unless you like to follow the spaghetti inside VBRUN. The program can however be de-compiled back into VB source code thanks to DoDi's VBDIS. It is available on the net as shareware but I STRONGLY recommend you get (and pay for, its worth it) the full version if you are serious about R-E'ing VB programs. This decompilation is possible thanks to Micro$oft's including information in the executable that is not needed for the program to run. Now why would they do that?
The VB executable is made up of the same basic parts as other windows programs:
DOS HEADER: This is provided for backward compatibility of the EXE file format.
STUB PROGRAM: Checks if Windows is running.
Provides an error message if the
program is being run from DOS.
WINDOWS HEADER: This section provides important
information about the EXE to
the operating system. Some of the more important locations are:
OFFSET (hex) FUNCTION
---------------------------------------------------------
14 Initial value of CS:IP
1C Number of Segments
22 Relative offset to Segment
Table (typ. 40)
24 Relative offset to Resource
Table
3E The expected Windows version
***Note About Hex Editors: There seems to be a difference in opinion
as to the 'START'
of a program. Some editors call the start byte 00, while others consider
it byte 01.
If the addresses you are looking at just don't seem right, try shifting
1 byte to the
right or left.
For a good reference on the Windows Header, look in the WIN SDK help file WIN31WH.HLP and look under "Executable-File Header Format"
A short VB program (1 form/module) will typically contain 4 entries
in it's segment table referencing 3 segments (one can be ignored). One
of the segments, usually located just after the Windows Header, is a single
CALL instruction which transfers
control to the interpreter. THIS IS THE ONLY CODE IN THE VB PROGRAM
THAT RUNS!!!!!! The other segments point to the Tokens themselves
and a section which specifies how the tokens are structured into the various
Subs and Functions.
Resources are 'packages' of data in a pre-defined format which a program
will access. Examples of resources are Icons, Fonts, and Menus. In a VB
program, they are also used to reference Forms and other 'Data' sections
of the program.
Some Hands On
** For this section of the lesson, you will need CALC.EXE compiled from the samples that come with VB3, it should compile to 9020 bytes. Or download it here within +Fravia's page.
Start your favorite Hex Editor and load CALC.EXE. Examine the following sections as I describe them. I have found it easiest to print the whole file in hex starting from the windows header and use colored markers to see what the sections 'look' like.
0000-003F DOS HEADER- Note the 06 @ 003D; This is the start page of the Windows Header. 0200-049F Stub Program- This code only runs from DOS. 0600-07FF Windows Header- Lets look more closely: 0614- Initial CS:IP = 10 00 01 00 This translates to 10 bytes past segment 1 061C- # Segments = 04 00 0622- Offset to Start of Segment Table = 40 00 Segment table starts @ 0640, segments are 4 words long Segment 1 @ 0640 - 08 00 19 00 50 1D 19 00 This means: The segment is located @ 0800 The segment is 0019 bytes long 1D50 - Flags (more later) The segment need 0019 bytes of memory Segment 2 @ 0648 - 00 00 00 00 11 0C 02 00 Ignore this segment definition Segment 3 @ 0650 - 0F 00 50 02 10 1D 50 02 This segment @ 0F00 is the 'Sub Structure Table' Segment 4 @ 0658 - 09 00 D0 50 10 1C D0 50 This segment @ 0900 is the Tokens (the 'Code') 0624- Offset to Resource Table = 68 00 Table starts @ 0668: Word @ 0668 = 08 00 - This is rscAlignShift, ignore it for now
First a resource's Type is defined, then all of the resources of that type follow: First Type definition @ 066A - 0E 80 01 00 00 00 00 00 This means: The TypeID is 800E (A Group Icon) There is 1 resource defined *The last 2 words are reserved
Then the resc. is defined @ 0672 - 12 00 01 00 30 1C 01 80 00 00 00 00 This means: The resource starts on Page 0012 It is 0001 Pages long 1C30 is more Flags The resource's ID is 8001 *Again, the last two words are reserved The next type definition is @ 067E - 03 80 01 00 00 00 00 00 'There is 0001 resource of type 8003 (Icon)' Then the resource definition @ 0686 - 13 00 03 00 30 1C 01 80 00 00 00 00 The resource starts at page 0013 and is 3 pages long The next type definition is @ 0692 - 0A 80 05 00 00 00 00 00 'There are 0005 resources of type 800A (Data)' * There are actually 4 resources, the 3rd is skipped Then the 4 definitions starting @ 069A: 069A - 16 00 02 00 30 1C 01 80 00 00 00 00 06A6 - 18 00 02 00 30 1C 02 80 00 00 00 00 06B2 - 1A 00 09 00 30 1C 04 80 00 00 00 00 06BE - 23 00 01 00 30 1C 05 80 00 00 00 00 These resources are respectively: Forms Definitions Internal Definitions A Form Form and Control Names It should be noted that the resource ID is not related to what the resource is used for. The function of the resource is identified by it's header bytes. The FLAGS sections of the segments and resources are used for information like if they are MOVABLE, SHAREABLE, PRELOADED, EXECUTEONLY, etc. 06D8-07FF Various name tables used by windows 0800-0819 This is the first segment. If you remember, the initial value of CS:IP was 10 bytes past the start of this segment. This byte is a long CALL (9A) into the interpreter. The address is computed at runtime since there is no way to tell where VBRUN will load into memory. The bytes which follow the segment are loading information for other segments. 0900-0EFF These are the actual tokens. The source code is translated to this at compile time. Strings are stored literally; this helps us to find our place while comparing tokens to source. More on this section and the ones that follow in the next lesson. 0F00-114F This section defines how the tokens are arranged into their various subs. 1200-12FF This is the GROUP_ICON definition (Don't bother!) 1300-15FF This is the ICON definition. For information on this and the previous section look in the WIN SDK help file under 'Graphics File Formats' 1600-17FF This is the Forms Definitions section. Here, information on forms, imported VBX's, and controls is stored. 1800-19FF This section's format is quite mysterious but it is used to hold object definitions like forms, controls, variables, and constants. 1A00-22FF This is the actual form used in the program. It's format is very similar to a VB .FRM file. Notice the 'in line' icon @ 1A61. Pictures are also stored this way. The form's controls are defined in the second half of the form. 2300-END These are the control names. ***This section is unnecessary for program operation and is removed when the program is PROTECTED.***What does this mean for CRACKERS???
Crackers can modify the information in these various sections to:
0900: 35 49 21 2D 1A 00 9A 38 0A 00 0C 00 04 00 64 75 0910: 6B 65 00 00 C3 11 7A 44 B4 34 1E 00 35 0E 4B 49 .. .. The source code which compiled into this was:
IF password <> "duke" THEN ENDLets break this down. The first two bytes '35 49' is the token for encoding the number of leading spaces in the original source code. Some of the token words for spacing are as follows:
'21 2D 1A 00' References the variable password.
The remaining tokens are the END instruction and the next line spacing token. The easiest way to learn about the different tokens is to write short VB programs, make .EXE's, and compare the tokens with the source code which generated them. When you compare the differences between simple code changes, you will begin to see the patterns. You could also look at the routines for the various tokens but these are very difficult to follow. If you would like to look at the routines, try the following:
1600:03 20 81 80 FF FF 43 41 4C 43 00 00 00 00 00 05 1610:00 01 00 43 41 4C 43 00 00 46 09 04 80 46 00 FF 1620:01 A4 48 00 43 41 4C 43 2E 46 52 4D 00 00 00 58 .....This is the start of the 'Forms Definitions' section. It contains the names of the form (.FRM) files used in the compile, names of any .VBX files needed by the program, and references for both common and custom controls.
Next byte:
43 - VBX Name - This defines a VBX needed by the program. The byte after the '43' is the length of the VBX filename. The next word is '00 00' since there are no resources associated with these entries (see next definition). The null terminated filename starts 7 bytes later (I don't know what the 7 bytes are for).
46 - Form Name - This is the name of the .FRM file used during the compile. Again the byte after the '46' is the length of the filename. The word which follows is the ID of the resource which contains the actual form. 7 unknown bytes, then the null term. filename. The word after this is the number of DWORDS which follow before the next definition. Most form name entries have this set to '00 00' which of course means that the next definition starts at the next byte. I don't know what these bytes mean either.
58 - Control Name(s)? - These seem to be the definitions for the control names. ALL of the standard controls are listed first without names. The word(?) after the 58 is the control type. A list of the most common controls will be provided later. There are 4 words follow the control type, the last of which always seems to be 0000. If the control is a Custom control, the length of the name and the null term name follow, otherwise this part is left out. The next word is again the number of mystery DWORDS which follow.Now for the Form: CALC.EXE only has one form which starts at 1A00. The form can be thought of as two sections:
00001A00 FF CC 2C 00 07 A3 08 00 00 8D 03 00 00 00 00 0D ..,............. 00001A10 22 01 26 00 32 00 FF 00 0A 43 61 6C 63 75 6C 61 ".&.2....Calcula 00001A20 74 6F 72 05 30 0C 00 00 98 07 00 00 C0 0C 00 00 tor.0........... 00001A30 00 0C 00 00 0C 06 53 79 73 74 65 6D 9A 99 19 41 ......System...A 00001A40 01 19 01 00 42 00 23 FE 02 00 00 00 00 01 00 01 ....B.#......... 00001A50 00 20 20 10 00 00 00 00 00 E8 02 00 00 16 00 00 . ............. .......
--MORE ICON DATA --
....... 00001D40 1F F8 00 00 1F F8 00 00 3F 24 05 46 6F 72 6D 31 ........?$.Form1 00001D50 25 01 35 30 0C 00 00 36 98 07 00 00 37 C0 0C 00 %.50...6....7... 00001D60 00 38 00 0C 00 00 FF 17 00 00 00 00 00 00 00 00 .8.............. 00001D70 00 00 00 00 71 01 00 00 00 00 00 00 00 00 00 00 ....q........... 00001D80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00001D90 00 00 00 00 00 00 01 41 00 00 80 01 07 00 00 04 .......A........ 00001DA0 FF 00 01 37 02 07 00 04 78 00 58 02 E0 01 E0 01 ...7....x.X..... 00001DB0 0B 06 53 79 73 74 65 6D 9A 99 19 41 E1 11 07 00 ..System...A....The form(s) will have the header 'FF CC 2C 00', (The header is actually just CCFF but all of the forms I have seen follow this with 002C, it doesn't seem to ever be checked. VBRUN300 will also accept CC23 as a valid form section header although I have not yet seen it in an .EXE.)
The important thing to remember from this point is that VB starts with
a default form and makes changes to it from there.
If our form were a default VB form, the 'FF 00' @ 1A16 would be located
at 1A10; this FF designates the end of the basic window properties. Instead
we have three entries:
00001D90 00 00 00 00 00 00 01 41 00 00 80 01 07 00 00 04 .......A........ 00001DA0 FF 00 01 37 02 07 00 04 78 00 58 02 E0 01 E0 01 ...7....x.X..... 00001DB0 0B 06 53 79 73 74 65 6D 9A 99 19 41 E1 11 07 00 ..System...A.... 00001DC0 FF 0B A9 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00001DD0 00 00 00 00 00 00 00 00 03 41 00 00 80 01 08 00 .........A...... 00001DE0 00 04 FF 00 01 38 02 08 00 04 D0 02 58 02 E0 01 .....8......X... 00001DF0 E0 01 0B 06 53 79 73 74 65 6D 9A 99 19 41 E1 11 ....System...A.. 00001E00 08 00 FF 0B A9 01 00 00 00 00 00 00 00 00 00 00 ................ 00001E10 00 00 00 00 00 00 00 00 00 00 03 41 00 00 80 01 ...........A.... 00001E20 09 00 00 04 FF 00 01 39 02 09 00 04 28 05 58 02 .......9....(.X. 00001E30 E0 01 E0 01 0B 06 53 79 73 74 65 6D 9A 99 19 41 ......System...A 00001E40 E1 11 09 00 FF 0B A9 01 00 00 00 00 00 00 00 00 ................ 00001E50 00 00 00 00 00 00 00 00 00 00 00 00 03 3C 00 00 .............<.. 00001E60 00 02 00 04 FF 00 01 43 04 F8 07 58 02 E0 01 E0 .......C...X....Lets look at the first control in detail:
Unfortunately, due to the large number of control properties, I cannot give you a list of them. It is, however, fairly easy to find the code of a property you are looking for .... Just compile a test program with whatever control you are trying to find the property for, make an .EXE out of it, then change the property and make another .EXE. When you binary compare the Form section's of the two programs, you will see what bytes have been added to change the property. This is the best way to find out most of what is in the .EXE.
* A note on VB3: VB3 has a strange habit of compiling the exact same source code into slightly different .EXE's between the first and second compiles. When making your reference file, compile your source code TWICE without changing anything. It will ask you if you want to over write the existing .EXE; answer YES. NOW rename the .EXE and compile it a THIRD time. A binary compare of these should be identical, if not, repeat this until you can get two files which are identical. THEN make a your changes and Compile again. This is necessary on the VB3 that I have, you may want to test it on yours.
Here is a list of some of the Form Properties you may want to change:
03 - Background color 09 - Enabled 0B - Mouse pointer 0C - Font change 10 - Window State 1D - Fill style 1E - Fill color 23 - Inline Icon definition 24 - Link topic 25 - Link mode 26 - Max Button 27 - Min Button 28 - Close Button 2E - Visible 31 - Key Previewand standard control types:
00 - Picture 01 - Label 02 - Text Box 03 - Frame 04 - Command Button 05 - Check Box 06 - Option Box 07 - Combo Box 08 - List Box 09 - Horz. Scroll Bar 0A - Vert. Scroll Bar 0B - Timer 10 - Drive Box 11 - Directory Box 12 - File Box 13 - A Menu Item 16 - Shape 17 - Line 18 - Image 25 - Data FF - Custom Control
The last resource in our CALC program is the control names resource
@ 2300. Not too much to talk about here, the first entry is the name of
the form, subsequent entries are the names of the controls on the form.
With a control array, only the first item is listed. This section is not
needed at all for the program to run and it can be removed (and
is!) without effect. Each control defined in the control section has a
reference to the position in this list of the control's name. Unfortunately,
the program's variable and sub/function names are not stored anywhere in
the program, and hence can never be recovered. If our program had more
than one form, the additional form(s) would follow alternating with their
control names section(s).
"Protection" from De-Compilers.
First let me start by saying that NO PROGRAM CAN BE PROTECTED FROM A GOOD DE-COMPILER!!!! This is not to say that DoDi's De-Compiler is not good, but he has written it with the intent to be able to prevent it from working. As long as the Program Tokens are in the .EXE (and they must be for the program to function) those tokens can be de-compiled back to the original source code. So when I talk about Un-Protecting a file, what we are really talking about is making it acceptable to DoDi's de-compiler.
Programs are protected by removing the sections which are not needed for the program to run, but ARE needed for the de-compiler. These sections are the .FRM names in the Forms definitions section and the Control Names resource(s). Get MAKE_MAK.EXE from the net if you don't already have it. It is a VB Protector. Make a copy of our CALC.EXE with the name CALC.OLD and using MAKE_MAK, 'Protect' CALC.EXE.
When you start to HEX examine the file, look at the following things:
Un-Protecting a file:
2) Rebuild the Control Names resource. It must start on the page after it's related form ends. If you are guessing at control names, use the control type to make a useful name i.e. Command1, and the 'Number in List' parameter in the control data to place it in the proper order. Pad the rest of the page to bring it to the next boundary. Repair the pointer to this resource in the header and adjust the pointers of any resources after this one which have been moved.
* If the program has been protected with DoDi's VBOPT, there will be one additional step needed in order to un-protect it. I won't tell you this step out of respect for the writer of the only VB de-compiler I know of, but it isn't hard to figure out. I have faith in all of you!!
|
If there is something that I have not made clear, and after much
time of trying to figure it out for yourself, or if you know of parts of
my essay that are just wrong, please e-mail me at vbman@nassau.cv.net and
I will do my best to help.
Please Don't email duke@nassau.cv.net, it's not me. Someone got the
address before I did :(
|