Polymorph engine (was:RE: PWL's how ?)
![](https://secure.gravatar.com/avatar/ac4b5842a352c815610574324ccabf65.jpg?s=120&d=mm&r=g)
Where did you get the following text on polymorph engines? Is it available somewhere? Where can I get other information on similar topics (Please resist the urge to mention web search engines; I'm looking for specific recommendations :-) Thanks! - Patrick _______________________________________________________________________________ From: Neil Tunnicliffe on Tue, Jan 7, 1997 6:08 Read this for an understanding of a polymorhphic virus: E#234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234#>> #188# A GENERAL DESCRIPTION OF THE METHODS BEHIND A POLYMORPH ENGINE #188# E#234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234##234#* A small glossary of terms: AAAAAAAAAAAAAAAAAAAAAAAAAA ENCRYPT = Transform from it's original form to an altered form. DECRYPT = Transform from it's altered form to it's original form. KEY = The register or value used to encrypt/decrypt with. SLIDING KEY = A KEY value that is INCREASED or DECREASED on each loop. COUNT = The number of bytes in the encrypted code or data. INDEX = A pointer to the encrypted code or data. SIGNATURE = A unique group of bytes that can be used to check against a programs content in the hope of detecting a particular program. HEURISTIC = A set of well defined rules to apply to a problem in the hope of achieving a known result. Question: What is a Polymorph? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Answer: Well, the Longman English Dictionary defines it as: "POLYMORPHOUS also POLYMORPHIC adj fml or tech. EXISTING IN VARIOUS DIFFERENT FORMS." In other words, something that has the ability to change it's shape. Other ways to describe such a thing might be; Mutable, Metamorphic, Etc... Question: What is a Polymorph Engine? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Answer: A program with the abilities to encrypt (or jumble up) another program or data and provide a unique decryptor for it, it must do this in such a way that no two encryptions of the same program or data will look alike. Example: Take the following ultra-simple decryptor: MOV SI,jumbled_data ;Point to the jumbled data MOV CX,10 ;Ten bytes to decrypt main_loop: XOR BYTE PTR [SI],55 ;XOR (un_scramble!) a byte INC SI ;Next byte LOOP main_loop ;Loop for the 9 remaining bytes This small program will XOR the ten bytes at the location pointed to by SI with the value 55. Providing the ten bytes were XORed with 55 prior to running this decryptor the ten bytes will be restored to their original state. If you are unsure as to why this is, brush up on your XOR logic!! Ok, so you might say that if you change the KEY value on each generation it will become Polymorphic? Well, yes and no! If you did that, the encrypted portion would be Polymorphic, but the decryptor would still remain mostly the same, the only change begin the KEY value! So, a signature scanner that allows WILDCARDS (and most do!) would still be able to find your decryptor! One way you could fool some signature scanners is to swap around some of the instructions. So, with this in mind, the above decryptor might look like: MOV CX,10 MOV SI,jumbled_data main_loop: XOR BYTE PTR [SI],55 INC SI LOOP main_loop As you can see, still not much of a change, not really enough to fool some of the better signature scanners. "GET TO THE POINT! WHAT IS A TRUE POLYMORPH?", I hear you cry! AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Well, a "true" Polymorph would be a decryptor that looks completely different on each generation! Take the following decryptor: MOV CX,10 NOP NOP MOV SI,jumbled_data NOP main_loop: NOP NOP XOR BYTE PTR [SI],55 NOP INC SI NOP NOP NOP NOP LOOP main_loop This decryptor is the same as the one before it, but it has has a few random NOP instructions peppered throughout itself. On each generation you would vary the amount of NOPs after each instruction. This is a Polymorph in it's simplest form. Still, most of the good signature scanners would have no problem with such a simple Polymorph. They would simply skip the NOPs, thus having a clear view of the decryptor, to which they could apply a signature! No, a "true" Polymorph has to be far far more complex then this! Instead of peppering NOPs throughout the decryptor it would pepper totally random amounts of totally random 8086 instructions, including JUMPS and CALLS. It would also use a different main decryptor (possibly from a selection of pre-coded ones) and would alter all the registers that the decryptor uses on each generation, making sure that the JUNK code that it generates doesn't destroy any of the registers used by the real decryptor! So, with these rules in mind, here is our simple decryptor again: MOV DX,10 ;Real part of the decryptor! MOV SI,1234 ;junk AND AX,[SI+1234] ;junk CLD ;junk MOV DI,jumbled_data ;Real part of the decryptor! TEST [SI+1234],BL ;junk OR AL,CL ;junk main_loop: ADD SI,SI ;junk instruction, real loop! XOR AX,1234 ;junk XOR BYTE PTR [DI],55 ;Real part of the decryptor! SUB SI,123 ;junk INC DI ;Real part of the decryptor! TEST DX,1234 ;junk AND AL,[BP+1234] ;junk DEC DX ;Real part of the decryptor! NOP ;junk XOR AX,DX ;junk SBB AX,[SI+1234] ;junk AND DX,DX ;Real part of the decryptor! JNZ main_loop ;Real part of the decryptor! As you should be able to see, quite a mess!! But, still executable code. It is essential that any junk code generated by the Polymorph Engine is executable, as it is going to be peppered throughout the decryptor. Note, in this example, that some of the junk instructions use registers that we are using in the decryptor! This is fine, providing the values in these registers aren't destroyed. Also note, that now we have random registers and random instructions on each generation it makes signature scanning (even for the clever signature scanners) impossible! Instead, an HEURISTIC method must be used, which can lead to false alarms. So, a Polymorph Engine can be summed up into three major parts: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1 .. The random number generator. 2 .. The junk code generator. 3 .. The decryptor generator. There are other discrete parts but these three are the ones where most of the work goes on! How does it all work? Well, SMEG goes about generating random decryptors in the following way: 1 .. Chooses a random selection of registers to use for the decryptor. Leaving the remaining registers as "junk" registers for the junk code generator. 2 .. Chooses one of the compressed pre-coded decryptors. 3 .. Goes into a loop generating the real decryptor, peppered with junk code. To understand how the selected registers are slotted into the decryptors and the junk code you must look at the 8086 instructions from a binary level: XOR AX,AX = 00110001 11000000 XOR AX,CX = 00110001 11001000 XOR AX,DX = 00110001 11010000 XOR AX,BX = 00110001 11011000 You should be able to see a pattern in the binary code for these four 8086 instructions? Well, all 8086 instructions follow logical patterns, and it is these patterns that tell the 8086 processor which registers/addressing mode to use for a particular instruction. The total amount of instruction formats and the precise logic regarding the patterns is too complex to go into here. However, all good 8086 tutorials/reference guides will explain in full. SMEG exploits this pattern logic to generate junk code and decryptors with random registers, as the patterns directly relate to the registers Etc. SMEG generates junk code in the following way: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Inside SMEG there is a table of the basic binary patterns for all of the 8086 instruction set, but with one important difference, all the register/address mode bits are zero. This is called the SKELETON INSTRUCTION TABLE. The table also contains various other bytes used by SMEG to determine the relevant bit positions to "plug in" the register bit patterns. These patterns are plugged in via the logic processes OR and AND. Using this method, SMEG can generate endless amounts of random 8086 instructions without destroying any of the registers used by the decryptor proper. SMEG also contains some discrete logic for producing false CALLS to dummy subroutines and also false conditional JMPS around the junk code. SMEG generates the decryptor proper in the following way: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Inside SMEG there is a table containing a selection of common 8086 instructions used in decryptors, such as XOR [index],reg Etc. These are, again, stored in SKELETON FORM with some control bytes used by the decryptor generator. Also, inside SMEG, there are several pre-coded decryptors stored in a compressed form. On average, a complete decryptor can be described to the decryptor generator in as few as 11 bytes and adding to the list of pre-coded decryptors is both painless and economical with space! SMEG generates the Polymorphed decryptor in the following way: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA First it chooses, at random, one of the pre-coded compressed decryptors. Next it goes into a loop uncompressing each decryptor instruction, plugging in the required registers, storing it and then generating (for each real instruction) a random amount of random instructions. This loop repeats until the complete decryptor has been constructed. The final result is a random size, random register, random patterned decryptor! It should also be noted that whenever SMEG generates an INDEXed instruction it uses either SI, DI or BX at random, also it sometimes uses a random offset. For example, say the encrypted code started at address 10h, the following could be used to index this address: MOV SI,10h ;Start address MOV AL,[SI] ;Index from initial address But sometimes SMEG will generate something like the following, again based on the encrypted code starting at address 10h: MOV DI,0BFAAh ;Indirect start address MOV AL,[DI+4066h) ;4066h + 0BFAAh = 10010h (and FFFF = 10h)!! These indexed and initial values are picked at complete random, and the examples of 0BFAAh and 4066h are valid, but next time they will be completely different! The following are two decryptors that were generated with my SMEG Polymorph Engine. It should be noted that I generated 4000 examples with no two alike! Unfortunately I ran out of hard drive space! But it is fairly safe to say that the total number of decryptor combinations would run into the BILLIONS! All the lines marked with ";junk" in the following listings indicate random junk instructions that were inserted throughout the actual decryptor, note that SMEG has the ability to generate junk CALLS to false SUBROUTINES, as well as general junk conditional jumps! All lines marked with a * indicate an actual part of the decryptor proper. I chose the two generations shown because their sizes were similar, 386 and 480 bytes. SMEG produces decryptors ranging in size from as little as 288 to as much as 1536 bytes. Even if two decryptors are generated that are the same size the chances of them being the same are, literally, billions to one!
participants (1)
-
Mullen Patrick