
Entropy is a measure of randomness in a set of data. The more random your data is, the more entropy it has. When we talk of entropy we are generally referring to Shannon Entropy which is an algorithm that will produce a result between 0 and 8, where 8 means there is no pattern in the data, thereby its very random and 0 means data follows a pattern. When you are used to evaluate digital data its very easy to see these patterns without even doing any calculation. Its the analog of the movie The Matrix when you can see what is going on in the Matrix Code without the translator. Random high entropy digital data looks very chaotic while low entropy data looks more organized, even you may be able to see some symmetric patterns formed by the array of bytes in the data, though it is still not readable in any known language, and again, no calculation is needed in general. It turns out that one of the techniques, I would say, almost universally used to hide malicious code is packing the code or encoding it to change its byte pattern signature to fool security products. The problem of this approach is also universal. As you pack or obfuscate the malicious code, its entropy increases. In fact, studies have demonstrated that entropy can be used very effectively to separate none malicious code from malicious one based on its entropy. Malicious samples will typically have an entropy over 7.2 while legitimate software ranges from 4.8 to 7.2. An entropy close to 8 will be seen in 30% of malicious samples but only in 1% of innocent code. More than half of malicious samples will have an entropy over 7.2 while only one out of ten legitimate applications will have such high entropy. In conclusion, not all malicious samples have high entropy (but a great majority will) and not all legitimate applications will have a low entropy (but the majority will). The reason why legitimate samples can have high entropy is related to the fact that packing is a valid technique used to decrease the size of executables, protect resources, and many applications take advantage of this. So even though entropy is not enough to differentiate malicious code from normal innocent code, malware analysis tools have used high entropy as one of the main elements in their malicious scoring systems to flag samples. If you have been in the crypters world, you can corroborate this by scanning your stub alone and then your stub with the encoded payload stored in the data section or resources section. Detection immediately increases. Anti-malware solutions will look for high entropy not only as a global value on the entire executable but also section by section and there is little that we can do to counter it. Even though this has been a problem for many years, the proposed techniques are not definitive. If you read on the topic, malware authors propose to increase the amount of normal code versus obfuscated code as first solution. Evidently then the stub has to be bigger and full of unused code. So although possible I don't see this as a good solution. Second solution is even more inconvenient because it would require to reduce the amount of obfuscated code meaning that the malicious code would have to be smaller in comparison to the stub. Unfortunately in most situations, its the opposite. The stub is greatly smaller than the payload. This, therefore will limit the possible payloads to the very few small ones. Another possibility is obfuscate the code using an algorithm that does not increase entropy (like XORing and Base64 encoding). This last one I think is the more convenient, which does not mean that is perfect. XORing as well as Base64 encoding can be easily decrypted to unmask the real purpose of the code. Also signatures can be created directly, both against the XORed as well as the Base64 encoded data. Finally, Some anti-malware solutions can even decode these simple schemes during the emulation phase of the analysis. Thinking in better possible alternatives I came up with an interesting idea that I have seen applied to some few malware samples in the past. If the problem is randomness, then why not try to disguise the malicious obfuscated code by inserting patterns to reduce its randomness and therefore its global entropy. In this way you are not limited to obfuscate the code with simple algorithms and yet stay under the radar of anti-malware solutions; also the obfuscated code could be of any size. With this in mind I created a POC to reduce the Shannon entropy of an obfuscated malicious array of bytes. The idea is to break the array in chunks and insert a low entropy pattern of bytes between one chunk and the next. When the sample is to be executed we have to rebuild the original payload in memory and at this point we have bypassed the static detection of the high entropy code. I want to stop in the pattern to follow for the low entropy bytes because you can create different patterns in order to avoid signature detections. Using several mathematical equations you could come up with many different low entropy patterns but for the purpose of the article we will use a simple one. The second needed task is the one that will put together the high entropy chunks of bytes with the low entropy ones. Not much to say about it. The third task will restore the original array of bytes by eliminating the low entropy patterns, because after all, we need to restore the obfuscated code to what it was originally in order to proceed to the de-obfuscation phase.

So lets present and explain line by line the code that will perform the first task. #include <cstdio> #include <Windows.h> #include "Entropy.h" using namespace std; BYTE payload[] = { 0xFF, 0x44, 0x01, 0xE2, 0x86, 0xD1, 0xC6, 0x88, 0x39, 0xA2, 0xAF, 0x59, 0x3D, 0x8C, 0x0F, 0x90, 0x7E, 0xB2, 0xAF, 0xBD, 0xAA, 0xE0, 0xBB, 0x87, 0x10, 0xF0, 0xD1, 0x37, 0x00, 0x55, 0x07 }; // Simulated high entropy code constexpr int number_of_chunks = 5; // Number of chunks. You can randomize this too. constexpr int chunk_size = sizeof payload / number_of_chunks; // Size of each chunk constexpr int remaining_bytes = sizeof payload % number_of_chunks; // Remaining bytes after the last chunk is processed BYTE low_entropy_payload[sizeof payload * 2 - remaining_bytes] = {0}; // array of bytes size calculation to contain the original high entropy code plus the low entropy inserts constexpr int payload_size_after_entropy_reduction = sizeof payload * 2; // Total size of the reduced entropy payload Notice that all of these calculations are stored in global variables and the high entropy code is also located in the global area of the code to assure that it will be stored in the data section of the executable but it could be perfectly located in the resources section and loaded at runtime too. You could even store the high entropy byte pattern inside the main function, however then the pattern would be stored in the .text section and it would be loaded in the stack and not in the heap as it happens when its stored in the data section or in the resources section. This is important because the stack can't handle very big array of bytes and also some compilers complain about this when array is too big. Next task is to divide the high entropy code in chunks and add the low entropy patterns: PBYTE reduce_entropy(PBYTE orig_payload) { // Random bytes used to create the low entropy array min/max constexpr int max_n = 0xEF; constexpr int min_n = 0x01; // Create Low entropy array of the same size or one high entropy chunk char random_hex[chunk_size]; // Some local variables needed for the loops int offset_of_low_entropy_payload = 0; int offset_of_original_payload = 0; // Get the first staring byte of the low entropy chunk const BYTE new_n = static_cast<BYTE>((rand() % (max_n + 1 - min_n) + min_n)); for (char& i : random_hex) { i = static_cast<char>(new_n); } // Build the low entropy array - put first a high entropy chunk then a low entropy one for (size_t i = 0; i < number_of_chunks; i++) { for (size_t j = 0; j < chunk_size; j++) { low_entropy_payload[offset_of_low_entropy_payload] = orig_payload[offset_of_original_payload]; offset_of_low_entropy_payload++; offset_of_original_payload++; } // Notice that adding is used to create the pattern of ascending bytes for (const char k : random_hex) { low_entropy_payload[offset_of_low_entropy_payload] = k; offset_of_low_entropy_payload++; } } // if there are remaining bytes, add them to the tail of the array if (remaining_bytes) { for (size_t i = 0; i < sizeof remaining_bytes; i++) { low_entropy_payload[offset_of_low_entropy_payload++] = orig_payload[offset_of_original_payload++]; } } // return the newly created low entropy array return low_entropy_payload; }
participants (1)
-
zeynepaydogan