Re: [spam][crazy] bomb malware
fork situation resolved i've loaded the mirai binary into the ghidra analyser. Here's how ghidra displays the mirai entrypoint. Comments from me are preceded by "//" inline. ************************************************************** * * * FUNCTION * ************************************************************** undefined __regparm3 entry(undefined4 param_1, undefined... undefined AL:1 <RETURN> undefined4 EAX:4 param_1 undefined4 EDX:4 param_2 undefined4 Stack[-0x8]:4 local_8 XREF[1]: 0804816d(*) entry XREF[2]: Entry Point(*), 08048018(*) 08048164 31 ed XOR EBP,EBP 08048166 5e POP ESI 08048167 89 e1 MOV ECX,ESP 08048169 83 e4 f0 AND ESP,0xfffffff0 0804816c 50 PUSH param_1 0804816d 54 PUSH ESP=>local_8 0804816e 52 PUSH param_2 0804816f 68 d6 db PUSH LAB_0804dbd6 04 08 08048174 68 94 80 PUSH LAB_08048094 04 08 08048179 51 PUSH ECX 0804817a 56 PUSH ESI 0804817b 68 40 a5 PUSH FUN_0804a540 04 08 08048180 e8 ba 50 CALL FUN_0804d23f int FUN_0804d23f(undefined * par 00 00 08048185 f4 HLT 08048186 90 90 90 align align(10) 90 90 90 90 90 90 90
I didn't end up including comments. the LAB_* references pushed onto the stack (to be passed to FUN_0804d23f) are function pointers. I click on them or hit enter while over them and end up hitting 'F' to reanalyse them as functions. I can tell they are functions because the instructions at their start and end are always used for functions. in ida pro you hit 'esc' to return to where you just were, in ghidra it's alt-shift-left
i wrote a lot more and my system froze quite thoroughly and i rebooted it
I'm looking at this autogenerated ghidra decompilation. I labeled the flag as a bool. PTR_DAT_0804e024 contains the address of DAT_0804e00c which contains void at start. The logic here is a little confusing. I'm trying to put comments inline below. void FUN_080480c0(void) { code *pcVar1; // code runs only once, sets a flag if (BOOL_0804e080 == false) { // loop dereferences the ptr, and continues only if it is nonzero while (pcVar1 = *(code **)PTR_DAT_0804e024, pcVar1 != (code *)0x0) { // ptr is incremented to _next_ value (since this is 32 bit code) PTR_DAT_0804e024 = PTR_DAT_0804e024 + 4; // _old_ value is derefenced and called? (*pcVar1)(); } BOOL_0804e080 = true; } return; } it looks like it needs to be called at the right time, and calls a hidden function when that is done? and may also increment a pointer? i'd like to review it again. here's the disassembly: ************************************************************** * * * FUNCTION * ************************************************************** undefined __cdecl FUN_080480c0(void) undefined AL:1 <RETURN> FUN_080480c0 XREF[1]: FUN_0804dbd6:0804dbe5(c) 080480c0 55 PUSH EBP 080480c1 89 e5 MOV EBP,ESP 080480c3 83 ec 08 SUB ESP,0x8 080480c6 80 3d 80 CMP byte ptr [BOOL_0804e080],0x0 = ?? e0 04 08 00 080480cd 74 0c JZ LAB_080480db 080480cf eb 35 JMP LAB_08048106 LAB_080480d1 XREF[1]: 080480e4(j) 080480d1 83 c0 04 ADD EAX,0x4 080480d4 a3 24 e0 MOV [PTR_DAT_0804e024],EAX = 0804e00c 04 08 080480d9 ff d2 CALL EDX LAB_080480db XREF[1]: 080480cd(j) 080480db a1 24 e0 MOV EAX,[PTR_DAT_0804e024] = 0804e00c 04 08 080480e0 8b 10 MOV EDX,dword ptr [EAX]=>DAT_0804e00c 080480e2 85 d2 TEST EDX,EDX 080480e4 75 eb JNZ LAB_080480d1 080480e6 b8 00 00 MOV EAX,0x0 00 00 080480eb 85 c0 TEST EAX,EAX 080480ed 74 10 JZ LAB_080480ff 080480ef 83 ec 0c SUB ESP,0xc 080480f2 68 08 df PUSH 0x804df08 04 08 080480f7 e8 04 7f CALL SUB_00000000 fb f7 080480fc 83 c4 10 ADD ESP,0x10 LAB_080480ff XREF[1]: 080480ed(j) 080480ff c6 05 80 MOV byte ptr [BOOL_0804e080],0x1 = ?? e0 04 08 01 LAB_08048106 XREF[1]: 080480cf(j) 08048106 c9 LEAVE 08048107 c3 RET
so let's go through that chunk by chunk // function prolog, set up a C-style function 080480c0 55 PUSH EBP 080480c1 89 e5 MOV EBP,ESP 080480c3 83 ec 08 SUB ESP,0x8 // compare the flag with 0 (false) 080480c6 80 3d 80 CMP byte ptr [BOOL_0804e080],0x0 = ?? e0 04 08 00 // goto 080480db if it is false 080480cd 74 0c JZ LAB_080480db // goto 08048106 if it is true 080480cf eb 35 JMP LAB_08048106
// this next line is 080480d1 . this line is jumped to (referenced XREF (j)) from 080480e4 LAB_080480d1 XREF[1]: 080480e4(j) // add 4 to the first active value (EAX is the first 32-bit register, the working memory of a cpu) 080480d1 83 c0 04 ADD EAX,0x4 // set PTR_DAT_0804e024 to EAX 080480d4 a3 24 e0 MOV [PTR_DAT_0804e024],EAX = 0804e00c // these are further bytecode bytes. the assembly statement is 5 bytes long (a3 24 e0 04 08) 04 08
// call EDX as a function. EDX is the 4th 32-bit register, i.e. cpu working-memory. 080480d9 ff d2 CALL EDX // this is where the jump statement from 080480cd ends up. So, this is the start of the while loop, and the code immediately above isn't executed until this is. LAB_080480db XREF[1]: 080480cd(j) // copy PTR_DAT_0804e024 into EAX. 080480db a1 24 e0 MOV EAX,[PTR_DAT_0804e024] = 0804e00c 04 08 I'm a little confused on whether [ADDR] dereferences the data pointed to by the address or not, in this notation. I think I'll look it up.
There are a handful of different ways to notate assembly code. Luckily, I stumbled on what appears to be the same one. https://www.cs.virginia.edu/~evans/cs216/guides/x86.html#memory Some examples of mov instructions using address computations are: mov eax, [ebx] ; Move the 4 bytes in memory at the address contained in EBX into EAX mov [var], ebx ; Move the contents of EBX into the 4 bytes at memory address var. (Note, var is a 32-bit constant). mov eax, [esi-4] ; Move 4 bytes at memory address ESI + (-4) into EAX mov [esi+eax], cl ; Move the contents of CL into the byte at address ESI+EAX mov edx, [esi+4*ebx] ; Move the 4 bytes of data at address ESI+4*EBX into EDX So, [var] treats var as the memory address to read from or write to, and the MOV statements above are not dereferencing the pointer, but rather adjusting where it is pointing.
// dereference the pointer and move the discovered value into EDX. // ghidra here is reminding us that PTR_DAT_0804e024 in EAX points to DAT_0804e00c // and if one of those values is renamed in the interface, it will update the name everywhere 080480e0 8b 10 MOV EDX,dword ptr [EAX]=>DAT_0804e00c // comparing a value with itself tests whether it is zero or not 080480e2 85 d2 TEST EDX,EDX // if the pointed to value is nonzero, goto 080480d1 080480e4 75 eb JNZ LAB_080480d1 // reset the active pointer (eax) to zero 080480e6 b8 00 00 MOV EAX,0x0 00 00 // this looks like a compilation quirk, and i'll include the two lines together // since eax is always zero, this is always a jump to 080480ff, right? // it's the cryptic termination of the while loop, right? 080480eb 85 c0 TEST EAX,EAX 080480ed 74 10 JZ LAB_080480ff // i'm not experienced with disassembling malware. it's possible some code jumps to 080480eb that ghidra hasn't detected, or maybe this is some compilation norm. // this code looks like the end of a function, but presently it would never be executed, due to the jump above 080480ef 83 ec 0c SUB ESP,0xc 080480f2 68 08 df PUSH 0x804df08 04 08 // this instruction makes no sense to me: calling 0x00000000 as a function. this would immediately segfault an application. maybe it is there for that purpose? maybe it is there to be replaced later? 080480f7 e8 04 7f CALL SUB_00000000 fb f7 080480fc 83 c4 10 ADD ESP,0x10 // here's where normal execution would resume, the end of the while loop. it sets the flag to 1 (true). LAB_080480ff XREF[1]: 080480ed(j) 080480ff c6 05 80 MOV byte ptr [BOOL_0804e080],0x1 = ?? e0 04 08 01 // and returns. LAB_08048106 XREF[1]: 080480cf(j) 08048106 c9 LEAVE 08048107 c3 RET
i'm seeing that pattern, with the skipped code calling a void pointer, elsewhere in the code. for something confusing like that, it's clearest to watch the system execute to see what is important. so it would make sense to move to code that i can run. this function is passed as a pointer in the entrypoint, and isn't immediately executed, so it's not of as much as interest as things that can be tested, to me, now.
The function called from the entrypoint is FUN_0804d23f . It's bigger.
i'm destabilising here. sounds like you want a quick summary of these binaries. a researcher for an antivirus group would likely have that. i'm not one, so i'm a lot slower. i really enjoy this work, it's very rare for me to be able to do something like this.
we could skip all the details and try to profile more attributes of the binaries.
i'm just gonna run the binary. i bet that idea is part of some of my fears.
people always say you should push your edge, challenge your fears! i'll be running it with a debugger so that it doesn't go too far. if you aren't a crazed homeless software developer, you'll want to have a vm or a dedicated offline system for something like this. $ gdb 776c341504769aa67af7efc5acc66c338dab5684a8579134d3f23165c7abcc00 (gdb)
I found this command from the web: (gdb) info file Symbols from "/media/3/pkg/ghidra-projects/Log4J Malware/Mirai/776c341504769aa67af7efc5acc66c338dab5684a8579134d3f23165c7abcc00". Local exec file: `/media/3/pkg/ghidra-projects/Log4J Malware/Mirai/776c341504769aa67af7efc5acc66c338dab5684a8579134d3f23165c7abcc00', file type elf32-i386. Entry point: 0x8048164 0x08048094 - 0x080480b0 is .init 0x080480b0 - 0x0804dbd6 is .text 0x0804dbd6 - 0x0804dbed is .fini 0x0804dbf0 - 0x0804df06 is .rodata 0x0804e000 - 0x0804e008 is .ctors 0x0804e008 - 0x0804e010 is .dtors 0x0804e020 - 0x0804e080 is .data 0x0804e080 - 0x0804e500 is .bss quick way to find the entrypoint
participants (1)
-
Karl