Thanks drome for sharing his knowledge and skills! He completed all 10 challenges and this series of writeups is done by him :)
| Details | Links |
|---|---|
| Official Challenge Site | https://flare-on.com/ |
| Official Challenge Announcement | https://www.fireeye.com/blog/threat-research/2021/08/announcing-the-eighth-annual-flare-on-challenge.html |
| Official Solutions | https://www.mandiant.com/resources/flare-on-8-challenge-solutions |
| Official Challenge Binaries | http://flare-on.com/files/Flare-On8_Challenges.zip |
09_evil
Mandiantโs unofficial motto is โfind evil and solve crimeโ. Well here is evil but forget crime, solve challenge. Listen kid, RFCs are for fools, but for you weโll make an exception :)
The challenge has 3 false flags:!t_$uRe_W0u1d_B3_n1ce_huh!@flare-on.com1s_tHi$_mY_f1aG@flare-on.comN3ver_G0nNa_g1ve_y0u_Up@flare-on.com.
7zip password:flare
There is a single file called evil.exe.
arch x86
baddr 0x400000
binsz 2964480
bintype pe
bits 32
canary false
retguard false
class PE32
cmp.csum 0x002d7f23
compiled Mon Jul 19 12:22:26 2021
crypto false
endian little
havecode true
hdr.csum 0x00000000
laddr 0x0
lang c
linenum false
lsyms false
machine i386
nx true
os windows
overlay false
cc cdecl
pic true
relocs false
signed false
sanitize false
static false
stripped false
subsys Windows CUI
va true
Dynamic Analysis
Disassembling it, we see that IDA canโt even define the main function because of bad opcodes. We run it dynamically, and see that it calls 0x623D0, which does a null pointer access at 0x62460. This causes an exception in our debugger, and if we pass it to the process, we get a divide by zero exception at 0x624B0.
SEH handlers
To understand where we can find the exception handlers, we refer to Practical Malware Analysis (PMA)โs Chapter 15 - Obscuring Flow Control - Missing Structured Exception Handlers (page 344).
To find the SEH chain, the OS examines the FS segment register. This register contains a segment selector that is used to gain access to the Thread Environment Block (TEB). The first structure within the TEB is the Thread Information Block (TIB). The first element of the TIB (and consequently the first bytes of the TEB) is a pointer to the SEH chain. The SEH chain is a simple linked list of 8-byte data structures called EXCEPTION_REGISTRATION records
This explains the following assembly code at the start of main, which adds an additional SEH entry at the top of the SEH chain pointing to sub_A38E4.
.text:00066450 _main:
.text:00066450 push ebp
.text:00066451 mov ebp, esp
.text:00066453 push 0FFFFFFFFh
.text:00066455 push offset sub_A38E4
.text:0006645A mov eax, large fs:0
.text:00066460 push eax
.text:00066461 mov large fs:0, esp
In the function called by main, at 0x623D0, we see that it does something very similar, just with the handler sub_A3740.
However, when we try to put breakpoint at those SEH exception handlers and run, we canโt hit those breakpoints. The only time control gets passed back to our debugger is when the program is running and runs into another exception.
We put a breakpoint at ntdll_KiUserExceptionDispatcher and hit it once we pass control to the application after the first exception.
We canโt step through into user mode until they throw exceptions, as shown in the following debugger output
624B0: Integer divide by zero (exc.code c0000094, tid 6032)
624E6: Integer divide by zero (exc.code c0000094, tid 6032)
6255F: Integer divide by zero (exc.code c0000094, tid 6032)
625AC: Integer divide by zero (exc.code c0000094, tid 6032)
62F70: Priveleged instruction (exc.code c0000096, tid 6032)
__scrt_common_main_seh
__scrt_common_main_seh is the second function called in our entry point function.
By luck, we noticed that sub_654B0 gets called a lot during exceptions. We put a breakpoint there and observe that it gets called at the very start of program execution, even before our main breakpoint is hit.
sub_4820F0 and sub_482130 call sub_654B0 in __scrt_common_main_seh. We observe its behavior and find that it always returns some API functions. Analyzing it statically, we see that sub_654B0 is some API hash resolver using this hash function
|
|
We can repurpose the hash checker we created in Challenge 7 to resolve these API names statically, as such
|
|
AddVectoredExceptionHandler is resolved by hash in sub_482130, then called in sub_482150, to register the function Handler (at 0x486AD0) as a VEH handler. Both functions are called in succession by 0x4B1ED6 in __scrt_common_main_seh.
This explains why our SEH handlers were not being called โ __scrt_common_main_seh registers the VEH handler, which gets called and transfer control back to the program before it ever reaches the SEH handlers.
VEH handler
Handler uses VirtualProtect to change the page permissions to RWX, then changes the two-byte instruction at EIP + 3 to FF D0, where EIP is the address of the faulty code, then changes the EIP at that context to that address, then returns EXCEPTION_CONTINUE_EXECUTION which causes the program to continue execution at that new address.
FF D0 disassembles into call eax, and the value of eax is resolved using the hash function resolver, with the hash being the value of ecx at the time of exception. For example, before the first exception, we have the following code
.text:0048243E C7 45 E8+ mov dword ptr [ebp-18h], 66FFF672h
...
.text:0048245B 8B 4D E8 mov ecx, [ebp-18h]
This corresponds to GetSystemTime.
Control flow obfuscation
Knowing this, we can patch the program so that IDA can define it as a function properly. After every exception-triggering instruction there is usually 3 bytes of bad code that messes IDA up.
List of exception triggering instructions and next 3 instructions
33 C0 8B 00 / 74 03 75โxor eax, eax; mov eax [eax]33 C0 8B 00 / EB FF E8โxor eax, eax; mov eax [eax]33 C0 F7 F0 / EB 00 EBโxor eax, eax; div eax33 C0 F7 F0 / E8 FF D2โxor eax, eax; div eax33 C0 F7 F0 / 5B 5D C3โxor eax, eax; div eax33 FF F7 F7 / 33 C0 74โxor edi, edi; div edi33 F6 F7 F6 / E8 FF D2โxor esi, esi; div esi
Initially we tried to just patch the last 3 bytes to 90 FF D0, which was nop; call eax, but then because of the unrelated call and the exception causing error, the output in IDA looked quite bad, decompilation didnโt even work, and we had to do a lot of manual analysis of the assembly in graph and even non-graph text view.
Then, we realized that each error-producing sequence is 7 bytes, which were just enough for 5 bytes for a call to any function we wanted and then 2 bytes for call eax. Better yet, the API hash resolving function takes ecx and edx as arguments the way the function sets it up. If we patched the opcodes accordingly, we could even get IDA to analyze the function stack frame properly.
We write an IDAPython script to patch all these exception triggering instructions, referring extensively to the IDAPythonโs gruesome documentation at https://hex-rays.com/products/ida/support/idapython_docs/ and https://hex-rays.com/products/ida/support/ida74_idapython_no_bc695_porting_guide.shtml. Base address of our program is now 0x480000.
|
|
We tried to do some manual fixing to correct the function frames, but the graph view and decompiler gave issues, especially with the continuity of the main function after the check for argc. To fix this and get a nice looking IDB flow, we created a copy of evil.exe, applied the patches to it, then reanalyzed it using IDA on the freshly patched copy. This gave us a much nicer looking IDB that allowed us to do static analysis nearly entirely in the decompiler view.
The patched binary also runs somewhat, but running from the start gives us errors because the API hash resolving function fails to resolve for some reason, so most of the subsequent analysis was done statically.
Static Analysis
With the obfuscation method understood, we can move on to analyzing the sample functionality.
main
main calls new a bunch of times to create a bunch of structures, then calls sub_4823D0 which does a bunch of anti-debugging things.
sub_4823D0 changes the first byte of DbgBreakPoint to 0xC3 (opcode for ret), and the first 14 bytes of DbgUiRemoteBreakin to 6a 00 68 ff ff ff ff B8 <terminate process address> FF D0, which essentially calls TerminateProcess. It also does other things but we didnโt analyze it deeply.
main then calls CreateMutexA and puts some values into the 0x1CC byte object, then calls CreateThread to the functions sub_482E70, sub_482D50, sub_484310, and sub_484680, then calls WaitForMultipleObjects.
Afterwards main calls closesocket and WSACleanup which strongly suggests that there is network functionality in the executable.
main also references 0x751698, which contains fake flag 1, 1s_tHi$_mY_f1aG@flare-on.com, initialized in sub_481000 which was called as part of the initialization routines in __scrt_common_main_seh.
Jumping around
We know that network functions like socket and connect and send and recv are likely called, and probably resolved using the hash API resolver, so we can do binary searches for their function name hashes in the binary to see which code calls them.
For example, the hash for socket is 0xd5af7bf3, and searching for it gives us 2 hits in sub_483A70. connect, send, and recv have no hits. They might be using UDP so we try recvfrom which gets us a hit in sub_484310, and sendto which gets us a hit in sub_483D40, confirming our suspicions.
Doing some xrefs and combining this with our knowledge of the threads that main is creating, we have the following information about threads that main creates
sub_482E70, uses the 0x18 byte object used by the anti-debug function, creates more threads and is a pain to reverse, probably not importantsub_482D50, also uses 0x18 byte anti-debug object, also probably unimportant
After that main calls sub_483A70 which takes argv[1] as the IP address, and does network startup function calls like WSAStartup, socket, and bind, and puts it into the 0x1CC byte object (not a real object because it doesnโt contain methods, probably just a struct), so we will keep track of that object, calling it network_struct, and take note of the 2 threads that use that struct
sub_484310, which uses thenetwork_structand callsrecvfrom, so this function is probably important and should be more deeply analyzed. We will call this therecvfunctionsub_484680, which is huge, uses the mutex and semaphore in thenetwork_struct, and then calls the functionsub_483D40which callssendto, andsub_4867A0which callsCryptDecrypt. This function looks very important. We will call this thecrypt_and_sendtofunction
Traffic format
The recv function calls recvfrom into a 1500-byte buffer, then checks the following:
buf[9] == 0x11- There is some header at
&buf[4 * (buf[0] & 0xF)]*(word*)(header+4) != 0ntohs(*(word*)(header+4))is the size of the 8-byte header and the buffer afterwards.(header+2) == '\x11\x04'
buf[6] < 0- After the headers there is some buf where
*(dword*)&buf[0]is the โmodeโ, explained in greater detail below in Modes*(dword*)&buf[4]is the length&buf[8]is the stuff after that gets checked incrypt_and_sendto.
Struct fields
After analyzing both the recv and the crypt_and_sendto functions, we managed to identify the following struct fields
|
|
Somehow, dword1B0, dword1B4, and dword1B8 are used to share data between the two threads, but it is very hard to tell how exactly because it does stuff like this
*(void **)(*(_DWORD *)(a_network_struct->dword1B0 + 4 * ((dword1B8 / 4) & (dword1B4 - 1))) + 4 * (dword1B8 % 4))
and this
((_BYTE)dword1B8 + (_BYTE)some_sync_number) % 4 == 0
&& dword1B4 <= (unsigned int)(some_sync_number / 4 + 1)
dword1B8 is updated to dword1B8 & (4 * dword1B4 - 1)
v30 = dword1B8 & (4 * dword1B4 - 1) + some_sync_number
4 * ((v30 >> 2) & (dword1B4 - 1)) + dword1B0 is some address
We ended up not figuring out how it worked exactly, but we know its a deque object somehow because the recv function calls sub_485010 which calls sub_4851B0 which has the message deque<T> too long, so this struct is probably part of some C++ deque implementation.
Nevertheless, we can roughly guess which variables are pointers to the received buffer by how they are used, so we donโt have to fully understand this struct.
Modes
After we analyzed and found everything we could about the traffic format and the object struct in the recv function, we can move on to analyze the crypt_and_sendto function.
Mode 1
There is some checking of some โmodeโ at (code location) 0x4847A1, then if the mode is 1 it goes into sub_483FC0 which contains API calls like GetConsoleWindow, and it makes a call to the crypt function, decrypting 0x118836 bytes starting at 0x637330, using a 16 byte key at 0x74FBC8.
|
|
Although the output is useless, this very importantly confirms to us that the crypt function uses RC4.
Afterwards the call the crypt function again with another 37 byte block at 0x74FBA0, with the same key, but that gives N3ver_G0nNa_g1ve_y0u_Up@flare-on.com, which we know to be a fake flag.
Mode 2
Mode 2 has a ugly deobfuscation technique that it uses, the same one used to get fake flag 1.
|
|
Instead of analyzing and reimplementing it, we can just debug and set our EIP to that location, and run it, then see what comes out from the other side. This should decrypt 4 strings, L0ve, s3cret, 5Ex, and g0d.
Note: you have to run all the way to the end to make sure its fully decrypted. Our initial try was partial and we got c0d instead of g0d in the final string, which prevented us from getting the flag even though by then we had understood the entire decryption logic.
Afterwards it does some checks to make sure the received buffer matches one of the strings, then if so calls sub_4869F0 and puts the result (a DWORD) in one of the offsets of the 16-byte buffer at 0x751680 according to which string it matches, in big endian format.
There is a different hash function in sub_4869F0, which has the following pseudocode
|
|
We ran it in the debugger by setting the IP to the start of this function, then running the first for loop, and realized that the array generated matches the one in CRC32, and when we check Wikipedia we realize that the algorithm matches too.
This means that it takes the 4 strings, CRC32 them, then concats the CRC32 values in big endian.
Mode 3
Mode 3 takes the CRC32 buffer generated by Mode 2 then uses it as the key to call the crypt function which we know does RC4 decryption, to decrypt the 39 byte string at 0xD0FB68. Then calls the sendto function but we didnโt analyze that.
Final Solver Script
|
|
Flag
n0_mOr3_eXcEpti0n$_p1ea$e@flare-on.com