# Nofix - Reverse Engineering Blog

Reverse engineering malwares for fun on my spare time

I am taking a break from the binary I am currently analysing in Random Malware Analysis unpacking - Stage 1/3, as it takes me quite a lot of time to figure out some things for stage 2. Today I will be focusing on writing up my solution for Exceptions, a binary to reverse that was given at ECW prequals 2019.

This binary features :

• Exceptions
• The use of Thread Local Storage
• Anti-dbg
• Some sort of polymorphic code
• …and that’s all folks !

Some good anti-dbg documentation and my head was enough to solve it. Let’s dive into the binary.

Loading it into binary, we quickly notice that the binary is stripped. CFF explorer informs us that it has been compiled using MV C++ 8:

I loaded FLIRT “universal” signatures for Microsoft Visual C++ into IDA to get some of the functions name resolved. IDA did not find the WinMain() function, but cleared up a bit the entry point start:

Everything looks quite normal, and if we are used to reverse PE files we can quickly notice the call to sub_401520 which appears to be approximativly where a WinMain function would be call. Notice that it takes 2 arguments : probably standard argc and argv.

Going into it looks like it is indeed the main function of our PE. We could also have taken a bottom-up approach to find the main as the binary is stripped : in the strings tab, we can notice a Usage: %s [FLAG]. Coming from there and using xrefs, we can quickly take our way up to the main function.

## The WinMain

The WinMain function is quite straightforward:

It :

• Checks if argc == 2
• Initialize some memory
• Check if given argument is the right flag

Let’s look at the check_flag function :

• Creates SEH chain and pushes exception handlers to it

• Divides some high value by 3

• Checks length of the input with a loop

  v3 = 66;
v4 = 0;
while ( a1[v4] )
{
v3 -= 2;
++v4;
}
[...]
if ( v3 != 0x51320262 )
return result;


Note : calculated length looks strange.

• Checks if our input is good doing some operations on it
v2 = __ROR4__(*(a1 + 1) ^ 0x47730DB1, 5);
if ( __ROL4__(*a1 ^ 0x6F56E06F, 5) == 0x79CB0379 && !sub_1A2590(1, 8, &v13, a1, &v5) && v2 == 0xE56C6E23 )
result = 1;


The required length on our input looks weird, and we can suspect that this code is not the code that should be executed at run time. Given the number we need to reach as the length (0x51320262), we would need have an input large enough for v3 to go negative, thus maybe reaching the expected number. That looks unrealistic. This gives us two options :

• The code modifies itself at run time
• There might be a way to trigger defined exceptions for this frame. Associated handlers might take us somewhere us

Assuming the first option is valid, we can try to set a hardware breakpoint to the value we have to reach :

I also set one to the division operand :

We now run the binary in windbg, and… Nothing. This could indicate that there is some kind of anti debugging trick going on here.

## Looking for anti-debugging tricks

I start by checking the classics : the use of IsDebuggerPresent.

We can notice an unusual use of IsDebuggerPresent in sub_1A1700+20.

Note : this function is called at an early stage of the binary, way before the WinMain function.

Let’s check this out :

void *__thiscall sub_1A1700(void *this)
{
void *v2; // [esp+0h] [ebp-10h]
DWORD flOldProtect; // [esp+8h] [ebp-8h]

v2 = this;
lpAddress = (char *)sub_1A1500 - 193;
if ( IsDebuggerPresent() )
return v2;
if ( !VirtualProtect(lpAddress, 1u, 0x40u, &flOldProtect) )
return v2;
return v2;
}

• If debugger is not present, it adds a vectored handler to the SEH list.

According to MSDN:

Vectored exception handlers are an extension to structured exception handling. An application can register a function to watch or handle all exceptions for the application. Vectored handlers are not frame-based, therefore, you can add a handler that will be called regardless of where you are in a call frame.

That means that this handler will be used for all exceptions in the application.

The binary then modifies mov ecx, 3 to mov ecx, 0 in the check_flag function. This is done right before div ecx, in order to trigger a zero division and execute some exception handlers in which code is hidden.

We continue the execution of the application and stumble upon a zero division exception, as expected. The exception gets catch by the first defined handler, which is the frame exception handler, defined in SEH table of the function check_flag (SEH is quite well defined in Practical Malware Analysis, a must have book !).

This handler brings us to the function sub_1A12D0(). This function starts by pointing at the first xor in check_flag : .text:001A1495 xor eax, 47730DB1h.

This handler :

• Uses VirtualProtect to change the permissions of this section to RWX
• XOR each byte of right operand of pointed address (0x47430DB1) with 0xDB, which gives FACEB00K
• Uses VirutalProtect to set back the permissions to RX

## The second handler

As the application execute every handler that matches the try condition, the handler associated to our Vectored Handler, which is global thus obviously match any exception, will be executed.

Let’s give a look at this Handler:

The process is very similar to the previous handler, it first points to .text:001A14A3 sub ecx, 79CB0379h :

• Uses VirtualProtect to change the permissions of this section to RWX
• ADD 0xC0FFE to the right operand of pointed address (0x79CB0379), which gives 0x79D71377
• Uses VirutalProtect to set back the permissions to RX

We set a breakpoint to the call to IsBeingDebugged from the parent, change its result, which allow the vectored exception handler to be set. We let the program continue and get back to check_flag function.

## Back to check_flag

We now have this piece of code first checking the lenght of our input :

 v3 = 66;
v4 = 0;
while ( a1[v4] ) {
v3 -= 2;
++v4;
}
[...]
if ( v3 != 50 )
return result;


It is now clear that our input needs to be 8 characters long.

The next part calculates our 8 characters :

v2 = __ROR4__(a1 + 1) ^ 0xFACEB00C, 5);
if ( __ROL4__(*a1 ^ 0x6F56E06F, 5) == 0x79D71377 && !sub_1A2590(1, 8, &v13, a1, &v5) && v2 == 0xE56C6E23 )
result = 1;


a1 + 1 gives a1[4], because a1 is a 32 bits pointer. Increment a pointer in C increments it by the size of the pointer, which is 4 bytes here.

• The second part of our input, xored with 0xFACEB00C, rotated right 4 times has to be equal to 0xE56C6E23.
• The first part should be xored with 0x6F56E06F, rotated left 4 times has to be equal to 0x79D71377.

The first condition is fairly easy to calculate :

>>> struct.pack("<I", int(hex(__ROL4__(0xE56C6E23, 5)), 16) ^ 0xFACEB00C)
b'ptCW'


The second gives wrong results. This is because a second anti-dbg trick was hidden deeply in the code.

## The TLSCallback anti-dbg trick

What is a TLS?

This has nothing to do with encryption. TLS means : Thread Local Storage. This is an old anti-dbg trick.

Is provides callbacks that can be executed both when a thread is created or destroyed, including the main thread. Those callbacks executes before the debugger attaches the process, at the creation of the main thread. There are many ways to add callbacks to the TLS callback array, but in our case the callback is already present statically. Let’s take a look at TLS callbacks for this binary. Hopefully, FLIRT did resolve TLSCallback() name.

PE-studio warns us about the binary containing a TLS callback :

We quickly find the TLSexception handler :

This handler uses the same technique as shown above:

• Sets an instruction in .text section RWX
• Modify it
• Sets the permissions back

We just need to set a breakpoint after the program gets the BeingDebugged flag, and modify its result. We can let the program continue. Getting back to the check_flag, the modifications has been made, and we now need the first part of our input to match: __ROL4__(*a1 ^ 0xDEADC0DE, 5) == 0x79D71377.

>>> struct.pack("<I", int(hex(__ROR4__(0x79D71377, 5)), 16) ^ 0xDEADC0DE)
b'Exce'


Which gives us the input : ExceptCW.

Nofix -