Introduction

2020-03-07_13-00

This reverse challenge it taken from the NorzhCTF from ENSIBS at 2020 FIC.

This challenge features a static binary (~10mb) using unicorn engine to emulate instructions, and some anti-disassembly techniques I have been explaining in a talk for Hack2G2 the year earlier. The static compilation will lower our possibilities for solving it.

Main procedure

After clearing the anti-disassembly trick in the main function, we get this decompiled output:

1583590403872

This main routine is clearly loading shellcode into memory and starts an emulation using the Unicorn library, after adding a hook on interrupt instructions. The hook will be called each time the shellcode uses an interrupt instruction (int X, or syscall).

The shellcode

The shellcode was using multiple anti-disassembly techniques. Even tho I knew those techniques, reconstructing the real payload was not easy using IDA, because it was using instructions that were not referenced by INTEL (such as int 0xAA), probably because they don’t exist at all. Thus, IDA had trouble understanding the shellcode. Indeed, the shellcode was using int X instructions. I was used to see int 3, int 0x80, but this shellcode was using int 0xAA, int 0x37 and other int instructions that are not referenced by INTEL.

Those instructions will reveal useful later.

I extracted the shellcode, wrote it into a bin file in order to open it with radare2:

In [19]: inst = [0xC03105EBBA66D231,
    ...: 0x00000000E8E8FA74,
    ...: 0x22C1B9C305240483,
    ...: 0x90E8017503740000,
    ...: 0x05E801750374C031,
    ...: 0x34CD13CD00000152,
    ...: 0x05E801750374C031,
    ...: 0x20CD21CD00000152]

In [20]:
    ...: with open("inst.bin", 'wb+') as f:
    ...:     for i in inst:
    ...:         f.write(struct.pack("<Q", i))

Radare2 was able to disassemble those “unknown” instructions, in a cleaner way than IDA did :

1583578898759

If we do not take into account the anti-disassembly junk, the important instructions here are the int X instructions. A review of the disassembly technique used in this binary is detailed here.

uc_hook_add

The binary will write the shellcode at 0x1000000 and start the emulation. The emulation is given a hook for every instruction emulated using the uc_hook_add function (main, L27).

The hook is defined with : uc_hook_add(uc, &hh, UC_HOOK_INTR, hook_intr);, UC_HOOK_INTR representing a hook for each interrupts. This means that the callback will be called on each int instructions (among others instructions).

Let’s give it a look.

1583587962425

1583587983964

We can see that specific code is executed for specific interrupt numbers.

int 0x13

This is the first interrupt of the shellcode. At this point, the hook asks for a user input and stores it into a 20 characters buffer.

int 0x34

This is the second interrupt executed in the shellcode. It checks that the flag starts by ENSIBS{!.

int 0x21

This is the third executed one. It gets a value from RAX, and do some calculation on the 10th to 15th characters of the flag. Calculating it gives us : r3S0L.

int 0x20

The fourth one. There is no such case in the switch statement, so it executes the default one. The default one checks if the interrupt is 0xAA. If not, it writes this interrupt at the start of the shellcode, and restart the emulation from its beginning:

1583589650398

int 0xAA

Now it will calculate the last part of the flag in the same way as it did in int 0x21 interrupt, which gives : 3et!}.

We are now missing 2 characters to have a complete flag: ENSIBS{!__r3S0L3et!}.

1583590008594

Those two chars will be xored with 0xB8 and 0x03, written to the start of the shellcode and executed.

Our goal is to create an int 0x37 interrupt (that triggers a /bin/bash), which is 0xCD37 in binary.

0x03 ^ 0x37 = '4'

0xCD ^ 0xB8 = 'u'

We now have our full flag : ENSIBS{!u4r3S0L3et!}.

Great challenge, it was my first time dealing with unicorn, and I enjoyed having to deal with anti-disassembly in a CTF challenge.

Annex

This annex details the anti-disassembly tricks used in this binary.

Main

The main procedure starts like this :

1583591497682

You can see strange instructions at the end of the disassembled procedure :

call $+5
add  [rsp+98h+var_98], 5
retn

This is a trick to prevent disassemblers such as IDA from disassembling a whole function putting a retn instruction early in a function. Disassemblers such as IDA are using flow-oriented disassembly, which stops disassembling a function when they encounter a retn instruction.

Here is how this trick works :

Those instructions has no other effect then messing with common disassemblers. It can be nopped out without problem, and that is what I did on this challenge.

My slides as reference - 30rd slide

Shellcode

The shellcode was using overlapping instructions in order to mess with my disassembler. One good example of this technique can be found in a book named Practical Malware Analysis :

1583592294463

Feel free to take a minute to read the image above, it is not that easy to understand if this is the first time you are encountering this.

The main purpose of overlapping instructions is to create an error in the disassembly. Once instructions has been disassembled, common disassemblers does not disassemble them again. Is can’t show to you that two instructions are overlapping (in this case mov ax, 05ebh and jmp 5), simply because even if they are overlapping, they wont be executed as the same time. This is difficult to represent in a linear way (this could be solved using graph view tho). radare2 solves the problem showing all the instructions even if they are not executing in the order they are displayed :

1583592677144

Here we can see that the mov occurs, but then the code directly jumps way later on the code, missing the xor eax, eax instruction.

Nofix -