This reverse challenge it taken from the NorzhCTF from ENSIBS at 2020 FIC.
This challenge features a static binary (~10mb) using unicorn engine to emulate instructions, and some anti-disassembly techniques I have been explaining in a talk for Hack2G2 the year earlier. The static compilation will lower our possibilities for solving it.
After clearing the anti-disassembly trick in the main function, we get this decompiled output:
This main routine is clearly loading shellcode into memory and starts an emulation using the
Unicorn library, after adding a hook on interrupt instructions. The hook will be called each time the shellcode uses an interrupt instruction (
int X, or
The shellcode was using multiple anti-disassembly techniques. Even tho I knew those techniques, reconstructing the real payload was not easy using IDA, because it was using instructions that were not referenced by INTEL (such as
int 0xAA), probably because they don’t exist at all. Thus, IDA had trouble understanding the shellcode. Indeed, the shellcode was using
int X instructions. I was used to see
int 0x80, but this shellcode was using
int 0x37 and other int instructions that are not referenced by INTEL.
Those instructions will reveal useful later.
I extracted the shellcode, wrote it into a bin file in order to open it with
In : inst = [0xC03105EBBA66D231, ...: 0x00000000E8E8FA74, ...: 0x22C1B9C305240483, ...: 0x90E8017503740000, ...: 0x05E801750374C031, ...: 0x34CD13CD00000152, ...: 0x05E801750374C031, ...: 0x20CD21CD00000152] In : ...: with open("inst.bin", 'wb+') as f: ...: for i in inst: ...: f.write(struct.pack("<Q", i))
Radare2 was able to disassemble those “unknown” instructions, in a cleaner way than IDA did :
If we do not take into account the anti-disassembly junk, the important instructions here are the
int X instructions. A review of the disassembly technique used in this binary is detailed here.
The binary will write the shellcode at
0x1000000 and start the emulation. The emulation is given a hook for every instruction emulated using the
uc_hook_add function (main, L27).
The hook is defined with :
uc_hook_add(uc, &hh, UC_HOOK_INTR, hook_intr);,
UC_HOOK_INTR representing a hook for each interrupts. This means that the callback will be called on each
int instructions (among others instructions).
Let’s give it a look.
We can see that specific code is executed for specific interrupt numbers.
This is the first interrupt of the shellcode. At this point, the hook asks for a user input and stores it into a 20 characters buffer.
This is the second interrupt executed in the shellcode. It checks that the flag starts by
This is the third executed one. It gets a value from RAX, and do some calculation on the 10th to 15th characters of the flag. Calculating it gives us :
The fourth one. There is no such case in the switch statement, so it executes the default one. The default one checks if the interrupt is
0xAA. If not, it writes this interrupt at the start of the shellcode, and restart the emulation from its beginning:
Now it will calculate the last part of the flag in the same way as it did in
int 0x21 interrupt, which gives :
We are now missing 2 characters to have a complete flag:
Those two chars will be xored with 0xB8 and 0x03, written to the start of the shellcode and executed.
Our goal is to create an
int 0x37 interrupt (that triggers a /bin/bash), which is
0xCD37 in binary.
0x03 ^ 0x37 = '4'
0xCD ^ 0xB8 = 'u'
We now have our full flag :
Great challenge, it was my first time dealing with unicorn, and I enjoyed having to deal with anti-disassembly in a CTF challenge.
This annex details the anti-disassembly tricks used in this binary.
The main procedure starts like this :
You can see strange instructions at the end of the disassembled procedure :
call $+5 add [rsp+98h+var_98], 5 retn
This is a trick to prevent disassemblers such as IDA from disassembling a whole function putting a
retn instruction early in a function. Disassemblers such as IDA are using flow-oriented disassembly, which stops disassembling a function when they encounter a
Here is how this trick works :
add [rsp+98h+var_98], 5here), jumping to it as well as putting its address on the stack (that is what a call instruction does, pushing the next instruction address to the stack and jump where it is supposed to jump).
retninstructions. In other words, the previously pushed address of the
addinstruction now points to the first instruction after the
retn, which pops the address on the stack and jumps to it, which in our case means that it jumps to the next instruction.
Those instructions has no other effect then messing with common disassemblers. It can be nopped out without problem, and that is what I did on this challenge.
The shellcode was using overlapping instructions in order to mess with my disassembler. One good example of this technique can be found in a book named
Practical Malware Analysis :
Feel free to take a minute to read the image above, it is not that easy to understand if this is the first time you are encountering this.
The main purpose of overlapping instructions is to create an error in the disassembly. Once instructions has been disassembled, common disassemblers does not disassemble them again. Is can’t show to you that two instructions are overlapping (in this case
mov ax, 05ebh and
jmp 5), simply because even if they are overlapping, they wont be executed as the same time. This is difficult to represent in a linear way (this could be solved using graph view tho).
radare2 solves the problem showing all the instructions even if they are not executing in the order they are displayed :
Here we can see that the mov occurs, but then the code directly jumps way later on the code, missing the
xor eax, eax instruction.