Barbhack 2023 - Encrypted Box
Encrypted Box
is a challenge from Barbhack 2023. It requires automating the decryption of code through 1000 stages and reversing the aesenc/aesdec CPU instructions.
Each stage is nested within another. To understand the challenge, we only need to examine a few ASM instructions. Here is what a single stage looks like:
.text:0000000000400078 start:
[...syscall setup]
.text:0000000000400088 syscall ; LINUX - sys_read
[...Setup some data in xmm0, and key in xmm1]
.text:4001B8 pinsrq xmm2, qword ptr [rsp+8], 1 ; User input it written to xmm2
.text:4001C1 pinsrq xmm2, qword ptr [rsp], 0
.text:4001C9 aesdec xmm2, xmm1
.text:4001CE comiss xmm2, xmm0
.text:4001D1 jnz short loc_40020E
.text:4001D3 mov rcx, 66B80h
.text:4001DA pinsrq xmm2, qword ptr [rsp+8], 1
.text:4001E3 pinsrq xmm2, qword ptr [rsp], 0
.text:4001EB
.text:4001EB dec_loop: ; CODE XREF: .text:000000000040020A↓j
.text:4001EB sub rcx, 10h
.text:4001EF lea rax, next_stage
.text:4001F6 add rax, rcx
.text:4001F9 movdqu xmm0, xmmword ptr [rax] ; User input it written to xmm2
.text:4001FD aesdec xmm0, xmm2
.text:400202 movdqu xmmword ptr [rax], xmm0
.text:400206 cmp rcx, 0
.text:40020A jnz short dec_loop
.text:40020C jmp short next_stage
.text:40020E ; ---------------------------------------------------------------------------
.text:40020E
.text:40020E loc_40020E: ; CODE XREF: .text:00000000004001D1↑j
.text:40020E ; .text:0000000000400255↓j
.text:40020E mov rax, 3Ch ; '<'
.text:400215 syscall ; LINUX - sys_exit
.text:400217
.text:400217 next_stage: ; CODE XREF: .text:000000000040020C↑j
[...junk code]
Upon analyzing this assembly, we quickly notice that the only impact we have on the executable is setting up xmm2 at addresses 0x4001DA and 0x4001E3.
This register will be :
- used in
aesdec
instruction as thestate
; - compared after decryption to data stored in
xmm0
.
The encryption key is stored in xmm1, and some data is stored in xmm0. Both values are fixed.
Our goal is to find an input that, when used in aesdec, will result in the same value as in xmm0
.
We need to encrypt xmm0
in a way that: decrypt(data=xmm2, key=xmm1) == xmm0
<=> encrypt(data=xmm0, key=xmm1) == xmm2
.
This implies that we need to “reverse” the following instruction: aesenc xmm0, xmm1.
aesenc
performs a single round of AES encryption, whileaesdec
performs a single round of AES decryption.
It is important to note that aesenc
is not a direct symmetric operation of aesdec. In other words, A != aesdec(aesenc(A))
.
This distinction becomes clear when reading Intel’s Manual, which states that aesenc
performs the following operations:
STATE := SRC1;
RoundKey := SRC2;
STATE := ShiftRows( STATE );
STATE := SubBytes( STATE );
STATE := MixColumns( STATE );
DEST[127:0] := STATE XOR RoundKey;
DEST[MAXVL-1:128] (Unmodified)
While aesdec
performs the following operations:
STATE := SRC1;
RoundKey := SRC2;
STATE := InvShiftRows( STATE );
STATE := InvSubBytes( STATE );
STATE := InvMixColumns( STATE );
DEST[127:0] := STATE XOR RoundKey;
DEST[MAXVL-1:128] (Unmodified)
To “revert” an aesenc
operation, one would need to reverse the order or operations, as follows:
DEST[127:0] := STATE XOR RoundKey;
STATE := InvMixColumns( STATE );
STATE := InvSubBytes( STATE );
STATE := ShiftRows( STATE );
which are essentially aesdec
operations in reverse order.
The same principle applies to “reverting” an aesdec
instruction:
DEST[127:0] := STATE XOR RoundKey;
STATE := MixColumns( STATE );
STATE := SubBytes( STATE );
STATE := ShiftRows( STATE );
Reversing a single aesenc
/ aesdec
instruction is something I encountered in 2020 at FCSC (French Cybersecurity Challenge) with the keykoolol
challenge.
Two write-ups are available online (1, 2), altough they do not explain how to or provide code to revert both aesdec
and aesenc
(I noticed after the CTF that the first writeup has a Rust routine that inverts aesenc
).
Mathis Hammel provided me with the following Python script, which implements AESDECINV
and AESENCINV
, to respectively invert AESDEC
and AESENC
instructions:
def xor(a, b):
return ''.join(chr(ord(ac) ^ ord(bc)) for ac, bc in zip(a, b))
Sbox = (
0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
0xB7, 0xFD, 0x93, 0x26, 0x36, 0x3F, 0xF7, 0xCC, 0x34, 0xA5, 0xE5, 0xF1, 0x71, 0xD8, 0x31, 0x15,
0x04, 0xC7, 0x23, 0xC3, 0x18, 0x96, 0x05, 0x9A, 0x07, 0x12, 0x80, 0xE2, 0xEB, 0x27, 0xB2, 0x75,
0x09, 0x83, 0x2C, 0x1A, 0x1B, 0x6E, 0x5A, 0xA0, 0x52, 0x3B, 0xD6, 0xB3, 0x29, 0xE3, 0x2F, 0x84,
0x53, 0xD1, 0x00, 0xED, 0x20, 0xFC, 0xB1, 0x5B, 0x6A, 0xCB, 0xBE, 0x39, 0x4A, 0x4C, 0x58, 0xCF,
0xD0, 0xEF, 0xAA, 0xFB, 0x43, 0x4D, 0x33, 0x85, 0x45, 0xF9, 0x02, 0x7F, 0x50, 0x3C, 0x9F, 0xA8,
0x51, 0xA3, 0x40, 0x8F, 0x92, 0x9D, 0x38, 0xF5, 0xBC, 0xB6, 0xDA, 0x21, 0x10, 0xFF, 0xF3, 0xD2,
0xCD, 0x0C, 0x13, 0xEC, 0x5F, 0x97, 0x44, 0x17, 0xC4, 0xA7, 0x7E, 0x3D, 0x64, 0x5D, 0x19, 0x73,
0x60, 0x81, 0x4F, 0xDC, 0x22, 0x2A, 0x90, 0x88, 0x46, 0xEE, 0xB8, 0x14, 0xDE, 0x5E, 0x0B, 0xDB,
0xE0, 0x32, 0x3A, 0x0A, 0x49, 0x06, 0x24, 0x5C, 0xC2, 0xD3, 0xAC, 0x62, 0x91, 0x95, 0xE4, 0x79,
0xE7, 0xC8, 0x37, 0x6D, 0x8D, 0xD5, 0x4E, 0xA9, 0x6C, 0x56, 0xF4, 0xEA, 0x65, 0x7A, 0xAE, 0x08,
0xBA, 0x78, 0x25, 0x2E, 0x1C, 0xA6, 0xB4, 0xC6, 0xE8, 0xDD, 0x74, 0x1F, 0x4B, 0xBD, 0x8B, 0x8A,
0x70, 0x3E, 0xB5, 0x66, 0x48, 0x03, 0xF6, 0x0E, 0x61, 0x35, 0x57, 0xB9, 0x86, 0xC1, 0x1D, 0x9E,
0xE1, 0xF8, 0x98, 0x11, 0x69, 0xD9, 0x8E, 0x94, 0x9B, 0x1E, 0x87, 0xE9, 0xCE, 0x55, 0x28, 0xDF,
0x8C, 0xA1, 0x89, 0x0D, 0xBF, 0xE6, 0x42, 0x68, 0x41, 0x99, 0x2D, 0x0F, 0xB0, 0x54, 0xBB, 0x16,
)
InvSbox = (
0x52, 0x09, 0x6A, 0xD5, 0x30, 0x36, 0xA5, 0x38, 0xBF, 0x40, 0xA3, 0x9E, 0x81, 0xF3, 0xD7, 0xFB,
0x7C, 0xE3, 0x39, 0x82, 0x9B, 0x2F, 0xFF, 0x87, 0x34, 0x8E, 0x43, 0x44, 0xC4, 0xDE, 0xE9, 0xCB,
0x54, 0x7B, 0x94, 0x32, 0xA6, 0xC2, 0x23, 0x3D, 0xEE, 0x4C, 0x95, 0x0B, 0x42, 0xFA, 0xC3, 0x4E,
0x08, 0x2E, 0xA1, 0x66, 0x28, 0xD9, 0x24, 0xB2, 0x76, 0x5B, 0xA2, 0x49, 0x6D, 0x8B, 0xD1, 0x25,
0x72, 0xF8, 0xF6, 0x64, 0x86, 0x68, 0x98, 0x16, 0xD4, 0xA4, 0x5C, 0xCC, 0x5D, 0x65, 0xB6, 0x92,
0x6C, 0x70, 0x48, 0x50, 0xFD, 0xED, 0xB9, 0xDA, 0x5E, 0x15, 0x46, 0x57, 0xA7, 0x8D, 0x9D, 0x84,
0x90, 0xD8, 0xAB, 0x00, 0x8C, 0xBC, 0xD3, 0x0A, 0xF7, 0xE4, 0x58, 0x05, 0xB8, 0xB3, 0x45, 0x06,
0xD0, 0x2C, 0x1E, 0x8F, 0xCA, 0x3F, 0x0F, 0x02, 0xC1, 0xAF, 0xBD, 0x03, 0x01, 0x13, 0x8A, 0x6B,
0x3A, 0x91, 0x11, 0x41, 0x4F, 0x67, 0xDC, 0xEA, 0x97, 0xF2, 0xCF, 0xCE, 0xF0, 0xB4, 0xE6, 0x73,
0x96, 0xAC, 0x74, 0x22, 0xE7, 0xAD, 0x35, 0x85, 0xE2, 0xF9, 0x37, 0xE8, 0x1C, 0x75, 0xDF, 0x6E,
0x47, 0xF1, 0x1A, 0x71, 0x1D, 0x29, 0xC5, 0x89, 0x6F, 0xB7, 0x62, 0x0E, 0xAA, 0x18, 0xBE, 0x1B,
0xFC, 0x56, 0x3E, 0x4B, 0xC6, 0xD2, 0x79, 0x20, 0x9A, 0xDB, 0xC0, 0xFE, 0x78, 0xCD, 0x5A, 0xF4,
0x1F, 0xDD, 0xA8, 0x33, 0x88, 0x07, 0xC7, 0x31, 0xB1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xEC, 0x5F,
0x60, 0x51, 0x7F, 0xA9, 0x19, 0xB5, 0x4A, 0x0D, 0x2D, 0xE5, 0x7A, 0x9F, 0x93, 0xC9, 0x9C, 0xEF,
0xA0, 0xE0, 0x3B, 0x4D, 0xAE, 0x2A, 0xF5, 0xB0, 0xC8, 0xEB, 0xBB, 0x3C, 0x83, 0x53, 0x99, 0x61,
0x17, 0x2B, 0x04, 0x7E, 0xBA, 0x77, 0xD6, 0x26, 0xE1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0C, 0x7D,
)
xtime = lambda a: (((a << 1) ^ 0x1B) & 0xFF) if (a & 0x80) else (a << 1)
Rcon = (
0x00, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40,
0x80, 0x1B, 0x36, 0x6C, 0xD8, 0xAB, 0x4D, 0x9A,
0x2F, 0x5E, 0xBC, 0x63, 0xC6, 0x97, 0x35, 0x6A,
0xD4, 0xB3, 0x7D, 0xFA, 0xEF, 0xC5, 0x91, 0x39,
)
def text2matrix(text):
matrix = []
for i in range(16):
byte = text[i]
if i % 4 == 0:
matrix.append([byte])
else:
matrix[i // 4].append(byte)
return matrix
def matrix2text(matrix):
text = []
for i in range(4):
for j in range(4):
text.append(matrix[i][j])
return bytes(text)
class AES:
def init(self, master_key=None):
if master_key:
self.change_key(master_key)
def change_key(self, master_key):
self.round_keys = text2matrix(master_key)
for i in range(4, 4 * 11):
self.round_keys.append([])
if i % 4 == 0:
byte = self.round_keys[i - 4][0] \
^ Sbox[self.round_keys[i - 1][1]] \
^ Rcon[i // 4]
self.round_keys[i].append(byte)
for j in range(1, 4):
byte = self.round_keys[i - 4][j] \
^ Sbox[self.round_keys[i - 1][(j + 1) % 4]]
self.round_keys[i].append(byte)
else:
for j in range(4):
byte = self.round_keys[i - 4][j] \
^ self.round_keys[i - 1][j]
self.round_keys[i].append(byte)
def encrypt(self, plaintext):
self.plain_state = text2matrix(plaintext)
self.add_round_key(self.plain_state, self.round_keys[:4])
for i in range(1, 10):
self.round_encrypt(self.plain_state, self.round_keys[4 * i: 4 * (i + 1)])
self.sub_bytes(self.plain_state)
self.shift_rows(self.plain_state)
self.add_round_key(self.plain_state, self.round_keys[40:])
return matrix2text(self.plain_state)
def AESENCLAST(self, plaintext, round_key):
state = text2matrix(plaintext)
rkey = text2matrix(round_key)
self.sub_bytes(state)
self.shift_rows(state)
self.add_round_key(state, rkey)
return matrix2text(state)
def AESENC(self, plaintext, round_key):
state = text2matrix(plaintext)
rkey = text2matrix(round_key)
self.shift_rows(state)
self.sub_bytes(state)
self.mix_columns(state)
self.add_round_key(state, rkey)
return matrix2text(state)
def AESDECINV(self, plaintext, round_key):
state = text2matrix(plaintext)
rkey = text2matrix(round_key)
self.add_round_key(state, rkey)
self.mix_columns(state)
self.sub_bytes(state)
self.shift_rows(state)
return matrix2text(state)
def AESDECLAST(self, ciphertext, round_key):
state = text2matrix(ciphertext)
rkey = text2matrix(round_key)
self.add_round_key(state, rkey)
self.inv_sub_bytes(state)
self.inv_shift_rows(state)
return matrix2text(state)
def AESDEC(self, ciphertext, round_key):
state = text2matrix(ciphertext)
rkey = text2matrix(round_key)
self.inv_shift_rows(state)
self.inv_sub_bytes(state)
self.inv_mix_columns(state)
self.add_round_key(state, rkey)
return matrix2text(state)
def AESENCINV(self, ciphertext, round_key):
state = text2matrix(ciphertext)
rkey = text2matrix(round_key)
self.add_round_key(state, rkey)
self.inv_mix_columns(state)
self.inv_sub_bytes(state)
self.inv_shift_rows(state)
return matrix2text(state)
def decrypt(self, ciphertext):
self.cipher_state = text2matrix(ciphertext)
self.add_round_key(self.cipher_state, self.round_keys[40:])
self.inv_shift_rows(self.cipher_state)
self.inv_sub_bytes(self.cipher_state)
for i in range(9, 0, -1):
self.round_decrypt(self.cipher_state, self.round_keys[4 * i: 4 * (i + 1)])
self.add_round_key(self.cipher_state, self.round_keys[:4])
return matrix2text(self.cipher_state)
def add_round_key(self, s, k):
for i in range(4):
for j in range(4):
s[i][j] ^= k[i][j]
def sr_encrypt(self, plaintext, key):
state_matrix = text2matrix(plaintext)
key_matrix = text2matrix(key)
self.round_encrypt(state_matrix, key_matrix)
return matrix2text(state_matrix)
def round_encrypt(self, state_matrix, key_matrix):
self.sub_bytes(state_matrix)
self.shift_rows(state_matrix)
self.mix_columns(state_matrix)
self.add_round_key(state_matrix, key_matrix)
def sr_decryptlast(self, plaintext, key):
state_matrix = text2matrix(plaintext)
key_matrix = text2matrix(key)
self.add_round_key(state_matrix, key_matrix)
self.inv_shift_rows(state_matrix)
self.inv_sub_bytes(state_matrix)
return matrix2text(state_matrix)
def sr_decrypt(self, plaintext, key):
state_matrix = text2matrix(plaintext)
key_matrix = text2matrix(key)
self.round_decrypt(state_matrix, key_matrix)
return matrix2text(state_matrix)
def round_decrypt(self, state_matrix, key_matrix):
self.add_round_key(state_matrix, key_matrix)
self.inv_shift_rows(state_matrix)
self.inv_sub_bytes(state_matrix)
def sub_bytes(self, s):
for i in range(4):
for j in range(4):
s[i][j] = Sbox[s[i][j]]
def x_sub_bytes(self, s):
s = text2matrix(s)
self.sub_bytes(s)
return matrix2text(s)
def x_mix_columns(self, s):
s = text2matrix(s)
self.mix_columns(s)
return matrix2text(s)
def x_inv_sub_bytes(self, s):
s = text2matrix(s)
self.inv_sub_bytes(s)
return matrix2text(s)
def x_inv_mix_columns(self, s):
s = text2matrix(s)
self.inv_mix_columns(s)
return matrix2text(s)
def x_inv_shift_rows(self, s):
s = text2matrix(s)
self.inv_shift_rows(s)
return matrix2text(s)
def x_shift_rows(self, s):
s = text2matrix(s)
self.shift_rows(s)
return matrix2text(s)
def inv_sub_bytes(self, s):
for i in range(4):
for j in range(4):
s[i][j] = InvSbox[s[i][j]]
def shift_rows(self, s):
s[0][1], s[1][1], s[2][1], s[3][1] = s[1][1], s[2][1], s[3][1], s[0][1]
s[0][2], s[1][2], s[2][2], s[3][2] = s[2][2], s[3][2], s[0][2], s[1][2]
s[0][3], s[1][3], s[2][3], s[3][3] = s[3][3], s[0][3], s[1][3], s[2][3]
def inv_shift_rows(self, s):
s[0][1], s[1][1], s[2][1], s[3][1] = s[3][1], s[0][1], s[1][1], s[2][1]
s[0][2], s[1][2], s[2][2], s[3][2] = s[2][2], s[3][2], s[0][2], s[1][2]
s[0][3], s[1][3], s[2][3], s[3][3] = s[1][3], s[2][3], s[3][3], s[0][3]
def mix_single_column(self, a):
# please see Sec 4.1.2 in The Design of Rijndael
t = a[0] ^ a[1] ^ a[2] ^ a[3]
u = a[0]
a[0] ^= t ^ xtime(a[0] ^ a[1])
a[1] ^= t ^ xtime(a[1] ^ a[2])
a[2] ^= t ^ xtime(a[2] ^ a[3])
a[3] ^= t ^ xtime(a[3] ^ u)
def mix_columns(self, s):
for i in range(4):
self.mix_single_column(s[i])
def inv_mix_columns(self, s):
# see Sec 4.1.3 in The Design of Rijndael
for i in range(4):
u = xtime(xtime(s[i][0] ^ s[i][2]))
v = xtime(xtime(s[i][1] ^ s[i][3]))
s[i][0] ^= u
s[i][1] ^= v
s[i][2] ^= u
s[i][3] ^= v
self.mix_columns(s)
if __name__ == "__main__":
A = AES()
txt = b'azertyuiopqsdfgh'
key = b'0123456789abcdef'
enc = A.AESENC(txt, key)
txt2 = A.AESENCINV(enc, key)
print('Original: ', txt)
print('Recovered:', txt2)
dec = A.AESENC(txt, key)
txt2 = A.AESENCINV(dec, key)
print('Original: ', txt)
print('Recovered:', txt2)
The second step of this challenge involves automating the task, since this challenge consists of 1000 stages.
There are several ways to automate this :
- static analysis ;
- emulation ;
- instrumentation ;
- debugging.
I attempted to emulate the code using Unicorn, but I encountered an issue where Unicorn would not properly handle XMM
registers. Plus, it also turned out that a valid unicorn script would take up too long to complete, since decryption process goes through quite a lot of instructions.
I gave up unicorn and tried miasm, but miasm cannot handle aesenc
/aesdec
instructions.
I ended up using gdb/peda to automate the execution. Here is the resulting script:
def parse_instruction(line) -> Tuple[str, str, str]:
line = line.split()
addr, instr, args = line[0][:-1], line[1], line[2] if len(line) > 2 else None
return addr, instr, args
def get_current_ins() -> str:
return peda.current_inst(peda.getreg("rip"))[1].split()[0]
def get_xmm(reg) -> int:
out = peda.execute_redirect(f'p ${reg}.uint128')
val = int(out.split()[2], 16)
if val == 0:
return 0
hexval = hex(val)[2:].zfill(32)
return int.from_bytes(bytes.fromhex(hexval), 'little')
def set_xmm(reg, bytes):
peda.execute(f'set ${reg}.v={hex(int.from_bytes(bytes,"big"))}')
def setup_bp_from_disassembly(relevant_instructions: list, ret_on_first_match = False):
rip = peda.getreg("rip")
out = peda.disassemble(hex(rip), hex(rip+0x200))
lines = out.splitlines()
for i, line in enumerate(lines):
addr, instr, args = parse_instruction(line)
if instr in relevant_instructions:
peda.execute(f'break *{addr}')
if ret_on_first_match:
return
def setup():
peda.execute('peda set option context ""')
peda.execute('peda set option clearscr off')
try: # I also had pwndbg installed
pwndbg.gdb.execute('set context-output /dev/null')
pwndbg.gdb.execute('set context-sections regs')
except:
pass
setup()
peda.execute('file /home/nofix/encrypted_box2')
peda.execute('starti < /dev/urandom')
setup_bp_from_disassembly(['syscall', 'aesenc', 'aesdec'])
i = 0
while True:
print(f'{i}/1001 {hex(peda.getreg("rip"))}')
peda.execute('continue')
curr_ins = get_current_ins()
if curr_ins == 'syscall':
peda.execute('set $rip=$rip+2') # ignore syscalls
elif curr_ins in ['aesdec', 'aesenc']:
xmm0 = get_xmm("xmm0")
xmm1 = get_xmm("xmm1")
# print(f'DATA : {hex(xmm0)}')
# print(f'KEY : {hex(xmm1)}')
aes = AES()
# We "reverse" the value from XMM0, using XMM1 as key
if curr_ins == 'aesdec':
value = aes.AESDECINV(bytes.fromhex(hex(xmm0)[2:].zfill(32)), bytes.fromhex(hex(xmm1)[2:].zfill(32)))
elif curr_ins == 'aesenc':
value = aes.AESENCINV(bytes.fromhex(hex(xmm0)[2:].zfill(32)), bytes.fromhex(hex(xmm1)[2:].zfill(32)))
else:
assert False
upper, lower = struct.pack('<Q', int(value.hex()[0:16], 16)), struct.pack('<Q', int(value.hex()[16:32], 16))
# It seems gdb wont allow one to use xmmX.int128 directly, so I have to split the value in two parts
peda.execute(f'set $xmm2.v2_int64 = {{0x{upper.hex()},0x{lower.hex()}}}')
# Value was supposed to come from the stack, and it is also used in the decryption algorithm to re-setup XMM2,
# so I set the proper value @RSP, see :
# 4001DA pinsrq xmm2, qword ptr [rsp+8], 1
# 4001E3 pinsrq xmm2, qword ptr [rsp], 0
peda.writemem(peda.getreg("rsp"), value)
# For speed purposes, we remove breakpoints, since we don't need them anymore
peda.execute('delete breakpoints')
# We still need to break before next stage
setup_bp_from_disassembly(['jmp'], ret_on_first_match=True)
elif curr_ins == 'jmp': # We are sitting right before next stage
peda.execute(f'si')
i += 1
peda.execute('delete breakpoints')
# Set a bp to valid instruction not far from aesenc/aesdec
setup_bp_from_disassembly(['pinsrq'], ret_on_first_match=True)
elif curr_ins == 'pinsrq':
# We should be near aesenc/aesdec
setup_bp_from_disassembly(['syscall', 'aesenc', 'aesdec'])
For each of the 1000 stages, the script:
- sets a breakpoint on the first
aesdec
/aesenc
instruction; - skips the
read
fromstdin
by reading garbage from/dev/urandom
(achieved automatically withstarti < /dev/urandom
); - reaches the
aesdec
/aesenc
breakpoint, calculate the expected value ofxmm2
by encrypting or decryptingxmm0
usingxmm1
as a key; - writes the result in
xmm2
and the stack, where ourread
input should have been located after theread
syscall; - removes breakpoints and allows the encryption/decryption routine to proceed.
BRB{as_deep_as_OceanGate}
Bonus solution
The challenge creator, s4r
, used DynamoRIO to instrument and automate the decryption process. This is a clever approach I did not think of during the CTF. While I tend to think that instrumentation requires coding effort, the APIs are actually rather simple, and you can achieve a lot with minimal code. The main drawback is that you need to recompile your code after each modification.
Here is the code provided by s4r
:
/// ~/Softs/DynamoRIO-Linux-10.0.0/bin64/drrun -c trace_encryped_box.so -- ~/barbhack-ctf/2023/reverse/encrypted_box/encrypted_box < /dev/urandom
// This sample logs the basic blocks executed in the main module
#include "dr_api.h"
#include "dr_ir_instr.h"
#include "dr_ir_utils.h"
#include "drmgr.h"
#include "drwrap.h"
#include <string.h>
static app_pc exe_start;
static int bb;
static void set_input(int is_dec);
static void event_exit(void);
static dr_emit_flags_t event_app_instruction_blocks(void *drcontext, void *tag, instrlist_t *bb, instr_t *instr, bool for_trace, bool translating, void *user_data);
static dr_emit_flags_t event_app_instruction_calls(void *drcontext, void *tag, instrlist_t *bb, instr_t *instr, bool for_trace, bool translating, void *user_data);
void sub_bytes(unsigned char* x0)
{
unsigned char sbox[] = {
0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
0xB7, 0xFD, 0x93, 0x26, 0x36, 0x3F, 0xF7, 0xCC, 0x34, 0xA5, 0xE5, 0xF1, 0x71, 0xD8, 0x31, 0x15,
0x04, 0xC7, 0x23, 0xC3, 0x18, 0x96, 0x05, 0x9A, 0x07, 0x12, 0x80, 0xE2, 0xEB, 0x27, 0xB2, 0x75,
0x09, 0x83, 0x2C, 0x1A, 0x1B, 0x6E, 0x5A, 0xA0, 0x52, 0x3B, 0xD6, 0xB3, 0x29, 0xE3, 0x2F, 0x84,
0x53, 0xD1, 0x00, 0xED, 0x20, 0xFC, 0xB1, 0x5B, 0x6A, 0xCB, 0xBE, 0x39, 0x4A, 0x4C, 0x58, 0xCF,
0xD0, 0xEF, 0xAA, 0xFB, 0x43, 0x4D, 0x33, 0x85, 0x45, 0xF9, 0x02, 0x7F, 0x50, 0x3C, 0x9F, 0xA8,
0x51, 0xA3, 0x40, 0x8F, 0x92, 0x9D, 0x38, 0xF5, 0xBC, 0xB6, 0xDA, 0x21, 0x10, 0xFF, 0xF3, 0xD2,
0xCD, 0x0C, 0x13, 0xEC, 0x5F, 0x97, 0x44, 0x17, 0xC4, 0xA7, 0x7E, 0x3D, 0x64, 0x5D, 0x19, 0x73,
0x60, 0x81, 0x4F, 0xDC, 0x22, 0x2A, 0x90, 0x88, 0x46, 0xEE, 0xB8, 0x14, 0xDE, 0x5E, 0x0B, 0xDB,
0xE0, 0x32, 0x3A, 0x0A, 0x49, 0x06, 0x24, 0x5C, 0xC2, 0xD3, 0xAC, 0x62, 0x91, 0x95, 0xE4, 0x79,
0xE7, 0xC8, 0x37, 0x6D, 0x8D, 0xD5, 0x4E, 0xA9, 0x6C, 0x56, 0xF4, 0xEA, 0x65, 0x7A, 0xAE, 0x08,
0xBA, 0x78, 0x25, 0x2E, 0x1C, 0xA6, 0xB4, 0xC6, 0xE8, 0xDD, 0x74, 0x1F, 0x4B, 0xBD, 0x8B, 0x8A,
0x70, 0x3E, 0xB5, 0x66, 0x48, 0x03, 0xF6, 0x0E, 0x61, 0x35, 0x57, 0xB9, 0x86, 0xC1, 0x1D, 0x9E,
0xE1, 0xF8, 0x98, 0x11, 0x69, 0xD9, 0x8E, 0x94, 0x9B, 0x1E, 0x87, 0xE9, 0xCE, 0x55, 0x28, 0xDF,
0x8C, 0xA1, 0x89, 0x0D, 0xBF, 0xE6, 0x42, 0x68, 0x41, 0x99, 0x2D, 0x0F, 0xB0, 0x54, 0xBB, 0x16
};
for (int i = 0; i < 16; i++)
x0[i] = sbox[x0[i]];
}
void inv_sub_bytes(unsigned char* x0)
{
unsigned char inv_sbox[] = {
0x52, 0x09, 0x6A, 0xD5, 0x30, 0x36, 0xA5, 0x38, 0xBF, 0x40, 0xA3, 0x9E, 0x81, 0xF3, 0xD7, 0xFB,
0x7C, 0xE3, 0x39, 0x82, 0x9B, 0x2F, 0xFF, 0x87, 0x34, 0x8E, 0x43, 0x44, 0xC4, 0xDE, 0xE9, 0xCB,
0x54, 0x7B, 0x94, 0x32, 0xA6, 0xC2, 0x23, 0x3D, 0xEE, 0x4C, 0x95, 0x0B, 0x42, 0xFA, 0xC3, 0x4E,
0x08, 0x2E, 0xA1, 0x66, 0x28, 0xD9, 0x24, 0xB2, 0x76, 0x5B, 0xA2, 0x49, 0x6D, 0x8B, 0xD1, 0x25,
0x72, 0xF8, 0xF6, 0x64, 0x86, 0x68, 0x98, 0x16, 0xD4, 0xA4, 0x5C, 0xCC, 0x5D, 0x65, 0xB6, 0x92,
0x6C, 0x70, 0x48, 0x50, 0xFD, 0xED, 0xB9, 0xDA, 0x5E, 0x15, 0x46, 0x57, 0xA7, 0x8D, 0x9D, 0x84,
0x90, 0xD8, 0xAB, 0x00, 0x8C, 0xBC, 0xD3, 0x0A, 0xF7, 0xE4, 0x58, 0x05, 0xB8, 0xB3, 0x45, 0x06,
0xD0, 0x2C, 0x1E, 0x8F, 0xCA, 0x3F, 0x0F, 0x02, 0xC1, 0xAF, 0xBD, 0x03, 0x01, 0x13, 0x8A, 0x6B,
0x3A, 0x91, 0x11, 0x41, 0x4F, 0x67, 0xDC, 0xEA, 0x97, 0xF2, 0xCF, 0xCE, 0xF0, 0xB4, 0xE6, 0x73,
0x96, 0xAC, 0x74, 0x22, 0xE7, 0xAD, 0x35, 0x85, 0xE2, 0xF9, 0x37, 0xE8, 0x1C, 0x75, 0xDF, 0x6E,
0x47, 0xF1, 0x1A, 0x71, 0x1D, 0x29, 0xC5, 0x89, 0x6F, 0xB7, 0x62, 0x0E, 0xAA, 0x18, 0xBE, 0x1B,
0xFC, 0x56, 0x3E, 0x4B, 0xC6, 0xD2, 0x79, 0x20, 0x9A, 0xDB, 0xC0, 0xFE, 0x78, 0xCD, 0x5A, 0xF4,
0x1F, 0xDD, 0xA8, 0x33, 0x88, 0x07, 0xC7, 0x31, 0xB1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xEC, 0x5F,
0x60, 0x51, 0x7F, 0xA9, 0x19, 0xB5, 0x4A, 0x0D, 0x2D, 0xE5, 0x7A, 0x9F, 0x93, 0xC9, 0x9C, 0xEF,
0xA0, 0xE0, 0x3B, 0x4D, 0xAE, 0x2A, 0xF5, 0xB0, 0xC8, 0xEB, 0xBB, 0x3C, 0x83, 0x53, 0x99, 0x61,
0x17, 0x2B, 0x04, 0x7E, 0xBA, 0x77, 0xD6, 0x26, 0xE1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0C, 0x7D
};
for (int i = 0; i < 16; i++)
x0[i] = inv_sbox[x0[i]];
}
void shiftrow(unsigned char* x0)
{
unsigned char tmp[16];
tmp[0] = x0[0];
tmp[1] = x0[5];
tmp[2] = x0[10];
tmp[3] = x0[15];
tmp[4] = x0[4];
tmp[5] = x0[9];
tmp[6] = x0[14];
tmp[7] = x0[3];
tmp[8] = x0[8];
tmp[9] = x0[13];
tmp[10] = x0[2];
tmp[11] = x0[7];
tmp[12] = x0[12];
tmp[13] = x0[1];
tmp[14] = x0[6];
tmp[15] = x0[11];
for (int i = 0; i < 16 ; i++)
{
x0[i] = tmp[i];
}
}
void inv_shiftrow(unsigned char* v0)
{
unsigned char tmp[16];
tmp[0] = v0[0];
tmp[1] = v0[13];
tmp[2] = v0[10];
tmp[3] = v0[7];
tmp[4] = v0[4];
tmp[5] = v0[1];
tmp[6] = v0[14];
tmp[7] = v0[11];
tmp[8] = v0[8];
tmp[9] = v0[5];
tmp[10] = v0[2];
tmp[11] = v0[15];
tmp[12] = v0[12];
tmp[13] = v0[9];
tmp[14] = v0[6];
tmp[15] = v0[3];
for (int i = 0; i < 16 ; i++)
{
v0[i] = tmp[i];
}
}
unsigned char xtime(unsigned char x)
{
return ((x<<1) ^ (((x>>7) & 1) * 0x1b)) & 0xff;
}
unsigned char multiply(unsigned char x, unsigned char y)
{
return (((y & 1) * x) ^
((y>>1 & 1) * xtime(x)) ^
((y>>2 & 1) * xtime(xtime(x))) ^
((y>>3 & 1) * xtime(xtime(xtime(x)))) ^
((y>>4 & 1) * xtime(xtime(xtime(xtime(x)))))) & 0xff;
}
void mix_columns(unsigned char* v0)
{
for (int i = 0 ; i < 4 ; i++)
{
unsigned char t = v0[4*i + 0];
unsigned char Tmp = v0[4*i + 0] ^ v0[4*i + 1] ^ v0[4*i + 2] ^ v0[4*i + 3];
unsigned char Tm = v0[4*i + 0] ^ v0[4*i + 1];
Tm = xtime(Tm);
v0[4*i + 0] ^= Tm ^ Tmp;
Tm = v0[4*i + 1] ^ v0[4*i + 2];
Tm = xtime(Tm);
v0[4*i + 1] ^= Tm ^ Tmp;
Tm = v0[4*i + 2] ^ v0[4*i + 3];
Tm = xtime(Tm);
v0[4*i + 2] ^= Tm ^ Tmp;
Tm = v0[4*i + 3] ^ t;
Tm = xtime(Tm);
v0[4*i + 3] ^= Tm ^ Tmp;
}
}
void inv_mixcolumns(unsigned char* v0)
{
for (int i = 0 ; i < 4 ; i++)
{
unsigned char a = v0[4*i + 0];
unsigned char b = v0[4*i + 1];
unsigned char c = v0[4*i + 2];
unsigned char d = v0[4*i + 3];
v0[4*i + 0] = multiply(a, 0x0e) ^ multiply(b, 0x0b) ^ multiply(c, 0x0d) ^ multiply(d, 0x09);
v0[4*i + 1] = multiply(a, 0x09) ^ multiply(b, 0x0e) ^ multiply(c, 0x0b) ^ multiply(d, 0x0d);
v0[4*i + 2] = multiply(a, 0x0d) ^ multiply(b, 0x09) ^ multiply(c, 0x0e) ^ multiply(d, 0x0b);
v0[4*i + 3] = multiply(a, 0x0b) ^ multiply(b, 0x0d) ^ multiply(c, 0x09) ^ multiply(d, 0x0e);
}
}
void xorv(unsigned char* x1, unsigned char* x2)
{
for (int i = 0; i < 16 ; i++)
{
x1[i] ^= x2[i];
}
}
void aesenc(unsigned char* x1, unsigned char* x2)
{
shiftrow(x1);
sub_bytes(x1);
mix_columns(x1);
return xorv(x1, x2);
}
void inv_aesenc(unsigned char* x1, unsigned char* x2)
{
xorv(x1, x2);
inv_mixcolumns(x1);
inv_sub_bytes(x1);
inv_shiftrow(x1);
}
void inv_aesdec(unsigned char* x1, unsigned char* x2)
{
xorv(x1, x2);
mix_columns(x1);
sub_bytes(x1);
shiftrow(x1);
}
void aesdec(unsigned char* x1, unsigned char* x2)
{
inv_shiftrow(x1);
inv_sub_bytes(x1);
inv_mixcolumns(x1);
xorv(x1, x2);
}
static void set_input(int is_dec){
dr_mcontext_t mc = {sizeof(mc),DR_MC_ALL,};
if (!dr_mcontext_xmm_fields_valid())
{
printf("Wrong values of XMM\n");
exit(0);
}
dr_get_mcontext(dr_get_current_drcontext(), &mc);
unsigned char xmm1[16];
unsigned char xmm0[16];
reg_get_value_ex(DR_REG_XMM1, &mc, xmm1);
reg_get_value_ex(DR_REG_XMM0, &mc, xmm0);
if (is_dec)
{
inv_aesdec(xmm1, xmm0);
}
else
{
inv_aesenc(xmm1, xmm0);
}
// To pass the comiss
reg_set_value_ex(DR_REG_XMM2, &mc, xmm1);
// To pass the decryption because XMM2 is fetch again from stack
for (int i = 0; i < 16 ; i++)
{
*((unsigned char*)mc.rsp + i) = xmm1[i];
}
dr_set_mcontext(dr_get_current_drcontext(), &mc);
return;
}
DR_EXPORT void dr_client_main(client_id_t id, int argc, const char** argv){
module_data_t* exe = dr_get_main_module();
if (!drmgr_init()){
DR_ASSERT(false);
}
if (!drwrap_init()){
DR_ASSERT(false);
}
dr_set_client_name("Stan tracer", "crackmes.one");
if (exe != NULL) {
exe_start = exe->start;
}
drmgr_register_bb_instrumentation_event(NULL, event_app_instruction_blocks, NULL);
dr_free_module_data(exe);
dr_register_exit_event(event_exit);
}
static dr_emit_flags_t
event_app_instruction_blocks(void *drcontext, void *tag, instrlist_t *bb, instr_t *instr, bool for_trace, bool translating, void *user_data)
{
app_pc addr_instr = instr_get_app_pc(instr);
instr_t* first_instr_block = instrlist_first_app(bb);
app_pc first_addr_block = instr_get_app_pc(first_instr_block);
module_data_t *mod = dr_lookup_module(dr_fragment_app_pc(tag));
if (mod != NULL) {
bool from_exe = (mod->start == exe_start);
dr_free_module_data(mod);
if (!from_exe) {
return DR_EMIT_DEFAULT;
}
}
// find aesdec; comiss
if ((*(unsigned char*)(addr_instr) == 0x66) &&
(*(unsigned char*)(addr_instr + 1) == 0x0f) &&
(*(unsigned char*)(addr_instr + 2) == 0x38) &&
(*(unsigned char*)(addr_instr + 3) == 0xde) &&
(*(unsigned char*)(addr_instr + 4) == 0xd1) &&
(*(unsigned char*)(addr_instr + 5) == 0x0f))
{
int dec = 0x1;
dr_insert_clean_call(drcontext, bb, instr, set_input, false, 1, OPND_CREATE_INTPTR(dec));
}
// find aesenc; comiss
else if ((*(unsigned char*)(addr_instr) == 0x66) &&
(*(unsigned char*)(addr_instr + 1) == 0x0f) &&
(*(unsigned char*)(addr_instr + 2) == 0x38) &&
(*(unsigned char*)(addr_instr + 3) == 0xdc) &&
(*(unsigned char*)(addr_instr + 4) == 0xd1) &&
(*(unsigned char*)(addr_instr + 5) == 0x0f))
{
int dec = 0x0;
dr_insert_clean_call(drcontext, bb, instr, set_input, false, 1, OPND_CREATE_INTPTR(dec));
}
return DR_EMIT_DEFAULT;
}
static void event_exit(void) {
drmgr_exit();
}
And the associated makefile:
DR_PATH=/home/user/Softs/DynamoRIO-Linux-10.0.0
CC=gcc
CFLAGS=-fPIC -shared -lgcc -DLINUX -DX86_64 -I$(DR_PATH)/include -I$(DR_PATH)/ext/include
LFLAGS=-L$(DR_PATH)/lib64/debug -L$(DR_PATH)/ext/lib64/debug
LLIBS=-ldrmgr -ldrwrap
trace_encryped_box.so: trace_encryped_box.c
$(CC) $(CFLAGS) $^ -o $@ $(LFLAGS) $(LLIBS)