White Maria CTF / Nov 18 2019

[Reverse] Cursed App

This is a puzzle from the ASIS CTF Final 2019.

cursed_app.elf

This puzzle comes with a single ELF file. Executing it produces the following line of output:

$ chmod +x cursed_app.elf$ ./cursed_app.elfplease locate license file, run ./app license_key

We open this in radare2 and quickly realise that the program consists of a huge main function containing approximately 1000 x86-64 instructions long, calling only C standard library functions (hence the puzzle title). Near the top of main we see several instructions that reads a file given in argv[1].

After a successful call to read, the code starts to execute the main instructions that are of interest to us. The program structure is quite clear for this puzzle but it was even easier to see with radare2's graph view. Essentially, the program consisted of similar blocks of instructions that always ended in a conditional branch to the address 0x1f1f.

0x00001177      4889ef         mov rdi, rbp                 ; 1st block0x0000117a      e8d1feffff     call sym.imp.fclose0x0000117f      0fbe03         movsx eax, byte [rbx]0x00001182      be00010000     mov esi, 0x1000x00001187      69c0af000000   imul eax, eax, 0xaf0x0000118d      0591000000     add eax, 0x910x00001192      99             cdq0x00001193      f7fe           idiv esi0x00001195      89d0           mov eax, edx0x00001197      69caeb000000   imul ecx, edx, 0xeb0x0000119d      0fafc2         imul eax, edx0x000011a0      6bd267         imul edx, edx, 0x670x000011a3      69c064010000   imul eax, eax, 0x1640x000011a9      8d8408020200.  lea eax, [rax + rcx + 0x202]0x000011b0      8d4a02         lea ecx, [rdx + 2]0x000011b3      99             cdq0x000011b4      f7f9           idiv ecx0x000011b6      85d2           test edx, edx0x000011b8      0f85910d0000   jne 0x1f4f0x000011be      0fbe4301       movsx eax, byte [rbx + 1]    ; 2nd block0x000011c2      6bc05d         imul eax, eax, 0x5d0x000011c5      05da000000     add eax, 0xda0x000011ca      99             cdq0x000011cb      f7fe           idiv esi0x000011cd      89d0           mov eax, edx0x000011cf      6bca3c         imul ecx, edx, 0x3c0x000011d2      0fafc2         imul eax, edx0x000011d5      6bd256         imul edx, edx, 0x560x000011d8      69c0da030000   imul eax, eax, 0x3da0x000011de      8d8408540300.  lea eax, [rax + rcx + 0x354]0x000011e5      8d8a5f030000   lea ecx, [rdx + 0x35f]0x000011eb      99             cdq0x000011ec      f7f9           idiv ecx0x000011ee      85d2           test edx, edx0x000011f0      0f85590d0000   jne 0x1f4f

(Excerpt of the output by radare2 for cursed_app.elf, adapted)

We see that at the beginning of each block, just after the previous block's conditional jump instruction, the assembly reads from memory a byte at increasing offsets from the value stored in the rbx register.

Around the address 0x1f4f, we discover that at the end of all the branches is a message congratulating us on getting the correct flag. On the other hand, taking any of the conditional branches leads to another message stating that our flag is wrong. This leads us to believe that the goal is to find the string that yields the former when passed in by a file given in argv[1] as stated above.

0x00001f3b      488d3dfe1000.  lea rdi, str.Congratz__You_got_the_correct_flag0x00001f42      e8e9f0ffff     call sym.imp.puts0x00001f47      c74424040100.  mov dword [var_4h], 10x00001f4f      837c240400     cmp dword [var_4h], 00x00001f54      7421           je 0x1f770x00001f56      4881c4e00000.  add rsp, 0xe00x00001f5d      31c0           xor eax, eax0x00001f5f      5b             pop rbx0x00001f60      5d             pop rbp0x00001f61      415c           pop r120x00001f63      c3             ret...0x00001f77      488d3dea1000.  lea rdi, str.Bummer__You_got_the_wrong_flag0x00001f7e      e8adf0ffff     call sym.imp.puts0x00001f83      ebd1           jmp 0x1f56

(Excerpt of the output by radare2 for cursed_app.elf, adapted)

One way to achieve this is to utilise a fuzzer. Inspection of the blocks indicate that the test for each character of the input is independent of the others, so that instead of the whole search space of $256^N$ characters, one can find the correct string in at most $256N$ trials. However, we actually used angr, a symbolic execution tool written in Python.

pip install angr

31.3s

Cursed_App (Bash in Python)

The gist is that we can pass in a symbolic argument to a simulator that will operate on it as an unknown rather than a concrete value. At each branch, the process will fork into two separate ones, each adding a constraint opposite to the other corresponding to whether the branch was taken or not. When the execution is complete, a solver can deduce what values will cause the execution to take a certain path. This suffers from exponential growth in the number of non-nested branches, so symbolic execution engines may also use a constraint solver to determine if a path is feasible, discarding it if not.

angr has a pretty helpful feature that allows us to begin simulation in the middle of a program. This enables us to skip past the cumbersome calls to reading a file and directly inject our symbolic variable into memory. Furthermore, we can also prune paths from execution, which in this case would be those that pass through address 0x1f4f. Naturally, we can also choose to terminate the search before execution normally finishes. We choose this to be the address 0x1f3b, which is after all the conditional checks have passed.

# Constants discovered during static analysis using r2 (or any other decompiler)# Our .text (code) section will be loaded at offset 0x400000start = 0x40117f    # Skip past the file reading stufftarget = 0x401f3b   # Earliest point(s) where we know we've succeededavoid = 0x401f4f    # Place(s) where we know we've failed

0.2s

Cursed_App (Python)

(angr has the capability to prune and terminate execution based on any information obtainable from the state rather than just addresses. However, this suffices for our purpose.)

One downside to beginning execution in the middle is that we have to set up the state manually to what the program expects or it will fail to work correctly. In this case it was easy to inspect the assembly and determine that the only register used which does not have its value initialised after the start address is rbx. It is assigned in the instruction mov rbx, rax at address 0x115c, following the call to malloc. A few instructions down, this is used as the first argument to fread, so it means that our input string should be loaded into the address at rbx. It does not actually matter what the value of rbx is as long as it matches with our input—indeed, it might even differ between different executions.

We now have all the information needed to reverse engineer the flag.

import angr# We require no library functionsproj=angr.Project(cursed_app.elf
,auto_load_libs=False)myaddr = 0x600000   # Arbitrary address for our string# Setting up the starting statestate = proj.factory.blank_state(addr=start)chars = []  # The symbolic stringfor i in range(0x3b):   # The last reference is [rbx+0x3a]  bvs = state.solver.BVS('c{}'.format(i), 8)  # 8-bits, name arbitrary  state.memory.store(myaddr + i, bvs)  state.add_constraints(bvs >= 32)            # We only accept printable chars  state.add_constraints(bvs < 127)  chars.append(bvs)state.regs.rbx = myaddr   # Initially rbx is set to the address of our string# Begin symbolic executionimport logginglogging.getLogger('angr').setLevel('INFO')simgr = proj.factory.simulation_manager(state)simgr.explore(find=target, avoid=avoid)# Final states are in simgr.found

118.5s

Cursed_App (Python)

<SimulationMa...und, 59 avoid>

We are almost done. The final step is to solve for the given input. angr's backend for constraint solving is claripy, which we can invoke directly from the final state object through the solver property. At first, we did the simplest thing, which was to call the eval function on each character, as in the following listing.

# There is only one execution path to finish, so there should only be one state# in simgr.found (unless it failed completely), which is the one we wantif len(simgr.found) == 0:  print("Flag not found!")  exit()final = simgr.found[0]flag = b''for c in chars:  flag += final.solver.eval(c, cast_to=bytes)print(flag.decode())   # Yay?

0.6s

Cursed_App (Python)

Unfortunately, this did not give us the correct flag. It was of the correct format, but on further inspection the l33+ string we generated appeared to have spelling errors. The problem was that some characters did not have unique solutions. The sha256 hash of the correct flag was given, so we generate all possible solutions instead and print the one with the correct hash. We can do this with the eval_upto function, since we know that there are only 95 possible characters within the range (fewer if we take into account the flag format).

import hashlibimport itertoolstarget_hash = 'dd791346264fc02ddcd2466b9e8b2b2bba8057d61fbbc824d25c7957c127250d'c_possibilities = []for c in chars:  c_possibilities.append(final.solver.eval_upto(c, 95, cast_to=bytes))flag_possibilities = itertools.product(*c_possibilities)for flag in flag_possibilities:  flag = b''.join(list(flag))   # flag is originally a tuple of bytes  # In case the flag is in the middle of the string  flag = flag[flag.find(b'ASIS{'):flag.find(b'}')+1]  if hashlib.sha256(flag).hexdigest() == target_hash:    print(flag.decode())  # Yay!

83.1s

Cursed_App (Python)

And we are done!

solve.py

N.B. Initially we made the mistake of creating one large bit-vector for the entire string. It resulted in a huge drop in the execution speed. In particular, the execution of each basic block took longer over time. That was possibly due to the complexity in checking that the constraints were satisfiable at each branch.

[Reverse] Cursed App

Runtimes (1)