Disclaimer: This article is for education and authorized security testing only. Reverse engineering may be restricted by license agreements (EULAs) or local law. Only analyze binaries you own, that you are explicitly authorized to assess, or that are provided in legitimate training environments such as CTFs and licensed labs.
Introduction / Overview
Almost every reverse engineering or exploitation task on Linux eventually drops you into a window full of mov, call, and lea. If those mnemonics look like noise, the rest of the tooling — GDB, Ghidra, IDA — stays opaque. This primer gives you the minimum viable mental model of x86-64 assembly: the registers, the System V calling convention, the stack, and how to read it all live with GDB disassembly.
The goal is not to make you write assembly by hand, but to let you read a function and immediately answer: where are the arguments, where is the return value, and what is on the stack?
How it works / Background
Registers
x86-64 has 16 general-purpose 64-bit registers. Each has narrower aliases: rax (64-bit), eax (32-bit), ax (16-bit), al (8-bit). Writing to a 32-bit alias (e.g. eax) zero-extends into the full 64-bit register — a frequent source of confusion.
| Register | Conventional role |
|---|---|
rax |
Return value / accumulator |
rbx |
Callee-saved general purpose |
rcx, rdx |
Args 4 and 3, scratch |
rsi, rdi |
Args 2 and 1, scratch |
rbp |
Frame (base) pointer |
rsp |
Stack pointer |
r8–r11 |
Scratch (r8/r9 are args 5/6) |
r12–r15 |
Callee-saved |
rip |
Instruction pointer |
rflags |
Status flags (ZF, CF, SF, OF) |
Calling convention (System V AMD64 ABI)
On Linux/macOS, integer and pointer arguments to a function go in registers in this exact order:
rdi, rsi, rdx, rcx, r8, r9PlaintextAdditional arguments are pushed onto the stack (right to left). The return value comes back in rax (and rdx for 128-bit returns). Floating-point arguments use xmm0–xmm7. Crucially, the caller must preserve rax, rcx, rdx, rsi, rdi, r8–r11 if it needs them, while the callee must preserve rbx, rbp, r12–r15.
A useful trick: for variadic functions (like printf), al holds the number of vector registers used.
The stack
The stack grows downward (toward lower addresses). rsp always points at the top. A standard function prologue/epilogue looks like:
push rbp ; save caller's frame pointer
mov rbp, rsp ; establish new frame
sub rsp, 0x20 ; reserve 32 bytes for locals
; ... function body ...
leave ; mov rsp, rbp ; pop rbp
ret ; pop return address into ripASMThe call instruction pushes the return address onto the stack before jumping; ret pops it back into rip. This return address on the stack is exactly what a classic stack buffer overflow overwrites.
Prerequisites / Lab setup
You need a Linux box (a VM is fine), GCC, and GDB. The pwndbg extension makes GDB dramatically more readable for RE work.
sudo apt update
sudo apt install -y gcc gdb gdb-multiarch
git clone https://github.com/pwndbg/pwndbg
cd pwndbg && ./setup.shBashCreate a tiny target so we can watch the calling convention in action:
// target.c
#include <stdio.h>
long add3(long a, long b, long c) {
long sum = a + b + c;
return sum;
}
int main(void) {
long r = add3(0x10, 0x20, 0x30);
printf("result = %ld\n", r);
return 0;
}CCompile without optimization so the prologue and stack frame stay visible:
gcc -O0 -fno-stack-protector -no-pie -g target.c -o targetBash-no-pie gives fixed addresses (easier to read), and -fno-stack-protector removes the canary so the frame is uncluttered for learning. In real analysis you keep these protections; we disable them here only to expose the raw mechanics.
Walkthrough / PoC
Disassemble add3 statically first:
objdump -d -M intel target | grep -A 15 '<add3>:'BashYou'll see something close to:
<add3>:
push rbp
mov rbp,rsp
mov QWORD PTR [rbp-0x18],rdi ; store arg a
mov QWORD PTR [rbp-0x20],rsi ; store arg b
mov QWORD PTR [rbp-0x28],rdx ; store arg c
mov rax,QWORD PTR [rbp-0x18]
mov rdx,QWORD PTR [rbp-0x20]
add rax,rdx
add rax,QWORD PTR [rbp-0x28]
mov QWORD PTR [rbp-0x8],rax ; sum
mov rax,QWORD PTR [rbp-0x8]
pop rbp
retASMNotice arguments arrive in rdi, rsi, rdx exactly as the ABI promises, and the result leaves in rax.
Now go dynamic with GDB. Set Intel syntax and break on add3:
gdb -q ./targetBashset disassembly-flavor intel
break add3
runPlaintextWhen the breakpoint hits, inspect the argument registers and the stack:
info registers rdi rsi rdx
x/4gx $rsp
disassemblePlaintextrdi should read 0x10, rsi 0x20, and rdx 0x30 — the three integer arguments. The top of the stack (x/4gx $rsp, four giant-words in hex) holds the saved return address pointing back into main.
Step through to watch rax get built and confirm the return value:
nexti 10
print/x $raxPlaintextTo follow the call/return mechanics, set a breakpoint on the ret and examine rip vs. the value on the stack:
break *(add3+0x2e)
continue
x/gx $rsp
stepi
print/x $ripPlaintextAfter the ret, rip equals the value that was sitting at $rsp — that is the return-address pop in action. Understanding this single fact is the foundation of ROP and stack overflow exploitation.
Mermaid diagram

The diagram shows one full call cycle: arguments loaded into registers, the return address pushed by call, frame setup, computation into rax, and the ret that restores rip to the caller.
Detection & Defense (Blue Team)
Reading assembly is offensive groundwork, but the same primitives are what defenders harden. Mitigations should be weighted at least as heavily as the offensive technique.
- Stack canaries (
-fstack-protector-strong): GCC/Clang insert a random guard value between locals and the saved return address. An overflow that reaches the return address corrupts the canary first, and__stack_chk_failaborts the process. Build production binaries with-fstack-protector-strongor-fstack-protector-all. - NX / DEP: Mark the stack non-executable so injected shellcode on the stack cannot run. Verify with
readelf -l ./target | grep GNU_STACK— the flags should beRW(noE). - ASLR + PIE: Compile with
-pie(default on modern distros) and keepkernel.randomize_va_space=2so register-leaked addresses are not stable across runs. Check withcat /proc/sys/kernel/randomize_va_space. - CFI / Intel CET: Control-flow Enforcement Technology (shadow stack + IBT) detects tampered return addresses in hardware. Build with
-fcf-protection=fullon supported toolchains. - Detection: Monitor for repeated crashes (
SIGSEGV/SIGABRT) on a service — a hallmark of exploit brute-forcing against ASLR — via the kernel audit log orcoredumpctl. Map this activity to MITRE ATT&CK T1203 (Exploitation for Client Execution) and T1055 (Process Injection). Defensive RE workflows often pair GDB with static tools like Ghidra to triage suspicious binaries before they reach production. - Compiler hardening audits: Run
checksec --file=./target(from pwntools orchecksec.sh) in CI to fail builds that ship without canaries, NX, RELRO, or PIE.
checksec --file=./target
readelf -l ./target | grep -A1 GNU_STACKBashConclusion
You now have the core reverse engineering vocabulary: the register set and their aliases, the System V argument order (rdi, rsi, rdx, rcx, r8, r9 in, rax out), how the stack frames and the call/ret pair move rip, and how to confirm all of it live with GDB disassembly. With this model, decompiler output and exploit write-ups stop being magic. Next, practice on real CTF binaries and step into printf to see variadic conventions, then move on to control-flow hijacking once the return-address mechanic is second nature.
References
- System V AMD64 ABI (official PSABI): https://gitlab.com/x86-psABIs/x86-64-ABI
- Intel 64 and IA-32 Architectures Software Developer Manuals: https://www.intel.com/sdm
- GDB Documentation: https://sourceware.org/gdb/current/onlinedocs/gdb/
- pwndbg: https://github.com/pwndbg/pwndbg
- HackTricks – Reversing & Binary Exploitation: https://book.hacktricks.xyz/
- MITRE ATT&CK T1203 – Exploitation for Client Execution: https://attack.mitre.org/techniques/T1203/
- MITRE ATT&CK T1055 – Process Injection: https://attack.mitre.org/techniques/T1055/



Comments