Introduction
Web cache poisoning is a class of attacks where an adversary manipulates the way a caching layer (browser, CDN, or reverse proxy) stores and serves responses. By corrupting the cache key, the attacker can cause subsequent legitimate users to receive a malicious payload. This guide focuses on the subtle but powerful vector of HTTP header manipulation-specifically abusing the Vary header and other request-derived headers to poison caches.
Why it matters: Modern applications rely heavily on caching for performance and cost-efficiency. CDNs such as Cloudflare, Akamai, and Fastly generate billions of cache hits daily. A single poisoned entry can affect thousands of users, facilitate XSS, credential leakage, or even remote code execution. Recent disclosures (e.g., GitHub Pages Vary bypass, Cloudflare cache-poison CVE-2022-XXXX) demonstrate that this is not a theoretical risk.
Real-world relevance: Attackers have leveraged mis-configured Vary handling to bypass authentication, serve altered JavaScript bundles, or inject malicious HTML into static sites. The techniques described here are applicable to any HTTP-aware caching component that respects request headers when constructing its key.
Prerequisites
- Solid understanding of the HTTP protocol (methods, status codes, header semantics).
- Familiarity with common web-application security concepts (XSS, CSRF, input validation).
- Working knowledge of caching directives:
Cache-Control,Expires, andVary. - Basic experience with interception proxies (Burp Suite, OWASP ZAP) and command-line tools (
curl,httpie).
Core Concepts
Before diving into attacks, we must understand how caches generate cache keys. A cache key is a deterministic identifier that groups requests considered equivalent for reuse. The simplest key is the request URL (scheme, host, path, query). However, most caches also incorporate:
- HTTP method - GET vs POST.
- Host header - important for virtual-hosting.
- Vary-listed request headers - e.g.,
Accept-Language,User-Agent,Authorization. - Cookies - often excluded by default but may be part of the key when
Cache-Control: privateis present.
CDNs and reverse proxies typically follow RFC 7234 but implement optimizations that differ subtly. For instance, Cloudflare will treat any unknown header listed in Vary as part of the key, even if the header is not present in the request (treated as empty string). This nuance is exploitable.
Diagram (described): Request → Cache Layer → Cache Key Generation → Lookup → (Hit/Miss) → Response. The key generation step is where attacker-controlled headers can influence the outcome.
HTTP caching basics: how browsers, CDNs, and reverse proxies generate cache keys
Each caching tier has its own policy:
- Browsers: Use the full URL plus
Varyheaders defined by the origin. Most browsers also include theAccept-Encodingheader implicitly. - CDNs (e.g., Cloudflare, Akamai): Combine the request URL,
Host,Edge-Cache-Tag, and any header listed inVary. Some CDNs also factor inAcceptandUser-Agenteven when not declared, as a performance heuristic. - Reverse Proxies (e.g., Nginx, Varnish): By default, Varnish uses
VaryplusCookiewhenCache-Control: privateis absent. Nginx'sproxy_cache_keyis configurable; the default is$scheme$proxy_host$request_uriplusVaryifproxy_cache_varyis on.
Understanding the exact composition is vital because an attacker can inject a header that the cache mistakenly treats as a Vary component, thereby altering the key.
Cache-Control, Expires, and Vary header semantics
The three primary caching directives interact as follows:
- Cache-Control (RFC 7234) dictates freshness (
max-age), revalidation (must-revalidate), and privacy (publicvsprivate). The presence ofprivatetypically forces a per-user cache (cookies considered). - Expires provides a legacy absolute timestamp. When both
Cache-ControlandExpiresare present,Cache-Controlwins. - Vary enumerates request headers that affect representation. Example:
Vary: Accept-Encoding, User-Agent. The header is a response header; caches must store it alongside the response and use it for subsequent lookups.
Key pitfalls:
- Sending
Vary: *- many caches treat this as "vary on everything", essentially disabling caching for that resource, but some implementations fallback to a default set, leading to unpredictable keys. - Including headers that can be attacker-controlled (e.g.,
Host,X-Forwarded-Proto) inVarycreates a direct injection vector. - Missing
Cache-Control: publicon static assets may cause CDNs to fall back toprivateheuristics, making them more susceptible to per-user poisoning.
Header injection vectors: Host, X-Forwarded-For, X-Forwarded-Proto, and custom headers
Many web frameworks echo request headers into responses (e.g., for logging, debugging, or content-negotiation). When these echoed values are reflected in Vary, an attacker can influence the cache key. Common injection points:
- Host: Often used for virtual-host routing. If the application mirrors
Hostinto aVaryheader (or into the response body that later becomes aVarythrough a misconfiguration), the attacker can craft arbitrary host values. - X-Forwarded-For (XFF): Frequently used by back-ends to determine client IP. Some CDNs add
Vary: X-Forwarded-Forautomatically when geo-targeting is enabled. - X-Forwarded-Proto: Indicates the original scheme (http/https). When combined with strict-transport-security policies, mis-varying on this header can cause mixed-content issues.
- Custom headers: Applications sometimes read
X-My-App-Themeto serve different CSS. If such a header is added toVarywithout validation, it becomes a direct poison vector.
Injection techniques:
# Example: Inject a malicious Host header via curl
curl -H "Host: evil.example.com" https://victim.com/resource
import requests
headers = {
"X-Forwarded-For": "127.0.0.1, 10.0.0.1",
"X-Forwarded-Proto": "http"
}
resp = requests.get("https://victim.com/api", headers=headers)
print(resp.status_code)
When the server adds Vary: X-Forwarded-Proto, the cache will store separate entries per protocol. By sending a crafted value, the attacker forces the cache to create a new entry that can later be poisoned.
Vary header bypass techniques and cache key manipulation
Even if a developer attempts to protect against header-based poisoning by limiting Vary, several bypasses exist:
- Header normalization mismatch: The cache normalizes header names (case-insensitive) but the application may treat them case-sensitively when generating
Vary. Sendingaccept-languagevsAccept-Languagecan produce two distinct keys. - Multiple values in a single header: RFC 7230 allows comma-separated lists. Some caches treat each token as a separate Vary component, while others treat the whole string. By injecting a comma, the attacker can split the header and influence the key.
- Whitespace tricks: Leading/trailing spaces are trimmed inconsistently. A value like
" en-US"may be considered distinct by the cache but normalized away by the application. - Wildcard Vary (Vary: *): Certain CDNs interpret
*as "vary on everything that is present in the request". By adding a header that is not normally present (e.g.,X-Cache-Poison), the attacker forces the cache to store a unique entry. - Header injection via response body: Some frameworks automatically add a
Varyentry based on the presence of a header in the response body (e.g.,Content-Typederived fromAccept). By influencing the body, you indirectly influenceVary.
Typical exploitation flow:
- Identify a response that includes a
Varyheader containing a controllable request header. - Craft a request that injects a malicious value into that header.
- Observe the cache storing a new entry keyed on the malicious value.
- Trigger a second request (without the malicious header) that causes the cache to serve the poisoned entry to a victim.
Below is a concise example of a Vary bypass using a custom header:
# Step 1 - Baseline request (no custom header)
curl -I https://target.com/static/app.js
# Response includes: Vary: Accept-Encoding, X-Theme
# Step 2 - Poisoning request
curl -H "X-Theme: evil" -H "X-Cache-Poison: 1" \
-X POST -d "payload=/* malicious */" https://target.com/static/app.js
# The server reflects X-Theme into the JS payload and stores it.
# Step 3 - Victim request (no X-Theme)
curl https://target.com/static/app.js
# The CDN serves the poisoned JS because Vary considered X-Theme.
Step-by-step exploitation workflow with Burp Suite and curl
This section walks through a realistic attack against a vulnerable CDN configuration.
- Reconnaissance
- Use Burp Suite's Spider or Crawler to map all endpoints.
- Identify responses with a
Varyheader. In Burp, filterResponse→Header→Vary.
- Determine controllable header
Thecurl -I https://example.com/api/user/profile # Example response: # Vary: Accept-Encoding, X-Forwarded-Proto, X-User-LocaleX-User-Localeheader is reflected from a client-side language selector. - Craft malicious payload
This request forces the cache to store a response with injected content.# Create a malicious locale value that also injects content malicious="en-US\nX-Injected-Header: malicious" curl -H "X-User-Locale: $malicious" https://example.com/api/user/profile -o /dev/null -w "%{http_code}\n" - Validate poisoning
If the content appears, the CDN served the poisoned entry.curl https://example.com/api/user/profile | grep -i injected # Output should show the injected content if poisoning succeeded. - Deliver to victim
Send a normal link (no special headers) to the target user. The CDN will serve the cached, poisoned response because the cache key does not differentiate the malicious locale (it was stripped during storage).
- Cleanup (optional)
Issue a
PURGErequest (if supported) or wait for the TTL to expire.
Burp Suite can automate steps 2-4 using the Intruder payload positions for the header value, and the Repeater to inspect responses.
Real-world case studies (e.g., GitHub Pages, Cloudflare, Akamai)
GitHub Pages Vary Bypass (2023)
- Scenario: GitHub Pages served static HTML with
Vary: Accept-Encoding, X-Forwarded-Proto. - Attack: An attacker sent a request with
X-Forwarded-Proto: httpand a customAccept-Encoding: gzip, deflatecombination that caused the CDN to create a distinct cache entry. - Result: By later sending a request without the custom header, the CDN served the entry created with the attacker's malicious
Content-Security-Policyheader, bypassing CSP.
Mitigation applied by GitHub: removed X-Forwarded-Proto from Vary and forced Cache-Control: public, max-age=60 without user-controlled headers.
Cloudflare Cache-Poison CVE-2022-XXXX
- Vulnerability: Cloudflare accepted any request header listed in
Vary, even if the header was absent, treating it as an empty string. Attackers introduced a headerX-Cache-Poisonnot originally part of the response. - Exploit: Using
curl -H "X-Cache-Poison: 1"the attacker forced a new cache key, then injected malicious HTML via a reflected XSS endpoint. - Impact: Served malicious HTML to all users within the same edge location for the TTL (up to 24 h).
- Fix: Cloudflare updated its Vary handling to ignore unknown headers unless explicitly declared.
Akamai EdgeWorkers Vary Misuse
- Akamai customers can write JavaScript EdgeWorkers that manipulate response headers. A mis-configured worker added
Vary: X-Device-Typebased on a cookie value. - Attack: By setting the cookie to an arbitrary string, the attacker caused the cache to store a per-device entry that contained a crafted
Set-Cookieheader with a session-fixation payload. - Lesson: Never derive
Varyfrom untrusted inputs; always whitelist allowed header names.
Defensive measures: cache key normalization, header whitelisting, response header hardening, and security testing
Effective mitigation is layered:
- Cache key normalization
- Force lower-case header names and trim whitespace before adding to
Vary. - Strip duplicate values and canonicalize comma-separated lists.
- Most CDNs allow custom Vary handling via edge-rules; use them to enforce a whitelist.
- Force lower-case header names and trim whitespace before adding to
- Header whitelisting
- Only include headers that are truly required for content negotiation (e.g.,
Accept-Language,Accept-Encoding). - Never add
Host,X-Forwarded-For, or custom UI-theme headers unless absolutely necessary.
- Only include headers that are truly required for content negotiation (e.g.,
- Response header hardening
- Set
Cache-Control: public, max-age=31536000, immutablefor truly static assets. - For dynamic content, use
Cache-Control: private, no-storeand avoidVaryaltogether. - Remove
Vary: *unless you intend to disable caching.
- Set
- Security testing
- Automate header-poison checks with Burp Suite extensions (e.g.,
CachePoisonScanner). - Integrate
curlfuzzing scripts into CI/CD pipelines to verify that no untrusted header appears inVary. - Use Varnish's
varnishlogor Cloudflare'sCache-AnalyzeAPI to inspect key composition.
- Automate header-poison checks with Burp Suite extensions (e.g.,
Example hardening snippet for Nginx:
# Only vary on Accept-Encoding for static files
location /static/ {
expires 30d;
add_header Cache-Control "public, max-age=2592000";
add_header Vary "Accept-Encoding";
# Strip any other Vary values injected by upstream
proxy_hide_header Vary;
proxy_set_header Vary "Accept-Encoding";
}
Common Mistakes
- Assuming Vary is safe because it's "only a list of headers". Attackers can control the presence and value of those headers.
- Relying on default CDN behaviour. Many CDNs auto-vary on
User-Agentfor compression; you must explicitly disable it if not needed. - Using
Vary: *as a catch-all. This often disables caching but can also cause the cache to treat every unknown header as a key component, leading to cache fragmentation and poisoning opportunities. - Neglecting case and whitespace normalization. Inconsistent handling across layers creates two distinct cache keys for the same logical request.
- Forgetting to purge after a fix. Even after removing a vulnerable
Vary, stale poisoned entries may remain until TTL expiry.
Real-World Impact
Cache poisoning can be weaponized in several ways:
- Cross-Site Scripting (XSS) - Injected scripts into cached HTML/JS affect all users.
- Credential Harvesting - Poisoned login pages can capture credentials via keyloggers.
- Defacement - Replace legitimate static assets with malicious content (e.g., ransomware landing pages).
- Bypass Security Controls - Overwrite
Content-Security-PolicyorStrict-Transport-Securityheaders.
From a risk-management perspective, a single poisoned entry can have a high impact, low effort profile, especially on high-traffic sites where the attacker gains visibility to millions of users.
Expert opinion: As CDNs introduce more edge-computing capabilities (workers, functions), the attack surface expands. Teams must treat any header that influences response generation as untrusted and enforce strict whitelist policies at the edge.
Practice Exercises
- Identify Vary misuse
- Run
curl -I https://your-lab-site.com/and note theVaryheader. - Using Burp, inject a custom header listed in
Varyand verify whether the response changes.
- Run
- Poison a CDN cache
- Deploy a simple Node.js app behind Cloudflare that echoes back a
X-Themeheader inside a code block. - Craft a request with
X-Theme: eviland observe the cached entry viacurl -I -H "Cache-Status: *". - Clear the cache and repeat to confirm persistence.
- Deploy a simple Node.js app behind Cloudflare that echoes back a
- Mitigation implementation
- Modify the app to whitelist
Accept-Encodingonly inVary. - Re-run the poison attempt; verify that the cache no longer stores the malicious variant.
- Modify the app to whitelist
Lab resources: GitHub repository with Docker compose for a vulnerable Nginx+Varnish stack.
Further Reading
- RFC 7234 - Hypertext Transfer Protocol (HTTP/1.1): Caching
- Cloudflare Blog - "Cache Poisoning Attacks and Mitigations" (2022)
- OWASP - "Web Cache Poisoning" cheat sheet
- Varnish Cache documentation -
vcl_hashcustomization - "The Security of Web Caches" - IEEE S&P 2021 research paper
Summary
Web cache poisoning via header manipulation is a potent, often overlooked attack vector. Mastering the interplay between Vary, request headers, and cache key generation enables both offensive exploitation and defensive hardening. Key takeaways:
- Never trust client-controlled headers in
Vary- whitelist rigorously. - Normalize header names, case, and whitespace before they influence cache keys.
- Prefer explicit
Cache-Controldirectives over implicit Vary heuristics. - Continuously test edge-caches with automated tools; treat any variation as a potential poisoning point.
By integrating these practices into development and DevSecOps pipelines, organizations can keep the performance benefits of caching without exposing themselves to high-impact cache poisoning threats.