Introduction
Amazon Simple Storage Service (S3) is the backbone of countless data pipelines, static-website hosting, and backup solutions. While S3 is designed for durability and scalability, misconfigured buckets are a favorite foothold for attackers seeking to exfiltrate data or host malicious payloads. This guide walks security professionals through the most reliable enumeration methods, explains the signals you’ll see, and offers mitigation tactics.
Why does enumeration matter? An adversary who can list or probe bucket names can quickly discover assets that are unintentionally public, gain clues about an organization’s naming conventions, and use that knowledge for lateral moves or credential stuffing. Real-world incidents such as the Capital One breach (2019) and the Accellion data leak (2020) demonstrate how a simple bucket name discovery step can cascade into massive data exposure.
Prerequisites
- Solid understanding of AWS Identity and Access Management (IAM) principals, policies, and the principle of least privilege.
- Installed and configured AWS CLI with at least
aws configurecompleted (access key, secret key, region). - Fundamental grasp of HTTP request/response cycles, status codes, and DNS resolution.
- Linux/macOS or Windows Subsystem for Linux (WSL) environment for command-line work.
Core Concepts
Before diving into tools, internalize three core ideas:
- Bucket Namespace Isolation: S3 bucket names are globally unique. Once a bucket is created, no other AWS account can claim the same name, which means an attacker can infer ownership simply by probing DNS.
- Permission Granularity: S3 permissions are evaluated in a layered fashion—bucket policies, ACLs, IAM policies, and public access block settings. Enumeration can surface any of these configurations.
- Signal vs. Noise: Not every 403 or 404 response indicates a vulnerability. Understanding the semantics of each HTTP status (e.g., 403 Forbidden vs. 404 Not Found) is essential for accurate inference.
Think of enumeration as a reconnaissance radar: the more accurate the returns, the better you can prioritize remediation.
Using AWS CLI for bucket listing (list-buckets, head-bucket)
The AWS CLI offers two primary commands for bucket discovery when you possess valid credentials:
1. aws s3api list-buckets
This call returns every bucket owned by the authenticated AWS account. It’s the fastest way to get an inventory if you have IAM permissions for s3:ListAllMyBuckets.
aws s3api list-buckets --query "Buckets[].Name" --output text
Expected output (example):
my-company-logs
static-site-assets
backup-2023-09
If you receive an AccessDenied error, the account lacks the required permission; you’ll need to pivot to other techniques.
2. aws s3api head-bucket
head-bucket performs a lightweight HEAD request against a specific bucket name. It’s useful for confirming ownership without pulling object listings.
aws s3api head-bucket --bucket static-site-assets
Successful execution returns no output (HTTP 200). Errors map to HTTP status codes:
- 403 Forbidden – bucket exists but you lack permission.
- 404 Not Found – bucket name does not exist.
When scripting, capture the exit code to build a rapid validation loop.
for b in $(cat wordlist.txt); do if aws s3api head-bucket --bucket $b >/dev/null 2>&1; then echo "[+] $b exists and you have access" else echo "[-] $b not accessible" fi
done
Enumerating buckets via DNS (bucketname.s3.amazonaws.com)
Because bucket names are part of a public DNS namespace, you can query them directly without any AWS credentials. The pattern {bucket}.s3.amazonaws.com resolves to a CNAME pointing to an Amazon endpoint when the bucket exists.
Manual DNS lookup
dig +short static-site-assets.s3.amazonaws.com
A successful lookup returns an endpoint IP (or a CNAME). A NXDOMAIN response indicates the bucket name is unavailable.
Automated enumeration script
import subprocess, sys
def check_bucket(name): cmd = ["dig", "+short", f"{name}.s3.amazonaws.com"] result = subprocess.run(cmd, capture_output=True, text=True) return bool(result.stdout.strip())
if __name__ == "__main__": for line in open("wordlist.txt"): bucket = line.strip() if check_bucket(bucket): print(f"[+] {bucket} exists") else: print(f"[-] {bucket} not found")
Note that DNS caching can cause false positives; always verify with an HTTP HEAD request (see next section).
Why DNS matters
Even if a bucket has BlockPublicAccess enabled, the DNS entry still exists. Therefore, DNS enumeration is a low-cost first pass to generate candidate names for deeper probing.
Leveraging open-source tools (S3Scanner, Bucket Finder, CloudEnum)
Several community-maintained scanners automate the heavy lifting of wordlist generation, concurrent probing, and result parsing.
S3Scanner
git clone https://github.com/mazen160/s3scanner.git && cd s3scanner
pip install -r requirements.txt
python3 s3scanner.py -b wordlist.txt -t 50
Key flags:
-b: path to bucket name wordlist.-t: number of concurrent threads (tune for your bandwidth).-r: enable “requester pays” detection.
S3Scanner categorises results as PUBLIC, PRIVATE, FORBIDDEN, or NOT_FOUND, saving them to CSV for later analysis.
Bucket Finder
pip install bucket-finder
bucket-finder -w wordlist.txt -o results.txt
Bucket Finder integrates with the Subfinder engine to discover bucket names leaked in public sources (GitHub, Pastebin, etc.) before performing network checks.
CloudEnum
git clone https://github.com/mazen160/cloud-enum.git && cd cloud-enum
python3 cloudenum.py -s s3 -w wordlist.txt
CloudEnum supports multi-cloud enumeration (AWS, Azure, GCP). For S3, it performs both DNS checks and HTTP HEAD requests, providing a concise JSON report.
When using these tools, always respect rate-limits and consider the legal implications of scanning external domains.
Wordlist-based bucket name brute-forcing
Many organizations adopt predictable naming schemes (e.g., env-project-service-YYYYMMDD). Crafting a targeted wordlist dramatically improves success rates.
Creating a custom wordlist
# Example: generate names for a monthly backup bucket
for env in dev prod staging; do for proj in analytics payments; do for year in {2022..2024}; do for month in {01..12}; do echo "${env}-${proj}-${year}${month}" done done done
done > custom_wordlist.txt
Combine this with public data sources (GitHub code, CloudFormation templates) using gitrob or trufflehog to harvest additional candidates.
Brute-force with curl
while read bucket; do status=$(curl -s -o /dev/null -w "%{http_code}" "https://$bucket.s3.amazonaws.com") case $status in 200) echo "[PUBLIC] $bucket";; 403) echo "[FORBIDDEN] $bucket";; 404) echo "[NOT_FOUND] $bucket";; *) echo "[UNKNOWN $status] $bucket";; esac
done < custom_wordlist.txt
Parallelise with xargs -P 20 for speed, but monitor AWS rate-limit headers (x-amz-request-id, Retry-After).
Interpreting HTTP status codes for permission inference
When you issue a GET/HEAD request against a bucket, AWS returns one of several status codes. Understanding the nuance is crucial for accurate risk scoring.
| Status | Meaning | Typical Implication |
|---|---|---|
| 200 OK | Object list or object reachable | Bucket is publicly readable (or you have valid credentials). |
| 403 Forbidden | Bucket exists but you lack permission | Potentially private, but may be mis‑configured (e.g., public‑read disabled but public‑write enabled). |
| 404 Not Found | Bucket name does not exist | No further action needed. |
| 301 Moved Permanently | Bucket redirected (e.g., to a region‑specific endpoint) | Follow the Location header to get the true status. |
| 400 Bad Request | Malformed request (often due to virtual‑host style vs path‑style) | Adjust request format; not a permission issue. |
Advanced tip: combine HEAD with the Range header to test for partial object reads (useful for detecting unintentionally exposed large files).
curl -I -H "Range: bytes=0-0" "https://example-bucket.s3.amazonaws.com/sample.txt"
If the response includes Content-Range: bytes 0-0/123456, the object is readable.
Practical Examples
Scenario 1 – Auditing an AWS account you own:
- Run
aws s3api list-bucketsto capture the inventory. - Pipe each bucket name into
aws s3api get-bucket-aclto verify ifAllUsersorAuthenticatedUsersgroups are granted permissions.
aws s3api list-buckets --query "Buckets[].Name" --output text | while read b; do echo "Bucket: $b" aws s3api get-bucket-acl --bucket $b --query "Grants[?Grantee.URI=='http://acs.amazonaws.com/groups/global/AllUsers']" echo "---"
done
Scenario 2 – Pen‑test on a target domain:
- Gather candidate names from subdomains (e.g.,
app.example.com→app-example-com). - Run
S3Scannerwith-t 100to quickly surface public buckets. - Validate any
PUBLICbuckets withaws s3 sync s3://bucket-name ./dump(if you have credentials) orcurlfor unauthenticated download.
Tools & Commands
- AWS CLI:
aws s3api list-buckets,aws s3api head-bucket,aws s3 sync. - curl / wget: Direct HTTP HEAD/GET checks.
- dig / nslookup: DNS existence probing.
- S3Scanner: Automated multi‑threaded scanner.
- Bucket Finder: Public‑source leakage aggregation.
- CloudEnum: Multi‑cloud enumeration framework.
- Python/Bash scripts: Custom wordlist generation and result parsing.
Sample combined command:
cat wordlist.txt | parallel -j 50 'status=$(curl -s -o /dev/null -w "%{http_code}" "https://{}.s3.amazonaws.com"); echo "{} $status"'
Defense & Mitigation
Protecting against unwanted enumeration is a mix of policy hygiene and technical safeguards.
- Enable Block Public Access at the account and bucket level. This prevents accidental public reads/writes.
- Apply Least‑Privilege IAM Policies – avoid wildcard actions like
s3:*on*resources. - Implement Bucket Naming Conventions that are hard to guess (include random UUID segments).
- Use Amazon Macie or GuardDuty to detect publicly accessible buckets.
- Monitor S3 Access Logs and enable CloudTrail data events for bucket‑level visibility.
- Rate‑limit external scans via WAF rules that block repetitive
HEADrequests from the same IP.
Periodic automated audits (e.g., using aws s3api get-bucket-policy-status) help ensure compliance over time.
Common Mistakes
- Assuming 404 means safe: Some AWS accounts return 404 for private buckets to obscure existence. Verify with DNS or use
HeadBucketwhen you have credentials. - Neglecting region‑specific endpoints: Buckets created in non‑us‑east‑1 regions require region‑specific URLs (e.g.,
bucket.s3.eu-west-1.amazonaws.com). - Over‑loading the target: Running hundreds of concurrent requests can trigger AWS throttling and may be considered a denial‑of‑service.
- Missing ACL checks: Even if the bucket policy denies public access, an ACL granting
AllUsersread permission can still expose data. - Storing wordlists in plain text: Wordlists often contain sensitive naming patterns; keep them secured.
Real‑World Impact
Publicly exposed S3 buckets have been the root cause of data leaks affecting millions. In 2021, a misconfigured backup bucket exposed over 200 GB of proprietary source code, leading to a ransomware extortion attempt. Organizations that routinely enumerate their own buckets discover such gaps before attackers do.
Trends to watch:
- Increasing use of “infrastructure‑as‑code” (IaC) templates that embed bucket names, providing attackers with a richer wordlist.
- Adoption of “S3 Object Lambda” which can mask underlying objects, but misconfigurations still surface via enumeration.
- Growth of third‑party SaaS platforms that automatically provision buckets on behalf of customers—these often inherit default permissive policies.
My experience: The most effective remediation strategy is to combine automated scans with a governance process that forces a “bucket review” before any new bucket is provisioned.
Practice Exercises
- Exercise 1 – Internal Audit
- Using an IAM user with read‑only S3 permissions, list all buckets in your test account.
- Write a Bash script that checks each bucket for public ACLs and outputs a CSV.
- Exercise 2 – External Recon
- Choose a public domain (e.g.,
example.com) and generate a wordlist based on its subdomains. - Run
S3Scanneragainst the wordlist and document anyPUBLICbuckets found.
- Choose a public domain (e.g.,
- Exercise 3 – Status‑Code Interpretation
- Pick three bucket names that return 200, 403, and 404 respectively.
- For each, capture the full HTTP response headers with
curl -Iand explain the security implication.
Further Reading
- AWS Documentation – Managing S3 Bucket Permissions
- “The 2023 AWS Security Best Practices” whitepaper
- Project Discovery’s Subfinder for passive subdomain enumeration (useful for bucket name clues)
- “S3 Security – A Complete Guide” by Troy Hunt (covers misconfigurations and remediation)
- OWASP Cloud‑Native Top 10 – Section “C10: Improper Cloud Storage Configuration”
Summary
Enumerating Amazon S3 buckets blends credentialed API calls, DNS probing, and HTTP status analysis. Mastery of the AWS CLI, open‑source scanners, and custom wordlist techniques enables security teams to discover unintentionally exposed storage assets quickly. By interpreting status codes correctly, applying strict IAM and bucket policies, and institutionalising regular audits, organizations can dramatically reduce the attack surface posed by misconfigured S3 buckets.
Remember: enumeration is a reconnaissance step—not an endpoint. Pair it with robust remediation workflows and continuous monitoring to keep your cloud data safe.