~/home/study/intermediate-guide-enumerating

Intermediate Guide to Enumerating Amazon S3 Buckets

Learn practical techniques for discovering S3 buckets using AWS CLI, DNS tricks, open-source scanners, wordlist brute-forcing, and HTTP status analysis. Gain actionable insights to assess exposure and harden defenses.

Introduction

Amazon Simple Storage Service (S3) is the backbone of countless data pipelines, static-website hosting, and backup solutions. While S3 is designed for durability and scalability, misconfigured buckets are a favorite foothold for attackers seeking to exfiltrate data or host malicious payloads. This guide walks security professionals through the most reliable enumeration methods, explains the signals you’ll see, and offers mitigation tactics.

Why does enumeration matter? An adversary who can list or probe bucket names can quickly discover assets that are unintentionally public, gain clues about an organization’s naming conventions, and use that knowledge for lateral moves or credential stuffing. Real-world incidents such as the Capital One breach (2019) and the Accellion data leak (2020) demonstrate how a simple bucket name discovery step can cascade into massive data exposure.

Prerequisites

  • Solid understanding of AWS Identity and Access Management (IAM) principals, policies, and the principle of least privilege.
  • Installed and configured AWS CLI with at least aws configure completed (access key, secret key, region).
  • Fundamental grasp of HTTP request/response cycles, status codes, and DNS resolution.
  • Linux/macOS or Windows Subsystem for Linux (WSL) environment for command-line work.

Core Concepts

Before diving into tools, internalize three core ideas:

  1. Bucket Namespace Isolation: S3 bucket names are globally unique. Once a bucket is created, no other AWS account can claim the same name, which means an attacker can infer ownership simply by probing DNS.
  2. Permission Granularity: S3 permissions are evaluated in a layered fashion—bucket policies, ACLs, IAM policies, and public access block settings. Enumeration can surface any of these configurations.
  3. Signal vs. Noise: Not every 403 or 404 response indicates a vulnerability. Understanding the semantics of each HTTP status (e.g., 403 Forbidden vs. 404 Not Found) is essential for accurate inference.

Think of enumeration as a reconnaissance radar: the more accurate the returns, the better you can prioritize remediation.

Using AWS CLI for bucket listing (list-buckets, head-bucket)

The AWS CLI offers two primary commands for bucket discovery when you possess valid credentials:

1. aws s3api list-buckets

This call returns every bucket owned by the authenticated AWS account. It’s the fastest way to get an inventory if you have IAM permissions for s3:ListAllMyBuckets.

aws s3api list-buckets --query "Buckets[].Name" --output text

Expected output (example):

my-company-logs
static-site-assets
backup-2023-09

If you receive an AccessDenied error, the account lacks the required permission; you’ll need to pivot to other techniques.

2. aws s3api head-bucket

head-bucket performs a lightweight HEAD request against a specific bucket name. It’s useful for confirming ownership without pulling object listings.

aws s3api head-bucket --bucket static-site-assets

Successful execution returns no output (HTTP 200). Errors map to HTTP status codes:

  • 403 Forbidden – bucket exists but you lack permission.
  • 404 Not Found – bucket name does not exist.

When scripting, capture the exit code to build a rapid validation loop.

for b in $(cat wordlist.txt); do if aws s3api head-bucket --bucket $b >/dev/null 2>&1; then echo "[+] $b exists and you have access" else echo "[-] $b not accessible" fi
done

Enumerating buckets via DNS (bucketname.s3.amazonaws.com)

Because bucket names are part of a public DNS namespace, you can query them directly without any AWS credentials. The pattern {bucket}.s3.amazonaws.com resolves to a CNAME pointing to an Amazon endpoint when the bucket exists.

Manual DNS lookup

dig +short static-site-assets.s3.amazonaws.com

A successful lookup returns an endpoint IP (or a CNAME). A NXDOMAIN response indicates the bucket name is unavailable.

Automated enumeration script

import subprocess, sys

def check_bucket(name): cmd = ["dig", "+short", f"{name}.s3.amazonaws.com"] result = subprocess.run(cmd, capture_output=True, text=True) return bool(result.stdout.strip())

if __name__ == "__main__": for line in open("wordlist.txt"): bucket = line.strip() if check_bucket(bucket): print(f"[+] {bucket} exists") else: print(f"[-] {bucket} not found")

Note that DNS caching can cause false positives; always verify with an HTTP HEAD request (see next section).

Why DNS matters

Even if a bucket has BlockPublicAccess enabled, the DNS entry still exists. Therefore, DNS enumeration is a low-cost first pass to generate candidate names for deeper probing.

Leveraging open-source tools (S3Scanner, Bucket Finder, CloudEnum)

Several community-maintained scanners automate the heavy lifting of wordlist generation, concurrent probing, and result parsing.

S3Scanner

git clone https://github.com/mazen160/s3scanner.git && cd s3scanner
pip install -r requirements.txt
python3 s3scanner.py -b wordlist.txt -t 50

Key flags:

  • -b: path to bucket name wordlist.
  • -t: number of concurrent threads (tune for your bandwidth).
  • -r: enable “requester pays” detection.

S3Scanner categorises results as PUBLIC, PRIVATE, FORBIDDEN, or NOT_FOUND, saving them to CSV for later analysis.

Bucket Finder

pip install bucket-finder
bucket-finder -w wordlist.txt -o results.txt

Bucket Finder integrates with the Subfinder engine to discover bucket names leaked in public sources (GitHub, Pastebin, etc.) before performing network checks.

CloudEnum

git clone https://github.com/mazen160/cloud-enum.git && cd cloud-enum
python3 cloudenum.py -s s3 -w wordlist.txt

CloudEnum supports multi-cloud enumeration (AWS, Azure, GCP). For S3, it performs both DNS checks and HTTP HEAD requests, providing a concise JSON report.

When using these tools, always respect rate-limits and consider the legal implications of scanning external domains.

Wordlist-based bucket name brute-forcing

Many organizations adopt predictable naming schemes (e.g., env-project-service-YYYYMMDD). Crafting a targeted wordlist dramatically improves success rates.

Creating a custom wordlist

# Example: generate names for a monthly backup bucket
for env in dev prod staging; do for proj in analytics payments; do for year in {2022..2024}; do for month in {01..12}; do echo "${env}-${proj}-${year}${month}" done done done
done > custom_wordlist.txt

Combine this with public data sources (GitHub code, CloudFormation templates) using gitrob or trufflehog to harvest additional candidates.

Brute-force with curl

while read bucket; do status=$(curl -s -o /dev/null -w "%{http_code}" "https://$bucket.s3.amazonaws.com") case $status in 200) echo "[PUBLIC] $bucket";; 403) echo "[FORBIDDEN] $bucket";; 404) echo "[NOT_FOUND] $bucket";; *) echo "[UNKNOWN $status] $bucket";; esac
done < custom_wordlist.txt

Parallelise with xargs -P 20 for speed, but monitor AWS rate-limit headers (x-amz-request-id, Retry-After).

Interpreting HTTP status codes for permission inference

When you issue a GET/HEAD request against a bucket, AWS returns one of several status codes. Understanding the nuance is crucial for accurate risk scoring.

StatusMeaningTypical Implication
200 OKObject list or object reachableBucket is publicly readable (or you have valid credentials).
403 ForbiddenBucket exists but you lack permissionPotentially private, but may be mis‑configured (e.g., public‑read disabled but public‑write enabled).
404 Not FoundBucket name does not existNo further action needed.
301 Moved PermanentlyBucket redirected (e.g., to a region‑specific endpoint)Follow the Location header to get the true status.
400 Bad RequestMalformed request (often due to virtual‑host style vs path‑style)Adjust request format; not a permission issue.

Advanced tip: combine HEAD with the Range header to test for partial object reads (useful for detecting unintentionally exposed large files).

curl -I -H "Range: bytes=0-0" "https://example-bucket.s3.amazonaws.com/sample.txt"

If the response includes Content-Range: bytes 0-0/123456, the object is readable.

Practical Examples

Scenario 1 – Auditing an AWS account you own:

  1. Run aws s3api list-buckets to capture the inventory.
  2. Pipe each bucket name into aws s3api get-bucket-acl to verify if AllUsers or AuthenticatedUsers groups are granted permissions.
aws s3api list-buckets --query "Buckets[].Name" --output text | while read b; do echo "Bucket: $b" aws s3api get-bucket-acl --bucket $b --query "Grants[?Grantee.URI=='http://acs.amazonaws.com/groups/global/AllUsers']" echo "---"
done

Scenario 2 – Pen‑test on a target domain:

  • Gather candidate names from subdomains (e.g., app.example.comapp-example-com).
  • Run S3Scanner with -t 100 to quickly surface public buckets.
  • Validate any PUBLIC buckets with aws s3 sync s3://bucket-name ./dump (if you have credentials) or curl for unauthenticated download.

Tools & Commands

  • AWS CLI: aws s3api list-buckets, aws s3api head-bucket, aws s3 sync.
  • curl / wget: Direct HTTP HEAD/GET checks.
  • dig / nslookup: DNS existence probing.
  • S3Scanner: Automated multi‑threaded scanner.
  • Bucket Finder: Public‑source leakage aggregation.
  • CloudEnum: Multi‑cloud enumeration framework.
  • Python/Bash scripts: Custom wordlist generation and result parsing.

Sample combined command:

cat wordlist.txt | parallel -j 50 'status=$(curl -s -o /dev/null -w "%{http_code}" "https://{}.s3.amazonaws.com"); echo "{} $status"'

Defense & Mitigation

Protecting against unwanted enumeration is a mix of policy hygiene and technical safeguards.

  1. Enable Block Public Access at the account and bucket level. This prevents accidental public reads/writes.
  2. Apply Least‑Privilege IAM Policies – avoid wildcard actions like s3:* on * resources.
  3. Implement Bucket Naming Conventions that are hard to guess (include random UUID segments).
  4. Use Amazon Macie or GuardDuty to detect publicly accessible buckets.
  5. Monitor S3 Access Logs and enable CloudTrail data events for bucket‑level visibility.
  6. Rate‑limit external scans via WAF rules that block repetitive HEAD requests from the same IP.

Periodic automated audits (e.g., using aws s3api get-bucket-policy-status) help ensure compliance over time.

Common Mistakes

  • Assuming 404 means safe: Some AWS accounts return 404 for private buckets to obscure existence. Verify with DNS or use HeadBucket when you have credentials.
  • Neglecting region‑specific endpoints: Buckets created in non‑us‑east‑1 regions require region‑specific URLs (e.g., bucket.s3.eu-west-1.amazonaws.com).
  • Over‑loading the target: Running hundreds of concurrent requests can trigger AWS throttling and may be considered a denial‑of‑service.
  • Missing ACL checks: Even if the bucket policy denies public access, an ACL granting AllUsers read permission can still expose data.
  • Storing wordlists in plain text: Wordlists often contain sensitive naming patterns; keep them secured.

Real‑World Impact

Publicly exposed S3 buckets have been the root cause of data leaks affecting millions. In 2021, a misconfigured backup bucket exposed over 200 GB of proprietary source code, leading to a ransomware extortion attempt. Organizations that routinely enumerate their own buckets discover such gaps before attackers do.

Trends to watch:

  • Increasing use of “infrastructure‑as‑code” (IaC) templates that embed bucket names, providing attackers with a richer wordlist.
  • Adoption of “S3 Object Lambda” which can mask underlying objects, but misconfigurations still surface via enumeration.
  • Growth of third‑party SaaS platforms that automatically provision buckets on behalf of customers—these often inherit default permissive policies.

My experience: The most effective remediation strategy is to combine automated scans with a governance process that forces a “bucket review” before any new bucket is provisioned.

Practice Exercises

  1. Exercise 1 – Internal Audit
    • Using an IAM user with read‑only S3 permissions, list all buckets in your test account.
    • Write a Bash script that checks each bucket for public ACLs and outputs a CSV.
  2. Exercise 2 – External Recon
    • Choose a public domain (e.g., example.com) and generate a wordlist based on its subdomains.
    • Run S3Scanner against the wordlist and document any PUBLIC buckets found.
  3. Exercise 3 – Status‑Code Interpretation
    • Pick three bucket names that return 200, 403, and 404 respectively.
    • For each, capture the full HTTP response headers with curl -I and explain the security implication.

Further Reading

  • AWS Documentation – Managing S3 Bucket Permissions
  • “The 2023 AWS Security Best Practices” whitepaper
  • Project Discovery’s Subfinder for passive subdomain enumeration (useful for bucket name clues)
  • “S3 Security – A Complete Guide” by Troy Hunt (covers misconfigurations and remediation)
  • OWASP Cloud‑Native Top 10 – Section “C10: Improper Cloud Storage Configuration”

Summary

Enumerating Amazon S3 buckets blends credentialed API calls, DNS probing, and HTTP status analysis. Mastery of the AWS CLI, open‑source scanners, and custom wordlist techniques enables security teams to discover unintentionally exposed storage assets quickly. By interpreting status codes correctly, applying strict IAM and bucket policies, and institutionalising regular audits, organizations can dramatically reduce the attack surface posed by misconfigured S3 buckets.

Remember: enumeration is a reconnaissance step—not an endpoint. Pair it with robust remediation workflows and continuous monitoring to keep your cloud data safe.