Protection Against Malicious Queries
GraphQL allows the client to bundle and nest many queries into a single request. While this is quite convenient it also makes GraphQL endpoints susceptible to Denial of Service attacks.
To mitigate this graphql-dotnet provides a few options that can be tweaked to set the upper bound of nesting and complexity of incoming queries so that the endpoint would only try to resolve queries that meet the set criteria and discard any overly complex and possibly malicious query that you don't expect your clients to make thus protecting your server resources against depletion by a denial of service attacks.
GraphQL.Validation.Complexity.ComplexityConfiguration
class represents these options
that are used by ComplexityValidationRule
. The available options are the following:
public class ComplexityConfiguration
{
public int? MaxDepth { get; set; }
public int? MaxComplexity { get; set; }
public double? FieldImpact { get; set; }
public int MaxRecursionCount { get; set; }
}
The easiest way to configure complexity checks for your schema is the following:
IServiceCollection services = ...;
services.AddGraphQL(builder => builder
.AddSchema<ComplexitySchema>()
.AddComplexityAnalyzer(opt => opt.MaxComplexity = 200));
MaxDepth
will enforce the total maximum nesting across all queries in a request.
For example the following query will have a query depth of 2. Note that fragments
are taken into consideration when making these calculations.
{
Product { # This query has a depth of 2 which loosely translates to two distinct queries
# to the datasource, first one to return the list of products and second
# one (which will be executed once for each returned product) to grab
# the product's first 3 locations.
Title
...X # The depth of this fragment is calculated first and added to the total.
}
}
fragment X on Product { # This fragment has a depth of only 1.
Location(first: 3) {
lat
long
}
}
The query depth setting is a good estimation of complexity for most use cases and it loosely translates to the number of unique queries sent to the datastore (however it does not look at how many times each query might get executed). Keep in mind that the calculation of complexity needs to be FAST otherwise it can impose a significant overhead.
One step further would be specifying MaxComplexity
and FieldImpact
to look at the estimated
number of entities (or cells in a database) that are expected to be returned by each query.
Obviously this depends on the size of your database (i.e. number of records per entity) so you
will need to find the average number of records per database entity and input that into FieldImpact
.
For example if I have 3 tables with 100, 120 and 98 rows and I know I will be querying the first
table twice as much then a good estimation for avgImpact
would be 105.
Note: I highly recommend setting a higher bound on the number of returned entities by each resolve function in your code. if you use this approach already in your code then you can input that upper bound (which would be the maximum possible items returned per entity) as your avgImpact. It is also possible to use a theoretical value for this (for example 2.0) to asses the query's impact on a theoretical database hence decoupling this calculation from your actual database.
Imagine if we had a simple test database for the query in the previous example and we assume an average impact of 2.0 (each entity will return ~2 results) then we can calculate the complexity as following:
2 Products returned + 2 * (1 * Title per Product) + 2 * [ (3 * Locations) + (3 * lat entries) + (3 * long entries) ] = **22**
Or simply put on average we will have 2x Products each will have 1 Title for a total of
2x Titles plus per each Product entry we will have 3 locations overridden by first
argument (we follow relay's spec for first
, last
and id
arguments) and each of these
3 locations have a lat and a long totalling 6x Locations having 6x lats and 6x longs.
Now if we set the avgImpact
to 2.0 and set the MaxComplexity
to 23 (or higher) the query
will execute correctly. If we change the MaxComplexity
to something like 20 the DocumentExecutor
will fail right after parsing the AST tree and will not attempt to resolve any of the fields
(or talk to the database).