DataLoader
GraphQL.NET includes an implementation of Facebook's DataLoader within the
GraphQL.DataLoader
NuGet package.
Consider a GraphQL query like this:
{
orders(date: "2017-01-01") {
orderId
date
user {
userId
firstName
lastName
}
}
}
When the query is executed, first a list of orders is fetched. Then for each order, the associated user must also be fetched. If each user is fetched one-by-one, this would get more inefficient as the number of orders (N) grows. This is known as the N+1 problem. If there are 50 orders (N = 50), 51 separate requests would be made to load this data.
A DataLoader helps in two ways:
- Similar operations are batched together. This can make fetching data over a network much more efficient.
- Fetched values are cached so if they are requested again, the cached value is returned.
In the example above, a using a DataLoader will allow us to batch together all of the requests for the users. So there would be 1 request to retrieve the list of orders and 1 request to load all users associated with those orders. This would always be a total of 2 requests rather than N+1.
Setup
- Register
IDataLoaderContextAccessor
in your IoC container. - Register
DataLoaderDocumentListener
in your IoC container.
services.AddSingleton<IDataLoaderContextAccessor, DataLoaderContextAccessor>();
services.AddSingleton<DataLoaderDocumentListener>();
- Hook up your GraphQL schema to your IoC container.
public class MySchema : Schema
{
public MySchema(IServiceProvider services) : base(services)
{
}
}
services.AddSingleton<MySchema>();
- Add the
DataLoaderDocumentListener
to theDocumentExecuter
.
var listener = Services.GetRequiredService<DataLoaderDocumentListener>();
var executer = new DocumentExecuter();
var result = executer.ExecuteAsync(opts => {
...
opts.Listeners.Add(listener);
});
Usage
First, inject the IDataLoaderContextAccessor
into your GraphQL type class.
Then use the Context
property on the accessor to get the current DataLoaderContext
. The DataLoaderDocumentListener
configured above ensures that each request will have its own context instance.
Use one of the GetOrAdd*Loader
methods on the DataLoaderContext
. These methods all require a string key to uniquely identify each loader. They also require a delegate for fetching the data. Each method will get an existing loader or add a new one, identified by the string key. Each method has various overloads to support different ways to load and map data with the keys.
Call LoadAsync()
on the data loader. This will queue the request and return a IDataLoaderResult<T>
. If the result has already been cached, the returned value will be pulled from the cache.
The ExecutionStrategy
will dispatch queued data loaders after all other pending fields have been resolved.
If your code requires an asynchronous call prior to queuing the data loader, use the ResolveAsync
field builder method to return a
Task<IDataLoaderResult<T>>
. The execution strategy will start executing the asynchronous code as soon as the field resolver executes.
Once the IDataLoaderResult<T>
is retrieved from the asynchronous task, the data loader will be queued to be dispatched once all
other pending fields have been resolved.
To execute code within the resolver after the data loader has retrieved the data, pass a delegate to the Then
extension
method of the returned IDataLoaderResult<T>
. You can use a synchronous or asynchronous delegate, and it can return another
IDataLoaderResult<T>
if you wish to chain dataloaders together. This may result in the field builder's Resolve delegate
signature looking like IDataLoaderResult<IDataLoaderResult<T>>
, which is correct and will be handled properly by the execution strategy.
GetOrAddBatchLoader vs. GetOrAddCollectionBatchLoader
The decision about whether to use the GetOrAddBatchLoader
or GetOrAddCollectionBatchLoader
method should be based on whether each key can link to only one or to multiple values:
- use
GetOrAddBatchLoader
for one-to-one or many-to-one relationships, i.e. when each key links to a single value; think for example of resolving the user who made an order: each order maps to one (and only one) user; - use
GetOrAddCollectionBatchLoader
for one-to-many or many-to-many relationships, i.e. when each key links to multiple values; think for example of resolving orders for a user: each user may map to (zero or) more orders.
Note that what matters here is the cardinality of the relation as seen from the resolver's perspective. Think for example of orders and products. In theory, a product can appear in multiple orders and an order can contain multiple products, so this is a many-to-many relationship. However, a resolver for Product.Orders
(or alternatively, Order.Products
) will only need to care for the one-to-many side of the relation, and thus use the GetOrAddBatchCollectionLoader
.
The same applies to one-to-many (or many-to-one) relationships. A resolver on the "one" side of the relation will need to use GetOrAddBatchCollectionLoader
, while a resolver on the "many" side will need to use GetOrAddBatchLoader
.
Examples
Below are some example implementations of a DataLoader for different use cases given the following schema:
type Order {
User: User! # many-to-one
}
type User {
Orders: [Order!]! # one-to-many
}
One-to-one or many-to-one relationships (Order.User
)
This is an example of using a DataLoader to batch requests for loading a single item by a key. Since each order belongs to exactly one user, from the perspective of an order this is a one-to-one relationship, so we should use GetOrAddBatchLoader
.
Below, LoadAsync()
is called by the field resolver for each Order. IUsersStore.GetUsersByIdAsync()
will be called with the batch of userIds that were requested.
public class OrderType : ObjectGraphType<Order>
{
// Inject the IDataLoaderContextAccessor to access the current DataLoaderContext
public OrderType(IDataLoaderContextAccessor accessor, IUsersStore users)
{
...
Field<UserType, User>()
.Name("User")
.ResolveAsync(context =>
{
// Get or add a batch loader with the key "GetUsersById"
// The loader will call GetUsersByIdAsync for each batch of keys
var loader = accessor.Context.GetOrAddBatchLoader<int, User>("GetUsersById", users.GetUsersByIdAsync);
// Add this UserId to the pending keys to fetch
// The execution strategy will trigger the data loader to fetch the data via GetUsersByIdAsync() at the
// appropriate time, and the field will be resolved with an instance of User once GetUsersByIdAsync()
// returns with the batched results
return loader.LoadAsync(context.Source.UserId);
});
}
}
public class UserStore : IUsersStore
{
// This will be called by the loader for all pending keys
// Note that fetch delegates can accept a CancellationToken parameter or not
Task<IDictionary<int, User>> GetUsersByIdAsync(IEnumerable<int> userIds, CancellationToken cancellationToken)
{
var users = await ... // load data from database
return users
.ToDictionary(x => x.UserId);
}
}
One-to-many or many-to-many relationships (User.Orders
)
This is an example of using a DataLoader to batch requests for loading a collection of items by a key. This is used when a key may be associated with more than one item. Since each user can have (zero or) more orders, from the perspective of a user this is a one-to-many relationship, so we should use GetOrAddCollectionBatchLoader
.
LoadAsync()
is called by the field resolver for each User. IOrdersStore.GetOrdersByUserIdAsync
will be called with a batch of userIds that have been requested.
public class UserType : ObjectGraphType<User>
{
// Inject the IDataLoaderContextAccessor to access the current DataLoaderContext
public UserType(IDataLoaderContextAccessor accessor, IOrdersStore orders)
{
...
Field<ListGraphType<OrderType>, IEnumerable<Order>>()
.Name("Orders")
.ResolveAsync(ctx =>
{
// Get or add a collection batch loader with the key "GetOrdersByUserId"
// The loader will call GetOrdersByUserIdAsync with a batch of keys
var ordersLoader = accessor.Context.GetOrAddCollectionBatchLoader<int, Order>("GetOrdersByUserId",
orders.GetOrdersByUserIdAsync);
// Add this UserId to the pending keys to fetch data for
// The execution strategy will trigger the data loader to fetch the data via GetOrdersByUserId() at the
// appropriate time, and the field will be resolved with an instance of IEnumerable<Order> once
// GetOrdersByUserId() returns with the batched results
return ordersLoader.LoadAsync(ctx.Source.UserId);
});
}
}
public class OrdersStore : IOrdersStore
{
public async Task<ILookup<int, Order>> GetOrdersByUserIdAsync(IEnumerable<int> userIds)
{
var orders = await ... // load data from database
return orders
.ToLookup(x => x.UserId);
}
}
No batching
This is an example of using a DataLoader without batching. This could be useful if the data may be requested multiple times. The result will be cached the first time. Subsequent calls to LoadAsync()
will return the cached result.
public class QueryType : ObjectGraphType
{
// Inject the IDataLoaderContextAccessor to access the current DataLoaderContext
public QueryType(IDataLoaderContextAccessor accessor, IUsersStore users)
{
Field<ListGraphType<UserType>, IEnumerable<User>>()
.Name("Users")
.Description("Get all Users")
.ResolveAsync(ctx =>
{
// Get or add a loader with the key "GetAllUsers"
var loader = accessor.Context.GetOrAddLoader("GetAllUsers",
() => users.GetAllUsersAsync());
// Prepare the load operation
// If the result is cached, a completed Task<IEnumerable<User>> will be returned
return loader.LoadAsync();
});
}
}
public interface IUsersStore
{
Task<IEnumerable<User>> GetAllUsersAsync();
}
Chained data loaders
This is an example of using two chained DataLoaders to batch requests together, with asynchronous code before the data loaders execute, and post-processing afterwards.
public class UserType : ObjectGraphType<User>
{
// Inject the IDataLoaderContextAccessor to access the current DataLoaderContext
public UserType(IDataLoaderContextAccessor accessor, IUsersStore users, IOrdersStore orders, IItemsStore items)
{
...
Field<ListGraphType<ItemType>, IEnumerable<Item>>()
.Name("OrderedItems")
.ResolveAsync(async context =>
{
// Asynchronously authenticate
var valid = await users.CanViewOrders(context.Source.UserId);
if (!valid) return null;
// Get or add a collection batch loader with the key "GetOrdersByUserId"
// The loader will call GetOrdersByUserIdAsync with a batch of keys
var ordersLoader = accessor.Context.GetOrAddCollectionBatchLoader<int, Order>("GetOrdersByUserId",
orders.GetOrdersByUserIdAsync);
var ordersResult = ordersLoader.LoadAsync(context.Source.UserId);
// Once the orders have been retrieved by the first data loader, feed the order IDs into
// the second data loader
return ordersResult.Then((orders, cancellationToken) =>
{
// Collect all of the order IDs
var orderIds = orders.Select(o => o.Id);
// Get or add a collection batch loader with the key "GetItemsByOrderId"
// The loader will call GetItemsByOrderId with a batch of keys
var itemsLoader = accessor.Context.GetOrAddCollectionBatchLoader<int, Item>("GetItemsByOrderId",
items.GetItemsByOrderIdAsync);
var itemsResults = itemsLoader.LoadAsync(orderIds);
// itemsResults is of type IDataLoaderResult<IEnumerable<Item>[]> so the array needs to be flattened
// before returning it back to the query
return itemsResults.Then(itemResultSet =>
{
// Flatten the results after the second dataloader has finished
return flattenedResults = itemResultSet.SelectMany(x => x);
});
});
});
}
}
public interface IUsersStore
{
// This will be called for each call to OrderedItems, prior to any data loader execution
Task<bool> CanViewOrders(int userId);
}
public interface IOrdersStore
{
// This will be called by the "order" loader for all pending keys
// Note that fetch delegates can accept a CancellationToken parameter or not
Task<ILookup<int, Order>> GetOrdersByUserIdAsync(IEnumerable<int> userIds, CancellationToken cancellationToken);
}
public interface IItemsStore
{
// This will be called by the "item" loader for all pending keys
// Note that fetch delegates can accept a CancellationToken parameter or not
Task<ILookup<int, Item>> GetItemsByOrderIdAsync(IEnumerable<int> orderIds, CancellationToken cancellationToken);
}
See this blog series for an in depth example using Entity Framework.
Exceptions
Exceptions within data loaders' fetch delegates are passed back to the execution strategy for all associated fields.
If you have a need to capture exceptions raised by the fetch delegate, create a new SimpleDataLoader<T>
within
your field resolver (do not use the IDataLoaderContextAccessor
for this) and have its fetch delegate await the
IDataLoaderResult<T>.GetResultAsync
method of the result obtained from the first data loader within a try/catch
block. Return the result of the simple data loader's LoadAsync()
function to the field resolver. The data loader
will still load at the appropriate time, and you can handle exceptions as desired.
DI-based data loaders
The above instructions describe how to use the data loader context and accessor classes to create data loaders scoped to the current request. You can also use dependency injection to register a data loader instance. This can eliminate duplicated code if you call the same data loader from different field resolvers. It can also help to prevent unforseen bugs due to a data loader fetch delegate capturing variables from a field resolver's scope.
To create a custom and register a custom data loader instance, first create a class and inherit DataLoaderBase<TKey, T>
.
Override the FetchAsync
method with the code to retrieve the data based on the provided keys. Call SetResult
on
each provided DataLoaderPair
to set the result. Feel free to use dependency injection to rely on any scoped services
necessary to facilitate execution of the fetch method. See below sample:
public class Order
{
public int Id { get; set; }
public string ShipToName { get; set; }
}
public class OrderItem
{
public int Id { get; set; }
public int OrderId { get; set; }
public string ItemName { get; set; }
}
// similar to BatchDataLoader
public class MyOrderDataLoader : DataLoaderBase<int, Order>
{
private readonly MyDbContext _dbContext;
public MyOrderDataLoader(MyDbContext dataContext)
{
_dbContext = dataContext;
}
protected override async Task FetchAsync(IEnumerable<DataLoaderPair<int, Order>> list, CancellationToken cancellationToken)
{
IEnumerable<int> ids = list.Select(pair => pair.Key);
IDictionary<int, Order> data = await _dbContext.Orders.Where(order => ids.Contains(order.Id)).ToDictionaryAsync(x => x.Id, cancellationToken);
foreach (DataLoaderPair<int, Order> entry in list)
{
entry.SetResult(data.TryGetValue(entry.Key, out var order) ? order : null);
}
}
}
// similar to CollectionBatchDataLoader
public class MyOrderItemsDataLoader : DataLoaderBase<int, IEnumerable<OrderItem>>
{
private readonly MyDbContext _dbContext;
public MyOrderItemsDataLoader(MyDbContext dataContext)
{
_dbContext = dataContext;
}
protected override async Task FetchAsync(IEnumerable<DataLoaderPair<int, IEnumerable<OrderItem>>> list, CancellationToken cancellationToken)
{
IEnumerable<int> ids = list.Select(pair => pair.Key);
IEnumerable<OrderItem> data = await _dbContext.OrderItems.Where(orderItem => ids.Contains(orderItem.OrderId)).ToListAsync(cancellationToken);
ILookup<int, OrderItem> dataLookup = data.ToLookup(x => x.OrderId);
foreach (DataLoaderPair<int, IEnumerable<OrderItem>> entry in list)
{
entry.SetResult(dataLookup[entry.Key]);
}
}
}
You will need to register the data loader as a scoped service within your DI framework.
services.AddScoped<MyOrderDataLoader>();
services.AddScoped<MyOrderItemsDataLoader>();
Then within your field resolvers, access the data loader via the RequestServices
property and call LoadAsync
as before:
public class MyQuery : ObjectGraphType
{
public MyQuery()
{
Field<OrderType, Order>("Order")
.Argument<IdGraphType>("id")
.ResolveAsync(context =>
{
// Get the custom data loader
var loader = context.RequestServices.GetRequiredService<MyOrderDataLoader>();
// Add this UserId to the pending keys to fetch.
// The execution strategy will trigger the data loader to fetch the data via MyOrderDataLoader.FetchAsync() at the
// appropriate time, and the field will be resolved with an instance of Order once FetchAsync()
// returns with the batched results
return loader.LoadAsync(context.GetArgument<int>("id"));
});
}
}
public class OrderType : ObjectGraphType<Order>
{
public OrderType()
{
Field(x => x.Id, type: typeof(IdGraphType));
Field(x => x.ShipToName);
Field<ListGraphType<OrderItemType>, IEnumerable<OrderItem>>("Items")
.ResolveAsync(context =>
{
var loader = context.RequestServices.GetRequiredService<MyOrderItemsDataLoader>();
return loader.LoadAsync(context.Source.Id);
});
}
}
You do not need to use IDataLoaderContextAccessor
or DataLoaderDocumentListener
and may remove those references
from your code.
You may also use the resolver builder feature of the GraphQL.MicrosoftDI
package,
as shown in the below example:
public class MyQuery : ObjectGraphType
{
public MyQuery()
{
Field<OrderType, Order>("Order")
.Argument<NonNullGraphType<IdGraphType>>("id")
.Resolve()
.WithService<MyOrderDataLoader>()
.ResolveAsync((context, loader) =>
{
return loader.LoadAsync(context.GetArgument<int>("id"));
});
}
}
Note that if you attempt to create a service scope via WithScope()
for a scoped
data loader, each data loaded entry will exist in its own service scope, and none
of the entries will be batch loaded. However, you can use a singleton data loader,
creating a service scope for the load operation, as shown below.
Singleton DI-based data loader instances
If you wish to register the data loader as a singleton, be sure to disable caching by calling : base(false)
in
the constructor, as the cache entries never expire. You will also need to be sure your code does not rely on any
scoped services, or create a dedicated service scope within the fetch method as shown below.
public class MyOrderDataLoader : DataLoaderBase<int, Order>
{
private readonly IServiceProvider _rootServiceProvider;
public MyOrderDataLoader(IServiceProvider serviceProvider) : base(false)
{
_rootServiceProvider = serviceProvider;
}
protected override async Task FetchAsync(IEnumerable<DataLoaderPair<int, Order>> list, CancellationToken cancellationToken)
{
using (var scope = _rootServiceProvider.CreateScope())
{
MyDbContext dbContext = scope.ServiceProvider.GetRequiredService<MyDbContext>();
IEnumerable<int> ids = list.Select(pair => pair.Key);
IDictionary<int, Order> data = await dbContext.Orders.Where(order => ids.Contains(order.Id)).ToDictionaryAsync(x => x.Id, cancellationToken);
foreach (DataLoaderPair<int, Order> entry in list)
{
entry.SetResult(data.TryGetValue(entry.Key, out var order) ? order : null);
}
}
}
}
As a singleton, you can pull the singleton instance into your graphtype class in its constructor.
public class MyQuery : ObjectGraphType
{
public MyQuery(MyOrderDataLoader loader)
{
Field<OrderType, Order>("Order")
.Argument<IdGraphType>("id")
.ResolveAsync(context =>
{
return loader.LoadAsync(context.GetArgument<int>("id"));
});
}
}
Adding a global cache
Data loaders will, by default, cache values returned for a given key for the lifetime of a request. You can change
the fetch method of your data loader to use a global cache. The below sample demonstrates changes required to a singleton
DI-based data loader as shown immediately above, using the Microsoft.Extensions.Caching.Memory
NuGet package.
public class MyOrderDataLoader : DataLoaderBase<int, Order>
{
private readonly IServiceProvider _rootServiceProvider;
private readonly IMemoryCache _memoryCache;
private readonly MemoryCacheEntryOptions _memoryCacheEntryOptions;
private const string CACHE_PREFIX = "ORDER_";
public MyOrderDataLoader(IServiceProvider serviceProvider, IMemoryCache memoryCache) : base(false)
{
_rootServiceProvider = serviceProvider;
_memoryCache = memoryCache;
_memoryCacheEntryOptions = new MemoryCacheEntryOptions
{
// specify a maximum lifetime of 5 minutes
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5),
// set so that the size of the cache can be limited
Size = 1,
};
}
protected override async Task FetchAsync(IEnumerable<DataLoaderPair<int, Order>> list, CancellationToken cancellationToken)
{
// create a list of keys that are not in the cache
var unMatched = new List<DataLoaderPair<int, Order>>(list.Count());
// attempt to match any keys possible from the global cache
foreach (var entry in list)
{
if (_memoryCache.TryGetValue(CACHE_PREFIX + entry.Key, out var value))
{
entry.SetResult((Order)value);
}
else
{
unMatched.Add(entry);
}
}
// process the unmatched keys as usual
list = unMatched;
using (var scope = _rootServiceProvider.CreateScope())
{
var dbContext = scope.ServiceProvider.GetRequiredService<MyDbContext>();
IEnumerable<int> ids = list.Select(pair => pair.Key);
IDictionary<int, Order> data = await dbContext.Orders.Where(order => ids.Contains(order.Id)).ToDictionaryAsync(x => x.Id, cancellationToken);
foreach (DataLoaderPair<int, Order> entry in list)
{
if (data.TryGetValue(entry.Key, out var order))
{
// only save the entry in the cache if it was found in the database
_memoryCache.Set(CACHE_PREFIX + entry.Key, order, _memoryCacheEntryOptions);
entry.SetResult(order);
}
else
{
entry.SetResult(null);
}
}
}
}
}
// also, register the memory cache in your DI configuration
// limit cache to 10,000 entries
services.AddSingleton<IMemoryCache>(_ => new MemoryCache(new MemoryCacheOptions { SizeLimit = 10000 }));