README
¶
Dataloader Generator
Automatically generates dataloaders based on sqlc queries
Requirements
sqlc.yaml
must be set up to use sqlc's sqlc-gen-json
example plugin to generate a JSON manifest file with information about generated queries
Quickstart
From the go-gallery root directory, run:
make sqlc-generate
Overview
This tool will read the manifest created by sqlc-gen-json
and use the go/types
package to figure out which SQL statements can be turned into dataloaders.
- By default, all
:batchone
and:batchmany
statements will create dataloaders - Dataloaders can also be generated for SQL queries that don't use sqlc's
:batch
syntax. See Custom Batching.
A dataloader can receive and cache results from other dataloaders. This happens automatically for dataloaders that appear to look up objects by their IDs, and can be set up for other dataloaders with minimal effort. See Caching Results.
Configuration options for individual dataloaders can be set with a -- dataloader-config:
comment in the sqlc queries file. For example:
-- name: GetUserByID :batchone
-- dataloader-config: maxBatchSize=10 batchTimeout=2ms publishResults=false
See Configuring Dataloaders for a full list of available options.
Generated dataloaders are aware of sqlc.embed
syntax, which can be used to return multiple generated types from a single query (e.g. a coredb.Token
and a coredb.Contract
). Each embedded type will be sent to dataloaders that can cache objects of that type (e.g. the coredb.Token
in the example above will be sent to dataloaders that can cache coredb.Token
results).
It's possible for sqlc
to generate parameter types that go doesn't consider comparable
. For example, a query might accept a list of Chains as a parameter, but a go struct with a slice field (e.g. chains []Chain
) is not comparable. Generated dataloaders support these non-comparable keys by converting them to JSON internally, and using their JSON strings as comparable cache keys.
Running make sqlc-generate
creates three files: dataloaders_gen.go
and api_gen.go
manifest.json
is the JSON manifest generated by thesqlc-gen-json
plugindataloaders_gen.go
contains definitions for all the generated dataloadersapi_gen.go
contains aLoaders
struct with fields for all the generated dataloaders, and sets up connections between them to cache results from one dataloader in another
"Not Found" Errors
Some sqlc queries (e.g. a :batchone
or a custom dataloader) can return the pgx.ErrNoRows
error, which is an implementation detail that should not be returned to callers. Instead, this error should be remapped to something more domain-specific (ErrUserNotFound
, etc). This can be done by implementing the notFoundErrorProvider
interface on a dataloader type, returning the appropriate error for a given key. The dataloader generator will tell you when this is required. You'll get an error like this:
type GetUserByIdBatch must implement getNotFoundError. Add this signature to notfound.go and have it return an appropriate error:
func (*GetUserByIdBatch) getNotFoundError(key persist.DBID) error {
// TODO: Return a specific error type, not pgx.ErrNoRows
}
Don't return pgx.ErrNoRows! Some existing dataloaders do this, but going forward, all new dataloaders should implement error types that don't depend on the underlying database implementation. A caller to GetUserByUsername
should expect something like ErrUserNotFound
, not pgx.ErrNoRows
.
Caching Results
Dataloaders will attempt to publish their results for other dataloaders to cache. A dataloader can opt in for caching by implementing one of these interfaces (where TKey
and TResult
are the key and result types of the dataloader itself):
// Given a TResult to cache, return the TKey value to use as its cache key
type autoCacheWithKey[TKey any, TResult any] interface {
getKeyForResult(TResult) TKey
}
// Given a TResult to cache, return multiple TKey values to use as cache keys.
// The TResult value will be cached once for each provided cache key.
// Useful for things like GetGalleryByCollectionID, where the same Gallery result
// should be cached with each of its child collection IDs as keys.
type autoCacheWithKeys[TKey any, TResult any] interface {
getKeysForResult(TResult) []TKey
}
If a sqlc query appears to look up an object by its ID, the generated dataloader will automatically implement autoCacheWithKey
for that object type. This happens if the dataloader has:
- a
persist.DBID
key type, and - a sqlc-generated result type (e.g. a
coredb.Xyz
) with apersist.DBID
field namedID
Because ID-based lookups are the most common caching need, it's rare to need to implement one of the autoCache interfaces manually. If the need arises, add an entry to autocache.go
.
Configuring Dataloaders
Configuration options for individual dataloaders can be set with a -- dataloader-config:
comment in the sqlc queries file. For example:
-- name: GetUserByID :batchone
-- dataloader-config: maxBatchSize=10 batchTimeout=2ms publishResults=false
Available options:
- maxBatchSize: the maximum number of keys to fetch in a single batched query. Defaults to 100.
- batchTimeout: the duration to wait before sending a batch (unless it reaches maxBatchSize first, at which point it will be sent immediately). Defaults to 2ms.
- publishResults: whether to publish results for other dataloaders to cache. Defaults to true.
- skip: whether to skip generating a dataloader for this query. Defaults to false.
Custom Batching
The easiest and most common way to generate dataloaders is to use sqlc's :batch
syntax, which uses the Postgres batching API to send many queries to the database in a single round trip. The batching API reduces round trip overhead, but it still executes one SQL query for each provided key. In some performance-critical circumstances (e.g. routinely looking up thousands of objects by their IDs), it's better to perform a single query that returns an entire batch of results.
A dataloader will be generated for SQL statements that don't use sqlc's :batch
syntax, if:
- the query uses the sqlc
:many
keyword - the query returns an
int
column namedbatch_key_index
batch_key_index
should be a 1-based index that maps keys to results, and is typically created via the generate_subscripts
function. For example, to create a dataloader that looks up contracts by their IDs:
with keys as (
select unnest (@contract_ids::varchar[]) as id
, generate_subscripts(@contract_ids::varchar[], 1) as batch_key_index
)
select k.batch_key_index, sqlc.embed(c) from keys k
join contracts c on c.id = k.id
where not c.deleted;
This example is a good template for looking up objects by IDs via custom batching, and can be reused for other types.
Note: because the SQL query above does not have a persist.DBID
key type (it uses a []varchar
), the generated dataloader will not automatically implement autoCacheWithKey
for the result type. autoCacheWithKey
will need to be implemented manually.
Documentation
¶
There is no documentation for this package.