Affected by GO-2023-2166 and 4 other vulnerabilities

GO-2023-2166: SpiceDB leaks information in log files when URI cannot be parsed in github.com/authzed/spicedb

GO-2024-2597: Integer overflow in chunking helper causes dispatching to miss elements or panic in github.com/authzed/spicedb

GO-2024-2716: SpiceDB: LookupSubjects may return partial results if a specific kind of relation is used in github.com/authzed/spicedb

GO-2024-2939: SpiceDB exclusions can result in no permission returned when permission expected in github.com/authzed/spicedb

GO-2024-3131: SpiceDB having multiple caveats on resources of the same type may improperly result in no permission in github.com/authzed/spicedb

crdb

package

v1.21.0-rc1 Latest Latest Go to latest Published: May 8, 2023 License: Apache-2.0 Imports: 33 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/authzed/spicedb

README ¶

CockroachDB Datastore

CockroachDB is a Spanner-like datastore supporting global, immediate consistency, with the mantra "no stale reads." The CockroachDB implementation should be used when your SpiceDB service runs in multiple geographic regions, and Google's Cloud Spanner is unavailable (e.g. AWS, Azure, bare metal.)

Implementation Caveats

In order to prevent the new-enemy problem, we need to make related transactions overlap. We do this by choosing a common database key and writing to that key with all relationships that may overlap. This tradeoff is cataloged in our blog post The One Crucial Difference Between Spanner and CockroachDB.

Overlap Strategies

There are three transaction overlap strategies:

insecure, which does not protect against the new enemy problem
static, which protects all writes from the new enemy problem
prefix, which protects all writes with the same object prefix from the new enemy problem

Depending on your application, insecure may be acceptable, and it avoids the performance cost associated with the static and prefix options.

When is `insecure` overlap a problem?

Using insecure overlap strategy for SpiceDB with CockroachDB means that it is possible that timestamps for two subsequent writes will be out of order. When this happens, it's possible for the New Enemy Problem to occur.

Let's look at how likely this is, and what the impact might actually be for your workload.

When can timestamps be reversed?

Before we look at how this can impact an application, let's first understand when and how timestamps can be reversed in the first place.

When two writes are made in short succession against CockroachDB
And those two writes hit two different gateway nodes
And the CRDB gateway node clocks have a delta D
And the writes touch disjoint sets of relationships
And those two writes are sent within the time delta D between the gateway nodes
And the writes land in ranges whose followers are disjoint sets of nodes
And other independent cockroach processes (heartbeats, etc) haven't coincidentally synced the gateway node clocks during the writes.

Then it's possible that the second write will be assigned a timestamp earlier than the first write. In the next section we'll look at whether that matters for your application, but for now let's look at what makes the above conditions more or less likely:

Clock skew. A larger clock skew gives a bigger window in which timestamps can be reversed. But note that CRDB enforces a max offset between clocks, and getting within some fraction of that max offset will kick the node from the cluster.
Network congestion, or anything that interferes with node heartbeating. This increases the length of time that clocks can be desynchronized befor Cockroach notices and syncs them back up.
Cluster size. When there are many nodes, it is more likely that a write to one range will not have follower nodes that overlap with the followers of a write to another range. It also makes it more likely that the two writes will have different gateway nodes. On the other side, a 3 node cluster with replicas: 3 means that all writes will sync clocks on all nodes.
Write rate. If the write rate is high, it's more likely that two writes will hit the conditions to have reversed timestamps. If writes only happen once every max offset period for the cluster, it's impossible for their timestamps to be reversed.

The likelihood of a timestamp reversal is dependent on the cockroach cluster and the application's usage patterns.

When does a timestamp reversal matter?

Now we know when timestamps could be reversed. But when does that matter to your application?

The TL;DR is: only when you care about the New Enemy Problem.

Let's take a look at a couple of examples of how reversed timestamps may be an issue for an application storing permissions in SpiceDB.

Neglecting ACL Update Order

Two separate WriteRelationship calls come in:

A: Alice removes Bob from the shared folder B: Alice adds a new document not-for-bob.txt to the shared folder

The normal case is that the timestamp for A < the timestamp for B.

But if those two writes hit the conditions for a timestamp reversal, then B < A.

From Alice's perspective, there should be no time at which Bob can ever see not-for-bob.txt. She performed the first write, got a response, and then performed the second write.

But this isn't true when using MinimizeLatency or AtLeastAsFresh consistency. If Bob later performs a Check request for the not-for-bob.txt document, it's possible that SpiceDB will pick an evaluation timestamp such that B < T < A, so that the document is in the folder and bob is allowed to see the contents of the folder.

Note that this is only possible if A - T < quantization window: the check has to happen soon enough after the write for A that it's possible that SpiceDB picks a timestamp in between them. The default quantization window is 5s.

Application Mitigations for ACL Update Order

This could be mitigated in your application by:

Not caring about the problem
Not allowing the write from B within the max_offset time of the CRDB cluster (or the quantization window).
Not allowing a Check on a resource within max_offset of its ACL modification (or the quantization window).

Mis-apply Old ACLs to New Content

Two separate API calls come in:

A: Alice remove Bob as a viewer of document secret B: Alice does a FullyConsistent Check request to get a ZedToken C: Alice stores that ZedToken (timestamp B) with the document secret when she updates it to say Bob is a fool.

Same as before, the normal case is that the timestamp for A < the timestamp for B, but if the two writes hit the conditions for a timestamp reversal, then B < A.

Bob later tries to read the document. The application performs an AtLeastAsFresh Check for Bob to access the document secret using the stored Zedtoken (which is timestamp B.)

It's possible that SpiceDB will pick an evaluation timestamp T such that B < T < A, so that bob is allowed to read the newest contents of the document, and discover that Alice thinks he is a fool.

Same as before, this is only possible if A - T < quantization window: Bob's check has to happen soon enough after the write for A that it's possible that SpiceDB picks a timestamp in between A and B, and the default quantization window is 5s.

Application Mitigations for Misapplying Old ACLs

This could be mitigated in your application by:

Not caring about the problem
Waiting for max_offset (or the quantization window) before doing the fully-consistent check.

When does a timestamp reversal not matter?

There are also some cases when there is no New Enemy Problem even if there are reversed timestamps.

Non-sensitive domain

Not all authorization problems have a version of the New Enemy Problem, which relies on there being some meaningful consequence of hitting an incorrect ACL during the small window of time where it's possible.

If the worst thing that happens from out-of-order ACL updates is that some users briefly see some non-sensitive data, or that a user retains access to something that they already had access to for a few extra seconds, then even though there could still effectively be a "New Enemy Problem," it's not a meaningful problem to worry about.

Disjoint SpiceDB Graphs

The examples of the New Enemy Problem above rely on out-of-order ACLs to be part of the same permission graph. But not all ACLs are part of the same graph, for example:

definition user {}

definition blog {
    relation author: user
    permission edit = author
}

defintion video {
    relation editor: user
    permission change_tags = editor
}

A: Alice is added as an author of the Blog entry new-enemy B: Bob is removed from the editors of the spicedb.mp4 video

If these writes are given reversed timestamps, it is possible that the ACLs will be applied out-or-order and this would normally be a New Enemy Problem. But the ACLs themselves aren't shared between any permission computations, and so there is no actual consequence to reversed timestamps.

Documentation ¶

Index ¶

Constants
func NewCRDBDatastore(url string, options ...Option) (datastore.Datastore, error)
type Option

Constants ¶

View Source

const (
	Engine = "cockroachdb"
)

Variables ¶

This section is empty.

Functions ¶

func NewCRDBDatastore ¶

func NewCRDBDatastore(url string, options ...Option) (datastore.Datastore, error)

NewCRDBDatastore initializes a SpiceDB datastore that uses a CockroachDB database while leveraging its AOST functionality.

Types ¶

type Option ¶ added in v1.0.0

type Option func(*crdbOptions)

Option provides the facility to configure how clients within the CRDB datastore interact with the running CockroachDB database.

func DisableStats ¶ added in v1.11.0

func DisableStats(disable bool) Option

DisableStats disables recording counts to the stats table

func FollowerReadDelay ¶ added in v1.2.0

func FollowerReadDelay(delay time.Duration) Option

FollowerReadDelay is the time delay to apply to enable historial reads.

This value defaults to 0 seconds.

func GCWindow ¶

func GCWindow(window time.Duration) Option

GCWindow is the maximum age of a passed revision that will be considered valid.

This value defaults to 24 hours.

func MaxRetries ¶ added in v1.0.0

func MaxRetries(maxRetries uint8) Option

MaxRetries is the maximum number of times a retriable transaction will be client-side retried. Default: 5

func MaxRevisionStalenessPercent ¶ added in v1.0.0

func MaxRevisionStalenessPercent(stalenessPercent float64) Option

MaxRevisionStalenessPercent is the amount of time, expressed as a percentage of the revision quantization window, that a previously computed rounded revision can still be advertised after the next rounded revision would otherwise be ready.

This value defaults to 0.1 (10%).

func OverlapKey ¶ added in v1.0.0

func OverlapKey(key string) Option

OverlapKey is a key touched on every write if OverlapStrategy is "static" Default: 'key'

func OverlapStrategy ¶ added in v1.0.0

func OverlapStrategy(strategy string) Option

OverlapStrategy is the strategy used to generate overlap keys on write. Default: 'static'

func ReadConnHealthCheckInterval ¶ added in v1.18.0

func ReadConnHealthCheckInterval(interval time.Duration) Option

ReadConnHealthCheckInterval is the frequency at which both idle and max lifetime connections are checked, and also the frequency at which the minimum number of connections is checked.

This happens asynchronously.

This is not the only approach to evaluate these counts; "connection idle/max lifetime" is also checked when connections are released to the pool.

There is no guarantee connections won't last longer than their specified idle/max lifetime. It's largely dependent on the health-check goroutine being able to pull them from the connection pool.

The health-check may not be able to clean up those connections if they are held by the application very frequently.

This value defaults to 30s.

func ReadConnMaxIdleTime ¶ added in v1.18.0

func ReadConnMaxIdleTime(idle time.Duration) Option

ReadConnMaxIdleTime is the duration after which an idle read connection will be automatically closed by the health check.

This value defaults to having no maximum.

func ReadConnMaxLifetime ¶ added in v1.18.0

func ReadConnMaxLifetime(lifetime time.Duration) Option

ReadConnMaxLifetime is the duration since creation after which a read connection will be automatically closed.

This value defaults to having no maximum.

func ReadConnMaxLifetimeJitter ¶ added in v1.19.0

func ReadConnMaxLifetimeJitter(jitter time.Duration) Option

ReadConnMaxLifetimeJitter is an interval to wait up to after the max lifetime to close the connection.

This value defaults to 20% of the max lifetime.

func ReadConnsMaxOpen ¶ added in v1.18.0

func ReadConnsMaxOpen(conns int) Option

ReadConnsMaxOpen is the maximum size of the connection pool used for reads.

This value defaults to having no maximum.

func ReadConnsMinOpen ¶ added in v1.18.0

func ReadConnsMinOpen(conns int) Option

ReadConnsMinOpen is the minimum size of the connection pool used for reads.

The health check will increase the number of connections to this amount if it had dropped below.

This value defaults to the maximum open connections.

func RevisionQuantization ¶

func RevisionQuantization(bucketSize time.Duration) Option

RevisionQuantization is the time bucket size to which advertised revisions will be rounded.

This value defaults to 5 seconds.

func SplitAtUsersetCount ¶ added in v1.5.0

func SplitAtUsersetCount(splitAtUsersetCount uint16) Option

SplitAtUsersetCount is the batch size for which userset queries will be split into smaller queries.

This defaults to 1024.

func WatchBufferLength ¶

func WatchBufferLength(watchBufferLength uint16) Option

WatchBufferLength is the number of entries that can be stored in the watch buffer while awaiting read by the client.

This value defaults to 128.

func WithEnablePrometheusStats ¶ added in v1.12.0

func WithEnablePrometheusStats(enablePrometheusStats bool) Option

WithEnablePrometheusStats marks whether Prometheus metrics provided by the Postgres clients being used by the datastore are enabled.

Prometheus metrics are disabled by default.

func WriteConnHealthCheckInterval ¶ added in v1.18.0

func WriteConnHealthCheckInterval(interval time.Duration) Option

WriteConnHealthCheckInterval is the frequency at which both idle and max lifetime connections are checked, and also the frequency at which the minimum number of connections is checked.

This happens asynchronously.

This is not the only approach to evaluate these counts; "connection idle/max lifetime" is also checked when connections are released to the pool.

There is no guarantee connections won't last longer than their specified idle/max lifetime. It's largely dependent on the health-check goroutine being able to pull them from the connection pool.

The health-check may not be able to clean up those connections if they are held by the application very frequently.

This value defaults to 30s.

func WriteConnMaxIdleTime ¶ added in v1.18.0

func WriteConnMaxIdleTime(idle time.Duration) Option

WriteConnMaxIdleTime is the duration after which an idle write connection will be automatically closed by the health check.

This value defaults to having no maximum.

func WriteConnMaxLifetime ¶ added in v1.18.0

func WriteConnMaxLifetime(lifetime time.Duration) Option

WriteConnMaxLifetime is the duration since creation after which a write connection will be automatically closed.

This value defaults to having no maximum.

func WriteConnMaxLifetimeJitter ¶ added in v1.19.0

func WriteConnMaxLifetimeJitter(jitter time.Duration) Option

WriteConnMaxLifetimeJitter is an interval to wait up to after the max lifetime to close the connection.

This value defaults to 20% of the max lifetime.

func WriteConnsMaxOpen ¶ added in v1.18.0

func WriteConnsMaxOpen(conns int) Option

WriteConnsMaxOpen is the maximum size of the connection pool used for writes.

This value defaults to having no maximum.

func WriteConnsMinOpen ¶ added in v1.18.0

func WriteConnsMinOpen(conns int) Option

WriteConnsMinOpen is the minimum size of the connection pool used for writes.

The health check will increase the number of connections to this amount if it had dropped below.

This value defaults to the maximum open connections.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
migrations

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

CockroachDB Datastore

Implementation Caveats

Overlap Strategies

When is insecure overlap a problem?

When can timestamps be reversed?

When does a timestamp reversal matter?

Neglecting ACL Update Order

Application Mitigations for ACL Update Order

Mis-apply Old ACLs to New Content

Application Mitigations for Misapplying Old ACLs

When does a timestamp reversal not matter?

Non-sensitive domain

Disjoint SpiceDB Graphs

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func NewCRDBDatastore ¶

Types ¶

type Option ¶ added in v1.0.0

func DisableStats ¶ added in v1.11.0

func FollowerReadDelay ¶ added in v1.2.0

func GCWindow ¶

func MaxRetries ¶ added in v1.0.0

func MaxRevisionStalenessPercent ¶ added in v1.0.0

func OverlapKey ¶ added in v1.0.0

func OverlapStrategy ¶ added in v1.0.0

func ReadConnHealthCheckInterval ¶ added in v1.18.0

func ReadConnMaxIdleTime ¶ added in v1.18.0

func ReadConnMaxLifetime ¶ added in v1.18.0

func ReadConnMaxLifetimeJitter ¶ added in v1.19.0

func ReadConnsMaxOpen ¶ added in v1.18.0

func ReadConnsMinOpen ¶ added in v1.18.0

func RevisionQuantization ¶

func SplitAtUsersetCount ¶ added in v1.5.0

func WatchBufferLength ¶

func WithEnablePrometheusStats ¶ added in v1.12.0

func WriteConnHealthCheckInterval ¶ added in v1.18.0

func WriteConnMaxIdleTime ¶ added in v1.18.0

func WriteConnMaxLifetime ¶ added in v1.18.0

func WriteConnMaxLifetimeJitter ¶ added in v1.19.0

func WriteConnsMaxOpen ¶ added in v1.18.0

func WriteConnsMinOpen ¶ added in v1.18.0

Source Files ¶

Directories ¶

When is `insecure` overlap a problem?