replds (v2)
Minimalistic filesystem-like replicated data distribution service,
with last-write-wins semantics, authentication and ACLs. It offers a
simple client to replicate locally (on the filesystem) parts of the
database, and trigger the execution of shell scripts when they change,
which makes it particularly suited for the distribution of secrets
that need to be modified by online automation.
The last-write-wins model makes it suitable for scenarios where there
is either a single writer, or the key space is completely
partitioned. The absence of transactions or other database-like
features is why it is best to think of replds as a data distribution
solution, with a familiar filesystem-like data model.
The current limitations of the internal replication protocols makes it
only suitable for small-ish datasets (i.e., fewer than 100k objects, a
few GBs total size). See the Replication semantics section below for
further details.
Building
From source
All the functionality (server, clients) is contained within the single
replds binary. It can be built from these sources, with a
sufficiently recent Go environment (>= 1.14):
go build ./cmd/replds
Running
All the functionality is available in the replds binary, which has
subcommands to run servers, clients, etc. You can run
replds help
to get a quick summary of the available commands and options.
Quick tutorial
This tutorial will set up a local replds service cluster with
multiple instances, without authentication or ACLs, and will show how
to perform some simple operations.
Let's start by running the server. We'll use the test.sh script to
start 3 instances of "replds server", on different ports, that talk to
each other. You should start the script on its own terminal, and then
press Enter when you're done with the tutorial to terminate the
processes.
./test.sh
With this test script, the servers will listen for GRPC traffic on
ports 12100, 12200 and 12300. Whichever you pick for your tests won't
make any difference.
The test script does little else than starting "replds server" with
the right options for the daemons to find each other.
Next, let's upload some data. To do so, we'll first create a simple
directory hierarchy locally:
mkdir data data/sub
echo foo > data/foo
echo one > data/sub/one
echo two > data/sub/two
and then let's use the "replds store" command to upload it:
replds store --server=localhost:12100 ./data /data
This will create /data/foo, /data/sub/one and /data/sub/two in
the database. You should see messages on the test.sh console telling
you that data has been written on one node, and then, slightly later,
on the other two as well.
Finally let's start a "watcher" to synchronize a subtree of this data
back to the local filesystem. Suppose we want to have a copy of the
/data/sub tree in a local directory called sub:
replds pull --store=./sub --server=localhost:12100 /data/sub
The "replds pull" program will keep running to receive incremental
updates, but now the sub local directory should have the expected
contents (files one and *two).
Replication semantics
The service implements a last-write-wins conflict resolution
approach, which favors availability as opposed to consistency, to the
extent that there is no consistency guarantee whatsoever: there
are no transactions, there is no serialization. The API contract is
the following:
- Each object in the data store (node, in internal terminology) has
a version associated with it.
- Clients are guaranteed that the version of objects they see will
always increase monotonically: that is, they will never revert to
seeing an "older" version of an object.
- The above applies to each object separately: if you store objects A
and B in a single API call, there is no guarantee that clients will
see A and B updated at the same time (though the current
implementation tries really hard to make it so).
Replds instances synchronize updates between each other using a
gossip-like protocol, with periodic data exchanges. The current
implementation of this protocol is very simple and highly
inefficient: the periodic data exchange includes a representation of
the entire contents of the data store, so bandwidth usage scales with
the number of objects in the database! Also, due to the randomness
intrinsic to the gossip protocol, update propagation delays can only
be expressed statistically, so it is best to limit the number of
instances to a small value.
There is no concept of consensus, so writes will always succeed: the
service relies on internal synchronization to propagate the update to
all instances. In the presence of a network partition, clients may not
see these updates until the partition is resolved. This makes it
pretty much impossible for clients to implement read / modify / update
cycles, even with a single writer, unless they maintain local state
(possibly using a Watcher).
The instances store node data on disk, but keep all the nodes'
metadata in memory for speed, although using a highly efficient radix
tree (compressed trie) data structure.
The maximum data size is limited by the inefficiency of the
synchronization protocol, and the requirement to keep all the node
metadata in memory at all times. The current implementation is also
not very smart in that it needs to hold the whole dataset (including
node data!) in memory when doing an initial synchronization pass, to
serialize it into protobufs. All this places limits on dataset size
both in terms of number of entities (say, less than 10k) and total
size of the data (a few GBs).
API
The public replds API is a GRPC API, and is defined in
proto/replds.proto and consists of two RPC
methods:
- Store, to upload data to the service. The "replds store" command
implements a simple command-line client for this method.
- Watch, for incremental synchronization of parts of the data
store. This is a long-lived streaming RPC that implements an
asynchronous replication protocol: the clients sends a summary of
the initial state of its database, then listens for streaming
incremental updates. The "replds pull" command implements this
method.
Deployment
In order to provide high availability, you're supposed to run multiple
instances of replds server. The cluster model is static, i.e. each
instance needs to be told explicitly about the others using the
--peer command-line option.
Clients can talk to any instance, both for read and write API calls.
Authentication
Replds supports authentication, both of clients and server instances,
using mTLS (mutually-authenticated TLS). Clients, including server
instances when they talk to other server instances, need a valid TLS
client certificate, signed by a specific CA.
Furthermore, the identity associated with the client certificate, in
the form of the Common Name (CN) part of the X509 subject, can be used
in access control lists (ACLs), to limit which clients can access
parts of the data store. In replds, an ACL consists of three parts:
- identity
- path
- operation (read / write)
when ACLs are defined, access is denied by default, and ACL rules
are used to allow specific accesses. Paths in ACL rules are prefixes:
an ACL for /foo will also allow access to /foo/bar etc.
ACLs are configured in a text file, one per line, with the following
format:
identity path operation
where operation can be either read
or write
(possibly abbreviated
to r
or w
), or:
identity operation
to be used with the peer
operation (that takes no path component), a
special operation that covers the internal synchronization protocol.
Pass the file containing ACL rules to "replds server" using the
--acls option. Lines that start with a # are considered comments and
ignored.
Triggers
The "replds pull" command can execute user-defined triggers (shell
commands) when certain parts of the data store tree are updated. This
can be useful, for instance, to restart services with new credentials.
To load triggers, specify a directory with --triggers-dir. Replds
will load all the files from this directory (whose name does not
contain a dot, so no extensions should be used) as JSON-encoded files,
each one should be a dictionary with two attributes:
- path, specifying the path prefix that will activate this trigger;
- command, the shell command to execute.