README ¶
Draupnir
Odin laid upon the pyre the gold ring called Draupnir; this quality attended it: that every ninth night there fell from it eight gold rings of equal weight.
Draupnir is a tool that provides on-demand Postgres databases with preloaded data.
Looking to use Draupnir? Please check the developer handbook
Development
Prerequisites:
- Go
- Postgresql
Install dep if you want to add/remove dependencies
brew install dep
Create the database
createdb draupnir
Migrate the database
make migrate
Development (Vagrant VM)
It will often be desirable to run a full virtual machine, with btrfs, in order to test the complete Draupnir flow. This can be achieved via the included Vagrant configuration.
Install prerequisites:
brew cask install virtualbox vagrant
Build the Linux binary:
make build-linux
Boot Vagrant VM:
vagrant up
Login and use Draupnir:
vagrant ssh
$ sudo su -
# eval $(draupnir new)
# psql -d myapp
After making changes to the code, to restart the server:
make build-linux && vagrant up --provision
Tests
To run the unit tests:
make test
To run the integration tests:
make deb && test-integration
Usage
Draupnir provides an API to create, use and manage instances of your database. There are two API resources: Images and Instances. An Image is a database backup that you upload to Draupnir. Instances are lightweight copies of a particular Image that you can create, use and destroy with ease. We'll walk through the basics of using Draupnir. The full API reference is at the bottom of this document.
Creating an Image
Create a new Image by POST
ing to /images
, providing a timestamp for the
backup and an anonymisation script that will be run against the backup. You can
use this to remove any sensitive data from your backup before serving it to
users.
POST /images HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
{
"data": {
"type": "images",
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"anonymisation_script": "\c my_db\nDELETE FROM secret_tokens;"
}
}
}
201 Created
{
"data": {
"type": "images",
"id": 1,
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"created_at": "2017-05-01T15:00:00Z",
"updated_at": "2017-05-01T15:00:00Z",
"ready": false
}
}
}
Uploading an Image
Once you've created an Image, you can upload it. This is done by scp
ing a
tarball of the database data directory to Draupnir. The upload is authenticated
with an ssh key which you'll create when setting up Draupnir.
scp -i key.pem db_backup.tar.gz upload@my-draupnir.tld:/draupnir/image_uploads/1
Once you've uploaded the backup, inform Draupnir that you're ready to finalise the image. This may take some time, as Draupnir will spin up Postgres and run the anonymisation script.
POST /images/1/done HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
200 OK
{
...
}
Creating Instances
Now you've got an image, you can create instances of it. The process for this is very simple.
POST /instances HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
{
"data": {
"type": "instances",
"attributes": {
"image_id": 1
}
}
}
201 Created
{
"data": {
"type": "instances",
"id": 1,
"attributes": {
"created_at": "2017-05-01T16:00:00Z",
"updated_at": "2017-05-01T16:00:00Z",
"image_id": 1,
"port": "5678"
}
}
}
You now have a Postgres server up and running, containing a copy of your database. You can connect to it like you would any other database.
PGHOST=my-draupnir.tld PGPORT=5678 psql my-db
You can make any modifications to this database and they won't affect the original backup. When you're done, just destroy the instance.
DELETE /instances/1
Authorization: Bearer 123
Draupnir-Version: 1.0.0
204 No Content
You can create as many instances of a particular image as you want, without worrying about disk space. Draupnir will only consume disk space for new data that you write to your instances.
Configuration
When draupnir boots it looks for a config file at /etc/draupnir/config.toml
.
This file must specify all required configuration variables in order for
Draupnir to boot. The variables are as follows:
Field | Required | Description |
---|---|---|
database_url |
True | A postgresql connection URI for draupnir's internal database. |
data_path |
True | The path to draupnir's data directory, where all images and instances will be stored. |
environment |
True | The environment. This can be any value, but if it is set to "test", draupnir will use a stubbed authentication client which allows all requests specifying an access token of the-integration-access-token . This is intended for integration tests - don't use it in production. The environment will be included in all log messages. |
shared_secret |
True | A hardcoded access token that can be used by automated scripts which can't authenticate via OAuth. At GoCardless we use this to automatically create new images. |
trusted_user_email_domain |
True | The domain under which users are considered "trusted". This is draupnir's rudimentary form of authentication: if a user athenticates via OAuth and their email address is under this domain, they will be allowed to use the service. This domain must start with a @ , e.g. @gocardless.com . |
public_hostname |
True | The hostname that will be set as PGHOST. This is configurable as it may be different to the hostname of the API address that clients communicate with. |
sentry_dsn |
False | The DSN for your Sentry project, if you're using Sentry. |
clean_interval |
True | The interval at which Draupnir checks and removes any instance associated with a user that no longer has a valid refresh token. Valid values are a sequence of digits followed by a unit, such as "30m", "6h". See time.ParseDuration. |
min_instance_port |
True | The minimum port number (inclusive) that may be used when creating a Draupnir instance. |
max_instance_port |
True | The maximum port number (exclusive) that may be used when creating a Draupnir instance. |
enable_ip_whitelisting |
False | Whether to enable the IP whitelisting module. |
whitelist_reconcile_interval |
False | If IP whitelisting is enabled, this is the interval at which Draupnir reconciles the IP address whitelist with what's in iptables, in order to clean up incorrect state. Uses the same format as clean_interval . |
use_x_forwarded_for |
False | Whether to use the X-Forwarded-For header when determining the real user IP address. See documentation. |
trusted_proxy_cidrs |
False | A list of CIDRs that will match your load balancer IP addresses. Example: ["10.32.0.0/16"] . See documentation. |
http.listen_address |
False | The address and port that the HTTPS server will bind to. |
http.insecure_listen_address |
False | The address and port that the HTTP server will bind to. |
http.tls_certificate |
False | The path to the TLS certificate file that the HTTPS server will use. |
http.tls_private_key |
False | The path to the TLS private key that the HTTPS server will use. |
oauth.redirect_url |
True | The redirect URL for the OAuth flow. |
oauth.client_id |
True | The OAuth client ID. |
oauth.client_secret |
True | The OAuth client secret. |
For a complete example of this file, see spec/fixtures/config.toml
.
CLI
Draupnir ships as a single binary which can be used to run the server or use as a client to manage your instances.
The CLI has built-in help (draupnir help
). For help on sub-commands, use an invocation
like draupnir images help
instead of draupnir help images
.
Authenticate
draupnir authenticate
List Images
draupnir images list
Create an instance of Image 3
draupnir instances create 3
Connect to instance 4
eval $(draupnir env 4)
psql
Destroy instance 4
draupnir instances destroy 4
API
The Draupnir API roughly follows the JSON API spec, with a few deviations.
The only supported Content-Type
is application/json
. Authentication is
required for most API endpoints and is provided in the form of an access token
in the Authorization
header.
The API also requires a Draupnir-Version
header to be set. This version must
be exactly equal to the version of Draupnir serving the API. The CLI and server
are distributed as one, and share a version number. We enforce equality here as
a conservative measure to ensure that the CLI and API can interoperate
seamlessly. In the future we might relax this constraint.
Images
List Images
GET /images HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
200 OK
{
"data": [
{
"type": "images",
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"anonymisation_script": "\c my_db\nDELETE FROM secret_tokens;"
}
}
]
}
Get Image
GET /images/1 HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
200 OK
{
"data": {
"type": "images",
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"anonymisation_script": "\c my_db\nDELETE FROM secret_tokens;"
}
}
}
Create Image
POST /images HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
{
"data": {
"type": "images",
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"anonymisation_script": "\c my_db\nDELETE FROM secret_tokens;"
}
}
}
201 Created
{
"data": {
"type": "images",
"id": 1,
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"created_at": "2017-05-01T15:00:00Z",
"updated_at": "2017-05-01T15:00:00Z",
"ready": false
}
}
}
Finalise Image
POST /images/1/done HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
200 OK
{
"data": {
"type": "images",
"id": 1,
"attributes": {
"backed_up_at": "2017-05-01T12:00:00Z",
"created_at": "2017-05-01T15:00:00Z",
"updated_at": "2017-05-01T15:01:00Z",
"ready": true
}
}
}
Destroy Image
DELETE /images/1
Authorization: Bearer 123
204 No Content
Instances
List Instances
GET /instances HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
200 Ok
{
"data": [
{
"type": "instances",
"id": 1,
"attributes": {
"created_at": "2017-05-01T16:00:00Z",
"updated_at": "2017-05-01T16:00:00Z",
"image_id": 1,
"port": "5678"
}
}
]
}
Get Instance
GET /instances HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
200 Ok
{
"data": {
"type": "instances",
"id": 1,
"attributes": {
"created_at": "2017-05-01T16:00:00Z",
"updated_at": "2017-05-01T16:00:00Z",
"image_id": 1,
"port": "5678"
}
}
}
Create Instance
POST /instances HTTP/1.1
Content-Type: application/json
Draupnir-Version: 1.0.0
Authorization: Bearer 123
{
"data": {
"type": "instances",
"attributes": {
"image_id": 1
}
}
}
201 Created
{
"data": {
"type": "instances",
"id": 1,
"attributes": {
"created_at": "2017-05-01T16:00:00Z",
"updated_at": "2017-05-01T16:00:00Z",
"image_id": 1,
"port": "5678"
}
}
}
Destroy Instance
DELETE /instances/1 HTTP/1.1
Draupnir-Version: 1.0.0
Authorization: Bearer 123
204 No Content
Internal Architecture
Draupnir is basically two things: a manager for BTRFS
volumes and a supervisor of PostgreSQL processes.
Each image is stored in its own BTRFS subvolume, and instances are created by
creating a snapshot of the image's subvolume, and booting a Postgres instance in
it. In order to do this, Draupnir requires read-write access to a disk formatted
with BTRFS. The path to this disk is specified at runtime by the DRAUPNIR_DATA_PATH
environment variable.
The whole process looks like this (assuming DRAUPNIR_DATA_PATH=/draupnir
):
- An image is created via the API (
POST /images
). This creates a record in Draupnir's internal database and an empty subvolume is created at/draupnir/image_uploads/1
(where1
is the image ID). The user may specify an anonymisation script to be run on the data before it is made available. At this point, the image is marked as "not ready", meaning it cannot be used to create instances. - A PostgreSQL backup, in the form of a tar file, is pushed into the server
over SCP. The ssh credentials for this operation are set when the machine is
provisioned, via the chef cookbook.
The backup is pushed directly into
/draupnir/image_uploads/1
. - The image is finalised via the API (
POST /images/1/done
). This indicates to Draupnir that the backup has completed and no more data needs to be pushed. Draupnir prepares the directory so Postgres will boot from it, and runs the anonymisation script. For more detail on this step seecmd/draupnir-finalise-image
. Finally, Draupnir will create a BTRFS snapshot of the subvolume at/draupnir/image_snapshots/1
. This snapshot is read-only and ensures that the image will not change from now on. At this point Draupnir marks the image as "ready", meaning that instances can be created from it. - A user creates an instance from this image via the API (
POST /instances
). First, draupnir creates a corresponding record in its database. Then it will take a further snapshot of the image:/draupnir/image_snapshots/1 -> /draupnir/instances/1
(where1
is the instance ID). It will start a Postgres process, setting the data directory to/draupnir/instances/1
and binding it to a random port (which we persist in the database as part of the instance). - The instance is now running and can accept external connections (the port
range used for instances is exposed via an iptables rule in the cookbook).
The user can connect to the instance as if it were any other database, simply
by specifying the host (whatever server Draupnir is running on), the port
(serialised in the API) and valid user credentials. We expect that the user
already knows the credentials for a user in their database, or alternatively
they can use the
postgres
user which we create (with no password) as part of step 3. - The user destroys the instance via the API (
DELETE /instances/1
). Draupnir stops the Postgres process for that instance and deletes the snapshot/draupnir/instances/1
. - The image is destroyed via an API call (
DELETE /images
). All instances of this image are destroyed as per step 6, and then the image is destroyed by removing the directories/draupnir/image_snapshots/1
and/draupnir/image_uploads/1
.
All interaction with BTFS and Postgres is done via a collection of small shell
scripts in the cmd
directory - read them if you want to know more.
Right now modifications to images (creation, finalisation, deletion) are restricted to a single "upload" user, who authenticates with the API via a shared secret.
Security model
Draupnir has been designed to be deployed on a publicly-accessible instance, but restrict access to the potentially sensitive data in the Draupnir images to authorised users only.
API access
Access to the API is secured via Google OAuth. A user must have a valid token in order to create, retrieve or destroy a Draupnir instance.
Connecting to Draupnir Postgres instances
Access to a Draupnir Postgres instance is secured via a client-authenticated TLS
connection.
The client certificate and key are served via the API and then
stored in a secure temporary location on the client machine. The paths to these
files are then used to set PGSSLCERT
and PGSSLKEY
.
Additionally, PGSSLMODE
is set to verify-ca
, meaning that the Postgres
client will only attempt to connect via TLS, and will also only successfully
connect to the instance if it provides the expected CA certificate.
On the server side, when an image is finalised the pg_hba.conf
file is setup
so that the only method of access is client-authenticated TLS, and this
therefore propagates to every instance created from the image.
This property ensures that even if a user was to login as a Postgres superuser
on their instance and set a blank password for a given database user, then still
nobody would be able to connect without a valid client certificate and key.
Each instance has a unique CA, server and client certificate, all generated at creation time, meaning that certificates and keys cannot be reused across instances and that once the instance is created the locally-stored credentials are useless. Given that an instance's details (and therefore credentials) can only be retrieved by the user that created that instance, it also means that only the owning user has access to connect to the instance.
IP address whitelisting
Draupnir provides the ability to dynamically whitelist user IP addresses to further secure the Postgres instances that it creates, protecting them from automated scans and attacks.
This is achieved via iptables rules. If this component is enabled
(enable_ip_whitelisting = true
in the server config) then the Draupnir
daemon will maintain an iptables chain named DRAUPNIR-WHITELIST
. It is
therefore the administrator's responsibility to provision other iptables
rules that reference this chain.
When a user creates an instance, or retrieves the details of one of their own instances, this chain will be populated with a rule that allows new connections to the Postgres instance port, from their IP address only. The rule will be removed as soon as the instance is destroyed.
An example configuration is provided below. This will work in configurations
where the default policy of the INPUT
chain is ACCEPT
. For those with
defaults of DROP
, the third rule can be omitted.
# In this example, our Draupnir port range is 6432-7432.
# Setup the DRAUPNIR-WHITELIST chain. The server will create this itself if it's
# missing, but we won't be able to reference the chain unless we do this.
iptables -N DRAUPNIR-WHITELIST
# Allow all connections on the loopback interface
iptables -A INPUT -i lo -p tcp -m tcp --dport 6432:7432 -j ACCEPT
# For any connections which have been successfully opened, allow further
# communication.
iptables -A INPUT -p tcp -m tcp --dport 6432:7432 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
# For any new connections, pass them through to the DRAUPNIR-WHITELIST chain
iptables -A INPUT -p tcp -m tcp --dport 6432:7432 -m conntrack --ctstate NEW -j DRAUPNIR-WHITELIST
# For any connections that have not been accepted by the whitelist, drop the
# packet
iptables -A INPUT -p tcp -m tcp --dport 6432:7432 -j DROP
The iptables wrapper library used in this project requires root access, and does
not support sudo. Because it
is strongly recommended to not run the Draupnir server as the root user, this
can be worked around by using the provided wrapper script
which is installed into the /usr/lib/draupnir/bin
directory by the Debian
package.
The Draupnir server process must be executed with a PATH
variable that places
this directory at the beginning, in order to ensure it is used instead of the
real iptables
binary.
Identification of user IP addresses
The Draupnir server creates whitelist rules based on the IP address of the user, which it determines by inspecting the HTTP request that was made to its API.
If your Draupnir API server is fronted by a load balancer, then the HTTP
connection that the Draupnir server receives will originate from that, rather
than the user directly. In this instance a separate mechanism of determining the
user's IP address must be employed; the X-Forwarded-For
header.
If this scenario applies to you then the following steps must be taken:
- Ensure that your load balancer places the 'real' user IP address in the
X-Forwarded-For
header. - Enable the use of the
X-Forwarded-For
header for IP address identification by setting theuse_x_forwarded_for
variable totrue
. - Define a list of trusted proxies, via the
trusted_proxy_cidrs
setting. Any IP addresses in theX-Forwarded-For
header that match any of these CIDRs will be ignored. The real user IP address is then determined by taking the resulting list of elements of theX-Forwarded-For
header and using the last one (under the assumption that this is the one that your load balancer has added).
If you are not using a load balancer then it is imperative that the
use_x_forwarded_for
setting remains disabled. If it is enabled without a load
balancer present, rewriting the contents of the header, then it's possible for
an authenticated user to send API requests with a fabricated X-Forwarded-For
header and therefore open up their instance(s) to unauthorized IP addresses.
Cleanup of revoked user instances
When a user creates an instance Draupnir stores the user's refresh token so that
it can, at the clean_interval
, check that the refresh token is still valid.
In the event that the token isn't valid, the instance is deleted. This ensures
that instances don't remain available longer than the users have access to
Draupnir.
Common causes for an invalid refresh token are:
- The user has revoked the application's third-party access in the Google account dashboard.
- The user is suspended via G Suite.
- The user has been deleted.