shield

module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 8, 2016 License: MIT

README

S.H.I.E.L.D. Backup Solution

Build Status

Project Goal

The goal of this project is to build a standalone system that can perform backup and restore functions for a wide variety of pluggable data systems (like Redis, PostgreSQL, MySQL, RabbitMQ, etc.), storing backup data in pluggable storage solutions (i.e. local files, S3 blobstore, etc.).

The system should enable self-service for end users to perform ad hoc backup / restore operations, review backup schedules, retention policies and backup job runs, etc.

Engineers should be able to integrate support for new data systems and storage solutions without having to modify core code.

Architecture

Architecture Image

Task Lifecycle

Task Lifecyle Image

Target Plugins

The system interfaces with data systems that hold the data to back up via Target Plugins. These plugins are bits of code that are compiled and linked into the Core Daemon, and implement a standard interface for the following operations:

backup

Retrieves data from the data system (via native means like pg_dump or the Redis SAVE command) and sends it to an Storage Plugin.

restore

Retrieves the data from an Storage Plugin and overwrites the data in the data system accordingly, using native means like pg_restore.

For data systems that permit full backups across a network (as most RDBMS do), nothing more is needed. Some data systems, however, make assumptions about the environment in which they operate. Redis, for example, always dumps its backups to local disk. To support these data systems, we can implement the Agent Target Plugin, and a corresponding Agent Daemon that will run on the target system. The Agent Daemon will be responsible for implementing the backup / restore options, and the Agent Target Plugin will forward the requests to it, and relay responses back to the caller.

Storage Plugins

The system interfaces with storage systems for uploading and retrieving backed up data files. These plugins are bits of code that are compiled and linked into the Core Daemon, and implements a standard interface for the following operations:

store

Store a single data blob (usually a file) in the remote storage system. Returns a key that can be used for later retrieval.

retrieve

Given a key returned from the store operation, retrieve the data blob.

purge

Given a key returned from the store operation, delete the stored data.

Core Daemon

The Core Daemon is the coordinating component that handles:

Metadata Management

What targets and stores exist, what schedules and retention policies are defined, what jobs are specified, what backups have taken place, and what tasks are in-flight.

Scheduling Backups

Kicks off backup tasks (owned by SYSTEM) for all jobs per their configured schedule.

Expiring Backups

Finds all expired entries in the archives and purges them from the remote storage system.

Ad hoc Backups

Kicks off backup tasks (owned by users) per end-user or operator request (via the HTTP API, detailed later.)

Restores

Handles retrieval of stored backup data and replay / restoration of that data to a given target.

Monitoring

Exposes metrics and statistics about backup jobs, allows searching of archives to ensure that backups are completing successfully, etc.

HTTP API

The HTTP API is a component of the Core Daemon that exposes management interfaces via REST endpoints. It underlies the Web UI and CLI components (described later).

Catalog Database

A dedicated data store that keeps track of schedules, retention policies, backup configurations, targets and stores, and running tasks. This database is private to the Core Daemon; there should be no need to query it directly, outside of maintenance tasks. Web UI and the CLI

The Web UI provides a rich user interface for operators and end-users to view configuration (schedules, policies, jobs, etc.) review archives, and monitor tasks in-progress. It also provides self-service functionality by allowing users to request ad hoc backup and restore operations.

The Web UI relies exclusively on the HTTP API.

The CLI provides similar functionality, in a scriptable, command-line interface. It also relies exclusively on the HTTP API. Catalog Database Schema Definition

TARGETS stores the information about the remote data systems that should be backed up. Each record identifies the method by which the target is backed up (plugin) and specific connection information required (endpoint)

CREATE TABLE targets (
  uuid      UUID PRIMARY KEY,
  name      TEXT,  -- a human-friendly name for this target
  summary   TEXT,  -- annotation for operator use, to describe the target
                   --   i.e.: "Production PostgreSQL database"
  plugin    TEXT NOT NULL,  -- short name of the target plugin, like 'postgres'
  endpoint  TEXT NOT NULL,  -- opaque blob used by target plugin to connect to
                            --   the remote data system.  Could be JSON, YAML, etc.
  agent     TEXT NOT NULL,  -- IP address and port (in ip:port format) of the
                            -- Shield agent that can backup/restore this target
);

STORES stores the destination of backup data, i.e. an S3 bucket, local file system directory, etc. Each record identifies a destination, the method by which to store and retrieve backup data to/from it ('plugin') and specific connection information required ('endpoint')

CREATE TABLE stores (
  uuid      UUID PRIMARY KEY,
  name      TEXT,  -- a human-friendly name for this store
  summary   TEXT,  -- annotation for operator use, to describe the store
  plugin    TEXT NOT NULL,  -- short name of the storage plugin, like 's3' or 'fs'
  endpoint  TEXT NOT NULL,  -- opaque blob used by storage plugin to connect to
                            -- the storage backend.  Could be JSON, YAML, etc.
);

SCHEDULES contains the timing information that informs the core daemon when it should run which backup jobs (or JOBS, see later).

CREATE TABLE schedules (
  uuid      UUID PRIMARY KEY,
  name      TEXT, -- a human-friendly name for this schedule
  summary   TEXT, -- annotation for operator use, to describe schedule
  timespec  TEXT NOT NULL, -- code in a DSL for specifying when to run backups,
                           --   i.e. 'sundays 8am' or 'daily 1am'
                           --   (note: may want to eval use of cron here)
);

RETENTION policies govern how long data is kept. For now, this is just a simple expiration time, with 'name' and 'summary' fields for annotation.

All backups taken MUST have a retention policy; no backups are kept indefinitely.

CREATE TABLE retention (
  uuid     UUID PRIMARY KEY,
  name     TEXT, -- a human-friendly name for this retention policy
  summary  TEXT, -- annotation for operator use, to describe policy
  expiry   INTEGER NOT NULL, -- how long (in seconds) before a given backup expires
);

JOBS keeps track of desired backup behavior, by marrying a target (the data to backup) with a store (where to send that data), according to a schedule (when to do the backups) and a retention policy (how long to keep the data for).

JOBS can be annotated by operators to provide context and justification for each job. For example, tickets can be called out in the notes field to direct people to more information about when the backup job was requested, and why.

CREATE TABLE jobs (
  uuid            UUID PRIMARY KEY,
  target_uuid     UUID NOT NULL, -- the target
  store_uuid      UUID NOT NULL, -- the store
  schedule_uuid   UUID NOT NULL, -- what schedule to use
  retention_uuid  UUID NOT NULL, -- what retention policy to use
  priority        INTEGER DEFAULT 50, -- priority, scale from 0 to 100 (0 = highest)
  paused          BOOLEAN, -- if true, this job is not run when scheduled.
  name            TEXT,    -- a human-friendly name for this schedule
  summary         TEXT,    -- annotation for operator use, to describe
                           --   the purpose of the job ('weekly orders db')
);

ARCHIVES records all archives as they are created, and keeps track of where the data came from, where it went, when the backed-up data expires, etc.

ARCHIVES can be annotated by operators, so that they can keep track of specifically important backups, like dumps of databases taken before potentially risky changes are attempted.

CREATE TABLE archives (
  uuid         UUID PRIMARY KEY,
  target_uuid  UUID NOT NULL, -- the target (from jobs)
  store_uuid   UUID NOT NULL, -- the store (from jobs)
  store_key    TEXT NOT NULL, -- opaque data returned from the storage plugin,
                              --   for use in restore ops / download / etc.
  taken_at     INTEGER NOT NULL,
  expires_at   INTEGER NOT NULL, -- based on retention policy
  notes        TEXT DEFAULT '', -- annotation for operator use, to describe this
                                --   specific backup, i.e. 'before change #422 backup'
                                --   (mostly, this will be empty)
);

TASKS keep track of non-custodial jobs being performed by the system. This includes scheduled backups, ad-hoc backups, data restoration and downloads, etc.

The core daemon interprets the 'op' field, and calls on the appropriate plugins, based on the associated JOB or ARCHIVE / TARGET entry.

Each TASK should be associated with either a JOB or an ARCHIVE.

Here are the defined operations:

Operation Description
backup Perform a backup of the associated JOB. The target and store are pulled directly from the JOB entry.
Note: the backup operation is used for both ad hoc and scheduled backups.
restore Perform a restore of the associated ARCHIVE. The storage channel is pulled directly from the ARCHIVE. The target can be specified explicitly. If it is not, the values from the ARCHIVE will be used. This allows restores to go to a different host (for migration / scale-out purposes).
CREATE TYPE status AS ENUM ('pending', 'running', 'canceled', 'failed', 'done');
CREATE TABLE tasks (
  uuid      UUID PRIMARY KEY,
  owner     TEXT, -- who owns / started this task?
  op        TEXT NOT NULL, -- name of the operation to run, i.e. 'backup' or 'restore'

  job_uuid      UUID,
  archive_uuid  UUID,
  target_uuid   UUID,

  status       status, -- current status of the task
  requested_at INTEGER NOT NULL, -- when the task was _created_
  started_at   INTEGER, -- when the task actually started
  stopped_at   INTEGER, -- when the task completed (or was cancelled)

  log       TEXT -- log of task activity
);

HTTP API

Schedules API

Purpose: allows the Web UI and CLI to find out what schedules are defined, and provides CRUD operations for schedule management. Allowing queries to filter to unused=t or unused=f enables the frontends to show schedules that can be deleted safely.

Method Path Arguments Request Body
GET /v1/schedules ?unused=[tf] -
POST /v1/schedules - see below
DELETE /v1/schedule/:uuid - -
GET /v1/schedule/:uuid - -
PUT /v1/schedule/:uuid - see below
GET /v1/schedules

Response Body:

[
  {
    "uuid"    : "36f50f26-b007-433a-a67a-bdffbd0746c8",
    "name"    : "Schedule Name",
    "summary" : "a short description",
    "when"    : "daily at 4am"
  },

  "..."
]
POST /v1/schedules

Request Body:

{
  "name"    : "Schedule Name",
  "summary" : "a short description",
  "when"    : "daily at 4am"
}
Field Required? Meaning
name Y The name of the new schedule
summary N A short summary of what the schedule is for, when it should be used
when Y The schedule, in the Timespec Language

Response Body:

{
  "ok"   : "created",
  "uuid" : "6b8398be-fdc0-424a-8532-e812e5dfc116"
}
Field Meaning
ok The new schedule was created
uuid The UUID of the newly-created schedule
PUT /v1/schedule/:uuid

Request Body:

{
  "name"    : "Schedule Name",
  "summary" : "a short description",
  "when"    : "daily at 4am"
}
Field Required? Meaning
name Y The name of the new schedule
summary Y A short summary of what the schedule is for, when it should be used
when Y The schedule, in the Timespec Language

NOTE: summary is required for update requests, whereas it is optional on creation.

Response Body:

{
  "ok" : "updated"
}
Field Meaning
ok The schedule was updated
Retention Policies API

Purpose: allows the Web UI and CLI to find out what retention policies are defined, and provides CRUD operations for policy management. Allowing queries to filter to unused=t or unused=f enables the frontends to show retention policies that can be deleted safely.

Method Path Arguments Request Body
GET /v1/retention ?unused=[tf] -
POST /v1/retention - see below
DELETE /v1/retention/:uuid - -
GET /v1/retention/:uuid - -
PUT /v1/retention/:uuid - see below
GET /v1/retention
[
  {
    "uuid"    : "c5aed303-a6fc-4b68-b0e9-81431cc07a4e",
    "name"    : "Retention Policy Name",
    "summary" : "a short description",
    "expires" : 86400
  },

  "..."
]
POST /v1/retention

Request Body:

{
  "name"    : "Policy Name",
  "summary" : "a short description",
  "expires" : 86400
}
Field Required? Meaning
name Y The name of the new retention policy
summary N A short summary of the new retention policy
expires Y How long, in seconds, to keep archives made against this policy. This value must be at least 3600 (1h)

Response Body:

{
  "ok"   : "created",
  "uuid" : "6b8398be-fdc0-424a-8532-e812e5dfc116"
}
Field Meaning
ok The new retention policy was created
uuid The UUID of the newly-created retention policy
PUT /v1/retention/:uuid

Request Body:

{
  "name"    : "Policy Name",
  "summary" : "a short description",
  "expires" : 86400
}
Field Required? Meaning
name Y The name of the new retention policy
summary Y A short summary of the new retention policy
expires Y How long, in seconds, to keep archives made against this policy. This value must be at least 3600 (1h)

NOTE: summary is required for update requests, whereas it is optional on creation.

Response Body:

{
  "ok" : "updated"
}
Field Meaning
ok The retention policy was updated
Targets API

Purpose: allows the Web UI and CLI to review what targets have been defined, and allows updates to existing targets (to change endpoints or plugins, for example) and remove unused targets (i.e. retired / decommissioned services).

Method Path Arguments Request Body
GET /v1/targets ?plugin=:name
?unused=[tf]
-
POST /v1/targets - see below
DELETE /v1/target/:uuid - -
GET /v1/target/:uuid - -
PUT /v1/target/:uuid - see below
GET /v1/targets
[
  {
    "uuid"     : "2f42d0b3-449a-4d0e-8576-a40cc552d7e5",
    "name"     : "Target Name",
    "summary"  : "a short description",
    "plugin"   : "plugin-name",
    "endpoint" : "{\"encoded\":\"json\"}",
    "agent"    : "10.17.66.54:5544"
  },

  "..."
]
POST /v1/targets

Request Body:

{
  "name"     : "Target Name",
  "summary"  : "a short description",
  "plugin"   : "plugin-name",
  "endpoint" : "{\"encoded\":\"json\"}",
  "agent"    : "10.17.66.54:5544"
}
Field Required? Meaning
name Y The name of the new target
summary N A short description of the target
plugin Y The name of the plugin to use when backing up this target
endpoint Y The endpoint configuration required to access this target's data
agent Y The host:port of a Shield agent that can backup/resetore this target

Response Body:

{
  "ok"   : "created",
  "uuid" : "6b8398be-fdc0-424a-8532-e812e5dfc116"
}
Field Meaning
ok The new target was created
uuid The UUID of the newly-created target
PUT /v1/target/:uuid

Request Body:

{
  "name"     : "Target Name",
  "summary"  : "a short description",
  "plugin"   : "plugin-name",
  "endpoint" : "{\"encoded\":\"json\"}",
  "agent"    : "10.17.66.54:5544"
}
Field Required? Meaning
name Y The name of the new target
summary Y A short description of the target
plugin Y The name of the plugin to use when backing up this target
endpoint Y The endpoint configuration required to access this target's data
agent Y The host:port of a Shield agent that can backup/resetore this target

NOTE: summary is required for update requests, whereas it is optional on creation.

Response Body:

{
  "ok" : "updated"
}
Field Meaning
ok The target was updated
Stores API

Purpose: allows operators (via the Web UI and CLI components) to view what storage systems are available for configuring backups, provision new ones, update existing ones and delete unused ones.

Method Path Arguments Request Body
GET /v1/stores ?plugin=:name
?unused=[tf]
-
POST /v1/stores - see below
DELETE /v1/store/:uuid - -
GET /v1/store/:uuid - -
PUT /v1/store/:uuid - see below
GET /v1/stores
[
  {
    "uuid"     : "5bcde12a-8b3f-4663-bbe3-9fe0fd6a093d",
    "name"     : "Store Name",
    "summary"  : "a short description",
    "plugin"   : "plugin-name",
    "endpoint" : "{\"encoded\":\"json\"}"
  },

  "..."
]
POST /v1/stores

Request Body:

{
  "name"     : "Store Name",
  "summary"  : "a short description",
  "plugin"   : "plugin-name",
  "endpoint" : "{\"encoded\":\"json\"}"
}
Field Required? Meaning
name Y The name of the new store
summary N A short description of the store
plugin Y The name of the plugin to use when backing up this store
endpoint Y The endpoint configuration required to access this store's data

Response Body:

{
  "ok"   : "created",
  "uuid" : "6b8398be-fdc0-424a-8532-e812e5dfc116"
}
Field Meaning
ok The new store was created
uuid The UUID of the newly-created store
PUT /v1/store/:uuid

Request Body:

{
  "name"     : "Store Name",
  "summary"  : "a short description",
  "plugin"   : "plugin-name",
  "endpoint" : "{\"encoded\":\"json\"}"
}
Field Required? Meaning
name Y The name of the new store
summary Y A short description of the store
plugin Y The name of the plugin to use when backing up this store
endpoint Y The endpoint configuration required to access this store's data

NOTE: summary is required for update requests, whereas it is optional on creation.

Response Body:

{
  "ok" : "updated"
}
Field Meaning
ok The store was updated
Jobs API

Purpose: allows end-users and operators to see what jobs have been configured, and the details of those configurations. The filtering on the main listing / search endpoint (/v1/jobs) allows the frontends to show only jobs for specific schedules (what weekly backups are we running?), retention policies (what backups are we keeping for 90d or more?), and specific targets / stores.

Method Path Arguments Request Body
GET /v1/jobs ?target=:uuid
?store=:uuid
?schedule=:uuid
?retention=:uuid
?paused=[tf]
-
POST /v1/jobs - see below
DELETE /v1/job/:uuid - -
GET /v1/job/:uuid - -
PUT /v1/job/:uuid - see below
POST /v1/job/:uuid/pause - -
POST /v1/job/:uuid/unpause - -
POST /v1/job/:uuid/run - see below
GET /v1/jobs
[
  {
    "uuid"            : "af0b40b2-8f7b-46e4-b425-9730c677e625",
    "name"            : "A Backup Job",
    "summary"         : "a short description",

    "retention_name"  : "100d Retention Policy",
    "retention_uuid"  : "7eb2131c-c2ad-40b1-916f-7e162be89465",
    "expiry"          : 8640000,

    "schedule_name"   : "Daily Backups Schedule",
    "schedule_uuid"   : "e390934b-fc43-4343-a51b-22bd69a8894f",
    "schedule"        : "daily at 4am",

    "paused"          : false,

    "store_uuid"      : "994e991f-112d-496d-a1df-bbdc67c79332",
    "store_plugin"    : "store-plugin",
    "store_endpoint"  : "{\"encoded\":\"json\"}",

    "target_uuid"     : "443e2ce1-de2e-4369-a497-add3dd970d4d",
    "target_plugin"   : "target-plugin",
    "target_endpoint" : "{\"encoded\":\"json\"}"
  },

  "..."
]
POST /v1/jobs

Request Body:

{
  "name"      : "Job Name",
  "summary"   : "a short description",

  "store"     : "uuid-of-store-to-use",
  "target"    : "uuid-of-target-to-use",
  "retention" : "uuid-of-retention-policy-to-use",
  "schedule"  : "uuid-of-schedule-to-use",

  "paused"    : false
}
Field Required? Meaning
name Y The name of the new job
summary N A short description of the job
store Y The UUID of the store to back data up to
target Y The UUID of the target to back up
retention Y The UUID of the retention policy to apply to backup archives
schedule Y The UUID of the backup schedule to use when determining when this job should run
paused Y Whether or not this job should be paused, initially

Response Body:

{
  "ok"   : "created",
  "uuid" : "6b8398be-fdc0-424a-8532-e812e5dfc116"
}
Field Meaning
ok The new job was created
uuid The UUID of the newly-created job
GET /v1/job/:uuid
{
  "uuid"            : "af0b40b2-8f7b-46e4-b425-9730c677e625",
  "name"            : "A Backup Job",
  "summary"         : "a short description",

  "retention_name"  : "100d Retention Policy",
  "retention_uuid"  : "7eb2131c-c2ad-40b1-916f-7e162be89465",
  "expiry"          : 8640000,

  "schedule_name"   : "Daily Backups Schedule",
  "schedule_uuid"   : "e390934b-fc43-4343-a51b-22bd69a8894f",
  "schedule"        : "daily at 4am",

  "paused"          : false,

  "store_uuid"      : "994e991f-112d-496d-a1df-bbdc67c79332",
  "store_plugin"    : "store-plugin",
  "store_endpoint"  : "{\"encoded\":\"json\"}",

  "target_uuid"     : "443e2ce1-de2e-4369-a497-add3dd970d4d",
  "target_plugin"   : "target-plugin",
  "target_endpoint" : "{\"encoded\":\"json\"}"
}

PUT /v1/job/:uuid

Request Body:

{
  "name"      : "Job Name",
  "summary"   : "a short description",

  "store"     : "uuid-of-store-to-use",
  "target"    : "uuid-of-target-to-use",
  "retention" : "uuid-of-retention-policy-to-use",
  "schedule"  : "uuid-of-schedule-to-use"
}
Field Required? Meaning
name Y The name of the new job
summary Y A short description of the job
store Y The UUID of the store to back data up to
target Y The UUID of the target to back up
retention Y The UUID of the retention policy to apply to backup archives
schedule Y The UUID of the backup schedule to use when determining when this job should run

NOTE: summary is required for update requests, whereas it is optional on creation.

ALSO NOTE: The paused boolean parameter available on creation is not available for jobs that already exist. Use the other POST URLs for pausing / unpausing existent jobs.

Response Body:

{
  "ok" : "updated"
}
Field Meaning
ok The job was updated
POST /v1/job/:uuid/run

Request Body:

{
  "owner" : "Username"
}
Field Required? Meaning
owner N Name of the user requesting the job re-run; defaults to "anon"

Response Body:

{
  "ok" : "scheduled"
}
Field Meaning
ok The task was scheduled
Archive API

Purpose: allows end-users and operators to see what backups have been performed, optionally filtering them to specific targets (just the Cloud Foundry postgres database please), stores (what’s in S3?) and time windows (only show me backups before that data corruption incident). It also facilitates restoration of data, and purging of backups ahead of schedule.

Note: the PUT /v1/archive/:uuid endpoint is only able to update the annotations (name and summary) for an archive.

Method Path Arguments Request Body
GET /v1/archives ?target=:uuid
?store=:uuid
?after=YYYYMMDD
?before=YYYYMMDD
-
POST /v1/archive/:uuid/restore { target: $target_uuid } see below
DELETE /v1/archive/:uuid - -
GET /v1/archive/:uuid - -
PUT /v1/archive/:uuid - see below
GET /v1/archives
[
  {
    "uuid"            : "9ee4b579-19ba-4fa5-94e1-e5b2a4d8e85a",
    "store_key"       : "BKP-1234-56789",

    "taken_at"        : "2015-10-25 11:32:00",
    "expires_at"      : "2015-12-25 11:32:00",
    "notes"           : "a few notes about this archive",

    "store_uuid"      : "b7b5743f-adfa-4ceb-abde-2c2085149b12",
    "store_plugin"    : "store-plugin",
    "store_endpoint"  : "{\"encoded\":\"json\"}",

    "target_uuid"     : "5c7b8b50-ff11-4d67-9624-fd8214bc8629",
    "target_plugin"   : "target-plugin",
    "target_endpoint" : "{\"encoded\":\"json\"}"
  },

  "..."
]
GET /v1/archive/:uuid

not yet implemented, apparently

POST /v1/archive/:uuid/restore

Request Body

{
  "target" : "dd322f14-763d-4659-bc49-c2f1f2352341",
  "owner"  : "Username"
}

| Field | Required? | Meaning |
| :---- | :-------: | :------ |
| target | N | UUID of the target to restore this archive to.  Defaults to the target from the original backup job
| owner | N | Username of the user requesting the restoration.  Defaults to "anon"

Response Body:

```json
{
  "ok" : "scheduled"
}
Field Meaning
ok The restore task was scheduled
PUT /v1/archive/:uuid

Request Body:

{
  "notes" : "Some notes about this archive"
}
Field Required? Meaning
notes Y Notes about the archive

Response Body:

{
  "ok" : "updated"
}
Field Meaning
ok The archive was updated
Tasks API

Purpose: allows the Web UI and the CLI to show running tasks, query a specific task, submit new tasks, cancel tasks, etc.

Method Path Arguments Request Body
GET /v1/tasks ?status=:status
?debug
-
GET /v1/task/:uuid -
DELETE /v1/task/:uuid - -
GET /v1/tasks
[
  {
    "uuid"         : "5e2c416d-36f7-484a-8a2a-3d3d567d55d6",
    "owner"        : "system",
    "type"         : "backup",

    "job_uuid"     : "274ddd91-6c17-4e5a-b5cd-6d53925d48b4",
    "archive_uuid" : "286102fe-c0fd-4e45-a357-743436a19602",
    "status"       : "done",
    "started_at"   : "2015-11-25 11:30:00",
    "stopped_at"   : "2015-11-25 11:32:00",
    "log"          : "this is the log of the job"
  },

  "..."
]
Meta API

Purpose: provides public (non-sensitive) information about the Shield daemon.

Method Path Arguments Request Body
GET /v1/meta/pubkey - -
GET /v1/meta/pubkey
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC5X75B52xHxfDeujUiKNk9t2jZTR6FIb02t9pUcE6yfwItKGEM8wEad5TVtAqrqdiOaZoosYzcXzzcM2JXsGaCqhVyf2oNaQHiPuyLufPdPW3ZE6omKfHlwL32PkdK4XtZQIwwLEK4NScp1Gvi8GMF90JSaPOQuKgpXCiDXQWFuQkPUzu6yIQIkhPCthtLRn31Td/zF92vBdr5VXyjQ1j8lFTO0jrw9nqwnrW3SA6b1FToSaLvXJJvV8De1Vlkl030tzVdYA4KPIZFX7IPPueVBJcqCaXxEMSzceknGTXP7r64oJDJw4vE39pYqCYtllhzOKKYVaDTHoUUBsZQu+e5 core@shield

This can be used by agents to auto-authorize the core daemon for remote operations, rather than having to specify the key out-of-band. There are security risks involved in using this feature, so be consider the potential for MitM attacks and act accordingly.

Plugin Calling Protocol

Store and Target Plugins are implemented as external programs, either scripts or compiled binaries, that follow the Plugin Calling Protocol, which stipulates how file descriptors are to be used, and what arguments are going to be passed to the external program to perform what functions.

$ redis-plugin info
{
  "name": "My Redis Plugin",
  "author": "Joe Random Hacker",
  "version": "1.0.0",
  "features": {
    "target": "yes",
    "store": "no"
  }
}

$ s3-plugin info
{
  "name": "My S3 Storage Plugin",
  "author": "Joe Random Hacker",
  "version": "2.1.4",
  "features": {
    "target": "no",
    "store": "yes"
  }
}

$ redis-plugin backup --endpoint '{"username":"redis","password":"secret"}' | s3-plugin store --endpoint '{"bucket":"test","key":"AKI123098123091"}'
{
  "key": "BA670360-DE9D-46D0-AEAB-55E72BD416C4"
}

$ s3-plugin retrieve --key decaf-bad --endpoint '{"bucket":"test","key":"AKI123098123091"}' | redis-plugin restore --endpoint '{"username":"redis","password":"secret"}'

Each plugin program must implement the following actions, which will be passed as the first argument:

  • info - Dump a JSON-encoded map containing the following keys, to standard output:

    1. name - The name of the plugin (human-readable)
    2. author - The name of the person or team who maintains the plugin. May include email, at author discretion.
    3. version - The version of the plugin
    4. features - A map of the features of this plugin. Currently supports two boolean keys ("yes" for true, "no" for false, both lower case) named "target" and "store", that indicate whether or not the plugin can support target and/or store operations.

    Other keys are allowed, but ignored, and all keys are reserved for future expansion. Keys starting with an underscore ('_') will never be used by shield, and is free for your own use.

    Always exits 0 to signify success. Exits non-zero to signify an error, and prints diagnostic information to standard error.

  • backup - Stream a backup blob of arbitrary binary data (per plugin semantics) to standard output, based on the endpoint given via the --endpoint command line argument. For example, a database target plugin may require the DSN and username/password in a JSON structure, and will run a platform-specific backup tool, hooking its output to standard output (like pgdump or mysqldump).

    Error messages and diagnostics should be printed to standard error.

    Exits 0 on success, or non-zero on failure.

  • restore - Read a backup blob of arbitrary binary data (per plugin semantics) from standard input, and perform a restore based on the endpoint given via the --endpoint command line argument.

    Error messages and diagnostics should be printed to standard error.

    Exits 0 on success, or non-zero on failure.

  • store - Read a backup blob of arbitrary binary data from standard input, and store it in the remote storage system, based on the endpoint given via the --endpoint command line argument. For example, an S3 plugin might require keys and a bucket name to perform storage operations.

    Error messages and diagnostics should be printed to standard error.

    Exits 0 on success, or non-zero on failure.

    On success, write the JSON representation of a map containing a summary of the stored object, including the following keys:

    1. key - An opaque identifier that means something to the plugin for purposes of restore. This will be logged in the database by shield.

    Other keys are allowed, but ignored, and all keys are reserved for future expansion. Keys starting with an underscore ('_') will never be used by shield, and is free for your own use.

  • retrieve Stream a backup blob of arbitrary binary data to standard output, based on the endpoint configuration given in the --endpoint command line argument, and a key, as given by the --key command line argument. (This will be the key that was returned from the store operation)

    Error messages and diagnostics should be printed to standard error.

    Exits 0 on success, or non-zero on failure.

  • purge Remove a backup blob of arbitrary data from the remote storage system, based on the endpoint configuration given in the --endpoint command line argument. The blob to be removed is identified via the --key command line argument.

    Error messages and diagnostics should be printed to standard error.

    Exits 0 on success, or non-zero on failure.

Notes on Development

Setting the environment variable SHIELD_MODE to the value DEV will cause all scheduling information to revert to "every minute" regardless of the actual schedule. This is to assist developers.

The Makefile

The Makefile is used to assist with development. The available targets are:

  • test | tests : runs all the tests with no additional parameters
  • coverage : runs tests with coverage information
  • report : makes report in (temporary) HTML page for a particular package, e.g. db. See examples.
  • race : runs ginkgo -race * to test for race conditions
  • plugin | plugins : builds all the plugin binaries
  • shield : builds the shieldd, shield-schema, shield-agent, and shield (CLI) binaries
  • all : runs all the tests (except the race test) and builds all the binaries.
  • fixme | fixmes : finds all FIXMEs in the project

all is also the default behavior, so running make with no targets is the same as make all.

Examples:

$ make shield
go build ./cmd/shieldd
go build ./cmd/shield-agent
go build ./cmd/shield-schema
go build ./cmd/shield

$ make tests
ginkgo *
[1450032890] Agent Test Suite - 39/39 specs •••••••••••••••••••••••••••••••••••••• SUCCESS! 387.609253ms PASS
[1450032890] API Client Library Test Suite - 3/3 specs ••• SUCCESS! 185.602µs PASS
[1450032890] Database Layer Test Suite - 21/21 specs ••••••••••••••••••••• SUCCESS! 15.888175ms PASS
[1450032890] Plugin Framework Test Suite - 45/45 specs ••••••••••••••••••••••••••••••••••••••••••••• SUCCESS! 20.695859ms PASS
[1450032890] Supervisor Test Suite - 139/139 specs ••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• SUCCESS! 155.843391ms PASS
[1450032890] Timespec Test Suite - 37/37 specs ••••••••••••••••••••••••••••••••••••• SUCCESS! 26.84143ms PASS

Ginkgo ran 6 suites in 4.001600857s
Test Suite Passed
go vet ./...

$ make report FOR=db
go tool cover -html=coverage/db.cov

CLI Usage Examples

This section is exploratory.

# targets
$ shield list targets [--[un]used] [--plugin $NAME]
$ shield show target $UUID
$ shield create target
$ shield edit target $UUID
$ shield delete target $UUID

# schedule management
$ shield list schedules [--[un]used]
$ shield show schedule $UUID
$ shield create schedule
$ shield update schedule $UUID
$ shield delete schedule $UUID

# retention policies
$ shield list retention policies [--[un]used]
$ shield show retention policy $UUID
$ shield create retention policy
$ shield update retention policy $UUID
$ shield delete retention policy $UUID

# "managing" plugins
$ shield list plugins
$ shield show plugin $NAME

# stores
$ shield list stores [--[un]used] [--plugin $NAME]
$ shield show store $UUID
$ shield create store
$ shield edit store $UUID
$ shield delete store $UUID

# jobs
$ shield list jobs [--[un]paused] [--target $UUID] [--store $UUID]
                [--schedule $UUID] [--retention-policy $UUID]
$ shield show job $UUID
$ shield create job
$ shield edit job $UUID
$ shield delete job $UUID
$ shield pause job $UUID
$ shield unpause job $UUID
$ shield paused job $UUID
$ shield run job $UUID

# archives
$ shield list archives [--target $UUID] [--store $UUID]
                    [--after YYYYMMDD] [--before YYYYMMDD]
$ shield show archive $UUID
$ shield delete archive $UUID
$ shield restore archive $UUID [--to $TARGET_UUID]

# task management
$ shield list tasks [--all]
$ shield show task $UUID
$ shield cancel task $UUID

Proof of Concept (Where Do We Go From Here?)

Research

We need to identify all of the data systems we wish to support with this system. For each system, we need to identify any problematic systems that will not fit into one of the two collection / restore models designed:

  • Direct over-the-network backup/restore a la pg_dump / pg_restore
  • Instrumentation of local backup/restore + file shipping via Agent Daemon / Plugin
Stage 1 Proof-of-Concept

To get this project off the ground, I think we need to do some research and experimental implementation into the following areas:

  • Implement the postgres target plugin using pg_dump / pg_restore tools
  • Implement the fs storage plugin to store blobs in the local file system
  • Implement the Core Daemon with limited functionality:
    • Task execution
    • backup operation
    • restore operation
  • Implement the HTTP API with limited functionality:
    • /v1/jobs/*
    • /v1/archive/*
  • Implement the CLI with limited functionality:
    • shield * job
    • shield * backup
    • shield * task

This will let us test flush out any inconsistencies in the architecture, and find any problematic aspects of the problem domain not presently considered.

Stage 2 Proof-of-Concept

Next, we extend the proof-of-concept implementation to test out the Agent Target Plugin design, using Redis as the data system. This entails the following:

  • Implement the Agent Daemon (in general)
  • Extend the Agent Daemon to handle Redis’ BGSAVE command
  • Implement the Agent Target Plugin

Directories

Path Synopsis
Godeps
_workspace/src/github.com/fsouza/go-dockerclient
Package docker provides a client for the Docker remote API.
Package docker provides a client for the Docker remote API.
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/Sirupsen/logrus
Package logrus is a structured logger for Go, completely API compatible with the standard library logger.
Package logrus is a structured logger for Go, completely API compatible with the standard library logger.
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/docker/docker/pkg/parsers
Package parsers provides helper functions to parse and validate different type of string.
Package parsers provides helper functions to parse and validate different type of string.
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/docker/docker/pkg/pools
Package pools provides a collection of pools which provide various data types with buffers.
Package pools provides a collection of pools which provide various data types with buffers.
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/docker/docker/pkg/ulimit
Package ulimit provides structure and helper function to parse and represent resource limits (Rlimit and Ulimit, its human friendly version).
Package ulimit provides structure and helper function to parse and represent resource limits (Rlimit and Ulimit, its human friendly version).
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/docker/docker/pkg/units
Package units provides helper function to parse and print size and time units in human-readable format.
Package units provides helper function to parse and print size and time units in human-readable format.
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/gorilla/context
Package context stores values shared during a request lifetime.
Package context stores values shared during a request lifetime.
_workspace/src/github.com/fsouza/go-dockerclient/external/github.com/gorilla/mux
Package gorilla/mux implements a request router and dispatcher.
Package gorilla/mux implements a request router and dispatcher.
_workspace/src/github.com/fsouza/go-dockerclient/testing
Package testing provides a fake implementation of the Docker API, useful for testing purpose.
Package testing provides a fake implementation of the Docker API, useful for testing purpose.
_workspace/src/github.com/kr/pretty
Package pretty provides pretty-printing for Go values.
Package pretty provides pretty-printing for Go values.
_workspace/src/github.com/kr/text
Package text provides rudimentary functions for manipulating text in paragraphs.
Package text provides rudimentary functions for manipulating text in paragraphs.
_workspace/src/github.com/kr/text/colwriter
Package colwriter provides a write filter that formats input lines in multiple columns.
Package colwriter provides a write filter that formats input lines in multiple columns.
_workspace/src/github.com/kr/text/mc
Command mc prints in multiple columns.
Command mc prints in multiple columns.
_workspace/src/github.com/lib/pq
Package pq is a pure Go Postgres driver for the database/sql package.
Package pq is a pure Go Postgres driver for the database/sql package.
_workspace/src/github.com/lib/pq/listen_example
Below you will find a self-contained Go program which uses the LISTEN / NOTIFY mechanism to avoid polling the database while waiting for more work to arrive.
Below you will find a self-contained Go program which uses the LISTEN / NOTIFY mechanism to avoid polling the database while waiting for more work to arrive.
_workspace/src/github.com/lib/pq/oid
Package oid contains OID constants as defined by the Postgres server.
Package oid contains OID constants as defined by the Postgres server.
_workspace/src/github.com/magiconair/properties
Package properties provides functions for reading and writing ISO-8859-1 and UTF-8 encoded .properties files and has support for recursive property expansion.
Package properties provides functions for reading and writing ISO-8859-1 and UTF-8 encoded .properties files and has support for recursive property expansion.
_workspace/src/github.com/mattn/go-sqlite3
Package sqlite3 provides interface to SQLite3 databases.
Package sqlite3 provides interface to SQLite3 databases.
_workspace/src/github.com/mitchellh/mapstructure
The mapstructure package exposes functionality to convert an abitrary map[string]interface{} into a native Go structure.
The mapstructure package exposes functionality to convert an abitrary map[string]interface{} into a native Go structure.
_workspace/src/github.com/onsi/ginkgo
Ginkgo is a BDD-style testing framework for Golang The godoc documentation describes Ginkgo's API.
Ginkgo is a BDD-style testing framework for Golang The godoc documentation describes Ginkgo's API.
_workspace/src/github.com/onsi/ginkgo/config
Ginkgo accepts a number of configuration options.
Ginkgo accepts a number of configuration options.
_workspace/src/github.com/onsi/ginkgo/ginkgo
The Ginkgo CLI The Ginkgo CLI is fully documented [here](http://onsi.github.io/ginkgo/#the_ginkgo_cli) You can also learn more by running: ginkgo help Here are some of the more commonly used commands: To install: go install github.com/onsi/ginkgo/ginkgo To run tests: ginkgo To run tests in all subdirectories: ginkgo -r To run tests in particular packages: ginkgo <flags> /path/to/package /path/to/another/package To pass arguments/flags to your tests: ginkgo <flags> <packages> -- <pass-throughs> To run tests in parallel ginkgo -p this will automatically detect the optimal number of nodes to use.
The Ginkgo CLI The Ginkgo CLI is fully documented [here](http://onsi.github.io/ginkgo/#the_ginkgo_cli) You can also learn more by running: ginkgo help Here are some of the more commonly used commands: To install: go install github.com/onsi/ginkgo/ginkgo To run tests: ginkgo To run tests in all subdirectories: ginkgo -r To run tests in particular packages: ginkgo <flags> /path/to/package /path/to/another/package To pass arguments/flags to your tests: ginkgo <flags> <packages> -- <pass-throughs> To run tests in parallel ginkgo -p this will automatically detect the optimal number of nodes to use.
_workspace/src/github.com/onsi/ginkgo/internal/remote
Aggregator is a reporter used by the Ginkgo CLI to aggregate and present parallel test output coherently as tests complete.
Aggregator is a reporter used by the Ginkgo CLI to aggregate and present parallel test output coherently as tests complete.
_workspace/src/github.com/onsi/ginkgo/reporters
Ginkgo's Default Reporter A number of command line flags are available to tweak Ginkgo's default output.
Ginkgo's Default Reporter A number of command line flags are available to tweak Ginkgo's default output.
_workspace/src/github.com/onsi/gomega
Gomega is the Ginkgo BDD-style testing framework's preferred matcher library.
Gomega is the Ginkgo BDD-style testing framework's preferred matcher library.
_workspace/src/github.com/onsi/gomega/format
Gomega's format package pretty-prints objects.
Gomega's format package pretty-prints objects.
_workspace/src/github.com/onsi/gomega/gbytes
Package gbytes provides a buffer that supports incrementally detecting input.
Package gbytes provides a buffer that supports incrementally detecting input.
_workspace/src/github.com/onsi/gomega/gexec
Package gexec provides support for testing external processes.
Package gexec provides support for testing external processes.
_workspace/src/github.com/onsi/gomega/ghttp
Package ghttp supports testing HTTP clients by providing a test server (simply a thin wrapper around httptest's server) that supports registering multiple handlers.
Package ghttp supports testing HTTP clients by providing a test server (simply a thin wrapper around httptest's server) that supports registering multiple handlers.
_workspace/src/github.com/onsi/gomega/matchers
Gomega matchers This package implements the Gomega matchers and does not typically need to be imported.
Gomega matchers This package implements the Gomega matchers and does not typically need to be imported.
_workspace/src/github.com/pborman/getopt
Package getopt provides traditional getopt processing for implementing commands that use traditional command lines.
Package getopt provides traditional getopt processing for implementing commands that use traditional command lines.
_workspace/src/github.com/pborman/uuid
The uuid package generates and inspects UUIDs.
The uuid package generates and inspects UUIDs.
_workspace/src/github.com/rlmcpherson/s3gof3r
Package s3gof3r provides fast, parallelized, streaming access to Amazon S3.
Package s3gof3r provides fast, parallelized, streaming access to Amazon S3.
_workspace/src/github.com/rlmcpherson/s3gof3r/gof3r
gof3r is a command-line interface for s3gof3r: fast, concurrent, streaming access to Amazon S3.
gof3r is a command-line interface for s3gof3r: fast, concurrent, streaming access to Amazon S3.
_workspace/src/github.com/russross/blackfriday
Blackfriday markdown processor.
Blackfriday markdown processor.
_workspace/src/github.com/shurcooL/sanitized_anchor_name
Package sanitized_anchor_name provides a func to create sanitized anchor names.
Package sanitized_anchor_name provides a func to create sanitized anchor names.
_workspace/src/github.com/spf13/cobra
Package cobra is a commander providing a simple interface to create powerful modern CLI interfaces.
Package cobra is a commander providing a simple interface to create powerful modern CLI interfaces.
_workspace/src/github.com/spf13/pflag
Package pflag is a drop-in replacement for Go's flag package, implementing POSIX/GNU-style --flags.
Package pflag is a drop-in replacement for Go's flag package, implementing POSIX/GNU-style --flags.
_workspace/src/github.com/spf13/viper/remote
Package remote integrates the remote features of Viper.
Package remote integrates the remote features of Viper.
Daemons need to log.
_workspace/src/github.com/voxelbrain/goptions
package goptions implements a flexible parser for command line options.
package goptions implements a flexible parser for command line options.
_workspace/src/golang.org/x/crypto/curve25519
Package curve25519 provides an implementation of scalar multiplication on the elliptic curve known as curve25519.
Package curve25519 provides an implementation of scalar multiplication on the elliptic curve known as curve25519.
_workspace/src/golang.org/x/crypto/ssh
Package ssh implements an SSH client and server.
Package ssh implements an SSH client and server.
_workspace/src/golang.org/x/crypto/ssh/agent
Package agent implements a client to an ssh-agent daemon.
Package agent implements a client to an ssh-agent daemon.
_workspace/src/golang.org/x/crypto/ssh/terminal
Package terminal provides support functions for dealing with terminals, as commonly found on UNIX systems.
Package terminal provides support functions for dealing with terminals, as commonly found on UNIX systems.
_workspace/src/golang.org/x/crypto/ssh/test
This package contains integration tests for the golang.org/x/crypto/ssh package.
This package contains integration tests for the golang.org/x/crypto/ssh package.
_workspace/src/gopkg.in/fsnotify.v1
Package fsnotify provides a platform-independent interface for file system notifications.
Package fsnotify provides a platform-independent interface for file system notifications.
_workspace/src/gopkg.in/yaml.v2
Package yaml implements YAML support for the Go language.
Package yaml implements YAML support for the Go language.
cmd
fs
s3

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL