scylla-octopus: backup and maintenance utility for scylladb
Scylla-octopus attempts to reproduce some functionality of Scylla Manager (which is not free) and Medusa for Apache Cassandra (which is not compatible with Scylla).
Features:
- Back up a single node or a database cluster
- database schema export
- snapshots of all or selected keyspaces
- optional backup compression with
pigz
- Upload a backup to s3-compatible storage with
awscli
- Backups in remote storage can be expired and removed automatically
- Database maintenance with
nodetool repair
- Webhook support for notifications about backup completion and/or errors
Future plans:
- Backup restoration
- Configure github actions to build a docker image and run tests
О проекте
Scylla-octopus - утилита для бэкапа и обслуживания scylladb.
В ней реализована часть функциональности платной Scylla Manager и Medusa for Apache Cassandra.
Функции:
- Бэкап отдельного узла или целого кластера
- экспорт схемы базы данных
- снэпшоты всех или выбранных keyspaces
- опциональное сжатие бэкапа с помощью
pigz
- Загрузка бэкапов в s3-совместимое хранилище через
awscli
- Автоматическое удаление бэкапов в хранилище после истечения заданного срока
- Обслуживание БД через вызов
nodetool repair
- Поддержка вебхуков для отправки уведомлений о завершении работы и об ошибках
Планы:
- Восстановление из бэкапов
- Настройка github actions для сборки docker-образа и запуска тестов
Usage
scylla-octopus healthcheck
- performs a sanity check of the environment and configuration (scylladb status, the presence of required executables, etc)
scylla-octopus backup run
- runs a backup (exports database schema and snapshot, uploads to remote storage, cleans up)
scylla-octopus backup list
- prints a list of existing backups in remote storage
scylla-octopus backup list-expired
- prints a list of expired backups in remote storage that can be removed
scylla-octopus backup cleanup-expired
- removes expired backups from remote storage
scylla-octopus db list-snapshots
- prints a list of existing snapshots on database nodes
scylla-octopus db repair
- executes nodetool repair -pr on database nodes
Command-line flags:
--config=...
- path to configuration file (defaults to config/remote.yml
)
--verbose
, -v
- forces debug output (equivalent to log.level=debug
and commands.debug=true
in configuration file)
Configuration
See config
directory for configuration examples.
config/remote.yml
is an example for running a tool on multiple database nodes over SSH.
You will probably need to add a public SSH key to every machine beforehand.
In this mode, it doesn't matter where scylla-octopus
is executed, as long as it can SSH to the nodes.
config/local.yml
is an example for running a tool on a database node itself.
The options are mostly the same except the lack of cluster.hosts
section.
Requirements
scylla-octopus
has a few assumptions about its environment:
- An awscli executable should be available on every database node for backup uploading.
- It can be used with any s3-compatible storage.
- If it is unavailable, or you only want to keep local backups, then set
backup.disableUploading
to true
.
- An alternative storage implementation (such as
rsync
) would be welcomed.
- If backup compression is enabled with
archive.method: pigz
, then pigz must be available on every database node.
- So far
pigz
is the only supported compression method, but we're open to suggestions.
- Database nodes are running linux with an
sh
shell.
- The tool is tested with recent (4.x) scylladb versions, but will probably work with older ones too.
Error handling
A healthcheck is performed before backup and repair. If any node is unreachable, or has a status other than "UN" (up and running), the program stops.
Backups are executed in parallel on each node. If there is any error, then the misbehaving node is skipped, but the program doesn't stop.
Repairs are executed consecutively. If there is any error, the program stops and the remaining nodes will not be repaired.
Building
If go 1.17+
is installed locally, then make build
will create an executable in output/scylla-octopus
.
Building docker image:
make docker-image
docker run --rm kolesa-team/scylla-octopus
Development
For local development and testing we spin up 3 scylladb instances in docker-compose: docker-compose up
.
Then, execute the following commands to configure the nodes (this will set up SSH keys on the nodes and install awscli
):
make prepare-test-node node=scylla-node1
make prepare-test-node node=scylla-node2
make prepare-test-node node=scylla-node3
(this should really be automated with a single command)
You can also create a database (keyspace) with some testing data: make init-db node=scylla-node1
(this will be replicated to every node).
Now try running a tool:
# healthcheck
go run main.go healthcheck
expected output (besides the debug logs):
{
"10.5.0.2": "OK",
"10.5.0.3": "OK",
"10.5.0.4": "OK"
}
# backup
go run main.go backup run
See test
directory for details about scylladb configuration in docker-compose.
Running on a database node directly
By default, a config/remote.yml
configuration file is used to connect to database nodes over SSH.
Another execution mode is to run on database host directly.
This can also be tested with docker-compose. For each database container, /scylla-octopus
directory is mounted with a program executable (compiled with make build
).
Let's run scylla-octopus
on node 1:
make build
docker-compose exec scylla-node1 sh
/scylla-octopus/scylla-octopus --config=/scylla/local.yml healthcheck
expected output:
{
"10.5.0.2": "OK"
}
Inspecting backups
The easiest way to inspect backup contents is to set backup.cleanupLocal
to false
, run a backup, then SSH to a database host and navigate to /var/lib/scylla/backup
:
# first make sure backup.cleanupLocal is false,
# then run a backup
go run main.go backup run
# SSH to database node
docker-compose exec scylla-node1 sh
# show the last backup metadata
cat /var/lib/scylla/backup/metadata.yml
© 2021 Kolesa Group. Licensed under MIT