wr

command module
v0.33.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 4, 2024 License: GPL-3.0 Imports: 3 Imported by: 0

README

wr - workflow runner

Gitter GoDoc Go Report Card Build Status

wr is a workflow runner. You use it to run the commands in your workflow easily, automatically, reliably, with repeatability, and while making optimal use of your available computing resources.

wr is implemented as a polling-free in-memory job queue with an on-disk acid transactional embedded database, written in go.

Its main benefits over other software workflow management systems are its very low latency and overhead, its high performance at scale, its real-time status updates with a view on all your workflows on one screen, its permanent searchable history of all the commands you have ever run, and its "live" dependencies enabling easy automation of on-going projects.

Furthermore, wr has best-in-class support for OpenStack, providing incredibly easy deployment and auto-scaling without you having to know anything about OpenStack. For use in clouds such as AWS, GCP and others, wr also has the built-in ability to self-deploy to any Kubernetes cluster. And it has built-in support for mounting S3-like object stores, providing an easy way of running commands against remote files whilst enjoying high performance.

Download

download

Alternatively, build it yourself (see go.mod for the minimum version of go required):

    git clone https://github.com/VertebrateResequencing/wr.git
    cd wr
    make

The wr executable should now be in $HOME/go/bin.

Documentation

Complete usage information is available using the -h option to wr and its sub-commands.

Guided usage, tips, notes and tutorials are available here: https://workflow-runner.readthedocs.io/

Documentation

Overview

Package main is a stub for wr's command line interface, with the actual implementation in the cmd package.

wr is a workflow runner. You use it to run the commands in your workflow easily, automatically, reliably, with repeatability, and while making optimal use of your available computing resources.

wr is implemented as a polling-free in-memory job queue with an on-disk acid transactional embedded database, written in go.

Its main benefits over other software workflow management systems are its very low latency and overhead, its high performance at scale, its real-time status updates with a view on all your workflows on one screen, its permanent searchable history of all the commands you have ever run, and its "live" dependencies enabling easy automation of on-going projects.

Basics

Start up the manager daemon, which gives you a url you can view the web interface on:

wr manager start -s local

In addition to the "local" scheduler, which will run your commands on all available cores of the local machine, you can also have it run your commands on your LSF cluster or in your OpenStack environment (where it will scale the number of servers needed up and down automatically).

Now, stick the commands you want to run in a text file and:

wr add -f myCommands.txt

Arbitrarily complex workflows can be formed by specifying command dependencies. Use the --help option of `wr add` for details.

Package Overview

wr's core is implemented in the queue package. This is the in-memory job queue that holds commands that still need to be run. Its multiple sub-queues enable certain guarantees: a given command will only get run by a single client at any one time; if a client dies, the command will get run by another client instead; if a command cannot be run, it is buried until the user takes action; if a command has a dependency, it won't run until its dependencies are complete.

The jobqueue package provides client+server code for interacting with the in-memory queue from the queue package, and by storing all new commands in an on-disk database, provides an additional guarantee: that (dynamic) workflows won't break because a job that was added got "lost" before it got run. It also retains all completed jobs, enabling searching through of past workflows and allowing for "live" dependencies, triggering the rerunning of previously completed commands if their dependencies change.

The jobqueue package is also what actually does the main "work" of the system: the server component knows how many commands need to be run and what their resource requirements (memory, time, cpus etc.) are, and submits the appropriate number of jobqueue runner clients to the job scheduler.

The jobqueue/scheduler package has the scheduler-specific code that ensures that these runner clients get run on the configured system in the most efficient way possible. Eg. for LSF, if we have 10 commands that need 2GB of memory to run, we will submit a job array of size 10 with 2GB of memory reservation to LSF. The most limited (and therefore potentially least contended) queue capable of running the commands will be chosen. For OpenStack, the cheapest server (in terms of cores and memory) that can run the commands will be spawned, and once there is no more work to do on those servers, they get terminated to free up resources.

The cloud package implements methods for interacting with cloud environments such as OpenStack. The corresponding jobqueue/scheduler package uses these methods to do their work.

The static subdirectory contains the html, css and javascript needed for the web interface. See jobqueue/serverWebI.go for how the web interface backend is implemented.

The internal package contains general utility functions, and most notably config.go holds the code for how the command line interface deals with config options.

Directories

Path Synopsis
Package cloud provides functions to interact with cloud providers, used to create cloud resources so that you can spawn servers, then delete those resources when you're done.
Package cloud provides functions to interact with cloud providers, used to create cloud resources so that you can spawn servers, then delete those resources when you're done.
Package cmd implements wr's command line interface.
Package cmd implements wr's command line interface.
Package internal houses code for wr's general utility functions.
Package internal houses code for wr's general utility functions.
Package jobqueue provides server/client functions to interact with the queue structure provided by the queue package over a network.
Package jobqueue provides server/client functions to interact with the queue structure provided by the queue package over a network.
scheduler
Package scheduler lets the jobqueue server interact with the configured job scheduler (if any) to submit jobqueue runner clients and have them run on a compute cluster (or local machine).
Package scheduler lets the jobqueue server interact with the configured job scheduler (if any) to submit jobqueue runner clients and have them run on a compute cluster (or local machine).
Package limiter provides a way of limiting the number of something that belongs to one or more limit groups.
Package limiter provides a way of limiting the number of something that belongs to one or more limit groups.
Package queue provides an in-memory queue structure suitable for the safe and low latency implementation of a real job queue.
Package queue provides an in-memory queue structure suitable for the safe and low latency implementation of a real job queue.
Package rp ("resource protector") provides functions that help control access to some limited resource.
Package rp ("resource protector") provides functions that help control access to some limited resource.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL