go-coordinate

module
v0.0.0-...-6f34b33 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 28, 2023 License: MIT

README

Go Coordinate Daemon

CircleCI Docker Hub Repository

This package provides a reimplementation of the Diffeo Coordinate (https://github.com/diffeo/coordinate) daemon. It is fully compatible with existing Python Coordinate code, and provides a REST interface for Go and other languages.

Overview

Coordinate is a job queue system. It is designed for repetitive tasks with large numbers of inputs, where the inputs and outputs will be stored externally and do not need to be passed directly through the system, and where no particular action needs to be taken when a job finishes.

Coordinate-based applications can define work specs, JSON or YAML dictionary objects that define specific work to do. A typical work spec would name a specific Python function to call with a YAML configuration dictionary, and the Python Coordinate package contains a worker process that can run these work specs. Each work spec has a list of work units, individual tasks to perform, where each work unit has a name or key and an additional data dictionary. In typical operation a work unit key is a filename or database key and the data is used only to record outputs. Read more about the data model.

The general expectation is that there will be, at most, dozens of work specs, but each work spec could have millions of work units. It is definitely expected that many worker processes will connect to a single Coordinate daemon, and past data loads have involved 800 or more workers talking to one server.

Installation

From source:

go get github.com/diffeo/go-coordinate/cmd/coordinated

Usage

Run the coordinated binary. With default options, it will use in-memory storage and start a network server listening on ports 5932 and 5980. Port 5980 provides the REST interface.

Go code should use the backend package to provide a command-line flag to get a backend object, which will implement the generic interface in the coordinate package. Test code can directly create a memory backend. Most applications will expect to use the restclient backend to communicate with a central Coordinate daemon.

5932 is the default TCP port for the Python Coordinate daemon, and application configurations that normally point at that daemon should work against this one as well. Read more about Python compatibility.

pip install coordinate
go get github.com/diffeo/go-coordinate/cmd/coordinated
$GOPATH/bin/coordinated &
cat >config.yaml <<EOF
coordinate:
  addresses: ['localhost:5932']
EOF
coordinate -c config.yaml summary

Docker

A Coordinate daemon server image is on Docker Hub:

docker run -p 5932:5932 -p 5980:5980 diffeo/coordinated

This is a single-binary image that only runs the daemon. If you need any additional command-line options, such as a persistent backend, specify them directly after the image name.

docker run -d -p 5432:5432 postgres:9.5
docker run -d -p 5932:5932 -p 5980:5980 diffeo/coordinated \
    -backend postgres://172.17.0.1 -log-requests

The current CI setup has the Docker latest tag pointing at a master commit from this repository. You may want to specify a specific version tag. The earliest version tag in Docker Hub is diffeo/coordinated:0.4.2.

Packages

cmd/coordinated is the main process, providing the network service. jobserver provides the RPC calls compatible with the Python Coordinate system. cborrpc provides the underlying wire transport.

coordinate describes an abstract API to Coordinate. This API is slightly different from the Python Coordinate API; in particular, an Attempt object records a single worker working on a single work unit, allowing the history of workers and individual work units to be tracked. memory is the in-memory implementation of this API, postgres uses PostgreSQL, and restclient talks to a REST server. backend provides a command-line option to choose a backend.

Future

The Namespace.Workers() call simply iterates all known workers, but the implementation of the Python Coordinate worker will generate an extremely large number of these. This call is subject to unspecified future change.

Go Coordinate version 0.3.0 adds a generic WorkUnitMeta structure, which defines both the work unit priority and the earliest time it can execute. This replaces the priority field in a couple of contexts. A future version of Go Coordinate not before 0.4.0 will delete WorkUnit.Priority() and WorkUnit.SetPriority().

Directories

Path Synopsis
Package backend provides a standard way to construct a coordinate interface based on command-line flags.
Package backend provides a standard way to construct a coordinate interface based on command-line flags.
Package cache provides name-based caching of Coordinate objects.
Package cache provides name-based caching of Coordinate objects.
Package cborrpc defines the CBOR-RPC format used by the Python Coordinate daemon.
Package cborrpc defines the CBOR-RPC format used by the Python Coordinate daemon.
cmd
coordbench
Package coordbench provides a load-generation tool for Coordinate.
Package coordbench provides a load-generation tool for Coordinate.
coordinated
Package coordinated provides a wire-compatible reimplementation of the Diffeo Coordinate daemon.
Package coordinated provides a wire-compatible reimplementation of the Diffeo Coordinate daemon.
cptest
Cptest copies test functions from a package into a new file.
Cptest copies test functions from a package into a new file.
demoworker
Package demoworker provides a complete demonstration Coordinate application.
Package demoworker provides a complete demonstration Coordinate application.
Package coordinate defines an abstract API to Coordinate.
Package coordinate defines an abstract API to Coordinate.
coordinatetest
Package coordinatetest provides generic functional tests for the Coordinate interface.
Package coordinatetest provides generic functional tests for the Coordinate interface.
Package jobserver provides a CBOR-RPC interface compatible with the Python coordinate module.
Package jobserver provides a CBOR-RPC interface compatible with the Python coordinate module.
Package memory provides an in-process, in-memory implementation of Coordinate.
Package memory provides an in-process, in-memory implementation of Coordinate.
Package restclient provides a Coordinate-compatible HTTP REST client that talks to the matching server in the "restserver" package.
Package restclient provides a Coordinate-compatible HTTP REST client that talks to the matching server in the "restserver" package.
Package restdata defines common data structures shared between the restserver and restclient packages.
Package restdata defines common data structures shared between the restserver and restclient packages.
Package restserver publishes a Coordinate interface as a REST service.
Package restserver publishes a Coordinate interface as a REST service.
Package worker provides a library framework for processes that execute Coordinate work units.
Package worker provides a library framework for processes that execute Coordinate work units.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL