appengine/

directory
v0.0.0-...-678bb0e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 8, 2017 License: Apache-2.0

README

The LUCI Scheduler Service

The LUCI Scheduler Service periodically makes URL fetches, runs Swarming tasks or DM quests. It uses luci-config to fetch per-project lists of cron jobs. It tries to prevent concurrent execution of job invocations (i.e. an invocation will not start if previous one is still running).

It's built on top of App Engine Task Queues service.

Terminology

To reduce confusion:

  • Cron job or just job is a definition of the periodic activity that the scheduler service consumes.
  • Invocation is an actual attempt to execute an activity specified by a job. Invocation has duration in time. It starts, runs, and completes. One example is Swarming tasks.
  • Task queue task or just task is a GAE Task Queue Service task. An act of enqueuing a task with some non-default ETA is referred to as "scheduling".
  • GAE cron task is GAE Cron Service task (defined via cron.yaml).

Design overview

A job state is stored in datastore in a separate entity group. It is updated in a chain of Task Queue tasks. The lifecycle of a job:

  1. A job is registered with the service. It is set to state SCHEDULED and first TickLater task is scheduled to run at some time in the future (based on the job's schedule).
  2. TickLater runs. It transactionally updates job's state to QUEUED, schedules next TickLater task and enqueues StartInvocation task.
  3. StartInvocation task launches the invocation (starts URL fetch, Swarming task, etc) and moves the job to RUNNING state. Once the invocation is finished, the job moves back to SCHEDULED state.

See statemachine.go for complete description of all various states.

Handling internal failures

The LUCI Scheduler Service relies on two GAE subsystems: Datastore Service and Task Queue Service. There are some associated concerns:

  • Datastore can effectively become read only for extended periods of time if underlying GAE infrastructure is under stress (e.g. the app is being migrated to another datacenter).
  • Task queue service's storage should be considered ephemeral. Task queues can be purged via admin console or somehow otherwise "drained" (imagine a bug in the code that causes service to consume tasks, but do wrong things). Since TickLater tasks are chained, a single skipped task may stop processing of some job forever.

Datastore partial availability problem is tricky because naive implementation may choose to retry StartInvocation tasks due to failed datastore writes and accidentally launch many invocations instead of one. Imagine the service scheduling a storm of Swarming tasks or DM quests, overloading entire infrastructure, just because its datastore is having a bad day.

To workaround datastore partial availability the service always writes something to datastore before sending external requests. In that case, if datastore is having issues, they will be detected before an external service is hit.

To workaround task queue issue the service uses "watchdog" GAE cron task:

  1. When TickLater or StartInvocation task is being scheduled, the cron job entity is updated with WatchdogTimerTs: a timestamp in the future when the task should be finished already.
  2. When TickLater or StartInvocation are running when expected, they move that timestamp further.
  3. TODO: Separate "watchdog" GAE cron once per minute fetches from datastore all jobs that have WatchdogTimerTs less than Now() and repairs their state (by launching another TickLater task) or at least reports them.
  4. Since watchdog triggering is considered an exceptional situation, the query above should return small number of entities, and thus single cron job should be able to handle them all within request deadline.

Directories

Path Synopsis
package acl implements ACLs for enforcement in API and UI.
package acl implements ACLs for enforcement in API and UI.
Package apiservers implements gRPC APIs exposed by Scheduler service.
Package apiservers implements gRPC APIs exposed by Scheduler service.
Package catalog implements a part that talks to luci-config service to fetch and parse job definitions.
Package catalog implements a part that talks to luci-config service to fetch and parse job definitions.
Package engine implements the core logic of the scheduler service.
Package engine implements the core logic of the scheduler service.
cron/demo
Package demo shows how cron.Machines can be hosted with Datastore and TQ.
Package demo shows how cron.Machines can be hosted with Datastore and TQ.
internal
Package internal contains internal structs used by the engine.
Package internal contains internal structs used by the engine.
Package frontend implements GAE web server for luci-scheduler service.
Package frontend implements GAE web server for luci-scheduler service.
Package messages is a generated protocol buffer package.
Package messages is a generated protocol buffer package.
package presentation implements common method to API and UI serving.
package presentation implements common method to API and UI serving.
Package task defines interface between Scheduler engine and implementations of particular tasks (such as URL fetch tasks, Swarming tasks, DM tasks, etc).
Package task defines interface between Scheduler engine and implementations of particular tasks (such as URL fetch tasks, Swarming tasks, DM tasks, etc).
buildbucket
Package buildbucket implements tasks that run Buildbucket jobs.
Package buildbucket implements tasks that run Buildbucket jobs.
noop
Package noop implements tasks that do nothing at all.
Package noop implements tasks that do nothing at all.
swarming
Package swarming implements tasks that run Swarming jobs.
Package swarming implements tasks that run Swarming jobs.
urlfetch
Package urlfetch implements tasks that just make HTTP calls.
Package urlfetch implements tasks that just make HTTP calls.
utils
Package utils contains a bunch of small functions used by task/ subpackages.
Package utils contains a bunch of small functions used by task/ subpackages.
Package ui implements request handlers that serve user facing HTML pages.
Package ui implements request handlers that serve user facing HTML pages.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL