luciexe

package
v0.0.0-...-0b2320d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 26, 2024 License: Apache-2.0 Imports: 12 Imported by: 13

Documentation

Overview

Package luciexe documents the "LUCI Executable" protocol, and contains constants which are part of this protocol.

Summary

A LUCI Executable ("luciexe") is a binary which implements a protocol to:

  • Pass the initial state of the 'build' from a parent process to the luciexe.
  • Understand the build's local system contracts (like the location of cached data).
  • Asynchronously update the state of the build as it runs.

This protocol is recursive; A luciexe can run another luciexe such that the child's updates are reflected on the parent's output.

The protocol has 3 parts:

  • Host Application - This sits at the top of the luciexe process hierarchy and sets up singleton environmental requirements for the whole tree.
  • Invocation of a luciexe binary - This invocation process occurs both for the topmost luciexe, as well as all internal invocations of other luciexe's within the process hierarchy.
  • The luciexe binary - The binary has a couple of responsibilities to be compatible with this protocol. Once the binary has fulfilled it's responsibilities it's free to do what it wants (i.e. actually do its task).

In general, we strive where possible to minimize the complexity of the luciexe binary. This is because we expect to have a small number of 'host application' implementations, and a relatively large number of 'luciexe' implementations.

The Host Application

At the root of every tree of luciexe invocations there is a 'host' application which sets up an manages all environmental singletons (like the Logdog 'butler' service, the LUCI ambient authentication service, etc.). This Host Application is responsible for intercepting and merging all 'build.proto' streams emitted within this tree of luciexes (see "Recursive Invocation"). The Host Application may choose what happens to these intercepted build.proto messages.

The Host Application MUST:

  • Run a logdog butler service and expose all relevant LOGDOG_* environment variables such that the following client libraries can stream log data:
  • Golang: go.chromium.org/luci/logdog/client/butlerlib/bootstrap
  • Python: infra_libs.logdog.bootstrap
  • Hook the butler to intercept and merge build.proto streams into a single build.proto (zlib-compressed) stream.
  • Set up a local LUCI ambient authentication service which luciexe's can use to mint auth tokens.
  • Prepare an empty directory which will house tempdirs and workdirs for all luciexe invocations. The Host Application MAY clean this directory up, but it may be useful to leak it for debugging. It's permissible for the Host Application to defer this cleanup to an external process (e.g. buildbucket's agent may defer this to swarming).

The Host Application MAY hook additional streams for debugging/logging; it is frequently convenient to hook the stderr/stdout streams from the top level luciexe and tee them to the Host Application's stdout/stderr.

For example: the `go.chromium.org/luci/buildbucket/cmd/agent` binary forwards these merged build.proto messages to the Buildbucket service, and also uploads all streams to them to the Logdog cloud service. Other host implementations may instead choose to write all streams to disk, send them to /dev/null or render them as html. However, from the point of view of the luciexe that they run, this is transparent.

Host Applications MAY implement 'backpressure' on the luciexe binaries by throttling the rate at which the Logdog butler accepts data on its various streams. However, doing this could introduce timing issues in the luciexe binaries as they try to run, so this should be done thoughtfully.

If a Host Application detects a protocol violation from a luciexe within its purview, it SHOULD report the violation (in a manner of its choosing) and MUST consider the entire Build status to be INFRA_FAILURE. In addition the Host Application SHOULD attempt to kill (via process group SIGTERM/SIGKILL on *nix, and CTRL+BREAK/Terminate on windows) the luciexe hierarchy. The host application MAY provide a window of time between the initial "please stop" signal and the "you die now" signal, but this comes with the usual caveats of cleanups and deadlines (notably: best-effort clean up is just that: best-effort. It cannot be relied on to run to completion (or to run completely)).

Invocation

When invoking a luciexe, the parent process has a couple responsibilities. It must:

  • Set $TEMPDIR, $TMPDIR, $TEMP, $TMP and $MAC_CHROMIUM_TMPDIR to all point to the same, empty directory. This directory MUST be located on the same file system as CWD. This directory MUST NOT be the same as CWD.

  • Set $LUCI_CONTEXT["luciexe"]["cache_dir"] to a cache dir which makes sense for the luciexe. The cache dir MAY persist/be shared between luciexe invocations. The cache dir MAY NOT be on the same filesystem as CWD.

  • Set the $LOGDOG_NAMESPACE to a prefix which namespaces all logdog streams generated from the luciexe.

  • Set the Status of initial buildbucket.v2.Build message to `STARTED`.

  • Set the CreateTime and StartTime of initial buildbucket.v2.Build message.

  • Clear following fields in the initial buildbucket.v2.Build message.

  • EndTime

  • Output

  • StatusDetails

  • Steps

  • SummaryMarkdown

  • UpdateTime

The CWD is up to your application. Some contexts (like Buildbucket) will guarantee an empty CWD, but others (like recursive invocation) may explicitly share CWD between multiple luciexe's.

The tempdir and workdir paths SHOULD NOT be cleaned up by the invoking process. Instead, the invoking process should defer to the Host Application to provide this cleanup, since the Host Application may be configured to leak these for debugging purposes.

The invoker MUST attach the stdout/stderr to the logdog butler as text streams. These MUST be located at `$LOGDOG_NAMESPACE/std{out,err}`. Typical luciexe implementations will use these for debug logging and output, but are not required to do so.

The invoker MUST write a binary-encoded buildbucket.v2.Build to the stdin of the luciexe which contains all the input parameters that the luciexe needs to know to run successfully.

The luciexe binary

Once running, the luciexe MUST read a binary-encoded buildbucket.v2.Build message from stdin until EOF.

As per the invoker's responsibility, the luciexe binary MAY expect the status of the initial Build message to be `STARTED` and CreateTime and StartTime are populated. It MAY also expect following fields to be empty. It MUST NOT assume other fields in the Build message are set. However, the Host Application or invoker MAY fill in other fields they think are useful.

EndTime
Output
StatusDetails
Steps
SummaryMarkdown
Tags
UpdateTime

As per the Host Application's responsibilities, the luciexe binary MAY expect the "luciexe" and "local_auth" sections of LUCI_CONTEXT to be filled. Other sections of LUCI_CONTEXT MAY also be filled. See the LUCI_CONTEXT docs: https://chromium.googlesource.com/infra/luci/luci-py/+/HEAD/client/LUCI_CONTEXT.md

!!NOTE!! The paths supplied to the luciexe MUST NOT be considered stable
across invocations. Do not hard-code these, and try not to rely on their
consistency (e.g. for build reproducibility).

The luciexe binary - Updating the Build state

A luciexe MAY update the Build state by writing to a "build.proto" Logdog stream named "$LOGDOG_NAMESPACE/build.proto". A "build.proto" Logdog stream is defined as:

Content-Type: "application/luci+proto; message=buildbucket.v2.Build"
Type: Datagram

Additionally, a build.proto stream MAY append "; encoding=zlib" to the Content-Type (and compress each message accordingly). This is useful for when you potentially have very large builds.

Each datagram MUST be a valid binary-encoded buildbucket.v2.Build message. The state of the build is defined as the last Build message sent on this stream. There's no implicit accumulation between sent Build messages.

The Step.Log.Url field in the emitted Build messages can be either absolute (has a valid url scheme) or relative. If an absolute `Url` is provided, the luciexe is also responsible for supplying corresponding `ViewUrl` and the host application won't validate or make any adjustments to those urls. This allows luciexe to include logs from other sources into this build. If a relative `Url` is provided, the host application will namespace the `Url` with $LOGDOG_NAMESPACE of the build.proto stream. For example, if the host application is parsing a Build.proto in a stream named "logdog://host/project/prefix/+/something/build.proto", then a Log with a Url of "hello/world/stdout" will be transformed into:

Url:     logdog://host/project/prefix/+/something/hello/world/stdout
ViewUrl: <implementation defined>

The `ViewUrl` field in this case SHOULD be left empty, and will be filled in by the host application running the luciexe (if supplied it will be overwritten).

The following Build fields will be read from the luciexe-controlled build.proto stream:

EndTime
Output
Status
StatusDetails
Steps
SummaryMarkdown
Tags
UpdateTime

The luciexe binary - Reporting final status

A luciexe MUST report its success/failure by sending a Build message with a terminal `status` value before exiting. If the luciexe exits before sending a Build message with a terminal Status, the invoking application MUST interpret this as an INFRA_FAILURE status. The exit code of the luciexe SHOULD be ignored, except for advisory (logging/reporting) purposes. The host application MUST detect this case and fill in a final status of INFRA_FAILURE, but MUST NOT terminate the process hierarchy in this case.

Recursive Invocation

To support recursive invocation, a luciexe MUST accept the flag:

--output=path/to/file.{pb,json,textpb}

The value of this flag MUST be an absolute path to a non-existent file in an existing directory. The extension of the file dictates the data format (binary, json or text protobuf). The luciexe MUST write it's final Build message to this file in the correct format. If `--output` is specified, but no Build message (or an invalid/improperly formatted Build message) is written, the caller MUST interpret this as an INFRA_FAILURE status.

NOTE: JSON outputs SHOULD be written with the original proto field names,
not the lowerCamelCase names; downstream users may not be using jsonpb
unmarshallers to interpret the JSON data.

This may need to be revised in a subsequent version of this API
specification.

LUCI Executables MAY invoke other LUCI Executables as sub-steps and have the Steps from the child luciexe show in the parent's Build updates. This is one of the responsibilities of the Host Application.

The parent can achieve this by recording a Step S (with no children), and a Step.MergeBuild.from_logdog_stream which points to a "build.proto" stream (see "Updating the Build State"). This is called a "Merge Step", and is a directive for the host to merge Build messages from in the from_logdog_stream log here. Note that this stream name should follow the same provisions specified for "Step.Log.Url". It's the invoker's responsibility to populate $LOGDOG_NAMESPACE with the new, full, namespace. Failure to do this will result in missing logs/step data.

(SIDENOTE: There's an internal proposal to improve the Logdog "butler" API so that application code only needs to handle the relative namespaces as well, which would make this much less confusing).

The Host Application MUST append all steps from the child build.proto stream to the parent build as substeps of step S and copy the following fields of the child Build to the equivalent fields of step S *only if* step S has *non-final* status. It is the caller's responsibility to populate rest of the fields of step S if the caller explicitly marks the step status as final.

SummaryMarkdown
Status
EndTime
Output.Logs (appended)

This rule applies recursively, i.e. the child build MAY have Merge Step(s).

Each luciexe's step names should be emitted as relative names. e.g. say a build runs a sub-luciexe with the name "a|b". This sub-luciexe then runs a step "x|y|z". The top level build.proto stream will show the step "a|b|x|y|z".

The graph of datagram streams MUST be a tree. This follows from the namespacing rules of the Log.Url fields; Since Log.Url fields are relative to their build's namespace, it's only possible to have a merge step point further 'down' the tree, making it impossible to create a cycle.

Recursive Invocation - Legacy ChromeOS style

Note that there's an option in MergeBuild called 'legacy_global_namespace'. This mode is NOT RECOMMENDED, and was only added to support legacy ChromeOS builders which relied on this functionality prior to luciexe.

When this option is turned on, output properties on the build stream will be merged into the top-level build containing the Merge Step, and it will also de-namespace the step names (i.e. a MergeStep called "Parent" whose build.proto stream contains a name "Step" would normally show on the build as "Parent|Step", will instead simply show as "Step" in this mode. This also means that these merge step processes must make sure not to emit a step which duplicates the name of one emitted in the parent build (otherwise the overall build will be invalid)).

Using this functionality has the downside that properties emitted by the parent build WILL BE OVERWRITTEN by the merged build stream, if written to the same top-level property keys. In ChromeOS's case this is "OK" because the legacy recipes do a short bootstrap followed by the main payload which does all the actual stuff.

There's a version of this feature which would be more-supportable, which would be to introduce namespaced outputs so that the top-level build could reflect the outputs from a given step (i.e. allow Step to actually have its own Output message). If you feel like you need this functionality, please contact ChOps Foundation team to discuss, rather than trying to use the legacy_global_namespace option.

For implementation-level details, please refer to the following:

  • "go.chromium.org/luci/luciexe" - low-level protocol details (this module).
  • "go.chromium.org/luci/luciexe/host" - the Host Application library.
  • "go.chromium.org/luci/luciexe/invoke" - luciexe invocation.
  • "go.chromium.org/luci/luciexe/exe" - luciexe binary helper library.

Other Client Implementations

Python Recipes (https://chromium.googlesource.com/infra/luci/recipes-py) implement the LUCI Executable protocol using the "luciexe" subcommand.

TODO(iannucci): Implement a luciexe binary helper in `infra_libs` analogous
to go.chromium.org/luci/luciexe/exe and implement Recipes' support in terms
of this.

LUCI Executables on Buildbucket

Buildbucket accepts LUCI Executables as CIPD packages containing the luciexe to run with the fixed name of "luciexe".

On Windows, this may be named "luciexe.exe" or "luciexe.bat". Buildbucket will prefer the first of these which it finds.

Note that the `recipes.py bundle` command generates a "luciexe" wrapper script for compatibility with Buildbucket.

Index

Constants

View Source
const (
	// BuildProtoContentType is the ContentType of the build.proto LogDog datagram
	// stream.
	BuildProtoContentType = protoutil.BuildMediaType

	// BuildProtoZlibContentType is the ContentType of the compressed
	// build.proto LogDog datagram stream. It's the same as BuildProtoContentType
	// except it's compressed with zlib.
	BuildProtoZlibContentType = BuildProtoContentType + "; encoding=zlib"

	// BuildProtoStreamSuffix is the logdog stream name suffix for sub-lucictx
	// programs to output their build.proto stream to.
	//
	// TODO(iannucci): Maybe change protocol so that build.proto stream can be
	// opened by the invoking process instead of the invoked process.
	BuildProtoStreamSuffix = "build.proto"
)
View Source
const LogdogNamespaceEnv = bootstrap.EnvNamespace

LogdogNamespaceEnv is the environment variable name for the current Logdog namespace. This must hold a fwd-slash-separated logdog StreamName fragment.

It's permissible for this to be empty.

View Source
const (
	// OutputCLIArg is the CLI argument to luciexe binaries to instruct them to
	// dump their final Build message. The value of this flag must be an absolute
	// path to a file which doesn't exist in a directory which does (and which the
	// luciexe binary has access to write in). See Output*FileExt for valid
	// extensions.
	OutputCLIArg = "--output"
)

Variables

View Source
var TempDirEnvVars = []string{

	"MAC_CHROMIUM_TMPDIR",
	"TEMP",
	"TEMPDIR",
	"TMP",
	"TMPDIR",
}

TempDirEnvVars is the list of environment variable names which should be set to point at the temporary directory for a luciexe.

Functions

func ReadBuildFile

func ReadBuildFile(buildFilePath string) (ret *bbpb.Build, err error)

ReadBuildFile parses a Build message from a file.

This uses the file BuildFileExtension of buildFilePath to look up the appropriate codec.

If buildFilePath is "", does nothing.

func WriteBuildFile

func WriteBuildFile(buildFilePath string, build *bbpb.Build) (err error)

WriteBuildFile writes a Build message to a file.

This uses the file BuildFileExtension of buildFilePath to look up the appropriate codec.

If buildFilePath is "", does nothing.

Types

type BuildFileCodec

type BuildFileCodec interface {
	FileExtension() string
	IsNoop() bool
	Enc(build *bbpb.Build, w io.Writer) error
	Dec(build *bbpb.Build, r io.Reader) error
}

BuildFileCodec represents the set of functions to properly encode and decode a build.proto message.

var (
	// This has 'IsNoop' set to true; the Enc/Dec functions do nothing.
	BuildFileCodecNoop BuildFileCodec = buildFileCodecNoop{}

	BuildFileCodecBinary BuildFileCodec = buildFileCodecBinary{}
	BuildFileCodecJSON   BuildFileCodec = buildFileCodecJSON{}
	BuildFileCodecText   BuildFileCodec = buildFileCodecText{}
)

These are the known BuildFileCodec implementations.

func BuildFileCodecForPath

func BuildFileCodecForPath(buildFilePath string) (codec BuildFileCodec, err error)

BuildFileCodecForPath returns the file BuildFileCodec for the given buildFilePath.

If buildFilePath is empty, returns BuildFileNone.

Returns an error if buildFilePath does not have a valid BuildFileExtension.

type OutputFlag

type OutputFlag struct {
	Path  string
	Codec BuildFileCodec
}

OutputFlag can be used as a flag.Value to parse the OutputCLIArg as part of your FlagSet.

func AddOutputFlagToSet

func AddOutputFlagToSet(fs *flag.FlagSet) *OutputFlag

AddOutputFlagToSet adds a OutputFlag to the FlagSet with OutputCLIArg and a useful helpstring.

Returns the added *OutputFlag. The default value has a Codec of BuildFileNone.

func (*OutputFlag) Set

func (o *OutputFlag) Set(value string) error

Set implements flag.Value.

func (*OutputFlag) String

func (o *OutputFlag) String() string

func (*OutputFlag) Write

func (o *OutputFlag) Write(build *bbpb.Build) error

Write writes the build message to this output file with the appropriate encoding.

Directories

Path Synopsis
Package build implements a minimal, safe, easy-to-use, hard-to-use-wrong, 'luciexe binary' implementation in Go.
Package build implements a minimal, safe, easy-to-use, hard-to-use-wrong, 'luciexe binary' implementation in Go.
cv
Package cv exposes CV properties to luciexe binaries for builds.
Package cv exposes CV properties to luciexe binaries for builds.
properties
Package properties encapsulates all logic and data structures for parsing the input properties and manipulating the output properties of a LUCI Build.
Package properties encapsulates all logic and data structures for parsing the input properties and manipulating the output properties of a LUCI Build.
Package exe implements a client for the LUCI Executable ("luciexe") protocol.
Package exe implements a client for the LUCI Executable ("luciexe") protocol.
Package host implements the 'Host Application' portion of the luciexe protocol.
Package host implements the 'Host Application' portion of the luciexe protocol.
buildmerge
Package buildmerge implements the build.proto tracking and merging logic for luciexe host applications.
Package buildmerge implements the build.proto tracking and merging logic for luciexe host applications.
Package invoke implements the process of invoking a 'luciexe' compatible subprocess, but without setting up any of the 'host' requirements (like a Logdog Butler or LUCI Auth).
Package invoke implements the process of invoking a 'luciexe' compatible subprocess, but without setting up any of the 'host' requirements (like a Logdog Butler or LUCI Auth).
Package legacy holds libraries and tools for @@@annotation@@@ protocol which is the predecessor of luciexe protocol.
Package legacy holds libraries and tools for @@@annotation@@@ protocol which is the predecessor of luciexe protocol.
annotee/annotation
Package annotation implements a state machine that constructs Milo annotation protobufs from a series of annotation commands.
Package annotation implements a state machine that constructs Milo annotation protobufs from a series of annotation commands.
annotee/executor
Package executor contains an implementation of the Annotee Executor.
Package executor contains an implementation of the Annotee Executor.
annotee/proto
Package annopb contains protobuf definitions for legacy annotation.
Package annopb contains protobuf definitions for legacy annotation.
cmd/run_annotations
Command run_annotations is a LUCI executable that wraps a command that produces @@@ANNOTATIONS@@@, and converts annotations to build message.
Command run_annotations is a LUCI executable that wraps a command that produces @@@ANNOTATIONS@@@, and converts annotations to build message.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL