vpython

package
v0.0.0-...-a0a3655 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 28, 2019 License: Apache-2.0 Imports: 13 Imported by: 0

README

vpython - simple and easy VirtualEnv Python

vpython is a tool, written in Go, which enables the simple and easy invocation of Python code in VirtualEnv environments.

vpython is a simple Python bootstrap which (almost) transparently wraps a Python interpreter invocation to run in a tailored VirtualEnv environment. The environment is expressed by a script-specific configuration file. This allows each Python script to trivially express its own package-level dependencies and run in a hermetic world consisting of just those dependencies.

When invoking such a script via vpython, the tool downloads its dependencies and prepares an immutable VirtualEnv containing them. It then invokes the script, now running in that VirutalEnv, through the preferred Python interpreter.

vpython does its best not to use hacky mechanisms to achieve this. It uses an unmodified VirtualEnv package, standard setup methods, and local system resources. The result is transparent canonical VirtualEnv environment bootstrapping that meets the expectations of standard Python packages. vpython is also safe for concurrent invocation, using safe filesystem-level locking to perform any environment setup and management.

vpython itself is very fast. The wheel downloads and VirtualEnvs may also be cached and re-used, optimally limiting the runtime overhead of vpython to just one initial setup per unique environment.

Setup and Invocation

For the standard case, employing vpython is as simple as:

  1. Create a vpython VirtualEnv specification (or don't, if no additional packages are needed.
  2. Invoke your script through vpython instead of python.

If additional Python libraries are needed, you may create new packages for those libraries. This is done in an implementation-specific way (e.g., upload wheels as packages to CIPD).

Once the packages are available:

  • Add vpython to PATH.
  • Write an environment specification naming packages.
  • Change tool invocation from python to vpython.

Using vpython offers several benefits to direct Python invocation, especially when vendoring packages. Notably, with vpython:

  • It trivially enables hermetic Python everywhere, greatly increasing control and removing per-system differences in Python packages and environment.
  • It handles situations that system-level packages cannot accommodate, such as different scripts with different versions of packages running in them.
  • No sys.path manipulation is needed to load vendored or imported packages.
  • Any tool can define which package(s) it needs without requiring coordination or cooperation from other tools. (Note that the package must be made available for download first).
  • Adding new Python dependencies to a project is non-invasive and immediate.
  • Package downloading and deployment are baked into vpython and built on fast and secure Google Cloud Platform technologies.
  • No more custom bootstraps. Several projects and tools, including multiple places within Chrome's infra code base, have bootstrap scripts that vendor packages or mimic a VirtualEnv. These are at best repetitive and, at worst, buggy and insecure.
  • Depenencies are explicitly stated, not assumed, and consistent between deployments.

Why VirtualEnv?

VirtualEnv offers several benefits over system Python. Primarily, it is the de facto encapsulated environment method used by the Python community and is generally used as the standard for a functional deployable package.

By using the same environemnt everywhere, Python invocations become reproducible. A tool run on a developer's system will load the same versions of the same libraries as it will on a production system. A production system will no longer fail because it is missing a package, because it has the wrong version of that package, or because a package is incompatible with another installed package.

A direct mechanism for vendoring, sys.path manipulation, is nuanced, buggy, and unsupported by the Python community. It is difficult to do correctly on all platforms in all environments for all packages. A notorious example of this is protobuf and other domain-bound packages, which actively fight sys.path inclusion and require special non-intuitive hacks to work. Using VirtualEnv means that any compliant Python package can trivially be included into a project.

Why CIPD?

CIPD is a cross-platform service and associated tooling and packages used to securely fetch and deploy immutable "packages" (~= zip files) into the local file system. Unlike package managers, it avoids platform-specific assumptions, executable hooks, or the complexities of dependency resolution. vpython uses this as a mechanism for housing and deploying wheels.

Unlike pip, a CIPD package is defined by its content, enabling precise package matching instead of fuzzy version matching (e.g., numpy >= 1.2, and numpy == 1.2 both can match multiple numpy packages in pip).

CIPD also supports ACLs, enabling privileged Python projects to easily vendor sensitive packages.

Why wheels?

A Python wheel is a simple binary distrubition of Python code. A wheel can be generic (pure Python) or system- and architecture-bound (e.g., 64-bit Mac OSX).

Wheels are prefered over Python eggs because they come packaged with compiled binaries. This makes their deployment fast and simple: unpack via pip. It also reduces system requirements and variation, since local compilation, headers, and build tools are not enlisted during installation.

The increased management burden of maintaining separate wheels for the same package, one for each architecture, is handled naturally by CIPD, removing the only real pain point.

Wheel Guidance

This section contains recommendations for building or uploading wheel CIPD packages, including platform-specific guidance.

CIPD wheel packages are CIPD packages that contain Python wheels. A given CIPD package can contain multiple wheels for multiple platforms, but should only contain one version of any given package for any given architecture/platform.

For example, you can bundle a Windows, Linux, and Mac OSX version of numpy and coverage in the same CIPD package, but you should not bundle numpy==1.11 and numpy==1.12 in the same package.

The reason for this is that vpython identifies which wheels to install by scanning the contents of the CIPD package, and if multiple versions appear, there is no clear guidance about which should be used.

Mac OSX

Use the m ABI suffix and the macosx_... platform. vpython installs wheels with the --force flag, so slight binary incompatibilities (e.g., specific OSX versions) can be glossed over.

coverage-4.3.4-cp27-cp27m-macosx_10_10_x86_64.whl

Linux

Use wheels with the mu ABI suffix and the manylinux1 platform. For example:

coverage-4.3.4-cp27-cp27mu-manylinux1_x86_64.whl

Windows

Use wheels with the cp27m or none ABI tag. For example:

coverage-4.3.4-cp27-cp27m-win_amd64.whl

Setup and Invocation

vpython can be invoked by replacing python in the command-line with vpython.

vpython works with a default Python environment out of the box. To add vendored packges, you need to define an environment specification file that describes which wheels to install.

An environment specification file is a text protobuf defined as Spec here. An example is:

# Any 2.7 interpreter will do.
python_version: "2.7"

# Include "numpy" for the current architecture.
wheel {
  name: "infra/python/wheels/numpy/${platform}-${arch}"
  version: "version:1.11.0"
}

# Include "coverage" for the current architecture.
wheel {
  name: "infra/python/wheels/coverage/${platform}-${arch}"
  version: "version:4.1"
}

This specification can be supplied in one of four ways:

  • Explicitly, as a command-line option to vpython (-vpython-spec).
  • Implicitly, as a file alongside your entry point. For example, if you are running test_runner.py, vpython will look for test_runner.py.vpython next to it and load the environment from there.
  • Implicitly, inlined in your main file. vpython will scan the main entry point for sentinel text and, if present, load the specification from that.
  • Implicitly, through the VPYTHON_DEFAULT_SPEC environment variable.

Optimization and Caching

vpython has several levels of caching that it employs to optimize setup and invocation overhead.

VirtualEnv

Once a VirtualEnv specification has been resolved, its resulting pinned specification is hashed and used as a key to that VirtualEnv. Other vpython invocations expressing hte same environment will naturally re-use that VirtualEnv instead of creating their own.

Download Caching

Download mechanisms (e.g., CIPD) can optionally include a package cache to avoid the overhead of downloading and/or resolving a package multiple times.

Migration

Command-line.

vpython is a natural replacement for python in the command line:

python ./foo/bar/baz.py -d --flag value arg arg whatever

Becomes:

vpython ./foo/bar/baz.py -d --flag value arg arg whatever

The vpython tool accepts its own command-line arguments. In this case, use a -- seprator to differentiate between vpython options and python options:

vpython -vpython-spec /path/to/spec.vpython -- ./foo/bar/baz.py
Shebang (POSIX)

If your script uses implicit specification (file or inline), replacing python with vpython in your shebang line will automatically work.

#!/usr/bin/env vpython

Documentation

Overview

Package vpython implements the vpython tool and associated libraries.

Index

Constants

This section is empty.

Variables

View Source
var IsUserError = errors.BoolTag{
	Key: errors.NewTagKey("this error occurred due to a user input."),
}

IsUserError is tagged into errors caused by bad user inputs (e.g. modules or scripts which don't exist).

Functions

func Exec

func Exec(c context.Context, interp *python.Interpreter, cl *python.CommandLine, env environ.Env, dir string, setupFn func() error) error

Exec runs the specified Python command.

Once the process launches, Context cancellation will not have an impact.

interp is the Python interperer to run.

cl is the populated CommandLine to run.

env is the environment to install.

dir, if not empty, is the working directory of the command.

setupFn, if not nil, is a function that will be run immediately before execution, after all operations that are permitted to fail have completed. Any error returned here will result in a panic.

If an error occurs during execution, it will be returned here. Otherwise, Exec will not return, and this process will exit with the return code of the executed process.

The implementation of Exec is platform-specific.

func Run

func Run(c context.Context, opts Options) error

Run sets up a Python VirtualEnv and executes the supplied Options.

If the Python interpreter was successfully launched, Run will never return, and the process will exit with the return code of the Python interpreter.

If the Python environment could not be set-up, or if the interpreter could not be invoked, Run will return an non-nil error.

Run consists of:

  • Identify the target Python script to run (if there is one).
  • Identifying the Python interpreter to use.
  • Composing the environment specification.
  • Constructing the virtual environment (download, install).
  • Execute the Python process with the supplied arguments.

The Python subprocess is bound to the lifetime of ctx, and will be terminated if ctx is cancelled.

Types

type Options

type Options struct {
	// The Python command-line to execute. Must not be nil.
	CommandLine *python.CommandLine

	// EnvConfig is the VirtualEnv configuration to run from.
	EnvConfig venv.Config

	// DefaultSpec is the default specification to use, if no specification was
	// supplied or probed.
	DefaultSpec vpython.Spec

	// BaseWheels is the set of wheels to include in the spec. These will always
	// be merged into the runtime spec and normalized, such that any duplicate
	// wheels will be deduplicated.
	BaseWheels []*vpython.Spec_Package

	// SpecLoader is the spec.Loader to use to load a specification file for a
	// given script.
	//
	// The empty value is a valid default spec.Loader.
	SpecLoader spec.Loader

	// WaitForEnv, if true, means that if another agent holds a lock on the target
	// environment, we will wait until it is available. If false, we will
	// immediately exit Setup with an error.
	WaitForEnv bool

	// WorkDir is the Python working directory. If empty, the current working
	// directory will be used.
	//
	// If EnvRoot is empty, WorkDir will be used as the base environment root.
	WorkDir string

	// Environ is environment to pass to subprocesses.
	Environ environ.Env

	// ClearPythonPath, if true, instructs vpython to clear the PYTHONPATH
	// environment variable prior to launch.
	//
	// TODO(iannucci): Delete this once we're satisfied that PYTHONPATH exports
	// are under control.
	ClearPythonPath bool
}

Options is the set of options to use to construct and execute a VirtualEnv Python application.

func (*Options) ResolveSpec

func (o *Options) ResolveSpec(c context.Context) error

ResolveSpec resolves the configured environment specification. The resulting spec is installed into o's EnvConfig.Spec field.

Directories

Path Synopsis
api
vpython
Package vpython contains `vpython` environment definition protobufs.
Package vpython contains `vpython` environment definition protobufs.
assets
Package assets is generated by go.chromium.org/luci/tools/cmd/assets.
Package assets is generated by go.chromium.org/luci/tools/cmd/assets.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL