README ¶
Python Gazelle plugin
Gazelle is a build file generator for Bazel projects. It can create new BUILD.bazel files for a project that follows language conventions, and it can update existing build files to include new sources, dependencies, and options.
Gazelle may be run by Bazel using the gazelle rule, or it may be installed and run as a command line tool.
This directory contains a plugin for
Gazelle
that generates BUILD files content for Python code. When Gazelle is run as a command line tool with this plugin, it embeds a Python interpreter resolved during the plugin build.
The behavior of the plugin is slightly different with different version of the interpreter as the Python stdlib
changes with every minor version release.
Distributors of Gazelle binaries should, therefore, build a Gazelle binary for each OS+CPU architecture+Minor Python version combination they are targeting.
The following instructions are for when you use bzlmod. Please refer to older documentation that includes instructions on how to use Gazelle without using bzlmod as your dependency manager.
Example
We have an example of using Gazelle with Python located here.
A fully-working example without using bzlmod is in examples/build_file_generation
.
The following documentation covers using bzlmod.
Adding Gazelle to your project
First, you'll need to add Gazelle to your MODULES.bazel
file.
Get the current version of Gazelle from there releases here: https://github.com/bazelbuild/bazel-gazelle/releases/.
See the installation MODULE.bazel
snippet on the Releases page:
https://github.com/bazelbuild/rules_python/releases in order to configure rules_python.
You will also need to add the bazel_dep
for configuration for rules_python_gazelle_plugin
.
Here is a snippet of a MODULE.bazel
file.
# The following stanza defines the dependency rules_python.
bazel_dep(name = "rules_python", version = "0.22.0")
# The following stanza defines the dependency rules_python_gazelle_plugin.
# For typical setups you set the version.
bazel_dep(name = "rules_python_gazelle_plugin", version = "0.22.0")
# The following stanza defines the dependency gazelle.
bazel_dep(name = "gazelle", version = "0.31.0", repo_name = "bazel_gazelle")
# Import the python repositories generated by the given module extension into the scope of the current module.
use_repo(python, "python3_9")
use_repo(python, "python3_9_toolchains")
# Register an already-defined toolchain so that Bazel can use it during toolchain resolution.
register_toolchains(
"@python3_9_toolchains//:all",
)
# Use the pip extension
pip = use_extension("@rules_python//python:extensions.bzl", "pip")
# Use the extension to call the `pip_repository` rule that invokes `pip`, with `incremental` set.
# Accepts a locked/compiled requirements file and installs the dependencies listed within.
# Those dependencies become available in a generated `requirements.bzl` file.
# You can instead check this `requirements.bzl` file into your repo.
# Because this project has different requirements for windows vs other
# operating systems, we have requirements for each.
pip.parse(
name = "pip",
requirements_lock = "//:requirements_lock.txt",
requirements_windows = "//:requirements_windows.txt",
)
# Imports the pip toolchain generated by the given module extension into the scope of the current module.
use_repo(pip, "pip")
Next, we'll fetch metadata about your Python dependencies, so that gazelle can
determine which package a given import statement comes from. This is provided
by the modules_mapping
rule. We'll make a target for consuming this
modules_mapping
, and writing it as a manifest file for Gazelle to read.
This is checked into the repo for speed, as it takes some time to calculate
in a large monorepo.
Gazelle will walk up the filesystem from a Python file to find this metadata,
looking for a file called gazelle_python.yaml
in an ancestor folder of the Python code.
Create an empty file with this name. It might be next to your requirements.txt
file.
(You can just use touch
at this point, it just needs to exist.)
To keep the metadata updated, put this in your BUILD.bazel
file next to gazelle_python.yaml
:
load("@pip//:requirements.bzl", "all_whl_requirements")
load("@rules_python_gazelle_plugin//manifest:defs.bzl", "gazelle_python_manifest")
load("@rules_python_gazelle_plugin//modules_mapping:def.bzl", "modules_mapping")
# This rule fetches the metadata for python packages we depend on. That data is
# required for the gazelle_python_manifest rule to update our manifest file.
modules_mapping(
name = "modules_map",
wheels = all_whl_requirements,
)
# Gazelle python extension needs a manifest file mapping from
# an import to the installed package that provides it.
# This macro produces two targets:
# - //:gazelle_python_manifest.update can be used with `bazel run`
# to recalculate the manifest
# - //:gazelle_python_manifest.test is a test target ensuring that
# the manifest doesn't need to be updated
gazelle_python_manifest(
name = "gazelle_python_manifest",
modules_mapping = ":modules_map",
# This is what we called our `pip_parse` rule, where third-party
# python libraries are loaded in BUILD files.
pip_repository_name = "pip",
# This should point to wherever we declare our python dependencies
# (the same as what we passed to the modules_mapping rule in WORKSPACE)
# This argument is optional. If provided, the `.test` target is very
# fast because it just has to check an integrity field. If not provided,
# the integrity field is not added to the manifest which can help avoid
# merge conflicts in large repos.
requirements = "//:requirements_lock.txt",
# include_stub_packages: bool (default: False)
# If set to True, this flag automatically includes any corresponding type stub packages
# for the third-party libraries that are present and used. For example, if you have
# `boto3` as a dependency, and this flag is enabled, the corresponding `boto3-stubs`
# package will be automatically included in the BUILD file.
#
# Enabling this feature helps ensure that type hints and stubs are readily available
# for tools like type checkers and IDEs, improving the development experience and
# reducing manual overhead in managing separate stub packages.
include_stub_packages = True
)
Finally, you create a target that you'll invoke to run the Gazelle tool
with the rules_python extension included. This typically goes in your root
/BUILD.bazel
file:
load("@bazel_gazelle//:def.bzl", "gazelle")
# Our gazelle target points to the python gazelle binary.
# This is the simple case where we only need one language supported.
# If you also had proto, go, or other gazelle-supported languages,
# you would also need a gazelle_binary rule.
# See https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.rst#example
gazelle(
name = "gazelle",
gazelle = "@rules_python_gazelle_plugin//python:gazelle_binary",
)
That's it, now you can finally run bazel run //:gazelle
anytime
you edit Python code, and it should update your BUILD
files correctly.
Usage
Gazelle is non-destructive.
It will try to leave your edits to BUILD files alone, only making updates to py_*
targets.
However it will remove dependencies that appear to be unused, so it's a
good idea to check in your work before running Gazelle so you can easily
revert any changes it made.
The rules_python extension assumes some conventions about your Python code. These are noted below, and might require changes to your existing code.
Note that the gazelle
program has multiple commands. At present, only the update
command (the default) does anything for Python code.
Directives
You can configure the extension using directives, just like for other
languages. These are just comments in the BUILD.bazel
file which
govern behavior of the extension when processing files under that
folder.
See https://github.com/bazelbuild/bazel-gazelle#directives
for some general directives that may be useful.
In particular, the resolve
directive is language-specific
and can be used with Python.
Examples of these directives in use can be found in the
/gazelle/testdata folder in the rules_python repo.
Python-specific directives are as follows:
Directive | Default value |
---|---|
# gazelle:python_extension |
enabled |
Controls whether the Python extension is enabled or not. Sub-packages inherit this value. Can be either "enabled" or "disabled". | |
# gazelle:python_root |
n/a |
Sets a Bazel package as a Python root. This is used on monorepos with multiple Python projects that don't share the top-level of the workspace as the root. See Directive: python_root below. |
|
# gazelle:python_manifest_file_name |
gazelle_python.yaml |
Overrides the default manifest file name. | |
# gazelle:python_ignore_files |
n/a |
Controls the files which are ignored from the generated targets. | |
# gazelle:python_ignore_dependencies |
n/a |
Controls the ignored dependencies from the generated targets. | |
# gazelle:python_validate_import_statements |
true |
Controls whether the Python import statements should be validated. Can be "true" or "false" | |
# gazelle:python_generation_mode |
package |
Controls the target generation mode. Can be "file", "package", or "project" | |
# gazelle:python_generation_mode_per_file_include_init |
false |
Controls whether __init__.py files are included as srcs in each generated target when target generation mode is "file". Can be "true", or "false" |
|
# gazelle:python_generation_mode_per_package_require_test_entry_point |
true |
Controls whether a file called __test__.py or a target called __test__ is required to generate one test target per package in package mode. |
|
# gazelle:python_library_naming_convention |
$package_name$ |
Controls the py_library naming convention. It interpolates $package_name$ with the Bazel package name. E.g. if the Bazel package name is foo , setting this to $package_name$_my_lib would result in a generated target named foo_my_lib . |
|
# gazelle:python_binary_naming_convention |
$package_name$_bin |
Controls the py_binary naming convention. Follows the same interpolation rules as python_library_naming_convention . |
|
# gazelle:python_test_naming_convention |
$package_name$_test |
Controls the py_test naming convention. Follows the same interpolation rules as python_library_naming_convention . |
|
# gazelle:resolve py ... |
n/a |
Instructs the plugin what target to add as a dependency to satisfy a given import statement. The syntax is # gazelle:resolve py import-string label where import-string is the symbol in the python import statement, and label is the Bazel label that Gazelle should write in deps . |
|
# gazelle:python_default_visibility labels |
|
Instructs gazelle to use these visibility labels on all python targets. labels is a comma-separated list of labels (without spaces). |
//$python_root$:__subpackages__ |
# gazelle:python_visibility label |
|
Appends additional visibility labels to each generated target. This directive can be set multiple times. | |
# gazelle:python_test_file_pattern |
*_test.py,test_*.py |
Filenames matching these comma-separated glob s will be mapped to py_test targets. |
|
# gazelle:python_label_convention |
$distribution_name$ |
Defines the format of the distribution name in labels to third-party deps. Useful for using Gazelle plugin with other rules with different repository conventions (e.g. rules_pycross ). Full label is always prepended with (pip) repository name, e.g. @pip//numpy . |
|
# gazelle:python_label_normalization |
snake_case |
Controls how distribution names in labels to third-party deps are normalized. Useful for using Gazelle plugin with other rules with different label conventions (e.g. rules_pycross uses PEP-503). Can be "snake_case", "none", or "pep503". |
Directive: python_root
:
Set this directive within the Bazel package that you want to use as the Python root.
For example, if using a src
dir (as recommended by the Python Packaging User
Guide), then set this directive in src/BUILD.bazel
:
# ./src/BUILD.bazel
# Tell gazelle that are python root is the same dir as this Bazel package.
# gazelle:python_root
Note that the directive does not have any arguments.
Gazelle will then add the necessary imports
attribute to all targets that it
generates:
# in ./src/foo/BUILD.bazel
py_libary(
...
imports = [".."], # Gazelle adds this
...
)
# in ./src/foo/bar/BUILD.bazel
py_libary(
...
imports = ["../.."], # Gazelle adds this
...
)
Directive: python_default_visibility
:
Instructs gazelle to use these visibility labels on all python targets
(typically py_*
, but can be modified via the map_kind
directive). The arg
to this directive is a a comma-separated list (without spaces) of labels.
For example:
# gazelle:python_default_visibility //:__subpackages__,//tests:__subpackages__
produces the following visibility attribute:
py_library(
...,
visibility = [
"//:__subpackages__",
"//tests:__subpackages__",
],
...,
)
You can also inject the python_root
value by using the exact string
$python_root$
. All instances of this string will be replaced by the python_root
value.
# gazelle:python_default_visibility //$python_root$:__pkg__,//foo/$python_root$/tests:__subpackages__
# Assuming the "# gazelle:python_root" directive is set in ./py/src/BUILD.bazel,
# the results will be:
py_library(
...,
visibility = [
"//foo/py/src/tests:__subpackages__", # sorted alphabetically
"//py/src:__pkg__",
],
...,
)
Two special values are also accepted as an argument to the directive:
NONE
: This removes all default visibility. Labels added by thepython_visibility
directive are still included.DEFAULT
: This resets the default visibility.
For example:
# gazelle:python_default_visibility NONE
py_library(
name = "...",
srcs = [...],
)
# gazelle:python_default_visibility //foo:bar
# gazelle:python_default_visibility DEFAULT
py_library(
...,
visibility = ["//:__subpackages__"],
...,
)
These special values can be useful for sub-packages.
Directive: python_visibility
:
Appends additional visibility
labels to each generated target.
This directive can be set multiple times. The generated visibility
attribute
will include the default visibility and all labels defined by this directive.
All labels will be ordered alphabetically.
# ./BUILD.bazel
# gazelle:python_visibility //tests:__pkg__
# gazelle:python_visibility //bar:baz
py_library(
...
visibility = [
"//:__subpackages__", # default visibility
"//bar:baz",
"//tests:__pkg__",
],
...
)
Child Bazel packages inherit values from parents:
# ./bar/BUILD.bazel
# gazelle:python_visibility //tests:__subpackages__
py_library(
...
visibility = [
"//:__subpackages__", # default visibility
"//bar:baz", # defined in ../BUILD.bazel
"//tests:__pkg__", # defined in ../BUILD.bazel
"//tests:__subpackages__", # defined in this ./BUILD.bazel
],
...
)
This directive also supports the $python_root$
placeholder that
# gazelle:python_default_visibility
supports.
# gazlle:python_visibility //$python_root$/foo:bar
py_library(
...
visibility = ["//this_is_my_python_root/foo:bar"],
...
)
Directive: python_test_file_pattern
:
This directive adjusts which python files will be mapped to the py_test
rule.
- The default is
*_test.py,test_*.py
: bothtest_*.py
and*_test.py
files will generatepy_test
targets. - This directive must have a value. If no value is given, an error will be raised.
- It is recommended, though not necessary, to include the
.py
extension in theglob
s:foo*.py,?at.py
. - Like most directives, it applies to the current Bazel package and all subpackages until the directive is set again.
- This directive accepts multiple
glob
patterns, separated by commas without spaces:
# gazelle:python_test_file_pattern foo*.py,?at
py_library(
name = "mylib",
srcs = ["mylib.py"],
)
py_test(
name = "foo_bar",
srcs = ["foo_bar.py"],
)
py_test(
name = "cat",
srcs = ["cat.py"],
)
py_test(
name = "hat",
srcs = ["hat.py"],
)
Resetting to the default value (such as in a subpackage) is manual. Set:
# gazelle:python_test_file_pattern *_test.py,test_*.py
There currently is no way to tell gazelle that no files in a package should
be mapped to py_test
targets (see Issue #1826). The workaround
is to set this directive to a pattern that will never match a .py
file, such
as foo.bar
:
# No files in this package should be mapped to py_test targets.
# gazelle:python_test_file_pattern foo.bar
py_library(
name = "my_test",
srcs = ["my_test.py"],
)
Directive: python_generation_mode_per_package_require_test_entry_point
:
When # gazelle:python_generation_mode package
, whether a file called __test__.py
or a target called __test__
, a.k.a., entry point, is required to generate one test target per package. If this is set to true but no entry point is found, Gazelle will fall back to file mode and generate one test target per file. Setting this directive to false forces Gazelle to generate one test target per package even without entry point. However, this means the main
attribute of the py_test
will not be set and the target will not be runnable unless either:
- there happen to be a file in the
srcs
with the same name as thepy_test
target, or - a macro populating the
main
attribute ofpy_test
is configured withgazelle:map_kind
to replacepy_test
when Gazelle is generating Python test targets. For example, user can provide such a macro to Gazelle:
load("@rules_python//python:defs.bzl", _py_test="py_test")
load("@aspect_rules_py//py:defs.bzl", "py_pytest_main")
def py_test(name, main=None, **kwargs):
deps = kwargs.pop("deps", [])
if not main:
py_pytest_main(
name = "__test__",
deps = ["@pip_pytest//:pkg"], # change this to the pytest target in your repo.
)
deps.append(":__test__")
main = ":__test__.py"
_py_test(
name = name,
main = main,
deps = deps,
**kwargs,
)
Annotations
Annotations refer to comments found within Python files that configure how Gazelle acts for that particular file.
Annotations have the form:
# gazelle:annotation_name value
and can reside anywhere within a Python file where comments are valid. For example:
import foo
# gazelle:annotation_name value
def bar(): # gazelle:annotation_name value
pass
The annotations are:
Annotation | Default value |
---|---|
# gazelle:ignore imports |
N/A |
Tells Gazelle to ignore import statements. imports is a comma-separated list of imports to ignore. |
|
# gazelle:include_dep targets |
N/A |
Tells Gazelle to include a set of dependencies, even if they are not imported in a Python module. targets is a comma-separated list of target names to include as dependencies. |
Annotation: ignore
This annotation accepts a comma-separated string of values. Values are names of Python imports that Gazelle should not include in target dependencies.
The annotation can be added multiple times, and all values are combined and de-duplicated.
For python_generation_mode = "package"
, the ignore
annotations
found across all files included in the generated target are removed from deps
.
Example:
import numpy # a pypi package
# gazelle:ignore bar.baz.hello,foo
import bar.baz.hello
import foo
# Ignore this import because _reasons_
import baz # gazelle:ignore baz
will cause Gazelle to generate:
deps = ["@pypi//numpy"],
Annotation: include_dep
This annotation accepts a comma-separated string of values. Values must be Python targets, but no validation is done. If a value is not a Python target, building will result in an error saying:
<target> does not have mandatory providers: 'PyInfo' or 'CcInfo' or 'PyInfo'.
Adding non-Python targets to the generated target is a feature request being tracked in Issue #1865.
The annotation can be added multiple times, and all values are combined and de-duplicated.
For python_generation_mode = "package"
, the include_dep
annotations
found across all files included in the generated target are included in deps
.
Example:
# gazelle:include_dep //foo:bar,:hello_world,//:abc
# gazelle:include_dep //:def,//foo:bar
import numpy # a pypi package
will cause Gazelle to generate:
deps = [
":hello_world",
"//:abc",
"//:def",
"//foo:bar",
"@pypi//numpy",
]
Libraries
Python source files are those ending in .py
but not ending in _test.py
.
First, we look for the nearest ancestor BUILD file starting from the folder containing the Python source file.
In package generation mode, if there is no py_library
in this BUILD file, one
is created using the package name as the target's name. This makes it the
default target in the package. Next, all source files are collected into the
srcs
of the py_library
.
In project generation mode, all source files in subdirectories (that don't have BUILD files) are also collected.
In file generation mode, each file is given its own target.
Finally, the import
statements in the source files are parsed, and
dependencies are added to the deps
attribute.
Unit Tests
A py_test
target is added to the BUILD file when gazelle encounters
a file named __test__.py
.
Often, Python unit test files are named with the suffix _test
.
For example, if we had a folder that is a package named "foo" we could have a Python file named foo_test.py
and gazelle would create a py_test
block for the file.
The following is an example of a py_test
target that gazelle would add when
it encounters a file named __test__.py
.
py_test(
name = "build_file_generation_test",
srcs = ["__test__.py"],
main = "__test__.py",
deps = [":build_file_generation"],
)
You can control the naming convention for test targets by adding a gazelle directive named
# gazelle:python_test_naming_convention
. See the instructions in the section above that
covers directives.
Binaries
When a __main__.py
file is encountered, this indicates the entry point
of a Python program. A py_binary
target will be created, named [package]_bin
.
When no such entry point exists, Gazelle will look for a line like this in the top level in every module:
if __name == "__main__":
Gazelle will create a py_binary
target for every module with such a line, with
the target name the same as the module name.
If python_generation_mode
is set to file
, then instead of one py_binary
target per module, Gazelle will create one py_binary
target for each file with
such a line, and the name of the target will match the name of the script.
Note that it's possible for another script to depend on a py_binary
target and
import from the py_binary
's scripts. This can have possible negative effects on
Bazel analysis time and runfiles size compared to depending on a py_library
target. The simplest way to avoid these negative effects is to extract library
code into a separate script without a main
line. Gazelle will then create a
py_library
target for that library code, and other scripts can depend on that
py_library
target.
Developer Notes
Gazelle extensions are written in Go. This gazelle plugin is a hybrid, as it uses Go to execute a Python interpreter as a subprocess to parse Python source files. See the gazelle documentation https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.md for more information on extending Gazelle.
If you add new Go dependencies to the plugin source code, you need to "tidy" the go.mod file.
After changing that file, run go mod tidy
or bazel run @go_sdk//:bin/go -- mod tidy
to update the go.mod and go.sum files. Then run bazel run //:gazelle_update_repos
to have gazelle
add the new dependenies to the deps.bzl file. The deps.bzl file is used as defined in our /WORKSPACE
to include the external repos Bazel loads Go dependencies from.
Then after editing Go code, run bazel run //:gazelle
to generate/update the rules in the
BUILD.bazel files in our repo.
Directories ¶
Path | Synopsis |
---|---|
generate
generate.go is a program that generates the Gazelle YAML manifest.
|
generate.go is a program that generates the Gazelle YAML manifest. |
test
test.go is a unit test that asserts the Gazelle YAML manifest is up-to-date in regards to the requirements.txt.
|
test.go is a unit test that asserts the Gazelle YAML manifest is up-to-date in regards to the requirements.txt. |