[β¬οΈ Download]
[π Quick start]
[βFAQs & Troubleshooting]
repro-get
: reproducible apt
, dnf
, apk
, and pacman
, with content-addressing
β
HTTP and HTTPS
β
Filesystems
β
OCI (Open Container Initiative) registries
β
IPFS
repro-get
installs a specific snapshot of packages using SHA256SUMS
, for the sake of reproducible builds:
$ cat SHA256SUMS-amd64
35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc pool/main/h/hello/hello_2.10-2_amd64.deb
$ repro-get install SHA256SUMS-amd64
(001/001) hello_2.10-2_amd64.deb Downloading from http://debian.notset.fr/snapshot/by-hash/SHA256/35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc
...
Preparing to unpack .../35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc ...
Unpacking hello (2.10-2) ...
Setting up hello (2.10-2) ...
repro-get
supports the following distros:
Distro |
"Batteries included" |
Support generating Dockerfiles |
Support verifying package signatures |
debian |
β
|
β
|
β |
ubuntu |
β
|
β |
β |
fedora (Experimental) |
β
|
β |
β
|
alpine (Experimental) |
β |
β |
β
|
arch |
β
|
β
|
β
|
"Batteries included" for Debian, Ubuntu, Fedora, and Arch Linux.
On Debian, the packages are fetched from the following URLs by default:
http://deb.debian.org/debian/{{.Name}}
for recent packages (fast, but ephemeral)
http://snapshot-cloudflare.debian.org/archive/debian/{{timeToDebianSnapshot .Epoch}}/{{.Name}}
for archived packages (slow, but persistent)
On Ubuntu: http://launchpad.net/ubuntu/+archive/primary/+files/{{.Basename}}
On Fedora: https://kojipkgs.fedoraproject.org/packages/{{.Name}}
On Arch Linux: https://archive.archlinux.org/packages/{{.Name}}
On other distros, the file provider has to be manually specified in the --provider=...
flag for long-term persistence.
The following file providers are supported:
- HTTP/HTTPS URLs, such as
http://debian.notset.fr/snapshot/by-hash/SHA256/{{.SHA256}}
- Filesystems, such as
file:///mnt/nfs/files/{{.Basename}}
, or file:///mnt/nfs/blobs/{{.SHA256}}
- OCI-compliant container registries, such as
oci://ghcr.io/USERNAME/REPO
- IPFS gateways, such as
http://ipfs.io/ipfs/{{.CID}}
Quick start
Set up
Download the latest binary release from https://github.com/reproducible-containers/repro-get/releases .
To install repro-get
from source, install Go, run make
, and sudo make install
.
The recommended version of Go is written in the go.mod
file.
The binary release can be reproduced locally by checking out the related tag and running make artifacts.docker
.
Installing packages with the hash file
Create the SHA256SUMS-amd64
file for the hello
package,
using the information from apt-cache show hello
:
35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc pool/main/h/hello/hello_2.10-2_amd64.deb
Then run repro-get install SHA256SUMS-amd64
:
$ repro-get install SHA256SUMS-amd64
(001/001) hello_2.10-2_amd64.deb Downloading from http://debian.notset.fr/snapshot/by-hash/SHA256/35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc
...
Preparing to unpack .../35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc ...
Unpacking hello (2.10-2) ...
Setting up hello (2.10-2) ...
See also Dockerfile for running repro-get
inside containers.
Generating the hash file
Note
Make sure to run apt-get update
before running repro-get hash generate
.
See also Dockerfile for how to run apt-get update
in a container image such as debian:bullseye-yyyyMMdd
.
To generate the hash for all the installed packages, including the system packages:
repro-get hash generate >SHA256SUMS-amd64
To generate the hash for specific packages:
repro-get hash generate hello >SHA256SUMS-amd64
To generate the hash for newly installed packages:
repro-get hash generate >SHA256SUMS-amd64.old
apt-get install -y hello
repro-get hash generate --dedupe=SHA256SUMS-amd64.old >SHA256SUMS-amd64
Updating the hash file
Note
Make sure to run apt-get update
before running repro-get hash update
.
To update the hash file:
repro-get hash update SHA256SUMS-amd64
Advanced usage
Dockerfile
Warning
repro-get dockerfile generate
is an experimental feature.
The following example produces an image with gcc
, using the packages from 2021-12-20.
# Generate "Dockerfile.generate-hash" and "Dockerfile" in the current directory
repro-get --distro=debian dockerfile generate . debian:bullseye-20211220 gcc build-essential
Enable BuildKit
export DOCKER_BUILDKIT=1
# Generate "SHA256SUMS-amd64" file in the current directory (needed by the next step)
docker build --output . -f Dockerfile.generate-hash .
# Build the image
docker build .
See ./examples/gcc
for an example output.
See also FAQs for "bit-to-bit" reproducibility of container images.
Cache management
The cache directory (--cache
) defaults to /var/cache/repro-get
.
Populate
To populate the package files into the cache without installing them:
repro-get download SHA256SUMS-amd64
Export
To export the cached package files to the current directory:
repro-get cache export .
Import
To import package files in the current directory into the cache:
repro-get cache import .
Clean
To clean the cache:
repro-get cache clean
Container registries
repro-get
supports downloading package files from OCI-compliant container registries.
Note
Make sure to create a container registry credential as ~/.docker/config.json
.
Push
To push the package files into a container registry such as https://ghcr.io/ , use ORAS:
repro-get cache export .
oras push ghcr.io/USERNAME/dpkgs:latest *.deb
Pull
To pull and install packages from the registry:
repro-get --provider=oci://ghcr.io/USERNAME/dpkgs install SHA256SUMS-amd64
Tips about the oci://...
provider strings:
- The provider string does not need contain the
:<TAG>@<DIGEST>
value, as repro-get
ignores the container manifests.
- Defaults to HTTPS for non-localhost registries. Use
oci+http://...
scheme to disable HTTPS.
IPFS
repro-get
also supports uploading package files to IPFS, and downloading them from IPFS via an IPFS gateway such as http://ipfs.io/ipfs/{{.CID}}
.
Note
The ipfs
command (Kubo) needs to be installed for pushing (not for pulling).
Push
Run repro-get ipfs push
to push the package files, and update the hash file to include the IPFS CIDs:
$ cat SHA256SUMS-amd64
35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc pool/main/h/hello/hello_2.10-2_amd64.deb
$ repro-get ipfs push SHA256SUMS-amd64
35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc /ipfs/QmRY19HEWeTJtRC6vAdz7rDfX3PjSMgXmd1KYi9guAACUj
$ cat SHA256SUMS-amd64
35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc pool/main/h/hello/hello_2.10-2_amd64.deb
35b1508eeee9c1dfba798c4c04304ef0f266990f936a51f165571edf53325cbc /ipfs/QmRY19HEWeTJtRC6vAdz7rDfX3PjSMgXmd1KYi9guAACUj
Pull
To pull and install packages from IPFS:
repro-get --provider=http://ipfs.io/ipfs/{{.CID}} install SHA256SUMS-amd64
The hash file must contain the ... /ipfs/...
lines.
The hash file may contain multiple CIDs for a single SHA256, but only a single CID is used for pulling.
FAQs
Why do we need reproducibility?
For supply chain security.
If a binary can be bit-to-bit reproducible by multiple independent people, the binary (and its distributor) can be considered more trustable than others.
Achieving bit-to-bit reproducibility is still challenging (see below), but even "quasi-"reproducibility is useful for avoiding regressions that could be introduced by installing unexpected updates.
See also https://reproducible-builds.org/docs/buy-in/ .
Why not just use snapshot.debian.org
with apt-get
?
Although it is already possible to reproduce a specific snapshot of Debian by specifying deb [...] http://snapshot.debian.org/archive/debian/yyyyMMddTHHmmssZ/ ... ...
in /etc/apt/sources.list
, this will cause a huge traffic on snapshot.debian.org
when everybody begins to make builds reproducible.
repro-get
mitigates this issue by content-addressing: A package file can be fetched from anywhere, such as HTTP(S) sites, local filesystems, OCI registries, or even IPFS, by its SHA256 (or CID) checksum.
Also, as the package files are verified by checksums, existing package files are not affected by potential GPG key leakage.
Are container images "bit-to-bit" reproducible?
Yes, with BuildKit v0.11 or later.
See ./hack/test-dockerfile-repro.sh
for testing reproducibility.
However, it should be noted that the reproducibility is not guaranteed across different versions of BuildKit.
The host operating system version, filesystem configuration, etc. may affect reproducibility too.
How to use HTTPS on Debian/Ubuntu?
repro-get --provider='https://deb.debian.org/debian/{{.Name}},https://debian.notset.fr/snapshot/by-hash/SHA256/{{.SHA256}}' install
Using HTTPS needs the ca-certificates
package to be installed.
The ca-certificates
package is not installed by default in the debian
and ubuntu
) images on Docker Hub.
Why not use HTTPS by default on Debian/Ubuntu?
Because apt-get
does not use HTTPS by default, either.
See an archive of whydoesaptnotusehttps.com
for the reason.
Acknowledgement
A huge thanks to FrΓ©dΓ©ric Pierret (@fepitre) for maintaining the snapshot server http://snapshot.notset.fr/ .
Also huge thanks to maintainers of http://snapshot.debian.org/ , https://kojipkgs.fedoraproject.org/ , and other package snapshot servers.
repro-get
could not be implemented without these snapshot servers.