Krawler: a kernel releases crawler 


A crawler for kernel releases distributed by the major Linux distributions.
It supports, Amazon Linux v1, Amazon Linux v2, Amazon Linux 2022, Centos, Debian, Ubuntu, Fedora, Oracle Linux, OpenSUSE Linux, Arch Linux.
The crawling data is continuously published and is available at db.krawler.dev.
Usage
krawler [options] <command>
Options
-c, --config file
: (optional) the config file to customize the list of mirrors to scrape for kernel releases (by default it looks at $HOME/.krawler.yaml).
-v, --verbosity level
: (optional) the verbosity level (debug, info, warn, error, fatal, panic). By (default warning).
Commands
list
|ls
List available kernel releases with distributed headers, by Linux distribution.
It returns a list of kernelRelease
objects. The output format can be specified by flag parameter.
krawler [options] list|ls <distribution> [-o <format>]
Parameters
distribution
: (required) The Linux distribution for which the release has been pubished.
Available distributions:
- amazonlinux
- amazonlinux2
- amazonlinux2022
- amazonlinux2023
- centos
- debian
- ubuntu
- fedora
- oracle
- opensuse
- archlinux
Options
-o, --output format
: (optional) the format of the output of the list of kernel releases (one of text, json or yaml). By default yaml.
Output
The list
|ls
command prints on standard ouput a is a list of kernel release objects of type KernelRelease
.
An example of a json
result entry:
{
"full_version": "4.18.0",
"version": 4,
"patch_level": 18,
"sublevel": 0,
"extra_version": "331",
"full_extra_version": "-331.el8.aarch64",
"architecture": "aarch64",
"package_name": "kernel-devel",
"package_url": "https://mirrors.edge.kernel.org/centos/8-stream/BaseOS/aarch64/os/Packages/kernel-devel-4.18.0-331.el8.aarch64.rpm",
"compiler_version": "80500"
}
Getting started
Let's imagine you want to list the available CentOS kernel releases, scraping default mirrors. You do it by running:
krawler ls centos
Configuration
A configuration lets you configure parameters for the crawling, like the mirrors to scrape.
The default configuration file path is $HOME/.krawler.yaml
. You can specify a custom path with the --config
option.
When a configuration is not present, a default configurations for repositories are used (for example this is the default for Centos).
For a detailed overview see the reference.
Moreover, sample configurations are available here.
Roadmap
- Provide GCC versions for all releases
- Support new distributions