caelus

module

v1.0.0 Latest Latest Go to latest Published: Oct 13, 2021 License: Apache-2.0, BSD-3-Clause, MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/tencent/caelus

Links

Open Source Insights

README ¶

Caelus

Caelus is a set of Kubernetes solutions for reusing idle resources of nodes by running extra batch jobs, these resources come from the underutilization of online jobs, especially during low traffic periods. To make batch jobs compatible with online jobs, caelus dynamically manages multiple resource isolation mechanisms and also checks abnormalities of various metrics. Batch jobs will be throttled or even killed if interference detected.

Features

Collect various metrics, including node resources, cgroup resources and online jobs latency
Batch jobs could be running on YARN or Kubernetes
Predict total resource usages of the node, including online jobs and kernel modules, such as slab
Dynamically manage multiple resource isolation mechanisms, such as CPU, memory, and disk space
Dynamically check abnormalities of various metrics, such as CPU usage or online jobs latency
Throttle or even kill batch jobs when resource pressure or latency spike detected
Prometheus metrics supported
Alarm supported

Usage

Find more usage at Tutorial.md. The project also have two attached tools:

nm_operator

nm_operator is used to execute YARN commands in the way of remote API.

metric_adapter

metric_adapter is used to collect more application metrics with adapter extension.

Getting started

build

# binary build, which generates binary under _output/bin/
$ make build

# image build
$ make image

# run unit test
$ make test

Run

# running in script
$ caelus --config=hack/config/caelus.json --hostname-override=xxx --v=2

# running in image
$ kubectl create -f hack/yaml/caelus.json
$ kubectl label node colation=true
$ kubectl -n kube-system get daemonset

Contributing

For more information about contributing issues or pull requests, see our Contributing to Caelus.

License

Caelus is under the Apache License 2.0. See the License file for details.

Directories ¶

Path	Synopsis
cmd
caelus
caelus/app
caelus/context
caelus_metric_adapter
nm-operator
nm-operator/app
pkg
cadvisor
caelus/alarm
caelus/checkpoint
caelus/cpi
caelus/detection
caelus/detection/mock
caelus/detection/ring
caelus/diskquota
caelus/diskquota/manager
caelus/diskquota/manager/projectquota
caelus/diskquota/volumes
caelus/healthcheck
caelus/healthcheck/action
caelus/healthcheck/cgroupnotify
caelus/healthcheck/conflict
caelus/healthcheck/conflict/mock
caelus/healthcheck/dispatcher
caelus/healthcheck/rulecheck
caelus/healthcheck/rulecheck/correlation
caelus/metrics
caelus/metrics/outer
caelus/metrics/outer/serverrequest
caelus/metrics/outer/textfile
caelus/mock
caelus/online
caelus/predict
caelus/qos
caelus/qos/manager
caelus/qos/manager/netio
caelus/qos/mock
caelus/resource
caelus/resource/k8s
caelus/resource/yarn
caelus/statestore
caelus/statestore/cgroup
caelus/statestore/common
caelus/statestore/common/customize
caelus/statestore/common/node
caelus/statestore/common/perf
caelus/statestore/common/perf/pmu
caelus/statestore/common/prometheus
caelus/statestore/common/rdt
caelus/statestore/common/rdt/rdt
caelus/statestore/mock
caelus/types
caelus/util
caelus/util/appclass
caelus/util/cgroup
caelus/util/machine
caelus/util/mountpoint
caelus/util/ports
caelus/util/runtime
caelus/util/runtime/docker
caelus/util/sets
metricadapter
nm-operator/hadoop
nm-operator/nmoperator
nm-operator/types
nm-operator/util
types
util
util/times
version
version/verflag

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL