:toc: macro
image:https://travis-ci.org/jaegertracing/jaeger-operator.svg?branch=master["Build Status", link="https://travis-ci.org/jaegertracing/jaeger-operator"]
image:https://goreportcard.com/badge/github.com/jaegertracing/jaeger-operator["Go Report Card", link="https://goreportcard.com/report/github.com/jaegertracing/jaeger-operator"]
image:https://codecov.io/gh/jaegertracing/jaeger-operator/branch/master/graph/badge.svg["Code Coverage", link="https://codecov.io/gh/jaegertracing/jaeger-operator"]
image:https://godoc.org/github.com/jaegertracing/jaeger-operator?status.svg["GoDoc", link="https://godoc.org/github.com/jaegertracing/jaeger-operator/pkg/apis/jaegertracing/v1#JaegerSpec"]
= Jaeger Operator for Kubernetes
toc::[]
IMPORTANT: The Jaeger Operator version is related to the version of the Jaeger components (Query, Collector, Agent) up to the minor portion. The patch version portion does *not* follow the ones from the Jaeger components. For instance, the Operator version 1.8.1 uses the Jaeger Docker images tagged with version 1.8 by default.
== Installing the operator
NOTE: The following instructions will deploy a version of the operator that is using the latest `master` version. If
you want to install a particular stable version of the operator, you will need to edit the `operator.yaml` and specify
the version as the tag in the container image - and then use the relevant `apiVersion` for the Jaeger operator.
|===
|Up to version |apiVersion |CRD yaml
|master
|jaegertracing.io/v1
|https://github.com/jaegertracing/jaeger-operator/blob/master/deploy/crds/jaegertracing_v1_jaeger_crd.yaml[jaegertracing_v1_jaeger_crd.yaml]
|1.10.0
|io.jaegertracing/v1alpha1
|https://github.com/jaegertracing/jaeger-operator/blob/master/deploy/crds/io_v1alpha1_jaeger_crd.yaml[io_v1alpha1_jaeger_crd.yaml]
|===
=== Kubernetes
NOTE: Make sure your `kubectl` command is properly configured to talk to a valid Kubernetes cluster. If you don't have one yet, check link:https://kubernetes.io/docs/tasks/tools/install-minikube/[`minikube`] out.
To install the operator, run:
[source,bash]
----
kubectl create namespace observability # <1>
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing_v1_jaeger_crd.yaml # <2>
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml
----
<1> This creates the namespace used by default in the deployment files.
<2> This installs the "Custom Resource Definition" for the `apiVersion: jaegertracing.io/v1`
IMPORTANT: when using a Jaeger Operator up to v1.10.0, install the CRD file `io_v1alpha1_jaeger_crd.yaml` in addition to `jaegertracing_v1_jaeger_crd.yaml`. This is because up to that version, the `apiVersion` in use was `io.jaegertracing/v1alpha1`.
If you want to install the Jaeger operator in a different namespace, you will need to edit the deployment
files to change `observability` to the required value.
At this point, there should be a `jaeger-operator` deployment available:
[source,bash]
----
$ kubectl get deployment jaeger-operator -n observability
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
jaeger-operator 1 1 1 1 48s
----
The operator is now ready to create Jaeger instances!
=== OpenShift
The instructions from the previous section also work on OpenShift. Make sure to install the RBAC rules, the CRD and the operator as a privileged user, such as `system:admin`.
[source,bash]
----
oc login -u system:admin
oc new-project observability # <1>
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing_v1_jaeger_crd.yaml # <2>
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml
----
<1> This creates the namespace used by default in the deployment files.
<2> This installs the "Custom Resource Definition" for the `apiVersion: jaegertracing.io/v1`
IMPORTANT: when using a Jaeger Operator up to v1.10.0, install the CRD file `io_v1alpha1_jaeger_crd.yaml` in addition to `jaegertracing_v1_jaeger_crd.yaml`. This is because up to that version, the `apiVersion` in use was `io.jaegertracing/v1alpha1`.
If you want to install the Jaeger operator in a different namespace, you will need to edit the deployment
files to change `observability` to the required value.
Once the operator is installed, grant the role `jaeger-operator` to users who should be able to install individual Jaeger instances. The following example creates a role binding allowing the user `developer` to create Jaeger instances:
[source,bash]
----
oc create \
rolebinding developer-jaeger-operator \
--role=jaeger-operator \
--user=developer
----
After the role is granted, switch back to a non-privileged user.
== Creating a new Jaeger instance
Example custom resources, for different configurations of Jaeger, can be found https://github.com/jaegertracing/jaeger-operator/tree/master/deploy/examples[here].
The simplest possible way to install is by creating a YAML file like the following:
.simplest.yaml
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simplest
----
The YAML file can then be used with `kubectl`:
[source,bash]
----
kubectl apply -f simplest.yaml
----
In a few seconds, a new in-memory all-in-one instance of Jaeger will be available, suitable for quick demos and development purposes. To check the instances that were created, list the `jaeger` objects:
[source,bash]
----
$ kubectl get jaeger
NAME CREATED AT
simplest 28s
----
To get the pod name, query for the pods belonging to the `simplest` Jaeger instance:
[source,bash]
----
$ kubectl get pods -l app.kubernetes.io/instance=simplest
NAME READY STATUS RESTARTS AGE
simplest-6499bb6cdd-kqx75 1/1 Running 0 2m
----
Similarly, the logs can be queried either from the pod directly using the pod name obtained from the previous example, or from all pods belonging to our instance:
[source,bash]
----
$ kubectl logs -l app.kubernetes.io/instance=simplest
...
{"level":"info","ts":1535385688.0951214,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
----
NOTE: On OpenShift the container name must be specified
[source,bash]
----
$ kubectl logs -l app.kubernetes.io/instance=simplest -c jaeger
...
{"level":"info","ts":1535385688.0951214,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
----
For reference, here's how a more complex all-in-one instance can be created:
.all-in-one.yaml
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: my-jaeger
spec:
strategy: allInOne # <1>
allInOne:
image: jaegertracing/all-in-one:latest # <2>
options: # <3>
log-level: debug # <4>
storage:
type: memory # <5>
options: # <6>
memory: # <7>
max-traces: 100000
ingress:
enabled: false # <8>
agent:
strategy: DaemonSet # <9>
annotations:
scheduler.alpha.kubernetes.io/critical-pod: "" # <10>
----
<1> The default strategy is `allInOne`. The only other possible values are `production` and `streaming`.
<2> The image to use, in a regular Docker syntax
<3> The (non-storage related) options to be passed verbatim to the underlying binary. Refer to the Jaeger documentation and/or to the `--help` option from the related binary for all the available options.
<4> The option is a simple `key: value` map. In this case, we want the option `--log-level=debug` to be passed to the binary.
<5> The storage type to be used. By default it will be `memory`, but can be any other supported storage type (e.g. elasticsearch, cassandra, kafka, etc).
<6> All storage related options should be placed here, rather than under the 'allInOne' or other component options.
<7> Some options are namespaced and we can alternatively break them into nested objects. We could have specified `memory.max-traces: 100000`.
<8> By default, an ingress object is created for the query service. It can be disabled by setting its `enabled` option to `false`. If deploying on OpenShift, this will be represented by a Route object.
<9> By default, the operator assumes that agents are deployed as sidecars within the target pods. Specifying the strategy as "DaemonSet" changes that and makes the operator deploy the agent as DaemonSet. Note that your tracer client will probably have to override the "JAEGER_AGENT_HOST" env var to use the node's IP.
<10> Define annotations to be applied to all deployments (not services). These can be overridden by annotations defined on the individual components.
== Updating a Jaeger instance (experimental)
A Jaeger instance can be updated by changing the `CustomResource`, either via `kubectl edit jaeger simplest`, where `simplest` is the Jaeger's instance name, or by applying the updated YAML file via `kubectl apply -f simplest.yaml`.
IMPORTANT: the name of the Jaeger instance cannot be updated, as it's part of the identifying information for the resource
Simpler changes such as changing the replica sizes can be applied without much concern, whereas changes to the strategy should be watched closely and might potentially cause an outage for individual components (collector/query/agent).
While changing the backing storage is supported, migration of the data is not.
== Strategies
As shown in the example above, the Jaeger instance is associated with a strategy. The strategy determines the architecture to be used for the Jaeger backend.
The available strategies are described in the following sections.
=== AllInOne (Default)
This strategy is intended for development, testing and demo purposes.
The main backend components, agent, collector and query service, are all packaged into a single executable which is configured (by default) to use in-memory storage.
=== Production
The `production` strategy is intended (as the name suggests) for production environments, where long term storage of trace data is important, as well as a more scalable and highly available architecture is required. Each of the backend components is therefore separately deployed.
The agent can be injected as a sidecar on the instrumented application or as a daemonset.
The query and collector services are configured with a supported storage type - currently cassandra or elasticsearch. Multiple instances of each of these components can be provisioned as required for performance and resilience purposes.
The main additional requirement is to provide the details of the storage type and options, e.g.
[source,yaml]
----
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
----
=== Streaming
The `streaming` strategy is designed to augment the `production` strategy by providing a streaming capability that effectively sits between the collector and the backend storage (e.g. cassandra or elasticsearch). This provides the benefit of reducing the pressure on the backend storage, under high load situations, and enables other trace post processing capabilities to tap into the real time span data directly from the streaming platform (kafka).
The only additional information required is to provide the details for accessing the Kafka platform, which is configured in a new `ingester` component:
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simple-streaming
spec:
strategy: streaming
collector:
options:
kafka: # <1>
producer:
topic: jaeger-spans
brokers: my-cluster-kafka-brokers.kafka:9092
ingester:
options:
kafka: # <1>
consumer:
topic: jaeger-spans
brokers: my-cluster-kafka-brokers.kafka:9092
ingester:
deadlockInterval: 0 # <2>
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
----
<1> Identifies the kafka configuration used by the collector, to produce the messages, and the ingester to consume the messages
<2> The deadlock interval can be disabled to avoid the ingester being terminated when no messages arrive within the default 1 minute period
TIP: A Kafka environment can be configured using link:https://strimzi.io/[Strimzi's Kafka operator].
== Elasticsearch storage
Under some circumstances, the Jaeger Operator can make use of the link:https://github.com/openshift/elasticsearch-operator[Elasticsearch Operator] to provision a suitable Elasticsearch cluster.
IMPORTANT: this feature is experimental and has been tested only on OpenShift clusters. Elasticsearch also requires the memory setting to be configured like `minishift ssh -- 'sudo sysctl -w vm.max_map_count=262144'`. Spark dependencies are not supported with this feature link:https://github.com/jaegertracing/jaeger-operator/issues/294[#294].
When there are no `es.server-urls` options as part of a Jaeger `production` instance and `elasticsearch` is set as the storage type, the Jaeger Operator creates an Elasticsearch cluster via the Elasticsearch Operator by creating a Custom Resource based on the configuration provided in storage section. The Elasticsearch cluster is meant to be dedicated for a single Jaeger instance.
The self-provision of an Elasticsearch cluster can be disabled by setting the flag `--es-provision` to `false`. The default value is `auto`, which will make the Jaeger Operator query the Kubernetes for its ability to handle a `Elasticsearch` custom resource. This is usually set by the Elasticsearch Operator during its installation process, so, if the Elasticsearch Operator is expected to run *after* the Jaeger Operator, the flag can be set to `true`.
IMPORTANT: At the moment there can be only one Jaeger with self-provisioned Elasticsearch instance per namespace.
== Accessing the UI
=== Kubernetes
The operator creates a Kubernetes link:https://kubernetes.io/docs/concepts/services-networking/ingress/[`ingress`] route, which is the Kubernetes' standard for exposing a service to the outside world, but it comes with no Ingress providers by default. link:https://kubernetes.github.io/ingress-nginx/deploy/#verify-installation[Check the documentation] on what's the most appropriate way to achieve that for your platform, but the following commands should provide a good start on `minikube`:
[source,bash]
----
minikube addons enable ingress
----
Once that is done, the UI can be found by querying the Ingress object:
[source,bash]
----
$ kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
simplest-query * 192.168.122.34 80 3m
----
IMPORTANT: an `Ingress` object is *not* created when the operator is running on OpenShift
In this example, the Jaeger UI is available at http://192.168.122.34
=== OpenShift
When using the `operator-openshift.yaml` resource, the Operator will automatically create a `Route` object for the query services. Check the hostname/port with the following command:
[source,bash]
----
oc get routes
----
NOTE: make sure to use `https` with the hostname/port you get from the command above, otherwise you'll see a message like: "Application is not available".
By default, the Jaeger UI is protected with OpenShift's OAuth service and any valid user is able to login. For development purposes, the user/password combination `developer/developer` can be used. To disable this feature and leave the Jaeger UI unsecured, set the Ingress property `security` to `none`:
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: disable-oauth-proxy
spec:
ingress:
security: none
----
== Auto injection of Jaeger Agent sidecars
The operator can also inject Jaeger Agent sidecars in `Deployment` workloads, provided that the deployment has the annotation `sidecar.jaegertracing.io/inject` with a suitable value. The values can be either `"true"` (as string), or the Jaeger instance name, as returned by `kubectl get jaegers`. When `"true"` is used, there should be exactly *one* Jaeger instance for the same namespace as the deployment, otherwise, the operator can't figure out automatically which Jaeger instance to use.
The following snippet shows a simple application that will get a sidecar injected, with the Jaeger Agent pointing to the single Jaeger instance available in the same namespace:
[source,yaml]
----
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
annotations:
"sidecar.jaegertracing.io/inject": "true" # <1>
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: acme/myapp:myversion
----
<1> Either `"true"` (as string) or the Jaeger instance name
A complete sample deployment is available at link:./deploy/examples/business-application-injected-sidecar.yaml[`deploy/examples/business-application-injected-sidecar.yaml`]
== Agent as DaemonSet
By default, the Operator expects the agents to be deployed as sidecars to the target applications. This is convenient for several purposes, like in a multi-tenant scenario or to have better load balancing, but there are scenarios where it's desirable to install the agent as a `DaemonSet`. In that case, specify the Agent's strategy to `DaemonSet`, as follows:
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: my-jaeger
spec:
agent:
strategy: DaemonSet
----
IMPORTANT: if you attempt to install two Jaeger instances on the same cluster with `DaemonSet` as the strategy, only *one* will end up deploying a `DaemonSet`, as the agent is required to bind to well-known ports on the node. Because of that, the second daemon set will fail to bind to those ports.
Your tracer client will then most likely need to be told where the agent is located. This is usually done by setting the env var `JAEGER_AGENT_HOST` and should be set to the value of the Kubernetes node's IP, like:
[source,yaml]
----
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: acme/myapp:myversion
env:
- name: JAEGER_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
----
== Secrets support
The Operator supports passing secrets to the Collector, Query and All-In-One deployments. This can be used for example, to pass credentials (username/password) to access the underlying storage backend (for ex: Elasticsearch).
The secrets are available as environment variables in the (Collector/Query/All-In-One) nodes.
[source,yaml]
----
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
secretName: jaeger-secrets
----
The secret itself would be managed outside of the `jaeger-operator` CR.
== Define sampling strategies
The operator can be used to define sampling strategies that will be supplied to tracers that have been configured
to use a remote sampler:
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: with-sampling
spec:
strategy: allInOne
sampling:
options:
default_strategy:
type: probabilistic
param: 50
----
This example defines a default sampling strategy that is probabilistic, with a 50% chance of the trace instances being
sampled.
Refer to the Jaeger documentation on link:https://www.jaegertracing.io/docs/latest/sampling/#collector-sampling-configuration[Collector Sampling Configuration] to see how service and endpoint sampling can be configured. The JSON representation described in that documentation can be used in the operator by converting to YAML.
== Schema migration
=== Cassandra
When the storage type is set to Cassandra, the operator will automatically create a batch job that creates the required schema for Jaeger to run. This batch job will block the Jaeger installation, so that it starts only after the schema is successfuly created. The creation of this batch job can be disabled by setting the `enabled` property to `false`:
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: cassandra-without-create-schema
spec:
strategy: allInOne
storage:
type: cassandra
cassandraCreateSchema:
enabled: false # <1>
----
<1> Defaults to `true`
Further aspects of the batch job can be configured as well. An example with all the possible options is shown below:
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: cassandra-with-create-schema
spec:
strategy: allInOne # <1>
storage:
type: cassandra
options: # <2>
cassandra:
servers: cassandra
keyspace: jaeger_v1_datacenter3
cassandraCreateSchema: # <3>
datacenter: "datacenter3"
mode: "test"
----
<1> The same works for `production` and `streaming`
<2> These options are for the regular Jaeger components, like `collector` and `query`
<3> The options for the `create-schema` job
NOTE: the default create-schema job uses `MODE=prod`, which implies a replication factor of `2`, using `NetworkTopologyStrategy` as the class, effectively meaning that at least 3 nodes are required in the Cassandra cluster. If a `SimpleStrategy` is desired, set the mode to `test`, which then sets the replication factor of `1`. Refer to the link:https://github.com/jaegertracing/jaeger/blob/master/plugin/storage/cassandra/schema/create.sh[create-schema script] for more details.
== Finer grained configuration
The custom resource can be used to define finer grained Kubernetes configuration applied to all Jaeger components or at the individual component level.
When a common definition (for all Jaeger components) is required, it is defined under the `spec` node. When the definition relates to an individual component, it is placed under the `spec/<component>` node.
The types of configuration supported include:
* link:https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/[annotations]
* link:https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container[resources] to limit cpu and memory
* link:https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity[affinity] to determine which nodes a pod can be allocated to
* link:https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/[tolerations] in conjunction with `taints` to enable pods to avoid being repelled from a node
* link:https://kubernetes.io/docs/concepts/storage/volumes/[volumes] and volume mounts
* link:https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/[serviceAccount] to run each component with separate identity
* link:https://kubernetes.io/docs/tasks/configure-pod-container/security-context/[securityContext] to define privileges of running components
[source,yaml]
----
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simple-prod
spec:
strategy: production
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
annotations:
key1: value1
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoExecute"
serviceAccount: nameOfServiceAccount
securityContext:
runAsUser: 1000
volumeMounts:
- name: config-vol
mountPath: /etc/config
volumes:
- name: config-vol
configMap:
name: log-config
items:
- key: log_level
path: log_level
----
== Removing an instance
To remove an instance, just use the `delete` command with the file used for the instance creation:
[source,bash]
----
kubectl delete -f simplest.yaml
----
Alternatively, you can remove a Jaeger instance by running:
[source,bash]
----
kubectl delete jaeger simplest
----
NOTE: deleting the instance will not remove the data from a permanent storage used with this instance. Data from in-memory instances, however, will be lost.
== Monitoring the operator
The Jaeger Operator starts a Prometheus-compatible endpoint on `0.0.0.0:8383/metrics` with internal metrics that can be used to monitor the process.
NOTE: The Jaeger Operator does not yet publish its own metrics. Rather, it makes available metrics reported by the components it uses, such as the Operator SDK.
== Uninstalling the operator
Similar to the installation, just run:
[source,bash]
----
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing_v1_jaeger_crd.yaml
----