The metadata-service is a fifteen factor microservice that uses the GitOps pattern to manage
metadata (owner, reviewers, quicklinks, alerting targets, ...) about services and exposes it via its REST API
. It uses the following repository as a data store. Create a copy of it by hitting
the button.
Getting Started
⚠ The service requires a Vault instance to get the secrets required for committing changes into
the metadata-repository.
- Download the latest release or clone the repository to build a local Docker image.
- Copy the config from
local-config.template.yaml
as local-config.yaml
- Adjust the config and fill in all required values (see below for details)
- Create a copy of the template repository and set it in the
local-config.yaml
- Run the server by starting the binary
Configuration
In production all configuration is sourced from environment variables. For localhost development
the local-config.yaml
can be used to set the variables.
Variable |
Default |
Description |
APPLICATION_NAME |
metadata |
The name of the application, lowercase, numbers and - only. |
SERVER_ADDRESS |
0.0.0.0 |
Address to bind to, one of IP, Hostname, ipv6_ip, ipv6ip%interface |
SERVER_PORT |
8080 |
Port to listen on, cannot be a privileged port. Must be in range of 1024 - 65535. |
METRICS_PORT |
9090 |
Port to provide prometheus metrics on, cannot be a privileged port. Must be in range of 1024 - 65535. |
|
|
|
METADATA_REPO_URL |
|
The HTTP url to the repository containing the metadata. e.g.: https://github.com/Interhyp/metadata-service-template.git |
METADATA_REPO_MAINLINE |
refs/heads/main |
The ref of the service metadata used as mainline |
OWNER_REGEX |
.* |
The regex used to limt owner aliases to load. Mostly useful for local development to minimize startup time. Default loads all owners. |
|
|
|
LOGSTYLE |
ecs |
The logstyle to use (defaults to elastic common schema) and can be changed to plain for localhost debugging. |
|
|
|
VAULT_ENABLED |
true |
Enables vault. supports all values supported by [ParseBool][https://pkg.go.dev/strconv#ParseBool]. |
VAULT_SERVER |
|
FQDN of the Vault server - do not add any other part of the URL. |
VAULT_AUTH_TOKEN |
|
Authentication token used to fetch secrets. Setting this implicitly switches from Kubernetes authentication to token mode. |
VAULT_AUTH_KUBERNETES_ROLE |
|
Role binding to use for vault kubernetes authentication, usually microservice_role_ |
VAULT_AUTH_KUBERNETES_TOKEN_PATH |
/var/run/secrets/kubernetes.io/serviceaccount/token |
File path to the service-account token. |
VAULT_AUTH_KUBERNETES_BACKEND |
|
Authentication path for the kubernetes cluster. |
VAULT_SECRETS_CONFIG |
|
Configuration consisting of vault paths and keys to fetch from the corresponding path. Values will be written to the global configuration object. |
|
|
|
BASIC_AUTH_USERNAME |
|
Name of the user used for basic authentication to this service. User will be granted admin privileges. |
BASIC_AUTH_PASSWORD |
|
Password of the user used for basic authentication to this service. User will be granted admin privileges. |
|
|
|
BITBUCKET_USERNAME |
|
Name of the user used for basic git authentication to pull and update the metadata repository. |
BITBUCKET_PASSWORD |
|
Password of the user used for basic git authentication to pull and update the metadata repository. |
|
|
|
GIT_COMMITTER_NAME |
|
Name of the user used to create the Git commits. |
GIT_COMMITTER_EMAIL |
|
E-Mail of the user used to create the Git commits. |
|
|
|
KAFKA_USERNAME |
|
Leave ALL of the following KAFKA_ fields empty to skip the Kafka integration. |
KAFKA_PASSWORD |
|
Leave ALL of the following KAFKA_ fields empty to skip the Kafka integration. |
KAFKA_TOPIC |
|
|
KAFKA_SEED_BROKERS |
|
A comma separated list of Kafka brokers, e.g. first-kafka-broker.domain.com:9092,second-kafka-broker.domain.com:9092 |
KAFKA_GROUP_ID_OVERRIDE |
|
Override the kafka group id for local development to avoid creating lots of consumer groups. If unset, derived from local IP address so each k8s pod gets their own group. |
|
|
|
AUTH_OIDC_KEY_SET_URL |
|
URL to the OpenID Connect Keyset for validating JWTs. See authentication for more details. |
AUTH_OIDC_TOKEN_AUDIENCE |
|
Expected audience of the JWT. Tokens not created for this audience will be rejected. |
AUTH_GROUP_WRITE |
|
Id or name of the group that is allowed to perform write actions. Needs to be part of the 'groups' claim to perform successful requests. If left blank, anyone with a valid JWT is allowed to perform write actions. |
|
|
|
UPDATE_JOB_INTERVAL_MINUTES |
15 |
Interval in minutes for refreshing the metadata repository cache. |
UPDATE_JOB_TIMEOUT_SECONDS |
30 |
Timeout in seconds when fetching the Git repository. |
|
|
|
ALERT_TARGET_PREFIX |
|
Validates the alert target to either match the prefix or suffix. |
ALERT_TARGET_SUFFIX |
|
|
|
|
|
OWNER_ALIAS_PERMITTED_REGEX |
^[a-z](-?[a-z0-9]+)*$ |
Regular expression to control the owner aliases that are permitted to be be created. |
OWNER_ALIAS_PROHIBITED_REGEX |
^$ |
Regular expression to control the owner aliases that are prohibited to be be created. |
OWNER_ALIAS_MAX_LENGTH |
28 |
Maximum length of a valid owner alias. |
OWNER_ALIAS_FILTER_REGEX |
|
Regular expression to filter owners based on their alias. Useful on localhost or for test instances to speed up service startup. |
|
|
|
SERVICE_NAME_PERMITTED_REGEX |
^[a-z](-?[a-z0-9]+)*$ |
Regular expression to control the service names that are permitted to be be created. |
SERVICE_NAME_PROHIBITED_REGEX |
^$ |
Regular expression to control the service names that are prohibited to be be created. |
SERVICE_NAME_MAX_LENGTH |
28 |
Maximum length of a valid service name. |
|
|
|
REPOSITORY_NAME_PERMITTED_REGEX |
^[a-z](-?[a-z0-9]+)*$ |
Regular expression to control the repository names that are permitted to be be created. |
REPOSITORY_NAME_PROHIBITED_REGEX |
^$ |
Regular expression to control the repository names that are prohibited to be be created. |
REPOSITORY_NAME_MAX_LENGTH |
64 |
Maximum length of a valid repository name. |
REPOSITORY_TYPES |
|
Comma separated list of supported repository types. |
REPOSITORY_KEY_SEPARATOR |
. |
Single character used to separate repository name from repository type. repository name and repository type must not contain separator. |
Datastore
The metadata-service uses a Git repository as its datastore and caches it in memory. This enables
a GitOps pattern for maintaining the metadata of services. The metadata is categorized into the three
types owner
, service
and repository
. The structure of the repository is as follows. See
the
documentation for details on the API.
owners/
└── owner-a/
├── owner.info.yaml
├── services/
│ ├── service-a.yaml
│ └── service-b.yaml
└── repositories/
├── service-a.implementation.yaml
├── service-a.deployment.yaml
├── service-b.api.yaml
└── something.none.yaml
Authentication
The metadata-service has two kinds of authentication. One for the repository used as the datastore and
another set of credentials used to authenticate against the protected API (create, update, delete operations) of the
service.
Datastore Authentication
The service currently only supports Basic Authentication for fetching and updating the git repository used as
the datastore. The credentials are configured via BITBUCKET_USERNAME
and BITBUCKET_PASSWORD
which are either
provided as environment variable or via Vault using the same keys at the destination defined in VAULT_SERVICE_SECRETS_PATH
.
API Authentication
Authentication against the API of the metadata-service can be done using a JWT and configuring the KEY_SET_URL
with a valid OpenID backend. As an alternative Basic Authentication can be enabled to authorise requests to your API.
The credentials are configured via BASIC_AUTH_USERNAME
and BASIC_AUTH_PASSWORD
which are either
provided as environment variable or via Vault using the same keys at the destination defined in VAULT_SERVICE_SECRETS_PATH
.
WARNING
Leaving BASIC_AUTH_USERNAME
and BASIC_AUTH_PASSWORD
empty will currently expose your protected API to anonymous
requests.
concurrency and eventual consistency
You will notice that all read operations give you the timestamp and git commit hash. You are expected to send this
back to us when making an update. We use this to detect concurrent updates.
Many parts of the api will give you a 409 response if you attempt to update based on outdated information
(commit hash).
While an update is being distributed between instances, we make no strong consistency guarantees for read operations
.
Write operations, on the other hand, always pull the current git tree before committing, and since you are
sending along the commit hash and timestamp an update is based on, any concurrent updates will fail
even if they happen to go through different instances of this service.
This has an important consequence! If you make a write operation, and you want to continue working with
a metadata entry, then you must always use the new state returned by the write operation to continue working with
an entry, or you may end up on another instance and read old state.
All write operations, including PATCH and initial creation, return the current state of the metadata entry,
including the new timestamp and commit hash.
changing owners
You can change the owner of a service by making an update to it that changes the owner alias. This will also
change the owner of any repositories associated with the service, and the whole thing is done in a single git
commit, meaning, this is an atomic operation.
It is not valid to change the owner of a repository directly while its service references it, because that
would lead to dangling references in the service. Change the owner of the service instead.
Note: this microservice only changes the metadata in git. You will still need to make other changes, such as moving
Jenkins jobs.
You can also change the owner of a repository (except for service-associated repositories) by making
an update to it that changes the owner alias. This is an atomic operation which will result in a single
git commit.
kafka event stream and caching behaviour
Kafka update notifications are sent for changes received through a controller (including the webhook controller,
which needs to be configured in Bitbucket to notify the service of outside changes).
This may lead to duplicate update notifications, both in and out of order. We solve this by keeping track of
which commits in service-metadata we have already seen, and ignoring those events. Any event that has a new
commit hash leads to a synchronous pull and update of all caches, and the next time that commit will be known.
If you are a client subscribing to our Kafka update notifications, and you want to ensure you GET the current
state following an update notification, you must compare the commit hash and timestamp to see if you got the
correct version. If not, wait a bit and try again, you landed on an instance that isn't consistent yet.
architecture
Those aren't Beans, they're Acorns
Our singleton components implement the
Acorn interface.
All singletons refer to each other by references to their interface. This allows for easy mocking during tests.
development
initial setup
Clone this outside your GOPATH (on linux, defaults to ~/go)
Tip: On Windows, you should NOT place the GOPATH in your profile, because a large body of source code goes there.
contract test setup
This service uses pact-go for consumer driven contract tests.
We only interact with the vault api, so we are a consumer, and there is no pact-aware implementation
for the producer side, instead this repository just comes with the relevant recordings,
which were manually created.
Download and install the pact command line tools and add them to your path as described in the
pact-go manual.
generate api model classes
The generated models are checked in, so you only need to run this step if there is a change in the openapi specs.
We use the OpenAPI Generator to generate our model classes (and only those!)
In a git bash (or on Linux), run ./api-generator/generate.sh
.
build
go build main.go
run tests
We have given-when-then style acceptance tests.
Run all tests, including the acceptance and contract tests (will need to have pact installed):
go test -coverpkg='./internal/...' -v './...'
In IntelliJ/GoLand, if you want to check code coverage, you must set Go Tool Arguments to
-coverpkg='./internal/...'
,
so cross-package coverage from the acceptance tests is considered. You may wish to set this under
Edit Configuration - Edit Configuration Templates, so it will be set on all new test run configurations.
Clear the test cache:
go clean -testcache
Goland terminal configuration
Goland has the annoying habit of limiting line width on the output terminal to 80 characters no matter how wide the
window is.
You can fix this. Menu: Help -> Find Action... -> search for "Registry"
Uncheck go.run.processes.with.pty
.
run application
All configuration is normally read from environment variables, but for localhost convenience, we support an
optional flat yaml file called local-config.yaml
, which is ignored by git.
If present, it is read BEFORE any environment variables are interpreted.
Copy the local-config.template.yaml
and replace LOCAL_VAULT_TOKEN
with your
Vault token.
To obtain your personal vault token, log into vault, and use "Copy token" from the
user menu on the right. Then add this verbatim under LOCAL_VAULT_TOKEN
.
After this, the application can be started with:
go run main.go
use Elastic APM during local development
To use Elastic APM add the following environment to your (run) configuration:
ELASTIC_APM_SERVER_URL=https://apm-server.sys.ehyp.dev.interhyp-cloud.de;ELASTIC_APM_ENVIRONMENT=dev;ELASTIC_APM_SERVICE_NAME=metadata
To disable APM, even if it is configured, add ELASTIC_APM_DISABLED: true
to your local-config.yaml
.
swagger-ui
This service comes with the swagger ui built-in.
Open http://localhost:8080/swagger-ui/index.html in your browser.
The api docs URL is /v3/api-docs (in case the swagger ui does not automatically open it).
List dependency tree
go mod graph > deps.txt