This release provides Cloud Foundry and BOSH integration with Google Cloud
Platform's Stackdriver Logging and
Monitoring.
Functionality is provided by 3 jobs in this release:
Project Status
The following is generally available:
- Stackdriver Host Monitoring Agent (
stackdriver-agent
)
- Stackdriver Host Logging Agent (
google-fluentd
)
- Stackdriver Nozzle (
stackdriver-nozzle
)
- Stackdriver Logging for Cloud Foundry Log Events (
LogMessage, Error, HttpStartStop
)
- Stackdriver Monitoring for Cloud Foundry Metric Events (
ContainerMetric, ValueMetric, CounterEvent
)
The following is in beta:
- Stackdriver Nozzle
- Stackdriver Logging for Cloud Foundry Metric Events (
ContainerMetric, ValueMetric, CounterEvent
)
The project was developed in partnership with Google and Pivotal and is actively
maintained by Google.
Getting started
Enable Stackdriver APIs
Ensure the Stackdriver Logging and Stackdriver
Monitoring APIs are enabled.
Quotas
Depending on the size of the cloud foundry deployment and which events the nozzle is forwarding,
it can be quite easy to reach the default Stackdriver quotas:
Google quotas can be viewed and managed on the API Quotas Page.
An operator can increase the default quota up to a limit; exceeding that, use the contact
links to request even higher quotas.
All of the jobs in this release authenticate to Stackdriver Logging and
Monitoring via Service
Accounts.
Follow the GCP documentation to create a service account via gcloud with the following roles:
roles/logging.logWriter
roles/logging.configWriter
roles/monitoring.metricWriter
You can either authenticate the job(s) by specifying the service account in the cloud_properties
for the resource pool running the job(s) or by configuring credentials.application_default_credentials
in the job spec.
You may also read the access control
documentation for more
general information about how authentication and authorization work for
Stackdriver.
General usage
To use any of the jobs in this BOSH release, first upload it to your BOSH
director:
bosh2 upload-release https://storage.googleapis.com/bosh-gcp/beta/stackdriver-tools/latest.tgz
The stackdriver-tools.yml sample BOSH 2.0 manifest illustrates how to
use all 3 jobs in this release (nozzle, host logging, and host monitoring). You
can deploy the sample with the following commands:
bosh2 upload-stemcell https://bosh.io/d/stemcells/bosh-google-kvm-ubuntu-trusty-go_agent
bosh2 update-cloud-config -n manifests/cloud-config-gcp.yml \
-v zone=... \
-v network=... \
-v subnetwork=... \
-v "tags=['stackdriver-nozzle']" \
-v internal_cidr=... \
-v internal_gw=... \
-v "reserved=[10....-10....]"
bosh2 deploy manifests/stackdriver-tools.yml \
-d stackdriver-nozzle \
--var=firehose_endpoint=https://.. \
--var=firehose_username=stackdriver_nozzle \
--var=firehose_password=... \
--var=skip_ssl=false \
--var=gcp_project_id=... \
--var-file=gcp_service_account_json=path/to/service_account.json \
This will create a self-contained deployment that sends Cloud Foundry firehose
data, host logs, and host metrics to Stackdriver.
Deploying each job individually is described in detail below.
Deploying the nozzle
Create a new deployment manifest for the nozzle. See the example
manifest for a full deployment and the jobs.stackdriver-nozzle
section for the nozzle.
To reduce message loss, operators should run a minimum of two instances. With
two instances, updating stemcells and other destructive BOSH operations will
still leave an instance draining logs.
The loggregator system will round-robin messages across multiple
instances. If the nozzle can't handle the load, consider scaling to more than
two nozzle instances.
The spec describes all the properties an operator should modify.
Stackdriver Error Reporting
Stackdriver can automatically detect and report errors from stack traces in logs.
However, this does not automatically work with Loggregator because it sends each
line from app output as a separate log message to the nozzle. To enable this feature
of Stackdriver, apps will need to manually encode stacktraces on a single line so
that the stackdriver-nozzle can send them as single messages to Stackdriver.
This is accomplished by replacing newlines in stacktraces with a unique character,
which is set using the firehose.newline_token
template variable in the nozzle
so that the nozzle can reconstruct the stacktrace on multiple lines.
For example, if firehose.newline_token
is set to ∴
, a Go app would need to
implement something like the following:
const newlineToken = "∴"
func main() {
...
defer handlePanic()
...
}
func handlePanic() {
e := recover()
if e == nil {
return
}
stack := make([]byte, 1<<16)
stackSize := runtime.Stack(stack, true)
out := string(stack[:stackSize])
fmt.Fprintf(os.Stderr, "panic: %v", e)
fmt.Fprintf(os.Stderr, strings.Replace(out, "\n", newlineToken, -1))
os.Exit(1)
}
This outputs the stacktrace separately from the panic so that the panic remains in
the logs and the stacktrace is logged by itself. This allows Stackdriver to detect
the stacktrace as an error.
For an example in Java, see this section of the Loggregator documentation.
Deploying host logging
The google-fluentd template uses Fluentd to send
both syslog and template logs (assuming that template jobs are writing logs into
/var/vcap/sys/log/*/*.log
) to Stackdriver Logging.
To forward host logs from BOSH VMs to Stackdriver, co-locate the
google-fluentd template with an existing job whose host logs should be
forwarded.
Include the stackdriver-tools
release in your existing deployment manifest:
releases:
...
- name: stackdriver-tools
version: latest
...
Add the google-fluentd template to your job:
jobs:
...
- name: nats
templates:
- name: nats
release: cf
- name: metron_agent
release: cf
- name: google-fluentd
release: stackdriver-tools
...
Deploying host monitoring
The stackdriver-agent template uses the Stackdriver
Monitoring Agent to collect VM metrics to send to
Stackdriver Monitoring.
To forward host metrics forwarding from BOSH VMs to Stackdriver, co-locate the
stackdriver-agent template with an existing job whose host metrics should be
forwarded.
Include the stackdriver-tools
release in your existing deployment manifest:
releases:
...
- name: stackdriver-tools
version: latest
...
Add the stackdriver-agent template to your job:
jobs:
...
- name: nats
templates:
- name: nats
release: cf
- name: metron_agent
release: cf
- name: stackdriver-agent
release: stackdriver-tools
...
Deploying as a BOSH addon
Specify the jobs as addons in your runtime config to deploy Stackdriver Monitoring and Logging agents on all instances in your deployment. Do not specify the jobs as part of your deployment manifest if you are using the runtime config.
# runtime.yml
---
releases:
- name: stackdriver-tools
version: latest
addons:
- name: stackdriver-tools
jobs:
- name: google-fluentd
release: stackdriver-tools
- name: stackdriver-agent
release: stackdriver-tools
To update the runtime config:
bosh2 update-runtime-config -d <your deployment> runtime.yml
Then redeploy your manifest:
bosh2 deploy -d <your deployment> path/to/manifest.yml
Development
Updating google-fluentd
google-fluentd
is versioned by the Gemfile in src/google-fluentd. To update fluentd:
- Update the version specifier in the Gemfile (if necessary)
- Update Gemfile.lock:
bundle update
- Create a vendor cache from the Gemfile.lock:
bundle package
- Tar and compress the vendor folder:
tar zvc vendor > google-fluentd-vendor-<VERSION>-plugin-<VERSION>.tgz
- Update the vendor version in the
google-fluentd
package
packaging and spec
- Add vendored cache to the BOSH blobstore:
bosh2 add-blob google-fluentd-vendor-<VERSION>-plugin-<VERSION>.tgz google-fluentd-vendor/google-fluentd-vendor-VERSION-NUMBER.tgz
- Create a dev release and deploy it to verify that all of the
above worked
- Update the BOSH blobstore:
bosh upload-blobs
- Commit your changes
bosh-lite
Both the nozzle and the fluentd jobs can run on bosh-lite. To generate a working manifest, start from
the bosh-lite-example-manifest. Note the application_default_credentials
property, which should be filled in with the contents of a Google service account key.
Contributing
For details on how to contribute to this project - including filing bug reports
and contributing code changes - please see CONTRIBUTING.md.
Copyright
Copyright (c) 2016 Ferran Rodenas. See
LICENSE
for details.