logdog/

directory
v0.0.0-...-35d8de9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 1, 2019 License: Apache-2.0

README

LogDog

LogDog is a high-performance log collection and dissemination platform. It is designed to collect log data from a large number of cooperative individual sources and make it available to users and systems for consumption. It is composed of several services and tools that work cooperatively to provide a reliable log streaming platform.

Like other LUCI components, LogDog primarily aims to be useful to the Chromium project.

LogDog offers several useful features:

  • Log data is streamed, and is consequently available the moment that it is ingested in the system.
  • Flexible hierarchial log namespaces for organization and navigation.
  • Recognition of different projects, and application of different ACLs for each project.
  • Able to stream text, binary data, or records.
  • Long term (possibly indefinite) log data storage and access.
  • Log data is sourced from read-forward streams (think files, sockets, etc.).
  • Leverages the LUCI Configuration Service for configuration and management.
  • Log data is implemented as protobufs.
  • The entire platform is written in Go.
  • Rich metadata is collected and stored alongside log records.
  • Built entirely on scalable platform technologies, targeting Google Cloud Platform.
    • Resource requirements scale linearly with log volume.

APIs

Most applications will interact with a LogDog Coordinator instance via its Coordinator Logs API.

Life of a Log Stream

Log streams pass through several layers and states during their path from generation through archival.

  1. Streaming: A log stream is being emitted by a Butler instance and pushed through the Transport Layer to the Collector.
  2. Pre-Registration: The log stream hasn't been observed by a Collector instance yet, and exists only in the mind of the Butler and the Transport layer.
  3. Registered: The log stream has been observed by a Collector instance and successfully registered with the Coordinator. At this point, it becomes queryable, listable, and the records that have been loaded into Intermediate Storage are streamable.
  4. ArchivePending: One of the following events cause the log stream to be recognized as finished and have an archival request dispatched. The archival request is submitted to the Archivist cluster.
    • The log stream's terminal entry is collected, and the terminal index is successfully registered with the Coordinator.
    • A sufficient amount of time has expired since the log stream's registration.
  5. Archived: An Archivist instance has received an archival request for the log stream, successfully executed the request according to its parameters, and updated the log stream's state with the Coordinator.

Most of the lifecycle is hidden from the Logs API endpoint by design. The user need not distinguish between a stream that is streaming, has archival pending, or has been archived. They will issue the same Get requests and receive the same log stream data.

A user may differentiate between a streaming and a complete log by observing its terminal index, which will be < 0 if the log stream is still streaming.

Components

The LogDog platform consists of several components:

  • Coordinator, a hosted service which serves log data to users and manages the log stream lifecycle.
  • Butler, which runs on each log stream producing system and serves log data to the Collector for consumption.
  • Collector, a microservice which takes log stream data and ingests it into intermediate storage for streaming and archival.
  • Archivist, a microservice which compacts completed log streams and prepares them for long-term storage.

LogDog offers several log stream clients to query and consume log data:

Additionally, LogDog is built on several abstract middleware technologies, including:

  • A Transport, a layer for the Butler to send data to the Collector.
  • An Intermediate Storage, a fast highly-accessible layer which stores log data immediately ingested by the Collector until it can be archived.
  • An Archival Storage, for cheap long-term file storage.

Log data is sent from the Butler through Transport to the Collector, which stages it in Intermediate Storage. Once the log stream is complete (or expired), the Archivist moves the data from Intermediate Storage to Archival Storage, where it will permanently reside.

The Chromium-deployed LogDog service uses Google Cloud Platform for several of the middleware layers:

  • Google AppEngine, a scaling application hosting service.
  • Cloud Datastore, a powerful transactional NOSQL structured data storage system. This is used by the Coordinator to store log stream state.
  • Cloud Pub/Sub, a publish / subscribe model transport layer. This is used to ferry log data from Butler instances to Collector instances for ingest.
  • Cloud BigTable, an unstructured key/value storage. This is used as intermediate storage for log stream data.
  • Cloud Storage, used for long-term log stream archival storage.
  • Container Engine, which manages Kubernetes clusters. This is used to host the Collector and Archivist microservices.

Additionally, other LUCI services are used, including:

Instantiation

To instantiate your own LogDog instance, you will need the following prerequisites:

  • A Configuration Service instance.
  • A Google Cloud Platform project configured with:
    • Datastore
    • A Pub/Sub topic (Butler) and subscription (Collector) for log streaming.
    • A Pub/Sub topic (Coordinator) and subscription (Archivist) for archival coordination.
    • A Container Engine instance for microservice hosting.
    • A BigTable cluster.
    • A Cloud Storage bucket for archival staging and storage.

Other compatible optional components include:

  • An Auth Service instance to manage authentication. This is necessary if something stricter than public read/write is desired.
Config

The Configuration Service must have a valid service entry text protobuf for this LogDog service (defined in svcconfig/config.proto).

Coordinator

After deploying the Coordinator to a suitable cloud project, several configuration parameters must be defined visit its settings page at: https://<your-app>/admin/portal, and configure:

  • Configure the "Configuration Service Settings" to point to the Configuration Service instance.
  • Update "Tumble Settings" appropriate (see tumble docs).
  • If using timeseries monitoring, update the "Time Series Monitoring Settings".
  • If using Auth Service, set the "Authorization Settings".

If you are using a BigTable instance outside of your cloud project (e.g., staging, dev), you will need to add your BigTable service account JSON to the service's settings. Currently this cannot be done without a command-line tool. Hopefully a proper settings page will be added to enable this, or alternatively Cloud BigTable will be updated to support IAM.

Microservices

Microservices are hosted in Google Container Engine, and use Google Compute Engine metadata for configuration.

The following metadata parameters must be set for deployed microservices to work:

  • logdog_coordinator_host, the host name of the Coordinator service.

All deployed microservices use the following optional metadata parameters for configuration:

  • logdog_storage_auth_json, an optional file containing the authentication credentials for intermediate storage (i.e., BigTable). This isn't necessary if the BigTable node is hosted in the same cloud project as the microservice is running, and the microservice's container has BigTable Read/Write permissions.
  • tsmon_endpoint, an optional endpoint for timeseries monitoring data.
Collector

The Collector instance is fully command-line compatible. Its entry point script uses Google Compute Engine metadata to populate the command line in a production environment:

  • logdog_collector_log_level, an optional -log-level flag value.
Archivist

The Archivist instance is fully command-line compatible. Its entry point script uses Google Compute Engine metadata to populate the command line in a production environment:

  • logdog_archivist_log_level, an optional -log-level flag value.

Directories

Path Synopsis
api
config/svcconfig
Package svcconfig stores service configuration for a LogDog instance.
Package svcconfig stores service configuration for a LogDog instance.
config/svcconfig/validate
Package main implements the LogDog Coordinator validation binary.
Package main implements the LogDog Coordinator validation binary.
endpoints/coordinator/admin/v1
Package logdog contains Version 1 of the LogDog Coordinator service interface.
Package logdog contains Version 1 of the LogDog Coordinator service interface.
endpoints/coordinator/logs/v1
Package logdog contains Version 1 of the LogDog Coordinator user interface.
Package logdog contains Version 1 of the LogDog Coordinator user interface.
endpoints/coordinator/registration/v1
Package logdog contains Version 1 of the LogDog Coordinator stream registration interface.
Package logdog contains Version 1 of the LogDog Coordinator stream registration interface.
endpoints/coordinator/services/v1
Package logdog contains Version 1 of the LogDog Coordinator backend service interface.
Package logdog contains Version 1 of the LogDog Coordinator backend service interface.
logpb
Package logpb contains LogDog protobuf source and generated protobuf data.
Package logpb contains LogDog protobuf source and generated protobuf data.
appengine
cmd/coordinator/default
Binary default is a simple AppEngine LUCI service.
Binary default is a simple AppEngine LUCI service.
cmd/coordinator/logs
Package main is the main entry point for the `vmuser` LogDog AppEngine executable.
Package main is the main entry point for the `vmuser` LogDog AppEngine executable.
cmd/coordinator/services
Binary services is the main entry point for the `services` LogDog AppEngine executable.
Binary services is the main entry point for the `services` LogDog AppEngine executable.
cmd/coordinator/static
Binary stub doesn't actually do anything.
Binary stub doesn't actually do anything.
client
annotee/annotation
Package annotation implements a state machine that constructs Milo annotation protobufs from a series of annotation commands.
Package annotation implements a state machine that constructs Milo annotation protobufs from a series of annotation commands.
annotee/executor
Package executor contains an implementation of the Annotee Executor.
Package executor contains an implementation of the Annotee Executor.
bootstrapResult
Package bootstrapResult defines a common way to express the result of bootstrapping a command via JSON.
Package bootstrapResult defines a common way to express the result of bootstrapping a command via JSON.
butler
Package butler is the main engine for the Butler executable.
Package butler is the main engine for the Butler executable.
butler/bootstrap
Package bootstrap handles Butler-side bootstrapping functionality.
Package bootstrap handles Butler-side bootstrapping functionality.
butler/buffered_callback
Package buffered_callback provides functionality to wrap around LogEntry callbacks to guarantee calling only on complete LogEntries, because the LogDog bundler produces fragmented LogEntries under normal operation, in order to meet time or buffer size requirements.
Package buffered_callback provides functionality to wrap around LogEntry callbacks to guarantee calling only on complete LogEntries, because the LogDog bundler produces fragmented LogEntries under normal operation, in order to meet time or buffer size requirements.
butler/bundler
Package bundler is responsible for efficiently transforming aggregate stream data into Butler messages for export.
Package bundler is responsible for efficiently transforming aggregate stream data into Butler messages for export.
butler/output
Package output contains interfaces and implementations for Butler Outputs, which are responsible for delivering Butler protobufs to LogDog collection endpoints.
Package output contains interfaces and implementations for Butler Outputs, which are responsible for delivering Butler protobufs to LogDog collection endpoints.
butler/output/log
Package log implements the "log" Output.
Package log implements the "log" Output.
butler/output/logdog
Package logdog implements output to a Logdog server via PubSub.
Package logdog implements output to a Logdog server via PubSub.
butlerlib/streamproto
Package streamproto describes the protocol primitives used by LogDog/Butler for stream negotiation.
Package streamproto describes the protocol primitives used by LogDog/Butler for stream negotiation.
cli
cmd/logdog_butler
Package main is entry point for the command-line LogDog Butler application.
Package main is entry point for the command-line LogDog Butler application.
pubsubprotocol
Package pubsubprotocol implements the LogDog pubsub wire protocol.
Package pubsubprotocol implements the LogDog pubsub wire protocol.
common
archive
Package archive constructs a LogDog archive out of log stream components.
Package archive constructs a LogDog archive out of log stream components.
renderer
Package renderer exports the capability to render a LogDog log stream to an io.Writer.
Package renderer exports the capability to render a LogDog log stream to an io.Writer.
storage/archive
Package archive implements a storage.Storage instance that retrieves logs from a Google Storage archive.
Package archive implements a storage.Storage instance that retrieves logs from a Google Storage archive.
storage/archive/logdog_archive_test
Package main implements a simple CLI tool to load and interact with Google Storage archived data.
Package main implements a simple CLI tool to load and interact with Google Storage archived data.
storage/bigtable
Package bigtable provides an implementation of the Storage interface backed by Google Cloud Platform's BigTable.
Package bigtable provides an implementation of the Storage interface backed by Google Cloud Platform's BigTable.
storage/bigtable/logdog_bigtable_test
Package main implements a simple CLI tool to load and interact with storage data in Google BigTable data.
Package main implements a simple CLI tool to load and interact with storage data in Google BigTable data.
storage/memory
Package memory implements in-memory Storage structures.
Package memory implements in-memory Storage structures.
viewer
Package viewer is a support library to interact with the LogDog web app and log stream viewer.
Package viewer is a support library to interact with the LogDog web app and log stream viewer.
server
collector
Package collector implements the LogDog Collector daemon's log parsing and registration logic.
Package collector implements the LogDog Collector daemon's log parsing and registration logic.
collector/coordinator
Package coordinator implements a minimal interface to the Coordinator service that is sufficient for Collector usage.
Package coordinator implements a minimal interface to the Coordinator service that is sufficient for Collector usage.
service/config
Package config implements common LogDog daemon configuration.
Package config implements common LogDog daemon configuration.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL