cloudevent-recorder/

directory
v0.6.20 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 21, 2024 License: Apache-2.0

README

cloudevent-recorder

This module provisions a regionalized event subscriber that consumes events of particular types from each of the regional brokers, writes them to Google BigQuery.

flowchart LR
    subgraph "regional network"
    A[[Pub/Sub topic]] -- notifies --> B(Recorder Service)
    end

   B -- writes every 3m --> C[(Cloud Storage)]
   C --> D[Data Transfer Service]
   D -- loads every 15m --> E[(BigQuery)]

The recorder service writes to a regional GCS bucket where a periodic BigQuery Data-Transfer Service Job will pull events from into a BigQuery table schematized for that event type.

The intended usage of this module for publishing events is something like this:

// Create a network with several regional subnets
module "networking" {
  source = "chainguard-dev/common/infra//modules/networking"

  name       = "my-networking"
  project_id = var.project_id
  regions    = [...]
}

// Create the Broker abstraction.
module "cloudevent-broker" {
  source = "chainguard-dev/common/infra//modules/cloudevent-broker"

  name       = "my-broker"
  project_id = var.project_id
  regions    = module.networking.regional-networks
}

// Record cloudevents of type com.example.foo and com.example.bar
module "foo-emits-events" {
  source = "chainguard-dev/common/infra//modules/cloudevent-recorder"

  name       = "my-recorder"
  project_id = var.project_id
  regions    = module.networking.regional-networks
  broker     = module.cloudevent-broker.broker

  retention-period = 30 // keep around 30 days worth of event data

  provisioner = "user:sally@chainguard.dev"

  types = {
    "com.example.foo": {
      schema = file("${path.module}/foo.schema.json")
    }
    "com.example.bar": {
      schema = file("${path.module}/bar.schema.json")
    }
  }
}

The default behavior of this module is to deploy a cloud event trigger that consumes events from a broker and uses log rotate to write them to a GCS bucket. The GCS bucket is then used as a source for a BigQuery Data Transfer Service job.

To override this behavior, you can choose a different method.

i.e. to use GCP native integration for pubsub to GCS:

module "foo-emits-events" {
  source = "chainguard-dev/common/infra//modules/cloudevent-recorder"
  method = "gcs"

  ...
}

Requirements

No requirements.

Providers

Name Version
google n/a
google-beta n/a
random n/a

Modules

Name Source Version
audit-import-serviceaccount ../audit-serviceaccount n/a
recorder-dashboard ../dashboard/cloudevent-receiver n/a
this ../regional-go-service n/a
triggers ../cloudevent-trigger n/a

Resources

Name Type
google-beta_google_project_service_identity.pubsub resource
google_bigquery_data_transfer_config.import-job resource
google_bigquery_dataset.this resource
google_bigquery_table.types resource
google_bigquery_table_iam_binding.import-writes-to-tables resource
google_monitoring_alert_policy.bq_dts resource
google_monitoring_alert_policy.bucket-access resource
google_pubsub_subscription.dead-letter-pull-sub resource
google_pubsub_subscription.this resource
google_pubsub_subscription_iam_binding.allow-pubsub-to-ack resource
google_pubsub_topic.dead-letter resource
google_pubsub_topic_iam_binding.allow-pubsub-to-send-to-dead-letter resource
google_service_account.import-identity resource
google_service_account.recorder resource
google_service_account_iam_binding.bq-dts-assumes-import-identity resource
google_service_account_iam_binding.provisioner-acts-as-import-identity resource
google_storage_bucket.recorder resource
google_storage_bucket_iam_binding.broker-writes-to-gcs-buckets resource
google_storage_bucket_iam_binding.import-reads-from-gcs-buckets resource
google_storage_bucket_iam_binding.recorder-writes-to-gcs-buckets resource
random_id.suffix resource
random_id.trigger-suffix resource
random_string.suffix resource
google_client_openid_userinfo.me data source
google_project.project data source

Inputs

Name Description Type Default Required
ack_deadline_seconds The number of seconds to acknowledge a message before it is redelivered. number 300 no
broker A map from each of the input region names to the name of the Broker topic in that region. map(string) n/a yes
cloud_storage_config_max_bytes The maximum bytes that can be written to a Cloud Storage file before a new file is created. Min 1 KB, max 10 GiB. number 1000000000 no
cloud_storage_config_max_duration The maximum duration that can elapse before a new Cloud Storage file is created. Min 1 minute, max 10 minutes, default 5 minutes. number 300 no
deletion_protection Whether to enable deletion protection on data resources. bool true no
enable_profiler Enable cloud profiler. bool false no
ignore_unknown_values Whether to ignore unknown values in the data, when transferring data to BigQuery. bool false no
limits Resource limits for the regional go service.
object({
cpu = string
memory = string
})
null no
location The location to create the BigQuery dataset in, and in which to run the data transfer jobs from GCS. string "US" no
max_delivery_attempts The maximum number of delivery attempts for any event. number 5 no
maximum_backoff The maximum delay between consecutive deliveries of a given message. number 600 no
method The method used to transfer events (e.g., trigger, gcs). string "trigger" no
minimum_backoff The minimum delay between consecutive deliveries of a given message. number 10 no
name n/a string n/a yes
notification_channels List of notification channels to alert (for service-level issues). list(string) n/a yes
project_id n/a string n/a yes
provisioner The identity as which this module will be applied (so it may be granted permission to 'act as' the DTS service account). This should be in the form expected by an IAM subject (e.g. user:sally@example.com) string n/a yes
regions A map from region names to a network and subnetwork. A recorder service and cloud storage bucket (into which the service writes events) will be created in each region.
map(object({
network = string
subnet = string
}))
n/a yes
retention-period The number of days to retain data in BigQuery. number n/a yes
split_triggers Opt-in flag to split into per-trigger dashboards. Helpful when hitting widget limits bool false no
types A map from cloudevent types to the BigQuery schema associated with them, as well as an alert threshold and a list of notification channels (for subscription-level issues).
map(object({
schema = string
alert_threshold = optional(number, 50000)
notification_channels = optional(list(string), [])
partition_field = optional(string)
}))
n/a yes

Outputs

Name Description
dataset_id n/a
table_ids n/a

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL