data-infra-benthos

module
v4.4.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 19, 2022 License: MIT

README

Benthos

godoc for benthosdev/benthos Build Status Discord invite Docs site

Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads.

It comes with a powerful mapping language, is easy to deploy and monitor, and ready to drop into your pipeline either as a static binary, docker image, or serverless function, making it cloud native as heck.

Benthos is fully declarative, with stream pipelines defined in a single config file, allowing you to specify connectors and a list of processing stages:

input:
  gcp_pubsub:
    project: foo
    subscription: bar

pipeline:
  processors:
    - bloblang: |
        root.message = this
        root.meta.link_count = this.links.length()
        root.user.age = this.user.age.number()

output:
  redis_streams:
    url: tcp://TODO:6379
    stream: baz
    max_in_flight: 20

Delivery Guarantees

Yep, we got 'em. Benthos implements transaction based resiliency with back pressure. When connecting to at-least-once sources and sinks it guarantees at-least-once delivery without needing to persist messages during transit.

Supported Sources & Sinks

Apache Pulsar, AWS (DynamoDB, Kinesis, S3, SQS, SNS), Azure (Blob storage, Queue storage, Table storage), Cassandra, Elasticsearch, File, GCP (Pub/Sub, Cloud storage), HDFS, HTTP (server and client, including websockets), Kafka, Memcached, MQTT, Nanomsg, NATS, NATS JetStream, NATS Streaming, NSQ, AMQP 0.91 (RabbitMQ), AMQP 1, Redis (streams, list, pubsub, hashes), MongoDB, SQL (MySQL, PostgreSQL, Clickhouse, MSSQL), Stdin/Stdout, TCP & UDP, sockets and ZMQ4.

Connectors are being added constantly, if something you want is missing then open an issue.

Documentation

If you want to dive fully into Benthos then don't waste your time in this dump, check out the documentation site.

For guidance on how to configure more advanced stream processing concepts such as stream joins, enrichment workflows, etc, check out the cookbooks section.

For guidance on building your own custom plugins in Go check out the public APIs.

Install

Grab a binary for your OS from here. Or use this script:

curl -Lsf https://sh.benthos.dev | bash

Or pull the docker image:

docker pull jeffail/benthos

Benthos can also be installed via Homebrew:

brew install benthos

For more information check out the getting started guide.

Run

benthos -c ./config.yaml

Or, with docker:

# Using a config file
docker run --rm -v /path/to/your/config.yaml:/benthos.yaml jeffail/benthos

# Using a series of -s flags
docker run --rm -p 4195:4195 jeffail/benthos \
  -s "input.type=http_server" \
  -s "output.type=kafka" \
  -s "output.kafka.addresses=kafka-server:9092" \
  -s "output.kafka.topic=benthos_topic"

Monitoring

Health Checks

Benthos serves two HTTP endpoints for health checks:

  • /ping can be used as a liveness probe as it always returns a 200.
  • /ready can be used as a readiness probe as it serves a 200 only when both the input and output are connected, otherwise a 503 is returned.

Metrics

Benthos exposes lots of metrics either to Statsd, Prometheus or for debugging purposes an HTTP endpoint that returns a JSON formatted object.

Tracing

Benthos also emits tracing events to a tracer of your choice (currently only Jaeger is supported) which can be used to visualise the processors within a pipeline.

Configuration

Benthos provides lots of tools for making configuration discovery, debugging and organisation easy. You can read about them here.

Build

Build with Go (1.16 or later):

git clone git@github.com:benthosdev/benthos
cd benthos
make

Lint

Benthos uses golangci-lint for linting, which you can install with:

curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin

And then run it with make lint.

Plugins

It's pretty easy to write your own custom plugins for Benthos in Go, for information check out the API docs, and for inspiration there's an example repo demonstrating a variety of plugin implementations.

Extra Plugins

By default Benthos does not build with components that require linking to external libraries, such as the zmq4 input and outputs. If you wish to build Benthos locally with these dependencies then set the build tag x_benthos_extra:

# With go
go install -tags "x_benthos_extra" github.com/utilitywarehouse/data-infra-benthos/v4/cmd/benthos@latest

# Using make
make TAGS=x_benthos_extra

Note that this tag may change or be broken out into granular tags for individual components outside of major version releases. If you attempt a build and these dependencies are not present you'll see error messages such as ld: library not found for -lzmq.

Docker Builds

There's a multi-stage Dockerfile for creating a Benthos docker image which results in a minimal image from scratch. You can build it with:

make docker

Then use the image:

docker run --rm \
	-v /path/to/your/benthos.yaml:/config.yaml \
	-v /tmp/data:/data \
	-p 4195:4195 \
	benthos -c /config.yaml

Contributing

Contributions are welcome, please read the guidelines, come and chat (links are on the community page), and watch your back.

Directories

Path Synopsis
cmd
internal
api
Package api implements a type used for creating the Benthos HTTP API.
Package api implements a type used for creating the Benthos HTTP API.
batch
Package batch contains internal utilities for interacting with message batches.
Package batch contains internal utilities for interacting with message batches.
batch/policy
Package policy provides tooling for creating and executing Benthos message batch policies.
Package policy provides tooling for creating and executing Benthos message batch policies.
bloblang/field
Package field implements a bloblang interpolation function templating syntax used in some dynamic fields within Benthos.
Package field implements a bloblang interpolation function templating syntax used in some dynamic fields within Benthos.
bloblang/mapping
Package mapping provides a parser for the full bloblang mapping spec.
Package mapping provides a parser for the full bloblang mapping spec.
bloblang/query
Package query provides a parser for the right-hand side query part of the bloblang spec.
Package query provides a parser for the right-hand side query part of the bloblang spec.
bundle
Package bundle contains singletons referenced throughout the Benthos codebase that allow imported components to add their constructors and documentation to a service.
Package bundle contains singletons referenced throughout the Benthos codebase that allow imported components to add their constructors and documentation to a service.
checkpoint
Package checkpoint implements a mechanism for tracking checkpointed integer offsets for sequential read at-least-once queue systems such as Kafka or Kinesis.
Package checkpoint implements a mechanism for tracking checkpointed integer offsets for sequential read at-least-once queue systems such as Kafka or Kinesis.
cli
cli/test
Package test implements the Benthos service unit testing command.
Package test implements the Benthos service unit testing command.
docs
Package docs provides useful functions for creating documentation from Benthos components
Package docs provides useful functions for creating documentation from Benthos components
impl/amqp1
Package amqp1 will eventually contain all implementations of AMQP 1 components (that are currently within ./internal/old)
Package amqp1 will eventually contain all implementations of AMQP 1 components (that are currently within ./internal/old)
impl/amqp1/shared
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
impl/azure
Package azure will eventually contain all implementations of Azure components (that are currently within ./internal/old)
Package azure will eventually contain all implementations of Azure components (that are currently within ./internal/old)
impl/azure/shared
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
impl/io
Package io contains component implementations that have a small dependency footprint (mostly standard library) and interact with external systems via the filesystem and/or network sockets.
Package io contains component implementations that have a small dependency footprint (mostly standard library) and interact with external systems via the filesystem and/or network sockets.
impl/mqtt
Package mqtt will eventually contain all implementations of MQTT components (that are currently within ./internal/old)
Package mqtt will eventually contain all implementations of MQTT components (that are currently within ./internal/old)
impl/mqtt/shared
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
impl/pure
Package pure contains all component implementations that are pure, in that they do not interact with external systems.
Package pure contains all component implementations that are pure, in that they do not interact with external systems.
impl/sftp
Package sftp will eventually contain all implementations of SFTP components (that are currently within ./internal/old)
Package sftp will eventually contain all implementations of SFTP components (that are currently within ./internal/old)
impl/sftp/shared
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
Package shared contains docs fields that need to be shared across old and new component implementations, it needs to be separate from the parent package in order to avoid circular dependencies (for now).
impl/xml
Package xml is a temporary way to convert XML to JSON.
Package xml is a temporary way to convert XML to JSON.
log
manager
Package manager implements the types.Manager interface used for creating and sharing resources across a Benthos service.
Package manager implements the types.Manager interface used for creating and sharing resources across a Benthos service.
old/util/retries
Package retries implements backoff strategies around a standard configuration scheme.
Package retries implements backoff strategies around a standard configuration scheme.
old/util/throttle
Package throttle implements throttle strategies.
Package throttle implements throttle strategies.
pipeline
Package pipeline contains structures that implement both the Producer and Consumer interfaces.
Package pipeline contains structures that implement both the Producer and Consumer interfaces.
serverless
Package serverless contains shared components for serverless distributions of Benthos.
Package serverless contains shared components for serverless distributions of Benthos.
serverless/lambda
Package lambda contains the execution logic for running Benthos as an AWS lambda function.
Package lambda contains the execution logic for running Benthos as an AWS lambda function.
stream/manager
Package manager creates and manages multiple streams, providing an API for performing CRUD operations.
Package manager creates and manages multiple streams, providing an API for performing CRUD operations.
tls
Package tls provides Benthos configuration fields and wrappers for a crypto/tls config.
Package tls provides Benthos configuration fields and wrappers for a crypto/tls config.
tracing
Package tracing implements utility functions for interacting with a global tracing system.
Package tracing implements utility functions for interacting with a global tracing system.
public
bloblang
Package bloblang provides high level APIs for registering custom Bloblang plugins, as well as for parsing and executing Bloblang mappings.
Package bloblang provides high level APIs for registering custom Bloblang plugins, as well as for parsing and executing Bloblang mappings.
components/all
Package all imports all component implementations that ship with the open source Benthos repo.
Package all imports all component implementations that ship with the open source Benthos repo.
components/io
Package io contains component implementations that have a small dependency footprint (mostly standard library) and interact with external systems via the filesystem and/or network sockets.
Package io contains component implementations that have a small dependency footprint (mostly standard library) and interact with external systems via the filesystem and/or network sockets.
components/pure
Package pure imports all component implementations that are pure, in that they do not interact with external systems.
Package pure imports all component implementations that are pure, in that they do not interact with external systems.
service
Package service provides a high level API for registering custom plugin components and executing either a standard Benthos CLI, or programmatically building isolated pipelines with a StreamBuilder API.
Package service provides a high level API for registering custom plugin components and executing either a standard Benthos CLI, or programmatically building isolated pipelines with a StreamBuilder API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL