kafka-topic-cloner

command module
v0.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 21, 2018 License: Apache-2.0 Imports: 1 Imported by: 0

README

Kafka topic cloner

Kafka topic cloner is a CLI that clones the content of a topic into another one.

The cloner supports two different hashers for key/partition assignment:

  • Murmur2, which is the standard hasher used in the Java kafka community, including kafka scripts and kafka connect (default)
  • FNV-1a, which is the standard hasher used in a part of the Go kafka community (e.g. Sarama producers)

The CLI was written in Go using spf13/cobra.

Installation

Standard

It is strongly recommended that you use a released version. You can find the released binaries here.

wget https://github.com/ricardo-ch/kafka-topic-cloner/releases/download/v0.5.0/kafka-topic-cloner

This project has binaries for Linux(386), and since the 0.5.0 for Windows. If your platform is not supported, you can build the project manually, or open an issue asking us to add your platform to the supported ones. If you used an official release, there is no extra step required before you can start cloning!

Building from the sources

Note: the project was created using Go 1.10.1, so you will need to have it installed before proceeding with the next steps.

If you want to dig inside the code, or build an executable for your own platform, you can download the source code with go get.

go get -u github.com/ricardo-ch/kafka-topic-cloner

You then need to download the dependencies of the project. We are using dep as our dependency manager, so you will need to install it (instructions available on their own github repository). Once dep is installed, retrieve the source files of the dependencies:

dep ensure --update

You can then build an executable for your own using go build. If you want to build for another platform, you will need to set the GOOS and GOARCH environment variables to match the target system specifications.

Examples

Standard use

A standard use of the Kafka topic cloner would look like this:

kafka-topic-cloner --from-brokers localhost:9092 --from foo --to bar

This is going to consume every event from the foo topic and produce them inside the bar topic.

If you choose the same hasher that was used to populate the source topic, and that you have the same number of partitions in the source and target topics, the cloned topic will be an exact replica of the original one. If some events were mistakenly placed on the wrong partition (e.g. by manually producing them), the cloning would place them back on the right one.

By default, Kafka topic cloner will use Murmur2 as partitioning hasher. It is the algorithm that is used by the kafka scripts, kafka connect, and the majority of the Java kafka libraries (including the Stream API). If you prefer to use FNV-1a instead, which is the hasher implemented in Sarama (Golang's most popular kafka library), you can use the hasher parameter:

kafka-topic-cloner --brokers localhost:9092 --from foo --to bar --hasher FNV-1a

If you would like to see another hasher implemented, feel free to open an issue about this!

Cross-cluster cloning

You can clone a topic from a kafka cluster to a different one, by specifying the --to-cluster parameter:

kafka-topic-cloner --from-brokers localhost:9092 --to-brokers remote-cluster:9092 --from foo --to bar
Timing out

Technically, a Kafka topic has no definite end, but it is nice to know when the application is done cloning every available event in the source topic. To do so, Kafka topic cloner comes with a built-in timeout that will close the application when it was unable to clone any event for a certain amount of time. The default timeout delay is 10 seconds. You can override this delay by using the timeout parameter:

kafka-topic-cloner --brokers localhost:9092 --from foo --to bar --timeout 5000

As of today, there is no way to completely disable the timeout (to be implemented soon).

Loop-cloning

Loop-cloning, or same-topic cloning, is the action of cloning a topic into itself. Since it creates a continuous flow of new events inside the source topic, the cloning will never end and quickly multiply the number of events. Since this can be quite a dangerous action if done unintentionally, this action it protected by the --loop parameter. When loop-cloning, you should not specify the target topic, and the source topic will be used as target:

kafka-topic-cloner --from-brokers localhost:9092 --from foo --loop

Parameters

You can find the complete list of parameters below:

Argument Shorthand Description
from-brokers F Semicolon-separated list of the source kafka brokers
from f Source topic's name
to-brokers T Semicolon-separated list of the target kafka brokers, specify only for cross-clusters cloning
to t Destination topic's name
timeout o consumer timeout is ms (defaults to 10000)
hasher p name of the hasher to use for partitioning, possible values: murmur2 (default), FNV-1a
compression c name of the compression codec to use, possible values: none, gzip(default), snappy, lz4
loop L allow loop-cloning
verbose v verbose mode (defaults to false)
help h displays the CLI's help

Disclaimer

Kafka topic cloner can not be considered production-ready since it does not come with any test at the moment. Use it at your own risk! :)

Contributing

Contributions are always welcome and appreciated, the maintainers have a look at the new issues and consider every pull request.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL