Timberlake is a Job Tracker for Hadoop.
Intro
Timberlake is a Go server paired with a React.js frontend. It improves on
existing Hadoop job trackers by providing a lightweight realtime view of your
running and finished MapReduce jobs. Timberlake exposes the counters and
configuration that are the most useful, allowing you to get a quick overview of
the whole cluster or dig into the performance and behavior of a single job.
It also provides waterfall and boxplot visualizations for jobs. We've found that
these visualizations can be really helpful for figuring out why a job is slow.
Is it launching too many mappers and overloading the cluster? Are reducers
launching early and starving the mappers? Does the job have reducer skew?
You can use the counters of bytes written, shuffled, and read to understand the
network and I/O behavior of your jobs. And when there's a crash, Timberlake will
show you tracebacks from the logs to help you debug the job.
Timberlake pairs well with Scalding and Cascading. It uses extra data from the
Cascading planner to show the relationships between steps, and to clarify which
jobs' outputs are used as inputs to other jobs in the flow. Visualizing that
flow makes it much easier to figure out which steps are causing bottlenecks.
Finally, we've included a Slackbot that has significantly improved our Hadooping
lives. The bot can notify you when your jobs start and finish, and provides
links back to Timberlake.
Screenshots
Job Details
List of Jobs
Installation
The best way to install is with tarballs, which are available on the
release page.
Download it somewhere on your server, and then untar it:
$ tar zxvf timberlake-v1.0.2-linux-amd64.tar.gz
$ mv -T timberlake-v1.0.2-linux-amd64 /opt/timberlake
Now you can start the server:
$ /opt/timberlake/bin/timberlake \
--bind :8000 \
--resource-manager-url http://resourcemanager:8088 \
--history-server-url http://resourcemanager:19888 \
--namenode-address namenode:9000
And optionally, start the Slackbot:
$ /opt/timberlake/bin/slack \
--internal-timberlake-url http://localhost:8000 \
--external-timberlake-url https://timberlake.example.com \
--slack-url https://hooks.slack.com/services/...
You'll need to create a new Incoming Webhook
to generate the Slack URL for your bot.
Building from Source
You'll need npm
, go
and node
on your path.
$ git clone https://github.com/stripe/timberlake.git
$ cd timberlake
$ make
Limitations
Timberlake only works with the YARN Resource Manager
API.
It's been tested on v2.4.x and v2.5.x, but the Kill Job feature uses an endpoint
that's only available in v2.5.x+.
Our cluster has 10-40 jobs running simultaneously and about 2,000 jobs running
per day. Timberlake's performance has not been tested outside these bounds.