README
¶
Chaos Monkey randomly terminates virtual machine instances and containers that run inside of your production environment. Exposing engineers to failures more frequently incentivizes them to build resilient services.
Chaos Monkey is an example of a tool that follows the Principles of Chaos Engineering.
Requirements
This version of Chaos Monkey is fully integrated with Spinnaker, the continuous delivery platform that we use at Netflix. You must be managing your apps with Spinnaker to use Chaos Monkey to terminate instances.
Chaos Monkey should work with any backend that Spinnaker supports (AWS, GCP, Azure, Kubernetes, Cloud Foundry). It has been tested with AWS and Kubernetes.
Install locally
To install the Chaos Monkey binary on your local machine:
go install github.com/Netflix/chaosmonkey/bin/chaosmonkey
How to deploy
See the wiki for instructions on how to configure and deploy Chaos Monkey.
Support
Documentation
¶
Overview ¶
Package chaosmonkey contains our domain models
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AppConfig ¶
type AppConfig struct { Enabled bool RegionsAreIndependent bool MeanTimeBetweenKillsInWorkDays int MinTimeBetweenKillsInWorkDays int Grouping Group Exceptions []Exception Whitelist *[]Exception }
AppConfig contains app-specific configuration parameters for Chaos Monkey
func NewAppConfig ¶
NewAppConfig constructs a new app configuration with reasonable defaults with specified accounts enabled/disabled
type AppConfigGetter ¶
type AppConfigGetter interface { // Get returns the App config info by app name Get(app string) (*AppConfig, error) }
AppConfigGetter retrieves App configuration info
type Checker ¶
type Checker interface { // Check checks if a termination is permitted and, if so, records the // termination time on the server. // The endHour (hour time when Chaos Monkey stops killing) is in the // time zone specified by loc. Check(term Termination, appCfg AppConfig, endHour int, loc *time.Location) error }
Checker checks to see if a termination is permitted given min time between terminations
if the termination is permitted, returns (true, nil) otherwise, returns false with an error
Returns ErrViolatesMinTime if violates min time between terminations ¶
Note that this call may change the state of the server: if the checker returns true, the termination will be recorded.
type Decryptor ¶
Decryptor decrypts encrypted text. It is used for decrypting sensitive credentials that are stored encrypted
type Env ¶
type Env interface { // InTest returns true if Chaos Monkey is running in a test environment InTest() bool }
Env provides information about the environment that Chaos Monkey has been deployed to.
type ErrViolatesMinTime ¶
type ErrViolatesMinTime struct { InstanceID string // the most recent terminated instance id KilledAt time.Time // the time that the most recent instance was terminated Loc *time.Location // local time zone location }
ErrViolatesMinTime represents an error when trying to record a termination that violates the min time between terminations for that particular app
func (ErrViolatesMinTime) Error ¶
func (e ErrViolatesMinTime) Error() string
type ErrorCounter ¶
type ErrorCounter interface {
Increment() error
}
ErrorCounter counts when errors occur.
type Exception ¶
Exception describes clusters that have been opted out of chaos monkey If one of the members is a "*", it matches everything. That is the only wildcard value For example, this will opt-out all of the cluters in the test account: Exception{ Account:"test", Stack:"*", Cluster:"*", Region: "*"}
type Group ¶
type Group int
Group describes what Chaos Monkey considers a group of instances Chaos Monkey will randomly kill an instance from each group. The group generally maps onto what the service owner considers a "cluster", which is different from Spinnaker's notion of a cluster.
type Instance ¶
type Instance interface { // AppName is the name of the Netflix app AppName() string // AccountName is the name of the account the instance is running in (e.g., prod, test) AccountName() string // RegionName is the name of the AWS region (e.g., us-east-1 RegionName() string // StackName returns the "stack" part of app-stack-detail in cluster names StackName() string // ClusterName is the full cluster name: app-stack-detail ClusterName() string // ASGName is the name of the ASG associated with the instance ASGName() string // ID is the instance ID, e.g. i-dbcba24c ID() string // CloudProvider returns the cloud provider (e.g., "aws") CloudProvider() string }
Instance contains naming info about an instance
type Outage ¶
type Outage interface { // Outage returns true if there is an ongoing outage Outage() (bool, error) }
Outage provides an interface for checking if there is currently an outage This provides a mechanism to check if there's an ongoing outage, since Chaos Monkey doesn't run during outages
type Termination ¶
type Termination struct { Instance Instance // The instance that will be terminated Time time.Time // Termination time Leashed bool // If true, track the termination but do not execute it }
Termination contains information about an instance termination.
type Terminator ¶
type Terminator interface { // Kill terminates a running instance Execute(trm Termination) error }
Terminator provides an interface for killing instances
type Tracker ¶
type Tracker interface { // Track pushes a termination event to the tracking system Track(t Termination) error }
Tracker records termination events an a tracking system such as Chronos
Directories
¶
Path | Synopsis |
---|---|
bin
|
|
chaosmonkey
Chaos Monkey randomly terminates instances.
|
Chaos Monkey randomly terminates instances. |
Package cal has calendar-related functions
|
Package cal has calendar-related functions |
Package clock provides the Clock interface for getting the current time
|
Package clock provides the Clock interface for getting the current time |
Package command contains functions that can be invoked via command-line e.g.
|
Package command contains functions that can be invoked via command-line e.g. |
Package config exposes configuration information
|
Package config exposes configuration information |
Package deploy contains information about all of the deployed instances, and how they are organized across accounts, apps, regions, clusters, and autoscaling groups.
|
Package deploy contains information about all of the deployed instances, and how they are organized across accounts, apps, regions, clusters, and autoscaling groups. |
Package deps holds a set of interfaces
|
Package deps holds a set of interfaces |
Package env contains a no-op implementation of chaosmonkey.env where InTest() always returns false
|
Package env contains a no-op implementation of chaosmonkey.env where InTest() always returns false |
Package grp holds the InstanceGroup interface
|
Package grp holds the InstanceGroup interface |
Package mock contains helper functions for generating mock objects for testing
|
Package mock contains helper functions for generating mock objects for testing |
Package outage provides a default no-op outage implementation
|
Package outage provides a default no-op outage implementation |
Package schedule implements a schedule of terminations
|
Package schedule implements a schedule of terminations |
Package spinnaker provides an interface to the Spinnaker API
|
Package spinnaker provides an interface to the Spinnaker API |
Package term contains the logic for terminating instances
|
Package term contains the logic for terminating instances |
Package tracker provides an entry point for instantiating Trackers
|
Package tracker provides an entry point for instantiating Trackers |