serf

package
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 14, 2016 License: MPL-2.0 Imports: 23 Imported by: 0

Documentation

Index

Constants

View Source
const (
	ProtocolVersionMin uint8 = 2
	ProtocolVersionMax       = 4
)

These are the protocol versions that Serf can _understand_. These are Serf-level protocol versions that are passed down as the delegate version to memberlist below.

View Source
const (
	// This is the prefix we use for queries that are internal to Serf.
	// They are handled internally, and not forwarded to a client.
	InternalQueryPrefix = "_serf_"
)
View Source
const (
	// PingVersion is an internal version for the ping message, above the normal
	// versioning we get from the protocol version. This enables small updates
	// to the ping message without a full protocol bump.
	PingVersion = 1
)
View Source
const (
	UserEventSizeLimit = 512 // Maximum byte size for event name and payload

)

Variables

View Source
var (
	// FeatureNotSupported is returned if a feature cannot be used
	// due to an older protocol version being used.
	FeatureNotSupported = fmt.Errorf("Feature not supported")
)
View Source
var ProtocolVersionMap map[uint8]uint8

ProtocolVersionMap is the mapping of Serf delegate protocol versions to memberlist protocol versions. We mask the memberlist protocols using our own protocol version.

Functions

This section is empty.

Types

type Config

type Config struct {
	// The name of this node. This must be unique in the cluster. If this
	// is not set, Serf will set it to the hostname of the running machine.
	NodeName string

	// The tags for this role, if any. This is used to provide arbitrary
	// key/value metadata per-node. For example, a "role" tag may be used to
	// differentiate "load-balancer" from a "web" role as parts of the same cluster.
	// Tags are deprecating 'Role', and instead it acts as a special key in this
	// map.
	Tags map[string]string

	// EventCh is a channel that receives all the Serf events. The events
	// are sent on this channel in proper ordering. Care must be taken that
	// this channel doesn't block, either by processing the events quick
	// enough or buffering the channel, otherwise it can block state updates
	// within Serf itself. If no EventCh is specified, no events will be fired,
	// but point-in-time snapshots of members can still be retrieved by
	// calling Members on Serf.
	EventCh chan<- Event

	// ProtocolVersion is the protocol version to speak. This must be between
	// ProtocolVersionMin and ProtocolVersionMax.
	ProtocolVersion uint8

	// BroadcastTimeout is the amount of time to wait for a broadcast
	// message to be sent to the cluster. Broadcast messages are used for
	// things like leave messages and force remove messages. If this is not
	// set, a timeout of 5 seconds will be set.
	BroadcastTimeout time.Duration

	// The settings below relate to Serf's event coalescence feature. Serf
	// is able to coalesce multiple events into single events in order to
	// reduce the amount of noise that is sent along the EventCh. For example
	// if five nodes quickly join, the EventCh will be sent one EventMemberJoin
	// containing the five nodes rather than five individual EventMemberJoin
	// events. Coalescence can mitigate potential flapping behavior.
	//
	// Coalescence is disabled by default and can be enabled by setting
	// CoalescePeriod.
	//
	// CoalescePeriod specifies the time duration to coalesce events.
	// For example, if this is set to 5 seconds, then all events received
	// within 5 seconds that can be coalesced will be.
	//
	// QuiescentPeriod specifies the duration of time where if no events
	// are received, coalescence immediately happens. For example, if
	// CoalscePeriod is set to 10 seconds but QuiscentPeriod is set to 2
	// seconds, then the events will be coalesced and dispatched if no
	// new events are received within 2 seconds of the last event. Otherwise,
	// every event will always be delayed by at least 10 seconds.
	CoalescePeriod  time.Duration
	QuiescentPeriod time.Duration

	// The settings below relate to Serf's user event coalescing feature.
	// The settings operate like above but only affect user messages and
	// not the Member* messages that Serf generates.
	UserCoalescePeriod  time.Duration
	UserQuiescentPeriod time.Duration

	// The settings below relate to Serf keeping track of recently
	// failed/left nodes and attempting reconnects.
	//
	// ReapInterval is the interval when the reaper runs. If this is not
	// set (it is zero), it will be set to a reasonable default.
	//
	// ReconnectInterval is the interval when we attempt to reconnect
	// to failed nodes. If this is not set (it is zero), it will be set
	// to a reasonable default.
	//
	// ReconnectTimeout is the amount of time to attempt to reconnect to
	// a failed node before giving up and considering it completely gone.
	//
	// TombstoneTimeout is the amount of time to keep around nodes
	// that gracefully left as tombstones for syncing state with other
	// Serf nodes.
	ReapInterval      time.Duration
	ReconnectInterval time.Duration
	ReconnectTimeout  time.Duration
	TombstoneTimeout  time.Duration

	// FlapTimeout is the amount of time less than which we consider a node
	// being failed and rejoining looks like a flap for telemetry purposes.
	// This should be set less than a typical reboot time, but large enough
	// to see actual events, given our expected detection times for a failed
	// node.
	FlapTimeout time.Duration

	// QueueDepthWarning is used to generate warning message if the
	// number of queued messages to broadcast exceeds this number. This
	// is to provide the user feedback if events are being triggered
	// faster than they can be disseminated
	QueueDepthWarning int

	// MaxQueueDepth is used to start dropping messages if the number
	// of queued messages to broadcast exceeds this number. This is to
	// prevent an unbounded growth of memory utilization
	MaxQueueDepth int

	// RecentIntentTimeout is used to determine how long we store recent
	// join and leave intents. This is used to guard against the case where
	// Serf broadcasts an intent that arrives before the Memberlist event.
	// It is important that this not be too short to avoid continuous
	// rebroadcasting of dead events.
	RecentIntentTimeout time.Duration

	// EventBuffer is used to control how many events are buffered.
	// This is used to prevent re-delivery of events to a client. The buffer
	// must be large enough to handle all "recent" events, since Serf will
	// not deliver messages that are older than the oldest entry in the buffer.
	// Thus if a client is generating too many events, it's possible that the
	// buffer gets overrun and messages are not delivered.
	EventBuffer int

	// QueryBuffer is used to control how many queries are buffered.
	// This is used to prevent re-delivery of queries to a client. The buffer
	// must be large enough to handle all "recent" events, since Serf will not
	// deliver queries older than the oldest entry in the buffer.
	// Thus if a client is generating too many queries, it's possible that the
	// buffer gets overrun and messages are not delivered.
	QueryBuffer int

	// QueryTimeoutMult configures the default timeout multipler for a query to run if no
	// specific value is provided. Queries are real-time by nature, where the
	// reply is time sensitive. As a result, results are collected in an async
	// fashion, however the query must have a bounded duration. We want the timeout
	// to be long enough that all nodes have time to receive the message, run a handler,
	// and generate a reply. Once the timeout is exceeded, any further replies are ignored.
	// The default value is
	//
	// Timeout = GossipInterval * QueryTimeoutMult * log(N+1)
	//
	QueryTimeoutMult int

	// QueryResponseSizeLimit and QuerySizeLimit limit the inbound and
	// outbound payload sizes for queries, respectively. These must fit
	// in a UDP packet with some additional overhead, so tuning these
	// past the default values of 1024 will depend on your network
	// configuration.
	QueryResponseSizeLimit int
	QuerySizeLimit         int

	// MemberlistConfig is the memberlist configuration that Serf will
	// use to do the underlying membership management and gossip. Some
	// fields in the MemberlistConfig will be overwritten by Serf no
	// matter what:
	//
	//   * Name - This will always be set to the same as the NodeName
	//     in this configuration.
	//
	//   * Events - Serf uses a custom event delegate.
	//
	//   * Delegate - Serf uses a custom delegate.
	//
	MemberlistConfig *memberlist.Config

	// LogOutput is the location to write logs to. If this is not set,
	// logs will go to stderr.
	LogOutput io.Writer

	// SnapshotPath if provided is used to snapshot live nodes as well
	// as lamport clock values. When Serf is started with a snapshot,
	// it will attempt to join all the previously known nodes until one
	// succeeds and will also avoid replaying old user events.
	SnapshotPath string

	// RejoinAfterLeave controls our interaction with the snapshot file.
	// When set to false (default), a leave causes a Serf to not rejoin
	// the cluster until an explicit join is received. If this is set to
	// true, we ignore the leave, and rejoin the cluster on start.
	RejoinAfterLeave bool

	// EnableNameConflictResolution controls if Serf will actively attempt
	// to resolve a name conflict. Since each Serf member must have a unique
	// name, a cluster can run into issues if multiple nodes claim the same
	// name. Without automatic resolution, Serf merely logs some warnings, but
	// otherwise does not take any action. Automatic resolution detects the
	// conflict and issues a special query which asks the cluster for the
	// Name -> IP:Port mapping. If there is a simple majority of votes, that
	// node stays while the other node will leave the cluster and exit.
	EnableNameConflictResolution bool

	// DisableCoordinates controls if Serf will maintain an estimate of this
	// node's network coordinate internally. A network coordinate is useful
	// for estimating the network distance (i.e. round trip time) between
	// two nodes. Enabling this option adds some overhead to ping messages.
	DisableCoordinates bool

	// KeyringFile provides the location of a writable file where Serf can
	// persist changes to the encryption keyring.
	KeyringFile string

	// Merge can be optionally provided to intercept a cluster merge
	// and conditionally abort the merge.
	Merge MergeDelegate
}

Config is the configuration for creating a Serf instance.

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig returns a Config struct that contains reasonable defaults for most of the configurations.

func (*Config) Init added in v0.4.0

func (c *Config) Init()

Init allocates the subdata structures

type Event

type Event interface {
	EventType() EventType
	String() string
}

Event is a generic interface for exposing Serf events Clients will usually need to use a type switches to get to a more useful type

type EventType

type EventType int

EventType are all the types of events that may occur and be sent along the Serf channel.

const (
	EventMemberJoin EventType = iota
	EventMemberLeave
	EventMemberFailed
	EventMemberUpdate
	EventMemberReap
	EventUser
	EventQuery
)

func (EventType) String

func (t EventType) String() string

type KeyManager added in v0.6.4

type KeyManager struct {
	// contains filtered or unexported fields
}

KeyManager encapsulates all functionality within Serf for handling encryption keyring changes across a cluster.

func (*KeyManager) InstallKey added in v0.6.4

func (k *KeyManager) InstallKey(key string) (*KeyResponse, error)

InstallKey handles broadcasting a query to all members and gathering responses from each of them, returning a list of messages from each node and any applicable error conditions.

func (*KeyManager) ListKeys added in v0.6.4

func (k *KeyManager) ListKeys() (*KeyResponse, error)

ListKeys is used to collect installed keys from members in a Serf cluster and return an aggregated list of all installed keys. This is useful to operators to ensure that there are no lingering keys installed on any agents. Since having multiple keys installed can cause performance penalties in some cases, it's important to verify this information and remove unneeded keys.

func (*KeyManager) RemoveKey added in v0.6.4

func (k *KeyManager) RemoveKey(key string) (*KeyResponse, error)

RemoveKey handles broadcasting a key to the cluster for removal. Each member will receive this event, and if they have the key in their keyring, remove it. If any errors are encountered, RemoveKey will collect and relay them.

func (*KeyManager) UseKey added in v0.6.4

func (k *KeyManager) UseKey(key string) (*KeyResponse, error)

UseKey handles broadcasting a primary key change to all members in the cluster, and gathering any response messages. If successful, there should be an empty KeyResponse returned.

type KeyResponse added in v0.6.0

type KeyResponse struct {
	Messages map[string]string // Map of node name to response message
	NumNodes int               // Total nodes memberlist knows of
	NumResp  int               // Total responses received
	NumErr   int               // Total errors from request

	// Keys is a mapping of the base64-encoded value of the key bytes to the
	// number of nodes that have the key installed.
	Keys map[string]int
}

KeyResponse is used to relay a query for a list of all keys in use.

type LamportClock

type LamportClock struct {
	// contains filtered or unexported fields
}

LamportClock is a thread safe implementation of a lamport clock. It uses efficient atomic operations for all of its functions, falling back to a heavy lock only if there are enough CAS failures.

func (*LamportClock) Increment

func (l *LamportClock) Increment() LamportTime

Increment is used to increment and return the value of the lamport clock

func (*LamportClock) Time

func (l *LamportClock) Time() LamportTime

Time is used to return the current value of the lamport clock

func (*LamportClock) Witness

func (l *LamportClock) Witness(v LamportTime)

Witness is called to update our local clock if necessary after witnessing a clock value received from another process

type LamportTime

type LamportTime uint64

LamportTime is the value of a LamportClock.

type Member

type Member struct {
	Name   string
	Addr   net.IP
	Port   uint16
	Tags   map[string]string
	Status MemberStatus

	// The minimum, maximum, and current values of the protocol versions
	// and delegate (Serf) protocol versions that each member can understand
	// or is speaking.
	ProtocolMin uint8
	ProtocolMax uint8
	ProtocolCur uint8
	DelegateMin uint8
	DelegateMax uint8
	DelegateCur uint8
}

Member is a single member of the Serf cluster.

type MemberEvent

type MemberEvent struct {
	Type    EventType
	Members []Member
}

MemberEvent is the struct used for member related events Because Serf coalesces events, an event may contain multiple members.

func (MemberEvent) EventType

func (m MemberEvent) EventType() EventType

func (MemberEvent) String

func (m MemberEvent) String() string

type MemberStatus

type MemberStatus int

MemberStatus is the state that a member is in.

const (
	StatusNone MemberStatus = iota
	StatusAlive
	StatusLeaving
	StatusLeft
	StatusFailed
)

func (MemberStatus) String

func (s MemberStatus) String() string

type MergeDelegate added in v0.6.4

type MergeDelegate interface {
	NotifyMerge([]*Member) error
}

type NodeResponse added in v0.5.0

type NodeResponse struct {
	From    string
	Payload []byte
}

NodeResponse is used to represent a single response from a node

type PreviousNode added in v0.3.0

type PreviousNode struct {
	Name string
	Addr string
}

PreviousNode is used to represent the previously known alive nodes

func (PreviousNode) String added in v0.3.0

func (p PreviousNode) String() string

type Query added in v0.5.0

type Query struct {
	LTime   LamportTime
	Name    string
	Payload []byte
	// contains filtered or unexported fields
}

Query is the struct used EventQuery type events

func (*Query) Deadline added in v0.5.0

func (q *Query) Deadline() time.Time

Deadline returns the time by which a response must be sent

func (*Query) EventType added in v0.5.0

func (q *Query) EventType() EventType

func (*Query) Respond added in v0.5.0

func (q *Query) Respond(buf []byte) error

Respond is used to send a response to the user query

func (*Query) String added in v0.5.0

func (q *Query) String() string

type QueryParam added in v0.5.0

type QueryParam struct {
	// If provided, we restrict the nodes that should respond to those
	// with names in this list
	FilterNodes []string

	// FilterTags maps a tag name to a regular expression that is applied
	// to restrict the nodes that should respond
	FilterTags map[string]string

	// If true, we are requesting an delivery acknowledgement from
	// every node that meets the filter requirement. This means nodes
	// the receive the message but do not pass the filters, will not
	// send an ack.
	RequestAck bool

	// The timeout limits how long the query is left open. If not provided,
	// then a default timeout is used based on the configuration of Serf
	Timeout time.Duration
}

QueryParam is provided to Query() to configure the parameters of the query. If not provided, sane defaults will be used.

type QueryResponse added in v0.5.0

type QueryResponse struct {
	// contains filtered or unexported fields
}

QueryResponse is returned for each new Query. It is used to collect Ack's as well as responses and to provide those back to a client.

func (*QueryResponse) AckCh added in v0.5.0

func (r *QueryResponse) AckCh() <-chan string

AckCh returns a channel that can be used to listen for acks Channel will be closed when the query is finished. This is nil, if the query did not specify RequestAck.

func (*QueryResponse) Close added in v0.5.0

func (r *QueryResponse) Close()

Close is used to close the query, which will close the underlying channels and prevent further deliveries

func (*QueryResponse) Deadline added in v0.5.0

func (r *QueryResponse) Deadline() time.Time

Deadline returns the ending deadline of the query

func (*QueryResponse) Finished added in v0.5.0

func (r *QueryResponse) Finished() bool

Finished returns if the query is finished running

func (*QueryResponse) ResponseCh added in v0.5.0

func (r *QueryResponse) ResponseCh() <-chan NodeResponse

ResponseCh returns a channel that can be used to listen for responses. Channel will be closed when the query is finished.

type Serf

type Serf struct {
	// contains filtered or unexported fields
}

Serf is a single node that is part of a single cluster that gets events about joins/leaves/failures/etc. It is created with the Create method.

All functions on the Serf structure are safe to call concurrently.

func Create

func Create(conf *Config) (*Serf, error)

Create creates a new Serf instance, starting all the background tasks to maintain cluster membership information.

After calling this function, the configuration should no longer be used or modified by the caller.

func (*Serf) DefaultQueryParams added in v0.5.0

func (s *Serf) DefaultQueryParams() *QueryParam

DefaultQueryParam is used to return the default query parameters

func (*Serf) DefaultQueryTimeout added in v0.5.0

func (s *Serf) DefaultQueryTimeout() time.Duration

DefaultQueryTimeout returns the default timeout value for a query Computed as GossipInterval * QueryTimeoutMult * log(N+1)

func (*Serf) EncryptionEnabled added in v0.6.0

func (s *Serf) EncryptionEnabled() bool

EncryptionEnabled is a predicate that determines whether or not encryption is enabled, which can be possible in one of 2 cases:

  • Single encryption key passed at agent start (no persistence)
  • Keyring file provided at agent start

func (*Serf) GetCachedCoordinate added in v0.7.0

func (s *Serf) GetCachedCoordinate(name string) (coord *coordinate.Coordinate, ok bool)

GetCachedCoordinate returns the network coordinate for the node with the given name. This will only be valid if DisableCoordinates is set to false.

func (*Serf) GetCoordinate added in v0.7.0

func (s *Serf) GetCoordinate() (*coordinate.Coordinate, error)

GetCoordinate returns the network coordinate of the local node.

func (*Serf) Join

func (s *Serf) Join(existing []string, ignoreOld bool) (int, error)

Join joins an existing Serf cluster. Returns the number of nodes successfully contacted. The returned error will be non-nil only in the case that no nodes could be contacted. If ignoreOld is true, then any user messages sent prior to the join will be ignored.

func (*Serf) KeyManager added in v0.6.0

func (s *Serf) KeyManager() *KeyManager

KeyManager returns the key manager for the current Serf instance.

func (*Serf) Leave

func (s *Serf) Leave() error

Leave gracefully exits the cluster. It is safe to call this multiple times.

func (*Serf) LocalMember added in v0.5.0

func (s *Serf) LocalMember() Member

LocalMember returns the Member information for the local node

func (*Serf) Memberlist added in v0.4.1

func (s *Serf) Memberlist() *memberlist.Memberlist

Memberlist is used to get access to the underlying Memberlist instance

func (*Serf) Members

func (s *Serf) Members() []Member

Members returns a point-in-time snapshot of the members of this cluster.

func (*Serf) NumNodes added in v0.8.0

func (s *Serf) NumNodes() (numNodes int)

NumNodes returns the number of nodes in the serf cluster, regardless of their health or status.

func (*Serf) ProtocolVersion added in v0.2.0

func (s *Serf) ProtocolVersion() uint8

ProtocolVersion returns the current protocol version in use by Serf. This is the Serf protocol version, not the memberlist protocol version.

func (*Serf) Query added in v0.5.0

func (s *Serf) Query(name string, payload []byte, params *QueryParam) (*QueryResponse, error)

Query is used to broadcast a new query. The query must be fairly small, and an error will be returned if the size limit is exceeded. This is only available with protocol version 4 and newer. Query parameters are optional, and if not provided, a sane set of defaults will be used.

func (*Serf) RemoveFailedNode

func (s *Serf) RemoveFailedNode(node string) error

RemoveFailedNode forcibly removes a failed node from the cluster immediately, instead of waiting for the reaper to eventually reclaim it. This also has the effect that Serf will no longer attempt to reconnect to this node.

func (*Serf) SetTags added in v0.4.0

func (s *Serf) SetTags(tags map[string]string) error

SetTags is used to dynamically update the tags associated with the local node. This will propagate the change to the rest of the cluster. Blocks until a the message is broadcast out.

func (*Serf) Shutdown

func (s *Serf) Shutdown() error

Shutdown forcefully shuts down the Serf instance, stopping all network activity and background maintenance associated with the instance.

This is not a graceful shutdown, and should be preceded by a call to Leave. Otherwise, other nodes in the cluster will detect this node's exit as a node failure.

It is safe to call this method multiple times.

func (*Serf) ShutdownCh added in v0.5.0

func (s *Serf) ShutdownCh() <-chan struct{}

ShutdownCh returns a channel that can be used to wait for Serf to shutdown.

func (*Serf) State

func (s *Serf) State() SerfState

State is the current state of this Serf instance.

func (*Serf) Stats added in v0.4.5

func (s *Serf) Stats() map[string]string

Stats is used to provide operator debugging information

func (*Serf) UserEvent

func (s *Serf) UserEvent(name string, payload []byte, coalesce bool) error

UserEvent is used to broadcast a custom user event with a given name and payload. The events must be fairly small, and if the size limit is exceeded and error will be returned. If coalesce is enabled, nodes are allowed to coalesce this event. Coalescing is only available starting in v0.2

type SerfState

type SerfState int

SerfState is the state of the Serf instance.

const (
	SerfAlive SerfState = iota
	SerfLeaving
	SerfLeft
	SerfShutdown
)

func (SerfState) String added in v0.4.0

func (s SerfState) String() string

type Snapshotter added in v0.3.0

type Snapshotter struct {
	// contains filtered or unexported fields
}

Snapshotter is responsible for ingesting events and persisting them to disk, and providing a recovery mechanism at start time.

func NewSnapshotter added in v0.3.0

func NewSnapshotter(path string,
	maxSize int,
	rejoinAfterLeave bool,
	logger *log.Logger,
	clock *LamportClock,
	coordClient *coordinate.Client,
	outCh chan<- Event,
	shutdownCh <-chan struct{}) (chan<- Event, *Snapshotter, error)

NewSnapshotter creates a new Snapshotter that records events up to a max byte size before rotating the file. It can also be used to recover old state. Snapshotter works by reading an event channel it returns, passing through to an output channel, and persisting relevant events to disk. Setting rejoinAfterLeave makes leave not clear the state, and can be used if you intend to rejoin the same cluster after a leave.

func (*Snapshotter) AliveNodes added in v0.3.0

func (s *Snapshotter) AliveNodes() []*PreviousNode

AliveNodes returns the last known alive nodes

func (*Snapshotter) LastClock added in v0.3.0

func (s *Snapshotter) LastClock() LamportTime

LastClock returns the last known clock time

func (*Snapshotter) LastEventClock added in v0.3.0

func (s *Snapshotter) LastEventClock() LamportTime

LastEventClock returns the last known event clock time

func (*Snapshotter) LastQueryClock added in v0.5.0

func (s *Snapshotter) LastQueryClock() LamportTime

LastQueryClock returns the last known query clock time

func (*Snapshotter) Leave added in v0.3.0

func (s *Snapshotter) Leave()

Leave is used to remove known nodes to prevent a restart from causing a join. Otherwise nodes will re-join after leaving!

func (*Snapshotter) Wait added in v0.3.0

func (s *Snapshotter) Wait()

Wait is used to wait until the snapshotter finishes shut down

type UserEvent

type UserEvent struct {
	LTime    LamportTime
	Name     string
	Payload  []byte
	Coalesce bool
}

UserEvent is the struct used for events that are triggered by the user and are not related to members

func (UserEvent) EventType

func (u UserEvent) EventType() EventType

func (UserEvent) String

func (u UserEvent) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL