Documentation ¶
Index ¶
- Constants
- Variables
- type Config
- type Event
- type EventType
- type KeyManager
- type KeyResponse
- type LamportClock
- type LamportTime
- type Member
- type MemberEvent
- type MemberStatus
- type MergeDelegate
- type NodeResponse
- type PreviousNode
- type Query
- type QueryParam
- type QueryResponse
- type Serf
- func (s *Serf) DefaultQueryParams() *QueryParam
- func (s *Serf) DefaultQueryTimeout() time.Duration
- func (s *Serf) EncryptionEnabled() bool
- func (s *Serf) GetCachedCoordinate(name string) (coord *coordinate.Coordinate, ok bool)
- func (s *Serf) GetCoordinate() (*coordinate.Coordinate, error)
- func (s *Serf) Join(existing []string, ignoreOld bool) (int, error)
- func (s *Serf) KeyManager() *KeyManager
- func (s *Serf) Leave() error
- func (s *Serf) LocalMember() Member
- func (s *Serf) Memberlist() *memberlist.Memberlist
- func (s *Serf) Members() []Member
- func (s *Serf) NumNodes() (numNodes int)
- func (s *Serf) ProtocolVersion() uint8
- func (s *Serf) Query(name string, payload []byte, params *QueryParam) (*QueryResponse, error)
- func (s *Serf) RemoveFailedNode(node string) error
- func (s *Serf) SetTags(tags map[string]string) error
- func (s *Serf) Shutdown() error
- func (s *Serf) ShutdownCh() <-chan struct{}
- func (s *Serf) State() SerfState
- func (s *Serf) Stats() map[string]string
- func (s *Serf) UserEvent(name string, payload []byte, coalesce bool) error
- type SerfState
- type Snapshotter
- type UserEvent
Constants ¶
const ( ProtocolVersionMin uint8 = 2 ProtocolVersionMax = 4 )
These are the protocol versions that Serf can _understand_. These are Serf-level protocol versions that are passed down as the delegate version to memberlist below.
const ( // This is the prefix we use for queries that are internal to Serf. // They are handled internally, and not forwarded to a client. InternalQueryPrefix = "_serf_" )
const ( // PingVersion is an internal version for the ping message, above the normal // versioning we get from the protocol version. This enables small updates // to the ping message without a full protocol bump. PingVersion = 1 )
const (
UserEventSizeLimit = 512 // Maximum byte size for event name and payload
)
Variables ¶
var ( // FeatureNotSupported is returned if a feature cannot be used // due to an older protocol version being used. FeatureNotSupported = fmt.Errorf("Feature not supported") )
var ProtocolVersionMap map[uint8]uint8
ProtocolVersionMap is the mapping of Serf delegate protocol versions to memberlist protocol versions. We mask the memberlist protocols using our own protocol version.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct { // The name of this node. This must be unique in the cluster. If this // is not set, Serf will set it to the hostname of the running machine. NodeName string // The tags for this role, if any. This is used to provide arbitrary // key/value metadata per-node. For example, a "role" tag may be used to // differentiate "load-balancer" from a "web" role as parts of the same cluster. // Tags are deprecating 'Role', and instead it acts as a special key in this // map. Tags map[string]string // EventCh is a channel that receives all the Serf events. The events // are sent on this channel in proper ordering. Care must be taken that // this channel doesn't block, either by processing the events quick // enough or buffering the channel, otherwise it can block state updates // within Serf itself. If no EventCh is specified, no events will be fired, // but point-in-time snapshots of members can still be retrieved by // calling Members on Serf. EventCh chan<- Event // ProtocolVersion is the protocol version to speak. This must be between // ProtocolVersionMin and ProtocolVersionMax. ProtocolVersion uint8 // BroadcastTimeout is the amount of time to wait for a broadcast // message to be sent to the cluster. Broadcast messages are used for // things like leave messages and force remove messages. If this is not // set, a timeout of 5 seconds will be set. BroadcastTimeout time.Duration // The settings below relate to Serf's event coalescence feature. Serf // is able to coalesce multiple events into single events in order to // reduce the amount of noise that is sent along the EventCh. For example // if five nodes quickly join, the EventCh will be sent one EventMemberJoin // containing the five nodes rather than five individual EventMemberJoin // events. Coalescence can mitigate potential flapping behavior. // // Coalescence is disabled by default and can be enabled by setting // CoalescePeriod. // // CoalescePeriod specifies the time duration to coalesce events. // For example, if this is set to 5 seconds, then all events received // within 5 seconds that can be coalesced will be. // // QuiescentPeriod specifies the duration of time where if no events // are received, coalescence immediately happens. For example, if // CoalscePeriod is set to 10 seconds but QuiscentPeriod is set to 2 // seconds, then the events will be coalesced and dispatched if no // new events are received within 2 seconds of the last event. Otherwise, // every event will always be delayed by at least 10 seconds. CoalescePeriod time.Duration QuiescentPeriod time.Duration // The settings below relate to Serf's user event coalescing feature. // The settings operate like above but only affect user messages and // not the Member* messages that Serf generates. UserCoalescePeriod time.Duration UserQuiescentPeriod time.Duration // The settings below relate to Serf keeping track of recently // failed/left nodes and attempting reconnects. // // ReapInterval is the interval when the reaper runs. If this is not // set (it is zero), it will be set to a reasonable default. // // ReconnectInterval is the interval when we attempt to reconnect // to failed nodes. If this is not set (it is zero), it will be set // to a reasonable default. // // ReconnectTimeout is the amount of time to attempt to reconnect to // a failed node before giving up and considering it completely gone. // // TombstoneTimeout is the amount of time to keep around nodes // that gracefully left as tombstones for syncing state with other // Serf nodes. ReapInterval time.Duration ReconnectInterval time.Duration ReconnectTimeout time.Duration TombstoneTimeout time.Duration // FlapTimeout is the amount of time less than which we consider a node // being failed and rejoining looks like a flap for telemetry purposes. // This should be set less than a typical reboot time, but large enough // to see actual events, given our expected detection times for a failed // node. FlapTimeout time.Duration // QueueDepthWarning is used to generate warning message if the // number of queued messages to broadcast exceeds this number. This // is to provide the user feedback if events are being triggered // faster than they can be disseminated QueueDepthWarning int // MaxQueueDepth is used to start dropping messages if the number // of queued messages to broadcast exceeds this number. This is to // prevent an unbounded growth of memory utilization MaxQueueDepth int // RecentIntentTimeout is used to determine how long we store recent // join and leave intents. This is used to guard against the case where // Serf broadcasts an intent that arrives before the Memberlist event. // It is important that this not be too short to avoid continuous // rebroadcasting of dead events. RecentIntentTimeout time.Duration // EventBuffer is used to control how many events are buffered. // This is used to prevent re-delivery of events to a client. The buffer // must be large enough to handle all "recent" events, since Serf will // not deliver messages that are older than the oldest entry in the buffer. // Thus if a client is generating too many events, it's possible that the // buffer gets overrun and messages are not delivered. EventBuffer int // QueryBuffer is used to control how many queries are buffered. // This is used to prevent re-delivery of queries to a client. The buffer // must be large enough to handle all "recent" events, since Serf will not // deliver queries older than the oldest entry in the buffer. // Thus if a client is generating too many queries, it's possible that the // buffer gets overrun and messages are not delivered. QueryBuffer int // QueryTimeoutMult configures the default timeout multipler for a query to run if no // specific value is provided. Queries are real-time by nature, where the // reply is time sensitive. As a result, results are collected in an async // fashion, however the query must have a bounded duration. We want the timeout // to be long enough that all nodes have time to receive the message, run a handler, // and generate a reply. Once the timeout is exceeded, any further replies are ignored. // The default value is // // Timeout = GossipInterval * QueryTimeoutMult * log(N+1) // QueryTimeoutMult int // QueryResponseSizeLimit and QuerySizeLimit limit the inbound and // outbound payload sizes for queries, respectively. These must fit // in a UDP packet with some additional overhead, so tuning these // past the default values of 1024 will depend on your network // configuration. QueryResponseSizeLimit int QuerySizeLimit int // MemberlistConfig is the memberlist configuration that Serf will // use to do the underlying membership management and gossip. Some // fields in the MemberlistConfig will be overwritten by Serf no // matter what: // // * Name - This will always be set to the same as the NodeName // in this configuration. // // * Events - Serf uses a custom event delegate. // // * Delegate - Serf uses a custom delegate. // MemberlistConfig *memberlist.Config // LogOutput is the location to write logs to. If this is not set, // logs will go to stderr. LogOutput io.Writer // SnapshotPath if provided is used to snapshot live nodes as well // as lamport clock values. When Serf is started with a snapshot, // it will attempt to join all the previously known nodes until one // succeeds and will also avoid replaying old user events. SnapshotPath string // RejoinAfterLeave controls our interaction with the snapshot file. // When set to false (default), a leave causes a Serf to not rejoin // the cluster until an explicit join is received. If this is set to // true, we ignore the leave, and rejoin the cluster on start. RejoinAfterLeave bool // EnableNameConflictResolution controls if Serf will actively attempt // to resolve a name conflict. Since each Serf member must have a unique // name, a cluster can run into issues if multiple nodes claim the same // name. Without automatic resolution, Serf merely logs some warnings, but // otherwise does not take any action. Automatic resolution detects the // conflict and issues a special query which asks the cluster for the // Name -> IP:Port mapping. If there is a simple majority of votes, that // node stays while the other node will leave the cluster and exit. EnableNameConflictResolution bool // DisableCoordinates controls if Serf will maintain an estimate of this // node's network coordinate internally. A network coordinate is useful // for estimating the network distance (i.e. round trip time) between // two nodes. Enabling this option adds some overhead to ping messages. DisableCoordinates bool // KeyringFile provides the location of a writable file where Serf can // persist changes to the encryption keyring. KeyringFile string // Merge can be optionally provided to intercept a cluster merge // and conditionally abort the merge. Merge MergeDelegate }
Config is the configuration for creating a Serf instance.
func DefaultConfig ¶
func DefaultConfig() *Config
DefaultConfig returns a Config struct that contains reasonable defaults for most of the configurations.
type Event ¶
Event is a generic interface for exposing Serf events Clients will usually need to use a type switches to get to a more useful type
type EventType ¶
type EventType int
EventType are all the types of events that may occur and be sent along the Serf channel.
type KeyManager ¶ added in v0.6.4
type KeyManager struct {
// contains filtered or unexported fields
}
KeyManager encapsulates all functionality within Serf for handling encryption keyring changes across a cluster.
func (*KeyManager) InstallKey ¶ added in v0.6.4
func (k *KeyManager) InstallKey(key string) (*KeyResponse, error)
InstallKey handles broadcasting a query to all members and gathering responses from each of them, returning a list of messages from each node and any applicable error conditions.
func (*KeyManager) ListKeys ¶ added in v0.6.4
func (k *KeyManager) ListKeys() (*KeyResponse, error)
ListKeys is used to collect installed keys from members in a Serf cluster and return an aggregated list of all installed keys. This is useful to operators to ensure that there are no lingering keys installed on any agents. Since having multiple keys installed can cause performance penalties in some cases, it's important to verify this information and remove unneeded keys.
func (*KeyManager) RemoveKey ¶ added in v0.6.4
func (k *KeyManager) RemoveKey(key string) (*KeyResponse, error)
RemoveKey handles broadcasting a key to the cluster for removal. Each member will receive this event, and if they have the key in their keyring, remove it. If any errors are encountered, RemoveKey will collect and relay them.
func (*KeyManager) UseKey ¶ added in v0.6.4
func (k *KeyManager) UseKey(key string) (*KeyResponse, error)
UseKey handles broadcasting a primary key change to all members in the cluster, and gathering any response messages. If successful, there should be an empty KeyResponse returned.
type KeyResponse ¶ added in v0.6.0
type KeyResponse struct { Messages map[string]string // Map of node name to response message NumNodes int // Total nodes memberlist knows of NumResp int // Total responses received NumErr int // Total errors from request // Keys is a mapping of the base64-encoded value of the key bytes to the // number of nodes that have the key installed. Keys map[string]int }
KeyResponse is used to relay a query for a list of all keys in use.
type LamportClock ¶
type LamportClock struct {
// contains filtered or unexported fields
}
LamportClock is a thread safe implementation of a lamport clock. It uses efficient atomic operations for all of its functions, falling back to a heavy lock only if there are enough CAS failures.
func (*LamportClock) Increment ¶
func (l *LamportClock) Increment() LamportTime
Increment is used to increment and return the value of the lamport clock
func (*LamportClock) Time ¶
func (l *LamportClock) Time() LamportTime
Time is used to return the current value of the lamport clock
func (*LamportClock) Witness ¶
func (l *LamportClock) Witness(v LamportTime)
Witness is called to update our local clock if necessary after witnessing a clock value received from another process
type Member ¶
type Member struct { Name string Addr net.IP Port uint16 Tags map[string]string Status MemberStatus // The minimum, maximum, and current values of the protocol versions // and delegate (Serf) protocol versions that each member can understand // or is speaking. ProtocolMin uint8 ProtocolMax uint8 ProtocolCur uint8 DelegateMin uint8 DelegateMax uint8 DelegateCur uint8 }
Member is a single member of the Serf cluster.
type MemberEvent ¶
MemberEvent is the struct used for member related events Because Serf coalesces events, an event may contain multiple members.
func (MemberEvent) EventType ¶
func (m MemberEvent) EventType() EventType
func (MemberEvent) String ¶
func (m MemberEvent) String() string
type MemberStatus ¶
type MemberStatus int
MemberStatus is the state that a member is in.
const ( StatusNone MemberStatus = iota StatusAlive StatusLeaving StatusLeft StatusFailed )
func (MemberStatus) String ¶
func (s MemberStatus) String() string
type MergeDelegate ¶ added in v0.6.4
type NodeResponse ¶ added in v0.5.0
NodeResponse is used to represent a single response from a node
type PreviousNode ¶ added in v0.3.0
PreviousNode is used to represent the previously known alive nodes
func (PreviousNode) String ¶ added in v0.3.0
func (p PreviousNode) String() string
type Query ¶ added in v0.5.0
type Query struct { LTime LamportTime Name string Payload []byte // contains filtered or unexported fields }
Query is the struct used EventQuery type events
type QueryParam ¶ added in v0.5.0
type QueryParam struct { // If provided, we restrict the nodes that should respond to those // with names in this list FilterNodes []string // FilterTags maps a tag name to a regular expression that is applied // to restrict the nodes that should respond FilterTags map[string]string // If true, we are requesting an delivery acknowledgement from // every node that meets the filter requirement. This means nodes // the receive the message but do not pass the filters, will not // send an ack. RequestAck bool // The timeout limits how long the query is left open. If not provided, // then a default timeout is used based on the configuration of Serf Timeout time.Duration }
QueryParam is provided to Query() to configure the parameters of the query. If not provided, sane defaults will be used.
type QueryResponse ¶ added in v0.5.0
type QueryResponse struct {
// contains filtered or unexported fields
}
QueryResponse is returned for each new Query. It is used to collect Ack's as well as responses and to provide those back to a client.
func (*QueryResponse) AckCh ¶ added in v0.5.0
func (r *QueryResponse) AckCh() <-chan string
AckCh returns a channel that can be used to listen for acks Channel will be closed when the query is finished. This is nil, if the query did not specify RequestAck.
func (*QueryResponse) Close ¶ added in v0.5.0
func (r *QueryResponse) Close()
Close is used to close the query, which will close the underlying channels and prevent further deliveries
func (*QueryResponse) Deadline ¶ added in v0.5.0
func (r *QueryResponse) Deadline() time.Time
Deadline returns the ending deadline of the query
func (*QueryResponse) Finished ¶ added in v0.5.0
func (r *QueryResponse) Finished() bool
Finished returns if the query is finished running
func (*QueryResponse) ResponseCh ¶ added in v0.5.0
func (r *QueryResponse) ResponseCh() <-chan NodeResponse
ResponseCh returns a channel that can be used to listen for responses. Channel will be closed when the query is finished.
type Serf ¶
type Serf struct {
// contains filtered or unexported fields
}
Serf is a single node that is part of a single cluster that gets events about joins/leaves/failures/etc. It is created with the Create method.
All functions on the Serf structure are safe to call concurrently.
func Create ¶
Create creates a new Serf instance, starting all the background tasks to maintain cluster membership information.
After calling this function, the configuration should no longer be used or modified by the caller.
func (*Serf) DefaultQueryParams ¶ added in v0.5.0
func (s *Serf) DefaultQueryParams() *QueryParam
DefaultQueryParam is used to return the default query parameters
func (*Serf) DefaultQueryTimeout ¶ added in v0.5.0
DefaultQueryTimeout returns the default timeout value for a query Computed as GossipInterval * QueryTimeoutMult * log(N+1)
func (*Serf) EncryptionEnabled ¶ added in v0.6.0
EncryptionEnabled is a predicate that determines whether or not encryption is enabled, which can be possible in one of 2 cases:
- Single encryption key passed at agent start (no persistence)
- Keyring file provided at agent start
func (*Serf) GetCachedCoordinate ¶ added in v0.7.0
func (s *Serf) GetCachedCoordinate(name string) (coord *coordinate.Coordinate, ok bool)
GetCachedCoordinate returns the network coordinate for the node with the given name. This will only be valid if DisableCoordinates is set to false.
func (*Serf) GetCoordinate ¶ added in v0.7.0
func (s *Serf) GetCoordinate() (*coordinate.Coordinate, error)
GetCoordinate returns the network coordinate of the local node.
func (*Serf) Join ¶
Join joins an existing Serf cluster. Returns the number of nodes successfully contacted. The returned error will be non-nil only in the case that no nodes could be contacted. If ignoreOld is true, then any user messages sent prior to the join will be ignored.
func (*Serf) KeyManager ¶ added in v0.6.0
func (s *Serf) KeyManager() *KeyManager
KeyManager returns the key manager for the current Serf instance.
func (*Serf) LocalMember ¶ added in v0.5.0
LocalMember returns the Member information for the local node
func (*Serf) Memberlist ¶ added in v0.4.1
func (s *Serf) Memberlist() *memberlist.Memberlist
Memberlist is used to get access to the underlying Memberlist instance
func (*Serf) NumNodes ¶ added in v0.8.0
NumNodes returns the number of nodes in the serf cluster, regardless of their health or status.
func (*Serf) ProtocolVersion ¶ added in v0.2.0
ProtocolVersion returns the current protocol version in use by Serf. This is the Serf protocol version, not the memberlist protocol version.
func (*Serf) Query ¶ added in v0.5.0
func (s *Serf) Query(name string, payload []byte, params *QueryParam) (*QueryResponse, error)
Query is used to broadcast a new query. The query must be fairly small, and an error will be returned if the size limit is exceeded. This is only available with protocol version 4 and newer. Query parameters are optional, and if not provided, a sane set of defaults will be used.
func (*Serf) RemoveFailedNode ¶
RemoveFailedNode forcibly removes a failed node from the cluster immediately, instead of waiting for the reaper to eventually reclaim it. This also has the effect that Serf will no longer attempt to reconnect to this node.
func (*Serf) SetTags ¶ added in v0.4.0
SetTags is used to dynamically update the tags associated with the local node. This will propagate the change to the rest of the cluster. Blocks until a the message is broadcast out.
func (*Serf) Shutdown ¶
Shutdown forcefully shuts down the Serf instance, stopping all network activity and background maintenance associated with the instance.
This is not a graceful shutdown, and should be preceded by a call to Leave. Otherwise, other nodes in the cluster will detect this node's exit as a node failure.
It is safe to call this method multiple times.
func (*Serf) ShutdownCh ¶ added in v0.5.0
func (s *Serf) ShutdownCh() <-chan struct{}
ShutdownCh returns a channel that can be used to wait for Serf to shutdown.
func (*Serf) UserEvent ¶
UserEvent is used to broadcast a custom user event with a given name and payload. The events must be fairly small, and if the size limit is exceeded and error will be returned. If coalesce is enabled, nodes are allowed to coalesce this event. Coalescing is only available starting in v0.2
type Snapshotter ¶ added in v0.3.0
type Snapshotter struct {
// contains filtered or unexported fields
}
Snapshotter is responsible for ingesting events and persisting them to disk, and providing a recovery mechanism at start time.
func NewSnapshotter ¶ added in v0.3.0
func NewSnapshotter(path string, maxSize int, rejoinAfterLeave bool, logger *log.Logger, clock *LamportClock, coordClient *coordinate.Client, outCh chan<- Event, shutdownCh <-chan struct{}) (chan<- Event, *Snapshotter, error)
NewSnapshotter creates a new Snapshotter that records events up to a max byte size before rotating the file. It can also be used to recover old state. Snapshotter works by reading an event channel it returns, passing through to an output channel, and persisting relevant events to disk. Setting rejoinAfterLeave makes leave not clear the state, and can be used if you intend to rejoin the same cluster after a leave.
func (*Snapshotter) AliveNodes ¶ added in v0.3.0
func (s *Snapshotter) AliveNodes() []*PreviousNode
AliveNodes returns the last known alive nodes
func (*Snapshotter) LastClock ¶ added in v0.3.0
func (s *Snapshotter) LastClock() LamportTime
LastClock returns the last known clock time
func (*Snapshotter) LastEventClock ¶ added in v0.3.0
func (s *Snapshotter) LastEventClock() LamportTime
LastEventClock returns the last known event clock time
func (*Snapshotter) LastQueryClock ¶ added in v0.5.0
func (s *Snapshotter) LastQueryClock() LamportTime
LastQueryClock returns the last known query clock time
func (*Snapshotter) Leave ¶ added in v0.3.0
func (s *Snapshotter) Leave()
Leave is used to remove known nodes to prevent a restart from causing a join. Otherwise nodes will re-join after leaving!
func (*Snapshotter) Wait ¶ added in v0.3.0
func (s *Snapshotter) Wait()
Wait is used to wait until the snapshotter finishes shut down