multiraft

package
v0.0.0-...-d3306dc Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 24, 2014 License: Apache-2.0 Imports: 10 Imported by: 0

Documentation

Overview

Package multiraft implements the Raft distributed consensus algorithm.

In contrast to other Raft implementations, this version is optimized for the case where one server is a part of many Raft consensus groups (likely with overlapping membership). This entails the use of a shared log and coalesced timers for heartbeats.

A cluster consists of a collection of nodes; the local node is represented by a MultiRaft object. Each node may participate in any number of groups. Nodes must have a globally unique ID (a string), and groups have a globally unique name. The application is responsible for providing a Transport interface that knows how to communicate with other nodes based on their IDs, and a Storage interface to manage persistent data.

The Raft protocol is documented in "In Search of an Understandable Consensus Algorithm" by Diego Ongaro and John Ousterhout. https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf

Index

Constants

This section is empty.

Variables

View Source
var RealClock = realClock{}

RealClock is the standard implementation of the Clock interface.

Functions

This section is empty.

Types

type AppendEntriesRequest

type AppendEntriesRequest struct {
	RequestHeader
	GroupID      GroupID
	Term         int
	LeaderID     NodeID
	PrevLogIndex int
	PrevLogTerm  int
	Entries      []*LogEntry
	LeaderCommit int
}

AppendEntriesRequest is a part of the Raft protocol. It is public so it can be used by the net/rpc system but should not be used outside this package except to serialize it.

type AppendEntriesResponse

type AppendEntriesResponse struct {
	Term    int
	Success bool
}

AppendEntriesResponse is a part of the Raft protocol. It is public so it can be used by the net/rpc system but should not be used outside this package except to serialize it.

type ClientInterface

type ClientInterface interface {
	Go(serviceMethod string, args interface{}, reply interface{}, done chan *rpc.Call) *rpc.Call
	Close() error
}

ClientInterface is the interface expected of the client provided by a transport. It is satisfied by rpc.Client, but could be implemented in other ways (using rpc.Call as a dumb data structure)

type Clock

type Clock interface {
	Now() time.Time

	// NewElectionTimer returns a timer to be used for triggering elections.  The resulting
	// Timer struct will have its C field filled out, but may not be a "real" timer, so it
	// must be stopped with StopElectionTimer instead of t.Stop()
	NewElectionTimer(time.Duration) *time.Timer
	StopElectionTimer(*time.Timer)
}

Clock encapsulates the timing-related parts of the raft protocol. Types of events are separated in the API (i.e. NewElectionTimer() instead of NewTimer() so they can be triggered individually in tests.

type Config

type Config struct {
	Storage   Storage
	Transport Transport
	// Clock may be nil to use real time.
	Clock Clock

	// A new election is called if the ElectionTimeout elapses with no contact from the leader.
	// The actual ElectionTimeout is chosen randomly from the range [ElectionTimeoutMin,
	// ElectionTimeoutMax) to minimize the chances of several servers trying to become leaders
	// simultaneously.  The Raft paper suggests a range of 150-300ms for local networks;
	// geographically distributed installations should use higher values to account for the
	// increased round trip time.
	ElectionTimeoutMin time.Duration
	ElectionTimeoutMax time.Duration

	// If Strict is true, some warnings become fatal panics and additional (possibly expensive)
	// sanity checks will be done.
	Strict bool
}

Config contains the parameters necessary to construct a MultiRaft object.

func (*Config) Validate

func (c *Config) Validate() error

Validate returns an error if any required elements of the Config are missing or invalid. Called automatically by NewMultiRaft.

type EventCommandCommitted

type EventCommandCommitted struct {
	Command []byte
}

An EventCommandCommitted is broadcast whenever a command has been committed.

type EventLeaderElection

type EventLeaderElection struct {
	GroupID GroupID
	NodeID  NodeID
}

An EventLeaderElection is broadcast when a group completes an election. TODO(bdarnell): emit EventLeaderElection from follower nodes as well.

type GroupElectionState

type GroupElectionState struct {
	// CurrentTerm is the highest term this node has seen.
	CurrentTerm int

	// VotedFor is the node this node has voted for in CurrentTerm's election.  It is zero
	// If this node has not yet voted in CurrentTerm.
	VotedFor NodeID
}

GroupElectionState records the votes this node has made so that it will not change its vote after a restart.

func (*GroupElectionState) Equal

func (g *GroupElectionState) Equal(other *GroupElectionState) bool

Equal compares two GroupElectionStates.

type GroupID

type GroupID int64

GroupID is a unique identifier for a consensus group within the cluster.

type GroupMembers

type GroupMembers struct {
	// Members contains the current members of the group and is always non-empty.
	// When ProposedMembers is non-empty, the group is in a "joint consensus" phase and
	// a quorum must be reached independently among both Members and ProposedMembers.
	// 'Members' never changes except to be set to the ProposedMembers of the most recently
	// committed GroupMembers.
	Members         []NodeID
	ProposedMembers []NodeID

	// NonVotingMembers receive logs for the group but do not participate in elections
	// or quorum decisions.  When a new node is added to the group it is initially
	// a NonVotingMember until it has caught up to the current log position.
	NonVotingMembers []NodeID
}

GroupMembers maintains the current and future members of the group. It is updated when log entries are received rather than when they are committed as a part of the "joint consensus" protocol (section 6 of the Raft paper).

type GroupPersistentState

type GroupPersistentState struct {
	GroupID       GroupID
	ElectionState GroupElectionState
	Members       GroupMembers
	LastLogIndex  int
	LastLogTerm   int
}

GroupPersistentState is a unified view of the readable data (except for log entries) about a group; used by Storage.LoadGroups.

type LogEntry

type LogEntry struct {
	Term    int
	Index   int
	Type    LogEntryType
	Payload []byte
}

LogEntry represents a persistent log entry. Payloads are opaque to the raft system. TODO(bdarnell): we will need both opaque payloads for the application and raft-subsystem payloads for membership changes.

type LogEntryState

type LogEntryState struct {
	Index int
	Entry LogEntry
	Error error
}

LogEntryState is used by Storage.GetLogEntries to bundle a LogEntry with its index and an optional error.

type LogEntryType

type LogEntryType int8

LogEntryType is the type of a LogEntry.

const (
	LogEntryCommand LogEntryType = iota
)

LogEntryCommand is for application-level commands sent via MultiRaft.SendCommand; other LogEntryTypes are for internal use.

type MemoryStorage

type MemoryStorage struct {
	// contains filtered or unexported fields
}

MemoryStorage is an in-memory implementation of Storage for testing.

func NewMemoryStorage

func NewMemoryStorage() *MemoryStorage

NewMemoryStorage creates a MemoryStorage.

func (*MemoryStorage) AppendLogEntries

func (m *MemoryStorage) AppendLogEntries(groupID GroupID, entries []*LogEntry) error

AppendLogEntries implements the Storage interface.

func (*MemoryStorage) GetLogEntries

func (m *MemoryStorage) GetLogEntries(groupID GroupID, firstIndex, lastIndex int,
	ch chan<- *LogEntryState)

GetLogEntries implements the Storage interface.

func (*MemoryStorage) GetLogEntry

func (m *MemoryStorage) GetLogEntry(groupID GroupID, index int) (*LogEntry, error)

GetLogEntry implements the Storage interface.

func (*MemoryStorage) LoadGroups

func (m *MemoryStorage) LoadGroups() <-chan *GroupPersistentState

LoadGroups implements the Storage interface.

func (*MemoryStorage) SetGroupElectionState

func (m *MemoryStorage) SetGroupElectionState(groupID GroupID,
	electionState *GroupElectionState) error

SetGroupElectionState implements the Storage interface.

func (*MemoryStorage) TruncateLog

func (m *MemoryStorage) TruncateLog(groupID GroupID, lastIndex int) error

TruncateLog implements the Storage interface.

type MultiRaft

type MultiRaft struct {
	Config
	Events chan interface{}
	// contains filtered or unexported fields
}

MultiRaft represents a local node in a raft cluster. The owner is responsible for consuming the Events channel in a timely manner.

func NewMultiRaft

func NewMultiRaft(nodeID NodeID, config *Config) (*MultiRaft, error)

NewMultiRaft creates a MultiRaft object.

func (*MultiRaft) CreateGroup

func (m *MultiRaft) CreateGroup(groupID GroupID, initialMembers []NodeID) error

CreateGroup creates a new consensus group and joins it. The application should arrange to call CreateGroup on all nodes named in initialMembers.

func (*MultiRaft) DoRPC

func (m *MultiRaft) DoRPC(name string, req, resp interface{}) error

DoRPC implements ServerInterface

func (*MultiRaft) Start

func (m *MultiRaft) Start()

Start runs the raft algorithm in a background goroutine.

func (*MultiRaft) Stop

func (m *MultiRaft) Stop()

Stop terminates the running raft instance and shuts down all network interfaces.

func (*MultiRaft) SubmitCommand

func (m *MultiRaft) SubmitCommand(groupID GroupID, command []byte) error

SubmitCommand sends a command (a binary blob) to the cluster. This method returns when the command has been successfully sent, not when it has been committed. TODO(bdarnell): should SubmitCommand wait until the commit?

type NodeID

type NodeID int32

NodeID is a unique non-zero identifier for the node within the cluster.

type RPCInterface

type RPCInterface interface {
	RequestVote(req *RequestVoteRequest, resp *RequestVoteResponse) error
	AppendEntries(req *AppendEntriesRequest, resp *AppendEntriesResponse) error
}

RPCInterface is the methods we expose for use by net/rpc.

type RequestHeader

type RequestHeader struct {
	SrcNode  NodeID
	DestNode NodeID
}

RequestHeader contains fields common to all RPC requests.

type RequestVoteRequest

type RequestVoteRequest struct {
	RequestHeader
	GroupID      GroupID
	Term         int
	CandidateID  NodeID
	LastLogIndex int
	LastLogTerm  int
}

RequestVoteRequest is a part of the Raft protocol. It is public so it can be used by the net/rpc system but should not be used outside this package except to serialize it.

type RequestVoteResponse

type RequestVoteResponse struct {
	Term        int
	VoteGranted bool
}

RequestVoteResponse is a part of the Raft protocol. It is public so it can be used by the net/rpc system but should not be used outside this package except to serialize it.

type Role

type Role int

Role represents the state of the node in a group.

const (
	RoleObserver Role = iota
	RoleFollower
	RoleCandidate
	RoleLeader
)

Nodes can be either observers, followers, candidates, or leaders. Observers receive replicated logs but do not vote. There is at most one Leader per term; a node cannot become a Leader without first becoming a Candiate and winning an election.

type ServerInterface

type ServerInterface interface {
	DoRPC(name string, req, resp interface{}) error
}

ServerInterface is a generic interface based on net/rpc.

type Storage

type Storage interface {
	// LoadGroups is called at startup to load all previously-existing groups.
	// The returned channel should be closed once all groups have been loaded.
	LoadGroups() <-chan *GroupPersistentState

	// SetGroupElectionState is called to update the election state for the given group.
	SetGroupElectionState(groupID GroupID, electionState *GroupElectionState) error

	// AppendLogEntries is called to add entries to the log.  The entries will always span
	// a contiguous range of indices just after the current end of the log.
	AppendLogEntries(groupID GroupID, entries []*LogEntry) error

	// TruncateLog is called to delete all log entries with index > lastIndex.
	TruncateLog(groupID GroupID, lastIndex int) error

	// GetLogEntry is called to synchronously retrieve an entry from the log.
	GetLogEntry(groupID GroupID, index int) (*LogEntry, error)

	// GetLogEntries is called to asynchronously retrieve entries from the log,
	// from firstIndex to lastIndex inclusive.  If there is an error the storage
	// layer should send one LogEntryState with a non-nil error and then close the
	// channel.
	GetLogEntries(groupID GroupID, firstIndex, lastIndex int, ch chan<- *LogEntryState)
}

The Storage interface is supplied by the application to manage persistent storage of raft data.

type Transport

type Transport interface {
	// Listen informs the Transport of the local node's ID and callback interface.
	// The Transport should associate the given id with the server object so other Transport's
	// Connect methods can find it.
	Listen(id NodeID, server ServerInterface) error

	// Stop undoes a previous Listen.
	Stop(id NodeID)

	// Connect looks up a node by id and returns a stub interface to submit RPCs to it.
	Connect(id NodeID) (ClientInterface, error)
}

The Transport interface is supplied by the application to manage communication with other nodes. It is responsible for mapping from IDs to some communication channel (in the simplest case, a host:port pair could be used as an ID, although this would make it impossible to move an instance from one host to another except by syncing up a new node from scratch).

func NewLocalRPCTransport

func NewLocalRPCTransport() Transport

NewLocalRPCTransport creates a Transport for local testing use. MultiRaft instances sharing the same local Transport can find and communicate with each other by ID (which can be an arbitrary string). Each instance binds to a different unused port on localhost.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL