Documentation ¶
Overview ¶
Package storage provides access to the Store and Range abstractions. Each Cockroach node handles one or more stores, each of which multiplexes to one or more ranges, identified by [start, end) keys. Ranges are contiguous regions of the keyspace. Each range implements an instance of the Raft consensus algorithm to synchronize participating range replicas.
Each store is represented by a single engine.Engine instance. The ranges hosted by a store all have access to the same engine, but write to only a range-limited keyspace within it. Ranges access the underlying engine via the MVCC interface, which provides historical versioned values.
Package storage is a generated protocol buffer package. It is generated from these files: cockroach/storage/status.proto It has these top-level messages: StoreStatus
Example (Rebalancing) ¶
// Model a set of stores in a cluster, // randomly adding / removing stores and adding bytes. g := gossip.New(nil, 0, nil) alloc := newAllocator(g) alloc.randGen = rand.New(rand.NewSource(0)) alloc.deterministic = true var wg sync.WaitGroup g.RegisterCallback(gossip.MakePrefixPattern(gossip.KeyCapacityPrefix), func(_ string, _ bool) { wg.Done() }) const generations = 100 const nodes = 20 // Initialize testStores. var testStores [nodes]testStore for i := 0; i < len(testStores); i++ { testStores[i].StoreID = proto.StoreID(i) testStores[i].Node = proto.NodeDescriptor{NodeID: proto.NodeID(i)} testStores[i].Capacity = proto.StoreCapacity{Capacity: 1 << 30, Available: 1 << 30} } // Initialize the cluster with a single range. testStores[0].Add(alloc.randGen.Int63n(1 << 20)) for i := 0; i < generations; i++ { // First loop through test stores and add data. wg.Add(len(testStores)) for j := 0; j < len(testStores); j++ { // Add a pretend range to the testStore if there's already one. if testStores[j].Capacity.RangeCount > 0 { testStores[j].Add(alloc.randGen.Int63n(1 << 20)) } key := gossip.MakeCapacityKey(proto.NodeID(j), proto.StoreID(j)) if err := g.AddInfo(key, testStores[j].StoreDescriptor, 0); err != nil { panic(err) } } wg.Wait() // Next loop through test stores and maybe rebalance. for j := 0; j < len(testStores); j++ { ts := &testStores[j] if alloc.ShouldRebalance(&testStores[j].StoreDescriptor) { target := alloc.RebalanceTarget(proto.Attributes{}, []proto.Replica{{NodeID: ts.Node.NodeID, StoreID: ts.StoreID}}) if target != nil { testStores[j].Rebalance(&testStores[int(target.StoreID)], alloc.randGen.Int63n(1<<20)) } } } // Output store capacities as hexidecimal 2-character values. if i%(generations/50) == 0 { var maxBytes int64 for j := 0; j < len(testStores); j++ { bytes := testStores[j].Capacity.Capacity - testStores[j].Capacity.Available if bytes > maxBytes { maxBytes = bytes } } if maxBytes > 0 { for j := 0; j < len(testStores); j++ { endStr := " " if j == len(testStores)-1 { endStr = "" } bytes := testStores[j].Capacity.Capacity - testStores[j].Capacity.Available fmt.Printf("%03d%s", (999*bytes)/maxBytes, endStr) } fmt.Printf("\n") } } } var totBytes int64 var totRanges int32 for i := 0; i < len(testStores); i++ { totBytes += testStores[i].Capacity.Capacity - testStores[i].Capacity.Available totRanges += testStores[i].Capacity.RangeCount } fmt.Printf("Total bytes=%d, ranges=%d\n", totBytes, totRanges)
Output: 999 000 000 000 000 000 000 739 000 000 000 000 000 000 000 000 000 000 000 000 999 000 000 000 204 000 000 375 000 000 107 000 000 000 000 000 000 000 000 536 942 000 000 463 140 000 000 646 000 288 288 000 442 000 058 647 000 000 316 999 880 000 412 630 365 745 445 565 122 407 380 570 276 000 271 709 000 718 299 999 925 000 667 600 555 975 704 552 272 491 773 890 584 000 407 974 000 930 476 999 990 967 793 579 493 999 698 453 616 608 777 755 709 425 455 984 483 698 267 931 965 999 869 606 635 908 630 585 567 577 818 870 740 621 550 868 805 790 411 913 953 995 990 624 617 947 562 609 670 658 909 952 835 851 641 958 924 999 526 987 999 923 901 571 687 915 636 636 674 685 831 881 847 820 702 905 897 983 509 981 999 884 809 585 691 826 640 572 748 641 754 887 758 848 643 927 865 897 541 956 999 856 891 594 691 745 602 615 766 663 814 834 719 886 733 925 882 911 593 926 999 890 900 653 707 759 642 697 771 732 851 858 748 869 842 953 903 928 655 923 999 924 909 696 748 797 693 689 806 766 841 902 705 897 874 914 913 916 730 892 999 948 892 704 740 821 685 656 859 772 893 911 690 878 824 935 928 941 741 860 999 948 931 697 770 782 697 666 893 761 944 869 658 902 816 925 923 983 742 831 999 878 901 736 750 737 677 647 869 731 930 825 631 880 775 947 949 930 687 810 999 890 910 764 778 757 709 663 849 777 964 837 672 891 814 978 944 946 721 868 985 895 968 806 791 791 720 694 883 819 999 847 652 888 790 995 950 947 692 843 960 903 956 794 815 779 746 706 891 824 958 830 665 886 757 999 931 969 701 861 999 928 954 805 807 822 764 734 910 829 952 827 678 927 785 980 936 962 677 836 999 903 924 800 769 822 776 730 886 815 935 781 668 890 805 948 929 965 676 837 999 926 935 836 782 836 809 756 897 835 937 781 690 894 804 979 951 978 667 832 999 937 936 875 843 872 854 793 908 873 950 808 714 901 860 981 975 962 693 866 988 957 938 898 922 912 916 886 905 912 964 867 764 915 911 992 999 985 776 896 945 959 922 910 937 913 938 944 957 921 993 916 898 957 928 999 976 997 855 957 980 986 944 956 963 920 966 967 999 966 991 956 981 973 955 998 990 954 994 981 956 985 942 945 950 900 933 949 981 969 946 935 963 951 931 999 936 941 972 963 940 999 964 949 941 974 967 937 970 975 965 951 976 968 949 993 944 949 977 964 926 999 973 932 944 952 933 944 963 965 927 940 964 960 938 995 932 935 968 951 907 999 919 957 941 958 934 935 930 941 940 926 966 933 920 973 937 923 938 946 924 999 914 963 976 945 911 936 929 951 930 930 972 935 941 977 932 960 939 958 942 999 950 961 987 942 928 945 938 941 939 936 985 937 969 985 952 958 957 948 956 999 950 947 943 939 949 934 929 935 940 942 943 957 988 974 933 936 938 951 967 990 950 949 964 952 951 922 943 940 954 956 962 946 982 999 945 949 940 954 970 999 952 959 970 955 957 974 937 965 968 947 950 958 947 993 953 938 958 950 945 964 954 963 965 959 967 961 925 978 954 944 968 937 960 999 947 947 961 960 930 957 938 974 956 944 968 930 944 972 930 946 958 974 940 999 961 945 953 947 966 980 954 989 979 960 969 995 961 986 954 980 980 971 968 999 968 977 979 972 963 953 958 986 990 947 973 955 955 983 974 981 961 964 977 999 984 982 966 964 964 968 975 993 999 955 965 958 972 995 978 981 956 966 981 987 978 976 985 966 967 957 954 999 963 940 968 966 941 966 971 969 957 961 949 940 968 963 988 947 951 939 952 980 937 948 964 970 941 965 979 966 941 940 952 938 973 955 999 934 939 958 941 998 942 951 962 942 962 951 972 978 946 935 958 935 950 947 999 953 959 952 938 999 936 957 961 950 937 954 975 971 958 930 938 930 944 939 978 950 957 943 963 999 947 965 953 937 966 953 978 972 963 937 933 945 944 937 979 952 945 951 956 999 926 948 958 923 947 934 951 961 955 941 949 936 945 929 960 947 956 960 975 999 945 977 956 934 954 943 961 956 956 954 960 954 958 929 969 938 947 966 993 999 944 963 942 939 963 935 952 957 968 947 962 946 962 947 959 942 940 961 999 992 935 946 938 932 968 939 957 938 970 949 964 934 948 957 952 939 944 955 999 978 940 932 937 944 957 936 957 945 958 955 947 933 956 948 947 942 Total bytes=1003302292, ranges=1899
Index ¶
- Constants
- Variables
- func InsertRange(txn *client.Txn, b *client.Batch, key proto.Key) error
- func ProcessStoreEvent(l StoreEventListener, event interface{})
- func SetupRangeTree(batch engine.Engine, ms *engine.MVCCStats, timestamp proto.Timestamp, ...) error
- type BeginScanRangesEvent
- type CommandQueue
- type EndScanRangesEvent
- type MergeRangeEvent
- type NotBootstrappedError
- type RegisterRangeEvent
- type RemoveRangeEvent
- type Replica
- func (r *Replica) AddCmd(ctx context.Context, args proto.Request) (proto.Response, error)
- func (r *Replica) AdminMerge(args proto.AdminMergeRequest) (proto.AdminMergeResponse, error)
- func (r *Replica) AdminSplit(args proto.AdminSplitRequest) (proto.AdminSplitResponse, error)
- func (r *Replica) Append(entries []raftpb.Entry) error
- func (r *Replica) ApplySnapshot(snap raftpb.Snapshot) error
- func (r *Replica) ChangeReplicas(changeType proto.ReplicaChangeType, replica proto.Replica) error
- func (r *Replica) ConditionalPut(batch engine.Engine, ms *engine.MVCCStats, args proto.ConditionalPutRequest) (proto.ConditionalPutResponse, error)
- func (r *Replica) ContainsKey(key proto.Key) bool
- func (r *Replica) ContainsKeyRange(start, end proto.Key) bool
- func (r *Replica) Delete(batch engine.Engine, ms *engine.MVCCStats, args proto.DeleteRequest) (proto.DeleteResponse, error)
- func (r *Replica) DeleteRange(batch engine.Engine, ms *engine.MVCCStats, args proto.DeleteRangeRequest) (proto.DeleteRangeResponse, error)
- func (r *Replica) Desc() *proto.RangeDescriptor
- func (r *Replica) Destroy() error
- func (r *Replica) EndTransaction(batch engine.Engine, ms *engine.MVCCStats, args proto.EndTransactionRequest) (proto.EndTransactionResponse, []proto.Intent, error)
- func (r *Replica) Entries(lo, hi, maxBytes uint64) ([]raftpb.Entry, error)
- func (r *Replica) FirstIndex() (uint64, error)
- func (r *Replica) GC(batch engine.Engine, ms *engine.MVCCStats, args proto.GCRequest) (proto.GCResponse, error)
- func (r *Replica) Get(batch engine.Engine, args proto.GetRequest) (proto.GetResponse, []proto.Intent, error)
- func (r *Replica) GetGCMetadata() (*proto.GCMetadata, error)
- func (r *Replica) GetLastVerificationTimestamp() (proto.Timestamp, error)
- func (r *Replica) GetMVCCStats() engine.MVCCStats
- func (r *Replica) GetMaxBytes() int64
- func (r *Replica) GetReplica() *proto.Replica
- func (r *Replica) HeartbeatTxn(batch engine.Engine, ms *engine.MVCCStats, args proto.HeartbeatTxnRequest) (proto.HeartbeatTxnResponse, error)
- func (r *Replica) Increment(batch engine.Engine, ms *engine.MVCCStats, args proto.IncrementRequest) (proto.IncrementResponse, error)
- func (r *Replica) InitialState() (raftpb.HardState, raftpb.ConfState, error)
- func (r *Replica) IsFirstRange() bool
- func (r *Replica) LastIndex() (uint64, error)
- func (r *Replica) LeaderLease(batch engine.Engine, ms *engine.MVCCStats, args proto.LeaderLeaseRequest) (proto.LeaderLeaseResponse, error)
- func (r *Replica) Less(i btree.Item) bool
- func (r *Replica) Merge(batch engine.Engine, ms *engine.MVCCStats, args proto.MergeRequest) (proto.MergeResponse, error)
- func (r *Replica) PushTxn(batch engine.Engine, ms *engine.MVCCStats, args proto.PushTxnRequest) (proto.PushTxnResponse, error)
- func (r *Replica) Put(batch engine.Engine, ms *engine.MVCCStats, args proto.PutRequest) (proto.PutResponse, error)
- func (r *Replica) RangeLookup(batch engine.Engine, args proto.RangeLookupRequest) (proto.RangeLookupResponse, []proto.Intent, error)
- func (r *Replica) ResolveIntent(batch engine.Engine, ms *engine.MVCCStats, args proto.ResolveIntentRequest) (proto.ResolveIntentResponse, error)
- func (r *Replica) ResolveIntentRange(batch engine.Engine, ms *engine.MVCCStats, ...) (proto.ResolveIntentRangeResponse, error)
- func (r *Replica) ReverseScan(batch engine.Engine, args proto.ReverseScanRequest) (proto.ReverseScanResponse, []proto.Intent, error)
- func (r *Replica) Scan(batch engine.Engine, args proto.ScanRequest) (proto.ScanResponse, []proto.Intent, error)
- func (r *Replica) SetHardState(st raftpb.HardState) error
- func (r *Replica) SetLastVerificationTimestamp(timestamp proto.Timestamp) error
- func (r *Replica) SetMaxBytes(maxBytes int64)
- func (r *Replica) Snapshot() (raftpb.Snapshot, error)
- func (r *Replica) String() string
- func (r *Replica) Term(i uint64) (uint64, error)
- func (r *Replica) TruncateLog(batch engine.Engine, ms *engine.MVCCStats, args proto.TruncateLogRequest) (proto.TruncateLogResponse, error)
- func (r *Replica) WaitForLeaderLease(t util.Tester)
- type ReplicationStatusEvent
- type ResponseCache
- func (rc *ResponseCache) ClearData(e engine.Engine) error
- func (rc *ResponseCache) CopyFrom(e engine.Engine, originRangeID proto.RangeID) error
- func (rc *ResponseCache) CopyInto(e engine.Engine, destRangeID proto.RangeID) error
- func (rc *ResponseCache) GetResponse(e engine.Engine, cmdID proto.ClientCmdID) (proto.ResponseWithError, error)
- func (rc *ResponseCache) PutResponse(e engine.Engine, cmdID proto.ClientCmdID, replyWithErr proto.ResponseWithError) error
- type SplitRangeEvent
- type StartStoreEvent
- type Store
- func (s *Store) AddReplicaTest(rng *Replica) error
- func (s *Store) AppliedIndex(groupID proto.RangeID) (uint64, error)
- func (s *Store) Attrs() proto.Attributes
- func (s *Store) Bootstrap(ident proto.StoreIdent, stopper *stop.Stopper) error
- func (s *Store) BootstrapRange() error
- func (s *Store) Capacity() (proto.StoreCapacity, error)
- func (s *Store) Clock() *hlc.Clock
- func (s *Store) ClusterID() string
- func (s *Store) Context(ctx context.Context) context.Context
- func (s *Store) DB() *client.DB
- func (s *Store) Descriptor() (*proto.StoreDescriptor, error)
- func (s *Store) DisableRangeGCQueue(disabled bool)
- func (s *Store) Engine() engine.Engine
- func (s *Store) EventFeed() StoreEventFeed
- func (s *Store) ExecuteCmd(ctx context.Context, args proto.Request) (proto.Response, error)
- func (s *Store) ForceRangeGCScan(t util.Tester)
- func (s *Store) ForceReplicationScan(t util.Tester)
- func (s *Store) GetReplica(rangeID proto.RangeID) (*Replica, error)
- func (s *Store) GetStatus() (*StoreStatus, error)
- func (s *Store) Gossip() *gossip.Gossip
- func (s *Store) GossipCapacity()
- func (s *Store) GroupStorage(groupID proto.RangeID) multiraft.WriteableGroupStorage
- func (s *Store) IsStarted() bool
- func (s *Store) LookupReplica(start, end proto.Key) *Replica
- func (s *Store) MergeRange(subsumingRng *Replica, updatedEndKey proto.Key, subsumedRangeID proto.RangeID) error
- func (s *Store) NewRangeDescriptor(start, end proto.Key, replicas []proto.Replica) (*proto.RangeDescriptor, error)
- func (s *Store) NewSnapshot() engine.Engine
- func (s *Store) ProposeRaftCommand(idKey cmdIDKey, cmd proto.RaftCommand) <-chan error
- func (s *Store) PublishStatus() error
- func (s *Store) RaftNodeID() proto.RaftNodeID
- func (s *Store) RaftStatus(rangeID proto.RangeID) *raft.Status
- func (s *Store) RemoveReplica(rng *Replica) error
- func (s *Store) ReplicaCount() int
- func (s *Store) SetRangeRetryOptions(ro retry.Options)
- func (s *Store) SplitRange(origRng, newRng *Replica) error
- func (s *Store) Start(stopper *stop.Stopper) error
- func (s *Store) StartedAt() int64
- func (s *Store) Stopper() *stop.Stopper
- func (s *Store) StoreID() proto.StoreID
- func (s *Store) String() string
- func (s *Store) Tracer() *tracer.Tracer
- func (s *Store) WaitForInit()
- type StoreContext
- type StoreEventFeed
- type StoreEventListener
- type StoreStatus
- func (m *StoreStatus) GetAvailableRangeCount() int32
- func (m *StoreStatus) GetDesc() cockroach_proto.StoreDescriptor
- func (m *StoreStatus) GetLeaderRangeCount() int32
- func (m *StoreStatus) GetNodeID() github_com_cockroachdb_cockroach_proto.NodeID
- func (m *StoreStatus) GetRangeCount() int32
- func (m *StoreStatus) GetReplicatedRangeCount() int32
- func (m *StoreStatus) GetStartedAt() int64
- func (m *StoreStatus) GetStats() cockroach_storage_engine.MVCCStats
- func (m *StoreStatus) GetUpdatedAt() int64
- func (m *StoreStatus) Marshal() (data []byte, err error)
- func (m *StoreStatus) MarshalTo(data []byte) (n int, err error)
- func (*StoreStatus) ProtoMessage()
- func (m *StoreStatus) Reset()
- func (m *StoreStatus) Size() (n int)
- func (m *StoreStatus) String() string
- func (m *StoreStatus) Unmarshal(data []byte) error
- type StoreStatusEvent
- type TimestampCache
- func (tc *TimestampCache) Add(start, end proto.Key, timestamp proto.Timestamp, txnID []byte, readOnly bool)
- func (tc *TimestampCache) Clear(clock *hlc.Clock)
- func (tc *TimestampCache) GetMax(start, end proto.Key, txnID []byte) (proto.Timestamp, proto.Timestamp)
- func (tc *TimestampCache) MergeInto(dest *TimestampCache, clear bool)
- func (tc *TimestampCache) SetLowWater(lowWater proto.Timestamp)
- type UpdateRangeEvent
Examples ¶
Constants ¶
const ( // DefaultHeartbeatInterval is how often heartbeats are sent from the // transaction coordinator to a live transaction. These keep it from // being preempted by other transactions writing the same keys. If a // transaction fails to be heartbeat within 2x the heartbeat interval, // it may be aborted by conflicting txns. DefaultHeartbeatInterval = 5 * time.Second )
const ( // DefaultLeaderLeaseDuration is the default duration of the leader lease. DefaultLeaderLeaseDuration = time.Second )
raftInitialLogIndex is the starting point for the raft log. We bootstrap the raft membership by synthesizing a snapshot as if there were some discarded prefix to the log, so we must begin the log at an arbitrary index greater than 1.
const ( // GCResponseCacheExpiration is the expiration duration for response // cache entries. GCResponseCacheExpiration = 1 * time.Hour )
const ( // MinTSCacheWindow specifies the minimum duration to hold entries in // the cache before allowing eviction. After this window expires, // transactions writing to this node with timestamps lagging by more // than minCacheWindow will necessarily have to advance their commit // timestamp. MinTSCacheWindow = 10 * time.Second )
const ( // RangeGCQueueInactivityThreshold is the inactivity duration after which // a range will be considered for garbage collection. Exported for testing. RangeGCQueueInactivityThreshold = 10 * 24 * time.Hour // 10 days )
Variables ¶
var (
ErrInvalidLengthStatus = fmt.Errorf("proto: negative length found during unmarshaling")
)
var ( // TestStoreContext has some fields initialized with values relevant // in tests. TestStoreContext = StoreContext{ RaftTickInterval: 100 * time.Millisecond, RaftHeartbeatIntervalTicks: 1, RaftElectionTimeoutTicks: 2, ScanInterval: 10 * time.Minute, } )
var TestingCommandFilter func(proto.Request) error
TestingCommandFilter may be set in tests to intercept the handling of commands and artificially generate errors. Return nil to continue with regular processing or non-nil to terminate processing with the returned error. Note that in a multi-replica test this filter will be run once for each replica and must produce consistent results each time. Should only be used in tests in the storage and storage_test packages.
Functions ¶
func InsertRange ¶
InsertRange adds a new range to the RangeTree. This should only be called from operations that create new ranges, such as AdminSplit.
func ProcessStoreEvent ¶
func ProcessStoreEvent(l StoreEventListener, event interface{})
ProcessStoreEvent dispatches an event on the StoreEventListener.
Types ¶
type BeginScanRangesEvent ¶
BeginScanRangesEvent occurs when the store is about to scan over all ranges. During such a scan, each existing range will be published to the feed as a RegisterRangeEvent with the Scan flag set. This is used because downstream consumers may be tracking statistics via the Deltas in UpdateRangeEvent; this event informs subscribers to clear currently cached values.
type CommandQueue ¶
type CommandQueue struct {
// contains filtered or unexported fields
}
A CommandQueue maintains an interval tree of keys or key ranges for executing commands. New commands affecting keys or key ranges must wait on already-executing commands which overlap their key range.
Before executing, a command invokes GetWait() to initialize a WaitGroup with the number of overlapping commands which are already running. The wait group is waited on by the caller for confirmation that all overlapping, pending commands have completed and the pending command can proceed.
After waiting, a command is added to the queue's already-executing set via Add(). Add accepts a parameter indicating whether the command is read-only. Read-only commands don't need to wait on other read-only commands, so the wait group returned via GetWait() doesn't include read-only on read-only overlapping commands as an optimization.
Once commands complete, Remove() is invoked to remove the executing command and decrement the counts on any pending WaitGroups, possibly signaling waiting commands who were gated by the executing command's affected key(s).
CommandQueue is not thread safe.
func NewCommandQueue ¶
func NewCommandQueue() *CommandQueue
NewCommandQueue returns a new command queue.
func (*CommandQueue) Add ¶
func (cq *CommandQueue) Add(start, end proto.Key, readOnly bool) interface{}
Add adds a command to the queue which affects the specified key range. If end is empty, it is set to start.Next(), meaning the command affects a single key. The returned interface is the key for the command queue and must be re-supplied on subsequent invocation of Remove().
Add should be invoked after waiting on already-executing, overlapping commands via the WaitGroup initialized through GetWait().
func (*CommandQueue) Clear ¶
func (cq *CommandQueue) Clear()
Clear removes all executing commands, signaling any waiting commands.
func (*CommandQueue) GetWait ¶
GetWait initializes the supplied wait group with the number of executing commands which overlap the specified key range. If end is empty, end is set to start.Next(), meaning the command affects a single key. The caller should call wg.Wait() to wait for confirmation that all gating commands have completed or failed. readOnly is true if the requester is a read-only command; false for read-write.
func (*CommandQueue) Remove ¶
func (cq *CommandQueue) Remove(key interface{})
Remove is invoked to signal that the command associated with the specified key has completed and should be removed. Any pending commands waiting on this command will be signaled if this is the only command upon which they are still waiting.
Remove is invoked after a mutating command has been committed to the Raft log and applied to the underlying state machine. Similarly, Remove is invoked after a read-only command has been executed against the underlying state machine.
type EndScanRangesEvent ¶
EndScanRangesEvent occurs when the store has finished scanning all ranges. Every BeginScanRangeEvent will eventually be followed by an EndScanRangeEvent.
type MergeRangeEvent ¶
type MergeRangeEvent struct { StoreID proto.StoreID Merged UpdateRangeEvent Removed RemoveRangeEvent }
MergeRangeEvent occurs whenever a range is merged into another. This Event contains two component events: an UpdateRangeEvent for the range which subsumed the other, and a RemoveRangeEvent for the range that was subsumed.
type NotBootstrappedError ¶
type NotBootstrappedError struct{}
A NotBootstrappedError indicates that an engine has not yet been bootstrapped due to a store identifier not being present.
func (*NotBootstrappedError) Error ¶
func (e *NotBootstrappedError) Error() string
Error formats error.
type RegisterRangeEvent ¶
type RegisterRangeEvent struct { StoreID proto.StoreID Desc *proto.RangeDescriptor Stats engine.MVCCStats Scan bool }
RegisterRangeEvent occurs in two scenarios. Firstly, while a store broadcasts its list of ranges to initialize one or more new accumulators (with Scan set to true), or secondly, when a new range is initialized on the store (for example through replication), with Scan set to false. This event includes the Range's RangeDescriptor and current MVCCStats.
type RemoveRangeEvent ¶
type RemoveRangeEvent struct { StoreID proto.StoreID Desc *proto.RangeDescriptor Stats engine.MVCCStats }
RemoveRangeEvent occurs whenever a Range is removed from a store. This structure includes the Range's RangeDescriptor and the Range's previous MVCCStats before it was removed.
type Replica ¶
type Replica struct { sync.RWMutex // Protects the following fields: // contains filtered or unexported fields }
A Replica is a contiguous keyspace with writes managed via an instance of the Raft consensus algorithm. Many ranges may exist in a store and they are unlikely to be contiguous. Ranges are independent units and are responsible for maintaining their own integrity by replacing failed replicas, splitting and merging as appropriate.
func NewReplica ¶
func NewReplica(desc *proto.RangeDescriptor, rm rangeManager) (*Replica, error)
NewReplica initializes the replica using the given metadata.
func (*Replica) AddCmd ¶
AddCmd adds a command for execution on this range. The command's affected keys are verified to be contained within the range and the range's leadership is confirmed. The command is then dispatched either along the read-only execution path or the read-write Raft command queue.
func (*Replica) AdminMerge ¶
func (r *Replica) AdminMerge(args proto.AdminMergeRequest) (proto.AdminMergeResponse, error)
AdminMerge extends the range to subsume the range that comes next in the key space. The range being subsumed is provided in args.SubsumedRange. The EndKey of the subsuming range must equal the start key of the range being subsumed. The merge is performed inside of a distributed transaction which writes the updated range descriptor for the subsuming range and deletes the range descriptor for the subsumed one. It also updates the range addressing metadata. The handover of responsibility for the reassigned key range is carried out seamlessly through a merge trigger carried out as part of the commit of that transaction. A merge requires that the two ranges are collocate on the same set of replicas.
func (*Replica) AdminSplit ¶
func (r *Replica) AdminSplit(args proto.AdminSplitRequest) (proto.AdminSplitResponse, error)
AdminSplit divides the range into into two ranges, using either args.SplitKey (if provided) or an internally computed key that aims to roughly equipartition the range by size. The split is done inside of a distributed txn which writes updated and new range descriptors, and updates the range addressing metadata. The handover of responsibility for the reassigned key range is carried out seamlessly through a split trigger carried out as part of the commit of that transaction.
func (*Replica) ApplySnapshot ¶
ApplySnapshot implements the multiraft.WriteableGroupStorage interface.
func (*Replica) ChangeReplicas ¶
ChangeReplicas adds or removes a replica of a range. The change is performed in a distributed transaction and takes effect when that transaction is committed. When removing a replica, only the NodeID and StoreID fields of the Replica are used.
func (*Replica) ConditionalPut ¶
func (r *Replica) ConditionalPut(batch engine.Engine, ms *engine.MVCCStats, args proto.ConditionalPutRequest) (proto.ConditionalPutResponse, error)
ConditionalPut sets the value for a specified key only if the expected value matches. If not, the return value contains the actual value.
func (*Replica) ContainsKey ¶
ContainsKey returns whether this range contains the specified key.
func (*Replica) ContainsKeyRange ¶
ContainsKeyRange returns whether this range contains the specified key range from start to end.
func (*Replica) Delete ¶
func (r *Replica) Delete(batch engine.Engine, ms *engine.MVCCStats, args proto.DeleteRequest) (proto.DeleteResponse, error)
Delete deletes the key and value specified by key.
func (*Replica) DeleteRange ¶
func (r *Replica) DeleteRange(batch engine.Engine, ms *engine.MVCCStats, args proto.DeleteRangeRequest) (proto.DeleteRangeResponse, error)
DeleteRange deletes the range of key/value pairs specified by start and end keys.
func (*Replica) Desc ¶
func (r *Replica) Desc() *proto.RangeDescriptor
Desc atomically returns the range's descriptor.
func (*Replica) EndTransaction ¶
func (r *Replica) EndTransaction(batch engine.Engine, ms *engine.MVCCStats, args proto.EndTransactionRequest) (proto.EndTransactionResponse, []proto.Intent, error)
EndTransaction either commits or aborts (rolls back) an extant transaction according to the args.Commit parameter.
func (*Replica) Entries ¶
Entries implements the raft.Storage interface. Note that maxBytes is advisory and this method will always return at least one entry even if it exceeds maxBytes. Passing maxBytes equal to zero disables size checking. TODO(bdarnell): consider caching for recent entries, if rocksdb's builtin caching is insufficient.
func (*Replica) FirstIndex ¶
FirstIndex implements the raft.Storage interface.
func (*Replica) GC ¶
func (r *Replica) GC(batch engine.Engine, ms *engine.MVCCStats, args proto.GCRequest) (proto.GCResponse, error)
GC iterates through the list of keys to garbage collect specified in the arguments. MVCCGarbageCollect is invoked on each listed key along with the expiration timestamp. The GC metadata specified in the args is persisted after GC.
func (*Replica) Get ¶
func (r *Replica) Get(batch engine.Engine, args proto.GetRequest) (proto.GetResponse, []proto.Intent, error)
Get returns the value for a specified key.
func (*Replica) GetGCMetadata ¶
func (r *Replica) GetGCMetadata() (*proto.GCMetadata, error)
GetGCMetadata reads the latest GC metadata for this range.
func (*Replica) GetLastVerificationTimestamp ¶
GetLastVerificationTimestamp reads the timestamp at which the range's data was last verified.
func (*Replica) GetMVCCStats ¶
GetMVCCStats returns a copy of the MVCC stats object for this range.
func (*Replica) GetMaxBytes ¶
GetMaxBytes atomically gets the range maximum byte limit.
func (*Replica) GetReplica ¶
GetReplica returns the replica for this range from the range descriptor. Returns nil if the replica is not found.
func (*Replica) HeartbeatTxn ¶
func (r *Replica) HeartbeatTxn(batch engine.Engine, ms *engine.MVCCStats, args proto.HeartbeatTxnRequest) (proto.HeartbeatTxnResponse, error)
HeartbeatTxn updates the transaction status and heartbeat timestamp after receiving transaction heartbeat messages from coordinator. Returns the updated transaction.
func (*Replica) Increment ¶
func (r *Replica) Increment(batch engine.Engine, ms *engine.MVCCStats, args proto.IncrementRequest) (proto.IncrementResponse, error)
Increment increments the value (interpreted as varint64 encoded) and returns the newly incremented value (encoded as varint64). If no value exists for the key, zero is incremented.
func (*Replica) InitialState ¶
InitialState implements the raft.Storage interface.
func (*Replica) IsFirstRange ¶
IsFirstRange returns true if this is the first range.
func (*Replica) LeaderLease ¶
func (r *Replica) LeaderLease(batch engine.Engine, ms *engine.MVCCStats, args proto.LeaderLeaseRequest) (proto.LeaderLeaseResponse, error)
LeaderLease sets the leader lease for this range. The command fails only if the desired start timestamp collides with a previous lease. Otherwise, the start timestamp is wound back to right after the expiration of the previous lease (or zero). If this range replica is already the lease holder, the expiration will be extended or shortened as indicated. For a new lease, all duties required of the range leader are commenced, including clearing the command queue and timestamp cache.
func (*Replica) Merge ¶
func (r *Replica) Merge(batch engine.Engine, ms *engine.MVCCStats, args proto.MergeRequest) (proto.MergeResponse, error)
Merge is used to merge a value into an existing key. Merge is an efficient accumulation operation which is exposed by RocksDB, used by Cockroach for the efficient accumulation of certain values. Due to the difficulty of making these operations transactional, merges are not currently exposed directly to clients. Merged values are explicitly not MVCC data.
func (*Replica) PushTxn ¶
func (r *Replica) PushTxn(batch engine.Engine, ms *engine.MVCCStats, args proto.PushTxnRequest) (proto.PushTxnResponse, error)
PushTxn resolves conflicts between concurrent txns (or between a non-transactional reader or writer and a txn) in several ways depending on the statuses and priorities of the conflicting transactions. The PushTxn operation is invoked by a "pusher" (the writer trying to abort a conflicting txn or the reader trying to push a conflicting txn's commit timestamp forward), who attempts to resolve a conflict with a "pushee" (args.PushTxn -- the pushee txn whose intent(s) caused the conflict).
Txn already committed/aborted: If pushee txn is committed or aborted return success.
Txn Timeout: If pushee txn entry isn't present or its LastHeartbeat timestamp isn't set, use PushTxn.Timestamp as LastHeartbeat. If current time - LastHeartbeat > 2 * DefaultHeartbeatInterval, then the pushee txn should be either pushed forward, aborted, or confirmed not pending, depending on value of Request.PushType.
Old Txn Epoch: If persisted pushee txn entry has a newer Epoch than PushTxn.Epoch, return success, as older epoch may be removed.
Lower Txn Priority: If pushee txn has a lower priority than pusher, adjust pushee's persisted txn depending on value of args.PushType. If args.PushType is ABORT_TXN, set txn.Status to ABORTED, and priority to one less than the pusher's priority and return success. If args.PushType is PUSH_TIMESTAMP, set txn.Timestamp to pusher's Timestamp + 1 (note that we use the pusher's Args.Timestamp, not Txn.Timestamp because the args timestamp can advance during the txn).
Higher Txn Priority: If pushee txn has a higher priority than pusher, return TransactionPushError. Transaction will be retried with priority one less than the pushee's higher priority.
func (*Replica) Put ¶
func (r *Replica) Put(batch engine.Engine, ms *engine.MVCCStats, args proto.PutRequest) (proto.PutResponse, error)
Put sets the value for a specified key.
func (*Replica) RangeLookup ¶
func (r *Replica) RangeLookup(batch engine.Engine, args proto.RangeLookupRequest) (proto.RangeLookupResponse, []proto.Intent, error)
RangeLookup is used to look up RangeDescriptors - a RangeDescriptor is a metadata structure which describes the key range and replica locations of a distinct range in the cluster.
RangeDescriptors are stored as values in the cockroach cluster's key-value store. However, they are always stored using special "Range Metadata keys", which are "ordinary" keys with a special prefix prepended. The Range Metadata Key for an ordinary key can be generated with the `keys.RangeMetaKey(key)` function. The RangeDescriptor for the range which contains a given key can be retrieved by generating its Range Metadata Key and dispatching it to RangeLookup.
Note that the Range Metadata Key sent to RangeLookup is NOT the key at which the desired RangeDescriptor is stored. Instead, this method returns the RangeDescriptor stored at the _lowest_ existing key which is _greater_ than the given key. The returned RangeDescriptor will thus contain the ordinary key which was originally used to generate the Range Metadata Key sent to RangeLookup.
The "Range Metadata Key" for a range is built by appending the end key of the range to the respective meta prefix.
Lookups for range metadata keys usually want to read inconsistently, but some callers need a consistent result; both are supported.
This method has an important optimization in the inconsistent case: instead of just returning the request RangeDescriptor, it also returns a slice of additional range descriptors immediately consecutive to the desired RangeDescriptor. This is intended to serve as a sort of caching pre-fetch, so that the requesting nodes can aggressively cache RangeDescriptors which are likely to be desired by their current workload. The Reverse flag specifies whether descriptors are prefetched in descending or ascending order.
func (*Replica) ResolveIntent ¶
func (r *Replica) ResolveIntent(batch engine.Engine, ms *engine.MVCCStats, args proto.ResolveIntentRequest) (proto.ResolveIntentResponse, error)
ResolveIntent resolves a write intent from the specified key according to the status of the transaction which created it.
func (*Replica) ResolveIntentRange ¶
func (r *Replica) ResolveIntentRange(batch engine.Engine, ms *engine.MVCCStats, args proto.ResolveIntentRangeRequest) (proto.ResolveIntentRangeResponse, error)
ResolveIntentRange resolves write intents in the specified key range according to the status of the transaction which created it.
func (*Replica) ReverseScan ¶
func (r *Replica) ReverseScan(batch engine.Engine, args proto.ReverseScanRequest) (proto.ReverseScanResponse, []proto.Intent, error)
ReverseScan scans the key range specified by start key through end key in descending order up to some maximum number of results.
func (*Replica) Scan ¶
func (r *Replica) Scan(batch engine.Engine, args proto.ScanRequest) (proto.ScanResponse, []proto.Intent, error)
Scan scans the key range specified by start key through end key in ascending order up to some maximum number of results.
func (*Replica) SetHardState ¶
SetHardState implements the multiraft.WriteableGroupStorage interface.
func (*Replica) SetLastVerificationTimestamp ¶
SetLastVerificationTimestamp writes the timestamp at which the range's data was last verified.
func (*Replica) SetMaxBytes ¶
SetMaxBytes atomically sets the maximum byte limit before split. This value is cached by the range for efficiency.
func (*Replica) TruncateLog ¶
func (r *Replica) TruncateLog(batch engine.Engine, ms *engine.MVCCStats, args proto.TruncateLogRequest) (proto.TruncateLogResponse, error)
TruncateLog discards a prefix of the raft log.
func (*Replica) WaitForLeaderLease ¶
WaitForLeaderLease is used from unittests to wait until this range has the leader lease.
type ReplicationStatusEvent ¶
type ReplicationStatusEvent struct { StoreID proto.StoreID // Per-range availability information, which is currently computed by // periodically polling the ranges of each store. // TODO(mrtracy): See if this information could be computed incrementally // from other events. LeaderRangeCount int32 ReplicatedRangeCount int32 AvailableRangeCount int32 }
ReplicationStatusEvent contains statistics on the replication status of the ranges in the store.
Because these statistics cannot currently be computed from other events, this event should be periodically broadcast by the store independently of other operations.
type ResponseCache ¶
type ResponseCache struct {
// contains filtered or unexported fields
}
A ResponseCache provides idempotence for request retries. Each request to a range specifies a ClientCmdID in the request header which uniquely identifies a client command. After commands have been replicated via Raft, they are executed against the state machine and the results are stored in the ResponseCache.
The ResponseCache stores responses in the underlying engine, using keys derived from the Range ID and the ClientCmdID.
A ResponseCache is not thread safe. Access to it is serialized through Raft.
func NewResponseCache ¶
func NewResponseCache(rangeID proto.RangeID) *ResponseCache
NewResponseCache returns a new response cache. Every range replica maintains a response cache, not just the leader. However, when a replica loses or gains leadership of the Raft consensus group, the inflight map should be cleared.
func (*ResponseCache) ClearData ¶
func (rc *ResponseCache) ClearData(e engine.Engine) error
ClearData removes all items stored in the persistent cache. It does not alter the inflight map.
func (*ResponseCache) CopyFrom ¶
CopyFrom copies all the cached results from the originRangeID response cache into this one. Note that the cache will not be locked while copying is in progress. Failures decoding individual cache entries return an error. The copy is done directly using the engine instead of interpreting values through MVCC for efficiency.
func (*ResponseCache) CopyInto ¶
CopyInto copies all the cached results from this response cache into the destRangeID response cache. Failures decoding individual cache entries return an error.
func (*ResponseCache) GetResponse ¶
func (rc *ResponseCache) GetResponse(e engine.Engine, cmdID proto.ClientCmdID) (proto.ResponseWithError, error)
GetResponse looks up a response matching the specified cmdID. If the response is found, it is returned along with its associated error. If the response is not found, nil is returned for both the response and its error. In all cases, the third return value is the error returned from the engine when reading the on-disk cache.
func (*ResponseCache) PutResponse ¶
func (rc *ResponseCache) PutResponse(e engine.Engine, cmdID proto.ClientCmdID, replyWithErr proto.ResponseWithError) error
PutResponse writes a response and an error associated with it to the cache for the specified cmdID.
type SplitRangeEvent ¶
type SplitRangeEvent struct { StoreID proto.StoreID Original UpdateRangeEvent New RegisterRangeEvent }
SplitRangeEvent occurs whenever a range is split in two. This Event actually contains two other events: an UpdateRangeEvent for the Range which originally existed, and a RegisterRangeEvent for the range created via the split.
type StartStoreEvent ¶
StartStoreEvent occurs whenever a store is initially started.
type Store ¶
type Store struct { Ident proto.StoreIdent // contains filtered or unexported fields }
A Store maintains a map of ranges by start key. A Store corresponds to one physical device.
func NewStore ¶
func NewStore(ctx StoreContext, eng engine.Engine, nodeDesc *proto.NodeDescriptor) *Store
NewStore returns a new instance of a store.
func (*Store) AddReplicaTest ¶
AddReplicaTest adds the replica to the store's replica map and to the sorted replicasByKey slice. To be used only by unittests.
func (*Store) AppliedIndex ¶
AppliedIndex implements the multiraft.StateMachine interface.
func (*Store) Attrs ¶
func (s *Store) Attrs() proto.Attributes
Attrs returns the attributes of the underlying store.
func (*Store) Bootstrap ¶
Bootstrap writes a new store ident to the underlying engine. To ensure that no crufty data already exists in the engine, it scans the engine contents before writing the new store ident. The engine should be completely empty. It returns an error if called on a non-empty engine.
func (*Store) BootstrapRange ¶
BootstrapRange creates the first range in the cluster and manually writes it to the store. Default range addressing records are created for meta1 and meta2. Default configurations for accounting, permissions, users, and zones are created. All configs are specified for the empty key prefix, meaning they apply to the entire database. Permissions are granted to all users and the zone requires three replicas with no other specifications. It also adds the range tree and the root node, the first range, to it.
func (*Store) Capacity ¶
func (s *Store) Capacity() (proto.StoreCapacity, error)
Capacity returns the capacity of the underlying storage engine.
func (*Store) Context ¶
Context returns a base context to pass along with commands being executed, derived from the supplied context (which is allowed to be nil).
func (*Store) Descriptor ¶
func (s *Store) Descriptor() (*proto.StoreDescriptor, error)
Descriptor returns a StoreDescriptor including current store capacity information.
func (*Store) DisableRangeGCQueue ¶
DisableRangeGCQueue disables or enables the range GC queue. Exposed only for testing.
func (*Store) ExecuteCmd ¶
ExecuteCmd fetches a range based on the header's replica, assembles method, args & reply into a Raft Cmd struct and executes the command using the fetched range.
func (*Store) ForceRangeGCScan ¶
ForceRangeGCScan iterates over all ranges and enqueues any that may need to be GC'd. Exposed only for testing.
func (*Store) ForceReplicationScan ¶
ForceReplicationScan iterates over all ranges and enqueues any that need to be replicated. Exposed only for testing.
func (*Store) GetReplica ¶
GetReplica fetches a replica by Range ID. Returns an error if no replica is found.
func (*Store) GetStatus ¶
func (s *Store) GetStatus() (*StoreStatus, error)
GetStatus fetches the latest store status from the stored value on the cluster. Returns nil if the scanner has not yet run. The scanner runs once every ctx.ScanInterval.
func (*Store) GossipCapacity ¶
func (s *Store) GossipCapacity()
GossipCapacity broadcasts the node's capacity on the gossip network.
func (*Store) GroupStorage ¶
func (s *Store) GroupStorage(groupID proto.RangeID) multiraft.WriteableGroupStorage
GroupStorage implements the multiraft.Storage interface.
func (*Store) LookupReplica ¶
LookupReplica looks up a replica via binary search over the "replicasByKey" btree. Returns nil if no replica is found for specified key range. Note that the specified keys are transformed using Key.Address() to ensure we lookup replicas correctly for local keys. When end is nil, a replica that contains start is looked up.
func (*Store) MergeRange ¶
func (s *Store) MergeRange(subsumingRng *Replica, updatedEndKey proto.Key, subsumedRangeID proto.RangeID) error
MergeRange expands the subsuming range to absorb the subsumed range. This merge operation will fail if the two ranges are not collocated on the same store.
func (*Store) NewRangeDescriptor ¶
func (s *Store) NewRangeDescriptor(start, end proto.Key, replicas []proto.Replica) (*proto.RangeDescriptor, error)
NewRangeDescriptor creates a new descriptor based on start and end keys and the supplied proto.Replicas slice. It allocates new replica IDs to fill out the supplied replicas.
func (*Store) NewSnapshot ¶
NewSnapshot creates a new snapshot engine.
func (*Store) ProposeRaftCommand ¶
func (s *Store) ProposeRaftCommand(idKey cmdIDKey, cmd proto.RaftCommand) <-chan error
ProposeRaftCommand submits a command to raft. The command is processed asynchronously and an error or nil will be written to the returned channel when it is committed or aborted (but note that committed does mean that it has been applied to the range yet).
func (*Store) PublishStatus ¶
PublishStatus publishes periodically computed status events to the store's events feed. This method itself should be periodically called by some external mechanism.
func (*Store) RaftStatus ¶
RaftStatus returns the current raft status of the given range.
func (*Store) RemoveReplica ¶
RemoveReplica removes the replica from the store's replica map and from the sorted replicasByKey btree.
func (*Store) ReplicaCount ¶
ReplicaCount returns the number of replicas contained by this store.
func (*Store) SetRangeRetryOptions ¶
SetRangeRetryOptions sets the retry options used for this store. For unittests only.
func (*Store) SplitRange ¶
SplitRange shortens the original range to accommodate the new range. The new range is added to the ranges map and the rangesByKey btree.
func (*Store) StartedAt ¶
StartedAt returns the timestamp at which the store was most recently started.
func (*Store) WaitForInit ¶
func (s *Store) WaitForInit()
WaitForInit waits for any asynchronous processes begun in Start() to complete their initialization. In particular, this includes gossiping. In some cases this may block until the range GC queue has completed its scan. Only for testing.
type StoreContext ¶
type StoreContext struct { Clock *hlc.Clock DB *client.DB Gossip *gossip.Gossip Transport multiraft.Transport // RangeRetryOptions are the retry options when retryable errors are // encountered sending commands to ranges. RangeRetryOptions retry.Options // RaftTickInterval is the resolution of the Raft timer; other raft timeouts // are defined in terms of multiples of this value. RaftTickInterval time.Duration // RaftHeartbeatIntervalTicks is the number of ticks that pass between heartbeats. RaftHeartbeatIntervalTicks int // RaftElectionTimeoutTicks is the number of ticks that must pass before a follower // considers a leader to have failed and calls a new election. Should be significantly // higher than RaftHeartbeatIntervalTicks. The raft paper recommends a value of 150ms // for local networks. RaftElectionTimeoutTicks int // ScanInterval is the default value for the scan interval ScanInterval time.Duration // ScanMaxIdleTime is the maximum time the scanner will be idle between ranges. // If enabled (> 0), the scanner may complete in less than ScanInterval for small // stores. ScanMaxIdleTime time.Duration // EventFeed is a feed to which this store will publish events. EventFeed *util.Feed // Tracer is a request tracer. Tracer *tracer.Tracer }
A StoreContext encompasses the auxiliary objects and configuration required to create a store. All fields holding a pointer or an interface are required to create a store; the rest will have sane defaults set if omitted.
func (*StoreContext) Valid ¶
func (sc *StoreContext) Valid() bool
Valid returns true if the StoreContext is populated correctly. We don't check for Gossip and DB since some of our tests pass that as nil.
type StoreEventFeed ¶
type StoreEventFeed struct {
// contains filtered or unexported fields
}
StoreEventFeed is a helper structure which publishes store-specific events to a util.Feed. The target feed may be shared by multiple StoreEventFeeds. If the target feed is nil, event methods become no-ops.
func NewStoreEventFeed ¶
func NewStoreEventFeed(id proto.StoreID, feed *util.Feed) StoreEventFeed
NewStoreEventFeed creates a new StoreEventFeed which publishes events for a specific store to the supplied feed.
type StoreEventListener ¶
type StoreEventListener interface { OnRegisterRange(event *RegisterRangeEvent) OnUpdateRange(event *UpdateRangeEvent) OnRemoveRange(event *RemoveRangeEvent) OnSplitRange(event *SplitRangeEvent) OnMergeRange(event *MergeRangeEvent) OnStartStore(event *StartStoreEvent) OnBeginScanRanges(event *BeginScanRangesEvent) OnEndScanRanges(event *EndScanRangesEvent) OnStoreStatus(event *StoreStatusEvent) OnReplicationStatus(event *ReplicationStatusEvent) }
StoreEventListener is an interface that can be implemented by objects which listen for events published by stores.
type StoreStatus ¶
type StoreStatus struct { Desc cockroach_proto.StoreDescriptor `protobuf:"bytes,1,opt,name=desc" json:"desc"` NodeID github_com_cockroachdb_cockroach_proto.NodeID `protobuf:"varint,2,opt,name=node_id,casttype=github.com/cockroachdb/cockroach/proto.NodeID" json:"node_id"` RangeCount int32 `protobuf:"varint,3,opt,name=range_count" json:"range_count"` StartedAt int64 `protobuf:"varint,4,opt,name=started_at" json:"started_at"` UpdatedAt int64 `protobuf:"varint,5,opt,name=updated_at" json:"updated_at"` Stats cockroach_storage_engine.MVCCStats `protobuf:"bytes,6,opt,name=stats" json:"stats"` LeaderRangeCount int32 `protobuf:"varint,7,opt,name=leader_range_count" json:"leader_range_count"` ReplicatedRangeCount int32 `protobuf:"varint,8,opt,name=replicated_range_count" json:"replicated_range_count"` AvailableRangeCount int32 `protobuf:"varint,9,opt,name=available_range_count" json:"available_range_count"` XXX_unrecognized []byte `json:"-"` }
StoreStatus contains the stats needed to calculate the current status of a store.
func (*StoreStatus) GetAvailableRangeCount ¶
func (m *StoreStatus) GetAvailableRangeCount() int32
func (*StoreStatus) GetDesc ¶
func (m *StoreStatus) GetDesc() cockroach_proto.StoreDescriptor
func (*StoreStatus) GetLeaderRangeCount ¶
func (m *StoreStatus) GetLeaderRangeCount() int32
func (*StoreStatus) GetNodeID ¶
func (m *StoreStatus) GetNodeID() github_com_cockroachdb_cockroach_proto.NodeID
func (*StoreStatus) GetRangeCount ¶
func (m *StoreStatus) GetRangeCount() int32
func (*StoreStatus) GetReplicatedRangeCount ¶
func (m *StoreStatus) GetReplicatedRangeCount() int32
func (*StoreStatus) GetStartedAt ¶
func (m *StoreStatus) GetStartedAt() int64
func (*StoreStatus) GetStats ¶
func (m *StoreStatus) GetStats() cockroach_storage_engine.MVCCStats
func (*StoreStatus) GetUpdatedAt ¶
func (m *StoreStatus) GetUpdatedAt() int64
func (*StoreStatus) Marshal ¶
func (m *StoreStatus) Marshal() (data []byte, err error)
func (*StoreStatus) ProtoMessage ¶
func (*StoreStatus) ProtoMessage()
func (*StoreStatus) Reset ¶
func (m *StoreStatus) Reset()
func (*StoreStatus) Size ¶
func (m *StoreStatus) Size() (n int)
func (*StoreStatus) String ¶
func (m *StoreStatus) String() string
func (*StoreStatus) Unmarshal ¶
func (m *StoreStatus) Unmarshal(data []byte) error
type StoreStatusEvent ¶
type StoreStatusEvent struct {
Desc *proto.StoreDescriptor
}
StoreStatusEvent contains the current descriptor for the given store.
Because the descriptor contains information that cannot currently be computed from other events, this event should be periodically broadcast by the store independently of other operations.
type TimestampCache ¶
type TimestampCache struct {
// contains filtered or unexported fields
}
A TimestampCache maintains an interval tree FIFO cache of keys or key ranges and the timestamps at which they were most recently read or written. If a timestamp was read or written by a transaction, the txn ID is stored with the timestamp to avoid advancing timestamps on successive requests from the same transaction.
The cache also maintains a low-water mark which is the most recently evicted entry's timestamp. This value always ratchets with monotonic increases. The low water mark is initialized to the current system time plus the maximum clock offset.
func NewTimestampCache ¶
func NewTimestampCache(clock *hlc.Clock) *TimestampCache
NewTimestampCache returns a new timestamp cache with supplied hybrid clock.
func (*TimestampCache) Add ¶
func (tc *TimestampCache) Add(start, end proto.Key, timestamp proto.Timestamp, txnID []byte, readOnly bool)
Add the specified timestamp to the cache as covering the range of keys from start to end. If end is nil, the range covers the start key only. txnID is nil for no transaction. readOnly specifies whether the command adding this timestamp was read-only or not.
func (*TimestampCache) Clear ¶
func (tc *TimestampCache) Clear(clock *hlc.Clock)
Clear clears the cache and resets the low water mark to the current time plus the maximum clock offset.
func (*TimestampCache) GetMax ¶
func (tc *TimestampCache) GetMax(start, end proto.Key, txnID []byte) (proto.Timestamp, proto.Timestamp)
GetMax returns the maximum read and write timestamps which overlap the interval spanning from start to end. Cached timestamps matching the specified txnID are not considered. If no part of the specified range is overlapped by timestamps in the cache, the low water timestamp is returned for both read and write timestamps.
The txn ID prevents restarts with a pattern like: read("a"), write("a"). The read adds a timestamp for "a". Then the write (for the same transaction) would get that as the max timestamp and be forced to increment it. This allows timestamps from the same txn to be ignored.
func (*TimestampCache) MergeInto ¶
func (tc *TimestampCache) MergeInto(dest *TimestampCache, clear bool)
MergeInto merges all entries from this timestamp cache into the dest timestamp cache. The clear parameter, if true, copies the values of lowWater and latest and clears the destination cache before merging in the source.
func (*TimestampCache) SetLowWater ¶
func (tc *TimestampCache) SetLowWater(lowWater proto.Timestamp)
SetLowWater sets the cache's low water mark, which is the minimum value the cache will return from calls to GetMax().
type UpdateRangeEvent ¶
type UpdateRangeEvent struct { StoreID proto.StoreID Desc *proto.RangeDescriptor Stats engine.MVCCStats Method proto.Method Delta engine.MVCCStats }
UpdateRangeEvent occurs whenever a Range is modified. This structure includes the basic range information, but also includes a second set of MVCCStats containing the delta from the Range's previous stats. If the update did not modify any statistics, this delta may be nil.
Source Files ¶
- addressing.go
- allocator.go
- command_queue.go
- doc.go
- feed.go
- gc_queue.go
- id_alloc.go
- queue.go
- range_data_iter.go
- range_gc_queue.go
- range_tree.go
- replica.go
- replica_command.go
- replica_raftstorage.go
- replicate_queue.go
- response_cache.go
- scanner.go
- split_queue.go
- stats.go
- status.pb.go
- store.go
- timestamp_cache.go
- verify_queue.go