epcache

package module
v1.13.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 26, 2024 License: MIT Imports: 25 Imported by: 0

README

EPCache

EPCache is Endless Paradox's Cache, Experiment-Purpose Cache and Enhanced-Performance Cache. A lightweight and customizable distributed cache system developing framework implemented by golang. Suitable for scenarios with more reads and less writes, has high concurrent access capabilities, and ensures eventual consistency.

Structure

.
├── bloomfilter
│   └── bloomfilter.go: implements a bloomfilter using bitmap and murmur3, working as a blacklist 
│                         to avoid cache penetration, which might cause a false potive problem.
├── consistenthash
│   └── consistenthash.go: implements a hash ring mapping reqs to a specific node having the same group,
│                            which establishes a basic load balance of the EPCache cluster.
├── epcachepb
│   └── epcachepb.proto: defines the protobuf messages and service used by ProtoPeer and PeerAgent.
├── etcd
│   └── startup.sh: provides an example to start an etcd cluster.
├── lru
│   └── lru.go: implements a lru cache.
├── msgctl
│   └── msgctl.go: provides a tool for merging messages within a specified interval.
├── ratelimit
│   └── ratelimit.go: implements a token bucket for rate limiting.
├── singleflight
│   └── singleflight.go: provides a duplicate func call suppression mechanism using Mutex and WaitGroup,
│                          to avoid cache breakdown.
├── amqp
│   └── amqp.go: handles data synchronization across nodes based on the fanout pattern of an MQ instance.
├── byteview.go: implements an immutable view of bytes, used inside the EPCache cluster and presented to users,
│                  which provides benefit of decoupling from data source and preventing users from 
│                  accidentally modifying the EPCache cluster's data.
├── cache.go: wraps a lru cache and its operators, using Mutex to provide concurrent safety 
│               and recording relevant statistical data. 
├── epcache.go: provides APIs to an EPCache cluster node, like Get, OnUpdate and OnDelete etc. 
│                 The ratelimiter and bloomfilter here can be enable and disabled at any time.
├── getter.go: provides the interface Getter, which must be implemented by users to access to the data source.
├── grpc.go: implements GrpcPool as a PeerAgent, which communicates with other nodes using gRPC, 
│              it will deal with the service registration and discovery based on etcd,
│              the data synchronization sending and receiving based on MQ,
│              and of course start a gRPC server, all of which support graceful shutdown. 
├── peers.go: provides the interface standards of ProtoPeer and PeerAgent, which are responsible for
│               the interation work among the EPCache cluster nodes; also implements NoPeer as a PeerAgent.
└── protopeer.go: implements protoPeer as a ProtoPeer with a long-running gRPC client, which will be used by GrpcPool.

Procedure

                         y
Get -----> cache hit? -----> retrun
            | n                                  
            |----> consistent hashing 
                    |                 y                                 y
                    |----> remote? -----> load from peer -----> suc? -----> popuate hot cache in 1/10 chance -----> return
                            | n                                  | n                            y             
                            |                                    |----> due to non-existent? -----> return error
                            |                                            | n                                  
                            |                                            |                            y
                            |-----------------------------------------> load locally -----> exist? -----> popuate main cache -----> return
                                                                                             | n                                  
                                                                                             |----> return error

Highlights

  1. Using gRPC/ProtoBuf to achieve efficient communication between nodes: request forwarding and data synchronization.
  2. Implementing cache elimination strategy based on LRU, and implement load balancing based on consistent hashing.
  3. Use mutex and semaphore to prevent cache breakdown, Bloom filters to prevent cache penetration, and token bucket algorithm to implement request rate limiting.
  4. Asynchronous data synchronization across nodes based on AMQP-style message queue has the advantages of order guarantee and decoupling.
  5. Implementing service registration and discovery based on etcd cluster to achieve dynamic adjustment of nodes.

Guide

  1. You can build up a cache system as you like by importing this module.
  2. Getter is implemented by you, a normal one might be using DB as data source.
  3. ProtoPeer and PeerAgent can also be implemented by you, and using protoPeer and GrpcPool are recommended if you attempt to build up a cluster, while NoPeer if you just need standalone one.
  4. GrpcPool requires both a running etcd cluster instance and a running AMQP-style MQ instance.
  5. Set up ratelimiter and bloomfilter when you need them.
  6. The GrpcPool will log something important when up.
  7. An API server is the best practise to be built in front of an EPCache cluster node.

Contributing

Issues and Pull Requests are accepted. Feel free to contribute to this project.

Benchmark

  • For 100k data entries, up to 3030k qps if cache hit locally and 33k qps if cache hit remotely when used concurrently.
  • For 100k data entries, as short as 633 ms to complete the writing regardless of the time spent on retrieving them from source.
  • For 100k data entries, as short as 541 ms to update all and publish messages related when used concurrently.
  • For 100k data entries, as short as 1.72 s to consume messages related and update all.

License

MIT © EndlessParadox1

Documentation

Overview

Package epcache implements a distributed cache system development framework.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ByteView

type ByteView struct {
	// contains filtered or unexported fields
}

ByteView holds an immutable view of bytes, it should be used as a value type, not a pointer type.

func (ByteView) ByteSlice

func (bv ByteView) ByteSlice() []byte

ByteSlice returns a copy of the data as a byte slice.

func (ByteView) Equal added in v1.7.2

func (bv ByteView) Equal(bv2 ByteView) bool

Equal returns whether the bytes in bv are the same as the bytes in bv2.

func (ByteView) EqualBytes added in v1.7.2

func (bv ByteView) EqualBytes(b2 []byte) bool

EqualBytes returns whether the bytes in bv are the same as the bytes b2.

func (ByteView) Len

func (bv ByteView) Len() int

func (ByteView) Slice added in v1.7.2

func (bv ByteView) Slice(from, to int) ByteView

Slice slices the view between from and to.

func (ByteView) String

func (bv ByteView) String() string

type CacheStats

type CacheStats struct {
	Bytes  int64
	Items  int64
	Hits   int64
	Gets   int64
	Evicts int64
}

type CacheType

type CacheType int
const (
	MainCache CacheType = iota + 1
	HotCache
)

type Getter

type Getter interface {
	// Get depends on users' concrete implementation.
	// Context's deadline should be treated properly if existed.
	Get(ctx context.Context, key string) ([]byte, error)
}

Getter loads data from source, like a DB.

type GetterFunc

type GetterFunc func(ctx context.Context, key string) ([]byte, error)

GetterFunc indicates Getter might just be a func.

func (GetterFunc) Get

func (f GetterFunc) Get(ctx context.Context, key string) ([]byte, error)

type GrpcPool

type GrpcPool struct {
	pb.UnimplementedEPCacheServer
	// contains filtered or unexported fields
}

func NewGrpcPool

func NewGrpcPool(self string, registry []string, mqBroker string, opts *GrpcPoolOptions) *GrpcPool

NewGrpcPool returns a GrpcPool instance.

registry: The listening addresses of the etcd cluster.
mqBroker: The listening address of the MQ broker.

func (*GrpcPool) Get

func (gp *GrpcPool) Get(ctx context.Context, req *pb.Request) (*pb.Response, error)

func (*GrpcPool) ListPeers added in v1.7.2

func (gp *GrpcPool) ListPeers() (ans []string)

func (*GrpcPool) PickPeer

func (gp *GrpcPool) PickPeer(key string) (ProtoPeer, bool)

func (*GrpcPool) SetNode added in v1.10.0

func (gp *GrpcPool) SetNode(node *Node)

func (*GrpcPool) SyncAll added in v1.7.2

func (gp *GrpcPool) SyncAll(data *pb.SyncData)

SyncAll just publishes a message to the MQ exchange working in fanout pattern.

type GrpcPoolOptions

type GrpcPoolOptions struct {
	Prefix   string
	Exchange string
	Replicas int
	HashFn   consistenthash.Hash
}

GrpcPoolOptions are options to build a GrpcPool instance.

	Prefix: The etcd namespace to which an EPCache cluster instance belongs, default `epcache/`.
	Exchange: The MQ exchange used by an EPCache cluster instance, default `epcache`.
 Replicas: The replicas of each node in hash ring, default `50`.
 HashFn: The hashing function used for consistent hash, default `CRC32`.

type LimitMode

type LimitMode int
const (
	NoLimit LimitMode = iota
	BlockMode
	RejectMode
)

type LoadError

type LoadError string
const ErrNotFound LoadError = "key not found in data source"

ErrNotFound must be returned when Getter can't found the data.

func (LoadError) Error

func (e LoadError) Error() string

type NoPeer

type NoPeer struct{}

NoPeer is an implementation of PeerAgent, used for nodes running in standalone mode.

func (NoPeer) ListPeers added in v1.7.2

func (NoPeer) ListPeers() (ans []string)

func (NoPeer) PickPeer

func (NoPeer) PickPeer(_ string) (peer ProtoPeer, ok bool)

func (NoPeer) SetNode added in v1.10.0

func (NoPeer) SetNode(_ *Node)

func (NoPeer) SyncAll added in v1.7.2

func (NoPeer) SyncAll(_ *pb.SyncData)

type Node added in v1.10.0

type Node struct {
	Stats Stats
	// contains filtered or unexported fields
}

Node is TODO

func NewNode added in v1.10.0

func NewNode(cacheBytes int64, getter Getter) *Node

func (*Node) CacheStats added in v1.10.0

func (n *Node) CacheStats(ctype CacheType) CacheStats

func (*Node) Get added in v1.10.0

func (n *Node) Get(ctx context.Context, key string) (ByteView, error)

func (*Node) OnRemove added in v1.10.0

func (n *Node) OnRemove(key string)

OnRemove removes data in cache and then syncs to all peers using MQ. This must be called when data in source is purged to ensure consistency.

func (*Node) OnUpdate added in v1.10.0

func (n *Node) OnUpdate(key string, value []byte)

OnUpdate updates data in cache and then syncs to all peers using MQ. This must be called when data in source is changed to ensure consistency.

func (*Node) RegisterPeers added in v1.10.0

func (n *Node) RegisterPeers(peers PeerAgent)

RegisterPeers specifies PeerPicker for a node, e.g. NoPeer, GrpcPool or any that implements the PeerPicker.

func (*Node) ResetLimiter added in v1.10.0

func (n *Node) ResetLimiter()

ResetLimiter disables a rate limiter.

func (*Node) SetFilter added in v1.10.0

func (n *Node) SetFilter(size uint32)

SetFilter sets a bloom filter, zero size for none. It calculates the required params to build a bloom filter, false positive rate of which will be lower than 0.01%, according to user's expected blacklist size.

func (*Node) SetLimiter added in v1.10.0

func (n *Node) SetLimiter(rate int, cap int, mode LimitMode)

SetLimiter sets a rate limiter working on blocking or rejecting mode.

type PeerAgent added in v1.7.2

type PeerAgent interface {
	// PickPeer picks peer according to the key.
	PickPeer(key string) (ProtoPeer, bool)
	// SyncAll trys to sync data to all peers.
	SyncAll(data *pb.SyncData)
	SetNode(node *Node)
	// ListPeers lists all peers.
	ListPeers() []string
}

type ProtoPeer added in v1.7.2

type ProtoPeer interface {
	// Get loads data from remote using gRPC.
	Get(ctx context.Context, req *pb.Request) (*pb.Response, error)
}

type Stats

type Stats struct {
	Reqs          int64
	Gets          int64
	Hits          int64
	Loads         int64
	LoadsDeduped  int64 // after singleflight
	LocalLoads    int64
	LocalLoadErrs int64
	PeerLoads     int64
	PeerLoadErrs  int64
	PeerReqs      int64 // requests from peers
	PeerSyncs     int64 // data-syncs from peers

	LenBlacklist int64
}

Stats are statistics for a node.

Directories

Path Synopsis
Package bloomfilter avoids cache penetration.
Package bloomfilter avoids cache penetration.
Package consistenthash implements a ring hash.
Package consistenthash implements a ring hash.
Package lru implements a lru cache.
Package lru implements a lru cache.
Package msgctl reduces messages within a specified interval into one.
Package msgctl reduces messages within a specified interval into one.
Package ratelimit implements a token bucket used for request rate limiting.
Package ratelimit implements a token bucket used for request rate limiting.
Package singleflight provides a duplicate func call suppression mechanism, therefore avoiding cache breakdown.
Package singleflight provides a duplicate func call suppression mechanism, therefore avoiding cache breakdown.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL