lazylru

package module
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 6, 2024 License: MIT Imports: 7 Imported by: 0

README

LazyLRU: An in-memory cache with limited locking

Build status: Build status Coverage Status

This is a cache implementation that uses a hash table for lookups and a priority queue to approximate LRU. Approximate because the usage is not updated on every get. Rather, items close to the head of the queue, those most likely to be read again and least likely to age out, are not updated. This assumption does not hold under every condition -- if the cache is undersized and churning a lot, this implementation will perform worse than an LRU that updates on every read.

Read about it on Medium or hear about it on the Alexa's Input podcast.

What makes this one different

The reason for the occasional updates is not because the updates themselves are expensive -- the heap-based implementation keeps it quite cheap. Rather, it is the exclusive locking required during that move that drove this design. The lock is more expensive than the move by a wide margin, but is required in concurrent access scenarios. This implementation uses an RWMutex, but only takes out the exclusive write lock when moving or inserting elements.

Several other LRU caches found on the internet use a doubly-linked list for the recency list. That has the advantage of requiring only four pointer rewrites. By contrast, the heap implementation may require as many as log(n) pointer rewrites. However, the linked-list approach cannot keep track of the position of each node in the list, so the optimization attempted here (ignoring the first quarter of the list) is not possible. Theoretically, the array-based heap implementation also provides better locality, but I don't think I can definitively prove that one way or the other.

There are deletes on read, as well as asynchronous deletes on a timer. When the ticker fires, it will pick a random spot in the heap and check the next 100 items for expiry. Once those 100 items are checked, if any should be deleted, an exlcusive lock will be taken and all items deleted in that single lock cycle, hopefully reducing the cost per deleted item.

All LRU implementations need to update records based on their insert or last access. This is usually handled using time. Time, however, can lead to collisions and undefined behavior. For simplicity, Lazy LRU uses an atomic counter, which means that every insert/update has a unique, integral value.

Features

  • Thread-safe
  • Tunable cache size
  • Constant-time reads without an exclusive lock
  • O(log n) inserts
  • Built-in expiry, including purging expired items in the background
  • Nearly 100% test coverage

Non-features

Deterministic reaping

An optimization that was considered was to keep the inserted values in a second queue (linked list or ring buffer), based on the time of their expiry. While this has the advantage of making the search for expired items extremely cheap, the complexity of maintaining a third reference to each data item seemed like more trouble than it was worth. I may change my mind later.

Sharding

This is a big one. Lots of cache implemetations get around the lock contention issues by sharding the key space. LazyLRU does not prevent that, but it doesn't do it either. The lack of exclusive locks under the most common reading circumstances should reduce the need to shard, though that really depends on your use cases.

If sharding makes sense for you, it should be pretty easy to make a list of LazyLRU instances, hash your keys before reading or writing to select your cache instance, and go from there. As many LazyLRU instances as you want can coexist a single process. If sharded caches are the path to living as your most authentic self, LazyLRU won't keep your peacock caged.

Usage

The Go heap has been copied and made to support generics. That allows the LRU to also support generics. To access that feature, import the lazylru/generic module. To maintain compatibility with pre-generics versions, the New factory method still uses string keys and interface{} values. However, this is just a wrapper over the NewT[K,V] factory method.

// import "github.com/TriggerMail/lazylru"

lru := lazylru.NewT[string, string](10, 5 * time.minute)
defer lru.Close()

lru.Set("abloy", "medeco")

vstr, ok := lru.Get("abloy")

It is important to note that LazyLRU should be closed if the TTL is non-zero. Otherwise, the background reaper thread will be left running. To be fair, under most circumstances I can imagine, the cache lives as long as the host process. So do what you like.

Go <= 1.17

As of v0.4.0, LazyLRU takes advantage of Go generics. If you want to use this library in Go 1.17 or lower, please use v0.3.x. v0.3.3 is the latest as of the time of this writing.

Like Go's heap itself, Lazy LRU uses the interface{} type for its values. That means that casting is required on the way out. I promised that as soon as Go had generics, I'd get right on it. See below!

// import "github.com/TriggerMail/lazylru"

lru := lazylru.New(10, 5 * time.minute)
defer lru.Close()

lru.Set("abloy", "medeco")

v, ok := lru.Get("abloy")
vstr, vok := v.(string)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type EvictCB added in v0.5.0

type EvictCB[K comparable, V any] func(K, V)

EvictCB is a callback function that will be executed when items are removed from the cache via eviction due to max size or because the TTL has been exceeded. These functions will not be called with a lock and will not block future reaping. Be sure any callback registered can complete the number of expected calls (based on your expire/eviction rates) or you may create a backlog of goroutines.

type LazyLRU

type LazyLRU[K comparable, V any] struct {
	// contains filtered or unexported fields
}

LazyLRU is an LRU cache that only reshuffles values if it is somewhat full. This is a cache implementation that uses a hash table for lookups and a priority queue to approximate LRU. Approximate because the usage is not updated on every get. Rather, items close to the head of the queue, those most likely to be read again and least likely to age out, are not updated. This assumption does not hold under every condition -- if the cache is undersized and churning a lot, this implementation will perform worse than an LRU that updates on every read.

func New deprecated

func New(maxItems int, ttl time.Duration) *LazyLRU[string, interface{}]

New creates a LazyLRU[string, interface{} with the given capacity and default expiration. This is compatible with the pre-generic interface. The generic version is available as `NewT`. If maxItems is zero or fewer, the cache will not hold anything, but does still incur some runtime penalties. If ttl is greater than zero, a background ticker will be engaged to proactively remove expired items.

Deprecated: To avoid the casting, use the generic NewT interface instead

func NewT added in v0.4.0

func NewT[K comparable, V any](maxItems int, ttl time.Duration) *LazyLRU[K, V]

NewT creates a LazyLRU with the given capacity and default expiration. If maxItems is zero or fewer, the cache will not hold anything, but does still incur some runtime penalties. If ttl is greater than zero, a background ticker will be engaged to proactively remove expired items.

func (*LazyLRU[K, V]) Close

func (lru *LazyLRU[K, V]) Close()

Close stops the reaper process. This is safe to call multiple times.

func (*LazyLRU[K, V]) Delete added in v0.3.0

func (lru *LazyLRU[K, V]) Delete(key K)

Delete elimitates a key from the cache. Removing a key that is not in the index is safe.

func (*LazyLRU[K, V]) Get

func (lru *LazyLRU[K, V]) Get(key K) (V, bool)

Get retrieves a value from the cache. The returned bool indicates whether the key was found in the cache.

func (*LazyLRU[K, V]) IsRunning

func (lru *LazyLRU[K, V]) IsRunning() bool

IsRunning indicates whether the background reaper is active

func (*LazyLRU[K, V]) Len

func (lru *LazyLRU[K, V]) Len() int

Len returns the number of items in the cache

func (*LazyLRU[K, V]) MGet

func (lru *LazyLRU[K, V]) MGet(keys ...K) map[K]V

MGet retrieves values from the cache. Missing values will not be returned.

func (*LazyLRU[K, V]) MSet

func (lru *LazyLRU[K, V]) MSet(keys []K, values []V) error

MSet writes multiple keys and values to the cache. If the "key" and "value" parameters are of different lengths, this method will return an error.

func (*LazyLRU[K, V]) MSetTTL

func (lru *LazyLRU[K, V]) MSetTTL(keys []K, values []V, ttl time.Duration) error

MSetTTL writes multiple keys and values to the cache, expiring with the given time-to-live value. If the "key" and "value" parameters are of different lengths, this method will return an error.

func (*LazyLRU[K, V]) OnEvict added in v0.5.0

func (lru *LazyLRU[K, V]) OnEvict(cb EvictCB[K, V])

OnEvict registers a callback that will be executed when items are removed from the cache via eviction due to max size or because the TTL has been exceeded. These functions will not be called with a lock and will not block future reaping. Be sure any callback registered can complete the number of expected calls (based on your expire/eviction rates) or you may create a backlog of goroutines.

If a Set or MSet operation causes an eviction, this function will be called synchronously to that Set or MSet call.

func (*LazyLRU[K, V]) Reap

func (lru *LazyLRU[K, V]) Reap()

Reap removes all expired items from the cache

func (*LazyLRU[K, V]) Scan added in v0.7.0

func (lru *LazyLRU[K, V]) Scan() iter.Seq2[K, V]

Scan returns an iterator that yields current non-expired items from the cache. It iterates over a snapshot of keys taken at the beginning of iteration, checking each key's existence and expiration before yielding its associated value. Keys created after the scan begins will be ignored.

func (*LazyLRU[K, V]) Set

func (lru *LazyLRU[K, V]) Set(key K, value V)

Set writes to the cache

func (*LazyLRU[K, V]) SetTTL

func (lru *LazyLRU[K, V]) SetTTL(key K, value V, ttl time.Duration)

SetTTL writes to the cache, expiring with the given time-to-live value

func (*LazyLRU[K, V]) Stats

func (lru *LazyLRU[K, V]) Stats() Stats

Stats gets a copy of the stats held by the cache. Note that this is a copy, so returned objects will not update as the service continues to execute.

type Stats

type Stats struct {
	KeysWritten      uint32
	KeysReadOK       uint32
	KeysReadNotFound uint32
	KeysReadExpired  uint32
	Shuffles         uint32
	Evictions        uint32
	KeysReaped       uint32
	ReaperCycles     uint32
}

Stats represends counts of actions against the cache.

Directories

Path Synopsis
containers
heap
Package heap provides heap operations for any type that implements heap.Interface.
Package heap provides heap operations for any type that implements heap.Interface.
generic module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL