cache

package
v0.0.0-...-24bf5da Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 28, 2024 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package cache implements caching helpers, e.g. an sqlite3 based cache. Caching is important for the most cited items, these can take seconds to assemble; so we cache the serialized JSON in sqlite3 and serve subsequent requests from there.

Without cache:

| Rank    | Links | T    |
|---------+-------+------|
| ~5000   |  2999 | 2.8s |
| ~10000  |  2108 | 3.5s |
| ~50000  |   937 | 1.2s |
| ~100000 |   659 | 0.8s |
| ~150000 |   538 | 0.6s |

A data point: The hundert most expensive ids take 175s to request (in parallel). After caching, this time reduces to 2.78s. Individual requests from cache are in the 1-10ms range.

Another data point: Warming the cache with the most expensive 150K DOI takes less than 2h.

$ time zstd -qcd -T0 /usr/share/labe/data/OpenCitationsRanked/current | \
    awk '{ print $2 }' | head -n 150000 | shuf | \
    parallel -j 32 -I {} 'curl -sL "http://localhost:8000/doi/{}"' > /dev/null

real    103m36.376s
user    21m57.202s
sys     18m15.376s

The cache database (with zstd compressed values) is about 8GB in size.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrCacheMiss             = errors.New("cache miss")
	ErrReadOnly              = errors.New("read only")
	DefaultMaxFileSize int64 = 1 << 36
)

Functions

This section is empty.

Types

type Cache

type Cache struct {
	Path        string
	MaxFileSize int64
	// Lock applies to both, db and readOnly.
	sync.Mutex
	// contains filtered or unexported fields
}

Cache is a minimalistic cache based on sqlite. In the future, values could be transparently compressed as well.

func New

func New(path string) (*Cache, error)

New creates a new cache at a given path with a default maximum file size.

func (*Cache) Close

func (c *Cache) Close() error

Close closes the underlying database.

func (*Cache) Flush

func (c *Cache) Flush() error

Flush empties the cache.

func (*Cache) Get

func (c *Cache) Get(key string) ([]byte, error)

Get value for a key.

func (*Cache) ItemCount

func (c *Cache) ItemCount() (int, error)

ItemCount returns the number of entries in the cache.

func (*Cache) Set

func (c *Cache) Set(key string, value []byte) error

Set key value pair.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL