ribs

package module
v0.0.0-...-9f36c07 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 30, 2024 License: Apache-2.0, MIT Imports: 9 Imported by: 0

README

RIBS

Reasonable Interplanetary BlockStore (RIBS)

RIBS is a Filecoin-native IPFS/IPLD blockstore designed for seamless integration with the Filecoin tech stack. It provides users with a scalable blockstore capable of handling almost arbitrary amounts of data. RIBS aims to be an auto-dagstore-like blockstore with fully automated Filecoin data offload support.

WORK IN PROGRESS

RIBS is a work-in-progress project. Most features aren't finished, and on-disk format is not stable. DO NOT USE FOR STORING CIRITICAL DATA, OR ANY OTHER DATA

Status:

  • Data layer is mostly implemented, but needs a lot of hardening to gain confidence that it never loses data.
  • API Implementation is mostly complete, but there is a lot of space for optimization.
  • Filecoin dealmaking process is mostly implemented.
  • RIBSWeb is mostly complete and covers existing functionality.
  • Kubo integration (KuboRIBS - KuRI) works, but needs some UX improvements.
  • Retrieval probing / retrieval functionality is not implemented yet.
  • Multi-node support is not implemented yet.

Key Features

Filecoin-native IPFS nodes: RIBS can be integrated into most IPFS nodes seamlessly, thanks to a layer that implements the standard Blockstore interface. RIBS also provides a high-performance block interface for more efficient data management.

Scalable blockstore: RIBS is designed to support 100PiB+ deployments, providing a scalable solution for distributed storage systems that can grow alongside user requirements.

Improved data locality and parallelism: RIBS groups blocks into log-files, which in many scenarios provide good data locality, and are very easy to convert into Filecoin-deal-friendly data format.

Fully automated Filecoin data offloading: When block groups become "full", RIBS automatically backs them up to Filecoin, by converting them into a .car file, computing deal CID, selecting SPs, and making deals. RIBS will also provide an offloading mechanism, which will make it possible to free up space on the local storage by removing the backed-up blocks, and fetching them from Filecoin when needed.

RIBSWeb: RIBSWeb is a web-based UI for RIBS, which provides an advanced, but easy-to-use interface for managing RIBS nodes, and monitoring their status.

architecture

Design Overview / Roadmap

architecture

Grouping blocks: RIBS groups blocks into log-files, which provides better performance on all drives, including HDDs and SSDs. This approach leads to improved data locality, allowing for better parallelism and easier offloading to Filecoin data chunks.

Local data index: RIBS maintains a local index of all blocks, which allows for efficient access to all block data, including offloaded blocks. Multiple backends will be supported for multi-node deployments, such as FoundationDB.

Filecoin dealmaking functionality: RIBS automates all steps of the Filecoin dealmaking process:

architecture

  • Storage Provider crawler: Discovers and monitors storage providers in the network.
  • Efficient on-the-fly .car file generation: RIBS can generate .car files in one sequential scan of a group log.
  • Fast DataCID computation: RIBS Uses all available cores to comput DataCID as fast as possible.
  • Advanced SP selection process: Utilizes a built-in reputation system to select the most suitable storage providers based on their performance and reliability.
  • Retrieval probing: Attempts retrievals for a random sample of data, to ensure that SPs provide the promised retrieval service. Unretrievable deals will not count torwards the redundancy factor, and SPs who fail to provide retrievals will be selected for deals much less frequently.
  • Automatic deal repair: Maintains a user-defined redundancy factor by automatically repairing and recreating deals as needed.

Arbitrary scalability: All parts of RIBS are designed to scale horizontally, allowing for almost arbitrary amounts of data to be stored.

multi-node support is not implemented yet

  • Group files can be stored on any storage backend, including distributed filesystems, and can be managed by a fleet of "Group workers" which can run tasks such as DataCID computation, car file creation, etc.
  • Local data index ("Top Level Index") can be backed by any scalable KV store
  • "Group Manager" processes can run redundantly to provide high availability
  • (Future) Users can deploy additional car file caching servers to improve efficiency of making redundant deals.
  • (Future) Retrievals can be served by a fleet of "Retrieval workers".
  • (Future) Access to storage is provided by a smart, session-aware driver, which can talk directly to the relevant parts of the cluster.

Usage

Integrating as a blockstore

Interface not stable yet

  • Main interface definition here
  • Can be wrapped into a standard IPFS blockstore using Blockstore layer
  • Example Kubo plugin here
Running (demo) Kubo-Ribs (KuRI) Node
  • Install Golang
  • Clone this repo
git clone https://github.com/lotus-web3/ribs.git
cd ribs
go build -o kuri ./integrations/kuri/cmd/kuri
  • backup / move away .ipfs / set IPFS_PATH to an alternative directory if you have a local IPFS node
  • Init KuRI node and start the daemon
./kuri init

# By default a new wallet will be generated.
# Send Filecoin funds or DataCap before the next
# starting the node daemon.

./kuri daemon
  • Use the node, like any Kubo node!
./kuri add README.md

Documentation

Index

Constants

View Source
const UndefGroupKey = GroupKey(-1)

Variables

This section is empty.

Functions

This section is empty.

Types

type Batch

type Batch interface {

	// Put queues writes to the blockstore
	Put(ctx context.Context, b []blocks.Block) error

	// Unlink makes a blocks not retrievable from the blockstore
	// NOTE: this method is best-effort. Data may not be removed immediately,
	// and it may be retrievable even after the operation is committed
	// In case of conflicts, Put operation will be preferred over Unlink
	Unlink(ctx context.Context, c []multihash.Multihash) error

	// Delete deletes block from the blockstore
	// NOTE: this method is best-effort. Might be better solutions for this.
	Delete(ctx context.Context, c []multihash.Multihash) error

	// Flush commits data to the blockstore. The batch can be reused after commit
	Flush(ctx context.Context) error
}

Batch groups operations, NOT thread safe

type CrawlState

type CrawlState struct {
	State string

	At, Reachable, Total int64
	Boost, BBswap, BHttp int64
}

type DealCountStats

type DealCountStats struct {
	Count  int
	Groups int
}

type DealMeta

type DealMeta struct {
	UUID     string
	Provider int64

	Sealed, Failed, Rejected bool

	StartEpoch, EndEpoch, StartTime int64

	Status     string
	SealStatus string
	Error      string
	DealID     int64

	BytesRecv int64
	TxSize    int64
	PubCid    string

	RetrTTFBMs            int64
	RetrSuccess, RetrFail int64
	NoRecentSuccess       bool
}

type DealSummary

type DealSummary struct {
	NonFailed, InProgress, Done, Failed int64

	TotalDataSize, TotalDealSize   int64
	StoredDataSize, StoredDealSize int64
}

type ExternalStorageProvider

type ExternalStorageProvider interface {
	FetchBlocks(ctx context.Context, group GroupKey, mh []multihash.Multihash, cb func(cidx int, data []byte)) error
}

type GroupDesc

type GroupDesc struct {
	RootCid, PieceCid cid.Cid
	CarSize           int64
}

type GroupIOStats

type GroupIOStats struct {
	ReadBlocks, ReadBytes   int64
	WriteBlocks, WriteBytes int64
}

type GroupKey

type GroupKey = int64

type GroupMeta

type GroupMeta struct {
	State GroupState

	MaxBlocks int64
	MaxBytes  int64

	Blocks int64
	Bytes  int64

	ReadBlocks, ReadBytes   int64
	WriteBlocks, WriteBytes int64

	PieceCID, RootCID string

	DealCarSize *int64 // todo move to DescribeGroup
}

type GroupState

type GroupState int // todo move to rbstore?
const (
	GroupStateWritable GroupState = iota
	GroupStateFull
	GroupStateVRCARDone

	GroupStateLocalReadyForDeals
	GroupStateOffloaded

	GroupStateReload
)

type GroupStats

type GroupStats struct {
	GroupCount           int64
	TotalDataSize        int64
	NonOffloadedDataSize int64
	OffloadedDataSize    int64

	OpenGroups, OpenWritable int
}

type GroupSub

type GroupSub func(group GroupKey, from, to GroupState)

type GroupUploadStats

type GroupUploadStats struct {
	ActiveRequests int
	UploadBytes    int64
}

type Index

type Index interface {
	// GetGroups gets group ids for the multihashes
	GetGroups(ctx context.Context, mh []multihash.Multihash, cb func(cidx int, gk GroupKey) (more bool, err error)) error
	GetSizes(ctx context.Context, mh []multihash.Multihash, cb func([]int32) error) error

	AddGroup(ctx context.Context, mh []multihash.Multihash, sizes []int32, group GroupKey) error

	Sync(ctx context.Context) error
	DropGroup(ctx context.Context, mh []multihash.Multihash, group GroupKey) error
	DropGroupForceDelete(ctx context.Context, mh []multihash.Multihash, group GroupKey) error
	EstimateSize(ctx context.Context) (int64, error)

	io.Closer
}

Index is the top level index, thread safe

type Libp2pInfo

type Libp2pInfo struct {
	PeerID string

	Listen []string

	Peers int
}

type OffloadLoader

type OffloadLoader interface {
	View(ctx context.Context, g GroupKey, c []multihash.Multihash, cb func(cidx int, data []byte)) error
}

type ProviderInfo

type ProviderInfo struct {
	Meta        ProviderMeta
	RecentDeals []DealMeta
}

type ProviderMeta

type ProviderMeta struct {
	ID     int64
	PingOk bool

	BoostDeals     bool
	BoosterHttp    bool
	BoosterBitswap bool

	IndexedSuccess int64
	IndexedFail    int64

	DealStarted  int64
	DealSuccess  int64
	DealFail     int64
	DealRejected int64

	MostRecentDealStart int64

	// price in fil/gib/epoch
	AskPrice         float64
	AskVerifiedPrice float64

	AskMinPieceSize float64
	AskMaxPieceSize float64

	RetrievDeals, UnretrievDeals int64
}

type RBS

type RBS interface {
	Start() error

	Session(ctx context.Context) Session
	Storage() Storage
	StorageDiag() RBSDiag

	// ExternalStorage manages offloaded data
	ExternalStorage() RBSExternalStorage

	// StagingStorage manages staged data (full non-replicated data)
	StagingStorage() RBSStagingStorage

	io.Closer
}

type RBSDiag

type RBSDiag interface {
	Groups() ([]GroupKey, error)
	GroupMeta(gk GroupKey) (GroupMeta, error)

	TopIndexStats(context.Context) (TopIndexStats, error)
	GetGroupStats() (*GroupStats, error)
	GroupIOStats() GroupIOStats

	WorkerStats() WorkerStats
}

type RBSExternalStorage

type RBSExternalStorage interface {
	InstallProvider(ExternalStorageProvider)
}

type RBSStagingStorage

type RBSStagingStorage interface {
	InstallStagingProvider(StagingStorageProvider)
}

type RIBS

type RIBS interface {
	RBS

	Wallet() Wallet
	DealDiag() RIBSDiag

	io.Closer
}

type RIBSDiag

type RIBSDiag interface {
	CarUploadStats() UploadStats
	DealSummary() (DealSummary, error)
	GroupDeals(gk GroupKey) ([]DealMeta, error)

	ProviderInfo(id int64) (ProviderInfo, error)
	CrawlState() CrawlState
	ReachableProviders() []ProviderMeta

	RetrStats() (RetrStats, error)

	StagingStats() (StagingStats, error)

	Filecoin(context.Context) (api.Gateway, jsonrpc.ClientCloser, error)

	P2PNodes(ctx context.Context) (map[string]Libp2pInfo, error)

	RetrChecker() RetrCheckerStats

	RetrievableDealCounts() ([]DealCountStats, error)
	SealedDealCounts() ([]DealCountStats, error)

	RepairQueue() (RepairQueueStats, error)
	RepairStats() (map[int]RepairJob, error)
}

type RepairJob

type RepairJob struct {
	GroupKey GroupKey

	State RepairJobState

	FetchProgress, FetchSize int64
	FetchUrl                 string
}

type RepairJobState

type RepairJobState string
const (
	RepairJobStateFetching  RepairJobState = "fetching"
	RepairJobStateVerifying RepairJobState = "verifying"
	RepairJobStateImporting RepairJobState = "importing"
)

type RepairQueueStats

type RepairQueueStats struct {
	Total, Assigned int
}

type RetrCheckerStats

type RetrCheckerStats struct {
	ToDo       int64
	Started    int64
	Success    int64
	Fail       int64
	SuccessAll int64
	FailAll    int64
}

type RetrStats

type RetrStats struct {
	Success, Bytes, Fail, CacheHit, CacheMiss, Active int64
	HTTPTries, HTTPSuccess, HTTPBytes                 int64
}

type Session

type Session interface {
	// View attempts to read a list of cids
	// NOTE:
	// * Callback calls can happen out of order
	// * Callback calls can happen in parallel
	// * Callback will not be called for indexes where data is not found
	// * Callback `data` must not be referenced after the function returns
	//   If the data is to be used after returning from the callback, it MUST be copied.
	View(ctx context.Context, c []multihash.Multihash, cb func(cidx int, data []byte)) error

	// -1 means not found
	GetSize(ctx context.Context, c []multihash.Multihash, cb func([]int32) error) error

	Batch(ctx context.Context) Batch
}

Session groups correlated IO operations; thread safa

type StagingStats

type StagingStats struct {
	UploadBytes, UploadStarted, UploadDone, UploadErr, Redirects, ReadReqs, ReadBytes int64
}

type StagingStorageProvider

type StagingStorageProvider interface {
	Upload(ctx context.Context, group GroupKey, size int64, src func(writer io.Writer) error) error
	ReadCar(ctx context.Context, group GroupKey, off, size int64) (io.ReadCloser, error)
	HasCar(ctx context.Context, group GroupKey) (bool, error)
}

type Storage

type Storage interface {
	FindHashes(ctx context.Context, hashes multihash.Multihash) ([]GroupKey, error)

	ReadCar(ctx context.Context, group GroupKey, sz func(int64), out io.Writer) error

	// HashSample returns a sample of hashes from the group saved when the group was finalized
	HashSample(ctx context.Context, group GroupKey) ([]multihash.Multihash, error)

	DescibeGroup(ctx context.Context, group GroupKey) (GroupDesc, error)

	Offload(ctx context.Context, group GroupKey) error

	LoadFilCar(ctx context.Context, group GroupKey, f io.Reader, sz int64) error

	Subscribe(GroupSub)
}

type TopIndexStats

type TopIndexStats struct {
	Entries       int64
	Writes, Reads int64
}

type UploadStats

type UploadStats struct {
	ByGroup map[GroupKey]*GroupUploadStats

	LastTotalBytes int64
}

type Wallet

type Wallet interface {
	WalletInfo() (WalletInfo, error)

	MarketAdd(ctx context.Context, amount abi.TokenAmount) (cid.Cid, error)
	MarketWithdraw(ctx context.Context, amount abi.TokenAmount) (cid.Cid, error)

	Withdraw(ctx context.Context, amount abi.TokenAmount, to address.Address) (cid.Cid, error)
}

type WalletInfo

type WalletInfo struct {
	Addr, IDAddr string

	DataCap string

	Balance       string
	MarketBalance string
	MarketLocked  string

	MarketBalanceDetailed api.MarketBalance
}

type WorkerStats

type WorkerStats struct {
	Available, InFinalize, InCommP, InReload int64
	TaskQueue                                int64

	CommPBytes int64
}

Directories

Path Synopsis
integrations
web

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL