dino

module

v0.0.0-...-f7a59ed Latest Latest Go to latest Published: Mar 2, 2021 License: MIT

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

README ¶

2021-03-02. TL;DR: I'm archiving this project for now, but I may unarchive it at some point in the future and switch from FUSE to 9P.

This project never went past the initial hackathon phase due to lack of interest in the setting where I aimed at using it. I've no interest at all in FUSE file systems, I'd much rather use 9P. My choice of FUSE was constrained by the mentioned setting (pardon the vagueness). Since you're here, you may take a look at my main project instead, which is a 9P file system, musclefs.

The original pre-archival README follows.

dino

Introduction

dinofs is a FUSE file system designed to share data volumes among several participating nodes. All nodes will work on the same data, and changes done from one node will be automatically visible to the other nodes.

The dino project consists of

a metadataserver binary, which should run on a server node,
a blobserver binary, which should run on a server node (possibly the same),
and of a dinofs binary, which should run on a set of participating client nodes.

The metadataserver is used by dinofs to commit metadata mutations (e.g., chown, chmod, setfacl, rename, truncate, ...) and is responsible for pushing updates to all connected clients.

The blobserver is use for persistent storage of blobs (file contents).

The dinofs binary interfaces with the operating system's FUSE implementation to provide the filesystem functionality. It's heavily cached and does synchronous network IO to commit metadata mutations and to fetch blobs that are not yet cached.

Getting started

If Go is not installed in your system, you can download it from https://golang.org/.

You also need FUSE support in your system, which could be achieved with apt install -qy fuse or the equivalent command in your Linux distribution. There is FUSE support for MacOS at https://osxfuse.github.io/.

Next install the latest revision of dinofs, metadataserver, blobserver with the following command.

go get -u github.com/nicolagi/dino/cmd/...

After this, create some initial configuration.

mkdir -p $HOME/lib/dino
cat > $HOME/lib/dino/blobserver.config <<EOF
{
	blob_server: "localhost:3002"
	debug: true
}
EOF
cat > $HOME/lib/dino/metadataserver.config <<EOF
debug = true
listen_address = "localhost:3003"
backend = "simple"
[stores]
[stores.simple]
type = "boltdb"
file = "$HOME/lib/dino/metadata.db"
EOF
cat > $HOME/lib/dino/fs-default.config <<EOF
{
	blobs: {
		type: "dino"
		address: "localhost:3002"
	}
	metadata: {
		type: "dino"
		address: "localhost:3003"
	}
	mountpoint: "/mnt/dino"
	debug: true
}
EOF

Prepare the mountpoint so your user can mount dinofs:

sudo mkdir -p /mnt/dino
sudo chown $USER /mnt/dino

Start all servers in the background:

blobserver &
metadataserver &
dinofs &

At this point the fs should be mounted:

$ mount | grep dinofs
dinofs on /mnt/dino type fuse.dinofs (rw,nosuid,nodev,relatime,user_id=1000,group_id=100)

In case something goes wrong, log files in $HOME/lib/dino/*log should indicate what the problem is.

This should be enough to start experimenting. After this, one probably wants to expose blobserver and metadataserver over a network, but beware of security implications (see section below).

To unmount, run

fusermount -u /mnt/dino

which will also terminate the dinofs process. The blobserver and metadataserver processes will run until killed.

Why FUSE instead of 9P?

Contrary to the muscle file system, I need to use this file system on servers that don't have a 9P driver.

Bugs and limitations

Probably.

At the time of writing, the code base is the result of a hackathon. I will be using this software and fixing bugs and update this file once the project is more stable.

Also, not a lot of effort has (yet) been made on making this performant.

Security

Not much. (Not yet.)

The metadataserver can be configured to use TLS, adding a key pair to the configuration.

listen_address = "localhost:3003"
cert_file = "$HOME/lib/dino/cert.pem"
key_file = "$HOME/lib/dino/key.pem"

The clients will try to do a TLS hand-shake, failing that, it'll try to connect in plain TCP.

It is possible to protect the metadataserver process with password-based authentication. To achieve that, add an auth_key property to the dinofs configuration file containing the password, and its bcrypt hash as the auth_hash property to the metadataserver configuration.

The proper value for auth_hash can be generated with the helper program genhash in this repository:

;; pwd
/n/muscle/src/dino/metadata/server/genhash
;; go run genhash.go
Type password (echo is off) and return:
The hash is: $2a$10$s8EKCUc1J60HmjAsSB9IdOcwgeW4FSAYTEdoyexdqlLB9CQpb1Vky

If an auth_key or auth_hash is configured, TLS must be used, and fallback to plain TCP will be disabled.

Details

A file system consists of data (file contents) and metadata (where are the file contents, what's the size, what are the child nodes, ...).

If a file system is to be synchronized over the network among a set of participating client nodes, key to its efficiency is limiting the synchronous network calls necessary to fulfill system calls.

Let's show how we do that.

Avoiding synchronous network calls for dinofs mounts writing data is easy. The data will be saved to a blob store. The blob store will save any value under a key that's the hash of its value (content addressing). That prevents competing clients (running dinofs using the same metadata server and same remote storage backend for the data) from conflicting. If they write to the same key, they'll also write the same values. Therefore, dinofs will therefore write data to a local disk-based store (write-back cache) and asynchronously to the blob server. In terms of file contents, the synchronous workload will always hit local storage.

This design for the blob store is the reason why its Put method does not take a key, it only takes a value, and it returns its key (to be stored in the file system node's metadata).

While file system regular file nodes usually consist of multiple data blocks, for simplicity and because my current use case for dinofs only entails small files, I'm storing only one blob per regular file or symlink.

Flexibility

The basic building block for metadata and data storage is a super simple interface, that of a key-value store. The same interface is used for

local write-back data cache,
the remote persistent data storage,
the local metadata write-through cache,
the metadataserver persistent metadata storage.

At the time of writing, the implementations used for the above are hard-coded to

a disk-based store,
a remote HTTP server (the blobserver binary) that's also disk based (but could easily be S3-based)
an in-memory map,
a Bolt database fronted by a custom TCP server (the metadataserver binary).

Wherever I said "disk-based store", "in-memory map", "Bolt database", "S3", above, one could substitute anything that allows implementing Put(key, value []byte) error and Get(key []byte) (value []byte, err error). The VersionedStoreWrapper and the BlobStoreWrapper class will take care of the higher level synchronization. The PairedStore will asynchronously propagate data from a local fast store to a remote slow store (any two stores, really), and it implements the store interface itself.

One could trade speed for simplicity and avoid any local cache, using only network-accessible storage, which could be the case if you want your files in a random server that you need to ssh into.

Slightly more advanced, one could implement the versioned store interface directly (instead of wrapping a store).

For the metadataserver and client, one could actually use a Redis server and a somewhat fat client, powered by Lua scripting and Redis PubSub for pushing metadata updates.

Yet another option would be to use DynamoDB (conditional updates to enforce the next-version constraint) and DynamoDB Streams to fan out the metadata updates.

Directories ¶

Path	Synopsis
bits Package bits contains functions to convert strings and unsigned ints of various widths to/from byte slices in an platform-independent way.	Package bits contains functions to convert strings and unsigned ints of various widths to/from byte slices in an platform-independent way.
cmd
blobserver Package blobserver implements an HTTP server that can be used together with its client, storage.RemoteStore.	Package blobserver implements an HTTP server that can be used together with its client, storage.RemoteStore.
dinofs
metadataserver The metadata server will read commands and write responses, but not in a strict request-response fashion.	The metadata server will read commands and write responses, but not in a strict request-response fashion.
message Package message defines the messages used to implement the metadata server and client, and the message serialization and deserialization code.	Package message defines the messages used to implement the metadata server and client, and the message serialization and deserialization code.
metadata
client Package client contains a low level client to the metadata server.	Package client contains a low level client to the metadata server.
server Package server implements a metadata server, whose job is exposing a storage.VersionedStore via a TCP connection.	Package server implements a metadata server, whose job is exposing a storage.VersionedStore via a TCP connection.
storage

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL