stream

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 12, 2024 License: MIT Imports: 16 Imported by: 5

README

stream

Golang file system abstraction tailored for AWS S3


The library provides a Golang file system abstraction tailored for AWS S3, enabling seamless streaming of binary objects along with their corresponding metadata.

Inspiration

The streaming is convenient paradigm for handling large binary objects like images, videos, and more. Applications can effectively manage data consumption by leveraging io.Reader and io.Writer for seamless abstraction. This library employs the AWS Golang SDK v2 under the hood to facilitate access to AWS S3 through streams.

On the other hand, a file system is a method used by computers to organize and store data on storage devices. It provides a structured way to access and manage binary objects. File systems handle tasks such as creating, reading, writing, and deleting binary objects, as well as managing permissions and metadata associated with each file or directory.

The library implements Golang File System and enhances it by adding support for writable files and type-safe metadata. The file system api is following:

type FileSystem interface {
  Create(path string) (File, error)
  Open(path string) (fs.File, error)
  Stat(path string) (fs.FileInfo, error)
  ReadDir(path string) ([]fs.DirEntry, error)
  Glob(pattern string) ([]string, error)
}

Notably, the interface supports reading and writing metadata associated with AWS objects using fs.FileInfo.

Getting started

The library requires Go 1.18 or later due to usage of generics.

The latest version of the library is available at its main branch. All development, including new features and bug fixes, take place on the main branch using forking and pull requests as described in contribution guidelines. The stable version is available via Golang modules.

Use go get to retrieve the library and add it as dependency to your application.

go get -u github.com/fogfish/stream

Quick Start

Check out the examples. They cover all fundamental use cases with runnable code snippets. Below is a simplest "Hello World"-like application for reading the object from AWS S3.

import (
  "io"
  "os"

  "github.com/fogfish/stream"
)

// mount s3 bucket as file system
s3fs, err := stream.NewFS(/* name of S3 bucket */)
if err != nil {
  return err
}

// open stream `io.Reader` to an object on S3
fd, err := s3fs.Open("/the/example/key.txt")
if err != nil {
  return err
}

// stream data using io.Reader interface
buf, err := io.ReadAll(fd)
if err != nil {
  return err
}

// close stream
err = fd.Close()
if err != nil {
  return err
}

See and try examples. Its cover all basic use-cases of the library.

Mounting S3

The library serves as a user-side implementation of Golang's file system abstractions defined by io/fs. It implements fs.FS, fs.StatFS, fs.ReadDirFS and fs.GlobFS. Additionally, it offers extensions for file writing: stream.CreateFS, stream.RemoveFS and stream.CopyFS.

To create a file system instance, utilize stream.NewFS or stream.New. The file system is configurable using options pattern.

s3fs, err := stream.NewFS(
  /* name of S3 bucket */,
  stream.WithIOTimeout(5 * time.Second),
  stream.WithListingLimit(25),
)

Reading objects

To open the file for reading use Open function giving the absolute path starting with /, the returned file descriptor is a composite of io.Reader, io.Closer and stream.Stat. Utilize Golang's convenient streaming methods to consume S3 object seamlessly.

r, err := s3fs.Open("/the/example/key")
if err != nil {
  return err
}
defer r.Close() 

// utilize Golang's convenient streaming methods
io.ReadAll(r)

Writing objects

To open the file for writing use Create function giving the absolute path starting with /, the returned file descriptor is a composite of io.Writer, io.Closer and stream.Stat. Utilize Golang's convenient streaming methods to update S3 object seamlessly. Once all bytes are written, it's crucial to close the stream. Failure to do so would cause data loss. The object is considered successfully created on S3 only if all Write operations and subsequent Close actions are successful.

w, err := s3fs.Create("/the/example/key", nil)
if err != nil {
  return err
}

// utilize Golang's convenient streaming methods
io.WriteString(w, "Hello World!\n")

// close stream and handle error to prevent data loss. 
err = w.Close()
if err != nil {
  return err
}

Walking through objects

The file system implements interfaces fs.ReadDirFS and fs.GlobFS for traversal through objects. The classical file system organize data hierarchically into directories as opposed to the flat storage structure of general purpose AWS S3 (the directory bucket is not supported yet). The flat structure implies a limitations into the implementation

  1. it assumes a directory if the path ends with / (e.g. /the/example/key points to the object, /the/example/key/ points to the directory).
  2. it return path relative to pattern for all found object.
err := fs.WalkDir(s3fs, dir, func(path string, d fs.DirEntry, err error) error {
  if err != nil {
    return err
  }

  if d.IsDir() {
    return nil
  }

  // do something with file
  // path is absolute path to the file but entry is relative path
  // path == dir + d.Name()

  return nil
})

Supported File System Operations

For added convenience, the file system is enhanced with stream.RemoveFS and stream.CopyFS, enabling the removal of S3 objects and the copying of objects across buckets, respectively.

Objects metadata

fs.FileInfo is a primary container for S3 objects metadata. The file system provides access to metadata either from open streams (file descriptors) or for any key.

fi, err := s3fs.Stat("/the/example/key")
if err != nil {
  return err
}

Type-safe objects metadata

AWS S3 support object metadata as a set of name-value pairs and allows to define the metadata at the time you upload the object and read it late. This library support both system and user-controlled metadata attributes.

What sets this library apart is its encouragement for developers to utilize the Golang type system in defining object metadata. Rather than working with loosely typed name-value pairs, metadata is structured as Golang structs, promoting correctness and maintainability. This approach is facilitated through generic programming style within the library.

A Golang struct type serves as the metadata container, where each public field is transformed into name-value pairs before being written to S3. Example below defines the container build with two user controlled attributes Author and Chapter and two system attributes ContentType and ContentLanguage.

type Note struct {
  Author          string
  Chapter         string
  ContentType     string
  ContentLanguage string
}

The file system interface has been expanded to handle user-defined metadata in a type-safe manner. Firstly, stream.New() create type annotated client. Secondly, the Create() function accepts a pointer to the metadata container, which is then written alongside the data. Lastly, the fs.FileInfo container retains an instance of associated metadata, which is accessible through either a Sys() call or the StatSys() helper.

// create client and define metadata type
s3fs, err := stream.New[Note](/* name of S3 bucket */)

// create stream and annotate it with metadata
fd, err := s3fs.Create("/the/example/key",
  &Note{/* defined metadata values */},
)

// fs.FileInfo carries previously written metadata, use Sys() function to access.
fi, err := s3fs.Stat("/the/example/key")

note := s3fs.StatSys(fi)

AWS S3 defined collection of well-known system attributes. This library supports only subset of those: Cache-Control, Content-Encoding, Content-Language, Content-Type, Expires, ETag, Last-Modified and Storage-Class. Open Pull Request or raise an issue if subset needs to be enhanced.

The library define type stream.SystemMetadata that incorporates all supported attributes. You might annotate your own types.

type Note struct {
  stream.SystemMetadata
  Author          string
  Chapter         string
}

Presigned Urls

Usage of io.Reader and io.Writer interfaces is sufficient for majority cloud applications. However, there are instances where delegating read/write responsibilities to a mobile client becomes necessary. For example, directly uploading images or video files from a mobile client to an S3 bucket is both scalable and considerably more efficient than routing through a backend system. The library accommodates this scenario with a special case for streaming binary objects using pre-signed URLs. The file system return pre-signed URL for the stream within the metadata. It only requires definition of attribute PreSignedUrl of string type.

type PreSignedUrl struct {
	PreSignedUrl string
}

Use fs.FileInfo container and metadata api depicted above to obtain pre-signed URLs.

// Mount the S3 bucket with metadata containing the `PreSignedUrl` attribute
s3fs, err := stream.New[stream.PreSignedUrl](/* name of S3 bucket */)

// Open file for read or write
fd, err := s3fs.Create("/the/example/key", nil)
if err != nil {
  return err
}
defer fd.Close()

// read files metadata
fi, err := fd.Stat()
if err != nil {
return err
}

if meta := s3fs.StatSys(fi); meta != nil {
  // Use meta.PreSignedUrl
}

Note: Utilizing a pre-signed URL necessitates passing all headers that were provided to the Create function.

fd, err := s3fs.Create("/the/example/key",
  &Note{
    Author:          "fogfish",
    ContentType:     "text/plain",
    ContentLanguage: "en",
  }
)
curl -XPUT https://pre-signed-url-goes-here \
  -H 'Content-Type: text/plain' \
  -H 'Content-Language: en' \
  -H 'X-Amz-Meta-Author: fogfish' \
  -d 'some content'

Error handling

The library consistently returns fs.PathError, except in cases where the object is not found, in which fs.ErrNotExist is returned. Additionally, it refrains from wrapping stream I/O errors.

How To Contribute

The library is MIT licensed and accepts contributions via GitHub pull requests:

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

The build and testing process requires Go version 1.13 or later.

build and test library.

git clone https://github.com/fogfish/stream
cd stream
go test

commit message

The commit message helps us to write a good release note, speed-up review process. The message should address two question what changed and why. The project follows the template defined by chapter Contributing to a Project of Git book.

bugs

If you experience any issues with the library, please let us know via GitHub issues. We appreciate detailed and accurate reports that help us to identity and replicate the issue.

License

See LICENSE

Documentation

Overview

Package stream provides a Golang file system abstraction tailored for AWS S3, enabling seamless streaming of binary objects along with their corresponding metadata. The package implements [Golang File System](https://pkg.go.dev/io/fs) and enhances it by adding support for writable files and type-safe metadata.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CopyFS added in v1.0.0

type CopyFS interface {
	fs.FS
	Copy(source, target string) error
	Wait(path string, timeout time.Duration) error
}

File System extension supporting file copying

type CreateFS added in v1.0.0

type CreateFS[T any] interface {
	fs.FS
	Create(path string, attr *T) (File, error)
}

File System extension supporting writable files

type File added in v1.0.0

type File interface {
	Stat
	io.Writer
	io.Closer
}

File is a writable object

type FileSystem added in v1.0.0

type FileSystem[T any] struct {
	Opts
	// contains filtered or unexported fields
}

File System

func New added in v1.0.0

func New[T any](bucket string, opts ...Option) (*FileSystem[T], error)

Create a file system instance, mounting S3 Bucket. Use Option type to configure file system.

func NewFS added in v1.0.0

func NewFS(bucket string, opts ...Option) (*FileSystem[struct{}], error)

Create a file system instance, mounting S3 Bucket. Use Option type to configure file system.

func (*FileSystem[T]) Copy added in v1.0.0

func (fsys *FileSystem[T]) Copy(source, target string) error

Copy object from source location to the target. The target shall be absolute s3://bucket/key url.

func (*FileSystem[T]) Create added in v1.0.0

func (fsys *FileSystem[T]) Create(path string, attr *T) (File, error)

To open the file for writing use `Create` function giving the absolute path starting with `/`, the returned file descriptor is a composite of `io.Writer`, `io.Closer` and `stream.Stat`. Utilize Golang's convenient streaming methods to update S3 object seamlessly. Once all bytes are written, it's crucial to close the stream. Failure to do so would cause data loss. The object is considered successfully created on S3 only if all `Write` operations and subsequent `Close` actions are successful.

func (*FileSystem[T]) Glob added in v1.0.0

func (fsys *FileSystem[T]) Glob(pattern string) ([]string, error)

Glob returns the names of all files matching pattern. The classical file system organize data hierarchically into directories as opposed to the flat storage structure of general purpose AWS S3.

It assumes a directory if the path ends with `/`.

It return path relative to pattern for all found object.

The pattern consists of S3 key prefix Golang regex. Its are split by `|`.

func (*FileSystem[T]) Open added in v1.0.0

func (fsys *FileSystem[T]) Open(path string) (fs.File, error)

To open the file for reading use `Open` function giving the absolute path starting with `/`, the returned file descriptor is a composite of `io.Reader`, `io.Closer` and `stream.Stat`. Utilize Golang's convenient streaming methods to consume S3 object seamlessly.

func (*FileSystem[T]) ReadDir added in v1.0.0

func (fsys *FileSystem[T]) ReadDir(path string) ([]fs.DirEntry, error)

Reads the named directory or path prefix. The classical file system organize data hierarchically into directories as opposed to the flat storage structure of general purpose AWS S3.

It assumes a directory if the path ends with `/`.

It return path relative to pattern for all found object.

func (*FileSystem[T]) Remove added in v1.0.0

func (fsys *FileSystem[T]) Remove(path string) error

Remove object

func (*FileSystem[T]) Stat added in v1.0.0

func (fsys *FileSystem[T]) Stat(path string) (fs.FileInfo, error)

Stat returns a FileInfo describing the file. File system executes HeadObject S3 API call to obtain metadata.

func (*FileSystem[T]) StatSys added in v1.0.0

func (fsys *FileSystem[T]) StatSys(stat fs.FileInfo) *T

Returns file metadata of type T embedded into a FileInfo.

func (*FileSystem[T]) Wait added in v1.0.0

func (fsys *FileSystem[T]) Wait(path string, timeout time.Duration) error

Wait for timeout until path exists

type Option added in v0.3.0

type Option func(*Opts)

func WithIOTimeout added in v1.0.0

func WithIOTimeout(t time.Duration) Option

func WithListingLimit added in v1.0.0

func WithListingLimit(limit int32) Option

func WithPreSignUrlTTL added in v1.0.0

func WithPreSignUrlTTL(t time.Duration) Option

func WithS3 added in v1.0.0

func WithS3(api S3) Option

func WithS3Signer added in v1.0.0

func WithS3Signer(api S3Signer) Option

func WithS3Upload added in v1.0.0

func WithS3Upload(api S3Upload) Option

type Opts added in v1.0.0

type Opts struct {
	// contains filtered or unexported fields
}

File System Configuration Options

type PreSignedUrl added in v1.0.0

type PreSignedUrl struct {
	PreSignedUrl string
}

Well-known attribute for reading pre-signed Urls of S3 objects

type RemoveFS added in v1.0.0

type RemoveFS interface {
	fs.FS
	Remove(path string) error
}

File System extension supporting file removal

type S3 added in v1.0.0

type S3 interface {
	HeadObject(ctx context.Context, params *s3.HeadObjectInput, optFns ...func(*s3.Options)) (*s3.HeadObjectOutput, error)
	GetObject(ctx context.Context, params *s3.GetObjectInput, optFns ...func(*s3.Options)) (*s3.GetObjectOutput, error)
	ListObjectsV2(ctx context.Context, params *s3.ListObjectsV2Input, optFns ...func(*s3.Options)) (*s3.ListObjectsV2Output, error)
	DeleteObject(ctx context.Context, params *s3.DeleteObjectInput, optFns ...func(*s3.Options)) (*s3.DeleteObjectOutput, error)
	CopyObject(ctx context.Context, params *s3.CopyObjectInput, optFns ...func(*s3.Options)) (*s3.CopyObjectOutput, error)
}

type S3Signer added in v1.0.0

type S3Signer interface {
	PresignGetObject(ctx context.Context, params *s3.GetObjectInput, optFns ...func(*s3.PresignOptions)) (*v4.PresignedHTTPRequest, error)
	PresignPutObject(ctx context.Context, params *s3.PutObjectInput, optFns ...func(*s3.PresignOptions)) (*v4.PresignedHTTPRequest, error)
}

type S3Upload added in v1.0.0

type S3Upload interface {
	Upload(ctx context.Context, input *s3.PutObjectInput, opts ...func(*manager.Uploader)) (*manager.UploadOutput, error)
}

type Stat added in v1.0.0

type Stat interface {
	Stat() (fs.FileInfo, error)
}

Stat returns a FileInfo describing the file and its metadata from the file system.

type SystemMetadata added in v1.0.0

type SystemMetadata struct {
	CacheControl    string
	ContentEncoding string
	ContentLanguage string
	ContentType     string
	Expires         *time.Time
	ETag            string
	LastModified    *time.Time
	StorageClass    string
}

well-known attributes controlled by S3 system

Directories

Path Synopsis
examples module
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL