cloudsync

package module
v0.0.1-alpha Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 27, 2022 License: MIT Imports: 16 Imported by: 0

README

Neutrino CloudSync

Go Build GoDoc Go Report Card codebeat badge Coverage Status Go Version

Neutrino CloudSync is an open-source tool used to upload entire file folders from any host to any cloud.

How-To

Provision your own infrastructure

This repository contains fully-customizable Terraform code (IaC) to deploy and/or provision live infrastructure in your own cloud account.

The code should be found here.

To run this code and provision your own infrastructure, you MUST have installed Terraform CLI in your admin host machine (not actual nodes which will interact with the platform's). Furthermore, an S3 bucket and a DynamoDB table is REQUIRED to persist terraform states remotely (S3) and lock/unlock a remote mutex lock mechanism (DynamoDB), enabling collaboration between multiple developers and hence development-purpose machines. If this functionality is not desired, please remove the main.tf's terraform block and leave it like this:

terraform {
}
Amazon Web Services

The following steps are specific for the Amazon Web Services (AWS) platform:

  • Go to deployments/terraform/workspaces/development.
  • Add a terraform.tfvars file with the following variables (replace with actual cloud account data):
aws_account = "0000"
aws_region = "us-east-N"
aws_access_key = "XXXX"
aws_secret_key = "XXXX"
  • OPTIONAL: Modify variables from variables.tf file as desired to configure your infrastructure properties.
  • Run the Terraform command terraform plan and verify a blob bucket and an encryption key will be created.
  • Run the Terraform command terraform apply and write yes after verifying a blob bucket and an encryption key are the only resources to be created.
  • OPTIONAL: Go to the GUI cloud console (or use the cloud CLI) and verify all resources have been created with their proper configurations.

NOTE: At this time, the deployed infrastructure will get tagged and named using development stage. This may be removed through Terraform files, more specifically in the main.tf file from development workspace folder.

Upload Files

Just start an uploader instance using the command:

user@machine:~ make run directory=DIRECTORY_TO_SYNC

or

user@machine:~ go run ./cmd/uploader/main.go -d DIRECTORY_TO_SYNC

The uploader program has other flags not specified on the examples. To read more about them please run the command:

user@machine:~ make help

or

user@machine:~ go run ./cmd/uploader/main.go -h

Documentation

Overview

Package cloudsync Neutrino CloudSync is an open-source tool used to upload entire file folders from any host to any cloud.

Index

Constants

This section is empty.

Variables

View Source
var DefaultStats = &Stats{}
View Source
var ErrFatalStorage = errors.New("cloudsync: Got fatal error from blob storage")

ErrFatalStorage non-recovery error issued by the blob storage. Programs should panic once they receive this error.

Functions

func ListenAndExecuteUploadJobs

func ListenAndExecuteUploadJobs(ctx context.Context, storage BlobStorage, wg *sync.WaitGroup)

ListenAndExecuteUploadJobs waits and executes object upload jobs asynchronously received from internal queues.

Will break listening loop if context was cancelled.

func ListenForSysInterruption

func ListenForSysInterruption(wg *sync.WaitGroup, cancel context.CancelFunc, sysChan <-chan os.Signal)

ListenForSysInterruption waits and gracefully shuts down internal workers when an external agent sends a cancellation signal (e.g. pressing Ctrl+C on shell session running the program).

func ListenUploadErrors

func ListenUploadErrors(cfg Config)

ListenUploadErrors waits and performs actions when object upload jobs fail. These errors are sent asynchronously through an internal error queue as all internal jobs are scheduled the same way.

Will break listening loop if context was cancelled.

func SaveConfig

func SaveConfig(cfg Config) error

SaveConfig stores the specified Config into host's physical disk.

func SaveConfigIfNotExists

func SaveConfigIfNotExists(path, file string) bool

SaveConfigIfNotExists creates a path and/or Config file if not found.

If no file was found, it will allocate a ULID as ScannerConfig.PartitionID.

func ScheduleFileUploads

func ScheduleFileUploads(ctx context.Context, cfg Config, wg *sync.WaitGroup, storage BlobStorage) error

ScheduleFileUploads traverses a directory tree based on specified configuration (Config.RootDirectory) and schedules upload jobs for each file found within all directories (if ScannerConfig.DeepTraversing was set as true) or files found in root directory only.

Furthermore, based on specified Config, a traversing process might get skipped if folder is hidden (uses '.' prefix character) or object/folder key was specified to be ignored explicitly in Config file.

func ShutdownUploadWorkers

func ShutdownUploadWorkers(ctx context.Context, wg *sync.WaitGroup)

ShutdownUploadWorkers closes internal job queues and stores new configuration variables (if required).

Types

type BlobStorage

type BlobStorage interface {
	// Upload stores an Object in a remote blob storage.
	Upload(ctx context.Context, obj Object) error

	// CheckMod verifies if an Object (using its key) was modified prior a specified time or differs from
	// size compared to the Object stored in the remote storage.
	//
	// Returns ErrFatalStorage if non-recovery operation was returned from remote storage server
	// (e.g. insufficient permissions, bucket does not exists).
	CheckMod(ctx context.Context, key string, modTime time.Time, size int64) (bool, error)
}

BlobStorage unit of non-volatile binary large objects (BLOB) persistence.

type CloudConfig

type CloudConfig struct {
	Region    string `yaml:"region"`
	Bucket    string `yaml:"bucket"`
	AccessKey string `yaml:"access_key"`
	SecretKey string `yaml:"secret_key"`
}

CloudConfig remote infrastructure and services configuration.

type Config

type Config struct {
	FilePath      string        `yaml:"-"`
	RootDirectory string        `yaml:"-"`
	Cloud         CloudConfig   `yaml:"cloud"`
	Scanner       ScannerConfig `yaml:"scanner"`
	// contains filtered or unexported fields
}

Config Main application configuration.

func NewConfig

func NewConfig(path, file, rootDirectory string) (Config, error)

NewConfig allocates a Config instance used by internal components to perform its processes.

func (*Config) KeyIsIgnored

func (c *Config) KeyIsIgnored(key string) bool

KeyIsIgnored verifies if a specified key was selected to be ignored.

type ErrFileUpload

type ErrFileUpload struct {
	Key    string
	Parent error
}

ErrFileUpload generic error generated from a blob upload job.

func (ErrFileUpload) Error

func (e ErrFileUpload) Error() string

type NoopBlobStorage

type NoopBlobStorage struct {
	UploadErr    error
	CheckModBool bool
	CheckModErr  error
}

func (NoopBlobStorage) CheckMod

func (n NoopBlobStorage) CheckMod(_ context.Context, _ string, _ time.Time, _ int64) (bool, error)

func (NoopBlobStorage) Upload

func (n NoopBlobStorage) Upload(_ context.Context, _ Object) error

type Object

type Object struct {
	// Key file's path + name or name.
	Key string
	// Data Binary Large Object reader instance.
	Data ReadSeekerAt
	// CleanupFunc frees resources like underlying buffers.
	CleanupFunc func() error
}

Object also known as file, information unit stored within a directory composed of an io.Reader holding binary data (Data) and a Key.

type ReadSeekerAt

type ReadSeekerAt interface {
	io.ReadSeeker
	io.ReaderAt
}

ReadSeekerAt a custom read buffer type used to perform memory-efficient allocations when reading large objects.

In deepness, io.Reader requires a complete buffer allocation when reading a slice of bytes while the combination of io.ReadSeeker and io.ReaderAt allows to read specific parts of the given slice of bytes (avoiding unnecessary memory allocations, i.e. full buffer allocation) while still satisfying the io.Reader interface.

Finally, this increases application performance drastically when reading a big slice of bytes (i.e. a large PDF or docx file) as underlying upload APIs from third party vendors might partition these files using a multipart strategy.

For more information, please read: https://aws.github.io/aws-sdk-go-v2/docs/sdk-utilities/s3/.

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner main component which reads and schedules upload jobs based on the files found on directories specified in Config.

func NewScanner

func NewScanner(cfg Config) *Scanner

NewScanner allocates a new Scanner instance which will use specified Config.

func (*Scanner) Shutdown

func (s *Scanner) Shutdown(ctx context.Context) error

Shutdown stops all internal process gracefully. Moreover, the shutdown process will stop if the specified context was cancelled, avoiding application deadlocks if used with context.WithTimeout() in expense of a corrupted shutdown.

func (*Scanner) Start

func (s *Scanner) Start(store BlobStorage) error

Start bootstraps and runs internal processes to read files and schedule upload jobs.

type ScannerConfig

type ScannerConfig struct {
	// PartitionID a Scanner instance will use this field to create logical partitions in the specified bucket.
	//
	// This could be used in many ways such as:
	//
	// - Create a multi-tenant environment.
	//
	// - Store data from several machines (maybe from within a network) into a single bucket without operational
	// overhead.
	//
	// Note: This field is auto-generated using Unique Lexicographic IDs (ULID) if not found.
	PartitionID string `yaml:"partition_id"`
	// ReadHidden read from files using the '.' character prefix.
	ReadHidden bool `yaml:"read_hidden"`
	// DeepTraversing read every node until leafs are reached from a root directory tree. If set to false,
	// Scanner will read only the root tree files.
	DeepTraversing bool `yaml:"deep_traversing"`
	// IgnoredKeys deny list of custom reserved file or folder keys. Scanner will skip items specified here.
	IgnoredKeys []string `yaml:"ignored_keys"`
	// LogErrors disable or enable logging of errors. Useful for development or overall process visibility purposes.
	LogErrors bool `yaml:"log_errors"`
}

ScannerConfig Scanner configuration.

type Stats

type Stats struct {
	// contains filtered or unexported fields
}

Stats contains counters used by internal processes to keep track of its operations.

This struct is goroutine-safe as it relies on atomic operations.

func (Stats) GetCurrentUploadJobs

func (s Stats) GetCurrentUploadJobs() uint64

func (Stats) GetTotalUploadJobs

func (s Stats) GetTotalUploadJobs() uint64

Directories

Path Synopsis
cmd
cli
Package storage holds 3rd party driver implementations for blob (and potentially other) stores.
Package storage holds 3rd party driver implementations for blob (and potentially other) stores.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL