cloudpath

package
v0.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 17, 2020 License: Apache-2.0 Imports: 3 Imported by: 5

README

Package cloudeng.io/path/cloudpath

CircleCI Go Report Card

import cloudeng.io/path/cloudpath

Package cloudpath provides utility routines for working with paths across both local and distributed storage systems. The set of schemes supported can be extended by providing additional implementations of the Matcher function. A cloudpath encodes two types of information:

1. the path name itself which can be used to access the data it names.
2. metadata about the where that filename is hosted.

For example, s3://my-bucket/a/b, contains the path '/my-bucket/a/b' as well the indication that this path is hosted on S3. Most cloud storage systems either use URI formats natively or support their use. Both AWS S3 and Google Cloud Storage support URLs: eg. storage.cloud.google.com/bucket/obj.

cloudpath provides operations for extracting both metadata and the path component, and operations for working with the extracted path directly. A common usage is to determine the 'scheme' (eg. s3, windows, unix etc) of a filename and to then operate on it appropriately. cloudpath represents a 'path' as a slice of strings to simplify often performed operations such as finding common prefixes, suffixes that are aware of the structure of the path. For example it should be possible to easily determine that s3://bucket/a/b is a prefix of https://s3.us-west-2.amazonaws.com/bucket/a/b/c.

All of the metadata for a path is represented using the Match type.

For manipulation, the path is converted to a cloudpath.T.

Constants

AWSS3, GoogleCloudStorage, UnixFileSystem, WindowsFileSystem
// AWSS3 is the scheme for Amazon Web Service's S3 object store.
AWSS3 = "s3"
// GoogleCloudStorage is the scheme for Google's Cloud Storage object store.
GoogleCloudStorage = "GoogleCloudStorage"
// UnixFileSystem is the scheme for unix like systems such as linux, macos etc.
UnixFileSystem = "unix"
// WindowsFileSystem is the scheme for msdos and windows filesystems.
WindowsFileSystem = "windows"

Functions

Func HasPrefix
func HasPrefix(path, prefix []string) bool

HasPrefix returns true if path has the specified prefix.

Func Host
func Host(path string) string

Host calls DefaultMatchers.Host(path).

Func IsLocal
func IsLocal(path string) bool

IsLocal calls DefaultMatchers.IsLocal(path).

Func Parameters
func Parameters(path string) map[string][]string

Parameters calls DefaultMatchers.Parameters(path).

Func Path
func Path(path string) (string, rune)

Path calls DefaultMatchers.Path(path).

Func Scheme
func Scheme(path string) string

Scheme calls DefaultMatchers.Scheme(path).

Func Volume
func Volume(path string) string

Volume calls DefaultMatchers.Volume(path).

Types

Type Match
type Match struct {
	// Scheme uniquely identifies the filesystem being used, eg. s3 or windows.
	Scheme string
	// Local is true for local filesystems.
	Local bool
	// Host will be 'localhost' for local filesystems, the host encoded
	// in a URL or otherwise empty if there is no notion of a host.
	Host string
	// Volume will be the bucket or file system share for systems that support
	// that concept, or an empty string otherwise.
	Volume string
	// Path is the filesystem path or filename to the data. It may be a prefix
	// on a cloud based system or a directory on a local one.
	Path string
	// Separator is the filesystem separator (e.g / or \ for windows).
	Separator rune
	// Parameters are any parameters encoded in a URL/URI based name.
	Parameters map[string][]string
}

Match is the result of a successful match.

Type Matcher
type Matcher func(p string) *Match

Matcher is the prototype for functions that parse the supplied path to determine if it matches a specific scheme and then breaks out the metadata encoded in the path. Matchers for local filesystems should return "localhost" for the host.

Type MatcherSpec
type MatcherSpec []Matcher

MatcherSpec represents a set of Matchers that will be applied in order. The ordering is important, the most specific matchers need to be applied first. For example a matcher for Windows should precede that for a Unix filesystem since the latter can accept filenames in Windows format.

Type T
type T []string

T represents a cloudpath. Instances of T are created from native storage system paths and/or URLs and are designed to retain the following information.

1. the path was absolute vs relative.
2. the path was a prefix or a filepath.
3. a path of zero length is represented as a nil slice and not an empty slice.

Redundant information is discarded:

1. multiple consecutive instances of separator are treated as a single separator.

The resulting format is as follows:

1. a relative path, ie. one that does not start with a separator has an
   empty string as the first item in the slice
2. a path that ends with a separator has an empty string as the final component
   of the path

For example:

""         => []                 // empty
"/"        => ["", ""]           // absolute, prefix, IsRoot is true
"/abc"     => ["", "abc"]        // absolute, filepath
"abc"      => ["abc"]            // relative, filepath
"/abc/"    => ["", "abc", ""]    // absolute, prefix, IsRoot is false
"abc/"     => ["abc", ""]        // relative, prefix

T is defined as a type rather than using []string directly to avoid clients of this package misinterpreting the above rules and incorrectly manipulating the string slice.

Examples

ExampleScheme

Documentation

Overview

Package cloudpath provides utility routines for working with paths across both local and distributed storage systems. The set of schemes supported can be extended by providing additional implementations of the Matcher function. A cloudpath encodes two types of information:

  1. the path name itself which can be used to access the data it names.
  2. metadata about the where that filename is hosted.

For example, s3://my-bucket/a/b, contains the path '/my-bucket/a/b' as well the indication that this path is hosted on S3. Most cloud storage systems either use URI formats natively or support their use. Both AWS S3 and Google Cloud Storage support URLs: eg. storage.cloud.google.com/bucket/obj.

cloudpath provides operations for extracting both metadata and the path component, and operations for working with the extracted path directly. A common usage is to determine the 'scheme' (eg. s3, windows, unix etc) of a filename and to then operate on it appropriately. cloudpath represents a 'path' as a slice of strings to simplify often performed operations such as finding common prefixes, suffixes that are aware of the structure of the path. For example it should be possible to easily determine that s3://bucket/a/b is a prefix of https://s3.us-west-2.amazonaws.com/bucket/a/b/c.

All of the metadata for a path is represented using the Match type.

For manipulation, the path is converted to a cloudpath.T.

Index

Examples

Constants

View Source
const (
	// AWSS3 is the scheme for Amazon Web Service's S3 object store.
	AWSS3 = "s3"
	// GoogleCloudStorage is the scheme for Google's Cloud Storage object store.
	GoogleCloudStorage = "GoogleCloudStorage"
	// UnixFileSystem is the scheme for unix like systems such as linux, macos etc.
	UnixFileSystem = "unix"
	// WindowsFileSystem is the scheme for msdos and windows filesystems.
	WindowsFileSystem = "windows"
)

Variables

This section is empty.

Functions

func HasPrefix

func HasPrefix(path, prefix []string) bool

HasPrefix returns true if path has the specified prefix.

func Host

func Host(path string) string

Host calls DefaultMatchers.Host(path).

func IsLocal

func IsLocal(path string) bool

IsLocal calls DefaultMatchers.IsLocal(path).

func Parameters

func Parameters(path string) map[string][]string

Parameters calls DefaultMatchers.Parameters(path).

func Path

func Path(path string) (string, rune)

Path calls DefaultMatchers.Path(path).

func Scheme

func Scheme(path string) string

Scheme calls DefaultMatchers.Scheme(path).

Example
package main

import (
	"fmt"

	"cloudeng.io/path/cloudpath"
)

func main() {
	for _, example := range []string{
		"s3://my-bucket/object",
		"https://storage.cloud.google.com/bucket/obj",
		"gs://my-bucket",
		`c:\root\file`,
	} {
		scheme := cloudpath.Scheme(example)
		local := cloudpath.IsLocal(example)
		host := cloudpath.Host(example)
		volume := cloudpath.Volume(example)
		path, sep := cloudpath.Path(example)
		parameters := cloudpath.Parameters(example)
		fmt.Printf("%v %q %q %q %q %c %v\n", local, scheme, host, volume, path, sep, parameters)
	}
}
Output:

false "s3" "" "my-bucket" "/my-bucket/object" / map[]
false "GoogleCloudStorage" "storage.cloud.google.com" "bucket" "/bucket/obj" / map[]
false "GoogleCloudStorage" "" "my-bucket" "/my-bucket" / map[]
true "windows" "localhost" "c" "c:\\root\\file" \ map[]

func Volume

func Volume(path string) string

Volume calls DefaultMatchers.Volume(path).

Types

type Match

type Match struct {
	// Scheme uniquely identifies the filesystem being used, eg. s3 or windows.
	Scheme string
	// Local is true for local filesystems.
	Local bool
	// Host will be 'localhost' for local filesystems, the host encoded
	// in a URL or otherwise empty if there is no notion of a host.
	Host string
	// Volume will be the bucket or file system share for systems that support
	// that concept, or an empty string otherwise.
	Volume string
	// Path is the filesystem path or filename to the data. It may be a prefix
	// on a cloud based system or a directory on a local one.
	Path string
	// Separator is the filesystem separator (e.g / or \ for windows).
	Separator rune
	// Parameters are any parameters encoded in a URL/URI based name.
	Parameters map[string][]string
}

Match is the result of a successful match.

func AWSS3Matcher

func AWSS3Matcher(p string) *Match

AWSS3Matcher implements Matcher for AWS S3 object names. It returns AWSS3 for its scheme result.

func GoogleCloudStorageMatcher

func GoogleCloudStorageMatcher(p string) *Match

GoogleCloudStorageMatcher implements Matcher for Google Cloud Storage object names. It returns GoogleCloudStorage for its scheme result.

func UnixMatcher

func UnixMatcher(p string) *Match

UnixMatcher implements Matcher for unix filenames. It returns UnixFileSystem for its scheme result.

func WindowsMatcher

func WindowsMatcher(p string) *Match

WindowsMatcher implements Matcher for Windows filenames. It returns WindowsFileSystem for its scheme result.

type Matcher

type Matcher func(p string) *Match

Matcher is the prototype for functions that parse the supplied path to determine if it matches a specific scheme and then breaks out the metadata encoded in the path. Matchers for local filesystems should return "localhost" for the host.

type MatcherSpec

type MatcherSpec []Matcher

MatcherSpec represents a set of Matchers that will be applied in order. The ordering is important, the most specific matchers need to be applied first. For example a matcher for Windows should precede that for a Unix filesystem since the latter can accept filenames in Windows format.

DefaultMatchers represents the built in set of Matchers.

func (MatcherSpec) Host

func (ms MatcherSpec) Host(path string) string

Host returns the host component of the path if there is one.

func (*MatcherSpec) IsLocal

func (ms *MatcherSpec) IsLocal(path string) bool

IsLocal returns true if the path is for a local filesystem.

func (MatcherSpec) Match

func (ms MatcherSpec) Match(p string) *Match

Match applies all of the matchers in turn to match the supplied path.

func (*MatcherSpec) Parameters

func (ms *MatcherSpec) Parameters(path string) map[string][]string

Parameters returns the parameters in path, if any. If no parameters are present an empty (rather than nil), map is returned.

func (MatcherSpec) Path

func (ms MatcherSpec) Path(path string) (string, rune)

Path returns the path component of path and the separator to use for it.

func (MatcherSpec) Scheme

func (ms MatcherSpec) Scheme(path string) string

Scheme returns the portion of the path that precedes a leading '//' or "" otherwise.

func (MatcherSpec) Volume

func (ms MatcherSpec) Volume(path string) string

Volume returns the filesystem specific volume, if any, encoded in the path.

type T added in v0.0.3

type T []string

T represents a cloudpath. Instances of T are created from native storage system paths and/or URLs and are designed to retain the following information.

  1. the path was absolute vs relative.
  2. the path was a prefix or a filepath.
  3. a path of zero length is represented as a nil slice and not an empty slice.

Redundant information is discarded:

  1. multiple consecutive instances of separator are treated as a single separator.

The resulting format is as follows:

  1. a relative path, ie. one that does not start with a separator has an empty string as the first item in the slice
  2. a path that ends with a separator has an empty string as the final component of the path

For example:

""         => []                 // empty
"/"        => ["", ""]           // absolute, prefix, IsRoot is true
"/abc"     => ["", "abc"]        // absolute, filepath
"abc"      => ["abc"]            // relative, filepath
"/abc/"    => ["", "abc", ""]    // absolute, prefix, IsRoot is false
"abc/"     => ["abc", ""]        // relative, prefix

T is defined as a type rather than using []string directly to avoid clients of this package misinterpreting the above rules and incorrectly manipulating the string slice.

func LongestCommonPrefix

func LongestCommonPrefix(paths []T) T

LongestCommonPrefix returns the longest prefix common to the specified cloudpaths.

func LongestCommonSuffix

func LongestCommonSuffix(paths []T) T

LongestCommonSuffix returns the longest suffix common to the specified cloudpaths.

func Split

func Split(path string, separator rune) T

Split slices path into an instance of T.

func SplitPath

func SplitPath(path string) T

SplitPath calls Split with the results of cloudpath.Split(path).

func TrimPrefix

func TrimPrefix(path, prefix []string) T

TrimPrefix removes the specified prefix from path. It returns nil if path and suffix are identical.

func (T) AsFilepath added in v0.0.3

func (path T) AsFilepath() T

AsFilepath returns path as a filepath if it is not already one provided that is not a root or empty.

func (T) AsPrefix added in v0.0.3

func (path T) AsPrefix() T

AsPrefix returns path as a path prefix if it is not already one.

func (T) Base added in v0.0.3

func (path T) Base() string

Base returns the 'base', or 'filename' component of path, ie. the last one. If the path is a prefix then an empty string is returned.

func (T) HasSuffix added in v0.0.3

func (path T) HasSuffix(suffix T) bool

HasSuffix returns true if path has the specified suffix.

func (T) IsAbsolute added in v0.0.3

func (path T) IsAbsolute() bool

IsAbsolute returns true if the components were derived from an absolute path.

func (T) IsFilepath added in v0.0.3

func (path T) IsFilepath() bool

IsFilepath returns true if the path was derived from a filepath.

func (T) IsRoot added in v0.0.3

func (path T) IsRoot() bool

IsRoot returns true if the path was a derived from the 'root', ie. a single separator such as /.

func (T) Join added in v0.0.3

func (path T) Join(separator rune) string

Join creates a string path from the supplied components. It follows the rules specified for Join. It is the inverse of Split, that is, newPath == origPath for:

newPath = Join(sep, Split(origPath,sep)...)

func (T) Pop added in v0.0.3

func (path T) Pop() (T, string)

Pop returns a new cloudpath.T with the trailing component removed and returned. Pop on a path for which IsRoot is true will return the root again. IsFilepath will always be false for the returned cloudpath.T.

func (T) Prefix added in v0.0.3

func (path T) Prefix() T

Prefix returns the prefix component of a path.

Example
package main

import (
	"fmt"

	"cloudeng.io/path/cloudpath"
)

func main() {
	date := cloudpath.Split("2012-11-27", '/').AsPrefix()
	for _, fullname := range []string{
		"s3://my-bucket/2012-11-27/shard-0000-of-0001.json",
		"/my-local-copy/2012-11-27/shard-0000-of-0001.json",
		"https://storage.cloud.google.com/google-copy/2012-11-27/shard-0001-of-0001.json",
	} {
		components := cloudpath.SplitPath(fullname)
		fmt.Printf("%v\n", components.Prefix().HasSuffix(date))

	}
}
Output:

true
true
true

func (T) Push added in v0.0.3

func (path T) Push(p string) T

Push returns a new cloudpath.T with the supplied component appended. IsFilePath will always be true for the returned value unless p is an empty string in which case Push is equivalent to path.AsFilePath().

func (T) String added in v0.0.3

func (path T) String() string

String implements stringer. It calls path.Join with / as the separator.

func (T) TrimSuffix added in v0.0.3

func (path T) TrimSuffix(suffix T) T

TrimSuffix removes the specified suffix from path. It returns nil if path and suffix are identical.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL