Documentation ¶
Overview ¶
Package cloudpath provides utility routines for working with paths across both local and distributed storage systems. The set of schemes supported can be extended by providing additional implementations of the Matcher function. A cloudpath encodes two types of information:
- the path name itself which can be used to access the data it names.
- metadata about the where that filename is hosted.
For example, s3://my-bucket/a/b, contains the path '/my-bucket/a/b' as well the indication that this path is hosted on S3. Most cloud storage systems either use URI formats natively or support their use. Both AWS S3 and Google Cloud Storage support URLs: eg. storage.cloud.google.com/bucket/obj.
cloudpath provides operations for extracting both metadata and the path component, and operations for working with the extracted path directly. A common usage is to determine the 'scheme' (eg. s3, windows, unix etc) of a filename and to then operate on it appropriately. cloudpath represents a 'path' as a slice of strings to simplify often performed operations such as finding common prefixes, suffixes that are aware of the structure of the path. For example it should be possible to easily determine that s3://bucket/a/b is a prefix of https://s3.us-west-2.amazonaws.com/bucket/a/b/c.
All of the metadata for a path is represented using the Match type.
For manipulation, the path is converted to a cloudpath.T.
Index ¶
- Constants
- func Base(scheme string, separator byte, path string) string
- func HasPrefix(path, prefix []string) bool
- func Host(path string) string
- func IsLocal(path string) bool
- func Join(sep byte, components []string) string
- func Key(path string) (string, rune)
- func Parameters(path string) map[string][]string
- func Path(path string) (string, rune)
- func Prefix(scheme string, separator byte, path string) string
- func Region(url string) string
- func Scheme(path string) string
- func Volume(path string) string
- type Match
- type Matcher
- type MatcherSpec
- func (ms MatcherSpec) Host(path string) string
- func (ms *MatcherSpec) IsLocal(path string) bool
- func (ms MatcherSpec) Key(path string) (string, rune)
- func (ms MatcherSpec) Match(p string) Match
- func (ms *MatcherSpec) Parameters(path string) map[string][]string
- func (ms MatcherSpec) Path(path string) (string, rune)
- func (ms MatcherSpec) Region(url string) string
- func (ms MatcherSpec) Scheme(path string) string
- func (ms MatcherSpec) Volume(path string) string
- type T
- func (path T) AsFilepath() T
- func (path T) AsPrefix() T
- func (path T) Base() string
- func (path T) HasSuffix(suffix T) bool
- func (path T) IsAbsolute() bool
- func (path T) IsFilepath() bool
- func (path T) IsRoot() bool
- func (path T) Join(separator rune) string
- func (path T) Pop() (T, string)
- func (path T) Prefix() T
- func (path T) Push(p string) T
- func (path T) String() string
- func (path T) TrimSuffix(suffix T) T
Examples ¶
Constants ¶
const ( // AWSS3 is the scheme for Amazon Web Service's S3 object store. AWSS3 = "s3" // GoogleCloudStorage is the scheme for Google's Cloud Storage object store. GoogleCloudStorage = "gs" // UnixFileSystem is the scheme for unix like systems such as linux, macos etc. UnixFileSystem = "unix" // WindowsFileSystem is the scheme for msdos and windows filesystems. WindowsFileSystem = "windows" // HTTP is the scheme for http. HTTP = "http" // HTTPS is the scheme for https. HTTPS = "https" )
Variables ¶
This section is empty.
Functions ¶
func Base ¶
Base is like path.Base but for cloud storage paths which may include a scheme (eg. s3://). It does not support URI host names, parameters etc. In particular:
- the scheme parameter should include the trailing :// or be the empty string.
- a trailing separator means that the path is a prefix with an empty base and hence Base returns "".
- the returned basename never includes the supplied scheme.
func Join ¶
Join will join the supplied components using the supplied separator behaviour appropriate for cloud storage paths that do not elide multiple contiguous separators. It behaves as follows:
- empty components are ignored.
- trailing instances of sep are preserved.
- separators are added only when not already present as a trailing character in the previous component and leading character in the next component.
- a leading separator is ignored/removed if the previous component ended with a separator and the next component starts with a separator.
func Parameters ¶
Parameters calls DefaultMatchers.Parameters(path).
func Prefix ¶
Prefix is like path.Dir but for cloud storage paths which may include a scheme (eg. s3:///). It does not support URI host names, parameters etc. In particular:
- the scheme parameter should include the trailing :// or be the empty string.
- the returned prefix never includes the supplied scheme.
- the returned prefix never includes a trailing separator.
func Scheme ¶
Scheme calls DefaultMatchers.Scheme(path).
Example ¶
package main import ( "fmt" "cloudeng.io/path/cloudpath" ) func main() { for _, example := range []string{ "s3://my-bucket/object", "https://storage.cloud.google.com/bucket/obj", "gs://my-bucket", `c:\root\file`, } { scheme := cloudpath.Scheme(example) local := cloudpath.IsLocal(example) host := cloudpath.Host(example) volume := cloudpath.Volume(example) path, sep := cloudpath.Path(example) key, _ := cloudpath.Key(example) region := cloudpath.Region(example) parameters := cloudpath.Parameters(example) fmt.Printf("%v %q %q %q %q %q %q %c %v\n", local, scheme, host, region, volume, path, key, sep, parameters) } }
Output: false "s3" "" "" "my-bucket" "my-bucket/object" "object" / map[] false "gs" "storage.cloud.google.com" "" "bucket" "/bucket/obj" "obj" / map[] false "gs" "" "" "my-bucket" "my-bucket" "" / map[] true "windows" "" "" "c" "c:\\root\\file" "\\root\\file" \ map[]
Types ¶
type Match ¶
type Match struct { // Original is the original string that was matched. Matched string // Scheme uniquely identifies the filesystem being used, eg. s3 or windows. Scheme string // Local is true for local filesystems. Local bool // Host will be 'localhost' for local filesystems, the host encoded // in a URL or otherwise empty if there is no notion of a host. Host string // Volume will be the bucket or file system share for systems that support // that concept, or an empty string otherwise. Volume string // Path is the filesystem path or filename to the data. It may be a prefix // on a cloud based system or a directory on a local one. Path string // Key is like Path except without the volume for systems where the volume // can appear in the path name. Key string // Region is the region for cloud based systems. Region string // Separator is the filesystem separator (e.g / or \ for windows). Separator rune // Parameters are any parameters encoded in a URL/URI based name. Parameters map[string][]string }
Match is the result of a successful match.
func AWSS3Matcher ¶
AWSS3Matcher implements Matcher for AWS S3 object names assuming '/' as the separator. It returns AWSS3 for its scheme result.
func AWSS3MatcherSep ¶ added in v0.0.9
func GoogleCloudStorageMatcher ¶
GoogleCloudStorageMatcher implements Matcher for Google Cloud Storage object names. It returns GoogleCloudStorage for its scheme result.
func URLMatcher ¶ added in v0.0.8
URLMatcher implements Matcher for http and https paths.
func UnixMatcher ¶
UnixMatcher implements Matcher for unix filenames. It returns UnixFileSystem for its scheme result. It will match on file://[HOST]/[PATH].
func WindowsMatcher ¶
WindowsMatcher implements Matcher for Windows filenames. It returns WindowsFileSystem for its scheme result.
type Matcher ¶
Matcher is the prototype for functions that parse the supplied path to determine if it matches a specific scheme and then breaks out the metadata encoded in the path. If Match.Matched is empty then no match has been found. Matchers for local filesystems should return "" for the host.
type MatcherSpec ¶
type MatcherSpec []Matcher
MatcherSpec represents a set of Matchers that will be applied in order. The ordering is important, the most specific matchers need to be applied first. For example a matcher for Windows should precede that for a Unix filesystem since the latter can accept filenames in Windows format.
var DefaultMatchers MatcherSpec = []Matcher{ AWSS3Matcher, GoogleCloudStorageMatcher, URLMatcher, WindowsMatcher, UnixMatcher, }
DefaultMatchers represents the built in set of Matchers.
func (MatcherSpec) Host ¶
func (ms MatcherSpec) Host(path string) string
Host returns the host component of the path if there is one.
func (*MatcherSpec) IsLocal ¶
func (ms *MatcherSpec) IsLocal(path string) bool
IsLocal returns true if the path is for a local filesystem.
func (MatcherSpec) Key ¶ added in v0.0.5
func (ms MatcherSpec) Key(path string) (string, rune)
Key returns the key component of path and the separator to use for it.
func (MatcherSpec) Match ¶
func (ms MatcherSpec) Match(p string) Match
Match applies all of the matchers in turn to match the supplied path.
func (*MatcherSpec) Parameters ¶
func (ms *MatcherSpec) Parameters(path string) map[string][]string
Parameters returns the parameters in path, if any. If no parameters are present an empty (rather than nil), map is returned.
func (MatcherSpec) Path ¶
func (ms MatcherSpec) Path(path string) (string, rune)
Path returns the path component of path and the separator to use for it.
func (MatcherSpec) Region ¶ added in v0.0.5
func (ms MatcherSpec) Region(url string) string
Region returns the region component for cloud based systems.
func (MatcherSpec) Scheme ¶
func (ms MatcherSpec) Scheme(path string) string
Scheme returns the portion of the path that precedes a leading '//' or "" otherwise.
func (MatcherSpec) Volume ¶
func (ms MatcherSpec) Volume(path string) string
Volume returns the filesystem specific volume, if any, encoded in the path.
type T ¶ added in v0.0.3
type T []string
T represents a cloudpath. Instances of T are created from native storage system paths and/or URLs and are designed to retain the following information.
- the path was absolute vs relative.
- the path was a prefix or a filepath.
- a path of zero length is represented as a nil slice and not an empty slice.
Redundant information is discarded:
- multiple consecutive instances of separator are treated as a single separator.
The resulting format is as follows:
- a relative path, ie. one that does not start with a separator has an empty string as the first item in the slice
- a path that ends with a separator has an empty string as the final component of the path
For example:
"" => [] // empty "/" => ["", ""] // absolute, prefix, IsRoot is true "/abc" => ["", "abc"] // absolute, filepath "abc" => ["abc"] // relative, filepath "/abc/" => ["", "abc", ""] // absolute, prefix, IsRoot is false "abc/" => ["abc", ""] // relative, prefix
T is defined as a type rather than using []string directly to avoid clients of this package misinterpreting the above rules and incorrectly manipulating the string slice.
func LongestCommonPrefix ¶
LongestCommonPrefix returns the longest prefix common to the specified cloudpaths.
func LongestCommonSuffix ¶
LongestCommonSuffix returns the longest suffix common to the specified cloudpaths.
func TrimPrefix ¶
TrimPrefix removes the specified prefix from path. It returns nil if path and suffix are identical.
func (T) AsFilepath ¶ added in v0.0.3
AsFilepath returns path as a filepath if it is not already one provided that is not a root or empty.
func (T) AsPrefix ¶ added in v0.0.3
AsPrefix returns path as a path prefix if it is not already one.
func (T) Base ¶ added in v0.0.3
Base returns the 'base', or 'filename' component of path, ie. the last one. If the path is a prefix then an empty string is returned.
func (T) IsAbsolute ¶ added in v0.0.3
IsAbsolute returns true if the components were derived from an absolute path.
func (T) IsFilepath ¶ added in v0.0.3
IsFilepath returns true if the path was derived from a filepath.
func (T) IsRoot ¶ added in v0.0.3
IsRoot returns true if the path was a derived from the 'root', ie. a single separator such as /.
func (T) Join ¶ added in v0.0.3
Join creates a string path from the supplied components. It follows the rules specified for Join. It is the inverse of Split, that is, newPath == origPath for:
newPath = Join(sep, Split(origPath,sep)...)
func (T) Pop ¶ added in v0.0.3
Pop returns a new cloudpath.T with the trailing component removed and returned. Pop on a path for which IsRoot is true will return the root again. IsFilepath will always be false for the returned cloudpath.T.
func (T) Prefix ¶ added in v0.0.3
Prefix returns the prefix component of a path.
Example ¶
package main import ( "fmt" "cloudeng.io/path/cloudpath" ) func main() { date := cloudpath.Split("2012-11-27", '/').AsPrefix() for _, fullname := range []string{ "s3://my-bucket/2012-11-27/shard-0000-of-0001.json", "/my-local-copy/2012-11-27/shard-0000-of-0001.json", "https://storage.cloud.google.com/google-copy/2012-11-27/shard-0001-of-0001.json", } { components := cloudpath.SplitPath(fullname) fmt.Printf("%v\n", components.Prefix().HasSuffix(date)) } }
Output: true true true
func (T) Push ¶ added in v0.0.3
Push returns a new cloudpath.T with the supplied component appended. IsFilePath will always be true for the returned value unless p is an empty string in which case Push is equivalent to path.AsFilePath().
func (T) String ¶ added in v0.0.3
String implements stringer. It calls path.Join with / as the separator.
func (T) TrimSuffix ¶ added in v0.0.3
TrimSuffix removes the specified suffix from path. It returns nil if path and suffix are identical.