Documentation ¶
Index ¶
- Variables
- func Crawl(ctx context.Context, root Directory, worker EnumerateOneDirFunc, ...) <-chan CrawlResult
- func CrawlLocalDirectory(ctx context.Context, root string, parallelism int, reader DirReader) <-chan CrawlResult
- func Transform(ctx context.Context, input <-chan CrawlResult, worker TransformFunc, ...) <-chan TransformResult
- func Walk(appCtx context.Context, root string, parallelism int, parallelStat bool, ...)
- type CrawlResult
- type DirReader
- type Directory
- type DirectoryEntry
- type EnumerateOneDirFunc
- type FileSystemEntry
- type InputObject
- type OutputObject
- type TransformFunc
- type TransformResult
Constants ¶
This section is empty.
Variables ¶
var ReaddirTimeoutError = errors.New("readdir timed out getting file properties")
Functions ¶
func Crawl ¶
func Crawl(ctx context.Context, root Directory, worker EnumerateOneDirFunc, parallelism int) <-chan CrawlResult
Crawl crawls an abstract directory tree, using the supplied enumeration function. May be use for whatever that function can enumerate (i.e. not necessarily a local file system, just anything tree-structured)
func CrawlLocalDirectory ¶
func CrawlLocalDirectory(ctx context.Context, root string, parallelism int, reader DirReader) <-chan CrawlResult
CrawlLocalDirectory specializes parallel.Crawl to work specifically on a local directory. It does not follow symlinks. The items in the CrawResult output channel are FileSystemEntry s. For a wrapper that makes this look more like filepath.Walk, see parallel.Walk.
func Transform ¶
func Transform(ctx context.Context, input <-chan CrawlResult, worker TransformFunc, parallelism int) <-chan TransformResult
transformation will stop when input is closed
func Walk ¶
func Walk(appCtx context.Context, root string, parallelism int, parallelStat bool, walkFn filepath.WalkFunc)
Walk is similar to filepath.Walk. But note the following difference is how WalkFunc is used:
- If fileError passed to walkFunc is not nil, then here the filePath passed to that function will usually be "" (whereas with filepath.Walk it will usually (always?) have a value).
- If the return value of walkFunc function is not nil, enumeration will always stop, not matter what the type of the error. (Unlike filepath.WalkFunc, where returning filePath.SkipDir is handled as a special case).
Types ¶
type CrawlResult ¶
type CrawlResult struct {
// contains filtered or unexported fields
}
func (CrawlResult) Item ¶
func (r CrawlResult) Item() (interface{}, error)
type DirReader ¶
func NewDirReader ¶
NewDirReader makes a directory reader. If parallelStat is true, then the reader uses a pool of go-routines to do the lookups from name of directory entry to full os.FileInfo. Useful on Linux, but not Windows. Why do we need this? Because on Linux os.Readdir does the same lookups, but it does them sequentially which hurts performance. Alternatives like https://github.com/karrick/godirwalk avoid the lookup all together, but only if you don't need any information about each entry other than whether its a file or directory. We definitely also need to know whether its a symlink. And, in our current architecture, we also need to get the size and LMT for the file.
type DirectoryEntry ¶
type DirectoryEntry interface{}
type EnumerateOneDirFunc ¶
type EnumerateOneDirFunc func(dir Directory, enqueueDir func(Directory), enqueueOutput func(DirectoryEntry, error)) error
must be safe to be simultaneously called by multiple go-routines, each with a different dir
type FileSystemEntry ¶
type FileSystemEntry struct {
// contains filtered or unexported fields
}
type InputObject ¶
type InputObject interface{}
type OutputObject ¶
type OutputObject interface{}
type TransformFunc ¶
type TransformFunc func(input InputObject) (OutputObject, error)
must be safe to be simultaneously called by multiple go-routines
type TransformResult ¶
type TransformResult struct {
// contains filtered or unexported fields
}
func (TransformResult) Item ¶
func (r TransformResult) Item() (interface{}, error)