Documentation ¶
Overview ¶
Package walker provides Walker, Scan and Sweep.
Index ¶
- Variables
- func Scan(Visit ScanProcess, Params Params) ([]string, error)
- func ScanMatch(value bool) float64
- func Sweep(Visit SweepProcess, Params Params) ([]string, error)
- type FS
- type Params
- type Process
- type ScanProcess
- func (ScanProcess) AfterVisit(context WalkContext[struct{}]) (err error)
- func (ScanProcess) AfterVisitChild(child fs.DirEntry, resultValue any, resultOK bool, ...) (err error)
- func (v ScanProcess) Visit(context WalkContext[struct{}]) (shouldVisitChildren bool, err error)
- func (ScanProcess) VisitChild(child fs.DirEntry, valid bool, context WalkContext[struct{}]) (action Step, err error)
- type Step
- type SweepProcess
- func (SweepProcess) AfterVisit(context WalkContext[bool]) (err error)
- func (SweepProcess) AfterVisitChild(child fs.DirEntry, resultValue any, resultOK bool, context WalkContext[bool]) (err error)
- func (v SweepProcess) Visit(context WalkContext[bool]) (shouldRecurse bool, err error)
- func (SweepProcess) VisitChild(child fs.DirEntry, valid bool, context WalkContext[bool]) (action Step, err error)
- type WalkContext
- type Walker
Constants ¶
This section is empty.
Variables ¶
var ErrUnknownAction = errors.New("Process.BeforeChild: Unknown action")
Functions ¶
func Scan ¶
func Scan(Visit ScanProcess, Params Params) ([]string, error)
Scan recursively scans a directory tree, and returns all nodes matching the Visit function. Nodes returned are first sorted descending by score, then by lexicographical order. When an error occurs, may continue scanning until all units have exited and returns nil, err.
This function is a convenience alternative to:
scanner := Walker{Visit: Visit, Params: Params} err := scanner.Walk(); results := scanner.Results()
func ScanMatch ¶
ScanMatch can be used to implement a boolean scan process. When value is true, it returns 1, when it is false, it returns -1.
func Sweep ¶
func Sweep(Visit SweepProcess, Params Params) ([]string, error)
Sweep recursively sweeps a directory tree, and returns all nodes that are empty or contain only empty directories When an error occurs, may continue sweeping until all units have exited and returns nil, err.
This function is a convenience alternative to:
scanner := Walker{Visit: Visit, Params: Params} err := scanner.Walk(); results := scanner.Results()
Types ¶
type FS ¶
type FS interface { // Path returns the path of this FS. // The path should not be normalized. Path() string // ResolvedPath returns the current path of this FS. // This function will be called only once, and may perform (potentially slow) normalization. // // The return value is used for cycle detection, and also passed to all other functions in this interface. ResolvedPath() (string, error) // Read reads the root directory of this filesystem. // and returns a list of directory entries sorted by filename. // // If is roughly equivalent to the ReadDir method of fs.ReadDirFS. // Assuming fsys is an internal fs.FS the method might be implemented as: // // fs.ReadDir(fs.FS(fsys), ".") Read(path string) ([]fs.DirEntry, error) // CanSub indicates if the given directory entry can be used as a valid FS. // // Sub creates a new FS for the provided entry. // path and rpath are the Path() and ResolvedPath() values. // Sub is only called when CanSub returns true and a nil error. CanSub(path string, entry fs.DirEntry) (bool, error) Sub(path, rpath string, entry fs.DirEntry) FS }
FS represents a file system for use by walker
See NewRealFS for a instantiating a sample implementation.
type Params ¶
type Params struct { // Root is the root filesystem to begin the walk at Root FS // ExtraRoots are extra root folders to walk on top of root. // These may be nil. ExtraRoots []FS // MaxParallel is maximum number of nodes that will be scanned in parallel. // Zero or negative values are treated as no limit. MaxParallel int // BufferSize is an integer that can be used to optimize internal behavior. // It should be larger than the average number of expected results. // Set to 0 to disable. BufferSize int }
Params are parameters for a walk across a filesystem
type Process ¶
type Process[S any] interface { // Visit is called for every node that is being visited. // It is the first function called for each node. // // It receives a context, representing the node being visited. // // Visit should return three things. // // Snapshot is an arbitrary object that captures that current state of the process // It is maintained throughout the processing of one node, and returned to the parent node (when being processed concurrently) // // shouldVisitChildren determines if any children of this node should be visited or if the process should stop. // When shouldVisitChildren is false, no other functions are called for this node, and the snapshot is returned to the parent (if any) immediately. // // Err is any error that may occur, and should typically be nil. // An error immediately causes iteration on this node to be aborted, and the first error of any node will be returned to the caller of Walk. Visit(context WalkContext[S]) (shouldVisitChildren bool, err error) // VisitChild is called to determine if and how a child node should be processed. // // A child entry is valid if it can be recursively processed (i.e. is a directory). // // When child is valid, it determines how the child should be processed; otherwise action is ignored. VisitChild(child fs.DirEntry, valid bool, context WalkContext[S]) (action Step, err error) // AfterVisitChild is called after a child has been visited synchronously. // // It is passed to special values, the returned snapshot (as returned from AfterVisit / Visit) and if the child was processed properly. // The child was processed improperly when any of the Process functions on it returned an error, listing a directory failed, or it was already processed before (loop detection). In these cases resultValue is nil. AfterVisitChild(child fs.DirEntry, resultValue any, resultOK bool, context WalkContext[S]) (err error) // AfterVisit is called after all children have been visited (or scheduled to be visited). // It is not called for the case where Visit returns shouldVisitChildren = false. // // result can be used to mark the current node, see also Visit. // // The returnValue returned from AfterVisit is passed to parent(s) if any. AfterVisit(context WalkContext[S]) (err error) }
Process determines the behavior of a Walker.
Each process may hold intermediate state of type S. Processes should not retain references to VisitContexts (or state) beyond the invocation of each method.
type ScanProcess ¶
ScanProcess is a function that is called once for each directory that is being walked. It returns a triple of float64 score, bool continue and err error.
match indicates that what score the path received. A non-negative score indicates a match, and will be returned in the array from Scan(). cont indicates if Scan() should continue scanning recursively. err != nil indicates that an error has occurred, and the entire process should be aborted.
ScanProcess may be nil. In such a case, it is assumed to return (0, true, nil) for every invocation.
ScanProcess implements Process and can be used with Walk
func (ScanProcess) AfterVisit ¶
func (ScanProcess) AfterVisit(context WalkContext[struct{}]) (err error)
func (ScanProcess) AfterVisitChild ¶
func (ScanProcess) AfterVisitChild(child fs.DirEntry, resultValue any, resultOK bool, context WalkContext[struct{}]) (err error)
func (ScanProcess) Visit ¶
func (v ScanProcess) Visit(context WalkContext[struct{}]) (shouldVisitChildren bool, err error)
func (ScanProcess) VisitChild ¶
func (ScanProcess) VisitChild(child fs.DirEntry, valid bool, context WalkContext[struct{}]) (action Step, err error)
type Step ¶
type Step int
Step describes how a child node should be processed
const ( // DoNothing ignores the child node, and continue with the next node. DoNothing Step = iota // DoSync synchronously processes the child node. // Once processing the child node has finished the AfterChild() function will be called. DoSync // DoConcurrent queues the child node to be processed concurrently. // The current node will node wait for DoConcurrent )
type SweepProcess ¶
SweepProcess is a function that is called once for each directory that is being sweeped. It returns a boolean stop.
stop should indicate if the scan should continue recursively, or stop and treat the appropriate directory as non-empty.
Visit may be nil. In such a case, it is assumed to return the pair false for every indication.
SweepProcess implements Process and can be used with Walk.
func (SweepProcess) AfterVisit ¶
func (SweepProcess) AfterVisit(context WalkContext[bool]) (err error)
func (SweepProcess) AfterVisitChild ¶
func (SweepProcess) AfterVisitChild(child fs.DirEntry, resultValue any, resultOK bool, context WalkContext[bool]) (err error)
func (SweepProcess) Visit ¶
func (v SweepProcess) Visit(context WalkContext[bool]) (shouldRecurse bool, err error)
func (SweepProcess) VisitChild ¶
func (SweepProcess) VisitChild(child fs.DirEntry, valid bool, context WalkContext[bool]) (action Step, err error)
type WalkContext ¶
type WalkContext[S any] interface { // Root node this instance of the scan started from Root() FS // Current node being operated on Node() FS // Path to the current node NodePath() string // Path from the root node to this node Path() []string // Depth of this node, equivalent to len(Path()) Depth() int // Update the snapshot corresponding to the current context Snapshot(update func(snapshot S) (value S)) // Mark the current node as a result with the given priority. // May be called multiple times, in which case the node is marked as a result multiple times. Mark(priority float64) }
WalkContext represents the current state of a Walker. It may additionally hold a snapshot of the state of type S.
Any instance of WalkContext should not be retained past any method it is passed to.
type Walker ¶
type Walker[S any] struct { Params Params Process Process[S] // contains filtered or unexported fields }
Walker is an object that can recursively operate on all subdirectories of a directory and score those matching a specific criterion. The criterion is determined by the Process parameter.
Process also determines if the process can operate on multiple directories concurrently. Parameters determine the initial root directory (or directories) to start with, and what level of concurrency the walker may make use of.
Each Walker may be used only once. A typical use of a walker looks like:
w := Walker{/* ... */} if err := w.Walk(); err != nil { return err } results, scores := w.Results(), w.Scores()
func (*Walker[S]) Paths ¶ added in v1.19.0
Paths returns the path of all nodes which have been marked as a result.
When resolved is true, returns the normalized (resolved) paths; else the non-normalized versions are returned. Directories are returned in sorted order; sorted first ascending by priority then by lexicographically by resolved node path. Each call to result returns a new copy of the results.
Paths expects the Scan() function to have returned, and will panic if this is not the case.
func (*Walker[S]) Scores ¶
Scores returns the scores which have been marked as a result. They are returned in the same order as Results()
Results expects the Scan() function to have returned, and will panic if this is not the case.