multiglob

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 16, 2019 License: MIT Imports: 4 Imported by: 0

README

multiglob: matching multiple wildcard based patterns

GoDoc Build Status codecov Go Report Card

Inspired by a problem I encountered at work, this matches a string against a list of patterns and tells you which one it matched against!

This uses a Radix tree under the hood and aims to be pretty darn fast.

Usage:

See under /example for the usage, but it's copied below.

func main() {
	mgb := multiglob.New()
	mgb.MustAddPattern("foo", "foo*")
	mgb.MustAddPattern("bar", "bar*")
	mgb.MustAddPattern("eyyyy!", "*ey*")

	mg := mgb.MustCompile()

	if mg.Match("football") {
		fmt.Println("I matched!")
	}

	matches := mg.FindAllPatterns("barney stinson")
	if matches == nil {
		fmt.Println("Oh no, I didn't match any pattern")
		return
	}

	for _, match := range matches {
		fmt.Println("I matched: ", match)
	}
}

Performance Comparison

The lazy way you can do what multiglob does is loop over the patterns. That's sloooow, but I wanted to know how slow. I benchmarked it against the standard library regexp package as well as github.com/gobwas/glob, which I took some inspiration from. On my laptop, the benchmarks in comparison_test.go produce this:

$ go test . -bench=.                                                                                                                                                                                                          [0]
goos: linux
goarch: amd64
pkg: github.com/szabado/multiglob
BenchmarkMultiMatchRegex-4            	    1000	   1260832 ns/op	      37 B/op	       0 allocs/op
BenchmarkMultiMatchGlob-4             	   20000	     61867 ns/op	       0 B/op	       0 allocs/op
BenchmarkMultiMatchMultiGlob-4        	 1000000	      4405 ns/op	       0 B/op	       0 allocs/op
BenchmarkMultiNotMatchRegex-4         	    3000	    544801 ns/op	      12 B/op	       0 allocs/op
BenchmarkMultiNotMatchGlob-4          	   30000	     41110 ns/op	       0 B/op	       0 allocs/op
BenchmarkMultiNotMatchMultiGlob-4     	 3000000	       586 ns/op	       0 B/op	       0 allocs/op
BenchmarkSingleMatchRegex-4           	   50000	     32161 ns/op	       0 B/op	       0 allocs/op
BenchmarkSingleMatchGlob-4            	 5000000	       323 ns/op	       0 B/op	       0 allocs/op
BenchmarkSingleMatchMultiGlob-4       	50000000	        29.0 ns/op	       0 B/op	       0 allocs/op
BenchmarkParseRegex-4                 	     500	   3645894 ns/op	 3188164 B/op	   24300 allocs/op
BenchmarkParseGlob-4                  	     500	   3609843 ns/op	 1555202 B/op	   39288 allocs/op
BenchmarkParseMultiGlob-4             	    1000	   2039323 ns/op	 1870861 B/op	   22935 allocs/op
BenchmarkMultiGlobFindAllPatterns-4   	  300000	      5540 ns/op	       0 B/op	       0 allocs/op
BenchmarkMultiGlobFindPattern-4       	  300000	      4826 ns/op	       0 B/op	       0 allocs/op
BenchmarkMultiGlobFindAllGlobs-4      	  200000	      6327 ns/op	     528 B/op	       7 allocs/op
BenchmarkMultiGlobFindGlobs-4         	  300000	      4838 ns/op	     128 B/op	       5 allocs/op
PASS
ok  	github.com/szabado/multiglob	31.539s

All the Multi benchmarks are matching one string across a bunch of patterns, and all the Single one are one pattern. Basically? It's fast. It's more then 10 times faster than Glob, and that ratio gets better the more patterns there are.

Based on the benchmark, it also has better performance growth than Glob. The Multi tests have 720 patterns, and MultiGlob took ~150 times longer to execute the Multi tests compared to Glob's ~190 times increase.

Glob is already way faster than using a Regex. MultiGlob is way faster than doing using Glob naively; you just have to accept the reduced functionality.

Open Questions

Isn't this basically an http router??

Yep! But I didn't want the overhead of http and I wanted to write this for fun.

Limitations

This only supports wildcards (*) and character ranges ([ab], [^cd], [e-h], etc.). If you need more, I'd suggest checking out glob.

Requirements

This requires >= Go 1.11, in order to use the modules. You can probably vendor it in with older versions, but as always do so at your own risk.

Documentation

Overview

Package multiglob implements a multi-pattern glob matcher.

It accepts multiple glob-based patterns and produces a matcher that determines which, if any pattern matches. It's radix tree based for maximal efficiency.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Builder

type Builder struct {
	// contains filtered or unexported fields
}

Builder builds a MultiGlob.

func New

func New() *Builder

New returns a new Builder that can be used to create a MultiGlob.

func (*Builder) AddPattern

func (m *Builder) AddPattern(name, pattern string) error

AddPattern adds the provided pattern to the builder and parses it.

func (*Builder) Compile

func (m *Builder) Compile() (*MultiGlob, error)

Compile merges all the compiled patterns into one MultiGlob and returns it.

func (*Builder) MustAddPattern

func (m *Builder) MustAddPattern(name, pattern string)

MustAddPattern wraps AddPattern, and panics if there is an error.

func (*Builder) MustCompile

func (m *Builder) MustCompile() *MultiGlob

MustCompile wraps Compile, and panics if there is an error.

type MultiGlob

type MultiGlob struct {
	// contains filtered or unexported fields
}

MultiGlob is a matcher that is built from a collection of patterns. See Builder.

func (*MultiGlob) FindAllGlobs added in v0.1.1

func (mg *MultiGlob) FindAllGlobs(input string) map[string][]string

FindAllGlobs returns a map of pattern names to globs extracted using each pattern. It uses all the patterns returned FindAllPatterns. See FindGlobs for an explanation of glob extraction.

func (*MultiGlob) FindAllPatterns

func (mg *MultiGlob) FindAllPatterns(input string) []string

FindAllPatterns returns a list containing all patterns that matched this input.

func (*MultiGlob) FindGlobs added in v0.1.1

func (mg *MultiGlob) FindGlobs(input string) (name string, globs []string, matched bool)

FindGlobs finds a matching pattern using FindPattern, and then extracts the globs from the input based on that pattern. It also returns the name of the pattern matched. This uses a greedy matching algorithm. For example:

Input:         "test"
Pattern Found: "t*t"
Globs:         ["es"]

Input:         "pen pineapple apple pen"
Pattern Found: "*apple*"
Globs:         ["pen pineapple ", " pen"]

func (*MultiGlob) FindGlobsForPattern added in v0.1.1

func (mg *MultiGlob) FindGlobsForPattern(input, name string) (globs []string, err error)

FindGlobsForPattern extracts the globs from input using the named pattern.

func (*MultiGlob) FindPattern added in v0.1.1

func (mg *MultiGlob) FindPattern(input string) (string, bool)

FindPattern returns one pattern out of the set of patterns that matches input. There is no guarantee as to which of the patterns will be returned. Returns true if a pattern was matched.

func (*MultiGlob) Match

func (mg *MultiGlob) Match(input string) bool

Match determines if any pattern matches the provided string.

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL