Documentation ¶
Overview ¶
Package implements a parser and a matcher for AdBlockPlus rules. The syntax of AdBlockPlus rules is partially defined in https://adblockplus.org/en/filter-cheatsheet and https://adblockplus.org/en/filters. To parse rules and build a matcher:
matcher := adblock.NewMatcher() fp, err := os.Open("easylist.txt") ... rules, err := adblock.ParseRules(fp) for _, rule := range rules { err = matcher.AddRule(rule, 0) ... }
To match HTTP requests:
host := r.URL.Host if host == "" { host = r.Host } rq := adblock.Request{ URL: r.URL.String(), Domain: host, // possibly fill OriginDomain from Referrer header // and ContentType from HTTP response Content-Type. Timeout: 200 * time.Millisecond, } matched, id, err := matcher.Match(rq) if err != nil { ... } if matched { // Use the rule identifier to print which rules was matched }
Index ¶
Constants ¶
This section is empty.
Variables ¶
var (
NullOpts = RuleOpts{}
)
Functions ¶
This section is empty.
Types ¶
type InterruptedError ¶
func (*InterruptedError) Error ¶
func (e *InterruptedError) Error() string
type Request ¶
type Request struct { // URL is matched against rule parts. Mandatory. URL string // Domain is matched against optional domain or third-party rules Domain string // ContentType is matched against optional content rules. This // information is often available only in client responses. Filters // may be applied twice, once at request time, once at response time. ContentType string // OriginDomain is matched against optional third-party rules. OriginDomain string // Timeout is the maximum amount of time a single matching can take. Timeout time.Duration CheckFreq int // GenericBlock is true if rules not matching a specific domain are to be // ignored. If nil, the matcher will determine it internally based on // $genericblock options. GenericBlock *bool }
Request defines client request properties to be matched against a set of rules.
func (*Request) HasGenericBlock ¶
type Rule ¶
type Rule struct { // The original string representation Raw string // Exception is true for exclusion rules (prefixed with "@@") Exception bool // Parts is the sequence of RulePart matching URLs Parts []RulePart // Opts are optional rules applied to content Opts RuleOpts }
Rule represents a complete adblockplus rule.
func ParseRules ¶
ParseRules returns the sequence of rules extracted from supplied reader content.
func (*Rule) HasContentOpts ¶
func (*Rule) HasUnsupportedOpts ¶
type RuleMatcher ¶
type RuleMatcher struct {
// contains filtered or unexported fields
}
RuleMatcher implements a complete set of include and exclude AdblockPlus rules.
func NewMatcherFromFiles ¶
func NewMatcherFromFiles(paths ...string) (*RuleMatcher, int, error)
func (*RuleMatcher) AddRule ¶
func (m *RuleMatcher) AddRule(rule *Rule, ruleId int) error
AddRule adds a rule to the matcher. Supplied rule identifier will be returned by Match().
func (*RuleMatcher) Match ¶
func (m *RuleMatcher) Match(rq *Request) (bool, int, error)
Match applies include and exclude rules on supplied request. If the request is accepted, it returns true and the matching rule identifier.
func (*RuleMatcher) String ¶
func (m *RuleMatcher) String() string
String returns a textual representation of the include and exclude rules, matching request with or without content.
type RuleNode ¶
type RuleNode struct { Type RuleType Value []byte Opts []*RuleOpts // non-empty on terminating nodes Children []*RuleNode RuleId int }
RuleNode is the node structure of rule trees. Rule trees start with a Root node containing any number of non-Root RuleNodes.
func (*RuleNode) GetValue ¶
GetValue returns the node representation. It may differ from Value field for composite nodes like Sustring.
func (*RuleNode) Match ¶
Match evaluates a piece of a request URL against the node subtree. If it matches an existing rule, returns the rule identifier and its options set. Requests are evaluated by applying the nodes on its URL in DFS order. When the URL is completely matched by a terminal node, a node with a non-empty Opts set, the Opts are applied on the Request properties. Any option match validates the URL as a whole and the matching rule identifier is returned. If the request timeout is set and exceeded, InterruptedError is returned.
type RuleOpts ¶
type RuleOpts struct { Raw string Collapse *bool Document bool Domains []string ElemHide bool Font *bool GenericBlock bool GenericHide bool Image *bool Media *bool Object *bool ObjectSubRequest *bool Other *bool Ping *bool Popup *bool Script *bool Stylesheet *bool SubDocument *bool ThirdParty *bool Websocket *bool WebRTC *bool XmlHttpRequest *bool }
RuleOpts defines custom rules applied to content once the URL part has been matched by the RuleParts.
func NewRuleOpts ¶
NewRuleOpts parses the rule part following the '$' separator and return content matching options.
type RulePart ¶
type RulePart struct { // Rule type, like Exact, Wildcard, etc. Type RuleType // Rule part string representation Value string }
RulePart is the base component of rules. It represents a single matching element, like an exact match, a wildcard, a domain anchor...
type RuleTree ¶
type RuleTree struct {
// contains filtered or unexported fields
}
A RuleTree matches a set of adblockplus rules.