Documentation ¶
Overview ¶
Package diff implements an algorithm for producing edit-scripts. The edit-script is a sequence of operations needed to transform one list of symbols into another (or vice-versa). The edits allowed are insertions, deletions, and modifications. The summation of all edits is called the Levenshtein distance as this problem is well-known in computer science.
This package prioritizes performance over accuracy. That is, the run time is more important than obtaining a minimal Levenshtein distance.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type EditScript ¶
type EditScript []EditType
EditScript represents the series of differences between two lists.
func Difference ¶
func Difference(nx, ny int, f EqualFunc) (eq bool, es EditScript)
Difference reports whether two lists of lengths nx and ny are equal given the definition of equality provided as f.
This function may return a edit-script, which is a sequence of operations needed to convert one list into the other. If non-nil, the following invariants for the edit-script are maintained:
- eq == (es.Dist()==0)
- nx == es.LenX()
- ny == es.LenY()
This algorithm is not guaranteed to be an optimal solution (i.e., one that produces an edit-script with a minimal Levenshtein distance). This algorithm favors performance over optimality. The exact output is not guaranteed to be stable and may change over time.
func (EditScript) Dist ¶
func (es EditScript) Dist() int
Dist is the Levenshtein distance and is guaranteed to be 0 if and only if lists X and Y are equal.
func (EditScript) String ¶
func (es EditScript) String() string
String returns a human-readable string representing the edit-script where Identity, UniqueX, UniqueY, and Modified are represented by the '.', 'X', 'Y', and 'M' characters, respectively.
type EditType ¶
type EditType uint8
EditType represents a single operation within an edit-script.
const ( // Identity indicates that a symbol pair is identical in both list X and Y. Identity EditType = iota // UniqueX indicates that a symbol only exists in X and not Y. UniqueX // UniqueY indicates that a symbol only exists in Y and not X. UniqueY // Modified indicates that a symbol pair is a modification of each other. Modified )
type EqualFunc ¶
EqualFunc reports whether the symbols at indexes ix and iy are equal. When called by Difference, the index is guaranteed to be within nx and ny.
type Result ¶
type Result struct{ NSame, NDiff int }
Result is the result of comparison. NSame is the number of sub-elements that are equal. NDiff is the number of sub-elements that are not equal.
func (Result) Equal ¶
Equal indicates whether the symbols are equal. Two symbols are equal if and only if NDiff == 0. If Equal, then they are also Similar.
func (Result) Similar ¶
Similar indicates whether two symbols are similar and may be represented by using the Modified type. As a special case, we consider binary comparisons (i.e., those that return Result{1, 0} or Result{0, 1}) to be similar.
The exact ratio of NSame to NDiff to determine similarity may change.