diffdoc

package
v0.48.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 24, 2024 License: MIT Imports: 20 Imported by: 0

Documentation

Overview

Package diffdoc provides core diff functionality, with a focus on streaming and concurrency.

Reference:

- https://en.wikipedia.org/wiki/Diff#Unified_format - https://www.gnu.org/software/diffutils/manual/html_node/Hunks.html - https://www.gnu.org/software/diffutils/manual/html_node/Sections.html - https://www.cloudbees.com/blog/git-diff-a-complete-comparison-tutorial-for-git - https://github.com/aymanbagabas/go-udiff

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ColorizeHunks

func ColorizeHunks(ctx context.Context, w io.Writer, clrs *Colors, hunks io.Reader) error

ColorizeHunks prints a colorized diff hunks to w. The reader must not include the diff header. That is, it should not include:

sq diff --data @diff/sakila_a.actor @diff/sakila_b.actor
--- @diff/sakila_a.actor
+++ @diff/sakila_b.actor

Instead, hunks should contain one or more hunks, e.g.

@@ -2,3 +2,3 @@
 1         PENELOPE    GUINESS    2020-06-11T02:50:54Z
-2         NICK        WAHLBERG   2020-06-11T02:50:54Z
+2         NICK        BERGER     2020-06-11T02:50:54Z
 3         ED          CHASE      2020-06-11T02:50:54Z
@@ -12,3 +12,3 @@
 11        ZERO        CAGE       2020-06-11T02:50:54Z

If pr is nil, printing is monochrome.

func ComputeUnified

func ComputeUnified(ctx context.Context, oldLabel, newLabel string, lines int,
	before, after string,
) (string, error)

ComputeUnified encapsulates computing a unified diff.

func Execute

func Execute(ctx context.Context, w io.Writer, concurrency int, differs []*Differ) (hasDiffs bool, err error)

Execute executes differs concurrently, writing output sequentially to w.

Arg concurrency specifies the maximum number of concurrent Differ executions. Zero indicates sequential execution; a negative values indicates unbounded concurrency.

The first error encountered is returned; hasDiff returns true if differences were found, and false if no differences.

func NewColorizer

func NewColorizer(clrs *Colors, src io.Reader) io.Reader

func Titlef

func Titlef(clrs *Colors, format string, a ...any) []byte

Titlef formats a diff command title suitable for use with NewHunkDoc or NewUnifiedDoc.

title := libdiff.Titlef(clrs, "sq diff --data %s %s", src1.Handle, src2.Handle)

The title is colorized according to [Colors.CmdTitle] and terminates with newline.

Types

type Colors

type Colors struct {
	// CmdTitle is the color for the diff command text. That is, the text of the
	// command that effectively triggered this diff. For example:
	//
	//  diff -U3 -r ./a/hiawatha.txt ./b/hiawatha.txt
	//
	// The command text is typically only displayed when multiple diffs are
	// printed back-to-back.
	CmdTitle *color.Color

	// Header is the color for diff header elements.
	//
	//  --- @diff/sakila_a.actor
	//  +++ @diff/sakila_b.actor
	//
	// A header should appear at the top of each diff.
	Header *color.Color

	// Section is the color for diff hunk section range elements.
	//
	//  @@ -8,9 +8,9 @@
	//
	// The above is a section.
	Section *color.Color

	// SectionComment is the color for (optional) diff hunk section comments.
	//
	//  @@ -8,9 +8,9 @@ Here's some context.
	//
	// The text after the second @@ is a section comment.
	SectionComment *color.Color

	// Insertion is the color for diff insertion "+" elements.
	Insertion *color.Color

	// Deletion is the color for diff deletion "-" elements.
	Deletion *color.Color

	// Context is the color for context lines, i.e. the lines above and below
	// the insertions and deletions.
	Context *color.Color

	// ShowHeader indicates that a header (e.g. a header row) should
	// be printed where applicable.
	//
	// REVISIT: Colors.ShowHeader may not be needed.
	ShowHeader bool
	// contains filtered or unexported fields
}

Colors encapsulates diff color printing config.

func NewColors

func NewColors() *Colors

NewColors returns a Colors instance. Coloring is enabled by default.

func (*Colors) Clone

func (c *Colors) Clone() *Colors

Clone returns a clone of c.

func (*Colors) EnableColor

func (c *Colors) EnableColor(enable bool)

EnableColor enables or disables all colors.

func (*Colors) IsMonochrome

func (c *Colors) IsMonochrome() bool

IsMonochrome returns true if in monochrome (no color) mode. Default is false (color enabled) for a new instance.

type Differ

type Differ struct {
	// contains filtered or unexported fields
}

Differ encapsulates a Doc and a function that populates the Doc. Create one via NewDiffer, and then pass it to Execute.

func NewDiffer

func NewDiffer(doc Doc, fn func(ctx context.Context, cancelFn func(error))) *Differ

NewDiffer returns a new Differ that can be passed to Execute. Arg doc is the Doc to be populated, and fn populates the Doc. The cancelFn arg to fn must only be invoked in the event of an error; it must not be invoked on the happy path.

type Doc

type Doc interface {
	// Read provides access to the Doc's bytes. It blocks until the doc is sealed,
	// or returns a non-nil error. If the doc does not contain any diff hunks,
	// Read returns [io.EOF].
	Read(p []byte) (n int, err error)

	// Close closes the doc, releasing any resources held.
	Close() error

	// Err returns the error associated with the doc. On the happy path, Err
	// returns nil. If Err returns non-nil, a call to [Doc.Read] returns the same
	// error.
	Err() error

	// String returns the doc's title, with any colorization removed. It may be
	// empty. It exists mainly for logging and debugging.
	String() string
}

Doc is a diff document that implements io.ReadCloser. It is used to stream diff output.

type Header []byte

Header is the byte sequence for a diff doc, as created by Headerf, and passed to NewHunkDoc.

func Headerf

func Headerf(clrs *Colors, left, right string) Header

Headerf formats a diff doc header suitable for use with NewHunkDoc.

header := libdiff.Headerf(clrs, "@sakila_a.actor", "@sakila_b.actor")

The returned header looks something like:

--- @sakila_a.actor
+++ @sakila_b.actor

It is colorized according to [Colors.Header], and terminates with newline.

func (Header) String

func (h Header) String() string

String returns the header as a string. It may be empty. Colorization is stripped.

type Hunk

type Hunk struct {
	// contains filtered or unexported fields
}

Hunk is a diff hunk, part of a HunkDoc. It implements io.Writer and io.Reader. The hunk is written to via Hunk.Write, and then sealed via Hunk.Seal. Until sealed, Hunk.Read blocks.

func (*Hunk) Close

func (h *Hunk) Close() error

Close implements io.Closer.

func (*Hunk) Err

func (h *Hunk) Err() error

Err returns any error associated with the hunk, as provided to Hunk.Seal.

func (*Hunk) Offset

func (h *Hunk) Offset() int

Offset returns the nominal offset of this hunk in its doc's body.

func (*Hunk) Read

func (h *Hunk) Read(p []byte) (n int, err error)

Read blocks until the hunk is sealed. It returns the doc's bytes, or the non-nil error provided to Hunk.Seal. It is a programming error to invoke Read after Hunk.Close has been invoked.

func (*Hunk) Seal

func (h *Hunk) Seal(header []byte, err error)

Seal seals the hunk, indicating that it is complete. The header arg is the hunk header ("@@ ... @@"). Until the hunk is sealed, a call to Hunk.Read blocks. On the happy path, arg err is nil. It is a runtime error to invoke Seal more than once.

func (*Hunk) Write

func (h *Hunk) Write(p []byte) (n int, err error)

Write writes to the hunk body. The hunk header ("@@ ... @@") should not be written to the body; instead it should be provided via Hunk.Seal. This facilitates stream processing of hunks, because the hunk header can't be calculated until after the hunk body is generated. When writing is complete, the hunk must be sealed via Hunk.Seal, supplying the hunk header at that point.

It is a programming error to invoke Write after Hunk.Seal or Hunk.Close has been invoked.

type HunkDoc

type HunkDoc struct {
	// contains filtered or unexported fields
}

HunkDoc is a document that consists of a series of diff hunks. It implements io.Reader, and is used to stream diff output. The hunks are added to the doc via HunkDoc.NewHunk. A call to HunkDoc.Read blocks until HunkDoc.Seal is invoked.

This may seem overly elaborate, and the design can probably be simplified, but the idea is to stream individual diff hunks as they're generated, rather than buffering the entire diff in memory. This is important for large diffs where, in theory, each hunk could be gigabytes in size. An earlier implementation of this package had an issue where it consumed 20GB+ of memory to execute a complete diff of two reasonably small databases, so this isn't a purely theoretical concern.

If the diff is only available as a block of unified diff text (as opposed to a sequence of hunks), instead use UnifiedDoc.

func NewHunkDoc

func NewHunkDoc(title Title, header []byte, opts ...Opt) *HunkDoc

NewHunkDoc returns a new HunkDoc with the given title and header. The values should be previously colorized if desired. The title may be empty. The header can be generated with Headerf. If non-empty, both title and header should be terminated with a newline. The returned HunkDoc is not sealed; thus a call to HunkDoc.Read blocks until HunkDoc.Seal is invoked.

func (*HunkDoc) Close

func (d *HunkDoc) Close() error

Close implements io.Closer. It is a programming error to use the HunkDoc after it has been closed.

func (*HunkDoc) Err

func (d *HunkDoc) Err() error

Err returns the error associated with the doc, as provided to HunkDoc.Seal. The same non-nil error is returned by a call to HunkDoc.Read.

func (*HunkDoc) NewHunk

func (d *HunkDoc) NewHunk(offset int) (*Hunk, error)

NewHunk returns a new hunk, where offset is the nominal line number in the unified diff that this hunk would be part of. The returned hunk is not sealed, and any call to Hunk.Read will block until Hunk.Seal is invoked.

func (*HunkDoc) Read

func (d *HunkDoc) Read(p []byte) (n int, err error)

Read provides access to the doc's bytes. It blocks until the doc is sealed, or returns the non-nil error provided to HunkDoc.Seal. If the doc does not contain any diff hunks, Read returns io.EOF.

func (*HunkDoc) Seal

func (d *HunkDoc) Seal(err error)

Seal seals the doc, indicating that it is complete. Until it is sealed, a call to HunkDoc.Read blocks. On the happy path, arg err is nil. If err is non-nil, a call to HunkDoc.Read returns an error. Seal panics if called more than once.

func (*HunkDoc) String

func (d *HunkDoc) String() string

String returns the doc's title as a string, with any colorization removed. It may be empty.

type Opt

type Opt interface {
	// contains filtered or unexported methods
}

Opt is an option for configuring a Doc.

func OptBufferFactory

func OptBufferFactory(f func() ioz.Buffer) Opt

OptBufferFactory returns an Opt that sets the buffer factory for a Doc. This permits the use of alternative buffering strategies, such as file-backed buffers for large files.

type Title

type Title []byte

Title is the byte sequence for a diff command title, as created by Titlef, and passed to NewUnifiedDoc or NewHunkDoc.

func (Title) String

func (t Title) String() string

String returns the title as a string. It may be empty. Colorization is stripped.

type UnifiedDoc

type UnifiedDoc struct {
	// contains filtered or unexported fields
}

UnifiedDoc is a diff Doc that consists of a single unified diff body (although that body may contain multiple hunks). It exists as a bridge to legacy code that generates unified diff output as a single block of text.

See also: HunkDoc.

func NewUnifiedDoc

func NewUnifiedDoc(cmdTitle Title, opts ...Opt) *UnifiedDoc

NewUnifiedDoc returns a new UnifiedDoc with the given title. The title may be empty. The diff body should be written to via UnifiedDoc.Write, and then the doc should be sealed via UnifiedDoc.Seal.

func (*UnifiedDoc) Close

func (d *UnifiedDoc) Close() error

Close implements io.Closer.

func (*UnifiedDoc) Err

func (d *UnifiedDoc) Err() error

Err returns the error associated with the doc, as provided to UnifiedDoc.Seal. The same non-nil error is returned by a call to UnifiedDoc.Read.

func (*UnifiedDoc) Read

func (d *UnifiedDoc) Read(p []byte) (n int, err error)

Read provides access to the doc's bytes. It blocks until the doc is sealed, or returns the non-nil error provided to HunkDoc.Seal. If the doc does not contain any diff hunks, Read returns io.EOF.

func (*UnifiedDoc) Seal

func (d *UnifiedDoc) Seal(err error)

Seal seals the doc, indicating that it is complete. Until it is sealed, a call to UnifiedDoc.Read will block. On the happy path, arg err is nil. If err is non-nil, a call to UnifiedDoc.Read will return that error. Seal panics if called more than once.

func (*UnifiedDoc) String

func (d *UnifiedDoc) String() string

String returns the doc's title as a string, with any colorization removed. It may be empty.

func (*UnifiedDoc) Write

func (d *UnifiedDoc) Write(p []byte) (n int, err error)

Write writes to the doc body. The bytes are returned (without additional processing) by UnifiedDoc.Read, so any colorization etc. must occur before writing. When writing is complete, the doc must be sealed via UnifiedDoc.Seal. It is a programming error to invoke UnifiedDoc.Write after UnifiedDoc.Seal has been invoked.

Directories

Path Synopsis
internal
go-udiff
Package udiff computes differences between text files or strings.
Package udiff computes differences between text files or strings.
go-udiff/difftest
Package difftest supplies a set of tests that will operate on any implementation of a diff algorithm as exposed by diff "github.com/neilotoole/sq/libsq/core/diffdoc/internal/go-udiff"
Package difftest supplies a set of tests that will operate on any implementation of a diff algorithm as exposed by diff "github.com/neilotoole/sq/libsq/core/diffdoc/internal/go-udiff"
go-udiff/lcs
package lcs contains code to find longest-common-subsequences (and diffs)
package lcs contains code to find longest-common-subsequences (and diffs)
go-udiff/myers
Package myers implements the Myers diff algorithm.
Package myers implements the Myers diff algorithm.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL