backscanner

package module
v0.0.0-...-dff01ac Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 24, 2024 License: Apache-2.0 Imports: 3 Imported by: 27

README

backscanner

Build Status Go Reference Go Report Card codecov

Ever needed or wondered how to efficiently search for something in a log file, but starting at the end and going backward? Here's your solution.

Package backscanner provides a scanner similar to bufio.Scanner, but it reads and returns lines in reverse order, starting at a given position (which may be the end of the input) and going backward.

This library only uses the standard library, but the test uses an external package. To install this library (along with the test dependency), simply run:

go get -t github.com/icza/backscanner

Advancing and accessing lines of the input is done by calling Scanner.Line(), which returns the next line (previous in the source) as a string.

For maximum efficiency there is Scanner.LineBytes(). It returns the next line as a byte slice, which shares its backing array with the internal buffer of Scanner. This is because no copy is made from the line data; but this also means you can only inspect or search in the slice before calling Line() or LineBytes() again, as the content of the internal buffer–and thus slices returned by LineBytes()–may be overwritten. If you need to retain the line data, make a copy of it or use Line().

Example using it:

input := "Line1\nLine2\nLine3"
scanner := backscanner.New(strings.NewReader(input), len(input))
for {
	line, pos, err := scanner.Line()
	if err != nil {
		fmt.Println("Error:", err)
		break
	}
	fmt.Printf("Line position: %2d, line: %q\n", pos, line)
}

Output:

Line position: 12, line: "Line3"
Line position:  6, line: "Line2"
Line position:  0, line: "Line1"
Error: EOF

Using it to efficiently scan a file, finding last occurrence of a string ("error"):

file, err := os.Open("mylog.txt")
if err != nil {
	panic(err)
}
fileStatus, err := file.Stat()
if err != nil {
	panic(err)
}
defer file.Close()

scanner := backscanner.New(file, int(fileStatus.Size()))
what := []byte("error")
for {
	line, pos, err := scanner.LineBytes()
	if err != nil {
		if err == io.EOF {
			fmt.Printf("%q is not found in file.\n", what)
		} else {
			fmt.Println("Error:", err)
		}
		break
	}
	if bytes.Contains(line, what) {
		fmt.Printf("Found %q at line position: %d, line: %q\n", what, pos, line)
		break
	}
}

Documentation

Overview

Package backscanner provides a scanner similar to bufio.Scanner, but it reads and returns lines in reverse order, starting at a given position (which may be the end of the input) and going backward.

Unlike with bufio.Scanner, max line length may be configured.

Advancing and accessing lines of the input is done by calling Scanner.Line(), which returns the next line (previous in the source) as a string.

For maximum efficiency there is Scanner.LineBytes(). It returns the next line as a byte slice, which shares its backing array with the internal buffer of Scanner. This is because no copy is made from the line data; but this also means you can only inspect or search in the slice before calling Line() or LineBytes() again, as the content of the internal buffer–and thus slices returned by LineBytes()–may be overwritten. If you need to retain the line data, make a copy of it or use Line().

Example using it:

input := "Line1\nLine2\nLine3"
scanner := backscanner.New(strings.NewReader(input), len(input))
for {
	line, pos, err := scanner.Line()
	if err != nil {
		fmt.Println("Error:", err)
		break
	}
	fmt.Printf("Line position: %2d, line: %q\n", pos, line)
}

Output:

Line position: 12, line: "Line3"
Line position:  6, line: "Line2"
Line position:  0, line: "Line1"
Error: EOF

Using it to efficiently scan a file, finding last occurrence of a string ("error"):

f, err := os.Open("mylog.txt")
if err != nil {
	panic(err)
}
fi, err := f.Stat()
if err != nil {
	panic(err)
}
defer f.Close()

scanner := backscanner.New(f, int(fi.Size()))
what := []byte("error")
for {
	line, pos, err := scanner.LineBytes()
	if err != nil {
		if err == io.EOF {
			fmt.Printf("%q is not found in file.\n", what)
		} else {
			fmt.Println("Error:", err)
		}
		break
	}
	if bytes.Contains(line, what) {
		fmt.Printf("Found %q at line position: %d, line: %q\n", what, pos, line)
		break
	}
}

Index

Constants

View Source
const (
	// DefaultChunkSize is the default value for the ChunkSize option
	DefaultChunkSize = 1024

	// DefaultMaxBufferSize is the default value for the MaxBufferSize option
	DefaultMaxBufferSize = 1 << 20 // 1 MB
)

Variables

View Source
var (
	// ErrLongLine indicates that the line is longer than the internal buffer size
	ErrLongLine = errors.New("line too long")
)

Functions

This section is empty.

Types

type Options

type Options struct {
	// ChunkSize specifies the size of the chunk that is read at once from the input.
	ChunkSize int

	// MaxBufferSize limits the maximum size of the buffer used internally.
	// This also limits the max line size.
	MaxBufferSize int
}

Options contains parameters that influence the internal working of the Scanner.

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner is the back-scanner implementation.

func New

func New(r io.ReaderAt, pos int) *Scanner

New returns a new Scanner.

func NewOptions

func NewOptions(r io.ReaderAt, pos int, o *Options) *Scanner

NewOptions returns a new Scanner with the given Options. Invalid option values are replaced with their default values.

func (*Scanner) Line

func (s *Scanner) Line() (line string, pos int, err error)

Line returns the next line from the input and its absolute byte-position. Line ending is cut from the line. Empty lines are also returned. After returning the last line (which is the first in the input), subsequent calls report io.EOF.

func (*Scanner) LineBytes

func (s *Scanner) LineBytes() (line []byte, pos int, err error)

LineBytes returns the bytes of the next line from the input and its absolute byte-position. Line ending is cut from the line. Empty lines are also returned. After returning the last line (which is the first in the input), subsequent calls report io.EOF.

This method is for efficiency if you need to inspect or search in the line. The returned line slice shares data with the internal buffer of the Scanner, and its content may be overwritten in subsequent calls to LineBytes() or Line(). If you need to retain the line data, make a copy of it or use the Line() method.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL