doi

package
v0.0.0-...-efee6a8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 24, 2022 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package doi helps to find DOI in JSON documents.

Index

Constants

View Source
const PatDOI = "10[.][0-9]{2,6}/[^ \"\u001f\u001e]{3,}"

Variables

This section is empty.

Functions

This section is empty.

Types

type MapSniffer

type MapSniffer struct {
	Pattern    *regexp.Regexp
	IgnoreKeys []*regexp.Regexp
}

MapSniffer tries to find values in a map.

func (*MapSniffer) SearchMap

func (s *MapSniffer) SearchMap(doc map[string]interface{}) []string

type Sniffer

type Sniffer struct {
	Reader         io.Reader
	Writer         io.Writer
	MapSniffer     *MapSniffer
	IdentifierKey  string
	UpdateKey      string // if set, update the document in place
	SkipUnmatched  bool
	ForceOverwrite bool // if set, overwrite existing values in "UpdateKey"
	PostProcess    func(s string) string
	BatchSize      int
	NumWorkers     int
}

Sniffer can read, transform and write a stream of newline delimited JSON documents.

func NewSniffer

func NewSniffer(r io.Reader, w io.Writer) *Sniffer

NewSniffer sets up a new sniffer with defaults keys matching the current SOLR schema. Can process around 20K docs/s.

func (*Sniffer) Run

func (s *Sniffer) Run() error

Run sniffs out DOI and eventually update document in place.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL