scraper

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 30, 2024 License: AGPL-3.0 Imports: 14 Imported by: 0

Documentation

Index

Constants

View Source
const COVER = "cover"
View Source
const METAFILE = "metadata.nfo"
View Source
const TAG_R18 = "18禁"
View Source
const TAG_VOICEASMR = "ボイス・ASMR"

Variables

View Source
var (
	ErrInvalid  = fmt.Errorf("invalid metadada file")
	ErrExists   = fmt.Errorf("metadata file already exists")
	ErrNotFound = fmt.Errorf("work info not found in site, it may be deleted")
	ErrNoNumber = fmt.Errorf("no number can be found")
)

Functions

func GetNumberFromFilename

func GetNumberFromFilename(regexp *regexp.Regexp, filename string) string

regexp: must has a "number" sub group.

func GetRename

func GetRename(originalname string, metadata *Metadata) (canonicalName string, shouldRename bool)

Get standard name

func Register

func Register(scraper *Scraper)

func WriteMetadata

func WriteMetadata(metafile string, metadata *Metadata) error

Types

type Metadata

type Metadata struct {
	Title                  string   `yaml:"title,omitempty" json:"title,omitempty"`
	Author                 string   `yaml:"author,omitempty" json:"author,omitempty"`
	Series                 string   `yaml:"series name,omitempty" json:"series,omitempty"`
	YamlNarrator           string   `yaml:"narrator,omitempty" json:"yaml_narrator,omitempty"`
	YamlTags               string   `yaml:"tags,omitempty" json:"yaml_tags,omitempty"`
	Number                 string   `yaml:"number,omitempty" json:"number,omitempty"`
	YamlOtherEditionNumber string   `yaml:"other edition number,omitempty"`
	Date                   string   `yaml:"date,omitempty" json:"date,omitempty"`
	Source                 string   `yaml:"source,omitempty" json:"source,omitempty"`
	GeneratedBy            string   `yaml:"generated by,omitempty" json:"generated_by,omitempty"`
	Narrator               []string `yaml:"-" json:"narrator,omitempty"`
	Tags                   []string `yaml:"-" json:"tags,omitempty"`
	OtherEditionNumber     []string `yaml:"-" json:"other_edition_number,omitempty"`
	Text                   string   `yaml:"-" json:"text,omitempty"`               // full text. Must not has leading or trailing whitespaces.
	Files                  []string `yaml:"-" json:"files,omitempty"`              // additional meta files saved in tmpdir.
	CanonicalFilename      string   `yaml:"-" json:"canonical_filename,omitempty"` // If empty, fallback to use metadata.GetCanonicalName()
	ShouldRename           bool     `yaml:"-" json:"should_rename,omitempty"`      // Indicate the content-dir should be renamed to canonical filename
}

Stored in metadata.nfo header using YAML front matter. See. https://jekyllrb.com/docs/front-matter/ . Title is a must field, all other fields are optional. All array type meta must be comprised of non-empty unique items, Tags should be sorted in lexical order. For now, write all array type meta to metadata.nfo as csv style string, instead of strict yaml array, this is for compatibility with other programs.

func ReadMetadata

func ReadMetadata(metafile string) (metadata *Metadata, err error)

func (*Metadata) GetCanonicalFilename

func (m *Metadata) GetCanonicalFilename() (filename string)

Return canonical filename of the resource. The default name is "[number][author]title", title is truncated if it's too long.

type Scraper

type Scraper struct {
	Name    string
	Version string
	Pre     func(filename string) bool
	Do      func(filename string, tmpdir string) (*Metadata, error)
}

type Scrapes

type Scrapes []*Scraper

func NewScrapers

func NewScrapers(names ...string) (Scrapes, error)

func (Scrapes) Scrape

func (s Scrapes) Scrape(dirname string, tmpdir string, force bool) (*Metadata, error)

Directories

Path Synopsis
Scrape dlsite asmr meta info from https://asmr.one/works
Scrape dlsite asmr meta info from https://asmr.one/works
https://hentaicovid.com/index.php/voices-asmr/
https://hentaicovid.com/index.php/voices-asmr/
https://hvdb.me/Dashboard/Details/01201812 Pure-english.
https://hvdb.me/Dashboard/Details/01201812 Pure-english.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL