alud

package module
v2.14.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2024 License: BSD-2-Clause Imports: 9 Imported by: 5

Documentation

Overview

Package alud derives Universal Dependencies from sentences parsed with Alpino.

Usually, the input is XML in the alpino_ds format.

The output is in the CoNLL-U format, or the Universal Dependencies can be embedded into the alpino_ds format (version 1.10), making them available for XPath queries.

It is also possible to embed a user provided file in the CoNLL-U format, and embed this into the alpino_ds format.

When empty heads are reconstructed (resulting in lines with an ID with a dot), the ID of the original line is added in the last field of the CoNLL-U format, in the form CopiedFrom=ID. This information is necessary for correct embedding into the alpino_ds format.

----

The package is based on a translation of an xquery script written by Gosse Bouma.

See Alpino: https://www.let.rug.nl/vannoord/alp/Alpino/

See Universal Dependencies: https://universaldependencies.org/

See CoNLL-U: https://universaldependencies.org/format.html

See xquery script: https://github.com/gossebouma/lassy2ud

Index

Constants

View Source
const (
	OPT_DEBUG                  = 1 << iota // include debug messages in comments
	OPT_DUMMY_OUTPUT                       // include dummy output if parse fails
	OPT_NO_COMMENTS                        // don't include comments
	OPT_NO_DETOKENIZE                      // don't try to restore detokenized sentence
	OPT_NO_ENHANCED                        // skip enhanced dependencies
	OPT_NO_FIX_MISPLACED_HEADS             // don't fix misplaced heads in coordination
	OPT_NO_FIX_PUNCT                       // don't fix punctuation
	OPT_NO_METADATA                        // don't copy metadata to comments
	OPT_PANIC                              // panic on error (for development)
)

options can be or'ed as last argument to Ud()

Variables

This section is empty.

Functions

func Alpino

func Alpino(alpino_doc []byte, conllu, auto string) (alpino string, err error)

Insert given Universal Dependencies into alpino_ds format.

Use UD info from alpino_doc if conllu is "".

The conllu format is not checked for correctness. Garbage in, garbage out.

The value from auto is copied to the output.

func Ud

func Ud(alpino_doc []byte, filename, sentid string, options int) (conllu string, err error)

Derive Universal Dependencies from parsed sentence in alpino_ds format.

If sentid is "" it is derived from the filename.

func UdAlpino

func UdAlpino(alpino_doc []byte, filename, sentid string) (alpino string, err error)

Derive Universal Dependencies and insert into alpino_ds format.

If sentid is "" it is derived from the filename.

When err is not nil and alpino is not "" it contains the err in the alpino_ds format.

func VersionID

func VersionID() string

Version ID string

Types

type Data

type Data struct {
	Name string `xml:"name,attr,omitempty"`
	Data string `xml:",chardata"`
}

Directories

Path Synopsis
cmd
alud-dact Module
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL