lt-pos-tagger

module
v0.0.0-...-42a358f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2022 License: BSD-3-Clause

README

LT Part of Speech tagging service

Go Coverage Status Go Report Card CodeQL Load Tests Integration Tests

Lithuanian Part of Speech Tagger - easy to use wrapper for lex and morph. The repository implements a service wrapper for semantikadocker.vdu.lt/v2/morph and semantikadocker.vdu.lt/lex services. These both services have quite complex API. This service makes the POS tag output simple to use and to understand.

Also it fixes some issues with lex segmentation.

Deploy

Deployment sample is prepared with docker: example/docker-compose.yml. You are on Linux? To start a service locally:

   cd example 
   make up

That's it. You can start using the service:

   curl -X POST http://localhost:8092/tag -d 'Mama su kasa kasa smėlį.'

The output is expected to be the list of tagged words:

[
  {
    "type": "WORD",
    "string": "Mama",
    "mi": "Ncfsnn-",
    "lemma": "mama"
  },
  {
    "type": "SPACE",
    "string": " "
  },
  {
    "type": "WORD",
    "string": "su",
    "mi": "Sgi",
    "lemma": "su"
  },
  {
    "type": "SPACE",
    "string": " "
  },
  {
    "type": "WORD",
    "string": "kasa",
    "mi": "Ncfsin-",
    "lemma": "kasa"
  },
  {
    "type": "SPACE",
    "string": " "
  },
  {
    "type": "WORD",
    "string": "kasa",
    "mi": "Vgmp3---n--ni-",
    "lemma": "kasti"
  },
...
]

Info about the values of mi property can be found here http://corpus.vdu.lt/en/morph. The set of possible values for the type field is SPACE, SEPARATOR, SENTENCE_END, NUMBER, WORD.


Author

Airenas Vaičiūnas


License

Copyright © 2021, Airenas Vaičiūnas. Released under the The 3-Clause BSD License.


Directories

Path Synopsis
cmd
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL