[WIP] Add `dico lemmatize` subcommand (!20) · Merge requests · dictionary / dico

Medina Cardenas, Lorena Giovanna requested to merge dico_lemmatize into master Mar 30, 2021

Description

The purpose of this merge request is to implement the dico lemmatize subcommand.

More precisely, it allows the user to add the tokens to the definition of a word. The json input file is a minimal json and the result is a lemmatized format json file. The subcommand uses SpaCy to tokenize the text definition into words, punctuation and so on, and then to get the lemma of the words.

Example

bind/dico -i sample/json/small_minimal.json -o file.json --library spacy --model en_core_web_sm

Checklist TODO

Edited Mar 30, 2021 by Medina Cardenas, Lorena Giovanna

[WIP] Add `dico lemmatize` subcommand

Description

Example

Checklist TODO

Merge request reports