Neural models for morphological generation, analysis and lemmatization in 22 languages
dc.contributor.affiliation | University of Helsinki - Hämäläinen, Mika | |
dc.contributor.affiliation | University of Helsinki - Partanen, Niko | |
dc.contributor.affiliation | University of Helsinki - Rueter, Jack | |
dc.contributor.affiliation | University of Helsinki - Alnajjar, Khalid | |
dc.contributor.author | Hämäläinen, Mika | |
dc.contributor.author | Partanen, Niko | |
dc.contributor.author | Rueter, Jack | |
dc.contributor.author | Alnajjar, Khalid | |
dc.date.accessioned | 2025-03-24T15:11:12Z | |
dc.date.issued | 2020-07-01 | |
dc.date.issued | 2020-07-01 | |
dc.description | Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a) Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm) Cite: Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021) | |
dc.identifier | https://doi.org/10.5281/zenodo.3926769 | |
dc.identifier.uri | https://hydatakatalogi-test-24.it.helsinki.fi/handle/123456789/9013 | |
dc.rights | Open | |
dc.rights.license | cc-by-4.0 | |
dc.subject | morphology | |
dc.subject | fst | |
dc.subject | endangered languages | |
dc.subject | neural models | |
dc.title | Neural models for morphological generation, analysis and lemmatization in 22 languages | |
dc.type | dataset | |
dc.type | dataset |