Neural models for morphological generation, analysis and lemmatization in 22 languages

dc.contributor.affiliationUniversity of Helsinki - Hämäläinen, Mika
dc.contributor.affiliationUniversity of Helsinki - Partanen, Niko
dc.contributor.affiliationUniversity of Helsinki - Rueter, Jack
dc.contributor.affiliationUniversity of Helsinki - Alnajjar, Khalid
dc.contributor.authorHämäläinen, Mika
dc.contributor.authorPartanen, Niko
dc.contributor.authorRueter, Jack
dc.contributor.authorAlnajjar, Khalid
dc.date.accessioned2025-03-24T15:11:12Z
dc.date.issued2020-07-01
dc.date.issued2020-07-01
dc.descriptionMorphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a) Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm) Cite: Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)
dc.identifierhttps://doi.org/10.5281/zenodo.3926769
dc.identifier.urihttps://hydatakatalogi-test-24.it.helsinki.fi/handle/123456789/9013
dc.rightsOpen
dc.rights.licensecc-by-4.0
dc.subjectmorphology
dc.subjectfst
dc.subjectendangered languages
dc.subjectneural models
dc.titleNeural models for morphological generation, analysis and lemmatization in 22 languages
dc.typedataset
dc.typedataset