Loflòc: A Morphological Lexicon for Occitan using Universal Dependencies
No Thumbnail Available
Restricted Availability
Date
2024-03-19, 2024-03-19
Persistent identifier of the Data Catalogue metadata
Creator/contributor
Editor
Journal title
Journal volume
Publisher
Publication Type
dataset
dataset
dataset
Peer Review Status
Repositories
Access rights
Open
ISBN
ISSN
Description
LOFLOC -- Lexic obèrt flechit Occitan (Open Inflected Lexicon of Occitan)
Loflòc is a morphological lexicon for Occitan, a Romance language spoken in the south of France and in parts of Italy and Spain. Occitan is not recognized as an official language in France and no standard variety is shared across the linguistic area. To the best of our knowledge, Loflòc is the first publicly available lexicon for Occitan. It contains 680 thousand entries for 57 thousand lemmas. Each entry contains an inflected form, its lemma and its part-of-speech tag according to the Universal Dependencies guidelines. Currently, the lexicon only contains the Lengadocian variety and the classical spelling norm. Nevertheless, it has been shown to be useful even for processing texts from other varieties (for more details, see Vergez-Couret et al., 2024; full reference below).