Universal Dependencies


Universal Dependencies is an effort to create a cross-linguistic treebank annotation system for many languages to develop a multi-lingual parser. While using the universal rules of grammatical dependencies, this annotation scheme relies on other established approaches such as Stanford Dependencies, Google’s universal part-of-speech tagging, and the Interset interlingua for morphosyntactic tagsets.

When this system was being developed, mainly the dominant languages with many speakers fit into its annotation scheme. The projects I participated in tried to apply UD for two unrelated North American languages, Arapaho and Southern Sierra Miwok. Both languages are severely threatened and have few resources. The Arapaho UD project established a number of dependencies seen in Arapaho and cross-referenced them with UD. About 5,000 lines were single annotated. The results of this project were published in the proceedings to LAW X.

My participation in the Southern Sierra Miwok project was short-term, but our team established the guidelines before my departure and began using them for annotations of the existing texts.

2015 - 2017

Full Text