By Jevalene C. Delos Reyes
Department of Linguistics
University of the Philippines-Diliman
Paper presented at the 2nd Philippine Conference Workshop on
Mother Tongue-Based Multilingual Education (MTBMLE 2)
held February 16-18, 2012, at the Punta Villa Resort
Sto. Niño Sur, Arevalo, Iloilo City, Philippines
This paper proposes a Part-of-Speech (POS) tagging system for Philippine-type languages. It manually annotates the grammar of one hundred (100) Tagalog sentences from the corpus of Curtis McFarland and seventy-five (75) sentences from my own Cebuano corpus, introducing tags and tagging techniques that address the unique characteristics and processes in our languages as exemplified in these two languages. The framework gives us information on the distinctive features, contextual behavior, and frequencies of certain words, parts of speech, and constructions. Other particular applications include natural language processing and curriculum design. It could be used to improve existing automatic POS taggers in terms of accuracy and integration of new inter-translationability functions to other Philippine languages, and assist curriculum writers and teachers under the MTBMLE program in curriculum development and designing materials and techniques for teaching the grammar of our languages to students.
Keywords: POS tagging, Corpus linguistics, Word categories
To read the complete article, click on Delos Reyes – TAG-a-Ling A Part-of-speech Tagging System for Philippine Languages.