Text-based language prediction of three native languages of Davao del Norte using Multinomial Naive Bayes algorithm

View/ Open
Date
2022-12Author
Esperanza, Dianne Faith
Rulida, Kezze Krisna C.
Citation Tool
Metadata
Show full item recordAbstract
There are few technologically based native or minority language studies in Davao del Norte, which is home to several indigenous languages. Some of these minority languages are represented either insufficiently or not on the internet or any other electronic resource. As the internet becomes more pervasive in society on a global scale, digitization is becoming increasingly crucial for language preservation. Using Multinomial Naive Bayes, we developed a language identification tool to identify Davao del Norte native languages (Ata-Manobo, Cebuano, and Mansaka) presented in a text. We experimented with varying sizes and quality datasets until the desired level of accuracy and performance had been achieved. The classifier's accuracy did not change significantly after adding training data. However, when tested with new inputs, the classifier appeared to perform better than it did with a smaller dataset. The model attained an accuracy of 98.43% due to the incorporation of additional training data. The input (text) length is still an essential factor to consider for an accurate language prediction.
Collections
- Undergraduate Theses [181]
Publisher
Department of Arts and Sciences Education- Bachelor of Science in Computer Science
