Developing a functional Natural Language Processing system for the Twi language with limited data
Language is a basic characteristic of human beings. It does not only play the vital role in the transmission of information, but it is also pivotal in the establishment of social connection and emotional bond. For centuries, machines had never possessed the ability to exchange ideas with mankind in an intelligent or intuitive way. However, in recent years, due to breakthroughs in the field of Artificial Intelligence and the rise of computational power, machines have made significant and quite impressive gains in the goal of understanding human language and interacting with it. The branch of Artificial Intelligence which is responsible for enabling machines to understand human language is known as Natural Language Processing. Natural Language Processing involves the utilization of statistical and mathematical models to create algorithms that can train machines to learn and understand human language. The major problem with the algorithms that are created in Natural Language Processing is that they require huge amounts of data to train. Unfortunately, this implies that Natural Language Systems cannot be created for languages that do not have large amounts of readily available data. These kinds of languages are called “low resource languages” and most Ghanaian languages, including the Twi language, fall into this category. This research would explore how a functional Natural Language System may be created for the Twi Ghanaian local language with limited language data.
Undergraduate thesis submitted to the Department of Computer Science, Ashesi University, in partial fulfillment of Bachelor of Science degree in Computer Science, April 2019
artificial intelligence (AI), Twi, low-resource languages, Natural Language Processing (NLP), voice recordings, machine learning