Analysis of data mining techniques and algorithms for healthcare application using cervical cancer as a case study.
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Cervical cancer is the most common cause of cancer among African women. It is a preventable disease and can be treated if identified at early stages. Given the lack of adequate health care services and the costly nature of colposcopies in Africa, it is difficult to get an early diagnosis. The development of smartphone-based diagnostic tools like MobileODT – with which pictures of the cervix are taken and sent to doctors for diagnosis – promises to address the expensive nature of colposcopy and Pap test; still, the diagnosis of these images is prone to human errors. This project aimed to recommend an algorithm that best classifies cervical images into cancerous and non-cancerous, in order to aid medical officials to give a better diagnosis. K-Nearest Neighbour (KNN), Convolutional Neural Network (CNN) and Support Vector Machine (SVM) were analyzed and compared based on their classification accuracy, sensitivity and specificity and how these results varied after applying Principal Component Analysis (PCA) on the dataset. KNN, CNN, and SVM models obtained classification accuracies of 68.75%, 83.3%, and 66.37% respectively while PCA-KNN and PCA-SVM models had classification accuracies of 78.12% and 62.7% respectively.