Evaluating and choosing a machine learning algorithm for classifying road surface quality data

Date

2018-04

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Considering the importance of roads to a community, stakeholders (Governments, Motoristsetc) needup-to-dateinformation about the state of roads for decisionmaking. This problem inspired Vorgbe’s (2014) work in implementing a machine learning classifier that could accurately classify roads as “good”, “fair” or “bad”. This information can then be visualised on Google Maps. However, with his algorithm failing to accurately classify some roads, this project seeks to evaluate five classification algorithms to determine which one is best for classifying road surface quality data. To do this, we collected x, y, z acceleration and location data, extracted the desired features from it, performed a 10-fold cross-validation trainingonthedatatochoosethe best model and then tested on a new set of examples to determine the model that accurately classifies the data. Fromthedataavailable,thedecisiontreemodelproduced the best performance with true positives of 97% accuracy for bad roads,81%accuracy for fair roads and 93% accuracy for good roads. The overall accuracy on the test set is 92% with a precision of 92% and recall of 90%. This means that, this model is more likely to accurately predict a new data point as belonging to its true class. The other algorithms (Logistic Regression, Random Forests, Support Vector Machines and Nearest Neighbour) performed well when classifying the “good” and “bad” road data but instead classified the “fair” road data as “good” road.

Description

Applied project submitted to the Department of Computer Science, Ashesi University, in partial fulfillment of Bachelor of Science degree in Computer Science, April 2018

Keywords

algorithm, road quality data

Citation

DOI