Use of Machine Learning for the Optimization of Genetic Circuits in Synthetic Biology: Focusing On Promoter Prediction for Gene Expression in Escherichia Coli

Date

2024-05

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This study explores the application of Machine Learning (ML) in optimizing genetic circuit design in Synthetic Biology, particularly focusing on predicting specific promoter for gene expression in Escherichia Coli. Despite the potential of ML, current methods rely heavily on trial-and-error, which is inefficient and costly. The research employs various ML models, including Genetic Algorithms, Support Vector Machines, and Neural Networks, alongside computational algorithms like Boyer-Moore-Horspool’s algorithm, to predict promoter efficacy and identify optimal promoter-gene configurations. Utilizing datasets from databases such as RegulonDB, EcoCyc, and PRODORIC, the study validates its findings through a combination of literature cross-checks and model performance metrics. The resulting model achieved a high accuracy in predicting promoter efficacy, with a 92% success rate in identifying optimal configurations. The findings suggest that incorporating transcription unit data significantly improves prediction accuracy, demonstrating the potential of ML in advancing synthetic biology towards more precise and efficient genetic circuit design.

Description

Undergraduate thesis submitted to the Department of Computer Science, Ashesi University, in partial fulfillment of Bachelor of Science degree in Computer Science, May 2024

Keywords

Citation

Ineza, N. C. (2024). Use of Machine Learning for the Optimization of Genetic Circuits in Synthetic Biology: Focusing On Promoter Prediction for Gene Expression in Escherichia Coli. Ashesi University.

DOI