Aim: The Insilco study uses deep learning algorithms to predict the protein-coding pg m RNA sequences. Material and methods: The NCBI GEO DA
Aim: The Insilco study uses deep learning algorithms to predict the protein-coding pg m RNA sequences. Material and methods: The NCBI GEO DATA SET GSE218606’s GEO R tool discovered P.G’s outer membrane vesicles’ most differentially expressed mRNA. Genemania analyzed differentially expressed gene networks. Transcriptomics data were collected and labeled on P. gingivalis protein-coding mRNA sequence and pseudogene, lincRNA, and bidirectional promoter lincRNA. Orange, a machine learning tool, analyzed and predicted data after preprocessing. Naïve Bayes, neural networks, and gradient descent partition data into training and testing sets, yielding accurate results. Cross-validation, model accuracy, and ROC curve were evaluated after model validation. Results: Three models, Neural Networks, Naive Bayes, and Gradient Boosting, were evaluated using metrics like Area Under the Curve (AUC), Classification Accuracy (CA), F 1 Score, Precision, Recall, and Specificity. Gradient Boosting achieved a balanced performance (AUC: 0.72, CA: 0.41, F 1: 0.32) compared to Neural Networks (AUC: 0.721, CA: 0.391, F1: 0.314) and Naive Bayes (AUC: 0.701, CA: 0.172, F1: 0.114). While statistical tests revealed no significant differences between the models, Gradient Boosting exhibited a more balanced precision-recall relationship. Conclusion: In silico analysis using machine learning techniques successfully predicted protein-coding mRNA sequences within Porphyromonas gingivalis OMVs. Gradient Boosting outperformed other models (Neural Networks, Naive Bayes) by achieving a balanced performance across metrics like AUC, classification accuracy, and precision-recall, suggests its potential as a reliable tool for protein-coding mRNA prediction in P. gingivalis OMVs.