Studying the Relationship Between DNA Methylation and Biological Age with an Understandable Machine Learning Model

Exploring DNA Methylation Data and Machine Learning Models for Biological Age Prediction

A groundbreaking study on DNA methylation data has revealed fascinating insights into biological age prediction using machine learning models. The study, which analyzed a total of 10,296 samples, including healthy and diseased samples, utilized 50,000 methylation site data along with gender data to construct machine learning models.

The data preprocessing results highlighted the importance of filling missing data, coding gender data, and normality testing of DNA methylation site data. Feature selection using XGBoost, LightGBM, and CatBoost models identified the top 20 methylation sites for biological age prediction. Statistical analysis further explored the correlation between methylation data and biological age, leading to the selection of 15 key methylation data sets for model training.

Machine learning model training involved XGBoost, LightGBM, CatBoost models, and deep neural networks, with XGBoost demonstrating superior performance in biological age prediction. The models were trained using 10-fold cross-validation, and the results showcased the efficacy of XGBoost in accurately predicting biological age.

Further analysis using SHAP values revealed the contribution of methylation data to biological age prediction, leading to the identification of 12 key genes associated with methylation sites. KEGG and GO analyses shed light on the biological processes and pathways related to these genes, offering valuable insights into cellular components and molecular functions.

Overall, this study represents a significant advancement in the field of DNA methylation data analysis and biological age prediction, showcasing the potential of machine learning models in unraveling complex biological phenomena. The findings hold promise for future research in understanding the mechanisms of aging and disease progression at a molecular level.

LEAVE A REPLY

Please enter your comment!
Please enter your name here