Explainable machine learning framework to predict personalized physiological aging
Résumé
Attaining personalized healthy aging requires accurate monitoring of physiological changes and identifying subclinical markers that predict accelerated or delayed aging. Classic biostatistical methods most rely on supervised variables to estimate physiological aging and do not capture the full complexity of inter‐parameter interactions. Machine learning (ML) is promising, but its black box nature eludes direct understanding, substantially limiting physician confidence and clinical usage. Using a broad population dataset from the National Health and Nutrition Examination Survey (NHANES) study including routine biological variables and after selection of XGBoost as the most appropriate algorithm, we created an innovative explainable ML framework to determine a Personalized physiological age (PPA). PPA predicted both chronic disease and mortality independently of chronological age. Twenty‐six variables were sufficient to predict PPA. Using SHapley Additive exPlanations (SHAP), we implemented a precise quantitative associated metric for each variable explaining physiological (i.e., accelerated or delayed) deviations from age‐specific normative data. Among the variables, glycated hemoglobin (HbA1c) displays a major relative weight in the estimation of PPA. Finally, clustering profiles of identical contextualized explanations reveal different aging trajectories opening opportunities to specific clinical follow‐up. These data show that PPA is a robust, quantitative and explainable ML‐based metric that monitors personalized health status. Our approach also provides a complete framework applicable to different datasets or variables, allowing precision physiological age estimation.
Origine | Fichiers éditeurs autorisés sur une archive ouverte |
---|---|
licence |