Advanced Search

Methods and Applications: AI-Based Hematological Age Predictors and the Association Between Biological Age Acceleration and Type 2 Diabetes Mellitus — Chongqing Municipality, China, 2015–2021

View author affiliations
  • Abstract

    Introduction

    Biological age (BA) can represent the actual state of human aging more accurately than chronological age (CA).

    Methods

    Using hematological data from 112,925 participants in southwestern China, collected between 2015 and 2021, this study constructed BA predictors using 7 machine learning (ML) methods (tailored separately for male and female populations). This study then analyzed the association between BA acceleration and type 2 diabetes mellitus (T2DM) within this data using logistic regression. Additionally, it examined the impact of glycemic control on BA in individuals with diabetes.

    Results

    Among all ML models, deep neural networks (DNN) delivered the best performance in male [mean absolute error (MAE)=6.89, r=0.75] and female subsets (MAE=6.86, r=0.74). BA acceleration showed positive correlations with T2DM in both male [odds ratio (OR): 2.22, 95% confidence interval (CI): 1.77–2.77] and female subsets (OR: 3.10, 95% CI: 2.16–4.46), while BA deceleration showed negative correlations in both male (OR: 0.32, 95% CI: 0.27–0.39) and female subsets (OR: 0.42, 95% CI: 0.33–0.53). Individuals with diabetes with normal fasting glucose had significantly lower BAs than those with impaired fasting glucose in all CA groups except for patients older than 80.

    Discussion

    Artificial intelligence (AI)-based hematological BA predictors show promise as advanced tools for assessing aging in epidemiological studies. Implementing AI-based BA predictors in public health initiatives could facilitate proactive aging management and disease prevention.

  • loading...
  • Conflicts of interest: No conflicts of interest.
  • Funding: Supported by the National Human Genetic Resources Sharing Service Platform (Grant No. 2005DKA21300)
  • [1] Hou YJ, Dan XL, Babbar M, Wei Y, Hasselbalch SG, Croteau DL, et al. Ageing as a risk factor for neurodegenerative disease. Nat Rev Neurol 2019;15(10):565 − 81. https://doi.org/10.1038/s41582-019-0244-7CrossRef
    [2] Popa-Wagner A, Petcu EB, Capitanescu B, Hermann DM, Radu E, Gresita A. Ageing as a risk factor for cerebral ischemia: Underlying mechanisms and therapy in animal models and in the clinic. Mech Ageing Dev 2020;190:111312. https://doi.org/10.1016/j.mad.2020.111312CrossRef
    [3] Smetana Jr K, Lacina L, Szabo P, Dvořánková B, Brož P, Šedo A. Ageing as an important risk factor for cancer. Anticancer Res 2016;36(10):5009 − 17. https://doi.org/10.21873/anticanres.11069CrossRef
    [4] Gloor AD, Berry GJ, Goronzy JJ, Weyand CM. Age as a risk factor in vasculitis. Semin Immunopathol 2022;44:281 − 301. https://doi.org/10.1007/s00281-022-00911-1CrossRef
    [5] Nie C, Li Y, Li R, Yan YZ, Zhang DT, Li T, et al. Distinct biological ages of organs and systems identified from a multi-omics study. Cell Rep 2022;38(10):110459. https://doi.org/10.1016/j.celrep.2022.110459CrossRef
    [6] Putin E, Mamoshina P, Aliper A, Korzinkin M, Moskalev A, Kolosov A, et al. Deep biomarkers of human aging: Application of deep neural networks to biomarker development. Aging (Albany NY) 2016;8(5):1021 − 33. https://doi.org/10.18632/aging.100968CrossRef
    [7] Zhong X, Lu YX, Gao Q, Nyunt SZ, Fulop T, Monterola CP, et al. Estimating biological age in the Singapore longitudinal aging study. J Gerontol A Biol Sci Med Sci 2020;75(10):1913 − 20. https://doi.org/10.1093/gerona/glz146CrossRef
    [8] An S, Ahn C, Moon S, Sim EJ, Park SK. Individualized biological age as a predictor of disease: Korean Genome and Epidemiology Study (KoGES) Cohort. J Pers Med 2022;12(3):505. https://doi.org/10.3390/jpm12030505CrossRef
    [9] Mamoshina P, Kochetov K, Putin E, Cortese F, Aliper A, Lee WS, et al. Population Specific Biomarkers of Human Aging: A Big Data Study Using South Korean, Canadian, and Eastern European Patient Populations. J Gerontol A Biol Sci Med Sci. 2018;73(11):1482−1490. http://dx.doi.org/10.1093/gerona/gly005.
    [10] Zhavoronkov A, Li R, Ma CDC, Mamoshina P. Deep biomarkers of aging and longevity: from research to applications. Aging (Albany NY) 2019;11(22):10771 − 80. https://doi.org/10.18632/aging.102475CrossRef
    [11] Manoj K, Senthamarai Kannan K. Comparison of methods for detecting outliers. Int J Sci Eng Res 2013;4(9):709-14. https://www.ijser.org/researchpaper/Comparison-of-methods-for-detecting-outliers.pdf.
    [12] Cao XQ, Yang GL, Jin XR, He L, Li XQ, Zheng ZT, et al. A machine learning-based aging measure among middle-aged and older Chinese adults: the China Health and Retirement Longitudinal Study. Front Med (Lausanne) 2021;8:698851. https://doi.org/10.3389/fmed.2021.698851CrossRef
    [13] Chen L, Zhang YQ, Yu CQ, Guo Y, Sun DJY, Pang YJ, et al. Modeling biological age using blood biomarkers and physical measurements in Chinese adults. eBioMedicine 2023;89:104458. https://doi.org/10.1016/j.ebiom.2023.104458CrossRef
    [14] Sebastiani P, Thyagarajan B, Sun FG, Honig LS, Schupf N, Cosentino S, et al. Age and sex distributions of age-related biomarker values in healthy older adults from the long life family study. J Am Geriatr Soc 2016;64(11):e189 − 94. https://doi.org/10.1111/jgs.14522CrossRef
  • FIGURE 1.  Optimized workflow for data processing and management.

    FIGURE 2.  Correlations between hematological biological age and chronological age based on deep neural network models in the training, validation, and testing datasets. (A) Training dataset — Male; (B) Training dataset — Female; (C) Validation dataset — Male; (D) Validation dataset — Female; (E) Testing dataset — Male; (F) Testing dataset — Female.

    Note: The colors represent the Gaussian kernel density estimation value.

    FIGURE 3.  Boxplot of biological age in different age groups by gender for diabetic and non-diabetic individuals (A) Male; (B) Female.

    Note: The significance levels were Bonferroni corrected.

    Abbreviation: IFG=impaired fasting glucose; NFG=normal fasting glucose.

    *** means P≤0.001;

    **** means P≤0.0001.

    TABLE 1.  Gender differences in the impact of biological age acceleration and deceleration, body mass index, and chronological age on type 2 diabetes mellitus risk: estimation of odds ratios.

    Factors Male Female
    Coefficients ORs (95% CI) P Coefficients ORs (95% CI) P
    Body mass index 0.05 1.05 (1.03–1.06) <0.001 0.03 1.03 (1.01–1.04) 0.006
    Chronological age 0.11 1.12 (1.11–1.12) <0.001 0.14 1.14 (1.13–1.16) <0.001
    Accelerate 0.80* 2.22 (1.77–2.77)* <0.001 1.13* 3.10 (2.16–4.46)* <0.001
    Decelerate −1.13* 0.32 (0.27–0.39)* <0.001 −0.87* 0.42 (0.33–0.53)* <0.001
    Abbreviation: OR=odds ratio. CI=confidence interval.
    * indicates that the acceleration of biological age is positively correlated with type 2 diabetes mellitus in both male and female subjects, while the deceleration of biological age is negatively correlated with type 2 diabetes mellitus.
    Download: CSV

Citation:

通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索
Turn off MathJax
Article Contents

Article Metrics

Article views(654) PDF downloads(8) Cited by()

Share

Related

AI-Based Hematological Age Predictors and the Association Between Biological Age Acceleration and Type 2 Diabetes Mellitus — Chongqing Municipality, China, 2015–2021

View author affiliations

Abstract

Introduction

Biological age (BA) can represent the actual state of human aging more accurately than chronological age (CA).

Methods

Using hematological data from 112,925 participants in southwestern China, collected between 2015 and 2021, this study constructed BA predictors using 7 machine learning (ML) methods (tailored separately for male and female populations). This study then analyzed the association between BA acceleration and type 2 diabetes mellitus (T2DM) within this data using logistic regression. Additionally, it examined the impact of glycemic control on BA in individuals with diabetes.

Results

Among all ML models, deep neural networks (DNN) delivered the best performance in male [mean absolute error (MAE)=6.89, r=0.75] and female subsets (MAE=6.86, r=0.74). BA acceleration showed positive correlations with T2DM in both male [odds ratio (OR): 2.22, 95% confidence interval (CI): 1.77–2.77] and female subsets (OR: 3.10, 95% CI: 2.16–4.46), while BA deceleration showed negative correlations in both male (OR: 0.32, 95% CI: 0.27–0.39) and female subsets (OR: 0.42, 95% CI: 0.33–0.53). Individuals with diabetes with normal fasting glucose had significantly lower BAs than those with impaired fasting glucose in all CA groups except for patients older than 80.

Discussion

Artificial intelligence (AI)-based hematological BA predictors show promise as advanced tools for assessing aging in epidemiological studies. Implementing AI-based BA predictors in public health initiatives could facilitate proactive aging management and disease prevention.

  • 1. National Human Genetic Resources Center, National Research Institute for Family Planning, Beijing, China
  • 2. National Human Genetic Resources Sharing Service Platform, National Research Institute for Family Planning, Beijing, China
  • 3. Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
  • 4. The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
  • Corresponding authors:

    Zongfu Cao, caozongfu@egene.org.cn

    Xu Ma, maxu_ky@nrifp.org.cn

  • Funding: Supported by the National Human Genetic Resources Sharing Service Platform (Grant No. 2005DKA21300)
  • Online Date: November 08 2024
    Issue Date: November 08 2024
    doi: 10.46234/ccdcw2024.240
  • Aging is a process characterized by the gradual decline of various molecular functions and the progressive accumulation of senescent cells, increasing the risk of diseases such as neurodegenerative diseases (1), cerebral ischemia (2), cancer (3), and vasculitis (4). Research has evolved the definition of aging from a simple increase in chronological age to a systematic assessment of overall physiological function. Different bodily systems and organs may exhibit varying aging rates, suggesting the presence of multiple “clocks” (5). Hematological features have been employed to develop biological age (BA) prediction models in populations from the USA (6), Singapore (7), the Republic of Korea (8,9), Canada (9), and eastern Europe (9). BA serves as a valuable tool for identifying biomarkers and exploring risk and protective factors associated with aging. Machine learning (ML) techniques, particularly deep learning, are playing an increasingly important role in accurately predicting BA. This advancement not only enhances our understanding of healthy aging but also provides the public health sector with a powerful tool. Deep learning, an important ML method used to predict BA since 2016 (6), leverages deep neural networks (DNN) to improve model interpretability through enhanced nonlinear fitting, learning, and generalization abilities (10).

    This study established and evaluated hematological BA prediction models in a large Chinese population. 7 ML methods based on 20 blood test features were used. Further analysis showed an association between BA acceleration and T2DM. The analysis also showed that controlling fasting blood glucose within normal levels may decrease BA. The study further revealed the association between BA and chronological age (CA), indicating that even among individuals with the same CA, there can be significant differences in BA, which often reflect variations in health status among individuals.

    • 29 blood test features from 145,645 individuals were collected from the physical examination data center of the First Affiliated Hospital of Chongqing Medical University between 2015 and 2021. Supplementary Table S1 provides a comprehensive overview of the participants’ demographic profiles. This dataset includes five categories: blood routine examination, cardiovascular efficiency, diabetes mellitus, liver function, and renal function (Supplementary Table S2). Outliers for each variable in the dataset were removed using the quartile method (11). After removing features with multicollinearity, the remaining features were normalized to a range of 0 to 1. Multicollinearity can impact the individual effects of each explanatory variable, as well as model stability and generalization error. To ascertain multicollinearity within the dataset of 29 features, the variance inflation factor (VIF) was calculated. A feature was considered to exhibit multicollinearity if its VIF exceeded a threshold of 10.

    • The datasets for male and female participants were randomly split into training, validation, and testing datasets at a ratio of 7∶2∶1. Seven ML methods, namely, DNN, support vector regression (SVR), stochastic gradient descent (SGD), kernel ridge regression (KRR), Least Absolute Shrinkage and Selection Operator (LASSO), K-nearest neighbors (KNN), and gradient boosting for regression (GBR), were used to build BA prediction models, with each model being optimized through parameter adjustment. To assess the performance of each model, Pearson’s correlation coefficient (r) and the mean absolute error (MAE) were calculated. Additionally, permutation feature importance (PFI) was used to measure the contribution of features in DNN models. This process involved 100 permutations for each feature, and the average decrease in the coefficient of determination (R2) was calculated before and after the permutation. In summary, the study comprehensively analyzed features of blood tests to develop and evaluate BA prediction models by applying various ML methods.

    • Participants with a difference between BA and CA greater than 7 years were classified as the BA acceleration group, while those with a difference less than −7 years comprised the BA deceleration group. The remaining participants constituted the control group. The T2DM status of each participant was self-reported based on previous medical history. Logistic regression, with BMI and CA as covariates, was used to analyze the association between BA acceleration and T2DM. Participants with T2DM were further divided into two groups based on glycemic control: those with impaired fasting glucose (IFG) and those with normal fasting glucose (NFG) after treatment. All individuals were categorized into six CA groups. The Kruskal-Wallis H test was used to analyze differences in BA-CA across the three groups (non-diabetic, IFG, and NFG) for each CA group in both male and female participants.

    • SPSS (version 25.0; IBM Corporation, Armonk, NY, USA), R software (version 3.6.1; R Foundation, Vienna, Austria), Python (version 3.8.3; maintained by the Python Software Foundation, Chicago, IL, USA), and TensorFlow software (version 2.7.0; developed by Google Brain Team, Mountain View, CA, USA) were used to perform the analyses.

    • In this study, CA ranged from 11.44 to 99.82, with a detailed distribution shown in Supplementary Figure S1. After outlier detection using the quartile method, 32,720 of 145,645 individuals were removed. Due to multicollinearity and strong correlations with other features, nine features were excluded (Supplementary Figure S2 and Supplementary Table S2). Figure 1 presents the process flowchart and sample size for each subset.

      Figure 1. 

      Optimized workflow for data processing and management.

    • Seven optimal BA prediction models were constructed using ML methods. Among these, the DNN model showed the highest coefficient and lowest mean absolute error (MAE) across the training, validation, and testing datasets (Supplementary Table S3). Subsequently, the DNN BA predictors were used to predict hematological BA for both male and female subsets. Notably, strong correlations between hematological BA and CA were observed in the male and female training, validation, and testing datasets (Figure 2).

      Figure 2. 

      Correlations between hematological biological age and chronological age based on deep neural network models in the training, validation, and testing datasets. (A) Training dataset — Male; (B) Training dataset — Female; (C) Validation dataset — Male; (D) Validation dataset — Female; (E) Testing dataset — Male; (F) Testing dataset — Female.

      Note: The colors represent the Gaussian kernel density estimation value.
    • The average decrease in model performance was calculated by randomly permuting features. The differences in the coefficients of determination (R2) of features before and after random permutation were calculated as the feature importance values (Supplementary Figure S3). The top five important features in the BA prediction model in male subjects were albumin, fasting blood glucose, platelet hematocrit, mean corpuscular volume, and serum total cholesterol (Supplementary Figure S3A), whereas the top five features in the BA prediction model in female subjects were albumin, fasting blood glucose, serum total cholesterol, triglycerides, and urea (Supplementary Figure S3B). In both models, albumin and fasting blood glucose ranked first and second in terms of importance. Moreover, albumin was twice as important in the male model as in the female model, with almost no difference in fasting blood glucose.

    • A total of 1,362 individuals with diabetes and 100,818 individuals without diabetes were included in this study to investigate the risk of BA acceleration or deceleration on diabetes incidence. Using logistic regression analysis, BA acceleration showed positive correlations with T2DM in both male [odds ratio (OR): 2.22, 95% confidence interval (CI): 1.77–2.77] and female subjects (OR: 3.10, 95% CI: 2.16–4.46), while BA deceleration showed negative correlations with T2DM in both male (OR: 0.32, 95% CI: 0.27–0.39) and female subjects (OR: 0.42, 95% CI: 0.33–0.53) (Table 1).

      Factors Male Female
      Coefficients ORs (95% CI) P Coefficients ORs (95% CI) P
      Body mass index 0.05 1.05 (1.03–1.06) <0.001 0.03 1.03 (1.01–1.04) 0.006
      Chronological age 0.11 1.12 (1.11–1.12) <0.001 0.14 1.14 (1.13–1.16) <0.001
      Accelerate 0.80* 2.22 (1.77–2.77)* <0.001 1.13* 3.10 (2.16–4.46)* <0.001
      Decelerate −1.13* 0.32 (0.27–0.39)* <0.001 −0.87* 0.42 (0.33–0.53)* <0.001
      Abbreviation: OR=odds ratio. CI=confidence interval.
      * indicates that the acceleration of biological age is positively correlated with type 2 diabetes mellitus in both male and female subjects, while the deceleration of biological age is negatively correlated with type 2 diabetes mellitus.

      Table 1.  Gender differences in the impact of biological age acceleration and deceleration, body mass index, and chronological age on type 2 diabetes mellitus risk: estimation of odds ratios.

      To observe whether glycemic control affected hematological BA in six CA groups, 1,362 diabetic subjects were divided into IFG and NFG groups (Supplementary Table S4). Using the Kruskal–Wallis H test, NFG diabetic subjects had significantly lower BAs than IFG diabetic subjects in all CA groups except those older than 80 years. Concurrently, there were no significant BA differences between NFG diabetic and nondiabetic subjects, except for individuals younger than 40 years (Figure 3).

      Figure 3. 

      Boxplot of biological age in different age groups by gender for diabetic and non-diabetic individuals (A) Male; (B) Female.

      Note: The significance levels were Bonferroni corrected.

      Abbreviation: IFG=impaired fasting glucose; NFG=normal fasting glucose.

      *** means P≤0.001;

      **** means P≤0.0001.

    • In this study, sex-specific hematological BA prediction models constructed using the DNN algorithm showed good performance in a southwestern Chinese population. DNN outperformed six other ML methods, yielding the smallest MAE and largest correlation coefficient. This study also showed that hematological BA acceleration may be a strong risk factor for T2DM in males and females, while BA deceleration may be protective. Controlling fasting blood glucose within normal levels in those with T2DM may decrease hematological BA. This observation highlights the importance of leveraging BA prediction tools to monitor the biological aging process in middle-aged and older adults (12), thereby informing personalized health management strategies for those at heightened risk for T2DM. Identifying individuals with a high risk of T2DM enables prompt actions to advocate for lifestyle changes, including a balanced diet, increased physical activity, and stress management. These proactive measures not only prevent the onset of T2DM but also improve the overall health of at-risk populations, halting disease progression before stages requiring intensive medical care and diminishing healthcare costs.

      Because information on the included features is widely available and easily obtained from routine hematological examinations, the BA prediction models may be widely applicable for assessing BA acceleration status, especially in retrospective cohort studies. Therefore, large, multicenter cohorts based on clinical or physical examination data could be established to facilitate extensive studies on the causes and adverse outcomes of accelerated aging. For example, Lu Chen et al. demonstrated an association between BA acceleration and the risk of cardiovascular events and all-cause mortality; specifically, integrating BA acceleration into mortality prediction increased Harrell’s C-index from 0.813 to 0.821 (13).

      This study’s BA prediction models demonstrate significant potential for large-scale population health research, reshaping public health strategies by identifying individuals at risk of accelerated aging and elevated disease susceptibility. This precision fosters personalized preventive measures and targeted therapies. It has been reported that a majority of hematological markers are associated with both gender and age (14). Tracking hematological markers facilitates the mapping of aging patterns across diverse populations, informing strategic resource allocation and the design of tailored health preservation initiatives. However, this study was cross-sectional, and future cohort studies will assess the associations between hematological BA acceleration and other diseases.

      In conclusion, AI-based hematological BA predictors using DNNs were established and demonstrated good performance in a large Chinese population, suggesting that they may be promising methods for assessing accelerated BA status in epidemiological studies of aging.

    • Professor Wang Hong of Peking University for her suggestions in the analysis.

  • Conflicts of interest: No conflicts of interest.
  • Reference (14)

    Citation:

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return