Predicting Cycloplegic Spherical Equivalent Refraction Among Children and Adolescents Using Non-cycloplegic Data and Machine Learning — China, 2020–2024

Keke Liu; Ran Qin; Huijuan Luo; Huining Kuang; Ranbo E; Chenyu Zhang; Bingjie Sun; Xin Guo

doi:10.46234/ccdcw2025.217

Article Navigation > China CDC Weekly > 2025, 7(40): 1284-1289

Methods and Applications: Predicting Cycloplegic Spherical Equivalent Refraction Among Children and Adolescents Using Non-cycloplegic Data and Machine Learning — China, 2020–2024

Keke Liu^1,2,&;
Ran Qin^2,&;
Huijuan Luo²;
Huining Kuang^1,2;
Ranbo E^1,2;
Chenyu Zhang^1,2;
Bingjie Sun²;
Xin Guo^1,3, ,

View author affiliations

Abstract
Introduction
Cycloplegic refraction is the gold standard for assessing refractive error in children. However, logistical constraints hinder its implementation in large-scale surveys.
Methods
Data obtained from a nationwide ocular health survey conducted in ten provincial-level administrative divisions in China were analyzed (2020–2024). Participants aged 5–18 years underwent standardized non-cycloplegic and cycloplegic autorefraction, axial length (AL), corneal radius (CR), and AL/CR measurements. Random forest and XGBoost models were trained to predict the cycloplegic spherical equivalent (SE) using non-cycloplegic SE, uncorrected visual acuity (UCVA), and biometric parameters. Performance was evaluated using R², root mean square error (RMSE), and Bland–Altman analysis.
Results
Both models exhibited strong predictive performance. In the test set, random forest achieved R²=0.88 and RMSE=0.55 diopter (D), whereas XGBoost achieved R²=0.89 and RMSE=0.54 D. Non-cycloplegic SE, AL/CR ratio, AL, and UCVA were consistently the top predictors. The predicted SE exhibited strong agreement with the cycloplegic SE, with minimal residual bias.
Conclusion
Machine learning models incorporating noncycloplegic SE and ocular biometrics accurately estimate cycloplegic SE in children and adolescents, providing a practical alternative for large-scale refractive-error surveillance when cycloplegia is impractical.
Conflicts of interest: No conflicts of interest.
Funding: Supported by the Ministry of Science and Technology of the People’s Republic of China, National Key Research and Development Program of China (2021YFC2702102), the Beijing Municipal Health Commission High-level Public Health Technical Talent Construction Project (Leading Talent-01-09) and China CDC Public Health Emergency Response Mechanism Programs (102393220020010000017)

Author Affiliations

1.
School of Public Health, Capital Medical University, Beijing, China
2.
Beijing Center for Disease Control and Prevention, Beijing, China
3.
Chinese Center for Disease Control and Prevention, Beijing, China
^& Joint first authors.

Corresponding author: Xin Guo, guoxin@chinacdc.cn
Online Date: October 03 2025
Issue Date: October 03 2025
doi: 10.46234/ccdcw2025.217

References

[1]	Zhang D, Wu M, Yi XD, Shi JP, Ouyang Y, Dong N, et al. Correlation analysis of myopia and dietary factors among primary and secondary school students in Shenyang, China. Sci Rep 2024;14(1):20619. https://doi.org/10.1038/s41598-024-71254-0.
[2]	Flitcroft DI, He MG, Jonas JB, Jong M, Naidoo K, Ohno-Matsui K, et al. IMI - defining and classifying myopia: a proposed set of standards for clinical and epidemiologic studies. Invest Ophthalmol Vis Sci 2019;60(3):M20 − 30. https://doi.org/10.1167/iovs.18-25957.
[3]	Sun YY, Wei SF, Li SM, Hu JP, Yang XH, Cao K, et al. Cycloplegic refraction by 1% cyclopentolate in young adults: is it the gold standard? The Anyang University Students Eye Study (AUSES). Br J Ophthalmol 2019;103(5):654. http://dx.doi.org/10.1136/bjophthalmol-2018-312199.
[4]	Ying BL, Chandra RS, Wang JY, Cui HG, Oatts JT. Machine learning models for predicting cycloplegic refractive error and myopia status based on non-cycloplegic data in Chinese students. Transl Vis Sci Technol 2024;13(8):16. https://doi.org/10.1167/tvst.13.8.16.
[5]	Liu Y, Qin R, Xiong Y, Gu F, Shi W, Liu JM, et al. Reference values of non-cycloplegic spherical equivalent for screening and predicting myopia among children and adolescents - China, 2020-2024. China CDC Wkly 2025;7(9):298 − 303. https://doi.org/10.46234/ccdcw2025.048.
[6]	Kim SR, Kang DH, Choe GS, Kim DH. Ensemble machine learning prediction model for clinical refraction using partial interferometry measurements in childhood. PLoS One 2025;20(7):e0328213. https://doi.org/10.1371/journal.pone.0328213.
[7]	Magome K, Morishige N, Ueno A, Matsui TA, Uchio E. Prediction of cycloplegic refraction for noninvasive screening of children for refractive error. PLoS One 2021;16(3):e0248494. https://doi.org/10.1371/journal.pone.0248494.
[8]	Zhao E, Wang XY, Zhang HY, Zhao E, Wang JY, Yang Y, et al. Author Correction: Ocular biometrics and uncorrected visual acuity for detecting myopia in Chinese school students. Sci Rep 2022;12(1):21184. https://doi.org/10.1038/s41598-022-25893-w.
[9]	Wang JY, Wang XY, Gao HM, Zhang HY, Yang Y, Gu F, et al. Prediction for cycloplegic refractive error in Chinese school students: model development and validation. Transl Vis Sci Technol 2022;11(1):15. https://doi.org/10.1167/tvst.11.1.15.

FIGURE 1. Permutation feature importance for predicting cycloplegic spherical equivalent (RMSE metric). (A) Random Forest, test set; (B) Random Forest, training set; (C) XGBoost, test set; (D) XGBoost, training set.

Abbreviation: SE=spherical equivalent; AL/CR=axial length/corneal radius ratio; AL=axial length; UCVA=uncorrected visual acuity; CR=corneal radius; XGBoost=eXtreme Gradient Boosting.

Download: Full-Size Img PowerPoint

FIGURE 2. Actual vs. predicted cycloplegic SE by ML models. (A) Random Forest, test set (R²=0.878); (B) Random Forest, training set (R²=0.941); (C) XGBoost, test set (R²=0.886); (D) XGBoost, training set (R²=0.897).

Note: Dashed line represents perfect agreement; solid line represents the fitted linear regression.

Abbreviation: XGBoost=eXtreme Gradient Boosting; SE=spherical equivalent; ML=machine learning.

Download: Full-Size Img PowerPoint

FIGURE 3. Bland–Altman analysis of ML model performance in predicting cycloplegic spherical equivalent. (A) Random Forest, test set; (B) Random Forest, training set; (C) XGBoost, test set; (D) XGBoost, training set.

Note: Plots display the differences between predicted and actual cycloplegic spherical equivalent (SE) values (in diopters, D) against their means. Solid horizontal lines represent mean differences, whereas dashed lines indicate 95% limits of agreement (±1.96 SD). The training sets (B and D) demonstrate narrower limits of agreement than those of test sets (A and C), reflecting expected performance variations between training and validation datasets.

Abbreviation: XGBoost=eXtreme Gradient Boosting.

Download: Full-Size Img PowerPoint

TABLE 1. Baseline characteristics of the participants (n=58,252).

Variable	Test set (n=11,649)	Train set (n=46,603)
Variable	Mean±SD / n (%)	Mean±SD / n (%)
Age (years)	8.27±2.86	8.25±2.82
Sex
Male	6,036 (51.8)	24,073 (51.7)
Female	5,613 (48.2)	22,530 (48.3)
Region
City	8,867 (76.1)	35,686 (76.6)
Country	2,782 (23.9)	10,917 (23.4)
Non-cycloplegic spherical equivalent (D)	−0.64±1.46	−0.63±1.44
Cycloplegic spherical equivalent (D)	0.09±1.59	0.09±1.57
Axial length (mm)	23.21±1.05	23.20±1.04
corneal radius (mm)	7.79±0.26	7.78±0.26
AL/CR ratio	2.98±0.12	2.98±0.12
Uncorrected visual acuity (logMAR)	4.85±0.28	4.85±0.28
Abbreviation: D=diopter; AL=axial length; CR=corneal radius; logMAR=Logarithm of the Minimum Angle of Resolution.

Download: CSV

TABLE 2. Performance of machine-learning models for predicting cycloplegic spherical equivalent in the train and test set.

Model	Test set (n=11,649)	Train set (n=46,603)
Random Forest
R²	0.88	0.94
RMSE (D)	0.55	0.39
MAE (D)	0.40	0.29
XGBoost
R²	0.89	0.90
RMSE (D)	0.54	0.51
MAE (D)	0.39	0.37
Abbreviation: D=diopter; XGBoost=eXtreme Gradient Boosting.

Download: CSV

Citation:

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Turn off MathJax

Article Contents

Get Citation

PDF

Article Metrics

Article views(249) PDF downloads(0) Cited by()

Introduction

Cycloplegic refraction is the gold standard for assessing refractive error in children. However, logistical constraints hinder its implementation in large-scale surveys.

Methods

Data obtained from a nationwide ocular health survey conducted in ten provincial-level administrative divisions in China were analyzed (2020–2024). Participants aged 5–18 years underwent standardized non-cycloplegic and cycloplegic autorefraction, axial length (AL), corneal radius (CR), and AL/CR measurements. Random forest and XGBoost models were trained to predict the cycloplegic spherical equivalent (SE) using non-cycloplegic SE, uncorrected visual acuity (UCVA), and biometric parameters. Performance was evaluated using R², root mean square error (RMSE), and Bland–Altman analysis.

Results

Both models exhibited strong predictive performance. In the test set, random forest achieved R²=0.88 and RMSE=0.55 diopter (D), whereas XGBoost achieved R²=0.89 and RMSE=0.54 D. Non-cycloplegic SE, AL/CR ratio, AL, and UCVA were consistently the top predictors. The predicted SE exhibited strong agreement with the cycloplegic SE, with minimal residual bias.

Conclusion

Machine learning models incorporating noncycloplegic SE and ocular biometrics accurately estimate cycloplegic SE in children and adolescents, providing a practical alternative for large-scale refractive-error surveillance when cycloplegia is impractical.

HTML

Refractive error screening in children is a major public health challenge in China. A Chinese study projects that by 2050, urban myopia rates will be 27.1% [95% confidence interval (95% CI]: 10.0%–44.4%) for 7–9 year olds and 81.5% (95% CI: 74.7%–88.3%) for 16–18 year olds, with rural rates at 20.1% (95% CI: 8.6%–31.7%) and 74.1% (95% CI: 63.2%–84.8%), respectively. High myopia in 16–18 year olds is expected to increase from 7.3% in 2001 to 22.1% by 2050 (1). Despite its status as the gold standard for accurate measurements (2), the implementation of cycloplegic refraction in large-scale school screenings is often impractical because of logistical constraints and potential side effects (3). Current reliance on non-cycloplegic refraction introduces substantial variability, with studies reporting mean differences in the range of 0.60–1.23 diopters (D) compared with cycloplegic measurements (4), particularly in younger children with strong accommodative responses. Recent advances in machine learning (ML) have shown promise in the prediction of ophthalmic parameters. However, their application in bridging cycloplegic and non-cycloplegic measurements remains underexplored in population-based settings. Ocular biometric parameters are important indicators for assessing far-vision reserve in children and for objectively evaluating the development of refraction (5). This study aimed to develop and validate ML-based models to predict cycloplegic spherical equivalent (SE) in children aged 5–18 years using non-cycloplegic refraction, axial length (AL), corneal radius (CR), AL/CR ratio, and uncorrected visual acuity (UCVA). These models may offer practical alternatives to cycloplegic refraction for the surveillance of myopia in large populations.

This study used data from a national cross-sectional survey of ocular development in children and adolescents organized by the National Disease Control and Prevention Administration of China between 2020 and 2024. The survey was conducted in 10 provincial-level administrative divisions (PLADs): Beijing, Shanxi, Liaoning, Zhejiang, Shandong, Henan, Hunan, Guangdong, Chongqing, and Shaanxi. In most PLADs, two cities were selected on the basis of their levels of economic development. A multistage cluster sampling method was used to recruit students from kindergarten to high school. The detailed survey protocol has been previously described (5). A flowchart of the data collection procedure is presented in Supplementary Figure S1.

All participants underwent standardized eye examinations, including noncycloplegic and cycloplegic autorefraction, using the same desktop autorefractor model at all sites. For cycloplegia, 0.5% tropicamide was applied four times at 5-min intervals. Trained personnel performed all assessments using a unified protocol. Owing to the high inter-eye correlation, only the right-eye data were analyzed. SE was calculated as the spherical power plus half the cylindrical power. For variable selection, univariate analyses were performed to assess the associations between each potential predictor and refractive error. Additionally, we calculated the variance inflation factors (VIFs) to evaluate multicollinearity and applied the least absolute shrinkage and selection operator (LASSO) regression to refine the variable set. Variables with significant univariate P-values, acceptable VIFs (<5), and retention in the LASSO model were included in the final analyses. These included AL, CR, AL/CR ratio, UCVA, and patient age. Gender and region, despite their theoretical importance, were not retained in the LASSO model and were thus excluded from the final analysis. UCVA was recorded on a Chinese 5-point scale, where 5.0 corresponded to 0 logarithm of the minimum angle of resolution (logMAR) (Snellen 20/20).

The dataset was randomly divided into training (80%) and test (20%) datasets. At the baseline, a multivariate linear regression model was fitted using all the non-cycloplegic variables. On the test set, this model achieved an R-squared (R²) value of 0.79, root mean square error (RMSE) of 0.73 D, and mean absolute error (MAE) of 0.54 D, suggesting the need for more flexible ML algorithms. Random forest regression and eXtreme Gradient Boosting (XGBoost) regression were used to develop predictive models, and model performance was evaluated using R², RMSE, actual-predicted SE plots, and Bland–Altman plots. All analyses were performed using R software (version 4.5.1, R Foundation for Statistical Computing, Vienna, Austria, 2024).

This study included 58,252 participants: 46,603 (80.0%) in the training set and 11,649 (20.0%) in the test set. As summarized in Table 1, the baseline characteristics were well balanced between the two groups: The mean ages were 8.25±2.82 years in the training set and 8.27±2.86 years in the test set, gender distributions were similar (51.7% vs. 51.8% male), and urban–rural residence patterns were comparable (76.6% vs. 76.1% urban residents). Ocular measurements exhibited similar refractive profiles between groups, including non-cycloplegic SE (−0.63±1.44 D vs. −0.64±1.46 D), cycloplegic SE (both 0.09±1.57 D), AL (23.20±1.04 mm vs. 23.21±1.05 mm), CR (7.78±0.26 mm vs. 7.79±0.26 mm), AL/CR ratio (both 2.98±0.12), and UCVA (both 4.85±0.28 logMAR). Descriptive characteristics of the study population are summarized in Table 1.

Variable	Test set (n=11,649)	Train set (n=46,603)
Variable	Mean±SD / n (%)	Mean±SD / n (%)
Age (years)	8.27±2.86	8.25±2.82
Sex
Male	6,036 (51.8)	24,073 (51.7)
Female	5,613 (48.2)	22,530 (48.3)
Region
City	8,867 (76.1)	35,686 (76.6)
Country	2,782 (23.9)	10,917 (23.4)
Non-cycloplegic spherical equivalent (D)	−0.64±1.46	−0.63±1.44
Cycloplegic spherical equivalent (D)	0.09±1.59	0.09±1.57
Axial length (mm)	23.21±1.05	23.20±1.04
corneal radius (mm)	7.79±0.26	7.78±0.26
AL/CR ratio	2.98±0.12	2.98±0.12
Uncorrected visual acuity (logMAR)	4.85±0.28	4.85±0.28
Abbreviation: D=diopter; AL=axial length; CR=corneal radius; logMAR=Logarithm of the Minimum Angle of Resolution.

Table 1. Baseline characteristics of the participants (n=58,252).

Both the random forest and XGBoost models exhibited strong performance in predicting cycloplegic SE using non-cycloplegic refraction and ocular biometric parameters. In the test set, random forest achieved an R² of 0.88 with an RMSE of 0.55 D and MAE of 0.40 D, whereas XGBoost exhibited comparable performance (R²=0.89, RMSE=0.54 D, and MAE=0.39 D). The performance of the training set was superior (random forest: R²=0.94, RMSE=0.39 D; XGBoost: R²=0.90, RMSE=0.51 D) (Table 2). The importance of the features of each ML model in the training and test sets is shown in Figure 1. Non-cycloplegic SE, the AL/CR ratio, AL, and UCVA were consistently among the four most important features for predicting cycloplegic SE in these models. Scatter plots of the predicted versus actual cycloplegic SE demonstrated satisfactory alignment along the identity line for both models (Figure 2). Bland–Altman plots showed that 95% of prediction errors were within ±2.0 D for both models, indicating acceptable agreement (Figure 3).

Model	Test set (n=11,649)	Train set (n=46,603)
Random Forest
R²	0.88	0.94
RMSE (D)	0.55	0.39
MAE (D)	0.40	0.29
XGBoost
R²	0.89	0.90
RMSE (D)	0.54	0.51
MAE (D)	0.39	0.37
Abbreviation: D=diopter; XGBoost=eXtreme Gradient Boosting.

Table 2. Performance of machine-learning models for predicting cycloplegic spherical equivalent in the train and test set.

Figure 1.

Permutation feature importance for predicting cycloplegic spherical equivalent (RMSE metric). (A) Random Forest, test set; (B) Random Forest, training set; (C) XGBoost, test set; (D) XGBoost, training set.

Abbreviation: SE=spherical equivalent; AL/CR=axial length/corneal radius ratio; AL=axial length; UCVA=uncorrected visual acuity; CR=corneal radius; XGBoost=eXtreme Gradient Boosting.

Figure 2.

Actual vs. predicted cycloplegic SE by ML models. (A) Random Forest, test set (R²=0.878); (B) Random Forest, training set (R²=0.941); (C) XGBoost, test set (R²=0.886); (D) XGBoost, training set (R²=0.897).

Note: Dashed line represents perfect agreement; solid line represents the fitted linear regression.

Abbreviation: XGBoost=eXtreme Gradient Boosting; SE=spherical equivalent; ML=machine learning.

Figure 3.

Bland–Altman analysis of ML model performance in predicting cycloplegic spherical equivalent. (A) Random Forest, test set; (B) Random Forest, training set; (C) XGBoost, test set; (D) XGBoost, training set.

Abbreviation: XGBoost=eXtreme Gradient Boosting.

DISCUSSION

In this large population-based study involving more than 58,252 Chinese children and adolescents, we developed and validated ML models to predict cycloplegic SE refraction using non-cycloplegic SE and ocular biometric parameters. Both the random forest and XGBoost models demonstrated excellent predictive performance, with R² values approaching 0.90 and RMSEs below 0.55 D in the test dataset. These findings suggest that ML algorithms can provide accurate estimates of cycloplegic refractive error in large-scale field settings without pharmacological cycloplegia.

Our findings are consistent with and extend those of previous studies that explored the use of non-cycloplegic data to estimate refractive status. Similar studies conducted on Chinese school-aged populations have reported R² values between 0.80 and 0.94 using models such as LASSO, support vector regression, and ensemble approaches (4,6). However, most previous studies have relied on smaller sample sizes or fewer input variables. By incorporating a broad set of ocular biometric parameters, including AL, CR, AL/CR ratio, and UCVA, the proposed models captured more of the physiological variations underlying refractive error, leading to higher predictive accuracy.

The feature-importance rankings further underscore the value of integrating multiple ocular measures. In both the Random Forest and XGBoost models, noncycloplegic SE, AL/CR ratio, AL, and UCVA consistently emerged as the top contributors. This aligns with the known roles of axial elongation and optical geometry in the development of myopia, and supports their inclusion in prediction tools when available. These measures are associated with cycloplegic refractive error and have been used to predict the cycloplegic refractive error in previous studies (7–9).

From a public health implementation perspective, our study demonstrated that ML models trained on noncycloplegic refraction and ocular biometry can serve as effective alternatives for estimating cycloplegic SE in large-scale pediatric eye health surveillance. The strong predictive performance, along with minimal residual bias and tight limits of agreement, suggests that these models are sufficiently robust for estimating refractive status where cycloplegia is infeasible. This approach can complement existing screening programs, support a more efficient allocation of clinical resources, and enhance the accuracy of population-level myopia monitoring without increasing procedural burden.

This study has several limitations. First, the cross-sectional design limits causal inference and prevents the tracking of refractive changes over time. Unmeasured confounders may also have biased model estimates. Second, although the overall model performance was robust, prediction errors exceeding ±2.0 D occurred in some cases, which may be clinically relevant. Third, although our sample covered 10 PLADs with diverse geographic and socioeconomic profiles, external validation in other countries or ethnic groups is required to assess generalizability. Finally, we focused on continuous SE prediction instead of the categorical classification of myopia. Although classification may offer direct clinical utility, continuous prediction allows a more nuanced interpretation and avoids threshold bias.

In conclusion, this study demonstrated that ML models incorporating non-cycloplegic refraction and ocular biometric parameters can accurately estimate cycloplegic SE in children. Such models hold promise for enhancing refractive error surveillance in large-scale community-based settings, where cycloplegia is infeasible.

Acknowledgements

All staff members contributed to data collection and all students participated in this study.

Ethical statements

Received ethics approval from the institutional review board of Beijing Centers for Disease Prevention and Control (2022 No.24).

Conflicts of interest: No conflicts of interest.

Reference (9)

Citation:

[1]	Zhang D, Wu M, Yi XD, Shi JP, Ouyang Y, Dong N, et al. Correlation analysis of myopia and dietary factors among primary and secondary school students in Shenyang, China. Sci Rep 2024;14(1):20619. https://doi.org/10.1038/s41598-024-71254-0.
[2]	Flitcroft DI, He MG, Jonas JB, Jong M, Naidoo K, Ohno-Matsui K, et al. IMI - defining and classifying myopia: a proposed set of standards for clinical and epidemiologic studies. Invest Ophthalmol Vis Sci 2019;60(3):M20 − 30. https://doi.org/10.1167/iovs.18-25957.
[3]	Sun YY, Wei SF, Li SM, Hu JP, Yang XH, Cao K, et al. Cycloplegic refraction by 1% cyclopentolate in young adults: is it the gold standard? The Anyang University Students Eye Study (AUSES). Br J Ophthalmol 2019;103(5):654. http://dx.doi.org/10.1136/bjophthalmol-2018-312199.
[4]	Ying BL, Chandra RS, Wang JY, Cui HG, Oatts JT. Machine learning models for predicting cycloplegic refractive error and myopia status based on non-cycloplegic data in Chinese students. Transl Vis Sci Technol 2024;13(8):16. https://doi.org/10.1167/tvst.13.8.16.
[5]	Liu Y, Qin R, Xiong Y, Gu F, Shi W, Liu JM, et al. Reference values of non-cycloplegic spherical equivalent for screening and predicting myopia among children and adolescents - China, 2020-2024. China CDC Wkly 2025;7(9):298 − 303. https://doi.org/10.46234/ccdcw2025.048.
[6]	Kim SR, Kang DH, Choe GS, Kim DH. Ensemble machine learning prediction model for clinical refraction using partial interferometry measurements in childhood. PLoS One 2025;20(7):e0328213. https://doi.org/10.1371/journal.pone.0328213.
[7]	Magome K, Morishige N, Ueno A, Matsui TA, Uchio E. Prediction of cycloplegic refraction for noninvasive screening of children for refractive error. PLoS One 2021;16(3):e0248494. https://doi.org/10.1371/journal.pone.0248494.
[8]	Zhao E, Wang XY, Zhang HY, Zhao E, Wang JY, Yang Y, et al. Author Correction: Ocular biometrics and uncorrected visual acuity for detecting myopia in Chinese school students. Sci Rep 2022;12(1):21184. https://doi.org/10.1038/s41598-022-25893-w.
[9]	Wang JY, Wang XY, Gao HM, Zhang HY, Yang Y, Gu F, et al. Prediction for cycloplegic refractive error in Chinese school students: model development and validation. Transl Vis Sci Technol 2022;11(1):15. https://doi.org/10.1167/tvst.11.1.15.

Methods and Applications: Predicting Cycloplegic Spherical Equivalent Refraction Among Children and Adolescents Using Non-cycloplegic Data and Machine Learning — China, 2020–2024