Objective study of the facial parameters of observations in patients with type 2 diabetes mellitus by machine learning
Introduction
According to the International Diabetes Federation’s 2021 diabetes map, China has the largest number of adult diabetes patients (1). According to the national epidemiological survey in 2020, the prevalence rate of diabetes in China is 12.8% and the prevalence rate of pre-diabetes is 35.2% (2). It is predicted that by 2045, the number of diabetes patients in China will reach 174.4 million (3). The early detection of diabetes is very difficult, and once diabetes is diagnosed, it cannot be reversed (4). Thus, it is particularly important to find early indicators to predict diabetes.
Evolutionary algorithm is a set of algorithms with excellent applicability and good global optimization ability, including genetic algorithm, ant colony algorithm and so on, which have a wide range of applications. Compared with traditional optimization algorithms, evolutionary algorithms have the characteristics of self-adaptation and self-learning, which makes them more efficient in dealing with complex problems. It is these advantages of evolutionary algorithms that can be considered in the feature selection part of machine learning (5). The purpose of feature selection is to improve the performance of the model by simplifying features. In the face of the original dataset with high dimensionality and high redundancy, the performance of the model obtained by direct training is generally not very good. However, after feature selection, better results may be obtained (6). By using the adaptive and optimization ability of evolutionary algorithm to screen the features of the original dataset, it can better play the ability of feature selection to remove redundant features and reduce dimensions, so as to train a more accurate model. The optimization of model parameters can also improve the experimental results and model performance. The use of evolutionary algorithm to optimize the model parameters has a good effect, and is also widely used in the field.
From its initial use in symbol deduction to its current large-scale successful applications in the fields of recommendation systems, computational advertising, face recognition, image recognition, speech recognition, machine translation, games, and so on, artificial intelligence has been developing since the mid-1960s. Breakthroughs have been made in protein structure prediction (7), new drug discoveries (8), and other fields (9). In recent years, breakthroughs in key technologies, such as image recognition, deep learning, and neural networks, have led to the combination of medicine and artificial intelligence (9,10). The field of intelligent medicine has developed rapidly; for example, it now includes classical intelligent face recognition and key-point description methods, such as the eigenface method, and the singular value decomposition method (11,12). The facial detection image-processing algorithm is based on Gaussian model detection and AdaBoost algorithm detection (13).
There are also many reports in the field on the prediction and diagnosis of diabetes. Based on a Meta-II analysis, an optimized logistic regression model was used to screen patients with complications of type 2 diabetes mellitus (T2DM) and standardize the high-risk groups by risk assessment (14). The artificial intelligence algorithm convolution neural network has been applied to predict diabetic complications (15), and good results have been obtained. Despite the deepening application of machine learning and artificial intelligence methods in the medical field, as yet, there is no effective method that effectively makes use of existing and easily available data and integrates the theory of traditional Chinese medicine (TCM) to predict diseases accurately.
This study was based on previous research on TCM theory combined with artificial intelligence methods. In this study, we used a collection of patients’ facial features and clinical information to predict diabetes. We sought to screen the objective parameters of the risk factors of the facial morphological features of T2DM via a predictive model. We present the following article in accordance with the STARD reporting checklist (available at https://atm.amegroups.com/article/view/10.21037/atm-22-3580/rc).
Methods
Research object data collection
The subjects were hospitalized and regularly followed-up at the Lanzhou Second People’s Hospital from December 2017 to September 2021. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by ethics committee of Lanzhou Second People’s Hospital (No. 2019071A). Informed consent was taken from all the patients. A total of 2,574 subjects met the criteria for the study, of whom 1,464 were male and 1,110 were female, and 1,590 had T2DM and 984 did not have T2DM. The age of the subjects was 60.27±10.13 years old. The non-T2DM subjects included 63 subjects with type 1 diabetes, 34 with impaired glucose tolerance, 113 with hypertension, 396 with chronic kidney disease, and 378 with other diseases.
Diagnostic criteria
This study adopted the following diagnostic criteria of diabetes of World Health Organization (WHO) 1999 (16): diabetes symptoms + a plasma glucose level at any time ≥11.1 mmol/L (200 mg/dL), or a fasting plasma glucose (FPG) ≥7.0 mmol/L (126 mg/dL), or an oral glucose tolerance test (OGTT), 2-hour postprandial blood glucose level ≥11.1 mmol/L (200 mg/dL). One of the above criterion had to be satisfied. For the other non-T2DM volunteers, we referred to the relevant guidelines for diagnosis.
Inclusion criteria
To be eligible for inclusion in this study, the patients had to meet the following inclusion criteria: (I) have no critical primary disease; (II) the age of patient was more than 18 and <70 years old; (III) sign the informed consent form; (IV) have no history of mental illness or acute disease; and (V) have no obvious facial deformity. None of the cases had a scar of their face and had every undergone plastic surgery.
Exclusion criteria
Patients were excluded from the study, if they met any of the following exclusion criteria: (I) had undergone artificial facial modification; and/or (II) had facial trauma and an abnormal expression in terms of facial shape and color.
Image acquisition
The Daosheng meridian detection and evaluation system (Shanghai Daosheng Medical Technology Co., Ltd., Shanghai, China) was used. The information was collected by the Customer Observation Collection Laboratory at the TCM Department of Lanzhou Second People’s Hospital. The collection period was from December 2017 to September 2021. The environment was kept at a room temperature of 24±4 ℃.
Sampling preparation
The subjects had clean faces, no beards, and were asked to remove any accessories and/or glasses and wash their faces with clean water. The subjects forehead, eyebrow, and facial skin area were exposed and their heads were kept upright.
Image acquisition process
The subjects sat in a stable posture with a natural expression and did not blink. The face was in the center of the scanner. After adjustment, the operator opened the software to shoot the images. The images of each patient were recorded by fellowship-trained operators of experience, who were blinded to all patient information. The clinical information and index test results were unavailable to the assessors of the standards.
Image storage
The captured images were saved in *.JPG format. No special treatment was required. Additionally, if a scanned image was partially defective, it was re-photographed.
Image measurement
To ensure the classification ability of the final data, it was necessary to describe the facial shape features in as much detail as possible. With reference to anthropometry, 81 facial landmarks were used that originated from the Gitbub (https://github.com/codeniko/shape_predictor_81_face_landmarks). Based on previous research (including research on the facial features of the young, elderly and those with other diseases), 18 facial morphological parameters were measured (16-19) (see Figure 1).
Measured content
The final analysis of this design resulted in a total of 18 morphological components, including the surface width, aspect ratio, and roundness. If the contours of the face were obscured by hair, or if there was a difference in the upper edge of the face due to baldness, this was corrected during marking.
Data extraction
The coordinated system and the original facial images were determined by machine learning. Additionally, the measuring point coordinates were marked in the coordinate system to automatically measure the values, and the results were saved to an Excel table. The specific indicators included the following 18 facial parameters:
- Face_rate: face length/width;
- Face_forehead_top: the linear distance between the left and right frontotemporal points;
- Face_forehead_bottom: the distance between the intersection of the hairline on both sides of the horizontal line on the brow bone;
- Face_forehead_cos1: the angle between the left hairline and the upper hairline;
- Face_forehead_cos2: the angle between the right hairline and the upper hairline;
- Face_frontal_rate: upper frontal width/lower frontal width;
- Nose_rate: nose length/straight-line distance between Yingfeng points on both side;
- Face_width1: the distance between the intersections of the hairline on both sides of the horizontal line of the eyebrow;
- Face_width2: the distance between the zygomatic points;
- Face_width3: the distance between the left and right mandibular corners;
- Face_cos1: the angle between the left hairline and the upper stop line;
- Face_cos2: the angle between the right hairline and the upper stop line;
- Face_cos4: the inclination of the line between the left zygomatic point and the chin;
- Face_cos4: the inclination of the line between the right zygomatic point and the chin;
- Face_cos5: the inclination of the line between the left mandibular point and the chin;
- Face_cos6: the inclination of the line between the right mandibular point and the chin;
- Face_cos7: the radian of the line between the left mandibular point and the chin;
- Face_cos8: the radian of the line between the right mandibular point and the chin.
Statistical analysis
Least absolute shrinkage and selection operator (LASSO) regression is a model that builds on linear regression to solve for issues of multicollinearity in machine learning (20,21). The optimization function in LASSO adds a shrinkage parameter that allows for the removal of features from the final model (22). A LASSO regression, which is a regression analysis method, was conducted in this study.
Results
Parameter characteristics included in the sample
A LASSO regression model was used to screen the variables to establish a prediction model. Receiver operating characteristic (ROC) curves and area under the curves (AUCs) were calculated. After the statistical analysis, the difference data were obtained (see Table 1). The flow of participants is shown in Figure 2. This study included a T2DM case group (n=1,590) and a control group (n=984). In relation to the eight parameters, (i.e., face_rate, face_forehead_top, face_forehead_bottom, face_forehead_rate, face_forehead_cos1, face_forehead_cos2, face_width2, and face_width3), there were significant differences between the T2DM group and the non-T2DM group (P<0.05; see Table 1).
Table 1
Parameters | Non-T2DM (n=984), mean (SD) | T2DM (n=1,590), mean (SD) | P value |
---|---|---|---|
Face_rate | 0.45 (0.17) | 0.47 (0.16) | 0.013* |
Face_forehead_top | 0.44 (0.18) | 0.52 (0.16) | <0.001** |
Face_forehead_bottom | 0.47 (0.12) | 0.51 (0.13) | <0.001** |
Face_forehead_rate | 0.51 (0.20) | 0.59 (0.18) | <0.001** |
Face_forehead_cos1 | 0.62 (0.12) | 0.60 (0.13) | <0.001** |
Face_forehead_cos2 | 0.51 (0.13) | 0.48 (0.14) | <0.001** |
Nose_rate | 0.30 (0.13) | 0.33 (0.12) | <0.001** |
Face_cos1 | 0.50 (0.14) | 0.51 (0.14) | 0.071 |
Face_cos2 | 0.48 (0.13) | 0.49 (0.14) | 0.115 |
Face_cos3 | 0.39 (0.13) | 0.40 (0.14) | 0.075 |
Face_cos4 | 0.35 (0.13) | 0.36 (0.13) | 0.199 |
Face_cos5 | 0.50 (0.12) | 0.48 (0.13) | 0.029* |
Face_cos6 | 0.51 (0.13) | 0.50 (0.13) | 0.099 |
Face_cos7 | 0.56 (0.13) | 0.55 (0.13) | 0.081 |
Face_cos8 | 0.61 (0.14) | 0.60 (0.14) | 0.264 |
Face_width1 | 0.49 (0.15) | 0.51 (0.14) | 0.003* |
Face_width2 | 0.45 (0.15) | 0.49 (0.14) | <0.001** |
Face_width3 | 0.51 (0.13) | 0.53 (0.13) | <0.001** |
*, P<0.05; **, P<0.01. T2DM, type 2 diabetes mellitus; SD, standard deviation.
Establishment and verification of the multi-factor model
All the sample cases were randomly divided into a training set (n=2,060) and testing set (n=514). The 18 variables of facial measurement were used as influencing factors. The LASSO regression model was used to screen the facial features affecting T2DM (see Figures 1,3). The two dotted lines in Figure 3 indicate two special λ values; that is, lambda.min and lambda.1se. According to the training set, we selected and constructed two prediction models according to lambda.1se (0.0001816427) with the minimum standard error and lambda.1se (0.006558061). The minimum model included all 18 parameters. The “1se” model included eight parameters; that is, face_rate, face_forefront_top, face_forefront_bottom, face_forefront_cos1, face_forefront_cos2, nose_rate, face_cos1, and face_width2.
According to the ROC curve analysis of the two prediction models constructed, the two models had good predictive efficiency for T2DM. The AUCs of the two models were 0.695 and 0.682, respectively (see Figure 4).
Additionally, the same verification was performed on the results of the testing set, and the AUCs of the validation dataset were 0.686 and 0.668, respectively (see Figure 5). The results are shown Table 2.
Table 2
Confusion matrix | Forecast =0 | Forecast =1 |
---|---|---|
Real =0 | 218 | 145 |
Real =1 | 182 | 255 |
The true class was represented with real =0 or 1; the predicted class was represented with forecast =0 or 1. ROC, receiver operating characteristic.
The results of the facial feature extraction
Based on the machine discrimination of the above subjects’ facial image samples, the predictive rate of machine recognition T2DM reflected the actual T2DM diagnoses of the 2,574 samples included in the study. The ROC curve analysis showed that the lambda.min “λ value” model had an AUC value of 0.695, while the lambda.1se “λ value” model had an AUC value of 0.682. For the non-T2DM samples, the predicted AUC values were 0.686 and 0.668, respectively. The above prediction model was available, which showed that the objective parameters of the facial features recognized by machine learning have a certain value in the automatic prediction of T2DM.
According to the results for the objective parameters of the facial images by machine recognition, eight parameters (i.e., face_rate, face_head_top, face_head_bottom, face_head_rate, face_head_cos1, face_head_cos2, face_width2, and face_width3) had statistical significance between the T2DM group and non-T2DM pair group (P<0.05). The T2DM patients had certain recognizable facial features. This study quantified the objective parameters and provided an objective basis for TCM facial inspections.
Discussion
The above information was based on the clinical data of inpatients, which had a good effect on the prediction of the treatment effect and diagnosis of patients. However, a large number of clinical data and biochemical indexes are still needed for rapid diagnosis and early risk warnings. Even if the data in TCM theory are used, information on the patient’s tongue, face, sublingual, pulse, smell, and other information need to be collected to make a non-invasive diagnosis (23).
In clinical practice, the diagnosis of a disease requires a comprehensive evaluation of patients (24,25). Facial morphological features can only provide an auxiliary reference (26,27). It can be predicted that with the in-depth study of artificial intelligence (28). Multi-information data integration is needed to provide clinicians with more accurate prediction and digital evaluation reports (29,30). Multiple organs, including islets, liver, skeletal muscle, adipose tissue, intestinal tract, and the hypothalamus, and the immune system play a role in the pathogenesis of T2DM (31). T2DM is closely related to heredity factors, obesity, and other mechanisms. Facial features are formed under the joint action of genetic information, bone, muscle, fat, and other factors, which is consistent with the pathogenesis of T2DM (32-35). At the same time, because the influence of facial features is more physical factors. Thus, the objective parameters of facial features should be valuable in the diagnosis and prediction of chronic diseases.
As with any statistical method, the LASSO regression method has a number of limitations (36). First, the variables were chosen to be 100% statistically driven. Unlike a human being, the LASSO process of selection does not take into account theoretical and other factors when deciding which predictors to include. In addition, there were two potential biases in our study that are common in predictive research (i.e., the reference standards and the sample size).
Conclusions
In conclusion, the influence of facial features is the physical factors. Thus, the objective parameters of facial features should be specific to differential diagnosis of T2DM.
Acknowledgments
Funding: This work was supported by the Project Support: Gansu Administration of Traditional Chinese Medicine (No. GZK-2019-75).
Footnote
Reporting Checklist: The authors have completed the STARD reporting checklist. Available at https://atm.amegroups.com/article/view/10.21037/atm-22-3580/rc
Data Sharing Statement: Available at https://atm.amegroups.com/article/view/10.21037/atm-22-3580/dss
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://atm.amegroups.com/article/view/10.21037/atm-22-3580/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by ethics committee of Lanzhou Second People’s Hospital (No. 2019071A). Informed consent was taken from all the patients.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Sun H, Saeedi P, Karuranga S, et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res Clin Pract 2022;183:109119. [Crossref] [PubMed]
- Li Y, Teng D, Shi X, et al. Prevalence of diabetes recorded in mainland China using 2018 diagnostic criteria from the American Diabetes Association: national cross sectional study. BMJ 2020;369:m997. [Crossref] [PubMed]
- IDF Diabetes Atlas. Brussels: International Diabetes Federation, 2020. Available online: https://www.idf.org
- Carmichael J, Fadavi H, Ishibashi F, et al. Advances in Screening, Early Diagnosis and Accurate Staging of Diabetic Neuropathy. Front Endocrinol (Lausanne) 2021;12:671257. [Crossref] [PubMed]
- Wang J, Hao L, Zhou X, et al. Clinical application of traditional Chinese medicine moisture exposed burn ointment in the treatment of facial soft tissue defect. J Cosmet Dermatol 2022;21:2481-7. [Crossref] [PubMed]
- Dang W, Liu YY. On the classification of personality temperament in traditional Chinese medicine-25 people with yin and yang. Shanxi Journal of Traditional Chinese Medicine 1990;8-9.
- Senior AW, Evans R, Jumper J, et al. Improved protein structure prediction using potentials from deep learning. Nature 2020;577:706-10. [Crossref] [PubMed]
- Freedman DH. Hunting for New Drugs with AI. Nature 2019;576:S49-53. [Crossref] [PubMed]
- Guo Y. On the History, Status Quo and the Development Strategy of Artificial Intelligence. Renming Luntan·Xueshu Qianyan 2021;(23):41-53.
- Zhao K. Analysis of the application of artificial intelligence in medical field. China New Telecommunications 2018;20:221-2.
- Zheng N. Application and Analysis of artificial Intelligence in the Field of Medical and Health. China Health Industry 2017;14:195-6.
- Hong ZQ. Algebraic feature extraction of image for recognition. Pattern Recognition 1991;24:211-9. [Crossref]
- Mao HC. Extraction of diagnostic information for face-to-face diagnosis of traditional Chinese medicine: research and implementation of key algorithms. Xiamen: Xiamen University, 2007.
- Liu XY. Based on the Meta-analysis of the Risk of Type II diabetes complications Study by Logistic regression model. Chongqing: Third Military Medical University, 2016.
- He B. Study on Prediction of Diabetes Mellitus based on convolution Neural Network. Chongqing: Southwest University, 2019.
- World Health Organization. Definition, diagnosis and classification of diabetes mellitus and its complications. 1999. Available online: https://www.staff.ncl.ac.uk/philip.home/who_dmc.htm
- Panda S, Das A, Lahiri K, et al. Facial Acanthosis Nigricans: A Morphological Marker of Metabolic Syndrome. Indian J Dermatol 2017;62:591-7. [PubMed]
- Rani D, Krishan K, Sahani R, et al. Evaluation of Morphological Characteristics of the Human Ear in Young Adults. J Craniofac Surg 2020;31:1692-8. [Crossref] [PubMed]
- Velazco de Maldonado GJ, Suárez-Vega DV, García-Guevara V, et al. Innovative Paradigm in Aesthetics Medicine: Proposal for Diagnostic Morphological Geometric by Thirds, Semiology in Clinical Applied to Aging Facial. J Cutan Aesthet Surg 2020;13:112-23. [PubMed]
- AbdAlmageed W. Assessment of Facial Morphologic Features in Patients With Congenital Adrenal Hyperplasia Using Deep Learning. JAMA Netw Open 2020;3:e2022199. [Crossref] [PubMed]
- Zhou ZR, Wang WW, Li Y, et al. In-depth mining of clinical data: the construction of clinical prediction model with R. Ann Transl Med 2019;7:796. [Crossref] [PubMed]
- Cai W, van der Laan M. Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (LASSO) estimator. Int J Biostat 2020; Epub ahead of print. [Crossref] [PubMed]
- Huang GM, Huang KY, Lee TY, et al. An interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients. BMC Bioinformatics 2015;16:S5. [Crossref] [PubMed]
- Christ-Crain M, Winzeler B, Refardt J. Diagnosis and management of diabetes insipidus for the internist: an update. J Intern Med 2021;290:73-87. [Crossref] [PubMed]
- Oshima M, Shimizu M, Yamanouchi M, et al. Trajectories of kidney function in diabetes: a clinicopathological update. Nat Rev Nephrol 2021;17:740-50. [Crossref] [PubMed]
- Wang M, Chen W. Age prediction based on a small number of facial landmarks and texture features. Technol Health Care 2021;29:497-507. [Crossref] [PubMed]
- Bacci N, Briers N, Steyn M. Assessing the effect of facial disguises on forensic facial comparison by morphological analysis. J Forensic Sci 2021;66:1220-33. [Crossref] [PubMed]
- Reel PS, Reel S, Pearson E, et al. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol Adv 2021;49:107739. [Crossref] [PubMed]
- Formosa N, Quddus M, Ison S, et al. Predicting real-time traffic conflicts using deep learning. Accid Anal Prev 2020;136:105429. [Crossref] [PubMed]
- Han SYS, Tomasik J, Rustogi N, et al. Diagnostic prediction model development using data from dried blood spot proteomics and a digital mental health assessment to identify major depressive disorder among individuals presenting with low mood. Brain Behav Immun 2020;90:184-95. [Crossref] [PubMed]
- Kolb H, Eizirik DL. Resistance to type 2 diabetes mellitus: a matter of hormesis? Nat Rev Endocrinol 2011;8:183-92. [Crossref] [PubMed]
- Dolci C, Sansone VA, Gibelli D, et al. Distinctive facial features in Andersen-Tawil syndrome: A three-dimensional stereophotogrammetric analysis. Am J Med Genet A 2021;185:781-9. [Crossref] [PubMed]
- Meng T, Guo X, Lian W, et al. Identifying Facial Features and Predicting Patients of Acromegaly Using Three-Dimensional Imaging Techniques and Machine Learning. Front Endocrinol (Lausanne) 2020;11:492. [Crossref] [PubMed]
- Beylot C. Skin ageing-General features of facial ageing and therapeutic choices. Ann Dermatol Venereol 2019;146:41-74. [Crossref] [PubMed]
- Yu X, Yan H, Xiao L, et al. Periorbital lipogranuloma after facial injection of autologous fat: A retrospective study of 18 cases. Eur J Ophthalmol 2021;31:3399-404. [Crossref] [PubMed]
- Dondelinger F, Mukherjee SAlzheimer’s Disease Neuroimaging Initiative. The joint lasso: high-dimensional regression for group structured data. Biostatistics 2020;21:219-35. [Crossref] [PubMed]
(English Language Editor: L. Huleatt)