Combined electronic medical records and gene polymorphism characteristics to establish an anti-tuberculosis drug-induced hepatic injury (ATDH) prediction model and evaluate the prediction value
Original Article

Combined electronic medical records and gene polymorphism characteristics to establish an anti-tuberculosis drug-induced hepatic injury (ATDH) prediction model and evaluate the prediction value

Jingwei Zhang1,2#, Wei Zhou3,4#, Shijie Ma4#, Yuwei Kang4,5, Wei Yang4,5, Xiaodong Peng5, Yi Zhou2, Fei Deng3,4

1Department of Laboratory Medicine, Chengdu Second People’s Hospital, Chengdu, China; 2Department of Laboratory Medicine, West China Hospital, Sichuan University, Chengdu, China; 3Department of Nephrology, Chengdu Jinniu District People’s Hospital (Sichuan Provincial People’s Hospital Jinniu Hospital), Chengdu, China; 4Department of Nephrology, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China; 5Department of Nephrology, Affiliated Hospital of Southwest Medical University, Clinical Medical College of Southwest Medical University, Luzhou, China

Contributions: (I) Conception and design: J Zhang, F Deng; (II) Administrative support: Y Zhou, X Peng; (III) Provision of study materials or patients: Y Zhou; (IV) Collection and assembly of data: Y Kang, W Yang, W Zhou; (V) Data analysis and interpretation: J Zhang, S Ma; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Yi Zhou. Department of Laboratory Medicine, West China Hospital, Sichuan University, No. 37 Guoxue Alley, Wuhou District, Chengdu 610041, China. Email: zhouyi2011@qq.com; Fei Deng. Department of Nephrology, Chengdu Jinniu District People’s Hospital (Sichuan Provincial People’s Hospital Jinniu Hospital), 300# Jinfu Rd., Chengdu 610007, China. Email: dengfei@med.uestc.edu.cn.

Background: Anti-tuberculosis drug-induced hepatic injury (ATDH) lacks specific diagnostic markers. The characteristics of gene polymorphisms have been preliminarily used for the risk classification of ATDH, and the activation of Pregnane X receptor/aminole-vulinic synthase-1/forkhead box O1 (PXR/ALAS1/FOXO1) axis is closely related to ATDH. Therefore, we consider combining general clinical features of the electronic medical record, laboratory indications, and genetic features of key genes in this axis for predictive model construction to help early clinical diagnosis and treatment.

Methods: The general characteristics derived from the Hospital Information System (HIS) medical record system, the biochemical tests and hematology tests were detected by Roche automatic biochemical immunoassay analyzer cobas8000 and Sysmex automatic hemocytometer XE2100. The single nucleotide polymorphisms (SNPs) genotyping work was conducted with a custom-designed 48-plex SNP scan® TM Kit. A total of 746 cases were included which were divided into training set and validation set according to the ratio of 3:2 randomly. Taking the occurrence of confirmed ATDH as the outcome variable, lasso regression and logistic regression were used to identify the predictors preliminarily. alanine aminotransferase, aspartate aminotransferase, monocyte, uric acid, albumin, fever, the polymorphisms of rs4435111 (FOXO1) and rs3814055 (PXR) were chosen from all variables to combine the predictive model. The goodness of fit, predictive efficacy, discrimination, and consistency, and clinical decision curve analysis was used to assess the clinical applicability of the models.

Results: The best model had a discriminant efficacy C-index of 0.8164, a sensitivity of 34.25%, specificity of 97.99%, a positive predictive value of 78.13% and negative predictive value of 87.69%, the two-tailed value of Spiegelhalter Z test of consistency test S:P =0.896, maximum absolute difference Emax =0.147, and average absolute difference Eave =0.017. In the validation set, performance was close. The clinical decision curve showed the clinical applicability of the prediction model when the prediction risk threshold was between 0.1 and 0.8.

Conclusions: The ATDH prediction model was constructed using a machine learning approach, combining general characteristics of the study population, laboratory indications, and SNP features of PXR and FOXO1 genes with good fit and certain predictive value, and has potential and value for clinical application.

Keywords: Anti-tuberculosis drug-induced hepatic injury (ATDH); PXR; FOXO1; single nucleotide polymorphism (SNP); prediction model


Submitted Aug 25, 2022. Accepted for publication Oct 08, 2022.

doi: 10.21037/atm-22-4551


Introduction

The Global Tuberculosis Annual Report 2020 placed the international and domestic situation of tuberculosis infection as critical, with China accounting for approximately 8.4% of the global tuberculosis population (1). While the 2021 report found the incidence of tuberculosis patients had fallen by 18% compared to the previous year (2), the coronavirus disease 2019 (COVID-19) pandemic has caused disruptions to tuberculosis services and an increase in tuberculosis deaths, with close to half of those who fall ill undiagnosed and untreated (2). One of the side effects of the World Health Organization (WHO)-recommended first-line anti-tuberculosis regimen is anti-tuberculosis drug-induced hepatic injury (ATDH), which occurs at an incidence of 5.0–28.0% and can lead to discontinuation of first-line regimens, treatment failure, and the development and spread of multi-drug resistant nodules (1,2). The mechanism of ATDH has not been clarified yet. The mechanism of ATDH involves many complex links such as drug metabolism and transport, oxidative stress, mitochondrial dysfunction, immune regulation and inflammatory response. However, it has not been clarified yet (1,3). While there is a lack of specific clinical symptoms and markers for the diagnosis of ATDH, single nucleotide polymorphisms (SNPs) have been shown to have potential clinical applications as molecular markers of the disease (4). However, the results of single nucleotide SNPs do not provide a complete and systematic picture of the relevance of the signaling pathway in which they are located to ATDH (5,6).

PXR/ALAS axis activation leading to abnormal metabolism of the hepatic heme pathway is one of the possible mechanisms by which combined rifampicin and isoniazid treatment leads to ATDH (7). PXR is an important factor in activating ALAS1 transcription (8), and the transcriptional balance of the ALAS1 gene is coordinated by a complex combination of signaling pathways. The insulin-sensitive FOXO1 pathway can synergize the transcriptional regulation of ALAS1 by PXR (9,10), and our previous studies have identified genetic polymorphisms in PXR and FOXO1 that correlate with ATDH susceptibility (6,11).

A clinical prediction model may assist in the diagnosis of ATDH by identifying predictors that have a significant impact on the outcome through a multifactorial analysis approach. This provides different combinations of candidate predictors to further assess the probability of the outcome to assist in clinical practice (12). Current ATDH prediction models with single sociodemographic factors or partial clinical indications as predictor variables have not been validated for the goodness of fit, and the robustness and accuracy of their predictive efficacy are yet to be verified. With the development of pharmacogenomics, SNPs as important genetic variable features as predictors, are believed to improve model predictive power, and combining multi-gene feature variables can enhance predictive model efficacy (13-15). Machine learning algorithms can perform big data mining effectively and at high speed to discover potential clinically relevant factors. By integrating multidimensional information simultaneously from which valid information can be extracted for model fitting, these algorithms improve model stability and predictive efficacy in unknown datasets and increases the applicability of models to assist clinical diagnosis and treatment (16). Therefore, building predictive models based on general clinical features, laboratory indications, and multigene genetic features with machine learning algorithms can lay the foundation for building clinical models that are both stable and individualized. At the same time, the probability of occurrence of disease outcomes predicted by the prediction model can be visualized by nomograms, which are more intuitive and easy to use than conventional prediction formulas (17).

In summary, in this study, for the first time, machine learning algorithms were used simultaneously to establish a visual prediction model of ATDH risk by combining the general clinical characteristics, laboratory indications, and genetic characteristics of multiple genes in the PXR/ALAS/FOXO1 axis in 746 patients with confirmed tuberculosis. The model was evaluated with respect to its predictive performance and clinical applicability. We present the following article in accordance with the STARD reporting checklist (available at https://atm.amegroups.com/article/view/10.21037/atm-22-4551/rc).


Methods

Study population

A total of 1,060 patients with suspected tuberculosis who attended the West China Hospital of Sichuan University from December 2016 to April 2018 consecutively were included retrospectively (18). The inclusion criteria were a clear tuberculosis diagnosis according to our tuberculosis diagnostic criteria and the use of an anti-tuberculosis first-line treatment regimen. The exclusion criteria were an unclear tuberculosis diagnosis; the use of other hepatotoxic drugs; concomitant human immunodeficiency virus (HIV), hepatitis B virus (HBV), hepatitis C virus (HCV), or other immune disease; and failure of follow-up or discontinuation of first-line treatment (19). The study conformed to the provisions of the Declaration of Helsinki (as revised in 2013) and it was approved by the Ethics Committee of West China Hospital, Sichuan University (registration No. 2014198). All experimental subjects were unrelated Han Chinese populations in western China who voluntarily participated and signed the informed consent form.

All methods were performed in accordance with the relevant guidelines and regulations. Inclusion criteria for the ATDH group were: (I) receiving a first-line anti-tuberculosis drug regimen; (II) diagnosis of liver injury in accordance with CTCAEv.5.0 criteria (20); and (III) no other hepatotoxic drugs used 14 days before the diagnosis of ATDH (11,18). Inclusion criteria for the non-ATDH group was that no liver injury had occurred during anti-tuberculosis treatment. The severity of hepatotoxicity was classified into three major categories according to the WHO Toxicity Classification Standards as Grade 1 (mild), alanine aminotransferase (ALT) <5× upper limits of normal (ULN) (200 IU/L); Grade 2 (moderate), ALT level higher than 5 ULN but less than 10× ULN; and Grade 3 (severe), ALT level ≥10× ULN (400 IU/L) (21).

Selection of target SNPs loci and typing assay

The dbSNP database and the 1000 Genomes database were accessed, and haploView software was used to screen the candidate genes PXR, FOXO1, and ALAS1 on the target SNPs by genetic polymorphism testing, with the screening principles as follows (11,19): minimum allele frequency ≥20% in the southern Chinese Han population; linkage disequilibrium value of tag SNPs r2 ≥0.8; located in the region where the candidate genes are located 200 bp upstream and 300 bp downstream; combined with domestic and international literature, SNPs loci that may be associated with the risk of ATDH development or have potential functional significance. Based on the above principles and pre-experimental results, SNPs loci that could be successfully designed with PCR primers and single-base amplification primers and were successfully typed were screened and typed using 48-Plex SNPscan® high-throughput SNP typing technology (18,19). Thirty samples were randomly selected for double-blind experiments to ensure the repeatability and stability of the genotyping results, and all the genotype calling success rates were greater than 99.0% (6).

Data collection, preprocessing, and feature variable screening

The definitive diagnosis of ATDH and basic medical history of subjects were exported from the HIS system by data collectors, and all relevant laboratory indications were exported in the LIS system. The missing data of <10% were filled with the median for continuous variables and plural for categorical variables, while missing data of >10% were excluded. The exact diagnosis of ATDH was obtained from the medical records by data collectors and provided to the data analysts. If there was no clear diagnosis of ATDH in the medical records, it was excluded after confirmation by a consulting clinician. Genetic polymorphism testing staff and clinical data collectors worked independently and were unaware of each other’s data, while data analysts jointly used all data to build predictive models and perform performance validation. All study subjects were randomly divided into test and validation datasets according to a 3:2 ratio. Lasso regression was used to initially screen candidate variables, and the smallest penalty coefficient lambada (λ) was selected to construct a subset of candidate variables at P<0.2 (22-24).

Construction of prediction models and evaluation of predictive efficacy using test set data

The modeling of candidate variables using the test set was performed using STATA software v15.0, and the goodness of fit of the model was evaluated using Akaike’s Information Criterion (AIC) (12,22,25). The selection criteria were (I) AIC minimization and (II) candidate variables minimization without affecting predictive efficacy (23,26). Covariance and interaction analyses were performed on the included predictors (26,27), while performance parameters such as sensitivity, specificity, positive predictive value, and negative predictive value were evaluated with a discrimination test (22). We used receiver operating characteristic (ROC) curves and C-index for model differentiation assessment (12,23), and calibration curve plots for consistency assessment (23,28).

Validation of prediction model efficacy using validation set data

Using the predictors obtained from the test set model and the corresponding coefficients, model reconstruction and validation of model fit were performed using the validation set data (12).

Model visualization and clinical application value assessment

A nomogram was developed to visualize the model. The clinical application value of the model was demonstrated using decision curve analysis, and the strategy with the highest net benefit for a specific threshold probability was considered the best strategy (26,29).

Statistical analysis

SPSS software (version 23.0) was used for data on clinical data and laboratory indications, with t-test or ANOVA for quantitative data obeying normal distribution, expressed as mean ± standard deviation (SD). Mann-Whitney or Kruskal-Wallis nonparametric tests were used for quantitative data with non-normal distribution, expressed as median or interquartile range, and the chi-square test or logistic regression was used to count the count data (19). R version 3.6.1 software was used to screen potential predictors through Lasso regression and SPSS version 23.0 software using one-way logistic regression analysis with P<0.20 was used as the judgment threshold for inclusion of predictors. Multi-factor analysis was performed by STATA version 14 software using a generalized linear model logistic regression stepwise selection method, and the model was constructed using the minimum value of AIC and the minimum number of predictors as criteria. ROC curve analysis was used to evaluate the predictive model discrimination using C-index as a criterion. The Hosmer-Lemeshow test was used to evaluate the consistency of the prediction model in the validation set data, with P>0.05 as the reference criterion, and the unreliability test (U test) corrected curve analysis was used to evaluate the consistency of the model fit, with P>0.05 as the reference criterion (30). The prediction model was visualized using a nomogram, and its clinical application value was analyzed using decision curves. The incidence of ATDH in the West China population was approximately 15% (31).

To examine the significant difference between these two groups, the bilateral significance level was established at 5%, and the power of the test was 80%. Considering a 10% loss to follow-up, the sample size of each group was estimated at approximately 100 cases (32).


Results

Basic information about the study population

Along with treatment, biochemical and haematological analyses were performed twice each month during the first two months and monthly in the subsequent four months. At the same time, clinicians observed and recorded clinical symptoms and signs in accordance with the diagnosis and treatment standards. A total of 746 study subjects (118 in the ATDH group and 628 in the non-ATDH group) were included in this study, and the process of enrolment is shown in Figure S1. All patients in the ATDH group presented alterations in hepatic enzymes, and 32 individuals developed symptomatic hepatitis, which was characterized by jaundice, nausea, vomiting, and abdominal pain. While there was no statistical difference between the two groups in the proportion of gender, age, and living habits, the proportion of patients presenting with febrile symptoms was significantly lower in the ATDH group than in the non-ATDH group, as shown in Table 1.

Table 1

Clinical characteristics of the study subjects

Characteristics ATDH (n=118) Non-ATDH (n=628) P
Age, mean ± SD (years) 40.92±15.72 42.85±18.44 0.285
Gender (male/female), n (%) 69 (58.47)/49 (41.53) 375 (59.71)/253 (40.29) 0.801
Smoking (yes/no), n (%) 35 (29.66)/83 (70.34) 192 (30.57)/436 (69.43) 0.843
Drinking alcohol (yes/no), n (%) 32 (27.12)/86 (72.88) 141 (22.45)/487 (77.55) 0.270
General symptoms, n (%)
   Fever 63 (20.32) 247 (79.68) 0.004
   Weight loss 34 (13.08) 226 (86.92) 0.133
   Nocturnal night sweats 29 (14.80) 167 (85.20) 0.648
   Loss of appetite 45 (17.11) 218 (82.89) 0.475
   Fatigue 31 (17.92) 142 (82.08) 0.387
Local infection symptoms, n (%)
   Appearances 88 (16.92) 432 (83.08) 0.209
   Disappearances 30 (13.27) 196 (86.73)

ATDH, anti-tuberculosis drug-induced hepatic injury; SD, standard deviation.

Indications of clinical laboratory tests in the study population

Patients in the ATDH group had increased total bilirubin (TBIL), indirect bilirubin, aspartate aminotransferase (AST), ALT, alkaline phosphatase, and glutamyl transferase levels and lower uric acid levels relative to those in the non-ATDH group (all P<0.05), as shown in Table 2. ATDH cases were graded as mild (83/118, 70.34%), moderate (21/118, 17.80%), and severe (14/118, 11.86%). Age and gender were similar in the three groups (P=0.888 and P=0.117, respectively) (data available if necessary). Once the patient developed ATDH, the clinician used the treatment to protect liver function according to the severity, temporarily discontinued the drug, or switched to the second-line treatment plan (2).

Table 2

Indications for clinical laboratory tests in the study population

Baseline values for laboratory test results ATDH group (n=118) Non-ATDH group (n=628) P
Red blood cell count (×1012/L) 4.31±0.74 4.28±0.68 0.481
Haemoglobin (g/L) 122.87±22.11 122.06±20.58 0.717
Erythrocyte pressure (%) 0.38±0.06 0.36±0.06 0.069
Platelet count (×109/L) 236.50 (184.00–321.75) 232.50 (172.75–297.25) 0.134
White blood cell count (×109/L) 6.57 (4.99–7.96) 6.51 (5.17–8.44) 0.761
Absolute value of neutrophils (×109/L) 5.23±2.89 5.10±2.73 0.631
Absolute value of lymphocytes (×109/L) 1.29±0.79 1.26±0.62 0.625
Absolute value of monocytes (×109/L) 0.55±0.29 0.50±0.25 0.099
Neutrophils (%) 70.49±11.50 70.13±11.54 0.760
Lymphocytes (%) 16.25 (12.58–25.58) 17.5 (12.18–25.68) 0.527
Monocytes (%) 7.74±2.62 7.30±2.37 0.077
Total bilirubin (μmol/L) 10.05 (7.50–14.13) 8.70 (6.30–12.10) 0.002
Direct bilirubin (μmol/L) 3.55 (2.38–5.60) 3.45 (2.50–5.40) 0.126
Indirect bilirubin (μmol/L) 5.70 (3.98–7.95) 4.80 (3.40–7.03) 0.049
ALT (IU/L) 28.00 (15.75–38.00) 15.00 (10.00–21.00) <0.001
AST (IU/L) 27.00 (20.00–34.00) 19.50 (16.00–25.00) <0.001
Total protein (g/L) 69.42±8.42 68.82±9.15 0.508
Albumin (g/L) 38.64±7.35 37.89±6.90 0.248
Globulin (g/L) 30.78± 6.65 30.93±7.02 0.829
Glucose (mmol/L) 5.15 (4.64–5.95) 5.14 (4.71–5.89) 0.410
Urea (mmol/L) 3.92 (2.90–5.24) 4.05 (3.15–5.30) 0.299
Creatinine (μmol/L) 57.50 (47.78–67.00) 60.45 (49.00–73.20) 0.601
Serum cystatin C (mg/L) 0.91 (0.81–1.04) 0.92 (0.79–1.06) 0.975
Uric acid (μmol/L) 291.29±125.98 331.51±155.30 0.008
Triglycerides (mmol/L) 0.99 (0.81–1.31) 1.06 (0.80–1.43) 0.469
Cholesterol (mmol/L) 3.96±1.206 3.96±1.058 0.966
High-density lipoprotein (mmol/L) 1.12 (0.85–1.48) 1.08 (0.82–1.41) 0.811
Low-density lipoprotein (mmol/L) 2.20 (1.79–2.72) 2.21 (1.69–2.77) 0.575
Alkaline phosphatase (IU/L) 85.50 (68.50–106.00) 79.00 (64.00–98.00) 0.021
Glutamyl transferase (IU/L) 42.50 (26.00–78.00) 29.00 (19.00–48.00) <0.001
C-reactive protein (mg/L) 9.74 (2.30–39.23) 12.25 (2.67–37.43) 0.961
Blood sedimentation (mm/h) 38.50 (20.50–63.00) 33.50 (14.75–64.00) 0.173

Data are presented as mean ± standard deviation or median (interquartile range). ATDH, anti-tuberculosis drug-induced hepatic injury; ALT, alanine aminotransferase; AST, aspartate aminotransferase.

Loci typing results for the target SNPs in the study population

T allele carriers at rs3814055 of the PXR gene had a reduced relative risk of ATDH compared to C allele carriers (11). Carriers of the rs2755237 locus C allele of the FOXO1 gene had a reduced relative risk of ATDH relative to carriers of the A allele and carriers of the T allele of the rs4435111 locus relative to carriers of the C allele. The gene frequencies of candidate SNPs for the ALAS1 gene did not differ between the two groups.

Modeling of ATDH risk prediction

Model predictor screening

Lasso regression in the machine learning algorithm was used to screen the pre-treated 98 characteristic variables, showing the optimal subset of non-zero coefficient variables for inclusion in the model was 36 at the minimum value of 10-fold cross-validation error λ=0.0074528, and the coefficients of the remaining variables were reduced to zero, as shown in Figures 1,2.

Figure 1 Determination of the optimal penalty factor λ=0.0074528 in the LASSO model using 10-fold cross-validation. LASSO, least absolute shrinkage and selection operator.
Figure 2 Distribution of Lasso coefficients for the 98 clinical characteristics. The left dashed vertical line shows the 36 non-zero coefficient variables for which λ was chosen as the minimum.

Identification of candidate predictors using one-way logistic regression

As shown in Table 3, fourteen candidate feature variables were statistically different in the test set, respectively, while there were twelve corresponding candidate feature variables in the validation set. The characteristic variables statistically different in both groups were fever, rs3814055, total bile acid, glutamic aminotransferase, glutamic oxalacetic aminotransferase and uric acid.

Table 3

Baseline levels of the 36 characteristic variables in the test and validation sets after LASSO screening

Candidate feature variables Test set (n=490) Validation set (n=256)
Non-ATDH group (n=409) ATDH group (n=81) P Non-ATDH group (n=219) ATDH group (n=37) P
Gender (M/F), n (%) 246 (60.1)/163 (39.9) 45 (55.6)/36 (44.4) 0.459 90 (41.1)/129 (58.9) 13 (35.1)/24 (64.9) 0.588
Alcohol consumption (yes/no), n (%) 127 (31.1)/ 282 (68.9) 25 (30.9)/56 (69.1) 0.543 14 (6.4)/205 (93.6) 7 (8.2)/30 (81.1) 0.019
Fever (yes/no), n (%) 177 (43.3)/ 232 (56.7) 44 (54.3)/37 (45.7) 0.087 70 (32.0)/149 (68.0) 19 (51.4)/18 (48.6) 0.026
Weight loss (yes/no), n (%) 176 (43.0)/ 233 (57.0) 25 (30.9)/56 (69.1) 0.048 50 (22.8)/169 (77.2) 9 (24.3)/28 (75.7) 0.834
Decreased appetite (yes/no), n (%) 166 (40.6)/ 243 (59.4) 32 (39.5)/49 (60.5) 0.902 52 (23.7)/167 (76.3) 13 (35.1)/24 (64.9) 0.155
Fatigue (yes/no), n (%) 122 (29.8)/ 287 (70.2) 23 (28.4)/58 (71.6) 0.894 19 (9.1)/199 (90.9) 8 (21.6)/29 (78.4) 0.041
Genotype
   rs353556, n (%) 0.886 0.544
    11 116 (28.4) 22 (27.2) 57 (26.0) 9 (24.3)
    22 207 (50.6) 40 (49.4) 120 (54.8) 18 (48.6)
    33 86 (21.0) 19 (23.5) 42 (19.2) 10 (27.0)
   rs3852071, n (%) 0.267 0.180
    11 13 (3.2) 0 (0.0) 3 (1.4) 1 (2.7)
    22 111 (27.1) 23 (28.4) 67 (30.6) 6 (16.2)
    33 285 (69.7) 58 (71.6) 149 (68.0) 30 (81.1)
   rs352169, n (%) 0.058 0.334
    11 174 (42.6) 27 (33.3) 80 (36.5) 17 (45.9)
    22 194 (47.4) 39 (48.1) 103 (47.0) 17 (45.9)
    33 41 (10.0) 15 (18.5) 36 (16.4) 3 (8.1)
   rs2755237, n (%) 0.008 0.385
    11 191 (46.7) 53 (65.4) 101 (46.1) 15 (40.5)
    22 192 (46.9) 24 (29.6) 102 (46.6) 21 (56.8)
    33 26 (6.4) 4 (4.9) 16 (7.3) 1 (2.7)
   rs2701891, n (%) 0.009 0.202
    11 224 (54.8) 39 (48.1) 126 (57.5) 22 (59.5)
    22 160 (39.1) 29 (35.8) 76 (34.7) 15 (40.5)
    33 25 (6.1) 13 (16.0) 17 (7.8) 0 (0.0)
   rs3751436, n (%) 0.629 0.481
    11 149 (36.4) 28 (34.5) 80 (36.5) 10 (27.0)
    22 205 (50.1) 39 (48.1) 108 (49.3) 22 (59.5)
    33 55 (13.5) 14 (17.3) 31 (14.2) 5 (13.5)
   rs4435111, n (%) 0.014 0.991
    11 249 (60.9) 63 (77.8) 117 (53.4) 20 (4.1)
    22 144 (35.2) 17 (21.0) 89 (40.6) 15 (40.5)
    33 16 (3.9) 1 (1.2) 13 (5.9) 2 (5.4)
   rs7325594, n (%) 0.492 0.329
    11 51 (12.5) 13 (16.0) 31 (14.2) 5 (13.5)
    22 218 (53.3) 45 (55.6) 109 (49.8) 23 (62.2)
    33 140 (34.2) 23 (28.4) 79 (36.1) 9 (24.3)
   rs3814055, n (%) 0.195 0.114
    11 242 (59.2) 56 (69.1) 131 (59.8) 28 (75.7)
    22 137 (33.5) 22 (27.2) 76 (34.7) 9 (24.3)
    33 30 (7.3) 3 (3.7) 12 (5.5) 0 (0.0)
   rs56967099, n (%) 0.880 0.897
    11 111 (27.1) 24 (29.6) 60 (27.4) 9 (24.3)
    22 189 (46.2) 37 (45.7) 104 (47.5) 19 (51.4)
    33 109 (26.7) 20 (24.7) 55 (25.1) 9 (24.3)
   rs13059232, n (%) 0.740 0.939
    11 157 (38.4) 34 (42.0) 80 (36.5) 14 (37.8)
    22 196 (47.9) 35 (43.2) 107 (48.9) 17 (45.9)
    33 56 (13.7) 12 (14.8) 32 (14.6) 6 (16.2)
   rs4688040, n (%) 0.477 0.964
    11 149 (36.4) 34 (42.0) 82 (37.4) 13 (35.1)
    22 207 (50.6) 35 (43.2) 103 (47.0) 18 (48.6)
    33 53 (13.0) 12 (14.8) 34 (15.5) 6 (16.2)
   rs6785049, n (%) 0.361 0.802
    11 163 (39.9) 39 (48.1) 94 (42.9) 15 (40.5)
    22 187 (45.7) 33 (40.7) 95 (43.4) 18 (48.6)
    33 59 (14.4) 9 (11.1) 30 (13.7) 4 (10.8)
   rs3732360, n (%) 0.675 0.220
    11 129 (31.5) 25 (30.9) 76 (34.7) 18 (48.6)
    22 206 (50.4) 38 (46.9) 108 (49.3) 13 (35.1)
    33 74 (18.1) 18 (22.2) 35 (16.0) 6 (16.2)
Platelets (×109/L) 231.0 (171.0–296.5) 244.0 (185.5–322.5) 0.0549 235.0 (173.0–293.0) 221.0 (181.5–276.0) 0.9589
Percentage of neutrophils (%) 71.70 (62.00–78.30) 70.60 (61.28–76.53) 0.6908 70.00 (64.00–79.00) 74.50 (65.35–82.65) 0.2106
Percentage of monocytes (%) 7.10 (5.75–8.90) 8.30 (5.68–9.45) 0.1136 6.90 (5.80–8.80) 7.40 (5.05–8.84) 0.5629
Absolute monocyte values (×109/L) 0.47 (0.35–0.64) 0.49 (0.36–0.67) 0.341 0.45 (0.32–0.60) 0.46 (0.36–0.76) 0.2338
Total bile acids (μmol/L) 8.70 (6.40–12.23) 9.80 (7.60–14.55) 0.0087 8.80 (6.40–12.20) 10.40 (6.60–14.70) 0.1414
Direct bilirubin (μmol/L) 3.50 (2.50–5.40) 3.60 (2.30–6.60) 0.3726 3.50 (2.50–5.45) 3.90 (2.90–7.55) 0.1374
Glutamic oxalacetic transaminase (IU/L) 14.0 (10.0–20.0) 27.0 (13.5–38.0) <0.0001 16.0 (10.0–23.0) 27.0 (17.0–38.0) <0.0001
Glutathione transaminase (IU/L) 19.00 (15.00–25.00) 27.00 (19.0–34.00) <0.0001 21.00 (17.00–26.25) 25.00 (21.00–33.00) 0.0012
Albumin (g/L) 38.50 (33.20–43.10) 38.70 (34.40–43.60) 0.2744 38.95 (33.10–43.33) 37.80 (30.40–46.85) 0.866
Glucose (mmol/L) 4.72 (5.16–5.97) 4.61 (4.93–5.64) 0.0735 5.05 (4.65–5.67) 5.37 (4.57–6.24) 0.2974
Creatinine (μmol/L) 61.1 (49.0–74.0) 57.4 (47.5–69.0) 0.2144 59.0 (50.0–71.0) 61.0 (53.0–73.0) 0.3446
Uric acid (μmol/L) 306.70 (225.50–403.00) 273.00 (203.00–393.00) 0.1429 292.00 (224.00–417.00) 267.35 (185.25–350.25) 0.0392
Triglycerides (mmol/L) 1.04 (0.78–1.41) 1.02 (0.81–1.47) 0.9415 1.01 (0.78–1.39) 0.94 (0.83–1.13) 0.1614
Alkaline phosphatase (IU/L) 78.00 (64.00–96.25) 84.00 (72.50–104.00) 0.0254 80.00 (63.75–99.00) 80.00 (65.50–108.50) 0.3916

LASSO, least absolute shrinkage and selection operator; ATDH, anti-tuberculosis drug-induced hepatic injury.

Adjustment for model confounders

There was moderate strength covariance P=0.616 for ALT and AST and no multicollinearity between the remaining 15 candidate variables two by two, with a maximum P value of 0.26. Rs3814055 and rs4435111 had an interaction effect on the outcome variable ATDH occurrence (P=0.001), while no interactions were detected between the other 15 variables, all P>0.05.

Test set model building and optimization

The 17 candidate predictors were modeled in different ways, and the screening P values and AIC and BIC are shown in Table 4. Model 6 incorporated five variables with an AIC of 320.50, model 8 incorporated nine variables with an AIC of 312.44, and model 9 incorporated eight variables with an AIC of 312.68. A comparison of model 6, model 8, and model 9 revealed model 6 and model 8 were different, and although model 6 incorporated fewer variables, its predictive efficacy was reduced (using STATA software’s lrtest test command, P<0.05). In contrast, there was no difference in predictive efficacy between model 8 and model 9, and as model 9 incorporated fewer variables (using the lrtest test command of STATA software, P>0.05), it was considered the best model with the characteristics of incorporated variables as shown in Table 5.

Table 4

Multiple models using multivariate logistic regression for comparison

Models Construction method Inclusion of variables Screening
P value
Number of variables AIC BIC
Model 1 Entry into law All variables 17 325.71 406.32
Model 2 Entry into law (dummy variable) All variables 17 327.26 422.75
Model 3 Entry into law (dummy variable) rs4435111, rs3814055, monocyte%, PLT, ALT, AST, UA, ALP, TBIL, DBIL, Alb, Glu, TG 13 350.91 408.60
Model 4 Entry into law (dummy variable) rs4435111, rs3814055, monocyte%, ALT, AST, UA, TBIL 7 352.18 389.40
Model 5 Stepwise method All variables 0.2 9 312.44 352.74
Model 6 Stepwise method All variables 0.05 5 320.50 344.68
Model 7 Stepwise method All variables 0.3 11 313.42 361.78
Model 8 Stepwise method All variables 0.2 9 312.44 352.74
Model 9 Stepwise method All variables 0.05 8 312.68 348.95
Model 10 Stepwise method All variables 0.3 13 315.68 372.11
Model 11 Entry into law (dummy variable/interaction) Fever, rs4435111, rs3814055, monocyte%, ALT, AST, UA, TBIL 10 315.01 359.16

AIC, Akaike’s information criterion; BIC, Baysian information criterion; PLT, platelet; ALT, alanine transaminase; AST, aspartate aminotransferase; ALP, alkaline phosphatase; TBIL, total bilirubin; DBIL, direct bilirubin; Glu, glucose; TG, triglyceride; UA, uric acid; Alb, albumin.

Table 5

Variables and characteristics eventually included in the model

Characteristic variable β OR 95% CI P
Lower limit Upper limit
Fever 0.7491207 2.115 0.148 1.349 0.015
rs4435111* −1.373078 0.253 −2.090 −0.65 <0.001
rs3814055* −0.5692482 0.565 −1.09 −0.04 0.033
Albumin 0.0676586 1.070 0.017 0.117 0.008
Glutamic-pyruvic transaminase 0.0662291 1.068 0.036 0.096 <0.001
Glutathione transaminase 0.0503438 1.051 0.004 0.096 0.032
Uric acid −0.0023242 0.997 −0.005 0.00015 0.036
Percentage of monocytes 0.1458457 1.157 0.028 0.262 0.014

*, both were set up according to dummy variables and modeled in layers according to interactions. OR, odds ratio; CI, confidence interval.

Test set model predictive efficacy analysis

The model had a discriminant C-index of 0.816, sensitivity of 34.25%, specificity of 97.99%, positive predictive value of 78.13%, and negative predictive value of 87.69%, as shown in Figure 3. The model consistency test had S:P =0.896, maximum absolute difference Emax =0.147, and average absolute difference Eave =0.017, as shown in Figure 4.

Figure 3 ROC curve of the prediction model built from the test set data. The area under the curve is 0.8164, indicating good discrimination. ROC, receiver operating characteristic.
Figure 4 Comparison of the agreement between the predicted risk of the ATDH prediction model and the observed actual risk of the ATDH in the test set. The gray straight line at 45° over the origin represents the ideal line; the gray dashed line represents the actual observed value; and the black straight line represents the predicted value according to the logistic model, S:P=0.896. ATDH, anti-tuberculosis drug induced hepatic injury; Dxy, Somer’s rank correlation between p and y: DXY=2(C−0.5); C, ROC area; ROC, receiver operating characteristic; R2, Nagelkerke-Cox-Snell-Magee R-squared index; D, discrimination index D; U, unreliability index; Q, the quality index; Brier, Brier score (average squared difference in p and y); Emax, maximum absolute difference in predicted and loess-calibrated probabilities; E90, the 0.9 quantile absolute difference in predicted and loess-calibrated probabilities; Eave, the average quantile absolute difference in predicted and loess-calibrated probabilities; S:Z, The Spiegelhalter Z-test for calibration accuracy; S:P, the two-tailed value of Spiegelhalter Z test.

Validation set model building and effectiveness analysis

Logistic regression models were recreated in the validation set data summary using the regression coefficients from the test set model: odds (ATDH) = 1/{1+exp[−(−3.661122 + 0.7491207 × fever + 0.0676586 × Albumin − 0.0023242 × uric acid + 0.1458457 × monocyte% + 0.050343 × AST + 0.0662291 × ALT − 1.373078 × rs4435111 − 0.5698482 × rs3814055)]}.

The fit of this model was consistent with that constructed from the test set data (Hosmer-Lemeshow test P=0.4636). In the validation set the model ROC curve analysis discrimination C-index was 0.7189, the specificity 97.77%, negative predictive value 86.21%, sensitivity 15.15%, and positive predictive value 55.56%, as shown in Figure 5, and the calibration curve validation maximum absolute difference Emax =0.101 and average absolute difference Eave =0.009, with the Spiegelhalter Z-test for calibration accuracy S:P =0.929, as shown in Figure 6.

Figure 5 ROC curves established by applying the ATDH prediction model in the validation set. The area under the ROC curve is 0.7189, indicating good discrimination. ATDH, anti-tuberculosis drug induced hepatic injury; ROC, receiver operating characteristic.
Figure 6 Comparison of the agreement between the predicted risk and the observed actual risk of ATDH using the ATDH prediction model in the validation set. The gray straight line at 45° over the origin represents the ideal line; the gray dashed line represents the actual observed value; and the black straight line represents the predicted value according to the logistic model, S:P=0.929. ATDH, anti-tuberculosis drug induced hepatic injury; Dxy, Somer’s rank correlation between p and y: DXY=2(C−0.5); C, ROC area; ROC, receiver operating characteristic; R2, Nagelkerke-Cox-Snell-Magee R-squared index; D, discrimination index D; U, unreliability index; Q, the quality index; Brier, Brier score (average squared difference in p and y); Emax, maximum absolute difference in predicted and loess-calibrated probabilities; E90, the 0.9 quantile absolute difference in predicted and loess-calibrated probabilities; Eave, the average quantile absolute difference in predicted and loess-calibrated probabilities; S:Z, the Spiegelhalter Z-test for calibration accuracy; S:P, the two-tailed value of Spiegelhalter Z test.

Building the nomogram

The nomogram was established according to this prediction model, and the genotypes of rs3814055 were stratified because of the interaction between rs3814055 and rs4435111. As the different genotypes of rs3814055 and rs4435111 had a non-equal predicted risk of ATDH set according to dummy variables. The nomogram is shown in Figure 7, with predicted probabilities between 0.1–0.7 for total integrals between 170–210.

Figure 7 ATDH prediction model presented as a column line graph plot. rs3814055 and rs4435111 have an interaction. ATDH, anti-tuberculosis drug induced hepatic injury; ALT, alanine transaminase; AST, aspartate aminotransferase; Uric, uric acid; Alb, albumin.

Decision curve analysis effects analysis of the prediction model

The clinical decision curve for the ATDH prediction model is shown in Figure 8. The model has value for clinical use when the risk threshold ranges between 0.1 and 0.8.

Figure 8 Clinical decision curves for the established ATDH prediction model. The y-axis represents the net benefit, x-axis represents the high-risk threshold. The thin blue line is the net benefit of therapeutic intervention for all men; the thin green line is the net benefit of therapeutic intervention for the men on the basis of the statistical model; and the red horizontal line is the net benefit of therapeutic intervention for no man. The threshold probability of X-axis and net benefit of Y-axis are displayed as a ratio. ATDH, anti-tuberculosis drug induced hepatic injury; Pr, threshold probability.

Discussion

In this study, the ATDH prediction model was constructed using machine learning algorithms to screen eight predictors in terms of general clinical characteristics, laboratory indications, and genetic characteristics variables. The construction process was conducted strictly with reference to the statement of clinical prediction models using three steps: developing the prediction model, validating the prediction model, and studying the clinical significance of the model (30). Although the model had moderate specificity, discrimination, consistency, and clinical application, the sensitivity needs to improve.

In this study, Lasso regression, a machine learning algorithm, was used for the pre-screening of model feature variables. Lasso regression is beneficial in constructing prediction models to satisfy variance trade-offs while integrating a large amount of data in different dimensions. It has the characteristics of fast analysis, stability, and easy interpretation of results compared to conventional logistic regression step-by-step processing (33). Therefore, this study first used Lasso regression for data pre-screening then used one-way logistic regression to filter out 17 candidates variables (24).

The principle of multivariate logistic regression was used in the test set data, with the lowest AIC value and the least number of predictors as the selection criteria for the optimal model (34). The included predictors were fever, rs3814055, rs4435111, albumin, ghrelin, glutamic aminotransferase, uric acid, and monocyte percentage.

Visualization of the optimized model revealed ALT, AST, albumin, monocyte percentage, and fever as independent predictors of ATDH, suggesting the basal liver function status, immune status, and ATDH susceptibility in TB patients were associated. Meanwhile, the nomogram visualized the interaction between rs3814055 and rs4435111 and showed when rs3814055 was the CC genotype, the risk of ATDH was significantly increased, while the TT genotype of rs4435111 could reduce the risk of ATDH. When rs3814055 was the TT genotype, the overall risk of ATDH decreased, while the TT genotype of rs4435111 increased the risk of ATDH. The rs4435111 TT genotype in the genetics section was a factor in the reduced risk of ATDH occurrence, although the nomogram showed its score value for predicting the probability of ATDH risk was influenced by the genotype at the rs3814055 locus. Analysis of possible reasons for this is are as follows: (I) both rs3814055 and rs4435111 have relatively few TT genotypes, leading to bias in data from small samples; and (II) there is a complex higher-order and second-order multiplicative interaction of the PXR gene with the FOXO1 gene (6). The rs3814055 genotype interfered with the efficacy of the latter assessment due to the greater weighting of the rs3814055 genotype on the effect of ATDH susceptibility.

The model had a C-index =0.8164 for the test set’s discriminant test, with the consistency test S:P =0.896, Emax =0.147, and Eave =0.017, suggesting both the model’s discriminant and consistency were good. To avoid overfitting of the model due to random and systematic errors in the cross-validation data in different training data sets, the model fit needs to be validated in the validation set data to prevent the increase in variance caused by overfitting. It was shown that the fit of the model constructed from the validation set data was consistent with that of the model constructed from the test set data, and the discrimination had moderate strength discrimination, indicating that the use of Lasso regression was effective in preventing model overfitting from causing fit contraction in the new sample set. Further clinical decision curve analysis of the model revealed that when the high-risk threshold was between 0.1 and 0.8, the model was of good value for clinical use.

However, the model’s sensitivity was 34.25% and 15.15% in the test and validation sets, respectively, and its specificity was 97.99% and 97.77%, respectively, with positive predictive values of 78.13% and 55.56% and negative predictive values of 87.69% and 86.21%, also suggesting its predictive sensitivity needs to be improved.

The possible reasons for the good predictive specificity and poor sensitivity of the model are as follows: (I) low incidence of ATDH. In this study, there were 118 cases in the ATDH group and 628 cases in the non-ATDH group, and the incidence of ATDH was 15.81%. The group randomly divided all TB patients into the test set (81 cases in the ATDH group) and the validation set (37 cases in the ATDH group) in a 3:2 ratio, and their ATDH incidence was 18.57% and 14.45%, corresponding to a sensitivity of 34.25% and 15.15%, respectively. The low incidence of ATDH in the constructed model data may be one of the important reasons for the poor sensitivity of the model. (II) Lack of strong predictors. The predictors selected by Lasso regression and one-way logistic regression for model inclusion factors were general clinical features (fever), routine laboratory indications (ALT, AST, albumin, monocyte percentage), and genetic indications (genotype of rs3814055 and rs4435111), respectively. Although these predictors are objective tests, they are all relevant markers derived from the mechanisms of ATDH occurrence and not specific markers. Metabolomics and microbiomes indicate ATDH characterized by metabolic and microbial profiles also differed from non-ATDH (35). However, this study found that gene polymorphisms were correlated with the occurrence of ATDH, and different genes had interactions. Given the mechanism of ATDH has not yet been elucidated, exploring the target molecules [such as N-acetyltransferase (NAT), glutathione S-transferase (GST), and CYP450] in its occurrence and development as predictors will help to improve the predictive power.

Therefore, based on our established prediction model for ATDH, it can be concluded that (I) the machine learning algorithm Lasso regression helps to simultaneously perform a large number of candidate variables screening and meets the requirements of variance trade-off by bootstrap self-sampling, cross-validation, and avoiding overfitting; (II) SNPs are promising predictors, and combining multi-gene SNPs features over single-gene SNPs to build prediction models can improve predictive efficacy and clinical applicability; and (III) simultaneous modeling of multi-gene SNPs requires consideration of the impact of interactions on model predictive efficacy. Further research directions should also be validated in a larger and different population while adding as many key genes or clinical data as possible to increase the sensitivity of the model.


Acknowledgments

Funding: This study was supported by the Sichuan Medical Research Project (No. S21058), The Science and Technology Project of the Health Planning Committee of Sichuan (No. 19PJ163), and the Chengdu Medical Research Project (No. 2019067 and No. 2018059).


Footnote

Reporting Checklist: The authors have completed the STARD reporting checklist. Available at https://atm.amegroups.com/article/view/10.21037/atm-22-4551/rc

Data Sharing Statement: Available at https://atm.amegroups.com/article/view/10.21037/atm-22-4551/dss

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://atm.amegroups.com/article/view/10.21037/atm-22-4551/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Committee of West China Hospital, Sichuan University (Registration No. 2014198). All experimental subjects have signed the informed consent form.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Chakaya J, Khan M, Ntoumi F, et al. Global Tuberculosis Report 2020 - Reflections on the Global TB burden, treatment and prevention efforts. Int J Infect Dis 2021;113:S7-S12. [Crossref] [PubMed]
  2. Jeremiah C, Petersen E, Nantanda R, et al. The WHO Global Tuberculosis 2021 Report - not so good news and turning the tide back to End TB. Int J Infect Dis 2022; Epub ahead of print. [Crossref] [PubMed]
  3. Bao Y, Ma X, Rasmussen TP, et al. Genetic Variations Associated with Anti-Tuberculosis Drug-Induced Liver Injury. Curr Pharmacol Rep 2018;4:171-81. [Crossref] [PubMed]
  4. Zhang J, Liu X, He H, et al. Influence of HNF4α and HNF4α-AS1 gene variants on the risk of anti-tuberculosis drugs-induced hepatotoxicity. Ann Palliat Med 2021;10:11733-44. [Crossref] [PubMed]
  5. Huang YS. Recent progress in genetic variation and risk of antituberculosis drug-induced liver injury. J Chin Med Assoc 2014;77:169-73. [Crossref] [PubMed]
  6. Zhang J, Jiao L, Song J, et al. Genetic and Functional Evaluation of the Role of FOXO1 in Antituberculosis Drug-Induced Hepatotoxicity. Evid Based Complement Alternat Med 2021;2021:3185874. [Crossref] [PubMed]
  7. Lyoumi S, Lefebvre T, Karim Z, et al. PXR-ALAS1: a key regulatory pathway in liver toxicity induced by isoniazid-rifampicin antituberculosis treatment. Clin Res Hepatol Gastroenterol 2013;37:439-41. [Crossref] [PubMed]
  8. Podvinec M, Handschin C, Looser R, et al. Identification of the xenosensors regulating human 5-aminolevulinate synthase. Proc Natl Acad Sci U S A 2004;101:9127-32. [Crossref] [PubMed]
  9. Fraser DJ, Zumsteg A, Meyer UA. Nuclear receptors constitutive androstane receptor and pregnane X receptor activate a drug-responsive enhancer of the murine 5-aminolevulinic acid synthase gene. J Biol Chem 2003;278:39392-401. [Crossref] [PubMed]
  10. Thunell S. (Far) Outside the box: genomic approach to acute porphyria. Physiol Res 2006;55:S43-66. [Crossref] [PubMed]
  11. Zhang J, Zhao Z, Bai H, et al. Genetic polymorphisms in PXR and NF-κB1 influence susceptibility to anti-tuberculosis drug-induced liver injury. PLoS One 2019;14:e0222033. [Crossref] [PubMed]
  12. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J 2014;35:1925-31. [Crossref] [PubMed]
  13. Pontual Y, Pacheco VSS, Monteiro SP, et al. ABCB1 gene polymorphism associated with clinical factors can predict drug-resistant tuberculosis. Clin Sci (Lond) 2017;131:1831-40. [Crossref] [PubMed]
  14. Mushiroda T, Yanai H, Yoshiyama T, et al. Development of a prediction system for anti-tuberculosis drug-induced liver injury in Japanese patients. Hum Genome Var 2016;3:16014. [Crossref] [PubMed]
  15. Chamorro JG, Castagnino JP, Aidar O, et al. Effect of gene-gene and gene-environment interactions associated with antituberculosis drug-induced hepatotoxicity. Pharmacogenet Genomics 2017;27:363-71. [Crossref] [PubMed]
  16. Mahomed S, Padayatchi N, Singh J, et al. Precision medicine in resistant Tuberculosis: Treat the correct patient, at the correct time, with the correct drug. J Infect 2019;78:261-8. [Crossref] [PubMed]
  17. Guo BL, Ouyang FS, Yang SM, et al. Development of a preprocedure nomogram for predicting contrast-induced acute kidney injury after coronary angiography or percutaneous coronary intervention. Oncotarget 2017;8:75087-93. [Crossref] [PubMed]
  18. Zhang J, Zhao Z, Bai H, et al. The Variant at TGFBRAP1 but Not TGFBR2 Is Associated with Antituberculosis Drug-Induced Liver Injury. Evid Based Complement Alternat Med 2019;2019:1685128. [Crossref] [PubMed]
  19. Yang M, Qiu Y, Jin Y, et al. NR1I2 genetic polymorphisms and the risk of anti-tuberculosis drug-induced hepatotoxicity: A systematic review and meta-analysis. Pharmacol Res Perspect 2020;8:e00696. [Crossref] [PubMed]
  20. WHO. Common Terminology Criteria for Adverse Events (CTCAE) Version 5.0. U.S. Department of Health and Human Services, 2017:1-155.
  21. Tostmann A, Boeree MJ, Aarnoutse RE, et al. Antituberculosis drug-induced hepatotoxicity: concise up-to-date review. J Gastroenterol Hepatol 2008;23:192-202. [Crossref] [PubMed]
  22. Huang YQ, Liang CH, He L, et al. Development and Validation of a Radiomics Nomogram for Preoperative Prediction of Lymph Node Metastasis in Colorectal Cancer. J Clin Oncol 2016;34:2157-64. [Crossref] [PubMed]
  23. Stone GW, Maehara A, Lansky AJ, et al. A prospective natural-history study of coronary atherosclerosis. N Engl J Med 2011;364:226-35. [Crossref] [PubMed]
  24. Kang SJ, Cho YR, Park GM, et al. Predictors for functionally significant in-stent restenosis: an integrated analysis using coronary angiography, IVUS, and myocardial perfusion imaging. JACC Cardiovasc Imaging 2013;6:1183-90. [Crossref] [PubMed]
  25. Akaike H. Data analysis by statistical models. No To Hattatsu 1992;24:127-33. [PubMed]
  26. Jaddoe VW, de Jonge LL, Hofman A, et al. First trimester fetal growth restriction and cardiovascular risk factors in school age children: population based cohort study. BMJ 2014;348:g14. [Crossref] [PubMed]
  27. Qiao F, Fu K, Zhang Q, et al. The association between missing teeth and non-alcoholic fatty liver disease in adults. J Clin Periodontol 2018;45:941-51. [Crossref] [PubMed]
  28. Coutant C, Olivier C, Lambaudie E, et al. Comparison of models to predict nonsentinel lymph node status in breast cancer patients with metastatic sentinel lymph nodes: a prospective multicenter study. J Clin Oncol 2009;27:2800-8. [Crossref] [PubMed]
  29. Vickers AJ, Cronin AM, Elkin EB, et al. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak 2008;8:53. [Crossref] [PubMed]
  30. Moons KG, Altman DG, Reitsma JB, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 2015;162:W1-73. [Crossref] [PubMed]
  31. Hu X, Zhang M, Bai H, et al. Antituberculosis Drug-Induced Adverse Events in the Liver, Kidneys, and Blood: Clinical Profiles and Pharmacogenetic Predictors. Clin Pharmacol Ther 2018;104:326-34. [Crossref] [PubMed]
  32. Lu T, He L, Zhang B, et al. Percutaneous mastoid electrical stimulator improves Poststroke depression and cognitive function in patients with Ischaemic stroke: a prospective, randomized, double-blind, and sham-controlled study. BMC Neurol 2020;20:217. [Crossref] [PubMed]
  33. Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, Series B (Methodology) 1996;58:267-88. [Crossref]
  34. Vrieze SI. Model selection and psychological theory: a discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychol Methods 2012;17:228-43. [Crossref] [PubMed]
  35. Wu S, Wang M, Zhang M, et al. Metabolomics and microbiomes for discovering biomarkers of antituberculosis drugs-induced hepatotoxicity. Arch Biochem Biophys 2022;716:109118. [Crossref] [PubMed]

(English Language Editor: B. Draper)

Cite this article as: Zhang J, Zhou W, Ma S, Kang Y, Yang W, Peng X, Zhou Y, Deng F. Combined electronic medical records and gene polymorphism characteristics to establish an anti-tuberculosis drug-induced hepatic injury (ATDH) prediction model and evaluate the prediction value. Ann Transl Med 2022;10(20):1114. doi: 10.21037/atm-22-4551

Download Citation