OCCUPATION AND HEALTH ›› 2025, Vol. 41 ›› Issue (15): 2098-2106.

• Treatise • Previous Articles     Next Articles

Construction and evaluation of key genes and prognosis prediction model for screening of pulmonary tuberculosis combined with lung adenocarcinoma based on GEO and TCGA databases

WEI Yifan1,2, LI Tianxin1,3, LI Yuchen4, YANG Xinyu4, LIU Jialiang4, YI Na1   

  1. 1. Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Xinjiang Medical University;State Key Laboratory of Pathogenesis, Prevention and Treatment of High Incidence Diseases in Central Asia;Xinjiang Key Laboratory of Molecular Biology for Endemic Diseases, Urumqi, Xinjiang 830017, China;
    2. The Fifth Clinical Medical College, Xinjiang Medical University, Urumqi, Xinjiang 830017, China;
    3. School of Pharmacy, Xinjiang Medical University, Urumqi, Xinjiang 830017, China;
    4. Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, Xinjiang 830017, China
  • Received:2024-11-03 Revised:2024-11-12 Online:2025-08-15 Published:2025-12-12
  • Contact: YI Na,Lecturer,E-mail:124504195@qq.com

Abstract: Objective Long-term history of pulmonary tuberculosis is easy to induce lung cancer,and lung cancer patients are more likely to be infected with tuberculosis. The purpose of this study is to identify and verify the genetic characteristics associated with pulmonary tuberculosis,which can be used as a potential biomarker for the prognosis of patients with lung adenocarcinoma,and to construct a prognostic prediction model. Methods Differential expression and weighted gene co-expression network analysis(WGCNA) were performed on the GSE126614 datasets of pulmonary tuberculosis,candidate genes were obtained by intersection,and further survival analysis was performed in the cancer genome atlas(TCGA) lung adenocarcinoma database. Five core genes were finally selected by univariate and Lasso-Cox multivariate analysis,and a prediction model was further constructed and externally validated. Results Differential expression and WGCNA analysis were performed on the tuberculosis GSE126614 datasets,and 241 genes were obtained from the intersection for further survival analysis. Single factor and Lasso-Cox multivariate analysis were used to finally select 5 genes and construct a risk model. The risk model was as follows:risk score =(0.000 67×PPTC7 expression) +(-0.000 5×RHOQ expression)+(0.000 1×TRIM28 expression)+(-0.031×USP49 expression)+(-0.000 2×ZNF710 expression). The internal validation results showed that the AUC values of 1-year,3-years and 5-years in the TCGA-LUAD training cohort were 0.755,0.747 and 0.723,respectively,and consistent predictive ability was also demonstrated in the TCGA-LUAD validation queue. In addition,external validation showed that the 1-year,3-year and 5-year AUC values of the ROC curve were 0.732,0.68 and 0.646,respectively,which also showed that the model had good predictive ability. Multivariate Cox analysis showed that risk score(P<0.001) was an independent prognostic factor. At the level of gene transcription,TRIM28,PPTC7,ZNF710 and USP49 were relatively highly expressed in lung adenocarcinoma tissues. At the translation level,TRIM28,PPTC7 and ZNF710 were relatively highly expressed in lung adenocarcinoma tissues. Conclusion The prognostic model composed of these five genes has certain value in predicting the prognosis of patients with lung adenocarcinoma,and can provide reference for clinical diagnosis and treatment.

Key words: Pulmonary tuberculosis, Lung adenocarcinoma, Core genes, Prognosis prediction model

CLC Number: