OCCUPATION AND HEALTH ›› 2023, Vol. 39 ›› Issue (12): 1719-1725.

• Overview • Previous Articles     Next Articles

Research progress of unbalanced data classification and its application in disease diagnosis

ZOU Qiong1,2, WANG Chong3   

  1. 1. College of PublicHealth,Shaanxi University of Chinese Medicine,Xianyang Shaanxi 712046,China;
    2. Military Health Statistics Teaching and Research Office,Department of Military Preventive Medicine,Air Force Military Medical University,Xi'an Shaanxi 710032,China;
    3. Medical Experiment Center,Shaanxi University of Chinese Medicine,Xianyang Shaanxi 712046,China
  • Received:2022-10-18 Revised:2022-11-21 Published:2026-03-15
  • Contact: WANG Chong,Associate professor,E-mail:w-goahead@163.com

Abstract: The imbalance of data distribution between different classes is called the imbalance between classes. The number of samples in different categories in unbalanced data varies greatly. In recent years,machine learning algorithm has been a popular choice for diagnosis,analysis and prediction of diseases,because it saves time than traditional diagnosis methods,and can directly predict the relevant factors causing diseases to reduce the harm and reduce the burden of individuals and society. The paper reviews the machine learning methods used in imbalanced data processing in recent years,focusing on their application in predicting cancer,heart disease,diabetes and its complications,and other diseases.

Key words: Unbalanced data, Disease diagnosis, Machine learning, Clinical prediction model

CLC Number: