职业与健康 ›› 2023, Vol. 39 ›› Issue (12): 1719-1725.

• 综述 • 上一篇    下一篇

不平衡数据分类及在疾病诊断中的应用研究进展

邹琼1,2, 王冲3   

  1. 1.陕西中医药大学公共卫生学院,陕西 咸阳 712046;
    2.空军军医大学军事预防医学系军队卫生统计教研室,陕西 西安 710032;
    3.陕西中医药大学医学科研实验中心,陕西 咸阳 712046
  • 收稿日期:2022-10-18 修回日期:2022-11-21 发布日期:2026-03-15
  • 通信作者: 王冲,副教授,E-mail:w-goahead@163.com
  • 作者简介:邹琼,女,在读硕士研究生,研究方向为实验设计与统计分析方法。

Research progress of unbalanced data classification and its application in disease diagnosis

ZOU Qiong1,2, WANG Chong3   

  1. 1. College of PublicHealth,Shaanxi University of Chinese Medicine,Xianyang Shaanxi 712046,China;
    2. Military Health Statistics Teaching and Research Office,Department of Military Preventive Medicine,Air Force Military Medical University,Xi'an Shaanxi 710032,China;
    3. Medical Experiment Center,Shaanxi University of Chinese Medicine,Xianyang Shaanxi 712046,China
  • Received:2022-10-18 Revised:2022-11-21 Published:2026-03-15
  • Contact: WANG Chong,Associate professor,E-mail:w-goahead@163.com

摘要: 不同类之间数据分布的不平衡称为类间不平衡,不平衡数据中不同类别的样本数量差异很大。近年,机器学习诊断成为疾病诊断、分析和预测的流行选择,他比传统诊断方法省时,且能够直接预测引起疾病的相关因素,以减轻危害,减少个人和社会的负担。本文综述了近年常用的不平衡数据处理的机器学习方法,重点分析其在预测恶性肿瘤、心脏病、糖尿病及其并发症和其他疾病中的应用情况。

关键词: 不平衡数据, 疾病诊断, 机器学习, 临床预测模型

Abstract: The imbalance of data distribution between different classes is called the imbalance between classes. The number of samples in different categories in unbalanced data varies greatly. In recent years,machine learning algorithm has been a popular choice for diagnosis,analysis and prediction of diseases,because it saves time than traditional diagnosis methods,and can directly predict the relevant factors causing diseases to reduce the harm and reduce the burden of individuals and society. The paper reviews the machine learning methods used in imbalanced data processing in recent years,focusing on their application in predicting cancer,heart disease,diabetes and its complications,and other diseases.

Key words: Unbalanced data, Disease diagnosis, Machine learning, Clinical prediction model

中图分类号: