TY - JOUR
T1 - A data analytics approach to building a clinical decision support system for diabetic retinopathy
T2 - Developing and deploying a model ensemble
AU - Piri, Saeed
AU - Delen, Dursun
AU - Liu, Tieming
AU - Zolbanin, Hamed M.
N1 - Publisher Copyright:
© 2017 Elsevier B.V.
PY - 2017/9/1
Y1 - 2017/9/1
N2 - Diabetes is a common chronic disease that may lead to several complications. Diabetic retinopathy (DR), one of the most serious of these complications, is the most common cause of vision loss among diabetic patients. In this paper, we analyzed data from more than 1.4 million diabetics and developed a clinical decision support system (CDSS) for predicting DR. While the existing diagnostic approach requires access to ophthalmologists and expensive equipment, our CDSS only uses demographic and lab data to detect patients' susceptibility to retinopathy with a high accuracy. We illustrate how a combination of multiple data preparation and modeling steps helped us improve the performance of our CDSS. From the data preprocessing aspect, we aggregated the data at the patient level and incorporated comorbidity information into our models. From the modeling perspective, we built several predictive models and developed a novel “confidence margin” ensemble technique that outperformed the existing ensemble models. Our results suggest that diabetic neuropathy, creatinine serum, blood urea nitrogen, glucose serum plasma, and hematocrit are the most important variables in detecting DR. Our CDSS provides several important practical implications, including identifying the DR risk factors, facilitating the early diagnosis of DR, and solving the problem of low compliance with annual retinopathy screenings.
AB - Diabetes is a common chronic disease that may lead to several complications. Diabetic retinopathy (DR), one of the most serious of these complications, is the most common cause of vision loss among diabetic patients. In this paper, we analyzed data from more than 1.4 million diabetics and developed a clinical decision support system (CDSS) for predicting DR. While the existing diagnostic approach requires access to ophthalmologists and expensive equipment, our CDSS only uses demographic and lab data to detect patients' susceptibility to retinopathy with a high accuracy. We illustrate how a combination of multiple data preparation and modeling steps helped us improve the performance of our CDSS. From the data preprocessing aspect, we aggregated the data at the patient level and incorporated comorbidity information into our models. From the modeling perspective, we built several predictive models and developed a novel “confidence margin” ensemble technique that outperformed the existing ensemble models. Our results suggest that diabetic neuropathy, creatinine serum, blood urea nitrogen, glucose serum plasma, and hematocrit are the most important variables in detecting DR. Our CDSS provides several important practical implications, including identifying the DR risk factors, facilitating the early diagnosis of DR, and solving the problem of low compliance with annual retinopathy screenings.
KW - Clinical decision support systems
KW - Data analytics
KW - Diabetic retinopathy
KW - Model ensembles
KW - Predictive modeling
KW - Variable importance
UR - http://www.scopus.com/inward/record.url?scp=85020058737&partnerID=8YFLogxK
U2 - 10.1016/j.dss.2017.05.012
DO - 10.1016/j.dss.2017.05.012
M3 - Article
AN - SCOPUS:85020058737
SN - 0167-9236
VL - 101
SP - 12
EP - 27
JO - Decision Support Systems
JF - Decision Support Systems
ER -