A data analytics approach to building a clinical decision support system for diabetic retinopathy: Developing and deploying a model ensemble

Saeed Piri, Dursun Delen, Tieming Liu, Hamed M. Zolbanin

Research output: Contribution to journalArticle

24 Scopus citations


Diabetes is a common chronic disease that may lead to several complications. Diabetic retinopathy (DR), one of the most serious of these complications, is the most common cause of vision loss among diabetic patients. In this paper, we analyzed data from more than 1.4 million diabetics and developed a clinical decision support system (CDSS) for predicting DR. While the existing diagnostic approach requires access to ophthalmologists and expensive equipment, our CDSS only uses demographic and lab data to detect patients' susceptibility to retinopathy with a high accuracy. We illustrate how a combination of multiple data preparation and modeling steps helped us improve the performance of our CDSS. From the data preprocessing aspect, we aggregated the data at the patient level and incorporated comorbidity information into our models. From the modeling perspective, we built several predictive models and developed a novel “confidence margin” ensemble technique that outperformed the existing ensemble models. Our results suggest that diabetic neuropathy, creatinine serum, blood urea nitrogen, glucose serum plasma, and hematocrit are the most important variables in detecting DR. Our CDSS provides several important practical implications, including identifying the DR risk factors, facilitating the early diagnosis of DR, and solving the problem of low compliance with annual retinopathy screenings.

Original languageEnglish
Pages (from-to)12-27
Number of pages16
JournalDecision Support Systems
Publication statusPublished - 1 Sep 2017
Externally publishedYes



  • Clinical decision support systems
  • Data analytics
  • Diabetic retinopathy
  • Model ensembles
  • Predictive modeling
  • Variable importance

Cite this