Development of a new metric to identify rare patterns in association analysis: The case of analyzing diabetes complications

Saeed Piri, Dursun Delen, Tieming Liu, William Paiva

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

Diabetes, one of the most serious and fast growing chronic health conditions, often leads to other serious complications such as neurological, renal, ophthalmic, and heart diseases. Research has shown that more than 85% of diabetic patients develop at least one of these complications. Therefore, studying comorbidities among diabetic patients using association analysis is a worthy research endeavor. Association analysis is a well-known data mining method that aims to reveal the association/affinity patterns/rules among various items (objects or events) that occur together. One of the most critical problems in association analysis is the difficulty with the identification of rare items/patterns. In ordinary association analysis, specifying a large minimum-support leads to not discovering rare rules, while setting a small minimum-support leads to over-generating rules that may not be strong and beneficial. In this study, we propose a new assessment metric, called adjusted_support, to address this problem. Applying this new metric can retrieve rare patterns without over-generating association rules. To test the proposed metric, we extracted data from a large and feature-rich electronic medical records data warehouse and performed association analysis on the resultant data set that included 492,025 unique patients diagnosed with diabetes and related complications. By applying adjusted_support, we discovered interesting associations among diabetes complications such as neurological manifestations with diabetic arthropathy and gastroparesis; renal manifestations with retinopathy; gastroparesis with ketoacidosis and retinopathy; and skin complications with hyperglycemia, peripheral circulatory disorder, heart disease, and neurological manifestations. We also performed association analysis in various demographic groups at more granular levels. Besides association analysis, we also analyzed the comorbidity situation among different demographic groups of diabetics. Finally, we studied and compared the prevalence of diabetes complications in every demographic group of patients.

Original languageEnglish
Pages (from-to)112-125
Number of pages14
JournalExpert Systems with Applications
Volume94
DOIs
StatePublished - 15 Mar 2018

Keywords

  • Adjusted_support
  • Association rule mining
  • Comorbidity
  • Data mining
  • Diabetes
  • Rare-pattern identification

Fingerprint

Dive into the research topics of 'Development of a new metric to identify rare patterns in association analysis: The case of analyzing diabetes complications'. Together they form a unique fingerprint.

Cite this