Analysis of healthcare coverage: A data mining approach

Dursun Delen, Christie Fuller, Charles McCann, Deepa Ray

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

The existing disparity in the healthcare coverage is a pressing issue in the United States. Unfortunately, many in the US do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, this study aims to examine the healthcare coverage of individuals by applying popular machine learning techniques on a wide-variety of predictive factors. Twenty-three variables and 193,373 records were utilized from the 2004 behavioral risk factor surveillance system survey data for this study. The artificial neural networks and the decision tree models were developed and compared to each other for predictive ability. The sensitivity analysis and variable importance measures are calculated to analyze the importance of the predictive factors. The experimental results indicated that the most accurate classifier for this phenomenon was the multi-layer perceptron type artificial neural network model that had an overall classification accuracy of 78.45% on the holdout sample. The most important predictive factors came out as income, employment status, education, and marital status. Using two popular machine learning techniques, this study identified the factors that can be used to accurately classify those with and without healthcare coverage. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in healthcare coverage.

Original languageEnglish
Pages (from-to)995-1003
Number of pages9
JournalExpert Systems with Applications
Volume36
Issue number2 PART 1
DOIs
StatePublished - 1 Jan 2009
Externally publishedYes

Fingerprint

Data mining
Learning systems
Neural networks
Multilayer neural networks
Decision trees
Sensitivity analysis
Classifiers
Education

Keywords

  • Classification
  • Data mining
  • Decision trees
  • Healthcare coverage
  • Neural networks
  • Prediction

Cite this

Delen, Dursun ; Fuller, Christie ; McCann, Charles ; Ray, Deepa. / Analysis of healthcare coverage : A data mining approach. In: Expert Systems with Applications. 2009 ; Vol. 36, No. 2 PART 1. pp. 995-1003.
@article{0cc9319749df4253b7960f7af0638a4f,
title = "Analysis of healthcare coverage: A data mining approach",
abstract = "The existing disparity in the healthcare coverage is a pressing issue in the United States. Unfortunately, many in the US do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, this study aims to examine the healthcare coverage of individuals by applying popular machine learning techniques on a wide-variety of predictive factors. Twenty-three variables and 193,373 records were utilized from the 2004 behavioral risk factor surveillance system survey data for this study. The artificial neural networks and the decision tree models were developed and compared to each other for predictive ability. The sensitivity analysis and variable importance measures are calculated to analyze the importance of the predictive factors. The experimental results indicated that the most accurate classifier for this phenomenon was the multi-layer perceptron type artificial neural network model that had an overall classification accuracy of 78.45{\%} on the holdout sample. The most important predictive factors came out as income, employment status, education, and marital status. Using two popular machine learning techniques, this study identified the factors that can be used to accurately classify those with and without healthcare coverage. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in healthcare coverage.",
keywords = "Classification, Data mining, Decision trees, Healthcare coverage, Neural networks, Prediction",
author = "Dursun Delen and Christie Fuller and Charles McCann and Deepa Ray",
year = "2009",
month = "1",
day = "1",
doi = "10.1016/j.eswa.2007.10.041",
language = "English",
volume = "36",
pages = "995--1003",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Ltd",
number = "2 PART 1",

}

Analysis of healthcare coverage : A data mining approach. / Delen, Dursun; Fuller, Christie; McCann, Charles; Ray, Deepa.

In: Expert Systems with Applications, Vol. 36, No. 2 PART 1, 01.01.2009, p. 995-1003.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Analysis of healthcare coverage

T2 - A data mining approach

AU - Delen, Dursun

AU - Fuller, Christie

AU - McCann, Charles

AU - Ray, Deepa

PY - 2009/1/1

Y1 - 2009/1/1

N2 - The existing disparity in the healthcare coverage is a pressing issue in the United States. Unfortunately, many in the US do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, this study aims to examine the healthcare coverage of individuals by applying popular machine learning techniques on a wide-variety of predictive factors. Twenty-three variables and 193,373 records were utilized from the 2004 behavioral risk factor surveillance system survey data for this study. The artificial neural networks and the decision tree models were developed and compared to each other for predictive ability. The sensitivity analysis and variable importance measures are calculated to analyze the importance of the predictive factors. The experimental results indicated that the most accurate classifier for this phenomenon was the multi-layer perceptron type artificial neural network model that had an overall classification accuracy of 78.45% on the holdout sample. The most important predictive factors came out as income, employment status, education, and marital status. Using two popular machine learning techniques, this study identified the factors that can be used to accurately classify those with and without healthcare coverage. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in healthcare coverage.

AB - The existing disparity in the healthcare coverage is a pressing issue in the United States. Unfortunately, many in the US do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, this study aims to examine the healthcare coverage of individuals by applying popular machine learning techniques on a wide-variety of predictive factors. Twenty-three variables and 193,373 records were utilized from the 2004 behavioral risk factor surveillance system survey data for this study. The artificial neural networks and the decision tree models were developed and compared to each other for predictive ability. The sensitivity analysis and variable importance measures are calculated to analyze the importance of the predictive factors. The experimental results indicated that the most accurate classifier for this phenomenon was the multi-layer perceptron type artificial neural network model that had an overall classification accuracy of 78.45% on the holdout sample. The most important predictive factors came out as income, employment status, education, and marital status. Using two popular machine learning techniques, this study identified the factors that can be used to accurately classify those with and without healthcare coverage. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in healthcare coverage.

KW - Classification

KW - Data mining

KW - Decision trees

KW - Healthcare coverage

KW - Neural networks

KW - Prediction

UR - http://www.scopus.com/inward/record.url?scp=56349092023&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2007.10.041

DO - 10.1016/j.eswa.2007.10.041

M3 - Article

AN - SCOPUS:56349092023

VL - 36

SP - 995

EP - 1003

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

IS - 2 PART 1

ER -