TY - GEN
T1 - An analytic approach to understanding and predicting healthcare coverage
AU - Delen, Dursun
AU - Fuller, Christie
PY - 2013/1/1
Y1 - 2013/1/1
N2 - The inequality in the level of healthcare coverage among the people in the US is a pressing issue. Unfortunately, many people do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, the goal of this study is to examine the healthcare coverage of individuals by applying popular analytic techniques on a wide-variety of predictive factors. A large and feature-rich dataset is used in conjunction with four popular data mining techniques - artificial neural networks, decision trees, support vector machines and logistic regression - to develop prediction models. Applying sensitivity analysis to the developed prediction models, the ranked importance of variables is determined. The experimental results indicated that the most accurate classifier for this phenomenon was the support vector machines that had an overall classification accuracy of 82.23% on the 10-fold holdout/test sample. The most important predictive factors came out as income, employment status, education, and marital status. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in health care coverage.
AB - The inequality in the level of healthcare coverage among the people in the US is a pressing issue. Unfortunately, many people do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, the goal of this study is to examine the healthcare coverage of individuals by applying popular analytic techniques on a wide-variety of predictive factors. A large and feature-rich dataset is used in conjunction with four popular data mining techniques - artificial neural networks, decision trees, support vector machines and logistic regression - to develop prediction models. Applying sensitivity analysis to the developed prediction models, the ranked importance of variables is determined. The experimental results indicated that the most accurate classifier for this phenomenon was the support vector machines that had an overall classification accuracy of 82.23% on the 10-fold holdout/test sample. The most important predictive factors came out as income, employment status, education, and marital status. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in health care coverage.
KW - analytics
KW - data mining
KW - Healthcare coverage
KW - sensitivity analysis
UR - http://www.scopus.com/inward/record.url?scp=84894281305&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-276-9-198
DO - 10.3233/978-1-61499-276-9-198
M3 - Conference contribution
C2 - 23823421
AN - SCOPUS:84894281305
SN - 9781614992752
T3 - Studies in Health Technology and Informatics
SP - 198
EP - 200
BT - Informatics, Management and Technology in Healthcare
PB - IOS Press
T2 - International Conference on Informatics, Management, and Technology in Healthcare, ICIMTH 2013
Y2 - 5 July 2013 through 7 July 2013
ER -