An assessment and cleaning framework for electronic health records data

Zhuqi Miao, Shrieraam Sathyanarayanan, Elvena Fong, William Paiva, Dursun Delen

Research output: Contribution to conferencePaper

Abstract

As a result of policies, such as the Health Information Technology for Economic and Clinical Health Act, that were designed to spur adoption and meaningful use of Electronic Health Records (EHR) to improve healthcare, an abundance of clinical data has been generated. Similar to other real-life big data, EHR data can be very “dirty,” which makes quality assessment and cleaning vital for producing accurate and complete EHR data sets that can be reused for clinical research. However, real-world quality assessment outcomes and cleaning methods for EHR data still remain largely undocumented to date. This study aims to contribute to such literature by: i) developing a data quality assessment and cleaning framework for EHR-based secondary analysis; ii) applying the framework to a case study of identifying hip fracture readmission risk factors based on data extracted from Cerner Health Facts, one of the nation's largest relational EHR data warehouses; and iii) reporting data quality problems identified and the cleaning methodologies that addressed the problems. Given the considerable similarities among various EHR systems, it is expected that the framework and findings based on Health Facts can be extended to cleaning relational EHR data in general.

Original languageEnglish
Pages907-912
Number of pages6
StatePublished - 1 Jan 2018
Event2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018 - Orlando, United States
Duration: 19 May 201822 May 2018

Other

Other2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018
CountryUnited States
CityOrlando
Period19/05/1822/05/18

Fingerprint

Cleaning
Health
Data warehouses
Information technology
Economics

Keywords

  • Data cleaning
  • Data quality assessment
  • Electronic health records (EHR)
  • Secondary analysis of EHR

Cite this

Miao, Z., Sathyanarayanan, S., Fong, E., Paiva, W., & Delen, D. (2018). An assessment and cleaning framework for electronic health records data. 907-912. Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.
Miao, Zhuqi ; Sathyanarayanan, Shrieraam ; Fong, Elvena ; Paiva, William ; Delen, Dursun. / An assessment and cleaning framework for electronic health records data. Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.6 p.
@conference{1ea169d042c843eb9d6111faca30b1b9,
title = "An assessment and cleaning framework for electronic health records data",
abstract = "As a result of policies, such as the Health Information Technology for Economic and Clinical Health Act, that were designed to spur adoption and meaningful use of Electronic Health Records (EHR) to improve healthcare, an abundance of clinical data has been generated. Similar to other real-life big data, EHR data can be very “dirty,” which makes quality assessment and cleaning vital for producing accurate and complete EHR data sets that can be reused for clinical research. However, real-world quality assessment outcomes and cleaning methods for EHR data still remain largely undocumented to date. This study aims to contribute to such literature by: i) developing a data quality assessment and cleaning framework for EHR-based secondary analysis; ii) applying the framework to a case study of identifying hip fracture readmission risk factors based on data extracted from Cerner Health Facts, one of the nation's largest relational EHR data warehouses; and iii) reporting data quality problems identified and the cleaning methodologies that addressed the problems. Given the considerable similarities among various EHR systems, it is expected that the framework and findings based on Health Facts can be extended to cleaning relational EHR data in general.",
keywords = "Data cleaning, Data quality assessment, Electronic health records (EHR), Secondary analysis of EHR",
author = "Zhuqi Miao and Shrieraam Sathyanarayanan and Elvena Fong and William Paiva and Dursun Delen",
year = "2018",
month = "1",
day = "1",
language = "English",
pages = "907--912",
note = "null ; Conference date: 19-05-2018 Through 22-05-2018",

}

Miao, Z, Sathyanarayanan, S, Fong, E, Paiva, W & Delen, D 2018, 'An assessment and cleaning framework for electronic health records data' Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States, 19/05/18 - 22/05/18, pp. 907-912.

An assessment and cleaning framework for electronic health records data. / Miao, Zhuqi; Sathyanarayanan, Shrieraam; Fong, Elvena; Paiva, William; Delen, Dursun.

2018. 907-912 Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.

Research output: Contribution to conferencePaper

TY - CONF

T1 - An assessment and cleaning framework for electronic health records data

AU - Miao, Zhuqi

AU - Sathyanarayanan, Shrieraam

AU - Fong, Elvena

AU - Paiva, William

AU - Delen, Dursun

PY - 2018/1/1

Y1 - 2018/1/1

N2 - As a result of policies, such as the Health Information Technology for Economic and Clinical Health Act, that were designed to spur adoption and meaningful use of Electronic Health Records (EHR) to improve healthcare, an abundance of clinical data has been generated. Similar to other real-life big data, EHR data can be very “dirty,” which makes quality assessment and cleaning vital for producing accurate and complete EHR data sets that can be reused for clinical research. However, real-world quality assessment outcomes and cleaning methods for EHR data still remain largely undocumented to date. This study aims to contribute to such literature by: i) developing a data quality assessment and cleaning framework for EHR-based secondary analysis; ii) applying the framework to a case study of identifying hip fracture readmission risk factors based on data extracted from Cerner Health Facts, one of the nation's largest relational EHR data warehouses; and iii) reporting data quality problems identified and the cleaning methodologies that addressed the problems. Given the considerable similarities among various EHR systems, it is expected that the framework and findings based on Health Facts can be extended to cleaning relational EHR data in general.

AB - As a result of policies, such as the Health Information Technology for Economic and Clinical Health Act, that were designed to spur adoption and meaningful use of Electronic Health Records (EHR) to improve healthcare, an abundance of clinical data has been generated. Similar to other real-life big data, EHR data can be very “dirty,” which makes quality assessment and cleaning vital for producing accurate and complete EHR data sets that can be reused for clinical research. However, real-world quality assessment outcomes and cleaning methods for EHR data still remain largely undocumented to date. This study aims to contribute to such literature by: i) developing a data quality assessment and cleaning framework for EHR-based secondary analysis; ii) applying the framework to a case study of identifying hip fracture readmission risk factors based on data extracted from Cerner Health Facts, one of the nation's largest relational EHR data warehouses; and iii) reporting data quality problems identified and the cleaning methodologies that addressed the problems. Given the considerable similarities among various EHR systems, it is expected that the framework and findings based on Health Facts can be extended to cleaning relational EHR data in general.

KW - Data cleaning

KW - Data quality assessment

KW - Electronic health records (EHR)

KW - Secondary analysis of EHR

UR - http://www.scopus.com/inward/record.url?scp=85054029141&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85054029141

SP - 907

EP - 912

ER -

Miao Z, Sathyanarayanan S, Fong E, Paiva W, Delen D. An assessment and cleaning framework for electronic health records data. 2018. Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.