A machine learning-based approach to prognostic analysis of thoracic transplantations

Dursun Delen, Asil Oztekin, Zhenyu Kong

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Objective: The prediction of survival time after organ transplantations and prognosis analysis of different risk groups of transplant patients are not only clinically important but also technically challenging. The current studies, which are mostly linear modeling-based statistical analyses, have focused on small sets of disparate predictive factors where many potentially important variables are neglected in their analyses. Data mining methods, such as machine learning-based approaches, are capable of providing an effective way of overcoming these limitations by utilizing sufficiently large data sets with many predictive factors to identify not only linear associations but also highly complex, non-linear relationships. Therefore, this study is aimed at exploring risk groups of thoracic recipients through machine learning-based methods. Methods and material: A large, feature-rich, nation-wide thoracic transplantation dataset (obtained from the United Network for Organ Sharing-UNOS) is used to develop predictive models for the survival time estimation. The predictive factors that are most relevant to the survival time identified via, (1) conducting sensitivity analysis on models developed by the machine learning methods, (2) extraction of variables from the published literature, and (3) eliciting variables from the medical experts and other domain specific knowledge bases. A unified set of predictors is then used to develop a Cox regression model and the related prognosis indices. A comparison of clustering algorithm-based and conventional risk grouping techniques is conducted based on the outcome of the Cox regression model in order to identify optimal number of risk groups of thoracic recipients. Finally, the Kaplan-Meier survival analysis is performed to validate the discrimination among the identified various risk groups. Results: The machine learning models performed very effectively in predicting the survival time: the support vector machine model with a radial basis Kernel function produced the best fit with an R2 value of 0.879, the artificial neural network (multilayer perceptron-MLP-model) came the second with an R2 value of 0.847, and the M5 algorithm-based regression tree model came last with an R2 value of 0.785. Following the proposed method, a consolidated set of predictive variables are determined and used to build the Cox survival model. Using the prognosis indices revealed by the Cox survival model along with a k-means clustering algorithm, an optimal number of " three" risk groups is identified. The significance of differences among these risk groups are also validated using the Kaplan-Meier survival analysis. Conclusions: This study demonstrated that the integrated machine learning method to select the predictor variables is more effective in developing the Cox survival models than the traditional methods commonly found in the literature. The significant distinction among the risk groups of thoracic patients also validates the effectiveness of the methodology proposed herein. We anticipate that this study (and other AI based analytic studies like this one) will lead to more effective analyses of thoracic transplant procedures to better understand the prognosis of thoracic organ recipients. It would potentially lead to new medical and biological advances and more effective allocation policies in the field of organ transplantation.

Original languageEnglish
Pages (from-to)33-42
Number of pages10
JournalArtificial Intelligence in Medicine
Volume49
Issue number1
DOIs
StatePublished - 1 May 2010
Externally publishedYes

Fingerprint

Transplantation (surgical)
Learning systems
Thorax
Transplantation
Proportional Hazards Models
Survival
Kaplan-Meier Estimate
Organ Transplantation
Survival Analysis
Transplants
Cluster Analysis
Clustering algorithms
Neural Networks (Computer)
Data Mining
Knowledge Bases
Machine Learning
Multilayer neural networks
Sensitivity analysis
Support vector machines
Data mining

Keywords

  • Data mining
  • Machine learning
  • Prognostic index
  • Survival analysis
  • Thoracic Transplantation
  • UNOS

Cite this

@article{b107c954a6b64712a694a05525ed67ce,
title = "A machine learning-based approach to prognostic analysis of thoracic transplantations",
abstract = "Objective: The prediction of survival time after organ transplantations and prognosis analysis of different risk groups of transplant patients are not only clinically important but also technically challenging. The current studies, which are mostly linear modeling-based statistical analyses, have focused on small sets of disparate predictive factors where many potentially important variables are neglected in their analyses. Data mining methods, such as machine learning-based approaches, are capable of providing an effective way of overcoming these limitations by utilizing sufficiently large data sets with many predictive factors to identify not only linear associations but also highly complex, non-linear relationships. Therefore, this study is aimed at exploring risk groups of thoracic recipients through machine learning-based methods. Methods and material: A large, feature-rich, nation-wide thoracic transplantation dataset (obtained from the United Network for Organ Sharing-UNOS) is used to develop predictive models for the survival time estimation. The predictive factors that are most relevant to the survival time identified via, (1) conducting sensitivity analysis on models developed by the machine learning methods, (2) extraction of variables from the published literature, and (3) eliciting variables from the medical experts and other domain specific knowledge bases. A unified set of predictors is then used to develop a Cox regression model and the related prognosis indices. A comparison of clustering algorithm-based and conventional risk grouping techniques is conducted based on the outcome of the Cox regression model in order to identify optimal number of risk groups of thoracic recipients. Finally, the Kaplan-Meier survival analysis is performed to validate the discrimination among the identified various risk groups. Results: The machine learning models performed very effectively in predicting the survival time: the support vector machine model with a radial basis Kernel function produced the best fit with an R2 value of 0.879, the artificial neural network (multilayer perceptron-MLP-model) came the second with an R2 value of 0.847, and the M5 algorithm-based regression tree model came last with an R2 value of 0.785. Following the proposed method, a consolidated set of predictive variables are determined and used to build the Cox survival model. Using the prognosis indices revealed by the Cox survival model along with a k-means clustering algorithm, an optimal number of {"} three{"} risk groups is identified. The significance of differences among these risk groups are also validated using the Kaplan-Meier survival analysis. Conclusions: This study demonstrated that the integrated machine learning method to select the predictor variables is more effective in developing the Cox survival models than the traditional methods commonly found in the literature. The significant distinction among the risk groups of thoracic patients also validates the effectiveness of the methodology proposed herein. We anticipate that this study (and other AI based analytic studies like this one) will lead to more effective analyses of thoracic transplant procedures to better understand the prognosis of thoracic organ recipients. It would potentially lead to new medical and biological advances and more effective allocation policies in the field of organ transplantation.",
keywords = "Data mining, Machine learning, Prognostic index, Survival analysis, Thoracic Transplantation, UNOS",
author = "Dursun Delen and Asil Oztekin and Zhenyu Kong",
year = "2010",
month = "5",
day = "1",
doi = "10.1016/j.artmed.2010.01.002",
language = "English",
volume = "49",
pages = "33--42",
journal = "Artificial Intelligence in Medicine",
issn = "0933-3657",
publisher = "Elsevier",
number = "1",

}

A machine learning-based approach to prognostic analysis of thoracic transplantations. / Delen, Dursun; Oztekin, Asil; Kong, Zhenyu.

In: Artificial Intelligence in Medicine, Vol. 49, No. 1, 01.05.2010, p. 33-42.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A machine learning-based approach to prognostic analysis of thoracic transplantations

AU - Delen, Dursun

AU - Oztekin, Asil

AU - Kong, Zhenyu

PY - 2010/5/1

Y1 - 2010/5/1

N2 - Objective: The prediction of survival time after organ transplantations and prognosis analysis of different risk groups of transplant patients are not only clinically important but also technically challenging. The current studies, which are mostly linear modeling-based statistical analyses, have focused on small sets of disparate predictive factors where many potentially important variables are neglected in their analyses. Data mining methods, such as machine learning-based approaches, are capable of providing an effective way of overcoming these limitations by utilizing sufficiently large data sets with many predictive factors to identify not only linear associations but also highly complex, non-linear relationships. Therefore, this study is aimed at exploring risk groups of thoracic recipients through machine learning-based methods. Methods and material: A large, feature-rich, nation-wide thoracic transplantation dataset (obtained from the United Network for Organ Sharing-UNOS) is used to develop predictive models for the survival time estimation. The predictive factors that are most relevant to the survival time identified via, (1) conducting sensitivity analysis on models developed by the machine learning methods, (2) extraction of variables from the published literature, and (3) eliciting variables from the medical experts and other domain specific knowledge bases. A unified set of predictors is then used to develop a Cox regression model and the related prognosis indices. A comparison of clustering algorithm-based and conventional risk grouping techniques is conducted based on the outcome of the Cox regression model in order to identify optimal number of risk groups of thoracic recipients. Finally, the Kaplan-Meier survival analysis is performed to validate the discrimination among the identified various risk groups. Results: The machine learning models performed very effectively in predicting the survival time: the support vector machine model with a radial basis Kernel function produced the best fit with an R2 value of 0.879, the artificial neural network (multilayer perceptron-MLP-model) came the second with an R2 value of 0.847, and the M5 algorithm-based regression tree model came last with an R2 value of 0.785. Following the proposed method, a consolidated set of predictive variables are determined and used to build the Cox survival model. Using the prognosis indices revealed by the Cox survival model along with a k-means clustering algorithm, an optimal number of " three" risk groups is identified. The significance of differences among these risk groups are also validated using the Kaplan-Meier survival analysis. Conclusions: This study demonstrated that the integrated machine learning method to select the predictor variables is more effective in developing the Cox survival models than the traditional methods commonly found in the literature. The significant distinction among the risk groups of thoracic patients also validates the effectiveness of the methodology proposed herein. We anticipate that this study (and other AI based analytic studies like this one) will lead to more effective analyses of thoracic transplant procedures to better understand the prognosis of thoracic organ recipients. It would potentially lead to new medical and biological advances and more effective allocation policies in the field of organ transplantation.

AB - Objective: The prediction of survival time after organ transplantations and prognosis analysis of different risk groups of transplant patients are not only clinically important but also technically challenging. The current studies, which are mostly linear modeling-based statistical analyses, have focused on small sets of disparate predictive factors where many potentially important variables are neglected in their analyses. Data mining methods, such as machine learning-based approaches, are capable of providing an effective way of overcoming these limitations by utilizing sufficiently large data sets with many predictive factors to identify not only linear associations but also highly complex, non-linear relationships. Therefore, this study is aimed at exploring risk groups of thoracic recipients through machine learning-based methods. Methods and material: A large, feature-rich, nation-wide thoracic transplantation dataset (obtained from the United Network for Organ Sharing-UNOS) is used to develop predictive models for the survival time estimation. The predictive factors that are most relevant to the survival time identified via, (1) conducting sensitivity analysis on models developed by the machine learning methods, (2) extraction of variables from the published literature, and (3) eliciting variables from the medical experts and other domain specific knowledge bases. A unified set of predictors is then used to develop a Cox regression model and the related prognosis indices. A comparison of clustering algorithm-based and conventional risk grouping techniques is conducted based on the outcome of the Cox regression model in order to identify optimal number of risk groups of thoracic recipients. Finally, the Kaplan-Meier survival analysis is performed to validate the discrimination among the identified various risk groups. Results: The machine learning models performed very effectively in predicting the survival time: the support vector machine model with a radial basis Kernel function produced the best fit with an R2 value of 0.879, the artificial neural network (multilayer perceptron-MLP-model) came the second with an R2 value of 0.847, and the M5 algorithm-based regression tree model came last with an R2 value of 0.785. Following the proposed method, a consolidated set of predictive variables are determined and used to build the Cox survival model. Using the prognosis indices revealed by the Cox survival model along with a k-means clustering algorithm, an optimal number of " three" risk groups is identified. The significance of differences among these risk groups are also validated using the Kaplan-Meier survival analysis. Conclusions: This study demonstrated that the integrated machine learning method to select the predictor variables is more effective in developing the Cox survival models than the traditional methods commonly found in the literature. The significant distinction among the risk groups of thoracic patients also validates the effectiveness of the methodology proposed herein. We anticipate that this study (and other AI based analytic studies like this one) will lead to more effective analyses of thoracic transplant procedures to better understand the prognosis of thoracic organ recipients. It would potentially lead to new medical and biological advances and more effective allocation policies in the field of organ transplantation.

KW - Data mining

KW - Machine learning

KW - Prognostic index

KW - Survival analysis

KW - Thoracic Transplantation

KW - UNOS

UR - http://www.scopus.com/inward/record.url?scp=77951625860&partnerID=8YFLogxK

U2 - 10.1016/j.artmed.2010.01.002

DO - 10.1016/j.artmed.2010.01.002

M3 - Article

C2 - 20153956

AN - SCOPUS:77951625860

VL - 49

SP - 33

EP - 42

JO - Artificial Intelligence in Medicine

JF - Artificial Intelligence in Medicine

SN - 0933-3657

IS - 1

ER -