Predicting and analyzing secondary education placement-test scores: A data mining approach

Baha Şen, Emine Uçar, Dursun Delen

Research output: Contribution to journalArticle

39 Scopus citations

Abstract

Understanding the factors that lead to success (or failure) of students at placement tests is an interesting and challenging problem. Since the centralized placement tests and future academic achievements are considered to be related concepts, analysis of the success factors behind placement tests may help understand and potentially improve academic achievement. In this study using a large and feature rich dataset from Secondary Education Transition System in Turkey we developed models to predict secondary education placement test results, and using sensitivity analysis on those prediction models we identified the most important predictors. The results showed that C5 decision tree algorithm is the best predictor with 95% accuracy on hold-out sample, followed by support vector machines (with an accuracy of 91%) and artificial neural networks (with an accuracy of 89%). Logistic regression models came out to be the least accurate of the four with and overall accuracy of 82%. The sensitivity analysis revealed that previous test experience, whether a student has a scholarship, student's number of siblings, previous years' grade point average are among the most important predictors of the placement test scores.

Original languageEnglish
Pages (from-to)9468-9476
Number of pages9
JournalExpert Systems with Applications
Volume39
Issue number10
DOIs
StatePublished - 1 Aug 2012
Externally publishedYes

    Fingerprint

Keywords

  • Classification
  • Data mining
  • Prediction
  • Sensitivity analysis
  • SETS

Cite this