An investigation of data and text mining methods for real world deception detection

Christie M. Fuller, David P. Biros, Dursun Delen

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

Uncovering lies (or deception) is of critical importance to many including law enforcement and security personnel. Though these people may try to use many different tactics to discover deception, previous research tells us that this cannot be accomplished successfully without aid. This manuscript reports on the promising results of a research study where data and text mining methods along with a sample of real-world data from a high-stakes situation is used to detect deception. At the end, the information fusion based classification models produced better than 74% classification accuracy on the holdout sample using a 10-fold cross validation methodology. Nonetheless, artificial neural networks and decision trees produced accuracy rates of 73.46% and 71.60% respectively. However, due to the high stakes associated with these types of decisions, the extra effort of combining the models to achieve higher accuracy is well warranted.

Original languageEnglish
Pages (from-to)8392-8398
Number of pages7
JournalExpert Systems with Applications
Volume38
Issue number7
DOIs
StatePublished - 1 Jul 2011
Externally publishedYes

Fingerprint

Information fusion
Law enforcement
Decision trees
Personnel
Neural networks

Keywords

  • Classification
  • Credibility assessment
  • Data mining
  • Deception detection
  • Information fusion
  • Text mining

Cite this

@article{02c7632293404a4cb000e11e9bb2bd13,
title = "An investigation of data and text mining methods for real world deception detection",
abstract = "Uncovering lies (or deception) is of critical importance to many including law enforcement and security personnel. Though these people may try to use many different tactics to discover deception, previous research tells us that this cannot be accomplished successfully without aid. This manuscript reports on the promising results of a research study where data and text mining methods along with a sample of real-world data from a high-stakes situation is used to detect deception. At the end, the information fusion based classification models produced better than 74{\%} classification accuracy on the holdout sample using a 10-fold cross validation methodology. Nonetheless, artificial neural networks and decision trees produced accuracy rates of 73.46{\%} and 71.60{\%} respectively. However, due to the high stakes associated with these types of decisions, the extra effort of combining the models to achieve higher accuracy is well warranted.",
keywords = "Classification, Credibility assessment, Data mining, Deception detection, Information fusion, Text mining",
author = "Fuller, {Christie M.} and Biros, {David P.} and Dursun Delen",
year = "2011",
month = "7",
day = "1",
doi = "10.1016/j.eswa.2011.01.032",
language = "English",
volume = "38",
pages = "8392--8398",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Ltd",
number = "7",

}

An investigation of data and text mining methods for real world deception detection. / Fuller, Christie M.; Biros, David P.; Delen, Dursun.

In: Expert Systems with Applications, Vol. 38, No. 7, 01.07.2011, p. 8392-8398.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An investigation of data and text mining methods for real world deception detection

AU - Fuller, Christie M.

AU - Biros, David P.

AU - Delen, Dursun

PY - 2011/7/1

Y1 - 2011/7/1

N2 - Uncovering lies (or deception) is of critical importance to many including law enforcement and security personnel. Though these people may try to use many different tactics to discover deception, previous research tells us that this cannot be accomplished successfully without aid. This manuscript reports on the promising results of a research study where data and text mining methods along with a sample of real-world data from a high-stakes situation is used to detect deception. At the end, the information fusion based classification models produced better than 74% classification accuracy on the holdout sample using a 10-fold cross validation methodology. Nonetheless, artificial neural networks and decision trees produced accuracy rates of 73.46% and 71.60% respectively. However, due to the high stakes associated with these types of decisions, the extra effort of combining the models to achieve higher accuracy is well warranted.

AB - Uncovering lies (or deception) is of critical importance to many including law enforcement and security personnel. Though these people may try to use many different tactics to discover deception, previous research tells us that this cannot be accomplished successfully without aid. This manuscript reports on the promising results of a research study where data and text mining methods along with a sample of real-world data from a high-stakes situation is used to detect deception. At the end, the information fusion based classification models produced better than 74% classification accuracy on the holdout sample using a 10-fold cross validation methodology. Nonetheless, artificial neural networks and decision trees produced accuracy rates of 73.46% and 71.60% respectively. However, due to the high stakes associated with these types of decisions, the extra effort of combining the models to achieve higher accuracy is well warranted.

KW - Classification

KW - Credibility assessment

KW - Data mining

KW - Deception detection

KW - Information fusion

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=79952437990&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2011.01.032

DO - 10.1016/j.eswa.2011.01.032

M3 - Article

AN - SCOPUS:79952437990

VL - 38

SP - 8392

EP - 8398

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

IS - 7

ER -