Writing Homework Help
Grand Canyon University How to Use Data Mining to Detect Deception in A Text Ques
Read “Application Case 5.3: Mining for Lies,” located in Chapter 5 of the textbook.
In 50-100 words each, address the questions presented at the end of the case study.
Driven by advancements in Web-based information technologies and increasing globalization, computer- mediated communication continues to filter into everyday life, bringing with it new venues for decep-tion. The volume of text-based chat, instant messag-ing, text messaging, and text generated by online communities of practice is increasing rapidly. Even e-mail continues to grow in use. With the massive growth of text-based communication, the potential for people to deceive others through computer-mediated communication has also grown, and such deception can have disastrous results.Unfortunately, in general, humans tend to perform poorly at deception-detection tasks. This phenomenon is exacerbated in text-based commu-nications. A large part of the research on deception detection (also known as credibility assessment) has involved face-to-face meetings and interviews. Yet, with the growth of text-based communication, text-based deception-detection techniques are essential.Techniques for successfully detecting decep-tion—that is, lies—have wide applicability. Law enforcement can use decision support tools and techniques to investigate crimes, conduct security screening in airports, and monitor communications of suspected terrorists. Human resources profession-als might use deception-detection tools to screen applicants. These tools and techniques also have the potential to screen e-mails to uncover fraud or other wrongdoings committed by corporate officers. Although some people believe that they can readily identify those who are not being truthful, a summary of deception research showed that, on average, peo-ple are only 54% accurate in making veracity deter-minations (Bond & DePaulo, 2006). This figure may actually be worse when humans try to detect decep-tion in text.
Using a combination of text mining and data mining techniques, Fuller et al. (2008) analyzed person-of-interest statements completed by people involved in crimes on military bases. In these state-ments, suspects and witnesses are required to write their recollection of the event in their own words. Military law enforcement personnel searched archival data for statements that they could conclusively iden-tify as being truthful or deceptive. These decisions were made on the basis of corroborating evidence and case resolution. Once labeled as truthful or deceptive, the law enforcement personnel removed identifying information and gave the statements to the research team. In total, 371 usable statements were received for analysis. The text-based decep-tion-detection method used by Fuller et al. (2008) was based on a process known as message feature mining, which relies on elements of data and text mining techniques. A simplified depiction of the process is provided in Figure 5.3.First, the researchers prepared the data for pro-cessing. The original handwritten statements had to be transcribed into a word processing file. Second, features (i.e., cues) were identified. The researchers identified 31 features representing categories or types of language that are relatively independent of the text content and that can be readily analyzed by auto-mated means. For example, first-person pronouns such as I or me can be identified without analysis of the surrounding text. Table 5.1 lists the categories and an example list of features used in this study.The features were extracted from the textual statements and input into a flat file for further pro-cessing. Using several feature-selection methods along with 10-fold cross-validation, the researchers com-pared the prediction accuracy of three popular data mining methods. Their results indicated that neural network models performed the best, with 73.46% pre-diction accuracy on test data samples; decision trees performed second best, with 71.60% accuracy; and logistic regression was last, with 65.28% accuracy.The results indicate that automated text-based deception detection has the potential to aid those who must try to detect lies in text and can be suc-cessfully applied to real-world data. The accuracy of these techniques exceeded the accuracy of most other deception-detection techniques, even though it was limited to textual cues.
Questions for Discussion
1. Why is it difficult to detect deception?
2. How can text/data mining be used to detect deception in text?
3. What do you think are the main challenges for such an automated system?