The book first explores the cybersecurity's landscape and the inherent susceptibility of online communication system such as e-mail, chat conversation and social media in cybercrimes. Common sources and resources of digital crimes, their causes and effects together with the emerging threats for society are illustrated in this book. This book not only explores the growing needs of cybersecurity and digital forensics but also investigates relevant technologies and methods to meet the said needs. Knowledge discovery, machine learning and data analytics are explored for collecting cyber-intelligence and forensics evidence on cybercrimes.
Online communication documents, which are the main source of cybercrimes are investigated from two perspectives: the crime and the criminal. AI and machine learning methods are applied to detect illegal and criminal activities such as bot distribution, drug trafficking and child pornography. Authorship analysis is applied to identify the potential suspects and their social linguistics characteristics. Deep learning together with frequent pattern mining and link mining techniques are applied to trace the potential collaborators of the identified criminals.
Finally, the aim of the book is not only to investigate the crimes and identify the potential suspects but, as well, to collect solid and precise forensics evidence to prosecute the suspects in the court of law.
1CYBERSECURITY AND CYBERCRIME INVESTIGATION
1.1CYBERSECURITY
1.2KEY COMPONENTS TO MINIMIZING CYBERCRIMES
1.3DAMAGE RESULTING FROM CYBERCRIME
1.4CYBERCRIMES
1.4.1Major Categories of Cybercrime
1.4.2Causes of and Motivations for Cybercrime
1.5MAJOR CHALLENGES
1.5.1Hacker Tools and Exploit Kits
1.5.2Universal Access29
1.5.3Online Anonymity
1.5.4Organized Crime30
1.5.5Nation State Threat Actors31
1.6CYBERCRIME INVESTIGATION32
2MACHINE LEARNING FRAMEWORK FOR MESSAGING FORENSICS34
2.1SOURCES OF CYBERCRIMES36
2.2FEW ANALYSIS TOOLS AND TECHNIQUES38
2.3PROPOSED FRAMEWORK FOR CYBERCRIMES INVESTIGATION39
2.4AUTHORSHIP ANALYSIS41
2.5INTRODUCTION TO CRIMINAL INFORMATION MINING43
2.5.1Existing Criminal Information Mining Approaches44
2.5.2WordNet-based Criminal Information Mining47
2.6WEKA48
3HEADER-LEVEL INVESTIGATION AND ANALYZING NETWORK INFORMATION50
3.1STATISTICAL EVALUATION52
3.2TEMPORAL ANALYSIS53
3.3GEOGRAPHICAL LOCALIZATION53
3.4SOCIAL NETWORK ANALYSIS55
3.5CLASSIFICATION56
3.6CLUSTERING58
4AUTHORSHIP ANALYSIS APPROACHES59
4.1HISTORICAL PERSPECTIVE59
4.2ONLINE ANONYMITY AND AUTHORSHIP ANALYSIS60
4.3STYLOMETRIC FEATURES61
4.4AUTHORSHIP ANALYSIS METHODS63
4.4.1Statistical Analysis Methods64
4.4.2Machine Learning Methods64
4.4.1Classification Method Fundamentals66
4.5AUTHORSHIP ATTRIBUTION67
4.6AUTHORSHIP CHARACTERIZATION69
4.7AUTHORSHIP VERIFICATION70
4.8LIMITATIONS OF EXISTING AUTHORSHIP TECHNIQUES72
5AUTHORSHIP ANALYSIS - WRITEPRINT MINING FOR AUTHORSHIP ATTRIBUTION74
5.1AUTHORSHIP ATTRIBUTION PROBLEM78
5.1.1Attribution without Stylistic Variation79
5.1.2Attribution with Stylistic Variation79
5.2BUILDING BLOCKS OF THE PROPOSED APPROACH80
5.3WRITEPRINT87
5.4PROPOSED APPROACHES87
5.4.1AuthorMiner1: Attribution without Stylistic Variation88
5.4.2AuthorMiner2: Attribution with Stylistic Variation92
6AUTHORSHIP ATTRIBUTION WITH FEW TRAINING SAMPLES97
6.1PROBLEM STATEMENT AND FUNDAMENTALS100
6.2PROPOSED APPROACH101
6.2.1Preprocessing101
6.2.2Clustering by Stylometric Features102
6.2.3Frequent Stylometric Pattern Mining104
6.2.4Writeprint Mining105
6.2.5Identifying Author106
6.3EXPERIMENTS AND DISCUSSION106
7AUTHORSHIP CHARACTERIZATION113
7.1PROPOSED APPROACH115
7.1.1Clustering Anonymous Messages116
7.1.2Extracting Writeprints from Sample Messages116
7.1.3Identifying Author Characteristics116
7.2EXPERIMENTS AND DISCUSSION117
8AUTHORSHIP VERIFICATION120
8.1PROBLEM STATEMENT123
8.2PROPOSED APPROACH125
8.2.1Verification by Classification126
8.2.2Verification by Regression126
8.3EXPERIMENTS AND DISCUSSION127
8.3.1Verification by Classification.128
8.3.2Verification by Regression128
9AUTHORSHIP ATTRIBUTION USING CUSTOMIZED ASSOCIATIVE CLASSIFICATION131
9.1PROBLEM STATEMENT132
9.1.1Extracting Stylometric Features132
9.1.2Associative Classification Writeprint133
9.1.3Refined Problem Statement136
9.2CLASSIFICATION BY MULTIPLE ASSOCIATION RULE FOR AUTHORSHIP ANALYSIS137
9.2.1Mining Class Association Rules137
9.2.2Pruning Class Association Rules139
9.2.3Authorship Classification142
9.3EXPERIMENTAL EVALUATION145
10CRIMINAL INFORMATION MINING151