File Download
Supplementary

postgraduate thesis: Insider threat investigation through unsupervised learning

TitleInsider threat investigation through unsupervised learning
Authors
Advisors
Advisor(s):Chow, KP
Issue Date2020
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wei, Y. [衛易辰]. (2020). Insider threat investigation through unsupervised learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractInsider threat investigation is one of the major challenges in the field of digital forensics. Being different with external attackers, insiders possess the tokens to access the digital asset within the organization, of which the deviations from normal behaviors are hard to seize. The complexity, concealment and infrequency of malicious internal actions make it difficult to detect insider threat. In this dissertation, we employ unsupervised deep learning approaches for investigating insider threat from digital evidence. The novel frameworks for insider threat detection, prediction and investigation are proposed. The proposed techniques are based on unsupervised data filtering, joint optimization and graph representation learning. First, we propose a real unsupervised deep learning framework for detecting insider threat from system log files. Being widely used for producing the nonlinear representation as low-dimensional codes of the input data, autoencoder is used for insider threat detection through automatic filtering in this thesis. We design cascaded autoencoder insider threat detection framework, a real unsupervised learning model, in which we can filter out insider records by cascaded autoencoder filters (CAFs) automatically and estimate the distribution of encoded normal data with Gaussian mixture model, then identify insider threats’ log records if they have low probabilities. In the process of traditional reactive forensic investigation, analysis and interpretation of the digital evidence are performed after a crime has been committed. Even if insiders can be detected, they have already caused huge damage to the organizations. In this thesis, we propose a novel general unsupervised anomaly detection scheme based on CAFs and joint optimization network. The core idea is to utilize CAFs to do data purification among unlabeled imbalanced dataset then jointly optimize the dimension reduction and density estimation network. Basing on this scheme, we design an end-to-end insider threat prediction framework for proactive forensic investigation, through which we can make real time response to prevent the harmful influences of insider threat. We extract the tractable and scalable feature representation automatically through the data driven Bidirectional Long Short-Term Memory feature extractor, which eliminates the time-consuming and customarily expert dependable feature engineering work. A hypergraph correction module is applied to decrease the commonly existed relatively high false positive rate in insider threat detection. Additionally, most existing deep learning solutions for insider threat investigation ignore considering the underlying correlation relationship among the data and only work for data with Euclidean structure. This thesis proposes Log2graph, an unsupervised variational graph autoencoder based scheme to detect insider threat entities through huge amount of data. We construct a graph representing an insider attack case from raw log files and design a novel graph neural network model to detect suspicious anomalous insiders in the graph. Subsequently, we perform a post-analysis to analyze the anomaly-instructure, which can help investigators attribute potential insiders. We evaluate our proposed models on public benchmark datasets. The empirical experiments demonstrate that our models outperform state-of-the-art methods.
DegreeDoctor of Philosophy
SubjectMachine learning
Computer security
Computer crimes - Investigation
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/308624

 

DC FieldValueLanguage
dc.contributor.advisorChow, KP-
dc.contributor.authorWei, Yichen-
dc.contributor.author衛易辰-
dc.date.accessioned2021-12-06T01:04:01Z-
dc.date.available2021-12-06T01:04:01Z-
dc.date.issued2020-
dc.identifier.citationWei, Y. [衛易辰]. (2020). Insider threat investigation through unsupervised learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/308624-
dc.description.abstractInsider threat investigation is one of the major challenges in the field of digital forensics. Being different with external attackers, insiders possess the tokens to access the digital asset within the organization, of which the deviations from normal behaviors are hard to seize. The complexity, concealment and infrequency of malicious internal actions make it difficult to detect insider threat. In this dissertation, we employ unsupervised deep learning approaches for investigating insider threat from digital evidence. The novel frameworks for insider threat detection, prediction and investigation are proposed. The proposed techniques are based on unsupervised data filtering, joint optimization and graph representation learning. First, we propose a real unsupervised deep learning framework for detecting insider threat from system log files. Being widely used for producing the nonlinear representation as low-dimensional codes of the input data, autoencoder is used for insider threat detection through automatic filtering in this thesis. We design cascaded autoencoder insider threat detection framework, a real unsupervised learning model, in which we can filter out insider records by cascaded autoencoder filters (CAFs) automatically and estimate the distribution of encoded normal data with Gaussian mixture model, then identify insider threats’ log records if they have low probabilities. In the process of traditional reactive forensic investigation, analysis and interpretation of the digital evidence are performed after a crime has been committed. Even if insiders can be detected, they have already caused huge damage to the organizations. In this thesis, we propose a novel general unsupervised anomaly detection scheme based on CAFs and joint optimization network. The core idea is to utilize CAFs to do data purification among unlabeled imbalanced dataset then jointly optimize the dimension reduction and density estimation network. Basing on this scheme, we design an end-to-end insider threat prediction framework for proactive forensic investigation, through which we can make real time response to prevent the harmful influences of insider threat. We extract the tractable and scalable feature representation automatically through the data driven Bidirectional Long Short-Term Memory feature extractor, which eliminates the time-consuming and customarily expert dependable feature engineering work. A hypergraph correction module is applied to decrease the commonly existed relatively high false positive rate in insider threat detection. Additionally, most existing deep learning solutions for insider threat investigation ignore considering the underlying correlation relationship among the data and only work for data with Euclidean structure. This thesis proposes Log2graph, an unsupervised variational graph autoencoder based scheme to detect insider threat entities through huge amount of data. We construct a graph representing an insider attack case from raw log files and design a novel graph neural network model to detect suspicious anomalous insiders in the graph. Subsequently, we perform a post-analysis to analyze the anomaly-instructure, which can help investigators attribute potential insiders. We evaluate our proposed models on public benchmark datasets. The empirical experiments demonstrate that our models outperform state-of-the-art methods. -
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshMachine learning-
dc.subject.lcshComputer security-
dc.subject.lcshComputer crimes - Investigation-
dc.titleInsider threat investigation through unsupervised learning-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2021-
dc.identifier.mmsid991044448906703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats