File Download
Supplementary

postgraduate thesis: Defending against adversarial machine learning attacks for deep neural networks

TitleDefending against adversarial machine learning attacks for deep neural networks
Authors
Advisors
Advisor(s):Yiu, SMHui, CK
Issue Date2022
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wen, J. [聞婧]. (2022). Defending against adversarial machine learning attacks for deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe deep neural network is extensively deployed in face recognition and image classification systems, which have achieved excellent performance. However, The adaptive nature of DNNs exposes the systems to new threats, which either compromise the integrity by misleading the model with malicious input or learning the confidential training data. This thesis focused on defensive frameworks and algorithms to systematically protect the DNN based systems from adversarial machine learning attacks. Specifically, we propose DCN, Holmes against adversarial attacks, Pat against model inversion attacks, and PuFace against facial cloaking attacks. Adversarial attacks can easily cheat DNNs by adding imperceptible noise to images and causing misclassification. In DCN, we propose a detector-corrector framework to mitigate adversarial attacks. We observed that logit could serve as an exterior feature to train detectors. Therefore, we suggest detecting adversarial samples by a binary classification model. We introduce a shallow neural network detector, and it can achieve high accuracy for detecting adversarial samples with low false-positive rates. The corrector will search their neighbor area for the detected adversarial samples to find the proper labels rather than the wrong labels the model predicts. Afterward, we extend the single detector to multiple detectors systems and propose Holmes to reinforce DNNs by detecting potential, even unseen adversarial samples from multiple attacks with high detection accuracy and low false adversarial rate than single detector systems even in an adaptive model. To ensure the diversity and randomness of detectors in Holmes, we train dedicated detectors for each label or detectors with top-k logits. Model inversion attacks could reveal and synthesize the input data from the result of DNNs, which poses a serious threat to data privacy. We get inspiration from malicious adversarial samples and present Pat to mitigate model inversion attacks by masking the model prediction with slight protective noise. Specifically, we transform the results into adversarial samples by adding optimal noise vectors to mislead attackers. Meanwhile, we leverage label modifiers to ensure model predictions remain the same. Therefore, Pat will not affect the model accuracy. Facial cloaking attacks add invisible cloaks to facial images to protect users from being recognized by facial recognition models. However, we show that the ”cloaks” can be purified from images. We introduce PuFace, an image purification system leveraging the generalization ability of neural networks to diminish the impact of cloaks by pushing the cloaked images towards the manifold of natural (uncloaked) images before the training process of facial recognition models. To meet the defense goal, we propose to train the purifier on particularly amplified cloaked images with a loss function that combines image loss and feature loss. Our empirical experiments show that our defensive methods can effectively defend against current adversarial machine learning attacks and outperform existing defenses. They are compatible with various models and complementary to other defenses for complete protection.
DegreeDoctor of Philosophy
SubjectMachine learning
Neural networks (Computer science) - Security measures
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/322950

 

DC FieldValueLanguage
dc.contributor.advisorYiu, SM-
dc.contributor.advisorHui, CK-
dc.contributor.authorWen, Jing-
dc.contributor.author聞婧-
dc.date.accessioned2022-11-18T10:42:04Z-
dc.date.available2022-11-18T10:42:04Z-
dc.date.issued2022-
dc.identifier.citationWen, J. [聞婧]. (2022). Defending against adversarial machine learning attacks for deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/322950-
dc.description.abstractThe deep neural network is extensively deployed in face recognition and image classification systems, which have achieved excellent performance. However, The adaptive nature of DNNs exposes the systems to new threats, which either compromise the integrity by misleading the model with malicious input or learning the confidential training data. This thesis focused on defensive frameworks and algorithms to systematically protect the DNN based systems from adversarial machine learning attacks. Specifically, we propose DCN, Holmes against adversarial attacks, Pat against model inversion attacks, and PuFace against facial cloaking attacks. Adversarial attacks can easily cheat DNNs by adding imperceptible noise to images and causing misclassification. In DCN, we propose a detector-corrector framework to mitigate adversarial attacks. We observed that logit could serve as an exterior feature to train detectors. Therefore, we suggest detecting adversarial samples by a binary classification model. We introduce a shallow neural network detector, and it can achieve high accuracy for detecting adversarial samples with low false-positive rates. The corrector will search their neighbor area for the detected adversarial samples to find the proper labels rather than the wrong labels the model predicts. Afterward, we extend the single detector to multiple detectors systems and propose Holmes to reinforce DNNs by detecting potential, even unseen adversarial samples from multiple attacks with high detection accuracy and low false adversarial rate than single detector systems even in an adaptive model. To ensure the diversity and randomness of detectors in Holmes, we train dedicated detectors for each label or detectors with top-k logits. Model inversion attacks could reveal and synthesize the input data from the result of DNNs, which poses a serious threat to data privacy. We get inspiration from malicious adversarial samples and present Pat to mitigate model inversion attacks by masking the model prediction with slight protective noise. Specifically, we transform the results into adversarial samples by adding optimal noise vectors to mislead attackers. Meanwhile, we leverage label modifiers to ensure model predictions remain the same. Therefore, Pat will not affect the model accuracy. Facial cloaking attacks add invisible cloaks to facial images to protect users from being recognized by facial recognition models. However, we show that the ”cloaks” can be purified from images. We introduce PuFace, an image purification system leveraging the generalization ability of neural networks to diminish the impact of cloaks by pushing the cloaked images towards the manifold of natural (uncloaked) images before the training process of facial recognition models. To meet the defense goal, we propose to train the purifier on particularly amplified cloaked images with a loss function that combines image loss and feature loss. Our empirical experiments show that our defensive methods can effectively defend against current adversarial machine learning attacks and outperform existing defenses. They are compatible with various models and complementary to other defenses for complete protection. -
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshMachine learning-
dc.subject.lcshNeural networks (Computer science) - Security measures-
dc.titleDefending against adversarial machine learning attacks for deep neural networks-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2022-
dc.identifier.mmsid991044609100503414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats