File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1145/3340531.3412130
- Scopus: eid_2-s2.0-85095864916
- WOS: WOS:000749561302004
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: Can Adversarial Weight Perturbations Inject Neural Backdoors
Title | Can Adversarial Weight Perturbations Inject Neural Backdoors |
---|---|
Authors | |
Keywords | adversarial deep learning backdoor attacks |
Issue Date | 2020 |
Citation | International Conference on Information and Knowledge Management, Proceedings, 2020, p. 2029-2032 How to Cite? |
Abstract | Adversarial machine learning has exposed several security hazards of neural models. Thus far, the concept of an "adversarial perturbation" has exclusively been used with reference to the input space referring to a small, imperceptible change which can cause a ML model to err. In this work we extend the idea of "adversarial perturbations" to the space of model weights, specifically to inject backdoors in trained DNNs, which exposes a security risk of publicly available trained models. Here, injecting a backdoor refers to obtaining a desired outcome from the model when a trigger pattern is added to the input, while retaining the original predictions on a non-triggered input. From the perspective of an adversary, we characterize these adversarial perturbations to be constrained within an ĝ.,"∞ norm around the original model weights. We introduce adversarial perturbations in model weights using a composite loss on the predictions of the original model and the desired trigger through projected gradient descent. Our results show that backdoors can be successfully injected with a very small average relative change in model weight values for several CV and NLP applications. |
Persistent Identifier | http://hdl.handle.net/10722/341291 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Garg, Siddhant | - |
dc.contributor.author | Kumar, Adarsh | - |
dc.contributor.author | Goel, Vibhor | - |
dc.contributor.author | Liang, Yingyu | - |
dc.date.accessioned | 2024-03-13T08:41:40Z | - |
dc.date.available | 2024-03-13T08:41:40Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | International Conference on Information and Knowledge Management, Proceedings, 2020, p. 2029-2032 | - |
dc.identifier.uri | http://hdl.handle.net/10722/341291 | - |
dc.description.abstract | Adversarial machine learning has exposed several security hazards of neural models. Thus far, the concept of an "adversarial perturbation" has exclusively been used with reference to the input space referring to a small, imperceptible change which can cause a ML model to err. In this work we extend the idea of "adversarial perturbations" to the space of model weights, specifically to inject backdoors in trained DNNs, which exposes a security risk of publicly available trained models. Here, injecting a backdoor refers to obtaining a desired outcome from the model when a trigger pattern is added to the input, while retaining the original predictions on a non-triggered input. From the perspective of an adversary, we characterize these adversarial perturbations to be constrained within an ĝ.,"∞ norm around the original model weights. We introduce adversarial perturbations in model weights using a composite loss on the predictions of the original model and the desired trigger through projected gradient descent. Our results show that backdoors can be successfully injected with a very small average relative change in model weight values for several CV and NLP applications. | - |
dc.language | eng | - |
dc.relation.ispartof | International Conference on Information and Knowledge Management, Proceedings | - |
dc.subject | adversarial deep learning | - |
dc.subject | backdoor attacks | - |
dc.title | Can Adversarial Weight Perturbations Inject Neural Backdoors | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1145/3340531.3412130 | - |
dc.identifier.scopus | eid_2-s2.0-85095864916 | - |
dc.identifier.spage | 2029 | - |
dc.identifier.epage | 2032 | - |
dc.identifier.isi | WOS:000749561302004 | - |