File Download
Supplementary

Conference Paper: SOTER: Guarding Black-box Inference for General Neural Networks at the Edge

TitleSOTER: Guarding Black-box Inference for General Neural Networks at the Edge
Authors
Issue Date13-Jul-2022
Abstract

The prosperity of AI and edge computing has pushed more and more well-trained DNN models to be deployed on third-party edge devices to compose mission-critical applications. This necessitates protecting model confidentiality at untrusted devices, and using a co-located accelerator (e.g., GPU) to speed up model inference locally. Recently, the community has sought to improve the security with CPU trusted execution environments (TEE). However, existing solutions either run an entire model in TEE, suffering from extremely high inference latency, or take a partition-based approach to handcraft partial model via parameter obfuscation techniques to run on an untrusted GPU, achieving lower inference latency at the expense of both the integrity of partitioned computations outside TEE and accuracy of obfuscated parameters.

We propose SOTER, the first system that can achieve model confidentiality, integrity, low inference latency and high accuracy in the partition-based approach. Our key observation is that there is often an \textit{associativity} property among many inference operators in DNN models. Therefore, SOTER automatically transforms a major fraction of associative operators into \textit{parameter-morphed}, thus \textit{confidentiality-preserved} operators to execute on untrusted GPU, and fully restores the execution results to accurate results with associativity in TEE. Based on these steps, SOTER further designs an \textit{oblivious fingerprinting} technique to safely detect integrity breaches of morphed operators outside TEE to ensure correct executions of inferences. Experimental results on six prevalent models in the three most popular categories show that, even with stronger model protection, SOTER achieves comparable performance with partition-based baselines while retaining the same high accuracy as insecure inference.


Persistent Identifierhttp://hdl.handle.net/10722/340445

 

DC FieldValueLanguage
dc.contributor.authorShen, Tianxiang-
dc.contributor.authorQi, Ji-
dc.contributor.authorJiang, Jianyu-
dc.contributor.authorWang, Xian-
dc.contributor.authorWen, Siyuan-
dc.contributor.authorChen, Xusheng-
dc.contributor.authorZhao, Shixiong-
dc.contributor.authorWang, Sen-
dc.contributor.authorChen, Li-
dc.contributor.authorLuo, Xiapu-
dc.contributor.authorZhang, Fengwei-
dc.contributor.authorCui, Heming-
dc.date.accessioned2024-03-11T10:44:42Z-
dc.date.available2024-03-11T10:44:42Z-
dc.date.issued2022-07-13-
dc.identifier.urihttp://hdl.handle.net/10722/340445-
dc.description.abstract<p>The prosperity of AI and edge computing has pushed more and more well-trained DNN models to be deployed on third-party edge devices to compose mission-critical applications. This necessitates protecting model confidentiality at untrusted devices, and using a co-located accelerator (e.g., GPU) to speed up model inference locally. Recently, the community has sought to improve the security with CPU trusted execution environments (TEE). However, existing solutions either run an entire model in TEE, suffering from extremely high inference latency, or take a partition-based approach to handcraft partial model via parameter obfuscation techniques to run on an untrusted GPU, achieving lower inference latency at the expense of both the integrity of partitioned computations outside TEE and accuracy of obfuscated parameters.</p><p>We propose SOTER, the first system that can achieve model confidentiality, integrity, low inference latency and high accuracy in the partition-based approach. Our key observation is that there is often an \textit{associativity} property among many inference operators in DNN models. Therefore, SOTER automatically transforms a major fraction of associative operators into \textit{parameter-morphed}, thus \textit{confidentiality-preserved} operators to execute on untrusted GPU, and fully restores the execution results to accurate results with associativity in TEE. Based on these steps, SOTER further designs an \textit{oblivious fingerprinting} technique to safely detect integrity breaches of morphed operators outside TEE to ensure correct executions of inferences. Experimental results on six prevalent models in the three most popular categories show that, even with stronger model protection, SOTER achieves comparable performance with partition-based baselines while retaining the same high accuracy as insecure inference.</p>-
dc.languageeng-
dc.relation.ispartof2022 USENIX Annual Technical Conference (11/07/2022-13/07/2022, Carlsbad)-
dc.titleSOTER: Guarding Black-box Inference for General Neural Networks at the Edge-
dc.typeConference_Paper-
dc.description.naturepublished_or_final_version-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats