Trustworthy Federated Learning Against Model Leakage


Grant Data
Project Title
Trustworthy Federated Learning Against Model Leakage
Principal Investigator
Professor Chow, Ka Ho   (Principal Investigator (PI))
Duration
60
Start Date
2024-04-08
Amount
500000
Conference Title
Trustworthy Federated Learning Against Model Leakage
Keywords
Federated learning
Discipline
Artificial Intelligence and Machine learning
HKU Project Code
2499102828
Grant Type
Start-up Allowance for Returning Croucher Award Recipients
Funding Year
2024
Status
On-going
Objectives
Objective 1: Develop a Model Mutation MethodAs a preventive measure, each participant's model will be modified to prevent it from being used to generate adversarial inputs that could compromise the server's unperturbed model. This will be achieved using a multi-tiered model mutation strategy, employing techniques such as differential privacy to introduce noise into themodel parameters and model pruning to reduce unnecessary functionalities specific to each participant to deter model leakage. These perturbed models isolate malicious activities from the server's unperturbed model. Evaluation metrics will include the perturbed model's utility, resistance to attacks, and the learning performance impact relative to a conventional FL setupObjective 2: Establish a Honeypot-Based Watermarking Mechanism To circumvent the misuse of watermarking for malicious purposes, we propose a honeypot-based mechanism [26]. This approach will allow participants to embed a backdoor as a watermark, which will only activate under specific, controlledconditions, such as when a predefined image is used during verification tests. This mechanism is designed to prevent the exploitation of the model in real-world applications while enabling legitimate verification of ownership and contributions by participants. Validation will focus on the persistence of watermarks through multiple model updates, resistance to malicious activities, and their impact on model performance.Objective 3: Design a Traceable Model-Sharing TechniqueTo identify and trace potential leaks, we will designate a unique signature to the parameter space allocated to each participant. This signature will serve a dual purpose: identifying potential leakers by matching it against any leaked models and analyzing specific characteristics of adversarial inputs to trace potential attackers. We plan to employ the honeypot concept, such that the generation of adversarial inputs will fall into a trap, inadvertently creating inputs that contain information about the signature of the surrogate model used to generate them.