File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Improving intra-class compactness for face recognition and image classification in deep neural networks
Title | Improving intra-class compactness for face recognition and image classification in deep neural networks |
---|---|
Authors | |
Advisors | Advisor(s):Lau, HYK |
Issue Date | 2021 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen, X. [陳晓宇]. (2021). Improving intra-class compactness for face recognition and image classification in deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Supervised learning that has been extensively explored over decades is well integrated with deep neural networks currently and applied to various areas. Pattern recognition is accordingly one of the mostly well-known targets. And various loss functions are proposed to tackle recognition and classification tasks in a robust manner. Despite inspiring achievements have been reached consistently with existing methods, the discriminative power of learned features remains challenging. Discriminative ability re- quires learned features to possess themselves good intra-class compactness and inter-class separability. And disturbances like illumination, camera viewpoints, biological process, and resolution even impose higher requirements on intra-class compactness in recognition and classification tasks.
Subsequently, enhancing intra-class compactness during the learning pro- cess is of increasing importance especially when under large intra-class variations. In this thesis, we focus on improving the intra-class compactness of learned features with supervised learning strategy for face recognition tasks.
We first present a methodology to enlarge inter-class distances and reduce intra-class distances for cross-age face recognition. The proposed identity-level angular triplet loss projects facial images to an embedding space in which angles between learned embeddings represent similarities of images with much clearer geometric interpretation. Angular metric adopted in our method learns more discriminative features in the way that angles between same classes are reduced while that between different classes are enlarged. Triplets employed for training are formed on identity-level and follow a moderate positive mining strategy to ensure the convergence.
However, we have noticed that the training process can be quite time- consuming with the on-line triplet mining strategy especially when large- scale datasets are involved. Subsequently, we propose another novel loss function called distribution loss without employing triplets for cross-age face recognition, which works together with the most widely-adopted softmax loss in a complementary manner. It controls the mean discrepancy between positive sets and negative sets on batch level. The softmax loss guides training in the early stage while the distribution loss begins to act a leading role in the late stage when the optimization of softmax loss gradually stops.
Reviewing the fast convergence speed and good inter-class separability of softmax loss, we further explore how to enhance the intra-class compactness of softmax loss for handling more complicated recognition and classification tasks. The logit of conventional softmax loss is a linear combination of weight vectors and leaned features. Inspired by SVM algorithms, we re- place the linear logit of softmax loss by Gaussian RBF kernels non-lineally to enhance the intra-class compactness of learned features. A mathematical analysis on selecting appropriate hyper-parameters for Gaussian RBF kernelized softmax is illustrated theoretically as well. Experiments are carried out on image classification and face recognition datasets to evaluate the performance of model with different hyper-parameters. Results have proved the effectiveness and stability of the proposed method. |
Degree | Doctor of Philosophy |
Subject | Neural networks (Computer science) Human face recognition (Computer science) |
Dept/Program | Industrial and Manufacturing Systems Engineering |
Persistent Identifier | http://hdl.handle.net/10722/311677 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Lau, HYK | - |
dc.contributor.author | Chen, Xiaoyu | - |
dc.contributor.author | 陳晓宇 | - |
dc.date.accessioned | 2022-03-30T05:42:22Z | - |
dc.date.available | 2022-03-30T05:42:22Z | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | Chen, X. [陳晓宇]. (2021). Improving intra-class compactness for face recognition and image classification in deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/311677 | - |
dc.description.abstract | Supervised learning that has been extensively explored over decades is well integrated with deep neural networks currently and applied to various areas. Pattern recognition is accordingly one of the mostly well-known targets. And various loss functions are proposed to tackle recognition and classification tasks in a robust manner. Despite inspiring achievements have been reached consistently with existing methods, the discriminative power of learned features remains challenging. Discriminative ability re- quires learned features to possess themselves good intra-class compactness and inter-class separability. And disturbances like illumination, camera viewpoints, biological process, and resolution even impose higher requirements on intra-class compactness in recognition and classification tasks. Subsequently, enhancing intra-class compactness during the learning pro- cess is of increasing importance especially when under large intra-class variations. In this thesis, we focus on improving the intra-class compactness of learned features with supervised learning strategy for face recognition tasks. We first present a methodology to enlarge inter-class distances and reduce intra-class distances for cross-age face recognition. The proposed identity-level angular triplet loss projects facial images to an embedding space in which angles between learned embeddings represent similarities of images with much clearer geometric interpretation. Angular metric adopted in our method learns more discriminative features in the way that angles between same classes are reduced while that between different classes are enlarged. Triplets employed for training are formed on identity-level and follow a moderate positive mining strategy to ensure the convergence. However, we have noticed that the training process can be quite time- consuming with the on-line triplet mining strategy especially when large- scale datasets are involved. Subsequently, we propose another novel loss function called distribution loss without employing triplets for cross-age face recognition, which works together with the most widely-adopted softmax loss in a complementary manner. It controls the mean discrepancy between positive sets and negative sets on batch level. The softmax loss guides training in the early stage while the distribution loss begins to act a leading role in the late stage when the optimization of softmax loss gradually stops. Reviewing the fast convergence speed and good inter-class separability of softmax loss, we further explore how to enhance the intra-class compactness of softmax loss for handling more complicated recognition and classification tasks. The logit of conventional softmax loss is a linear combination of weight vectors and leaned features. Inspired by SVM algorithms, we re- place the linear logit of softmax loss by Gaussian RBF kernels non-lineally to enhance the intra-class compactness of learned features. A mathematical analysis on selecting appropriate hyper-parameters for Gaussian RBF kernelized softmax is illustrated theoretically as well. Experiments are carried out on image classification and face recognition datasets to evaluate the performance of model with different hyper-parameters. Results have proved the effectiveness and stability of the proposed method. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Neural networks (Computer science) | - |
dc.subject.lcsh | Human face recognition (Computer science) | - |
dc.title | Improving intra-class compactness for face recognition and image classification in deep neural networks | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Industrial and Manufacturing Systems Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2022 | - |
dc.identifier.mmsid | 991044494004903414 | - |