File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Intrinsically interpretable machine learning models and automated hyperparameter optimization
Title | Intrinsically interpretable machine learning models and automated hyperparameter optimization |
---|---|
Authors | |
Advisors | Advisor(s):Yin, G |
Issue Date | 2021 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Yang, Z. [杨泽斌]. (2021). Intrinsically interpretable machine learning models and automated hyperparameter optimization. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Prediction accuracy and model interpretability are the two most important objectives when developing machine learning algorithms. Neural networks and ensemble trees are known to possess good prediction performance but suffer from the lack of model interpretability. In this thesis, three intrinsically interpretable machine learning models are proposed, including an enhanced explainable neural network (ExNN), an explainable neural network based on generalized additive models with structured interactions (GAMI-Net), and a single-index model tree (SIMTree). All these three models are validated through extensive experiments, which show their superior performance for balancing prediction performance and model interpretability. Moreover, a sequential uniform design (SeqUD) approach is proposed for hyperparameter optimization, which can help a machine learning model to achieve maximum possible predictive performance.
In ExNN, the explainability of neural networks is enhanced through the following architecture constraints: a) sparse additive subnetworks; b) projection pursuit with orthogonality constraint; c) smooth function approximation. It leads to a superior balance between prediction performance and model interpretability. The multiple parameters are simultaneously estimated by a modified mini-batch gradient descent method based on the backpropagation algorithm for calculating the derivatives and the Cayley transform for preserving the projection orthogonality.
GAMI-Net is a disentangled feedforward network with multiple additive subnetworks. Each subnetwork consists of multiple hidden layers and is designed for capturing one main effect or one pairwise interaction. Three interpretability aspects are further considered, including a) sparsity, to select the most significant effects for parsimonious representations; b) heredity, a pairwise interaction could only be included when at least one of its parent main effects exists; c) marginal clarity, to make main effects and pairwise interactions mutually distinguishable. An adaptive training algorithm is developed, where main effects are first trained and then pairwise interactions are fitted to the residuals.
SIMTree is developed for heterogeneous data modeling. It adopts the recursive partitioning strategy and each data segment is modeled by a single-index model (SIM), which is a flexible extension of linear regression with non-parametric link functions. The proposed SIMTree has two major advantages: a) with only a few leaf nodes, it can achieve competitive predictive performance compared to complicated black-box models; b) SIMs fitted on each local data segment are intrinsically interpretable. To make the computation burden affordable, an effective training algorithm is proposed as enabled by the efficient utilization of Stein's lemma and several accelerating strategies in the tree construction algorithm.
Finally, this thesis reformulates hyperparameter optimization as a computer experiment and proposes a novel SeqUD strategy with three-fold advantages: a) the hyperparameter space is adaptively explored with evenly spread design points, without the need of expensive meta-modeling and acquisition optimization; b) the batch-by-batch design points are sequentially generated with parallel processing support; c) a new augmented uniform design algorithm is developed for the efficient real-time generation of follow-up design points. The superior performance of SeqUD is validated via both global optimization tasks and real applications. |
Degree | Doctor of Philosophy |
Subject | Machine learning Mathematical optimization |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/308636 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yin, G | - |
dc.contributor.author | Yang, Zebin | - |
dc.contributor.author | 杨泽斌 | - |
dc.date.accessioned | 2021-12-06T01:04:02Z | - |
dc.date.available | 2021-12-06T01:04:02Z | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | Yang, Z. [杨泽斌]. (2021). Intrinsically interpretable machine learning models and automated hyperparameter optimization. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/308636 | - |
dc.description.abstract | Prediction accuracy and model interpretability are the two most important objectives when developing machine learning algorithms. Neural networks and ensemble trees are known to possess good prediction performance but suffer from the lack of model interpretability. In this thesis, three intrinsically interpretable machine learning models are proposed, including an enhanced explainable neural network (ExNN), an explainable neural network based on generalized additive models with structured interactions (GAMI-Net), and a single-index model tree (SIMTree). All these three models are validated through extensive experiments, which show their superior performance for balancing prediction performance and model interpretability. Moreover, a sequential uniform design (SeqUD) approach is proposed for hyperparameter optimization, which can help a machine learning model to achieve maximum possible predictive performance. In ExNN, the explainability of neural networks is enhanced through the following architecture constraints: a) sparse additive subnetworks; b) projection pursuit with orthogonality constraint; c) smooth function approximation. It leads to a superior balance between prediction performance and model interpretability. The multiple parameters are simultaneously estimated by a modified mini-batch gradient descent method based on the backpropagation algorithm for calculating the derivatives and the Cayley transform for preserving the projection orthogonality. GAMI-Net is a disentangled feedforward network with multiple additive subnetworks. Each subnetwork consists of multiple hidden layers and is designed for capturing one main effect or one pairwise interaction. Three interpretability aspects are further considered, including a) sparsity, to select the most significant effects for parsimonious representations; b) heredity, a pairwise interaction could only be included when at least one of its parent main effects exists; c) marginal clarity, to make main effects and pairwise interactions mutually distinguishable. An adaptive training algorithm is developed, where main effects are first trained and then pairwise interactions are fitted to the residuals. SIMTree is developed for heterogeneous data modeling. It adopts the recursive partitioning strategy and each data segment is modeled by a single-index model (SIM), which is a flexible extension of linear regression with non-parametric link functions. The proposed SIMTree has two major advantages: a) with only a few leaf nodes, it can achieve competitive predictive performance compared to complicated black-box models; b) SIMs fitted on each local data segment are intrinsically interpretable. To make the computation burden affordable, an effective training algorithm is proposed as enabled by the efficient utilization of Stein's lemma and several accelerating strategies in the tree construction algorithm. Finally, this thesis reformulates hyperparameter optimization as a computer experiment and proposes a novel SeqUD strategy with three-fold advantages: a) the hyperparameter space is adaptively explored with evenly spread design points, without the need of expensive meta-modeling and acquisition optimization; b) the batch-by-batch design points are sequentially generated with parallel processing support; c) a new augmented uniform design algorithm is developed for the efficient real-time generation of follow-up design points. The superior performance of SeqUD is validated via both global optimization tasks and real applications. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Machine learning | - |
dc.subject.lcsh | Mathematical optimization | - |
dc.title | Intrinsically interpretable machine learning models and automated hyperparameter optimization | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2021 | - |
dc.identifier.mmsid | 991044448916903414 | - |