File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data
Title | Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data |
---|---|
Authors | |
Advisors | |
Issue Date | 2022 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen, H. [陈翰宁]. (2022). Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Large-scale analyses in biostatistics pose massive challenges over the traditional statistical models. Traditional models turn out to be underperformance and even non-applicable because of the large dimension and small sample. To procure a better performance of linear regression model in terms of prediction, our paper adopts a two-step transfer learning algorithm, including a general analysis step and a specific analysis step, with the assumptions that analogous studies possess generalities, meanwhile, hold specificities. Consequently, a good transfer learning algorithm exploiting the merits of generality and specificity boosts the model performance. Nevertheless, it is inevitable to bring noises when trying to borrow information from auxiliary tasks. Therefore, we facilitate the transfer learning process with a novel class of shrinkage prior, R2D2, under the framework of the linear regression through identifying a prior first on model fitting, particularly the coefficient of determination, and then distributing it through to the coefficients in an innovative way. Throughout our simulation and application on gene expression data, the proposed two-step transfer learning algorithm increases the model performance, and R2D2 transfer learning outperforms other experimented methods, implying that R2D2 is more resistant to noisy information.
Besides adopting R2D2 to estimate the linear coefficient, we apply instance weighting transfer learning to handle noises involved in transfer learning. We introduce a group weighting parameter to the two-step transfer learning framework. In our improved transfer pipeline, auxiliary sets of higher quality are assigned to bigger weights. Meanwhile, those of lower quality get smaller weights. Different from the common instance weighting transfer research that is highly confined to classification problems and uses individual-level weighting, our work is regression-based and focuses on group-level weighting under the background of the sample batch effect. It turns out that the improved algorithm is more general as it works well under different target tissues. On the basis of our first project, we further apply our method to cross-organ transfer learning and verify the capability of our method to discern the quality of auxiliary sets with respect to tissue similarities. |
Degree | Master of Philosophy |
Subject | Transfer learning (Machine learning) Gene expression - Statistical methods |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/324475 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Zhang, Y | - |
dc.contributor.advisor | Yin, G | - |
dc.contributor.author | Chen, Hanning | - |
dc.contributor.author | 陈翰宁 | - |
dc.date.accessioned | 2023-02-03T02:12:23Z | - |
dc.date.available | 2023-02-03T02:12:23Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Chen, H. [陈翰宁]. (2022). Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/324475 | - |
dc.description.abstract | Large-scale analyses in biostatistics pose massive challenges over the traditional statistical models. Traditional models turn out to be underperformance and even non-applicable because of the large dimension and small sample. To procure a better performance of linear regression model in terms of prediction, our paper adopts a two-step transfer learning algorithm, including a general analysis step and a specific analysis step, with the assumptions that analogous studies possess generalities, meanwhile, hold specificities. Consequently, a good transfer learning algorithm exploiting the merits of generality and specificity boosts the model performance. Nevertheless, it is inevitable to bring noises when trying to borrow information from auxiliary tasks. Therefore, we facilitate the transfer learning process with a novel class of shrinkage prior, R2D2, under the framework of the linear regression through identifying a prior first on model fitting, particularly the coefficient of determination, and then distributing it through to the coefficients in an innovative way. Throughout our simulation and application on gene expression data, the proposed two-step transfer learning algorithm increases the model performance, and R2D2 transfer learning outperforms other experimented methods, implying that R2D2 is more resistant to noisy information. Besides adopting R2D2 to estimate the linear coefficient, we apply instance weighting transfer learning to handle noises involved in transfer learning. We introduce a group weighting parameter to the two-step transfer learning framework. In our improved transfer pipeline, auxiliary sets of higher quality are assigned to bigger weights. Meanwhile, those of lower quality get smaller weights. Different from the common instance weighting transfer research that is highly confined to classification problems and uses individual-level weighting, our work is regression-based and focuses on group-level weighting under the background of the sample batch effect. It turns out that the improved algorithm is more general as it works well under different target tissues. On the basis of our first project, we further apply our method to cross-organ transfer learning and verify the capability of our method to discern the quality of auxiliary sets with respect to tissue similarities. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Transfer learning (Machine learning) | - |
dc.subject.lcsh | Gene expression - Statistical methods | - |
dc.title | Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2023 | - |
dc.identifier.mmsid | 991044634603703414 | - |