Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data

Chen, Hanning; 陈翰宁

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Statistics & Actuarial Science: Theses

postgraduate thesis: Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data

Title	Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data
Authors	Chen, Hanning 陈翰宁
Advisors	Advisor(s):Zhang, Y Yin, G
Issue Date	2022
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, H. [陈翰宁]. (2022). Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Large-scale analyses in biostatistics pose massive challenges over the traditional statistical models. Traditional models turn out to be underperformance and even non-applicable because of the large dimension and small sample. To procure a better performance of linear regression model in terms of prediction, our paper adopts a two-step transfer learning algorithm, including a general analysis step and a specific analysis step, with the assumptions that analogous studies possess generalities, meanwhile, hold specificities. Consequently, a good transfer learning algorithm exploiting the merits of generality and specificity boosts the model performance. Nevertheless, it is inevitable to bring noises when trying to borrow information from auxiliary tasks. Therefore, we facilitate the transfer learning process with a novel class of shrinkage prior, R2D2, under the framework of the linear regression through identifying a prior first on model fitting, particularly the coefficient of determination, and then distributing it through to the coefficients in an innovative way. Throughout our simulation and application on gene expression data, the proposed two-step transfer learning algorithm increases the model performance, and R2D2 transfer learning outperforms other experimented methods, implying that R2D2 is more resistant to noisy information. Besides adopting R2D2 to estimate the linear coefficient, we apply instance weighting transfer learning to handle noises involved in transfer learning. We introduce a group weighting parameter to the two-step transfer learning framework. In our improved transfer pipeline, auxiliary sets of higher quality are assigned to bigger weights. Meanwhile, those of lower quality get smaller weights. Different from the common instance weighting transfer research that is highly confined to classification problems and uses individual-level weighting, our work is regression-based and focuses on group-level weighting under the background of the sample batch effect. It turns out that the improved algorithm is more general as it works well under different target tissues. On the basis of our first project, we further apply our method to cross-organ transfer learning and verify the capability of our method to discern the quality of auxiliary sets with respect to tissue similarities.
Degree	Master of Philosophy
Subject	Transfer learning (Machine learning) Gene expression - Statistical methods
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/324475

DC Field	Value	Language
dc.contributor.advisor	Zhang, Y	-
dc.contributor.advisor	Yin, G	-
dc.contributor.author	Chen, Hanning	-
dc.contributor.author	陈翰宁	-
dc.date.accessioned	2023-02-03T02:12:23Z	-
dc.date.available	2023-02-03T02:12:23Z	-
dc.date.issued	2022	-
dc.identifier.citation	Chen, H. [陈翰宁]. (2022). Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/324475	-
dc.description.abstract	Large-scale analyses in biostatistics pose massive challenges over the traditional statistical models. Traditional models turn out to be underperformance and even non-applicable because of the large dimension and small sample. To procure a better performance of linear regression model in terms of prediction, our paper adopts a two-step transfer learning algorithm, including a general analysis step and a specific analysis step, with the assumptions that analogous studies possess generalities, meanwhile, hold specificities. Consequently, a good transfer learning algorithm exploiting the merits of generality and specificity boosts the model performance. Nevertheless, it is inevitable to bring noises when trying to borrow information from auxiliary tasks. Therefore, we facilitate the transfer learning process with a novel class of shrinkage prior, R2D2, under the framework of the linear regression through identifying a prior first on model fitting, particularly the coefficient of determination, and then distributing it through to the coefficients in an innovative way. Throughout our simulation and application on gene expression data, the proposed two-step transfer learning algorithm increases the model performance, and R2D2 transfer learning outperforms other experimented methods, implying that R2D2 is more resistant to noisy information. Besides adopting R2D2 to estimate the linear coefficient, we apply instance weighting transfer learning to handle noises involved in transfer learning. We introduce a group weighting parameter to the two-step transfer learning framework. In our improved transfer pipeline, auxiliary sets of higher quality are assigned to bigger weights. Meanwhile, those of lower quality get smaller weights. Different from the common instance weighting transfer research that is highly confined to classification problems and uses individual-level weighting, our work is regression-based and focuses on group-level weighting under the background of the sample batch effect. It turns out that the improved algorithm is more general as it works well under different target tissues. On the basis of our first project, we further apply our method to cross-organ transfer learning and verify the capability of our method to discern the quality of auxiliary sets with respect to tissue similarities.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Transfer learning (Machine learning)	-
dc.subject.lcsh	Gene expression - Statistical methods	-
dc.title	Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data	-
dc.type	PG_Thesis	-
dc.description.thesisname	Master of Philosophy	-
dc.description.thesislevel	Master	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2023	-
dc.identifier.mmsid	991044634603703414	-

File Download

Supplementary

postgraduate thesis: Two-step transfer learning methods with R2D2 shrinkage prior and instance weighting transfer using gene expression data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats