File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/BIBM.2013.6732526
- Scopus: eid_2-s2.0-84894561395
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Intra- and Inter-sparse Multiple Output Regression with Application on Environmental Microbial Community Study
Title | Intra- and Inter-sparse Multiple Output Regression with Application on Environmental Microbial Community Study |
---|---|
Authors | |
Issue Date | 2013 |
Publisher | I E E E. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1001586 |
Citation | The IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shanghai, China, 18-21 December. 2013. In IEEE International Conference on Bioinformatics and Biomedicine Proceedings, 2013, p. 404-409, article no. 6732526 How to Cite? |
Abstract | Feature selection is important for many biological studies, especially when the number of available samples is limited (in order of hundreds) while the number of input features is large (in order of millions), such as eQTL (expression quantitative trait loci) mapping, GWAS (genome wide association study) and environmental microbial community study. We study the problem of multiple output regression which leverages the underlying common relationship shared by multiple output features and propose an efficient and accurate approach for feature selection. Our approach considers both intra- and inter-group sparsities. The intergroup sparsity assumes that only small set of input features are related to the output features. The intragroup sparsity assumes that each input features may relate to multiple output features which should have different kinds of sparsity. Most existing methods do not model the intragroup sparsity well by either assuming uniform regularization on each group, i.e. each input feature relates to similar number of output features, or requiring prior knowledge of the relationship of input and output features. By modelling the regression coefficients as a mixture distributions of Laplacian and Gaussian, we can shrink group regression coefficients to be small adaptively and learn the intergroup, intragroup sparsity and shrinkage estimation patterns. Empirical studies on the synthetic and real environmental microbial community datasets show that our model has better predictions on test dataset than existing methods such as Lasso, Elastic Net, dirty model and rMTFL (robust multi-task feature learning). Moreover, by using least angle regression or coordinate descent and projected gradient descent techniques for optimization, we can obtain the optimal regression efficiently. © 2013 IEEE. |
Persistent Identifier | http://hdl.handle.net/10722/201113 |
ISBN |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yang, J | en_US |
dc.contributor.author | Leung, HCM | en_US |
dc.contributor.author | Yiu, SM | en_US |
dc.contributor.author | Cai, YP | en_US |
dc.contributor.author | Chin, FYL | en_US |
dc.date.accessioned | 2014-08-21T07:13:35Z | - |
dc.date.available | 2014-08-21T07:13:35Z | - |
dc.date.issued | 2013 | en_US |
dc.identifier.citation | The IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shanghai, China, 18-21 December. 2013. In IEEE International Conference on Bioinformatics and Biomedicine Proceedings, 2013, p. 404-409, article no. 6732526 | en_US |
dc.identifier.isbn | 9781479913091 | - |
dc.identifier.uri | http://hdl.handle.net/10722/201113 | - |
dc.description.abstract | Feature selection is important for many biological studies, especially when the number of available samples is limited (in order of hundreds) while the number of input features is large (in order of millions), such as eQTL (expression quantitative trait loci) mapping, GWAS (genome wide association study) and environmental microbial community study. We study the problem of multiple output regression which leverages the underlying common relationship shared by multiple output features and propose an efficient and accurate approach for feature selection. Our approach considers both intra- and inter-group sparsities. The intergroup sparsity assumes that only small set of input features are related to the output features. The intragroup sparsity assumes that each input features may relate to multiple output features which should have different kinds of sparsity. Most existing methods do not model the intragroup sparsity well by either assuming uniform regularization on each group, i.e. each input feature relates to similar number of output features, or requiring prior knowledge of the relationship of input and output features. By modelling the regression coefficients as a mixture distributions of Laplacian and Gaussian, we can shrink group regression coefficients to be small adaptively and learn the intergroup, intragroup sparsity and shrinkage estimation patterns. Empirical studies on the synthetic and real environmental microbial community datasets show that our model has better predictions on test dataset than existing methods such as Lasso, Elastic Net, dirty model and rMTFL (robust multi-task feature learning). Moreover, by using least angle regression or coordinate descent and projected gradient descent techniques for optimization, we can obtain the optimal regression efficiently. © 2013 IEEE. | - |
dc.language | eng | en_US |
dc.publisher | I E E E. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1001586 | - |
dc.relation.ispartof | IEEE International Conference on Bioinformatics and Biomedicine Proceedings | en_US |
dc.title | Intra- and Inter-sparse Multiple Output Regression with Application on Environmental Microbial Community Study | en_US |
dc.type | Conference_Paper | en_US |
dc.identifier.email | Yang, J: lne1013@hku.hk | en_US |
dc.identifier.email | Leung, HCM: cmleung2@cs.hku.hk | en_US |
dc.identifier.email | Yiu, SM: smyiu@cs.hku.hk | en_US |
dc.identifier.email | Chin, FYL: chin@cs.hku.hk | en_US |
dc.identifier.authority | Leung, HCM=rp00144 | en_US |
dc.identifier.authority | Yiu, SM=rp00207 | en_US |
dc.identifier.authority | Chin, FYL=rp00105 | en_US |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/BIBM.2013.6732526 | - |
dc.identifier.scopus | eid_2-s2.0-84894561395 | - |
dc.identifier.hkuros | 235158 | en_US |
dc.identifier.spage | 404, article no. 6732526 | en_US |
dc.identifier.epage | 409, article no. 6732526 | en_US |
dc.publisher.place | United States | - |