File Download

There are no files associated with this item.

Supplementary

Conference Paper: On the Power of Multitask Representation Learning with Gradient Descent

TitleOn the Power of Multitask Representation Learning with Gradient Descent
Authors
Issue Date23-Apr-2025
Abstract

Representation learning, particularly multi-task representation learning, has gained widespread popularity in various deep learning applications, ranging from computer vision to natural language processing, due to its remarkable generalization performance. Despite its growing use, our understanding of the underlying mechanisms remains limited. In this paper, we provide a theoretical analysis elucidating why multi-task representation learning outperforms its single-task counterpart in scenarios involving over-parameterized two-layer convolutional neural networks trained by gradient descent. Our analysis is based on a data model that encompasses both task-shared and task-specific features, a setting commonly encountered in real-world applications. We also present experiments on synthetic and real-world data to illustrate and validate our theoretical findings.


Persistent Identifierhttp://hdl.handle.net/10722/359630

 

DC FieldValueLanguage
dc.contributor.authorLi, Qiaobo-
dc.contributor.authorChen, Zixiang-
dc.contributor.authorDeng, Yihe-
dc.contributor.authorKou, Yiwen-
dc.contributor.authorCao, Yuan-
dc.contributor.authorGu, Quanquan-
dc.date.accessioned2025-09-09T00:45:38Z-
dc.date.available2025-09-09T00:45:38Z-
dc.date.issued2025-04-23-
dc.identifier.urihttp://hdl.handle.net/10722/359630-
dc.description.abstract<p>Representation learning, particularly multi-task representation learning, has gained widespread popularity in various deep learning applications, ranging from computer vision to natural language processing, due to its remarkable generalization performance. Despite its growing use, our understanding of the underlying mechanisms remains limited. In this paper, we provide a theoretical analysis elucidating why multi-task representation learning outperforms its single-task counterpart in scenarios involving over-parameterized two-layer convolutional neural networks trained by gradient descent. Our analysis is based on a data model that encompasses both task-shared and task-specific features, a setting commonly encountered in real-world applications. We also present experiments on synthetic and real-world data to illustrate and validate our theoretical findings.<br></p>-
dc.languageeng-
dc.relation.ispartofThe 28th International Conference on Artificial Intelligence and Statistics. (03/05/2025-05/05/2025, Mai Khao)-
dc.titleOn the Power of Multitask Representation Learning with Gradient Descent-
dc.typeConference_Paper-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats