File Download
Supplementary

postgraduate thesis: Random matrix theory and its applications in high-dimensional hypothesis testing problems

TitleRandom matrix theory and its applications in high-dimensional hypothesis testing problems
Authors
Issue Date2023
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Mei, T. [梅天星]. (2023). Random matrix theory and its applications in high-dimensional hypothesis testing problems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractRandom matrix theory is one of the important parts of modern probability theory and has been applied widely in high-dimensional statistical analysis. This thesis consists of two parts: one focuses on characterizing the limiting singular value distribution of a large data matrix with independent columns; and the other concerns about an application of random matrix theory to the problem of testing hypotheses on a growing number of large covariance matrices. In the first part, we analyze the singular values of a large $p\times n$ data matrix $\mathbf{X}_n=(\mathbf{x}_{n1},\ldots,\mathbf{x}_{nn})$, where the columns $\{\mathbf{x}_{nj}\}$ are independent $p$-dimensional vectors, possibly with different distributions. Assuming that the covariance matrices $\mathbf{\Sigma}_{nj}=\text{Cov}(\mathbf{x}_{nj})$ of the column vectors can be asymptotically simultaneously diagonalized, with appropriately converging spectra, we establish a limiting spectral distribution (LSD) for the singular values of $\mathbf{X}_n$ when both dimensions $p$ and $n$ grow to infinity in comparable magnitudes. Our matrix model includes and goes beyond many types of sample covariance matrices in existing work, such as weighted sample covariance matrices, Gram matrices, and sample covariance matrices of a linear time series model. Furthermore, three applications of our general approach are developed. First, we obtain the existence and uniqueness of the LSD for realized covariance matrices of a multi-dimensional diffusion process with anisotropic time-varying co-volatility. Second, we derive the LSD for singular values of data matrices from a recent matrix-valued auto-regressive model. Finally, we also obtain the LSD for singular values of data matrices from a generalized finite mixture model. In the second part, we consider the hypothesis testing problem involving a large number of $q$ covariance matrices of dimension $p$ under a limiting scheme where $p$, $q$, and the sample sizes from the $q$ populations grow to infinity in a proper manner. Under this setting, we propose procedures for testing (a) the equality hypothesis, (b) the proportionality hypothesis, and (c) the general hypothesis on the dimension of the linear span of $q$ covariance matrices. The proposed test statistics are shown to be asymptotically normal. Simulation results show that finite sample properties of the test procedures are satisfactory under both the null and alternatives. As an application, we apply our test procedures to a matrix-valued transposable gene data, the Mouse Aging Project, and derive some new insights about its covariance structures. Empirical analysis of datasets from the 1000 Genomes Project (phase 3) is also conducted.
DegreeDoctor of Philosophy
SubjectRandom matrices
Dept/ProgramStatistics and Actuarial Science
Persistent Identifierhttp://hdl.handle.net/10722/336637

 

DC FieldValueLanguage
dc.contributor.authorMei, Tianxing-
dc.contributor.author梅天星-
dc.date.accessioned2024-02-26T08:30:53Z-
dc.date.available2024-02-26T08:30:53Z-
dc.date.issued2023-
dc.identifier.citationMei, T. [梅天星]. (2023). Random matrix theory and its applications in high-dimensional hypothesis testing problems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/336637-
dc.description.abstractRandom matrix theory is one of the important parts of modern probability theory and has been applied widely in high-dimensional statistical analysis. This thesis consists of two parts: one focuses on characterizing the limiting singular value distribution of a large data matrix with independent columns; and the other concerns about an application of random matrix theory to the problem of testing hypotheses on a growing number of large covariance matrices. In the first part, we analyze the singular values of a large $p\times n$ data matrix $\mathbf{X}_n=(\mathbf{x}_{n1},\ldots,\mathbf{x}_{nn})$, where the columns $\{\mathbf{x}_{nj}\}$ are independent $p$-dimensional vectors, possibly with different distributions. Assuming that the covariance matrices $\mathbf{\Sigma}_{nj}=\text{Cov}(\mathbf{x}_{nj})$ of the column vectors can be asymptotically simultaneously diagonalized, with appropriately converging spectra, we establish a limiting spectral distribution (LSD) for the singular values of $\mathbf{X}_n$ when both dimensions $p$ and $n$ grow to infinity in comparable magnitudes. Our matrix model includes and goes beyond many types of sample covariance matrices in existing work, such as weighted sample covariance matrices, Gram matrices, and sample covariance matrices of a linear time series model. Furthermore, three applications of our general approach are developed. First, we obtain the existence and uniqueness of the LSD for realized covariance matrices of a multi-dimensional diffusion process with anisotropic time-varying co-volatility. Second, we derive the LSD for singular values of data matrices from a recent matrix-valued auto-regressive model. Finally, we also obtain the LSD for singular values of data matrices from a generalized finite mixture model. In the second part, we consider the hypothesis testing problem involving a large number of $q$ covariance matrices of dimension $p$ under a limiting scheme where $p$, $q$, and the sample sizes from the $q$ populations grow to infinity in a proper manner. Under this setting, we propose procedures for testing (a) the equality hypothesis, (b) the proportionality hypothesis, and (c) the general hypothesis on the dimension of the linear span of $q$ covariance matrices. The proposed test statistics are shown to be asymptotically normal. Simulation results show that finite sample properties of the test procedures are satisfactory under both the null and alternatives. As an application, we apply our test procedures to a matrix-valued transposable gene data, the Mouse Aging Project, and derive some new insights about its covariance structures. Empirical analysis of datasets from the 1000 Genomes Project (phase 3) is also conducted. -
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshRandom matrices-
dc.titleRandom matrix theory and its applications in high-dimensional hypothesis testing problems-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineStatistics and Actuarial Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2024-
dc.identifier.mmsid991044770601803414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats