File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Random matrix theory and its applications in high-dimensional hypothesis testing problems
Title | Random matrix theory and its applications in high-dimensional hypothesis testing problems |
---|---|
Authors | |
Issue Date | 2023 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Mei, T. [梅天星]. (2023). Random matrix theory and its applications in high-dimensional hypothesis testing problems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Random matrix theory is one of the important parts of modern probability theory and has been applied widely in high-dimensional statistical analysis. This thesis consists of two parts: one focuses on characterizing the limiting singular value distribution of a large data matrix with independent columns; and the other concerns about an application of random matrix theory to the problem of testing hypotheses on a growing number of large covariance matrices.
In the first part, we analyze the singular values of a large $p\times n$ data matrix $\mathbf{X}_n=(\mathbf{x}_{n1},\ldots,\mathbf{x}_{nn})$, where the columns $\{\mathbf{x}_{nj}\}$ are independent $p$-dimensional vectors, possibly with different distributions. Assuming that the covariance matrices $\mathbf{\Sigma}_{nj}=\text{Cov}(\mathbf{x}_{nj})$ of the column vectors can be asymptotically simultaneously diagonalized, with appropriately converging spectra, we establish a limiting spectral distribution (LSD) for the singular values of $\mathbf{X}_n$ when both dimensions $p$ and $n$ grow to infinity in comparable magnitudes. Our matrix model includes and goes beyond many types of sample covariance matrices in existing work, such as weighted sample covariance matrices, Gram matrices, and sample covariance matrices of a linear time series model. Furthermore, three applications of our general approach are developed.
First, we obtain the existence and uniqueness of the LSD for realized covariance matrices of a multi-dimensional diffusion process with anisotropic time-varying co-volatility. Second, we derive the LSD for singular values of data matrices from a recent matrix-valued auto-regressive model. Finally, we also obtain the LSD for singular values of data matrices from a generalized finite mixture model.
In the second part, we consider the hypothesis testing problem involving a large number of $q$ covariance matrices of dimension $p$ under a limiting scheme where $p$, $q$, and the sample sizes from the $q$ populations grow to infinity in a proper manner. Under this setting, we propose procedures for testing (a) the equality hypothesis, (b) the proportionality hypothesis, and (c) the general hypothesis on the dimension of the linear span of $q$ covariance matrices. The proposed test statistics are shown to be asymptotically normal. Simulation results show that finite sample properties of the test procedures are satisfactory under both the null and alternatives. As an application, we apply our test procedures to a matrix-valued transposable gene data, the Mouse Aging Project, and derive some new insights about its covariance structures. Empirical analysis of datasets from the 1000 Genomes Project (phase 3) is also conducted.
|
Degree | Doctor of Philosophy |
Subject | Random matrices |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/336637 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Mei, Tianxing | - |
dc.contributor.author | 梅天星 | - |
dc.date.accessioned | 2024-02-26T08:30:53Z | - |
dc.date.available | 2024-02-26T08:30:53Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Mei, T. [梅天星]. (2023). Random matrix theory and its applications in high-dimensional hypothesis testing problems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/336637 | - |
dc.description.abstract | Random matrix theory is one of the important parts of modern probability theory and has been applied widely in high-dimensional statistical analysis. This thesis consists of two parts: one focuses on characterizing the limiting singular value distribution of a large data matrix with independent columns; and the other concerns about an application of random matrix theory to the problem of testing hypotheses on a growing number of large covariance matrices. In the first part, we analyze the singular values of a large $p\times n$ data matrix $\mathbf{X}_n=(\mathbf{x}_{n1},\ldots,\mathbf{x}_{nn})$, where the columns $\{\mathbf{x}_{nj}\}$ are independent $p$-dimensional vectors, possibly with different distributions. Assuming that the covariance matrices $\mathbf{\Sigma}_{nj}=\text{Cov}(\mathbf{x}_{nj})$ of the column vectors can be asymptotically simultaneously diagonalized, with appropriately converging spectra, we establish a limiting spectral distribution (LSD) for the singular values of $\mathbf{X}_n$ when both dimensions $p$ and $n$ grow to infinity in comparable magnitudes. Our matrix model includes and goes beyond many types of sample covariance matrices in existing work, such as weighted sample covariance matrices, Gram matrices, and sample covariance matrices of a linear time series model. Furthermore, three applications of our general approach are developed. First, we obtain the existence and uniqueness of the LSD for realized covariance matrices of a multi-dimensional diffusion process with anisotropic time-varying co-volatility. Second, we derive the LSD for singular values of data matrices from a recent matrix-valued auto-regressive model. Finally, we also obtain the LSD for singular values of data matrices from a generalized finite mixture model. In the second part, we consider the hypothesis testing problem involving a large number of $q$ covariance matrices of dimension $p$ under a limiting scheme where $p$, $q$, and the sample sizes from the $q$ populations grow to infinity in a proper manner. Under this setting, we propose procedures for testing (a) the equality hypothesis, (b) the proportionality hypothesis, and (c) the general hypothesis on the dimension of the linear span of $q$ covariance matrices. The proposed test statistics are shown to be asymptotically normal. Simulation results show that finite sample properties of the test procedures are satisfactory under both the null and alternatives. As an application, we apply our test procedures to a matrix-valued transposable gene data, the Mouse Aging Project, and derive some new insights about its covariance structures. Empirical analysis of datasets from the 1000 Genomes Project (phase 3) is also conducted. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Random matrices | - |
dc.title | Random matrix theory and its applications in high-dimensional hypothesis testing problems | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2024 | - |
dc.identifier.mmsid | 991044770601803414 | - |