File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Incomplete categorical data, inflated count data analyses and robust modeling with applications

TitleIncomplete categorical data, inflated count data analyses and robust modeling with applications
Authors
Advisors
Advisor(s):Yuen, KCTian, G
Issue Date2017
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Zhang, C. [张弛]. (2017). Incomplete categorical data, inflated count data analyses and robust modeling with applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractIn this thesis, some issues related with incomplete categorical data and inflated count data analyses as well as a robust statistical model are considered. The first part investigates the problem of a case-control study with missing data. Specifically, the valid sampling distribution of the observed counts under the assumption of missing at random is derived, and the corresponding statistical inference methods are developed. The theoretical comparisons of the proposed sampling distribution with two existing methods exhibit a large difference. The results elucidate that the conclusion by the Wald test under different sampling distributions may be completely diverse and even contradictory. The second part studies some distributional properties of the zero-and-one inflated Poisson (ZOIP) distribution which was proposed by Melkersson and Olsson (1999) to model count data with large amounts of zero and one observations. Stochastic representations are constructed for the ZOIP random variable. These representations facilitate the expectation-maximization algorithm to obtain the maximum likelihood estimates for the parameters of interest. Other likelihood-based inference results including the bootstrap confidence intervals and testing hypotheses under large sample sizes are also provided. The third part generalizes the univariate ZOIP distribution to the multivariate case. The multivariate ZOIP distribution can be used to handle the multivariate count data with inflated counts for both zero and one. It possesses a very general correlation structure that depends on the values of the parameters, allowing a positive or negative correlation coefficient between any pair of random components. For the proposed multivariate distribution, important distributional properties are derived, and some useful statistical inference methods are developed. The final part proposes a new multivariate t (MVT) distribution by allowing different degrees of freedom for each univariate component. It includes components following the multivariate normal distribution when the corresponding degrees of freedom tend to infinity. It also contains the product of independent t distributions as a special case. Unlike the classical MVT distribution, this new structure is more flexible in model specification. The performances of all the proposed methods in this thesis are evaluated through simulation studies and real data analyses.
DegreeDoctor of Philosophy
SubjectSampling (Statistics)
Poisson distribution
Multivariate analysis
Dept/ProgramStatistics and Actuarial Science
Persistent Identifierhttp://hdl.handle.net/10722/250805

 

DC FieldValueLanguage
dc.contributor.advisorYuen, KC-
dc.contributor.advisorTian, G-
dc.contributor.authorZhang, Chi-
dc.contributor.author张弛-
dc.date.accessioned2018-01-26T01:59:35Z-
dc.date.available2018-01-26T01:59:35Z-
dc.date.issued2017-
dc.identifier.citationZhang, C. [张弛]. (2017). Incomplete categorical data, inflated count data analyses and robust modeling with applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/250805-
dc.description.abstractIn this thesis, some issues related with incomplete categorical data and inflated count data analyses as well as a robust statistical model are considered. The first part investigates the problem of a case-control study with missing data. Specifically, the valid sampling distribution of the observed counts under the assumption of missing at random is derived, and the corresponding statistical inference methods are developed. The theoretical comparisons of the proposed sampling distribution with two existing methods exhibit a large difference. The results elucidate that the conclusion by the Wald test under different sampling distributions may be completely diverse and even contradictory. The second part studies some distributional properties of the zero-and-one inflated Poisson (ZOIP) distribution which was proposed by Melkersson and Olsson (1999) to model count data with large amounts of zero and one observations. Stochastic representations are constructed for the ZOIP random variable. These representations facilitate the expectation-maximization algorithm to obtain the maximum likelihood estimates for the parameters of interest. Other likelihood-based inference results including the bootstrap confidence intervals and testing hypotheses under large sample sizes are also provided. The third part generalizes the univariate ZOIP distribution to the multivariate case. The multivariate ZOIP distribution can be used to handle the multivariate count data with inflated counts for both zero and one. It possesses a very general correlation structure that depends on the values of the parameters, allowing a positive or negative correlation coefficient between any pair of random components. For the proposed multivariate distribution, important distributional properties are derived, and some useful statistical inference methods are developed. The final part proposes a new multivariate t (MVT) distribution by allowing different degrees of freedom for each univariate component. It includes components following the multivariate normal distribution when the corresponding degrees of freedom tend to infinity. It also contains the product of independent t distributions as a special case. Unlike the classical MVT distribution, this new structure is more flexible in model specification. The performances of all the proposed methods in this thesis are evaluated through simulation studies and real data analyses.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshSampling (Statistics)-
dc.subject.lcshPoisson distribution-
dc.subject.lcshMultivariate analysis-
dc.titleIncomplete categorical data, inflated count data analyses and robust modeling with applications-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineStatistics and Actuarial Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991043979552803414-
dc.date.hkucongregation2017-
dc.identifier.mmsid991043979552803414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats