File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: A parametric embedding approach for modeling ranking data and its applications

TitleA parametric embedding approach for modeling ranking data and its applications
Authors
Advisors
Advisor(s):Yu, PLH
Issue Date2018
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Xu, H. [徐航]. (2018). A parametric embedding approach for modeling ranking data and its applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe idea of parametric embedding can be dated back to 1973 when J. Neyman first introduced the notion of smooth tests of null hypothesis against alternatives in a smooth parametric family. This idea can be used to embed various non-parametric inference in a parametric family. This thesis considers several applications of the parametric embedding approach to the problem of ranking data analysis and multiple change-point detection. In the first part, two types of new models for modeling ranking data are introduced. We first focus on the problem of rank aggregation which aims at combining rankings of a set of items assigned by a sample of rankers to generate a consensus ranking. We develop a new distance-based model by allowing different weights for different rankers to incorporate the situation that rankers with different backgrounds have different cognitive levels of examining the items. Under this model, the weight associated with a ranker is used to measure the cognitive level of ranking of the items, and these weights are unobserved and exponentially distributed. Then we introduce a new class of general exponential ranking models which we label angle-based models for ranking data. A consensus score vector is assumed, which assigns scores to a set of items, where the scores reflect a consensus view of the relative preference of the items. The probability of observing a ranking is modeled to be proportional to its cosine of the angle from the consensus vector. Bayesian variational inference is employed to determine the corresponding predictive density. Model extensions to incomplete rankings and mixture models are also developed for both types of new models. Real data applications also demonstrate that the new models and their extensions can handle well different tasks for the analysis of ranking data. In the second part, we discuss the usage of parametric embedding in the multiple change-point problem. We first propose a kernel function to detect changes in location and then we construct a composite likelihood function to search for the change points using a binary segmentation algorithm. It is shown that the estimation method for the change-point locations is consistent. The method is then compared with various existing methods in a series of simulations involving both dependent and independent data under various error distributions. Then we address the statistical problem of detecting change-points in the stress-strength reliability R=P(X
DegreeDoctor of Philosophy
SubjectRanking and selection (Statistics)
Dept/ProgramStatistics and Actuarial Science
Persistent Identifierhttp://hdl.handle.net/10722/263181

 

DC FieldValueLanguage
dc.contributor.advisorYu, PLH-
dc.contributor.authorXu, Hang-
dc.contributor.author徐航-
dc.date.accessioned2018-10-16T07:34:53Z-
dc.date.available2018-10-16T07:34:53Z-
dc.date.issued2018-
dc.identifier.citationXu, H. [徐航]. (2018). A parametric embedding approach for modeling ranking data and its applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/263181-
dc.description.abstractThe idea of parametric embedding can be dated back to 1973 when J. Neyman first introduced the notion of smooth tests of null hypothesis against alternatives in a smooth parametric family. This idea can be used to embed various non-parametric inference in a parametric family. This thesis considers several applications of the parametric embedding approach to the problem of ranking data analysis and multiple change-point detection. In the first part, two types of new models for modeling ranking data are introduced. We first focus on the problem of rank aggregation which aims at combining rankings of a set of items assigned by a sample of rankers to generate a consensus ranking. We develop a new distance-based model by allowing different weights for different rankers to incorporate the situation that rankers with different backgrounds have different cognitive levels of examining the items. Under this model, the weight associated with a ranker is used to measure the cognitive level of ranking of the items, and these weights are unobserved and exponentially distributed. Then we introduce a new class of general exponential ranking models which we label angle-based models for ranking data. A consensus score vector is assumed, which assigns scores to a set of items, where the scores reflect a consensus view of the relative preference of the items. The probability of observing a ranking is modeled to be proportional to its cosine of the angle from the consensus vector. Bayesian variational inference is employed to determine the corresponding predictive density. Model extensions to incomplete rankings and mixture models are also developed for both types of new models. Real data applications also demonstrate that the new models and their extensions can handle well different tasks for the analysis of ranking data. In the second part, we discuss the usage of parametric embedding in the multiple change-point problem. We first propose a kernel function to detect changes in location and then we construct a composite likelihood function to search for the change points using a binary segmentation algorithm. It is shown that the estimation method for the change-point locations is consistent. The method is then compared with various existing methods in a series of simulations involving both dependent and independent data under various error distributions. Then we address the statistical problem of detecting change-points in the stress-strength reliability R=P(X<Y) in a sequence of paired variables (X,Y). Without specifying their underlying distributions, we embed this non-parametric problem into a parametric framework and apply the maximum likelihood method via a dynamic programming approach to determine the locations of the change-points in R. Under some mild conditions, we show the consistency and asymptotic properties of the estimation to the location of the change-points. Simulations and applications to real data demonstrate the usefulness of our proposed methodologies for detecting the change-points in both cases.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshRanking and selection (Statistics)-
dc.titleA parametric embedding approach for modeling ranking data and its applications-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineStatistics and Actuarial Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991044046594603414-
dc.date.hkucongregation2018-
dc.identifier.mmsid991044046594603414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats