File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: A parametric embedding approach for modeling ranking data and its applications
Title | A parametric embedding approach for modeling ranking data and its applications |
---|---|
Authors | |
Advisors | Advisor(s):Yu, PLH |
Issue Date | 2018 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Xu, H. [徐航]. (2018). A parametric embedding approach for modeling ranking data and its applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | The idea of parametric embedding can be dated back to 1973 when J. Neyman first introduced the notion of smooth tests of null hypothesis against alternatives in a smooth parametric family. This idea can be used to embed various non-parametric inference in a parametric family. This thesis considers several applications of the parametric embedding approach to the problem of ranking data analysis and multiple change-point detection.
In the first part, two types of new models for modeling ranking data are introduced. We first focus on the problem of rank aggregation which aims at combining rankings of a set of items assigned by a sample of rankers to generate a consensus ranking. We develop a new distance-based model by allowing different weights for different rankers to incorporate the situation that rankers with different backgrounds have different cognitive levels of examining the items. Under this model, the weight associated with a ranker is used to measure the cognitive level of ranking of the items, and these weights are unobserved and exponentially distributed. Then we introduce a new class of general exponential ranking models which we label angle-based models for ranking data. A consensus score vector is assumed, which assigns scores to a set of items, where the scores reflect a consensus view of the relative preference of the items. The probability of observing a ranking is modeled to be proportional to its cosine of the angle from the consensus vector. Bayesian variational inference is employed to determine the corresponding predictive density. Model extensions to incomplete rankings and mixture models are also developed for both types of new models. Real data applications also demonstrate that the new models and their extensions can handle well different tasks for the analysis of ranking data.
In the second part, we discuss the usage of parametric embedding in the multiple change-point problem. We first propose a kernel function to detect changes in location and then we construct a composite likelihood function to search for the change points using a binary segmentation algorithm. It is shown that the estimation method for the change-point locations is consistent. The method is then compared with various existing methods in a series of simulations involving both dependent and independent data under various error distributions. Then we address the statistical problem of detecting change-points in the stress-strength reliability R=P(X |
Degree | Doctor of Philosophy |
Subject | Ranking and selection (Statistics) |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/263181 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yu, PLH | - |
dc.contributor.author | Xu, Hang | - |
dc.contributor.author | 徐航 | - |
dc.date.accessioned | 2018-10-16T07:34:53Z | - |
dc.date.available | 2018-10-16T07:34:53Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Xu, H. [徐航]. (2018). A parametric embedding approach for modeling ranking data and its applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/263181 | - |
dc.description.abstract | The idea of parametric embedding can be dated back to 1973 when J. Neyman first introduced the notion of smooth tests of null hypothesis against alternatives in a smooth parametric family. This idea can be used to embed various non-parametric inference in a parametric family. This thesis considers several applications of the parametric embedding approach to the problem of ranking data analysis and multiple change-point detection. In the first part, two types of new models for modeling ranking data are introduced. We first focus on the problem of rank aggregation which aims at combining rankings of a set of items assigned by a sample of rankers to generate a consensus ranking. We develop a new distance-based model by allowing different weights for different rankers to incorporate the situation that rankers with different backgrounds have different cognitive levels of examining the items. Under this model, the weight associated with a ranker is used to measure the cognitive level of ranking of the items, and these weights are unobserved and exponentially distributed. Then we introduce a new class of general exponential ranking models which we label angle-based models for ranking data. A consensus score vector is assumed, which assigns scores to a set of items, where the scores reflect a consensus view of the relative preference of the items. The probability of observing a ranking is modeled to be proportional to its cosine of the angle from the consensus vector. Bayesian variational inference is employed to determine the corresponding predictive density. Model extensions to incomplete rankings and mixture models are also developed for both types of new models. Real data applications also demonstrate that the new models and their extensions can handle well different tasks for the analysis of ranking data. In the second part, we discuss the usage of parametric embedding in the multiple change-point problem. We first propose a kernel function to detect changes in location and then we construct a composite likelihood function to search for the change points using a binary segmentation algorithm. It is shown that the estimation method for the change-point locations is consistent. The method is then compared with various existing methods in a series of simulations involving both dependent and independent data under various error distributions. Then we address the statistical problem of detecting change-points in the stress-strength reliability R=P(X<Y) in a sequence of paired variables (X,Y). Without specifying their underlying distributions, we embed this non-parametric problem into a parametric framework and apply the maximum likelihood method via a dynamic programming approach to determine the locations of the change-points in R. Under some mild conditions, we show the consistency and asymptotic properties of the estimation to the location of the change-points. Simulations and applications to real data demonstrate the usefulness of our proposed methodologies for detecting the change-points in both cases. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Ranking and selection (Statistics) | - |
dc.title | A parametric embedding approach for modeling ranking data and its applications | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044046594603414 | - |
dc.date.hkucongregation | 2018 | - |
dc.identifier.mmsid | 991044046594603414 | - |