A parametric embedding approach for modeling ranking data and its applications

Xu, Hang; 徐航

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_991044046594603414

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Statistics & Actuarial Science: Theses

postgraduate thesis: A parametric embedding approach for modeling ranking data and its applications

Title	A parametric embedding approach for modeling ranking data and its applications
Authors	Xu, Hang 徐航
Advisors	Advisor(s):Yu, PLH
Issue Date	2018
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Xu, H. [徐航]. (2018). A parametric embedding approach for modeling ranking data and its applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	The idea of parametric embedding can be dated back to 1973 when J. Neyman first introduced the notion of smooth tests of null hypothesis against alternatives in a smooth parametric family. This idea can be used to embed various non-parametric inference in a parametric family. This thesis considers several applications of the parametric embedding approach to the problem of ranking data analysis and multiple change-point detection. In the first part, two types of new models for modeling ranking data are introduced. We first focus on the problem of rank aggregation which aims at combining rankings of a set of items assigned by a sample of rankers to generate a consensus ranking. We develop a new distance-based model by allowing different weights for different rankers to incorporate the situation that rankers with different backgrounds have different cognitive levels of examining the items. Under this model, the weight associated with a ranker is used to measure the cognitive level of ranking of the items, and these weights are unobserved and exponentially distributed. Then we introduce a new class of general exponential ranking models which we label angle-based models for ranking data. A consensus score vector is assumed, which assigns scores to a set of items, where the scores reflect a consensus view of the relative preference of the items. The probability of observing a ranking is modeled to be proportional to its cosine of the angle from the consensus vector. Bayesian variational inference is employed to determine the corresponding predictive density. Model extensions to incomplete rankings and mixture models are also developed for both types of new models. Real data applications also demonstrate that the new models and their extensions can handle well different tasks for the analysis of ranking data. In the second part, we discuss the usage of parametric embedding in the multiple change-point problem. We first propose a kernel function to detect changes in location and then we construct a composite likelihood function to search for the change points using a binary segmentation algorithm. It is shown that the estimation method for the change-point locations is consistent. The method is then compared with various existing methods in a series of simulations involving both dependent and independent data under various error distributions. Then we address the statistical problem of detecting change-points in the stress-strength reliability R=P(X
Degree	Doctor of Philosophy
Subject	Ranking and selection (Statistics)
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/263181

DC Field	Value	Language
dc.contributor.advisor	Yu, PLH	-
dc.contributor.author	Xu, Hang	-
dc.contributor.author	徐航	-
dc.date.accessioned	2018-10-16T07:34:53Z	-
dc.date.available	2018-10-16T07:34:53Z	-
dc.date.issued	2018	-
dc.identifier.citation	Xu, H. [徐航]. (2018). A parametric embedding approach for modeling ranking data and its applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/263181	-
dc.description.abstract	The idea of parametric embedding can be dated back to 1973 when J. Neyman first introduced the notion of smooth tests of null hypothesis against alternatives in a smooth parametric family. This idea can be used to embed various non-parametric inference in a parametric family. This thesis considers several applications of the parametric embedding approach to the problem of ranking data analysis and multiple change-point detection. In the first part, two types of new models for modeling ranking data are introduced. We first focus on the problem of rank aggregation which aims at combining rankings of a set of items assigned by a sample of rankers to generate a consensus ranking. We develop a new distance-based model by allowing different weights for different rankers to incorporate the situation that rankers with different backgrounds have different cognitive levels of examining the items. Under this model, the weight associated with a ranker is used to measure the cognitive level of ranking of the items, and these weights are unobserved and exponentially distributed. Then we introduce a new class of general exponential ranking models which we label angle-based models for ranking data. A consensus score vector is assumed, which assigns scores to a set of items, where the scores reflect a consensus view of the relative preference of the items. The probability of observing a ranking is modeled to be proportional to its cosine of the angle from the consensus vector. Bayesian variational inference is employed to determine the corresponding predictive density. Model extensions to incomplete rankings and mixture models are also developed for both types of new models. Real data applications also demonstrate that the new models and their extensions can handle well different tasks for the analysis of ranking data. In the second part, we discuss the usage of parametric embedding in the multiple change-point problem. We first propose a kernel function to detect changes in location and then we construct a composite likelihood function to search for the change points using a binary segmentation algorithm. It is shown that the estimation method for the change-point locations is consistent. The method is then compared with various existing methods in a series of simulations involving both dependent and independent data under various error distributions. Then we address the statistical problem of detecting change-points in the stress-strength reliability R=P(X<Y) in a sequence of paired variables (X,Y). Without specifying their underlying distributions, we embed this non-parametric problem into a parametric framework and apply the maximum likelihood method via a dynamic programming approach to determine the locations of the change-points in R. Under some mild conditions, we show the consistency and asymptotic properties of the estimation to the location of the change-points. Simulations and applications to real data demonstrate the usefulness of our proposed methodologies for detecting the change-points in both cases.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Ranking and selection (Statistics)	-
dc.title	A parametric embedding approach for modeling ranking data and its applications	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_991044046594603414	-
dc.date.hkucongregation	2018	-
dc.identifier.mmsid	991044046594603414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: A parametric embedding approach for modeling ranking data and its applications

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats