File Download
Supplementary

Conference Paper: Analyzing ranking data using decision tree

TitleAnalyzing ranking data using decision tree
Authors
KeywordsDecision tree
Ranking data
Impurity function
AUC
Issue Date2008
Citation
The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, 15-19 September 2008. In Proceedings of ECML PKDD 2008, p. 139-156 How to Cite?
AbstractRanking/preference data arises from many applications in marketing, psychology and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree [2]. We modify the existing splitting criteria, Gini and entropy, which can precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namely n-wise and top-k measures. Minimal cost-complexity pruning is used to find the optimum-sized tree. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. The proposed methodology is implemented to analyze a partial ranking dataset of Inglehart's items collected in the 1993 International Social Science Programme survey. Change in importance of item values with country, age and level of education are identified.
Persistent Identifierhttp://hdl.handle.net/10722/127196

 

DC FieldValueLanguage
dc.contributor.authorYu, PLHen_HK
dc.contributor.authorWan, WMen_HK
dc.contributor.authorLee, Hen_HK
dc.date.accessioned2010-10-31T13:11:41Z-
dc.date.available2010-10-31T13:11:41Z-
dc.date.issued2008en_HK
dc.identifier.citationThe European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, 15-19 September 2008. In Proceedings of ECML PKDD 2008, p. 139-156en_HK
dc.identifier.urihttp://hdl.handle.net/10722/127196-
dc.description.abstractRanking/preference data arises from many applications in marketing, psychology and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree [2]. We modify the existing splitting criteria, Gini and entropy, which can precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namely n-wise and top-k measures. Minimal cost-complexity pruning is used to find the optimum-sized tree. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. The proposed methodology is implemented to analyze a partial ranking dataset of Inglehart's items collected in the 1993 International Social Science Programme survey. Change in importance of item values with country, age and level of education are identified.-
dc.languageengen_HK
dc.relation.ispartofProceedings of ECML PKDD 2008en_HK
dc.subjectDecision tree-
dc.subjectRanking data-
dc.subjectImpurity function-
dc.subjectAUC-
dc.titleAnalyzing ranking data using decision treeen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailYu, PLH: plhyu@hkucc.hku.hken_HK
dc.identifier.emailWan, WM: h0105945@hkusua.hku.hken_HK
dc.identifier.emailLee, H: honglee@graduate.hku.hken_HK
dc.description.naturepostprint-
dc.identifier.hkuros180500en_HK
dc.identifier.spage139en_HK
dc.identifier.epage156en_HK
dc.description.otherThe European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, 15-19 September 2008. In Proceedings of ECML PKDD 2008, p. 139-156-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats