File Download
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Analyzing ranking data using decision tree
Title | Analyzing ranking data using decision tree |
---|---|
Authors | |
Keywords | Decision tree Ranking data Impurity function AUC |
Issue Date | 2008 |
Citation | The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, 15-19 September 2008. In Proceedings of ECML PKDD 2008, p. 139-156 How to Cite? |
Abstract | Ranking/preference data arises from many applications in marketing, psychology and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree [2]. We modify the existing splitting criteria, Gini and entropy, which can precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namely n-wise and top-k measures. Minimal cost-complexity pruning is used to find the optimum-sized tree. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. The proposed methodology is implemented to analyze a partial ranking dataset of Inglehart's items collected in the 1993 International Social Science Programme survey. Change in importance of item values with country, age and level of education are identified. |
Persistent Identifier | http://hdl.handle.net/10722/127196 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yu, PLH | en_HK |
dc.contributor.author | Wan, WM | en_HK |
dc.contributor.author | Lee, H | en_HK |
dc.date.accessioned | 2010-10-31T13:11:41Z | - |
dc.date.available | 2010-10-31T13:11:41Z | - |
dc.date.issued | 2008 | en_HK |
dc.identifier.citation | The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, 15-19 September 2008. In Proceedings of ECML PKDD 2008, p. 139-156 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/127196 | - |
dc.description.abstract | Ranking/preference data arises from many applications in marketing, psychology and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree [2]. We modify the existing splitting criteria, Gini and entropy, which can precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namely n-wise and top-k measures. Minimal cost-complexity pruning is used to find the optimum-sized tree. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. The proposed methodology is implemented to analyze a partial ranking dataset of Inglehart's items collected in the 1993 International Social Science Programme survey. Change in importance of item values with country, age and level of education are identified. | - |
dc.language | eng | en_HK |
dc.relation.ispartof | Proceedings of ECML PKDD 2008 | en_HK |
dc.subject | Decision tree | - |
dc.subject | Ranking data | - |
dc.subject | Impurity function | - |
dc.subject | AUC | - |
dc.title | Analyzing ranking data using decision tree | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.email | Yu, PLH: plhyu@hkucc.hku.hk | en_HK |
dc.identifier.email | Wan, WM: h0105945@hkusua.hku.hk | en_HK |
dc.identifier.email | Lee, H: honglee@graduate.hku.hk | en_HK |
dc.description.nature | postprint | - |
dc.identifier.hkuros | 180500 | en_HK |
dc.identifier.spage | 139 | en_HK |
dc.identifier.epage | 156 | en_HK |
dc.description.other | The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, 15-19 September 2008. In Proceedings of ECML PKDD 2008, p. 139-156 | - |