File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Reliable Retrieval of Top-k Tags

TitleReliable Retrieval of Top-k Tags
Authors
Issue Date2017
PublisherSpringer International Publishing.
Citation
The 18th International Conference on Web Information Systems Engineering, Puschino, Russia, 7-11 October 2017. In Bouguettaya, A ... (et al) (Eds.). Web Information Systems Engineering – WISE 2017 (Lecture Notes in Computer Science, v. 10569), p. 330-346. Cham: Springer International Publishing, 2017 How to Cite?
AbstractCollaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top- k sliding average similarity (top- k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r. Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold.
Persistent Identifierhttp://hdl.handle.net/10722/243246
ISBN
ISSN
2020 SCImago Journal Rankings: 0.249
ISI Accession Number ID
Series/Report no.Information Systems and Applications, incl. Internet/Web, and HCI
Lecture Notes in Computer Science book series (LNCS) ; v. 10569

 

DC FieldValueLanguage
dc.contributor.authorXu, Y-
dc.contributor.authorCheng, CK-
dc.contributor.authorZheng, Y-
dc.date.accessioned2017-08-25T02:52:10Z-
dc.date.available2017-08-25T02:52:10Z-
dc.date.issued2017-
dc.identifier.citationThe 18th International Conference on Web Information Systems Engineering, Puschino, Russia, 7-11 October 2017. In Bouguettaya, A ... (et al) (Eds.). Web Information Systems Engineering – WISE 2017 (Lecture Notes in Computer Science, v. 10569), p. 330-346. Cham: Springer International Publishing, 2017-
dc.identifier.isbn978-3-319-68782-7-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/243246-
dc.description.abstractCollaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top- k sliding average similarity (top- k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r. Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold.-
dc.languageeng-
dc.publisherSpringer International Publishing.-
dc.relation.ispartofWeb Information Systems Engineering – WISE 2017-
dc.relation.ispartofseriesInformation Systems and Applications, incl. Internet/Web, and HCI-
dc.relation.ispartofseriesLecture Notes in Computer Science book series (LNCS) ; v. 10569-
dc.titleReliable Retrieval of Top-k Tags-
dc.typeConference_Paper-
dc.identifier.emailCheng, CK: ckcheng@cs.hku.hk-
dc.identifier.authorityCheng, CK=rp00074-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-319-68783-4_23-
dc.identifier.scopuseid_2-s2.0-85031417029-
dc.identifier.hkuros275517-
dc.identifier.spage330-
dc.identifier.epage346-
dc.identifier.eissn1611-3349-
dc.identifier.isiWOS:000739665100023-
dc.publisher.placeCham-
dc.identifier.issnl0302-9743-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats