Advanced rank-aware queries and recommendation with novel types of data

Wang, Hao; 王皓

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5270554

Supplementary

Citations:
Appears in Collections:
- Computer Science: Theses
- HKU Theses Online

postgraduate thesis: Advanced rank-aware queries and recommendation with novel types of data

Title	Advanced rank-aware queries and recommendation with novel types of data
Authors	Wang, Hao 王皓
Advisors	Advisor(s):Mamoulis, N Cheung, DWL
Issue Date	2014
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wang, H. [王皓]. (2014). Advanced rank-aware queries and recommendation with novel types of data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5270554
Abstract	Nowadays we are living in an era of rich data, not only in the sense of the amount of data, but also in the sense of various sources and content of data. Efficient search, management, and exploitation of data have, over decades, been a major direction of database research. In this thesis, three challenging problems are proposed and studied, targeting (i) time series data, (ii) user preference data, and (iii) location-based social network data, respectively, providing efficient solutions to corresponding real-life applications. First, durability queries are studied in historical time series databases, which identify objects that have durable quality over time. For example, a sociologist may be interested in the top 10 web search terms during the period of some historical events; the police may seek for vehicles that move close to a suspect 70% of the time during a certain time, etc. Such durable top-k (DTop-k) and durable k-nearest neighbor (DkNN) queries can be viewed as natural extensions of the standard snapshot top-k and NN queries to timestamped sequences of values or locations. Although their snapshot counterparts have been studied extensively, there is little prior work that addresses this new class of durability queries. Efficient and scalable algorithms are proposed based on novel indexing techniques. Next, an efficient solution to k-nearest neighbor search over top-m lists is investigated. A top-m list is a ranking of m items, typically representing some user’s preference over these items. For example, a user may have a list of her 10 most favourite books; the result from a search engine is typically a list of webpages ranked according to their relevance to some keywords. The search problem aims at extracting k top-m lists from the database that are the “closest” to some query list where the closeness is evaluated using commonly used measures such as the Fagin’s intersection metric, Spearman’s footrule, Kendall’s tau, etc. Despite of the importance of such queries, there’s little prior work suggesting any efficient solution. In this thesis, a unified framework is proposed to answer such queries efficiently. Finally, the problem of top-N venue recommendation in location-based social networks (LBSNs) is studied, which recommends new venues to users. As an increasingly larger number of users partake in LBSNs, the recommendation problem in this setting has attracted significant attention in research and in practical applications. The detailed information about past user behavior that is traced by the LBSN differentiates the problem significantly from its traditional settings. The spatial nature in the past user behavior and also the information about the user social interaction with other users, provide a richer background to build a more accurate and expressive recommendation model. Although there have been extensive studies on recommender systems working with user-item ratings, GPS trajectories, and other types of data, there are very few approaches that exploit the unique properties of the LBSN user check-in data. In this thesis, effective and efficient algorithms that create recommendations are proposed based on such properties.
Degree	Doctor of Philosophy
Subject	Data mining Time-series analysis - Computer programs Social networks - Data processing
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/206672
HKU Library Item ID	b5270554

DC Field	Value	Language
dc.contributor.advisor	Mamoulis, N	-
dc.contributor.advisor	Cheung, DWL	-
dc.contributor.author	Wang, Hao	-
dc.contributor.author	王皓	-
dc.date.accessioned	2014-11-25T03:53:15Z	-
dc.date.available	2014-11-25T03:53:15Z	-
dc.date.issued	2014	-
dc.identifier.citation	Wang, H. [王皓]. (2014). Advanced rank-aware queries and recommendation with novel types of data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5270554	-
dc.identifier.uri	http://hdl.handle.net/10722/206672	-
dc.description.abstract	Nowadays we are living in an era of rich data, not only in the sense of the amount of data, but also in the sense of various sources and content of data. Efficient search, management, and exploitation of data have, over decades, been a major direction of database research. In this thesis, three challenging problems are proposed and studied, targeting (i) time series data, (ii) user preference data, and (iii) location-based social network data, respectively, providing efficient solutions to corresponding real-life applications. First, durability queries are studied in historical time series databases, which identify objects that have durable quality over time. For example, a sociologist may be interested in the top 10 web search terms during the period of some historical events; the police may seek for vehicles that move close to a suspect 70% of the time during a certain time, etc. Such durable top-k (DTop-k) and durable k-nearest neighbor (DkNN) queries can be viewed as natural extensions of the standard snapshot top-k and NN queries to timestamped sequences of values or locations. Although their snapshot counterparts have been studied extensively, there is little prior work that addresses this new class of durability queries. Efficient and scalable algorithms are proposed based on novel indexing techniques. Next, an efficient solution to k-nearest neighbor search over top-m lists is investigated. A top-m list is a ranking of m items, typically representing some user’s preference over these items. For example, a user may have a list of her 10 most favourite books; the result from a search engine is typically a list of webpages ranked according to their relevance to some keywords. The search problem aims at extracting k top-m lists from the database that are the “closest” to some query list where the closeness is evaluated using commonly used measures such as the Fagin’s intersection metric, Spearman’s footrule, Kendall’s tau, etc. Despite of the importance of such queries, there’s little prior work suggesting any efficient solution. In this thesis, a unified framework is proposed to answer such queries efficiently. Finally, the problem of top-N venue recommendation in location-based social networks (LBSNs) is studied, which recommends new venues to users. As an increasingly larger number of users partake in LBSNs, the recommendation problem in this setting has attracted significant attention in research and in practical applications. The detailed information about past user behavior that is traced by the LBSN differentiates the problem significantly from its traditional settings. The spatial nature in the past user behavior and also the information about the user social interaction with other users, provide a richer background to build a more accurate and expressive recommendation model. Although there have been extensive studies on recommender systems working with user-item ratings, GPS trajectories, and other types of data, there are very few approaches that exploit the unique properties of the LBSN user check-in data. In this thesis, effective and efficient algorithms that create recommendations are proposed based on such properties.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Data mining	-
dc.subject.lcsh	Time-series analysis - Computer programs	-
dc.subject.lcsh	Social networks - Data processing	-
dc.title	Advanced rank-aware queries and recommendation with novel types of data	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5270554	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5270554	-
dc.identifier.mmsid	991038814969703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Advanced rank-aware queries and recommendation with novel types of data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats