File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Effective and efficient community search over large attributed graphs

TitleEffective and efficient community search over large attributed graphs
Authors
Advisors
Advisor(s):Cheng, CK
Issue Date2017
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Fang, Y. [方一向]. (2017). Effective and efficient community search over large attributed graphs. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractCommunities, which are prevalent in attributed graphs such as social networks and knowledge bases, can be used in emerging applications such as product advertisement and setting up of social events. Given a graph G and a vertex q ∈ G, the community search query returns a subgraph of G that contains vertices related to q. In this thesis, we study community search over three common attributed graphs, where (1) vertices are associated with keywords; (2) vertices are augmented with location information; and (3) edges are directed. Our goal is to study effective and efficient algorithms for finding communities over these graphs. For keyword-based attributed graphs, we investigate the keyword-based attributed community query (or KAC query), which returns a KAC for a query vertex. A KAC satisfies both structure cohesiveness (i.e., its vertices are tightly connected) and keyword cohesiveness (i.e., its vertices share common keywords). The KAC query enables a better understanding of how and why a community is formed (e.g., members of a KAC have a common interest in music, because they all have the same keyword ``music''). To enable efficient KAC search, we develop the CL-tree index and three fast algorithms. We evaluate our solutions on six large graphs. Our results show that KAC search is more effective and efficient than existing community retrieval approaches. Moreover, a KAC contains more precise and personalized information than those of existing methods. For spatial-based attributed graphs, we aim to find the spatial-aware community (or SAC), whose vertices are close structurally and spatially, for a query vertex in an online manner. To enable efficient SAC search, we develop two exact and three approximation algorithms. We evaluate SAC search on both large real and synthetic datasets. The experimental results show that SACs achieve higher effectiveness than communities returned by existing community retrieval solutions. Moreover, our solutions are faster than the baseline approaches. For directed graphs, we study the problem of finding a community, in which each vertex has some in-neighbors and out-neighbors. We call this problem community search on directed graph (or CSD problem). Given a directed graph, our goal is to find a dense subgraph of a query vertex, in which each vertex has both high in-degree and out-degree. To solve the CSD problem, we propose the concept of D-core number and develop index-based algorithms. The experimental results on real large graphs show that our solutions are able to find better communities than existing solutions. Moreover, our algorithms are efficient and scalable. We further develop the C-Explorer system to assist users in extracting, visualizing, and analyzing KACs. C-Explorer provides online and interactive community retrieval facilities, allowing a user to view her interesting communities for a query vertex. Moreover, C-Explorer implements several state-of-the-art community retrieval algorithms, as well as functions for analyzing their effectiveness.
DegreeDoctor of Philosophy
SubjectGraph algorithms
Data mining
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/250721

 

DC FieldValueLanguage
dc.contributor.advisorCheng, CK-
dc.contributor.authorFang, Yixiang-
dc.contributor.author方一向-
dc.date.accessioned2018-01-26T01:59:22Z-
dc.date.available2018-01-26T01:59:22Z-
dc.date.issued2017-
dc.identifier.citationFang, Y. [方一向]. (2017). Effective and efficient community search over large attributed graphs. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/250721-
dc.description.abstractCommunities, which are prevalent in attributed graphs such as social networks and knowledge bases, can be used in emerging applications such as product advertisement and setting up of social events. Given a graph G and a vertex q ∈ G, the community search query returns a subgraph of G that contains vertices related to q. In this thesis, we study community search over three common attributed graphs, where (1) vertices are associated with keywords; (2) vertices are augmented with location information; and (3) edges are directed. Our goal is to study effective and efficient algorithms for finding communities over these graphs. For keyword-based attributed graphs, we investigate the keyword-based attributed community query (or KAC query), which returns a KAC for a query vertex. A KAC satisfies both structure cohesiveness (i.e., its vertices are tightly connected) and keyword cohesiveness (i.e., its vertices share common keywords). The KAC query enables a better understanding of how and why a community is formed (e.g., members of a KAC have a common interest in music, because they all have the same keyword ``music''). To enable efficient KAC search, we develop the CL-tree index and three fast algorithms. We evaluate our solutions on six large graphs. Our results show that KAC search is more effective and efficient than existing community retrieval approaches. Moreover, a KAC contains more precise and personalized information than those of existing methods. For spatial-based attributed graphs, we aim to find the spatial-aware community (or SAC), whose vertices are close structurally and spatially, for a query vertex in an online manner. To enable efficient SAC search, we develop two exact and three approximation algorithms. We evaluate SAC search on both large real and synthetic datasets. The experimental results show that SACs achieve higher effectiveness than communities returned by existing community retrieval solutions. Moreover, our solutions are faster than the baseline approaches. For directed graphs, we study the problem of finding a community, in which each vertex has some in-neighbors and out-neighbors. We call this problem community search on directed graph (or CSD problem). Given a directed graph, our goal is to find a dense subgraph of a query vertex, in which each vertex has both high in-degree and out-degree. To solve the CSD problem, we propose the concept of D-core number and develop index-based algorithms. The experimental results on real large graphs show that our solutions are able to find better communities than existing solutions. Moreover, our algorithms are efficient and scalable. We further develop the C-Explorer system to assist users in extracting, visualizing, and analyzing KACs. C-Explorer provides online and interactive community retrieval facilities, allowing a user to view her interesting communities for a query vertex. Moreover, C-Explorer implements several state-of-the-art community retrieval algorithms, as well as functions for analyzing their effectiveness.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshGraph algorithms-
dc.subject.lcshData mining-
dc.titleEffective and efficient community search over large attributed graphs-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991043979532303414-
dc.date.hkucongregation2017-
dc.identifier.mmsid991043979532303414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats