Entropy-based Scheduling Policy for Cross Aggregate Ranking Workloads

Dai, CC; Nutanong, S; Chow, CY; Cheng, CK

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TSC.2016.2586062
Scopus: eid_2-s2.0-85048537571
WOS: WOS:000440871000004
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Entropy-based Scheduling Policy for Cross Aggregate Ranking Workloads

Title	Entropy-based Scheduling Policy for Cross Aggregate Ranking Workloads
Authors	Dai, CC Nutanong, S Chow, CY Cheng, CK
Keywords	knowledge and data engineering tools and techniques Query processing
Issue Date	2018
Citation	IEEE Transactions on Services Computing, 2018, v. 11 n. 3, p. 507-520 How to Cite? DOI: http://dx.doi.org/10.1109/TSC.2016.2586062
Abstract	Many data exploration applications require the ability to identify the top-k results according to a scoring function. We study a class of top-k ranking problems where top-k candidates in a dataset are scored with the assistance of another set. We call this class of workloads cross aggregate ranking. Example computation problems include evaluating the Hausdorff distance between two datasets, finding the medoid or radius within one dataset, and finding the closest or farthest pair between two datasets. In this paper, we propose a parallel and distributed solution to process cross aggregate ranking workloads. Our solution subdivides the aggregate score computation of each candidate into tasks while constantly maintains the tentative top-k results as an uncertain top-k result set. The crux of our proposed approach lies in our entropy-based scheduling technique to determine result-yielding tasks based on their abilities to reduce the uncertainty of the tentative result set. Experimental results show that our proposed approach consistently outperforms the best existing one in two different types of cross aggregate rank workloads using real datasets.
Persistent Identifier	http://hdl.handle.net/10722/232840
ISSN	1939-1374 2023 Impact Factor: 5.5 2023 SCImago Journal Rankings: 2.002
ISI Accession Number ID	WOS:000440871000004

DC Field	Value	Language
dc.contributor.author	Dai, CC	-
dc.contributor.author	Nutanong, S	-
dc.contributor.author	Chow, CY	-
dc.contributor.author	Cheng, CK	-
dc.date.accessioned	2016-09-20T05:32:49Z	-
dc.date.available	2016-09-20T05:32:49Z	-
dc.date.issued	2018	-
dc.identifier.citation	IEEE Transactions on Services Computing, 2018, v. 11 n. 3, p. 507-520	-
dc.identifier.issn	1939-1374	-
dc.identifier.uri	http://hdl.handle.net/10722/232840	-
dc.description.abstract	Many data exploration applications require the ability to identify the top-k results according to a scoring function. We study a class of top-k ranking problems where top-k candidates in a dataset are scored with the assistance of another set. We call this class of workloads cross aggregate ranking. Example computation problems include evaluating the Hausdorff distance between two datasets, finding the medoid or radius within one dataset, and finding the closest or farthest pair between two datasets. In this paper, we propose a parallel and distributed solution to process cross aggregate ranking workloads. Our solution subdivides the aggregate score computation of each candidate into tasks while constantly maintains the tentative top-k results as an uncertain top-k result set. The crux of our proposed approach lies in our entropy-based scheduling technique to determine result-yielding tasks based on their abilities to reduce the uncertainty of the tentative result set. Experimental results show that our proposed approach consistently outperforms the best existing one in two different types of cross aggregate rank workloads using real datasets.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Services Computing	-
dc.subject	knowledge and data engineering tools and techniques	-
dc.subject	Query processing	-
dc.title	Entropy-based Scheduling Policy for Cross Aggregate Ranking Workloads	-
dc.type	Article	-
dc.identifier.email	Cheng, CK: ckcheng@cs.hku.hk	-
dc.identifier.authority	Cheng, CK=rp00074	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TSC.2016.2586062	-
dc.identifier.scopus	eid_2-s2.0-85048537571	-
dc.identifier.hkuros	265231	-
dc.identifier.volume	11	-
dc.identifier.issue	3	-
dc.identifier.spage	507	-
dc.identifier.epage	520	-
dc.identifier.isi	WOS:000440871000004	-
dc.identifier.issnl	1939-1374	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Entropy-based Scheduling Policy for Cross Aggregate Ranking Workloads

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats