Efficient mining of frequent item sets on large uncertain databases

Wang, L; Cheung, DWL; Cheng, R; Lee, SD; Yang, XS

File Download

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TKDE.2011.165
Scopus: eid_2-s2.0-84867943010
WOS: WOS:000309914400005
Find via

Supplementary

Bookmarks:
- CiteULike: 1
Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Efficient mining of frequent item sets on large uncertain databases

Title	Efficient mining of frequent item sets on large uncertain databases
Authors	Wang, L Cheung, DWL Cheng, R Lee, SD Yang, XS
Keywords	Approximate algorithm Frequent item sets Incremental mining Uncertain data set Database systems
Issue Date	2012
Publisher	IEEE. The Journal's web site is located at http://www.computer.org/tkde
Citation	IEEE Transactions on Knowledge & Data Engineering, 2012, v. 24 n. 12, p. 2170-2183 How to Cite? DOI: http://dx.doi.org/10.1109/TKDE.2011.165
Abstract	The data handled in emerging applications like location-based services, sensor monitoring systems, and data integration, are often inexact in nature. In this paper, we study the important problem of extracting frequent item sets from a large uncertain database, interpreted under the Possible World Semantics (PWS). This issue is technically challenging, since an uncertain database contains an exponential number of possible worlds. By observing that the mining process can be modeled as a Poisson binomial distribution, we develop an approximate algorithm, which can efficiently and accurately discover frequent item sets in a large uncertain database. We also study the important issue of maintaining the mining result for a database that is evolving (e.g., by inserting a tuple). Specifically, we propose incremental mining algorithms, which enable Probabilistic Frequent Item set (PFI) results to be refreshed. This reduces the need of re-executing the whole mining algorithm on the new database, which is often more expensive and unnecessary. We examine how an existing algorithm that extracts exact item sets, as well as our approximate algorithm, can support incremental mining. All our approaches support both tuple and attribute uncertainty, which are two common uncertain database models. We also perform extensive evaluation on real and synthetic data sets to validate our approaches. © 1989-2012 IEEE.
Persistent Identifier	http://hdl.handle.net/10722/138034
ISSN	1041-4347 2021 Impact Factor: 9.235 2020 SCImago Journal Rankings: 1.360
ISI Accession Number ID	WOS:000309914400005

DC Field	Value	Language
dc.contributor.author	Wang, L	en_US
dc.contributor.author	Cheung, DWL	en_US
dc.contributor.author	Cheng, R	en_US
dc.contributor.author	Lee, SD	en_US
dc.contributor.author	Yang, XS	-
dc.date.accessioned	2011-08-26T14:39:02Z	-
dc.date.available	2011-08-26T14:39:02Z	-
dc.date.issued	2012	en_US
dc.identifier.citation	IEEE Transactions on Knowledge & Data Engineering, 2012, v. 24 n. 12, p. 2170-2183	en_US
dc.identifier.issn	1041-4347	-
dc.identifier.uri	http://hdl.handle.net/10722/138034	-
dc.description.abstract	The data handled in emerging applications like location-based services, sensor monitoring systems, and data integration, are often inexact in nature. In this paper, we study the important problem of extracting frequent item sets from a large uncertain database, interpreted under the Possible World Semantics (PWS). This issue is technically challenging, since an uncertain database contains an exponential number of possible worlds. By observing that the mining process can be modeled as a Poisson binomial distribution, we develop an approximate algorithm, which can efficiently and accurately discover frequent item sets in a large uncertain database. We also study the important issue of maintaining the mining result for a database that is evolving (e.g., by inserting a tuple). Specifically, we propose incremental mining algorithms, which enable Probabilistic Frequent Item set (PFI) results to be refreshed. This reduces the need of re-executing the whole mining algorithm on the new database, which is often more expensive and unnecessary. We examine how an existing algorithm that extracts exact item sets, as well as our approximate algorithm, can support incremental mining. All our approaches support both tuple and attribute uncertainty, which are two common uncertain database models. We also perform extensive evaluation on real and synthetic data sets to validate our approaches. © 1989-2012 IEEE.	-
dc.language	eng	en_US
dc.publisher	IEEE. The Journal's web site is located at http://www.computer.org/tkde	-
dc.relation.ispartof	IEEE Transactions on Knowledge & Data Engineering	en_US
dc.subject	Approximate algorithm	-
dc.subject	Frequent item sets	-
dc.subject	Incremental mining	-
dc.subject	Uncertain data set	-
dc.subject	Database systems	-
dc.title	Efficient mining of frequent item sets on large uncertain databases	en_US
dc.type	Article	en_US
dc.identifier.email	Wang, L: lwang@cs.hku.hk	en_US
dc.identifier.email	Cheung, DWL: dcheung@cs.hku.hk	en_US
dc.identifier.email	Cheng, RCK: ckcheng@cs.hku.hk	en_US
dc.identifier.email	Lee, SD: sdlee@cs.hku.hk	-
dc.identifier.email	Yang, XS: xyang2@cs.hku.hk	-
dc.identifier.authority	Cheung, DWL=rp00101	en_US
dc.identifier.authority	Cheng, RCK=rp00074	en_US
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TKDE.2011.165	-
dc.identifier.scopus	eid_2-s2.0-84867943010	-
dc.identifier.hkuros	190763	en_US
dc.identifier.volume	24	-
dc.identifier.issue	12	-
dc.identifier.spage	2170	-
dc.identifier.epage	2183	-
dc.identifier.isi	WOS:000309914400005	-
dc.publisher.place	United States	-
dc.identifier.citeulike	11273187	-
dc.identifier.issnl	1041-4347	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Efficient mining of frequent item sets on large uncertain databases

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats