Kakute: A Precise, Unified Information Flow Analysis System for Big-data Security

Jiang, J; Zhao, S; Alsayed, D; Wang, Y; Cui, H; Liang, F; Gu, Z

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1145/3134600.3134607
Scopus: eid_2-s2.0-85038893861
WOS: WOS:000540643200007

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Kakute: A Precise, Unified Information Flow Analysis System for Big-data Security

Title	Kakute: A Precise, Unified Information Flow Analysis System for Big-data Security
Authors	Jiang, J Zhao, S Alsayed, D Wang, Y Cui, H Liang, F Gu, Z
Keywords	Big-data Data-intensive Scalable Computing System Information Flow Tracking
Issue Date	2017
Publisher	ACM. The Proceedings' web site is located at https://dl.acm.org/citation.cfm?id=3134600
Citation	Proceedings of the 33rd Annual Computer Security Applications Conference (ACSAC 2017), Orlando, FL, USA, 4-8 December 2017, p. 79-90 How to Cite? DOI: http://dx.doi.org/10.1145/3134600.3134607
Abstract	Big-data frameworks (e.g., Spark) enable computations on tremen- dous data records generated by third parties, which introduces vari- ous security and reliability problems such as information leakage and programming bugs. Existing systems for big-data security (e.g., Titian) track data transformations in a record level, so they are impre- cise and too coarse-grained for these problems. For instance, when we ran Titian to drill down input records that produced a buggy output record, Titian reported 3 to 9 orders of magnitude more input records than the actual ones. Information Flow Tracking (IFT) is a conventional approach for precise information control. However, extant IFT systems are neither efficient nor complete for big-data frameworks, because theses frameworks are data-intensive, and data flowing across hosts is often ignored by IFT. This paper presents KAKUTE, the first precise, fine-grained infor- mation flow analysis system for big-data. Our insight on making IFT efficient is that most fields in a data record often have the same IFT tags, and we present two new efficient techniques called Reference Propagation and Tag Sharing. In addition, we design an efficient, complete cross-host information flow propagation approach. Eval- uations on 7 diverse big-data programs (e.g., WordCount) shows that KAKUTE has merely 32.3% overhead even when fine-grained information control is enabled. Compared with Titian, KAKUTE precisely drilled down the actual bug inducing input records, a huge reduction of 3 to 9 orders of magnitude. KAKUTE’s performance overhead is comparable with Titian. Furthermore, KAKUTE effec- tively detected 13 real-world security and reliability bugs in 4 diverse problems, including information leakage, data provenance, program- ming and performance bugs. KAKUTE’s source code is available at https://github.com/acsac17-p78/kakute.
Description	Session: Big Data Analytics
Persistent Identifier	http://hdl.handle.net/10722/245449
ISBN	978-1-4503-5345-8
ISI Accession Number ID	WOS:000540643200007

DC Field	Value	Language
dc.contributor.author	Jiang, J	-
dc.contributor.author	Zhao, S	-
dc.contributor.author	Alsayed, D	-
dc.contributor.author	Wang, Y	-
dc.contributor.author	Cui, H	-
dc.contributor.author	Liang, F	-
dc.contributor.author	Gu, Z	-
dc.date.accessioned	2017-09-18T02:10:55Z	-
dc.date.available	2017-09-18T02:10:55Z	-
dc.date.issued	2017	-
dc.identifier.citation	Proceedings of the 33rd Annual Computer Security Applications Conference (ACSAC 2017), Orlando, FL, USA, 4-8 December 2017, p. 79-90	-
dc.identifier.isbn	978-1-4503-5345-8	-
dc.identifier.uri	http://hdl.handle.net/10722/245449	-
dc.description	Session: Big Data Analytics	-
dc.description.abstract	Big-data frameworks (e.g., Spark) enable computations on tremen- dous data records generated by third parties, which introduces vari- ous security and reliability problems such as information leakage and programming bugs. Existing systems for big-data security (e.g., Titian) track data transformations in a record level, so they are impre- cise and too coarse-grained for these problems. For instance, when we ran Titian to drill down input records that produced a buggy output record, Titian reported 3 to 9 orders of magnitude more input records than the actual ones. Information Flow Tracking (IFT) is a conventional approach for precise information control. However, extant IFT systems are neither efficient nor complete for big-data frameworks, because theses frameworks are data-intensive, and data flowing across hosts is often ignored by IFT. This paper presents KAKUTE, the first precise, fine-grained infor- mation flow analysis system for big-data. Our insight on making IFT efficient is that most fields in a data record often have the same IFT tags, and we present two new efficient techniques called Reference Propagation and Tag Sharing. In addition, we design an efficient, complete cross-host information flow propagation approach. Eval- uations on 7 diverse big-data programs (e.g., WordCount) shows that KAKUTE has merely 32.3% overhead even when fine-grained information control is enabled. Compared with Titian, KAKUTE precisely drilled down the actual bug inducing input records, a huge reduction of 3 to 9 orders of magnitude. KAKUTE’s performance overhead is comparable with Titian. Furthermore, KAKUTE effec- tively detected 13 real-world security and reliability bugs in 4 diverse problems, including information leakage, data provenance, program- ming and performance bugs. KAKUTE’s source code is available at https://github.com/acsac17-p78/kakute.	-
dc.language	eng	-
dc.publisher	ACM. The Proceedings' web site is located at https://dl.acm.org/citation.cfm?id=3134600	-
dc.relation.ispartof	Annual Computer Security Applications Conference (ACSAC) 2017	-
dc.subject	Big-data	-
dc.subject	Data-intensive Scalable Computing System	-
dc.subject	Information Flow Tracking	-
dc.title	Kakute: A Precise, Unified Information Flow Analysis System for Big-data Security	-
dc.type	Conference_Paper	-
dc.identifier.email	Wang, Y: amywang@hku.hk	-
dc.identifier.email	Cui, H: heming@hku.hk	-
dc.identifier.email	Gu, Z: zqgu@hku.hk	-
dc.identifier.authority	Cui, H=rp02008	-
dc.identifier.doi	10.1145/3134600.3134607	-
dc.identifier.scopus	eid_2-s2.0-85038893861	-
dc.identifier.hkuros	276668	-
dc.identifier.spage	79	-
dc.identifier.epage	90	-
dc.identifier.isi	WOS:000540643200007	-
dc.publisher.place	New York, NY	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Kakute: A Precise, Unified Information Flow Analysis System for Big-data Security

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats