File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Forensic and pattern analysis on the bitcoin blockchain
| Title | Forensic and pattern analysis on the bitcoin blockchain |
|---|---|
| Authors | |
| Advisors | |
| Issue Date | 2025 |
| Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
| Citation | Gong, Y. [龔雅楠]. (2025). Forensic and pattern analysis on the bitcoin blockchain. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
| Abstract | Cryptocurrency-related crimes are on the rise and have a wide-ranging impact across various areas. To effectively combat and prevent these illicit activities, cryptocurrency forensics (crypto forensics) is essential. At its core, this field relies on the investigation and analysis of blockchain data. However, the inherent pseudonymity and dynamics of Bitcoin introduce significant complexities to these investigations.
The collection and validation of Bitcoin addresses are indispensable processes in blockchain forensic analysis, crucial for identifying suspicious transactions, tracing fund flows, and conducting de-anonymization investigations. Address clustering, which groups addresses likely controlled by the same entity, serves as a foundational technique. The accuracy of clustering outcomes significantly impacts the reliability of crypto forensic findings. While heuristic-based address clustering is commonly adopted, its effectiveness faces limitations primarily due to the absence of ground truth data. This lack introduces fundamental uncertainty into clustering results, hindering the validation of forensic conclusions. This uncertainty is compounded by the increasing adoption of privacy-enhancing technologies, which complicate address relationships and create additional hurdles for investigators. Moreover, other blockchain dynamic factors, such as introducing new network features, further challenge the accuracy of clustering, collectively making reliable forensic analysis increasingly complex.
This study undertakes multiple approaches to address these limitations. In the first part, confronting the challenge of unavailable ground truth labels, we develop a simulation model to assess the potential error rates of two widely used clustering heuristics: the multi-input and one-time change address heuristics. The second part provides an in-depth behavioral analysis of peeling chains, a common structure utilized by entities such as exchanges and mixers. This analysis enhances our understanding of the operational characteristics of transaction data associated with privacy-enhancing practices. Building on works and insights from these first two parts, the third part introduces an enhanced simulation platform that more accurately replicates real-world Bitcoin transaction structures. Additionally, we propose and evaluate a novel heuristic algorithm specifically designed to improve the classification of one-time change addresses. This refined simulator provides a robust environment for assessing address clustering methods based on transaction details. The new heuristic aims to reduce misclassifications and achieve better clustering results.
Overall, this research presents a simulation framework to quantify the uncertainties in heuristic clustering results. This facilitates a clearer assessment of the reliability and limitations of address clustering algorithms, thus strengthening the basis for the admissibility of clustering findings as forensic evidence. The proposed heuristic more effectively captures relevant transaction patterns, helping to alleviate the uncertainties introduced by privacy techniques in forensic analysis. Additionally, all three parts of this study contribute a comprehensive analysis of Bitcoin blockchain data from different periods, examining aspects such as transaction types, address reuse, and structural details. The identified characteristics and observed trends serve as a basis for refining forensic tools and methodologies. |
| Degree | Doctor of Philosophy |
| Subject | Bitcoin Blockchains (Databases) |
| Dept/Program | Computer Science |
| Persistent Identifier | http://hdl.handle.net/10722/358336 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Yiu, SM | - |
| dc.contributor.advisor | Chow, KP | - |
| dc.contributor.author | Gong, Yanan | - |
| dc.contributor.author | 龔雅楠 | - |
| dc.date.accessioned | 2025-07-31T14:06:56Z | - |
| dc.date.available | 2025-07-31T14:06:56Z | - |
| dc.date.issued | 2025 | - |
| dc.identifier.citation | Gong, Y. [龔雅楠]. (2025). Forensic and pattern analysis on the bitcoin blockchain. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
| dc.identifier.uri | http://hdl.handle.net/10722/358336 | - |
| dc.description.abstract | Cryptocurrency-related crimes are on the rise and have a wide-ranging impact across various areas. To effectively combat and prevent these illicit activities, cryptocurrency forensics (crypto forensics) is essential. At its core, this field relies on the investigation and analysis of blockchain data. However, the inherent pseudonymity and dynamics of Bitcoin introduce significant complexities to these investigations. The collection and validation of Bitcoin addresses are indispensable processes in blockchain forensic analysis, crucial for identifying suspicious transactions, tracing fund flows, and conducting de-anonymization investigations. Address clustering, which groups addresses likely controlled by the same entity, serves as a foundational technique. The accuracy of clustering outcomes significantly impacts the reliability of crypto forensic findings. While heuristic-based address clustering is commonly adopted, its effectiveness faces limitations primarily due to the absence of ground truth data. This lack introduces fundamental uncertainty into clustering results, hindering the validation of forensic conclusions. This uncertainty is compounded by the increasing adoption of privacy-enhancing technologies, which complicate address relationships and create additional hurdles for investigators. Moreover, other blockchain dynamic factors, such as introducing new network features, further challenge the accuracy of clustering, collectively making reliable forensic analysis increasingly complex. This study undertakes multiple approaches to address these limitations. In the first part, confronting the challenge of unavailable ground truth labels, we develop a simulation model to assess the potential error rates of two widely used clustering heuristics: the multi-input and one-time change address heuristics. The second part provides an in-depth behavioral analysis of peeling chains, a common structure utilized by entities such as exchanges and mixers. This analysis enhances our understanding of the operational characteristics of transaction data associated with privacy-enhancing practices. Building on works and insights from these first two parts, the third part introduces an enhanced simulation platform that more accurately replicates real-world Bitcoin transaction structures. Additionally, we propose and evaluate a novel heuristic algorithm specifically designed to improve the classification of one-time change addresses. This refined simulator provides a robust environment for assessing address clustering methods based on transaction details. The new heuristic aims to reduce misclassifications and achieve better clustering results. Overall, this research presents a simulation framework to quantify the uncertainties in heuristic clustering results. This facilitates a clearer assessment of the reliability and limitations of address clustering algorithms, thus strengthening the basis for the admissibility of clustering findings as forensic evidence. The proposed heuristic more effectively captures relevant transaction patterns, helping to alleviate the uncertainties introduced by privacy techniques in forensic analysis. Additionally, all three parts of this study contribute a comprehensive analysis of Bitcoin blockchain data from different periods, examining aspects such as transaction types, address reuse, and structural details. The identified characteristics and observed trends serve as a basis for refining forensic tools and methodologies. | - |
| dc.language | eng | - |
| dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
| dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
| dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject.lcsh | Bitcoin | - |
| dc.subject.lcsh | Blockchains (Databases) | - |
| dc.title | Forensic and pattern analysis on the bitcoin blockchain | - |
| dc.type | PG_Thesis | - |
| dc.description.thesisname | Doctor of Philosophy | - |
| dc.description.thesislevel | Doctoral | - |
| dc.description.thesisdiscipline | Computer Science | - |
| dc.description.nature | published_or_final_version | - |
| dc.date.hkucongregation | 2025 | - |
| dc.identifier.mmsid | 991045004489003414 | - |
