File Download
Supplementary

postgraduate thesis: Unsupervised learning on scientific ocean drill datasets in the South China Sea

TitleUnsupervised learning on scientific ocean drill datasets in the South China Sea
Authors
Advisors
Advisor(s):Lam, EYMLi, Y
Issue Date2018
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Tse, C. [謝至愷]. (2018). Unsupervised learning on scientific ocean drill datasets in the South China Sea. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractIn this interdisciplinary research, unsupervised learning methods are employed to study scientific ocean drilling data of the South China Sea for the first time. A data analysis pipeline consisting of five different unsupervised learning algorithms, K-means, Hierarchical Clustering (HC), Self-Organizing Maps (SOM), Random Forest (RF) and Sparse Autoencoder (SA) is designed to experiment with multivariate geophysical datasets from Ocean Drilling Program (ODP) sites 1146 and 1148 and Integrated Ocean Drilling Program (IODP) sites U1431 and U1433. Compared with conventional methods, unsupervised learning methods do not require any a priori or expert knowledge and has the potential of unveiling data structures previously unknown to traditional analytical methods. Data clusters produced by the five experimented unsupervised learning methods reveal the natural data structure present in the datasets objectively and without any objectivity and presumption. Insights of the relevance of such clusters to the physical world are gained by comparing them to the existing classification of the drilling cores by lithologic units and geologic time scales against depths below seafloor. The correspondence between the existing classification and clustering results has demonstrated the applicability of the unsupervised methods to the specific datasets. The pioneering work suggests that unsupervised learning methods originated from computational data analysis is capable of revealing previously unexplored data patterns within the datasets studied. Clustering results from ODP sites 1146 and 1148 are observed to display a higher correspondence with existing classifications than results from IODP sites U1431 and U1433. As for the unsupervised learning methods, SOM, RF and SA are found to yield a higher Rand Index for datasets from the same site. Similarity analysis of the datasets as time-series data is also carried out to understand the intrinsic relationship among the datasets in an objective way. The unsupervised learning methodology experimented in this work has laid the groundwork for further machine learning framework that would enable data-driven scientific discovery from ocean drilling data in the future.
DegreeDoctor of Philosophy
SubjectUnderwater drilling
Geophysics - Mathematical models
Machine learning
Dept/ProgramEarth Sciences
Persistent Identifierhttp://hdl.handle.net/10722/261495

 

DC FieldValueLanguage
dc.contributor.advisorLam, EYM-
dc.contributor.advisorLi, Y-
dc.contributor.authorTse, Chi-hoi-
dc.contributor.author謝至愷-
dc.date.accessioned2018-09-20T06:43:56Z-
dc.date.available2018-09-20T06:43:56Z-
dc.date.issued2018-
dc.identifier.citationTse, C. [謝至愷]. (2018). Unsupervised learning on scientific ocean drill datasets in the South China Sea. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/261495-
dc.description.abstractIn this interdisciplinary research, unsupervised learning methods are employed to study scientific ocean drilling data of the South China Sea for the first time. A data analysis pipeline consisting of five different unsupervised learning algorithms, K-means, Hierarchical Clustering (HC), Self-Organizing Maps (SOM), Random Forest (RF) and Sparse Autoencoder (SA) is designed to experiment with multivariate geophysical datasets from Ocean Drilling Program (ODP) sites 1146 and 1148 and Integrated Ocean Drilling Program (IODP) sites U1431 and U1433. Compared with conventional methods, unsupervised learning methods do not require any a priori or expert knowledge and has the potential of unveiling data structures previously unknown to traditional analytical methods. Data clusters produced by the five experimented unsupervised learning methods reveal the natural data structure present in the datasets objectively and without any objectivity and presumption. Insights of the relevance of such clusters to the physical world are gained by comparing them to the existing classification of the drilling cores by lithologic units and geologic time scales against depths below seafloor. The correspondence between the existing classification and clustering results has demonstrated the applicability of the unsupervised methods to the specific datasets. The pioneering work suggests that unsupervised learning methods originated from computational data analysis is capable of revealing previously unexplored data patterns within the datasets studied. Clustering results from ODP sites 1146 and 1148 are observed to display a higher correspondence with existing classifications than results from IODP sites U1431 and U1433. As for the unsupervised learning methods, SOM, RF and SA are found to yield a higher Rand Index for datasets from the same site. Similarity analysis of the datasets as time-series data is also carried out to understand the intrinsic relationship among the datasets in an objective way. The unsupervised learning methodology experimented in this work has laid the groundwork for further machine learning framework that would enable data-driven scientific discovery from ocean drilling data in the future.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshUnderwater drilling-
dc.subject.lcshGeophysics - Mathematical models-
dc.subject.lcshMachine learning-
dc.titleUnsupervised learning on scientific ocean drill datasets in the South China Sea-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineEarth Sciences-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2018-
dc.identifier.mmsid991044040577903414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats