File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Protein function prediction based on pocket-specific noncontiguous amino acid subsequences
Title | Protein function prediction based on pocket-specific noncontiguous amino acid subsequences |
---|---|
Authors | |
Issue Date | 2015 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | An, Y. [{273a67}亚{275c28}]. (2015). Protein function prediction based on pocket-specific noncontiguous amino acid subsequences. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576786 |
Abstract | Building a protein functional repertoire is important for many life sciences. Unfortunately, less than 1% of protein sequences have been annotated with reliable evidence. The use of computational methods to predict protein functions has become a common means to bridge this formidable gap. In this thesis, it is proposed to use pocket-specific noncontiguous amino acid subsequences for predicting protein functions. These subsequence patterns have a strong function classification capability and are also complementary to protein sequence alignment methods. On the basis of a benchmark of ∼1600 testing proteins from the Protein Data Bank (PDB), It is demonstrated that function prediction using pocket-specific noncontiguous amino acid subsequences can be much more accurate than using three-dimensional pocket structures. Because these noncontiguous amino acid subsequences are independent of protein or pocket structures, the method based on such subsequence patterns can be easily applied to proteins with unknown structures. Predictors achieve state-of-the-art performance on two benchmarks constructed using proteins from the PDB and SwissProt respectively. Then protein sequence alignment features are further integrated into our pocket-specific noncontiguous subsequence model. The maximum F-measure of the integrated predictor on the PDB-based benchmark is 0.844 for the molecular function (MF) ontology and 0.838 for the biological process (BP) ontology, representing respective performance improvements of 47.8% and 48.3% over best results achieved with existing methods. On the SwissProt-based benchmark, the maximum Fmeasure of the integrated predictor is 0.627 for MF and 0.468 for BP, representing respective performance improvements of 29.0% and 38.1% over best results achieved with existing methods. |
Degree | Master of Philosophy |
Subject | Amino acid sequence Proteomics - Data processing |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/221082 |
HKU Library Item ID | b5576786 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | An, Yatong | - |
dc.contributor.author | {273a67}亚{275c28} | - |
dc.date.accessioned | 2015-10-26T23:11:56Z | - |
dc.date.available | 2015-10-26T23:11:56Z | - |
dc.date.issued | 2015 | - |
dc.identifier.citation | An, Y. [{273a67}亚{275c28}]. (2015). Protein function prediction based on pocket-specific noncontiguous amino acid subsequences. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576786 | - |
dc.identifier.uri | http://hdl.handle.net/10722/221082 | - |
dc.description.abstract | Building a protein functional repertoire is important for many life sciences. Unfortunately, less than 1% of protein sequences have been annotated with reliable evidence. The use of computational methods to predict protein functions has become a common means to bridge this formidable gap. In this thesis, it is proposed to use pocket-specific noncontiguous amino acid subsequences for predicting protein functions. These subsequence patterns have a strong function classification capability and are also complementary to protein sequence alignment methods. On the basis of a benchmark of ∼1600 testing proteins from the Protein Data Bank (PDB), It is demonstrated that function prediction using pocket-specific noncontiguous amino acid subsequences can be much more accurate than using three-dimensional pocket structures. Because these noncontiguous amino acid subsequences are independent of protein or pocket structures, the method based on such subsequence patterns can be easily applied to proteins with unknown structures. Predictors achieve state-of-the-art performance on two benchmarks constructed using proteins from the PDB and SwissProt respectively. Then protein sequence alignment features are further integrated into our pocket-specific noncontiguous subsequence model. The maximum F-measure of the integrated predictor on the PDB-based benchmark is 0.844 for the molecular function (MF) ontology and 0.838 for the biological process (BP) ontology, representing respective performance improvements of 47.8% and 48.3% over best results achieved with existing methods. On the SwissProt-based benchmark, the maximum Fmeasure of the integrated predictor is 0.627 for MF and 0.468 for BP, representing respective performance improvements of 29.0% and 38.1% over best results achieved with existing methods. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Amino acid sequence | - |
dc.subject.lcsh | Proteomics - Data processing | - |
dc.title | Protein function prediction based on pocket-specific noncontiguous amino acid subsequences | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5576786 | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5576786 | - |
dc.identifier.mmsid | 991011257099703414 | - |