Clustering-based approach for predicting motif pairs from protein interaction data

Leung, HCM; Siu, MH; Yiu, SM; Chin, FYL; Sung, KWK

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1142/S0219720009004266
Scopus: eid_2-s2.0-68349110569
PMID: 19634199
Find via

Supplementary

Citations:
- Scopus: 0
- PubMed Central: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Clustering-based approach for predicting motif pairs from protein interaction data

Title	Clustering-based approach for predicting motif pairs from protein interaction data
Authors	Leung, HCM Siu, MH Yiu, SM Chin, FYL Sung, KWK
Keywords	Motif pair Protein domain Protein-protein interaction network
Issue Date	2009
Publisher	Imperial College Press. The Journal's web site is located at http://www.worldscinet.com/jbcb/jbcb.shtml
Citation	Journal Of Bioinformatics And Computational Biology, 2009, v. 7 n. 4, p. 701-716 How to Cite? DOI: http://dx.doi.org/10.1142/S0219720009004266
Abstract	Predicting motif pairs from a set of protein sequences based on the protein-protein interaction data is an important, but difficult computational problem. Tan et al. proposed a solution to this problem. However, the scoring function (using λ 2 testing) used in their approach is not adequate and their approach is also not scalable. It may take days to process a set of 5000 protein sequences with about 20,000 interactions. Later, Leung et al. proposed an improved scoring function and faster algorithms for solving the same problem. But, the model used in Leung et al. is complicated. The exact value of the scoring function is not easy to compute and an estimated value is used in practice. In this paper, we derive a better model to capture the significance of a given motif pair based on a clustering notion. We develop a fast heuristic algorithm to solve the problem. The algorithm is able to locate the correct motif pair in the yeast data set in about 45 minutes for 5000 protein sequences and 20,000 interactions. Moreover, we derive a lower bound result for the p-value of a motif pair in order for it to be distinguishable from random motif pairs. The lower bound result has been verified using simulated data sets. © 2009 Imperial College Press.
Persistent Identifier	http://hdl.handle.net/10722/152414
ISSN	0219-7200 2021 Impact Factor: 1.204 2020 SCImago Journal Rankings: 0.339
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Leung, HCM	en_US
dc.contributor.author	Siu, MH	en_US
dc.contributor.author	Yiu, SM	en_US
dc.contributor.author	Chin, FYL	en_US
dc.contributor.author	Sung, KWK	en_US
dc.date.accessioned	2012-06-26T06:38:15Z	-
dc.date.available	2012-06-26T06:38:15Z	-
dc.date.issued	2009	en_US
dc.identifier.citation	Journal Of Bioinformatics And Computational Biology, 2009, v. 7 n. 4, p. 701-716	en_US
dc.identifier.issn	0219-7200	en_US
dc.identifier.uri	http://hdl.handle.net/10722/152414	-
dc.description.abstract	Predicting motif pairs from a set of protein sequences based on the protein-protein interaction data is an important, but difficult computational problem. Tan et al. proposed a solution to this problem. However, the scoring function (using λ 2 testing) used in their approach is not adequate and their approach is also not scalable. It may take days to process a set of 5000 protein sequences with about 20,000 interactions. Later, Leung et al. proposed an improved scoring function and faster algorithms for solving the same problem. But, the model used in Leung et al. is complicated. The exact value of the scoring function is not easy to compute and an estimated value is used in practice. In this paper, we derive a better model to capture the significance of a given motif pair based on a clustering notion. We develop a fast heuristic algorithm to solve the problem. The algorithm is able to locate the correct motif pair in the yeast data set in about 45 minutes for 5000 protein sequences and 20,000 interactions. Moreover, we derive a lower bound result for the p-value of a motif pair in order for it to be distinguishable from random motif pairs. The lower bound result has been verified using simulated data sets. © 2009 Imperial College Press.	en_US
dc.language	eng	en_US
dc.publisher	Imperial College Press. The Journal's web site is located at http://www.worldscinet.com/jbcb/jbcb.shtml	en_US
dc.relation.ispartof	Journal of Bioinformatics and Computational Biology	en_US
dc.subject	Motif pair	-
dc.subject	Protein domain	-
dc.subject	Protein-protein interaction network	-
dc.subject.mesh	Algorithms	en_US
dc.subject.mesh	Amino Acid Motifs	en_US
dc.subject.mesh	Amino Acid Sequence	en_US
dc.subject.mesh	Binding Sites	en_US
dc.subject.mesh	Cluster Analysis	en_US
dc.subject.mesh	Molecular Sequence Data	en_US
dc.subject.mesh	Pattern Recognition, Automated - Methods	en_US
dc.subject.mesh	Protein Binding	en_US
dc.subject.mesh	Protein Interaction Mapping - Methods	en_US
dc.subject.mesh	Protein Structure, Tertiary	en_US
dc.subject.mesh	Proteins - Chemistry - Metabolism	en_US
dc.subject.mesh	Sequence Analysis, Protein - Methods	en_US
dc.title	Clustering-based approach for predicting motif pairs from protein interaction data	en_US
dc.type	Article	en_US
dc.identifier.email	Leung, HCM:cmleung2@cs.hku.hk	en_US
dc.identifier.email	Yiu, SM:smyiu@cs.hku.hk	en_US
dc.identifier.email	Chin, FYL:chin@cs.hku.hk	en_US
dc.identifier.authority	Leung, HCM=rp00144	en_US
dc.identifier.authority	Yiu, SM=rp00207	en_US
dc.identifier.authority	Chin, FYL=rp00105	en_US
dc.description.nature	link_to_subscribed_fulltext	en_US
dc.identifier.doi	10.1142/S0219720009004266	en_US
dc.identifier.pmid	19634199	-
dc.identifier.scopus	eid_2-s2.0-68349110569	en_US
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-68349110569&selection=ref&src=s&origin=recordpage	en_US
dc.identifier.volume	7	en_US
dc.identifier.issue	4	en_US
dc.identifier.spage	701	en_US
dc.identifier.epage	716	en_US
dc.publisher.place	United Kingdom	en_US
dc.identifier.scopusauthorid	Leung, HCM=35233742700	en_US
dc.identifier.scopusauthorid	Siu, MH=36762173800	en_US
dc.identifier.scopusauthorid	Yiu, SM=7003282240	en_US
dc.identifier.scopusauthorid	Chin, FYL=7005101915	en_US
dc.identifier.scopusauthorid	Sung, KWK=12797768900	en_US
dc.identifier.issnl	0219-7200	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Clustering-based approach for predicting motif pairs from protein interaction data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats