HKU Scholars Hubhttp://hub.hku.hkThe DSpace digital repository system captures, stores, indexes, preserves, and distributes digital research material.Wed, 14 Aug 2024 10:13:08 GMT2024-08-14T10:13:08Z502511- Approximation for minimum triangulations of simplicial convex 3-polytopeshttp://hdl.handle.net/10722/88890Title: Approximation for minimum triangulations of simplicial convex 3-polytopes
Authors: Chin, FYL; Fung, SPY; Wang, CA
Abstract: A minimum triangulation of a convex 3-polytope is a triangulation that contains the minimum number of tetrahedra over all its possible triangulations. Since finding minimum triangulations of convex 3-polytopes was recently shown to be NP-hard, it becomes significant to find algorithms that give good approximation. In this paper we give a new triangulation algorithm with an improved approximation ratio 2 - Ω(1/√n), where n is the number of vertices of the polytope. We further show that this is the best possible for algorithms that only consider the combinatorial structure of the polytopes.
Mon, 01 Jan 2001 00:00:00 GMThttp://hdl.handle.net/10722/888902001-01-01T00:00:00Z
- IDBA-MTP: A Hybrid MetaTranscriptomic Assembler Based on Protein Informationhttp://hdl.handle.net/10722/201112Title: IDBA-MTP: A Hybrid MetaTranscriptomic Assembler Based on Protein Information
Authors: Leung, HCM; Yiu, SM; Chin, FYL
Abstract: Metatranscriptomic analysis provides information on how a microbial community reacts to environmental changes. Using next-generation sequencing (NGS) technology, biologists can study microbe community by sampling short reads from a mixture of mRNAs (metatranscriptomic data). As most microbial genome sequences are unknown, it would seem that de novo assembly of the mRNAs is needed. However, NGS reads are short and mRNAs share many similar regions and differ tremendously in abundance levels, making de novo assembly challenging. The existing assembler, IDBA-MT, designed specifically for the assembly of metatranscriptomic data only performs well on high-expressed mRNAs.
This paper introduces IDBA-MTP, which adopts a novel approach to metatranscriptomic assembly that makes use of the fact that there is a database of millions of known protein sequences associated with mRNAs. How to effectively use the protein information is non-trivial given the size of the database and given that different mRNAs might lead to proteins with similar functions (because different amino acids might have similar characteristics). IDBA-MTP employs a similarity measure between mRNAs and protein sequences, dynamic programming techniques and seed-and-extend heuristics to tackle the problem effectively and efficiently. Experimental results show that IDBA-MTP outperforms existing assemblers by reconstructing 14% more mRNAs. Availability: www.cs.hku.hk/alse/hkubrg/.
Description: Lecture Notes in Computer Science, vol 8394 entitled: Research in Computational Molecular Biology: 18th Annual International Conference, RECOMB 2014, Pittsburgh, PA, USA, April 2-5, 2014: Proceedings
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/2011122014-01-01T00:00:00Z
- DNA motif representation with nucleotide dependencyhttp://hdl.handle.net/10722/57248Title: DNA motif representation with nucleotide dependency
Authors: Chin, F; Leung, HCM
Abstract: The problem of discovering novel motifs of binding sites is important to the understanding of gene regulatory networks. Motifs are generally represented by matrices (position weight matrix (PWM) or position specific scoring matrix (PSSM)) or strings. However, these representations cannot model biological binding sites well because they fail to capture nucleotide interdependence. It has been pointed out by many researchers that the nucleotides of the DNA binding site cannot be treated independently, for example, the binding sites of zinc finger in proteins. In this paper, a new representation called Scored Position Specific Pattern (SPSP), which is a generalization of the matrix and string representations, is introduced, which takes into consideration the dependent occurrences of neighboring nucleotides. Even though the problem of discovering the optimal motif in SPSP representation is proved to be NP-hard, we introduce a heuristic algorithm called SPSP Finder, which can effectively find optimal motifs in most simulated cases and some real cases for which existing popular motif-finding software, such as Weeder, MEME, and AlignACE, fail. © 2008 IEEE.
Tue, 01 Jan 2008 00:00:00 GMThttp://hdl.handle.net/10722/572482008-01-01T00:00:00Z
- A dynamic programming approach of finding an optimal broadcast schedule in minimizing total flow timehttp://hdl.handle.net/10722/53595Title: A dynamic programming approach of finding an optimal broadcast schedule in minimizing total flow time
Authors: Chan, WT; Chin, FYL; Zhang, Y; Zhu, H; Shen, H; Wong, PWH
Abstract: We study the problem of (off-line) broadcast scheduling in minimizing total flow time and propose a dynamic programming approach to compute an optimal broadcast schedule. Suppose the broadcast server has k pages and the last page request arrives at time n. The optimal schedule can be computed in O(k 3(n + k)k-1) time for the case that the server has a single broadcast channel. For m channels case, i.e., the server can broadcast m different pages at a time where m < k, the optimal schedule can be computed in O(nk-m) time when k and m are constants. Note that this broadcast scheduling problem is NP-hard when k is a variable and will take O(n k-m+1) time when k is fixed and m ≥ 1 with the straightforward implementation of the dynamic programming approach. © Springer Science + Business Media, LLC 2006.
Description: Proceedings of the 11th Annual International Computing and Combinatorics Conference as 'Off-line Algorithms for Minimizing the Total Flow Time in Broadcast Scheduling'
Sun, 01 Jan 2006 00:00:00 GMThttp://hdl.handle.net/10722/535952006-01-01T00:00:00Z
- Finding exact optimal motifs in matrix representation by partitioninghttp://hdl.handle.net/10722/53599Title: Finding exact optimal motifs in matrix representation by partitioning
Authors: Leung, HCM; Chin, FYL
Abstract: Motivation: Finding common patterns, or motifs, in the promoter regions of co-expressed genes is an important problem in bioinformatics. A common representation of the motif is by probability matrix or PSSM (position specific scoring matrix). However, even for a motif of length six or seven, there is no algorithm that can guarantee finding the exact optimal matrix from an infinite number of possible matrices. Results: T his paper introduces the first algorithm, called EOMM, for finding the exact optimal matrix-represented motif, or simply optimal motif. Based on branch-and-bound searching by partitioning the solution space recursively, EOMM can find the optimal motif of size up to eight or nine, and a motif of larger size with any desired accuracy on the principle that the smaller the error bound, the longer the running time. Experiments show that for some real and simulated data sets, EOMM finds the motif despite very weak signals when existing software, such as MEME and MITRA-PSSM, fails to do so. © The Author 2005. Published by Oxford University Press. All rights reserved.
Sat, 01 Jan 2005 00:00:00 GMThttp://hdl.handle.net/10722/535992005-01-01T00:00:00Z
- Optimal simulation of full binary trees on faulty hypercubeshttp://hdl.handle.net/10722/43631Title: Optimal simulation of full binary trees on faulty hypercubes
Authors: Chan, Bethany MY; Chin, Francis YL; Poon, ChungKeung
Abstract: The problem of operating full binary tree based algorithms on a hypercube with faulty nodes was investigated. Developing a method for embedding a full binary tree into the faulty hypercube is the solution to this problem. Two outcomes for embedding an (n-1)-tree into an n-cube with unit dilation and load, that were based on a new embedding technique, were presented. For the problem where the root can be mapped to any nonfaulty hypercube node, the optimum toleration of faults was shown. Moreover, it was demonstrated that the algorithm for the variable root embedding problem is maximal within a class algorithms called recursive embedding algorithms as far as the number of tolerable faults is concerned. Lastly, it was demonstrated that when an O(1/√n) fraction of nodes in the hypercube are faulty, a O(1)-load variable root embedding is not always possible regardless of the significance of the dilation.
Sun, 01 Jan 1995 00:00:00 GMThttp://hdl.handle.net/10722/436311995-01-01T00:00:00Z
- Approximation Algorithms for Some Optimal 2D and 3D Triangulationshttp://hdl.handle.net/10722/118221Title: Approximation Algorithms for Some Optimal 2D and 3D Triangulations
Authors: Fung, SPY; Wang, CA; Chin, FYL
Sun, 01 Jan 2006 00:00:00 GMThttp://hdl.handle.net/10722/1182212006-01-01T00:00:00Z
- Online uniformly inserting points on gridhttp://hdl.handle.net/10722/125680Title: Online uniformly inserting points on grid
Authors: Zhang, Y; Chang, Z; Chin, FYL; Ting, HF; Tsin, YH
Abstract: In this paper, we consider the problem of inserting points in a square grid, which has many background applications, including halftone in reprographic and image processing. We consider an online version of this problem, i.e., the points are inserted one at a time. The objective is to distribute the points as uniformly as possible. Precisely speaking, after each insertion, the gap ratio should be as small as possible. In this paper, we give an insertion strategy with a maximal gap ratio no more than 2 √2 ≈ 2.828, which is the first result on uniformly inserting point in a grid. Moreover, we show that no online algorithm can achieve the maximal gap ratio strictly less than 2.5 for a 3 × 3 grid. © Springer-Verlag Berlin Heidelberg 2010.
Description: LNCS v. 6124 is conference proceedings of 6th international conference, AAIM 2010
Fri, 01 Jan 2010 00:00:00 GMThttp://hdl.handle.net/10722/1256802010-01-01T00:00:00Z
- A constant-competitive algorithm for online OVSF code assignmenthttp://hdl.handle.net/10722/127350Title: A constant-competitive algorithm for online OVSF code assignment
Authors: Chin, FYL; Ting, HF; Zhang, Y
Abstract: Orthogonal Variable Spreading Factor (OVSF) code assignment is a fundamental problem in Wideband Code-Division Multiple-Access (W-CDMA) systems, which plays an important role in third generation mobile communications. In the OVSF problem, codes must be assigned to incoming call requests with different data rate requirements, in such a way that they are mutually orthogonal with respect to an OVSF code tree. An OVSF code tree is a complete binary tree in which each node represents a code associated with the combined bandwidths of its two children. To be mutually orthogonal, each leaf-to-root path must contain at most one assigned code. In this paper, we focus on the online version of the OVSF code assignment problem and give a 10-competitive algorithm (where the cost is measured by the total number of assignments and reassignments used). Our algorithm improves the previous O(h)-competitive result, where h is the height of the code tree, and also another recent constant-competitive result, where the competitive ratio is only constant under amortized analysis and the constant is not determined. We also improve the lower bound of the problem from 3/2 to 5/3. © 2008 Springer Science+Business Media, LLC.
Fri, 01 Jan 2010 00:00:00 GMThttp://hdl.handle.net/10722/1273502010-01-01T00:00:00Z
- Online problems for frequency assignment and OVSF code assignment in wireless communication networkshttp://hdl.handle.net/10722/127351Title: Online problems for frequency assignment and OVSF code assignment in wireless communication networks
Authors: Chan, JWT; Chin, FYL; Ting, HF; Zhang, Y
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/10722/1273512009-01-01T00:00:00Z
- PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM Approachhttp://hdl.handle.net/10722/195935Title: PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM Approach
Authors: Zhu, X; Leung, HCM; Chin, FYL; Yiu, SM; Quan, G; Liu, B; Wang, Y
Abstract: Since the read lengths of high throughput sequencing (HTS) technologies are short, de novo assembly which plays significant roles in many applications remains a great challenge. Most of the state-of-the-art approaches base on de Bruijn graph strategy and overlap-layout strategy. However, these approaches which depend on k-mers or read overlaps do not fully utilize information of single-end and paired-end reads when resolving branches, e.g. the number and positions of reads supporting each possible extension are not taken into account when resolving branches. We present PERGA (Paired-End Reads Guided Assembler), a novel sequence-reads-guided de novo assembly approach, which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct contig, PERGA uses paired-end reads and different read overlap size thresholds ranging from Omax to Omin to resolve the gaps and branches. Moreover, by constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. When the correct extension cannot be determined, PERGA will try to extend the contigs by all feasible extensions and determine the correct extension by using look ahead technology. We evaluated PERGA on both simulated Illumina data sets and real data sets, and it constructed longer and more correct contigs and scaffolds than other state-of-the-art assemblers IDBA-UD, Velvet, ABySS, SGA and CABOG. Availability: https://github.com/hitbio/PERGA
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1959352013-01-01T00:00:00Z
- Constant-Competitive Tree Node Assignmenthttp://hdl.handle.net/10722/185122Title: Constant-Competitive Tree Node Assignment
Authors: Zhang, Y; Chin, FYL; Ting, HF
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/1851222014-01-01T00:00:00Z
- Online pricing for bundles of multiple itemshttp://hdl.handle.net/10722/185123Title: Online pricing for bundles of multiple items
Authors: Zhang, Y; Chin, FYL; Ting, HF
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/1851232014-01-01T00:00:00Z
- Constant competitive algorithms for unbounded one-Way trading under monotone hazard ratehttp://hdl.handle.net/10722/293929Title: Constant competitive algorithms for unbounded one-Way trading under monotone hazard rate
Authors: Zhang, Y; Chin, FYL; Lau, FCM; Tan, H; Ting, HF
Abstract: In the one-way trading problem, a seller has some product to be sold to a sequence of buyers in an online fashion, i.e., buyers come one after another. Each buyer has the accepted unit price which is known to the seller on his arrival. To maximize the total revenue, the seller has to carefully decide the amount of products to be sold to each buyer at the then-prevailing prices. In this paper, we study the unbounded one-way trading, i.e., the highest unit price among all buyers is positive and unbounded. We assume that the highest prices of buyers follow some distribution with monotone hazard rate, which is a well-adopted assumption. We investigate two variants, (1) the distribution is on the highest price among all buyers, and (2) the distribution of the highest price of each buyer is independent and identically distributed. To measure the performance of the algorithms, the expected competitive ratios, E[OPT]/E[ALG] and E[OPT/ALG], are considered. If the distributions satisfy the monotone hazard rate, for both of the above two variants, constant-competitive algorithms can be achieved.
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/10722/2939292018-01-01T00:00:00Z
- Offline and online algorithms for single-minded selling problemhttp://hdl.handle.net/10722/294277Title: Offline and online algorithms for single-minded selling problem
Authors: Zhang, Y; Chin, FYL; Poon, SH; Ting, HF; Xu, D; Yu, D
Abstract: Given a seller with k types of items and n single-minded buyers, i.e., each buyer is only interested in a particular bundle of items, to maximize the revenue, the seller must assign some amount of bundles to each buyer with respect to the buyer's accepted price. Each buyer is associated with a value function such that is the accepted unit bundle price is willing to pay for x bundles. In this paper, we assume that bundles can be sold fractionally. The single-minded item selling problem is proved to be NP-hard. Moreover, we give an -approximation algorithm. For the online version, i.e., buyers come one by one and the decision must be made immediately on the arrival of each buyer, an -competitive algorithm is given, where h is the highest unit item price among all buyers.
Wed, 01 Jan 2020 00:00:00 GMThttp://hdl.handle.net/10722/2942772020-01-01T00:00:00Z
- Conserved Patterns in Bioinformaticshttp://hdl.handle.net/10722/242058Title: Conserved Patterns in Bioinformatics
Authors: Chin, FYL
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/10722/2420582011-01-01T00:00:00Z
- Space-Bounded Online Bin Packing Problemhttp://hdl.handle.net/10722/242059Title: Space-Bounded Online Bin Packing Problem
Authors: Chin, FYL
Description: Organized by Advanced Computing and Microelectronics Unit, Indian Statistical Institute
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/10722/2420592011-01-01T00:00:00Z
- Statistical Database Securityhttp://hdl.handle.net/10722/242060Title: Statistical Database Security
Authors: Chin, FYL
Description: Keynote
Fri, 01 Jan 2010 00:00:00 GMThttp://hdl.handle.net/10722/2420602010-01-01T00:00:00Z
- Experiences in Running a Flexible, Web-Based, and Self-Paced Course (Keynote)http://hdl.handle.net/10722/242061Title: Experiences in Running a Flexible, Web-Based, and Self-Paced Course (Keynote)
Authors: Chin, FYL
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/10722/2420612011-01-01T00:00:00Z
- MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binninghttp://hdl.handle.net/10722/195943Title: MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binning
Authors: Wang, Y; Leung, HCM; Yiu, SM; Chin, FYL
Abstract: Background Taxonomic annotation of reads is an important problem in metagenomic analysis. Existing annotation tools, which rely on the approach of aligning each read to the taxonomic structure, are unable to annotate many reads efficiently and accurately as reads (100 bp) are short and most of them come from unknown genomes. Previous work has suggested assembling the reads to make longer contigs before annotation. More reads/contigs can be annotated as a longer contig (in Kbp) can be aligned to a taxon even if it is from an unknown species as long as it contains a conserved region of that taxon. Unfortunately existing metagenomic assembly tools are not mature enough to produce long enough contigs. Binning tries to group reads/contigs of similar species together. Intuitively, reads in the same group (cluster) should be annotated to the same taxon and these reads altogether should cover a significant portion of the genome alleviating the problem of short contigs if the quality of binning is high. However, no existing work has tried to use binning results to help solve the annotation problem. This work explores this direction. Results In this paper, we describe MetaCluster-TA, an assembly-assisted binning-based annotation tool which relies on an innovative idea of annotating binned reads instead of aligning each read or contig to the taxonomic structure separately. We propose the novel concept of the 'virtual contig' (which can be up to 10 Kb in length) to represent a set of reads and then represent each cluster as a set of 'virtual contigs' (which together can be total up to 1 Mb in length) for annotation. MetaCluster-TA can outperform widely-used MEGAN4 and can annotate (1) more reads since the virtual contigs are much longer; (2) more accurately since each cluster of long virtual contigs contains global information of the sampled genome which tends to be more accurate than short reads or assembled contigs which contain only local information of the genome; and (3) more efficiently since there are much fewer long virtual contigs to align than short reads. MetaCluster-TA outperforms MetaCluster 5.0 as a binning tool since binning itself can be more sensitive and precise given long virtual contigs and the binning results can be improved using the reference taxonomic database. Conclusions MetaCluster-TA can outperform widely-used MEGAN4 and can annotate more reads with higher accuracy and higher efficiency. It also outperforms MetaCluster 5.0 as a binning tool.
Description: This article is part of the supplement: Selected articles from the Twelfth Asia Pacific Bioinformatics Conference (APBC 2014): Genomics
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/1959432014-01-01T00:00:00Z
- Competitive Algorithms for Online Pricinghttp://hdl.handle.net/10722/165837Title: Competitive Algorithms for Online Pricing
Authors: Zhang, Y; Chin, FYL; Ting, HF
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/10722/1658372012-01-01T00:00:00Z
- MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy samplehttp://hdl.handle.net/10722/165872Title: MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample
Authors: Wang, Y; Leung, HCM; Yiu, SM; Chin, FYL
Abstract: MOTIVATION: Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning methods that can solve these problems are desirable. RESULTS: We proposed a two-round binning method (MetaCluster 5.0) that aims at identifying both low-abundance and high-abundance species in the presence of a large amount of noise due to many extremely low-abundance species. In summary, MetaCluster 5.0 uses a filtering strategy to remove noise from the extremely low-abundance species. It separate reads of high-abundance species from those of low-abundance species in two different rounds. To overcome the issue of low coverage for low-abundance species, multiple w values are used to group reads with overlapping w-mers, whereas reads from high-abundance species are grouped with high confidence based on a large w and then binning expands to low-abundance species using a relaxed (shorter) w. Compared to the recent tools, TOSS and MetaCluster 4.0, MetaCluster 5.0 can find more species (especially those with low abundance of say 6x to 10x) and can achieve better sensitivity and specificity using less memory and running time. AVAILABILITY: http://i.cs.hku.hk/alse/MetaCluster/ CONTACT: chin@cs.hku.hk.
Description: All proceedings papers are available as open access at: OUP Bioinformatics (http://www.eccb12.org/proceedings-talks)
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/10722/1658722012-01-01T00:00:00Z
- IDBA-MT: De Novo Assembler for Metatranscriptomic Data Generated from Next-Generation Sequencing Technologyhttp://hdl.handle.net/10722/187133Title: IDBA-MT: De Novo Assembler for Metatranscriptomic Data Generated from Next-Generation Sequencing Technology
Authors: Leung, HCM; Yiu, SM; Parkinson, J; Chin, FYL
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1871332013-01-01T00:00:00Z
- IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levelshttp://hdl.handle.net/10722/187134Title: IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels
Authors: Peng, Y; Leung, HCM; Yiu, SM; Lv, MJ; Zhu, XG; Chin, FYL
Abstract: Motivation: RNA sequencing based on next-generation sequencing technology is effective for analyzing transcriptomes. Like de novo genome assembly, de novo transcriptome assembly does not rely on any reference genome or additional annotation information, but is more difficult. In particular, isoforms can have very uneven expression levels (e.g. 1:100), which make it very difficult to identify low-expressed isoforms. One challenge is to remove erroneous vertices/edges with high multiplicity (produced by high-expressed isoforms) in the de Bruijn graph without removing correct ones with not-so-high multiplicity from low-expressed isoforms. Failing to do so will result in the loss of low-expressed isoforms or having complicated subgraphs with transcripts of different genes mixed together due to erroneous vertices/edges.
Contributions: Unlike existing tools, which remove erroneous vertices/edges with multiplicities lower than a global threshold, we use a probabilistic progressive approach to iteratively remove them with local thresholds. This enables us to decompose the graph into disconnected components, each containing a few genes, if not a single gene, while retaining many correct vertices/edges of low-expressed isoforms. Combined with existing techniques, IDBA-Tran is able to assemble both high-expressed and low-expressed transcripts and outperform existing assemblers in terms of sensitivity and specificity for both simulated and real data.
Availability:http://www.cs.hku.hk/∼alse/idba_tran.
Supplementary information:Supplementary data are available at Bioinformatics online.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1871342013-01-01T00:00:00Z
- Inferring gene regulatory network from variable delay high temporal datahttp://hdl.handle.net/10722/189624Title: Inferring gene regulatory network from variable delay high temporal data
Authors: Yalamanchili, HK; Yan, B; Li, MJ; Qin, J; Zhao, Z; Chin, FYL; Wang, JJ
Abstract: BACKGROUND: Recent advances in the live cell imaging techniques have enabled us to closely observe the cell lineages with high temporal resolution. Gene expression patterns in these cell lineages suggest regulatory pathways, which help us to decode various underlying mechanisms in developmental biology. Inferring gene networks from temporal data is fundamental but still a long standing challenge. Quite a number of gene network inference methods are proposed. However, every method has its own biases. Network Inference gets complicated with the dynamic delay associated to the gene regulation and the number of time points. This advocates the need of new gene network inference methods that can effectively handle the variable delay in high temporal live cell imaging data. METHOD: Here we design a new gene network inference algorithm based on the dynamic local alignment of time series gene expression data. The novelty of our method is the use of gapped alignment to handle the variable delay in gene regulation. The local alignment can even detect the short term gene regulations in the cell lineages, that are undetectable by traditional correlation and Mutual Information based methods. RESULTS: We tested our method on both synthetic and C. elegans live cell imaging data and compared its performance against other popular methods like MIC, ARACNE and Banjo. The area under the curve (AUC) of our method is observed to be significantly higher when compared to others. Besides the well-established regulatory relationships we also predicted few new relationships that are worthy of subsequent experimental investigation.
Description: No. O030
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1896242013-01-01T00:00:00Z
- DDGni: dynamic delay gene-network inference from high-temporal data using gapped local alignmenthttp://hdl.handle.net/10722/193171Title: DDGni: dynamic delay gene-network inference from high-temporal data using gapped local alignment
Authors: Yalamanchili, HK; Yan, B; Li, J; Qin, J; Zhao, Z; Chin, FYL; Wang, JJ
Abstract: MOTIVATION: Inferring gene-regulatory networks is very crucial in decoding various complex mechanisms in biological systems. Synthesis of a fully functional transcriptional factor/protein from DNA involves series of reactions, leading to a delay in gene regulation. The complexity increases with the dynamic delay induced by other small molecules involved in gene regulation, and noisy cellular environment. The dynamic delay in gene regulation is quite evident in high-temporal live cell lineage-imaging data. Although a number of gene-network-inference methods are proposed, most of them ignore the associated dynamic time delay. RESULTS: Here, we propose DDGni (dynamic delay gene-network inference), a novel gene-network-inference algorithm based on the gapped local alignment of gene-expression profiles. The local alignment can detect short-term gene regulations, that are usually overlooked by traditional correlation and mutual Information based methods. DDGni uses 'gaps' to handle the dynamic delay and non-uniform sampling frequency in high-temporal data, like live cell imaging data. Our algorithm is evaluated on synthetic and yeast cell cycle data, and Caenorhabditis elegans live cell imaging data against other prominent methods. The area under the curve of our method is significantly higher when compared to other methods on all three datasets. AVAILABILITY: The program, datasets and supplementary files are available at http://www.jjwanglab.org/DDGni/.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/1931712014-01-01T00:00:00Z
- Online algorithms for 1-space bounded 2-dimensional bin packing and square packinghttp://hdl.handle.net/10722/184865Title: Online algorithms for 1-space bounded 2-dimensional bin packing and square packing
Authors: Zhang, Y; Chin, FYL; Ting, HF; Han, X; Poon, CK; Tsin, YH; Ye, DS
Abstract: In this paper, we study 1-space bounded 2-dimensional bin packing and square packing. A sequence of rectangular items (square items) arrive one by one, each item must be packed into a square bin of unit size on its arrival without any information about future items. When packing items, 90-rotation is allowed. 1-space bounded means there is only one active bin. If the active bin cannot accommodate the coming item, it will be closed and a new bin will be opened. The objective is to minimize the total number of bins used for packing all items in the sequence. Our contributions are as follows: For 1-space bounded 2-dimensional bin packing, we propose an online packing strategy with competitive ratio 5.06. A lower bound of 3.17 on the competitive ratio is proven. Moreover, we study 1-space bounded square packing, where each item is a square with side length no more than 1. A 4.3-competitive algorithm is achieved, and a lower bound of 2.94 on the competitive ratio is given. All these bounds surpass the previously best known results. © 2013 Springer-Verlag Berlin Heidelberg.
Description: LNCS v. 7936 entitled: Computing and combinatorics: 19th International Conference, COCOON 2013 ... proceedings
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1848652013-01-01T00:00:00Z
- Reconstructing k-Reticulated Phylogenetic Network from a Set of Gene Treeshttp://hdl.handle.net/10722/191051Title: Reconstructing k-Reticulated Phylogenetic Network from a Set of Gene Trees
Authors: Vu, H; Chin, FYL; Hon, WK; Leung, HCM; Sadakane, K; Sung, WK; Yiu, SM
Abstract: The time complexity of existing algorithms for reconstructing a level-x phylogenetic network increases exponentially in x. In this paper, we propose a new classification of phylogenetic networks called k-reticulated network. A k-reticulated network can model all level-k networks and some level-x networks with x > k. We design algorithms for reconstructing k-reticulated network (k = 1 or 2) with minimum number of hybrid nodes from a set of m binary trees, each with n leaves in O(mn 2) time. The implication is that some level-x networks with x > k can now be reconstructed in a faster way. We implemented our algorithm (ARTNET) and compared it with CMPT. We show that ARTNET outperforms CMPT in terms of running time and accuracy. We also consider the case when there does not exist a 2-reticulated network for the input trees. We present an algorithm computing a maximum subset of the species set so that a new set of subtrees can be combined into a 2-reticulated network.
Description: Lecture Notes in Computer Science v. 7875 entitled: Bioinformatics research and applications : 9th international symposium, ISBRA 2013 ... proceedings
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1910512013-01-01T00:00:00Z
- Algorithms for Placing Monitors in a Flow Networkhttp://hdl.handle.net/10722/191782Title: Algorithms for Placing Monitors in a Flow Network
Authors: Chin, FYL; Chrobak, M; Yan, L
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/10722/1917822012-01-01T00:00:00Z
- Non-adaptive Complex Group Testing with Multiple Positive Setshttp://hdl.handle.net/10722/191783Title: Non-adaptive Complex Group Testing with Multiple Positive Sets
Authors: Chin, FYL; Leung, HCM; Yiu, SM
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/10722/1917832013-01-01T00:00:00Z
- Competitive algorithms for unbounded One-Way Tradinghttp://hdl.handle.net/10722/205524Title: Competitive algorithms for unbounded One-Way Trading
Authors: Chin, FYL; Fu, B; Jiang, MH; Ting, HF; Zhang, Y
Abstract: In the one-way trading problem, a seller has some product to be sold to a sequence σ of buyers u1, u2,..., uσ arriving online and he needs to decide, for each ui , the amount of product to be sold to ui at the then-prevailing market price pi. The objective is to maximize the seller's revenue. We note that most previous algorithms for the problem need to impose some artificial upper bound M and lower bound m on the market prices, and the seller needs to know either the values of M and m, or their ratio M/m, at the outset. Moreover, the performance guarantees provided by these algorithms depend only on M and m, and are often too loose; for example, given a one-way trading algorithm with competitive ratio Θ(log(M/m)), its actual performance can be significantly better when the actual highest to actual lowest price ratio is significantly smaller than M/m. This paper gives a one-way trading algorithm that does not impose any bounds on market prices and whose performance guarantee depends directly on the input. In particular, we give a class of one-way trading algorithms such that for any positive integer h and any positive number ε, we have an algorithm Ah,ε that has competitive ratio O (logr*(log(2) r*) ... (log(h - 1) r*)(log(h) r*)1 + ε) if the value of r* = p*/p1, the ratio of the highest market price p* = maxi pi and the first price p1, is large and satisfy log(h) r* > 1, where log(i) x denotes the application of the logarithm function i times to x; otherwise, Ah,ε has a constant competitive ratio Γh . We also show that our algorithms are near optimal by showing that given any positive integer h and any one-way trading algorithm A, we can construct a sequence of buyers σ with log(h) r* > 1 such that the ratio between the optimal revenue and the revenue obtained by A is at least Ω(logr*(log(2) r*) ... (log(h - 1) r*) (log(h) r*)). © 2014 Springer International Publishing.
Description: LNCS v. 8546 entitled: Algorithmic aspects in information and management : 10th International Conference, AAIM 2014 ... proceedings
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/2055242014-01-01T00:00:00Z
- On the Complexity of Constrained Sequences Alignment Problemshttp://hdl.handle.net/10722/205525Title: On the Complexity of Constrained Sequences Alignment Problems
Authors: Zhang, Y; Chan, J; Chin, FYL; Ting, HF; Ye, D; Zhang, F; Shi, JY
Abstract: We consider the problem of aligning a set of sequences subject to a given constrained sequence, which has applications in computational biology. In this paper we show that sequence alignment for two sequences A and B with a given distance function and a constrained sequence of k identical characters (say character c) can be solved in O( min {kn 2,(t − k)n 2}) time, where n is the length of A and B, and t is the minimum number of occurrences of character c in A and B. We also prove that the problem of constrained center-star sequence alignment (CCSA) is NP-hard even over the binary alphabet. Furthermore, for some distance function, we show that no polynomial-time algorithm can approximate the CCSA within any constant ratio.
Description: Lecture Notes in Computer Science, vol. 8497 entitled: Frontiers in algorithmics : 8th International Workshop, FAW 2014, Zhangjiajie, China, June 28-30, 2014: proceedings
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/2055252014-01-01T00:00:00Z
- Unbounded one-way trading on distributions with monotone hazard ratehttp://hdl.handle.net/10722/261169Title: Unbounded one-way trading on distributions with monotone hazard rate
Authors: Zhang, Y; Chin, FYL; Lau, FCM; Ting, HF; Tan, HS
Description: Session 7: Application
Sun, 01 Jan 2017 00:00:00 GMThttp://hdl.handle.net/10722/2611692017-01-01T00:00:00Z
- Approximation and competitive algorithms for single-minded selling problemhttp://hdl.handle.net/10722/277785Title: Approximation and competitive algorithms for single-minded selling problem
Authors: Chin, FYL; Poon, SH; Ting, HF; Xu, D; Yu, D; Zhang, Y
Abstract: The problem of item selling with the objective of maximizing the revenue is studied. Given a seller with k types of items and n single-minded buyers, i.e., each buyer is only interested in a particular bundle of items, to maximize the revenue, the seller must carefully assign some amount of bundles to each buyer with respect to the buyer’s accepted price. Each buyer bi is associated with a value function vi(⋅) such that vi(x) is the accepted unit bundle price bi is willing to pay for x bundles. We show that the single-minded item selling problem is NP-hard. Moreover, we give an O(k−−√) -approximation algorithm. For the online version, i.e., the buyers come one by one and the decision on each buyer must be made before the arrival of the next buyer, an O(k√⋅(logh+logk)) -competitive algorithm is achieved, where h is the highest unit item price among all buyers.
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/10722/2777852018-01-01T00:00:00Z
- Phylogeny-structured carbohydrate metabolism across microbiomes collected from different units in wastewater treatment processhttp://hdl.handle.net/10722/231694Title: Phylogeny-structured carbohydrate metabolism across microbiomes collected from different units in wastewater treatment process
Authors: Xia, Y; Chin, FYL; Chao, Y; Zhang, T
Abstract: With respect to global priority for bioenergy production from plant biomass, understanding the fundamental genetic associations underlying carbohydrate metabolisms is crucial for the development of effective biorefinery process. Compared with gut microbiome of ruminal animals and wood-feed insects, knowledge on carbohydrate metabolisms of engineered biosystems is limited.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/10722/2316942015-01-01T00:00:00Z
- Cellular adhesiveness and cellulolytic capacity in Anaerolineae revealed by omics-based genome interpretationhttp://hdl.handle.net/10722/231693Title: Cellular adhesiveness and cellulolytic capacity in Anaerolineae revealed by omics-based genome interpretation
Authors: Xia, Y; Wang, Y; Wang, Y; Chin, FYL; Zhang, T
Abstract: The Anaerolineae lineage of Chloroflexi had been identified as one of the core microbial populations in anaerobic digesters; however, the ecological role of the Anaerolineae remains uncertain due to the scarcity of isolates and annotated genome sequences. Our previous metatranscriptional analysis revealed this prevalent population that showed minimum involvement in the main pathways of cellulose hydrolysis and subsequent methanogenesis in the thermophilic cellulose fermentative consortium (TCF).
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/10722/2316932016-01-01T00:00:00Z
- Voting algorithms for discovering long motifshttp://hdl.handle.net/10722/93288Title: Voting algorithms for discovering long motifs
Authors: Chin, FYL; Leung, HCM
Abstract: Pevzner and Sze [14] have introduced the Planted (l,d)-Motif Problem to find similar patterns (motifs) in sequences which represent the promoter region of co-regulated genes. l is the length of the motif and d is the maximum Hamming distance around the similar patterns. Many algorithms have been developed to solve this motif problem. However, these algorithms either have long running times or do not guarantee the motif can be found. In this paper, we introduce new algorithms to solve the motif problem. Our algorithms can find motifs in reasonable time for not only the challenging (9,2), (11,3), (15,5)-motif problems but for even longer motifs, say (20,7), (30,11) and (40,15), which have never been seriously attempted by other researchers because of heavy time and space requirements.
Sat, 01 Jan 2005 00:00:00 GMThttp://hdl.handle.net/10722/932882005-01-01T00:00:00Z
- Generalized planted (l,d)-motif problem with negative sethttp://hdl.handle.net/10722/93313Title: Generalized planted (l,d)-motif problem with negative set
Authors: Leung, HCM; Chin, FYL
Abstract: Finding similar patterns (motifs) in a set of sequences is an important problem in Computational Molecular Biology. Pevzner and Sze [18] defined the planted (l,d)-motif problem as trying to find a length-l pattern that occurs in each input sequence with at most d substitutions. When d is large, this problem is difficult to solve because the input sequences do not contain enough information on the motif. In this paper, we propose a generalized planted (l,d)-motif problem which considers as input an additional set of sequences without any substring similar to the motif (negative set) as extra information. We analyze the effects of this negative set on the finding of motifs, and define a set of unsolvable problems and another set of most difficult problems, known as "challenging generalized problems". We develop an algorithm called VANS based on voting and other novel techniques, which can solve the (9,3), (11,4),(15,6) and (20,8)-motif problems which were unsolvable before as well as challenging problems of the planted (l,d)-motif problem such as (9,2), (11,3), (15,5) and (20,7)-motif problems. © Springer-Verlag Berlin Heidelberg 2005.
Sat, 01 Jan 2005 00:00:00 GMThttp://hdl.handle.net/10722/933132005-01-01T00:00:00Z
- Real-time multiple head shape detection and tracking system with decentralized trackershttp://hdl.handle.net/10722/93337Title: Real-time multiple head shape detection and tracking system with decentralized trackers
Authors: Yuk, JSC; Wong, KYK; Chung, RHY; Chin, FYL; Chow, KP
Abstract: This paper presents a robust human tracking system which incorporates automatic detection of head shape objects with decentralized tracking approach. A fast and robust probabilistic shape contour matching algorithm is applied to the input image frame to detect and locate head shape objects. The detected objects are then tracked by decentralized trackers. Here, a decentralized tracker refers to the tracker that tracks exactly one object. Essentially, each newly detected object will instantiate an individual tracker, which tracks the object and destroys itself when the object disappears. Two trackers communicate with each other only when they are getting close enough. This approach simplifies the competition of targets between trackers, and is more efficient than the centralized approach whose time complexity is greatly depends on the number of tracked objects. The system has been tested with several challenging digital surveillance video sequences, and the results show the robustness and the efficiency of the system under crowded and clutter environment. © 2006 IEEE.
Sun, 01 Jan 2006 00:00:00 GMThttp://hdl.handle.net/10722/933372006-01-01T00:00:00Z
- Mining confident rules without support requirementhttp://hdl.handle.net/10722/93173Title: Mining confident rules without support requirement
Authors: Wang, K; He, Y; Cheung, DW; Chin, FYL
Abstract: An open problem is to find all rules that satisfy a minimum confidence but not necessarily a minimum support. Without the support requirement, the classic support-based pruning strategy is inapplicable. The problem demands a confidence-based pruning strategy. In particular, the following monotonicity of confidence, called the universal-existential upward closure, holds: if a rule of size k is confident (for the given minimum confidence), for every other attribute not in the rule, some specialization of size k + 1 using the attribute must be confident. Like the support-based pruning, the bottleneck is at the memory that often is too small to store the candidates required for search. We implement this strategy on disk and study its performance.
Mon, 01 Jan 2001 00:00:00 GMThttp://hdl.handle.net/10722/931732001-01-01T00:00:00Z
- Online tree node assignment with resource augmentationhttp://hdl.handle.net/10722/93096Title: Online tree node assignment with resource augmentation
Authors: Chan, JWT; Chin, FYL; Ting, HF; Zhang, Y
Abstract: Given a complete binary tree of height h, the online tree node assignment problem is to serve a sequence of assignment/release requests, where an assignment request, with an integer parameter 0≤i≤h, is served by assigning a (tree) node at level (or height) i and a release request is served by releasing a specified assigned node. The node assignments have to guarantee that no node is assigned to two assignment requests unreleased, and every leaf-to-root path of the tree contains at most one assigned node. With assigned node reassignments allowed, the target of the problem is to minimize the number of assignments/reassigments, i.e., the cost, to serve the whole sequence of requests. This online tree node assignment problem is fundamental to many applications, including OVSF code assignment in WCDMA networks, buddy memory allocation and hypercube subcube allocation. Most of the previous results focus on how to achieve good performance when the same amount of resource is given to both the online and the optimal offline algorithms, i.e., one tree. In this paper, we focus on resource augmentation, where the online algorithm is allowed to use more trees than the optimal offline algorithm. By using different approaches, we give (1) a 1-competitive online algorithm, which uses (h+1)/2 trees, and is optimal because (h+1)/2 trees are required by any online algorithm to match the cost of the optimal offline algorithm with one tree; (2) a 2-competitive algorithm with 3h/8+2 trees; (3) an amortized (4/3+α)- competitive algorithm with (11/4+4/(3α)) trees, for any α where 0<α≤4/3. © 2009 Springer Berlin Heidelberg.
Description: LNCS v. 5609 is Proceedings of the 15th Annual International Conference; Session - TS11: Algorithms
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/10722/930962009-01-01T00:00:00Z
- Predicting protein complexes from PPI data: A core-attachment approachhttp://hdl.handle.net/10722/60624Title: Predicting protein complexes from PPI data: A core-attachment approach
Authors: Leung, HCM; Xiang, Q; Yiu, SM; Chin, FYL
Abstract: Protein complexes play a critical role in many biological processes. Identifying the component proteins in a protein complex is an important step in understanding the complex as well as the related biological activities. This paper addresses the problem of predicting protein complexes from the protein-protein interaction (PPI) network of one species using a computational approach. Most of the previous methods rely on the assumption that proteins within the same complex would have relatively more interactions. This translates into dense subgraphs in the PPI network. However, the existing software tools have limited success. Recently, Gavin et al. (2006) provided a detailed study on the organization of protein complexes and suggested that a complex consists of two parts: a core and an attachment. Based on this core-attachment concept, we developed a novel approach to identify complexes from the PPI network by identifying their cores and attachments separately. We evaluated the effectiveness of our proposed approach using three different datasets and compared the quality of our predicted complexes with three existing tools. The evaluation results show that we can predict many more complexes and with higher accuracy than these tools with an improvement of over 30%. To verify the cores we identified in each complex, we compared our cores with the mediators produced by Andreopoulos et al. (2007), which were claimed to be the cores, based on the benchmark result produced by Gavin et al. (2006). We found that the cores we produced are of much higher quality ranging from 10- to 30-fold more correctly predicted cores and with better accuracy. Availability: http://alse.cs.hku.hk/ complexes/. © Mary Ann Liebert, Inc. 2009.
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/10722/606242009-01-01T00:00:00Z
- Dynamic offline conflict-free coloring for unit diskshttp://hdl.handle.net/10722/61203Title: Dynamic offline conflict-free coloring for unit disks
Authors: Chan, JWT; Chin, FYL; Hong, X; Ting, HF
Abstract: A conflict-free coloring for a given set of disks is a coloring of the disks such that for any point p on the plane there is a disk among the disks covering p having a color different from that of the rest of the disks that covers p. In the dynamic offline setting, a sequence of disks is given, we have to color the disks one-by-one according to the order of the sequence and maintain the conflict-free property at any time for the disks that are colored. This paper focuses on unit disks, i.e., disks with radius one. We give an algorithm that colors a sequence of n unit disks in the dynamic offline setting using O(logn) colors. The algorithm is asymptotically optimal because Ω(logn) colors is necessary to color some set of n unit disks for any value of n [9]. © 2009 Springer Berlin Heidelberg.
Description: LNCS v. 5426 is Proceedings of the 6th International Workshop, WAOA 2008; WAOA – Audimax A
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/10722/612032009-01-01T00:00:00Z
- Algorithms for Placing Monitors in a Flow Networkhttp://hdl.handle.net/10722/61201Title: Algorithms for Placing Monitors in a Flow Network
Authors: Chin, FYL; Chrobak, M; Yan, L
Abstract: In the Flow Edge-Monitor Problem, we are given an undirected graph G = (V,E), an integer k > 0 and some unknown circulation ψ on G. We want to find a set of k edges in G, so that if we place k monitors on those edges to measure the flow along them, the total number of edges for which the flow can be uniquely determined is maximized. In this paper, we first show that the Flow Edge-Monitor Problem is NP-hard, and then we give two approximation algorithms: a 3-approximation algorithm with running time O((m + n)2) and a 2-approximation algorithm with running time O((m + n)3), where n = |V| and m = |E|.
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/10722/612012009-01-01T00:00:00Z
- Minimum Manhattan network is NP-completehttp://hdl.handle.net/10722/61196Title: Minimum Manhattan network is NP-complete
Authors: Chin, FYL; Guo, Z; Sun, H
Description: Session 3-A
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/10722/611962009-01-01T00:00:00Z
- Maximum Stabbing Line in 2D Planehttp://hdl.handle.net/10722/93374Title: Maximum Stabbing Line in 2D Plane
Authors: Chin, FYL; Wang, CA; Wang, FL
Fri, 01 Jan 1999 00:00:00 GMThttp://hdl.handle.net/10722/933741999-01-01T00:00:00Z
- Finding the medial axis of a simple polygon in linear timehttp://hdl.handle.net/10722/93428Title: Finding the medial axis of a simple polygon in linear time
Authors: Chin, FYL; Snoeyink, J; Wang, CA
Sun, 01 Jan 1995 00:00:00 GMThttp://hdl.handle.net/10722/934281995-01-01T00:00:00Z
- A simple algorithm for finding all k-edge-connected componentshttp://hdl.handle.net/10722/217774Title: A simple algorithm for finding all k-edge-connected components
Authors: Wang, T; Zhang, Y; Chin, FYL; Ting, HF; Tsin, YH; Poon, SH
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/10722/2177742015-01-01T00:00:00Z
- Online algorithms for 1-space bounded 2-dimensional bin packing and square packinghttp://hdl.handle.net/10722/217770Title: Online algorithms for 1-space bounded 2-dimensional bin packing and square packing
Authors: Zhang, Y; Chin, FYL; Ting, HF; Xin, H; Poon, CK; Tsin, YH; Ye, D
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/10722/2177702014-01-01T00:00:00Z
- Constrained pairwise and center-star sequences alignment problemshttp://hdl.handle.net/10722/217295Title: Constrained pairwise and center-star sequences alignment problems
Authors: Zhang, Y; Chan, J; Chin, FYL; Ting, HF; Ye, D; Zhang, F; Shi, J
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/10722/2172952016-01-01T00:00:00Z