SeqDev: An Algorithm for Constructing Genetic Elements Using Comparative Assembly
Keywords:Genome sequences, Assembler, Genetic elements, Bioinformatics
With the availability of recent next generation sequencing technologies and their low cost, genomes of different organisms are being sequenced frequently. Therefore, quick assembly of genome, transcriptome, and target contigs from the raw data generated through the sequencing technologies has become necessary for better understanding of different biological systems. This article proposes an algorithm, namely SeqDev (Sequence Developer) for constructing contigs from raw reads using reference sequences. For this, we considered a weighted frequency?based consensus mechanism named BlastAssemb for primary construction of a sequence with gaps. Then, we adopted suffix array and proposed a gap filling search (GFS) algorithm for searching the missing sequences in the primary construct. For evaluating our algorithm, we have chosen Pokkali (rice) raw genome and Japonica (rice) as our reference data. Experimental results demonstrated that our proposed algorithm accurately constructs promoter sequences of Pokkali from its raw genome data. These constructed promoter sequences were 93 ? 100% identical with the reference and also aligned with 96 ? 100% of corresponding reference sequences with eValue ranging from 0.0 ? 2e-14. All these results indicated that our proposed method could be a potential algorithm to construct target contigs from raw sequences with the help of reference sequences. Further wet lab validation with specific Pokkali promoter sequence will boost this method as a robust algorithm for target contig assembly.
Plant Tissue Cult. & Biotech. 26(1): 105-121, 2016 (June)