A platform for biological sequence comparison on parallel computers

A S Deshpande; D S Richards; W R Pearson

doi:10.1093/bioinformatics/7.2.237

A platform for biological sequence comparison on parallel computers

Comput Appl Biosci. 1991 Apr;7(2):237-47. doi: 10.1093/bioinformatics/7.2.237.

Authors

A S Deshpande¹, D S Richards, W R Pearson

Affiliation

¹ Department of Computer Science, University of Virginia, Charlottesville 22908.

PMID: 2059850
DOI: 10.1093/bioinformatics/7.2.237

Abstract

We have written two programs for searching biological sequence databases that run on Intel hypercube computers. PSCANLIB compares a single sequence against a sequence library, and PCOMPLIB compares all the entries in one sequence library against a second library. The programs provide a general framework for similarity searching; they include functions for reading in query sequences, search parameters and library entries, and reporting the results of a search. We have isolated the code for the specific function that calculates the similarity score between the query and library sequence; alternative searching algorithms can be implemented by editing two files. We have implemented the rapid FASTA sequence comparison algorithm and the more rigorous Smith-Waterman algorithm within this framework. The PSCANLIB program on a 16 node iPSC/2 80386-based hypercube can compare a 229 amino acid protein sequence with a 3.4 million residue sequence library in approximately 16 s with the FASTA algorithm. Using the Smith-Waterman algorithm, the same search takes 35 min. The PCOMPLIB program can compare a 0.8 million amino acid protein sequence library with itself in 5.3 min with FASTA on a third-generation 32 node Intel iPSC/860 hypercube.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms
Amino Acid Sequence*
DNA / analysis*
Electronic Data Processing
Gene Library*
Software*

Substances

DNA

Grants and funding

LM04969/LM/NLM NIH HHS/United States