NEBC Menu Banner

Bio-Linux Software Documentation Pages NEBC Home NEBC EnvBase Data Catalogue Training courses run by the NEBC Bioinformatics documentation from the NEBC The NEBC Bio-Linux Project NEBC News

Back to search form

fasta

Name fasta
Description

fasta is part of the Fasta3 package. This package contains many programs for searching DNA and protein databases and for evaluating statistical significance from randomly shuffled sequences.

fasta is used to compare a protein sequence to a protein sequence database or a DNA sequence to a DNA sequence database using the FASTA algorithm. Search speed and selectivity are controlled with the ktup (wordsize) parameter.

For protein comparisons, ktup = 2 by default; ktup =1 is more sensitive but slower. For DNA comparisons, ktup=6 by default; ktup=3 or ktup=4 provides higher sensi-tivity; ktup=1 should be used for oligonucleotides (DNA query lengths < 20).


In the Bio-Linux package, the threaded versions of the fasta programs are the default.


The programs available in the Fasta3 package are listed below. For a full listing, check out the contents of the directory on Bio-Linux: /usr/local/bioinf/fasta/fasta/bin. Sample data files can be found in the fasta folder of the sampledata folder on the Bio-Linux desktop, or inside the folder /usr/local/bioinf/sampledata/fasta.

  • fasta - scan a protein or DNA sequence library for similar sequences.
  • fastx - compare a DNA sequence to a protein sequence database, comparing the translated DNA sequence in forward and reverse frames.
  • tfastx - compare a protein sequence to a DNA sequence database, calculating similarities with frameshifts to the forward and reverse orientations.
  • fasty - compare a DNA sequence to a protein sequence database, comparing the translated DNA sequence in forward and reverse frames.
  • tfasty - compare a protein sequence to a DNA sequence database, calculating similarities with frameshifts to the forward and reverse orientations.
  • fasts - compare unordered peptides to a protein sequence database
  • tfasts - compare unordered peptides to a translated DNA sequence database
  • fastm - compare ordered peptides (or short DNA sequences) to a protein (DNA) sequence database
  • fastm - compare ordered peptides (or short DNA sequences) to a translated DNA sequence database
  • fastf - compare mixed peptides to a protein sequence database
  • tfastf - compare mixed peptides to a translated DNA sequence database
  • ssearch - compare a protein or DNA sequence to a sequence database using the Smith-Waterman algorithm.
  • ggsearch - compare a protein or DNA sequence to a sequence database using a global alignment (Needleman-Wunsch)
  • glsearch35 - compare a protein or DNA sequence to a sequence database with alignments that are global in the query and local in the database sequence (global-local).
  • lalign - produce multiple non-overlapping alignments for protein and DNA sequences using the Huang and Miller SIM algorithm for the Waterman-Eggert algorithm. This version of lalign replaces that from the Fasta2 package.
  • prss - (discontinued, replaced in the fasta35 release by new versions of ssearch and fastx) estimate statistical significance of an alignment by comparing the score to the distribution of similarity scores generated by shuffling the second sequence. prss35 uses Smith-Waterman. prfx35 uses the fastx algorithm.

References:
Pearson, W.R. Flexible sequence similarity searching with the FASTA3 program package. Methods Mol Biol. 2000;132:185-219 [Entrez]

Pearson, W.R. Empirical statistical estimates for sequence similarity searches. J Mol Biol. 1998 Feb 13;276(1):71-84 [Entrez]

Pearson WR, Wood T, Zhang Z, Miller W. Comparison of DNA sequences with protein sequences. Genomics. 1997 Nov 15;46(1):24-36. [Entrez]


Homepage http://www.people.virginia.edu/~wrp/pearson.html  
Remote Documentation http://www.people.virginia.edu/~wrp/papers/ismb2000.pdf
 
Local Documentation
Fasta version 3 documentation

Fasta version 2 documentation

Main programs in the fasta3 package

Default settings for fasta programs

Fastf documentation

Fasts documentation

Prss documentation

Ps_lav documentation

Pvcomp documentation