SwissTree

Methods

General policy: preferred methods will be suggested, no method will be excluded.

Alignments

Alignment programs
MethodComments (may include non-representative observations)
BAli-Phy VERY slow, VERY good, not feasible for >40 sequences. Simultaneous alignment and phylogeny reconstruction
PAGAN Better than PRANK, and faster
PRANK Good, slow - small datasets, provides posterior probability scores for each position
M-Coffee Good, slow, provides confidence scores for each position based on agreement of different aligners
MAFFT Pretty good! fast! - large datasets
FSA VERY slow, easy to use
hmmer3 relatively fast. relatively easy to use; Only local alignments
ProGraphMSA very fast Prank-like alignment based on graph representation and includes content sensitive feature (uses the idea from Biegert A and Soding J (2009) Sequence context-specific profiles for homology searching. Proc Natl Acad Sci USA). It should perform especially well for more divergent sequences and those with repeats. Not yet published but is currently under review

Alignment filtering

Alignment filtering programs
MethodComments (may include non-representative observations)
GUIDANCE May be run with PAGAN, PRANK or MAFFT to give confidence score for each position. Based on robustness to guide tree uncertainty. Increases run time by a factor of the number of bootstrap iterations (100 by default; may reduce to 30)
GBlocks Very strict, even under less stringent options; Faster but less accurate than GUIDANCE; GBlocks returns wrong column positions if columns with 100% gaps remain. It does NOT count them at all !!!; Can work on codons directly !; X are not as gaps; Do not seem to use a 0 exit code when it runs properly: always exits with code 1 for success or failure; Source code not available
TrimAl very fast, easy to use; Column numbering starts at 0 ! So, 143 means 144th columns; Columns with gaps remain whatever the option (but nogaps); Cannot work directly on codons; X are not as gaps
AliScore / AliCut fast, not easy to use
NOISY fast
BMGE fast; can work on codons
MaxAlign fast, easy to use; Remove sequences that disrupt the alignment NOT columns !
AL2CO fast, easy to use; Need Clustal format as input

Tree reconstruction

It was suggested to estimate gene phylogenies with ML and BI methods only.
All methods listed below handle gap positions as unknown characters. Are you sure, or should we confirm in the comment section below for each method?
Tree reconstruction programs
MethodComments (may include non-representative observations)
CodonPhyML Includes about 50 different codon model for fast phylogeny reconstruction with ML. According to our current results (unpublished) codon models fit protein-coding DNA data significantly better and infer better trees, compared to amino acid and DNA models. This can be explained by the fact that the evolution of proteins is primarily driven by selection (negative and positive) and codon models explicitly allow selection, unlike amino acid and DNA models which do not allow selection. CodonPhyML will be available from sourceforge on publication - for now please contact: maria.anisimova@inf.ethz.ch if you would like to try it
MrBayes None
PhyML None
RAxML None

Gene synteny

Databases