SwissTree

Methods

General policy: preferred methods will be suggested, no method will be excluded.

Alignments

Recent papers benchmarking alignment methods:
Recent papers benchmarking alignment filtering methods:
- Penn O, Privman E, Landan G, Graur D, Pupko T (2010) An alignment confidence score capturing robustness to guide-tree uncertainty. Molecular Biology and Evolution
- Jordan G, Goldman N (2011) The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Molecular Biology and Evolution
How to decide which alignment is best? Any recommendations? Who, when? Order methods by preference
- Recent review-editorial with recommendations on the choice of an appropriate alignment method:
- - Anisimova M, Cannarozzi GM, Liberles DA (2010) Finding the balance between the mathematical and biological optima in multiple sequence alignment. Trends Evol. Biol

Alignment programs
Method	Comments (may include non-representative observations)
BAli-Phy	VERY slow, VERY good, not feasible for >40 sequences. Simultaneous alignment and phylogeny reconstruction
PAGAN	Better than PRANK, and faster
PRANK	Good, slow - small datasets, provides posterior probability scores for each position
M-Coffee	Good, slow, provides confidence scores for each position based on agreement of different aligners
MAFFT	Pretty good! fast! - large datasets
FSA	VERY slow, easy to use
hmmer3	relatively fast. relatively easy to use; Only local alignments
ProGraphMSA	very fast Prank-like alignment based on graph representation and includes content sensitive feature (uses the idea from Biegert A and Soding J (2009) Sequence context-specific profiles for homology searching. Proc Natl Acad Sci USA). It should perform especially well for more divergent sequences and those with repeats. Not yet published but is currently under review

Alignment filtering

Alignment filtering programs
Method	Comments (may include non-representative observations)
GUIDANCE	May be run with PAGAN, PRANK or MAFFT to give confidence score for each position. Based on robustness to guide tree uncertainty. Increases run time by a factor of the number of bootstrap iterations (100 by default; may reduce to 30)
GBlocks	Very strict, even under less stringent options; Faster but less accurate than GUIDANCE; GBlocks returns wrong column positions if columns with 100% gaps remain. It does NOT count them at all !!!; Can work on codons directly !; X are not as gaps; Do not seem to use a 0 exit code when it runs properly: always exits with code 1 for success or failure; Source code not available
TrimAl	very fast, easy to use; Column numbering starts at 0 ! So, 143 means 144th columns; Columns with gaps remain whatever the option (but nogaps); Cannot work directly on codons; X are not as gaps
AliScore / AliCut	fast, not easy to use
NOISY	fast
BMGE	fast; can work on codons
MaxAlign	fast, easy to use; Remove sequences that disrupt the alignment NOT columns !
AL2CO	fast, easy to use; Need Clustal format as input

Tree reconstruction

It was suggested to estimate gene phylogenies with ML and BI methods only.
All methods listed below handle gap positions as unknown characters. Are you sure, or should we confirm in the comment section below for each method?

Tree reconstruction programs
Method	Comments (may include non-representative observations)
CodonPhyML	Includes about 50 different codon model for fast phylogeny reconstruction with ML. According to our current results (unpublished) codon models fit protein-coding DNA data significantly better and infer better trees, compared to amino acid and DNA models. This can be explained by the fact that the evolution of proteins is primarily driven by selection (negative and positive) and codon models explicitly allow selection, unlike amino acid and DNA models which do not allow selection. CodonPhyML will be available from sourceforge on publication - for now please contact: maria.anisimova@inf.ethz.ch if you would like to try it
MrBayes	None
PhyML	None
RAxML	None

Gene synteny

Databases