PCMA
Documentation
PCMA
[1]
(Profile Consistency Multiple sequence Alignment) is a progressive
multiple sequence alignment program that combines two different
alignment strategies. Highly
similar sequences are aligned in a fast way as in ClustalW
[2]
,
forming pre-aligned groups. The
T-Coffee
[3]
strategy is applied to align the relatively divergent groups based on
profile-profile comparison and consistency.
The scoring function for local alignments of pre-aligned groups
is based on COMPASS
[4]
,
a profile-profile comparison method that is
a generalization of the PSI-BLAST
[5]
approach to profile-sequence comparison.
PCMA balances speed and accuracy in a flexible way and is
suitable for aligning large numbers of sequences.
An
important parameter:
ave_grp_id: Threshold of PERCENTAGE sequence identity above
which neighboring groups are aligned by ClustalW and below which
neighboring groups are subject to profile consistency measure. If the
sequence number is very large, a decrease of the threshold from the
default value is recommended.
Range [0..100]
Default: -ave_grp_id=50
References:
1. Pei J, Sadreyev R, Grishin NV: PCMA:
fast and accurate multiple sequence alignment based on profile
consistency. Bioinformatics 2003,
19:427-428.
2.
Thompson JD, Higgins DG, Gibson TJ: CLUSTAL
W: improving the sensitivity of progressive multiple sequence alignment
through sequence weighting, position-specific gap penalties and weight
matrix choice. Nucleic Acids
Res 1994, 22:4673-4680.
3.
Notredame C, Higgins DG, Heringa J: T-Coffee:
A novel method for fast and accurate multiple sequence alignment. J
Mol Biol 2000, 302:205-217.
4.
Sadreyev R, Grishin N: COMPASS: a
tool for comparison of multiple protein alignments with assessment of
statistical significance. J
Mol Biol 2003, 326:317-336.
5.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman
DJ: Gapped BLAST and PSI-BLAST: a
new generation of protein database search programs. Nucleic Acids Res 1997, 25:3389-3402.
|