PCMA Documentation

PCMA [1] (Profile Consistency Multiple sequence Alignment) is a progressive multiple sequence alignment program that combines two different alignment strategies.  Highly similar sequences are aligned in a fast way as in ClustalW [2] , forming pre-aligned groups.  The T-Coffee [3] strategy is applied to align the relatively divergent groups based on profile-profile comparison and consistency.  The scoring function for local alignments of pre-aligned groups is based on COMPASS [4] , a profile-profile comparison method that is a generalization of the PSI-BLAST [5] approach to profile-sequence comparison.  PCMA balances speed and accuracy in a flexible way and is suitable for aligning large numbers of sequences.

An important parameter:
ave_grp_id:  Threshold of PERCENTAGE sequence identity above which neighboring groups are aligned by ClustalW and below which neighboring groups are subject to profile consistency measure. If the sequence number is very large, a decrease of the threshold from the default value is recommended.
        Range [0..100]
        Default: -ave_grp_id=50

References:

1. Pei J, Sadreyev R, Grishin NV: PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics 2003, 19:427-428.

2. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22:4673-4680.

3. Notredame C, Higgins DG, Heringa J: T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol 2000, 302:205-217.

4. Sadreyev R, Grishin N: COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol 2003, 326:317-336.

5. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25:3389-3402.