PROMALS3D output alignment format PROMALS3D web server provides links to alignments in the following three formats. 1. Colored alignment The first line in each block shows conservation indices for positions with a conservation index above 5. Each representative sequence has a magenta name and is colored according to PSIPRED [7] secondary structure predictions (red: alpha-helix, blue: beta-strand). A representative sequence and the immediate sequences below it with black names, if there are any, form a closely related group (determined by option "Identity threshold"). Sequences within each group are aligned in a fast way. The groups are aligned using profile consistency with predicted secondary structures. In the example below, seq8, seq1, seq6, seq5 and seq9 are representative sequences; seq8, seq10 and seq7 form a closely related group, and seq6 is a group by itself. The last two lines show consensus amino acid sequence (Consensus_aa) and consensus predicted secondary structures (Consensus_ss). Representative sequences have magenta names and they are colored according to predicted secondary structures (red: alpha-helix, blue: beta-strand). If the sequences are in aligned order, the sequences with black names directly under a representative sequence are in the same pre-aligned group and are aligned in a fast way. The first and last residue numbers of each sequence in each alignment block are shown before and after the sequences respectively. Consensus predicted secondary structure symbols: alpha-helix: h; beta-strand: e. Consensus amino acid symbols are: conserved amino acids are in bold and uppercase letters; aliphatic (I, V, L): l; aromatic (Y, H, W, F): @; hydrophobic (W, F, Y, M, L, I, V, A, C, T, H): h; alcohol (S, T): o; polar residues (D, E, H, K, N, Q, R, S, T): p; tiny (A, G, C, S): t; small (A, G, C, S, V, N, D, T, P): s; bulky residues (E, F, I, K, L, M, Q, R, W, Y): b; positively charged (K, R, H): +; negatively charged (D, E): -; charged (D, E, K, R, H): c.
Colored alignment example: Conservation: 9669 6 6 9 9 99 9 7 6 9 696 66 6 6 67 99 seq8 ----SWDEFVDRSVQLFRADPESTRYVMKYRHCDGKLVLKVTDNKECLKFKTDQAQEAKKMEKLNNIFFT seq10 --FDSWDEFVSKSVELFRNHPDTTRYVVKYRHCEGKLVLKVTDNHECLKFKTDQAQDAKKMEK------- seq7 ----SWEEFVERSVQLFRGDPNATRYVMKYRHCEGKLVLKVTDDRECLKFKTDQAQDAKKMEKLNNIFF- seq1 -KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMR seq4 ------EEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSV-VSYE-----------------MR seq3 -MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK----------- seq2 EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK--- seq0 --FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF------ seq6 --FTNWEEFAKAAERLHSANPEKCRFVTKYNHTKGELVLKLTDDVVCLQYSTNQLQDVKKLEKLSSTLLR seq5 ----SWEEFAKAAEVLYLEDPMKCRMCTKYRHVDHKLVVKLTDNHTVLKYVTDMAQDVKKIEKLTTLLMR seq9 ---KNWEDFEIAAENMYMANPQNCRYTMKYVHSKGHILLKMSDNVKCVQYRAENMPDLKK---------- Consensus_aa: ....sWEEFsp.t.pL@.ssPbphRhhhKYpHhcGpLhlKlTDs..Clp@phcbh.DhKKhEK....... Consensus_ss: hhhhhhhhhhhhh eeeeeeeee eeeeee eeeeeee hhhhhhhhhhhh Conservation: seq8 LM----------------------------------- seq10 ------------------------------------- seq7 ------------------------------------- seq1 LMELKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM seq4 LFGVQKDNFALEHSLL--------------------- seq3 ------------------------------------- seq2 ------------------------------------- seq0 ------------------------------------- seq6 SI----------------------------------- seq5 ------------------------------------- seq9 ------------------------------------- Consensus_aa: ..................................... Consensus_ss: 2. CLUSTAL format alignment
Each sequence and its name are on the same line and the sequences can be partitioned into a number of blocks separated by empty lines. The word "CLUSTAL" indicating the format can begin in the first line, but such a first line is optional.
CLUSTAL format alignment Example: seq8 ----SWDEFVDRSVQLFRADPESTRYVMKYRHCDGKLVLKVTDNKECLKFKTDQAQEAKK seq1 -KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKK seq6 --FTNWEEFAKAAERLHSANPEKCRFVTKYNHTKGELVLKLTDDVVCLQYSTNQLQDVKK seq5 ----SWEEFAKAAEVLYLEDPMKCRMCTKYRHVDHKLVVKLTDNHTVLKYVTDMAQDVKK seq9 ---KNWEDFEIAAENMYMANPQNCRYTMKYVHSKGHILLKMSDNVKCVQYRAENMPDLKK seq10 --FDSWDEFVSKSVELFRNHPDTTRYVVKYRHCEGKLVLKVTDNHECLKFKTDQAQDAKK seq7 ----SWEEFVERSVQLFRGDPNATRYVMKYRHCEGKLVLKVTDDRECLKFKTDQAQDAKK seq4 ------EEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSV-VSYE--------- seq3 -MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK- seq2 EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKK seq0 --FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKK seq8 MEKLNNIFFTLM----------------------------------- seq1 IEKFHSQLMRLMELKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM seq6 LEKLSSTLLRSI----------------------------------- seq5 IEKLTTLLMR------------------------------------- seq9 ----------------------------------------------- seq10 MEK-------------------------------------------- seq7 MEKLNNIFF-------------------------------------- seq4 ------MRLFGVQKDNFALEHSLL----------------------- seq3 ----------------------------------------------- seq2 VEKLHGK---------------------------------------- seq0 IEKF------------------------------------------- 3. FASTA format alignment A sequence record in a FASTA format consists of a single-line description (sequence name), followed by line(s) of sequence data. The first character of the description line is a greater-than (">") symbol. FASTA format alignment example: >seq8 |