Euclidian space mapping and grouping of protein families:
Euclidian space mapping and grouping of protein families
(a) Sm proteins. The multiple sequence alignment was taken from (Wicker et al., 2001).
The sequences are numbered as follows: 1. ySmE; 2. riSmEa; 3. riSmEb; 4. arSmE; 5. huSmE;
6. caSmE; 7. huLsm5; 8. q9vrt7; 9. yLsm5; 10. o42978; 11. globu2; 12. huSmN; 13. oSmB;
14. chSmB; 15. dSmB; 16. caSmB; 17. arSmB; 18. ySmB; 19. sSmB; 20. riLsm1; 21. huLsm1;
22. yLsm1; 23. q20229; 24. yb18-schpo; 25. aaf46688; 26. aad56232; 27. aaf47567; 28. riSmx9;
29. aaf23841; 30. o74483; 31. ySmx13; 32. riSmGa; 33. riSmGb; 34. arSmG; 35. alSmG;
36. huSmG; 37. caSmG; 38. ySmG; 39. sSmG; 40. bLsm7; 41. riLsm7; 42. huLsm7; 43. yLsm7;
44. sulfo; 45. globu1; 46. pyroc1; 47. p-abys; 48. metha1; 49. aero-pern1; 50. riLsm3;
51. huLsm3; 52. q9y7m4; 53. yLsm3; 54. huSmD2; 55. caSmD2; 56. arSmD2; 57. sSmD2;
58. ySmD2; 59. pfalSmD; 60. huSmF; 61. dSmF; 62. riSmF; 63. bSmF; 64. caSmF; 65. ySmF;
66. arSmF; 67. sSmF; 68. nLsm6; 69. huLsm6; 70. yLsm6; 71. cab54975; 72. ySmD1;
73. sSmD1; 74. huSmD1; 75. riSmD1a; 76. arSmD1; 77. caSmD1; 78. yLsm2; 79. mLsm2;
80. amphSm; 81. yLsm4; 82. caLsm4; 83. huLsm4; 84. arLsm4; 85. ySmD3; 86. sSmD3;
87. arSmD3; 88. riSmD3; 89. huSmD3; 90. dSmD3; 91. caSmD3; 92. yLsm9; 93. aero-pern2;
94. m-therm2. (b) γ-Glultamylcysteine synthetase/glutamine synthase family proteins (γ-GCS/GS).
The multiple sequence alignment was taken from (Abbott et al., 2001). The sequence are numbered
as follows: 1. 4557625; 2. 7290879; 3. 7500706; 4. 312704; 5. 1439564; 6. 1170034; 7. 3913792;
8. 11386873; 9. 11282603; 10. 6580772; 11. 7478107; 12. 6562892; 13. 11499888; 14. 6831717;
15. 11349463; 16. 2495565; 17. 8246826; 18. 10580902; 19. 11272160; 20. 121661; 21. 11386815;
22. 11348607; 23. 6634496; 24. RVK00642; 25. RAB01613; 26. REF02644; 27. REFA03339;
28. RMN01341; 29. 9256961; 30. 11498554; 31. 6624705; 32. 7478087; 33. 4504027;
34. 7292645; 35. 6325292; 36. 1835156; 37. 1075624; 38. 7471952; 39. 121358.
The three dimensional projection of the multidimensional Euclidian space is
shown on top of each panel. Proteins are shown as circles in (a) and as numbers
in (b). Groupings at different steps defined by different σ-values are shown at
the bottom of each panel. The grouping on the projections corresponds to the most
stable configuration: sets 33-38 in (a) and sets 34-56 in (b). Colors of groups
are the same in the projection and in the table.
Abbott, J. J., Pei, J., Ford, J. L., Qi, Y., Grishin, V. N., Pitcher,
L. A., Phillips, M. A. & Grishin, N. V. (2001). Structure Prediction and Active
Site Analysis of the Metal Binding Determinants in gamma-Glutamylcysteine Synthetase.
J Biol Chem 276, 42099-42107.
Wicker, N., Perrin, G. R., Thierry, J. C. & Poch, O. (2001).
Secator: a program for inferring protein subfamilies from phylogenetic trees.
Mol Biol Evol 18, 1435-1441.