Query         psy2771
Match_columns 174
No_of_seqs    147 out of 1063
Neff          8.3 
Searched_HMMs 46136
Date          Sat Aug 17 00:43:34 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy2771.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/2771hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 3.4E-32 7.3E-37  232.1  18.1  160    4-166   110-275 (455)
  2 PRK10898 serine endoprotease;  100.0 2.2E-31 4.8E-36  220.8  18.7  161    4-167    97-265 (353)
  3 TIGR02038 protease_degS peripl 100.0 5.5E-31 1.2E-35  218.4  19.3  160    4-166    97-263 (351)
  4 PRK10942 serine endoprotease;  100.0 4.3E-30 9.4E-35  220.1  17.9  159    5-166   132-296 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 4.7E-29   1E-33  211.8  18.3  159    6-167    79-243 (428)
  6 COG0265 DegQ Trypsin-like seri  99.9 1.2E-23 2.6E-28  174.3  16.8  161    4-166    91-257 (347)
  7 KOG1320|consensus               99.6 1.2E-14 2.6E-19  123.2  10.2  140   17-156   208-357 (473)
  8 PF13365 Trypsin_2:  Trypsin-li  99.1 6.2E-10 1.4E-14   77.6   8.9   84   11-124    33-120 (120)
  9 PF00089 Trypsin:  Trypsin;  In  98.7 3.2E-07 6.9E-12   69.9  10.9  115   32-146    86-220 (220)
 10 KOG1421|consensus               98.6 1.7E-07 3.6E-12   82.3   7.3  147   20-167   120-276 (955)
 11 cd00190 Tryp_SPc Trypsin-like   98.3 7.2E-06 1.6E-10   62.8  10.4  116   32-147    88-230 (232)
 12 PF00863 Peptidase_C4:  Peptida  98.2 3.9E-05 8.4E-10   60.4  11.9  115   27-151    76-194 (235)
 13 KOG1320|consensus               98.2 2.7E-06 5.8E-11   72.8   5.4  130   19-154   123-258 (473)
 14 COG3591 V8-like Glu-specific e  98.1 7.9E-05 1.7E-09   59.2  11.1   87   53-150   155-250 (251)
 15 smart00020 Tryp_SPc Trypsin-li  97.9 0.00014 3.1E-09   55.8   9.5  100   31-130    87-209 (229)
 16 PF10459 Peptidase_S46:  Peptid  97.6 0.00013 2.8E-09   65.8   5.8   57   94-150   619-687 (698)
 17 PF05580 Peptidase_S55:  SpoIVB  97.2 0.00075 1.6E-08   52.2   5.4   46   97-143   169-216 (218)
 18 PF08192 Peptidase_S64:  Peptid  97.2  0.0043 9.4E-08   55.1  10.4  117   30-150   540-689 (695)
 19 PF00949 Peptidase_S7:  Peptida  97.0  0.0011 2.3E-08   47.8   3.8   34   97-130    86-119 (132)
 20 KOG3627|consensus               96.9    0.02 4.3E-07   45.0  11.2  117   33-149   106-253 (256)
 21 PF00944 Peptidase_S3:  Alphavi  96.6   0.004 8.6E-08   44.8   4.2   41   97-137    95-135 (158)
 22 PF03761 DUF316:  Domain of unk  96.4   0.087 1.9E-06   42.3  11.6  105   31-144   159-273 (282)
 23 PF00548 Peptidase_C3:  3C cyst  96.3   0.075 1.6E-06   40.0  10.2  108   17-128    51-170 (172)
 24 TIGR02860 spore_IV_B stage IV   96.1   0.012 2.5E-07   50.0   5.3   46   98-144   350-397 (402)
 25 PF05579 Peptidase_S32:  Equine  95.9    0.01 2.2E-07   47.5   3.6   28  105-132   205-232 (297)
 26 KOG1421|consensus               95.9    0.18 3.9E-06   45.5  11.6  150   10-162   577-739 (955)
 27 PF00947 Pico_P2A:  Picornaviru  93.2    0.25 5.4E-06   35.2   5.0   40   96-136    78-117 (127)
 28 PF02907 Peptidase_S29:  Hepati  92.1    0.34 7.4E-06   35.0   4.5   38  104-142   104-146 (148)
 29 PF01732 DUF31:  Putative pepti  90.7     0.2 4.2E-06   42.2   2.6   25  102-126   349-373 (374)
 30 COG5640 Secreted trypsin-like   85.7       2 4.4E-05   36.1   5.3   49  103-151   223-279 (413)
 31 PF02122 Peptidase_S39:  Peptid  85.4     1.2 2.7E-05   34.4   3.8   45   97-142   136-184 (203)
 32 PF12381 Peptidase_C3G:  Tungro  79.2     3.2   7E-05   32.4   3.9   53   97-149   169-228 (231)
 33 PF00571 CBS:  CBS domain CBS d  75.8     3.7   8E-05   24.0   2.8   21  107-127    28-48  (57)
 34 cd01735 LSm12_N LSm12 belongs   71.0      16 0.00034   22.7   4.8   26   17-42     14-39  (61)
 35 PF10459 Peptidase_S46:  Peptid  70.9     6.8 0.00015   35.9   4.4   39   32-71    199-253 (698)
 36 PF14827 Cache_3:  Sensory doma  68.1     5.4 0.00012   27.5   2.6   18  112-129    94-111 (116)
 37 PF02743 Cache_1:  Cache domain  62.4      12 0.00026   23.7   3.3   31  112-150    19-49  (81)
 38 PF05578 Peptidase_S31:  Pestiv  61.4      28 0.00061   25.9   5.3   74   55-130   108-184 (211)
 39 COG2524 Predicted transcriptio  60.7      90   0.002   25.4   8.7   21  107-128   201-221 (294)
 40 cd04627 CBS_pair_14 The CBS do  59.8     8.1 0.00018   26.1   2.2   22  107-128    97-118 (123)
 41 COG0298 HypC Hydrogenase matur  58.9      28 0.00061   22.8   4.4   47   22-69      5-53  (82)
 42 cd04603 CBS_pair_KefB_assoc Th  55.2      12 0.00026   24.9   2.4   22  107-128    85-106 (111)
 43 PF02395 Peptidase_S6:  Immunog  54.5      26 0.00055   32.7   5.0   47  103-149   211-266 (769)
 44 PF03510 Peptidase_C24:  2C end  53.9      33 0.00071   23.7   4.4   23   31-53     34-56  (105)
 45 cd04618 CBS_pair_5 The CBS dom  53.2      29 0.00062   22.7   4.0   48  107-154    22-73  (98)
 46 cd04620 CBS_pair_7 The CBS dom  52.7      14 0.00029   24.5   2.4   22  107-128    89-110 (115)
 47 cd04643 CBS_pair_30 The CBS do  48.4      16 0.00034   24.1   2.2   17  112-128    95-111 (116)
 48 cd04597 CBS_pair_DRTGG_assoc2   47.0      21 0.00046   24.0   2.7   21  107-127    87-107 (113)
 49 cd04619 CBS_pair_6 The CBS dom  46.9      19 0.00042   23.9   2.4   22  107-128    88-109 (114)
 50 cd04592 CBS_pair_EriC_assoc_eu  46.3      22 0.00047   25.0   2.7   22  107-128    22-43  (133)
 51 cd04801 CBS_pair_M50_like This  46.3      20 0.00043   23.7   2.4   22  107-128    88-109 (114)
 52 cd04607 CBS_pair_NTP_transfera  44.9      22 0.00047   23.4   2.4   22  107-128    87-108 (113)
 53 PRK15431 ferrous iron transpor  44.3      20 0.00043   23.4   2.0   27  133-159    24-50  (78)
 54 COG5428 Uncharacterized conser  44.3      43 0.00093   21.3   3.4   18   26-43      2-19  (69)
 55 cd04602 CBS_pair_IMPDH_2 This   43.2      23  0.0005   23.4   2.3   22  107-128    88-109 (114)
 56 cd04590 CBS_pair_CorC_HlyC_ass  43.1      22 0.00047   23.2   2.2   22  107-128    85-106 (111)
 57 cd04617 CBS_pair_4 The CBS dom  42.9      21 0.00047   23.8   2.2   22  107-128    89-113 (118)
 58 COG3448 CBS-domain-containing   42.5      21 0.00045   29.6   2.3   21  108-128   345-365 (382)
 59 cd04582 CBS_pair_ABC_OpuCA_ass  42.5      25 0.00054   22.7   2.4   22  107-128    80-101 (106)
 60 cd04641 CBS_pair_28 The CBS do  42.2      27 0.00059   23.3   2.6   22  106-127    21-42  (120)
 61 cd04601 CBS_pair_IMPDH This cd  42.2      24 0.00053   22.8   2.3   22  107-128    84-105 (110)
 62 smart00116 CBS Domain in cysta  42.1      28  0.0006   18.2   2.2   20  108-127    22-41  (49)
 63 cd04614 CBS_pair_1 The CBS dom  42.0      30 0.00065   22.4   2.7   48  107-154    22-72  (96)
 64 cd04583 CBS_pair_ABC_OpuCA_ass  41.9      25 0.00055   22.7   2.4   22  107-128    83-104 (109)
 65 cd04596 CBS_pair_DRTGG_assoc T  41.3      26 0.00056   22.9   2.4   22  107-128    82-103 (108)
 66 COG3290 CitA Signal transducti  41.0      23 0.00049   31.5   2.4   18  112-129   143-160 (537)
 67 cd04606 CBS_pair_Mg_transporte  40.3      28  0.0006   22.7   2.4   22  107-128    82-103 (109)
 68 cd04642 CBS_pair_29 The CBS do  40.1      28 0.00061   23.5   2.4   20  109-128   102-121 (126)
 69 cd04610 CBS_pair_ParBc_assoc T  38.5      31 0.00067   22.2   2.4   19  110-128    84-102 (107)
 70 cd04640 CBS_pair_27 The CBS do  38.2      28 0.00061   23.6   2.2   22  107-128    99-121 (126)
 71 PF01455 HupF_HypC:  HupF/HypC   38.2   1E+02  0.0022   19.3   5.0   43   22-65      5-47  (68)
 72 cd04600 CBS_pair_HPP_assoc Thi  37.4      32  0.0007   22.9   2.4   22  107-128    98-119 (124)
 73 cd04615 CBS_pair_2 The CBS dom  36.6      34 0.00074   22.3   2.4   22  107-128    87-108 (113)
 74 cd04609 CBS_pair_PALP_assoc2 T  36.4      32  0.0007   22.2   2.2   18  111-128    88-105 (110)
 75 PF09465 LBR_tudor:  Lamin-B re  36.2   1E+02  0.0022   18.7   4.7   42    1-42      1-43  (55)
 76 cd04587 CBS_pair_CAP-ED_DUF294  34.8      36 0.00078   22.2   2.3   18  111-128    91-108 (113)
 77 COG3284 AcoR Transcriptional a  34.2      22 0.00048   32.1   1.4   23  107-129   158-180 (606)
 78 PF15436 PGBA_N:  Plasminogen-b  33.5 1.9E+02  0.0041   22.7   6.2   52   12-64     33-88  (218)
 79 cd04624 CBS_pair_11 The CBS do  32.9      45 0.00097   21.7   2.5   22  107-128    86-107 (112)
 80 cd04611 CBS_pair_PAS_GGDEF_DUF  32.2      43 0.00094   21.6   2.3   22  107-128    85-106 (111)
 81 PF06003 SMN:  Survival motor n  32.2 1.3E+02  0.0028   24.2   5.4   34    9-42     72-106 (264)
 82 PF00741 Gas_vesicle:  Gas vesi  32.1      98  0.0021   17.4   3.3   30  142-171     2-31  (39)
 83 cd04635 CBS_pair_22 The CBS do  32.1      50  0.0011   21.8   2.7   21  107-127    96-116 (122)
 84 PF10049 DUF2283:  Protein of u  31.5      70  0.0015   18.6   2.9   16   27-42      3-18  (50)
 85 PF09012 FeoC:  FeoC like trans  30.3      58  0.0013   20.1   2.5   26  134-159    23-48  (69)
 86 cd01717 Sm_B The eukaryotic Sm  30.2 1.2E+02  0.0026   19.3   4.1   29   13-41     14-42  (79)
 87 cd00218 GlcAT-I Beta1,3-glucur  30.1      63  0.0014   25.4   3.1   32  111-143   136-173 (223)
 88 cd05701 S1_Rrp5_repeat_hs10 S1  29.9 1.4E+02  0.0031   18.8   4.0   14   52-65     42-55  (69)
 89 cd04605 CBS_pair_MET2_assoc Th  29.9      53  0.0011   21.2   2.4   22  107-128    84-105 (110)
 90 cd04621 CBS_pair_8 The CBS dom  29.7      58  0.0013   22.5   2.7   20  108-127    23-42  (135)
 91 TIGR00074 hypC_hupF hydrogenas  29.6 1.3E+02  0.0027   19.5   4.0   44   22-68      5-49  (76)
 92 cd04604 CBS_pair_KpsF_GutQ_ass  29.5      61  0.0013   21.0   2.7   21  107-127    88-108 (114)
 93 cd04585 CBS_pair_ACT_assoc2 Th  29.1      63  0.0014   21.1   2.7   22  107-128    96-117 (122)
 94 cd04632 CBS_pair_19 The CBS do  29.1      62  0.0013   21.7   2.7   21  107-127    22-42  (128)
 95 cd04803 CBS_pair_15 The CBS do  29.0      55  0.0012   21.6   2.4   22  107-128    96-117 (122)
 96 cd04588 CBS_pair_CAP-ED_DUF294  29.0      53  0.0012   21.2   2.3   21  108-128    85-105 (110)
 97 cd04598 CBS_pair_GGDEF_assoc T  28.8      47   0.001   21.8   2.1   18  111-128    97-114 (119)
 98 cd04623 CBS_pair_10 The CBS do  28.3      58  0.0013   21.0   2.4   20  108-127    23-42  (113)
 99 PF08275 Toprim_N:  DNA primase  27.8      69  0.0015   22.6   2.8   17  113-129    82-98  (128)
100 cd04608 CBS_pair_PALP_assoc Th  27.6      59  0.0013   22.1   2.4   21  108-128    24-44  (124)
101 cd04639 CBS_pair_26 The CBS do  27.6      66  0.0014   20.8   2.6   22  107-128    85-106 (111)
102 cd06168 LSm9 The eukaryotic Sm  27.6 1.7E+02  0.0036   18.7   4.3   30   12-41     13-42  (75)
103 PRK09371 gas vesicle synthesis  27.2 1.2E+02  0.0026   19.1   3.4   33  139-171     6-38  (68)
104 cd02205 CBS_pair The CBS domai  27.1      63  0.0014   20.4   2.4   21  107-127    87-107 (113)
105 cd04593 CBS_pair_EriC_assoc_ba  26.7      59  0.0013   21.3   2.2   22  107-128    87-110 (115)
106 cd04595 CBS_pair_DHH_polyA_Pol  26.6      54  0.0012   21.2   2.0   20  108-128    86-105 (110)
107 PF04085 MreC:  rod shape-deter  26.5 2.5E+02  0.0055   20.3   8.6   57   31-87     66-125 (152)
108 COG0517 FOG: CBS domain [Gener  26.5      69  0.0015   20.8   2.5   21  107-127    92-113 (117)
109 cd04637 CBS_pair_24 The CBS do  26.4      65  0.0014   21.3   2.4   22  107-128    96-117 (122)
110 cd01730 LSm3 The eukaryotic Sm  26.1 1.4E+02  0.0031   19.2   3.9   29   12-40     14-42  (82)
111 COG0490 Putative regulatory, l  26.0      81  0.0017   23.6   2.9   14   54-67    133-146 (162)
112 cd04599 CBS_pair_GGDEF_assoc2   25.5      61  0.0013   20.7   2.1   21  107-128    80-100 (105)
113 KOG3888|consensus               25.2      99  0.0021   26.3   3.6   45  109-153   292-338 (407)
114 cd04625 CBS_pair_12 The CBS do  25.2      62  0.0013   21.0   2.1   21  107-128    87-107 (112)
115 cd04631 CBS_pair_18 The CBS do  24.8      72  0.0016   21.1   2.4   22  107-128    99-120 (125)
116 PF08448 PAS_4:  PAS fold;  Int  24.5      80  0.0017   20.0   2.5   17  112-128    86-102 (110)
117 cd04629 CBS_pair_16 The CBS do  23.8      69  0.0015   20.8   2.1   20  108-127    23-42  (114)
118 cd04594 CBS_pair_EriC_assoc_ar  23.5      71  0.0015   20.6   2.1   21  107-128    79-99  (104)
119 cd04612 CBS_pair_SpoIVFB_EriC_  23.4      88  0.0019   20.1   2.6   22  107-128    85-106 (111)
120 cd04630 CBS_pair_17 The CBS do  23.3      71  0.0015   20.9   2.1   21  107-128    89-109 (114)
121 cd01727 LSm8 The eukaryotic Sm  23.3 1.3E+02  0.0029   18.9   3.3   27   14-40     14-40  (74)
122 cd01732 LSm5 The eukaryotic Sm  23.2   2E+02  0.0043   18.4   4.1   29   12-40     16-44  (76)
123 cd04586 CBS_pair_BON_assoc Thi  23.2      80  0.0017   21.5   2.4   21  107-128   110-130 (135)
124 cd04622 CBS_pair_9 The CBS dom  22.8      85  0.0018   20.3   2.4   19  110-128    90-108 (113)
125 cd01728 LSm1 The eukaryotic Sm  22.5 2.2E+02  0.0047   18.1   4.2   53   12-66     15-72  (74)
126 PRK11543 gutQ D-arabinose 5-ph  22.5      73  0.0016   25.8   2.4   22  107-128   292-313 (321)
127 cd04613 CBS_pair_SpoIVFB_EriC_  22.4      87  0.0019   20.1   2.4   20  108-127    23-42  (114)
128 cd01720 Sm_D2 The eukaryotic S  22.3 2.2E+02  0.0047   18.8   4.2   30   12-41     17-46  (87)
129 cd04584 CBS_pair_ACT_assoc Thi  22.1      82  0.0018   20.6   2.3   20  108-127    23-42  (121)
130 PF08669 GCV_T_C:  Glycine clea  21.9 1.1E+02  0.0024   19.9   2.8   23  107-129    32-54  (95)
131 cd01731 archaeal_Sm1 The archa  21.6 2.1E+02  0.0045   17.6   4.3   29   13-41     14-42  (68)
132 cd01722 Sm_F The eukaryotic Sm  20.9 2.2E+02  0.0047   17.5   4.0   31   11-41     13-43  (68)
133 COG1958 LSM1 Small nuclear rib  20.9 2.3E+02  0.0051   17.9   4.1   32   10-41     18-49  (79)
134 cd00600 Sm_like The eukaryotic  20.8   2E+02  0.0042   16.9   4.5   29   13-41     10-38  (63)
135 PRK00737 small nuclear ribonuc  20.6 2.3E+02   0.005   17.7   4.3   30   12-41     17-46  (72)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=3.4e-32  Score=232.08  Aligned_cols=160  Identities=33%  Similarity=0.448  Sum_probs=141.4

Q ss_pred             eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEc-CCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEee
Q psy2771           4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCL-QNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISN   82 (174)
Q Consensus         4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~-~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~   82 (174)
                      |.+.++++.+.. .|++.++|+++++|+.+||||||++ +.++++++|++++.+++||+|+++|||+++..+++.|+|++
T Consensus       110 Vv~~a~~i~V~~-~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~  188 (455)
T PRK10139        110 VINQAQKISIQL-NDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISA  188 (455)
T ss_pred             HhCCCCEEEEEE-CCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcc
Confidence            344566776654 9999999999999999999999998 46899999999999999999999999999999999999999


Q ss_pred             eccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeeee
Q psy2771          83 KQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFCA  157 (174)
Q Consensus        83 ~~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~~  157 (174)
                      ..+.....  .....++++|+.+++|||||||||.+|+||||+++...     .+++||||++.+++++++|+++|++.|
T Consensus       189 ~~r~~~~~--~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r  266 (455)
T PRK10139        189 LGRSGLNL--EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKR  266 (455)
T ss_pred             ccccccCC--CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccc
Confidence            87653211  23457899999999999999999999999999998653     468999999999999999999999999


Q ss_pred             eecCceeee
Q psy2771         158 YSKGKSDLR  166 (174)
Q Consensus       158 ~~lg~~~~~  166 (174)
                      +|+|++..+
T Consensus       267 ~~LGv~~~~  275 (455)
T PRK10139        267 GLLGIKGTE  275 (455)
T ss_pred             cceeEEEEE
Confidence            999998765


No 2  
>PRK10898 serine endoprotease; Provisional
Probab=99.98  E-value=2.2e-31  Score=220.80  Aligned_cols=161  Identities=29%  Similarity=0.406  Sum_probs=140.0

Q ss_pred             eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeee
Q psy2771           4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNK   83 (174)
Q Consensus         4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~   83 (174)
                      |.+.++++.+.+ .||+.++|+++++|+.+||||||++..++|++++++++.+++|+.|+++|||++...+++.|+|++.
T Consensus        97 Vv~~a~~i~V~~-~dg~~~~a~vv~~d~~~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~  175 (353)
T PRK10898         97 VINDADQIIVAL-QDGRVFEALLVGSDSLTDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISAT  175 (353)
T ss_pred             EeCCCCEEEEEe-CCCCEEEEEEEEEcCCCCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEec
Confidence            344466666554 8999999999999999999999999888999999998889999999999999999999999999987


Q ss_pred             ccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC--------CCeEEEEeHHHHHHHHHHHHhCCee
Q psy2771          84 QRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT--------AGISFAIPIDYAIEFLTNYKRKGKF  155 (174)
Q Consensus        84 ~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~--------~~~~~aiPi~~i~~~l~~l~~~g~~  155 (174)
                      .+.....  .....++++|+.+++|||||||+|.+|+||||+++...        .+++|+||++.+++++++++++|++
T Consensus       176 ~r~~~~~--~~~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~  253 (353)
T PRK10898        176 GRIGLSP--TGRQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRV  253 (353)
T ss_pred             cccccCC--ccccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcc
Confidence            7643211  22347899999999999999999999999999997542        4689999999999999999999999


Q ss_pred             eeeecCceeeee
Q psy2771         156 CAYSKGKSDLRT  167 (174)
Q Consensus       156 ~~~~lg~~~~~~  167 (174)
                      .|+|+|+...+.
T Consensus       254 ~~~~lGi~~~~~  265 (353)
T PRK10898        254 IRGYIGIGGREI  265 (353)
T ss_pred             cccccceEEEEC
Confidence            999999987653


No 3  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.98  E-value=5.5e-31  Score=218.43  Aligned_cols=160  Identities=31%  Similarity=0.437  Sum_probs=139.3

Q ss_pred             eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeee
Q psy2771           4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNK   83 (174)
Q Consensus         4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~   83 (174)
                      |.+.++.+.+. +.||+.++|+++++|+.+||||||++...+++++++++..+++||+|+++|||++...+++.|+|+..
T Consensus        97 VV~~~~~i~V~-~~dg~~~~a~vv~~d~~~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~  175 (351)
T TIGR02038        97 VIKKADQIVVA-LQDGRKFEAELVGSDPLTDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISAT  175 (351)
T ss_pred             EeCCCCEEEEE-ECCCCEEEEEEEEecCCCCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEec
Confidence            33445555554 59999999999999999999999999888999999988889999999999999999999999999988


Q ss_pred             ccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-------CCeEEEEeHHHHHHHHHHHHhCCeee
Q psy2771          84 QRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-------AGISFAIPIDYAIEFLTNYKRKGKFC  156 (174)
Q Consensus        84 ~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-------~~~~~aiPi~~i~~~l~~l~~~g~~~  156 (174)
                      .+.....  .....++++|+.+++|||||||||.+|+||||+++...       .+++|+||++.+++++++++++|++.
T Consensus       176 ~r~~~~~--~~~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~  253 (351)
T TIGR02038       176 GRNGLSS--VGRQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI  253 (351)
T ss_pred             cCcccCC--CCcceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc
Confidence            7653211  23357899999999999999999999999999987542       46899999999999999999999999


Q ss_pred             eeecCceeee
Q psy2771         157 AYSKGKSDLR  166 (174)
Q Consensus       157 ~~~lg~~~~~  166 (174)
                      |+|+|+...+
T Consensus       254 r~~lGv~~~~  263 (351)
T TIGR02038       254 RGYIGVSGED  263 (351)
T ss_pred             ceEeeeEEEE
Confidence            9999998765


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=99.97  E-value=4.3e-30  Score=220.07  Aligned_cols=159  Identities=32%  Similarity=0.449  Sum_probs=139.9

Q ss_pred             eEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEc-CCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeee
Q psy2771           5 EKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCL-QNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNK   83 (174)
Q Consensus         5 ~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~-~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~   83 (174)
                      ...++++.++ +.|++.+.|++++.|+.+||||||++ ..++++++|++++.+++|++|+++|||+++..+++.|+|++.
T Consensus       132 v~~a~~i~V~-~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~  210 (473)
T PRK10942        132 VDNATKIKVQ-LSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPYGLGETVTSGIVSAL  210 (473)
T ss_pred             cCCCCEEEEE-ECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCCCCCcceeEEEEEEe
Confidence            3446666665 49999999999999999999999997 568999999999999999999999999999999999999988


Q ss_pred             ccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeeeee
Q psy2771          84 QRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFCAY  158 (174)
Q Consensus        84 ~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~~~  158 (174)
                      .+....  ......++++|+.+++|+|||||+|.+|+||||+++...     .+++|+||++.+++++++|++.|++.|+
T Consensus       211 ~r~~~~--~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g~v~rg  288 (473)
T PRK10942        211 GRSGLN--VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRG  288 (473)
T ss_pred             ecccCC--cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhccccccc
Confidence            764221  123457899999999999999999999999999998653     3589999999999999999999999999


Q ss_pred             ecCceeee
Q psy2771         159 SKGKSDLR  166 (174)
Q Consensus       159 ~lg~~~~~  166 (174)
                      |+|+...+
T Consensus       289 ~lGv~~~~  296 (473)
T PRK10942        289 ELGIMGTE  296 (473)
T ss_pred             eeeeEeee
Confidence            99998765


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.97  E-value=4.7e-29  Score=211.76  Aligned_cols=159  Identities=36%  Similarity=0.506  Sum_probs=139.1

Q ss_pred             EeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCC-CCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeec
Q psy2771           6 KVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQN-NYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQ   84 (174)
Q Consensus         6 ~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~-~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~   84 (174)
                      ..+.++.+.+ .+++.++|+++++|+..||||||++.. .+++++|++++.+++|++|+++|||++...+++.|+|++..
T Consensus        79 ~~~~~i~V~~-~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~  157 (428)
T TIGR02037        79 DGADEITVTL-SDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALG  157 (428)
T ss_pred             CCCCeEEEEe-CCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecc
Confidence            3456666554 899999999999999999999999864 89999999988899999999999999999999999999887


Q ss_pred             cCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeeeeee
Q psy2771          85 RSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFCAYS  159 (174)
Q Consensus        85 ~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~~~~  159 (174)
                      +...  .......++++|+++++|+|||||||.+|+||||+++...     .+++|+||++.++++++++++.|++.++|
T Consensus       158 ~~~~--~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~  235 (428)
T TIGR02037       158 RSGL--GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGW  235 (428)
T ss_pred             cCcc--CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCc
Confidence            6531  1123456899999999999999999999999999987653     46899999999999999999999999999


Q ss_pred             cCceeeee
Q psy2771         160 KGKSDLRT  167 (174)
Q Consensus       160 lg~~~~~~  167 (174)
                      ||+...+.
T Consensus       236 lGi~~~~~  243 (428)
T TIGR02037       236 LGVTIQEV  243 (428)
T ss_pred             CceEeecC
Confidence            99987653


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.92  E-value=1.2e-23  Score=174.27  Aligned_cols=161  Identities=33%  Similarity=0.480  Sum_probs=140.8

Q ss_pred             eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCC-CCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEee
Q psy2771           4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNN-YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISN   82 (174)
Q Consensus         4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~-~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~   82 (174)
                      |...|+++.+.. .||+.++++++++|+..|+|++|++... ++.+.++++..++.|+.++++|+|+++..+++.|+++.
T Consensus        91 Vi~~a~~i~v~l-~dg~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~  169 (347)
T COG0265          91 VIAGAEEITVTL-ADGREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSA  169 (347)
T ss_pred             ecCCcceEEEEe-CCCCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEec
Confidence            344466666666 9999999999999999999999999654 89999999999999999999999999999999999999


Q ss_pred             eccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecCC-----CeEEEEeHHHHHHHHHHHHhCCeeee
Q psy2771          83 KQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTA-----GISFAIPIDYAIEFLTNYKRKGKFCA  157 (174)
Q Consensus        83 ~~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~-----~~~~aiPi~~i~~~l~~l~~~g~~~~  157 (174)
                      ..+. ..........++++|+.+++|+||||++|.+|++|||++.....     +++|+||++.+..++.++.++|++.+
T Consensus       170 ~~r~-~v~~~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~  248 (347)
T COG0265         170 LGRT-GVGSAGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVR  248 (347)
T ss_pred             cccc-cccCcccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccc
Confidence            9885 11111235688999999999999999999999999999988752     47999999999999999999999999


Q ss_pred             eecCceeee
Q psy2771         158 YSKGKSDLR  166 (174)
Q Consensus       158 ~~lg~~~~~  166 (174)
                      +++|+...+
T Consensus       249 ~~lgv~~~~  257 (347)
T COG0265         249 GYLGVIGEP  257 (347)
T ss_pred             cccceEEEE
Confidence            999998764


No 7  
>KOG1320|consensus
Probab=99.58  E-value=1.2e-14  Score=123.16  Aligned_cols=140  Identities=37%  Similarity=0.431  Sum_probs=121.6

Q ss_pred             ecCcEEEEeeeEeeCCCcEEEEEEc--CCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCcc-
Q psy2771          17 SFNSLLTLPNIAYYFEKHIILFHCL--QNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLN-   93 (174)
Q Consensus        17 ~~~~~~~a~~v~~d~~~DlAllkv~--~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~~-   93 (174)
                      ..+...++.+.+.|+..|+|+++++  ..-+++++++-+.++..|+++.++|.|++..++.+.|.++...+..+..+.. 
T Consensus       208 ~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~  287 (473)
T KOG1320|consen  208 GPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLET  287 (473)
T ss_pred             cCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCccc
Confidence            3448899999999999999999995  3348889999899999999999999999999999999999998877765544 


Q ss_pred             --ccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeee
Q psy2771          94 --KTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFC  156 (174)
Q Consensus        94 --~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~  156 (174)
                        ...+++++|+.++.|+||||++|.+|++||+.++...     .+++|++|.+.+..++.+..+.....
T Consensus       288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~l  357 (473)
T KOG1320|consen  288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISL  357 (473)
T ss_pred             ceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceee
Confidence              5568899999999999999999999999999998765     68999999999999998885444443


No 8  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.11  E-value=6.2e-10  Score=77.60  Aligned_cols=84  Identities=21%  Similarity=0.272  Sum_probs=50.2

Q ss_pred             EEEEEeecCcEEE--EeeeEeeCC-CcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCc
Q psy2771          11 ICLSTFSFNSLLT--LPNIAYYFE-KHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSS   87 (174)
Q Consensus        11 ~~~~~~~~~~~~~--a~~v~~d~~-~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~   87 (174)
                      +.+.. .++....  ++++..|+. .|+|||+++                   .....+...     ...+.........
T Consensus        33 ~~~~~-~~~~~~~~~~~~~~~~~~~~D~All~v~-------------------~~~~~~~~~-----~~~~~~~~~~~~~   87 (120)
T PF13365_consen   33 VEVVF-PDGRRVPPVAEVVYFDPDDYDLALLKVD-------------------PWTGVGGGV-----RVPGSTSGVSPTS   87 (120)
T ss_dssp             EEEEE-TTSCEEETEEEEEEEETT-TTEEEEEES-------------------CEEEEEEEE-----EEEEEEEEEEEEE
T ss_pred             EEEEe-cCCCEEeeeEEEEEECCccccEEEEEEe-------------------cccceeeee-----Eeeeecccccccc
Confidence            34444 6777788  999999999 999999999                   000000000     0000000000000


Q ss_pred             cccCccccccEEE-EeeecCCCCccceEEcCCCcEEEE
Q psy2771          88 ETLGLNKTINYIQ-TDAAITFGNSGGPLVNLDGEVIGI  124 (174)
Q Consensus        88 ~~~~~~~~~~~~~-~~~~~~~G~SGGPl~n~~G~liGI  124 (174)
                           ........ +++++.+|+|||||||.+|+||||
T Consensus        88 -----~~~~~~~~~~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   88 -----TNDNRMLYITDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             -----EEETEEEEEESSS-STTTTTSEEEETTSEEEEE
T ss_pred             -----CcccceeEeeecccCCCcEeHhEECCCCEEEeC
Confidence                 01111112 799999999999999999999997


No 9  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=98.67  E-value=3.2e-07  Score=69.93  Aligned_cols=115  Identities=22%  Similarity=0.230  Sum_probs=75.6

Q ss_pred             CCcEEEEEEcCC-----CCCceeecCC-CCCCCCCEEEEEecCCCCCC----ceeecEEeeeccCccc--cCccccccEE
Q psy2771          32 EKHIILFHCLQN-----NYPALKLGKA-ADIRNGEFVIAMGSPLTLNN----TNTFGIISNKQRSSET--LGLNKTINYI   99 (174)
Q Consensus        32 ~~DlAllkv~~~-----~~~~~~l~~~-~~~~~G~~v~~~G~p~g~~~----~~~~G~vs~~~~~~~~--~~~~~~~~~~   99 (174)
                      ..|+|||+++..     ...++.+... ..++.|+.+.++|++.....    .+....+.-.......  .........+
T Consensus        86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~  165 (220)
T PF00089_consen   86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI  165 (220)
T ss_dssp             TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            469999999843     5567777652 34689999999999976332    3333333322221110  0001223444


Q ss_pred             EEee----ecCCCCccceEEcCCCcEEEEEeeecCC----CeEEEEeHHHHHHHH
Q psy2771         100 QTDA----AITFGNSGGPLVNLDGEVIGINSMKVTA----GISFAIPIDYAIEFL  146 (174)
Q Consensus       100 ~~~~----~~~~G~SGGPl~n~~G~liGI~~~~~~~----~~~~aiPi~~i~~~l  146 (174)
                      ....    ..++|+|||||++.++.|+||++.+..+    ...++.+++...+|+
T Consensus       166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             cccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            4444    7899999999997777899999998542    258899998887764


No 10 
>KOG1421|consensus
Probab=98.57  E-value=1.7e-07  Score=82.32  Aligned_cols=147  Identities=16%  Similarity=0.190  Sum_probs=117.9

Q ss_pred             cEEEEeeeEeeCCCcEEEEEEcCCC-----CCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCc--
Q psy2771          20 SLLTLPNIAYYFEKHIILFHCLQNN-----YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGL--   92 (174)
Q Consensus        20 ~~~~a~~v~~d~~~DlAllkv~~~~-----~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~--   92 (174)
                      ...+.-.+..|+-+|+.+++.+++.     +..+.++. +..++|.+++++|+-.+.-.++-.|.++++.+....++.  
T Consensus       120 ee~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~  198 (955)
T KOG1421|consen  120 EEIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDT  198 (955)
T ss_pred             ccCCcccccCCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccc
Confidence            3334444788999999999998643     34444543 345899999999998888888889999988887765544  


Q ss_pred             --cccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-CCeEEEEeHHHHHHHHHHHHhCCeeeeeecCceeeee
Q psy2771          93 --NKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKRKGKFCAYSKGKSDLRT  167 (174)
Q Consensus        93 --~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-~~~~~aiPi~~i~~~l~~l~~~g~~~~~~lg~~~~~~  167 (174)
                        .....++|.......|.||+||+|.+|..|..+..+.. .+..|++|++.+.+.+.-++++..++|+-|.++.+.+
T Consensus       199 yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k  276 (955)
T KOG1421|consen  199 YNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHK  276 (955)
T ss_pred             cccccceeeeehhcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCCCcccceEEEEEehh
Confidence              33346678888899999999999999999999997764 5679999999999999999999999999888877654


No 11 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.33  E-value=7.2e-06  Score=62.80  Aligned_cols=116  Identities=18%  Similarity=0.194  Sum_probs=68.1

Q ss_pred             CCcEEEEEEcC-----CCCCceeecCCC-CCCCCCEEEEEecCCCCCC-----ceeecEEeeeccC--ccccC--ccccc
Q psy2771          32 EKHIILFHCLQ-----NNYPALKLGKAA-DIRNGEFVIAMGSPLTLNN-----TNTFGIISNKQRS--SETLG--LNKTI   96 (174)
Q Consensus        32 ~~DlAllkv~~-----~~~~~~~l~~~~-~~~~G~~v~~~G~p~g~~~-----~~~~G~vs~~~~~--~~~~~--~~~~~   96 (174)
                      ..|+||||++.     ..+.|+.|.... .+..|+.+.+.|+......     ......+.-....  .....  .....
T Consensus        88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~  167 (232)
T cd00190          88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITD  167 (232)
T ss_pred             cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCC
Confidence            57999999973     236788886653 5788999999998764321     1122222111110  00000  00000


Q ss_pred             cEE-E----EeeecCCCCccceEEcCC---CcEEEEEeeecCC----CeEEEEeHHHHHHHHH
Q psy2771          97 NYI-Q----TDAAITFGNSGGPLVNLD---GEVIGINSMKVTA----GISFAIPIDYAIEFLT  147 (174)
Q Consensus        97 ~~~-~----~~~~~~~G~SGGPl~n~~---G~liGI~~~~~~~----~~~~aiPi~~i~~~l~  147 (174)
                      ..+ .    .....|+|+|||||+...   ..|+||++++..+    ....+..+....+|++
T Consensus       168 ~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~  230 (232)
T cd00190         168 NMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ  230 (232)
T ss_pred             ceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence            111 1    145578999999999654   7899999987632    2345566666666664


No 12 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.20  E-value=3.9e-05  Score=60.38  Aligned_cols=115  Identities=20%  Similarity=0.320  Sum_probs=59.1

Q ss_pred             eEeeCCCcEEEEEEcCCCCCceeec-CCCCCCCCCEEEEEecCCCCCC-ceeecEEeeeccCccccCccccccEEEEeee
Q psy2771          27 IAYYFEKHIILFHCLQNNYPALKLG-KAADIRNGEFVIAMGSPLTLNN-TNTFGIISNKQRSSETLGLNKTINYIQTDAA  104 (174)
Q Consensus        27 v~~d~~~DlAllkv~~~~~~~~~l~-~~~~~~~G~~v~~~G~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~~~~~~~  104 (174)
                      +..-+..||.++|... ++||.+-. .-..++.++.|+++|.-+.... +.+.-.-+...+       .....+..+-..
T Consensus        76 v~~i~~~DiviirmPk-DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p-------~~~~~fWkHwIs  147 (235)
T PF00863_consen   76 VHPIEGRDIVIIRMPK-DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSISSTVSESSWIYP-------EENSHFWKHWIS  147 (235)
T ss_dssp             EEE-TCSSEEEEE--T-TS----S---B----TT-EEEEEEEECSSCCCEEEEEEEEEEEE-------ETTTTEEEE-C-
T ss_pred             eEEeCCccEEEEeCCc-ccCCcchhhhccCCCCCCEEEEEEEEEEcCCeeEEECCceEEee-------cCCCCeeEEEec
Confidence            4555688999999984 56665531 2356899999999997544332 111111111111       234477788888


Q ss_pred             cCCCCccceEEcC-CCcEEEEEeeecC-CCeEEEEeHHHHHHHHHHHHh
Q psy2771         105 ITFGNSGGPLVNL-DGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKR  151 (174)
Q Consensus       105 ~~~G~SGGPl~n~-~G~liGI~~~~~~-~~~~~aiPi~~i~~~l~~l~~  151 (174)
                      +..|+=|.||++. +|++|||++.... ...||+.|+..  ++.+.+.+
T Consensus       148 Tk~G~CG~PlVs~~Dg~IVGiHsl~~~~~~~N~F~~f~~--~f~~~~l~  194 (235)
T PF00863_consen  148 TKDGDCGLPLVSTKDGKIVGIHSLTSNTSSRNYFTPFPD--DFEEFYLE  194 (235)
T ss_dssp             --TT-TT-EEEETTT--EEEEEEEEETTTSSEEEEE--T--THHHHHCC
T ss_pred             CCCCccCCcEEEcCCCcEEEEEcCccCCCCeEEEEcCCH--HHHHHHhc
Confidence            8999999999976 5999999998764 45788877753  44444443


No 13 
>KOG1320|consensus
Probab=98.18  E-value=2.7e-06  Score=72.82  Aligned_cols=130  Identities=19%  Similarity=0.181  Sum_probs=100.6

Q ss_pred             CcEEEEeeeEeeCCCcEEEEEEcCC----CCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCccc
Q psy2771          19 NSLLTLPNIAYYFEKHIILFHCLQN----NYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNK   94 (174)
Q Consensus        19 ~~~~~a~~v~~d~~~DlAllkv~~~----~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~~~   94 (174)
                      -+.+.+++...-.++|+|++.++..    ...|+.+++.  +.-.+.++++|   +....++.|.|++.....+..+ ..
T Consensus       123 ~~k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~i--p~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~-~~  196 (473)
T KOG1320|consen  123 PRKYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDI--PSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHS-ST  196 (473)
T ss_pred             chhhhhhHHHhhhcccceEEEEeeccccCCCcccccCCC--cccCccEEEEc---CCcEEEEeeEEEEEEeccccCC-Cc
Confidence            3567788888888999999999853    3335666554  56678899998   7788999999999887655433 23


Q ss_pred             cccEEEEeeecCCCCccceEEcCCCcEEEEEeeec--CCCeEEEEeHHHHHHHHHHHHhCCe
Q psy2771          95 TINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIEFLTNYKRKGK  154 (174)
Q Consensus        95 ~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~--~~~~~~aiPi~~i~~~l~~l~~~g~  154 (174)
                      ....+++++.+.+|+||+|.+...++..|+++...  .+++.+.||.-.+.++.......+.
T Consensus       197 ~l~~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~  258 (473)
T KOG1320|consen  197 VLLRVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAI  258 (473)
T ss_pred             ceeeEEEEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeecc
Confidence            44568999999999999999988899999999887  4567889998777777665544443


No 14 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.05  E-value=7.9e-05  Score=59.16  Aligned_cols=87  Identities=21%  Similarity=0.232  Sum_probs=61.1

Q ss_pred             CCCCCCCCEEEEEecCCCCCCc----eeecEEeeeccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeee
Q psy2771          53 AADIRNGEFVIAMGSPLTLNNT----NTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus        53 ~~~~~~G~~v~~~G~p~g~~~~----~~~G~vs~~~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      ....+.++.+-++|||.+..+.    ...+.+....           ...+.+++.+++|+||+||++.+.++||++..+
T Consensus       155 ~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~-----------~~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g  223 (251)
T COG3591         155 ASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK-----------GNKLFYDADTLPGSSGSPVLISKDEVIGVHYNG  223 (251)
T ss_pred             ccccccCceeEEEeccCCCCcceeEeeecceeEEEe-----------cceEEEEecccCCCCCCceEecCceEEEEEecC
Confidence            3457899999999999775532    2233333221           246899999999999999999999999999987


Q ss_pred             cCC----CeEE-EEeHHHHHHHHHHHH
Q psy2771         129 VTA----GISF-AIPIDYAIEFLTNYK  150 (174)
Q Consensus       129 ~~~----~~~~-aiPi~~i~~~l~~l~  150 (174)
                      ...    ..++ +.-...++++++++.
T Consensus       224 ~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         224 PGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             CCcccccccCcceEecHHHHHHHHHhh
Confidence            651    2333 344456677776654


No 15 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=97.87  E-value=0.00014  Score=55.76  Aligned_cols=100  Identities=20%  Similarity=0.200  Sum_probs=57.8

Q ss_pred             CCCcEEEEEEcC-----CCCCceeecCC-CCCCCCCEEEEEecCCCCC------CceeecEEeeeccCcc--ccC---cc
Q psy2771          31 FEKHIILFHCLQ-----NNYPALKLGKA-ADIRNGEFVIAMGSPLTLN------NTNTFGIISNKQRSSE--TLG---LN   93 (174)
Q Consensus        31 ~~~DlAllkv~~-----~~~~~~~l~~~-~~~~~G~~v~~~G~p~g~~------~~~~~G~vs~~~~~~~--~~~---~~   93 (174)
                      ...|+|||+++.     ..+.|+.|... ..+..++.+.+.|++....      .......+........  ...   ..
T Consensus        87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~  166 (229)
T smart00020       87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAI  166 (229)
T ss_pred             CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhcccccc
Confidence            467999999974     24667777553 3577899999999886542      1111222211111000  000   00


Q ss_pred             ccccEEE----EeeecCCCCccceEEcCCC--cEEEEEeeecC
Q psy2771          94 KTINYIQ----TDAAITFGNSGGPLVNLDG--EVIGINSMKVT  130 (174)
Q Consensus        94 ~~~~~~~----~~~~~~~G~SGGPl~n~~G--~liGI~~~~~~  130 (174)
                      ....+-.    .....++|+|||||+...+  .|+||++.+..
T Consensus       167 ~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~~  209 (229)
T smart00020      167 TDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGSG  209 (229)
T ss_pred             CCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECCC
Confidence            0000000    1456789999999995443  89999999763


No 16 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=97.58  E-value=0.00013  Score=65.79  Aligned_cols=57  Identities=25%  Similarity=0.302  Sum_probs=45.4

Q ss_pred             ccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC------------CCeEEEEeHHHHHHHHHHHH
Q psy2771          94 KTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT------------AGISFAIPIDYAIEFLTNYK  150 (174)
Q Consensus        94 ~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~------------~~~~~aiPi~~i~~~l~~l~  150 (174)
                      ...-.+.++..++.||||+||+|.+|+|||+++-+.-            ...+.+|-+..+..+++.+-
T Consensus       619 ~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~  687 (698)
T PF10459_consen  619 SVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY  687 (698)
T ss_pred             CeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence            3455678999999999999999999999999996531            23577788888888877653


No 17 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=97.21  E-value=0.00075  Score=52.23  Aligned_cols=46  Identities=24%  Similarity=0.504  Sum_probs=35.7

Q ss_pred             cEEEEeeecCCCCccceEEcCCCcEEEEEeeec--CCCeEEEEeHHHHH
Q psy2771          97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAI  143 (174)
Q Consensus        97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~--~~~~~~aiPi~~i~  143 (174)
                      .++..+..+.+||||+|++ .+|+|||-++...  .+..+|.+++++..
T Consensus       169 ~Ll~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~ML  216 (218)
T PF05580_consen  169 RLLEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWML  216 (218)
T ss_pred             chhhhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHHh
Confidence            3444445677999999999 9999999999765  34579999987653


No 18 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=97.18  E-value=0.0043  Score=55.11  Aligned_cols=117  Identities=15%  Similarity=0.178  Sum_probs=71.8

Q ss_pred             eCCCcEEEEEEcCC---------CC------CceeecCC------CCCCCCCEEEEEecCCCCCCceeecEEeeeccCcc
Q psy2771          30 YFEKHIILFHCLQN---------NY------PALKLGKA------ADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSE   88 (174)
Q Consensus        30 d~~~DlAllkv~~~---------~~------~~~~l~~~------~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~   88 (174)
                      ..-.|+||||++..         ++      |.+.+.+.      ..+++|..|+-+|...+    .+.|.+.+..-..-
T Consensus       540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw  615 (695)
T PF08192_consen  540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYW  615 (695)
T ss_pred             ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEe
Confidence            33459999999741         22      22333221      34678999999988766    44565555432110


Q ss_pred             ccCccccccEEEEe----eecCCCCccceEEcCCCc------EEEEEeeecC--CCeEEEEeHHHHHHHHHHHH
Q psy2771          89 TLGLNKTINYIQTD----AAITFGNSGGPLVNLDGE------VIGINSMKVT--AGISFAIPIDYAIEFLTNYK  150 (174)
Q Consensus        89 ~~~~~~~~~~~~~~----~~~~~G~SGGPl~n~~G~------liGI~~~~~~--~~~~~aiPi~~i~~~l~~l~  150 (174)
                      ..+.-...+++...    .=...|+||+=|++.-++      |+||.+..-.  ..++++.|+.+|.+-|++.-
T Consensus       616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~vT  689 (695)
T PF08192_consen  616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEVT  689 (695)
T ss_pred             cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHhh
Confidence            00100112333333    223489999999976444      9999998653  35889999999888777754


No 19 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.95  E-value=0.0011  Score=47.79  Aligned_cols=34  Identities=29%  Similarity=0.509  Sum_probs=25.0

Q ss_pred             cEEEEeeecCCCCccceEEcCCCcEEEEEeeecC
Q psy2771          97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT  130 (174)
Q Consensus        97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~  130 (174)
                      .+...+....+|.||+|+||.+|++|||...+..
T Consensus        86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~  119 (132)
T PF00949_consen   86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE  119 (132)
T ss_dssp             EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred             eEEeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence            4556677788999999999999999999987763


No 20 
>KOG3627|consensus
Probab=96.92  E-value=0.02  Score=44.99  Aligned_cols=117  Identities=19%  Similarity=0.185  Sum_probs=64.4

Q ss_pred             CcEEEEEEcC-----CCCCceeecCCCC---CCCCCEEEEEecCCCC------CCceeecEEeeecc--CccccCcc-cc
Q psy2771          33 KHIILFHCLQ-----NNYPALKLGKAAD---IRNGEFVIAMGSPLTL------NNTNTFGIISNKQR--SSETLGLN-KT   95 (174)
Q Consensus        33 ~DlAllkv~~-----~~~~~~~l~~~~~---~~~G~~v~~~G~p~g~------~~~~~~G~vs~~~~--~~~~~~~~-~~   95 (174)
                      +|||+|+++.     +.+.|+.|.....   ...++..++.|.+...      ...+....+.-...  ........ ..
T Consensus       106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~  185 (256)
T KOG3627|consen  106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTI  185 (256)
T ss_pred             CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCcccc
Confidence            7999999984     3566777743332   3455888888865431      11222212221111  11111100 00


Q ss_pred             -ccEE-----EEeeecCCCCccceEEcCC---CcEEEEEeeecC-CC----eEEEEeHHHHHHHHHHH
Q psy2771          96 -INYI-----QTDAAITFGNSGGPLVNLD---GEVIGINSMKVT-AG----ISFAIPIDYAIEFLTNY  149 (174)
Q Consensus        96 -~~~~-----~~~~~~~~G~SGGPl~n~~---G~liGI~~~~~~-~~----~~~aiPi~~i~~~l~~l  149 (174)
                       ...+     .....+|.|+|||||+-.+   ..++||++++.. ++    .+....+....+++++.
T Consensus       186 ~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~  253 (256)
T KOG3627|consen  186 TDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKEN  253 (256)
T ss_pred             CCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHH
Confidence             0111     1123468999999999554   699999999864 21    35566666666666554


No 21 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=96.56  E-value=0.004  Score=44.76  Aligned_cols=41  Identities=22%  Similarity=0.318  Sum_probs=31.1

Q ss_pred             cEEEEeeecCCCCccceEEcCCCcEEEEEeeecCCCeEEEE
Q psy2771          97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISFAI  137 (174)
Q Consensus        97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~~~~~ai  137 (174)
                      .+..-+..-.+|+||-|++|.+|+||||+..+.+++...++
T Consensus        95 rftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaL  135 (158)
T PF00944_consen   95 RFTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTAL  135 (158)
T ss_dssp             EEEEETTS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEE
T ss_pred             eEEeccCCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEE
Confidence            33344555679999999999999999999998876654444


No 22 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.38  E-value=0.087  Score=42.28  Aligned_cols=105  Identities=18%  Similarity=0.177  Sum_probs=67.1

Q ss_pred             CCCcEEEEEEcCC---CCCceeecCCC-CCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCccccccEEEEeeecC
Q psy2771          31 FEKHIILFHCLQN---NYPALKLGKAA-DIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAIT  106 (174)
Q Consensus        31 ~~~DlAllkv~~~---~~~~~~l~~~~-~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~~~~~~~~  106 (174)
                      ...++.||.++.+   ...|+.|+++. .+..++.+.+.|+...  ..+....+.-....       .....+......+
T Consensus       159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~~~~~~~i~~~~-------~~~~~~~~~~~~~  229 (282)
T PF03761_consen  159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKLKHRKLKITNCT-------KCAYSICTKQYSC  229 (282)
T ss_pred             cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeEEEEEEEEEEee-------ccceeEecccccC
Confidence            4568999999854   78899998753 3678999999988211  11222222111110       1223445556778


Q ss_pred             CCCccceEE---cCCCcEEEEEeeecCC---CeEEEEeHHHHHH
Q psy2771         107 FGNSGGPLV---NLDGEVIGINSMKVTA---GISFAIPIDYAIE  144 (174)
Q Consensus       107 ~G~SGGPl~---n~~G~liGI~~~~~~~---~~~~aiPi~~i~~  144 (174)
                      .|++|||++   |.+-.||||.+.....   +..+++.+..+++
T Consensus       230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~  273 (282)
T PF03761_consen  230 KGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD  273 (282)
T ss_pred             CCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence            999999999   3344699999877632   2577777776654


No 23 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.31  E-value=0.075  Score=39.97  Aligned_cols=108  Identities=17%  Similarity=0.195  Sum_probs=60.8

Q ss_pred             ecCcEEEEeee--EeeC---CCcEEEEEEcC-CCCCcee--ecCCCCCCCCCEEEEEecCCCCCC-ceeecEEeeeccCc
Q psy2771          17 SFNSLLTLPNI--AYYF---EKHIILFHCLQ-NNYPALK--LGKAADIRNGEFVIAMGSPLTLNN-TNTFGIISNKQRSS   87 (174)
Q Consensus        17 ~~~~~~~a~~v--~~d~---~~DlAllkv~~-~~~~~~~--l~~~~~~~~G~~v~~~G~p~g~~~-~~~~G~vs~~~~~~   87 (174)
                      .++..+.....  ..+.   ..|+++++++. ..++-++  +.+. ..+..+...++=++ .... ....+.+...... 
T Consensus        51 i~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~~~-~~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-  127 (172)
T PF00548_consen   51 IDGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFPES-IPEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-  127 (172)
T ss_dssp             ETTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSBSS-GGTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-
T ss_pred             ECCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhccc-cccCCCcEEEEECC-CCccEEEEEEEEeecCcc-
Confidence            45665555442  2333   45999999963 2222111  1121 12444555555333 3333 3344444433322 


Q ss_pred             cccCccccccEEEEeeecCCCCccceEEcC---CCcEEEEEeee
Q psy2771          88 ETLGLNKTINYIQTDAAITFGNSGGPLVNL---DGEVIGINSMK  128 (174)
Q Consensus        88 ~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~---~G~liGI~~~~  128 (174)
                       ..........+.+++++.+|+-||||+..   .++++||+.++
T Consensus       128 -~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen  128 -NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             -EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred             -ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence             11113456788999999999999999942   57899999986


No 24 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=96.09  E-value=0.012  Score=50.01  Aligned_cols=46  Identities=22%  Similarity=0.505  Sum_probs=34.7

Q ss_pred             EEEEeeecCCCCccceEEcCCCcEEEEEeeec--CCCeEEEEeHHHHHH
Q psy2771          98 YIQTDAAITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIE  144 (174)
Q Consensus        98 ~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~--~~~~~~aiPi~~i~~  144 (174)
                      ++..+..+.+||||+|++ .+|+|||-++=..  .+..+|.|-+++..+
T Consensus       350 ll~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~Ml~  397 (402)
T TIGR02860       350 LLEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWMLK  397 (402)
T ss_pred             HhhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHHHH
Confidence            333445677999999999 9999999887443  456799997776644


No 25 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=95.85  E-value=0.01  Score=47.48  Aligned_cols=28  Identities=36%  Similarity=0.663  Sum_probs=21.7

Q ss_pred             cCCCCccceEEcCCCcEEEEEeeecCCC
Q psy2771         105 ITFGNSGGPLVNLDGEVIGINSMKVTAG  132 (174)
Q Consensus       105 ~~~G~SGGPl~n~~G~liGI~~~~~~~~  132 (174)
                      +.+|+||+|++..+|.+|||++.....+
T Consensus       205 T~~GDSGSPVVt~dg~liGVHTGSn~~G  232 (297)
T PF05579_consen  205 TGPGDSGSPVVTEDGDLIGVHTGSNKRG  232 (297)
T ss_dssp             S-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred             cCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence            4589999999999999999999876543


No 26 
>KOG1421|consensus
Probab=95.85  E-value=0.18  Score=45.49  Aligned_cols=150  Identities=15%  Similarity=0.083  Sum_probs=97.4

Q ss_pred             eEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCC-----ceeecEEeeec
Q psy2771          10 DICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNN-----TNTFGIISNKQ   84 (174)
Q Consensus        10 ~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~-----~~~~G~vs~~~   84 (174)
                      |+.+.. .|-..+.|.+...++...+|.+|-+++-...++|.+ ..+..||++...|+-.....     +++.-.+....
T Consensus       577 d~~vt~-~dS~~i~a~~~fL~~t~n~a~~kydp~~~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~p  654 (955)
T KOG1421|consen  577 DQRVTE-ADSDGIPANVSFLHPTENVASFKYDPALEVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIP  654 (955)
T ss_pred             ceEEee-cccccccceeeEecCccceeEeccChhHhhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEec
Confidence            334433 666778888888999999999999977666777755 45899999999998865432     22221111111


Q ss_pred             cC-ccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecCC-------CeEEEEeHHHHHHHHHHHHhCCeee
Q psy2771          85 RS-SETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTA-------GISFAIPIDYAIEFLTNYKRKGKFC  156 (174)
Q Consensus        85 ~~-~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~-------~~~~aiPi~~i~~~l~~l~~~g~~~  156 (174)
                      +. ...+. ......+...+.+.-++--|-+.|.+|+++++=-....+       -..|.+.+..+++.+++|+..++..
T Consensus       655 s~~~pr~r-~~n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r  733 (955)
T KOG1421|consen  655 SSVMPRFR-ATNLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR  733 (955)
T ss_pred             CCCCccee-ecceEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC
Confidence            11 00111 223455555545445555568899999999986544321       2578899999999999999988765


Q ss_pred             eeecCc
Q psy2771         157 AYSKGK  162 (174)
Q Consensus       157 ~~~lg~  162 (174)
                      .--+|+
T Consensus       734 p~i~~v  739 (955)
T KOG1421|consen  734 PTIAGV  739 (955)
T ss_pred             ceeecc
Confidence            333343


No 27 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=93.19  E-value=0.25  Score=35.21  Aligned_cols=40  Identities=28%  Similarity=0.375  Sum_probs=29.7

Q ss_pred             ccEEEEeeecCCCCccceEEcCCCcEEEEEeeecCCCeEEE
Q psy2771          96 INYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISFA  136 (174)
Q Consensus        96 ~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~~~~~a  136 (174)
                      ..++.-..+..||+-||+|+ -+--||||++++...-..|+
T Consensus        78 ~~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg~g~VaF~  117 (127)
T PF00947_consen   78 YNLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGGEGHVAFA  117 (127)
T ss_dssp             ECEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEETTEEEEE
T ss_pred             cCceeecccCCCCCCCceeE-eCCCeEEEEEeCCCceEEEE
Confidence            35666778899999999999 66679999999876544444


No 28 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=92.09  E-value=0.34  Score=34.97  Aligned_cols=38  Identities=32%  Similarity=0.606  Sum_probs=27.1

Q ss_pred             ecCCCCccceEEcCCCcEEEEEeeecC-CC----eEEEEeHHHH
Q psy2771         104 AITFGNSGGPLVNLDGEVIGINSMKVT-AG----ISFAIPIDYA  142 (174)
Q Consensus       104 ~~~~G~SGGPl~n~~G~liGI~~~~~~-~~----~~~aiPi~~i  142 (174)
                      ....|.||||++-.+|-+|||..+... .+    +.|. |++.+
T Consensus       104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l  146 (148)
T PF02907_consen  104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL  146 (148)
T ss_dssp             HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred             EEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence            345899999999999999999887663 22    4444 88765


No 29 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=90.67  E-value=0.2  Score=42.20  Aligned_cols=25  Identities=28%  Similarity=0.472  Sum_probs=21.8

Q ss_pred             eeecCCCCccceEEcCCCcEEEEEe
Q psy2771         102 DAAITFGNSGGPLVNLDGEVIGINS  126 (174)
Q Consensus       102 ~~~~~~G~SGGPl~n~~G~liGI~~  126 (174)
                      ...+..|.||+.|+|.+|++|||..
T Consensus       349 ~~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  349 NYSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             ccCCCCCCCcCeEECCCCCEEEEeC
Confidence            3456699999999999999999975


No 30 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=85.68  E-value=2  Score=36.10  Aligned_cols=49  Identities=18%  Similarity=0.310  Sum_probs=34.8

Q ss_pred             eecCCCCccceEEcC--CC-cEEEEEeeecC-CC----eEEEEeHHHHHHHHHHHHh
Q psy2771         103 AAITFGNSGGPLVNL--DG-EVIGINSMKVT-AG----ISFAIPIDYAIEFLTNYKR  151 (174)
Q Consensus       103 ~~~~~G~SGGPl~n~--~G-~liGI~~~~~~-~~----~~~aiPi~~i~~~l~~l~~  151 (174)
                      ...|.|+||||+|=.  +| .-+||++|+.. ++    .....-++....++.+..+
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~  279 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN  279 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence            567899999999922  23 48999999975 21    2445557777888777544


No 31 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=85.39  E-value=1.2  Score=34.39  Aligned_cols=45  Identities=18%  Similarity=0.206  Sum_probs=18.9

Q ss_pred             cEEEEeeecCCCCccceEEcCCCcEEEEEeeec----CCCeEEEEeHHHH
Q psy2771          97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKV----TAGISFAIPIDYA  142 (174)
Q Consensus        97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~----~~~~~~aiPi~~i  142 (174)
                      .+...-+.+.+|.||.|+++.+ +++|++....    ..+.++--|+.-+
T Consensus       136 ~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip~~  184 (203)
T PF02122_consen  136 KFASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIPPI  184 (203)
T ss_dssp             TEEEE-----TT-TT-EEE-SS--EEEEEEEE------------------
T ss_pred             cCCceEcCCCCCCCCCCeEECC-CceEeecCccccccccccccccccccc
Confidence            4667778899999999999877 9999999842    2466776666544


No 32 
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=79.17  E-value=3.2  Score=32.42  Aligned_cols=53  Identities=17%  Similarity=0.371  Sum_probs=38.6

Q ss_pred             cEEEEeeecCCCCccceEE--cCC--CcEEEEEeeecC-CCeEEEEeH--HHHHHHHHHH
Q psy2771          97 NYIQTDAAITFGNSGGPLV--NLD--GEVIGINSMKVT-AGISFAIPI--DYAIEFLTNY  149 (174)
Q Consensus        97 ~~~~~~~~~~~G~SGGPl~--n~~--G~liGI~~~~~~-~~~~~aiPi--~~i~~~l~~l  149 (174)
                      .-+++..++..|+=|+|++  |.+  -+++||+.++.. .+.+||-++  +.+.+.+..+
T Consensus       169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~~~gYAe~itQEDL~~A~~~l  228 (231)
T PF12381_consen  169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANHAMGYAESITQEDLMRAINKL  228 (231)
T ss_pred             eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccccceehhhhhHHHHHHHHHhh
Confidence            4567888999999999999  222  589999999985 467787554  4444444444


No 33 
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=75.82  E-value=3.7  Score=23.98  Aligned_cols=21  Identities=43%  Similarity=0.563  Sum_probs=18.1

Q ss_pred             CCCccceEEcCCCcEEEEEee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~  127 (174)
                      .+.+.-|++|.+|+++|+++.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            457778999999999999875


No 34 
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=70.96  E-value=16  Score=22.70  Aligned_cols=26  Identities=15%  Similarity=0.464  Sum_probs=23.5

Q ss_pred             ecCcEEEEeeeEeeCCCcEEEEEEcC
Q psy2771          17 SFNSLLTLPNIAYYFEKHIILFHCLQ   42 (174)
Q Consensus        17 ~~~~~~~a~~v~~d~~~DlAllkv~~   42 (174)
                      -+|..++.+++++|....+.+||-.+
T Consensus        14 c~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735          14 CFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             cCCceEEEEEEEecCCCcEEEEECcc
Confidence            67999999999999999999999553


No 35 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=70.86  E-value=6.8  Score=35.94  Aligned_cols=39  Identities=26%  Similarity=0.537  Sum_probs=26.9

Q ss_pred             CCcEEEEEEc-C----------CCCC-----ceeecCCCCCCCCCEEEEEecCCCC
Q psy2771          32 EKHIILFHCL-Q----------NNYP-----ALKLGKAADIRNGEFVIAMGSPLTL   71 (174)
Q Consensus        32 ~~DlAllkv~-~----------~~~~-----~~~l~~~~~~~~G~~v~~~G~p~g~   71 (174)
                      ..|++++|+= .          ++.|     .+++. ...++.||.|+++|||...
T Consensus       199 tgDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is-~~G~keGD~vmv~GyPG~T  253 (698)
T PF10459_consen  199 TGDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKIS-LKGVKEGDFVMVAGYPGRT  253 (698)
T ss_pred             CCceEEEEEEeCCCCCccccCcCCCCCCCccccccC-CCCCCCCCeEEEccCCCcc
Confidence            4499999992 1          1222     34443 3568999999999999653


No 36 
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=68.11  E-value=5.4  Score=27.51  Aligned_cols=18  Identities=44%  Similarity=0.785  Sum_probs=13.6

Q ss_pred             ceEEcCCCcEEEEEeeec
Q psy2771         112 GPLVNLDGEVIGINSMKV  129 (174)
Q Consensus       112 GPl~n~~G~liGI~~~~~  129 (174)
                      .|++|.+|++||+++.+.
T Consensus        94 ~PV~d~~g~viG~V~VG~  111 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVGV  111 (116)
T ss_dssp             EEEE-TTS-EEEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEEE
Confidence            488899999999998764


No 37 
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=62.36  E-value=12  Score=23.67  Aligned_cols=31  Identities=32%  Similarity=0.606  Sum_probs=24.1

Q ss_pred             ceEEcCCCcEEEEEeeecCCCeEEEEeHHHHHHHHHHHH
Q psy2771         112 GPLVNLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYK  150 (174)
Q Consensus       112 GPl~n~~G~liGI~~~~~~~~~~~aiPi~~i~~~l~~l~  150 (174)
                      -|+.+.+|+++|++..        .+..+.+.++++++.
T Consensus        19 ~pi~~~~g~~~Gvv~~--------di~l~~l~~~i~~~~   49 (81)
T PF02743_consen   19 VPIYDDDGKIIGVVGI--------DISLDQLSEIISNIK   49 (81)
T ss_dssp             EEEEETTTEEEEEEEE--------EEEHHHHHHHHTTSB
T ss_pred             EEEECCCCCEEEEEEE--------EeccceeeeEEEeeE
Confidence            4888889999999764        577778877777754


No 38 
>PF05578 Peptidase_S31:  Pestivirus NS3 polyprotein peptidase S31;  InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=61.42  E-value=28  Score=25.88  Aligned_cols=74  Identities=19%  Similarity=0.207  Sum_probs=46.0

Q ss_pred             CCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCc--cccccEEEEeeecCCCCccceEEcC-CCcEEEEEeeecC
Q psy2771          55 DIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGL--NKTINYIQTDAAITFGNSGGPLVNL-DGEVIGINSMKVT  130 (174)
Q Consensus        55 ~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~--~~~~~~~~~~~~~~~G~SGGPl~n~-~G~liGI~~~~~~  130 (174)
                      .-..|...|++ +|...+.+-+.|.+-.......++.-  .+... -.+|..-..|.||=|+|.. .|++||=+-.+-+
T Consensus       108 gcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtp-af~~~knlkg~s~~pifeassgr~vgr~k~gkn  184 (211)
T PF05578_consen  108 GCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTP-AFFDLKNLKGWSGLPIFEASSGRVVGRVKVGKN  184 (211)
T ss_pred             CCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCc-ceeeccccCCCCCCceeeccCCcEEEEEEecCC
Confidence            35678888888 77766667777766655543221110  00001 1234455689999999954 5999998776543


No 39 
>COG2524 Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]
Probab=60.70  E-value=90  Score=25.36  Aligned_cols=21  Identities=33%  Similarity=0.678  Sum_probs=18.5

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .|..|.|++|.+ +++||.+..
T Consensus       201 ~~i~GaPVvd~d-k~vGiit~~  221 (294)
T COG2524         201 KGIRGAPVVDDD-KIVGIITLS  221 (294)
T ss_pred             cCccCCceecCC-ceEEEEEHH
Confidence            899999999766 999999854


No 40 
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=59.79  E-value=8.1  Score=26.14  Aligned_cols=22  Identities=32%  Similarity=0.350  Sum_probs=18.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+.=||+|.+|+++|+++..
T Consensus        97 ~~~~~lpVvd~~~~~vGiit~~  118 (123)
T cd04627          97 EGISSVAVVDNQGNLIGNISVT  118 (123)
T ss_pred             cCCceEEEECCCCcEEEEEeHH
Confidence            5556679999999999999864


No 41 
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=58.94  E-value=28  Score=22.83  Aligned_cols=47  Identities=13%  Similarity=0.244  Sum_probs=32.1

Q ss_pred             EEEeeeEeeCCCcEEEEEEcC-CCCCceeecCCCCCCCCCEEEE-EecCC
Q psy2771          22 LTLPNIAYYFEKHIILFHCLQ-NNYPALKLGKAADIRNGEFVIA-MGSPL   69 (174)
Q Consensus        22 ~~a~~v~~d~~~DlAllkv~~-~~~~~~~l~~~~~~~~G~~v~~-~G~p~   69 (174)
                      ++.+++..|...++|++.+-. ..--.+.|-.. +++.|+.|++ +||..
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvkreV~l~Lv~~-~v~~GdyVLVHvGfAi   53 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVKREVNLDLVGE-EVKVGDYVLVHVGFAM   53 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEeEEEEeeeecC-ccccCCEEEEEeeEEE
Confidence            467888888888899999864 11112233222 6899999998 78764


No 42 
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=55.19  E-value=12  Score=24.88  Aligned_cols=22  Identities=14%  Similarity=0.197  Sum_probs=17.5

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+-=||+|.+|+++|+++..
T Consensus        85 ~~~~~lpVvd~~~~~~Giit~~  106 (111)
T cd04603          85 TEPPVVAVVDKEGKLVGTIYER  106 (111)
T ss_pred             cCCCeEEEEcCCCeEEEEEEhH
Confidence            4555569999999999999853


No 43 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=54.47  E-value=26  Score=32.70  Aligned_cols=47  Identities=23%  Similarity=0.258  Sum_probs=30.6

Q ss_pred             eecCCCCccceEE--cCC-Cc--EEEEEeeecCC----CeEEEEeHHHHHHHHHHH
Q psy2771         103 AAITFGNSGGPLV--NLD-GE--VIGINSMKVTA----GISFAIPIDYAIEFLTNY  149 (174)
Q Consensus       103 ~~~~~G~SGGPl~--n~~-G~--liGI~~~~~~~----~~~~aiPi~~i~~~l~~l  149 (174)
                      ....+|+||+|||  |.. .+  |+|+.+.....    +....+|.+.+.++.++.
T Consensus       211 n~~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~~~~~~~~~~f~~~~~~~d  266 (769)
T PF02395_consen  211 NYGSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKGNWWNVIPPDFINQIKQND  266 (769)
T ss_dssp             EB--TT-TT-EEEEEETTTTEEEEEEEEEEECCCCHSEEEEEEECHHHHHHHHHHC
T ss_pred             cccccCcCCCceEEEEccCCeEEEEEEEccccccCCccceeEEecHHHHHHHHhhh
Confidence            3456999999999  333 33  99999876542    356678888887777764


No 44 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=53.93  E-value=33  Score=23.71  Aligned_cols=23  Identities=13%  Similarity=0.257  Sum_probs=16.9

Q ss_pred             CCCcEEEEEEcCCCCCceeecCC
Q psy2771          31 FEKHIILFHCLQNNYPALKLGKA   53 (174)
Q Consensus        31 ~~~DlAllkv~~~~~~~~~l~~~   53 (174)
                      ...|+|+++.+...+|.+++++.
T Consensus        34 ~~ge~~~v~~~~~~~p~~~ig~g   56 (105)
T PF03510_consen   34 TDGELCWVQSPLVHLPAAQIGTG   56 (105)
T ss_pred             eccCEEEEECCCCCCCeeEeccC
Confidence            34699999998766777777543


No 45 
>cd04618 CBS_pair_5 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=53.25  E-value=29  Score=22.71  Aligned_cols=48  Identities=10%  Similarity=0.041  Sum_probs=33.5

Q ss_pred             CCCccceEEcCC-CcEEEEEeeecCCC---eEEEEeHHHHHHHHHHHHhCCe
Q psy2771         107 FGNSGGPLVNLD-GEVIGINSMKVTAG---ISFAIPIDYAIEFLTNYKRKGK  154 (174)
Q Consensus       107 ~G~SGGPl~n~~-G~liGI~~~~~~~~---~~~aiPi~~i~~~l~~l~~~g~  154 (174)
                      .+.++-|++|.+ |+++||++..--..   ....-|-+.+.+.++.+.+++.
T Consensus        22 ~~~~~~~Vvd~~~~~~~Givt~~Dl~~~~~~~~v~~~~~l~~a~~~m~~~~~   73 (98)
T cd04618          22 NGIRSAPLWDSRKQQFVGMLTITDFILILRLVSIHPERSLFDAALLLLKNKI   73 (98)
T ss_pred             cCCceEEEEeCCCCEEEEEEEHHHHhhheeeEEeCCCCcHHHHHHHHHHCCC
Confidence            456788999875 89999999542111   3445566678888888877654


No 46 
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=52.69  E-value=14  Score=24.49  Aligned_cols=22  Identities=18%  Similarity=0.360  Sum_probs=17.7

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|++|.+|+++|+++..
T Consensus        89 ~~~~~~pVvd~~~~~~Gvit~~  110 (115)
T cd04620          89 HQIRHLPVLDDQGQLIGLVTAE  110 (115)
T ss_pred             hCCceEEEEcCCCCEEEEEEhH
Confidence            4556679999999999999853


No 47 
>cd04643 CBS_pair_30 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=48.40  E-value=16  Score=24.12  Aligned_cols=17  Identities=41%  Similarity=0.581  Sum_probs=14.9

Q ss_pred             ceEEcCCCcEEEEEeee
Q psy2771         112 GPLVNLDGEVIGINSMK  128 (174)
Q Consensus       112 GPl~n~~G~liGI~~~~  128 (174)
                      -|++|.+|+++||++..
T Consensus        95 ~~Vv~~~~~~~Gvit~~  111 (116)
T cd04643          95 LPVVDDDGIFIGIITRR  111 (116)
T ss_pred             eeEEeCCCeEEEEEEHH
Confidence            69999999999999864


No 48 
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=47.02  E-value=21  Score=24.03  Aligned_cols=21  Identities=29%  Similarity=0.374  Sum_probs=18.1

Q ss_pred             CCCccceEEcCCCcEEEEEee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~  127 (174)
                      .+...-||+|.+|+++||++.
T Consensus        87 ~~~~~lpVvd~~~~l~Givt~  107 (113)
T cd04597          87 HNIRTLPVVDDDGTPAGIITL  107 (113)
T ss_pred             cCCCEEEEECCCCeEEEEEEH
Confidence            566788999999999999975


No 49 
>cd04619 CBS_pair_6 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=46.92  E-value=19  Score=23.87  Aligned_cols=22  Identities=18%  Similarity=0.339  Sum_probs=17.6

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...=|++|.+|+++|+++..
T Consensus        88 ~~~~~lpVvd~~~~~~Gvi~~~  109 (114)
T cd04619          88 RGLKNIPVVDENARPLGVLNAR  109 (114)
T ss_pred             cCCCeEEEECCCCcEEEEEEhH
Confidence            4555679999889999999863


No 50 
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually 
Probab=46.29  E-value=22  Score=25.00  Aligned_cols=22  Identities=23%  Similarity=0.099  Sum_probs=18.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.++-||+|.+|+++|+++..
T Consensus        22 ~~~~~~~VvD~~g~l~Givt~~   43 (133)
T cd04592          22 EKQSCVLVVDSDDFLEGILTLG   43 (133)
T ss_pred             cCCCEEEEECCCCeEEEEEEHH
Confidence            3556789999999999999953


No 51 
>cd04801 CBS_pair_M50_like This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the metalloprotease peptidase M50.  CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=46.28  E-value=20  Score=23.68  Aligned_cols=22  Identities=27%  Similarity=0.314  Sum_probs=18.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+--||+|.+|+++|+++..
T Consensus        88 ~~~~~l~Vv~~~~~~~Gvl~~~  109 (114)
T cd04801          88 QGLDELAVVEDSGQVIGLITEA  109 (114)
T ss_pred             CCCCeeEEEcCCCcEEEEEecc
Confidence            5566679998889999999864


No 52 
>cd04607 CBS_pair_NTP_transferase_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain downstream.  CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=44.86  E-value=22  Score=23.43  Aligned_cols=22  Identities=18%  Similarity=0.457  Sum_probs=17.7

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-||+|.+|+++|+++..
T Consensus        87 ~~~~~~~Vv~~~~~~~Gvit~~  108 (113)
T cd04607          87 RSIRHLPILDEEGRVVGLATLD  108 (113)
T ss_pred             CCCCEEEEECCCCCEEEEEEhH
Confidence            4556679999889999999853


No 53 
>PRK15431 ferrous iron transport protein FeoC; Provisional
Probab=44.29  E-value=20  Score=23.38  Aligned_cols=27  Identities=15%  Similarity=0.177  Sum_probs=23.8

Q ss_pred             eEEEEeHHHHHHHHHHHHhCCeeeeee
Q psy2771         133 ISFAIPIDYAIEFLTNYKRKGKFCAYS  159 (174)
Q Consensus       133 ~~~aiPi~~i~~~l~~l~~~g~~~~~~  159 (174)
                      ..|..|.+.+...|+.|.+.|++.+..
T Consensus        24 ~~~~~p~~~VeaMLe~l~~kGkverv~   50 (78)
T PRK15431         24 QTLNTPQPMINAMLQQLESMGKAVRIQ   50 (78)
T ss_pred             HHHCcCHHHHHHHHHHHHHCCCeEeec
Confidence            367899999999999999999998663


No 54 
>COG5428 Uncharacterized conserved small protein [Function unknown]
Probab=44.27  E-value=43  Score=21.29  Aligned_cols=18  Identities=11%  Similarity=0.237  Sum_probs=14.5

Q ss_pred             eeEeeCCCcEEEEEEcCC
Q psy2771          26 NIAYYFEKHIILFHCLQN   43 (174)
Q Consensus        26 ~v~~d~~~DlAllkv~~~   43 (174)
                      .+.||++.|++-|.+.+.
T Consensus         2 kv~YD~daD~lYI~~~~~   19 (69)
T COG5428           2 KVKYDTDADILYILLEEG   19 (69)
T ss_pred             ceeecCCCcEEEEEEecC
Confidence            367999999998888754


No 55 
>cd04602 CBS_pair_IMPDH_2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein.  IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain in IMPDH have been associated with retinitis pigmentos
Probab=43.17  E-value=23  Score=23.44  Aligned_cols=22  Identities=27%  Similarity=0.398  Sum_probs=17.9

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|++|.+|+++|+++..
T Consensus        88 ~~~~~~pVv~~~~~~~Gvit~~  109 (114)
T cd04602          88 SKKGKLPIVNDDGELVALVTRS  109 (114)
T ss_pred             cCCCceeEECCCCeEEEEEEHH
Confidence            4555679999899999999853


No 56 
>cd04590 CBS_pair_CorC_HlyC_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the CorC_HlyC domain. CorC_HlyC is a transporter associated domain. This small domain is found in Na+/H+ antiporters, in proteins involved in magnesium and cobalt efflux, and in association with some proteins of unknown function.  The function of the CorC_HlyC domain is uncertain but it might be involved in modulating transport of ion substrates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role,
Probab=43.08  E-value=22  Score=23.24  Aligned_cols=22  Identities=14%  Similarity=0.149  Sum_probs=17.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+=-|++|.+|+++|+++..
T Consensus        85 ~~~~~~~Vv~~~~~~~Gvit~~  106 (111)
T cd04590          85 ERSHMAIVVDEYGGTAGLVTLE  106 (111)
T ss_pred             cCCcEEEEEECCCCEEEEeEHH
Confidence            3455568899889999999853


No 57 
>cd04617 CBS_pair_4 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=42.88  E-value=21  Score=23.81  Aligned_cols=22  Identities=27%  Similarity=0.155  Sum_probs=17.0

Q ss_pred             CCCccceEEcCC---CcEEEEEeee
Q psy2771         107 FGNSGGPLVNLD---GEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~---G~liGI~~~~  128 (174)
                      .+..-=||+|.+   |+++|+++..
T Consensus        89 ~~~~~lpVvd~~~~~~~l~Gvit~~  113 (118)
T cd04617          89 HQVDSLPVVEKVDEGLEVIGRITKT  113 (118)
T ss_pred             cCCCEeeEEeCCCccceEEEEEEhh
Confidence            455567999887   7999999864


No 58 
>COG3448 CBS-domain-containing membrane protein [Signal transduction mechanisms]
Probab=42.54  E-value=21  Score=29.60  Aligned_cols=21  Identities=29%  Similarity=0.537  Sum_probs=16.5

Q ss_pred             CCccceEEcCCCcEEEEEeee
Q psy2771         108 GNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~~  128 (174)
                      |.--=|++|.+|+++||++..
T Consensus       345 g~H~lpvld~~g~lvGIvsQt  365 (382)
T COG3448         345 GLHALPVLDAAGKLVGIVSQT  365 (382)
T ss_pred             CcceeeEEcCCCcEEEEeeHH
Confidence            333459999999999999853


No 59 
>cd04582 CBS_pair_ABC_OpuCA_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown.  In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzi
Probab=42.46  E-value=25  Score=22.69  Aligned_cols=22  Identities=27%  Similarity=0.291  Sum_probs=17.5

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+--|++|.+|+++|+++..
T Consensus        80 ~~~~~~~Vv~~~~~~~Gvi~~~  101 (106)
T cd04582          80 HDMSWLPCVDEDGRYVGEVTQR  101 (106)
T ss_pred             CCCCeeeEECCCCcEEEEEEHH
Confidence            4555579999899999999864


No 60 
>cd04641 CBS_pair_28 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=42.22  E-value=27  Score=23.32  Aligned_cols=22  Identities=27%  Similarity=0.362  Sum_probs=18.2

Q ss_pred             CCCCccceEEcCCCcEEEEEee
Q psy2771         106 TFGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       106 ~~G~SGGPl~n~~G~liGI~~~  127 (174)
                      ..+.+.-||+|.+|+++|+++.
T Consensus        21 ~~~~~~~pVv~~~~~~~Giv~~   42 (120)
T cd04641          21 ERRVSALPIVDENGKVVDVYSR   42 (120)
T ss_pred             HcCCCeeeEECCCCeEEEEEeH
Confidence            3466778999999999999983


No 61 
>cd04601 CBS_pair_IMPDH This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein.  IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain in IMPDH have been associated with retinitis pigmentosa.
Probab=42.19  E-value=24  Score=22.84  Aligned_cols=22  Identities=23%  Similarity=0.377  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+..--|++|.+|+++|+++..
T Consensus        84 ~~~~~~~Vv~~~~~~~Gvi~~~  105 (110)
T cd04601          84 HKIEKLPVVDDEGKLKGLITVK  105 (110)
T ss_pred             hCCCeeeEEcCCCCEEEEEEhh
Confidence            3455568999889999999864


No 62 
>smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.
Probab=42.14  E-value=28  Score=18.23  Aligned_cols=20  Identities=30%  Similarity=0.554  Sum_probs=15.7

Q ss_pred             CCccceEEcCCCcEEEEEee
Q psy2771         108 GNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~  127 (174)
                      +.+.-|+++.+++++|+++.
T Consensus        22 ~~~~~~v~~~~~~~~g~i~~   41 (49)
T smart00116       22 GIRRLPVVDEEGRLVGIVTR   41 (49)
T ss_pred             CCCcccEECCCCeEEEEEEH
Confidence            44566888888999999875


No 63 
>cd04614 CBS_pair_1 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=41.96  E-value=30  Score=22.38  Aligned_cols=48  Identities=19%  Similarity=0.175  Sum_probs=32.4

Q ss_pred             CCCccceEEcCCCcEEEEEeeec--C-CCeEEEEeHHHHHHHHHHHHhCCe
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMKV--T-AGISFAIPIDYAIEFLTNYKRKGK  154 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~~--~-~~~~~aiPi~~i~~~l~~l~~~g~  154 (174)
                      .+.+.-|++|.+|+++|+++...  . ....+.-|-+.+.+.++.+.+.+.
T Consensus        22 ~~~~~~~V~d~~~~~~Giv~~~dl~~~~~~~~v~~~~~l~~a~~~m~~~~~   72 (96)
T cd04614          22 ANVKALPVLDDDGKLSGIITERDLIAKSEVVTATKRTTVSECAQKMKRNRI   72 (96)
T ss_pred             cCCCeEEEECCCCCEEEEEEHHHHhcCCCcEEecCCCCHHHHHHHHHHhCC
Confidence            46678899999999999998543  1 123444455666777777766544


No 64 
>cd04583 CBS_pair_ABC_OpuCA_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown.  In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyz
Probab=41.87  E-value=25  Score=22.70  Aligned_cols=22  Identities=27%  Similarity=0.477  Sum_probs=17.4

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+..--|++|.+|+++|+++..
T Consensus        83 ~~~~~~~vv~~~g~~~Gvit~~  104 (109)
T cd04583          83 RGPKYVPVVDEDGKLVGLITRS  104 (109)
T ss_pred             cCCceeeEECCCCeEEEEEehH
Confidence            3555669999999999999864


No 65 
>cd04596 CBS_pair_DRTGG_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=41.25  E-value=26  Score=22.86  Aligned_cols=22  Identities=27%  Similarity=0.303  Sum_probs=17.9

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|++|.+|+++|+++..
T Consensus        82 ~~~~~~~Vv~~~~~~~G~it~~  103 (108)
T cd04596          82 EGIEMLPVVDDNKKLLGIISRQ  103 (108)
T ss_pred             cCCCeeeEEcCCCCEEEEEEHH
Confidence            4556779999999999998753


No 66 
>COG3290 CitA Signal transduction histidine kinase regulating citrate/malate metabolism [Signal transduction mechanisms]
Probab=41.02  E-value=23  Score=31.50  Aligned_cols=18  Identities=33%  Similarity=0.569  Sum_probs=16.4

Q ss_pred             ceEEcCCCcEEEEEeeec
Q psy2771         112 GPLVNLDGEVIGINSMKV  129 (174)
Q Consensus       112 GPl~n~~G~liGI~~~~~  129 (174)
                      -|+||.+|++||+++-++
T Consensus       143 ~PI~d~~g~~IGvVsVG~  160 (537)
T COG3290         143 VPIFDEDGKQIGVVSVGY  160 (537)
T ss_pred             cceECCCCCEEEEEEEee
Confidence            499999999999999875


No 67 
>cd04606 CBS_pair_Mg_transporter This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE.  MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=40.26  E-value=28  Score=22.74  Aligned_cols=22  Identities=23%  Similarity=0.522  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+....|++|.+|+++|+++..
T Consensus        82 ~~~~~~~Vv~~~~~~~Gvit~~  103 (109)
T cd04606          82 YDLLALPVVDEEGRLVGIITVD  103 (109)
T ss_pred             cCCceeeeECCCCcEEEEEEhH
Confidence            3445679999899999999864


No 68 
>cd04642 CBS_pair_29 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=40.07  E-value=28  Score=23.51  Aligned_cols=20  Identities=20%  Similarity=0.283  Sum_probs=15.7

Q ss_pred             CccceEEcCCCcEEEEEeee
Q psy2771         109 NSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       109 ~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+=-|++|.+|+++|+++..
T Consensus       102 ~~~l~Vvd~~~~~~Giit~~  121 (126)
T cd04642         102 VHRVWVVDEEGKPIGVITLT  121 (126)
T ss_pred             CcEEEEECCCCCEEEEEEHH
Confidence            33358998889999999853


No 69 
>cd04610 CBS_pair_ParBc_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain downstream. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=38.46  E-value=31  Score=22.23  Aligned_cols=19  Identities=26%  Similarity=0.431  Sum_probs=15.2

Q ss_pred             ccceEEcCCCcEEEEEeee
Q psy2771         110 SGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       110 SGGPl~n~~G~liGI~~~~  128 (174)
                      .--|++|.+|+++|+++..
T Consensus        84 ~~~~Vv~~~g~~~Gvi~~~  102 (107)
T cd04610          84 SKLPVVDENNNLVGIITNT  102 (107)
T ss_pred             CeEeEECCCCeEEEEEEHH
Confidence            3458889899999998753


No 70 
>cd04640 CBS_pair_27 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=38.20  E-value=28  Score=23.58  Aligned_cols=22  Identities=23%  Similarity=0.297  Sum_probs=17.8

Q ss_pred             CCCccceEEcCC-CcEEEEEeee
Q psy2771         107 FGNSGGPLVNLD-GEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~-G~liGI~~~~  128 (174)
                      .+.+--||+|.+ |+++|+++..
T Consensus        99 ~~~~~lpVvd~~~~~~~G~it~~  121 (126)
T cd04640          99 SGRQHALVVDREHHQIRGIISTS  121 (126)
T ss_pred             CCCceEEEEECCCCEEEEEEeHH
Confidence            566678999887 8999999853


No 71 
>PF01455 HupF_HypC:  HupF/HypC family;  InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=38.16  E-value=1e+02  Score=19.34  Aligned_cols=43  Identities=9%  Similarity=0.128  Sum_probs=29.0

Q ss_pred             EEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEE
Q psy2771          22 LTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAM   65 (174)
Q Consensus        22 ~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~   65 (174)
                      ++++++..+.....|++..... ...+.+.=-.++++||.|++-
T Consensus         5 iP~~Vv~v~~~~~~A~v~~~G~-~~~V~~~lv~~v~~Gd~VLVH   47 (68)
T PF01455_consen    5 IPGRVVEVDEDGGMAVVDFGGV-RREVSLALVPDVKVGDYVLVH   47 (68)
T ss_dssp             EEEEEEEEETTTTEEEEEETTE-EEEEEGTTCTSB-TT-EEEEE
T ss_pred             ccEEEEEEeCCCCEEEEEcCCc-EEEEEEEEeCCCCCCCEEEEe
Confidence            5788888888899999988742 333433333458999999985


No 72 
>cd04600 CBS_pair_HPP_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the HPP motif domain. These proteins are integral membrane proteins with four transmembrane spanning helices. The function of these proteins is uncertain, but they are thought to be transporters. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=37.36  E-value=32  Score=22.90  Aligned_cols=22  Identities=27%  Similarity=0.424  Sum_probs=18.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+.-|++|.+|+++|+++..
T Consensus        98 ~~~~~~~Vv~~~g~~~Gvit~~  119 (124)
T cd04600          98 GGHHHVPVVDEDRRLVGIVTQT  119 (124)
T ss_pred             cCCCceeEEcCCCCEEEEEEhH
Confidence            4566789999899999999853


No 73 
>cd04615 CBS_pair_2 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=36.58  E-value=34  Score=22.33  Aligned_cols=22  Identities=27%  Similarity=0.249  Sum_probs=17.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|++|.+|+++|+++..
T Consensus        87 ~~~~~~~Vvd~~g~~~Gvvt~~  108 (113)
T cd04615          87 NNISRLPVLDDKGKVGGIVTED  108 (113)
T ss_pred             cCCCeeeEECCCCeEEEEEEHH
Confidence            3445679999899999998853


No 74 
>cd04609 CBS_pair_PALP_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain upstream.   The vitamin B6 complex comprises pyridoxine, pyridoxal, and pyridoxamine, as well as the 5'-phosphate esters of pyridoxal (PALP) and pyridoxamine, the last two being the biologically active coenzyme derivatives.  The members of the PALP family are principally involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and other amine-containing compounds.  CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a pote
Probab=36.39  E-value=32  Score=22.15  Aligned_cols=18  Identities=22%  Similarity=0.340  Sum_probs=14.8

Q ss_pred             cceEEcCCCcEEEEEeee
Q psy2771         111 GGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       111 GGPl~n~~G~liGI~~~~  128 (174)
                      .-|+++.+|+++|+++..
T Consensus        88 ~~~vv~~~~~~~Gvvt~~  105 (110)
T cd04609          88 VAVVVDEGGKFVGIITRA  105 (110)
T ss_pred             ceeEEecCCeEEEEEeHH
Confidence            468888889999998853


No 75 
>PF09465 LBR_tudor:  Lamin-B receptor of TUDOR domain;  InterPro: IPR019023  The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=36.18  E-value=1e+02  Score=18.70  Aligned_cols=42  Identities=12%  Similarity=0.097  Sum_probs=26.1

Q ss_pred             CCceeEeeeeEEEEEeecCcEE-EEeeeEeeCCCcEEEEEEcC
Q psy2771           1 MPGVEKVTQDICLSTFSFNSLL-TLPNIAYYFEKHIILFHCLQ   42 (174)
Q Consensus         1 ~~~v~~~a~~~~~~~~~~~~~~-~a~~v~~d~~~DlAllkv~~   42 (174)
                      ||...-...+..-.-+.+...+ ++++..+|...++.-++.+.
T Consensus         1 mp~~k~~~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D   43 (55)
T PF09465_consen    1 MPSRKFAIGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED   43 (55)
T ss_dssp             SSSSSS-SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred             CCcccccCCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence            4443333333334444555554 99999999999999888874


No 76 
>cd04587 CBS_pair_CAP-ED_DUF294_PBI_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with either the CAP_ED (cAMP receptor protein effector domain) family of transcription factors and the DUF294 domain or the PB1 (Phox and Bem1p) domain.  Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. The PB1 domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pai
Probab=34.84  E-value=36  Score=22.16  Aligned_cols=18  Identities=28%  Similarity=0.575  Sum_probs=14.7

Q ss_pred             cceEEcCCCcEEEEEeee
Q psy2771         111 GGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       111 GGPl~n~~G~liGI~~~~  128 (174)
                      --|+++.+|+++|+++..
T Consensus        91 ~l~Vv~~~~~~~Gvvs~~  108 (113)
T cd04587          91 HLPVVDKSGQVVGLLDVT  108 (113)
T ss_pred             cccEECCCCCEEEEEEHH
Confidence            348998889999999853


No 77 
>COG3284 AcoR Transcriptional activator of acetoin/glycerol metabolism [Secondary metabolites biosynthesis, transport, and catabolism / Transcription]
Probab=34.20  E-value=22  Score=32.05  Aligned_cols=23  Identities=22%  Similarity=0.559  Sum_probs=19.3

Q ss_pred             CCCccceEEcCCCcEEEEEeeec
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMKV  129 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~~  129 (174)
                      --+++.|++|.+|+|+|+..-..
T Consensus       158 lsCsAaPI~D~qG~L~gVLDISs  180 (606)
T COG3284         158 LSCSAAPIFDEQGELVGVLDISS  180 (606)
T ss_pred             ceeeeeccccCCCcEEEEEEecc
Confidence            34789999999999999987653


No 78 
>PF15436 PGBA_N:  Plasminogen-binding protein pgbA N-terminal
Probab=33.48  E-value=1.9e+02  Score=22.72  Aligned_cols=52  Identities=15%  Similarity=0.052  Sum_probs=34.1

Q ss_pred             EEEEe-ecCcEEEEeeeEeeCCCcEEEEEEcC---CCCCceeecCCCCCCCCCEEEE
Q psy2771          12 CLSTF-SFNSLLTLPNIAYYFEKHIILFHCLQ---NNYPALKLGKAADIRNGEFVIA   64 (174)
Q Consensus        12 ~~~~~-~~~~~~~a~~v~~d~~~DlAllkv~~---~~~~~~~l~~~~~~~~G~~v~~   64 (174)
                      .+..+ .+-+.+.|+.+....+...|.+|+..   .....++... -.++.||.|+.
T Consensus        33 V~h~~~~~~~~IiA~a~V~~~~~g~A~~kf~~fd~L~Q~aLP~p~-~~pk~GD~vil   88 (218)
T PF15436_consen   33 VVHKFDKDHSSIIARAVVISKKNGVAKAKFSVFDSLKQDALPTPK-MVPKKGDEVIL   88 (218)
T ss_pred             EEEEecCCcceeeeEEEEEEecCCeeEEEEeehhhhhhhcCCCCc-cccCCCCEEEE
Confidence            34455 56677778877777789999999863   2222333322 24799999876


No 79 
>cd04624 CBS_pair_11 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=32.90  E-value=45  Score=21.68  Aligned_cols=22  Identities=23%  Similarity=0.365  Sum_probs=17.4

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+..|++|.+|+++|+++..
T Consensus        86 ~~~~~~~Vv~~~g~~~Gilt~~  107 (112)
T cd04624          86 NNIRHHLVVDKGGELVGVISIR  107 (112)
T ss_pred             cCccEEEEEcCCCcEEEEEEHH
Confidence            3455678999889999999864


No 80 
>cd04611 CBS_pair_PAS_GGDEF_DUF1_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with a PAS domain, a GGDEF (DiGuanylate-Cyclase (DGC) domain, and a DUF1 domain downstream. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CB
Probab=32.24  E-value=43  Score=21.61  Aligned_cols=22  Identities=32%  Similarity=0.403  Sum_probs=17.0

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+..--|++|.+|+++|+++..
T Consensus        85 ~~~~~~~Vv~~~~~~~Gvi~~~  106 (111)
T cd04611          85 HGIRHLVVVDDDGELLGLLSQT  106 (111)
T ss_pred             cCCeEEEEECCCCcEEEEEEhH
Confidence            3445578999889999999853


No 81 
>PF06003 SMN:  Survival motor neuron protein (SMN);  InterPro: IPR010304 This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs.; GO: 0003723 RNA binding, 0006397 mRNA processing, 0005634 nucleus, 0005737 cytoplasm; PDB: 1MHN_A 4A4G_A 3S6N_M 4A4E_A 1G5V_A 4A4H_A 4A4F_A 2D9T_A.
Probab=32.21  E-value=1.3e+02  Score=24.16  Aligned_cols=34  Identities=12%  Similarity=0.055  Sum_probs=28.3

Q ss_pred             eeEEEEEee-cCcEEEEeeeEeeCCCcEEEEEEcC
Q psy2771           9 QDICLSTFS-FNSLLTLPNIAYYFEKHIILFHCLQ   42 (174)
Q Consensus         9 ~~~~~~~~~-~~~~~~a~~v~~d~~~DlAllkv~~   42 (174)
                      -|.|.-.++ ||..|+|++...+.+.+-|++++..
T Consensus        72 Gd~C~A~~s~Dg~~Y~A~I~~i~~~~~~~~V~f~g  106 (264)
T PF06003_consen   72 GDKCMAVYSEDGQYYPATIESIDEEDGTCVVVFTG  106 (264)
T ss_dssp             T-EEEEE-TTTSSEEEEEEEEEETTTTEEEEEETT
T ss_pred             CCEEEEEECCCCCEEEEEEEEEcCCCCEEEEEEcc
Confidence            357888886 8999999999999999999999974


No 82 
>PF00741 Gas_vesicle:  Gas vesicle protein;  InterPro: IPR000638 Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterial microorganisms []. They allow the positioning of the bacteria at the favourable depth for growth. Gas vesicles are hollow cylindrical tubes, closed by a hollow, conical cap at each end. Both the conical end caps and central cylinder are made up of 4-5 nm wide ribs that run at right angles to the long axis of the structure. Gas vesicles seem to be constituted of two different protein components, GVPa and GVPc. GVPa, a small protein of about 70 amino acid residues, is the main constituent of gas vesicles and form the essential core of the structure. The sequence of GVPa is extremely well conserved. GvpJ and gvpM, two proteins encoded in the cluster of genes required for gas vesicle synthesis in the archaebacteria Halobacterium salinarium and Halobacterium mediterranei (Haloferax mediterranei), have been found [] to be evolutionary related to GVPa. The exact function of these two proteins is not known, although they could be important for determining the shape determination gas vesicles. The N-terminal domain of Aphanizomenon flos-aquae protein gvpA/J is also related to GVPa.; GO: 0005198 structural molecule activity, 0012506 vesicle membrane
Probab=32.14  E-value=98  Score=17.36  Aligned_cols=30  Identities=20%  Similarity=0.119  Sum_probs=25.8

Q ss_pred             HHHHHHHHHhCCeeeeeecCceeeeeeeee
Q psy2771         142 AIEFLTNYKRKGKFCAYSKGKSDLRTEVLY  171 (174)
Q Consensus       142 i~~~l~~l~~~g~~~~~~lg~~~~~~e~~~  171 (174)
                      +.++++++..+|-+...++-++....|+..
T Consensus         2 L~d~LdriLdkGvVi~gdi~isva~veLl~   31 (39)
T PF00741_consen    2 LVDLLDRILDKGVVIDGDIRISVAGVELLT   31 (39)
T ss_pred             HHHHHHHHcCCceEEEEEEEEEEcceEEEE
Confidence            568899999999999999988888877764


No 83 
>cd04635 CBS_pair_22 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=32.12  E-value=50  Score=21.82  Aligned_cols=21  Identities=24%  Similarity=0.325  Sum_probs=17.6

Q ss_pred             CCCccceEEcCCCcEEEEEee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~  127 (174)
                      .+.+.-|+++.+|+++|+++.
T Consensus        96 ~~~~~~~Vvd~~g~~~Gvit~  116 (122)
T cd04635          96 HDIGRLPVVNEKDQLVGIVDR  116 (122)
T ss_pred             cCCCeeeEEcCCCcEEEEEEh
Confidence            566677999988999999985


No 84 
>PF10049 DUF2283:  Protein of unknown function (DUF2283);  InterPro: IPR019270  Members of this family of hypothetical proteins have no known function. 
Probab=31.54  E-value=70  Score=18.60  Aligned_cols=16  Identities=19%  Similarity=0.204  Sum_probs=13.6

Q ss_pred             eEeeCCCcEEEEEEcC
Q psy2771          27 IAYYFEKHIILFHCLQ   42 (174)
Q Consensus        27 v~~d~~~DlAllkv~~   42 (174)
                      +.||++.|.+.|++..
T Consensus         3 i~YD~~~D~lyi~l~~   18 (50)
T PF10049_consen    3 IEYDPEADALYIRLSD   18 (50)
T ss_pred             eEEcCcCCEEEEEECC
Confidence            6799999999999953


No 85 
>PF09012 FeoC:  FeoC like transcriptional regulator;  InterPro: IPR015102 This entry contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependent transcriptional repressor []. ; PDB: 1XN7_A 2K02_A.
Probab=30.29  E-value=58  Score=20.11  Aligned_cols=26  Identities=23%  Similarity=0.280  Sum_probs=20.1

Q ss_pred             EEEEeHHHHHHHHHHHHhCCeeeeee
Q psy2771         134 SFAIPIDYAIEFLTNYKRKGKFCAYS  159 (174)
Q Consensus       134 ~~aiPi~~i~~~l~~l~~~g~~~~~~  159 (174)
                      .|-++.+.+...++.|.+.|++.+-.
T Consensus        23 ~~~~s~~~ve~mL~~l~~kG~I~~~~   48 (69)
T PF09012_consen   23 EFGISPEAVEAMLEQLIRKGYIRKVD   48 (69)
T ss_dssp             HTT--HHHHHHHHHHHHCCTSCEEEE
T ss_pred             HHCcCHHHHHHHHHHHHHCCcEEEec
Confidence            35588899999999999999987543


No 86 
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.24  E-value=1.2e+02  Score=19.34  Aligned_cols=29  Identities=14%  Similarity=0.123  Sum_probs=23.6

Q ss_pred             EEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          13 LSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        13 ~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ...+.+|+.+.....++|....+.|=...
T Consensus        14 ~V~l~dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          14 RVTLQDGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             EEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence            34459999999999999999988865553


No 87 
>cd00218 GlcAT-I Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl  of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately.  The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43).
Probab=30.12  E-value=63  Score=25.44  Aligned_cols=32  Identities=22%  Similarity=0.369  Sum_probs=24.0

Q ss_pred             cceEEcCCCcEEEEEeeecC------CCeEEEEeHHHHH
Q psy2771         111 GGPLVNLDGEVIGINSMKVT------AGISFAIPIDYAI  143 (174)
Q Consensus       111 GGPl~n~~G~liGI~~~~~~------~~~~~aiPi~~i~  143 (174)
                      -||+++ +|+|+|-.+.-..      +-.+||+.+..+.
T Consensus       136 egP~c~-~gkV~gw~~~w~~~R~f~idmAGFA~n~~ll~  173 (223)
T cd00218         136 EGPVCE-NGKVVGWHTAWKPERPFPIDMAGFAFNSKLLW  173 (223)
T ss_pred             eccEee-CCeEeEEecCCCCCCCCcceeeeEEEehhhhc
Confidence            479997 8999999996543      2358888877664


No 88 
>cd05701 S1_Rrp5_repeat_hs10 S1_Rrp5_repeat_hs10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 10 (hs10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=29.88  E-value=1.4e+02  Score=18.75  Aligned_cols=14  Identities=7%  Similarity=0.181  Sum_probs=8.9

Q ss_pred             CCCCCCCCCEEEEE
Q psy2771          52 KAADIRNGEFVIAM   65 (174)
Q Consensus        52 ~~~~~~~G~~v~~~   65 (174)
                      +++.+++|+.+.+.
T Consensus        42 ~seklkvG~~l~v~   55 (69)
T cd05701          42 DSEKLSVGQCLDVT   55 (69)
T ss_pred             cceeeeccceEEEE
Confidence            45567777776664


No 89 
>cd04605 CBS_pair_MET2_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the MET2 domain. Met2 is a key enzyme in the biosynthesis of methionine.  It encodes a homoserine transacetylase involved in converting homoserine to O-acetyl homoserine. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=29.86  E-value=53  Score=21.23  Aligned_cols=22  Identities=32%  Similarity=0.450  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+..--|+++.+|+++|+++..
T Consensus        84 ~~~~~~~Vv~~~~~~~G~v~~~  105 (110)
T cd04605          84 HNISALPVVDAENRVIGIITSE  105 (110)
T ss_pred             hCCCEEeEECCCCcEEEEEEHH
Confidence            4445678999899999999863


No 90 
>cd04621 CBS_pair_8 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=29.71  E-value=58  Score=22.54  Aligned_cols=20  Identities=20%  Similarity=0.348  Sum_probs=17.3

Q ss_pred             CCccceEEcCCCcEEEEEee
Q psy2771         108 GNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~  127 (174)
                      +.+.-||+|.+|+++|+++.
T Consensus        23 ~~~~l~V~d~~~~~~Giv~~   42 (135)
T cd04621          23 GVGRVIVVDDNGKPVGVITY   42 (135)
T ss_pred             CCCcceEECCCCCEEEEEeH
Confidence            56778999999999999984


No 91 
>TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. An additional proposed function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. PubMed:12441107.
Probab=29.65  E-value=1.3e+02  Score=19.45  Aligned_cols=44  Identities=14%  Similarity=0.218  Sum_probs=26.8

Q ss_pred             EEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEE-EecC
Q psy2771          22 LTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIA-MGSP   68 (174)
Q Consensus        22 ~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~-~G~p   68 (174)
                      ++++++..+.  +.|++..... ...+.+.=..++++||.|++ .||.
T Consensus         5 iP~~V~~i~~--~~A~v~~~G~-~~~v~l~lv~~~~vGD~VLVH~G~A   49 (76)
T TIGR00074         5 IPGQVVEIDE--NIALVEFCGI-KRDVSLDLVGEVKVGDYVLVHVGFA   49 (76)
T ss_pred             cceEEEEEcC--CEEEEEcCCe-EEEEEEEeeCCCCCCCEEEEecChh
Confidence            4667777665  4688877632 12233322245799999998 5654


No 92 
>cd04604 CBS_pair_KpsF_GutQ_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with KpsF/GutQ domains in the API [A5P (D-arabinose 5-phosphate) isomerase] protein.  These APIs catalyze the conversion of the pentose pathway intermediate D-ribulose 5-phosphate into A5P, a precursor of 3-deoxy-D-manno-octulosonate, which is an integral carbohydrate component of various glycolipids coating the surface of the outer membrane of Gram-negative bacteria, including lipopolysaccharide and many group 2 K-antigen capsules. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other funct
Probab=29.49  E-value=61  Score=20.98  Aligned_cols=21  Identities=19%  Similarity=0.336  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCCcEEEEEee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~  127 (174)
                      .+.+.-|++|.+|+++|+++.
T Consensus        88 ~~~~~~~Vv~~~~~~iG~it~  108 (114)
T cd04604          88 NKITALPVVDDNGRPVGVLHI  108 (114)
T ss_pred             cCCCEEEEECCCCCEEEEEEH
Confidence            455677999888999999875


No 93 
>cd04585 CBS_pair_ACT_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in  the acetoin utilization proteins in bacteria. Acetoin is a product of fermentative metabolism in many prokaryotic and eukaryotic microorganisms.  They produce acetoin as an external carbon storage compound and then later reuse it as a carbon and energy source during their stationary phase and sporulation. In addition these CBS domains are associated with a downstream ACT domain, which is linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The i
Probab=29.09  E-value=63  Score=21.13  Aligned_cols=22  Identities=32%  Similarity=0.433  Sum_probs=18.0

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+.-|++|.+|+++|+++..
T Consensus        96 ~~~~~~~Vv~~~~~~~Gvvt~~  117 (122)
T cd04585          96 RKISGLPVVDDQGRLVGIITES  117 (122)
T ss_pred             cCCCceeEECCCCcEEEEEEHH
Confidence            5667789998889999999853


No 94 
>cd04632 CBS_pair_19 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=29.06  E-value=62  Score=21.73  Aligned_cols=21  Identities=33%  Similarity=0.491  Sum_probs=17.6

Q ss_pred             CCCccceEEcCCCcEEEEEee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~  127 (174)
                      .+.++-|++|.+|+++|+++.
T Consensus        22 ~~~~~~~Vv~~~~~~~G~it~   42 (128)
T cd04632          22 HGISRLPVVDDNGKLTGIVTR   42 (128)
T ss_pred             cCCCEEEEECCCCcEEEEEEH
Confidence            456678999999999999993


No 95 
>cd04803 CBS_pair_15 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=29.02  E-value=55  Score=21.63  Aligned_cols=22  Identities=23%  Similarity=0.268  Sum_probs=16.7

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...=|++|.+|+++|+++..
T Consensus        96 ~~~~~~~Vv~~~~~~~Gvit~~  117 (122)
T cd04803          96 NKIGCLPVVDDKGTLVGIITRS  117 (122)
T ss_pred             cCCCeEEEEcCCCCEEEEEEHH
Confidence            3445568888889999999853


No 96 
>cd04588 CBS_pair_CAP-ED_DUF294_assoc_arch This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the archaeal CAP_ED (cAMP receptor protein effector domain) family of transcription factors and the DUF294 domain.  Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site.
Probab=28.99  E-value=53  Score=21.22  Aligned_cols=21  Identities=14%  Similarity=0.150  Sum_probs=16.6

Q ss_pred             CCccceEEcCCCcEEEEEeee
Q psy2771         108 GNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~~  128 (174)
                      +...-|++|.+|+++|+++..
T Consensus        85 ~~~~~~V~~~~~~~~G~i~~~  105 (110)
T cd04588          85 NVGRLIVTDDEGRPVGIITRT  105 (110)
T ss_pred             CCCEEEEECCCCCEEEEEEhH
Confidence            445678888889999999864


No 97 
>cd04598 CBS_pair_GGDEF_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=28.79  E-value=47  Score=21.84  Aligned_cols=18  Identities=33%  Similarity=0.658  Sum_probs=14.7

Q ss_pred             cceEEcCCCcEEEEEeee
Q psy2771         111 GGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       111 GGPl~n~~G~liGI~~~~  128 (174)
                      ..++++.+|+++|+++..
T Consensus        97 ~~~vv~~~~~~~Gvvs~~  114 (119)
T cd04598          97 DGFIVTEEGRYLGIGTVK  114 (119)
T ss_pred             ccEEEeeCCeEEEEEEHH
Confidence            457888899999999853


No 98 
>cd04623 CBS_pair_10 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=28.32  E-value=58  Score=21.00  Aligned_cols=20  Identities=25%  Similarity=0.277  Sum_probs=16.5

Q ss_pred             CCccceEEcCCCcEEEEEee
Q psy2771         108 GNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~  127 (174)
                      +.+.=|++|.+++++|+++.
T Consensus        23 ~~~~~~V~~~~~~~~Giv~~   42 (113)
T cd04623          23 NIGAVVVVDDGGRLVGIFSE   42 (113)
T ss_pred             CCCeEEEECCCCCEEEEEeh
Confidence            45566899988999999994


No 99 
>PF08275 Toprim_N:  DNA primase catalytic core, N-terminal domain;  InterPro: IPR013264 This is the N-terminal, catalytic core domain of DNA primases. DNA primase (2.7.7 from EC) is a nucleotidyltransferase which synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork. It can also prime the leading strand and has been implicated in cell division []. ; PDB: 1EQN_E 1DD9_A 3B39_B 1DDE_A 2AU3_A.
Probab=27.80  E-value=69  Score=22.59  Aligned_cols=17  Identities=24%  Similarity=0.632  Sum_probs=13.3

Q ss_pred             eEEcCCCcEEEEEeeec
Q psy2771         113 PLVNLDGEVIGINSMKV  129 (174)
Q Consensus       113 Pl~n~~G~liGI~~~~~  129 (174)
                      |+.|.+|+|||......
T Consensus        82 PI~d~~G~vvgF~gR~l   98 (128)
T PF08275_consen   82 PIRDERGRVVGFGGRRL   98 (128)
T ss_dssp             EEE-TTS-EEEEEEEES
T ss_pred             EEEcCCCCEEEEecccC
Confidence            89999999999988766


No 100
>cd04608 CBS_pair_PALP_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain upstream.   The vitamin B6 complex comprises pyridoxine, pyridoxal, and pyridoxamine, as well as the 5'-phosphate esters of pyridoxal (PALP) and pyridoxamine, the last two being the biologically active coenzyme derivatives.  The members of the PALP family are principally involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and other amine-containing compounds.  CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a poten
Probab=27.63  E-value=59  Score=22.07  Aligned_cols=21  Identities=24%  Similarity=0.558  Sum_probs=17.1

Q ss_pred             CCccceEEcCCCcEEEEEeee
Q psy2771         108 GNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~~  128 (174)
                      +.+.-|++|.+|+++|+++..
T Consensus        24 ~~~~~~Vvd~~~~~~Gii~~~   44 (124)
T cd04608          24 GFDQLPVVDESGKILGMVTLG   44 (124)
T ss_pred             CCCEEEEEcCCCCEEEEEEHH
Confidence            455778999999999999853


No 101
>cd04639 CBS_pair_26 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=27.59  E-value=66  Score=20.80  Aligned_cols=22  Identities=23%  Similarity=0.522  Sum_probs=17.6

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|++|.+|+++|+++..
T Consensus        85 ~~~~~~~Vv~~~~~~~G~it~~  106 (111)
T cd04639          85 GGAPAVPVVDGSGRLVGLVTLE  106 (111)
T ss_pred             cCCceeeEEcCCCCEEEEEEHH
Confidence            4566779998889999999863


No 102
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.56  E-value=1.7e+02  Score=18.72  Aligned_cols=30  Identities=13%  Similarity=-0.119  Sum_probs=24.1

Q ss_pred             EEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ....+.||+.+.....++|...++.+=...
T Consensus        13 v~V~l~dgR~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          13 MRIHMTDGRTLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             EEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence            344559999999999999998888765553


No 103
>PRK09371 gas vesicle synthesis protein GvpA; Provisional
Probab=27.16  E-value=1.2e+02  Score=19.11  Aligned_cols=33  Identities=18%  Similarity=0.020  Sum_probs=28.6

Q ss_pred             HHHHHHHHHHHHhCCeeeeeecCceeeeeeeee
Q psy2771         139 IDYAIEFLTNYKRKGKFCAYSKGKSDLRTEVLY  171 (174)
Q Consensus       139 i~~i~~~l~~l~~~g~~~~~~lg~~~~~~e~~~  171 (174)
                      ...+.++++++..+|-+...++-++....|++.
T Consensus         6 s~sLadvldriLDKGiVI~adi~VSl~gieLL~   38 (68)
T PRK09371          6 SSSLAEVIDRILDKGIVVDAWVRVSLVGIELLA   38 (68)
T ss_pred             cccHHHHHHHHccCCeEEEEEEEEEEeeeEEEE
Confidence            346789999999999999999999988888775


No 104
>cd02205 CBS_pair The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generali
Probab=27.09  E-value=63  Score=20.37  Aligned_cols=21  Identities=29%  Similarity=0.507  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCCcEEEEEee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~  127 (174)
                      .+.+--|++|.+|+++|+.+.
T Consensus        87 ~~~~~~~V~~~~~~~~G~i~~  107 (113)
T cd02205          87 HGIRRLPVVDDEGRLVGIVTR  107 (113)
T ss_pred             cCCCEEEEEcCCCcEEEEEEH
Confidence            455667899999999999875


No 105
>cd04593 CBS_pair_EriC_assoc_bac_arch This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in bacteria and archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS d
Probab=26.66  E-value=59  Score=21.31  Aligned_cols=22  Identities=27%  Similarity=0.449  Sum_probs=17.3

Q ss_pred             CCCccceEEcCC--CcEEEEEeee
Q psy2771         107 FGNSGGPLVNLD--GEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~--G~liGI~~~~  128 (174)
                      .+..--||+|..  |+++|+++..
T Consensus        87 ~~~~~~~Vvd~~~~~~~~Gvit~~  110 (115)
T cd04593          87 RGLRQLPVVDRGNPGQVLGLLTRE  110 (115)
T ss_pred             cCCceeeEEeCCCCCeEEEEEEhH
Confidence            555567999887  8999999863


No 106
>cd04595 CBS_pair_DHH_polyA_Pol_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with an upstream DHH domain which performs a phosphoesterase function and a downstream polyA polymerase domain. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=26.63  E-value=54  Score=21.23  Aligned_cols=20  Identities=30%  Similarity=0.534  Sum_probs=15.0

Q ss_pred             CCccceEEcCCCcEEEEEeee
Q psy2771         108 GNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~~  128 (174)
                      +..-=|+++ +|+++|+++..
T Consensus        86 ~~~~~~V~~-~~~~~Gvvt~~  105 (110)
T cd04595          86 DIGRVPVVE-DGRLVGIVTRT  105 (110)
T ss_pred             CCCeeEEEe-CCEEEEEEEhH
Confidence            334458888 89999999864


No 107
>PF04085 MreC:  rod shape-determining protein MreC;  InterPro: IPR007221 MreC (murein formation C) is involved in the rod shape determination in Escherichia coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped.; GO: 0008360 regulation of cell shape; PDB: 2J5U_B 2QF4_B 2QF5_A.
Probab=26.53  E-value=2.5e+02  Score=20.27  Aligned_cols=57  Identities=18%  Similarity=0.159  Sum_probs=33.9

Q ss_pred             CCCcEEEEEEcCCC---CCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCc
Q psy2771          31 FEKHIILFHCLQNN---YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSS   87 (174)
Q Consensus        31 ~~~DlAllkv~~~~---~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~   87 (174)
                      ...+.++++=....   +.--.+....+++.||.|+..|...-+..-+.-|.|.......
T Consensus        66 ~~~~~Gi~~G~~~~~~~~~l~~i~~~~~i~~GD~V~TSG~~~~fP~Gi~VG~V~~v~~~~  125 (152)
T PF04085_consen   66 RSGDRGILRGDGSNTGLLKLEYIPKDADIKKGDIVVTSGLGGIFPPGIPVGTVSSVEPDK  125 (152)
T ss_dssp             CTTEEEEEEEEETTTTEEEEEEECTTS---TT-EEEEE-TTSSS-CCEEEEEEEEEECTT
T ss_pred             cCCeeEEEEeCCCCCceEEEEECCCCCCCCCCCEEEECCCCCcCCCCCEEEEEEEEEeCC
Confidence            33456888766433   2223344566799999999998765566678899998887653


No 108
>COG0517 FOG: CBS domain [General function prediction only]
Probab=26.52  E-value=69  Score=20.84  Aligned_cols=21  Identities=29%  Similarity=0.491  Sum_probs=18.2

Q ss_pred             CCCccceEEcCCC-cEEEEEee
Q psy2771         107 FGNSGGPLVNLDG-EVIGINSM  127 (174)
Q Consensus       107 ~G~SGGPl~n~~G-~liGI~~~  127 (174)
                      .+...-|++|.++ +++||++.
T Consensus        92 ~~~~~lpVv~~~~~~lvGivt~  113 (117)
T COG0517          92 HKIRRLPVVDDDGGKLVGIITL  113 (117)
T ss_pred             cCcCeEEEEECCCCeEEEEEEH
Confidence            5788899999986 99999985


No 109
>cd04637 CBS_pair_24 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=26.40  E-value=65  Score=21.29  Aligned_cols=22  Identities=36%  Similarity=0.460  Sum_probs=17.5

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|++|.+|+++|+++..
T Consensus        96 ~~~~~~~vv~~~~~~~Gvit~~  117 (122)
T cd04637          96 NSISCLPVVDENGQLIGIITWK  117 (122)
T ss_pred             cCCCeEeEECCCCCEEEEEEHH
Confidence            4555679998889999999853


No 110
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.09  E-value=1.4e+02  Score=19.21  Aligned_cols=29  Identities=17%  Similarity=-0.018  Sum_probs=23.1

Q ss_pred             EEEEeecCcEEEEeeeEeeCCCcEEEEEE
Q psy2771          12 CLSTFSFNSLLTLPNIAYYFEKHIILFHC   40 (174)
Q Consensus        12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv   40 (174)
                      ....+.+|+.+...+.++|....|.+=..
T Consensus        14 V~V~l~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          14 VYVKLRGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             EEEEECCCCEEEEEEEEEccceEEeccce
Confidence            33445999999999999999888876443


No 111
>COG0490 Putative regulatory, ligand-binding protein related to C-terminal domains of K+ channels [Inorganic ion transport and metabolism]
Probab=26.03  E-value=81  Score=23.56  Aligned_cols=14  Identities=14%  Similarity=0.525  Sum_probs=11.6

Q ss_pred             CCCCCCCEEEEEec
Q psy2771          54 ADIRNGEFVIAMGS   67 (174)
Q Consensus        54 ~~~~~G~~v~~~G~   67 (174)
                      ..++.||.++++|-
T Consensus       133 ~vle~gDtlvviG~  146 (162)
T COG0490         133 TVLEAGDTLVVIGE  146 (162)
T ss_pred             hhhcCCCEEEEEec
Confidence            45799999999983


No 112
>cd04599 CBS_pair_GGDEF_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=25.47  E-value=61  Score=20.67  Aligned_cols=21  Identities=14%  Similarity=0.132  Sum_probs=16.1

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...=|++|. |+++|+++..
T Consensus        80 ~~~~~~~Vv~~-~~~~G~it~~  100 (105)
T cd04599          80 KKIERLPVLRE-RKLVGIITKG  100 (105)
T ss_pred             cCCCEeeEEEC-CEEEEEEEHH
Confidence            45556788876 9999999853


No 113
>KOG3888|consensus
Probab=25.21  E-value=99  Score=26.29  Aligned_cols=45  Identities=18%  Similarity=0.257  Sum_probs=36.5

Q ss_pred             CccceEE--cCCCcEEEEEeeecCCCeEEEEeHHHHHHHHHHHHhCC
Q psy2771         109 NSGGPLV--NLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYKRKG  153 (174)
Q Consensus       109 ~SGGPl~--n~~G~liGI~~~~~~~~~~~aiPi~~i~~~l~~l~~~g  153 (174)
                      ..=.|++  |.+|+++-|+........-|-+|.+.++.+...++..-
T Consensus       292 i~r~~vI~ld~egr~~rIN~s~~~Rds~fdvp~e~v~~~y~a~~~F~  338 (407)
T KOG3888|consen  292 IWRAPVICLDDEGRVVRINFSNPQRDSIFDVPVEQVQPWYRALKLFV  338 (407)
T ss_pred             hhcCceEEEcccceEEEEecCCccccccccCCHHHHHHHHHHHHHHH
Confidence            3445777  77899999999887767789999999999998886543


No 114
>cd04625 CBS_pair_12 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.21  E-value=62  Score=20.98  Aligned_cols=21  Identities=19%  Similarity=0.314  Sum_probs=15.8

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-|+++ +|+++|+++..
T Consensus        87 ~~~~~l~Vv~-~~~~~Gvvt~~  107 (112)
T cd04625          87 RHLRYLPVLD-GGTLLGVISFH  107 (112)
T ss_pred             cCCCeeeEEE-CCEEEEEEEHH
Confidence            4455578887 69999999853


No 115
>cd04631 CBS_pair_18 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=24.78  E-value=72  Score=21.13  Aligned_cols=22  Identities=32%  Similarity=0.533  Sum_probs=17.2

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-||+|.+|+++|+++..
T Consensus        99 ~~~~~~~V~~~~~~~~Gvit~~  120 (125)
T cd04631          99 KRVGGLPVVDDDGKLVGIVTER  120 (125)
T ss_pred             cCCceEEEEcCCCcEEEEEEHH
Confidence            4555678888789999999853


No 116
>PF08448 PAS_4:  PAS fold;  InterPro: IPR013656 The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs []. The PAS fold appears in archaea, eubacteria and eukarya. ; PDB: 3K3D_A 3K3C_B 3KX0_X 3FC7_B 3LUQ_D 3MXQ_A 3BWL_C 3FG8_A.
Probab=24.46  E-value=80  Score=20.02  Aligned_cols=17  Identities=35%  Similarity=0.741  Sum_probs=14.8

Q ss_pred             ceEEcCCCcEEEEEeee
Q psy2771         112 GPLVNLDGEVIGINSMK  128 (174)
Q Consensus       112 GPl~n~~G~liGI~~~~  128 (174)
                      .|+.|.+|++.|++...
T Consensus        86 ~Pi~~~~g~~~g~~~~~  102 (110)
T PF08448_consen   86 SPIFDEDGEVVGVLVII  102 (110)
T ss_dssp             EEEECTTTCEEEEEEEE
T ss_pred             EEeEcCCCCEEEEEEEE
Confidence            69999999999998754


No 117
>cd04629 CBS_pair_16 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=23.82  E-value=69  Score=20.76  Aligned_cols=20  Identities=40%  Similarity=0.699  Sum_probs=16.1

Q ss_pred             CCccceEEcCCCcEEEEEee
Q psy2771         108 GNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~  127 (174)
                      +.+.-|++|.+|+++|+++.
T Consensus        23 ~~~~~~V~~~~~~~~G~v~~   42 (114)
T cd04629          23 KISGGPVVDDNGNLVGFLSE   42 (114)
T ss_pred             CCCCccEECCCCeEEEEeeh
Confidence            34566899999999999983


No 118
>cd04594 CBS_pair_EriC_assoc_archaea This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the EriC CIC-type chloride channels in archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS do
Probab=23.51  E-value=71  Score=20.58  Aligned_cols=21  Identities=29%  Similarity=0.458  Sum_probs=15.6

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+..--|+++ +|+++|+++..
T Consensus        79 ~~~~~~~Vv~-~~~~iGvit~~   99 (104)
T cd04594          79 NKTRWCPVVD-DGKFKGIVTLD   99 (104)
T ss_pred             cCcceEEEEE-CCEEEEEEEHH
Confidence            4444578887 69999998853


No 119
>cd04612 CBS_pair_SpoIVFB_EriC_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with either the SpoIVFB domain (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) or the chloride channel protein EriC.  SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A ), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus).  SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB.  It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. EriC is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase an
Probab=23.36  E-value=88  Score=20.06  Aligned_cols=22  Identities=27%  Similarity=0.327  Sum_probs=17.9

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+.-|++|.+|+++|+++..
T Consensus        85 ~~~~~~~V~~~~~~~~G~it~~  106 (111)
T cd04612          85 RDIGRLPVVDDSGRLVGIVSRS  106 (111)
T ss_pred             CCCCeeeEEcCCCCEEEEEEHH
Confidence            4566789998889999999864


No 120
>cd04630 CBS_pair_17 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=23.31  E-value=71  Score=20.88  Aligned_cols=21  Identities=33%  Similarity=0.447  Sum_probs=15.7

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+..--|++|. |+++|+++..
T Consensus        89 ~~~~~~~Vvd~-~~~~Gvi~~~  109 (114)
T cd04630          89 TNIRRAPVVEN-NELIGIISLT  109 (114)
T ss_pred             cCCCEeeEeeC-CEEEEEEEHH
Confidence            34555688876 9999999853


No 121
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=23.27  E-value=1.3e+02  Score=18.87  Aligned_cols=27  Identities=7%  Similarity=-0.066  Sum_probs=22.6

Q ss_pred             EEeecCcEEEEeeeEeeCCCcEEEEEE
Q psy2771          14 STFSFNSLLTLPNIAYYFEKHIILFHC   40 (174)
Q Consensus        14 ~~~~~~~~~~a~~v~~d~~~DlAllkv   40 (174)
                      ..+.+++.+.....++|....+.+=..
T Consensus        14 V~l~dgr~~~G~L~~~D~~~NlvL~~~   40 (74)
T cd01727          14 VITVDGRVIVGTLKGFDQATNLILDDS   40 (74)
T ss_pred             EEECCCcEEEEEEEEEccccCEEccce
Confidence            345999999999999999888777665


No 122
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=23.19  E-value=2e+02  Score=18.36  Aligned_cols=29  Identities=3%  Similarity=-0.072  Sum_probs=23.1

Q ss_pred             EEEEeecCcEEEEeeeEeeCCCcEEEEEE
Q psy2771          12 CLSTFSFNSLLTLPNIAYYFEKHIILFHC   40 (174)
Q Consensus        12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv   40 (174)
                      ....+.+++.+...+.++|..-.+.+=..
T Consensus        16 V~V~l~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          16 IWIVMKSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             EEEEECCCeEEEEEEEEeccceEEEEccE
Confidence            33445999999999999999888876544


No 123
>cd04586 CBS_pair_BON_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the BON (bacterial OsmY and nodulation domain) domain. BON is a putative phospholipid-binding domain found in a family of osmotic shock protection proteins. It is also found in some secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.16  E-value=80  Score=21.46  Aligned_cols=21  Identities=29%  Similarity=0.406  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+.+.-|++| +|+++||++..
T Consensus       110 ~~~~~l~Vvd-~g~~~Gvit~~  130 (135)
T cd04586         110 HRIKRVPVVR-GGRLVGIVSRA  130 (135)
T ss_pred             cCCCccCEec-CCEEEEEEEhH
Confidence            5666789999 89999999853


No 124
>cd04622 CBS_pair_9 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=22.84  E-value=85  Score=20.28  Aligned_cols=19  Identities=37%  Similarity=0.602  Sum_probs=14.8

Q ss_pred             ccceEEcCCCcEEEEEeee
Q psy2771         110 SGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       110 SGGPl~n~~G~liGI~~~~  128 (174)
                      .--|+++.+|+++|+++..
T Consensus        90 ~~~~V~~~~~~~~G~it~~  108 (113)
T cd04622          90 RRLPVVDDDGRLVGIVSLG  108 (113)
T ss_pred             CeeeEECCCCcEEEEEEHH
Confidence            3448888889999998753


No 125
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.50  E-value=2.2e+02  Score=18.11  Aligned_cols=53  Identities=17%  Similarity=0.171  Sum_probs=33.1

Q ss_pred             EEEEeecCcEEEEeeeEeeCCCcEEEEEEcC-----CCCCceeecCCCCCCCCCEEEEEe
Q psy2771          12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCLQ-----NNYPALKLGKAADIRNGEFVIAMG   66 (174)
Q Consensus        12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~-----~~~~~~~l~~~~~~~~G~~v~~~G   66 (174)
                      ....+.+|+.+.....++|+...+.+=...+     .......++.  -+-.|+.|..+|
T Consensus        15 v~V~l~~gr~~~G~L~~fD~~~NlvL~d~~E~~~~~~~~~~~~lG~--~viRG~~V~~ig   72 (74)
T cd01728          15 VVVLLRDGRKLIGILRSFDQFANLVLQDTVERIYVGDKYGDIPRGI--FIIRGENVVLLG   72 (74)
T ss_pred             EEEEEcCCeEEEEEEEEECCcccEEecceEEEEecCCccceeEeeE--EEEECCEEEEEE
Confidence            3445599999999999999988877755421     1111222222  244577777665


No 126
>PRK11543 gutQ D-arabinose 5-phosphate isomerase; Provisional
Probab=22.47  E-value=73  Score=25.79  Aligned_cols=22  Identities=18%  Similarity=0.410  Sum_probs=18.7

Q ss_pred             CCCccceEEcCCCcEEEEEeee
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMK  128 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~  128 (174)
                      .+...-||+|.+|+++|+++..
T Consensus       292 ~~~~~lpVvd~~~~lvGvIt~~  313 (321)
T PRK11543        292 RKITAAPVVDENGKLTGAINLQ  313 (321)
T ss_pred             cCCCEEEEEcCCCeEEEEEEHH
Confidence            6677789999899999999854


No 127
>cd04613 CBS_pair_SpoIVFB_EriC_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with either the SpoIVFB domain (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) or the chloride channel protein EriC.  SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A ), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus).  SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB.  It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. EriC is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase a
Probab=22.41  E-value=87  Score=20.14  Aligned_cols=20  Identities=35%  Similarity=0.607  Sum_probs=16.2

Q ss_pred             CCccceEEcCCCcEEEEEee
Q psy2771         108 GNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~  127 (174)
                      +.+.-|++|.+|+++|+++.
T Consensus        23 ~~~~~~v~~~~~~~~G~v~~   42 (114)
T cd04613          23 PENNFPVVDDDGRLVGIVSL   42 (114)
T ss_pred             CCcceeEECCCCCEEEEEEH
Confidence            34567899888999999994


No 128
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.33  E-value=2.2e+02  Score=18.79  Aligned_cols=30  Identities=17%  Similarity=-0.013  Sum_probs=24.3

Q ss_pred             EEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ....+.+++.+...+.++|....+.+=...
T Consensus        17 V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          17 VLINCRNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             EEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence            344459999999999999999988876553


No 129
>cd04584 CBS_pair_ACT_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in  the acetoin utilization proteins in bacteria. Acetoin is a product of fermentative metabolism in many prokaryotic and eukaryotic microorganisms.  They produce acetoin as an external carbon storage compound and then later reuse it as a carbon and energy source during their stationary phase and sporulation. In addition these CBS domains are associated with a downstream ACT domain, which is linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The in
Probab=22.05  E-value=82  Score=20.64  Aligned_cols=20  Identities=25%  Similarity=0.398  Sum_probs=16.1

Q ss_pred             CCccceEEcCCCcEEEEEee
Q psy2771         108 GNSGGPLVNLDGEVIGINSM  127 (174)
Q Consensus       108 G~SGGPl~n~~G~liGI~~~  127 (174)
                      +.+.-|++|.+|+++|+++.
T Consensus        23 ~~~~~~V~d~~~~~~G~v~~   42 (121)
T cd04584          23 KIRHLPVVDEEGRLVGIVTD   42 (121)
T ss_pred             CCCcccEECCCCcEEEEEEH
Confidence            44456888999999999983


No 130
>PF08669 GCV_T_C:  Glycine cleavage T-protein C-terminal barrel domain;  InterPro: IPR013977  This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=21.93  E-value=1.1e+02  Score=19.89  Aligned_cols=23  Identities=22%  Similarity=0.345  Sum_probs=17.9

Q ss_pred             CCCccceEEcCCCcEEEEEeeec
Q psy2771         107 FGNSGGPLVNLDGEVIGINSMKV  129 (174)
Q Consensus       107 ~G~SGGPl~n~~G~liGI~~~~~  129 (174)
                      +=..|.|++..+|+.||.++...
T Consensus        32 ~~~~g~~v~~~~g~~vG~vTS~~   54 (95)
T PF08669_consen   32 PPRGGEPVYDEDGKPVGRVTSGA   54 (95)
T ss_dssp             --STTCEEEETTTEEEEEEEEEE
T ss_pred             CCCCCCEEEECCCcEEeEEEEEe
Confidence            34567899977999999999764


No 131
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=21.61  E-value=2.1e+02  Score=17.55  Aligned_cols=29  Identities=10%  Similarity=-0.035  Sum_probs=24.5

Q ss_pred             EEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          13 LSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        13 ~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ...+.+|+.+...+.++|...++.+-...
T Consensus        14 ~V~l~~g~~~~G~L~~~D~~mNlvL~~~~   42 (68)
T cd01731          14 LVKLKGGKEVRGRLKSYDQHMNLVLEDAE   42 (68)
T ss_pred             EEEECCCCEEEEEEEEECCcceEEEeeEE
Confidence            33459999999999999999998887764


No 132
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=20.91  E-value=2.2e+02  Score=17.54  Aligned_cols=31  Identities=3%  Similarity=-0.114  Sum_probs=24.6

Q ss_pred             EEEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          11 ICLSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        11 ~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      .....+.+|+.+..++.++|..-++.+=.+.
T Consensus        13 ~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          13 PVIVKLKWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             EEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence            3444559999999999999998888875553


No 133
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=20.87  E-value=2.3e+02  Score=17.90  Aligned_cols=32  Identities=6%  Similarity=-0.065  Sum_probs=25.6

Q ss_pred             eEEEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          10 DICLSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        10 ~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ......+.+|+.+..++.++|....+.+--+.
T Consensus        18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   49 (79)
T COG1958          18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVE   49 (79)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceeEEEeceE
Confidence            34455569999999999999998888776654


No 134
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.79  E-value=2e+02  Score=16.95  Aligned_cols=29  Identities=10%  Similarity=-0.048  Sum_probs=23.9

Q ss_pred             EEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          13 LSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        13 ~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ...+.+|+.+...+.++|...++.+-...
T Consensus        10 ~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~   38 (63)
T cd00600          10 RVELKDGRVLEGVLVAFDKYMNLVLDDVE   38 (63)
T ss_pred             EEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence            34459999999999999998888876664


No 135
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=20.62  E-value=2.3e+02  Score=17.69  Aligned_cols=30  Identities=10%  Similarity=0.003  Sum_probs=24.6

Q ss_pred             EEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771          12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCL   41 (174)
Q Consensus        12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~   41 (174)
                      ....+.+|+.+...+.++|..-++.+=...
T Consensus        17 V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   46 (72)
T PRK00737         17 VLVRLKGGREFRGELQGYDIHMNLVLDNAE   46 (72)
T ss_pred             EEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence            334459999999999999999888877764


Done!