Query         016647
Match_columns 385
No_of_seqs    349 out of 3407
Neff          7.8 
Searched_HMMs 46136
Date          Fri Mar 29 08:44:15 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016647.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016647hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 3.8E-46 8.2E-51  376.9  35.8  262  117-382    41-362 (455)
  2 PRK10898 serine endoprotease;  100.0 3.6E-45 7.8E-50  359.3  36.3  264  112-383    41-352 (353)
  3 TIGR02038 protease_degS peripl 100.0 5.5E-45 1.2E-49  358.2  36.3  263  112-382    41-350 (351)
  4 PRK10942 serine endoprotease;  100.0 1.1E-43 2.4E-48  360.5  34.5  261  117-381    39-382 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 4.3E-42 9.4E-47  346.8  32.5  261  118-382     3-329 (428)
  6 COG0265 DegQ Trypsin-like seri 100.0   2E-34 4.3E-39  283.6  28.8  262  116-380    33-340 (347)
  7 KOG1320 Serine protease [Postt  99.9 2.4E-26 5.1E-31  227.9  19.9  267  116-382   128-470 (473)
  8 KOG1421 Predicted signaling-as  99.8 1.3E-19 2.7E-24  182.2  19.3  254  117-381    53-372 (955)
  9 PF13365 Trypsin_2:  Trypsin-li  99.7 2.7E-16 5.8E-21  130.0  12.9  109  154-293     1-120 (120)
 10 PF00089 Trypsin:  Trypsin;  In  99.4 1.3E-11 2.8E-16  112.1  18.6  146  151-297    24-198 (220)
 11 KOG1421 Predicted signaling-as  99.3 3.6E-11 7.9E-16  121.9  17.7  247  122-382   524-833 (955)
 12 cd00190 Tryp_SPc Trypsin-like   99.3 8.2E-11 1.8E-15  107.6  16.2  147  151-298    24-208 (232)
 13 PF13180 PDZ_2:  PDZ domain; PD  99.3 1.1E-11 2.3E-16   96.4   8.4   63  316-378    20-82  (82)
 14 KOG1320 Serine protease [Postt  99.3 8.8E-12 1.9E-16  124.5   7.7  236  123-369    57-351 (473)
 15 smart00020 Tryp_SPc Trypsin-li  99.2 4.5E-10 9.8E-15  102.9  15.4  147  151-298    25-208 (229)
 16 cd00991 PDZ_archaeal_metallopr  99.0   1E-09 2.2E-14   84.7   8.6   62  316-377    16-77  (79)
 17 cd00986 PDZ_LON_protease PDZ d  99.0 1.7E-09 3.6E-14   83.3   9.2   65  317-382    15-79  (79)
 18 COG3591 V8-like Glu-specific e  98.9 3.9E-08 8.4E-13   91.3  13.4  135  153-300    65-226 (251)
 19 cd00989 PDZ_metalloprotease PD  98.8 2.1E-08 4.6E-13   76.8   7.9   61  316-377    18-78  (79)
 20 cd00990 PDZ_glycyl_aminopeptid  98.7 3.7E-08   8E-13   75.7   7.3   61  316-379    18-78  (80)
 21 cd00987 PDZ_serine_protease PD  98.7 3.9E-08 8.3E-13   77.2   7.5   60  316-375    30-89  (90)
 22 PRK10779 zinc metallopeptidase  98.7 4.3E-08 9.3E-13  100.0   7.5   66  315-380   131-196 (449)
 23 TIGR01713 typeII_sec_gspC gene  98.7 7.2E-08 1.6E-12   91.0   8.4   61  318-378   199-259 (259)
 24 cd00988 PDZ_CTP_protease PDZ d  98.6 1.9E-07 4.1E-12   72.5   8.0   62  316-378    19-83  (85)
 25 PRK10779 zinc metallopeptidase  98.4 9.7E-07 2.1E-11   90.1   9.0   64  316-380   227-290 (449)
 26 TIGR00054 RIP metalloprotease   98.4   9E-07 1.9E-11   89.5   8.2   64  316-380   209-272 (420)
 27 TIGR02860 spore_IV_B stage IV   98.3 1.8E-06 3.9E-11   85.6   8.9   60  319-379   122-181 (402)
 28 PF00863 Peptidase_C4:  Peptida  98.3 8.7E-05 1.9E-09   68.6  18.1  165  123-315    14-186 (235)
 29 TIGR02037 degP_htrA_DO peripla  98.3 1.7E-06 3.7E-11   87.8   7.6   60  316-375   368-427 (428)
 30 KOG3627 Trypsin [Amino acid tr  98.3 2.4E-05 5.2E-10   73.3  14.4  145  153-299    39-229 (256)
 31 cd00136 PDZ PDZ domain, also c  98.2   2E-06 4.4E-11   64.1   5.0   50  316-366    19-70  (70)
 32 PRK09681 putative type II secr  98.1 7.2E-06 1.6E-10   77.4   7.8   55  324-378   221-275 (276)
 33 PRK10139 serine endoprotease;   98.1 7.4E-06 1.6E-10   83.7   7.9   59  316-376   396-454 (455)
 34 TIGR00225 prc C-terminal pepti  98.1 7.9E-06 1.7E-10   80.2   7.2   64  316-380    68-133 (334)
 35 TIGR03279 cyano_FeS_chp putati  98.0 1.5E-05 3.3E-10   79.6   7.8   61  316-380     4-65  (433)
 36 PRK10942 serine endoprotease;   98.0 1.5E-05 3.2E-10   81.8   7.7   59  316-376   414-472 (473)
 37 smart00228 PDZ Domain present   97.9 1.4E-05 3.1E-10   61.4   5.0   54  316-369    32-85  (85)
 38 PLN00049 carboxyl-terminal pro  97.9 3.3E-05 7.2E-10   77.4   8.2   62  316-378   108-171 (389)
 39 COG3480 SdrC Predicted secrete  97.8 5.1E-05 1.1E-09   72.0   7.8   59  323-381   142-201 (342)
 40 TIGR00054 RIP metalloprotease   97.8 2.1E-05 4.7E-10   79.6   5.5   61  316-378   134-194 (420)
 41 PF14685 Tricorn_PDZ:  Tricorn   97.7 0.00012 2.6E-09   57.4   7.2   55  320-375    30-87  (88)
 42 COG0793 Prc Periplasmic protea  97.7 0.00011 2.4E-09   73.9   7.5   62  316-378   118-183 (406)
 43 cd00992 PDZ_signaling PDZ doma  97.7 9.2E-05   2E-09   56.7   5.4   48  316-365    32-81  (82)
 44 PF05579 Peptidase_S32:  Equine  97.6 0.00049 1.1E-08   63.9  10.4  116  151-297   111-228 (297)
 45 PF00595 PDZ:  PDZ domain (Also  97.5 7.8E-05 1.7E-09   57.3   3.1   49  316-366    31-81  (81)
 46 COG3031 PulC Type II secretory  97.4 0.00025 5.4E-09   64.7   5.6   54  324-377   221-274 (275)
 47 COG5640 Secreted trypsin-like   97.1   0.022 4.8E-07   55.3  15.2   56  154-210    63-135 (413)
 48 PRK11186 carboxy-terminal prot  97.0  0.0014 3.1E-08   69.6   7.2   61  316-377   261-332 (667)
 49 KOG3129 26S proteasome regulat  96.8  0.0024 5.3E-08   57.2   5.6   65  316-380   145-211 (231)
 50 PF03761 DUF316:  Domain of unk  96.7    0.11 2.4E-06   49.4  17.0   91  197-297   159-254 (282)
 51 PF05580 Peptidase_S55:  SpoIVB  96.7   0.023   5E-07   51.6  11.1  164  147-315    15-214 (218)
 52 PF04495 GRASP55_65:  GRASP55/6  96.6  0.0027 5.7E-08   54.3   4.3   62  316-378    49-113 (138)
 53 COG3975 Predicted protease wit  96.2  0.0048   1E-07   62.6   4.1   59  316-382   468-526 (558)
 54 PF00548 Peptidase_C3:  3C cyst  96.2   0.049 1.1E-06   48.3  10.1  137  151-297    24-170 (172)
 55 KOG3209 WW domain-containing p  95.8   0.014   3E-07   61.0   5.3   64  303-368   763-837 (984)
 56 PF09342 DUF1986:  Domain of un  95.6   0.095 2.1E-06   48.5   9.5   96  141-237    17-131 (267)
 57 PF00949 Peptidase_S7:  Peptida  95.5   0.013 2.8E-07   49.5   3.4   32  268-299    88-119 (132)
 58 PF10459 Peptidase_S46:  Peptid  95.3   0.012 2.6E-07   62.9   3.3   21  153-173    48-68  (698)
 59 PF08192 Peptidase_S64:  Peptid  95.0    0.16 3.4E-06   53.3  10.1  109  198-315   542-680 (695)
 60 KOG3580 Tight junction protein  94.1   0.074 1.6E-06   54.7   5.1   73  286-367   414-488 (1027)
 61 PF02122 Peptidase_S39:  Peptid  94.1    0.18 3.8E-06   45.9   7.0  144  152-313    30-181 (203)
 62 TIGR02860 spore_IV_B stage IV   93.8    0.46   1E-05   47.6  10.0   41  271-315   354-394 (402)
 63 KOG3580 Tight junction protein  93.0    0.16 3.5E-06   52.3   5.4   52  323-376   233-286 (1027)
 64 PF10459 Peptidase_S46:  Peptid  92.9   0.053 1.1E-06   58.1   1.8   28  268-295   624-651 (698)
 65 PF12812 PDZ_1:  PDZ-like domai  92.6    0.15 3.2E-06   39.1   3.5   33  324-356    44-76  (78)
 66 COG0750 Predicted membrane-ass  92.4    0.39 8.6E-06   47.6   7.3   56  316-372   135-194 (375)
 67 PF00944 Peptidase_S3:  Alphavi  92.3    0.12 2.6E-06   43.4   2.8   28  271-298   100-127 (158)
 68 KOG3209 WW domain-containing p  92.2    0.24 5.1E-06   52.2   5.4   68  303-370   493-566 (984)
 69 KOG3605 Beta amyloid precursor  89.1     1.4 2.9E-05   46.3   7.5   79  286-371   654-737 (829)
 70 KOG3532 Predicted protein kina  88.8    0.57 1.2E-05   49.2   4.7   47  316-363   404-450 (1051)
 71 KOG3552 FERM domain protein FR  88.6     0.5 1.1E-05   51.2   4.2   45  321-367    85-131 (1298)
 72 KOG3553 Tax interaction protei  83.7    0.62 1.4E-05   37.1   1.5   28  316-343    65-92  (124)
 73 PF02907 Peptidase_S29:  Hepati  80.5    0.86 1.9E-05   38.4   1.3  114  155-299    15-130 (148)
 74 KOG3550 Receptor targeting pro  79.7     3.7   8E-05   35.3   4.8   48  316-365   121-171 (207)
 75 PF03510 Peptidase_C24:  2C end  78.9      10 0.00022   30.7   6.9   52  156-219     3-54  (105)
 76 PF01732 DUF31:  Putative pepti  75.6     2.2 4.8E-05   42.5   2.8   24  272-295   350-373 (374)
 77 KOG3549 Syntrophins (type gamm  75.6     9.1  0.0002   37.4   6.7   47  319-367    89-138 (505)
 78 KOG3542 cAMP-regulated guanine  74.5     2.5 5.4E-05   44.6   2.8   50  316-366   568-617 (1283)
 79 KOG3834 Golgi reassembly stack  71.6      11 0.00024   37.9   6.4   62  316-378   115-177 (462)
 80 PF02395 Peptidase_S6:  Immunog  69.9      15 0.00032   40.3   7.6   62  152-218    65-130 (769)
 81 PF00947 Pico_P2A:  Picornaviru  66.7     8.5 0.00018   32.2   3.9   32  265-297    78-109 (127)
 82 KOG1892 Actin filament-binding  65.6     7.9 0.00017   42.6   4.3   52  317-370   967-1021(1629)
 83 KOG0609 Calcium/calmodulin-dep  61.6     9.7 0.00021   39.3   4.0   40  326-367   163-204 (542)
 84 cd01720 Sm_D2 The eukaryotic S  60.2      18 0.00039   28.3   4.4   37  171-207    10-46  (87)
 85 cd01735 LSm12_N LSm12 belongs   58.8      35 0.00075   24.8   5.4   33  176-208     7-39  (61)
 86 cd00600 Sm_like The eukaryotic  58.4      27 0.00058   24.8   4.9   33  176-208     7-39  (63)
 87 KOG3651 Protein kinase C, alph  56.1      51  0.0011   31.8   7.4   49  316-366    36-87  (429)
 88 cd01731 archaeal_Sm1 The archa  54.7      30 0.00065   25.4   4.7   33  176-208    11-43  (68)
 89 PRK00737 small nuclear ribonuc  54.0      30 0.00066   25.7   4.7   32  176-207    15-46  (72)
 90 cd01726 LSm6 The eukaryotic Sm  53.9      28 0.00062   25.5   4.5   32  176-207    11-42  (67)
 91 cd01722 Sm_F The eukaryotic Sm  52.6      29 0.00063   25.5   4.3   32  176-207    12-43  (68)
 92 KOG0606 Microtubule-associated  51.3      19 0.00042   40.4   4.4   48  316-365   664-713 (1205)
 93 PF00571 CBS:  CBS domain CBS d  51.3      15 0.00033   25.1   2.5   21  276-296    28-48  (57)
 94 cd01730 LSm3 The eukaryotic Sm  50.6      28 0.00061   26.7   4.1   31  176-206    12-42  (82)
 95 KOG3834 Golgi reassembly stack  50.5      25 0.00053   35.5   4.6   60  316-377    21-84  (462)
 96 COG0298 HypC Hydrogenase matur  50.2      37 0.00079   26.1   4.5   46  188-236     5-52  (82)
 97 PF05416 Peptidase_C37:  Southa  49.5      41 0.00088   34.0   5.9  135  152-299   379-528 (535)
 98 cd01717 Sm_B The eukaryotic Sm  49.2      36 0.00078   25.8   4.5   31  176-206    11-41  (79)
 99 KOG2921 Intramembrane metallop  48.8      16 0.00035   36.4   3.0   29  326-354   237-265 (484)
100 cd06168 LSm9 The eukaryotic Sm  48.1      45 0.00097   25.2   4.8   31  176-206    11-41  (75)
101 cd01729 LSm7 The eukaryotic Sm  47.0      43 0.00094   25.6   4.6   31  176-206    13-43  (81)
102 cd01732 LSm5 The eukaryotic Sm  46.9      39 0.00084   25.6   4.3   31  176-206    14-44  (76)
103 KOG3938 RGS-GAIP interacting p  46.1      11 0.00024   35.5   1.4   41  327-367   167-209 (334)
104 TIGR03000 plancto_dom_1 Planct  44.8      36 0.00077   25.8   3.7   49  330-378    10-63  (75)
105 cd01719 Sm_G The eukaryotic Sm  44.3      53  0.0011   24.5   4.7   31  176-206    11-41  (72)
106 PF04225 OapA:  Opacity-associa  43.4      14  0.0003   28.7   1.4   53  328-380     7-68  (85)
107 cd01728 LSm1 The eukaryotic Sm  43.2      54  0.0012   24.7   4.5   31  176-206    13-43  (74)
108 smart00651 Sm snRNP Sm protein  43.1      57  0.0012   23.4   4.7   32  176-207     9-40  (67)
109 cd01721 Sm_D3 The eukaryotic S  42.3      59  0.0013   24.1   4.6   32  176-207    11-42  (70)
110 cd01727 LSm8 The eukaryotic Sm  40.5      60  0.0013   24.2   4.5   31  176-206    10-40  (74)
111 COG1958 LSM1 Small nuclear rib  40.1      55  0.0012   24.7   4.3   33  176-208    18-50  (79)
112 KOG3551 Syntrophins (type beta  39.9      26 0.00057   34.9   2.9   42  326-369   127-172 (506)
113 PF01423 LSM:  LSM domain ;  In  37.8      66  0.0014   23.1   4.3   33  176-208     9-41  (67)
114 PF02601 Exonuc_VII_L:  Exonucl  35.9      43 0.00093   32.4   3.9   34  153-186   281-314 (319)
115 KOG3606 Cell polarity protein   35.2      93   0.002   29.6   5.6   40  317-356   201-243 (358)
116 KOG3571 Dishevelled 3 and rela  35.0      38 0.00082   34.9   3.3   43  322-367   290-338 (626)
117 PF01455 HupF_HypC:  HupF/HypC   34.6 1.2E+02  0.0026   22.4   5.2   43  188-233     5-47  (68)
118 KOG3605 Beta amyloid precursor  34.1      28  0.0006   37.0   2.2   42  317-359   763-806 (829)
119 PF05578 Peptidase_S31:  Pestiv  33.6   1E+02  0.0022   26.7   5.2   73  224-297   109-182 (211)
120 PF09122 DUF1930:  Domain of un  31.3 1.9E+02  0.0041   21.2   5.4   45  331-376    19-64  (68)
121 PF14827 Cache_3:  Sensory doma  30.4      39 0.00084   27.4   2.2   17  281-297    94-110 (116)
122 cd01723 LSm4 The eukaryotic Sm  30.3 1.6E+02  0.0035   22.0   5.4   32  176-207    12-43  (76)
123 PF14275 DUF4362:  Domain of un  29.9 1.2E+02  0.0026   24.2   4.8   25  329-355     1-25  (98)
124 COG4956 Integral membrane prot  29.4      48  0.0011   32.1   2.8   38  335-372   270-308 (356)
125 cd04627 CBS_pair_14 The CBS do  26.7      50  0.0011   26.2   2.2   22  276-297    97-118 (123)
126 cd01725 LSm2 The eukaryotic Sm  25.7 1.6E+02  0.0034   22.5   4.7   32  176-207    12-43  (81)
127 cd04603 CBS_pair_KefB_assoc Th  24.5      59  0.0013   25.4   2.2   20  277-296    86-105 (111)
128 PF11948 DUF3465:  Protein of u  24.5 4.3E+02  0.0094   22.3  10.1   12  223-234    85-96  (131)
129 PF08669 GCV_T_C:  Glycine clea  24.4      58  0.0013   25.2   2.1   20  278-297    34-53  (95)
130 cd04620 CBS_pair_7 The CBS dom  23.8      62  0.0013   25.2   2.2   20  277-296    90-109 (115)
131 cd01733 LSm10 The eukaryotic S  23.4 3.3E+02  0.0072   20.6   6.6   32  176-207    20-51  (78)
132 PF10049 DUF2283:  Protein of u  23.1      56  0.0012   22.4   1.6   10  285-294    36-45  (50)
133 KOG1379 Serine/threonine prote  22.4      98  0.0021   30.1   3.5   71  275-354   187-271 (330)
134 PRK13835 conjugal transfer pro  21.9      73  0.0016   27.4   2.3   40  118-158    47-86  (145)
135 PRK14864 putative biofilm stre  21.6 1.7E+02  0.0037   23.6   4.3   26  108-133    32-57  (104)
136 PF14438 SM-ATX:  Ataxin 2 SM d  21.1 2.2E+02  0.0047   21.2   4.7   30  176-206    13-45  (77)
137 cd04597 CBS_pair_DRTGG_assoc2   20.8      84  0.0018   24.9   2.4   21  276-296    87-107 (113)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=3.8e-46  Score=376.87  Aligned_cols=262  Identities=39%  Similarity=0.609  Sum_probs=223.9

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccC------ccc----c---cc-ccCCCeEEEEEEEcC-CCEEEecccccCCCCeEEEE
Q 016647          117 ATVRLFQENTPSVVNITNLAARQD------AFT----L---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT  181 (385)
Q Consensus       117 ~~~~~~~~~~~SVV~I~~~~~~~~------~~~----~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~i~V~  181 (385)
                      ++.++++++.||||.|.+......      .|.    .   +. .....+.||||||++ +||||||+|||++++.+.|+
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~  120 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ  120 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence            578999999999999987653221      110    0   00 112247899999985 79999999999999999999


Q ss_pred             ecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCC
Q 016647          182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG  261 (385)
Q Consensus       182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~  261 (385)
                      +.|+++++|++++.|+.+||||||++.+. .+++++|+++..+++||+|+++|||++...+++.|+|++..+....   .
T Consensus       121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~~-~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~---~  196 (455)
T PRK10139        121 LNDGREFDAKLIGSDDQSDIALLQIQNPS-KLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLN---L  196 (455)
T ss_pred             ECCCCEEEEEEEEEcCCCCEEEEEecCCC-CCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccC---C
Confidence            99999999999999999999999998643 6899999998899999999999999999999999999987764221   1


Q ss_pred             CCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc--------------------------
Q 016647          262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT--------------------------  315 (385)
Q Consensus       262 ~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~--------------------------  315 (385)
                      ..+.+++++|+.+++|+|||||||.+||||||+++.+.++++..|+||+||++.                          
T Consensus       197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l  276 (455)
T PRK10139        197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM  276 (455)
T ss_pred             CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence            224578999999999999999999999999999998877777789999999986                          


Q ss_pred             -------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647          316 -------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV  376 (385)
Q Consensus       316 -------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v  376 (385)
                                         .+.++++++++|||+||+|++|||++|.++.|+.+.+...++|++++++|+|+|+.+++++
T Consensus       277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V~R~G~~~~l~v  356 (455)
T PRK10139        277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGLLRNGKPLEVEV  356 (455)
T ss_pred             CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEE
Confidence                               2235667889999999999999999999999999999888899999999999999999999


Q ss_pred             EeecCC
Q 016647          377 KLEPKP  382 (385)
Q Consensus       377 ~~~~~~  382 (385)
                      ++.+.+
T Consensus       357 ~~~~~~  362 (455)
T PRK10139        357 TLDTST  362 (455)
T ss_pred             EECCCC
Confidence            885543


No 2  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=3.6e-45  Score=359.33  Aligned_cols=264  Identities=33%  Similarity=0.492  Sum_probs=223.1

Q ss_pred             CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeE
Q 016647          112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK  191 (385)
Q Consensus       112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~  191 (385)
                      ...+.++.++++++.||||.|.........   .......+.||||+|+++|+||||+||+.+++.+.|++.||+.++|+
T Consensus        41 ~~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~  117 (353)
T PRK10898         41 DETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEAL  117 (353)
T ss_pred             ccccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEE
Confidence            334457889999999999999885532211   01112347899999999999999999999999999999999999999


Q ss_pred             EEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEc
Q 016647          192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD  271 (385)
Q Consensus       192 vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d  271 (385)
                      ++++|+.+||||||++..  .+++++++++..+++|++|+++|||++...+++.|+|++..+....   .....+++++|
T Consensus       118 vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtd  192 (353)
T PRK10898        118 LVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTD  192 (353)
T ss_pred             EEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEec
Confidence            999999999999999873  5788999888889999999999999998889999999987664321   11234789999


Q ss_pred             cccCCCCCCceeeCCCccEEEEeecccCCCC---CCCCceeeecccc---------------------------------
Q 016647          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDT---------------------------------  315 (385)
Q Consensus       272 ~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~---~~~~~g~aIP~~~---------------------------------  315 (385)
                      +.+++|+|||||+|.+||||||+++.+...+   ...++||+||++.                                 
T Consensus       193 a~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~  272 (353)
T PRK10898        193 ASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQG  272 (353)
T ss_pred             cccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHh
Confidence            9999999999999999999999998765432   2368999999987                                 


Q ss_pred             ------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCCC
Q 016647          316 ------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKPD  383 (385)
Q Consensus       316 ------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~~  383 (385)
                                  .+.++++++++||++||+|++|||++|.++.++.+.+...++|++++++|+|+|+.+++++++.+.+.
T Consensus       273 ~~~~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~p~  352 (353)
T PRK10898        273 GGIDQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLTLQVTIQEYPA  352 (353)
T ss_pred             cCCCCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEeccCCC
Confidence                        12345677888999999999999999999999999998888999999999999999999999987764


No 3  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=5.5e-45  Score=358.17  Aligned_cols=263  Identities=37%  Similarity=0.605  Sum_probs=223.5

Q ss_pred             CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeE
Q 016647          112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK  191 (385)
Q Consensus       112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~  191 (385)
                      ...+.++.++++++.||||.|.......+.+   ......+.||||+|+++||||||+||+.+++.+.|.+.||+.++|+
T Consensus        41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~---~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~  117 (351)
T TIGR02038        41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSL---NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE  117 (351)
T ss_pred             cccchhHHHHHHhcCCcEEEEEeEecccccc---ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence            4455578899999999999998865433211   1123357899999999999999999999999999999999999999


Q ss_pred             EEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEc
Q 016647          192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD  271 (385)
Q Consensus       192 vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d  271 (385)
                      +++.|+.+||||||++..  .+++++++++..+++|++|+++|||++...+++.|+|+...+....   .....+++++|
T Consensus       118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd  192 (351)
T TIGR02038       118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD  192 (351)
T ss_pred             EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence            999999999999999874  4788999888889999999999999998899999999987764321   11235789999


Q ss_pred             cccCCCCCCceeeCCCccEEEEeecccCCC--CCCCCceeeecccc----------------------------------
Q 016647          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDT----------------------------------  315 (385)
Q Consensus       272 ~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~--~~~~~~g~aIP~~~----------------------------------  315 (385)
                      +.+++|+|||||+|.+|+||||+++.+...  +...+++|+||++.                                  
T Consensus       193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l  272 (351)
T TIGR02038       193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL  272 (351)
T ss_pred             CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence            999999999999999999999998766432  22468999999986                                  


Q ss_pred             -----------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCC
Q 016647          316 -----------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKP  382 (385)
Q Consensus       316 -----------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~  382 (385)
                                 .+.++++++++||++||+|++|||++|.++.|+.+.+...++|++++++|+|+|+.+++++++.+.|
T Consensus       273 gl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~p  350 (351)
T TIGR02038       273 GLPDLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLELPVTIDEKP  350 (351)
T ss_pred             CCCccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEecCCC
Confidence                       1234567788899999999999999999999999999988899999999999999999999987765


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=1.1e-43  Score=360.52  Aligned_cols=261  Identities=38%  Similarity=0.595  Sum_probs=222.3

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccC---c--------cccc--c------------------------ccCCCeEEEEEEE
Q 016647          117 ATVRLFQENTPSVVNITNLAARQD---A--------FTLD--V------------------------LEVPQGSGSGFVW  159 (385)
Q Consensus       117 ~~~~~~~~~~~SVV~I~~~~~~~~---~--------~~~~--~------------------------~~~~~~~GSGfiI  159 (385)
                      .+.++++++.||||.|.+......   +        |..+  .                        ....++.||||||
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            588999999999999987653211   1        1000  0                        0012468999999


Q ss_pred             cC-CCEEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCC
Q 016647          160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG  238 (385)
Q Consensus       160 ~~-~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g  238 (385)
                      ++ +||||||+||+.+++.+.|++.|+++++|++++.|+.+||||||++... .+++++|+++..+++|++|+++|+|++
T Consensus       119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~-~l~~~~lg~s~~l~~G~~V~aiG~P~g  197 (473)
T PRK10942        119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPK-NLTAIKMADSDALRVGDYTVAIGNPYG  197 (473)
T ss_pred             ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCC-CCceeEecCccccCCCCEEEEEcCCCC
Confidence            86 5999999999999999999999999999999999999999999997543 689999999889999999999999999


Q ss_pred             CCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc---
Q 016647          239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT---  315 (385)
Q Consensus       239 ~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~---  315 (385)
                      ...+++.|+|+...+....   ...+..++++|+.+++|+|||||+|.+|+||||+++.+.++++..++||+||++.   
T Consensus       198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~  274 (473)
T PRK10942        198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN  274 (473)
T ss_pred             CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence            9999999999987764211   1223578999999999999999999999999999998887777789999999975   


Q ss_pred             ------------------------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHh
Q 016647          316 ------------------------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD  353 (385)
Q Consensus       316 ------------------------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~  353 (385)
                                                                .+.++++++++|||+||+|++|||++|.++.++.+.+.
T Consensus       275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~  354 (473)
T PRK10942        275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVG  354 (473)
T ss_pred             HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHHHHHHH
Confidence                                                      22345677889999999999999999999999999999


Q ss_pred             cCCCCCEEEEEEEECCEEEEEEEEeecC
Q 016647          354 QCKVGDEVIVEVLRGDQKEKIPVKLEPK  381 (385)
Q Consensus       354 ~~~~g~~v~l~v~R~g~~~~~~v~~~~~  381 (385)
                      ..++|++++++|+|+|+.+++++++.+.
T Consensus       355 ~~~~g~~v~l~v~R~G~~~~v~v~l~~~  382 (473)
T PRK10942        355 TMPVGSKLTLGLLRDGKPVNVNVELQQS  382 (473)
T ss_pred             hcCCCCEEEEEEEECCeEEEEEEEeCcC
Confidence            8889999999999999999999988654


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=4.3e-42  Score=346.81  Aligned_cols=261  Identities=43%  Similarity=0.658  Sum_probs=222.7

Q ss_pred             HHHHHHHhCCceEEEEEeeeccC---------c----ccc---c-----cccCCCeEEEEEEEcCCCEEEecccccCCCC
Q 016647          118 TVRLFQENTPSVVNITNLAARQD---------A----FTL---D-----VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS  176 (385)
Q Consensus       118 ~~~~~~~~~~SVV~I~~~~~~~~---------~----~~~---~-----~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~  176 (385)
                      +.++++++.||||.|.+......         +    |..   .     ......+.||||+|+++|+||||+||+.++.
T Consensus         3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~   82 (428)
T TIGR02037         3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD   82 (428)
T ss_pred             HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence            56899999999999988652211         1    100   0     0112357899999999999999999999999


Q ss_pred             eEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeec
Q 016647          177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS  256 (385)
Q Consensus       177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~  256 (385)
                      .+.|++.|++.++|++++.|+.+||||||++.+ ..++++.|+++..+++|++|+++|||++...+++.|+|+...+...
T Consensus        83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~  161 (428)
T TIGR02037        83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL  161 (428)
T ss_pred             eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence            999999999999999999999999999999875 3689999998888999999999999999999999999998766521


Q ss_pred             cCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc---------------------
Q 016647          257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT---------------------  315 (385)
Q Consensus       257 ~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~---------------------  315 (385)
                         ....+..++++|+.+++|+|||||+|.+|+||||+++.+...++..+++|+||++.                     
T Consensus       162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi  238 (428)
T TIGR02037       162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV  238 (428)
T ss_pred             ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence               11234568999999999999999999999999999988777666789999999876                     


Q ss_pred             ------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEE
Q 016647          316 ------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQK  371 (385)
Q Consensus       316 ------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~  371 (385)
                                              .+.++++++++||++||+|++|||++|.++.++.+++...++|++++++|+|+|+.
T Consensus       239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~v~l~v~R~g~~  318 (428)
T TIGR02037       239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIGTLKPGKKVTLGILRKGKE  318 (428)
T ss_pred             EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEE
Confidence                                    22356778889999999999999999999999999999888999999999999999


Q ss_pred             EEEEEEeecCC
Q 016647          372 EKIPVKLEPKP  382 (385)
Q Consensus       372 ~~~~v~~~~~~  382 (385)
                      +++++++...+
T Consensus       319 ~~~~v~l~~~~  329 (428)
T TIGR02037       319 KTITVTLGASP  329 (428)
T ss_pred             EEEEEEECcCC
Confidence            99999876544


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=2e-34  Score=283.60  Aligned_cols=262  Identities=45%  Similarity=0.665  Sum_probs=225.0

Q ss_pred             hhHHHHHHHhCCceEEEEEeeeccC-ccccccc-cC-CCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeEE
Q 016647          116 LATVRLFQENTPSVVNITNLAARQD-AFTLDVL-EV-PQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI  192 (385)
Q Consensus       116 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~~~~~-~~-~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~v  192 (385)
                      ..+..+++++.|+||.|........ .|..... .. ..+.||||+++++|+|+|+.||+.+++.+.+.+.||+++++++
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~  112 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL  112 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence            5778999999999999988654321 1110000 00 1479999999999999999999999999999999999999999


Q ss_pred             EEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcc
Q 016647          193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA  272 (385)
Q Consensus       193 v~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~  272 (385)
                      ++.|+..|+|++|++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. . ........+++++|+
T Consensus       113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~-v~~~~~~~~~IqtdA  189 (347)
T COG0265         113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-G-VGSAGGYVNFIQTDA  189 (347)
T ss_pred             EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-c-ccCcccccchhhccc
Confidence            9999999999999998543 788899999999999999999999999999999999988885 1 111122568899999


Q ss_pred             ccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc------------------------------c------
Q 016647          273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT------------------------------G------  316 (385)
Q Consensus       273 ~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~------------------------------~------  316 (385)
                      .+++|+||||++|.+|++|||++..+.+.++..+++|+||++.                              .      
T Consensus       190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~  269 (347)
T COG0265         190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA  269 (347)
T ss_pred             ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence            9999999999999999999999999887766678999999986                              1      


Q ss_pred             -------ccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          317 -------LLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       317 -------v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                             +.+.++++++|++.||+|+++||+++.+..++.+.+....+|+.+.++++|+|+.+++.+++.+
T Consensus       270 ~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l~~  340 (347)
T COG0265         270 AGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTLGD  340 (347)
T ss_pred             CceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEecC
Confidence                   2345667888999999999999999999999999999999999999999999999999999987


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95  E-value=2.4e-26  Score=227.90  Aligned_cols=267  Identities=37%  Similarity=0.532  Sum_probs=219.9

Q ss_pred             hhHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCCCC-----------eEEEEecC
Q 016647          116 LATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFAD  184 (385)
Q Consensus       116 ~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~-----------~i~V~~~d  184 (385)
                      .....++++...++|.|+...-..........+.+...|||||++.+|.++|++||+....           .+.|...+
T Consensus       128 ~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~  207 (473)
T KOG1320|consen  128 AFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI  207 (473)
T ss_pred             hhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee
Confidence            3456789999999999998544333322233445678899999999999999999986432           37777776


Q ss_pred             C--CeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCC-
Q 016647          185 Q--SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG-  261 (385)
Q Consensus       185 g--~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~-  261 (385)
                      +  ..+++.+++.|+..|+|+++++.+..-.++++++-...+..|+++..+|.|++..+..+.|+++...|........ 
T Consensus       208 ~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~  287 (473)
T KOG1320|consen  208 GPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLET  287 (473)
T ss_pred             cCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCccc
Confidence            6  8899999999999999999997654337788888888899999999999999999999999999888775543333 


Q ss_pred             -CCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc-------------------------
Q 016647          262 -RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT-------------------------  315 (385)
Q Consensus       262 -~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~-------------------------  315 (385)
                       ....+++++|+.++.|+||+|++|.+|++||+++......+-..+++|++|.+.                         
T Consensus       288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~  367 (473)
T KOG1320|consen  288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVH  367 (473)
T ss_pred             ceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCccccc
Confidence             445688999999999999999999999999999877654444568899999886                         


Q ss_pred             ------------------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCC
Q 016647          316 ------------------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGD  359 (385)
Q Consensus       316 ------------------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~  359 (385)
                                                          .+.+++++...++++||+|.+|||++|.|..++.++++++.+++
T Consensus       368 ~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~~  447 (473)
T KOG1320|consen  368 QYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTED  447 (473)
T ss_pred             ccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHHHhcCcCc
Confidence                                                23346666777899999999999999999999999999999999


Q ss_pred             EEEEEEEECCEEEEEEEEeecCC
Q 016647          360 EVIVEVLRGDQKEKIPVKLEPKP  382 (385)
Q Consensus       360 ~v~l~v~R~g~~~~~~v~~~~~~  382 (385)
                      +|.+..+|..|..++.+..++..
T Consensus       448 ~v~vl~~~~~e~~tl~Il~~~~~  470 (473)
T KOG1320|consen  448 KVAVLDRRSAEDATLEILPEHKI  470 (473)
T ss_pred             eEEEEEecCccceeEEecccccC
Confidence            99999999999999988766543


No 8  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.84  E-value=1.3e-19  Score=182.16  Aligned_cols=254  Identities=26%  Similarity=0.352  Sum_probs=197.6

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCC-CEEEecccccC-CCCeEEEEecCCCeEeeEEEE
Q 016647          117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVG  194 (385)
Q Consensus       117 ~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~  194 (385)
                      .+...+..+-++||.|......  .|  +....+.+.+|||++++. |++|||+|++. +.-.-.+.+.+..+.+.-.++
T Consensus        53 ~w~~~ia~VvksvVsI~~S~v~--~f--dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy  128 (955)
T KOG1421|consen   53 DWRNTIANVVKSVVSIRFSAVR--AF--DTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY  128 (955)
T ss_pred             hhhhhhhhhcccEEEEEehhee--ec--ccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence            5667888999999999875432  11  222335678999999977 89999999996 444567778777788888889


Q ss_pred             ECCCCCeEEEEEcCCC---CCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCC---CCcccEE
Q 016647          195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG---RPIQDVI  268 (385)
Q Consensus       195 ~d~~~DlAlLkv~~~~---~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i  268 (385)
                      .|+.+|+.+++.+...   ..+..+.++. +..++|.++.++|+..+.-.++..|.++.+++....+...   .....++
T Consensus       129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~  207 (955)
T KOG1421|consen  129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI  207 (955)
T ss_pred             CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence            9999999999998643   2344455543 3468899999999987777788889999888876655322   1123456


Q ss_pred             EEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc---------------------------------
Q 016647          269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT---------------------------------  315 (385)
Q Consensus       269 ~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~---------------------------------  315 (385)
                      |.-+....|.||+|++|.+|..|.++..+..    ..+.+|++|.+.                                 
T Consensus       208 QaasstsggssgspVv~i~gyAVAl~agg~~----ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr  283 (955)
T KOG1421|consen  208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSI----SSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR  283 (955)
T ss_pred             eehhcCCCCCCCCceecccceEEeeecCCcc----cccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence            7777788999999999999999999987654    335678888875                                 


Q ss_pred             -------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCE
Q 016647          316 -------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQ  370 (385)
Q Consensus       316 -------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~  370 (385)
                                               ++.+++++.+ .|++||+++++|+.-+.++.++.++|++. .|+.++|+|+|+|+
T Consensus       284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k-~Le~GDillavN~t~l~df~~l~~iLDeg-vgk~l~LtI~Rggq  361 (955)
T KOG1421|consen  284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEK-KLEPGDILLAVNSTCLNDFEALEQILDEG-VGKNLELTIQRGGQ  361 (955)
T ss_pred             cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhh-ccCCCcEEEEEcceehHHHHHHHHHHhhc-cCceEEEEEEeCCE
Confidence                                     2234444444 49999999999999999999999999985 89999999999999


Q ss_pred             EEEEEEEeecC
Q 016647          371 KEKIPVKLEPK  381 (385)
Q Consensus       371 ~~~~~v~~~~~  381 (385)
                      +.++.++.+..
T Consensus       362 elel~vtvqdl  372 (955)
T KOG1421|consen  362 ELELTVTVQDL  372 (955)
T ss_pred             EEEEEEEeccc
Confidence            99999887665


No 9  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.70  E-value=2.7e-16  Score=129.96  Aligned_cols=109  Identities=38%  Similarity=0.599  Sum_probs=74.4

Q ss_pred             EEEEEEcCCCEEEecccccC--------CCCeEEEEecCCCeEe--eEEEEECCC-CCeEEEEEcCCCCCCcceecCCCC
Q 016647          154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA  222 (385)
Q Consensus       154 GSGfiI~~~G~ILT~aHvv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~l~l~~~~  222 (385)
                      ||||+|+++|+||||+||+.        ....+.+...++....  +++++.|+. .|+|||+++.             .
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------~   67 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------W   67 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC-------------E
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec-------------c
Confidence            89999999999999999998        4566888888888888  999999999 9999999970             0


Q ss_pred             CCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEE
Q 016647          223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI  293 (385)
Q Consensus       223 ~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI  293 (385)
                       ...+......+            ...........    ......+ +++.+.+|+||||+||.+|+||||
T Consensus        68 -~~~~~~~~~~~------------~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   68 -TGVGGGVRVPG------------STSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             -EEEEEEEEEEE------------EEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             -cceeeeeEeee------------eccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence             00000000000            00000000000    0001113 799999999999999999999997


No 10 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.42  E-value=1.3e-11  Score=112.09  Aligned_cols=146  Identities=25%  Similarity=0.401  Sum_probs=98.0

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCCeEEEEecC-------C--CeEeeEEEEE----CC---CCCeEEEEEcCC---CC
Q 016647          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD-------Q--SAYDAKIVGF----DQ---DKDVAVLRIDAP---KD  211 (385)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d-------g--~~~~a~vv~~----d~---~~DlAlLkv~~~---~~  211 (385)
                      ...|+|++|+++ +|||++||+.+..++.+.+..       +  ..+..+-+..    +.   .+|+|||+++.+   ..
T Consensus        24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~  102 (220)
T PF00089_consen   24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD  102 (220)
T ss_dssp             EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred             CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            458999999987 999999999996667775542       2  2344443333    22   569999999987   24


Q ss_pred             CCcceecCCC-CCCCCCCEEEEEeCCCCCCC----ceEEeEEeeeeeeeccCC-CCCCcccEEEEcc----ccCCCCCCc
Q 016647          212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSAA-TGRPIQDVIQTDA----AINPGNSGG  281 (385)
Q Consensus       212 ~~~~l~l~~~-~~~~~G~~V~~iG~p~g~~~----~~~~G~Vs~~~~~~~~~~-~~~~~~~~i~~d~----~i~~G~SGG  281 (385)
                      .+.++.+... ..+..|+.+.++||+.....    ......+.......+... ........+....    ..+.|+|||
T Consensus       103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~  182 (220)
T PF00089_consen  103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG  182 (220)
T ss_dssp             SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            5677777652 23588999999999875332    344444443333222211 1112245566655    788999999


Q ss_pred             eeeCCCccEEEEeecc
Q 016647          282 PLLDSSGSLIGINTAI  297 (385)
Q Consensus       282 Plvn~~G~VVGI~s~~  297 (385)
                      |+++.++.|+||++..
T Consensus       183 pl~~~~~~lvGI~s~~  198 (220)
T PF00089_consen  183 PLICNNNYLVGIVSFG  198 (220)
T ss_dssp             EEEETTEEEEEEEEEE
T ss_pred             ccccceeeecceeeec
Confidence            9998776799999987


No 11 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.35  E-value=3.6e-11  Score=121.90  Aligned_cols=247  Identities=21%  Similarity=0.242  Sum_probs=161.4

Q ss_pred             HHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCC-CEEEecccccC-CCCeEEEEecCCCeEeeEEEEECCCC
Q 016647          122 FQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDK  199 (385)
Q Consensus       122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~~d~~~  199 (385)
                      .+++..+.|.+....    +++.+........|||.|++.+ |++++...++. +..+.+|.+.|...++|.+.+.++..
T Consensus       524 ~~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~  599 (955)
T KOG1421|consen  524 SADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTE  599 (955)
T ss_pred             hhHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCcc
Confidence            355666777776632    3444444444567999999866 89999999996 66788999999889999999999999


Q ss_pred             CeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeecc----CCCCCCcccEEEEccccC
Q 016647          200 DVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISS----AATGRPIQDVIQTDAAIN  275 (385)
Q Consensus       200 DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~----~~~~~~~~~~i~~d~~i~  275 (385)
                      ++|.+|.+...  ...++|.+ ..+..||++...|+......-.....|..+......    ........+.|..++.+.
T Consensus       600 n~a~~kydp~~--~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nls  676 (955)
T KOG1421|consen  600 NVASFKYDPAL--EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLS  676 (955)
T ss_pred             ceeEeccChhH--hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecccc
Confidence            99999998743  34555643 447899999999987543322122222211111111    111122345666666665


Q ss_pred             CCCCCceeeCCCccEEEEeecccCCC--CCC--CCceeeecccc--------------------------------ccc-
Q 016647          276 PGNSGGPLLDSSGSLIGINTAIYSPS--GAS--SGVGFSIPVDT--------------------------------GLL-  318 (385)
Q Consensus       276 ~G~SGGPlvn~~G~VVGI~s~~~~~~--~~~--~~~g~aIP~~~--------------------------------~v~-  318 (385)
                      -+.--|-+.|.+|+|+|++-..+...  +..  .-+|.+++.-.                                |+. 
T Consensus       677 T~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~  756 (955)
T KOG1421|consen  677 TSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPS  756 (955)
T ss_pred             ccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCH
Confidence            55555668899999999976554431  111  12244433322                                000 


Q ss_pred             --------------------ccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          319 --------------------STKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       319 --------------------~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                                          .-.+-...-|..||||+++|||.|+.+.||.++.       .+.++|+|+|.++++++.+
T Consensus       757 e~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~-------eid~~ilrdg~~~~ikipt  829 (955)
T KOG1421|consen  757 EFIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFE-------EIDAVILRDGIEMEIKIPT  829 (955)
T ss_pred             HHHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhh-------hhheeeeecCcEEEEEecc
Confidence                                0001111236789999999999999999999733       4789999999999999887


Q ss_pred             ecCC
Q 016647          379 EPKP  382 (385)
Q Consensus       379 ~~~~  382 (385)
                      -+..
T Consensus       830 ~p~~  833 (955)
T KOG1421|consen  830 YPEY  833 (955)
T ss_pred             cccc
Confidence            6554


No 12 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.31  E-value=8.2e-11  Score=107.55  Aligned_cols=147  Identities=22%  Similarity=0.297  Sum_probs=91.6

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCC--CeEEEEecCC---------CeEeeEEEEEC-------CCCCeEEEEEcCCC--
Q 016647          151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFADQ---------SAYDAKIVGFD-------QDKDVAVLRIDAPK--  210 (385)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~--~~i~V~~~dg---------~~~~a~vv~~d-------~~~DlAlLkv~~~~--  210 (385)
                      ...|+|++|+++ +|||+|||+.+.  ..+.|.+...         ..+..+-+..+       ..+|||||+++.+.  
T Consensus        24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~  102 (232)
T cd00190          24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL  102 (232)
T ss_pred             cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence            458999999987 999999999875  5666766432         22333333333       35799999998754  


Q ss_pred             -CCCcceecCCCC-CCCCCCEEEEEeCCCCCCC-----ceEEeEEeeeeeeeccCCCC---CCcccEEEE-----ccccC
Q 016647          211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN  275 (385)
Q Consensus       211 -~~~~~l~l~~~~-~~~~G~~V~~iG~p~g~~~-----~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----d~~i~  275 (385)
                       ..+.|+.|.... .+..|+.+.++||......     ......+..+....+.....   ......+..     ....|
T Consensus       103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c  182 (232)
T cd00190         103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC  182 (232)
T ss_pred             CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence             236778886553 5778999999999754332     12222222222211111100   011122222     34567


Q ss_pred             CCCCCceeeCCC---ccEEEEeeccc
Q 016647          276 PGNSGGPLLDSS---GSLIGINTAIY  298 (385)
Q Consensus       276 ~G~SGGPlvn~~---G~VVGI~s~~~  298 (385)
                      .|+||||++...   +.++||.+...
T Consensus       183 ~gdsGgpl~~~~~~~~~lvGI~s~g~  208 (232)
T cd00190         183 QGDSGGPLVCNDNGRGVLVGIVSWGS  208 (232)
T ss_pred             cCCCCCcEEEEeCCEEEEEEEEehhh
Confidence            899999999653   78999998764


No 13 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.30  E-value=1.1e-11  Score=96.42  Aligned_cols=63  Identities=37%  Similarity=0.527  Sum_probs=58.0

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                      .+.+++|++.+||++||+|++|||++|++..++.+++...++|++++|+|+|+|+.+++++++
T Consensus        20 ~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l   82 (82)
T PF13180_consen   20 SVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEELTVEVTL   82 (82)
T ss_dssp             EESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEEEEEEE-
T ss_pred             EeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEC
Confidence            367889999999999999999999999999999999998999999999999999999999875


No 14 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.26  E-value=8.8e-12  Score=124.52  Aligned_cols=236  Identities=21%  Similarity=0.277  Sum_probs=161.7

Q ss_pred             HHhCCceEEEEEeeeccCcccccccc-CCCeEEEEEEEcCCCEEEecccccC---CCCeEEEEe-cCCCeEeeEEEEECC
Q 016647          123 QENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFDQ  197 (385)
Q Consensus       123 ~~~~~SVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHvv~---~~~~i~V~~-~dg~~~~a~vv~~d~  197 (385)
                      +....+++.+........+...|... .....|+||.+... .++|++|++.   +...+.+.- +.-+.|.+++...-.
T Consensus        57 ~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~~  135 (473)
T KOG1320|consen   57 DLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVFE  135 (473)
T ss_pred             cccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHhhh
Confidence            34456788887766555433333332 23457999999754 8999999998   555566652 233568888888889


Q ss_pred             CCCeEEEEEcCCC--CCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 016647          198 DKDVAVLRIDAPK--DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN  275 (385)
Q Consensus       198 ~~DlAlLkv~~~~--~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~  275 (385)
                      +.|+|++.++...  ....++.+  .+-+...+.++++|   |....++.|.|.......  +..+......+++++.++
T Consensus       136 ~cd~Avv~Ie~~~f~~~~~~~e~--~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~--y~~~~~~l~~vqi~aa~~  208 (473)
T KOG1320|consen  136 ECDLAVVYIESEEFWKGMNPFEL--GDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRI--YAHSSTVLLRVQIDAAIG  208 (473)
T ss_pred             cccceEEEEeeccccCCCccccc--CCCcccCccEEEEc---CCcEEEEeeEEEEEEecc--ccCCCcceeeEEEEEeec
Confidence            9999999998643  12222333  34456678899998   667789999998765542  222333445789999999


Q ss_pred             CCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc-------------------------ccc------------
Q 016647          276 PGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT-------------------------GLL------------  318 (385)
Q Consensus       276 ~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~-------------------------~v~------------  318 (385)
                      +|+||+|.+...+++.|+........+   .+++.||...                         ++.            
T Consensus       209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~---~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~  285 (473)
T KOG1320|consen  209 PGNSGEPVIVGVDKVAGVAFLKIKTPE---NILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGL  285 (473)
T ss_pred             CCccCCCeEEccccccceEEEEEecCC---cccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCc
Confidence            999999999877899999987764221   4566666543                         000            


Q ss_pred             ---------ccccccccCCCCCcEEEEECCEEeCC-HH-----HHHHHHhcCCCCCEEEEEEEECC
Q 016647          319 ---------STKRDAYGRLILGDIITSVNGKKVSN-GS-----DLYRILDQCKVGDEVIVEVLRGD  369 (385)
Q Consensus       319 ---------~~~~a~~~gl~~GDiI~~ing~~i~s-~~-----~l~~~l~~~~~g~~v~l~v~R~g  369 (385)
                               ....++..-++.||+|+.+||+.|-- ..     .+...+....++|+|.+.+.|.+
T Consensus       286 ~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~  351 (473)
T KOG1320|consen  286 ETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG  351 (473)
T ss_pred             ccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence                     00112333478999999999999941 11     24456677889999999999987


No 15 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.20  E-value=4.5e-10  Score=102.90  Aligned_cols=147  Identities=23%  Similarity=0.329  Sum_probs=90.8

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCC--eEEEEecCC--------CeEeeEEEEEC-------CCCCeEEEEEcCCC---
Q 016647          151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK---  210 (385)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~---  210 (385)
                      ...|+|++|+++ +|||+|||+.+..  .+.|.+...        ..+.+.-+..+       ..+|||||+++.+.   
T Consensus        25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~  103 (229)
T smart00020       25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS  103 (229)
T ss_pred             CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence            458999999987 9999999998753  677777543        22334433322       45799999998763   


Q ss_pred             CCCcceecCCC-CCCCCCCEEEEEeCCCCCC------CceEEeEEeeeeeeeccCCCC---CCcccEEEE-----ccccC
Q 016647          211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN  275 (385)
Q Consensus       211 ~~~~~l~l~~~-~~~~~G~~V~~iG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----d~~i~  275 (385)
                      ..+.++.+... ..+..++.+.+.||+....      .......+..+....+.....   ......+..     ....+
T Consensus       104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c  183 (229)
T smart00020      104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC  183 (229)
T ss_pred             CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence            24667777653 2467789999999986542      112222222222211111000   001111221     35578


Q ss_pred             CCCCCceeeCCCc--cEEEEeeccc
Q 016647          276 PGNSGGPLLDSSG--SLIGINTAIY  298 (385)
Q Consensus       276 ~G~SGGPlvn~~G--~VVGI~s~~~  298 (385)
                      +|+||||++...+  .++||.+...
T Consensus       184 ~gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      184 QGDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             CCCCCCeeEEECCCEEEEEEEEECC
Confidence            8999999996443  8999998764


No 16 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.04  E-value=1e-09  Score=84.67  Aligned_cols=62  Identities=26%  Similarity=0.349  Sum_probs=56.0

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK  377 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~  377 (385)
                      .+.++++++++||++||+|++|||++|.++.++.+++...++|+.+.+++.|+|+..+++++
T Consensus        16 ~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~   77 (79)
T cd00991          16 GVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST   77 (79)
T ss_pred             EECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence            34577888899999999999999999999999999998877899999999999998887765


No 17 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.03  E-value=1.7e-09  Score=83.35  Aligned_cols=65  Identities=26%  Similarity=0.351  Sum_probs=57.4

Q ss_pred             ccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCC
Q 016647          317 LLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKP  382 (385)
Q Consensus       317 v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~  382 (385)
                      +.++++++. +|++||+|++|||++|.+++++.+++...++|+.+.+++.|+|+.+++++++.+.+
T Consensus        15 V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~~~~   79 (79)
T cd00986          15 VVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLILKTFP   79 (79)
T ss_pred             ECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEEeccC
Confidence            345666665 79999999999999999999999999877789999999999999999999987653


No 18 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.87  E-value=3.9e-08  Score=91.34  Aligned_cols=135  Identities=20%  Similarity=0.258  Sum_probs=81.2

Q ss_pred             EEEEEEEcCCCEEEecccccCCCC----eEEEEe----cCCC-eEeeE--EEEEC-C---CCCeEEEEEcCCC-------
Q 016647          153 SGSGFVWDSKGHVVTNYHVIRGAS----DIRVTF----ADQS-AYDAK--IVGFD-Q---DKDVAVLRIDAPK-------  210 (385)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~~~~----~i~V~~----~dg~-~~~a~--vv~~d-~---~~DlAlLkv~~~~-------  210 (385)
                      .+++|+|+++ .+||++||+....    ++.+..    .++. .+..+  ..... .   +.|.+...+..-.       
T Consensus        65 ~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~  143 (251)
T COG3591          65 CTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINI  143 (251)
T ss_pred             eeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCc
Confidence            4466999998 9999999986433    222211    1111 11111  11112 2   3466666654211       


Q ss_pred             -CCCcceecCCCCCCCCCCEEEEEeCCCCCCCce----EEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeC
Q 016647          211 -DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTL----TTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD  285 (385)
Q Consensus       211 -~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~----~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn  285 (385)
                       .......+......+.++.+.++|||.+.....    ..+.+..+            ....+.+++.+.+|+||+|+++
T Consensus       144 ~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SGSpv~~  211 (251)
T COG3591         144 GDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSGSPVLI  211 (251)
T ss_pred             cccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCCCceEe
Confidence             112222333345578899999999997755322    22322211            1236889999999999999999


Q ss_pred             CCccEEEEeecccCC
Q 016647          286 SSGSLIGINTAIYSP  300 (385)
Q Consensus       286 ~~G~VVGI~s~~~~~  300 (385)
                      .+.+|||+++.+...
T Consensus       212 ~~~~vigv~~~g~~~  226 (251)
T COG3591         212 SKDEVIGVHYNGPGA  226 (251)
T ss_pred             cCceEEEEEecCCCc
Confidence            888999999877553


No 19 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.80  E-value=2.1e-08  Score=76.80  Aligned_cols=61  Identities=21%  Similarity=0.326  Sum_probs=53.7

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK  377 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~  377 (385)
                      .+..+++++++||++||+|++|||+++.++.++...+... .|+.+.+++.|+|+..++.++
T Consensus        18 ~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~~l~v~r~~~~~~~~l~   78 (79)
T cd00989          18 EVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQEN-PGKPLTLTVERNGETITLTLT   78 (79)
T ss_pred             eECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHC-CCceEEEEEEECCEEEEEEec
Confidence            5567788888999999999999999999999999998875 488999999999988777664


No 20 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.73  E-value=3.7e-08  Score=75.73  Aligned_cols=61  Identities=28%  Similarity=0.405  Sum_probs=51.6

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE  379 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~  379 (385)
                      .+.++++++.+||++||+|++|||++|.++.+   ++...+.|+.+.++++|+|+..++.+++.
T Consensus        18 ~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~---~l~~~~~~~~v~l~v~r~g~~~~~~v~~~   78 (80)
T cd00990          18 FVRDDSPADKAGLVAGDELVAVNGWRVDALQD---RLKEYQAGDPVELTVFRDDRLIEVPLTLA   78 (80)
T ss_pred             EECCCChHHHhCCCCCCEEEEECCEEhHHHHH---HHHhcCCCCEEEEEEEECCEEEEEEEEec
Confidence            45678889999999999999999999998554   45455688999999999999998888765


No 21 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.73  E-value=3.9e-08  Score=77.16  Aligned_cols=60  Identities=37%  Similarity=0.488  Sum_probs=53.2

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIP  375 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~  375 (385)
                      .+.++++++.+||++||+|++|||++|.++.++.+++.....|+.+.+++.|+|+.+++.
T Consensus        30 ~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~i~l~v~r~g~~~~~~   89 (90)
T cd00987          30 SVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDKVTLTVLRGGKELTVT   89 (90)
T ss_pred             EECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEee
Confidence            445678888899999999999999999999999999988777999999999999876654


No 22 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.66  E-value=4.3e-08  Score=99.97  Aligned_cols=66  Identities=11%  Similarity=0.078  Sum_probs=61.2

Q ss_pred             ccccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          315 TGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       315 ~~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                      .++.+++|++++|||+||+|++|||++|++++|+...+....+|++++++|.|+|+.++.++++..
T Consensus       131 ~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~  196 (449)
T PRK10779        131 GEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLDL  196 (449)
T ss_pred             cccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEecc
Confidence            378899999999999999999999999999999999998888999999999999999888888753


No 23 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.66  E-value=7.2e-08  Score=91.02  Aligned_cols=61  Identities=21%  Similarity=0.272  Sum_probs=56.4

Q ss_pred             cccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          318 LSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       318 ~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                      .++++++.+|||+||+|++|||+++.++.++.+++.+.++++.++++|.|+|+.+++.+.+
T Consensus       199 ~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~  259 (259)
T TIGR01713       199 KDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF  259 (259)
T ss_pred             CCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence            4567889999999999999999999999999999999999999999999999998888764


No 24 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.60  E-value=1.9e-07  Score=72.53  Aligned_cols=62  Identities=24%  Similarity=0.436  Sum_probs=53.6

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEEC-CEEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRG-DQKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~-g~~~~~~v~~  378 (385)
                      .+.++++++.+||++||+|++|||+++.++  .++.+++.. ..|+.+.+++.|+ |+..+++++.
T Consensus        19 ~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~-~~~~~i~l~v~r~~~~~~~~~~~~   83 (85)
T cd00988          19 SVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRG-KAGTKVRLTLKRGDGEPREVTLTR   83 (85)
T ss_pred             EecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcC-CCCCEEEEEEEcCCCCEEEEEEEE
Confidence            456778888899999999999999999998  899888876 4689999999999 8887777653


No 25 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.39  E-value=9.7e-07  Score=90.08  Aligned_cols=64  Identities=14%  Similarity=0.261  Sum_probs=58.3

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                      .+.++++++++||++||+|++|||++|++++|+.+.+.. .+|+.+.+++.|+|+..++++++..
T Consensus       227 ~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~~  290 (449)
T PRK10779        227 EVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIERQGSPLSLTLTPDS  290 (449)
T ss_pred             eeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEECCEEEEEEEEeee
Confidence            567889999999999999999999999999999999877 5789999999999999999888753


No 26 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.38  E-value=9e-07  Score=89.53  Aligned_cols=64  Identities=22%  Similarity=0.321  Sum_probs=58.0

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                      .+.++++++++|||+||+|++|||++|.+++|+.+.+.. .+|+++.+++.|+|+..++++++..
T Consensus       209 ~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~~  272 (420)
T TIGR00054       209 DVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKVERNGETLSISLTPEA  272 (420)
T ss_pred             EECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEEEECCEEEEEEEEEcC
Confidence            566888999999999999999999999999999999987 5789999999999999998888743


No 27 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.33  E-value=1.8e-06  Score=85.58  Aligned_cols=60  Identities=22%  Similarity=0.387  Sum_probs=54.0

Q ss_pred             ccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 016647          319 STKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE  379 (385)
Q Consensus       319 ~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~  379 (385)
                      ..++++.+|||+||+|++|||++|.+++|+.+++...+ |+.+.++|.|+|+..+++++..
T Consensus       122 ~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~LtV~R~Ge~~tv~V~Pv  181 (402)
T TIGR02860       122 IHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAG-GEKLTLTIERGGKIIETVIKPV  181 (402)
T ss_pred             CCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEEEEECCEEEEEEEEEe
Confidence            35678889999999999999999999999999998764 8999999999999998888754


No 28 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.28  E-value=8.7e-05  Score=68.58  Aligned_cols=165  Identities=16%  Similarity=0.270  Sum_probs=86.4

Q ss_pred             HHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCC-CCeEEEEecCCCeEeeE-----EEEEC
Q 016647          123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGFD  196 (385)
Q Consensus       123 ~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~-----vv~~d  196 (385)
                      ..+...|++|.......           ...=-|+... + +|||++|.+.. ...++|...-|.- ...     -+..-
T Consensus        14 n~Ia~~ic~l~n~s~~~-----------~~~l~gigyG-~-~iItn~HLf~~nng~L~i~s~hG~f-~v~nt~~lkv~~i   79 (235)
T PF00863_consen   14 NPIASNICRLTNESDGG-----------TRSLYGIGYG-S-YIITNAHLFKRNNGELTIKSQHGEF-TVPNTTQLKVHPI   79 (235)
T ss_dssp             HHHHTTEEEEEEEETTE-----------EEEEEEEEET-T-EEEEEGGGGSSTTCEEEEEETTEEE-EECEGGGSEEEE-
T ss_pred             chhhheEEEEEEEeCCC-----------eEEEEEEeEC-C-EEEEChhhhccCCCeEEEEeCceEE-EcCCccccceEEe
Confidence            34556788887643211           1233477775 3 89999999954 4567887766632 211     22334


Q ss_pred             CCCCeEEEEEcCCCCCCcceecC-CCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 016647          197 QDKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN  275 (385)
Q Consensus       197 ~~~DlAlLkv~~~~~~~~~l~l~-~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~  275 (385)
                      +..||.++|+..   ++||.+-. ....++.+|.|+++|.-+....  ....||........ ..    ..+..+-....
T Consensus        80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p~-~~----~~fWkHwIsTk  149 (235)
T PF00863_consen   80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYPE-EN----SHFWKHWISTK  149 (235)
T ss_dssp             TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEEE-TT----TTEEEE-C---
T ss_pred             CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEeec-CC----CCeeEEEecCC
Confidence            688999999965   45554422 2245889999999997544322  22233322221111 11    23445555566


Q ss_pred             CCCCCceeeC-CCccEEEEeecccCCCCCCCCceeeecccc
Q 016647          276 PGNSGGPLLD-SSGSLIGINTAIYSPSGASSGVGFSIPVDT  315 (385)
Q Consensus       276 ~G~SGGPlvn-~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~  315 (385)
                      .|+=|.|+++ .||.+|||++....    ....+|..|+..
T Consensus       150 ~G~CG~PlVs~~Dg~IVGiHsl~~~----~~~~N~F~~f~~  186 (235)
T PF00863_consen  150 DGDCGLPLVSTKDGKIVGIHSLTSN----TSSRNYFTPFPD  186 (235)
T ss_dssp             TT-TT-EEEETTT--EEEEEEEEET----TTSSEEEEE--T
T ss_pred             CCccCCcEEEcCCCcEEEEEcCccC----CCCeEEEEcCCH
Confidence            8999999998 79999999997643    235678777766


No 29 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.28  E-value=1.7e-06  Score=87.79  Aligned_cols=60  Identities=33%  Similarity=0.447  Sum_probs=54.8

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIP  375 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~  375 (385)
                      .+.++++++++||++||+|++|||++|.++.++.+++...+.|+.++++|+|+|+..++.
T Consensus       368 ~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~  427 (428)
T TIGR02037       368 KVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGGATIFVT  427 (428)
T ss_pred             EeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEE
Confidence            456788899999999999999999999999999999998888999999999999987664


No 30 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.26  E-value=2.4e-05  Score=73.31  Aligned_cols=145  Identities=25%  Similarity=0.306  Sum_probs=83.3

Q ss_pred             EEEEEEEcCCCEEEecccccCCCC--eEEEEecCC------------CeEee-EEEEEC-------CC-CCeEEEEEcCC
Q 016647          153 SGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ------------SAYDA-KIVGFD-------QD-KDVAVLRIDAP  209 (385)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg------------~~~~a-~vv~~d-------~~-~DlAlLkv~~~  209 (385)
                      .+-|.+|+++ ||+|++||+.+..  .+.|.+...            ..... +++ .+       .. +|||||+++.+
T Consensus        39 ~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~  116 (256)
T KOG3627|consen   39 LCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEP  116 (256)
T ss_pred             eeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCC
Confidence            6677788665 9999999999875  666766321            11111 222 22       13 79999999875


Q ss_pred             C---CCCcceecCCCCC---CCCCCEEEEEeCCCCCC------CceEEeEEeeeeeeeccCCCCC---CcccEEEEc---
Q 016647          210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQTD---  271 (385)
Q Consensus       210 ~---~~~~~l~l~~~~~---~~~G~~V~~iG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~~---~~~~~i~~d---  271 (385)
                      .   ..+.++.|.....   ...+..+++.||+....      .......+.-+....+......   .....+...   
T Consensus       117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~  196 (256)
T KOG3627|consen  117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE  196 (256)
T ss_pred             cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence            3   4566777753332   34458888899864321      1222222222222212111110   011224332   


Q ss_pred             --cccCCCCCCceeeCCC---ccEEEEeecccC
Q 016647          272 --AAINPGNSGGPLLDSS---GSLIGINTAIYS  299 (385)
Q Consensus       272 --~~i~~G~SGGPlvn~~---G~VVGI~s~~~~  299 (385)
                        ...|.|+|||||+-.+   ..++||++++..
T Consensus       197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~  229 (256)
T KOG3627|consen  197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG  229 (256)
T ss_pred             CCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence              2368899999998654   699999998754


No 31 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.22  E-value=2e-06  Score=64.11  Aligned_cols=50  Identities=30%  Similarity=0.415  Sum_probs=43.7

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVL  366 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~  366 (385)
                      .+..+++++.+||++||+|++|||+++.++  .++.+++... .|+.++|+++
T Consensus        19 ~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~-~g~~v~l~v~   70 (70)
T cd00136          19 SVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKE-VGEKVTLTVR   70 (70)
T ss_pred             EeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhC-CCCeEEEEEC
Confidence            456778888899999999999999999999  8999999875 4899998863


No 32 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.13  E-value=7.2e-06  Score=77.42  Aligned_cols=55  Identities=20%  Similarity=0.326  Sum_probs=51.2

Q ss_pred             cccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          324 AYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       324 ~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                      ...|||+||++++|||..+++.++..+++.+.+....++|+|+|||+..++.+.+
T Consensus       221 ~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l  275 (276)
T PRK09681        221 DASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL  275 (276)
T ss_pred             HHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence            4679999999999999999999999999999999999999999999999888765


No 33 
>PRK10139 serine endoprotease; Provisional
Probab=98.10  E-value=7.4e-06  Score=83.65  Aligned_cols=59  Identities=17%  Similarity=0.299  Sum_probs=52.7

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV  376 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v  376 (385)
                      .+.++++++++|||+||+|++|||++|.+++++.+++.+. + +++.|+|+|+|+.+++.+
T Consensus       396 ~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~R~g~~~~~~~  454 (455)
T PRK10139        396 EVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIVRGNESIYLLL  454 (455)
T ss_pred             EeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEEECCEEEEEEe
Confidence            5567889999999999999999999999999999999874 3 789999999999887765


No 34 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.07  E-value=7.9e-06  Score=80.23  Aligned_cols=64  Identities=23%  Similarity=0.364  Sum_probs=53.3

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                      .+.++++++.+||++||+|++|||++|.++  .++...+.. ++|+.+.+++.|+|+..++++++..
T Consensus        68 ~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l~~  133 (334)
T TIGR00225        68 SPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRG-KKGTKVSLEILRAGKSKPLTFTLKR  133 (334)
T ss_pred             EeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccC-CCCCEEEEEEEeCCCCceEEEEEEE
Confidence            567889999999999999999999999985  567666654 5799999999999877776666543


No 35 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.01  E-value=1.5e-05  Score=79.59  Aligned_cols=61  Identities=20%  Similarity=0.229  Sum_probs=51.8

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE-ECCEEEEEEEEeec
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVL-RGDQKEKIPVKLEP  380 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~-R~g~~~~~~v~~~~  380 (385)
                      .+.++++|+.+||++||+|++|||++|.++.|+...+.    ++.+.++|. |+|+..++++...+
T Consensus         4 ~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~~~   65 (433)
T TIGR03279         4 AVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEKDL   65 (433)
T ss_pred             CcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEecCC
Confidence            45688999999999999999999999999999887773    467899997 89988888776543


No 36 
>PRK10942 serine endoprotease; Provisional
Probab=97.99  E-value=1.5e-05  Score=81.85  Aligned_cols=59  Identities=27%  Similarity=0.332  Sum_probs=52.6

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV  376 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v  376 (385)
                      .+.++++++.+||++||+|++|||++|.+++++.+++.. ++ +.+.|+|+|+|+.+++.+
T Consensus       414 ~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~-~~-~~v~l~V~R~g~~~~v~~  472 (473)
T PRK10942        414 NVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDS-KP-SVLALNIQRGDSSIYLLM  472 (473)
T ss_pred             EeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh-CC-CeEEEEEEECCEEEEEEe
Confidence            556788999999999999999999999999999999987 33 799999999999887765


No 37 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.94  E-value=1.4e-05  Score=61.44  Aligned_cols=54  Identities=31%  Similarity=0.399  Sum_probs=42.5

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECC
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGD  369 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g  369 (385)
                      .+.++++++.+||++||+|++|||+++.++.+..........++.+.+++.|++
T Consensus        32 ~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r~~   85 (85)
T smart00228       32 SVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLRGG   85 (85)
T ss_pred             EECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEeCC
Confidence            455778889999999999999999999987665544433345679999999975


No 38 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.90  E-value=3.3e-05  Score=77.40  Aligned_cols=62  Identities=16%  Similarity=0.205  Sum_probs=52.2

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                      .+.+++|++++||++||+|++|||++|.++  .++...+.. ..|+.|.++|.|+|+..+++++-
T Consensus       108 ~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g~~v~ltv~r~g~~~~~~l~r  171 (389)
T PLN00049        108 APAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQG-PEGSSVELTLRRGPETRLVTLTR  171 (389)
T ss_pred             EeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhc-CCCCEEEEEEEECCEEEEEEEEe
Confidence            566889999999999999999999999864  677777754 57999999999999887776653


No 39 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.84  E-value=5.1e-05  Score=72.02  Aligned_cols=59  Identities=27%  Similarity=0.442  Sum_probs=54.3

Q ss_pred             ccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEEeecC
Q 016647          323 DAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLR-GDQKEKIPVKLEPK  381 (385)
Q Consensus       323 a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~~~~~  381 (385)
                      .+.+.|+.||.|+++||+++.+.+|+.+.+...++||+|++++.| +++...+++++.+.
T Consensus       142 ~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~~~  201 (342)
T COG3480         142 PFKGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITLIKN  201 (342)
T ss_pred             chhceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEEEee
Confidence            455669999999999999999999999999999999999999997 88888899988877


No 40 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.83  E-value=2.1e-05  Score=79.56  Aligned_cols=61  Identities=25%  Similarity=0.214  Sum_probs=53.3

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                      .+.++++++++|||+||+|+++||+++.++.++.+.+....  +++.+++.|+++...+++++
T Consensus       134 ~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I~r~g~~~~l~v~l  194 (420)
T TIGR00054       134 LLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEILAERENWTFEVMK  194 (420)
T ss_pred             ccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEEEEecCceEecccc
Confidence            66789999999999999999999999999999999888765  68899999998877655443


No 41 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.74  E-value=0.00012  Score=57.44  Aligned_cols=55  Identities=27%  Similarity=0.419  Sum_probs=38.1

Q ss_pred             cccccccC--CCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECC-EEEEEE
Q 016647          320 TKRDAYGR--LILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGD-QKEKIP  375 (385)
Q Consensus       320 ~~~a~~~g--l~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g-~~~~~~  375 (385)
                      .+|-...|  +++||+|++|||+++....++..+|.. +.|+.|.|+|.+++ +.+++.
T Consensus        30 ~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~-~agk~V~Ltv~~~~~~~R~v~   87 (88)
T PF14685_consen   30 RSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEG-KAGKQVLLTVNRKPGGARTVV   87 (88)
T ss_dssp             B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHT-TTTSEEEEEEE-STT-EEEEE
T ss_pred             cCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcc-cCCCEEEEEEecCCCCceEEE
Confidence            34444444  579999999999999999999999987 68999999999976 455554


No 42 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.67  E-value=0.00011  Score=73.93  Aligned_cols=62  Identities=19%  Similarity=0.353  Sum_probs=50.1

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEEC--CEEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRG--DQKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~--g~~~~~~v~~  378 (385)
                      ....++|++++||++||+|++|||+++..+  ++..+.+.. ++|..|+|+|.|.  ++.+.++++-
T Consensus       118 s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG-~~Gt~V~L~i~r~~~~k~~~v~l~R  183 (406)
T COG0793         118 SPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRG-KPGTKVTLTILRAGGGKPFTVTLTR  183 (406)
T ss_pred             ecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCC-CCCCeEEEEEEEcCCCceeEEEEEE
Confidence            446789999999999999999999999976  456666665 6899999999997  4555655543


No 43 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.66  E-value=9.2e-05  Score=56.67  Aligned_cols=48  Identities=29%  Similarity=0.370  Sum_probs=39.7

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeC--CHHHHHHHHhcCCCCCEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKVGDEVIVEV  365 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~--s~~~l~~~l~~~~~g~~v~l~v  365 (385)
                      .+..+++++.++|++||+|++|||+++.  +..++.+++....  ..+.+++
T Consensus        32 ~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~--~~v~l~v   81 (82)
T cd00992          32 RVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSG--DEVTLTV   81 (82)
T ss_pred             EECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCC--CeEEEEE
Confidence            4567788888999999999999999999  8899999887632  3666655


No 44 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.62  E-value=0.00049  Score=63.88  Aligned_cols=116  Identities=26%  Similarity=0.373  Sum_probs=61.8

Q ss_pred             CeEEEEEEEcCCC--EEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCC
Q 016647          151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ  228 (385)
Q Consensus       151 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~  228 (385)
                      ...|||=+...+|  .|+|+.||+. .+...|... +.....   -++..-|+|.-.++.-.-.+|.++++..   ..|-
T Consensus       111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~~-g~~~~~---tF~~~GDfA~~~~~~~~G~~P~~k~a~~---~~Gr  182 (297)
T PF05579_consen  111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSGV-GTRRML---TFKKNGDFAEADITNWPGAAPKYKFAQN---YTGR  182 (297)
T ss_dssp             SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEET-TEEEEE---EEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred             ecccccceEEECCeEEEEEEEEEcC-CCeEEEEec-ceEEEE---EEeccCcEEEEECCCCCCCCCceeecCC---cccc
Confidence            3456665555444  5999999998 444444443 323222   3445669999999543335666666521   2232


Q ss_pred             EEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecc
Q 016647          229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI  297 (385)
Q Consensus       229 ~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~  297 (385)
                      --+.-      ..-+..|.|...              ..+.   -..+|+||+|++..+|.+|||++..
T Consensus       183 AyW~t------~tGvE~G~ig~~--------------~~~~---fT~~GDSGSPVVt~dg~liGVHTGS  228 (297)
T PF05579_consen  183 AYWLT------STGVEPGFIGGG--------------GAVC---FTGPGDSGSPVVTEDGDLIGVHTGS  228 (297)
T ss_dssp             EEEEE------TTEEEEEEEETT--------------EEEE---SS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred             eEEEc------ccCcccceecCc--------------eEEE---EcCCCCCCCccCcCCCCEEEEEecC
Confidence            22211      123455555421              1122   2347999999999999999999975


No 45 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.50  E-value=7.8e-05  Score=57.34  Aligned_cols=49  Identities=20%  Similarity=0.358  Sum_probs=39.5

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVL  366 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~  366 (385)
                      .+.++++++.+||++||.|++|||+.+.++  .+..+++...  +++++|+|+
T Consensus        31 ~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~--~~~v~L~V~   81 (81)
T PF00595_consen   31 SVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSA--SNPVTLTVQ   81 (81)
T ss_dssp             EECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHS--TSEEEEEEE
T ss_pred             EEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCC--CCcEEEEEC
Confidence            556788888889999999999999999976  4666777664  348888874


No 46 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.42  E-value=0.00025  Score=64.74  Aligned_cols=54  Identities=22%  Similarity=0.404  Sum_probs=49.5

Q ss_pred             cccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 016647          324 AYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK  377 (385)
Q Consensus       324 ~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~  377 (385)
                      ...|||.|||.++||+..+++.+++..++.....-+.++++|.|+|+..++.|.
T Consensus       221 ~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~  274 (275)
T COG3031         221 YKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR  274 (275)
T ss_pred             hhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence            456899999999999999999999999999888888999999999999988875


No 47 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.09  E-value=0.022  Score=55.28  Aligned_cols=56  Identities=21%  Similarity=0.254  Sum_probs=34.8

Q ss_pred             EEEEEEcCCCEEEecccccCCCCe-----EEE--EecC---CCeEeeEEEEEC-------CCCCeEEEEEcCCC
Q 016647          154 GSGFVWDSKGHVVTNYHVIRGASD-----IRV--TFAD---QSAYDAKIVGFD-------QDKDVAVLRIDAPK  210 (385)
Q Consensus       154 GSGfiI~~~G~ILT~aHvv~~~~~-----i~V--~~~d---g~~~~a~vv~~d-------~~~DlAlLkv~~~~  210 (385)
                      |=|-+++.+ ||||+|||+.+..-     +.|  .+.|   ++...++.+..+       ...|+|++++....
T Consensus        63 CGgs~l~~R-YvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a  135 (413)
T COG5640          63 CGGSKLGGR-YVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAA  135 (413)
T ss_pred             eccceecce-EEeeehhhccCCCCccccceEEEecccccccccCcceEEEeeecccccccccCcceeecccccc
Confidence            446777777 99999999986552     222  2233   233334433332       45699999998753


No 48 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.03  E-value=0.0014  Score=69.65  Aligned_cols=61  Identities=23%  Similarity=0.396  Sum_probs=47.3

Q ss_pred             ccccccccccc-CCCCCcEEEEEC--CEEeCC-----HHHHHHHHhcCCCCCEEEEEEEEC---CEEEEEEEE
Q 016647          316 GLLSTKRDAYG-RLILGDIITSVN--GKKVSN-----GSDLYRILDQCKVGDEVIVEVLRG---DQKEKIPVK  377 (385)
Q Consensus       316 ~v~~~~~a~~~-gl~~GDiI~~in--g~~i~s-----~~~l~~~l~~~~~g~~v~l~v~R~---g~~~~~~v~  377 (385)
                      .+.+++||+++ ||++||+|++||  |+++.+     .+++.+++.. +.|.+|.|+|.|+   ++..+++++
T Consensus       261 ~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG-~~Gt~V~LtV~r~~~~~~~~~vtl~  332 (667)
T PRK11186        261 SLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKG-PKGSKVRLEILPAGKGTKTRIVTLT  332 (667)
T ss_pred             EccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcC-CCCCEEEEEEEeCCCCCceEEEEEE
Confidence            56789999987 999999999998  565543     3477777765 6899999999993   455666655


No 49 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=96.79  E-value=0.0024  Score=57.24  Aligned_cols=65  Identities=20%  Similarity=0.163  Sum_probs=53.2

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHH--HHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYR--ILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~--~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                      .|.+++|++.+||+.||.|++++...--+...|++  .+.+...++.+.+++.|.|+...+.++...
T Consensus       145 sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~  211 (231)
T KOG3129|consen  145 SVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLTPKK  211 (231)
T ss_pred             ecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeCccc
Confidence            67799999999999999999998877666665554  333456889999999999999999887643


No 50 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.72  E-value=0.11  Score=49.45  Aligned_cols=91  Identities=19%  Similarity=0.227  Sum_probs=57.3

Q ss_pred             CCCCeEEEEEcCC-CCCCcceecCCCC-CCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccc
Q 016647          197 QDKDVAVLRIDAP-KDKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI  274 (385)
Q Consensus       197 ~~~DlAlLkv~~~-~~~~~~l~l~~~~-~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i  274 (385)
                      ...+++||.++.+ .....++.|+++. ....|+.+.+.|+.  .........+.-....        .....+......
T Consensus       159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~--~~~~~~~~~~~i~~~~--------~~~~~~~~~~~~  228 (282)
T PF03761_consen  159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN--STGKLKHRKLKITNCT--------KCAYSICTKQYS  228 (282)
T ss_pred             cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC--CCCeEEEEEEEEEEee--------ccceeEeccccc
Confidence            4569999999886 2367888887654 36788999999882  1122222222211110        012345555667


Q ss_pred             CCCCCCceee---CCCccEEEEeecc
Q 016647          275 NPGNSGGPLL---DSSGSLIGINTAI  297 (385)
Q Consensus       275 ~~G~SGGPlv---n~~G~VVGI~s~~  297 (385)
                      +.|++|||++   |.+..||||.+..
T Consensus       229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~  254 (282)
T PF03761_consen  229 CKGDRGGPLVKNINGRWTLIGVGASG  254 (282)
T ss_pred             CCCCccCeEEEEECCCEEEEEEEccC
Confidence            7999999997   4445699998754


No 51 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.68  E-value=0.023  Score=51.60  Aligned_cols=164  Identities=16%  Similarity=0.254  Sum_probs=84.0

Q ss_pred             ccCCCeEEEEEEEcCC-CEEEecccccCCCCe-EEEEecCCCeEeeEEEEECC----------------CCCeEEEEEc-
Q 016647          147 LEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQ----------------DKDVAVLRID-  207 (385)
Q Consensus       147 ~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~----------------~~DlAlLkv~-  207 (385)
                      .....+.||=.+++++ +..--=.|.+.+.+. -.+.+.+|+.+++++..+.+                ..-+.-+.-+ 
T Consensus        15 RD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~Nt   94 (218)
T PF05580_consen   15 RDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEKNT   94 (218)
T ss_pred             EeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEecc
Confidence            3445678888889874 555555888876654 45667788888877665432                1112222221 


Q ss_pred             ---------CC----CCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCC----CcccEEEE
Q 016647          208 ---------AP----KDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGR----PIQDVIQT  270 (385)
Q Consensus       208 ---------~~----~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~----~~~~~i~~  270 (385)
                               ..    ....++++++...++++|..-+..-........+.--++. +.+.........    ....++..
T Consensus        95 ~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~G~~ie~f~ieI~~-v~~~~~~~~k~~vi~vtd~~Ll~~  173 (218)
T PF05580_consen   95 QFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVIDGTKIEEFDIEIEK-VLPQSSPSGKGMVIKVTDPRLLEK  173 (218)
T ss_pred             ccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEEcCCeEEEeEEEEEE-EccCCCCCCCcEEEEECCcchhhh
Confidence                     11    0123445555555666675432211110001111111111 111100000000    00122233


Q ss_pred             ccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc
Q 016647          271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT  315 (385)
Q Consensus       271 d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~  315 (385)
                      ...+..||||+|++ .+|++||-++..+-   .....||.+++.+
T Consensus       174 TGGIvqGMSGSPI~-qdGKLiGAVthvf~---~dp~~Gygi~ie~  214 (218)
T PF05580_consen  174 TGGIVQGMSGSPII-QDGKLIGAVTHVFV---NDPTKGYGIFIEW  214 (218)
T ss_pred             hCCEEecccCCCEE-ECCEEEEEEEEEEe---cCCCceeeecHHH
Confidence            34567899999999 69999998887664   2346788887753


No 52 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.60  E-value=0.0027  Score=54.27  Aligned_cols=62  Identities=13%  Similarity=0.228  Sum_probs=45.8

Q ss_pred             cccccccccccCCCC-CcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECC--EEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLIL-GDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGD--QKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~-GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g--~~~~~~v~~  378 (385)
                      .|.+++||+.+||++ .|.|+.+|+..+.+.++|.+.+.. ..+..+.|.|++..  +.+++++++
T Consensus        49 ~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~-~~~~~l~L~Vyns~~~~vR~V~i~P  113 (138)
T PF04495_consen   49 RVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEA-NENKPLQLYVYNSKTDSVREVTITP  113 (138)
T ss_dssp             EE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHH-TTTS-EEEEEEETTTTCEEEEEE--
T ss_pred             EecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHH-cCCCcEEEEEEECCCCeEEEEEEEc
Confidence            678999999999999 699999999999999999999987 47899999999844  445555544


No 53 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.19  E-value=0.0048  Score=62.60  Aligned_cols=59  Identities=31%  Similarity=0.370  Sum_probs=51.5

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCC
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKP  382 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~  382 (385)
                      .|.+++|+..+||.+||.|++|||.        .+.+..+++++.+++++.|.|..+++.|++...+
T Consensus       468 ~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~~~~L~e~~v~~~~~~  526 (558)
T COG3975         468 FVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKIQVHVFREGRLREFLVKLGGDP  526 (558)
T ss_pred             ecCCCChhHhccCCCccEEEEEcCc--------cccccccccccceEEEEccCCceEEeecccCCCc
Confidence            4667899999999999999999999        4566778999999999999999999988876543


No 54 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.17  E-value=0.049  Score=48.32  Aligned_cols=137  Identities=18%  Similarity=0.262  Sum_probs=76.7

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEee--EEEEECC---CCCeEEEEEcCCCCCCccee-cCCCCCC
Q 016647          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDA--KIVGFDQ---DKDVAVLRIDAPKDKLRPIP-IGVSADL  224 (385)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a--~vv~~d~---~~DlAlLkv~~~~~~~~~l~-l~~~~~~  224 (385)
                      ...++++.|-++ ++|...| -....  ++.+ +|+.++.  .+...+.   ..||++++++... +++-+. +-.....
T Consensus        24 ~~t~l~~gi~~~-~~lvp~H-~~~~~--~i~i-~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~~~   97 (172)
T PF00548_consen   24 EFTMLALGIYDR-YFLVPTH-EEPED--TIYI-DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPESIP   97 (172)
T ss_dssp             EEEEEEEEEEBT-EEEEEGG-GGGCS--EEEE-TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSSGG
T ss_pred             eEEEecceEeee-EEEEECc-CCCcE--EEEE-CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccccc
Confidence            457888888766 9999999 22222  3333 3444433  2223443   4599999997643 232221 1001112


Q ss_pred             CCCCEEEEEeCCCCCCC-ceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeC---CCccEEEEeecc
Q 016647          225 LVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD---SSGSLIGINTAI  297 (385)
Q Consensus       225 ~~G~~V~~iG~p~g~~~-~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn---~~G~VVGI~s~~  297 (385)
                      ...+...++=.. .... ....+.+...... ..  .+......+.++++..+|+-||||+.   ..++++||+.++
T Consensus        98 ~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen   98 EYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             TEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred             cCCCcEEEEECC-CCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence            334444444332 3332 2344444433332 11  22334577888999999999999984   367999999875


No 55 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.79  E-value=0.014  Score=60.99  Aligned_cols=64  Identities=25%  Similarity=0.424  Sum_probs=51.7

Q ss_pred             CCCCceeeecccc--------ccccccccccc-CCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEEC
Q 016647          303 ASSGVGFSIPVDT--------GLLSTKRDAYG-RLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRG  368 (385)
Q Consensus       303 ~~~~~g~aIP~~~--------~v~~~~~a~~~-gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R~  368 (385)
                      ++.||||.|=.+.        .+..++||+.. .|+.||-|+++||+.|.++.  |+..+++.  .|-+|+|+|.-.
T Consensus       763 ENeGFGFVi~sS~~kp~sgiGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIip~  837 (984)
T KOG3209|consen  763 ENEGFGFVIMSSQNKPESGIGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTIIPP  837 (984)
T ss_pred             cCCceeEEEEecccCCCCCccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEcCh
Confidence            4678999886655        46778888775 59999999999999999875  67777765  699999998754


No 56 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.60  E-value=0.095  Score=48.51  Aligned_cols=96  Identities=17%  Similarity=0.249  Sum_probs=65.6

Q ss_pred             ccccccccCCCeEEEEEEEcCCCEEEecccccCCC----CeEEEEecCCCeEe------eEEEEEC-----CCCCeEEEE
Q 016647          141 AFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGA----SDIRVTFADQSAYD------AKIVGFD-----QDKDVAVLR  205 (385)
Q Consensus       141 ~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~----~~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLk  205 (385)
                      ||..+...++...++|++||++ |||++-.|+.+-    .-+.+.++.++.+.      -++..+|     ++.++.+|.
T Consensus        17 PWlA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLH   95 (267)
T PF09342_consen   17 PWLADIYVDGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLH   95 (267)
T ss_pred             cceeeEEEcCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeee
Confidence            3444555567889999999987 999999999863    34777788777543      1344444     678999999


Q ss_pred             EcCCC---CCCcceecCC-CCCCCCCCEEEEEeCCC
Q 016647          206 IDAPK---DKLRPIPIGV-SADLLVGQKVYAIGNPF  237 (385)
Q Consensus       206 v~~~~---~~~~~l~l~~-~~~~~~G~~V~~iG~p~  237 (385)
                      ++.+.   ..+.|.-+.. .......+..+++|.-.
T Consensus        96 L~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   96 LEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             ecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence            98764   2334444433 23345556888998653


No 57 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.54  E-value=0.013  Score=49.54  Aligned_cols=32  Identities=22%  Similarity=0.456  Sum_probs=22.4

Q ss_pred             EEEccccCCCCCCceeeCCCccEEEEeecccC
Q 016647          268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYS  299 (385)
Q Consensus       268 i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~  299 (385)
                      ...+....+|.||+|+||.+|++|||......
T Consensus        88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~  119 (132)
T PF00949_consen   88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE  119 (132)
T ss_dssp             EEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred             EeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence            34444567999999999999999999876543


No 58 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=95.35  E-value=0.012  Score=62.91  Aligned_cols=21  Identities=33%  Similarity=0.272  Sum_probs=18.9

Q ss_pred             EEEEEEEcCCCEEEecccccC
Q 016647          153 SGSGFVWDSKGHVVTNYHVIR  173 (385)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~  173 (385)
                      -|||-+|+++|+|+||.||..
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~   68 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGY   68 (698)
T ss_pred             ceeEEEEcCCceEEecchhhh
Confidence            599999999999999999964


No 59 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.04  E-value=0.16  Score=53.30  Aligned_cols=109  Identities=21%  Similarity=0.428  Sum_probs=65.9

Q ss_pred             CCCeEEEEEcCCC-------CCC------cceecCCC------CCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccC
Q 016647          198 DKDVAVLRIDAPK-------DKL------RPIPIGVS------ADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSA  258 (385)
Q Consensus       198 ~~DlAlLkv~~~~-------~~~------~~l~l~~~------~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~  258 (385)
                      -.|+|||+++...       +.+      |.+.+.+.      ..+..|..|+-+|.-.+    .+.|.+.++.-..  +
T Consensus       542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy--w  615 (695)
T PF08192_consen  542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY--W  615 (695)
T ss_pred             ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--e
Confidence            4599999997532       111      22333221      23567999999987644    4677777654322  1


Q ss_pred             CCCCC-cccEEEEc----cccCCCCCCceeeCCCcc------EEEEeecccCCCCCCCCceeeecccc
Q 016647          259 ATGRP-IQDVIQTD----AAINPGNSGGPLLDSSGS------LIGINTAIYSPSGASSGVGFSIPVDT  315 (385)
Q Consensus       259 ~~~~~-~~~~i~~d----~~i~~G~SGGPlvn~~G~------VVGI~s~~~~~~~~~~~~g~aIP~~~  315 (385)
                      ..+.. ..+++...    .-...|+||+-|++.-+.      |+||.++.-   |+...+|...|+..
T Consensus       616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsyd---ge~kqfglftPi~~  680 (695)
T PF08192_consen  616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYD---GEQKQFGLFTPINE  680 (695)
T ss_pred             cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecC---CccceeeccCcHHH
Confidence            12211 12333333    234589999999985444      999998753   34557888888764


No 60 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.13  E-value=0.074  Score=54.72  Aligned_cols=73  Identities=23%  Similarity=0.234  Sum_probs=53.2

Q ss_pred             CCccEEEEeecccCCCCCCCCceeeecccccccccccccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEE
Q 016647          286 SSGSLIGINTAIYSPSGASSGVGFSIPVDTGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIV  363 (385)
Q Consensus       286 ~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l  363 (385)
                      .+|.-||+--++-.      -.|+.+   .||+.+++++..||+.||.|+++|..+..+.-  +...+|-...+|+.|+|
T Consensus       414 ~KGdSvGLRLAGGN------DVGIFV---aGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevti  484 (1027)
T KOG3580|consen  414 KKGDSVGLRLAGGN------DVGIFV---AGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTI  484 (1027)
T ss_pred             ecCCeeeeEeccCC------ceeEEE---eecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEee
Confidence            46777776544311      223322   27889999999999999999999999999854  44455667899999999


Q ss_pred             EEEE
Q 016647          364 EVLR  367 (385)
Q Consensus       364 ~v~R  367 (385)
                      .-++
T Consensus       485 laQ~  488 (1027)
T KOG3580|consen  485 LAQS  488 (1027)
T ss_pred             hhhh
Confidence            6544


No 61 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.06  E-value=0.18  Score=45.94  Aligned_cols=144  Identities=15%  Similarity=0.170  Sum_probs=50.1

Q ss_pred             eEEEEEEEcCCC--EEEecccccCCCCeEEEEecCCCeEee---EEEEECCCCCeEEEEEcCCC---CCCcceecCCCCC
Q 016647          152 GSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDA---KIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSAD  223 (385)
Q Consensus       152 ~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a---~vv~~d~~~DlAlLkv~~~~---~~~~~l~l~~~~~  223 (385)
                      +.++.+-. .+|  .++|+.||..+...+. ...+|+.++-   +.+..+...|++||++....   ...+.+.+.....
T Consensus        30 Gya~cv~l-~~g~~~L~ta~Hv~~~~~~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~  107 (203)
T PF02122_consen   30 GYATCVRL-FDGEDALLTARHVWSRPSKVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQ  107 (203)
T ss_dssp             ----EEEE-----EEEEE-HHHHTSSS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----S
T ss_pred             ccceEEEC-cCCccceecccccCCCcccee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhh
Confidence            44455432 233  6999999998855543 3344544432   34456788999999997321   1223333322111


Q ss_pred             CCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCC
Q 016647          224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGA  303 (385)
Q Consensus       224 ~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~  303 (385)
                      +..|    .+     ..+....+........+..     ....+...-+...+|.||.|+++.+ ++||++... .....
T Consensus       108 ~~~g----~~-----~~y~~~~~~~~~~sa~i~g-----~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~-~~~~~  171 (203)
T PF02122_consen  108 LAKG----PV-----SFYGFSSGEWPCSSAKIPG-----TEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS-PSGSN  171 (203)
T ss_dssp             EEEE----ES-----STTSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE------
T ss_pred             hCCC----Ce-----eeeeecCCCceeccCcccc-----ccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc-ccccc
Confidence            1100    01     1112222222211111111     1123556667788999999999987 999999875 22233


Q ss_pred             CCCceeeecc
Q 016647          304 SSGVGFSIPV  313 (385)
Q Consensus       304 ~~~~g~aIP~  313 (385)
                      ....++..|+
T Consensus       172 ~~n~n~~spi  181 (203)
T PF02122_consen  172 RENNNRMSPI  181 (203)
T ss_dssp             ----------
T ss_pred             cccccccccc
Confidence            4455555444


No 62 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=93.80  E-value=0.46  Score=47.63  Aligned_cols=41  Identities=24%  Similarity=0.552  Sum_probs=29.1

Q ss_pred             ccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc
Q 016647          271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT  315 (385)
Q Consensus       271 d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~  315 (385)
                      ...+..||||+|++ .+|++||=++-.+-.   ....||.|-+.+
T Consensus       354 tgGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~ie~  394 (402)
T TIGR02860       354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYIEW  394 (402)
T ss_pred             hCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehHHH
Confidence            34566899999999 799999977655432   335677765543


No 63 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=93.02  E-value=0.16  Score=52.32  Aligned_cols=52  Identities=31%  Similarity=0.453  Sum_probs=40.1

Q ss_pred             ccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647          323 DAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQKEKIPV  376 (385)
Q Consensus       323 a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v  376 (385)
                      +..++|+.||+|++|||.--.|+.  |..+++...  -.++.|.|+||.+..-+++
T Consensus       233 ardgnlqEGDiiLkINGtvteNmSLtDar~LIEkS--~GKL~lvVlRD~~qtLiNi  286 (1027)
T KOG3580|consen  233 ARDGNLQEGDIILKINGTVTENMSLTDARKLIEKS--RGKLQLVVLRDSQQTLINI  286 (1027)
T ss_pred             hccCCcccccEEEEECcEeeccccchhHHHHHHhc--cCceEEEEEecCCceeeec
Confidence            456789999999999998888754  777887753  3467999999976655554


No 64 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=92.86  E-value=0.053  Score=58.13  Aligned_cols=28  Identities=36%  Similarity=0.719  Sum_probs=15.0

Q ss_pred             EEEccccCCCCCCceeeCCCccEEEEee
Q 016647          268 IQTDAAINPGNSGGPLLDSSGSLIGINT  295 (385)
Q Consensus       268 i~~d~~i~~G~SGGPlvn~~G~VVGI~s  295 (385)
                      +.++..+..||||+|++|.+|||||+++
T Consensus       624 FlstnDitGGNSGSPvlN~~GeLVGl~F  651 (698)
T PF10459_consen  624 FLSTNDITGGNSGSPVLNAKGELVGLAF  651 (698)
T ss_pred             EEeccCcCCCCCCCccCCCCceEEEEee
Confidence            3444455555555555555555555554


No 65 
>PF12812 PDZ_1:  PDZ-like domain
Probab=92.56  E-value=0.15  Score=39.09  Aligned_cols=33  Identities=33%  Similarity=0.443  Sum_probs=28.7

Q ss_pred             cccCCCCCcEEEEECCEEeCCHHHHHHHHhcCC
Q 016647          324 AYGRLILGDIITSVNGKKVSNGSDLYRILDQCK  356 (385)
Q Consensus       324 ~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~  356 (385)
                      ...++..|-+|++|||+++.+.+++.+.+++.+
T Consensus        44 ~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ip   76 (78)
T PF12812_consen   44 FAGGISKGFIITSVNGKPTPDLDDFIKVVKKIP   76 (78)
T ss_pred             hhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCC
Confidence            334499999999999999999999999998753


No 66 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=92.41  E-value=0.39  Score=47.60  Aligned_cols=56  Identities=29%  Similarity=0.421  Sum_probs=47.9

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE---EEEEEEE-CCEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE---VIVEVLR-GDQKE  372 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~---v~l~v~R-~g~~~  372 (385)
                      ++...++++.+++++||.|+++|++++.++.+..+.+... .|..   +.+.+.| +++..
T Consensus       135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~-~~~~~~~~~i~~~~~~~~~~  194 (375)
T COG0750         135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAA-AGDVFNLLTILVIRLDGEAH  194 (375)
T ss_pred             ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhc-cCCcccceEEEEEeccceee
Confidence            5678888999999999999999999999999998888764 5565   8999999 77663


No 67 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=92.33  E-value=0.12  Score=43.38  Aligned_cols=28  Identities=32%  Similarity=0.565  Sum_probs=23.2

Q ss_pred             ccccCCCCCCceeeCCCccEEEEeeccc
Q 016647          271 DAAINPGNSGGPLLDSSGSLIGINTAIY  298 (385)
Q Consensus       271 d~~i~~G~SGGPlvn~~G~VVGI~s~~~  298 (385)
                      ...-.+|+||-|++|..|+||||+-.+.
T Consensus       100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~  127 (158)
T PF00944_consen  100 TGVGKPGDSGRPIFDNSGRVVAIVLGGA  127 (158)
T ss_dssp             TTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred             cCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence            3445689999999999999999998764


No 68 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=92.20  E-value=0.24  Score=52.17  Aligned_cols=68  Identities=25%  Similarity=0.417  Sum_probs=51.2

Q ss_pred             CCCCceeeecccccccc----cccccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCE
Q 016647          303 ASSGVGFSIPVDTGLLS----TKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQ  370 (385)
Q Consensus       303 ~~~~~g~aIP~~~~v~~----~~~a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R~g~  370 (385)
                      +.+++||+|--......    -.+..--||++||.|++||++.+....  ++-+++.+...|..+.|.|.|+|-
T Consensus       493 gpegfgftiADsPtgqrvK~ilDp~~c~gl~eGd~IVei~~rnvr~L~h~qvvdmlke~piG~r~~Llv~RGgp  566 (984)
T KOG3209|consen  493 GPEGFGFTIADSPTGQRVKQILDPQDCPGLSEGDLIVEINERNVRALTHTQVVDMLKECPIGSRVHLLVKRGGP  566 (984)
T ss_pred             CCCCCCceeccCCCCCceeeecCcccCCCCCCCCeEEecccccccccchHHHHHHHHhccCCcceeEEEecCCC
Confidence            35678888744332211    122334589999999999999999765  677899999999999999999874


No 69 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=89.06  E-value=1.4  Score=46.28  Aligned_cols=79  Identities=23%  Similarity=0.333  Sum_probs=57.4

Q ss_pred             CCccEEEEeecccCCCCCCCCceeeecccc--ccccccccccc-CCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCE
Q 016647          286 SSGSLIGINTAIYSPSGASSGVGFSIPVDT--GLLSTKRDAYG-RLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDE  360 (385)
Q Consensus       286 ~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~--~v~~~~~a~~~-gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~  360 (385)
                      .+||.+||+--       ..|.|=.+|.-.  .+..++++++. .|..||.|++|||...--  ...-+.+++..|.-..
T Consensus       654 ~kGEiLGVViV-------ESGWGSmLPTVViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~  726 (829)
T KOG3605|consen  654 HKGEILGVVIV-------ESGWGSILPTVVIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTA  726 (829)
T ss_pred             ccCceeeEEEE-------ecCccccchHHHHHhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccce
Confidence            57899998753       346666677644  56677887765 599999999999988764  3466677888777777


Q ss_pred             EEEEEEECCEE
Q 016647          361 VIVEVLRGDQK  371 (385)
Q Consensus       361 v~l~v~R~g~~  371 (385)
                      |+++|.+=--.
T Consensus       727 VkltiV~cpPV  737 (829)
T KOG3605|consen  727 VKLNIVSCPPV  737 (829)
T ss_pred             EEEEEecCCCc
Confidence            88877764333


No 70 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=88.77  E-value=0.57  Score=49.17  Aligned_cols=47  Identities=15%  Similarity=0.280  Sum_probs=39.6

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIV  363 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l  363 (385)
                      .|.+.+++.++.+++||++++|||.+|++..+..+.+... .|+...+
T Consensus       404 tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~-~~~~~~l  450 (1051)
T KOG3532|consen  404 TVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQST-TGDLTVL  450 (1051)
T ss_pred             EecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhc-ccceEEE
Confidence            4567889999999999999999999999999999999876 3544333


No 71 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=88.61  E-value=0.5  Score=51.22  Aligned_cols=45  Identities=27%  Similarity=0.452  Sum_probs=35.7

Q ss_pred             ccccccCCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 016647          321 KRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR  367 (385)
Q Consensus       321 ~~a~~~gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~R  367 (385)
                      +....+.|++||.|+.|||++|..  ++.+.++++.+  .+.|.|+|.+
T Consensus        85 GGps~GKL~PGDQIl~vN~Epv~daprervIdlvRac--e~sv~ltV~q  131 (1298)
T KOG3552|consen   85 GGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--ESSVNLTVCQ  131 (1298)
T ss_pred             CCCccccccCCCeEEEecCcccccccHHHHHHHHHHH--hhhcceEEec
Confidence            335567799999999999999996  55677888764  4678888877


No 72 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=83.67  E-value=0.62  Score=37.09  Aligned_cols=28  Identities=21%  Similarity=0.197  Sum_probs=25.0

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeC
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVS  343 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~  343 (385)
                      .+..++|++.+||+.+|.|+.+||-..+
T Consensus        65 ~V~eGsPA~~AGLrihDKIlQvNG~DfT   92 (124)
T KOG3553|consen   65 RVSEGSPAEIAGLRIHDKILQVNGWDFT   92 (124)
T ss_pred             EeccCChhhhhcceecceEEEecCceeE
Confidence            5667899999999999999999997766


No 73 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=80.55  E-value=0.86  Score=38.39  Aligned_cols=114  Identities=21%  Similarity=0.271  Sum_probs=55.3

Q ss_pred             EEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEe
Q 016647          155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIG  234 (385)
Q Consensus       155 SGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG  234 (385)
                      -|+.|+  |..-|.+|--...   ++--+.   -+..-.+.+...|+..-....-...+.|-.-+       -+.++++-
T Consensus        15 mgt~vn--GV~wT~~HGagsr---tlAgp~---Gpv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg-------~~dlylVt   79 (148)
T PF02907_consen   15 MGTCVN--GVMWTVYHGAGSR---TLAGPK---GPVNQMYTSVDDDLVGWPAPPGARSLTPCTCG-------SSDLYLVT   79 (148)
T ss_dssp             EEEEET--TEEEEEHHHHTTS---EEEBTT---SEB-ESEEETTTTEEEEE-STTB--BBB-SSS-------SSEEEEE-
T ss_pred             ehhEEc--cEEEEEEecCCcc---cccCCC---CcceEeEEcCCCCCcccccccccccCCccccC-------CccEEEEe
Confidence            367775  7888888864321   111111   12334456777898887775433334433332       24566664


Q ss_pred             CCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEE--ccccCCCCCCceeeCCCccEEEEeecccC
Q 016647          235 NPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQT--DAAINPGNSGGPLLDSSGSLIGINTAIYS  299 (385)
Q Consensus       235 ~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~--d~~i~~G~SGGPlvn~~G~VVGI~s~~~~  299 (385)
                      +-    ..+-.+     ++.      +.. ...+..  -.....|.||||++-.+|.+|||..+...
T Consensus        80 r~----~~v~p~-----rr~------gd~-~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~  130 (148)
T PF02907_consen   80 RD----ADVIPV-----RRR------GDS-RASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVC  130 (148)
T ss_dssp             TT----S-EEEE-----EEE------STT-EEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEE
T ss_pred             cc----CcEeee-----EEc------CCC-ceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEE
Confidence            32    111111     111      000 001111  11234799999999999999999776544


No 74 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=79.65  E-value=3.7  Score=35.26  Aligned_cols=48  Identities=27%  Similarity=0.320  Sum_probs=31.1

Q ss_pred             cccccccc-cccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEE
Q 016647          316 GLLSTKRD-AYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEV  365 (385)
Q Consensus       316 ~v~~~~~a-~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v  365 (385)
                      .+.+++.+ -.+|||.||.++++||..|.--.  -..++|... . ..|++.|
T Consensus       121 riipggvadrhgglkrgdqllsvngvsvege~hekavellkaa-~-gsvklvv  171 (207)
T KOG3550|consen  121 RIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAA-V-GSVKLVV  171 (207)
T ss_pred             eecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHh-c-CcEEEEE
Confidence            34566554 56799999999999999997432  233344432 2 3467655


No 75 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=78.92  E-value=10  Score=30.69  Aligned_cols=52  Identities=23%  Similarity=0.346  Sum_probs=34.0

Q ss_pred             EEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecC
Q 016647          156 GFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIG  219 (385)
Q Consensus       156 GfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~  219 (385)
                      ++-|. +|.++|+.||.+..+.+.     |..+  +++.  ...|+++++.+..  .++..+++
T Consensus         3 avHIG-nG~~vt~tHva~~~~~v~-----g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig   54 (105)
T PF03510_consen    3 AVHIG-NGRYVTVTHVAKSSDSVD-----GQPF--KIVK--TDGELCWVQSPLV--HLPAAQIG   54 (105)
T ss_pred             eEEeC-CCEEEEEEEEeccCceEc-----CcCc--EEEE--eccCEEEEECCCC--CCCeeEec
Confidence            55675 689999999998876542     2222  2323  3459999999763  35556664


No 76 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=75.58  E-value=2.2  Score=42.53  Aligned_cols=24  Identities=25%  Similarity=0.504  Sum_probs=21.3

Q ss_pred             cccCCCCCCceeeCCCccEEEEee
Q 016647          272 AAINPGNSGGPLLDSSGSLIGINT  295 (385)
Q Consensus       272 ~~i~~G~SGGPlvn~~G~VVGI~s  295 (385)
                      ..+..|.||+.++|.+|++|||..
T Consensus       350 ~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  350 YSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             cCCCCCCCcCeEECCCCCEEEEeC
Confidence            356689999999999999999975


No 77 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=75.58  E-value=9.1  Score=37.41  Aligned_cols=47  Identities=32%  Similarity=0.457  Sum_probs=35.6

Q ss_pred             ccccccccC-CCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 016647          319 STKRDAYGR-LILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR  367 (385)
Q Consensus       319 ~~~~a~~~g-l~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~R  367 (385)
                      .+..++..| |=.||-|++|||..|+.  -+|+-.+|++  .||.|+++|..
T Consensus        89 kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~~  138 (505)
T KOG3549|consen   89 KDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVKH  138 (505)
T ss_pred             hhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeHh
Confidence            344444444 67999999999999996  4578888875  79999998853


No 78 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=74.54  E-value=2.5  Score=44.62  Aligned_cols=50  Identities=22%  Similarity=0.235  Sum_probs=34.9

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVL  366 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~  366 (385)
                      +|.+++.++..|||.||.|+++||+...+... .++..-...+.-+.+++.
T Consensus       568 ~V~pgskAa~~GlKRgDqilEVNgQnfenis~-~KA~eiLrnnthLtltvK  617 (1283)
T KOG3542|consen  568 EVFPGSKAAREGLKRGDQILEVNGQNFENISA-KKAEEILRNNTHLTLTVK  617 (1283)
T ss_pred             eecCCchHHHhhhhhhhhhhhccccchhhhhH-HHHHHHhcCCceEEEEEe
Confidence            67788899999999999999999999887543 222222223344555543


No 79 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=71.63  E-value=11  Score=37.88  Aligned_cols=62  Identities=16%  Similarity=0.227  Sum_probs=47.7

Q ss_pred             cccccccccccCCC-CCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647          316 GLLSTKRDAYGRLI-LGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       316 ~v~~~~~a~~~gl~-~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  378 (385)
                      .|.+.++++.+||+ -+|.|+-+-+......+||..++..+ .++.+++-|+.-....--+|++
T Consensus       115 ~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIesh-e~kpLklyVYN~D~d~~ReVti  177 (462)
T KOG3834|consen  115 SVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKLYVYNHDTDSCREVTI  177 (462)
T ss_pred             ecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhc-cCCCcceeEeecCCCccceEEe
Confidence            56788999999999 68999999555566778999988875 6899999998855544334433


No 80 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=69.91  E-value=15  Score=40.29  Aligned_cols=62  Identities=23%  Similarity=0.375  Sum_probs=35.4

Q ss_pred             eEEEEEEEcCCCEEEecccccCCCCeEEEEecC--CCeEeeEEEEEC--CCCCeEEEEEcCCCCCCcceec
Q 016647          152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD--QSAYDAKIVGFD--QDKDVAVLRIDAPKDKLRPIPI  218 (385)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d--g~~~~a~vv~~d--~~~DlAlLkv~~~~~~~~~l~l  218 (385)
                      ..|...+|++. ||+|.+|+..+...  |.|.+  +..|  +++...  +..|+.+-|++.-.....|+..
T Consensus        65 ~~G~aTLigpq-YiVSV~HN~~gy~~--v~FG~~g~~~Y--~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~  130 (769)
T PF02395_consen   65 NKGVATLIGPQ-YIVSVKHNGKGYNS--VSFGNEGQNTY--KIVDRNNYPSGDFHMPRLNKFVTEVAPAEM  130 (769)
T ss_dssp             TTSS-EEEETT-EEEBETTG-TSCCE--ECESCSSTCEE--EEEEEEBETTSTEBEEEESS---SS----B
T ss_pred             CCceEEEecCC-eEEEEEccCCCcCc--eeecccCCceE--EEEEccCCCCcccceeecCceEEEEecccc
Confidence            34789999986 99999999855444  45544  3334  444443  4469999999764333344443


No 81 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=66.70  E-value=8.5  Score=32.18  Aligned_cols=32  Identities=31%  Similarity=0.471  Sum_probs=24.1

Q ss_pred             ccEEEEccccCCCCCCceeeCCCccEEEEeecc
Q 016647          265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI  297 (385)
Q Consensus       265 ~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~  297 (385)
                      .+++....+..||+-||+|+- +--||||++++
T Consensus        78 ~~~l~g~Gp~~PGdCGg~L~C-~HGViGi~Tag  109 (127)
T PF00947_consen   78 YNLLIGEGPAEPGDCGGILRC-KHGVIGIVTAG  109 (127)
T ss_dssp             ECEEEEE-SSSTT-TCSEEEE-TTCEEEEEEEE
T ss_pred             cCceeecccCCCCCCCceeEe-CCCeEEEEEeC
Confidence            345666778899999999995 55699999986


No 82 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=65.59  E-value=7.9  Score=42.63  Aligned_cols=52  Identities=29%  Similarity=0.417  Sum_probs=36.9

Q ss_pred             cccccc-ccccCCCCCcEEEEECCEEeCCHHH--HHHHHhcCCCCCEEEEEEEECCE
Q 016647          317 LLSTKR-DAYGRLILGDIITSVNGKKVSNGSD--LYRILDQCKVGDEVIVEVLRGDQ  370 (385)
Q Consensus       317 v~~~~~-a~~~gl~~GDiI~~ing~~i~s~~~--l~~~l~~~~~g~~v~l~v~R~g~  370 (385)
                      |..+++ +..+.|+.||.++++||+..--..+  ..+++  .+.|..|.++|...|.
T Consensus       967 VV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lm--trtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen  967 VVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLM--TRTGNVVHLEVAKQGA 1021 (1629)
T ss_pred             eccCCccccccccccCceeeeecCcccccccHHHHHHHH--hccCCeEEEehhhhhh
Confidence            344554 4566799999999999998875443  33444  4579999999866553


No 83 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=61.65  E-value=9.7  Score=39.33  Aligned_cols=40  Identities=33%  Similarity=0.577  Sum_probs=32.2

Q ss_pred             cCCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 016647          326 GRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR  367 (385)
Q Consensus       326 ~gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~R  367 (385)
                      +-|+.||.|.+|||..+.+  ..++++++.+.+ | .++++|.-
T Consensus       163 glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G-~itfkiiP  204 (542)
T KOG0609|consen  163 GLLHVGDEILEVNGISVANKSPEELQELLRNSR-G-SITFKIIP  204 (542)
T ss_pred             cceeeccchheecCeecccCCHHHHHHHHHhCC-C-cEEEEEcc
Confidence            3478999999999999986  579999999876 4 55776653


No 84 
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=60.18  E-value=18  Score=28.28  Aligned_cols=37  Identities=5%  Similarity=0.297  Sum_probs=30.8

Q ss_pred             ccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       171 vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ++.....+.|.+.+++.+.+++.++|.+.++.+=...
T Consensus        10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence            3445568999999999999999999999998876553


No 85 
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=58.80  E-value=35  Score=24.82  Aligned_cols=33  Identities=15%  Similarity=0.303  Sum_probs=28.6

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (385)
                      ..+.+....|..++++++.+|....+.+|+-..
T Consensus         7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence            457778888999999999999999999998654


No 86 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.35  E-value=27  Score=24.80  Aligned_cols=33  Identities=18%  Similarity=0.415  Sum_probs=28.4

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (385)
                      ..+.|.+.||+.+.+.+.++|...++.+-....
T Consensus         7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~   39 (63)
T cd00600           7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE   39 (63)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence            468899999999999999999998988766643


No 87 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=56.07  E-value=51  Score=31.80  Aligned_cols=49  Identities=24%  Similarity=0.355  Sum_probs=33.6

Q ss_pred             cccccccccc-cCCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEE
Q 016647          316 GLLSTKRDAY-GRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVL  366 (385)
Q Consensus       316 ~v~~~~~a~~-~gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~  366 (385)
                      .+-...|++. +.++.||.|++|||..|.-  .-++.+++...  -+.|++.|.
T Consensus        36 QvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~--~~eV~IhyN   87 (429)
T KOG3651|consen   36 QVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVS--LNEVKIHYN   87 (429)
T ss_pred             EeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHh--ccceEEEeh
Confidence            3445566654 5699999999999999985  44666666543  245677664


No 88 
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=54.67  E-value=30  Score=25.37  Aligned_cols=33  Identities=9%  Similarity=0.245  Sum_probs=29.0

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (385)
                      ..+.|.+.+|+.+.+++.++|...++.+-....
T Consensus        11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e   43 (68)
T cd01731          11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE   43 (68)
T ss_pred             CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence            468999999999999999999999998877643


No 89 
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=54.02  E-value=30  Score=25.74  Aligned_cols=32  Identities=13%  Similarity=0.330  Sum_probs=28.3

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|+.+.+++.++|...++.+=...
T Consensus        15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   46 (72)
T PRK00737         15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAE   46 (72)
T ss_pred             CEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence            46899999999999999999999999877764


No 90 
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=53.91  E-value=28  Score=25.45  Aligned_cols=32  Identities=13%  Similarity=0.233  Sum_probs=27.9

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|+.|.+++.++|...++.+=...
T Consensus        11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence            46899999999999999999999998776653


No 91 
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=52.55  E-value=29  Score=25.51  Aligned_cols=32  Identities=13%  Similarity=0.217  Sum_probs=27.7

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|+.+.+++.++|...++.+=.+.
T Consensus        12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence            46889999999999999999999988775553


No 92 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=51.34  E-value=19  Score=40.44  Aligned_cols=48  Identities=25%  Similarity=0.341  Sum_probs=37.1

Q ss_pred             cccccccccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEE
Q 016647          316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEV  365 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v  365 (385)
                      .|..++++..+|+++||.|+.+||+++....  ++.+++..  .|..+.+.+
T Consensus       664 sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~t  713 (1205)
T KOG0606|consen  664 SVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLRT  713 (1205)
T ss_pred             eecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh--cCCeeEEEe
Confidence            4567888999999999999999999998644  66666653  466666644


No 93 
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=51.29  E-value=15  Score=25.15  Aligned_cols=21  Identities=38%  Similarity=0.616  Sum_probs=17.8

Q ss_pred             CCCCCceeeCCCccEEEEeec
Q 016647          276 PGNSGGPLLDSSGSLIGINTA  296 (385)
Q Consensus       276 ~G~SGGPlvn~~G~VVGI~s~  296 (385)
                      .+.+.-|++|.+|+++|+++.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            356778999999999999874


No 94 
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.63  E-value=28  Score=26.65  Aligned_cols=31  Identities=10%  Similarity=0.271  Sum_probs=26.8

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.+|+.+.+++.++|...+|.+=..
T Consensus        12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence            4689999999999999999999998876443


No 95 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=50.47  E-value=25  Score=35.50  Aligned_cols=60  Identities=18%  Similarity=0.222  Sum_probs=42.5

Q ss_pred             cccccccccccCCCC-CcEEEEECCEEeCCHHH-HHHHHhcCCCCCEEEEEEEECC--EEEEEEEE
Q 016647          316 GLLSTKRDAYGRLIL-GDIITSVNGKKVSNGSD-LYRILDQCKVGDEVIVEVLRGD--QKEKIPVK  377 (385)
Q Consensus       316 ~v~~~~~a~~~gl~~-GDiI~~ing~~i~s~~~-l~~~l~~~~~g~~v~l~v~R~g--~~~~~~v~  377 (385)
                      .+..++++.++||.+ =|-|++|||..++...| |++.|+....  .|+++++.-.  ..+.++|+
T Consensus        21 kVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~se--kVkltv~n~kt~~~R~v~I~   84 (462)
T KOG3834|consen   21 KVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSE--KVKLTVYNSKTQEVRIVEIV   84 (462)
T ss_pred             EeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhccc--ceEEEEEecccceeEEEEec
Confidence            567888999999876 58999999999996554 6666665433  3899887643  23344444


No 96 
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=50.23  E-value=37  Score=26.05  Aligned_cols=46  Identities=22%  Similarity=0.456  Sum_probs=31.6

Q ss_pred             EeeEEEEECCCCCeEEEEEcCCCCCCcceecCCC-CCCCCCCEEEE-EeCC
Q 016647          188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVS-ADLLVGQKVYA-IGNP  236 (385)
Q Consensus       188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~-~~~~~G~~V~~-iG~p  236 (385)
                      ++++++..|...++|++.+-.-.   +.+.+.-- ..++.|+.|.+ +||.
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvk---reV~l~Lv~~~v~~GdyVLVHvGfA   52 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVK---REVNLDLVGEEVKVGDYVLVHVGFA   52 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEe---EEEEeeeecCccccCCEEEEEeeEE
Confidence            57899999988889999996532   22222211 26889999877 6764


No 97 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=49.48  E-value=41  Score=34.01  Aligned_cols=135  Identities=20%  Similarity=0.299  Sum_probs=62.9

Q ss_pred             eEEEEEEEcCCCEEEecccccCCCC-eEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCC-CCCcceecCCCCCCCCCCE
Q 016647          152 GSGSGFVWDSKGHVVTNYHVIRGAS-DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVGQK  229 (385)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~-~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~l~l~~~~~~~~G~~  229 (385)
                      +.|=||-|+++ .++|+-||+.... ++   |  |  .+..-+.++..-++.-+++..+. .+++-+.|..  -...|.-
T Consensus       379 GsGWGfWVS~~-lfITttHViP~g~~E~---F--G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLEe--GapEGtV  448 (535)
T PF05416_consen  379 GSGWGFWVSPT-LFITTTHVIPPGAKEA---F--G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILEE--GAPEGTV  448 (535)
T ss_dssp             TTEEEEESSSS-EEEEEGGGS-STTSEE---T--T--EECGGEEEEEETTEEEEEESS-SSTTS---EE-S--S--TT-E
T ss_pred             CCceeeeecce-EEEEeeeecCCcchhh---h--C--CChhHeEEeeccceEEEecCCCCCCCccceeecc--CCCCceE
Confidence            46889999998 9999999997432 21   0  0  01111233444577777776543 2455555532  2334554


Q ss_pred             EEE-EeCCCCCC--CceEEeEEeeeeeeeccCCCCCCcccEEEE-------ccccCCCCCCceeeCCCc---cEEEEeec
Q 016647          230 VYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSG---SLIGINTA  296 (385)
Q Consensus       230 V~~-iG~p~g~~--~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~-------d~~i~~G~SGGPlvn~~G---~VVGI~s~  296 (385)
                      +.+ |-.+.|.-  ..+.-|....+.-.-..  . .....++.+       |-...||+-|.|-+-..|   -|+|++.+
T Consensus       449 ~siLiKR~sGEllpLAvRMgt~AsmkIqgr~--v-~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~A  525 (535)
T PF05416_consen  449 CSILIKRPSGELLPLAVRMGTHASMKIQGRT--V-HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAA  525 (535)
T ss_dssp             EEEEEE-TTSBEEEEEEEEEEEEEEEETTEE--E-EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEE
T ss_pred             EEEEEEcCCccchhhhhhhccceeEEEccee--e-cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEeh
Confidence            433 44554432  12333333322111000  0 000122222       334569999999996555   48999987


Q ss_pred             ccC
Q 016647          297 IYS  299 (385)
Q Consensus       297 ~~~  299 (385)
                      ...
T Consensus       526 Atr  528 (535)
T PF05416_consen  526 ATR  528 (535)
T ss_dssp             E-S
T ss_pred             hcc
Confidence            643


No 98 
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=49.16  E-value=36  Score=25.81  Aligned_cols=31  Identities=19%  Similarity=0.414  Sum_probs=27.1

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.||+.+.+.+.++|...+|.|=..
T Consensus        11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~   41 (79)
T cd01717          11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDC   41 (79)
T ss_pred             CEEEEEECCCcEEEEEEEEEcCccCEEcCCE
Confidence            4688999999999999999999999876554


No 99 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=48.82  E-value=16  Score=36.43  Aligned_cols=29  Identities=38%  Similarity=0.486  Sum_probs=25.2

Q ss_pred             cCCCCCcEEEEECCEEeCCHHHHHHHHhc
Q 016647          326 GRLILGDIITSVNGKKVSNGSDLYRILDQ  354 (385)
Q Consensus       326 ~gl~~GDiI~~ing~~i~s~~~l~~~l~~  354 (385)
                      -||.+||+|+++||-+|.+.+|..+-++.
T Consensus       237 rGL~vgdvitsldgcpV~~v~dW~ecl~t  265 (484)
T KOG2921|consen  237 RGLSVGDVITSLDGCPVHKVSDWLECLAT  265 (484)
T ss_pred             ccCCccceEEecCCcccCCHHHHHHHHHh
Confidence            38999999999999999999997776653


No 100
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.07  E-value=45  Score=25.21  Aligned_cols=31  Identities=13%  Similarity=0.213  Sum_probs=26.9

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.||+.+.+...++|...+|.+=..
T Consensus        11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~   41 (75)
T cd06168          11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSA   41 (75)
T ss_pred             CeEEEEEcCCeEEEEEEEEEcCCCcEEecCc
Confidence            4689999999999999999999999876544


No 101
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.00  E-value=43  Score=25.62  Aligned_cols=31  Identities=23%  Similarity=0.334  Sum_probs=26.7

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.||+.+.+++.++|...+|.+=..
T Consensus        13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~   43 (81)
T cd01729          13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT   43 (81)
T ss_pred             CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence            4688999999999999999999998876544


No 102
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=46.91  E-value=39  Score=25.60  Aligned_cols=31  Identities=16%  Similarity=0.409  Sum_probs=27.1

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus        14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence            5789999999999999999999998876544


No 103
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=46.08  E-value=11  Score=35.47  Aligned_cols=41  Identities=20%  Similarity=0.493  Sum_probs=35.3

Q ss_pred             CCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEE
Q 016647          327 RLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLR  367 (385)
Q Consensus       327 gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R  367 (385)
                      -++.||.|-+|||+.|--..  ++.++|.+.+.|++.+|.+..
T Consensus       167 ~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie  209 (334)
T KOG3938|consen  167 AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE  209 (334)
T ss_pred             heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence            47899999999999998765  677899999999999997764


No 104
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=44.81  E-value=36  Score=25.83  Aligned_cols=49  Identities=18%  Similarity=0.182  Sum_probs=34.8

Q ss_pred             CCcEEEEECCEEeCCHHHHHHHHh-cCCCCCE----EEEEEEECCEEEEEEEEe
Q 016647          330 LGDIITSVNGKKVSNGSDLYRILD-QCKVGDE----VIVEVLRGDQKEKIPVKL  378 (385)
Q Consensus       330 ~GDiI~~ing~~i~s~~~l~~~l~-~~~~g~~----v~l~v~R~g~~~~~~v~~  378 (385)
                      |-|-.+.+||++.++......... .+++|..    |+.++.|||+....+-++
T Consensus        10 PadAkl~v~G~~t~~~G~~R~F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~~V   63 (75)
T TIGR03000        10 PADAKLKVDGKETNGTGTVRTFTTPPLEAGKEYEYTVTAEYDRDGRILTRTRTV   63 (75)
T ss_pred             CCCCEEEECCeEcccCccEEEEECCCCCCCCEEEEEEEEEEecCCcEEEEEEEE
Confidence            468889999999998877555443 3566664    677888999876554443


No 105
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.31  E-value=53  Score=24.52  Aligned_cols=31  Identities=10%  Similarity=0.202  Sum_probs=26.9

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.+|+.+.+++.++|...+|.+=..
T Consensus        11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~   41 (72)
T cd01719          11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDA   41 (72)
T ss_pred             CeEEEEECCCeEEEEEEEEEcccccEEeccE
Confidence            4688999999999999999999888877554


No 106
>PF04225 OapA:  Opacity-associated protein A LysM-like domain;  InterPro: IPR007340 This entry includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonization, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation [].; PDB: 2GU1_A.
Probab=43.43  E-value=14  Score=28.67  Aligned_cols=53  Identities=17%  Similarity=0.246  Sum_probs=26.4

Q ss_pred             CCCCcEEEEE---CCEEeCCHHHHHH------HHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647          328 LILGDIITSV---NGKKVSNGSDLYR------ILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  380 (385)
Q Consensus       328 l~~GDiI~~i---ng~~i~s~~~l~~------~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  380 (385)
                      ++.||-+-.|   .|-+..++..+.+      .|...+||+++.+.+-.+|+...+.+...+
T Consensus         7 V~~GDtLs~iF~~~gls~~dl~~v~~~~~~~k~L~~L~pGq~l~f~~d~~g~L~~L~~~~~~   68 (85)
T PF04225_consen    7 VKSGDTLSTIFRRAGLSASDLYAVLEADGEAKPLTRLKPGQTLEFQLDEDGQLTALRYERSP   68 (85)
T ss_dssp             --TT--HHHHHHHTT--HHHHHHHHHHGGGT--GGG--TT-EEEEEE-TTS-EEEEEEEEET
T ss_pred             ECCCCcHHHHHHHcCCCHHHHHHHHhccCccchHhhCCCCCEEEEEECCCCCEEEEEEEcCC
Confidence            4556654444   4555444444333      556678999999999888988777766543


No 107
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.17  E-value=54  Score=24.69  Aligned_cols=31  Identities=16%  Similarity=0.185  Sum_probs=27.0

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      ..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus        13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~   43 (74)
T cd01728          13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT   43 (74)
T ss_pred             CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence            4688999999999999999999988877554


No 108
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=43.10  E-value=57  Score=23.43  Aligned_cols=32  Identities=19%  Similarity=0.413  Sum_probs=27.5

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.||+.+.+.+.++|...++-+=...
T Consensus         9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~   40 (67)
T smart00651        9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE   40 (67)
T ss_pred             cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence            36889999999999999999999988776554


No 109
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=42.29  E-value=59  Score=24.06  Aligned_cols=32  Identities=9%  Similarity=0.272  Sum_probs=28.8

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus        11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence            46889999999999999999999999888774


No 110
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.51  E-value=60  Score=24.24  Aligned_cols=31  Identities=19%  Similarity=0.272  Sum_probs=27.1

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (385)
                      .++.|.+.||+.+.++..++|...++.+=..
T Consensus        10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~   40 (74)
T cd01727          10 KTVSVITVDGRVIVGTLKGFDQATNLILDDS   40 (74)
T ss_pred             CEEEEEECCCcEEEEEEEEEccccCEEccce
Confidence            4688999999999999999999988877665


No 111
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=40.14  E-value=55  Score=24.70  Aligned_cols=33  Identities=21%  Similarity=0.459  Sum_probs=28.5

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (385)
                      ..+.|.+.+|+.+.+++.++|...++.+--+..
T Consensus        18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence            578999999999999999999999887765543


No 112
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=39.89  E-value=26  Score=34.85  Aligned_cols=42  Identities=36%  Similarity=0.649  Sum_probs=29.4

Q ss_pred             cCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEE--EEEECC
Q 016647          326 GRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIV--EVLRGD  369 (385)
Q Consensus       326 ~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l--~v~R~g  369 (385)
                      ..|..||.|+++||+...+.  ++...+|+  +.|+.|.+  +++|+-
T Consensus       127 ~aL~~gDaIlSVNG~dL~~AtHdeAVqaLK--raGkeV~levKy~REv  172 (506)
T KOG3551|consen  127 GALFLGDAILSVNGEDLRDATHDEAVQALK--RAGKEVLLEVKYMREV  172 (506)
T ss_pred             cceeeccEEEEecchhhhhcchHHHHHHHH--hhCceeeeeeeeehhc
Confidence            45889999999999999864  34445554  46777655  556654


No 113
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=37.83  E-value=66  Score=23.13  Aligned_cols=33  Identities=21%  Similarity=0.444  Sum_probs=29.1

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (385)
                      ..+.|.+.+|+.+.+.+..+|...++.+-....
T Consensus         9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~   41 (67)
T PF01423_consen    9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE   41 (67)
T ss_dssp             SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred             cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence            468999999999999999999999998877754


No 114
>PF02601 Exonuc_VII_L:  Exonuclease VII, large subunit;  InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus.  This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=35.89  E-value=43  Score=32.40  Aligned_cols=34  Identities=29%  Similarity=0.524  Sum_probs=30.5

Q ss_pred             EEEEEEEcCCCEEEecccccCCCCeEEEEecCCC
Q 016647          153 SGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS  186 (385)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~  186 (385)
                      .|-.++.+++|.+||+..-+...+.+++.+.||.
T Consensus       281 RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~  314 (319)
T PF02601_consen  281 RGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS  314 (319)
T ss_pred             CceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence            5667888888999999999999999999999995


No 115
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=35.18  E-value=93  Score=29.61  Aligned_cols=40  Identities=18%  Similarity=0.184  Sum_probs=29.7

Q ss_pred             ccccccccccC-CCCCcEEEEECCEEeC--CHHHHHHHHhcCC
Q 016647          317 LLSTKRDAYGR-LILGDIITSVNGKKVS--NGSDLYRILDQCK  356 (385)
Q Consensus       317 v~~~~~a~~~g-l~~GDiI~~ing~~i~--s~~~l~~~l~~~~  356 (385)
                      +.+++-++..| |..+|.|+++||.+|.  +.+++.++|-...
T Consensus       201 lVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs  243 (358)
T KOG3606|consen  201 LVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS  243 (358)
T ss_pred             ecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc
Confidence            34566666666 4689999999999997  6778877765543


No 116
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=34.98  E-value=38  Score=34.92  Aligned_cols=43  Identities=28%  Similarity=0.362  Sum_probs=29.1

Q ss_pred             cccccCCCCCcEEEEECCEEeCCHH--H----HHHHHhcCCCCCEEEEEEEE
Q 016647          322 RDAYGRLILGDIITSVNGKKVSNGS--D----LYRILDQCKVGDEVIVEVLR  367 (385)
Q Consensus       322 ~a~~~gl~~GDiI~~ing~~i~s~~--~----l~~~l~~~~~g~~v~l~v~R  367 (385)
                      .+..+.|.+||.|+.||.....++.  |    |.+++.  ++|- ++++|-.
T Consensus       290 VA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~--~~gP-i~ltvAk  338 (626)
T KOG3571|consen  290 VALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVS--RPGP-IKLTVAK  338 (626)
T ss_pred             eeccCccCccceEEEeeecchhhcCchHHHHHHHHHhc--cCCC-eEEEEee
Confidence            4667789999999999998887654  3    344443  3442 5666544


No 117
>PF01455 HupF_HypC:  HupF/HypC family;  InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=34.59  E-value=1.2e+02  Score=22.38  Aligned_cols=43  Identities=23%  Similarity=0.396  Sum_probs=30.3

Q ss_pred             EeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEE
Q 016647          188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI  233 (385)
Q Consensus       188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~i  233 (385)
                      ++++++..+.....|++....   ....+.+.--.++++||+|.+-
T Consensus         5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH   47 (68)
T PF01455_consen    5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH   47 (68)
T ss_dssp             EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred             ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence            688999998888999998875   3345555444558999999764


No 118
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=34.11  E-value=28  Score=36.98  Aligned_cols=42  Identities=26%  Similarity=0.445  Sum_probs=30.5

Q ss_pred             ccccccccccCCCCCcEEEEECCEEeCCH-H-HHHHHHhcCCCCC
Q 016647          317 LLSTKRDAYGRLILGDIITSVNGKKVSNG-S-DLYRILDQCKVGD  359 (385)
Q Consensus       317 v~~~~~a~~~gl~~GDiI~~ing~~i~s~-~-~l~~~l~~~~~g~  359 (385)
                      ++-++.++++|++.|.-|++|||+.|--. . -+.++|.. ..|+
T Consensus       763 LlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~-aVGE  806 (829)
T KOG3605|consen  763 LLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSN-AVGE  806 (829)
T ss_pred             hhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHH-hhhh
Confidence            34677799999999999999999988643 2 35555554 3554


No 119
>PF05578 Peptidase_S31:  Pestivirus NS3 polyprotein peptidase S31;  InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=33.57  E-value=1e+02  Score=26.72  Aligned_cols=73  Identities=22%  Similarity=0.219  Sum_probs=38.5

Q ss_pred             CCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeC-CCccEEEEeecc
Q 016647          224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD-SSGSLIGINTAI  297 (385)
Q Consensus       224 ~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn-~~G~VVGI~s~~  297 (385)
                      ...|...+++ +|...+-+-+.|.+-...+.-..+..-.....--.+|..-..|.||-|+|. ..|++||=.-.+
T Consensus       109 cp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtpaf~~~knlkg~s~~pifeassgr~vgr~k~g  182 (211)
T PF05578_consen  109 CPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEASSGRVVGRVKVG  182 (211)
T ss_pred             CCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCcceeeccccCCCCCCceeeccCCcEEEEEEec
Confidence            4457778777 565544455555554333221100000000001123334457999999997 689999976543


No 120
>PF09122 DUF1930:  Domain of unknown function (DUF1930);  InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=31.31  E-value=1.9e+02  Score=21.16  Aligned_cols=45  Identities=22%  Similarity=0.304  Sum_probs=27.7

Q ss_pred             CcEEEEECCEEeCCHH-HHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647          331 GDIITSVNGKKVSNGS-DLYRILDQCKVGDEVIVEVLRGDQKEKIPV  376 (385)
Q Consensus       331 GDiI~~ing~~i~s~~-~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v  376 (385)
                      .-..+.+||+.+.+.+ ++..++.-...|+..++.+ ..+....+++
T Consensus        19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~F-kS~Rv~~iEv   64 (68)
T PF09122_consen   19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFF-KSQRVAVIEV   64 (68)
T ss_dssp             TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEE-TTS-EEEEE-
T ss_pred             cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEE-ecCcEEEEEc
Confidence            4567889999999987 6888888778999988865 3344444443


No 121
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=30.42  E-value=39  Score=27.39  Aligned_cols=17  Identities=35%  Similarity=0.675  Sum_probs=12.6

Q ss_pred             ceeeCCCccEEEEeecc
Q 016647          281 GPLLDSSGSLIGINTAI  297 (385)
Q Consensus       281 GPlvn~~G~VVGI~s~~  297 (385)
                      .|++|.+|++||++..+
T Consensus        94 ~PV~d~~g~viG~V~VG  110 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVG  110 (116)
T ss_dssp             EEEE-TTS-EEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEE
Confidence            68889999999998754


No 122
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.26  E-value=1.6e+02  Score=22.00  Aligned_cols=32  Identities=13%  Similarity=0.270  Sum_probs=28.5

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus        12 ~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~   43 (76)
T cd01723          12 HPMLVELKNGETYNGHLVNCDNWMNIHLREVI   43 (76)
T ss_pred             CEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence            46899999999999999999999999887663


No 123
>PF14275 DUF4362:  Domain of unknown function (DUF4362)
Probab=29.95  E-value=1.2e+02  Score=24.23  Aligned_cols=25  Identities=20%  Similarity=0.405  Sum_probs=17.8

Q ss_pred             CCCcEEEEECCEEeCCHHHHHHHHhcC
Q 016647          329 ILGDIITSVNGKKVSNGSDLYRILDQC  355 (385)
Q Consensus       329 ~~GDiI~~ing~~i~s~~~l~~~l~~~  355 (385)
                      |.||||.+ .|+ |.|.+.|...+...
T Consensus         1 ~~~DVi~~-~~~-i~Nl~kl~~Fi~nv   25 (98)
T PF14275_consen    1 KNNDVINK-HGE-IENLDKLDQFIENV   25 (98)
T ss_pred             CCCCEEEe-CCe-EEeHHHHHHHHHHH
Confidence            56999999 444 77777777766653


No 124
>COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only]
Probab=29.36  E-value=48  Score=32.12  Aligned_cols=38  Identities=21%  Similarity=0.390  Sum_probs=32.0

Q ss_pred             EEECCEEeCCHHHHHHHHh-cCCCCCEEEEEEEECCEEE
Q 016647          335 TSVNGKKVSNGSDLYRILD-QCKVGDEVIVEVLRGDQKE  372 (385)
Q Consensus       335 ~~ing~~i~s~~~l~~~l~-~~~~g~~v~l~v~R~g~~~  372 (385)
                      .++.|.+|-|..|+.++++ ..-|||.+++++.++||+.
T Consensus       270 ae~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE~  308 (356)
T COG4956         270 AELQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKEP  308 (356)
T ss_pred             HhhcCCceecHHHHHHHhCCcccCCCeeEEEEeecCccc
Confidence            4567788889999999988 4679999999999999864


No 125
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=26.65  E-value=50  Score=26.23  Aligned_cols=22  Identities=32%  Similarity=0.453  Sum_probs=17.3

Q ss_pred             CCCCCceeeCCCccEEEEeecc
Q 016647          276 PGNSGGPLLDSSGSLIGINTAI  297 (385)
Q Consensus       276 ~G~SGGPlvn~~G~VVGI~s~~  297 (385)
                      .+.+.-|++|.+|+++|+++..
T Consensus        97 ~~~~~lpVvd~~~~~vGiit~~  118 (123)
T cd04627          97 EGISSVAVVDNQGNLIGNISVT  118 (123)
T ss_pred             cCCceEEEECCCCcEEEEEeHH
Confidence            3445679999899999999753


No 126
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.68  E-value=1.6e+02  Score=22.47  Aligned_cols=32  Identities=13%  Similarity=0.276  Sum_probs=28.4

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|..+.+++..+|...++-+-.+.
T Consensus        12 ~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~   43 (81)
T cd01725          12 KEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS   43 (81)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence            46899999999999999999999999887764


No 127
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=24.55  E-value=59  Score=25.37  Aligned_cols=20  Identities=20%  Similarity=0.270  Sum_probs=16.0

Q ss_pred             CCCCceeeCCCccEEEEeec
Q 016647          277 GNSGGPLLDSSGSLIGINTA  296 (385)
Q Consensus       277 G~SGGPlvn~~G~VVGI~s~  296 (385)
                      +.+--|++|.+|+++|+++.
T Consensus        86 ~~~~lpVvd~~~~~~Giit~  105 (111)
T cd04603          86 EPPVVAVVDKEGKLVGTIYE  105 (111)
T ss_pred             CCCeEEEEcCCCeEEEEEEh
Confidence            44456899988999999875


No 128
>PF11948 DUF3465:  Protein of unknown function (DUF3465);  InterPro: IPR021856  This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif. 
Probab=24.53  E-value=4.3e+02  Score=22.30  Aligned_cols=12  Identities=33%  Similarity=0.246  Sum_probs=10.4

Q ss_pred             CCCCCCEEEEEe
Q 016647          223 DLLVGQKVYAIG  234 (385)
Q Consensus       223 ~~~~G~~V~~iG  234 (385)
                      .++.||.|.+.|
T Consensus        85 ~l~~GD~V~f~G   96 (131)
T PF11948_consen   85 WLQKGDQVEFYG   96 (131)
T ss_pred             CcCCCCEEEEEE
Confidence            478899999988


No 129
>PF08669 GCV_T_C:  Glycine cleavage T-protein C-terminal barrel domain;  InterPro: IPR013977  This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=24.36  E-value=58  Score=25.20  Aligned_cols=20  Identities=30%  Similarity=0.499  Sum_probs=16.5

Q ss_pred             CCCceeeCCCccEEEEeecc
Q 016647          278 NSGGPLLDSSGSLIGINTAI  297 (385)
Q Consensus       278 ~SGGPlvn~~G~VVGI~s~~  297 (385)
                      ..|.|+++.+|+.||.+++.
T Consensus        34 ~~g~~v~~~~g~~vG~vTS~   53 (95)
T PF08669_consen   34 RGGEPVYDEDGKPVGRVTSG   53 (95)
T ss_dssp             STTCEEEETTTEEEEEEEEE
T ss_pred             CCCCEEEECCCcEEeEEEEE
Confidence            45899998799999988754


No 130
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=23.76  E-value=62  Score=25.17  Aligned_cols=20  Identities=45%  Similarity=0.599  Sum_probs=16.2

Q ss_pred             CCCCceeeCCCccEEEEeec
Q 016647          277 GNSGGPLLDSSGSLIGINTA  296 (385)
Q Consensus       277 G~SGGPlvn~~G~VVGI~s~  296 (385)
                      +...-|++|.+|+++|+++.
T Consensus        90 ~~~~~pVvd~~~~~~Gvit~  109 (115)
T cd04620          90 QIRHLPVLDDQGQLIGLVTA  109 (115)
T ss_pred             CCceEEEEcCCCCEEEEEEh
Confidence            34467899988999999875


No 131
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=23.44  E-value=3.3e+02  Score=20.56  Aligned_cols=32  Identities=9%  Similarity=0.285  Sum_probs=28.3

Q ss_pred             CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (385)
                      ..+.|.+.+|..|.+++..+|...++-+-.+.
T Consensus        20 ~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~   51 (78)
T cd01733          20 KVVTVELRNETTVTGRIASVDAFMNIRLAKVT   51 (78)
T ss_pred             CEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence            46899999999999999999999998887764


No 132
>PF10049 DUF2283:  Protein of unknown function (DUF2283);  InterPro: IPR019270  Members of this family of hypothetical proteins have no known function. 
Probab=23.14  E-value=56  Score=22.44  Aligned_cols=10  Identities=40%  Similarity=0.973  Sum_probs=8.0

Q ss_pred             CCCccEEEEe
Q 016647          285 DSSGSLIGIN  294 (385)
Q Consensus       285 n~~G~VVGI~  294 (385)
                      |.+|++|||-
T Consensus        36 d~~G~ivGIE   45 (50)
T PF10049_consen   36 DEDGRIVGIE   45 (50)
T ss_pred             CCCCCEEEEE
Confidence            5788999974


No 133
>KOG1379 consensus Serine/threonine protein phosphatase [Signal transduction mechanisms]
Probab=22.44  E-value=98  Score=30.14  Aligned_cols=71  Identities=20%  Similarity=0.212  Sum_probs=41.6

Q ss_pred             CCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccccccc--------cccc----cccCCCCCcEEEEECCEEe
Q 016647          275 NPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTGLLS--------TKRD----AYGRLILGDIITSVNGKKV  342 (385)
Q Consensus       275 ~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~~v~~--------~~~a----~~~gl~~GDiI~~ing~~i  342 (385)
                      +-|+||=-++ .+|+||=      ..  ..+...|-.|+...+.+        +.|.    ..-.+|.||||+.--+=-.
T Consensus       187 NLGDSGF~Vv-R~G~vv~------~S--~~Q~H~FN~PyQLs~~p~~~~~~~~d~p~~ad~~~~~v~~GDvIilATDGlf  257 (330)
T KOG1379|consen  187 NLGDSGFLVV-REGKVVF------RS--PEQQHYFNTPYQLSSPPEGYSSYISDVPDSADVTSFDVQKGDVIILATDGLF  257 (330)
T ss_pred             eccCcceEEE-ECCEEEE------cC--chheeccCCceeeccCCccccccccCCccccceEEEeccCCCEEEEeccccc
Confidence            4699998888 7999872      11  23445666676653332        2221    1235899999887655455


Q ss_pred             CCHHH--HHHHHhc
Q 016647          343 SNGSD--LYRILDQ  354 (385)
Q Consensus       343 ~s~~~--l~~~l~~  354 (385)
                      +|+.+  +..+|..
T Consensus       258 DNl~e~~Il~il~~  271 (330)
T KOG1379|consen  258 DNLPEKEILSILKG  271 (330)
T ss_pred             ccccHHHHHHHHHH
Confidence            55543  4444443


No 134
>PRK13835 conjugal transfer protein TrbH; Provisional
Probab=21.92  E-value=73  Score=27.38  Aligned_cols=40  Identities=20%  Similarity=0.223  Sum_probs=26.0

Q ss_pred             HHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEE
Q 016647          118 TVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFV  158 (385)
Q Consensus       118 ~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfi  158 (385)
                      ..++.+.+.|+--+|.-.... ++|....+..-+++|=+++
T Consensus        47 vsqLae~~pPa~tt~~l~q~~-d~Fg~aL~~aLr~~GYaVv   86 (145)
T PRK13835         47 VSRLAEQIGPGTTTIKLKKDT-SPFGQALEAALKGWGYAVV   86 (145)
T ss_pred             HHHHHHhcCCCceEEEEeecC-cHHHHHHHHHHHhcCeEEe
Confidence            457888889988777765554 6776555544455555555


No 135
>PRK14864 putative biofilm stress and motility protein A; Provisional
Probab=21.59  E-value=1.7e+02  Score=23.65  Aligned_cols=26  Identities=8%  Similarity=-0.027  Sum_probs=10.8

Q ss_pred             ccccCccchhHHHHHHHhCCceEEEE
Q 016647          108 QRKLQTDELATVRLFQENTPSVVNIT  133 (385)
Q Consensus       108 ~~~~~~~~~~~~~~~~~~~~SVV~I~  133 (385)
                      ++...+++.+..+....-+=.+|.|.
T Consensus        32 ~~~~~A~eI~~~qa~~lq~iGtVSvs   57 (104)
T PRK14864         32 PPADHAQEIRRAQTQGLQKMGTVSAL   57 (104)
T ss_pred             CccccceecCHHHhhCCceeeEEEEe
Confidence            33444455554433222222355554


No 136
>PF14438 SM-ATX:  Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=21.15  E-value=2.2e+02  Score=21.17  Aligned_cols=30  Identities=20%  Similarity=0.350  Sum_probs=21.3

Q ss_pred             CeEEEEecCCCeEeeEEEEECC---CCCeEEEEE
Q 016647          176 SDIRVTFADQSAYDAKIVGFDQ---DKDVAVLRI  206 (385)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~---~~DlAlLkv  206 (385)
                      ..++|++.||..|++-....++   +.|++| +.
T Consensus        13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL-k~   45 (77)
T PF14438_consen   13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL-KM   45 (77)
T ss_dssp             SEEEEEETTS-EEEEEEEEE-T---T--EEE-EE
T ss_pred             CEEEEEECCCCEEEEEEEeCCCcccceeEEE-Ee
Confidence            4689999999999999999887   667755 44


No 137
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=20.79  E-value=84  Score=24.94  Aligned_cols=21  Identities=29%  Similarity=0.347  Sum_probs=17.4

Q ss_pred             CCCCCceeeCCCccEEEEeec
Q 016647          276 PGNSGGPLLDSSGSLIGINTA  296 (385)
Q Consensus       276 ~G~SGGPlvn~~G~VVGI~s~  296 (385)
                      .+...-|++|.+|+++||++.
T Consensus        87 ~~~~~lpVvd~~~~l~Givt~  107 (113)
T cd04597          87 HNIRTLPVVDDDGTPAGIITL  107 (113)
T ss_pred             cCCCEEEEECCCCeEEEEEEH
Confidence            455678999999999999874


Done!