Query         008087
Match_columns 578
No_of_seqs    553 out of 3478
Neff          7.9 
Searched_HMMs 46136
Date          Thu Mar 28 19:18:07 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/008087.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/008087hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 1.3E-57 2.9E-62  490.1  40.5  387  114-550    42-451 (455)
  2 TIGR02037 degP_htrA_DO peripla 100.0 1.6E-55 3.5E-60  474.6  42.9  391  115-550     4-425 (428)
  3 PRK10942 serine endoprotease;  100.0 1.2E-54 2.6E-59  469.3  38.5  349  148-550   110-469 (473)
  4 TIGR02038 protease_degS peripl 100.0 2.3E-47   5E-52  399.5  33.2  296  114-433    47-349 (351)
  5 PRK10898 serine endoprotease;  100.0 8.7E-47 1.9E-51  394.8  33.3  297  114-434    47-351 (353)
  6 KOG1421 Predicted signaling-as 100.0 1.1E-38 2.5E-43  334.7  25.6  376  114-540    54-455 (955)
  7 KOG1320 Serine protease [Postt 100.0 6.2E-38 1.3E-42  328.3  15.9  410  117-571    55-473 (473)
  8 COG0265 DegQ Trypsin-like seri 100.0 4.9E-36 1.1E-40  315.0  28.0  300  114-433    35-341 (347)
  9 KOG1320 Serine protease [Postt  99.9 1.9E-23 4.1E-28  219.6  20.0  303  114-431   130-467 (473)
 10 KOG1421 Predicted signaling-as  99.9 1.7E-19 3.6E-24  191.0  27.6  362  119-542   525-916 (955)
 11 PRK10779 zinc metallopeptidase  99.7 8.4E-17 1.8E-21  174.5  10.7  153  355-549   128-282 (449)
 12 PF13365 Trypsin_2:  Trypsin-li  99.6 2.6E-14 5.7E-19  125.9  14.1  108  151-289     1-120 (120)
 13 TIGR00054 RIP metalloprotease   99.5 2.7E-14 5.8E-19  153.5  11.6  135  353-548   128-263 (420)
 14 PF00089 Trypsin:  Trypsin;  In  99.4 2.3E-12 5.1E-17  125.4  14.4  182  134-315     7-220 (220)
 15 PF13180 PDZ_2:  PDZ domain; PD  99.4 4.1E-13 8.8E-18  110.9   7.3   81  328-430     1-82  (82)
 16 cd00190 Tryp_SPc Trypsin-like   99.3 4.5E-11 9.7E-16  117.2  16.9  168  148-315    24-229 (232)
 17 cd00987 PDZ_serine_protease PD  99.2   4E-11 8.6E-16  100.5   8.3   88  328-427     1-89  (90)
 18 smart00020 Tryp_SPc Trypsin-li  99.2 2.3E-10 5.1E-15  112.3  15.0  147  148-294    25-208 (229)
 19 cd00986 PDZ_LON_protease PDZ d  99.1 5.8E-10 1.3E-14   91.3   9.0   72  352-433     7-78  (79)
 20 cd00991 PDZ_archaeal_metallopr  99.0 7.6E-10 1.7E-14   90.7   8.5   68  352-429     9-77  (79)
 21 cd00990 PDZ_glycyl_aminopeptid  99.0 6.9E-10 1.5E-14   90.9   7.7   77  328-431     1-78  (80)
 22 TIGR01713 typeII_sec_gspC gene  99.0 9.4E-10   2E-14  110.3   8.4  100  309-430   159-259 (259)
 23 cd00989 PDZ_metalloprotease PD  98.8 1.2E-08 2.6E-13   83.1   7.7   64  354-428    13-77  (79)
 24 TIGR02037 degP_htrA_DO peripla  98.8   1E-08 2.2E-13  111.2   8.9   90  327-427   337-427 (428)
 25 COG3591 V8-like Glu-specific e  98.7 9.4E-08   2E-12   94.0  12.4  162  148-319    63-250 (251)
 26 cd00988 PDZ_CTP_protease PDZ d  98.7   4E-08 8.6E-13   81.4   8.0   67  352-429    12-82  (85)
 27 cd00136 PDZ PDZ domain, also c  98.5 1.9E-07 4.2E-12   74.2   6.2   54  353-417    13-69  (70)
 28 PF12812 PDZ_1:  PDZ-like domai  98.5   3E-07 6.6E-12   74.7   7.4   70  446-536     5-78  (78)
 29 KOG3209 WW domain-containing p  98.5 4.7E-07   1E-11   97.9  10.3  151  352-543   673-835 (984)
 30 PF13180 PDZ_2:  PDZ domain; PD  98.5 3.2E-07 6.9E-12   75.6   6.2   57  493-549    15-76  (82)
 31 cd00991 PDZ_archaeal_metallopr  98.4 8.6E-07 1.9E-11   72.5   8.1   58  492-549    10-72  (79)
 32 cd00987 PDZ_serine_protease PD  98.4 1.7E-06 3.8E-11   72.1  10.0   79  451-549     2-86  (90)
 33 smart00228 PDZ Domain present   98.3 1.6E-06 3.4E-11   71.3   7.5   59  353-421    26-85  (85)
 34 TIGR00054 RIP metalloprotease   98.3 1.5E-06 3.2E-11   93.9   8.1   69  353-432   203-272 (420)
 35 KOG3580 Tight junction protein  98.2 6.5E-06 1.4E-10   87.6  11.3   61  352-422    39-99  (1027)
 36 KOG3627 Trypsin [Amino acid tr  98.2   5E-05 1.1E-09   76.1  17.2  158  136-294    21-228 (256)
 37 PRK10779 zinc metallopeptidase  98.2 2.7E-06 5.9E-11   92.7   8.1   67  354-431   222-289 (449)
 38 cd00989 PDZ_metalloprotease PD  98.2 3.6E-06 7.8E-11   68.4   5.7   52  497-548    21-72  (79)
 39 TIGR00225 prc C-terminal pepti  98.1 5.1E-06 1.1E-10   87.1   7.2   72  353-433    62-134 (334)
 40 PF00595 PDZ:  PDZ domain (Also  98.1   4E-06 8.6E-11   68.8   4.8   72  327-418     9-81  (81)
 41 PRK10139 serine endoprotease;   98.1 6.3E-06 1.4E-10   89.7   7.5   64  353-428   390-454 (455)
 42 cd00986 PDZ_LON_protease PDZ d  98.1 1.7E-05 3.7E-10   64.7   8.0   56  493-549     9-69  (79)
 43 TIGR03279 cyano_FeS_chp putati  98.1 5.2E-06 1.1E-10   87.9   6.2   61  357-431     2-64  (433)
 44 KOG3209 WW domain-containing p  98.1 2.9E-05 6.2E-10   84.5  11.6   53  357-419   782-836 (984)
 45 cd00988 PDZ_CTP_protease PDZ d  98.0 2.1E-05 4.5E-10   64.9   7.0   55  493-547    14-75  (85)
 46 PRK10942 serine endoprotease;   98.0 1.6E-05 3.4E-10   87.0   7.6   64  353-428   408-472 (473)
 47 cd00992 PDZ_signaling PDZ doma  97.9 1.9E-05   4E-10   64.6   6.0   49  328-387    12-61  (82)
 48 PLN00049 carboxyl-terminal pro  97.9 2.4E-05 5.2E-10   83.7   8.5   68  354-430   103-171 (389)
 49 TIGR02038 protease_degS peripl  97.9 3.5E-05 7.6E-10   81.3   9.5   79  451-549   256-340 (351)
 50 TIGR02860 spore_IV_B stage IV   97.9 1.9E-05 4.2E-10   83.1   7.4   68  352-430   104-180 (402)
 51 PF14685 Tricorn_PDZ:  Tricorn   97.9 6.4E-05 1.4E-09   62.4   8.6   65  352-427    11-87  (88)
 52 cd00136 PDZ PDZ domain, also c  97.9 3.1E-05 6.6E-10   61.3   6.1   50  493-542    14-69  (70)
 53 PF00863 Peptidase_C4:  Peptida  97.9  0.0011 2.3E-08   65.2  17.9  135  158-309    40-185 (235)
 54 PRK10898 serine endoprotease;   97.8 9.9E-05 2.2E-09   77.9  10.1   58  492-549   279-341 (353)
 55 cd00990 PDZ_glycyl_aminopeptid  97.8 5.3E-05 1.1E-09   61.7   6.3   55  493-549    13-71  (80)
 56 PF05579 Peptidase_S32:  Equine  97.7 0.00036 7.7E-09   68.4  11.8  113  150-293   115-228 (297)
 57 PRK09681 putative type II secr  97.7 7.2E-05 1.6E-09   75.0   7.3   67  353-430   204-275 (276)
 58 TIGR01713 typeII_sec_gspC gene  97.7 9.5E-05 2.1E-09   74.4   7.4   58  492-549   191-253 (259)
 59 COG0793 Prc Periplasmic protea  97.7 0.00015 3.3E-09   77.6   9.1   81  327-431    99-182 (406)
 60 KOG3580 Tight junction protein  97.6 0.00059 1.3E-08   73.1  11.8   76  342-428   209-286 (1027)
 61 TIGR02860 spore_IV_B stage IV   97.6 9.9E-05 2.1E-09   77.9   6.0   51  499-549   124-174 (402)
 62 COG3480 SdrC Predicted secrete  97.5 0.00021 4.6E-09   71.7   6.8   71  353-433   130-201 (342)
 63 cd00992 PDZ_signaling PDZ doma  97.4 0.00047   1E-08   56.2   6.9   49  493-542    27-81  (82)
 64 PF00595 PDZ:  PDZ domain (Also  97.4 0.00033 7.2E-09   57.3   5.4   51  492-543    25-81  (81)
 65 KOG3605 Beta amyloid precursor  97.3 0.00045 9.8E-09   74.8   7.4  123  357-536   677-806 (829)
 66 COG3975 Predicted protease wit  97.3 0.00044 9.5E-09   74.0   6.4   85  330-434   439-526 (558)
 67 PRK11186 carboxy-terminal prot  97.2  0.0011 2.4E-08   74.9   8.8   71  353-429   255-332 (667)
 68 KOG3129 26S proteasome regulat  97.2 0.00082 1.8E-08   63.5   6.4   72  355-434   141-213 (231)
 69 COG3031 PulC Type II secretory  97.2 0.00038 8.3E-09   67.0   4.1   66  354-429   208-274 (275)
 70 TIGR03279 cyano_FeS_chp putati  97.0 0.00068 1.5E-08   72.1   4.9   51  495-548     5-56  (433)
 71 PF04495 GRASP55_65:  GRASP55/6  97.0  0.0014 3.1E-08   59.3   6.0   86  327-431    25-114 (138)
 72 smart00228 PDZ Domain present   97.0  0.0015 3.3E-08   53.3   5.7   53  493-546    27-85  (85)
 73 TIGR00225 prc C-terminal pepti  97.0  0.0006 1.3E-08   71.5   4.1   50  497-546    71-122 (334)
 74 PF12812 PDZ_1:  PDZ-like domai  96.9  0.0012 2.6E-08   53.6   4.3   58  328-389     9-67  (78)
 75 PF03761 DUF316:  Domain of unk  96.9    0.03 6.5E-07   57.2  15.5  109  194-313   159-273 (282)
 76 KOG3834 Golgi reassembly stack  96.9  0.0033 7.2E-08   65.5   8.2  143  352-542    14-164 (462)
 77 PLN00049 carboxyl-terminal pro  96.9  0.0013 2.8E-08   70.5   5.3   51  497-547   111-163 (389)
 78 COG5640 Secreted trypsin-like   96.8    0.02 4.3E-07   58.8  12.6   58  150-207    62-135 (413)
 79 PRK09681 putative type II secr  96.8  0.0028   6E-08   63.8   6.3   50  500-549   219-269 (276)
 80 PF00548 Peptidase_C3:  3C cyst  96.7   0.023 5.1E-07   53.6  12.3  138  147-292    23-169 (172)
 81 PF08192 Peptidase_S64:  Peptid  96.6   0.019 4.2E-07   63.3  11.8  117  195-318   542-688 (695)
 82 PF14685 Tricorn_PDZ:  Tricorn   96.6  0.0033 7.2E-08   52.2   4.7   48  499-546    31-80  (88)
 83 PF10459 Peptidase_S46:  Peptid  96.3  0.0048   1E-07   70.1   5.2   20  150-169    48-68  (698)
 84 KOG3553 Tax interaction protei  96.2  0.0038 8.2E-08   51.9   3.0   34  353-386    59-93  (124)
 85 COG0265 DegQ Trypsin-like seri  95.8   0.025 5.5E-07   59.6   7.6   59  492-550   270-333 (347)
 86 PF09342 DUF1986:  Domain of un  95.4    0.19 4.1E-06   49.3  11.1   99  135-234    12-131 (267)
 87 PF02122 Peptidase_S39:  Peptid  95.4   0.012 2.5E-07   56.9   2.9  138  158-311    41-184 (203)
 88 COG0793 Prc Periplasmic protea  95.3   0.016 3.5E-07   62.2   4.2   48  498-545   122-171 (406)
 89 PF04495 GRASP55_65:  GRASP55/6  95.1   0.014 3.1E-07   52.9   2.4   49  495-543    50-99  (138)
 90 KOG3550 Receptor targeting pro  94.9    0.06 1.3E-06   48.3   5.8   38  352-389   114-153 (207)
 91 PRK11186 carboxy-terminal prot  94.0   0.056 1.2E-06   61.4   4.4   49  496-544   263-319 (667)
 92 PF00949 Peptidase_S7:  Peptida  93.7   0.079 1.7E-06   47.4   4.0   29  266-294    90-118 (132)
 93 KOG3552 FERM domain protein FR  93.2   0.094   2E-06   59.6   4.4   57  353-419    75-131 (1298)
 94 PF00947 Pico_P2A:  Picornaviru  93.2    0.28 6.1E-06   43.2   6.5   57  248-310    65-121 (127)
 95 COG3480 SdrC Predicted secrete  92.8    0.28 6.1E-06   49.8   6.7   53  492-544   130-186 (342)
 96 KOG3532 Predicted protein kina  92.7    0.12 2.6E-06   57.0   4.2   37  353-389   398-435 (1051)
 97 KOG3605 Beta amyloid precursor  92.7    0.14   3E-06   56.2   4.6   81  305-387   708-791 (829)
 98 COG3031 PulC Type II secretory  92.4    0.19 4.1E-06   48.9   4.7   50  500-549   219-269 (275)
 99 KOG3553 Tax interaction protei  92.0    0.17 3.6E-06   42.4   3.3   53  491-545    58-116 (124)
100 KOG3532 Predicted protein kina  92.0    0.37   8E-06   53.3   6.8   48  495-542   405-452 (1051)
101 PF00944 Peptidase_S3:  Alphavi  91.5    0.32 6.9E-06   43.1   4.7   27  267-293   100-126 (158)
102 KOG3542 cAMP-regulated guanine  91.3    0.14 3.1E-06   56.3   2.8   39  350-388   559-598 (1283)
103 KOG3129 26S proteasome regulat  90.2    0.35 7.6E-06   46.2   4.1   56  494-549   145-203 (231)
104 KOG1892 Actin filament-binding  89.6    0.46   1E-05   54.4   5.1   61  352-422   959-1021(1629)
105 KOG2921 Intramembrane metallop  87.6    0.37 8.1E-06   50.1   2.5   38  352-389   219-258 (484)
106 PF10459 Peptidase_S46:  Peptid  87.0    0.32 6.9E-06   55.7   1.8   40  196-235   200-252 (698)
107 KOG3550 Receptor targeting pro  86.3     4.8  0.0001   36.4   8.4   46  496-542   123-171 (207)
108 PF02907 Peptidase_S29:  Hepati  85.8    0.72 1.6E-05   41.0   2.9   24  271-294   106-129 (148)
109 PF05580 Peptidase_S55:  SpoIVB  85.6    0.52 1.1E-05   45.5   2.2   41  267-310   174-214 (218)
110 COG0750 Predicted membrane-ass  85.2     1.5 3.3E-05   46.5   5.9   57  357-424   133-194 (375)
111 COG0750 Predicted membrane-ass  83.3     1.8 3.8E-05   46.0   5.3   53  496-548   137-193 (375)
112 COG3975 Predicted protease wit  83.2    0.91   2E-05   49.3   3.0   21  498-518   472-492 (558)
113 KOG3552 FERM domain protein FR  80.1     2.3   5E-05   48.9   4.8   48  496-544    82-131 (1298)
114 KOG3606 Cell polarity protein   79.0     1.8 3.9E-05   42.9   3.1   56  330-387   173-230 (358)
115 KOG3651 Protein kinase C, alph  78.8     3.4 7.4E-05   41.6   5.0   55  354-418    31-87  (429)
116 KOG1924 RhoA GTPase effector D  76.9     9.2  0.0002   43.5   8.1   11   12-22    502-512 (1102)
117 KOG3571 Dishevelled 3 and rela  76.1     2.8 6.1E-05   45.1   3.8   38  352-389   276-315 (626)
118 PF03510 Peptidase_C24:  2C end  75.3      12 0.00025   32.2   6.7   53  153-219     3-55  (105)
119 KOG3551 Syntrophins (type beta  74.9     2.8   6E-05   43.7   3.3   55  353-418   110-167 (506)
120 KOG0606 Microtubule-associated  74.1       3 6.5E-05   49.2   3.7   35  355-389   660-695 (1205)
121 KOG0609 Calcium/calmodulin-dep  72.3     4.3 9.3E-05   44.3   4.1   49  494-543   152-203 (542)
122 KOG1924 RhoA GTPase effector D  72.2      13 0.00027   42.5   7.7   10  523-532  1043-1052(1102)
123 KOG2921 Intramembrane metallop  72.2     6.5 0.00014   41.3   5.2   43  492-534   220-267 (484)
124 KOG3606 Cell polarity protein   71.3     9.4  0.0002   38.1   5.8   46  496-542   202-250 (358)
125 PF02395 Peptidase_S6:  Immunog  70.5      15 0.00033   42.8   8.3   63  151-219    67-132 (769)
126 PF01732 DUF31:  Putative pepti  70.3     3.4 7.4E-05   44.0   2.9   24  269-292   351-374 (374)
127 KOG3549 Syntrophins (type gamm  68.1     8.2 0.00018   39.8   4.8   49  494-543    82-137 (505)
128 KOG3549 Syntrophins (type gamm  66.5       7 0.00015   40.3   4.0   55  354-418    81-137 (505)
129 KOG0606 Microtubule-associated  65.8     9.2  0.0002   45.4   5.2   42  495-536   665-708 (1205)
130 KOG0609 Calcium/calmodulin-dep  65.7     8.3 0.00018   42.2   4.6   56  354-419   147-204 (542)
131 KOG3938 RGS-GAIP interacting p  63.7      11 0.00024   37.4   4.7   38  505-542   167-207 (334)
132 KOG3542 cAMP-regulated guanine  62.1     7.5 0.00016   43.4   3.4   51  493-545   563-619 (1283)
133 KOG3938 RGS-GAIP interacting p  57.6     6.4 0.00014   39.1   1.8   56  355-418   151-208 (334)
134 KOG3551 Syntrophins (type beta  56.4      11 0.00024   39.5   3.3   54  499-553   121-179 (506)
135 KOG3651 Protein kinase C, alph  55.8      25 0.00055   35.7   5.7   47  495-542    37-86  (429)
136 KOG3834 Golgi reassembly stack  54.9      11 0.00023   40.2   3.1   50  495-545    22-73  (462)
137 PF05416 Peptidase_C37:  Southa  53.2 1.4E+02  0.0031   32.0  10.8  136  149-294   379-527 (535)
138 KOG3571 Dishevelled 3 and rela  49.3      30 0.00066   37.6   5.4   53  490-542   275-336 (626)
139 cd01720 Sm_D2 The eukaryotic S  43.8      46 0.00099   27.6   4.6   37  167-204    10-46  (87)
140 cd00600 Sm_like The eukaryotic  40.9      75  0.0016   23.9   5.2   33  172-205     7-39  (63)
141 cd01726 LSm6 The eukaryotic Sm  36.5      82  0.0018   24.5   4.8   32  172-204    11-42  (67)
142 PRK00737 small nuclear ribonuc  35.8      91   0.002   24.7   5.0   33  172-205    15-47  (72)
143 cd01722 Sm_F The eukaryotic Sm  35.3      82  0.0018   24.6   4.7   32  172-204    12-43  (68)
144 cd01731 archaeal_Sm1 The archa  34.6      99  0.0022   24.0   5.1   33  172-205    11-43  (68)
145 cd06168 LSm9 The eukaryotic Sm  33.1   1E+02  0.0023   24.7   5.0   31  172-203    11-41  (75)
146 cd01717 Sm_B The eukaryotic Sm  32.9      96  0.0021   25.0   4.8   32  172-204    11-42  (79)
147 COG0298 HypC Hydrogenase matur  32.7   1E+02  0.0023   25.0   4.8   48  185-234     5-53  (82)
148 cd01735 LSm12_N LSm12 belongs   32.2 1.6E+02  0.0035   22.7   5.6   34  171-205     6-39  (61)
149 cd01730 LSm3 The eukaryotic Sm  32.0      89  0.0019   25.4   4.6   31  172-203    12-42  (82)
150 PF00571 CBS:  CBS domain CBS d  30.5      52  0.0011   23.9   2.8   21  272-292    28-48  (57)
151 cd01729 LSm7 The eukaryotic Sm  30.2 1.2E+02  0.0026   24.7   5.0   31  172-203    13-43  (81)
152 cd01732 LSm5 The eukaryotic Sm  30.0 1.1E+02  0.0023   24.6   4.6   31  172-203    14-44  (76)
153 cd01721 Sm_D3 The eukaryotic S  29.4 1.3E+02  0.0029   23.6   5.0   33  171-204    10-42  (70)
154 KOG1738 Membrane-associated gu  28.7      29 0.00064   38.8   1.5   34  355-388   227-262 (638)
155 cd01719 Sm_G The eukaryotic Sm  28.3 1.4E+02  0.0031   23.6   5.0   31  172-203    11-41  (72)
156 KOG4371 Membrane-associated pr  28.1 1.9E+02  0.0041   34.7   7.6  155  328-547  1158-1331(1332)
157 TIGR03000 plancto_dom_1 Planct  28.0 1.3E+02  0.0028   24.3   4.5   48  372-428    10-61  (75)
158 COG1582 FlgEa Uncharacterized   27.3 1.9E+02  0.0041   22.5   5.1   52  511-565     3-54  (67)
159 PF12381 Peptidase_C3G:  Tungro  27.3      87  0.0019   30.5   4.2   54  263-319   170-229 (231)
160 PF11874 DUF3394:  Domain of un  26.6 1.9E+02  0.0041   27.6   6.3   19  498-516   132-150 (183)
161 smart00651 Sm snRNP Sm protein  25.9 1.7E+02  0.0036   22.3   5.0   33  172-205     9-41  (67)
162 PF12419 DUF3670:  SNF2 Helicas  25.8 2.7E+02  0.0059   25.1   7.1   54  508-566    72-125 (141)
163 smart00384 AT_hook DNA binding  25.5      50  0.0011   20.8   1.4   14    5-18      1-14  (26)
164 PF11874 DUF3394:  Domain of un  25.4      59  0.0013   30.9   2.7   29  352-380   121-150 (183)
165 cd01728 LSm1 The eukaryotic Sm  25.2 1.7E+02  0.0037   23.4   4.9   32  172-204    13-44  (74)
166 PF01423 LSM:  LSM domain ;  In  25.2 1.4E+02   0.003   22.9   4.4   35  171-206     8-42  (67)
167 COG1958 LSM1 Small nuclear rib  24.3 1.5E+02  0.0033   23.8   4.6   33  172-205    18-50  (79)
168 PF07174 FAP:  Fibronectin-atta  24.1 6.4E+02   0.014   25.6   9.5   17  153-169   120-136 (297)
169 cd01727 LSm8 The eukaryotic Sm  24.1 3.3E+02  0.0071   21.5   6.5   32  172-204    10-41  (74)
170 PF09465 LBR_tudor:  Lamin-B re  23.7   3E+02  0.0065   20.8   5.5   38  169-206     7-44  (55)
171 cd01723 LSm4 The eukaryotic Sm  20.9 2.5E+02  0.0053   22.4   5.1   33  171-204    11-43  (76)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=1.3e-57  Score=490.06  Aligned_cols=387  Identities=21%  Similarity=0.294  Sum_probs=310.5

Q ss_pred             cccccccCCCCcEEEEeeeeCCC-------------CCCccccCCCcceEEEEEEEc--CCEEEecccccCCCceEEEEE
Q 008087          114 GVARVVPAMDAVVKVFCVHTEPN-------------FSLPWQRKRQYSSSSSGFAIG--GRRVLTNAHSVEHYTQVKLKK  178 (578)
Q Consensus       114 ~~~~~~~~~~sVV~I~~~~~~~~-------------~~~p~~~~~~~~~~GsGfvI~--~g~ILT~aHvV~~~~~i~V~~  178 (578)
                      ....++++.+|||.|........             ...||+......+.||||||+  +||||||+|||.++..+.|++
T Consensus        42 ~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~~  121 (455)
T PRK10139         42 LAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQL  121 (455)
T ss_pred             HHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEEE
Confidence            35667788889999987653221             012343334456799999996  599999999999999999999


Q ss_pred             cCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceeeeeecC
Q 008087          179 RGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVH  256 (578)
Q Consensus       179 ~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~  256 (578)
                       .|++.|+|++++.|+.+||||||++...   .+++++|+++.  .+|++|+++|||+|+.. +++.|+||++.+.....
T Consensus       122 -~dg~~~~a~vvg~D~~~DlAvlkv~~~~---~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~-tvt~GivS~~~r~~~~~  196 (455)
T PRK10139        122 -NDGREFDAKLIGSDDQSDIALLQIQNPS---KLTQIAIADSDKLRVGDFAVAVGNPFGLGQ-TATSGIISALGRSGLNL  196 (455)
T ss_pred             -CCCCEEEEEEEEEcCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEecCCCCCC-ceEEEEEccccccccCC
Confidence             5999999999999999999999998643   68899999865  56999999999999776 89999999987753221


Q ss_pred             CceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCceeeecchhHHHHHHHHHHcCceeccccCCcccc
Q 008087          257 GSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQ  335 (578)
Q Consensus       257 ~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~  335 (578)
                       .....+||+|+++++|||||||||.+|+||||+++.+..+ +..+++||||++.+++++++|+++|++. |+|||+.++
T Consensus       197 -~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~-r~~LGv~~~  274 (455)
T PRK10139        197 -EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIK-RGLLGIKGT  274 (455)
T ss_pred             -CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCccc-ccceeEEEE
Confidence             1234689999999999999999999999999999876543 3578999999999999999999999998 999999999


Q ss_pred             cccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEE
Q 008087          336 KMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAA  414 (578)
Q Consensus       336 ~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~  414 (578)
                      .+ ++++++.+|++. ..|++|..|.++|||++ |||+||+|++|||++|.+|.+          +...+.....|+++.
T Consensus       275 ~l-~~~~~~~lgl~~-~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~d----------l~~~l~~~~~g~~v~  342 (455)
T PRK10139        275 EM-SADIAKAFNLDV-QRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAE----------LRSRIATTEPGTKVK  342 (455)
T ss_pred             EC-CHHHHHhcCCCC-CCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHH----------HHHHHHhcCCCCEEE
Confidence            99 899999999975 67999999999999999 999999999999999999988          556676667889999


Q ss_pred             EEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhhhhhhhccccccccccccCCCceE
Q 008087          415 VKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMS  494 (578)
Q Consensus       415 l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~v  494 (578)
                      ++|+|+|+.+++++++.......... ....|   .+.|+.+.+.     .+                     .....++
T Consensus       343 l~V~R~G~~~~l~v~~~~~~~~~~~~-~~~~~---~~~g~~l~~~-----~~---------------------~~~~~Gv  392 (455)
T PRK10139        343 LGLLRNGKPLEVEVTLDTSTSSSASA-EMITP---ALQGATLSDG-----QL---------------------KDGTKGI  392 (455)
T ss_pred             EEEEECCEEEEEEEEECCCCCccccc-ccccc---cccccEeccc-----cc---------------------ccCCCce
Confidence            99999999999999875432211100 00000   0223322210     00                     0011345


Q ss_pred             EEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEE
Q 008087          495 SLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVL  550 (578)
Q Consensus       495 vvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l  550 (578)
                      ++..    ++|+++||++||+|++|||++|.+|++|.+++++.+ +.+.|++.|+++.++
T Consensus       393 ~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R~g~~~~  451 (455)
T PRK10139        393 KIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVRGNESIY  451 (455)
T ss_pred             EEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEECCEEEE
Confidence            5554    489999999999999999999999999999999865 788999999987654


No 2  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=1.6e-55  Score=474.57  Aligned_cols=391  Identities=22%  Similarity=0.311  Sum_probs=322.6

Q ss_pred             ccccccCCCCcEEEEeeeeCCCCC----------Cccc----------cCCCcceEEEEEEEc-CCEEEecccccCCCce
Q 008087          115 VARVVPAMDAVVKVFCVHTEPNFS----------LPWQ----------RKRQYSSSSSGFAIG-GRRVLTNAHSVEHYTQ  173 (578)
Q Consensus       115 ~~~~~~~~~sVV~I~~~~~~~~~~----------~p~~----------~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~~  173 (578)
                      .+.++++.+|||.|.+........          ..|.          ......+.||||+|+ +||||||+|||.++..
T Consensus         4 ~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~~   83 (428)
T TIGR02037         4 APLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGADE   83 (428)
T ss_pred             HHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCCe
Confidence            455677888999998765321100          0010          112345789999998 7899999999999999


Q ss_pred             EEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceee
Q 008087          174 VKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEI  251 (578)
Q Consensus       174 i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~  251 (578)
                      +.|++ .+++.++|++++.|+.+|||||+++...   .++++.|+++.  ..|++|+++|||.+... +++.|+|++..+
T Consensus        84 i~V~~-~~~~~~~a~vv~~d~~~DlAllkv~~~~---~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~-~~t~G~vs~~~~  158 (428)
T TIGR02037        84 ITVTL-SDGREFKAKLVGKDPRTDIAVLKIDAKK---NLPVIKLGDSDKLRVGDWVLAIGNPFGLGQ-TVTSGIVSALGR  158 (428)
T ss_pred             EEEEe-CCCCEEEEEEEEecCCCCEEEEEecCCC---CceEEEccCCCCCCCCCEEEEEECCCcCCC-cEEEEEEEeccc
Confidence            99999 5999999999999999999999999752   68999998744  67999999999999765 899999998876


Q ss_pred             eeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCceeeecchhHHHHHHHHHHcCceeccccC
Q 008087          252 LSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLL  330 (578)
Q Consensus       252 ~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~l  330 (578)
                      ... ....+..+||+|+++++|||||||||.+|+||||+++.+... +..+++||||++.+++++++|++++++. |+||
T Consensus       159 ~~~-~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~-~~~l  236 (428)
T TIGR02037       159 SGL-GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQ-RGWL  236 (428)
T ss_pred             Ccc-CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCc-CCcC
Confidence            532 122233589999999999999999999999999998866532 3568899999999999999999999998 9999


Q ss_pred             CcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCC
Q 008087          331 GVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYT  409 (578)
Q Consensus       331 Gi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~  409 (578)
                      |+.++.+ ++++++.||++. ..|++|..|.++|||++ ||++||+|++|||++|.++.+          +...+.....
T Consensus       237 Gi~~~~~-~~~~~~~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~----------~~~~l~~~~~  304 (428)
T TIGR02037       237 GVTIQEV-TSDLAKSLGLEK-QRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFAD----------LRRAIGTLKP  304 (428)
T ss_pred             ceEeecC-CHHHHHHcCCCC-CCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCC
Confidence            9999999 899999999986 57999999999999999 999999999999999999988          5567777778


Q ss_pred             CCEEEEEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhcccccccccccc
Q 008087          410 GDSAAVKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQC  488 (578)
Q Consensus       410 g~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~  488 (578)
                      |+.++++|+|+|+.+++++++...+...       .++...+.|+.+.++ +.+...++...                  
T Consensus       305 g~~v~l~v~R~g~~~~~~v~l~~~~~~~-------~~~~~~~lGi~~~~l~~~~~~~~~l~~------------------  359 (428)
T TIGR02037       305 GKKVTLGILRKGKEKTITVTLGASPEEQ-------ASSSNPFLGLTVANLSPEIRKELRLKG------------------  359 (428)
T ss_pred             CCEEEEEEEECCEEEEEEEEECcCCCcc-------ccccccccceEEecCCHHHHHHcCCCc------------------
Confidence            9999999999999999999987654221       123345789999988 77776665211                  


Q ss_pred             CCCceEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhc-CCCeEEEEEecCeEEEE
Q 008087          489 HNCQMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENC-DDEFLKFDLEYDQVVVL  550 (578)
Q Consensus       489 ~~~~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~-~~~~v~l~v~R~~~~~l  550 (578)
                       ...++++..+    +|+++||++||+|++|||++|.++++|.+++++. .++.+.|++.|+++.++
T Consensus       360 -~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g~~~~  425 (428)
T TIGR02037       360 -DVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGGATIF  425 (428)
T ss_pred             -CcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence             1146666654    7999999999999999999999999999999986 57899999999987654


No 3  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=1.2e-54  Score=469.32  Aligned_cols=349  Identities=24%  Similarity=0.332  Sum_probs=289.6

Q ss_pred             ceEEEEEEEc--CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CC
Q 008087          148 SSSSSGFAIG--GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--AL  223 (578)
Q Consensus       148 ~~~GsGfvI~--~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~  223 (578)
                      .+.||||||+  +||||||+|||.++..+.|++ .|+++|+|++++.|+.+||||||++...   .++++.|+++.  ++
T Consensus       110 ~~~GSG~ii~~~~G~IlTn~HVv~~a~~i~V~~-~dg~~~~a~vv~~D~~~DlAvlki~~~~---~l~~~~lg~s~~l~~  185 (473)
T PRK10942        110 MALGSGVIIDADKGYVVTNNHVVDNATKIKVQL-SDGRKFDAKVVGKDPRSDIALIQLQNPK---NLTAIKMADSDALRV  185 (473)
T ss_pred             cceEEEEEEECCCCEEEeChhhcCCCCEEEEEE-CCCCEEEEEEEEecCCCCEEEEEecCCC---CCceeEecCccccCC
Confidence            5689999997  489999999999999999999 5999999999999999999999997543   68899998765  56


Q ss_pred             CCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCc
Q 008087          224 QDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENI  302 (578)
Q Consensus       224 g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~  302 (578)
                      |++|+++|+|+++.. +++.|+|+++.+..... ..+..+||+|+++++|||||||+|.+|+||||+++.+..+ +..++
T Consensus       186 G~~V~aiG~P~g~~~-tvt~GiVs~~~r~~~~~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~  263 (473)
T PRK10942        186 GDYTVAIGNPYGLGE-TVTSGIVSALGRSGLNV-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGI  263 (473)
T ss_pred             CCEEEEEcCCCCCCc-ceeEEEEEEeecccCCc-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccE
Confidence            999999999999766 89999999987653221 1234689999999999999999999999999999876543 34679


Q ss_pred             eeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECC
Q 008087          303 GYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDG  381 (578)
Q Consensus       303 ~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG  381 (578)
                      +|+||++.+++++++|.++|++. |+|||+.++.+ ++++++.++++. ..|++|..|.++|||++ ||++||+|++|||
T Consensus       264 gfaIP~~~~~~v~~~l~~~g~v~-rg~lGv~~~~l-~~~~a~~~~l~~-~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG  340 (473)
T PRK10942        264 GFAIPSNMVKNLTSQMVEYGQVK-RGELGIMGTEL-NSELAKAMKVDA-QRGAFVSQVLPNSSAAKAGIKAGDVITSLNG  340 (473)
T ss_pred             EEEEEHHHHHHHHHHHHhccccc-cceeeeEeeec-CHHHHHhcCCCC-CCceEEEEECCCChHHHcCCCCCCEEEEECC
Confidence            99999999999999999999998 99999999999 889999999986 67999999999999999 9999999999999


Q ss_pred             EEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-H
Q 008087          382 IDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-L  460 (578)
Q Consensus       382 ~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p  460 (578)
                      ++|.+|.+          |...+.....|+++.++|+|+|+.+++.+++.........       ....+.|+....+ +
T Consensus       341 ~~V~s~~d----------l~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~~~~~~~-------~~~~~lGl~g~~l~~  403 (473)
T PRK10942        341 KPISSFAA----------LRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQSSQNQVD-------SSNIFNGIEGAELSN  403 (473)
T ss_pred             EECCCHHH----------HHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcCcccccc-------cccccccceeeeccc
Confidence            99999988          5567777778999999999999999999888653211000       0001123222211 0


Q ss_pred             HHHHHhchhhhhhhhhccccccccccccCCCceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC
Q 008087          461 YLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE  536 (578)
Q Consensus       461 ~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~  536 (578)
                      .                           ....++++..    ++|+++||++||+|++|||++|.++++|.+++++.+ +
T Consensus       404 ~---------------------------~~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~  455 (473)
T PRK10942        404 K---------------------------GGDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-S  455 (473)
T ss_pred             c---------------------------cCCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-C
Confidence            0                           0012455554    489999999999999999999999999999999854 7


Q ss_pred             eEEEEEecCeEEEE
Q 008087          537 FLKFDLEYDQVVVL  550 (578)
Q Consensus       537 ~v~l~v~R~~~~~l  550 (578)
                      .+.|++.|++..++
T Consensus       456 ~v~l~V~R~g~~~~  469 (473)
T PRK10942        456 VLALNIQRGDSSIY  469 (473)
T ss_pred             eEEEEEEECCEEEE
Confidence            89999999987654


No 4  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=2.3e-47  Score=399.49  Aligned_cols=296  Identities=25%  Similarity=0.366  Sum_probs=252.6

Q ss_pred             cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEE
Q 008087          114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAI  192 (578)
Q Consensus       114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~  192 (578)
                      ..+.++++.+|||.|.......+.   + ......+.||||+|+ +||||||+|||.++..+.|++ .||+.++|+++++
T Consensus        47 ~~~~~~~~~psVV~I~~~~~~~~~---~-~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~-~dg~~~~a~vv~~  121 (351)
T TIGR02038        47 FNKAVRRAAPAVVNIYNRSISQNS---L-NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVAL-QDGRKFEAELVGS  121 (351)
T ss_pred             HHHHHHhcCCcEEEEEeEeccccc---c-ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEE-CCCCEEEEEEEEe
Confidence            456677888899999986544321   1 122345689999998 789999999999999999999 5899999999999


Q ss_pred             cCCCCEEEEEEeeCCCCCCeeeEEcCCC--CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCC
Q 008087          193 GTECDIAMLTVEDDEFWEGVLPVEFGEL--PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAI  270 (578)
Q Consensus       193 d~~~DlAlLkv~~~~~~~~~~~l~l~~~--~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i  270 (578)
                      |+.+||||||++..    .+++++++++  .+.|++|+++|||++... +++.|+|+.+.+..... ..+.++||+|+++
T Consensus       122 d~~~DlAvlkv~~~----~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~-s~t~GiIs~~~r~~~~~-~~~~~~iqtda~i  195 (351)
T TIGR02038       122 DPLTDLAVLKIEGD----NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQ-TITQGIISATGRNGLSS-VGRQNFIQTDAAI  195 (351)
T ss_pred             cCCCCEEEEEecCC----CCceEeccCcCccCCCCEEEEEeCCCCCCC-cEEEEEEEeccCcccCC-CCcceEEEECCcc
Confidence            99999999999976    4677788754  467999999999998765 89999999987654321 2234689999999


Q ss_pred             CCCCccceEEccCCeEEEEEecccccc---ccCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhc
Q 008087          271 NSGNSGGPAFNDKGKCVGIAFQSLKHE---DVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMS  347 (578)
Q Consensus       271 ~~G~SGGPlvn~~G~vVGI~~~~~~~~---~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lg  347 (578)
                      ++|||||||||.+|+||||+++.+...   ...+++|+||++.+++++++|+++|++. |+|||+.++++ ++..++.+|
T Consensus       196 ~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~-r~~lGv~~~~~-~~~~~~~lg  273 (351)
T TIGR02038       196 NAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI-RGYIGVSGEDI-NSVVAQGLG  273 (351)
T ss_pred             CCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc-ceEeeeEEEEC-CHHHHHhcC
Confidence            999999999999999999998765432   2367899999999999999999999998 99999999998 888899999


Q ss_pred             cccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEE
Q 008087          348 MKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNF  426 (578)
Q Consensus       348 l~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~  426 (578)
                      ++. ..|++|..|.++|||++ ||++||+|++|||++|.++.+          |.+.+.....|+++.++|+|+|+.+++
T Consensus       274 l~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~d----------l~~~l~~~~~g~~v~l~v~R~g~~~~~  342 (351)
T TIGR02038       274 LPD-LRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEE----------LMDRIAETRPGSKVMVTVLRQGKQLEL  342 (351)
T ss_pred             CCc-cccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEEEEEEEECCEEEEE
Confidence            975 57999999999999999 999999999999999999988          556777667899999999999999999


Q ss_pred             EEEeccc
Q 008087          427 NITLATH  433 (578)
Q Consensus       427 ~v~l~~~  433 (578)
                      .+++..+
T Consensus       343 ~v~l~~~  349 (351)
T TIGR02038       343 PVTIDEK  349 (351)
T ss_pred             EEEecCC
Confidence            9988654


No 5  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=8.7e-47  Score=394.83  Aligned_cols=297  Identities=22%  Similarity=0.331  Sum_probs=250.2

Q ss_pred             cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEE
Q 008087          114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAI  192 (578)
Q Consensus       114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~  192 (578)
                      ....++++.+|||.|........    +.......+.||||+|+ +||||||+|||.++..+.|++ .|++.|+|+++++
T Consensus        47 ~~~~~~~~~psvV~v~~~~~~~~----~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~-~dg~~~~a~vv~~  121 (353)
T PRK10898         47 YNQAVRRAAPAVVNVYNRSLNST----SHNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVAL-QDGRVFEALLVGS  121 (353)
T ss_pred             HHHHHHHhCCcEEEEEeEecccc----CcccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEe-CCCCEEEEEEEEE
Confidence            35567778889999998764322    11223345789999998 789999999999999999999 5899999999999


Q ss_pred             cCCCCEEEEEEeeCCCCCCeeeEEcCCC--CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCC
Q 008087          193 GTECDIAMLTVEDDEFWEGVLPVEFGEL--PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAI  270 (578)
Q Consensus       193 d~~~DlAlLkv~~~~~~~~~~~l~l~~~--~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i  270 (578)
                      |+.+||||||++..    .+++++|+++  ...|++|+++|||++... +++.|+|++..+..... ....++||+|+++
T Consensus       122 d~~~DlAvl~v~~~----~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~-~~t~Giis~~~r~~~~~-~~~~~~iqtda~i  195 (353)
T PRK10898        122 DSLTDLAVLKINAT----NLPVIPINPKRVPHIGDVVLAIGNPYNLGQ-TITQGIISATGRIGLSP-TGRQNFLQTDASI  195 (353)
T ss_pred             cCCCCEEEEEEcCC----CCCeeeccCcCcCCCCCEEEEEeCCCCcCC-CcceeEEEeccccccCC-ccccceEEecccc
Confidence            99999999999875    4677888764  467999999999998665 89999999887643321 1223689999999


Q ss_pred             CCCCccceEEccCCeEEEEEeccccccc----cCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhh
Q 008087          271 NSGNSGGPAFNDKGKCVGIAFQSLKHED----VENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAM  346 (578)
Q Consensus       271 ~~G~SGGPlvn~~G~vVGI~~~~~~~~~----~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~l  346 (578)
                      ++|||||||+|.+|+||||+++.+...+    ..+++|+||++.+++++++|+++|++. ++|||+..+.+ ++.....+
T Consensus       196 ~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~-~~~lGi~~~~~-~~~~~~~~  273 (353)
T PRK10898        196 NHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVI-RGYIGIGGREI-APLHAQGG  273 (353)
T ss_pred             CCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccc-ccccceEEEEC-CHHHHHhc
Confidence            9999999999999999999998664322    257899999999999999999999998 89999999988 66666777


Q ss_pred             ccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEE
Q 008087          347 SMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILN  425 (578)
Q Consensus       347 gl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~  425 (578)
                      ++.. ..|++|..|.++|||++ ||++||+|++|||++|.++.+          +.+.+.....|+.+.++|+|+|+.++
T Consensus       274 ~~~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~----------l~~~l~~~~~g~~v~l~v~R~g~~~~  342 (353)
T PRK10898        274 GIDQ-LQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALE----------TMDQVAEIRPGSVIPVVVMRDDKQLT  342 (353)
T ss_pred             CCCC-CCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEEEEEEEECCEEEE
Confidence            7765 57999999999999999 999999999999999999987          45667666789999999999999999


Q ss_pred             EEEEecccc
Q 008087          426 FNITLATHR  434 (578)
Q Consensus       426 ~~v~l~~~~  434 (578)
                      +.+++..++
T Consensus       343 ~~v~l~~~p  351 (353)
T PRK10898        343 LQVTIQEYP  351 (353)
T ss_pred             EEEEeccCC
Confidence            999887653


No 6  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00  E-value=1.1e-38  Score=334.65  Aligned_cols=376  Identities=16%  Similarity=0.232  Sum_probs=309.0

Q ss_pred             cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc--CCEEEecccccCCCceEEEEEcCCCcEEEEEEEE
Q 008087          114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG--GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLA  191 (578)
Q Consensus       114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~--~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~  191 (578)
                      +...+..+.+|||.|++.....     |+..-...+.||||+++  .||||||+|+|..+..+.-..+.+.++.+...++
T Consensus        54 w~~~ia~VvksvVsI~~S~v~~-----fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy  128 (955)
T KOG1421|consen   54 WRNTIANVVKSVVSIRFSAVRA-----FDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY  128 (955)
T ss_pred             hhhhhhhhcccEEEEEehheee-----cccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence            3556667778999999987544     56666778899999998  6999999999998776544444688888999999


Q ss_pred             EcCCCCEEEEEEeeCCC-CCCeeeEEcCC-CCCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCC-----ceeeeEE
Q 008087          192 IGTECDIAMLTVEDDEF-WEGVLPVEFGE-LPALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHG-----STELLGL  264 (578)
Q Consensus       192 ~d~~~DlAlLkv~~~~~-~~~~~~l~l~~-~~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~-----~~~~~~i  264 (578)
                      .|+.||+.+++++++.. ...+..+.++. ..++|.+++++||..+ +..++..|.++++++....++     .++..++
T Consensus       129 rDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDag-EklsIlagflSrldr~apdyg~~~yndfnTfy~  207 (955)
T KOG1421|consen  129 RDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAG-EKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI  207 (955)
T ss_pred             CCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCcc-ceEEeehhhhhhccCCCccccccccccccceee
Confidence            99999999999998743 12455566653 5688999999999988 666999999999988644332     2334468


Q ss_pred             EEecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhh
Q 008087          265 QIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRV  344 (578)
Q Consensus       265 ~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~  344 (578)
                      |..+...+|.||+||++.+|..|.++.++.   ...+.+|++|++.+++.|..++++..+. |+.|.++|... ..+.++
T Consensus       208 QaasstsggssgspVv~i~gyAVAl~agg~---~ssas~ffLpLdrV~RaL~clq~n~PIt-RGtLqvefl~k-~~de~r  282 (955)
T KOG1421|consen  208 QAASSTSGGSSGSPVVDIPGYAVALNAGGS---ISSASDFFLPLDRVVRALRCLQNNTPIT-RGTLQVEFLHK-LFDECR  282 (955)
T ss_pred             eehhcCCCCCCCCceecccceEEeeecCCc---ccccccceeeccchhhhhhhhhcCCCcc-cceEEEEEehh-hhHHHH
Confidence            998899999999999999999999998854   4567789999999999999999888888 99999999988 899999


Q ss_pred             hhccccC-----------CCCcE-EEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCE
Q 008087          345 AMSMKAD-----------QKGVR-IRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDS  412 (578)
Q Consensus       345 ~lgl~~~-----------~~Gv~-V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~  412 (578)
                      +|||+.+           ..|++ |..|.++|||++.|++||++++||+.-+.++.+          +.+++.. ..|+.
T Consensus       283 rlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~----------l~~iLDe-gvgk~  351 (955)
T KOG1421|consen  283 RLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEA----------LEQILDE-GVGKN  351 (955)
T ss_pred             hcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHH----------HHHHHhh-ccCce
Confidence            9999864           45654 567999999999999999999999998888766          3344544 58999


Q ss_pred             EEEEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCC
Q 008087          413 AAVKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNC  491 (578)
Q Consensus       413 v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~  491 (578)
                      +.|+|+|+|++.++++++++.+...|.       +|+.|+|.+|+++ ++++..+..+.                     
T Consensus       352 l~LtI~Rggqelel~vtvqdlh~itp~-------R~levcGav~hdlsyq~ar~y~lP~---------------------  403 (955)
T KOG1421|consen  352 LELTIQRGGQELELTVTVQDLHGITPD-------RFLEVCGAVFHDLSYQLARLYALPV---------------------  403 (955)
T ss_pred             EEEEEEeCCEEEEEEEEeccccCCCCc-------eEEEEcceEecCCCHHHHhhccccc---------------------
Confidence            999999999999999999999887776       8999999999999 77775554221                     


Q ss_pred             ceEEEEe---ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC-eEEE
Q 008087          492 QMSSLLW---CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE-FLKF  540 (578)
Q Consensus       492 ~~vvvs~---v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~-~v~l  540 (578)
                      +|++++.   +++++.++. |.+|.+||++++.++++|++++++.+++ .|.+
T Consensus       404 ~GvyVa~~~gsf~~~~~~y-~~ii~~vanK~tPdLdaFidvlk~L~dg~rV~v  455 (955)
T KOG1421|consen  404 EGVYVASPGGSFRHRGPRY-GQIIDSVANKPTPDLDAFIDVLKELPDGARVPV  455 (955)
T ss_pred             CcEEEccCCCCccccCCcc-eEEEEeecCCcCCCHHHHHHHHHhccCCCeeeE
Confidence            4788774   356666666 9999999999999999999999998654 4444


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=6.2e-38  Score=328.32  Aligned_cols=410  Identities=43%  Similarity=0.617  Sum_probs=361.4

Q ss_pred             ccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEcCCEEEecccccC---CCceEEEEEcCCCcEEEEEEEEEc
Q 008087          117 RVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIGGRRVLTNAHSVE---HYTQVKLKKRGSDTKYLATVLAIG  193 (578)
Q Consensus       117 ~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~~g~ILT~aHvV~---~~~~i~V~~~~~g~~~~a~vv~~d  193 (578)
                      .+..+..|++.+.+..+.+.+..||+..++..+.|+||.+....+|||+|++.   +...+.+..++.-+.|.|++...-
T Consensus        55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~  134 (473)
T KOG1320|consen   55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF  134 (473)
T ss_pred             CccccccceeEEEeecccccccCcceeeehhcccccchhhcccceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence            34556679999999999999999999999999999999999999999999999   566677766667778899999999


Q ss_pred             CCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCC
Q 008087          194 TECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSG  273 (578)
Q Consensus       194 ~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G  273 (578)
                      ..+|+|+|.++..+||....++++++.+.+.+.++++|   | +..++|.|.|++.....+.++......+|+++++++|
T Consensus       135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---g-d~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~  210 (473)
T KOG1320|consen  135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---G-DGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPG  210 (473)
T ss_pred             hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---C-CcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCC
Confidence            99999999999999998888999999999999999999   4 5569999999999988888777777789999999999


Q ss_pred             CccceEEccCCeEEEEEeccccccccCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhccccCCC
Q 008087          274 NSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMSMKADQK  353 (578)
Q Consensus       274 ~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lgl~~~~~  353 (578)
                      +||+|++...+++.|+++..++..  .++++.||.-.+.+|.......+.+.+++++++..+.+++...+..+.|..+ +
T Consensus       211 ~s~ep~i~g~d~~~gvA~l~ik~~--~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~-~  287 (473)
T KOG1320|consen  211 NSGEPVIVGVDKVAGVAFLKIKTP--ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE-T  287 (473)
T ss_pred             ccCCCeEEccccccceEEEEEecC--CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc-c
Confidence            999999998899999999977532  2789999999999999888888888789999999999989999999999887 8


Q ss_pred             CcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEeccc
Q 008087          354 GVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATH  433 (578)
Q Consensus       354 Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~  433 (578)
                      |+.+.++.+.+.|.+-++.||.|+++||+.|.    +.++..+|+.|.+++....+++++.+.|+|.+   ++.+.+...
T Consensus       288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~Ig----Vn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~---e~~~~lr~~  360 (473)
T KOG1320|consen  288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIG----VNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG---EFQISLRPV  360 (473)
T ss_pred             ceeeeeecccchhhhcccCCCcEEEecCcEee----eeeeeeEEeeccccceeccCchHhhhhhhhhh---hhceeeccc
Confidence            99999999999888899999999999999997    66778889999999999999999999999988   566777777


Q ss_pred             ccccCCCCCCCCCCceeeccEEEeeh--HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEee----cCCcCCCC
Q 008087          434 RRLIPSHNKGRPPSYYIIAGFVFSRC--LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWC----LRSPLCLN  507 (578)
Q Consensus       434 ~~~~~~~~~~~~p~~~~~~Gl~~~~~--p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v----~A~~aGl~  507 (578)
                      ..+.+.+.+...|.|++++||+|.++  +++....                       ..++++++.+    ++..+++.
T Consensus       361 ~~~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~-----------------------~~q~v~is~Vlp~~~~~~~~~~  417 (473)
T KOG1320|consen  361 KPLVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSG-----------------------VVQLVLVSQVLPGSINGGYGLK  417 (473)
T ss_pred             cCcccccccCCceeEEEecceEEeecCCCcccccc-----------------------ceeEEEEEEeccCCCccccccc
Confidence            77788888999999999999999987  4433111                       1246666655    68888899


Q ss_pred             CCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEEechhhhHHHHHHHHHcCCCcC
Q 008087          508 CFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVLRTKTSKAATLDILATHCIPSA  571 (578)
Q Consensus       508 ~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l~~~~~~~~~~~i~~~~~i~~~  571 (578)
                      +||+|++|||++|+++.++.++++.+..+        +...+|+++.+|+++..|+.+|++|++
T Consensus       418 ~g~~V~~vng~~V~n~~~l~~~i~~~~~~--------~~v~vl~~~~~e~~tl~Il~~~~~p~~  473 (473)
T KOG1320|consen  418 PGDQVVKVNGKPVKNLKHLYELIEECSTE--------DKVAVLDRRSAEDATLEILPEHKIPSA  473 (473)
T ss_pred             CCCEEEEECCEEeechHHHHHHHHhcCcC--------ceEEEEEecCccceeEEecccccCCCC
Confidence            99999999999999999999999998766        788899999999999999999999985


No 8  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=4.9e-36  Score=315.02  Aligned_cols=300  Identities=25%  Similarity=0.364  Sum_probs=250.8

Q ss_pred             cccccccCCCCcEEEEeeeeCCC-CCCccccCCC-cceEEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEE
Q 008087          114 GVARVVPAMDAVVKVFCVHTEPN-FSLPWQRKRQ-YSSSSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVL  190 (578)
Q Consensus       114 ~~~~~~~~~~sVV~I~~~~~~~~-~~~p~~~~~~-~~~~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv  190 (578)
                      ....+.++.++||.|........ ..++-..... ..+.||||+++ +|||+||.|+|.++..+.+.+ .||+.++++++
T Consensus        35 ~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l-~dg~~~~a~~v  113 (347)
T COG0265          35 FATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTL-ADGREVPAKLV  113 (347)
T ss_pred             HHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEe-CCCCEEEEEEE
Confidence            35566777889999998765442 0000000001 14789999998 899999999999999999999 69999999999


Q ss_pred             EEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEec
Q 008087          191 AIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDA  268 (578)
Q Consensus       191 ~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da  268 (578)
                      +.|+..|||+|+++...   .++.+.++++.  .+|++++++|+|+++.. +++.|+|+.+.+...........+||+|+
T Consensus       114 g~d~~~dlavlki~~~~---~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~-tvt~Givs~~~r~~v~~~~~~~~~IqtdA  189 (347)
T COG0265         114 GKDPISDLAVLKIDGAG---GLPVIALGDSDKLRVGDVVVAIGNPFGLGQ-TVTSGIVSALGRTGVGSAGGYVNFIQTDA  189 (347)
T ss_pred             ecCCccCEEEEEeccCC---CCceeeccCCCCcccCCEEEEecCCCCccc-ceeccEEeccccccccCcccccchhhccc
Confidence            99999999999999864   26777888765  45899999999999665 99999999998862222122557899999


Q ss_pred             CCCCCCccceEEccCCeEEEEEeccccccc-cCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhc
Q 008087          269 AINSGNSGGPAFNDKGKCVGIAFQSLKHED-VENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMS  347 (578)
Q Consensus       269 ~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~-~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lg  347 (578)
                      ++++|+||||++|.+|++|||+++.+...+ ..+++|+||++.+..++.++...|++. ++|+|+.+..+ +.+.+  +|
T Consensus       190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~-~~~lgv~~~~~-~~~~~--~g  265 (347)
T COG0265         190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVV-RGYLGVIGEPL-TADIA--LG  265 (347)
T ss_pred             ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCcc-ccccceEEEEc-ccccc--cC
Confidence            999999999999999999999999776443 356899999999999999999988888 99999999988 66666  77


Q ss_pred             cccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEE
Q 008087          348 MKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNF  426 (578)
Q Consensus       348 l~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~  426 (578)
                      ++. ..|++|..|.+++||++ |++.||+|+++||+++.+..+          +...+.....|+.+.+++.|+|+++++
T Consensus       266 ~~~-~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~----------l~~~v~~~~~g~~v~~~~~r~g~~~~~  334 (347)
T COG0265         266 LPV-AAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSD----------LVAAVASNRPGDEVALKLLRGGKEREL  334 (347)
T ss_pred             CCC-CCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHH----------HHHHHhccCCCCEEEEEEEECCEEEEE
Confidence            774 77899999999999999 999999999999999999887          556677777899999999999999999


Q ss_pred             EEEeccc
Q 008087          427 NITLATH  433 (578)
Q Consensus       427 ~v~l~~~  433 (578)
                      .+++.+.
T Consensus       335 ~v~l~~~  341 (347)
T COG0265         335 AVTLGDR  341 (347)
T ss_pred             EEEecCc
Confidence            9999874


No 9  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.91  E-value=1.9e-23  Score=219.58  Aligned_cols=303  Identities=21%  Similarity=0.221  Sum_probs=225.7

Q ss_pred             cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc-CCEEEecccccCCCc-----------eEEEEEc-C
Q 008087          114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG-GRRVLTNAHSVEHYT-----------QVKLKKR-G  180 (578)
Q Consensus       114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~-----------~i~V~~~-~  180 (578)
                      ..........|||.|.....-. ...|+....-....||||||+ +|+|+||+||+....           .+.+... +
T Consensus       130 v~~~~~~cd~Avv~Ie~~~f~~-~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~  208 (473)
T KOG1320|consen  130 VAAVFEECDLAVVYIESEEFWK-GMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIG  208 (473)
T ss_pred             HHHhhhcccceEEEEeeccccC-CCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeec
Confidence            3455666778899998743211 122455555566789999998 999999999997432           2555552 1


Q ss_pred             CCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCC-
Q 008087          181 SDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHG-  257 (578)
Q Consensus       181 ~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~-  257 (578)
                      .+..+++.+.+.|+..|+|+++++..+  .-..+++++-+.  ..|+++.++|+|+++.+ +.+.|+++...|..+..+ 
T Consensus       209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~n-t~t~g~vs~~~R~~~~lg~  285 (473)
T KOG1320|consen  209 PGNSGEPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLN-TLTQGMVSGQLRKSFKLGL  285 (473)
T ss_pred             CCccCCCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeee-eeeecccccccccccccCc
Confidence            248899999999999999999997654  125666666544  45899999999999877 899999998877655422 


Q ss_pred             ---ceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCceeeecchhHHHHHHHHHHcCc---ee-----
Q 008087          258 ---STELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENIGYVIPTPVIMHFIQDYEKNGA---YT-----  325 (578)
Q Consensus       258 ---~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~~~aiPi~~i~~~l~~l~~~g~---v~-----  325 (578)
                         ....+++|+|++++.|+||||++|.+|++||++++...+- -..+++|++|.+.++.++.+..+...   ..     
T Consensus       286 ~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p  365 (473)
T KOG1320|consen  286 ETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVP  365 (473)
T ss_pred             ccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCccc
Confidence               2345689999999999999999999999999998855321 23678999999999998887743221   11     


Q ss_pred             ccccCCcccccccChhhh-----hhhcccc-CCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcch
Q 008087          326 GFPLLGVEWQKMENPDLR-----VAMSMKA-DQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERI  398 (578)
Q Consensus       326 ~~~~lGi~~~~~~~~~~~-----~~lgl~~-~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~  398 (578)
                      -+.|+|.....+ ...+.     +.+-.+. ..++++|..|.|++++.. ++++||+|++|||++|.+..+         
T Consensus       366 ~~~~~g~~s~~i-~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~---------  435 (473)
T KOG1320|consen  366 VHQYIGLPSYYI-FAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKH---------  435 (473)
T ss_pred             ccccCCceeEEE-ecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHH---------
Confidence            134677665554 22221     1121221 135899999999999999 999999999999999999998         


Q ss_pred             hHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          399 GFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       399 ~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                       +.+++.....++++.+..+|+.+..++.+...
T Consensus       436 -l~~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~  467 (473)
T KOG1320|consen  436 -LYELIEECSTEDKVAVLDRRSAEDATLEILPE  467 (473)
T ss_pred             -HHHHHHhcCcCceEEEEEecCccceeEEeccc
Confidence             55788888788889888888888888877654


No 10 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.86  E-value=1.7e-19  Score=190.97  Aligned_cols=362  Identities=14%  Similarity=0.132  Sum_probs=260.2

Q ss_pred             ccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc--CCEEEecccccC-CCceEEEEEcCCCcEEEEEEEEEcCC
Q 008087          119 VPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG--GRRVLTNAHSVE-HYTQVKLKKRGSDTKYLATVLAIGTE  195 (578)
Q Consensus       119 ~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~--~g~ILT~aHvV~-~~~~i~V~~~~~g~~~~a~vv~~d~~  195 (578)
                      +....+.|.+.+.......+.     ......|||.|++  +|++++++.+|. +.....|++ +|...++|.+.+.++.
T Consensus       525 ~~i~~~~~~v~~~~~~~l~g~-----s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~-~dS~~i~a~~~fL~~t  598 (955)
T KOG1421|consen  525 ADISNCLVDVEPMMPVNLDGV-----SSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTE-ADSDGIPANVSFLHPT  598 (955)
T ss_pred             hHHhhhhhhheeceeeccccc-----hhhhhcCceEEEEccCCceeEecccCCchhhceEEee-cccccccceeeEecCc
Confidence            334457777776664443221     1123469999997  799999999997 567888988 5778899999999999


Q ss_pred             CCEEEEEEeeCCCCCCeeeEEcCCCC-CCCCcEEEEeecCCCCc----ceEEEeEEeceeeeeec-CCceeeeEEEEecC
Q 008087          196 CDIAMLTVEDDEFWEGVLPVEFGELP-ALQDAVTVVGYPIGGDT----ISVTSGVVSRIEILSYV-HGSTELLGLQIDAA  269 (578)
Q Consensus       196 ~DlAlLkv~~~~~~~~~~~l~l~~~~-~~g~~V~~iG~p~g~~~----~sv~~G~Is~~~~~~~~-~~~~~~~~i~~da~  269 (578)
                      .++|.+|+++.-    ...+.|.+.. ..|+++...|+......    .+++.-.+......... ....+++.|.+++.
T Consensus       599 ~n~a~~kydp~~----~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~n  674 (955)
T KOG1421|consen  599 ENVASFKYDPAL----EVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDN  674 (955)
T ss_pred             cceeEeccChhH----hhhhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecc
Confidence            999999999863    3344554422 45899999998865432    12222111111111111 11234567888777


Q ss_pred             CCCCCccceEEccCCeEEEEEecccccc-c--cCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhh
Q 008087          270 INSGNSGGPAFNDKGKCVGIAFQSLKHE-D--VENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAM  346 (578)
Q Consensus       270 i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~--~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~l  346 (578)
                      +..++--|-+.|.+|+|+|++...++.. +  ...+-|.+.+..++..|+.|+.++... ...+|++|..+ +...++.+
T Consensus       675 lsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-p~i~~vef~~i-~laqar~l  752 (955)
T KOG1421|consen  675 LSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-PTIAGVEFSHI-TLAQARTL  752 (955)
T ss_pred             ccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-ceeeccceeeE-Eeehhhcc
Confidence            7766666788899999999998765432 1  122456899999999999998877765 55679999998 88888889


Q ss_pred             ccccC------------CCCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEE
Q 008087          347 SMKAD------------QKGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAA  414 (578)
Q Consensus       347 gl~~~------------~~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~  414 (578)
                      |++.+            .+-.+|.+|.+.-+  +-|..||+|+++||+-|+...++          ++ +.      .+.
T Consensus       753 glp~e~imk~e~es~~~~ql~~ishv~~~~~--kil~~gdiilsvngk~itr~~dl----------~d-~~------eid  813 (955)
T KOG1421|consen  753 GLPSEFIMKSEEESTIPRQLYVISHVRPLLH--KILGVGDIILSVNGKMITRLSDL----------HD-FE------EID  813 (955)
T ss_pred             CCCHHHHhhhhhcCCCcceEEEEEeeccCcc--cccccccEEEEecCeEEeeehhh----------hh-hh------hhh
Confidence            98864            13356788877543  25999999999999999998883          33 21      567


Q ss_pred             EEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCce
Q 008087          415 VKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQM  493 (578)
Q Consensus       415 l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~  493 (578)
                      ..|+|+|.++++.+++-...         ..-++.+|.|..+++. .++...+-   +.                  .++
T Consensus       814 ~~ilrdg~~~~ikipt~p~~---------et~r~vi~~gailq~ph~av~~q~e---dl------------------p~g  863 (955)
T KOG1421|consen  814 AVILRDGIEMEIKIPTYPEY---------ETSRAVIWMGAILQPPHSAVFEQVE---DL------------------PEG  863 (955)
T ss_pred             eeeeecCcEEEEEecccccc---------ccceEEEEEeccccCchHHHHHHHh---cc------------------CCc
Confidence            89999999999998875432         1227889999999998 66654443   11                  146


Q ss_pred             EEEE----eecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC-eEEEEE
Q 008087          494 SSLL----WCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE-FLKFDL  542 (578)
Q Consensus       494 vvvs----~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~-~v~l~v  542 (578)
                      |++.    ++||.+ +|.+--.|++|||..+.++++|..++...++. ++++..
T Consensus       864 vyvt~rg~gspalq-~l~aa~fitavng~~t~~lddf~~~~~~ipdnsyv~v~~  916 (955)
T KOG1421|consen  864 VYVTSRGYGSPALQ-MLRAAHFITAVNGHDTNTLDDFYHMLLEIPDNSYVQVKQ  916 (955)
T ss_pred             eEEeecccCChhHh-hcchheeEEEecccccCcHHHHHHHHhhCCCCceEEEEE
Confidence            7776    357777 89999999999999999999999999998765 666554


No 11 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.68  E-value=8.4e-17  Score=174.48  Aligned_cols=153  Identities=12%  Similarity=0.170  Sum_probs=113.9

Q ss_pred             cEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEeccc
Q 008087          355 VRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATH  433 (578)
Q Consensus       355 v~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~  433 (578)
                      .+|.+|.++|||++ |||+||+|++|||++|.+|.+          +...+.....|++++++|.|+|+.+++++++...
T Consensus       128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~----------l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~~  197 (449)
T PRK10779        128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDA----------VRLALVSKIGDESTTITVAPFGSDQRRDKTLDLR  197 (449)
T ss_pred             ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhhccCCceEEEEEeCCccceEEEEeccc
Confidence            46899999999999 999999999999999999988          4566777778899999999999998888888543


Q ss_pred             ccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCCcCCCCCCCEE
Q 008087          434 RRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPLCLNCFNKV  512 (578)
Q Consensus       434 ~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~aGl~~GD~I  512 (578)
                      +........  . .   ..++++.++ +.....                          =..|..+++|+++||++||+|
T Consensus       198 ~~~~~~~~~--~-~---~~~lGl~~~~~~~~~v--------------------------V~~V~~~SpA~~AGL~~GDvI  245 (449)
T PRK10779        198 HWAFEPDKQ--D-P---VSSLGIRPRGPQIEPV--------------------------LAEVQPNSAASKAGLQAGDRI  245 (449)
T ss_pred             ccccCcccc--c-h---hhcccccccCCCcCcE--------------------------EEeeCCCCHHHHcCCCCCCEE
Confidence            221110000  0 0   111222221 100000                          012334568999999999999


Q ss_pred             EEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087          513 LAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       513 ~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~  549 (578)
                      ++|||++|++|+++.+.++..+++.+.+++.|+++.+
T Consensus       246 l~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~  282 (449)
T PRK10779        246 VKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQGSPL  282 (449)
T ss_pred             EEECCEEcCCHHHHHHHHHhCCCCEEEEEEEECCEEE
Confidence            9999999999999999999888889999999997653


No 12 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.59  E-value=2.6e-14  Score=125.93  Aligned_cols=108  Identities=33%  Similarity=0.491  Sum_probs=71.8

Q ss_pred             EEEEEEcC-CEEEecccccC--------CCceEEEEEcCCCcEEE--EEEEEEcCC-CCEEEEEEeeCCCCCCeeeEEcC
Q 008087          151 SSGFAIGG-RRVLTNAHSVE--------HYTQVKLKKRGSDTKYL--ATVLAIGTE-CDIAMLTVEDDEFWEGVLPVEFG  218 (578)
Q Consensus       151 GsGfvI~~-g~ILT~aHvV~--------~~~~i~V~~~~~g~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~l~l~  218 (578)
                      ||||+|++ |+||||+|||.        ....+.+... ++..+.  +++++.|+. +|||||+++.             
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~D~All~v~~-------------   66 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFP-DGRRVPPVAEVVYFDPDDYDLALLKVDP-------------   66 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEET-TSCEEETEEEEEEEETT-TTEEEEEESC-------------
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEec-CCCEEeeeEEEEEECCccccEEEEEEec-------------
Confidence            89999985 59999999999        4567888874 666777  999999999 9999999990             


Q ss_pred             CCCCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEE
Q 008087          219 ELPALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGI  289 (578)
Q Consensus       219 ~~~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI  289 (578)
                              ....+..      ....+..........  .......+ +++.+.+|+|||||||.+|+||||
T Consensus        67 --------~~~~~~~------~~~~~~~~~~~~~~~--~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   67 --------WTGVGGG------VRVPGSTSGVSPTST--NDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             --------EEEEEEE------EEEEEEEEEEEEEEE--EETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             --------ccceeee------eEeeeeccccccccC--cccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence                    0000000      000000000000000  00111124 899999999999999999999997


No 13 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.54  E-value=2.7e-14  Score=153.52  Aligned_cols=135  Identities=17%  Similarity=0.189  Sum_probs=105.1

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                      .|.+|.+|.++|||++ |||+||+|++|||+++.++.++          ...+....  +++.+++.|+|+..++.+++.
T Consensus       128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl----------~~~ia~~~--~~v~~~I~r~g~~~~l~v~l~  195 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDV----------RQQIADIA--GEPMVEILAERENWTFEVMKE  195 (420)
T ss_pred             CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHH----------HHHHHhhc--ccceEEEEEecCceEeccccc
Confidence            4889999999999999 9999999999999999999884          45555544  678899999988766544332


Q ss_pred             ccccccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCCcCCCCCCCE
Q 008087          432 THRRLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPLCLNCFNK  511 (578)
Q Consensus       432 ~~~~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~aGl~~GD~  511 (578)
                      -...         .|.    .+..+                                    ..+.++++|+++||++||+
T Consensus       196 ~~~~---------~~~----~g~vV------------------------------------~~V~~~SpA~~aGL~~GD~  226 (420)
T TIGR00054       196 LIPR---------GPK----IEPVL------------------------------------SDVTPNSPAEKAGLKEGDY  226 (420)
T ss_pred             ceec---------CCC----cCcEE------------------------------------EEECCCCHHHHcCCCCCCE
Confidence            1100         000    00000                                    1233456899999999999


Q ss_pred             EEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEE
Q 008087          512 VLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVV  548 (578)
Q Consensus       512 I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~  548 (578)
                      |++|||++|++|+|+.+.+++.+++.+.+++.|+++.
T Consensus       227 Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~  263 (420)
T TIGR00054       227 IQSINGEKLRSWTDFVSAVKENPGKSMDIKVERNGET  263 (420)
T ss_pred             EEEECCEECCCHHHHHHHHHhCCCCceEEEEEECCEE
Confidence            9999999999999999999998888899999999765


No 14 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.42  E-value=2.3e-12  Score=125.39  Aligned_cols=182  Identities=22%  Similarity=0.291  Sum_probs=116.1

Q ss_pred             CCCCCCccccCCCc---ceEEEEEEEcCCEEEecccccCCCceEEEEEcC------CC--cEEEEEEEEEc----C---C
Q 008087          134 EPNFSLPWQRKRQY---SSSSSGFAIGGRRVLTNAHSVEHYTQVKLKKRG------SD--TKYLATVLAIG----T---E  195 (578)
Q Consensus       134 ~~~~~~p~~~~~~~---~~~GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~------~g--~~~~a~vv~~d----~---~  195 (578)
                      .....+||......   ...|+|++|++.+|||+|||+.....+.+.+..      ++  ..+..+-+..+    .   .
T Consensus         7 ~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~   86 (220)
T PF00089_consen    7 ASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYD   86 (220)
T ss_dssp             CGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTT
T ss_pred             CCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            33445566554332   457999999999999999999996667775531      22  23343333332    2   5


Q ss_pred             CCEEEEEEeeC-CCCCCeeeEEcCCC---CCCCCcEEEEeecCCCCcc---eEE---EeEEeceeeeeecCCceeeeEEE
Q 008087          196 CDIAMLTVEDD-EFWEGVLPVEFGEL---PALQDAVTVVGYPIGGDTI---SVT---SGVVSRIEILSYVHGSTELLGLQ  265 (578)
Q Consensus       196 ~DlAlLkv~~~-~~~~~~~~l~l~~~---~~~g~~V~~iG~p~g~~~~---sv~---~G~Is~~~~~~~~~~~~~~~~i~  265 (578)
                      +|||||+++.+ .+...+.++.+...   ...++.+.++|++......   .+.   ..+++...+............++
T Consensus        87 ~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c  166 (220)
T PF00089_consen   87 NDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMIC  166 (220)
T ss_dssp             TSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEE
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            79999999987 33456788888762   2568999999998753221   233   33333322222111111123466


Q ss_pred             Eec----CCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchhHHHHH
Q 008087          266 IDA----AINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVIMHFI  315 (578)
Q Consensus       266 ~da----~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l  315 (578)
                      +..    ..+.|+|||||++.++.++||++...........++++++...++++
T Consensus       167 ~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  167 AGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             EETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             ccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            655    78899999999998778999999864332223357888888777654


No 15 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.41  E-value=4.1e-13  Score=110.94  Aligned_cols=81  Identities=32%  Similarity=0.520  Sum_probs=69.1

Q ss_pred             ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087          328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ  406 (578)
Q Consensus       328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~  406 (578)
                      ||||+.+... +.           ..|++|..|.++|||++ ||++||+|++|||++|.++.+          |...+..
T Consensus         1 ~~lGv~~~~~-~~-----------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~----------~~~~l~~   58 (82)
T PF13180_consen    1 GGLGVTVQNL-SD-----------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSED----------LVNILSK   58 (82)
T ss_dssp             -E-SEEEEEC-SC-----------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHH----------HHHHHHC
T ss_pred             CEECeEEEEc-cC-----------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHH----------HHHHHHh
Confidence            6899999876 21           35999999999999999 999999999999999999887          5677778


Q ss_pred             cCCCCEEEEEEEECCEEEEEEEEe
Q 008087          407 KYTGDSAAVKVLRDSKILNFNITL  430 (578)
Q Consensus       407 ~~~g~~v~l~V~R~g~~~~~~v~l  430 (578)
                      ...|++++|+|+|+|+.++++++|
T Consensus        59 ~~~g~~v~l~v~R~g~~~~~~v~l   82 (82)
T PF13180_consen   59 GKPGDTVTLTVLRDGEELTVEVTL   82 (82)
T ss_dssp             SSTTSEEEEEEEETTEEEEEEEE-
T ss_pred             CCCCCEEEEEEEECCEEEEEEEEC
Confidence            889999999999999999999875


No 16 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.33  E-value=4.5e-11  Score=117.19  Aligned_cols=168  Identities=21%  Similarity=0.207  Sum_probs=98.8

Q ss_pred             ceEEEEEEEcCCEEEecccccCCC--ceEEEEEcCC--------CcEEEEEEEEEc-------CCCCEEEEEEeeCC-CC
Q 008087          148 SSSSSGFAIGGRRVLTNAHSVEHY--TQVKLKKRGS--------DTKYLATVLAIG-------TECDIAMLTVEDDE-FW  209 (578)
Q Consensus       148 ~~~GsGfvI~~g~ILT~aHvV~~~--~~i~V~~~~~--------g~~~~a~vv~~d-------~~~DlAlLkv~~~~-~~  209 (578)
                      ...|+|++|++.+|||+|||+.+.  ..+.|.+...        ...+..+-+..+       ..+|||||+++.+. +.
T Consensus        24 ~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~  103 (232)
T cd00190          24 RHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLS  103 (232)
T ss_pred             cEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCC
Confidence            468999999999999999999875  4566665211        122333334444       35899999999763 33


Q ss_pred             CCeeeEEcCCC---CCCCCcEEEEeecCCCCc-------ceEEEeEEeceeeeeecC--CceeeeEEEE-----ecCCCC
Q 008087          210 EGVLPVEFGEL---PALQDAVTVVGYPIGGDT-------ISVTSGVVSRIEILSYVH--GSTELLGLQI-----DAAINS  272 (578)
Q Consensus       210 ~~~~~l~l~~~---~~~g~~V~~iG~p~g~~~-------~sv~~G~Is~~~~~~~~~--~~~~~~~i~~-----da~i~~  272 (578)
                      ..+.|+.|...   ...++.+.++||......       ......+++...+.....  .......++.     ....|.
T Consensus       104 ~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~  183 (232)
T cd00190         104 DNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQ  183 (232)
T ss_pred             CcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCcccc
Confidence            45788888764   244789999998764321       111222222222221111  0011122333     335788


Q ss_pred             CCccceEEccC---CeEEEEEeccccccccCCceeeecchhHHHHH
Q 008087          273 GNSGGPAFNDK---GKCVGIAFQSLKHEDVENIGYVIPTPVIMHFI  315 (578)
Q Consensus       273 G~SGGPlvn~~---G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l  315 (578)
                      |+|||||+...   +.++||.+....-......+.+..+...++++
T Consensus       184 gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI  229 (232)
T cd00190         184 GDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWI  229 (232)
T ss_pred             CCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHh
Confidence            99999999864   79999998744211112333444455554444


No 17 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.22  E-value=4e-11  Score=100.46  Aligned_cols=88  Identities=34%  Similarity=0.584  Sum_probs=74.0

Q ss_pred             ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087          328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ  406 (578)
Q Consensus       328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~  406 (578)
                      +|+|+.++.+ ++.....+++.. ..|++|..|.++|||++ ||++||+|++|||++|.++.+          +...+..
T Consensus         1 ~~~G~~~~~~-~~~~~~~~~~~~-~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~----------~~~~l~~   68 (90)
T cd00987           1 PWLGVTVQDL-TPDLAEELGLKD-TKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVAD----------LRRALAE   68 (90)
T ss_pred             CccceEEeEC-CHHHHHHcCCCC-CCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHH----------HHHHHHh
Confidence            5899999998 666666666643 56999999999999998 999999999999999999987          4566666


Q ss_pred             cCCCCEEEEEEEECCEEEEEE
Q 008087          407 KYTGDSAAVKVLRDSKILNFN  427 (578)
Q Consensus       407 ~~~g~~v~l~V~R~g~~~~~~  427 (578)
                      ...++.+.+++.|+|+..++.
T Consensus        69 ~~~~~~i~l~v~r~g~~~~~~   89 (90)
T cd00987          69 LKPGDKVTLTVLRGGKELTVT   89 (90)
T ss_pred             cCCCCEEEEEEEECCEEEEee
Confidence            556889999999999876654


No 18 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.21  E-value=2.3e-10  Score=112.29  Aligned_cols=147  Identities=22%  Similarity=0.237  Sum_probs=90.8

Q ss_pred             ceEEEEEEEcCCEEEecccccCCCc--eEEEEEcCCC-------cEEEEEEEEEc-------CCCCEEEEEEeeCC-CCC
Q 008087          148 SSSSSGFAIGGRRVLTNAHSVEHYT--QVKLKKRGSD-------TKYLATVLAIG-------TECDIAMLTVEDDE-FWE  210 (578)
Q Consensus       148 ~~~GsGfvI~~g~ILT~aHvV~~~~--~i~V~~~~~g-------~~~~a~vv~~d-------~~~DlAlLkv~~~~-~~~  210 (578)
                      ...|+|++|++.+|||+|||+.+..  .+.|.+....       ..+...-+..+       ..+|||||+++.+. +..
T Consensus        25 ~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~  104 (229)
T smart00020       25 RHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSD  104 (229)
T ss_pred             CcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCC
Confidence            4579999999999999999998753  6777773222       22333434432       45899999998762 334


Q ss_pred             CeeeEEcCCC---CCCCCcEEEEeecCCCCc-----ceEE---EeEEeceeeeeecCC--ceeeeEEEE-----ecCCCC
Q 008087          211 GVLPVEFGEL---PALQDAVTVVGYPIGGDT-----ISVT---SGVVSRIEILSYVHG--STELLGLQI-----DAAINS  272 (578)
Q Consensus       211 ~~~~l~l~~~---~~~g~~V~~iG~p~g~~~-----~sv~---~G~Is~~~~~~~~~~--~~~~~~i~~-----da~i~~  272 (578)
                      .+.|+.|...   ...+..+.++||+.....     ....   .-+++...+......  ......++.     ....++
T Consensus       105 ~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~  184 (229)
T smart00020      105 NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQ  184 (229)
T ss_pred             ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccC
Confidence            5788888763   345788999998765320     0111   122222111111000  001112222     345788


Q ss_pred             CCccceEEccCC--eEEEEEeccc
Q 008087          273 GNSGGPAFNDKG--KCVGIAFQSL  294 (578)
Q Consensus       273 G~SGGPlvn~~G--~vVGI~~~~~  294 (578)
                      |+|||||+...+  .++||++...
T Consensus       185 gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      185 GDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             CCCCCeeEEECCCEEEEEEEEECC
Confidence            999999998654  9999998743


No 19 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.08  E-value=5.8e-10  Score=91.31  Aligned_cols=72  Identities=28%  Similarity=0.371  Sum_probs=63.3

Q ss_pred             CCCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          352 QKGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                      ..|++|..|.++|||+.||++||+|++|||+++.+|.+          +..++.....|+.+.+++.|+|+..++++++.
T Consensus         7 ~~Gv~V~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~----------~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~   76 (79)
T cd00986           7 YHGVYVTSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEE----------LIDYIQSKKEGDTVKLKVKREEKELPEDLILK   76 (79)
T ss_pred             ecCEEEEEECCCCchhhCCCCCCEEEEECCEECCCHHH----------HHHHHHhCCCCCEEEEEEEECCEEEEEEEEEe
Confidence            35899999999999988999999999999999999987          55667655678899999999999999999887


Q ss_pred             cc
Q 008087          432 TH  433 (578)
Q Consensus       432 ~~  433 (578)
                      .+
T Consensus        77 ~~   78 (79)
T cd00986          77 TF   78 (79)
T ss_pred             cc
Confidence            54


No 20 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.05  E-value=7.6e-10  Score=90.66  Aligned_cols=68  Identities=28%  Similarity=0.317  Sum_probs=59.8

Q ss_pred             CCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Q 008087          352 QKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNIT  429 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~  429 (578)
                      ..|++|..|.++|||++ ||++||+|++|||+++.+|.+          |...+.....|+.+.+++.|+|+..+++++
T Consensus         9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d----------~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~   77 (79)
T cd00991           9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLED----------FMEALKPTKPGEVITVTVLPSTTKLTNVST   77 (79)
T ss_pred             CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHH----------HHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence            46999999999999998 999999999999999999988          556676655688999999999998887764


No 21 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.03  E-value=6.9e-10  Score=90.93  Aligned_cols=77  Identities=22%  Similarity=0.384  Sum_probs=63.2

Q ss_pred             ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087          328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ  406 (578)
Q Consensus       328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~  406 (578)
                      +|+|+.+..-              ..|++|..|.++|||++ ||++||+|++|||+++.+|.+             ++..
T Consensus         1 ~~~G~~~~~~--------------~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~-------------~l~~   53 (80)
T cd00990           1 PYLGLTLDKE--------------EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQD-------------RLKE   53 (80)
T ss_pred             CcccEEEEcc--------------CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHH-------------HHHh
Confidence            5778777532              34799999999999999 999999999999999998654             3444


Q ss_pred             cCCCCEEEEEEEECCEEEEEEEEec
Q 008087          407 KYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       407 ~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                      ...++.+.+++.|+|+..++.+++.
T Consensus        54 ~~~~~~v~l~v~r~g~~~~~~v~~~   78 (80)
T cd00990          54 YQAGDPVELTVFRDDRLIEVPLTLA   78 (80)
T ss_pred             cCCCCEEEEEEEECCEEEEEEEEec
Confidence            4567899999999999988887764


No 22 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.00  E-value=9.4e-10  Score=110.27  Aligned_cols=100  Identities=15%  Similarity=0.183  Sum_probs=85.5

Q ss_pred             hhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC
Q 008087          309 PVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND  387 (578)
Q Consensus       309 ~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~  387 (578)
                      ..+.++++++.+++.+. ++|+|+..... +          ....|++|..+.++++|++ |||+||+|++|||+++.++
T Consensus       159 ~~~~~v~~~l~~~g~~~-~~~lgi~p~~~-~----------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~  226 (259)
T TIGR01713       159 VVSRRIIEELTKDPQKM-FDYIRLSPVMK-N----------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDP  226 (259)
T ss_pred             hhHHHHHHHHHHCHHhh-hheEeEEEEEe-C----------CceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCH
Confidence            45678899999999888 89999987543 1          1245999999999999999 9999999999999999999


Q ss_pred             CCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Q 008087          388 GTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITL  430 (578)
Q Consensus       388 ~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l  430 (578)
                      .+          +..++.....++.+.|+|+|+|+.+++.+.+
T Consensus       227 ~~----------~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~  259 (259)
T TIGR01713       227 EQ----------AFQALQMLREETNLTLTVERDGQREDIYVRF  259 (259)
T ss_pred             HH----------HHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence            88          5577777778899999999999998887753


No 23 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.82  E-value=1.2e-08  Score=83.15  Aligned_cols=64  Identities=23%  Similarity=0.360  Sum_probs=54.5

Q ss_pred             CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEE
Q 008087          354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNI  428 (578)
Q Consensus       354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v  428 (578)
                      .++|..|.++|+|++ ||++||+|++|||+++.+|.+          +...+... .++.+.+++.|+|+..++.+
T Consensus        13 ~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~----------~~~~l~~~-~~~~~~l~v~r~~~~~~~~l   77 (79)
T cd00989          13 EPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWED----------LVDAVQEN-PGKPLTLTVERNGETITLTL   77 (79)
T ss_pred             CcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHHC-CCceEEEEEEECCEEEEEEe
Confidence            588999999999998 999999999999999999987          44555554 37789999999998776665


No 24 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.81  E-value=1e-08  Score=111.15  Aligned_cols=90  Identities=24%  Similarity=0.477  Sum_probs=79.4

Q ss_pred             cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHh
Q 008087          327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVS  405 (578)
Q Consensus       327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~  405 (578)
                      +.|+|+.+..+ ++..++.++++....|++|..|.++|||++ ||++||+|++|||++|.++.+          |..++.
T Consensus       337 ~~~lGi~~~~l-~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d----------~~~~l~  405 (428)
T TIGR02037       337 NPFLGLTVANL-SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAE----------LRKVLD  405 (428)
T ss_pred             ccccceEEecC-CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHH
Confidence            46899999998 888888899887567999999999999999 999999999999999999988          567777


Q ss_pred             ccCCCCEEEEEEEECCEEEEEE
Q 008087          406 QKYTGDSAAVKVLRDSKILNFN  427 (578)
Q Consensus       406 ~~~~g~~v~l~V~R~g~~~~~~  427 (578)
                      ....++.+.|+|+|+|+...+.
T Consensus       406 ~~~~g~~v~l~v~R~g~~~~~~  427 (428)
T TIGR02037       406 RAKKGGRVALLILRGGATIFVT  427 (428)
T ss_pred             hcCCCCEEEEEEEECCEEEEEE
Confidence            7667899999999999977654


No 25 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.75  E-value=9.4e-08  Score=94.05  Aligned_cols=162  Identities=22%  Similarity=0.248  Sum_probs=93.4

Q ss_pred             ceEEEEEEEcCCEEEecccccCCCc----eEEEEEc---CCCc-EE--EEEEEEEc-C---CCCEEEEEEeeCCCC----
Q 008087          148 SSSSSGFAIGGRRVLTNAHSVEHYT----QVKLKKR---GSDT-KY--LATVLAIG-T---ECDIAMLTVEDDEFW----  209 (578)
Q Consensus       148 ~~~GsGfvI~~g~ILT~aHvV~~~~----~i~V~~~---~~g~-~~--~a~vv~~d-~---~~DlAlLkv~~~~~~----  209 (578)
                      ...|++|+|+++.+||++||+....    ++.+...   +++. .+  ......+. .   ..|.+...+....+.    
T Consensus        63 ~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~  142 (251)
T COG3591          63 RLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGIN  142 (251)
T ss_pred             cceeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCC
Confidence            3456779999999999999997533    1111110   1111 11  11111112 2   345555555443221    


Q ss_pred             --CCee--eEEcCCCCCCCCcEEEEeecCCCCc---ceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEcc
Q 008087          210 --EGVL--PVEFGELPALQDAVTVVGYPIGGDT---ISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFND  282 (578)
Q Consensus       210 --~~~~--~l~l~~~~~~g~~V~~iG~p~g~~~---~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~  282 (578)
                        ....  ...+....++++.+.++|||.....   .-...+.|.....          ..++.++.+.+|+||+||++.
T Consensus       143 ~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~----------~~l~y~~dT~pG~SGSpv~~~  212 (251)
T COG3591         143 IGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG----------NKLFYDADTLPGSSGSPVLIS  212 (251)
T ss_pred             ccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEec----------ceEEEEecccCCCCCCceEec
Confidence              1111  2223334467888999999987441   1222333333221          257788889999999999999


Q ss_pred             CCeEEEEEeccccccccCCcee-eecchhHHHHHHHHH
Q 008087          283 KGKCVGIAFQSLKHEDVENIGY-VIPTPVIMHFIQDYE  319 (578)
Q Consensus       283 ~G~vVGI~~~~~~~~~~~~~~~-aiPi~~i~~~l~~l~  319 (578)
                      +.++||+++.+....+....++ +.-...++.+++++.
T Consensus       213 ~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         213 KDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             CceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence            8899999998654222233343 445667778877764


No 26 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.73  E-value=4e-08  Score=81.35  Aligned_cols=67  Identities=22%  Similarity=0.346  Sum_probs=56.0

Q ss_pred             CCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC--CCCccccCcchhHHHHHhccCCCCEEEEEEEEC-CEEEEEE
Q 008087          352 QKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND--GTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD-SKILNFN  427 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~--~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~-g~~~~~~  427 (578)
                      ..+++|..|.++|||++ ||++||+|++|||+++.+|  .+          +...+.. ..|+.+.+++.|+ |+..+++
T Consensus        12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~----------~~~~l~~-~~~~~i~l~v~r~~~~~~~~~   80 (85)
T cd00988          12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLED----------VVKLLRG-KAGTKVRLTLKRGDGEPREVT   80 (85)
T ss_pred             CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHH----------HHHHhcC-CCCCEEEEEEEcCCCCEEEEE
Confidence            35899999999999999 9999999999999999998  55          3344443 3578999999998 8887777


Q ss_pred             EE
Q 008087          428 IT  429 (578)
Q Consensus       428 v~  429 (578)
                      +.
T Consensus        81 ~~   82 (85)
T cd00988          81 LT   82 (85)
T ss_pred             EE
Confidence            65


No 27 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.53  E-value=1.9e-07  Score=74.16  Aligned_cols=54  Identities=28%  Similarity=0.528  Sum_probs=45.5

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC--CCCccccCcchhHHHHHhccCCCCEEEEEE
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND--GTVPFRHGERIGFSYLVSQKYTGDSAAVKV  417 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~--~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V  417 (578)
                      .|++|..|.++|||+. ||++||+|++|||+++.+|  .+          +..++.... |+.++|+|
T Consensus        13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~----------~~~~l~~~~-g~~v~l~v   69 (70)
T cd00136          13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLED----------VAELLKKEV-GEKVTLTV   69 (70)
T ss_pred             CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHH----------HHHHHhhCC-CCeEEEEE
Confidence            3899999999999999 9999999999999999999  55          445665543 78888876


No 28 
>PF12812 PDZ_1:  PDZ-like domain
Probab=98.52  E-value=3e-07  Score=74.67  Aligned_cols=70  Identities=14%  Similarity=0.142  Sum_probs=56.8

Q ss_pred             CCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEee---cCCcCCCCCCCEEEEeCCeecC
Q 008087          446 PSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWC---LRSPLCLNCFNKVLAFNGNPVK  521 (578)
Q Consensus       446 p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v---~A~~aGl~~GD~I~~VNG~~V~  521 (578)
                      -+|+.|+|..|+++ +.....++..+                     ++++++..   ++...++..|.+|.+|||++++
T Consensus         5 ~r~v~~~Ga~f~~Ls~q~aR~~~~~~---------------------~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~   63 (78)
T PF12812_consen    5 SRFVEVCGAVFHDLSYQQARQYGIPV---------------------GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTP   63 (78)
T ss_pred             CEEEEEcCeecccCCHHHHHHhCCCC---------------------CEEEEEecCCChhhhCCCCCCeEEEeECCcCCc
Confidence            37889999999999 66666777333                     36776643   5555669999999999999999


Q ss_pred             CHHHHHHHHHhcCCC
Q 008087          522 NLKSLANMVENCDDE  536 (578)
Q Consensus       522 ~~~~l~~~l~~~~~~  536 (578)
                      ++++|++++++.++.
T Consensus        64 ~Ld~f~~vvk~ipd~   78 (78)
T PF12812_consen   64 DLDDFIKVVKKIPDN   78 (78)
T ss_pred             CHHHHHHHHHhCCCC
Confidence            999999999998863


No 29 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.51  E-value=4.7e-07  Score=97.88  Aligned_cols=151  Identities=16%  Similarity=0.218  Sum_probs=98.0

Q ss_pred             CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Q 008087          352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNIT  429 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~  429 (578)
                      .+-++|+.|.+.+.|++  .|++||.|+.|||.+|.....-.        .-.++........|.|+|.|.-..      
T Consensus       673 ~qpi~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~~--------vv~Lm~~AArnghV~LtVRRkv~~------  738 (984)
T KOG3209|consen  673 GQPIYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHSE--------VVDLMEAAARNGHVNLTVRRKVRT------  738 (984)
T ss_pred             CCeeEEeeeeecccccccCcccCCCeEEEecCeeccCccHHH--------HHHHHHHHHhcCceEEEEeeeeee------
Confidence            34588999999999998  59999999999999999876521        224454444556789999883110      


Q ss_pred             ecccccccCCCCCCCCCCcee------eccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCC
Q 008087          430 LATHRRLIPSHNKGRPPSYYI------IAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRS  502 (578)
Q Consensus       430 l~~~~~~~~~~~~~~~p~~~~------~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~  502 (578)
                       .. ....+.......+.|-.      -.||+|.-+ ..-+.                        +..-|.++.+++|+
T Consensus       739 -~~-~~rsp~~s~~~~~~yDV~lhR~ENeGFGFVi~sS~~kp------------------------~sgiGrIieGSPAd  792 (984)
T KOG3209|consen  739 -GP-ARRSPRNSAAPSGPYDVVLHRKENEGFGFVIMSSQNKP------------------------ESGIGRIIEGSPAD  792 (984)
T ss_pred             -cc-ccCCcccccCCCCCeeeEEecccCCceeEEEEecccCC------------------------CCCccccccCChhH
Confidence             00 01111111111112211      236666543 11110                        11125788999999


Q ss_pred             cCC-CCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEe
Q 008087          503 PLC-LNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLE  543 (578)
Q Consensus       503 ~aG-l~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~  543 (578)
                      +.| |+.||+|++|||+.|-++  .+.+++|+.+ +-.|+|++.
T Consensus       793 RCgkLkVGDrilAVNG~sI~~lsHadiv~LIKda-GlsVtLtIi  835 (984)
T KOG3209|consen  793 RCGKLKVGDRILAVNGQSILNLSHADIVSLIKDA-GLSVTLTII  835 (984)
T ss_pred             hhccccccceEEEecCeeeeccCchhHHHHHHhc-CceEEEEEc
Confidence            998 589999999999999865  6789999874 667888874


No 30 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.46  E-value=3.2e-07  Score=75.60  Aligned_cols=57  Identities=18%  Similarity=0.194  Sum_probs=48.1

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHh-cCCCeEEEEEecCeEEE
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVEN-CDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~-~~~~~v~l~v~R~~~~~  549 (578)
                      ++++..    ++|+++||++||+|++|||++|.++.+|.+++.+ .+++.+.|++.|+++..
T Consensus        15 g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~   76 (82)
T PF13180_consen   15 GVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEEL   76 (82)
T ss_dssp             SEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEE
T ss_pred             eEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEE
Confidence            455444    5899999999999999999999999999999965 46889999999987764


No 31 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.44  E-value=8.6e-07  Score=72.50  Aligned_cols=58  Identities=10%  Similarity=0.169  Sum_probs=49.3

Q ss_pred             ceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhc-CCCeEEEEEecCeEEE
Q 008087          492 QMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENC-DDEFLKFDLEYDQVVV  549 (578)
Q Consensus       492 ~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~-~~~~v~l~v~R~~~~~  549 (578)
                      .++++..    ++|+++||++||+|++|||++|.+|.+|.+.+... +++.+.+++.|+++..
T Consensus        10 ~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~   72 (79)
T cd00991          10 AGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKL   72 (79)
T ss_pred             CcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEE
Confidence            3565554    57999999999999999999999999999999986 4778999999987543


No 32 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.43  E-value=1.7e-06  Score=72.09  Aligned_cols=79  Identities=15%  Similarity=0.137  Sum_probs=61.3

Q ss_pred             eccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHH
Q 008087          451 IAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKS  525 (578)
Q Consensus       451 ~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~  525 (578)
                      +.|+.++++ +.....++.                    ....++++..+    +|+++||++||+|++|||++|.++.+
T Consensus         2 ~~G~~~~~~~~~~~~~~~~--------------------~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~   61 (90)
T cd00987           2 WLGVTVQDLTPDLAEELGL--------------------KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVAD   61 (90)
T ss_pred             ccceEEeECCHHHHHHcCC--------------------CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHH
Confidence            568888888 665544331                    01246666654    78889999999999999999999999


Q ss_pred             HHHHHHhcC-CCeEEEEEecCeEEE
Q 008087          526 LANMVENCD-DEFLKFDLEYDQVVV  549 (578)
Q Consensus       526 l~~~l~~~~-~~~v~l~v~R~~~~~  549 (578)
                      +.+++.... ++.+.|.+.|+++.+
T Consensus        62 ~~~~l~~~~~~~~i~l~v~r~g~~~   86 (90)
T cd00987          62 LRRALAELKPGDKVTLTVLRGGKEL   86 (90)
T ss_pred             HHHHHHhcCCCCEEEEEEEECCEEE
Confidence            999998764 778999999988654


No 33 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.34  E-value=1.6e-06  Score=71.32  Aligned_cols=59  Identities=27%  Similarity=0.402  Sum_probs=47.2

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECC
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDS  421 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g  421 (578)
                      .|++|..|.++|||++ ||++||+|++|||+.+.++.+..          ........++.+.|++.|++
T Consensus        26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~----------~~~~~~~~~~~~~l~i~r~~   85 (85)
T smart00228       26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLE----------AVDLLKKAGGKVTLTVLRGG   85 (85)
T ss_pred             CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHH----------HHHHHHhCCCeEEEEEEeCC
Confidence            5899999999999999 99999999999999999886632          22222234568899998864


No 34 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.30  E-value=1.5e-06  Score=93.86  Aligned_cols=69  Identities=26%  Similarity=0.360  Sum_probs=60.2

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                      .|++|..|.++|||++ |||+||+|++|||++|.+|.+          +...+.. ..++.+.+++.|+|+..++++++.
T Consensus       203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~d----------l~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~  271 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTD----------FVSAVKE-NPGKSMDIKVERNGETLSISLTPE  271 (420)
T ss_pred             cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHh-CCCCceEEEEEECCEEEEEEEEEc
Confidence            4799999999999999 999999999999999999988          4456655 467889999999999988888875


Q ss_pred             c
Q 008087          432 T  432 (578)
Q Consensus       432 ~  432 (578)
                      .
T Consensus       272 ~  272 (420)
T TIGR00054       272 A  272 (420)
T ss_pred             C
Confidence            3


No 35 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=98.25  E-value=6.5e-06  Score=87.58  Aligned_cols=61  Identities=25%  Similarity=0.401  Sum_probs=46.8

Q ss_pred             CCCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCE
Q 008087          352 QKGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSK  422 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~  422 (578)
                      .+-++|..|.||+||+..||.||.|+.|||....+....         | .+-.....|+...++|.|-.+
T Consensus        39 etSiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~ha---------F-AvQqLrksgK~A~ItvkRprk   99 (1027)
T KOG3580|consen   39 ETSIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLHA---------F-AVQQLRKSGKVAAITVKRPRK   99 (1027)
T ss_pred             ceeEEEeeccCCCCcccccccCCeEEEEcCcchhhhHHH---------H-HHHHHHhhccceeEEecccce
Confidence            456899999999999999999999999999999887552         1 111123467788889887544


No 36 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.23  E-value=5e-05  Score=76.15  Aligned_cols=158  Identities=23%  Similarity=0.256  Sum_probs=92.8

Q ss_pred             CCCCccccCCCcc----eEEEEEEEcCCEEEecccccCCCc--eEEEEEcC--------CC---cEE-EEEEEEEcC---
Q 008087          136 NFSLPWQRKRQYS----SSSSGFAIGGRRVLTNAHSVEHYT--QVKLKKRG--------SD---TKY-LATVLAIGT---  194 (578)
Q Consensus       136 ~~~~p~~~~~~~~----~~GsGfvI~~g~ILT~aHvV~~~~--~i~V~~~~--------~g---~~~-~a~vv~~d~---  194 (578)
                      ...+||+......    ..|.|.+|++.||||+|||+.+..  .+.|.+..        .+   ... ..+++ .++   
T Consensus        21 ~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~   99 (256)
T KOG3627|consen   21 PGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYN   99 (256)
T ss_pred             CCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCC
Confidence            3467787654433    278888888889999999999876  66666621        01   111 11233 221   


Q ss_pred             ----C-CCEEEEEEeeC-CCCCCeeeEEcCCCC----CCC-CcEEEEeecCCC----C-c---ceEEEeEEeceeeeeec
Q 008087          195 ----E-CDIAMLTVEDD-EFWEGVLPVEFGELP----ALQ-DAVTVVGYPIGG----D-T---ISVTSGVVSRIEILSYV  255 (578)
Q Consensus       195 ----~-~DlAlLkv~~~-~~~~~~~~l~l~~~~----~~g-~~V~~iG~p~g~----~-~---~sv~~G~Is~~~~~~~~  255 (578)
                          . +|||||+++.+ .|...+.|+.|....    ..+ ..+++.||....    . .   ..+...+++...+....
T Consensus       100 ~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~  179 (256)
T KOG3627|consen  100 PRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAY  179 (256)
T ss_pred             CCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccc
Confidence                3 79999999975 455678888876322    223 778888865321    1 1   11122333332222111


Q ss_pred             CCc--eeeeEEEEe-----cCCCCCCccceEEccC---CeEEEEEeccc
Q 008087          256 HGS--TELLGLQID-----AAINSGNSGGPAFNDK---GKCVGIAFQSL  294 (578)
Q Consensus       256 ~~~--~~~~~i~~d-----a~i~~G~SGGPlvn~~---G~vVGI~~~~~  294 (578)
                      ...  .....++..     ...|.|+|||||+..+   ..++||++.+.
T Consensus       180 ~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~  228 (256)
T KOG3627|consen  180 GGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGS  228 (256)
T ss_pred             cCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecC
Confidence            110  001124443     2468899999999875   69999998854


No 37 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.22  E-value=2.7e-06  Score=92.67  Aligned_cols=67  Identities=25%  Similarity=0.360  Sum_probs=59.2

Q ss_pred             CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                      +++|..|.++|||++ ||++||+|++|||++|.+|.+          +...+.. ..++.+.++|.|+|+..++++++.
T Consensus       222 ~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~d----------l~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~  289 (449)
T PRK10779        222 EPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQT----------FVTLVRD-NPGKPLALEIERQGSPLSLTLTPD  289 (449)
T ss_pred             CcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence            588999999999999 999999999999999999988          4556655 467889999999999988888875


No 38 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.16  E-value=3.6e-06  Score=68.44  Aligned_cols=52  Identities=19%  Similarity=0.245  Sum_probs=46.0

Q ss_pred             EeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEE
Q 008087          497 LWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVV  548 (578)
Q Consensus       497 s~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~  548 (578)
                      .+++|+++||++||+|++|||+++.+++++...+.+..++.+.+++.|+++.
T Consensus        21 ~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~~~~   72 (79)
T cd00989          21 PGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERNGET   72 (79)
T ss_pred             CCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEECCEE
Confidence            3457888999999999999999999999999999987777899999888754


No 39 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.11  E-value=5.1e-06  Score=87.08  Aligned_cols=72  Identities=21%  Similarity=0.276  Sum_probs=56.0

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                      .+++|..|.++|||++ ||++||+|++|||++|.+|..-.        +..++. ...|..+.++|.|+|+..++++++.
T Consensus        62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~--------~~~~l~-~~~g~~v~l~v~R~g~~~~~~v~l~  132 (334)
T TIGR00225        62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDD--------AVALIR-GKKGTKVSLEILRAGKSKPLTFTLK  132 (334)
T ss_pred             CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHH--------HHHhcc-CCCCCEEEEEEEeCCCCceEEEEEE
Confidence            4799999999999999 99999999999999999884100        222332 2468899999999987777666665


Q ss_pred             cc
Q 008087          432 TH  433 (578)
Q Consensus       432 ~~  433 (578)
                      ..
T Consensus       133 ~~  134 (334)
T TIGR00225       133 RD  134 (334)
T ss_pred             EE
Confidence            43


No 40 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.10  E-value=4e-06  Score=68.77  Aligned_cols=72  Identities=24%  Similarity=0.309  Sum_probs=51.9

Q ss_pred             cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHh
Q 008087          327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVS  405 (578)
Q Consensus       327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~  405 (578)
                      ...||+.+.... ..         ...+++|..|.++|+|+. ||+.||+|++|||+++.++....        ...++.
T Consensus         9 ~~~lG~~l~~~~-~~---------~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~--------~~~~l~   70 (81)
T PF00595_consen    9 NGPLGFTLRGGS-DN---------DEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDE--------VVQLLK   70 (81)
T ss_dssp             TSBSSEEEEEES-TS---------SSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHH--------HHHHHH
T ss_pred             CCCcCEEEEecC-CC---------CcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHH--------HHHHHH
Confidence            456788877541 10         024899999999999999 99999999999999999986521        223333


Q ss_pred             ccCCCCEEEEEEE
Q 008087          406 QKYTGDSAAVKVL  418 (578)
Q Consensus       406 ~~~~g~~v~l~V~  418 (578)
                      .  .+..++|+|+
T Consensus        71 ~--~~~~v~L~V~   81 (81)
T PF00595_consen   71 S--ASNPVTLTVQ   81 (81)
T ss_dssp             H--STSEEEEEEE
T ss_pred             C--CCCcEEEEEC
Confidence            3  3447888764


No 41 
>PRK10139 serine endoprotease; Provisional
Probab=98.09  E-value=6.3e-06  Score=89.71  Aligned_cols=64  Identities=17%  Similarity=0.345  Sum_probs=55.8

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEE
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNI  428 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v  428 (578)
                      .|++|..|.++|||++ ||++||+|++|||++|.+|.+          |...+...  .+.+.|+|+|+|+.+.+.+
T Consensus       390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~----------~~~~l~~~--~~~v~l~v~R~g~~~~~~~  454 (455)
T PRK10139        390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAE----------MRKVLAAK--PAIIALQIVRGNESIYLLL  454 (455)
T ss_pred             CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhC--CCeEEEEEEECCEEEEEEe
Confidence            5899999999999999 999999999999999999988          55677653  3689999999999876654


No 42 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.07  E-value=1.7e-05  Score=64.66  Aligned_cols=56  Identities=13%  Similarity=0.236  Sum_probs=46.1

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHh-cCCCeEEEEEecCeEEE
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVEN-CDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~-~~~~~v~l~v~R~~~~~  549 (578)
                      |+++..    ++|+. ||++||+|++|||+++.+|++|.+++.. .++..+.|++.|+++..
T Consensus         9 Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~   69 (79)
T cd00986           9 GVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKEL   69 (79)
T ss_pred             CEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEE
Confidence            455554    45665 7999999999999999999999999986 46778999999987654


No 43 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.07  E-value=5.2e-06  Score=87.92  Aligned_cols=61  Identities=20%  Similarity=0.317  Sum_probs=50.5

Q ss_pred             EEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE-ECCEEEEEEEEec
Q 008087          357 IRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL-RDSKILNFNITLA  431 (578)
Q Consensus       357 V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~-R~g~~~~~~v~l~  431 (578)
                      |..|.|+|+|++ ||++||+|++|||++|.+|.++          ...+    .++.+.++|. |+|+..++++...
T Consensus         2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~----------~~~l----~~e~l~L~V~~rdGe~~~l~Ie~~   64 (433)
T TIGR03279         2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDY----------QFLC----ADEELELEVLDANGESHQIEIEKD   64 (433)
T ss_pred             cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHH----------HHHh----cCCcEEEEEEcCCCeEEEEEEecC
Confidence            677999999999 9999999999999999999884          3334    2467889997 8998888877654


No 44 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.06  E-value=2.9e-05  Score=84.46  Aligned_cols=53  Identities=26%  Similarity=0.351  Sum_probs=43.3

Q ss_pred             EEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087          357 IRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR  419 (578)
Q Consensus       357 V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R  419 (578)
                      |++|.+||||++  .|+.||.|++|||+.|.+.....        .-.++.  ..|-+|+|+|.-
T Consensus       782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHad--------iv~LIK--daGlsVtLtIip  836 (984)
T KOG3209|consen  782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHAD--------IVSLIK--DAGLSVTLTIIP  836 (984)
T ss_pred             ccccccCChhHhhccccccceEEEecCeeeeccCchh--------HHHHHH--hcCceEEEEEcC
Confidence            789999999999  59999999999999999887632        223443  468899999875


No 45 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.98  E-value=2.1e-05  Score=64.91  Aligned_cols=55  Identities=11%  Similarity=0.181  Sum_probs=47.0

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEecC-eE
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLEYD-QV  547 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~R~-~~  547 (578)
                      ++++..    ++|+++||++||+|++|||+++.+|  .++..+++...++.+.|++.|+ +.
T Consensus        14 ~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~   75 (85)
T cd00988          14 GLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGE   75 (85)
T ss_pred             eEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCC
Confidence            455544    4788999999999999999999999  9999999887788899999987 53


No 46 
>PRK10942 serine endoprotease; Provisional
Probab=97.96  E-value=1.6e-05  Score=87.04  Aligned_cols=64  Identities=22%  Similarity=0.375  Sum_probs=55.4

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEE
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNI  428 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v  428 (578)
                      .|++|..|.++|+|++ ||++||+|++|||++|.+|.+          |...+...  ++.+.|+|+|+|+.+.+.+
T Consensus       408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~d----------l~~~l~~~--~~~v~l~V~R~g~~~~v~~  472 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAE----------LRKILDSK--PSVLALNIQRGDSSIYLLM  472 (473)
T ss_pred             CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhC--CCeEEEEEEECCEEEEEEe
Confidence            5899999999999999 999999999999999999988          55666652  3689999999998876654


No 47 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.94  E-value=1.9e-05  Score=64.58  Aligned_cols=49  Identities=24%  Similarity=0.406  Sum_probs=39.7

Q ss_pred             ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC
Q 008087          328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND  387 (578)
Q Consensus       328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~  387 (578)
                      ..+|+.+... ...          ..|++|..|.++|||++ ||++||+|++|||+++.++
T Consensus        12 ~~~G~~~~~~-~~~----------~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~   61 (82)
T cd00992          12 GGLGFSLRGG-KDS----------GGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGL   61 (82)
T ss_pred             CCcCEEEeCc-ccC----------CCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCcc
Confidence            4577777644 110          24899999999999999 9999999999999999943


No 48 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.94  E-value=2.4e-05  Score=83.66  Aligned_cols=68  Identities=19%  Similarity=0.295  Sum_probs=54.1

Q ss_pred             CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Q 008087          354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITL  430 (578)
Q Consensus       354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l  430 (578)
                      |++|..|.++|||++ ||++||+|++|||++|.++...        .+..++. ...|..+.|+|.|+|+..+++++-
T Consensus       103 g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~--------~~~~~l~-g~~g~~v~ltv~r~g~~~~~~l~r  171 (389)
T PLN00049        103 GLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLY--------EAADRLQ-GPEGSSVELTLRRGPETRLVTLTR  171 (389)
T ss_pred             cEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHH--------HHHHHHh-cCCCCEEEEEEEECCEEEEEEEEe
Confidence            799999999999999 9999999999999999876321        0223343 346889999999999877766543


No 49 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.93  E-value=3.5e-05  Score=81.29  Aligned_cols=79  Identities=8%  Similarity=0.013  Sum_probs=62.2

Q ss_pred             eccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHH
Q 008087          451 IAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKS  525 (578)
Q Consensus       451 ~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~  525 (578)
                      |.|+.+.++ +.....++.+                    ...++++..    ++|+++||++||+|++|||++|.++++
T Consensus       256 ~lGv~~~~~~~~~~~~lgl~--------------------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~d  315 (351)
T TIGR02038       256 YIGVSGEDINSVVAQGLGLP--------------------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEE  315 (351)
T ss_pred             EeeeEEEECCHHHHHhcCCC--------------------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHH
Confidence            567778777 6666666621                    123566654    478999999999999999999999999


Q ss_pred             HHHHHHh-cCCCeEEEEEecCeEEE
Q 008087          526 LANMVEN-CDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       526 l~~~l~~-~~~~~v~l~v~R~~~~~  549 (578)
                      |.+.+++ .+++.+.|++.|+++..
T Consensus       316 l~~~l~~~~~g~~v~l~v~R~g~~~  340 (351)
T TIGR02038       316 LMDRIAETRPGSKVMVTVLRQGKQL  340 (351)
T ss_pred             HHHHHHhcCCCCEEEEEEEECCEEE
Confidence            9999987 46778999999987654


No 50 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.93  E-value=1.9e-05  Score=83.13  Aligned_cols=68  Identities=25%  Similarity=0.374  Sum_probs=55.9

Q ss_pred             CCCcEEEEeC--------CCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCE
Q 008087          352 QKGVRIRRVD--------PTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSK  422 (578)
Q Consensus       352 ~~Gv~V~~V~--------p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~  422 (578)
                      ..|++|....        .+|||++ |||+||+|++|||++|.+|.+          +.+++... .++.+.++|.|+|+
T Consensus       104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~D----------L~~iL~~~-~g~~V~LtV~R~Ge  172 (402)
T TIGR02860       104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDD----------LANLINKA-GGEKLTLTIERGGK  172 (402)
T ss_pred             cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHhC-CCCeEEEEEEECCE
Confidence            4588886542        2589998 999999999999999999998          55666665 47899999999999


Q ss_pred             EEEEEEEe
Q 008087          423 ILNFNITL  430 (578)
Q Consensus       423 ~~~~~v~l  430 (578)
                      ..++.+..
T Consensus       173 ~~tv~V~P  180 (402)
T TIGR02860       173 IIETVIKP  180 (402)
T ss_pred             EEEEEEEE
Confidence            88888763


No 51 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.91  E-value=6.4e-05  Score=62.40  Aligned_cols=65  Identities=23%  Similarity=0.352  Sum_probs=43.0

Q ss_pred             CCCcEEEEeCCC--------CcccC-C--CCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEEC
Q 008087          352 QKGVRIRRVDPT--------APESE-V--LKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD  420 (578)
Q Consensus       352 ~~Gv~V~~V~p~--------spA~~-G--L~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~  420 (578)
                      ..+..|.+|.++        ||..+ |  +++||+|++|||+++....+          +..++.. ..|+.+.|+|.+.
T Consensus        11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~----------~~~lL~~-~agk~V~Ltv~~~   79 (88)
T PF14685_consen   11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADAN----------PYRLLEG-KAGKQVLLTVNRK   79 (88)
T ss_dssp             TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-----------HHHHHHT-TTTSEEEEEEE-S
T ss_pred             CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCC----------HHHHhcc-cCCCEEEEEEecC
Confidence            357888888875        77777 6  56999999999999998777          3345544 4789999999986


Q ss_pred             C-EEEEEE
Q 008087          421 S-KILNFN  427 (578)
Q Consensus       421 g-~~~~~~  427 (578)
                      + +.+++.
T Consensus        80 ~~~~R~v~   87 (88)
T PF14685_consen   80 PGGARTVV   87 (88)
T ss_dssp             TT-EEEEE
T ss_pred             CCCceEEE
Confidence            5 455554


No 52 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=97.88  E-value=3.1e-05  Score=61.34  Aligned_cols=50  Identities=24%  Similarity=0.259  Sum_probs=43.4

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEE
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v  542 (578)
                      ++++..    ++|+.+||++||+|++|||+++.++  +++.++++...++.+.|++
T Consensus        14 ~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v   69 (70)
T cd00136          14 GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV   69 (70)
T ss_pred             CEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence            455544    5788899999999999999999999  9999999998888888876


No 53 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.88  E-value=0.0011  Score=65.17  Aligned_cols=135  Identities=21%  Similarity=0.290  Sum_probs=69.2

Q ss_pred             CCEEEecccccCC-CceEEEEEcCCCcEEEEE-----EEEEcCCCCEEEEEEeeCCCCCCeeeEEcC---CCCCCCCcEE
Q 008087          158 GRRVLTNAHSVEH-YTQVKLKKRGSDTKYLAT-----VLAIGTECDIAMLTVEDDEFWEGVLPVEFG---ELPALQDAVT  228 (578)
Q Consensus       158 ~g~ILT~aHvV~~-~~~i~V~~~~~g~~~~a~-----vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~---~~~~~g~~V~  228 (578)
                      ..||+||+|.... ...+.|..  ..-.|...     -+..-...||.++++..+     ++|.+-.   ..+..++.|.
T Consensus        40 G~~iItn~HLf~~nng~L~i~s--~hG~f~v~nt~~lkv~~i~~~DiviirmPkD-----fpPf~~kl~FR~P~~~e~v~  112 (235)
T PF00863_consen   40 GSYIITNAHLFKRNNGELTIKS--QHGEFTVPNTTQLKVHPIEGRDIVIIRMPKD-----FPPFPQKLKFRAPKEGERVC  112 (235)
T ss_dssp             TTEEEEEGGGGSSTTCEEEEEE--TTEEEEECEGGGSEEEE-TCSSEEEEE--TT-----S----S---B----TT-EEE
T ss_pred             CCEEEEChhhhccCCCeEEEEe--CceEEEcCCccccceEEeCCccEEEEeCCcc-----cCCcchhhhccCCCCCCEEE
Confidence            6799999999964 34567766  22233322     133446899999999875     4443221   2567799999


Q ss_pred             EEeecCCCCcceEEEeEEeceeeeee-cCCceeeeEEEEecCCCCCCccceEEcc-CCeEEEEEeccccccccCCceeee
Q 008087          229 VVGYPIGGDTISVTSGVVSRIEILSY-VHGSTELLGLQIDAAINSGNSGGPAFND-KGKCVGIAFQSLKHEDVENIGYVI  306 (578)
Q Consensus       229 ~iG~p~g~~~~sv~~G~Is~~~~~~~-~~~~~~~~~i~~da~i~~G~SGGPlvn~-~G~vVGI~~~~~~~~~~~~~~~ai  306 (578)
                      .+|.-+.....+.   .||....... ..+.....+|.    ...|+=|.|+++. +|++|||++...   .....+|+.
T Consensus       113 mVg~~fq~k~~~s---~vSesS~i~p~~~~~fWkHwIs----Tk~G~CG~PlVs~~Dg~IVGiHsl~~---~~~~~N~F~  182 (235)
T PF00863_consen  113 MVGSNFQEKSISS---TVSESSWIYPEENSHFWKHWIS----TKDGDCGLPLVSTKDGKIVGIHSLTS---NTSSRNYFT  182 (235)
T ss_dssp             EEEEECSSCCCEE---EEEEEEEEEEETTTTEEEE-C-------TT-TT-EEEETTT--EEEEEEEEE---TTTSSEEEE
T ss_pred             EEEEEEEcCCeeE---EECCceEEeecCCCCeeEEEec----CCCCccCCcEEEcCCCcEEEEEcCcc---CCCCeEEEE
Confidence            9998665333222   2333222111 22222233444    4468889999986 899999999743   335567776


Q ss_pred             cch
Q 008087          307 PTP  309 (578)
Q Consensus       307 Pi~  309 (578)
                      |+.
T Consensus       183 ~f~  185 (235)
T PF00863_consen  183 PFP  185 (235)
T ss_dssp             E--
T ss_pred             cCC
Confidence            664


No 54 
>PRK10898 serine endoprotease; Provisional
Probab=97.80  E-value=9.9e-05  Score=77.86  Aligned_cols=58  Identities=9%  Similarity=0.025  Sum_probs=50.0

Q ss_pred             ceEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHh-cCCCeEEEEEecCeEEE
Q 008087          492 QMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVEN-CDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       492 ~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~-~~~~~v~l~v~R~~~~~  549 (578)
                      .++++..+    +|+++||+.||+|++|||++|.++.+|.+.+.. .+++.+.|++.|+++.+
T Consensus       279 ~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~  341 (353)
T PRK10898        279 QGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQL  341 (353)
T ss_pred             CeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEE
Confidence            46766654    799999999999999999999999999999887 56778999999987654


No 55 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.79  E-value=5.3e-05  Score=61.73  Aligned_cols=55  Identities=13%  Similarity=0.145  Sum_probs=42.0

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~  549 (578)
                      ++++..    ++|+++||++||+|++|||+++.+|.++.+.+  ..++.+.+++.|+++..
T Consensus        13 ~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r~g~~~   71 (80)
T cd00990          13 LGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFRDDRLI   71 (80)
T ss_pred             cEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEECCEEE
Confidence            455554    47899999999999999999999966654333  25678899999887653


No 56 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.73  E-value=0.00036  Score=68.40  Aligned_cols=113  Identities=19%  Similarity=0.281  Sum_probs=62.7

Q ss_pred             EEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEE
Q 008087          150 SSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVT  228 (578)
Q Consensus       150 ~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~  228 (578)
                      +|+.|-++ +-.|+|+.||+. .+...|..  .+..   +...++..-|+|.-.++.  +....|.++++.. ..|..  
T Consensus       115 sggvft~~~~~vvvTAtHVlg-~~~a~v~~--~g~~---~~~tF~~~GDfA~~~~~~--~~G~~P~~k~a~~-~~GrA--  183 (297)
T PF05579_consen  115 SGGVFTIGGNTVVVTATHVLG-GNTARVSG--VGTR---RMLTFKKNGDFAEADITN--WPGAAPKYKFAQN-YTGRA--  183 (297)
T ss_dssp             EEEEEECTTEEEEEEEHHHCB-TTEEEEEE--TTEE---EEEEEEEETTEEEEEETT--S-S---B--B-TT--SEEE--
T ss_pred             ccceEEECCeEEEEEEEEEcC-CCeEEEEe--cceE---EEEEEeccCcEEEEECCC--CCCCCCceeecCC-cccce--
Confidence            44444444 459999999998 55666665  3333   333456678999998843  2235677777622 12211  


Q ss_pred             EEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEecc
Q 008087          229 VVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQS  293 (578)
Q Consensus       229 ~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~  293 (578)
                           +.+...-+..|.|..-.+            +.+   ..+|+||+|++..+|.+|||++++
T Consensus       184 -----yW~t~tGvE~G~ig~~~~------------~~f---T~~GDSGSPVVt~dg~liGVHTGS  228 (297)
T PF05579_consen  184 -----YWLTSTGVEPGFIGGGGA------------VCF---TGPGDSGSPVVTEDGDLIGVHTGS  228 (297)
T ss_dssp             -----EEEETTEEEEEEEETTEE------------EES---S-GGCTT-EEEETTC-EEEEEEEE
T ss_pred             -----EEEcccCcccceecCceE------------EEE---cCCCCCCCccCcCCCCEEEEEecC
Confidence                 111111355565544322            222   357999999999999999999984


No 57 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.73  E-value=7.2e-05  Score=75.02  Aligned_cols=67  Identities=25%  Similarity=0.326  Sum_probs=54.3

Q ss_pred             CCcEEE-EeCCCCc---ccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEE
Q 008087          353 KGVRIR-RVDPTAP---ESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFN  427 (578)
Q Consensus       353 ~Gv~V~-~V~p~sp---A~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~  427 (578)
                      .| +++ .+.|+..   ..+ |||+||++++|||.++.+..+          ...++........++|+|+|+|+.+++.
T Consensus       204 ~G-l~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~q----------a~~l~~~L~~~tei~ltVeRdGq~~~i~  272 (276)
T PRK09681        204 EG-IVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRA----------MIALMRQLPSMDSIQLTVLRKGARHDIS  272 (276)
T ss_pred             CC-ceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHH----------HHHHHHHhccCCeEEEEEEECCEEEEEE
Confidence            35 444 4667643   455 999999999999999998876          4577888888899999999999999988


Q ss_pred             EEe
Q 008087          428 ITL  430 (578)
Q Consensus       428 v~l  430 (578)
                      +.+
T Consensus       273 i~l  275 (276)
T PRK09681        273 IAL  275 (276)
T ss_pred             EEc
Confidence            765


No 58 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=97.68  E-value=9.5e-05  Score=74.36  Aligned_cols=58  Identities=16%  Similarity=0.032  Sum_probs=50.6

Q ss_pred             ceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcC-CCeEEEEEecCeEEE
Q 008087          492 QMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCD-DEFLKFDLEYDQVVV  549 (578)
Q Consensus       492 ~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~-~~~v~l~v~R~~~~~  549 (578)
                      .|+.+..    ++++++||+.||+|++|||+++.+++++.+++.+.+ ++.+.|++.|+++.+
T Consensus       191 ~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~  253 (259)
T TIGR01713       191 EGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQRE  253 (259)
T ss_pred             eEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEE
Confidence            5676664    489999999999999999999999999999999864 468999999998754


No 59 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.67  E-value=0.00015  Score=77.63  Aligned_cols=81  Identities=25%  Similarity=0.393  Sum_probs=59.5

Q ss_pred             cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHH-
Q 008087          327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLV-  404 (578)
Q Consensus       327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~-  404 (578)
                      +..+|++++.. +            ..++.|.++.+++||++ ||++||+|++|||+++....-           ...+ 
T Consensus        99 ~~GiG~~i~~~-~------------~~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~-----------~~av~  154 (406)
T COG0793          99 FGGIGIELQME-D------------IGGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSL-----------DEAVK  154 (406)
T ss_pred             ccceeEEEEEe-c------------CCCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCH-----------HHHHH
Confidence            56777777643 1            15899999999999999 999999999999999988641           1222 


Q ss_pred             -hccCCCCEEEEEEEECCEEEEEEEEec
Q 008087          405 -SQKYTGDSAAVKVLRDSKILNFNITLA  431 (578)
Q Consensus       405 -~~~~~g~~v~l~V~R~g~~~~~~v~l~  431 (578)
                       .+-..|..|+|++.|.+....+.+++.
T Consensus       155 ~irG~~Gt~V~L~i~r~~~~k~~~v~l~  182 (406)
T COG0793         155 LIRGKPGTKVTLTILRAGGGKPFTVTLT  182 (406)
T ss_pred             HhCCCCCCeEEEEEEEcCCCceeEEEEE
Confidence             233478999999999744333333333


No 60 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.59  E-value=0.00059  Score=73.12  Aligned_cols=76  Identities=21%  Similarity=0.274  Sum_probs=52.8

Q ss_pred             hhhhhccccCCCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087          342 LRVAMSMKADQKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR  419 (578)
Q Consensus       342 ~~~~lgl~~~~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R  419 (578)
                      ..+.|||.- ..-+.|.++...+.|++  +|+.||+|++|||.-..+..- .       +...++... . .+++|.|+|
T Consensus       209 ~nEEyGlrL-gSqIFvKeit~~gLAardgnlqEGDiiLkINGtvteNmSL-t-------Dar~LIEkS-~-GKL~lvVlR  277 (1027)
T KOG3580|consen  209 ANEEYGLRL-GSQIFVKEITRTGLAARDGNLQEGDIILKINGTVTENMSL-T-------DARKLIEKS-R-GKLQLVVLR  277 (1027)
T ss_pred             cchhhcccc-cchhhhhhhcccchhhccCCcccccEEEEECcEeeccccc-h-------hHHHHHHhc-c-CceEEEEEe
Confidence            344566654 23478888988888887  899999999999998887632 1       133455443 3 468999999


Q ss_pred             CCEEEEEEE
Q 008087          420 DSKILNFNI  428 (578)
Q Consensus       420 ~g~~~~~~v  428 (578)
                      +.+..-+.|
T Consensus       278 D~~qtLiNi  286 (1027)
T KOG3580|consen  278 DSQQTLINI  286 (1027)
T ss_pred             cCCceeeec
Confidence            866554544


No 61 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.58  E-value=9.9e-05  Score=77.87  Aligned_cols=51  Identities=22%  Similarity=0.399  Sum_probs=47.0

Q ss_pred             ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087          499 CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       499 v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~  549 (578)
                      ++|+++||+.||+|++|||++|.+|+||.+++++..++.+.|++.|+++..
T Consensus       124 SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~Ge~~  174 (402)
T TIGR02860       124 SPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERGGKII  174 (402)
T ss_pred             CHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEECCEEE
Confidence            478889999999999999999999999999999998899999999987653


No 62 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.50  E-value=0.00021  Score=71.70  Aligned_cols=71  Identities=25%  Similarity=0.312  Sum_probs=63.8

Q ss_pred             CCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE-CCEEEEEEEEec
Q 008087          353 KGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR-DSKILNFNITLA  431 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R-~g~~~~~~v~l~  431 (578)
                      .|+++..|..++|+..-|+.||.|++|||+++.+..+          |...+.....|+++++++.| +++....++++.
T Consensus       130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e----------~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~  199 (342)
T COG3480         130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDE----------LIDYVSSKKPGDEVTIDYERHNETPEIVTITLI  199 (342)
T ss_pred             eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHH----------HHHHHhccCCCCeEEEEEEeccCCCceEEEEEE
Confidence            5899999999999988999999999999999999988          67888888999999999997 888888887777


Q ss_pred             cc
Q 008087          432 TH  433 (578)
Q Consensus       432 ~~  433 (578)
                      ..
T Consensus       200 ~~  201 (342)
T COG3480         200 KN  201 (342)
T ss_pred             ee
Confidence            65


No 63 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.42  E-value=0.00047  Score=56.16  Aligned_cols=49  Identities=14%  Similarity=0.128  Sum_probs=40.2

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v  542 (578)
                      ++++..    ++|+++||++||+|++|||+++.  +++++.++++.... .+.|.+
T Consensus        27 ~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v   81 (82)
T cd00992          27 GIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV   81 (82)
T ss_pred             CeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence            455554    47889999999999999999999  99999999997654 566654


No 64 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.36  E-value=0.00033  Score=57.27  Aligned_cols=51  Identities=10%  Similarity=0.224  Sum_probs=41.6

Q ss_pred             ceEEEEee----cCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEe
Q 008087          492 QMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLE  543 (578)
Q Consensus       492 ~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~  543 (578)
                      .+++++.+    +|+++||+.||+|++|||+++.++  .+..++++.+.+ .++|+|+
T Consensus        25 ~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~-~v~L~V~   81 (81)
T PF00595_consen   25 KGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN-PVTLTVQ   81 (81)
T ss_dssp             EEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS-EEEEEEE
T ss_pred             CCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC-cEEEEEC
Confidence            35666655    789999999999999999999955  777888888766 7888763


No 65 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=97.34  E-value=0.00045  Score=74.84  Aligned_cols=123  Identities=12%  Similarity=0.109  Sum_probs=79.6

Q ss_pred             EEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEecccc
Q 008087          357 IRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATHR  434 (578)
Q Consensus       357 V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~  434 (578)
                      |.....++||++  .|-.||.|++|||..+-..--.        .++..+........|+++|.+=--..++.|.  .  
T Consensus       677 iAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLs--------tcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~--R--  744 (829)
T KOG3605|consen  677 IANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLS--------TCQSIIKGLKNQTAVKLNIVSCPPVTTVLIR--R--  744 (829)
T ss_pred             HHhcccCChhhhcCCccccceeEeecCceeccccHH--------HHHHHHhcccccceEEEEEecCCCceEEEee--c--
Confidence            335566899998  5999999999999877653211        1456677666666788887763333233221  1  


Q ss_pred             cccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhhhhhhhccccccccccccCCCceEEEE---eecCCcCCCCCCCE
Q 008087          435 RLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLL---WCLRSPLCLNCFNK  511 (578)
Q Consensus       435 ~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs---~v~A~~aGl~~GD~  511 (578)
                                               |..+-.+|+..                    +.|++++   +.-|++.|++.|-+
T Consensus       745 -------------------------Pd~kyQLGFSV--------------------QNGiICSLlRGGIAERGGVRVGHR  779 (829)
T KOG3605|consen  745 -------------------------PDLRYQLGFSV--------------------QNGIICSLLRGGIAERGGVRVGHR  779 (829)
T ss_pred             -------------------------ccchhhcccee--------------------eCcEeehhhcccchhccCceeeee
Confidence                                     11111222111                    2355554   66899999999999


Q ss_pred             EEEeCCeecCC--HHHHHHHHHhcCCC
Q 008087          512 VLAFNGNPVKN--LKSLANMVENCDDE  536 (578)
Q Consensus       512 I~~VNG~~V~~--~~~l~~~l~~~~~~  536 (578)
                      |++||||.|--  .+-.+++|..+-++
T Consensus       780 IIEINgQSVVA~pHekIV~lLs~aVGE  806 (829)
T KOG3605|consen  780 IIEINGQSVVATPHEKIVQLLSNAVGE  806 (829)
T ss_pred             EEEECCceEEeccHHHHHHHHHHhhhh
Confidence            99999999863  35578888776554


No 66 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.28  E-value=0.00044  Score=73.95  Aligned_cols=85  Identities=21%  Similarity=0.368  Sum_probs=64.3

Q ss_pred             CCcccccccChhhhhhhcccc--CCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087          330 LGVEWQKMENPDLRVAMSMKA--DQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ  406 (578)
Q Consensus       330 lGi~~~~~~~~~~~~~lgl~~--~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~  406 (578)
                      .|+.+..+ . ...-+||+.-  +..+.+|..|.++|||++ ||.+||.|++|||.   +  +             .+.+
T Consensus       439 ~gL~~~~~-~-~~~~~LGl~v~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~---s--~-------------~l~~  498 (558)
T COG3975         439 FGLTFTPK-P-REAYYLGLKVKSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI---S--D-------------QLDR  498 (558)
T ss_pred             cceEEEec-C-CCCcccceEecccCCeeEEEecCCCChhHhccCCCccEEEEEcCc---c--c-------------cccc
Confidence            45666554 2 2244566543  345688999999999999 99999999999998   1  1             2445


Q ss_pred             cCCCCEEEEEEEECCEEEEEEEEecccc
Q 008087          407 KYTGDSAAVKVLRDSKILNFNITLATHR  434 (578)
Q Consensus       407 ~~~g~~v~l~V~R~g~~~~~~v~l~~~~  434 (578)
                      ...++.+++++.|.|+.+++.+++....
T Consensus       499 ~~~~d~i~v~~~~~~~L~e~~v~~~~~~  526 (558)
T COG3975         499 YKVNDKIQVHVFREGRLREFLVKLGGDP  526 (558)
T ss_pred             cccccceEEEEccCCceEEeecccCCCc
Confidence            5688999999999999999988876543


No 67 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.19  E-value=0.0011  Score=74.94  Aligned_cols=71  Identities=17%  Similarity=0.178  Sum_probs=47.9

Q ss_pred             CCcEEEEeCCCCcccC--CCCCCCEEEEEC--CEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEEC---CEEEE
Q 008087          353 KGVRIRRVDPTAPESE--VLKPSDIILSFD--GIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD---SKILN  425 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~--GL~~GDiIl~In--G~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~---g~~~~  425 (578)
                      .+++|..|.+||||++  ||++||+|++||  |.++.+.....+.     .+..++. -..|.+|.|+|.|+   |+..+
T Consensus       255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~-----~vv~lir-G~~Gt~V~LtV~r~~~~~~~~~  328 (667)
T PRK11186        255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLD-----DVVALIK-GPKGSKVRLEILPAGKGTKTRI  328 (667)
T ss_pred             CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHH-----HHHHHhc-CCCCCEEEEEEEeCCCCCceEE
Confidence            3688999999999987  899999999999  5554432211100     0223333 34789999999983   45555


Q ss_pred             EEEE
Q 008087          426 FNIT  429 (578)
Q Consensus       426 ~~v~  429 (578)
                      +++.
T Consensus       329 vtl~  332 (667)
T PRK11186        329 VTLT  332 (667)
T ss_pred             EEEE
Confidence            5554


No 68 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.18  E-value=0.00082  Score=63.54  Aligned_cols=72  Identities=22%  Similarity=0.271  Sum_probs=57.2

Q ss_pred             cEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEeccc
Q 008087          355 VRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATH  433 (578)
Q Consensus       355 v~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~  433 (578)
                      ++|..|.|+|||+. ||+.||.|+++....-.++..+.     +  + ..+.....++.+.++|.|.|+...+.++...|
T Consensus       141 a~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq-----~--i-~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~W  212 (231)
T KOG3129|consen  141 AVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQ-----N--I-AAVVQSNEDQIVSVTVIREGQKVVLSLTPKKW  212 (231)
T ss_pred             EEEeecCCCChhhhhCcccCceEEEecccccccchhHH-----H--H-HHHHHhccCcceeEEEecCCCEEEEEeCcccc
Confidence            67899999999999 99999999999887666665421     0  1 12334557888999999999999999988777


Q ss_pred             c
Q 008087          434 R  434 (578)
Q Consensus       434 ~  434 (578)
                      .
T Consensus       213 ~  213 (231)
T KOG3129|consen  213 Q  213 (231)
T ss_pred             c
Confidence            4


No 69 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.16  E-value=0.00038  Score=67.00  Aligned_cols=66  Identities=18%  Similarity=0.229  Sum_probs=52.9

Q ss_pred             CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Q 008087          354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNIT  429 (578)
Q Consensus       354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~  429 (578)
                      |..+.-..+++..+. |||.||+.++||+..+++.++          +..++.....-..++++|+|+|+...+.+.
T Consensus       208 Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~----------m~~llq~l~~m~s~qlTv~R~G~rhdInV~  274 (275)
T COG3031         208 GYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPED----------MFRLLQMLRNMPSLQLTVIRRGKRHDINVR  274 (275)
T ss_pred             EEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHH----------HHHHHHhhhcCcceEEEEEecCccceeeec
Confidence            343443445566777 999999999999999999877          456777777778899999999999888764


No 70 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.03  E-value=0.00068  Score=72.14  Aligned_cols=51  Identities=16%  Similarity=0.170  Sum_probs=43.4

Q ss_pred             EEEeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEe-cCeEE
Q 008087          495 SLLWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLE-YDQVV  548 (578)
Q Consensus       495 vvs~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~-R~~~~  548 (578)
                      |.++++|+++||++||+|++|||++|.+|.++...+.   ++.+.+++. |+++.
T Consensus         5 V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~rdGe~   56 (433)
T TIGR03279         5 VLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDANGES   56 (433)
T ss_pred             cCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcCCCeE
Confidence            5567899999999999999999999999999988874   467888885 66644


No 71 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.00  E-value=0.0014  Score=59.34  Aligned_cols=86  Identities=23%  Similarity=0.338  Sum_probs=54.6

Q ss_pred             cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCC-CCEEEEECCEEeCCCCCCccccCcchhHHHHH
Q 008087          327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKP-SDIILSFDGIDIANDGTVPFRHGERIGFSYLV  404 (578)
Q Consensus       327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~-GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~  404 (578)
                      .+.||+.++-- ...-       ....+..|.+|.|+|||++ ||++ .|.|+.+|+..+.+.++          |..++
T Consensus        25 ~g~LG~sv~~~-~~~~-------~~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~----------l~~~v   86 (138)
T PF04495_consen   25 QGLLGISVRFE-SFEG-------AEEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDD----------LFELV   86 (138)
T ss_dssp             SSSS-EEEEEE-E-TT-------GCCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCH----------HHHHH
T ss_pred             CCCCcEEEEEe-cccc-------cccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHH----------HHHHH
Confidence            46788877643 1110       1245789999999999999 9999 69999999988887655          56666


Q ss_pred             hccCCCCEEEEEEEEC--CEEEEEEEEec
Q 008087          405 SQKYTGDSAAVKVLRD--SKILNFNITLA  431 (578)
Q Consensus       405 ~~~~~g~~v~l~V~R~--g~~~~~~v~l~  431 (578)
                      .. ..++.+.|.|...  ...+++.+...
T Consensus        87 ~~-~~~~~l~L~Vyns~~~~vR~V~i~P~  114 (138)
T PF04495_consen   87 EA-NENKPLQLYVYNSKTDSVREVTITPS  114 (138)
T ss_dssp             HH-TTTS-EEEEEEETTTTCEEEEEE---
T ss_pred             HH-cCCCcEEEEEEECCCCeEEEEEEEcC
Confidence            65 4678999999863  45556666554


No 72 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=96.99  E-value=0.0015  Score=53.30  Aligned_cols=53  Identities=13%  Similarity=0.144  Sum_probs=39.6

Q ss_pred             eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHH--HHHHHhcCCCeEEEEEecCe
Q 008087          493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSL--ANMVENCDDEFLKFDLEYDQ  546 (578)
Q Consensus       493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l--~~~l~~~~~~~v~l~v~R~~  546 (578)
                      ++++..    ++|+++||++||+|++|||+++.++.+.  ...+... ++.+.|.+.|++
T Consensus        27 ~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~-~~~~~l~i~r~~   85 (85)
T smart00228       27 GVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKA-GGKVTLTVLRGG   85 (85)
T ss_pred             CEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhC-CCeEEEEEEeCC
Confidence            455554    4788999999999999999999977554  3444443 458888888763


No 73 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=96.99  E-value=0.0006  Score=71.54  Aligned_cols=50  Identities=8%  Similarity=0.109  Sum_probs=43.7

Q ss_pred             EeecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEecCe
Q 008087          497 LWCLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLEYDQ  546 (578)
Q Consensus       497 s~v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~R~~  546 (578)
                      .+++|+++||+.||+|++|||++|.+|  .++...+....++.+.|++.|++
T Consensus        71 ~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~g  122 (334)
T TIGR00225        71 EGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRAG  122 (334)
T ss_pred             CCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeCC
Confidence            456899999999999999999999986  67888887777888999998875


No 74 
>PF12812 PDZ_1:  PDZ-like domain
Probab=96.92  E-value=0.0012  Score=53.65  Aligned_cols=58  Identities=12%  Similarity=0.073  Sum_probs=51.0

Q ss_pred             ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCC
Q 008087          328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGT  389 (578)
Q Consensus       328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~  389 (578)
                      -|.|..++++ +...++.++++-   |+++.....++++.. ++..|-+|.+|||+++.+.++
T Consensus         9 ~~~Ga~f~~L-s~q~aR~~~~~~---~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~   67 (78)
T PF12812_consen    9 EVCGAVFHDL-SYQQARQYGIPV---GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDD   67 (78)
T ss_pred             EEcCeecccC-CHHHHHHhCCCC---CEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHH
Confidence            4789999999 899999999977   466667788899888 699999999999999999877


No 75 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.89  E-value=0.03  Score=57.15  Aligned_cols=109  Identities=16%  Similarity=0.205  Sum_probs=64.3

Q ss_pred             CCCCEEEEEEeeCCCCCCeeeEEcCCCCC---CCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCC
Q 008087          194 TECDIAMLTVEDDEFWEGVLPVEFGELPA---LQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAI  270 (578)
Q Consensus       194 ~~~DlAlLkv~~~~~~~~~~~l~l~~~~~---~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i  270 (578)
                      ...+++||+++.+ +.....++.|+++..   .++.+.+.|+... .  .+....+.-.....      ....+......
T Consensus       159 ~~~~~mIlEl~~~-~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~-~--~~~~~~~~i~~~~~------~~~~~~~~~~~  228 (282)
T PF03761_consen  159 RPYSPMILELEED-FSKNVSPPCLADSSTNWEKGDEVDVYGFNST-G--KLKHRKLKITNCTK------CAYSICTKQYS  228 (282)
T ss_pred             cccceEEEEEccc-ccccCCCEEeCCCccccccCceEEEeecCCC-C--eEEEEEEEEEEeec------cceeEeccccc
Confidence            3569999999988 334788899987543   3788888888222 1  12222222111110      11235556677


Q ss_pred             CCCCccceEEc---cCCeEEEEEeccccccccCCceeeecchhHHH
Q 008087          271 NSGNSGGPAFN---DKGKCVGIAFQSLKHEDVENIGYVIPTPVIMH  313 (578)
Q Consensus       271 ~~G~SGGPlvn---~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~  313 (578)
                      +.|++||||+.   ....||||.+..... ...+..+++.+...++
T Consensus       229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~-~~~~~~~f~~v~~~~~  273 (282)
T PF03761_consen  229 CKGDRGGPLVKNINGRWTLIGVGASGNYE-CNKNNSYFFNVSWYQD  273 (282)
T ss_pred             CCCCccCeEEEEECCCEEEEEEEccCCCc-ccccccEEEEHHHhhh
Confidence            89999999983   334799998763211 1112456666655444


No 76 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=96.88  E-value=0.0033  Score=65.55  Aligned_cols=143  Identities=17%  Similarity=0.238  Sum_probs=95.7

Q ss_pred             CCCcEEEEeCCCCcccC-CCCC-CCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEEC--CEEEEEE
Q 008087          352 QKGVRIRRVDPTAPESE-VLKP-SDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD--SKILNFN  427 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~-GL~~-GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~--g~~~~~~  427 (578)
                      ..|.-|-+|..+|+|++ ||++ -|.|++|||..+...++.         |..+++...  +.|+++|.--  -..+.+.
T Consensus        14 teg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~---------Lk~llk~~s--ekVkltv~n~kt~~~R~v~   82 (462)
T KOG3834|consen   14 TEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDT---------LKALLKANS--EKVKLTVYNSKTQEVRIVE   82 (462)
T ss_pred             ceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHH---------HHHHHHhcc--cceEEEEEecccceeEEEE
Confidence            45788899999999999 9988 499999999999988773         445555433  3389998742  2333444


Q ss_pred             EEecccccccCCCCCCCCCCceeeccEEEeeh---HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCCcC
Q 008087          428 ITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC---LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPL  504 (578)
Q Consensus       428 v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~---p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~a  504 (578)
                      |+....-..          .   +.|+.+.-.   .+....|.                        -.-|-..++|+.|
T Consensus        83 I~ps~~wgg----------q---llGvsvrFcsf~~A~~~vwH------------------------vl~V~p~SPaalA  125 (462)
T KOG3834|consen   83 IVPSNNWGG----------Q---LLGVSVRFCSFDGAVESVWH------------------------VLSVEPNSPAALA  125 (462)
T ss_pred             ecccccccc----------c---ccceEEEeccCccchhheee------------------------eeecCCCCHHHhc
Confidence            443322110          0   223333221   11111111                        1234556799999


Q ss_pred             CCC-CCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEE
Q 008087          505 CLN-CFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       505 Gl~-~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v  542 (578)
                      ||. .+|.|+.+-..-....+||..+|+.+.++.+++-|
T Consensus       126 gl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyV  164 (462)
T KOG3834|consen  126 GLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYV  164 (462)
T ss_pred             ccccccceEecchhhhccchHHHHHHHHhccCCCcceeE
Confidence            998 89999999666677899999999999888888766


No 77 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=96.86  E-value=0.0013  Score=70.49  Aligned_cols=51  Identities=12%  Similarity=0.089  Sum_probs=44.0

Q ss_pred             EeecCCcCCCCCCCEEEEeCCeecCC--HHHHHHHHHhcCCCeEEEEEecCeE
Q 008087          497 LWCLRSPLCLNCFNKVLAFNGNPVKN--LKSLANMVENCDDEFLKFDLEYDQV  547 (578)
Q Consensus       497 s~v~A~~aGl~~GD~I~~VNG~~V~~--~~~l~~~l~~~~~~~v~l~v~R~~~  547 (578)
                      .+++|+++||++||+|++|||++|.+  +.++...++...+..+.|++.|++.
T Consensus       111 ~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~g~  163 (389)
T PLN00049        111 PGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRGPE  163 (389)
T ss_pred             CCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEECCE
Confidence            35689999999999999999999985  4788888877778889999988765


No 78 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=96.78  E-value=0.02  Score=58.77  Aligned_cols=58  Identities=22%  Similarity=0.290  Sum_probs=37.1

Q ss_pred             EEEEEEEcCCEEEecccccCCCc-----eEEEEEc-CC---CcEEEEEEEEEc-------CCCCEEEEEEeeCC
Q 008087          150 SSSGFAIGGRRVLTNAHSVEHYT-----QVKLKKR-GS---DTKYLATVLAIG-------TECDIAMLTVEDDE  207 (578)
Q Consensus       150 ~GsGfvI~~g~ILT~aHvV~~~~-----~i~V~~~-~~---g~~~~a~vv~~d-------~~~DlAlLkv~~~~  207 (578)
                      .|-|-+++..||||+|||+....     .+.|.+. +|   ++....+.++.+       ...|+|++++....
T Consensus        62 fCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a  135 (413)
T COG5640          62 FCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAA  135 (413)
T ss_pred             EeccceecceEEeeehhhccCCCCccccceEEEecccccccccCcceEEEeeecccccccccCcceeecccccc
Confidence            57788888889999999998654     2333331 12   222234444443       35799999998753


No 79 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=96.75  E-value=0.0028  Score=63.76  Aligned_cols=50  Identities=6%  Similarity=0.074  Sum_probs=44.4

Q ss_pred             cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCC-CeEEEEEecCeEEE
Q 008087          500 LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDD-EFLKFDLEYDQVVV  549 (578)
Q Consensus       500 ~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~-~~v~l~v~R~~~~~  549 (578)
                      .-.++||++||++++|||.++.+.++..+++++..+ ..+.|+|+|||+.+
T Consensus       219 lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~  269 (276)
T PRK09681        219 LFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARH  269 (276)
T ss_pred             HHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEE
Confidence            356789999999999999999999999999998765 48999999998864


No 80 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.75  E-value=0.023  Score=53.61  Aligned_cols=138  Identities=17%  Similarity=0.293  Sum_probs=80.3

Q ss_pred             cceEEEEEEEcCCEEEecccccCCCceEEEEEcCCCcEEE--EEEEEEcC---CCCEEEEEEeeCCCCCCe-eeEEcCCC
Q 008087          147 YSSSSSGFAIGGRRVLTNAHSVEHYTQVKLKKRGSDTKYL--ATVLAIGT---ECDIAMLTVEDDEFWEGV-LPVEFGEL  220 (578)
Q Consensus       147 ~~~~GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~~g~~~~--a~vv~~d~---~~DlAlLkv~~~~~~~~~-~~l~l~~~  220 (578)
                      ....++|+-|.+.++|.+.|   ......+.+  ++..++  ..+...|.   ..||++++++...-..++ +.+. ...
T Consensus        23 g~~t~l~~gi~~~~~lvp~H---~~~~~~i~i--~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~-~~~   96 (172)
T PF00548_consen   23 GEFTMLALGIYDRYFLVPTH---EEPEDTIYI--DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFP-ESI   96 (172)
T ss_dssp             EEEEEEEEEEEBTEEEEEGG---GGGCSEEEE--TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSB-SSG
T ss_pred             ceEEEecceEeeeEEEEECc---CCCcEEEEE--CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhc-ccc
Confidence            45678888999999999999   223334555  455543  23333454   459999999765321222 2222 112


Q ss_pred             CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEcc---CCeEEEEEec
Q 008087          221 PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFND---KGKCVGIAFQ  292 (578)
Q Consensus       221 ~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~---~G~vVGI~~~  292 (578)
                      ....+...++-+. ......+..+.++..+.. ...+......+..+++...|+-||||+..   .++++||+.+
T Consensus        97 ~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHva  169 (172)
T PF00548_consen   97 PEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVA  169 (172)
T ss_dssp             GTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred             ccCCCcEEEEECC-CCccEEEEEEEEeecCcc-ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEec
Confidence            2233444444332 223224455555554443 22233333567778888899999999963   5799999987


No 81 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.59  E-value=0.019  Score=63.35  Aligned_cols=117  Identities=19%  Similarity=0.283  Sum_probs=74.6

Q ss_pred             CCCEEEEEEeeCC-----CCCCe------eeEEcCCC--------CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeec
Q 008087          195 ECDIAMLTVEDDE-----FWEGV------LPVEFGEL--------PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYV  255 (578)
Q Consensus       195 ~~DlAlLkv~~~~-----~~~~~------~~l~l~~~--------~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~  255 (578)
                      -.|+|||+++...     +.+++      +.+.+.+.        ...|..|+=+|+..|     .|.|+|.++....+.
T Consensus       542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg-----yT~G~lNg~klvyw~  616 (695)
T PF08192_consen  542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG-----YTTGILNGIKLVYWA  616 (695)
T ss_pred             ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC-----ccceEecceEEEEec
Confidence            3599999998653     11122      22333332        123788999997766     356777777655444


Q ss_pred             CCcee-eeEEEEe----cCCCCCCccceEEccCC------eEEEEEeccccccccCCceeeecchhHHHHHHHH
Q 008087          256 HGSTE-LLGLQID----AAINSGNSGGPAFNDKG------KCVGIAFQSLKHEDVENIGYVIPTPVIMHFIQDY  318 (578)
Q Consensus       256 ~~~~~-~~~i~~d----a~i~~G~SGGPlvn~~G------~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l~~l  318 (578)
                      .+... .+++..+    .=...|+||+=|++.-+      .|+||.++.-+  ....+|++.|+..|+.-|++.
T Consensus       617 dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~rl~~v  688 (695)
T PF08192_consen  617 DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILDRLEEV  688 (695)
T ss_pred             CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHHHHHHh
Confidence            44322 3444444    22347999999998644      49999987432  455788899999887766554


No 82 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=96.59  E-value=0.0033  Score=52.24  Aligned_cols=48  Identities=15%  Similarity=0.127  Sum_probs=36.7

Q ss_pred             ecCCcCCC--CCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCe
Q 008087          499 CLRSPLCL--NCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQ  546 (578)
Q Consensus       499 v~A~~aGl--~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~  546 (578)
                      +|-.+.|+  +.||.|++|||+++..-.++.++|....++.+.|+|.+..
T Consensus        31 sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~   80 (88)
T PF14685_consen   31 SPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKP   80 (88)
T ss_dssp             -GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-ST
T ss_pred             CCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCC
Confidence            35556666  5999999999999999999999999999999999998753


No 83 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.28  E-value=0.0048  Score=70.13  Aligned_cols=20  Identities=35%  Similarity=0.295  Sum_probs=15.3

Q ss_pred             EEEEEEEc-CCEEEecccccC
Q 008087          150 SSSGFAIG-GRRVLTNAHSVE  169 (578)
Q Consensus       150 ~GsGfvI~-~g~ILT~aHvV~  169 (578)
                      -|||.+|+ +|+||||.||+.
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~   68 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGY   68 (698)
T ss_pred             ceeEEEEcCCceEEecchhhh
Confidence            48888887 788888888864


No 84 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.23  E-value=0.0038  Score=51.94  Aligned_cols=34  Identities=32%  Similarity=0.489  Sum_probs=31.3

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCC
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAN  386 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~  386 (578)
                      .|++|++|..+|||+. ||+.+|.|+.+||...+-
T Consensus        59 ~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTM   93 (124)
T KOG3553|consen   59 KGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTM   93 (124)
T ss_pred             ccEEEEEeccCChhhhhcceecceEEEecCceeEE
Confidence            6999999999999999 999999999999976653


No 85 
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=95.77  E-value=0.025  Score=59.60  Aligned_cols=59  Identities=17%  Similarity=0.124  Sum_probs=49.7

Q ss_pred             ceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcC-CCeEEEEEecCeEEEE
Q 008087          492 QMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCD-DEFLKFDLEYDQVVVL  550 (578)
Q Consensus       492 ~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~-~~~v~l~v~R~~~~~l  550 (578)
                      .|+++.+    ++|+++|++.||+|+++||+++.+..+|.+.+.... +..+.+.+.|+++..-
T Consensus       270 ~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~  333 (347)
T COG0265         270 AGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERE  333 (347)
T ss_pred             CceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEE
Confidence            4555554    489999999999999999999999999999988875 7799999999855443


No 86 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.37  E-value=0.19  Score=49.25  Aligned_cols=99  Identities=20%  Similarity=0.301  Sum_probs=70.7

Q ss_pred             CCCCCccccCC--CcceEEEEEEEcCCEEEecccccCCC----ceEEEEEcCCCcEEE------EEEEEEc-----CCCC
Q 008087          135 PNFSLPWQRKR--QYSSSSSGFAIGGRRVLTNAHSVEHY----TQVKLKKRGSDTKYL------ATVLAIG-----TECD  197 (578)
Q Consensus       135 ~~~~~p~~~~~--~~~~~GsGfvI~~g~ILT~aHvV~~~----~~i~V~~~~~g~~~~------a~vv~~d-----~~~D  197 (578)
                      ..+.+||-...  .+...|+|++|+..|||++..|+.+-    ..+.+.+ +.++.+.      -++..+|     +..+
T Consensus        12 e~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~Yvsall-G~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~   90 (267)
T PF09342_consen   12 EDYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALL-GGGKTYLSVDGPHEQISRVDCFKDVPESN   90 (267)
T ss_pred             ccccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEe-cCcceecccCCChheEEEeeeeeeccccc
Confidence            46789997643  45668999999999999999999873    4577777 4665442      2444444     5789


Q ss_pred             EEEEEEeeC-CCCCCeeeEEcCC---CCCCCCcEEEEeecC
Q 008087          198 IAMLTVEDD-EFWEGVLPVEFGE---LPALQDAVTVVGYPI  234 (578)
Q Consensus       198 lAlLkv~~~-~~~~~~~~l~l~~---~~~~g~~V~~iG~p~  234 (578)
                      ++||.++.+ .|...+.|+-+.+   .....+.++++|...
T Consensus        91 v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   91 VLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             eeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence            999999987 4445566666654   122346899999876


No 87 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.36  E-value=0.012  Score=56.87  Aligned_cols=138  Identities=22%  Similarity=0.268  Sum_probs=48.8

Q ss_pred             CCEEEecccccCCCceEEEEEcCCCcEEE---EEEEEEcCCCCEEEEEEeeCCCC--CCeeeEEcCCCCCCC-CcEEEEe
Q 008087          158 GRRVLTNAHSVEHYTQVKLKKRGSDTKYL---ATVLAIGTECDIAMLTVEDDEFW--EGVLPVEFGELPALQ-DAVTVVG  231 (578)
Q Consensus       158 ~g~ILT~aHvV~~~~~i~V~~~~~g~~~~---a~vv~~d~~~DlAlLkv~~~~~~--~~~~~l~l~~~~~~g-~~V~~iG  231 (578)
                      ...++|++||......+....  +|+.++   -+.+..+...|++||+.... ++  -.++.+.|.....+. ..+.+.+
T Consensus        41 ~~~L~ta~Hv~~~~~~~~~~k--~g~kipl~~f~~~~~~~~~D~~il~~P~n-~~s~Lg~k~~~~~~~~~~~~g~~~~y~  117 (203)
T PF02122_consen   41 EDALLTARHVWSRPSKVTSLK--TGEKIPLAEFTDLLESRIADFVILRGPPN-WESKLGVKAAQLSQNSQLAKGPVSFYG  117 (203)
T ss_dssp             -EEEEE-HHHHTSSS---EEE--TTEEEE--S-EEEEE-TTT-EEEEE--HH-HHHHHT-----B----SEEEEESSTTS
T ss_pred             ccceecccccCCCccceeEcC--CCCcccchhChhhhCCCccCEEEEecCcC-HHHHhCcccccccchhhhCCCCeeeee
Confidence            459999999999865554443  454443   35566788999999999943 11  134444443322210 0010001


Q ss_pred             ecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchhH
Q 008087          232 YPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVI  311 (578)
Q Consensus       232 ~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i  311 (578)
                      ...+ .. ......|.         +... .++.+-+...+|.||.|+++.. ++||++.+.......++.++..|+.-+
T Consensus       118 ~~~~-~~-~~~sa~i~---------g~~~-~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip~~  184 (203)
T PF02122_consen  118 FSSG-EW-PCSSAKIP---------GTEG-KFASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIPPI  184 (203)
T ss_dssp             EEEE-EE-EEEE-S-------------ST-TEEEE-----TT-TT-EEE-SS--EEEEEEEE------------------
T ss_pred             ecCC-Cc-eeccCccc---------cccC-cCCceEcCCCCCCCCCCeEECC-CceEeecCccccccccccccccccccc
Confidence            0000 00 11111111         1111 1455666788999999999987 999999985322244566665555443


No 88 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=95.31  E-value=0.016  Score=62.17  Aligned_cols=48  Identities=4%  Similarity=0.068  Sum_probs=43.2

Q ss_pred             eecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEecC
Q 008087          498 WCLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLEYD  545 (578)
Q Consensus       498 ~v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~R~  545 (578)
                      ++||+++|+++||+|++|||+++.+.  ++.++.|+-.+|..++|++.|.
T Consensus       122 ~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~  171 (406)
T COG0793         122 GSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRA  171 (406)
T ss_pred             CChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEc
Confidence            56999999999999999999999965  6788888888888999999884


No 89 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=95.07  E-value=0.014  Score=52.87  Aligned_cols=49  Identities=14%  Similarity=0.197  Sum_probs=39.5

Q ss_pred             EEEeecCCcCCCCC-CCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEe
Q 008087          495 SLLWCLRSPLCLNC-FNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLE  543 (578)
Q Consensus       495 vvs~v~A~~aGl~~-GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~  543 (578)
                      |..++||++|||++ .|.|+.+|+...++.++|.+.++++.++.+.|.|-
T Consensus        50 V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vy   99 (138)
T PF04495_consen   50 VAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVY   99 (138)
T ss_dssp             E-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEE
T ss_pred             ecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEE
Confidence            55678999999997 69999999999999999999999999999999883


No 90 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=94.94  E-value=0.06  Score=48.34  Aligned_cols=38  Identities=24%  Similarity=0.410  Sum_probs=33.4

Q ss_pred             CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCC
Q 008087          352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGT  389 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~  389 (578)
                      ..-++|.++.|++.|..  ||+-||.+++|||..|.....
T Consensus       114 nspiyisriipggvadrhgglkrgdqllsvngvsvege~h  153 (207)
T KOG3550|consen  114 NSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHH  153 (207)
T ss_pred             CCceEEEeecCCccccccCcccccceeEeecceeecchhh
Confidence            45689999999999988  899999999999999987543


No 91 
>PRK11186 carboxy-terminal protease; Provisional
Probab=93.95  E-value=0.056  Score=61.41  Aligned_cols=49  Identities=10%  Similarity=0.203  Sum_probs=40.4

Q ss_pred             EEeecCCcC-CCCCCCEEEEeC--CeecC-----CHHHHHHHHHhcCCCeEEEEEec
Q 008087          496 LLWCLRSPL-CLNCFNKVLAFN--GNPVK-----NLKSLANMVENCDDEFLKFDLEY  544 (578)
Q Consensus       496 vs~v~A~~a-Gl~~GD~I~~VN--G~~V~-----~~~~l~~~l~~~~~~~v~l~v~R  544 (578)
                      +.++||+++ ||++||+|++||  |+++.     .++++.++|+..+|..|+|++.|
T Consensus       263 ipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r  319 (667)
T PRK11186        263 VAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILP  319 (667)
T ss_pred             cCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEe
Confidence            346799998 999999999999  55443     35688999998888999999987


No 92 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=93.68  E-value=0.079  Score=47.45  Aligned_cols=29  Identities=38%  Similarity=0.672  Sum_probs=21.5

Q ss_pred             EecCCCCCCccceEEccCCeEEEEEeccc
Q 008087          266 IDAAINSGNSGGPAFNDKGKCVGIAFQSL  294 (578)
Q Consensus       266 ~da~i~~G~SGGPlvn~~G~vVGI~~~~~  294 (578)
                      .+....+|.||+|+||.+|++|||....+
T Consensus        90 ~~~d~~~GsSGSpi~n~~g~ivGlYg~g~  118 (132)
T PF00949_consen   90 IDLDFPKGSSGSPIFNQNGEIVGLYGNGV  118 (132)
T ss_dssp             E---S-TTGTT-EEEETTSCEEEEEEEEE
T ss_pred             eecccCCCCCCCceEcCCCcEEEEEccce
Confidence            34447899999999999999999988755


No 93 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=93.22  E-value=0.094  Score=59.62  Aligned_cols=57  Identities=26%  Similarity=0.327  Sum_probs=43.8

Q ss_pred             CCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087          353 KGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR  419 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R  419 (578)
                      .-|+|..|.+|+|+...|++||.|++|||++|.+.-.-      |  ..+++...  .+.|.|+|.+
T Consensus        75 rPviVr~VT~GGps~GKL~PGDQIl~vN~Epv~dapre------r--vIdlvRac--e~sv~ltV~q  131 (1298)
T KOG3552|consen   75 RPVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAPRE------R--VIDLVRAC--ESSVNLTVCQ  131 (1298)
T ss_pred             CceEEEEecCCCCccccccCCCeEEEecCcccccccHH------H--HHHHHHHH--hhhcceEEec
Confidence            35889999999999989999999999999999874321      1  22455443  4678888877


No 94 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=93.21  E-value=0.28  Score=43.20  Aligned_cols=57  Identities=19%  Similarity=0.116  Sum_probs=37.2

Q ss_pred             ceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchh
Q 008087          248 RIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPV  310 (578)
Q Consensus       248 ~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~  310 (578)
                      .++.+.+.+..+...++....+..||+-||+|+..-| ||||+++     ++++.-.|..+..
T Consensus        65 ~i~~s~YYP~h~Q~~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta-----gg~g~VaF~dir~  121 (127)
T PF00947_consen   65 WIEESEYYPKHYQYNLLIGEGPAEPGDCGGILRCKHG-VIGIVTA-----GGEGHVAFADIRD  121 (127)
T ss_dssp             EE-SBTTB-SEEEECEEEEE-SSSTT-TCSEEEETTC-EEEEEEE-----EETTEEEEEECCC
T ss_pred             EECCccCchhheecCceeecccCCCCCCCceeEeCCC-eEEEEEe-----CCCceEEEEechh
Confidence            3334444444555566777889999999999998655 9999998     5566655555543


No 95 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=92.79  E-value=0.28  Score=49.80  Aligned_cols=53  Identities=17%  Similarity=0.157  Sum_probs=42.9

Q ss_pred             ceEEEEee---cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhc-CCCeEEEEEec
Q 008087          492 QMSSLLWC---LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENC-DDEFLKFDLEY  544 (578)
Q Consensus       492 ~~vvvs~v---~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~-~~~~v~l~v~R  544 (578)
                      .||++..+   .....-|+.||.|++|||+++.+.++|.+++++. .++.+++++.|
T Consensus       130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r  186 (342)
T COG3480         130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYER  186 (342)
T ss_pred             eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEe
Confidence            35665555   2344457899999999999999999999999876 57899999986


No 96 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=92.69  E-value=0.12  Score=56.97  Aligned_cols=37  Identities=16%  Similarity=0.498  Sum_probs=33.9

Q ss_pred             CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCC
Q 008087          353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGT  389 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~  389 (578)
                      .-|.|..|.++++|.+ .|++||++++|||.+|.+..+
T Consensus       398 ~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q  435 (1051)
T KOG3532|consen  398 RAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQ  435 (1051)
T ss_pred             eEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHH
Confidence            3577888999999999 999999999999999999877


No 97 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=92.66  E-value=0.14  Score=56.24  Aligned_cols=81  Identities=16%  Similarity=0.222  Sum_probs=59.9

Q ss_pred             eecchhHHHHHHHHHHcCcee--ccccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECC
Q 008087          305 VIPTPVIMHFIQDYEKNGAYT--GFPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDG  381 (578)
Q Consensus       305 aiPi~~i~~~l~~l~~~g~v~--~~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG  381 (578)
                      .+|.+....+++.+++.-.|.  --+.--++-..+.-++.+..||++- ++|+ |-....|+.|++ |++.|-+|++|||
T Consensus       708 GLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~kyQLGFSV-QNGi-ICSLlRGGIAERGGVRVGHRIIEINg  785 (829)
T KOG3605|consen  708 GLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLRYQLGFSV-QNGI-ICSLLRGGIAERGGVRVGHRIIEING  785 (829)
T ss_pred             cccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccchhhcccee-eCcE-eehhhcccchhccCceeeeeEEEECC
Confidence            589999999999887644442  0112222323333788888899887 6676 677889999999 9999999999999


Q ss_pred             EEeCCC
Q 008087          382 IDIAND  387 (578)
Q Consensus       382 ~~V~~~  387 (578)
                      +.|--.
T Consensus       786 QSVVA~  791 (829)
T KOG3605|consen  786 QSVVAT  791 (829)
T ss_pred             ceEEec
Confidence            988654


No 98 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=92.35  E-value=0.19  Score=48.88  Aligned_cols=50  Identities=10%  Similarity=-0.003  Sum_probs=44.1

Q ss_pred             cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC-eEEEEEecCeEEE
Q 008087          500 LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE-FLKFDLEYDQVVV  549 (578)
Q Consensus       500 ~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~-~v~l~v~R~~~~~  549 (578)
                      +-...||+.||+.+++|+..+.+.++..++++...+. .+.|++.|+|+.-
T Consensus       219 lF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rh  269 (275)
T COG3031         219 LFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRH  269 (275)
T ss_pred             hhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccc
Confidence            5667899999999999999999999999999998655 7999999988653


No 99 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=91.97  E-value=0.17  Score=42.40  Aligned_cols=53  Identities=13%  Similarity=0.103  Sum_probs=38.2

Q ss_pred             CceEEEEee----cCCcCCCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEecC
Q 008087          491 CQMSSLLWC----LRSPLCLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLEYD  545 (578)
Q Consensus       491 ~~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~R~  545 (578)
                      +.|++++.+    ||+.|||+.+|.|+.|||-...  +.+..++.|.+  ++.+.+.|.|.
T Consensus        58 D~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~k--~~vl~mLVaR~  116 (124)
T KOG3553|consen   58 DKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRITK--EEVLRMLVARQ  116 (124)
T ss_pred             CccEEEEEeccCChhhhhcceecceEEEecCceeEEEEhHHHHHHhhH--hHHHHHHHHhh
Confidence            457777754    8999999999999999997765  55666666665  34444444443


No 100
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=91.96  E-value=0.37  Score=53.25  Aligned_cols=48  Identities=8%  Similarity=0.092  Sum_probs=40.8

Q ss_pred             EEEeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEE
Q 008087          495 SLLWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       495 vvs~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v  542 (578)
                      |....+|+++++.+||++++|||.||.+..+..+.++...+....|..
T Consensus       405 v~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~~~l~~  452 (1051)
T KOG3532|consen  405 VEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDLTVLVE  452 (1051)
T ss_pred             ecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccceEEEEe
Confidence            344568999999999999999999999999999999998776554443


No 101
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=91.49  E-value=0.32  Score=43.06  Aligned_cols=27  Identities=30%  Similarity=0.629  Sum_probs=23.3

Q ss_pred             ecCCCCCCccceEEccCCeEEEEEecc
Q 008087          267 DAAINSGNSGGPAFNDKGKCVGIAFQS  293 (578)
Q Consensus       267 da~i~~G~SGGPlvn~~G~vVGI~~~~  293 (578)
                      ...-.+|+||-|++|..|+||||+.++
T Consensus       100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG  126 (158)
T PF00944_consen  100 TGVGKPGDSGRPIFDNSGRVVAIVLGG  126 (158)
T ss_dssp             TTS-STTSTTEEEESTTSBEEEEEEEE
T ss_pred             cCCCCCCCCCCccCcCCCCEEEEEecC
Confidence            456789999999999999999999884


No 102
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=91.28  E-value=0.14  Score=56.27  Aligned_cols=39  Identities=26%  Similarity=0.326  Sum_probs=34.6

Q ss_pred             cCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCC
Q 008087          350 ADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDG  388 (578)
Q Consensus       350 ~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~  388 (578)
                      ....|++|.+|.|++.|+. ||+-||.|++|||+...+..
T Consensus       559 EkGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis  598 (1283)
T KOG3542|consen  559 EKGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENIS  598 (1283)
T ss_pred             cccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhh
Confidence            3357899999999999999 99999999999999887754


No 103
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=90.24  E-value=0.35  Score=46.15  Aligned_cols=56  Identities=18%  Similarity=0.195  Sum_probs=43.7

Q ss_pred             EEEEeecCCcCCCCCCCEEEEeCCeecCC---HHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087          494 SSLLWCLRSPLCLNCFNKVLAFNGNPVKN---LKSLANMVENCDDEFLKFDLEYDQVVV  549 (578)
Q Consensus       494 vvvs~v~A~~aGl~~GD~I~~VNG~~V~~---~~~l~~~l~~~~~~~v~l~v~R~~~~~  549 (578)
                      .|+.++||+++||+.||.|+++....--+   +..+...+++..++.+.+++.|.++.+
T Consensus       145 sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v  203 (231)
T KOG3129|consen  145 SVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKV  203 (231)
T ss_pred             ecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEE
Confidence            35557799999999999999987665544   555666777788899999998876544


No 104
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=89.64  E-value=0.46  Score=54.38  Aligned_cols=61  Identities=20%  Similarity=0.306  Sum_probs=47.6

Q ss_pred             CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCE
Q 008087          352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSK  422 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~  422 (578)
                      .-|++|..|.+|++|+.  .|+.||.+++|||+.+-...+-+        ..+++.  ..|..|.|+|...|.
T Consensus       959 klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQEr--------AA~lmt--rtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen  959 KLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQER--------AARLMT--RTGNVVHLEVAKQGA 1021 (1629)
T ss_pred             ccceEEEEeccCCccccccccccCceeeeecCcccccccHHH--------HHHHHh--ccCCeEEEehhhhhh
Confidence            56899999999999988  59999999999999888776522        123333  368889999976554


No 105
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=87.61  E-value=0.37  Score=50.14  Aligned_cols=38  Identities=26%  Similarity=0.378  Sum_probs=35.0

Q ss_pred             CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCC
Q 008087          352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGT  389 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~  389 (578)
                      ..|+.|.+|...||+..  ||++||+|+++||-+|.+.++
T Consensus       219 g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~d  258 (484)
T KOG2921|consen  219 GEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSD  258 (484)
T ss_pred             CceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHH
Confidence            56899999999999987  999999999999999998766


No 106
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=86.98  E-value=0.32  Score=55.65  Aligned_cols=40  Identities=23%  Similarity=0.187  Sum_probs=24.5

Q ss_pred             CCEEEEEEeeCC------CC---CCee---eEEcCC-CCCCCCcEEEEeecCC
Q 008087          196 CDIAMLTVEDDE------FW---EGVL---PVEFGE-LPALQDAVTVVGYPIG  235 (578)
Q Consensus       196 ~DlAlLkv~~~~------~~---~~~~---~l~l~~-~~~~g~~V~~iG~p~g  235 (578)
                      .|++++|+=...      +.   .++.   .+++.. ..+-|+-|+++|||..
T Consensus       200 gDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is~~G~keGD~vmv~GyPG~  252 (698)
T PF10459_consen  200 GDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKISLKGVKEGDFVMVAGYPGR  252 (698)
T ss_pred             CceEEEEEEeCCCCCccccCcCCCCCCCccccccCCCCCCCCCeEEEccCCCc
Confidence            599999995431      10   1222   233332 3356899999999965


No 107
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=86.31  E-value=4.8  Score=36.41  Aligned_cols=46  Identities=13%  Similarity=0.164  Sum_probs=34.3

Q ss_pred             EEeecCCcC-CCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087          496 LLWCLRSPL-CLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       496 vs~v~A~~a-Gl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v  542 (578)
                      +++..|++- ||+.||.+++|||..|.  ..+..+++|++..+. ++|.+
T Consensus       123 ipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~gs-vklvv  171 (207)
T KOG3550|consen  123 IPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVGS-VKLVV  171 (207)
T ss_pred             cCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcCc-EEEEE
Confidence            334445554 68999999999999997  556688889887554 67766


No 108
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=85.82  E-value=0.72  Score=41.00  Aligned_cols=24  Identities=33%  Similarity=0.609  Sum_probs=18.6

Q ss_pred             CCCCccceEEccCCeEEEEEeccc
Q 008087          271 NSGNSGGPAFNDKGKCVGIAFQSL  294 (578)
Q Consensus       271 ~~G~SGGPlvn~~G~vVGI~~~~~  294 (578)
                      -.|.||||++..+|.+|||-.+..
T Consensus       106 lkGSSGgPiLC~~GH~vG~f~aa~  129 (148)
T PF02907_consen  106 LKGSSGGPILCPSGHAVGMFRAAV  129 (148)
T ss_dssp             HTT-TT-EEEETTSEEEEEEEEEE
T ss_pred             EecCCCCcccCCCCCEEEEEEEEE
Confidence            469999999999999999976543


No 109
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=85.64  E-value=0.52  Score=45.50  Aligned_cols=41  Identities=27%  Similarity=0.375  Sum_probs=32.3

Q ss_pred             ecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchh
Q 008087          267 DAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPV  310 (578)
Q Consensus       267 da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~  310 (578)
                      ...+..|+||+|++- +|++||-++..+.  +....+|.++++.
T Consensus       174 TGGIvqGMSGSPI~q-dGKLiGAVthvf~--~dp~~Gygi~ie~  214 (218)
T PF05580_consen  174 TGGIVQGMSGSPIIQ-DGKLIGAVTHVFV--NDPTKGYGIFIEW  214 (218)
T ss_pred             hCCEEecccCCCEEE-CCEEEEEEEEEEe--cCCCceeeecHHH
Confidence            346778999999986 8999999887664  4566788888654


No 110
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=85.19  E-value=1.5  Score=46.45  Aligned_cols=57  Identities=28%  Similarity=0.432  Sum_probs=43.5

Q ss_pred             EEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCE---EEEEEEE-CCEEE
Q 008087          357 IRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDS---AAVKVLR-DSKIL  424 (578)
Q Consensus       357 V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~---v~l~V~R-~g~~~  424 (578)
                      +.++..++++.. ++++||.|+++|++++.+|.++          ...+.. ..+..   +.+.+.| ++...
T Consensus       133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~----------~~~~~~-~~~~~~~~~~i~~~~~~~~~~  194 (375)
T COG0750         133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDV----------RRLLVA-AAGDVFNLLTILVIRLDGEAH  194 (375)
T ss_pred             eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHH----------HHHHHh-ccCCcccceEEEEEeccceee
Confidence            337889999999 9999999999999999999874          233333 23444   7888889 76664


No 111
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=83.29  E-value=1.8  Score=46.02  Aligned_cols=53  Identities=11%  Similarity=0.102  Sum_probs=45.2

Q ss_pred             EEeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCe---EEEEEec-CeEE
Q 008087          496 LLWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEF---LKFDLEY-DQVV  548 (578)
Q Consensus       496 vs~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~---v~l~v~R-~~~~  548 (578)
                      ...+++..++++.||.|+++|++++.++++..+.+....+..   +.+.+.| ++..
T Consensus       137 ~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~  193 (375)
T COG0750         137 APKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA  193 (375)
T ss_pred             CCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence            345689999999999999999999999999999998877766   7888888 5444


No 112
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=83.22  E-value=0.91  Score=49.30  Aligned_cols=21  Identities=24%  Similarity=0.055  Sum_probs=19.7

Q ss_pred             eecCCcCCCCCCCEEEEeCCe
Q 008087          498 WCLRSPLCLNCFNKVLAFNGN  518 (578)
Q Consensus       498 ~v~A~~aGl~~GD~I~~VNG~  518 (578)
                      ++||.+|||.+||.|++|||.
T Consensus       472 ~gPA~~AGl~~Gd~ivai~G~  492 (558)
T COG3975         472 GGPAYKAGLSPGDKIVAINGI  492 (558)
T ss_pred             CChhHhccCCCccEEEEEcCc
Confidence            458999999999999999999


No 113
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=80.10  E-value=2.3  Score=48.95  Aligned_cols=48  Identities=23%  Similarity=0.283  Sum_probs=37.1

Q ss_pred             EEeecCCcCCCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEec
Q 008087          496 LLWCLRSPLCLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLEY  544 (578)
Q Consensus       496 vs~v~A~~aGl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~R  544 (578)
                      |+..-.....|++||+|++|||++|+  .|+..+++++++... |.|+|-+
T Consensus        82 VT~GGps~GKL~PGDQIl~vN~Epv~daprervIdlvRace~s-v~ltV~q  131 (1298)
T KOG3552|consen   82 VTEGGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVRACESS-VNLTVCQ  131 (1298)
T ss_pred             ecCCCCccccccCCCeEEEecCcccccccHHHHHHHHHHHhhh-cceEEec
Confidence            33334555668999999999999998  679999999998654 5566644


No 114
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=78.96  E-value=1.8  Score=42.92  Aligned_cols=56  Identities=21%  Similarity=0.294  Sum_probs=40.8

Q ss_pred             CCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-C-CCCCCEEEEECCEEeCCC
Q 008087          330 LGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-V-LKPSDIILSFDGIDIAND  387 (578)
Q Consensus       330 lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-G-L~~GDiIl~InG~~V~~~  387 (578)
                      ||+.+..- +.--....||.. ..|+.|.+..||+-|+. | |-..|.|++|||.+|...
T Consensus       173 LGFYIRDG-~SVRVtp~Glek-vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGK  230 (358)
T KOG3606|consen  173 LGFYIRDG-TSVRVTPHGLEK-VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGK  230 (358)
T ss_pred             ceEEEecC-ceEEeccccccc-cCceEEEeecCCccccccceeeecceeEEEcCEEeccc
Confidence            66665543 111112246655 67999999999999999 6 677999999999999864


No 115
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=78.79  E-value=3.4  Score=41.65  Aligned_cols=55  Identities=15%  Similarity=0.244  Sum_probs=41.3

Q ss_pred             CcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE
Q 008087          354 GVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL  418 (578)
Q Consensus       354 Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~  418 (578)
                      -++|..|-.++||++  .++.||.|++|||..|.....+.        ...++...  -+.|++++.
T Consensus        31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKve--------VAkmIQ~~--~~eV~IhyN   87 (429)
T KOG3651|consen   31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVE--------VAKMIQVS--LNEVKIHYN   87 (429)
T ss_pred             eEEEEEeccCCchhccCccccCCeeEEecceeecCccHHH--------HHHHHHHh--ccceEEEeh
Confidence            478889999999999  49999999999999999876543        23444442  245666664


No 116
>KOG1924 consensus RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms; Cytoskeleton]
Probab=76.90  E-value=9.2  Score=43.54  Aligned_cols=11  Identities=27%  Similarity=0.087  Sum_probs=6.4

Q ss_pred             CCCCCCccccc
Q 008087           12 PKIPDAEKTLD   22 (578)
Q Consensus        12 ~~~~~~~~~~~   22 (578)
                      +||..-++|..
T Consensus       502 ~Ki~~l~ae~~  512 (1102)
T KOG1924|consen  502 EKIKLLEAEKQ  512 (1102)
T ss_pred             hhcccCchhhh
Confidence            66666555543


No 117
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=76.08  E-value=2.8  Score=45.15  Aligned_cols=38  Identities=16%  Similarity=0.338  Sum_probs=31.9

Q ss_pred             CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCC
Q 008087          352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGT  389 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~  389 (578)
                      ..|++|+.|.+++..+.  .+.+||.||.||.....++..
T Consensus       276 DggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSN  315 (626)
T KOG3571|consen  276 DGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSN  315 (626)
T ss_pred             CCceEEeeeccCceeeccCccCccceEEEeeecchhhcCc
Confidence            46899999999887666  599999999999987776543


No 118
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=75.32  E-value=12  Score=32.18  Aligned_cols=53  Identities=15%  Similarity=0.271  Sum_probs=35.8

Q ss_pred             EEEEcCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCC
Q 008087          153 GFAIGGRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGE  219 (578)
Q Consensus       153 GfvI~~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~  219 (578)
                      ++-|++|..+|+.||.+.++.+      ++..+  +++.  ...|+++++....    .++.+.+++
T Consensus         3 avHIGnG~~vt~tHva~~~~~v------~g~~f--~~~~--~~ge~~~v~~~~~----~~p~~~ig~   55 (105)
T PF03510_consen    3 AVHIGNGRYVTVTHVAKSSDSV------DGQPF--KIVK--TDGELCWVQSPLV----HLPAAQIGT   55 (105)
T ss_pred             eEEeCCCEEEEEEEEeccCceE------cCcCc--EEEE--eccCEEEEECCCC----CCCeeEecc
Confidence            5667899999999999877653      22222  2333  4569999998876    356666654


No 119
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=74.86  E-value=2.8  Score=43.74  Aligned_cols=55  Identities=22%  Similarity=0.254  Sum_probs=40.7

Q ss_pred             CCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc-cCCCCEEEEEEE
Q 008087          353 KGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ-KYTGDSAAVKVL  418 (578)
Q Consensus       353 ~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~-~~~g~~v~l~V~  418 (578)
                      --++|..+-+|-.|.+  .|..||.|++|||..+.+...           .+.+.. +..|+.|.++|.
T Consensus       110 MPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtH-----------deAVqaLKraGkeV~levK  167 (506)
T KOG3551|consen  110 MPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATH-----------DEAVQALKRAGKEVLLEVK  167 (506)
T ss_pred             CceehhHhccccccccccceeeccEEEEecchhhhhcch-----------HHHHHHHHhhCceeeeeee
Confidence            3578889999888888  699999999999999887654           133332 346787766554


No 120
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=74.13  E-value=3  Score=49.22  Aligned_cols=35  Identities=20%  Similarity=0.247  Sum_probs=31.3

Q ss_pred             cEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCC
Q 008087          355 VRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGT  389 (578)
Q Consensus       355 v~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~  389 (578)
                      -.|..|.++|||.. ||++||.|+.|||++|.....
T Consensus       660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H  695 (1205)
T KOG0606|consen  660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVH  695 (1205)
T ss_pred             eeeeeecCCCCccccCCCccceeEeccCcccchhhH
Confidence            56889999999988 999999999999999987643


No 121
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=72.28  E-value=4.3  Score=44.30  Aligned_cols=49  Identities=16%  Similarity=0.226  Sum_probs=40.5

Q ss_pred             EEEEeecCCcCCC-CCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEe
Q 008087          494 SSLLWCLRSPLCL-NCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLE  543 (578)
Q Consensus       494 vvvs~v~A~~aGl-~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~  543 (578)
                      -++.+..+++.|+ ..||.|.+|||..|.  +..++.+++.+.. +.++|.+.
T Consensus       152 RI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G~itfkii  203 (542)
T KOG0609|consen  152 RIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR-GSITFKII  203 (542)
T ss_pred             eeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC-CcEEEEEc
Confidence            3455667888887 699999999999998  5799999999987 67788774


No 122
>KOG1924 consensus RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms; Cytoskeleton]
Probab=72.22  E-value=13  Score=42.52  Aligned_cols=10  Identities=20%  Similarity=0.431  Sum_probs=5.2

Q ss_pred             HHHHHHHHHh
Q 008087          523 LKSLANMVEN  532 (578)
Q Consensus       523 ~~~l~~~l~~  532 (578)
                      ++.|.+.|++
T Consensus      1043 mDslLeaLqs 1052 (1102)
T KOG1924|consen 1043 MDSLLEALQS 1052 (1102)
T ss_pred             HHHHHHHHHh
Confidence            3445555554


No 123
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=72.21  E-value=6.5  Score=41.30  Aligned_cols=43  Identities=12%  Similarity=0.131  Sum_probs=34.8

Q ss_pred             ceEEEEeec-----CCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcC
Q 008087          492 QMSSLLWCL-----RSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCD  534 (578)
Q Consensus       492 ~~vvvs~v~-----A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~  534 (578)
                      .+|.+.+++     -..-||.+||+|+++||-+|.+.+|+.+.++.+.
T Consensus       220 ~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~tsl  267 (484)
T KOG2921|consen  220 EGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLATSL  267 (484)
T ss_pred             ceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHhhc
Confidence            467777763     2233899999999999999999999999998743


No 124
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=71.28  E-value=9.4  Score=38.05  Aligned_cols=46  Identities=22%  Similarity=0.229  Sum_probs=35.9

Q ss_pred             EEeecCCcCCCC-CCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087          496 LLWCLRSPLCLN-CFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       496 vs~v~A~~aGl~-~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v  542 (578)
                      +.+.+|+.-||- .+|.|++|||..|.  ++++..++|-++.-. +-++|
T Consensus       202 VpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANshN-LIiTV  250 (358)
T KOG3606|consen  202 VPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSHN-LIITV  250 (358)
T ss_pred             cCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcccc-eEEEe
Confidence            345678888985 89999999999996  999999998876433 33444


No 125
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=70.53  E-value=15  Score=42.80  Aligned_cols=63  Identities=16%  Similarity=0.248  Sum_probs=35.7

Q ss_pred             EEEEEEcCCEEEecccccCCCceEEEEEcC-CCcEEEEEEEEEcC--CCCEEEEEEeeCCCCCCeeeEEcCC
Q 008087          151 SSGFAIGGRRVLTNAHSVEHYTQVKLKKRG-SDTKYLATVLAIGT--ECDIAMLTVEDDEFWEGVLPVEFGE  219 (578)
Q Consensus       151 GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~-~g~~~~a~vv~~d~--~~DlAlLkv~~~~~~~~~~~l~l~~  219 (578)
                      |...+|++.||+|.+|+..+...  |.|.. ....|  +++..+.  ..|+.+-|++.-.  ..+.|+++..
T Consensus        67 G~aTLigpqYiVSV~HN~~gy~~--v~FG~~g~~~Y--~iV~RNn~~~~Df~~pRLnK~V--TEvaP~~~t~  132 (769)
T PF02395_consen   67 GVATLIGPQYIVSVKHNGKGYNS--VSFGNEGQNTY--KIVDRNNYPSGDFHMPRLNKFV--TEVAPAEMTT  132 (769)
T ss_dssp             SS-EEEETTEEEBETTG-TSCCE--ECESCSSTCEE--EEEEEEBETTSTEBEEEESS-----SS----BBS
T ss_pred             ceEEEecCCeEEEEEccCCCcCc--eeecccCCceE--EEEEccCCCCcccceeecCceE--EEEecccccc
Confidence            66889999999999999854443  45532 22333  4555543  3699999998642  2455655543


No 126
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=70.25  E-value=3.4  Score=44.03  Aligned_cols=24  Identities=33%  Similarity=0.588  Sum_probs=21.4

Q ss_pred             CCCCCCccceEEccCCeEEEEEec
Q 008087          269 AINSGNSGGPAFNDKGKCVGIAFQ  292 (578)
Q Consensus       269 ~i~~G~SGGPlvn~~G~vVGI~~~  292 (578)
                      ....|.||+.|+|.+|++|||.++
T Consensus       351 ~l~gGaSGS~V~n~~~~lvGIy~g  374 (374)
T PF01732_consen  351 SLGGGASGSMVINQNNELVGIYFG  374 (374)
T ss_pred             CCCCCCCcCeEECCCCCEEEEeCC
Confidence            566899999999999999999764


No 127
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=68.08  E-value=8.2  Score=39.77  Aligned_cols=49  Identities=12%  Similarity=0.090  Sum_probs=37.9

Q ss_pred             EEEEee----cCCcCCC-CCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEe
Q 008087          494 SSLLWC----LRSPLCL-NCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLE  543 (578)
Q Consensus       494 vvvs~v----~A~~aGl-~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~  543 (578)
                      |+++..    .|+..|+ -.||-|++|||.-|.  ..++.+++|++. ++.|+|+|.
T Consensus        82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNA-GdeVtlTV~  137 (505)
T KOG3549|consen   82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRNA-GDEVTLTVK  137 (505)
T ss_pred             EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHhc-CCEEEEEeH
Confidence            677765    4555565 499999999999997  568899999975 666777773


No 128
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=66.48  E-value=7  Score=40.26  Aligned_cols=55  Identities=20%  Similarity=0.230  Sum_probs=41.3

Q ss_pred             CcEEEEeCCCCcccC-C-CCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE
Q 008087          354 GVRIRRVDPTAPESE-V-LKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL  418 (578)
Q Consensus       354 Gv~V~~V~p~spA~~-G-L~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~  418 (578)
                      -++|..+..+-.|+. | |-.||-|+.|||..|..-..-.        .-.++  .+.|+.|+|+|.
T Consensus        81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~Hee--------vV~iL--RNAGdeVtlTV~  137 (505)
T KOG3549|consen   81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEE--------VVNIL--RNAGDEVTLTVK  137 (505)
T ss_pred             cEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHH--------HHHHH--HhcCCEEEEEeH
Confidence            478888988888887 5 8899999999999998765411        11233  347899998885


No 129
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=65.76  E-value=9.2  Score=45.39  Aligned_cols=42  Identities=14%  Similarity=0.139  Sum_probs=34.6

Q ss_pred             EEEeecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCC
Q 008087          495 SLLWCLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDE  536 (578)
Q Consensus       495 vvs~v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~  536 (578)
                      |..+++|+.+|++++|.|+.|||++|..+  .++.+++-+..++
T Consensus       665 v~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~~gn~  708 (1205)
T KOG0606|consen  665 VEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLKSGNK  708 (1205)
T ss_pred             ecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHhcCCe
Confidence            55567999999999999999999999955  6688888765444


No 130
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=65.74  E-value=8.3  Score=42.16  Aligned_cols=56  Identities=23%  Similarity=0.307  Sum_probs=42.1

Q ss_pred             CcEEEEeCCCCcccC-C-CCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087          354 GVRIRRVDPTAPESE-V-LKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR  419 (578)
Q Consensus       354 Gv~V~~V~p~spA~~-G-L~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R  419 (578)
                      -++|..+..|+.+.+ | |+.||.|+.|||..|.+..- .       .++.++....  ..++++|.-
T Consensus       147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~-~-------e~q~~l~~~~--G~itfkiiP  204 (542)
T KOG0609|consen  147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSP-E-------ELQELLRNSR--GSITFKIIP  204 (542)
T ss_pred             ccEEeeeccCCcchhccceeeccchheecCeecccCCH-H-------HHHHHHHhCC--CcEEEEEcc
Confidence            488999999998888 5 89999999999999998532 1       1455665544  457777753


No 131
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=63.67  E-value=11  Score=37.45  Aligned_cols=38  Identities=18%  Similarity=0.325  Sum_probs=28.5

Q ss_pred             CCCCCCEEEEeCCeecCCHH--HHHHHHHhcC-CCeEEEEE
Q 008087          505 CLNCFNKVLAFNGNPVKNLK--SLANMVENCD-DEFLKFDL  542 (578)
Q Consensus       505 Gl~~GD~I~~VNG~~V~~~~--~l~~~l~~~~-~~~v~l~v  542 (578)
                      .+..||.|.+|||+.|-.+.  ++.++|+..+ ++..++.+
T Consensus       167 ~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrL  207 (334)
T KOG3938|consen  167 AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRL  207 (334)
T ss_pred             heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEe
Confidence            35799999999999998664  4678888874 45555444


No 132
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=62.06  E-value=7.5  Score=43.42  Aligned_cols=51  Identities=14%  Similarity=0.186  Sum_probs=37.5

Q ss_pred             eEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHH--HHHHHHhcCCCeEEEEEecC
Q 008087          493 MSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKS--LANMVENCDDEFLKFDLEYD  545 (578)
Q Consensus       493 ~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~--l~~~l~~~~~~~v~l~v~R~  545 (578)
                      ++++.++    -|++.||+.||.|++||||..+++..  ..++|.+  +-.+.|++.-+
T Consensus       563 gifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKtN  619 (1283)
T KOG3542|consen  563 GIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKTN  619 (1283)
T ss_pred             eeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEecc
Confidence            5676665    58899999999999999999997754  3445543  34667777544


No 133
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=57.62  E-value=6.4  Score=39.14  Aligned_cols=56  Identities=13%  Similarity=0.237  Sum_probs=44.8

Q ss_pred             cEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE
Q 008087          355 VRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL  418 (578)
Q Consensus       355 v~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~  418 (578)
                      ..|..+.++|....  -++.||.|-+|||+.|-.+....        ...++.....|++.+|.+.
T Consensus       151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYe--------VArmLKel~rge~ftlrLi  208 (334)
T KOG3938|consen  151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYE--------VARMLKELPRGETFTLRLI  208 (334)
T ss_pred             eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHH--------HHHHHHhcccCCeeEEEee
Confidence            57888899998877  79999999999999999887643        3466777777887777665


No 134
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=56.36  E-value=11  Score=39.53  Aligned_cols=54  Identities=7%  Similarity=0.121  Sum_probs=38.7

Q ss_pred             ecCCcCC-CCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE--ecCeEEEEech
Q 008087          499 CLRSPLC-LNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL--EYDQVVVLRTK  553 (578)
Q Consensus       499 v~A~~aG-l~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v--~R~~~~~l~~~  553 (578)
                      ..|++.+ |..||.|++|||....  +.++.+++|+.. ++.|.++|  .|+-..++.+.
T Consensus       121 lAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKra-GkeV~levKy~REvtPy~kk~  179 (506)
T KOG3551|consen  121 LAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKRA-GKEVLLEVKYMREVTPYFKKE  179 (506)
T ss_pred             cccccccceeeccEEEEecchhhhhcchHHHHHHHHhh-CceeeeeeeeehhcchhhccC
Confidence            3466655 4699999999999997  668889999875 56565555  56655566543


No 135
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=55.76  E-value=25  Score=35.65  Aligned_cols=47  Identities=21%  Similarity=0.261  Sum_probs=35.6

Q ss_pred             EEEeecCCcCC-CCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087          495 SLLWCLRSPLC-LNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL  542 (578)
Q Consensus       495 vvs~v~A~~aG-l~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v  542 (578)
                      |..+.||++-| ++.||.|++|||..|+  +--+..++|+...++ |++.+
T Consensus        37 vFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~e-V~Ihy   86 (429)
T KOG3651|consen   37 VFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLNE-VKIHY   86 (429)
T ss_pred             eccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhccc-eEEEe
Confidence            34455777666 5899999999999998  567788899887655 55554


No 136
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=54.89  E-value=11  Score=40.18  Aligned_cols=50  Identities=12%  Similarity=0.180  Sum_probs=41.1

Q ss_pred             EEEeecCCcCCCC-CCCEEEEeCCeecC-CHHHHHHHHHhcCCCeEEEEEecC
Q 008087          495 SLLWCLRSPLCLN-CFNKVLAFNGNPVK-NLKSLANMVENCDDEFLKFDLEYD  545 (578)
Q Consensus       495 vvs~v~A~~aGl~-~GD~I~~VNG~~V~-~~~~l~~~l~~~~~~~v~l~v~R~  545 (578)
                      |..+++|+++||. --|.|++|||..++ +-+.|...++++..+ |++++-.-
T Consensus        22 VqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~   73 (462)
T KOG3834|consen   22 VQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNS   73 (462)
T ss_pred             eecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEec
Confidence            5567899999997 68999999999998 556788888888777 88887443


No 137
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=53.25  E-value=1.4e+02  Score=32.01  Aligned_cols=136  Identities=16%  Similarity=0.214  Sum_probs=69.3

Q ss_pred             eEEEEEEEcCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEE
Q 008087          149 SSSSGFAIGGRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVT  228 (578)
Q Consensus       149 ~~GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~  228 (578)
                      ++|=||-|++.+.+|+-||+.....-   ++  |  .+-.-+.++..-+|.-+++..+- ..++.-+-|.+...-|.-+.
T Consensus       379 GsGWGfWVS~~lfITttHViP~g~~E---~F--G--v~i~~i~vh~sGeF~~~rFpk~i-RPDvtgmiLEeGapEGtV~s  450 (535)
T PF05416_consen  379 GSGWGFWVSPTLFITTTHVIPPGAKE---AF--G--VPISQIQVHKSGEFCRFRFPKPI-RPDVTGMILEEGAPEGTVCS  450 (535)
T ss_dssp             TTEEEEESSSSEEEEEGGGS-STTSE---ET--T--EECGGEEEEEETTEEEEEESS-S-STTS---EE-SS--TT-EEE
T ss_pred             CCceeeeecceEEEEeeeecCCcchh---hh--C--CChhHeEEeeccceEEEecCCCC-CCCccceeeccCCCCceEEE
Confidence            46889999999999999999854210   00  0  01111344555677777777642 12455555544333444333


Q ss_pred             -EEeecCCCC-cceEEEeEEeceeeeee-cCCceeeeEEEE-------ecCCCCCCccceEEccCC---eEEEEEeccc
Q 008087          229 -VVGYPIGGD-TISVTSGVVSRIEILSY-VHGSTELLGLQI-------DAAINSGNSGGPAFNDKG---KCVGIAFQSL  294 (578)
Q Consensus       229 -~iG~p~g~~-~~sv~~G~Is~~~~~~~-~~~~~~~~~i~~-------da~i~~G~SGGPlvn~~G---~vVGI~~~~~  294 (578)
                       .|=++.|.. .+.+..|.......... ..+  ...++.+       |-...||+-|-|-|-..|   -|+|++.+..
T Consensus       451 iLiKR~sGEllpLAvRMgt~AsmkIqgr~v~G--Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAt  527 (535)
T PF05416_consen  451 ILIKRPSGELLPLAVRMGTHASMKIQGRTVHG--QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAAT  527 (535)
T ss_dssp             EEEE-TTSBEEEEEEEEEEEEEEEETTEEEEE--EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-
T ss_pred             EEEEcCCccchhhhhhhccceeEEEcceeecc--eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhc
Confidence             345565522 23566776666543211 111  1123333       335668999999996654   6999998854


No 138
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=49.27  E-value=30  Score=37.63  Aligned_cols=53  Identities=8%  Similarity=-0.026  Sum_probs=37.6

Q ss_pred             CCceEEEEee-----cCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhc--CCCeEEEEE
Q 008087          490 NCQMSSLLWC-----LRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENC--DDEFLKFDL  542 (578)
Q Consensus       490 ~~~~vvvs~v-----~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~--~~~~v~l~v  542 (578)
                      .+++++|..+     -|...-+.+||.|+.||.....++  ++.+++|+..  +...++|++
T Consensus       275 gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltv  336 (626)
T KOG3571|consen  275 GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTV  336 (626)
T ss_pred             CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEE
Confidence            3466777654     255555789999999999999866  6777777764  334566665


No 139
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.82  E-value=46  Score=27.61  Aligned_cols=37  Identities=27%  Similarity=0.442  Sum_probs=30.3

Q ss_pred             ccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          167 SVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       167 vV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ++.....+.|.+ .+++.+.+++.++|.+.+|.|=...
T Consensus        10 ~~~~~~~V~V~l-r~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          10 AVKNNTQVLINC-RNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             HHcCCCEEEEEE-cCCCEEEEEEEEecCccEEEEcceE
Confidence            344567889999 5999999999999999999876553


No 140
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.88  E-value=75  Score=23.92  Aligned_cols=33  Identities=12%  Similarity=0.200  Sum_probs=27.8

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED  205 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~  205 (578)
                      ..+.|.+ .+|+.|.+.+..+|...++.|-....
T Consensus         7 ~~V~V~l-~~g~~~~G~L~~~D~~~Ni~L~~~~~   39 (63)
T cd00600           7 KTVRVEL-KDGRVLEGVLVAFDKYMNLVLDDVEE   39 (63)
T ss_pred             CEEEEEE-CCCcEEEEEEEEECCCCCEEECCEEE
Confidence            4688888 59999999999999999998776543


No 141
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.53  E-value=82  Score=24.49  Aligned_cols=32  Identities=22%  Similarity=0.256  Sum_probs=27.1

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ..+.|.+ .+|+.|.+++..+|...+|-|=...
T Consensus        11 ~~V~V~L-k~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          11 RPVVVKL-NSGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CeEEEEE-CCCCEEEEEEEEEccceeeEEeeEE
Confidence            4688888 5999999999999999999886553


No 142
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=35.78  E-value=91  Score=24.67  Aligned_cols=33  Identities=6%  Similarity=0.189  Sum_probs=28.0

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED  205 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~  205 (578)
                      ..+.|.+ .+|+.|.+++.++|...++-|=....
T Consensus        15 k~V~V~l-k~g~~~~G~L~~~D~~mNlvL~d~~e   47 (72)
T PRK00737         15 SPVLVRL-KGGREFRGELQGYDIHMNLVLDNAEE   47 (72)
T ss_pred             CEEEEEE-CCCCEEEEEEEEEcccceeEEeeEEE
Confidence            4688888 59999999999999999998876543


No 143
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=35.33  E-value=82  Score=24.60  Aligned_cols=32  Identities=16%  Similarity=0.266  Sum_probs=27.0

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ..+.|.+ .+|+.|.+++..+|...+|.|=...
T Consensus        12 ~~V~V~L-k~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          12 KPVIVKL-KWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CEEEEEE-CCCcEEEEEEEEECCCEEEEEeeEE
Confidence            4688888 6999999999999999999875543


No 144
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=34.57  E-value=99  Score=24.03  Aligned_cols=33  Identities=9%  Similarity=0.133  Sum_probs=28.4

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED  205 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~  205 (578)
                      ..+.|.+ .+|+.+.+++..+|...+|.|-....
T Consensus        11 ~~V~V~l-~~g~~~~G~L~~~D~~mNlvL~~~~e   43 (68)
T cd01731          11 KPVLVKL-KGGKEVRGRLKSYDQHMNLVLEDAEE   43 (68)
T ss_pred             CEEEEEE-CCCCEEEEEEEEECCcceEEEeeEEE
Confidence            5688888 58999999999999999998877653


No 145
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=33.05  E-value=1e+02  Score=24.67  Aligned_cols=31  Identities=10%  Similarity=0.338  Sum_probs=26.3

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV  203 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv  203 (578)
                      ..+.|.+ .||+.+.+++.++|...+|.|=..
T Consensus        11 ~~v~V~l-~dgR~~~G~l~~~D~~~NivL~~~   41 (75)
T cd06168          11 RTMRIHM-TDGRTLVGVFLCTDRDCNIILGSA   41 (75)
T ss_pred             CeEEEEE-cCCeEEEEEEEEEcCCCcEEecCc
Confidence            4688888 699999999999999999876544


No 146
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=32.86  E-value=96  Score=24.99  Aligned_cols=32  Identities=9%  Similarity=0.301  Sum_probs=26.8

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ..+.|.+ .+|+.+.+.+.++|...+|.|=...
T Consensus        11 ~~V~V~l-~dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          11 YRLRVTL-QDGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             CEEEEEE-CCCcEEEEEEEEEcCccCEEcCCEE
Confidence            4678888 6999999999999999999765443


No 147
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=32.70  E-value=1e+02  Score=25.03  Aligned_cols=48  Identities=23%  Similarity=0.283  Sum_probs=32.1

Q ss_pred             EEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEEE-EeecC
Q 008087          185 YLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVTV-VGYPI  234 (578)
Q Consensus       185 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~~-iG~p~  234 (578)
                      ++++++..|...++|++.+-.-.  ..+.---+....++|++|.+ +||-.
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvk--reV~l~Lv~~~v~~GdyVLVHvGfAi   53 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVK--REVNLDLVGEEVKVGDYVLVHVGFAM   53 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEe--EEEEeeeecCccccCCEEEEEeeEEE
Confidence            57899999988889999887532  11222222336788999876 67654


No 148
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=32.16  E-value=1.6e+02  Score=22.66  Aligned_cols=34  Identities=15%  Similarity=0.190  Sum_probs=27.5

Q ss_pred             CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087          171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED  205 (578)
Q Consensus       171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~  205 (578)
                      +..+.++. -.|..++++|+.+|....+.+|+-..
T Consensus         6 Gs~V~~kT-c~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           6 GSQVSCRT-CFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             ccEEEEEe-cCCceEEEEEEEecCCCcEEEEECcc
Confidence            44566666 37899999999999999999998554


No 149
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=32.04  E-value=89  Score=25.41  Aligned_cols=31  Identities=16%  Similarity=0.164  Sum_probs=26.1

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV  203 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv  203 (578)
                      ..+.|.+ .+|+.+.+++.++|.+.+|.|=..
T Consensus        12 k~V~V~l-~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          12 ERVYVKL-RGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CEEEEEE-CCCCEEEEEEEEEccceEEeccce
Confidence            4688888 599999999999999999876443


No 150
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=30.53  E-value=52  Score=23.86  Aligned_cols=21  Identities=38%  Similarity=0.555  Sum_probs=17.6

Q ss_pred             CCCccceEEccCCeEEEEEec
Q 008087          272 SGNSGGPAFNDKGKCVGIAFQ  292 (578)
Q Consensus       272 ~G~SGGPlvn~~G~vVGI~~~  292 (578)
                      .+.+.-|++|.+|+++|+.+.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            356678999999999999875


No 151
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.15  E-value=1.2e+02  Score=24.66  Aligned_cols=31  Identities=3%  Similarity=0.070  Sum_probs=26.2

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV  203 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv  203 (578)
                      ..+.|.+ .+|+.+.+++.++|...+|.|=..
T Consensus        13 k~V~V~l-~~gr~~~G~L~~~D~~mNlvL~~~   43 (81)
T cd01729          13 KKIRVKF-QGGREVTGILKGYDQLLNLVLDDT   43 (81)
T ss_pred             CeEEEEE-CCCcEEEEEEEEEcCcccEEecCE
Confidence            4678888 589999999999999999877544


No 152
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.02  E-value=1.1e+02  Score=24.65  Aligned_cols=31  Identities=16%  Similarity=0.348  Sum_probs=26.4

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV  203 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv  203 (578)
                      ..+.|.+ .+++.+.+++.++|...++.|=..
T Consensus        14 ~~V~V~l-~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          14 SRIWIVM-KSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CEEEEEE-CCCeEEEEEEEEeccceEEEEccE
Confidence            5688888 699999999999999999987544


No 153
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=29.41  E-value=1.3e+02  Score=23.60  Aligned_cols=33  Identities=9%  Similarity=0.113  Sum_probs=28.6

Q ss_pred             CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ...+.|.+ .+|..|.+++..+|...++.|-...
T Consensus        10 g~~V~VeL-k~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          10 GHIVTVEL-KTGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CCEEEEEE-CCCcEEEEEEEEEcCCceeEEEEEE
Confidence            45688888 5899999999999999999988774


No 154
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=28.72  E-value=29  Score=38.77  Aligned_cols=34  Identities=9%  Similarity=0.085  Sum_probs=30.8

Q ss_pred             cEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCC
Q 008087          355 VRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDG  388 (578)
Q Consensus       355 v~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~  388 (578)
                      .+|.++.++|||..  .|..||.|+.||++.|-.|+
T Consensus       227 h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq  262 (638)
T KOG1738|consen  227 HVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ  262 (638)
T ss_pred             eeccccccCChHHHhhcccCccceeeecccccccch
Confidence            56778899999988  79999999999999999996


No 155
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=28.29  E-value=1.4e+02  Score=23.61  Aligned_cols=31  Identities=10%  Similarity=0.129  Sum_probs=26.1

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV  203 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv  203 (578)
                      ..+.|.+ .+|+.+.+++.++|...+|.|=..
T Consensus        11 k~V~V~L-~~g~~~~G~L~~~D~~mNlvL~~~   41 (72)
T cd01719          11 KKLSLKL-NGNRKVSGILRGFDPFMNLVLDDA   41 (72)
T ss_pred             CeEEEEE-CCCeEEEEEEEEEcccccEEeccE
Confidence            4678888 599999999999999999877544


No 156
>KOG4371 consensus Membrane-associated protein tyrosine phosphatase PTP-BAS and related proteins, contain FERM domain [Signal transduction mechanisms]
Probab=28.10  E-value=1.9e+02  Score=34.70  Aligned_cols=155  Identities=16%  Similarity=0.154  Sum_probs=0.0

Q ss_pred             ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087          328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ  406 (578)
Q Consensus       328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~  406 (578)
                      |.||++...+              ...+-+....-...-.. -|+.||+++.+||..+...-.          ....-..
T Consensus      1158 ~~l~~~~a~~--------------~~~~~~~~~~~~~~~~~pd~~~g~~l~~~n~i~~~~~~~----------~~~~~~~ 1213 (1332)
T KOG4371|consen 1158 GSLGVQIASL--------------SGRVCIKQLTSEPAISHPDIRVGDVLLYVNGIAVEGKVH----------QEVVAML 1213 (1332)
T ss_pred             CCCCceeccC--------------ccceehhhcccCCCCCCCCcchhhhhhhccceeeechhh----------HHHHHHH


Q ss_pred             cCCCCEEEEEEEE---------------CCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhh
Q 008087          407 KYTGDSAAVKVLR---------------DSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERI  471 (578)
Q Consensus       407 ~~~g~~v~l~V~R---------------~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~  471 (578)
                      ...|+.+.|-|+|               +...+.-...+........-....+.|+--++.|                  
T Consensus      1214 ~~~~~~~~~~~~r~~~~~~d~~~~s~~~~~~~l~~~~~~~~p~~~~~~~~~~~~~s~~~~~~------------------ 1275 (1332)
T KOG4371|consen 1214 RGGGDRVVLGVQRPPPAYSDQHHASSTSASAPLISVMLLKKPMATLGLSLAKRTMSDGIFIR------------------ 1275 (1332)
T ss_pred             hccCceEEEEeecCCcccccchhhhhhcccchhhhheeeecccccccccccccCcCCceeee------------------


Q ss_pred             hhhhhccccccccccccCCCceEEEEeecCCcCCC-CCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEecCeE
Q 008087          472 MNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPLCL-NCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLEYDQV  547 (578)
Q Consensus       472 ~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~aGl-~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~R~~~  547 (578)
                                            ++.++..|.-.|- +.||.++..+|+++.  ......+.++ .--+.+.+.+.|+++
T Consensus      1276 ----------------------~~~~~~~a~~~~~~r~g~~~~~~~~~~~~~~~p~~~l~~~~-~v~~p~~~~~~~~q~ 1331 (1332)
T KOG4371|consen 1276 ----------------------NIAQDSAASSEGTLRVGDRLVSLDGEPVDGFTPATILEKLK-LVQGPVQITVTREQT 1331 (1332)
T ss_pred             ----------------------cccccccccccccccccceeeccCCccCCCCChHHHHHHhh-hccCchhheehhhhc


No 157
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=27.96  E-value=1.3e+02  Score=24.28  Aligned_cols=48  Identities=27%  Similarity=0.386  Sum_probs=31.4

Q ss_pred             CCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCE----EEEEEEECCEEEEEEE
Q 008087          372 PSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDS----AAVKVLRDSKILNFNI  428 (578)
Q Consensus       372 ~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~----v~l~V~R~g~~~~~~v  428 (578)
                      |-|-.+.+||++..+.+..+-         ..-..+..|..    +..++.|||+..+.+-
T Consensus        10 PadAkl~v~G~~t~~~G~~R~---------F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~   61 (75)
T TIGR03000        10 PADAKLKVDGKETNGTGTVRT---------FTTPPLEAGKEYEYTVTAEYDRDGRILTRTR   61 (75)
T ss_pred             CCCCEEEECCeEcccCccEEE---------EECCCCCCCCEEEEEEEEEEecCCcEEEEEE
Confidence            468889999999999887531         11223445554    5566678998766543


No 158
>COG1582 FlgEa Uncharacterized protein, possibly involved in motility [Cell motility and secretion]
Probab=27.32  E-value=1.9e+02  Score=22.48  Aligned_cols=52  Identities=13%  Similarity=0.114  Sum_probs=39.5

Q ss_pred             EEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEEechhhhHHHHHHHHH
Q 008087          511 KVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVLRTKTSKAATLDILAT  565 (578)
Q Consensus       511 ~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l~~~~~~~~~~~i~~~  565 (578)
                      .++..||++.-=-.++.+.++..|+..++|   -+|.-+...+.+++..++|..-
T Consensus         3 ~vtrlNG~~~~lN~~~IE~ie~~PDttItL---inGkkyvVkEsveEVi~kI~~y   54 (67)
T COG1582           3 KVTRLNGREFWLNAHHIETIEAFPDTTITL---INGKKYVVKESVEEVINKIIEY   54 (67)
T ss_pred             EEEEecCcceeeCHHHhhhhhccCCcEEEE---EcCcEEEEcccHHHHHHHHHHH
Confidence            467899999876688899999999887654   3566666678888888877653


No 159
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=27.29  E-value=87  Score=30.53  Aligned_cols=54  Identities=26%  Similarity=0.441  Sum_probs=37.9

Q ss_pred             EEEEecCCCCCCccceEEccC----CeEEEEEeccccccccCCceeeecc--hhHHHHHHHHH
Q 008087          263 GLQIDAAINSGNSGGPAFNDK----GKCVGIAFQSLKHEDVENIGYVIPT--PVIMHFIQDYE  319 (578)
Q Consensus       263 ~i~~da~i~~G~SGGPlvn~~----G~vVGI~~~~~~~~~~~~~~~aiPi--~~i~~~l~~l~  319 (578)
                      .+........|+=|||++-.+    -++|||+.++.   ...+.+||-++  ..+++.+..|+
T Consensus       170 gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~---~~~~~gYAe~itQEDL~~A~~~l~  229 (231)
T PF12381_consen  170 GLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGS---ANHAMGYAESITQEDLMRAINKLE  229 (231)
T ss_pred             eeeEECCCcCCCccceeeEcchhhhhhhheeeeccc---ccccceehhhhhHHHHHHHHHhhc
Confidence            355667788899999998432    58999998844   33467787665  46666666554


No 160
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=26.58  E-value=1.9e+02  Score=27.57  Aligned_cols=19  Identities=0%  Similarity=-0.178  Sum_probs=16.1

Q ss_pred             eecCCcCCCCCCCEEEEeC
Q 008087          498 WCLRSPLCLNCFNKVLAFN  516 (578)
Q Consensus       498 ~v~A~~aGl~~GD~I~~VN  516 (578)
                      +++|+++|++-++.|++|-
T Consensus       132 gS~A~~~g~d~d~~I~~v~  150 (183)
T PF11874_consen  132 GSPAEKAGIDFDWEITEVE  150 (183)
T ss_pred             CCHHHHcCCCCCcEEEEEE
Confidence            4589999999999888773


No 161
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=25.86  E-value=1.7e+02  Score=22.33  Aligned_cols=33  Identities=15%  Similarity=0.250  Sum_probs=27.2

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED  205 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~  205 (578)
                      ..+.|.+ .+|+.+.+.+..+|...++-|=....
T Consensus         9 ~~V~V~l-~~g~~~~G~L~~~D~~~NlvL~~~~e   41 (67)
T smart00651        9 KRVLVEL-KNGREYRGTLKGFDQFMNLVLEDVEE   41 (67)
T ss_pred             cEEEEEE-CCCcEEEEEEEEECccccEEEccEEE
Confidence            4678888 59999999999999999987765543


No 162
>PF12419 DUF3670:  SNF2 Helicase protein ;  InterPro: IPR022138  This domain family is found in bacteria, archaea and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with PF00271 from PFAM, PF00176 from PFAM. Most of the proteins in this family are annotated as SNF2 helicases but there is little accompanying literature to confirm this. 
Probab=25.79  E-value=2.7e+02  Score=25.08  Aligned_cols=54  Identities=15%  Similarity=0.135  Sum_probs=45.0

Q ss_pred             CCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEEechhhhHHHHHHHHHc
Q 008087          508 CFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVLRTKTSKAATLDILATH  566 (578)
Q Consensus       508 ~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l~~~~~~~~~~~i~~~~  566 (578)
                      ..|+=++|+|+.+ +.++|.+++++. ...|+   .|+.-+.++.++++++...+.+..
T Consensus        72 ~f~W~lalGd~~L-s~eEf~~L~~~~-~~LV~---~rg~WV~ld~~~l~~~~~~~~~~~  125 (141)
T PF12419_consen   72 DFDWELALGDEEL-SEEEFEQLVEQK-RPLVR---FRGRWVELDPEELRRALAFLEKAP  125 (141)
T ss_pred             cceEEEEECCEEC-CHHHHHHHHHcC-CCeEE---ECCEEEEECHHHHHHHHHHHHhcc
Confidence            7788899999988 899999999863 34443   499999999999999999888753


No 163
>smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).
Probab=25.45  E-value=50  Score=20.79  Aligned_cols=14  Identities=57%  Similarity=0.864  Sum_probs=11.3

Q ss_pred             ccccCCCCCCCCCc
Q 008087            5 KRKRGRKPKIPDAE   18 (578)
Q Consensus         5 ~~~~~~~~~~~~~~   18 (578)
                      ++||||-+|.....
T Consensus         1 kRkRGRPrK~~~~~   14 (26)
T smart00384        1 KRKRGRPRKAPKDX   14 (26)
T ss_pred             CCCCCCCCCCCCcc
Confidence            68999999987744


No 164
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=25.43  E-value=59  Score=30.95  Aligned_cols=29  Identities=14%  Similarity=0.081  Sum_probs=25.1

Q ss_pred             CCCcEEEEeCCCCcccC-CCCCCCEEEEEC
Q 008087          352 QKGVRIRRVDPTAPESE-VLKPSDIILSFD  380 (578)
Q Consensus       352 ~~Gv~V~~V~p~spA~~-GL~~GDiIl~In  380 (578)
                      ...+.|..|..+|||++ |+.-|+.|++|-
T Consensus       121 ~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~  150 (183)
T PF11874_consen  121 GGKVIVDEVEFGSPAEKAGIDFDWEITEVE  150 (183)
T ss_pred             CCEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence            34588999999999999 999999888773


No 165
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.19  E-value=1.7e+02  Score=23.38  Aligned_cols=32  Identities=6%  Similarity=0.096  Sum_probs=26.7

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ..+.|.+ .+|+.+.+.+.++|+..+|.|=...
T Consensus        13 k~v~V~l-~~gr~~~G~L~~fD~~~NlvL~d~~   44 (74)
T cd01728          13 KKVVVLL-RDGRKLIGILRSFDQFANLVLQDTV   44 (74)
T ss_pred             CEEEEEE-cCCeEEEEEEEEECCcccEEecceE
Confidence            4678888 5999999999999999999875543


No 166
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=25.16  E-value=1.4e+02  Score=22.88  Aligned_cols=35  Identities=11%  Similarity=0.247  Sum_probs=29.3

Q ss_pred             CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeC
Q 008087          171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDD  206 (578)
Q Consensus       171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~  206 (578)
                      ...+.|.+ .+|+.+.+.+..+|...++-|-.....
T Consensus         8 g~~V~V~l-~~g~~~~G~L~~~D~~~Nl~L~~~~~~   42 (67)
T PF01423_consen    8 GKRVRVEL-KNGRTYRGTLVSFDQFMNLVLSDVTET   42 (67)
T ss_dssp             TSEEEEEE-TTSEEEEEEEEEEETTEEEEEEEEEEE
T ss_pred             CcEEEEEE-eCCEEEEEEEEEeechheEEeeeEEEE
Confidence            35688888 599999999999999999988777653


No 167
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=24.34  E-value=1.5e+02  Score=23.77  Aligned_cols=33  Identities=18%  Similarity=0.310  Sum_probs=28.0

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED  205 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~  205 (578)
                      ..+.|.+ .+|+.|.+++..+|...++.|--+..
T Consensus        18 ~~V~V~l-k~g~~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          18 KRVLVKL-KNGREYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             CEEEEEE-CCCCEEEEEEEEEccceeEEEeceEE
Confidence            5788888 58999999999999999998775554


No 168
>PF07174 FAP:  Fibronectin-attachment protein (FAP);  InterPro: IPR010801 This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix [].; GO: 0050840 extracellular matrix binding, 0005576 extracellular region
Probab=24.10  E-value=6.4e+02  Score=25.57  Aligned_cols=17  Identities=12%  Similarity=0.311  Sum_probs=9.4

Q ss_pred             EEEEcCCEEEecccccC
Q 008087          153 GFAIGGRRVLTNAHSVE  169 (578)
Q Consensus       153 GfvI~~g~ILT~aHvV~  169 (578)
                      -|+|=.||+.+.+.-+.
T Consensus       120 S~vvP~GW~~Sda~~L~  136 (297)
T PF07174_consen  120 SYVVPAGWVESDASHLD  136 (297)
T ss_pred             EEeccCCccccccceee
Confidence            34444777766654433


No 169
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.06  E-value=3.3e+02  Score=21.53  Aligned_cols=32  Identities=6%  Similarity=0.066  Sum_probs=26.9

Q ss_pred             ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ..+.|.+ .+++.+.+++.++|...++.|=...
T Consensus        10 ~~V~V~l-~dgr~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727          10 KTVSVIT-VDGRVIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             CEEEEEE-CCCcEEEEEEEEEccccCEEccceE
Confidence            4677888 6999999999999999998876653


No 170
>PF09465 LBR_tudor:  Lamin-B receptor of TUDOR domain;  InterPro: IPR019023  The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=23.74  E-value=3e+02  Score=20.79  Aligned_cols=38  Identities=24%  Similarity=0.211  Sum_probs=29.1

Q ss_pred             CCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeC
Q 008087          169 EHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDD  206 (578)
Q Consensus       169 ~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~  206 (578)
                      ..+..+.++-+.+...|+|+++.+|...+++-++++.-
T Consensus         7 ~~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG   44 (55)
T PF09465_consen    7 AIGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG   44 (55)
T ss_dssp             -SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred             cCCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence            34567888887777788999999999999999998764


No 171
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.85  E-value=2.5e+02  Score=22.42  Aligned_cols=33  Identities=9%  Similarity=0.079  Sum_probs=28.3

Q ss_pred             CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087          171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE  204 (578)
Q Consensus       171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~  204 (578)
                      ...+.|.+ .+|..+.+++..+|...++.|-.+.
T Consensus        11 g~~V~VeL-kng~~~~G~L~~~D~~mNi~L~~~~   43 (76)
T cd01723          11 NHPMLVEL-KNGETYNGHLVNCDNWMNIHLREVI   43 (76)
T ss_pred             CCEEEEEE-CCCCEEEEEEEEEcCCCceEEEeEE
Confidence            45788888 5899999999999999999887654


Done!