Query         016641
Match_columns 385
No_of_seqs    367 out of 2614
Neff          8.1 
Searched_HMMs 46136
Date          Fri Mar 29 08:41:06 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016641.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016641hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 5.2E-48 1.1E-52  388.2  32.2  271  104-383    41-333 (455)
  2 TIGR02038 protease_degS peripl 100.0 2.3E-46 4.9E-51  365.9  31.8  268  102-383    44-321 (351)
  3 PRK10898 serine endoprotease;  100.0 5.6E-46 1.2E-50  362.9  32.0  267  103-383    45-322 (353)
  4 PRK10942 serine endoprotease;  100.0   5E-45 1.1E-49  368.3  30.0  270  104-382    39-353 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0   3E-44 6.6E-49  360.7  32.1  271  104-383     2-300 (428)
  6 COG0265 DegQ Trypsin-like seri 100.0 2.9E-35 6.3E-40  288.2  27.0  272  103-384    33-314 (347)
  7 KOG1320 Serine protease [Postt  99.9 3.8E-24 8.2E-29  210.7  18.1  278  102-383   127-441 (473)
  8 KOG1320 Serine protease [Postt  99.9 1.8E-22 3.9E-27  198.9  10.7  261  108-373    55-319 (473)
  9 KOG1421 Predicted signaling-as  99.8 4.3E-20 9.3E-25  184.1  16.9  269  104-382    53-344 (955)
 10 PF13365 Trypsin_2:  Trypsin-li  99.6 2.5E-14 5.3E-19  117.7  12.0  107  142-279     1-120 (120)
 11 PF00089 Trypsin:  Trypsin;  In  99.5 3.6E-13 7.7E-18  122.0  16.0  165  139-303    24-220 (220)
 12 cd00190 Tryp_SPc Trypsin-like   99.5 1.3E-12 2.8E-17  119.2  16.8  166  139-304    24-230 (232)
 13 KOG1421 Predicted signaling-as  99.4 2.5E-12 5.5E-17  129.2  16.8  259  109-381   524-809 (955)
 14 smart00020 Tryp_SPc Trypsin-li  99.4 6.3E-12 1.4E-16  114.9  15.2  146  139-284    25-208 (229)
 15 PF13180 PDZ_2:  PDZ domain; PD  99.0   1E-09 2.2E-14   84.7   5.4   56  316-383     1-57  (82)
 16 cd00987 PDZ_serine_protease PD  98.9 3.1E-09 6.8E-14   83.0   7.0   65  316-382     1-66  (90)
 17 KOG3627 Trypsin [Amino acid tr  98.8 3.6E-07 7.9E-12   85.3  17.4  144  141-285    39-229 (256)
 18 COG3591 V8-like Glu-specific e  98.7   4E-07 8.7E-12   84.0  14.0  157  141-307    65-250 (251)
 19 cd00991 PDZ_archaeal_metallopr  98.5 1.8E-07 3.9E-12   71.6   5.8   44  340-383     9-53  (79)
 20 cd00990 PDZ_glycyl_aminopeptid  98.5 1.4E-07 3.1E-12   72.0   5.0   50  316-379     1-51  (80)
 21 cd00136 PDZ PDZ domain, also c  98.5 2.1E-07 4.6E-12   69.0   5.8   42  341-382    13-57  (70)
 22 TIGR02037 degP_htrA_DO peripla  98.5 2.1E-07 4.5E-12   94.0   7.3   67  315-382   337-404 (428)
 23 TIGR01713 typeII_sec_gspC gene  98.4 2.8E-07 6.2E-12   86.4   6.0   74  297-382   159-233 (259)
 24 cd00989 PDZ_metalloprotease PD  98.4 6.5E-07 1.4E-11   68.0   5.5   42  341-382    12-54  (79)
 25 cd00986 PDZ_LON_protease PDZ d  98.4 6.9E-07 1.5E-11   68.2   5.3   43  341-383     8-50  (79)
 26 PF00863 Peptidase_C4:  Peptida  98.3 3.3E-05 7.1E-10   70.8  15.9  143  149-305    40-195 (235)
 27 cd00992 PDZ_signaling PDZ doma  98.2 2.6E-06 5.7E-11   65.0   5.6   56  316-382    12-70  (82)
 28 cd00988 PDZ_CTP_protease PDZ d  98.2 2.8E-06   6E-11   65.5   4.9   42  341-382    13-57  (85)
 29 PF00595 PDZ:  PDZ domain (Also  97.9 1.2E-05 2.6E-10   61.6   4.4   53  316-378    10-63  (81)
 30 smart00228 PDZ Domain present   97.9 1.8E-05   4E-10   60.5   5.1   41  341-381    26-67  (85)
 31 COG5640 Secreted trypsin-like   97.9 0.00022 4.8E-09   68.2  13.0   50  259-308   223-279 (413)
 32 PF12812 PDZ_1:  PDZ-like domai  97.7 7.9E-05 1.7E-09   56.7   5.8   64  316-383     9-73  (78)
 33 TIGR00054 RIP metalloprotease   97.6 6.6E-05 1.4E-09   75.6   5.4   42  341-382   203-245 (420)
 34 PF03761 DUF316:  Domain of unk  97.6   0.004 8.6E-08   59.2  17.3  106  186-301   160-273 (282)
 35 PRK10779 zinc metallopeptidase  97.6 4.9E-05 1.1E-09   77.3   4.3   40  343-382   128-168 (449)
 36 PF05579 Peptidase_S32:  Equine  97.6 0.00062 1.3E-08   62.7  10.9  113  141-284   113-229 (297)
 37 PRK10139 serine endoprotease;   97.6 9.6E-05 2.1E-09   75.1   5.7   42  341-382   390-432 (455)
 38 PF00548 Peptidase_C3:  3C cyst  97.5  0.0018 3.9E-08   57.1  12.6  138  138-283    23-170 (172)
 39 TIGR00054 RIP metalloprotease   97.5 9.6E-05 2.1E-09   74.5   4.9   43  341-383   128-171 (420)
 40 PRK10942 serine endoprotease;   97.5 0.00011 2.4E-09   75.0   5.4   42  341-382   408-450 (473)
 41 PRK10779 zinc metallopeptidase  97.5 0.00015 3.2E-09   73.8   5.3   42  341-382   221-263 (449)
 42 KOG3553 Tax interaction protei  97.4 0.00017 3.7E-09   56.3   3.7   37  334-372    54-91  (124)
 43 TIGR00225 prc C-terminal pepti  97.4 0.00024 5.2E-09   69.4   5.2   41  341-381    62-105 (334)
 44 PLN00049 carboxyl-terminal pro  97.2 0.00047   1E-08   68.8   5.5   35  341-375   102-137 (389)
 45 PF14685 Tricorn_PDZ:  Tricorn   97.0 0.00094   2E-08   52.0   4.6   43  341-383    12-65  (88)
 46 TIGR02860 spore_IV_B stage IV   97.0   0.001 2.2E-08   65.8   5.3   42  341-382   105-155 (402)
 47 PF04495 GRASP55_65:  GRASP55/6  96.5  0.0032 6.9E-08   53.4   4.3   60  315-382    25-86  (138)
 48 COG0793 Prc Periplasmic protea  96.3  0.0051 1.1E-07   61.6   5.0   49  315-376    99-148 (406)
 49 PF09342 DUF1986:  Domain of un  96.2   0.029 6.2E-07   51.4   8.9   98  127-225    13-131 (267)
 50 PF08192 Peptidase_S64:  Peptid  96.0   0.054 1.2E-06   56.2  10.7  117  185-306   541-688 (695)
 51 KOG3532 Predicted protein kina  95.9  0.0079 1.7E-07   62.0   4.2   42  341-382   398-440 (1051)
 52 COG3480 SdrC Predicted secrete  95.8  0.0099 2.1E-07   56.3   4.2   43  341-383   130-172 (342)
 53 PRK11186 carboxy-terminal prot  95.7   0.015 3.2E-07   61.6   5.3   34  341-374   255-292 (667)
 54 PF10459 Peptidase_S46:  Peptid  94.9   0.039 8.4E-07   58.8   5.4   20  141-160    48-68  (698)
 55 PF05580 Peptidase_S55:  SpoIVB  94.9   0.022 4.8E-07   51.3   3.0   41  259-299   175-215 (218)
 56 KOG3129 26S proteasome regulat  94.9   0.031 6.8E-07   49.8   3.9   39  342-380   140-179 (231)
 57 COG3975 Predicted protease wit  94.8   0.029 6.3E-07   56.7   3.9   32  339-370   460-492 (558)
 58 PF02122 Peptidase_S39:  Peptid  94.2   0.045 9.8E-07   49.4   3.4  136  149-298    41-183 (203)
 59 PF00947 Pico_P2A:  Picornaviru  93.6    0.31 6.7E-06   40.2   6.9   42  242-283    68-109 (127)
 60 KOG3580 Tight junction protein  93.2   0.065 1.4E-06   54.7   2.9   44  340-383   428-472 (1027)
 61 KOG3209 WW domain-containing p  91.9    0.11 2.4E-06   54.2   2.6   32  345-376   782-815 (984)
 62 KOG3552 FERM domain protein FR  91.8    0.12 2.6E-06   55.4   2.8   34  342-375    76-109 (1298)
 63 PF00949 Peptidase_S7:  Peptida  91.6    0.31 6.7E-06   40.9   4.6   31  255-285    88-119 (132)
 64 PRK09681 putative type II secr  91.6    0.22 4.8E-06   47.0   4.1   42  341-382   204-249 (276)
 65 KOG3550 Receptor targeting pro  91.1    0.28 6.1E-06   41.5   3.8   35  341-375   115-151 (207)
 66 PF00944 Peptidase_S3:  Alphavi  90.4    0.31 6.7E-06   40.5   3.4   27  259-285   101-128 (158)
 67 KOG2921 Intramembrane metallop  89.9    0.29 6.4E-06   47.8   3.3   42  340-381   219-262 (484)
 68 TIGR02860 spore_IV_B stage IV   89.1    0.28   6E-06   48.9   2.6   41  259-299   355-395 (402)
 69 KOG3542 cAMP-regulated guanine  88.7    0.29 6.2E-06   50.9   2.4   41  337-377   558-599 (1283)
 70 PF10459 Peptidase_S46:  Peptid  88.2    0.26 5.6E-06   52.7   1.8   54  254-307   623-687 (698)
 71 COG3031 PulC Type II secretory  87.6    0.58 1.3E-05   42.8   3.4   44  341-384   207-251 (275)
 72 KOG3606 Cell polarity protein   87.2     0.5 1.1E-05   43.9   2.8   72  306-382   164-239 (358)
 73 KOG3651 Protein kinase C, alph  85.1     1.1 2.4E-05   42.4   4.0   37  342-378    31-69  (429)
 74 KOG3571 Dishevelled 3 and rela  84.2    0.77 1.7E-05   46.3   2.7   36  340-375   276-313 (626)
 75 PF02395 Peptidase_S6:  Immunog  84.2     5.1 0.00011   43.5   9.0   65  142-209    67-131 (769)
 76 COG0750 Predicted membrane-ass  83.3     1.7 3.6E-05   42.9   4.7   39  345-383   133-172 (375)
 77 PF05416 Peptidase_C37:  Southa  83.0      11 0.00025   37.5  10.0  137  140-285   379-528 (535)
 78 KOG1892 Actin filament-binding  82.8     1.2 2.6E-05   48.3   3.5   42  337-378   956-999 (1629)
 79 PF02907 Peptidase_S29:  Hepati  82.3     1.1 2.5E-05   37.3   2.6   23  262-284   106-129 (148)
 80 KOG3209 WW domain-containing p  82.2     1.1 2.3E-05   47.1   2.9   37  341-377   923-961 (984)
 81 PF03510 Peptidase_C24:  2C end  82.0     8.6 0.00019   30.9   7.3   53  144-210     3-55  (105)
 82 KOG3580 Tight junction protein  81.0     1.6 3.5E-05   45.0   3.6   37  341-377    40-76  (1027)
 83 KOG3605 Beta amyloid precursor  77.9     2.9 6.2E-05   43.6   4.3   75  293-373   708-789 (829)
 84 KOG3551 Syntrophins (type beta  73.2     1.5 3.2E-05   43.0   0.8   35  341-375   110-146 (506)
 85 KOG0609 Calcium/calmodulin-dep  71.9       3 6.5E-05   42.6   2.7   41  342-382   147-191 (542)
 86 KOG0606 Microtubule-associated  67.9     3.4 7.3E-05   45.8   2.2   32  344-375   661-693 (1205)
 87 cd01720 Sm_D2 The eukaryotic S  58.1      21 0.00045   27.6   4.5   37  158-195    10-46  (87)
 88 COG0260 PepB Leucyl aminopepti  56.1      11 0.00024   38.7   3.4   40  332-373   291-330 (485)
 89 PF01732 DUF31:  Putative pepti  51.1      11 0.00023   37.4   2.4   22  260-281   351-373 (374)
 90 cd00600 Sm_like The eukaryotic  48.9      52  0.0011   23.1   5.1   33  163-196     7-39  (63)
 91 KOG3549 Syntrophins (type gamm  48.7      12 0.00026   36.3   2.1   33  343-375    82-116 (505)
 92 cd01727 LSm8 The eukaryotic Sm  48.1      93   0.002   23.0   6.5   32  163-195    10-41  (74)
 93 PRK05015 aminopeptidase B; Pro  45.7      20 0.00044   36.0   3.3   39  333-373   230-268 (424)
 94 cd01728 LSm1 The eukaryotic Sm  45.7      93   0.002   23.2   6.1   32  163-195    13-44  (74)
 95 PRK00737 small nuclear ribonuc  44.7      59  0.0013   24.0   4.9   33  163-196    15-47  (72)
 96 cd01731 archaeal_Sm1 The archa  44.4      61  0.0013   23.5   4.9   33  163-196    11-43  (68)
 97 cd01726 LSm6 The eukaryotic Sm  44.2      56  0.0012   23.6   4.7   32  163-195    11-42  (67)
 98 cd01722 Sm_F The eukaryotic Sm  43.9      54  0.0012   23.9   4.6   32  163-195    12-43  (68)
 99 cd01730 LSm3 The eukaryotic Sm  43.2      49  0.0011   25.1   4.4   31  163-194    12-42  (82)
100 cd01729 LSm7 The eukaryotic Sm  41.7      66  0.0014   24.4   4.9   31  163-194    13-43  (81)
101 cd01732 LSm5 The eukaryotic Sm  41.6      59  0.0013   24.4   4.6   31  163-194    14-44  (76)
102 cd06168 LSm9 The eukaryotic Sm  41.1      72  0.0016   23.9   4.9   32  163-195    11-42  (75)
103 cd01717 Sm_B The eukaryotic Sm  40.8      65  0.0014   24.2   4.7   32  163-195    11-42  (79)
104 KOG3605 Beta amyloid precursor  40.1      11 0.00025   39.4   0.6   29  346-374   678-708 (829)
105 cd01719 Sm_G The eukaryotic Sm  39.0      81  0.0018   23.3   4.9   31  163-194    11-41  (72)
106 cd01735 LSm12_N LSm12 belongs   38.8 1.2E+02  0.0027   21.8   5.5   33  163-196     7-39  (61)
107 PF11874 DUF3394:  Domain of un  36.4      46   0.001   29.6   3.7   28  341-368   122-150 (183)
108 PF09465 LBR_tudor:  Lamin-B re  35.1 1.6E+02  0.0035   20.7   5.8   37  161-197     8-44  (55)
109 PRK00913 multifunctional amino  34.9      28 0.00061   35.8   2.4   31  343-373   301-331 (483)
110 smart00651 Sm snRNP Sm protein  34.3 1.1E+02  0.0023   21.8   4.9   33  163-196     9-41  (67)
111 PTZ00412 leucyl aminopeptidase  33.9      34 0.00073   35.7   2.8   40  332-373   337-376 (569)
112 PF12381 Peptidase_C3G:  Tungro  33.8      32  0.0007   31.3   2.3   53  254-307   170-229 (231)
113 cd01721 Sm_D3 The eukaryotic S  33.6 1.1E+02  0.0024   22.4   4.9   32  163-195    11-42  (70)
114 PF00883 Peptidase_M17:  Cytoso  33.0      22 0.00048   34.3   1.3   30  344-373   133-162 (311)
115 COG1958 LSM1 Small nuclear rib  32.3      95  0.0021   23.2   4.5   33  163-196    18-50  (79)
116 cd00433 Peptidase_M17 Cytosol   31.9      33  0.0007   35.2   2.3   31  343-373   287-317 (468)
117 PF01423 LSM:  LSM domain ;  In  31.3 1.3E+02  0.0028   21.3   4.9   35  162-197     8-42  (67)
118 COG0298 HypC Hydrogenase matur  29.6 1.2E+02  0.0027   23.0   4.4   47  176-224     5-52  (82)
119 KOG3938 RGS-GAIP interacting p  27.9      21 0.00046   33.4   0.2   38  343-380   151-190 (334)
120 KOG2597 Predicted aminopeptida  26.9      67  0.0015   33.1   3.5   43  329-373   310-352 (513)
121 PF11730 DUF3297:  Protein of u  25.2      44 0.00096   24.4   1.4   32  347-378     5-37  (71)
122 cd01723 LSm4 The eukaryotic Sm  24.1 2.1E+02  0.0046   21.2   5.0   33  162-195    11-43  (76)
123 KOG1738 Membrane-associated gu  23.1      53  0.0012   34.5   2.0   31  343-373   227-259 (638)
124 KOG3834 Golgi reassembly stack  22.9      69  0.0015   32.2   2.7   39  344-382   112-152 (462)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=5.2e-48  Score=388.20  Aligned_cols=271  Identities=23%  Similarity=0.367  Sum_probs=234.0

Q ss_pred             cHHHHHHHhCCCeEEEEeeecCCC----------CC---CCCCCCCCCcceEEEEEec--CCEEEecccccCCCceEEEE
Q 016641          104 NAYAAIELALDSVVKIFTVSSSPN----------YG---LPWQNKSQRETTGSGFVIP--GKKILTNAHVVADSTFVLVR  168 (385)
Q Consensus       104 ~~~~~~~~~~~SVV~I~~~~~~~~----------~~---~p~~~~~~~~~~GSGfiI~--~g~ILT~aHvv~~~~~i~V~  168 (385)
                      ++.++++++.||||.|.+......          ++   .||+......+.||||+|+  +||||||+|||+++..+.|+
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~  120 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ  120 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence            589999999999999998653210          11   1333333446789999997  58999999999999999999


Q ss_pred             EcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeeccccccc
Q 016641          169 KHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYV  246 (385)
Q Consensus       169 ~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~  246 (385)
                      +. |+++++|++++.|+.+||||||++...   .+++++|+++.  ++||+|+++|+|++.. .+++.|+|++..+....
T Consensus       121 ~~-dg~~~~a~vvg~D~~~DlAvlkv~~~~---~l~~~~lg~s~~~~~G~~V~aiG~P~g~~-~tvt~GivS~~~r~~~~  195 (455)
T PRK10139        121 LN-DGREFDAKLIGSDDQSDIALLQIQNPS---KLTQIAIADSDKLRVGDFAVAVGNPFGLG-QTATSGIISALGRSGLN  195 (455)
T ss_pred             EC-CCCEEEEEEEEEcCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEecCCCCC-CceEEEEEccccccccC
Confidence            97 899999999999999999999998643   58899999876  5699999999999987 59999999998775322


Q ss_pred             CCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccc
Q 016641          247 HGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSC  322 (385)
Q Consensus       247 ~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~  322 (385)
                      . .....+||+|+++++|||||||+ .+|+||||+++...   +..+++||||++.+++++++|+++|++. ++|||+.+
T Consensus       196 ~-~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~-r~~LGv~~  273 (455)
T PRK10139        196 L-EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIK-RGLLGIKG  273 (455)
T ss_pred             C-CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCccc-ccceeEEE
Confidence            2 22346899999999999999999 99999999998764   3467999999999999999999999998 89999999


Q ss_pred             cccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          323 QTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       323 ~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      +++ +++.++.+|++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..|+.+.|.
T Consensus       274 ~~l-~~~~~~~lgl~~-~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~  333 (455)
T PRK10139        274 TEM-SADIAKAFNLDV-QRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIA  333 (455)
T ss_pred             EEC-CHHHHHhcCCCC-CCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHH
Confidence            999 788999999873 579999999999999999 99999999999999999999988764


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=2.3e-46  Score=365.89  Aligned_cols=268  Identities=24%  Similarity=0.397  Sum_probs=230.4

Q ss_pred             CCcHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec-CCEEEecccccCCCceEEEEEcCCCcEEEEEE
Q 016641          102 TTNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQV  180 (385)
Q Consensus       102 ~~~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v  180 (385)
                      ...+.++++++.||||+|.+.....+.   + ......+.||||+|+ +||||||+|||.+++.+.|++. ||+.++|++
T Consensus        44 ~~~~~~~~~~~~psVV~I~~~~~~~~~---~-~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~-dg~~~~a~v  118 (351)
T TIGR02038        44 EISFNKAVRRAAPAVVNIYNRSISQNS---L-NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQ-DGRKFEAEL  118 (351)
T ss_pred             chhHHHHHHhcCCcEEEEEeEeccccc---c-ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEEC-CCCEEEEEE
Confidence            346889999999999999986543321   1 122345689999998 7899999999999999999997 889999999


Q ss_pred             EEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEec
Q 016641          181 EAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQID  258 (385)
Q Consensus       181 ~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d  258 (385)
                      ++.|+.+||||||++...    +++++++++.  ++||+|+++|||++.. .+++.|+|+...+..... .....++++|
T Consensus       119 v~~d~~~DlAvlkv~~~~----~~~~~l~~s~~~~~G~~V~aiG~P~~~~-~s~t~GiIs~~~r~~~~~-~~~~~~iqtd  192 (351)
T TIGR02038       119 VGSDPLTDLAVLKIEGDN----LPTIPVNLDRPPHVGDVVLAIGNPYNLG-QTITQGIISATGRNGLSS-VGRQNFIQTD  192 (351)
T ss_pred             EEecCCCCEEEEEecCCC----CceEeccCcCccCCCCEEEEEeCCCCCC-CcEEEEEEEeccCcccCC-CCcceEEEEC
Confidence            999999999999999764    6788887654  6799999999999877 589999999987754322 2234689999


Q ss_pred             ccCCCCCCCccee-eCCEEEEEEeeecC-----CCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHh
Q 016641          259 AAINPGNSGGPAI-MGNKVAGVAFQNLS-----GAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN  332 (385)
Q Consensus       259 ~~i~~G~SGGPL~-~~G~vVGI~s~~~~-----~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~  332 (385)
                      +.+++|||||||+ .+|+||||+++.+.     ...+++|+||++.+++++++++++|++. ++|||+.++++ ++..++
T Consensus       193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~-r~~lGv~~~~~-~~~~~~  270 (351)
T TIGR02038       193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI-RGYIGVSGEDI-NSVVAQ  270 (351)
T ss_pred             CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc-ceEeeeEEEEC-CHHHHH
Confidence            9999999999999 99999999987543     1257899999999999999999999987 89999999998 678888


Q ss_pred             hcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          333 NFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       333 ~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .+|++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..||.+.|.
T Consensus       271 ~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~  321 (351)
T TIGR02038       271 GLGLPD-LRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIA  321 (351)
T ss_pred             hcCCCc-cccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence            899974 479999999999999999 99999999999999999999987763


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=5.6e-46  Score=362.95  Aligned_cols=267  Identities=21%  Similarity=0.357  Sum_probs=226.9

Q ss_pred             CcHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec-CCEEEecccccCCCceEEEEEcCCCcEEEEEEE
Q 016641          103 TNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVE  181 (385)
Q Consensus       103 ~~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~  181 (385)
                      ..+.++++++.+|||.|.+.......    .......+.||||+|+ +||||||+|||+++..+.|++. |++.++|+++
T Consensus        45 ~~~~~~~~~~~psvV~v~~~~~~~~~----~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~-dg~~~~a~vv  119 (353)
T PRK10898         45 ASYNQAVRRAAPAVVNVYNRSLNSTS----HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQ-DGRVFEALLV  119 (353)
T ss_pred             chHHHHHHHhCCcEEEEEeEeccccC----cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeC-CCCEEEEEEE
Confidence            46889999999999999986543211    1123345789999998 7899999999999999999997 8899999999


Q ss_pred             EecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecc
Q 016641          182 AVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDA  259 (385)
Q Consensus       182 ~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~  259 (385)
                      +.|+.+||||||++...    +++++++++.  ++||+|+++|||++.. .+++.|+|+...+...... ....++|+|+
T Consensus       120 ~~d~~~DlAvl~v~~~~----l~~~~l~~~~~~~~G~~V~aiG~P~g~~-~~~t~Giis~~~r~~~~~~-~~~~~iqtda  193 (353)
T PRK10898        120 GSDSLTDLAVLKINATN----LPVIPINPKRVPHIGDVVLAIGNPYNLG-QTITQGIISATGRIGLSPT-GRQNFLQTDA  193 (353)
T ss_pred             EEcCCCCEEEEEEcCCC----CCeeeccCcCcCCCCCEEEEEeCCCCcC-CCcceeEEEeccccccCCc-cccceEEecc
Confidence            99999999999998753    7788888654  6799999999999876 5899999998876533222 2236799999


Q ss_pred             cCCCCCCCccee-eCCEEEEEEeeecCC------CCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHh
Q 016641          260 AINPGNSGGPAI-MGNKVAGVAFQNLSG------AENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN  332 (385)
Q Consensus       260 ~i~~G~SGGPL~-~~G~vVGI~s~~~~~------~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~  332 (385)
                      .+++|||||||+ .+|+||||+++.+..      ..+++||||++.+++++++++++|++. ++|||+.++++ ++..+.
T Consensus       194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~-~~~lGi~~~~~-~~~~~~  271 (353)
T PRK10898        194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVI-RGYIGIGGREI-APLHAQ  271 (353)
T ss_pred             ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccc-ccccceEEEEC-CHHHHH
Confidence            999999999999 999999999976542      257899999999999999999999998 89999999988 555666


Q ss_pred             hcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          333 NFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       333 ~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .++++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus       272 ~~~~~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~  322 (353)
T PRK10898        272 GGGIDQ-LQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVA  322 (353)
T ss_pred             hcCCCC-CCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence            677753 489999999999999999 99999999999999999999887663


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=5e-45  Score=368.25  Aligned_cols=270  Identities=26%  Similarity=0.391  Sum_probs=229.9

Q ss_pred             cHHHHHHHhCCCeEEEEeeecCCC--------C--CC----CCC----------------------CCCCCcceEEEEEe
Q 016641          104 NAYAAIELALDSVVKIFTVSSSPN--------Y--GL----PWQ----------------------NKSQRETTGSGFVI  147 (385)
Q Consensus       104 ~~~~~~~~~~~SVV~I~~~~~~~~--------~--~~----p~~----------------------~~~~~~~~GSGfiI  147 (385)
                      ++.++++++.|+||.|.+......        +  +|    |+.                      ......+.||||||
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            589999999999999998653211        0  11    110                      00112467999999


Q ss_pred             c--CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEec
Q 016641          148 P--GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGY  223 (385)
Q Consensus       148 ~--~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~  223 (385)
                      +  +||||||+|||.+++.++|++. |+++++|++++.|+.+||||||++...   .+++++|+++.  ++|++|+++|+
T Consensus       119 ~~~~G~IlTn~HVv~~a~~i~V~~~-dg~~~~a~vv~~D~~~DlAvlki~~~~---~l~~~~lg~s~~l~~G~~V~aiG~  194 (473)
T PRK10942        119 DADKGYVVTNNHVVDNATKIKVQLS-DGRKFDAKVVGKDPRSDIALIQLQNPK---NLTAIKMADSDALRVGDYTVAIGN  194 (473)
T ss_pred             ECCCCEEEeChhhcCCCCEEEEEEC-CCCEEEEEEEEecCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEcC
Confidence            8  4899999999999999999997 899999999999999999999997543   58899999765  56999999999


Q ss_pred             CCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchH
Q 016641          224 PQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVI  299 (385)
Q Consensus       224 p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i  299 (385)
                      |++.. .+++.|+|+...+..... ..+..+|++|+.+++|||||||+ .+|+||||+++...   +..+++|+||++.+
T Consensus       195 P~g~~-~tvt~GiVs~~~r~~~~~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~  272 (473)
T PRK10942        195 PYGLG-ETVTSGIVSALGRSGLNV-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMV  272 (473)
T ss_pred             CCCCC-cceeEEEEEEeecccCCc-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHH
Confidence            99887 589999999987642211 12346899999999999999999 99999999998654   33568999999999


Q ss_pred             HHHHHHHHHcCeeeeeeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhH
Q 016641          300 KHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTG  378 (385)
Q Consensus       300 ~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l  378 (385)
                      ++++++|.++|++. ++|||+.++++ ++++++.++++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..+|
T Consensus       273 ~~v~~~l~~~g~v~-rg~lGv~~~~l-~~~~a~~~~l~~-~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl  349 (473)
T PRK10942        273 KNLTSQMVEYGQVK-RGELGIMGTEL-NSELAKAMKVDA-QRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAAL  349 (473)
T ss_pred             HHHHHHHHhccccc-cceeeeEeeec-CHHHHHhcCCCC-CCceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHH
Confidence            99999999999998 89999999999 788999999974 589999999999999999 999999999999999999999


Q ss_pred             Hhhh
Q 016641          379 SHSM  382 (385)
Q Consensus       379 ~~~l  382 (385)
                      .+.|
T Consensus       350 ~~~l  353 (473)
T PRK10942        350 RAQV  353 (473)
T ss_pred             HHHH
Confidence            8876


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=3e-44  Score=360.71  Aligned_cols=271  Identities=29%  Similarity=0.409  Sum_probs=230.9

Q ss_pred             cHHHHHHHhCCCeEEEEeeecCCC-------------CCC---CCC----CCCCCcceEEEEEec-CCEEEecccccCCC
Q 016641          104 NAYAAIELALDSVVKIFTVSSSPN-------------YGL---PWQ----NKSQRETTGSGFVIP-GKKILTNAHVVADS  162 (385)
Q Consensus       104 ~~~~~~~~~~~SVV~I~~~~~~~~-------------~~~---p~~----~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~  162 (385)
                      ++.++++++.||||.|.+......             ++.   |..    ......+.||||+|+ +||||||+||+.++
T Consensus         2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~   81 (428)
T TIGR02037         2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA   81 (428)
T ss_pred             cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC
Confidence            367899999999999998652211             110   000    012235689999999 78999999999999


Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRV  240 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~  240 (385)
                      ..+.|++. +++.++|++++.|+.+||||||++...   .+++++|+++.  +.|++|+++|||++.. .+++.|+|+..
T Consensus        82 ~~i~V~~~-~~~~~~a~vv~~d~~~DlAllkv~~~~---~~~~~~l~~~~~~~~G~~v~aiG~p~g~~-~~~t~G~vs~~  156 (428)
T TIGR02037        82 DEITVTLS-DGREFKAKLVGKDPRTDIAVLKIDAKK---NLPVIKLGDSDKLRVGDWVLAIGNPFGLG-QTVTSGIVSAL  156 (428)
T ss_pred             CeEEEEeC-CCCEEEEEEEEecCCCCEEEEEecCCC---CceEEEccCCCCCCCCCEEEEEECCCcCC-CcEEEEEEEec
Confidence            99999997 899999999999999999999998752   58999998754  6699999999999987 58999999988


Q ss_pred             ccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchHHHHHHHHHHcCeeeeee
Q 016641          241 EPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHGKYVGFC  316 (385)
Q Consensus       241 ~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~  316 (385)
                      .+... .......++++|+.+++|+|||||+ .+|+||||+++...   +..+++||||++.+++++++|+++|++. ++
T Consensus       157 ~~~~~-~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~-~~  234 (428)
T TIGR02037       157 GRSGL-GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQ-RG  234 (428)
T ss_pred             ccCcc-CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCc-CC
Confidence            76532 2223346899999999999999999 99999999987654   3467899999999999999999999987 89


Q ss_pred             ccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          317 SLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       317 ~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      |||+.++++ +++.++.+|++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus       235 ~lGi~~~~~-~~~~~~~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~  300 (428)
T TIGR02037       235 WLGVTIQEV-TSDLAKSLGLEK-QRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIG  300 (428)
T ss_pred             cCceEeecC-CHHHHHHcCCCC-CCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence            999999999 788999999975 579999999999999999 99999999999999999999988763


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=2.9e-35  Score=288.16  Aligned_cols=272  Identities=29%  Similarity=0.399  Sum_probs=228.6

Q ss_pred             CcHHHHHHHhCCCeEEEEeeecCCC-CCCCCCCCCC-CcceEEEEEec-CCEEEecccccCCCceEEEEEcCCCcEEEEE
Q 016641          103 TNAYAAIELALDSVVKIFTVSSSPN-YGLPWQNKSQ-RETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQ  179 (385)
Q Consensus       103 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~p~~~~~~-~~~~GSGfiI~-~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~  179 (385)
                      ..+..+++++.++||.|........ .+++-..... ..+.||||+++ ++||+||.|++.++..+.+.+. |+++++++
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~-dg~~~~a~  111 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLA-DGREVPAK  111 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeC-CCCEEEEE
Confidence            5788999999999999998664332 1111111101 14789999999 9999999999999999999985 99999999


Q ss_pred             EEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEe
Q 016641          180 VEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQI  257 (385)
Q Consensus       180 v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~  257 (385)
                      +++.|+..|+|+||++...   .++.+.++++.  ++|+.++++|+|++.. .+++.|+|+...+...........+||+
T Consensus       112 ~vg~d~~~dlavlki~~~~---~~~~~~~~~s~~l~vg~~v~aiGnp~g~~-~tvt~Givs~~~r~~v~~~~~~~~~Iqt  187 (347)
T COG0265         112 LVGKDPISDLAVLKIDGAG---GLPVIALGDSDKLRVGDVVVAIGNPFGLG-QTVTSGIVSALGRTGVGSAGGYVNFIQT  187 (347)
T ss_pred             EEecCCccCEEEEEeccCC---CCceeeccCCCCcccCCEEEEecCCCCcc-cceeccEEeccccccccCcccccchhhc
Confidence            9999999999999999864   26777888776  5599999999999976 5999999999988622221225578999


Q ss_pred             cccCCCCCCCccee-eCCEEEEEEeeecCCC---CceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHhh
Q 016641          258 DAAINPGNSGGPAI-MGNKVAGVAFQNLSGA---ENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNN  333 (385)
Q Consensus       258 d~~i~~G~SGGPL~-~~G~vVGI~s~~~~~~---~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~  333 (385)
                      |+++++|+||||++ .+|++|||++......   .+++|++|++.++.+++++.++|++. ++|+|+...++ +...+  
T Consensus       188 dAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~-~~~lgv~~~~~-~~~~~--  263 (347)
T COG0265         188 DAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVV-RGYLGVIGEPL-TADIA--  263 (347)
T ss_pred             ccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCcc-ccccceEEEEc-ccccc--
Confidence            99999999999999 9999999999987743   35899999999999999999988877 89999999887 44444  


Q ss_pred             cCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhcc
Q 016641          334 FGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSMLF  384 (385)
Q Consensus       334 ~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~~  384 (385)
                      +|++ ...|++|.+|.+++||+++ ++.||||+++||+++.+..++.+.+..
T Consensus       264 ~g~~-~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~  314 (347)
T COG0265         264 LGLP-VAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVAS  314 (347)
T ss_pred             cCCC-CCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhc
Confidence            7766 4689999999999999999 999999999999999999999988753


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.92  E-value=3.8e-24  Score=210.73  Aligned_cols=278  Identities=26%  Similarity=0.265  Sum_probs=207.2

Q ss_pred             CCcHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec-CCEEEecccccCCCc-----------eEEEEE
Q 016641          102 TTNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADST-----------FVLVRK  169 (385)
Q Consensus       102 ~~~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~~-----------~i~V~~  169 (385)
                      .....++.++-..++|.|....-. ....|+....-....||||+++ +++++||+||+....           .+.+..
T Consensus       127 ~~~v~~~~~~cd~Avv~Ie~~~f~-~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~a  205 (473)
T KOG1320|consen  127 KAFVAAVFEECDLAVVYIESEEFW-KGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDA  205 (473)
T ss_pred             hhhHHHhhhcccceEEEEeecccc-CCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEE
Confidence            345677889999999999974321 1122444555567889999999 999999999997432           255655


Q ss_pred             c-CCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeeccccccc
Q 016641          170 H-GSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYV  246 (385)
Q Consensus       170 ~-~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~  246 (385)
                      . +.+..+++.+.+.|+..|+|+++++.+.  ...++++++-..  ..|+++.++|.|++..+ +.+.|+++...|..+.
T Consensus       206 a~~~~~s~ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~n-t~t~g~vs~~~R~~~~  282 (473)
T KOG1320|consen  206 AIGPGNSGEPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLN-TLTQGMVSGQLRKSFK  282 (473)
T ss_pred             eecCCccCCCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeee-eeeecccccccccccc
Confidence            4 2348899999999999999999997653  136677777555  45999999999999985 8999999988776544


Q ss_pred             CC----CceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchHHHHHHHHHHcC---eee--
Q 016641          247 HG----ATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHG---KYV--  313 (385)
Q Consensus       247 ~~----~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i~~~l~~l~~~g---~~~--  313 (385)
                      -+    ....+++|+|+.++.|+|||||+ .+|++||+++....   -..+++|++|.+.++.++.+..+..   +..  
T Consensus       283 lg~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~  362 (473)
T KOG1320|consen  283 LGLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKP  362 (473)
T ss_pred             cCcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccC
Confidence            21    23446899999999999999999 99999999988654   2357899999999998888763222   211  


Q ss_pred             ---eeeccCccccccc----cHHHHhhcCCC-CccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          314 ---GFCSLGLSCQTTE----NVQLRNNFGMR-SEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       314 ---~~~~lGi~~~~~~----~~~~~~~~g~~-~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                         .+.|+|+..--+.    ...+.+.+-.+ ...+|++|.+|.+++++... +++||+|++|||++|.+..++.+.|-
T Consensus       363 ~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~  441 (473)
T KOG1320|consen  363 LVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIE  441 (473)
T ss_pred             cccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHHH
Confidence               1235665442221    01111223223 23479999999999999999 99999999999999999999998763


No 8  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.87  E-value=1.8e-22  Score=198.95  Aligned_cols=261  Identities=52%  Similarity=0.764  Sum_probs=235.3

Q ss_pred             HHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEecCCEEEecccccC---CCceEEEEEcCCCcEEEEEEEEec
Q 016641          108 AIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIPGKKILTNAHVVA---DSTFVLVRKHGSPTKYRAQVEAVG  184 (385)
Q Consensus       108 ~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~~g~ILT~aHvv~---~~~~i~V~~~~~g~~~~a~v~~~d  184 (385)
                      ..+...+|++.+......+.+..||+...+....|+||.+....++|++|+++   +...+.+..++.-+.|.+++...-
T Consensus        55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~  134 (473)
T KOG1320|consen   55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF  134 (473)
T ss_pred             CccccccceeEEEeecccccccCcceeeehhcccccchhhcccceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence            45677889999999999999999999998888999999999999999999999   666777776667778899999999


Q ss_pred             CCCCeEEEEecCCcccccceeeecCCcccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCC
Q 016641          185 HECDLAILIVESDEFWEGMHFLELGDIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPG  264 (385)
Q Consensus       185 ~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G  264 (385)
                      .++|+|++.++..+||+.+.|+++++.+...+-++++|   + +...++.|.|.......+.........+++|+++++|
T Consensus       135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---g-d~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~  210 (473)
T KOG1320|consen  135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---G-DGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPG  210 (473)
T ss_pred             hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---C-CcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCC
Confidence            99999999999999999999999999999989999999   3 3479999999998887777776777789999999999


Q ss_pred             CCCccee-eCCEEEEEEeeecCCCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHhhcCCCCccCce
Q 016641          265 NSGGPAI-MGNKVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGV  343 (385)
Q Consensus       265 ~SGGPL~-~~G~vVGI~s~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv  343 (385)
                      +||+|.+ ..+++.|+.+......+++++.+|.-.+.++.......+.+.++++++...+.+++.+.++.+.|..+ .|+
T Consensus       211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~-~g~  289 (473)
T KOG1320|consen  211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE-TGV  289 (473)
T ss_pred             ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc-cce
Confidence            9999999 55999999999875444789999999999999998888888889999999999988999999999887 899


Q ss_pred             EEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          344 LVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       344 ~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      .+.++.+-+.|.+.++.||+|+.+||+.|.
T Consensus       290 ~i~~~~qtd~ai~~~nsg~~ll~~DG~~Ig  319 (473)
T KOG1320|consen  290 LISKINQTDAAINPGNSGGPLLNLDGEVIG  319 (473)
T ss_pred             eeeeecccchhhhcccCCCcEEEecCcEee
Confidence            999999999999999999999999999995


No 9  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.84  E-value=4.3e-20  Score=184.14  Aligned_cols=269  Identities=19%  Similarity=0.226  Sum_probs=207.1

Q ss_pred             cHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec--CCEEEecccccCCC-ceEEEEEcCCCcEEEEEE
Q 016641          104 NAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP--GKKILTNAHVVADS-TFVLVRKHGSPTKYRAQV  180 (385)
Q Consensus       104 ~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~--~g~ILT~aHvv~~~-~~i~V~~~~~g~~~~a~v  180 (385)
                      ++...+.++-+|||.|......     +++........++||+++  .||||||+|++... -.-.+.+. +..+.+.-.
T Consensus        53 ~w~~~ia~VvksvVsI~~S~v~-----~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~-n~ee~ei~p  126 (955)
T KOG1421|consen   53 DWRNTIANVVKSVVSIRFSAVR-----AFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFD-NHEEIEIYP  126 (955)
T ss_pred             hhhhhhhhhcccEEEEEehhee-----ecccccccccceeEEEEecccceEEEeccccCCCCceeEEEec-ccccCCccc
Confidence            7888999999999999987642     344445566789999999  78999999999854 34455554 556677778


Q ss_pred             EEecCCCCeEEEEecCCcc-cccceeeecCCc-ccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCC-----ceee
Q 016641          181 EAVGHECDLAILIVESDEF-WEGMHFLELGDI-PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGA-----TQLM  253 (385)
Q Consensus       181 ~~~d~~~DlAlLkv~~~~~-~~~~~~l~l~~~-~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~-----~~~~  253 (385)
                      ++.|+-+|+.++|.++... ...++-+++... .++|.++.++|+..+.. .++..|.++.+++.....+.     ....
T Consensus       127 vyrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDagEk-lsIlagflSrldr~apdyg~~~yndfnTf  205 (955)
T KOG1421|consen  127 VYRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAGEK-LSILAGFLSRLDRNAPDYGEDTYNDFNTF  205 (955)
T ss_pred             ccCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCccce-EEeehhhhhhccCCCccccccccccccce
Confidence            8999999999999987642 124555666543 47899999999987765 68889999988875443221     2234


Q ss_pred             EEEecccCCCCCCCccee-eCCEEEEEEeeecCCCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHh
Q 016641          254 AIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN  332 (385)
Q Consensus       254 ~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~  332 (385)
                      ++|.......|.||.|++ -+|..|.++..+.. ....+|++|++.+++.|.-++.+.-+. |+.|-+++-.- ..+.++
T Consensus       206 y~QaasstsggssgspVv~i~gyAVAl~agg~~-ssas~ffLpLdrV~RaL~clq~n~PIt-RGtLqvefl~k-~~de~r  282 (955)
T KOG1421|consen  206 YIQAASSTSGGSSGSPVVDIPGYAVALNAGGSI-SSASDFFLPLDRVVRALRCLQNNTPIT-RGTLQVEFLHK-LFDECR  282 (955)
T ss_pred             eeeehhcCCCCCCCCceecccceEEeeecCCcc-cccccceeeccchhhhhhhhhcCCCcc-cceEEEEEehh-hhHHHH
Confidence            688888889999999999 99999999876543 345689999999999998888555444 67777666544 445666


Q ss_pred             hcCCCC-----------ccCce-EEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhhh
Q 016641          333 NFGMRS-----------EVTGV-LVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       333 ~~g~~~-----------~~~Gv-~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .+|++.           +..|+ +|..|.++|||++.|++||++++||+.-+.++.++.+.|
T Consensus       283 rlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l~~iL  344 (955)
T KOG1421|consen  283 RLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEALEQIL  344 (955)
T ss_pred             hcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHHHHHH
Confidence            777754           24555 556799999999999999999999999999999988765


No 10 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.57  E-value=2.5e-14  Score=117.65  Aligned_cols=107  Identities=36%  Similarity=0.485  Sum_probs=70.8

Q ss_pred             EEEEEecCC-EEEecccccC--------CCceEEEEEcCCCcEEE--EEEEEecCC-CCeEEEEecCCcccccceeeecC
Q 016641          142 GSGFVIPGK-KILTNAHVVA--------DSTFVLVRKHGSPTKYR--AQVEAVGHE-CDLAILIVESDEFWEGMHFLELG  209 (385)
Q Consensus       142 GSGfiI~~g-~ILT~aHvv~--------~~~~i~V~~~~~g~~~~--a~v~~~d~~-~DlAlLkv~~~~~~~~~~~l~l~  209 (385)
                      ||||+|++. +||||+||+.        ....+.+... ++....  ++++..|+. .|+|||+++...      .    
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~D~All~v~~~~------~----   69 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFP-DGRRVPPVAEVVYFDPDDYDLALLKVDPWT------G----   69 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEET-TSCEEETEEEEEEEETT-TTEEEEEESCEE------E----
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEec-CCCEEeeeEEEEEECCccccEEEEEEeccc------c----
Confidence            899999954 9999999998        4567888876 666677  999999999 999999999100      0    


Q ss_pred             CcccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEE
Q 016641          210 DIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGV  279 (385)
Q Consensus       210 ~~~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI  279 (385)
                          .+......+.         ..+.....      ........+ +++.+.+|+|||||+ .+|+||||
T Consensus        70 ----~~~~~~~~~~---------~~~~~~~~------~~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   70 ----VGGGVRVPGS---------TSGVSPTS------TNDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             ----EEEEEEEEEE---------EEEEEEEE------EEETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             ----eeeeeEeeee---------cccccccc------CcccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence                0000000000         00000000      000111124 799999999999999 99999997


No 11 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.52  E-value=3.6e-13  Score=122.05  Aligned_cols=165  Identities=21%  Similarity=0.239  Sum_probs=107.9

Q ss_pred             cceEEEEEecCCEEEecccccCCCceEEEEEc------CCC--cEEEEEEEEec----C---CCCeEEEEecCC-ccccc
Q 016641          139 ETTGSGFVIPGKKILTNAHVVADSTFVLVRKH------GSP--TKYRAQVEAVG----H---ECDLAILIVESD-EFWEG  202 (385)
Q Consensus       139 ~~~GSGfiI~~g~ILT~aHvv~~~~~i~V~~~------~~g--~~~~a~v~~~d----~---~~DlAlLkv~~~-~~~~~  202 (385)
                      ...|+|++|++.||||++||+.+...+.+.+.      .++  ..+..+-+..+    .   .+|+|||+++.+ .+.+.
T Consensus        24 ~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~  103 (220)
T PF00089_consen   24 RFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDN  103 (220)
T ss_dssp             EEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSS
T ss_pred             CeeEeEEecccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            46799999999999999999999555655432      122  23444433332    2   579999999987 45567


Q ss_pred             ceeeecCCcc---cCCCeEEEEecCCCCCCc---eEEEeeEeecc---cccccCCCceeeEEEecc----cCCCCCCCcc
Q 016641          203 MHFLELGDIP---FLQQAVAVVGYPQGGDNI---SVTKGVVSRVE---PTQYVHGATQLMAIQIDA----AINPGNSGGP  269 (385)
Q Consensus       203 ~~~l~l~~~~---~~G~~V~~iG~p~~~~~~---~~~~G~Vs~~~---~~~~~~~~~~~~~i~~d~----~i~~G~SGGP  269 (385)
                      +.++.+....   ..|+.+.++||+......   .+....+..+.   +............+....    ..|.|+||||
T Consensus       104 ~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~p  183 (220)
T PF00089_consen  104 IQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGGP  183 (220)
T ss_dssp             BEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTSE
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            8899998733   568999999998753322   33333333222   221111111112344444    7899999999


Q ss_pred             eeeCC-EEEEEEeeecC-C-CCceEEEEecchHHHHH
Q 016641          270 AIMGN-KVAGVAFQNLS-G-AENIGYIIPVPVIKHFI  303 (385)
Q Consensus       270 L~~~G-~vVGI~s~~~~-~-~~~~~~aip~~~i~~~l  303 (385)
                      |+.++ +|+||++.... + .....+..+++...+|+
T Consensus       184 l~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  184 LICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             EEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             cccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            99444 59999998733 1 12358889998887775


No 12 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.49  E-value=1.3e-12  Score=119.19  Aligned_cols=166  Identities=19%  Similarity=0.192  Sum_probs=98.2

Q ss_pred             cceEEEEEecCCEEEecccccCCC--ceEEEEEcC--------CCcEEEEEEEEec-------CCCCeEEEEecCC-ccc
Q 016641          139 ETTGSGFVIPGKKILTNAHVVADS--TFVLVRKHG--------SPTKYRAQVEAVG-------HECDLAILIVESD-EFW  200 (385)
Q Consensus       139 ~~~GSGfiI~~g~ILT~aHvv~~~--~~i~V~~~~--------~g~~~~a~v~~~d-------~~~DlAlLkv~~~-~~~  200 (385)
                      ...|+|++|++.+|||+|||+.+.  ..+.|.+..        ....+..+-+..+       ..+||||||++.+ .+.
T Consensus        24 ~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~  103 (232)
T cd00190          24 RHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLS  103 (232)
T ss_pred             cEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCC
Confidence            368999999999999999999874  455555421        1222334333333       3589999999875 344


Q ss_pred             ccceeeecCCc--c-cCCCeEEEEecCCCCCC----ceEEE---eeEeecccccccC--CCceeeEEEe-----cccCCC
Q 016641          201 EGMHFLELGDI--P-FLQQAVAVVGYPQGGDN----ISVTK---GVVSRVEPTQYVH--GATQLMAIQI-----DAAINP  263 (385)
Q Consensus       201 ~~~~~l~l~~~--~-~~G~~V~~iG~p~~~~~----~~~~~---G~Vs~~~~~~~~~--~~~~~~~i~~-----d~~i~~  263 (385)
                      ..+.|++|...  . ..|+.+.+.||......    .....   .++....+.....  .......+..     +...|+
T Consensus       104 ~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~  183 (232)
T cd00190         104 DNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQ  183 (232)
T ss_pred             CcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCcccc
Confidence            56889999866  2 45899999999765321    11222   2222222211111  0000111211     345789


Q ss_pred             CCCCccee-eC---CEEEEEEeeecCCC--CceEEEEecchHHHHHH
Q 016641          264 GNSGGPAI-MG---NKVAGVAFQNLSGA--ENIGYIIPVPVIKHFIT  304 (385)
Q Consensus       264 G~SGGPL~-~~---G~vVGI~s~~~~~~--~~~~~aip~~~i~~~l~  304 (385)
                      |+|||||+ ..   +.++||.+....-.  ...+....+....+|++
T Consensus       184 gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~  230 (232)
T cd00190         184 GDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ  230 (232)
T ss_pred             CCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence            99999999 43   78999999864311  12334444444555543


No 13 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.45  E-value=2.5e-12  Score=129.18  Aligned_cols=259  Identities=15%  Similarity=0.130  Sum_probs=181.1

Q ss_pred             HHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec--CCEEEecccccC-CCceEEEEEcCCCcEEEEEEEEecC
Q 016641          109 IELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP--GKKILTNAHVVA-DSTFVLVRKHGSPTKYRAQVEAVGH  185 (385)
Q Consensus       109 ~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~--~g~ILT~aHvv~-~~~~i~V~~~~~g~~~~a~v~~~d~  185 (385)
                      .+++..+.|.+......+-     +.-......|||.|++  +|+++++..++. +..+.+|+.. |...++|.+...++
T Consensus       524 ~~~i~~~~~~v~~~~~~~l-----~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~-dS~~i~a~~~fL~~  597 (955)
T KOG1421|consen  524 SADISNCLVDVEPMMPVNL-----DGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEA-DSDGIPANVSFLHP  597 (955)
T ss_pred             hhHHhhhhhhheeceeecc-----ccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeec-ccccccceeeEecC
Confidence            4667777787776443221     1112234569999998  899999999997 6778888886 77889999999999


Q ss_pred             CCCeEEEEecCCcccccceeeecCCcc-cCCCeEEEEecCCCCCCc----eEEEeeEeecccccc-cCCCceeeEEEecc
Q 016641          186 ECDLAILIVESDEFWEGMHFLELGDIP-FLQQAVAVVGYPQGGDNI----SVTKGVVSRVEPTQY-VHGATQLMAIQIDA  259 (385)
Q Consensus       186 ~~DlAlLkv~~~~~~~~~~~l~l~~~~-~~G~~V~~iG~p~~~~~~----~~~~G~Vs~~~~~~~-~~~~~~~~~i~~d~  259 (385)
                      ..++|.+|.++..    ...++|.+.. ..||+|...|+....+..    +++.-.+....+... .......+.|..++
T Consensus       598 t~n~a~~kydp~~----~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~  673 (955)
T KOG1421|consen  598 TENVASFKYDPAL----EVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMD  673 (955)
T ss_pred             ccceeEeccChhH----hhhhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEec
Confidence            9999999998764    3455665544 459999999998765421    222111111111111 11123445677766


Q ss_pred             cCCCCCCCccee-eCCEEEEEEeeecC---CCC--ceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHhh
Q 016641          260 AINPGNSGGPAI-MGNKVAGVAFQNLS---GAE--NIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNN  333 (385)
Q Consensus       260 ~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~--~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~  333 (385)
                      .+.-++--|-+. .+|+|+|++-....   +..  ..-|.+.+..++.+|++|+..+..+ ...+|+.+..+ +...++.
T Consensus       674 nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-p~i~~vef~~i-~laqar~  751 (955)
T KOG1421|consen  674 NLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-PTIAGVEFSHI-TLAQART  751 (955)
T ss_pred             cccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-ceeeccceeeE-Eeehhhc
Confidence            655444446677 99999999866554   111  1345567789999999999877765 45688888888 6777788


Q ss_pred             cCCCCc------------cCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhh
Q 016641          334 FGMRSE------------VTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHS  381 (385)
Q Consensus       334 ~g~~~~------------~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~  381 (385)
                      +|+|.+            .+=.+|++|.+.-+--  |..||||+++|||.|+...||.+.
T Consensus       752 lglp~e~imk~e~es~~~~ql~~ishv~~~~~ki--l~~gdiilsvngk~itr~~dl~d~  809 (955)
T KOG1421|consen  752 LGLPSEFIMKSEEESTIPRQLYVISHVRPLLHKI--LGVGDIILSVNGKMITRLSDLHDF  809 (955)
T ss_pred             cCCCHHHHhhhhhcCCCcceEEEEEeeccCcccc--cccccEEEEecCeEEeeehhhhhh
Confidence            888752            3456788898865433  999999999999999999999873


No 14 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.40  E-value=6.3e-12  Score=114.92  Aligned_cols=146  Identities=18%  Similarity=0.189  Sum_probs=91.0

Q ss_pred             cceEEEEEecCCEEEecccccCCCc--eEEEEEcCCC-------cEEEEEEEEec-------CCCCeEEEEecCC-cccc
Q 016641          139 ETTGSGFVIPGKKILTNAHVVADST--FVLVRKHGSP-------TKYRAQVEAVG-------HECDLAILIVESD-EFWE  201 (385)
Q Consensus       139 ~~~GSGfiI~~g~ILT~aHvv~~~~--~i~V~~~~~g-------~~~~a~v~~~d-------~~~DlAlLkv~~~-~~~~  201 (385)
                      ...|+|.+|++.+|||++||+.+..  .+.|.+....       ..+.+.-+..+       ..+|||||+++.+ .+.+
T Consensus        25 ~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~  104 (229)
T smart00020       25 RHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSD  104 (229)
T ss_pred             CcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCC
Confidence            4579999999999999999998743  6666654211       33344433322       3589999999876 3445


Q ss_pred             cceeeecCCc---ccCCCeEEEEecCCCCC-----CceEEEeeEeecc---cccccC-----CCceeeEEEe--cccCCC
Q 016641          202 GMHFLELGDI---PFLQQAVAVVGYPQGGD-----NISVTKGVVSRVE---PTQYVH-----GATQLMAIQI--DAAINP  263 (385)
Q Consensus       202 ~~~~l~l~~~---~~~G~~V~~iG~p~~~~-----~~~~~~G~Vs~~~---~~~~~~-----~~~~~~~i~~--d~~i~~  263 (385)
                      .+.|+.|...   ...++.+.+.||+....     ........+..+.   +.....     ..........  ....|+
T Consensus       105 ~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~  184 (229)
T smart00020      105 NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQ  184 (229)
T ss_pred             ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccC
Confidence            6889999864   34589999999986542     0112222222211   111000     0111111111  355789


Q ss_pred             CCCCcceeeCC---EEEEEEeeec
Q 016641          264 GNSGGPAIMGN---KVAGVAFQNL  284 (385)
Q Consensus       264 G~SGGPL~~~G---~vVGI~s~~~  284 (385)
                      |+|||||+.++   .++||++...
T Consensus       185 gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      185 GDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             CCCCCeeEEECCCEEEEEEEEECC
Confidence            99999999444   9999999864


No 15 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.95  E-value=1e-09  Score=84.67  Aligned_cols=56  Identities=30%  Similarity=0.439  Sum_probs=47.7

Q ss_pred             eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      ||||+.+.....            ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus         1 ~~lGv~~~~~~~------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~   57 (82)
T PF13180_consen    1 GGLGVTVQNLSD------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILS   57 (82)
T ss_dssp             -E-SEEEEECSC------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHH
T ss_pred             CEECeEEEEccC------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHH
Confidence            579999877631            369999999999999999 99999999999999999999998874


No 16 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.92  E-value=3.1e-09  Score=82.99  Aligned_cols=65  Identities=31%  Similarity=0.517  Sum_probs=56.3

Q ss_pred             eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      +|+|+.++++ ++..++.++++ ...|++|.+|.++|||+++ |++||+|++|||++|.+..++.+.+
T Consensus         1 ~~~G~~~~~~-~~~~~~~~~~~-~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l   66 (90)
T cd00987           1 PWLGVTVQDL-TPDLAEELGLK-DTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRAL   66 (90)
T ss_pred             CccceEEeEC-CHHHHHHcCCC-CCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHH
Confidence            5799999999 56666656654 3579999999999999999 9999999999999999999988765


No 17 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.79  E-value=3.6e-07  Score=85.34  Aligned_cols=144  Identities=19%  Similarity=0.178  Sum_probs=85.9

Q ss_pred             eEEEEEecCCEEEecccccCCCc--eEEEEEcC--------CC---cEE-EEEEEEecC-------C-CCeEEEEecCC-
Q 016641          141 TGSGFVIPGKKILTNAHVVADST--FVLVRKHG--------SP---TKY-RAQVEAVGH-------E-CDLAILIVESD-  197 (385)
Q Consensus       141 ~GSGfiI~~g~ILT~aHvv~~~~--~i~V~~~~--------~g---~~~-~a~v~~~d~-------~-~DlAlLkv~~~-  197 (385)
                      .|.|.+|++.||||++||+.+..  ...|.+..        .+   ... ..+++ .++       . +|||||+++.+ 
T Consensus        39 ~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~v  117 (256)
T KOG3627|consen   39 LCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEPV  117 (256)
T ss_pred             eeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCCc
Confidence            78888889889999999999865  55555421        11   111 11233 221       3 89999999975 


Q ss_pred             cccccceeeecCCcc----cC-CCeEEEEecCCCCC-----CceE---EEeeEeecccccccCCC--ceeeEEEe-----
Q 016641          198 EFWEGMHFLELGDIP----FL-QQAVAVVGYPQGGD-----NISV---TKGVVSRVEPTQYVHGA--TQLMAIQI-----  257 (385)
Q Consensus       198 ~~~~~~~~l~l~~~~----~~-G~~V~~iG~p~~~~-----~~~~---~~G~Vs~~~~~~~~~~~--~~~~~i~~-----  257 (385)
                      .|.+.+.|+.|....    .. +..+++.||+....     ....   ...+++...+.......  .....+..     
T Consensus       118 ~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~  197 (256)
T KOG3627|consen  118 TFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEG  197 (256)
T ss_pred             ccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCC
Confidence            566788999886432    22 48899999865321     1112   22233322221111110  00011332     


Q ss_pred             cccCCCCCCCccee-eC---CEEEEEEeeecC
Q 016641          258 DAAINPGNSGGPAI-MG---NKVAGVAFQNLS  285 (385)
Q Consensus       258 d~~i~~G~SGGPL~-~~---G~vVGI~s~~~~  285 (385)
                      ....|.|||||||+ .+   ..++||++++..
T Consensus       198 ~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~  229 (256)
T KOG3627|consen  198 GKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG  229 (256)
T ss_pred             CCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence            23469999999999 44   699999999754


No 18 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.69  E-value=4e-07  Score=83.97  Aligned_cols=157  Identities=19%  Similarity=0.218  Sum_probs=88.8

Q ss_pred             eEEEEEecCCEEEecccccCCCc----eEEEEEc---CC-CcEEEEE--EEEec-C---CCCeEEEEecCCcc------c
Q 016641          141 TGSGFVIPGKKILTNAHVVADST----FVLVRKH---GS-PTKYRAQ--VEAVG-H---ECDLAILIVESDEF------W  200 (385)
Q Consensus       141 ~GSGfiI~~g~ILT~aHvv~~~~----~i~V~~~---~~-g~~~~a~--v~~~d-~---~~DlAlLkv~~~~~------~  200 (385)
                      .|++|+|.++.+||++||+....    .+.+...   ++ +..+..+  ..... .   +.|.+...+.+..+      .
T Consensus        65 ~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~  144 (251)
T COG3591          65 CTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINIG  144 (251)
T ss_pred             eeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCcc
Confidence            45669999999999999996432    2222211   11 1111111  11111 2   34555555543221      1


Q ss_pred             ccce--eeecCCcccCCCeEEEEecCCCCCC---ceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCC
Q 016641          201 EGMH--FLELGDIPFLQQAVAVVGYPQGGDN---ISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGN  274 (385)
Q Consensus       201 ~~~~--~l~l~~~~~~G~~V~~iG~p~~~~~---~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G  274 (385)
                      ....  ........++++.+.++|||.+..+   .....+.|.....          ..++.++.+++|+||.|++ .+.
T Consensus       145 ~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~----------~~l~y~~dT~pG~SGSpv~~~~~  214 (251)
T COG3591         145 DVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG----------NKLFYDADTLPGSSGSPVLISKD  214 (251)
T ss_pred             ccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEec----------ceEEEEecccCCCCCCceEecCc
Confidence            1222  2222233366889999999987652   2233333332211          2578899999999999999 888


Q ss_pred             EEEEEEeeecCCC--CceEEE-EecchHHHHHHHHH
Q 016641          275 KVAGVAFQNLSGA--ENIGYI-IPVPVIKHFITGVV  307 (385)
Q Consensus       275 ~vVGI~s~~~~~~--~~~~~a-ip~~~i~~~l~~l~  307 (385)
                      +++|+++......  ...+++ .-...++++++++.
T Consensus       215 ~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         215 EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence            9999998865522  122333 33345666666553


No 19 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.53  E-value=1.8e-07  Score=71.59  Aligned_cols=44  Identities=25%  Similarity=0.343  Sum_probs=40.8

Q ss_pred             cCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          340 VTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       340 ~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus         9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~   53 (79)
T cd00991           9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALK   53 (79)
T ss_pred             CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHh
Confidence            379999999999999999 99999999999999999999987763


No 20 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.52  E-value=1.4e-07  Score=72.01  Aligned_cols=50  Identities=20%  Similarity=0.086  Sum_probs=41.8

Q ss_pred             eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHH
Q 016641          316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGS  379 (385)
Q Consensus       316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~  379 (385)
                      ||+|+.+..-              ..|++|.+|.++|||+++ |++||+|++|||++|.+..++.
T Consensus         1 ~~~G~~~~~~--------------~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l   51 (80)
T cd00990           1 PYLGLTLDKE--------------EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRL   51 (80)
T ss_pred             CcccEEEEcc--------------CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHH
Confidence            4688777432              257999999999999999 9999999999999999855543


No 21 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.51  E-value=2.1e-07  Score=69.03  Aligned_cols=42  Identities=31%  Similarity=0.434  Sum_probs=38.5

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh--hhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~--~~l~~~l  382 (385)
                      .|++|.+|.++|||+++ |++||+|++|||+++.+.  .++.+.|
T Consensus        13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l   57 (70)
T cd00136          13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELL   57 (70)
T ss_pred             CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHH
Confidence            48999999999999998 999999999999999998  7777655


No 22 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.50  E-value=2.1e-07  Score=94.00  Aligned_cols=67  Identities=24%  Similarity=0.386  Sum_probs=60.9

Q ss_pred             eeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       315 ~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      ..|+|+.++++ ++..++.++++....|++|.+|.++|||+++ |++||+|++|||++|.+..++.++|
T Consensus       337 ~~~lGi~~~~l-~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l  404 (428)
T TIGR02037       337 NPFLGLTVANL-SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVL  404 (428)
T ss_pred             ccccceEEecC-CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence            46899999998 7888888898865689999999999999999 9999999999999999999998876


No 23 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.44  E-value=2.8e-07  Score=86.42  Aligned_cols=74  Identities=18%  Similarity=0.223  Sum_probs=64.0

Q ss_pred             chHHHHHHHHHHcCeeeeeeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh
Q 016641          297 PVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       297 ~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~  375 (385)
                      ..++++++++.++|++. +.|+|+......        |   ...|++|..+.+++||+++ ||+||+|++|||++|.+.
T Consensus       159 ~~~~~v~~~l~~~g~~~-~~~lgi~p~~~~--------g---~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~  226 (259)
T TIGR01713       159 VVSRRIIEELTKDPQKM-FDYIRLSPVMKN--------D---KLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDP  226 (259)
T ss_pred             hhHHHHHHHHHHCHHhh-hheEeEEEEEeC--------C---ceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCH
Confidence            46788999999999887 889999875431        1   2479999999999999999 999999999999999999


Q ss_pred             hhHHhhh
Q 016641          376 GTGSHSM  382 (385)
Q Consensus       376 ~~l~~~l  382 (385)
                      .++.+.+
T Consensus       227 ~~~~~~l  233 (259)
T TIGR01713       227 EQAFQAL  233 (259)
T ss_pred             HHHHHHH
Confidence            9987765


No 24 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.37  E-value=6.5e-07  Score=68.03  Aligned_cols=42  Identities=26%  Similarity=0.275  Sum_probs=38.8

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      ..++|.+|.++|||+++ |++||+|++|||+++.+..++...|
T Consensus        12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l   54 (79)
T cd00989          12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAV   54 (79)
T ss_pred             cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHH
Confidence            45899999999999998 9999999999999999999998765


No 25 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.36  E-value=6.9e-07  Score=68.18  Aligned_cols=43  Identities=28%  Similarity=0.278  Sum_probs=39.8

Q ss_pred             CceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .|++|.+|.++|||+..|++||+|++|||++|.+..++.+.|.
T Consensus         8 ~Gv~V~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~~~~~l~   50 (79)
T cd00986           8 HGVYVTSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEELIDYIQ   50 (79)
T ss_pred             cCEEEEEECCCCchhhCCCCCCEEEEECCEECCCHHHHHHHHH
Confidence            6999999999999987799999999999999999999987764


No 26 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.31  E-value=3.3e-05  Score=70.81  Aligned_cols=143  Identities=17%  Similarity=0.200  Sum_probs=72.0

Q ss_pred             CCEEEecccccC-CCceEEEEEcCCCcEEEEE-----EEEecCCCCeEEEEecCCcccccceeeecC---CcccCCCeEE
Q 016641          149 GKKILTNAHVVA-DSTFVLVRKHGSPTKYRAQ-----VEAVGHECDLAILIVESDEFWEGMHFLELG---DIPFLQQAVA  219 (385)
Q Consensus       149 ~g~ILT~aHvv~-~~~~i~V~~~~~g~~~~a~-----v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~---~~~~~G~~V~  219 (385)
                      ..||+|++|... +...++|... -|. |...     -+..-+..||.++|+..+     +||.+-.   ..+..+|+|.
T Consensus        40 G~~iItn~HLf~~nng~L~i~s~-hG~-f~v~nt~~lkv~~i~~~DiviirmPkD-----fpPf~~kl~FR~P~~~e~v~  112 (235)
T PF00863_consen   40 GSYIITNAHLFKRNNGELTIKSQ-HGE-FTVPNTTQLKVHPIEGRDIVIIRMPKD-----FPPFPQKLKFRAPKEGERVC  112 (235)
T ss_dssp             TTEEEEEGGGGSSTTCEEEEEET-TEE-EEECEGGGSEEEE-TCSSEEEEE--TT-----S----S---B----TT-EEE
T ss_pred             CCEEEEChhhhccCCCeEEEEeC-ceE-EEcCCccccceEEeCCccEEEEeCCcc-----cCCcchhhhccCCCCCCEEE
Confidence            779999999996 4456777764 332 2221     133345899999999874     3443322   3456799999


Q ss_pred             EEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee--eCCEEEEEEeeecCCCCceEEEEec-
Q 016641          220 VVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI--MGNKVAGVAFQNLSGAENIGYIIPV-  296 (385)
Q Consensus       220 ~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~--~~G~vVGI~s~~~~~~~~~~~aip~-  296 (385)
                      ++|.-+.......   .||.........+.   .+...-.....|+-|.||+  .||.+|||++..... ...+|+.|+ 
T Consensus       113 mVg~~fq~k~~~s---~vSesS~i~p~~~~---~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~  185 (235)
T PF00863_consen  113 MVGSNFQEKSISS---TVSESSWIYPEENS---HFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFP  185 (235)
T ss_dssp             EEEEECSSCCCEE---EEEEEEEEEEETTT---TEEEE-C---TT-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--
T ss_pred             EEEEEEEcCCeeE---EECCceEEeecCCC---CeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCC
Confidence            9998654432222   22222111111111   2344555567899999999  899999999986542 344566554 


Q ss_pred             -chHHHHHHH
Q 016641          297 -PVIKHFITG  305 (385)
Q Consensus       297 -~~i~~~l~~  305 (385)
                       +.+..+++.
T Consensus       186 ~~f~~~~l~~  195 (235)
T PF00863_consen  186 DDFEEFYLEN  195 (235)
T ss_dssp             TTHHHHHCC-
T ss_pred             HHHHHHHhcc
Confidence             334444433


No 27 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.21  E-value=2.6e-06  Score=65.04  Aligned_cols=56  Identities=23%  Similarity=0.396  Sum_probs=45.9

Q ss_pred             eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcC--ChhhHHhhh
Q 016641          316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIA--NDGTGSHSM  382 (385)
Q Consensus       316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~--~~~~l~~~l  382 (385)
                      ..+|+.++...+           ...|++|.+|.++|||+++ |++||+|++|||+++.  +..++.+.+
T Consensus        12 ~~~G~~~~~~~~-----------~~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l   70 (82)
T cd00992          12 GGLGFSLRGGKD-----------SGGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELL   70 (82)
T ss_pred             CCcCEEEeCccc-----------CCCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHH
Confidence            457877765421           0268999999999999999 9999999999999999  888877665


No 28 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.16  E-value=2.8e-06  Score=65.54  Aligned_cols=42  Identities=29%  Similarity=0.442  Sum_probs=38.8

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh--hhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~--~~l~~~l  382 (385)
                      .+++|..|.++|||+++ |++||+|++|||+++.+.  .++.+.+
T Consensus        13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l   57 (85)
T cd00988          13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLL   57 (85)
T ss_pred             CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHh
Confidence            68999999999999999 999999999999999998  8887655


No 29 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.93  E-value=1.2e-05  Score=61.55  Aligned_cols=53  Identities=26%  Similarity=0.382  Sum_probs=42.5

Q ss_pred             eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhH
Q 016641          316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTG  378 (385)
Q Consensus       316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l  378 (385)
                      ..+||.+....+.          ...|++|.+|.++|||+++ |++||.|++|||+.+.+....
T Consensus        10 ~~lG~~l~~~~~~----------~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~   63 (81)
T PF00595_consen   10 GPLGFTLRGGSDN----------DEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHD   63 (81)
T ss_dssp             SBSSEEEEEESTS----------SSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHH
T ss_pred             CCcCEEEEecCCC----------CcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHH
Confidence            4578887654210          0259999999999999999 999999999999999987443


No 30 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.91  E-value=1.8e-05  Score=60.48  Aligned_cols=41  Identities=29%  Similarity=0.371  Sum_probs=36.7

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHS  381 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~  381 (385)
                      .|++|..|.++|||+++ |++||+|++|||+++.+..+....
T Consensus        26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~   67 (85)
T smart00228       26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAV   67 (85)
T ss_pred             CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHH
Confidence            68999999999999999 999999999999999987655443


No 31 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.89  E-value=0.00022  Score=68.21  Aligned_cols=50  Identities=24%  Similarity=0.355  Sum_probs=34.0

Q ss_pred             ccCCCCCCCccee-eC--C-EEEEEEeeecCCCCc---eEEEEecchHHHHHHHHHH
Q 016641          259 AAINPGNSGGPAI-MG--N-KVAGVAFQNLSGAEN---IGYIIPVPVIKHFITGVVE  308 (385)
Q Consensus       259 ~~i~~G~SGGPL~-~~--G-~vVGI~s~~~~~~~~---~~~aip~~~i~~~l~~l~~  308 (385)
                      ...|+||||||++ ..  | .-+||++++.....+   .+...-++....|++...+
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~  279 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN  279 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence            4569999999999 43  4 489999997663222   2333446677778777553


No 32 
>PF12812 PDZ_1:  PDZ-like domain
Probab=97.73  E-value=7.9e-05  Score=56.73  Aligned_cols=64  Identities=17%  Similarity=0.114  Sum_probs=56.7

Q ss_pred             eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      -|.|..++++ +-+.+++++++-   |+++.....++++... +..|-||.+|||+++.+.++|.+++-
T Consensus         9 ~~~Ga~f~~L-s~q~aR~~~~~~---~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk   73 (78)
T PF12812_consen    9 EVCGAVFHDL-SYQQARQYGIPV---GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVK   73 (78)
T ss_pred             EEcCeecccC-CHHHHHHhCCCC---CEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHH
Confidence            5799999999 788999999984   4666677899999998 99999999999999999999998864


No 33 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.63  E-value=6.6e-05  Score=75.64  Aligned_cols=42  Identities=19%  Similarity=0.249  Sum_probs=39.8

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .|++|.+|.++|||+++ ||+||+|++|||++|.+.+|+.+.+
T Consensus       203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l  245 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAV  245 (420)
T ss_pred             cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHH
Confidence            58999999999999999 9999999999999999999998776


No 34 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.62  E-value=0.004  Score=59.20  Aligned_cols=106  Identities=21%  Similarity=0.261  Sum_probs=64.3

Q ss_pred             CCCeEEEEecCCcccccceeeecCCcc---cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCC
Q 016641          186 ECDLAILIVESDEFWEGMHFLELGDIP---FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAIN  262 (385)
Q Consensus       186 ~~DlAlLkv~~~~~~~~~~~l~l~~~~---~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~  262 (385)
                      ..+++||.++.+ +.....|+.|+++.   ..++.+.+.|+....   .+....+.-.....      ....+......+
T Consensus       160 ~~~~mIlEl~~~-~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~---~~~~~~~~i~~~~~------~~~~~~~~~~~~  229 (282)
T PF03761_consen  160 PYSPMILELEED-FSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG---KLKHRKLKITNCTK------CAYSICTKQYSC  229 (282)
T ss_pred             ccceEEEEEccc-ccccCCCEEeCCCccccccCceEEEeecCCCC---eEEEEEEEEEEeec------cceeEecccccC
Confidence            579999999887 33478899999865   238889888882221   22222222111100      112355566778


Q ss_pred             CCCCCccee--eCCE--EEEEEeeecCCC-CceEEEEecchHHH
Q 016641          263 PGNSGGPAI--MGNK--VAGVAFQNLSGA-ENIGYIIPVPVIKH  301 (385)
Q Consensus       263 ~G~SGGPL~--~~G~--vVGI~s~~~~~~-~~~~~aip~~~i~~  301 (385)
                      .|++||||+  .+|+  ||||.+...... .+..+++.+...++
T Consensus       230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~  273 (282)
T PF03761_consen  230 KGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD  273 (282)
T ss_pred             CCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence            999999999  5664  999987643211 12455555544443


No 35 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.62  E-value=4.9e-05  Score=77.27  Aligned_cols=40  Identities=20%  Similarity=0.222  Sum_probs=37.3

Q ss_pred             eEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          343 VLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       343 v~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .+|.+|.++|||++| ||+||+|+++||++|.+.+|+...+
T Consensus       128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v  168 (449)
T PRK10779        128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLAL  168 (449)
T ss_pred             ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence            479999999999999 9999999999999999999997654


No 36 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.61  E-value=0.00062  Score=62.73  Aligned_cols=113  Identities=18%  Similarity=0.194  Sum_probs=59.0

Q ss_pred             eEEEEEec---CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcccCCCe
Q 016641          141 TGSGFVIP---GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQA  217 (385)
Q Consensus       141 ~GSGfiI~---~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~  217 (385)
                      .|||-++.   +-.|+|+.||+. ....+|..  .+.....+   .+..-|+|.-.++.-.  ...|.+++++. ..|. 
T Consensus       113 ~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~--~g~~~~~t---F~~~GDfA~~~~~~~~--G~~P~~k~a~~-~~Gr-  182 (297)
T PF05579_consen  113 VGSGGVFTIGGNTVVVTATHVLG-GNTARVSG--VGTRRMLT---FKKNGDFAEADITNWP--GAAPKYKFAQN-YTGR-  182 (297)
T ss_dssp             EEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE--TTEEEEEE---EEEETTEEEEEETTS---S---B--B-TT--SEE-
T ss_pred             ccccceEEECCeEEEEEEEEEcC-CCeEEEEe--cceEEEEE---EeccCcEEEEECCCCC--CCCCceeecCC-cccc-
Confidence            45555554   448999999998 55666664  33433333   3446799998884321  25677776621 1121 


Q ss_pred             EEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeec
Q 016641          218 VAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNL  284 (385)
Q Consensus       218 V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~  284 (385)
                      .+      -.-.--+..|.|..-.+            +.   -..+||||+|++ .+|.+|||++...
T Consensus       183 Ay------W~t~tGvE~G~ig~~~~------------~~---fT~~GDSGSPVVt~dg~liGVHTGSn  229 (297)
T PF05579_consen  183 AY------WLTSTGVEPGFIGGGGA------------VC---FTGPGDSGSPVVTEDGDLIGVHTGSN  229 (297)
T ss_dssp             EE------EEETTEEEEEEEETTEE------------EE---SS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred             eE------EEcccCcccceecCceE------------EE---EcCCCCCCCccCcCCCCEEEEEecCC
Confidence            11      00011334444432111            11   135799999999 9999999999753


No 37 
>PRK10139 serine endoprotease; Provisional
Probab=97.57  E-value=9.6e-05  Score=75.12  Aligned_cols=42  Identities=19%  Similarity=0.312  Sum_probs=40.2

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .|++|.+|.++|||+++ ||+||+|++|||++|.+..+|.+.|
T Consensus       390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l  432 (455)
T PRK10139        390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVL  432 (455)
T ss_pred             CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence            58999999999999999 9999999999999999999998876


No 38 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.54  E-value=0.0018  Score=57.11  Aligned_cols=138  Identities=20%  Similarity=0.301  Sum_probs=80.6

Q ss_pred             CcceEEEEEecCCEEEecccccCCCceEEEEEcCCCcEEEE--EEEEecC---CCCeEEEEecCCc-ccccceeeecCCc
Q 016641          138 RETTGSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRA--QVEAVGH---ECDLAILIVESDE-FWEGMHFLELGDI  211 (385)
Q Consensus       138 ~~~~GSGfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a--~v~~~d~---~~DlAlLkv~~~~-~~~~~~~l~l~~~  211 (385)
                      ....++++.|.+.|+|...| -.....+.  +  ++..++.  .+...+.   ..||++++++... |.+-.+.+. ...
T Consensus        23 g~~t~l~~gi~~~~~lvp~H-~~~~~~i~--i--~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~-~~~   96 (172)
T PF00548_consen   23 GEFTMLALGIYDRYFLVPTH-EEPEDTIY--I--DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFP-ESI   96 (172)
T ss_dssp             EEEEEEEEEEEBTEEEEEGG-GGGCSEEE--E--TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSB-SSG
T ss_pred             ceEEEecceEeeeEEEEECc-CCCcEEEE--E--CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhc-ccc
Confidence            45678899999999999999 22233333  3  3444433  2223343   4699999997743 322223333 112


Q ss_pred             ccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCcceee----CCEEEEEEeee
Q 016641          212 PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAIM----GNKVAGVAFQN  283 (385)
Q Consensus       212 ~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~~----~G~vVGI~s~~  283 (385)
                      +...+...++=.+. ........+.+...... ...+......+..+++..+|+-||||+.    .++++||+.++
T Consensus        97 ~~~~~~~l~v~~~~-~~~~~~~v~~v~~~~~i-~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen   97 PEYPECVLLVNSTK-FPRMIVEVGFVTNFGFI-NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             GTEEEEEEEEESSS-STCEEEEEEEEEEEEEE-EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred             ccCCCcEEEEECCC-CccEEEEEEEEeecCcc-ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence            23344444443332 22223444444443332 2223333457888888889999999995    68999999875


No 39 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.53  E-value=9.6e-05  Score=74.46  Aligned_cols=43  Identities=21%  Similarity=0.239  Sum_probs=40.5

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .|.+|.+|.++|||+++ ||+||+|+++||+++.+..|+...+.
T Consensus       128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia  171 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIA  171 (420)
T ss_pred             CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence            68999999999999999 99999999999999999999987664


No 40 
>PRK10942 serine endoprotease; Provisional
Probab=97.53  E-value=0.00011  Score=75.03  Aligned_cols=42  Identities=31%  Similarity=0.510  Sum_probs=40.0

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .|++|.+|.++|||+++ |++||+|++|||++|.+..+|.+++
T Consensus       408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l  450 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKIL  450 (473)
T ss_pred             CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence            58999999999999999 9999999999999999999998865


No 41 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.46  E-value=0.00015  Score=73.78  Aligned_cols=42  Identities=24%  Similarity=0.303  Sum_probs=39.1

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .+++|.+|.++|||+++ ||+||+|++|||++|.+..|+.+.+
T Consensus       221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l  263 (449)
T PRK10779        221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLV  263 (449)
T ss_pred             cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence            35899999999999999 9999999999999999999998765


No 42 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.40  E-value=0.00017  Score=56.26  Aligned_cols=37  Identities=27%  Similarity=0.383  Sum_probs=32.7

Q ss_pred             cCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEc
Q 016641          334 FGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPI  372 (385)
Q Consensus       334 ~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i  372 (385)
                      ++.+  ..|+||++|.++|||+.| |+.+|-|+.|||-..
T Consensus        54 f~yt--D~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~Df   91 (124)
T KOG3553|consen   54 FSYT--DKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDF   91 (124)
T ss_pred             CCcC--CccEEEEEeccCChhhhhcceecceEEEecCcee
Confidence            4555  489999999999999999 999999999999653


No 43 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.36  E-value=0.00024  Score=69.41  Aligned_cols=41  Identities=17%  Similarity=0.170  Sum_probs=35.8

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh--hhHHhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTGSHS  381 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~--~~l~~~  381 (385)
                      .+++|.+|.++|||+++ ||+||+|++|||++|.+.  .++...
T Consensus        62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~  105 (334)
T TIGR00225        62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVAL  105 (334)
T ss_pred             CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHh
Confidence            58999999999999999 999999999999999875  454443


No 44 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.21  E-value=0.00047  Score=68.77  Aligned_cols=35  Identities=29%  Similarity=0.425  Sum_probs=32.9

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~  375 (385)
                      .|++|..|.++|||+++ |++||+|++|||++|.+.
T Consensus       102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~  137 (389)
T PLN00049        102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGL  137 (389)
T ss_pred             CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCC
Confidence            48999999999999999 999999999999999864


No 45 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.05  E-value=0.00094  Score=51.97  Aligned_cols=43  Identities=26%  Similarity=0.407  Sum_probs=30.6

Q ss_pred             CceEEEeeCCC--------CHHhhh---cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          341 TGVLVNKINPL--------SDAHEI---LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       341 ~Gv~V~~V~~~--------spA~~a---L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .+..|.+|.++        ||..+.   +++||+|++|||+++....+++..|.
T Consensus        12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~   65 (88)
T PF14685_consen   12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLE   65 (88)
T ss_dssp             TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHH
T ss_pred             CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhc
Confidence            68899999886        676654   77999999999999999999888764


No 46 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=96.98  E-value=0.001  Score=65.78  Aligned_cols=42  Identities=29%  Similarity=0.312  Sum_probs=36.7

Q ss_pred             CceEEEeeC--------CCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKIN--------PLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~--------~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .||+|....        .+|||+++ ||+||+|++|||++|.+..|+.++|
T Consensus       105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL  155 (402)
T TIGR02860       105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLI  155 (402)
T ss_pred             CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHH
Confidence            699996642        26899999 9999999999999999999998765


No 47 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.52  E-value=0.0032  Score=53.35  Aligned_cols=60  Identities=25%  Similarity=0.259  Sum_probs=41.6

Q ss_pred             eeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCC-CCEEEEECCEEcCChhhHHhhh
Q 016641          315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKK-DDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       315 ~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~-GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      ...||++++.-. .     .+.  ...++-|.+|.|+|||++| |++ .|.|+.+|+..+.+.++|.+.+
T Consensus        25 ~g~LG~sv~~~~-~-----~~~--~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v   86 (138)
T PF04495_consen   25 QGLLGISVRFES-F-----EGA--EEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELV   86 (138)
T ss_dssp             SSSS-EEEEEEE-------TTG--CCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHH
T ss_pred             CCCCcEEEEEec-c-----ccc--ccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHH
Confidence            355888876542 1     111  2368999999999999999 999 6999999999999999988765


No 48 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=96.29  E-value=0.0051  Score=61.59  Aligned_cols=49  Identities=27%  Similarity=0.392  Sum_probs=41.1

Q ss_pred             eeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChh
Q 016641          315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDG  376 (385)
Q Consensus       315 ~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~  376 (385)
                      +..+|+.++.-.             ..++.|.++.+++||+++ ||+||+|++|||+++....
T Consensus        99 ~~GiG~~i~~~~-------------~~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~  148 (406)
T COG0793          99 FGGIGIELQMED-------------IGGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVS  148 (406)
T ss_pred             ccceeEEEEEec-------------CCCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCC
Confidence            456777765431             168999999999999999 9999999999999999875


No 49 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=96.23  E-value=0.029  Score=51.42  Aligned_cols=98  Identities=17%  Similarity=0.223  Sum_probs=67.4

Q ss_pred             CCCCCCCCCC--CCcceEEEEEecCCEEEecccccCCC----ceEEEEEcCCCcEEE------EEEEEec-----CCCCe
Q 016641          127 NYGLPWQNKS--QRETTGSGFVIPGKKILTNAHVVADS----TFVLVRKHGSPTKYR------AQVEAVG-----HECDL  189 (385)
Q Consensus       127 ~~~~p~~~~~--~~~~~GSGfiI~~g~ILT~aHvv~~~----~~i~V~~~~~g~~~~------a~v~~~d-----~~~Dl  189 (385)
                      ++..||...-  .+...|+|++|+..|||++..|+.+-    ..+.+.+. .++.+.      -++..+|     ++.++
T Consensus        13 ~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG-~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v   91 (267)
T PF09342_consen   13 DYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLG-GGKTYLSVDGPHEQISRVDCFKDVPESNV   91 (267)
T ss_pred             cccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEec-CcceecccCCChheEEEeeeeeeccccce
Confidence            4556665432  24467999999999999999999863    34556553 444322      1333333     57899


Q ss_pred             EEEEecCC-cccccceeeecCCcc---cCCCeEEEEecCC
Q 016641          190 AILIVESD-EFWEGMHFLELGDIP---FLQQAVAVVGYPQ  225 (385)
Q Consensus       190 AlLkv~~~-~~~~~~~~l~l~~~~---~~G~~V~~iG~p~  225 (385)
                      +||.++.+ .|...+.|+-+.+..   ...+.++++|...
T Consensus        92 ~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   92 LLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             eeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence            99999886 566778888887622   2256899999876


No 50 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.01  E-value=0.054  Score=56.23  Aligned_cols=117  Identities=20%  Similarity=0.227  Sum_probs=68.6

Q ss_pred             CCCCeEEEEecCCc-----cccc------ceeeecCCc--------ccCCCeEEEEecCCCCCCceEEEeeEeecccccc
Q 016641          185 HECDLAILIVESDE-----FWEG------MHFLELGDI--------PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQY  245 (385)
Q Consensus       185 ~~~DlAlLkv~~~~-----~~~~------~~~l~l~~~--------~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~  245 (385)
                      .-.|+|||+++...     +.++      -|.+.+.+.        ...|.+|+=+|...+.     +.|.|.++.-...
T Consensus       541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy-----T~G~lNg~klvyw  615 (695)
T PF08192_consen  541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY-----TTGILNGIKLVYW  615 (695)
T ss_pred             cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc-----cceEecceEEEEe
Confidence            34699999998542     1122      233333321        1238899999987553     4566665532211


Q ss_pred             cCCCce-eeEEEec----ccCCCCCCCccee-eCC------EEEEEEeeecCCCCceEEEEecchHHHHHHHH
Q 016641          246 VHGATQ-LMAIQID----AAINPGNSGGPAI-MGN------KVAGVAFQNLSGAENIGYIIPVPVIKHFITGV  306 (385)
Q Consensus       246 ~~~~~~-~~~i~~d----~~i~~G~SGGPL~-~~G------~vVGI~s~~~~~~~~~~~aip~~~i~~~l~~l  306 (385)
                      ..+... .+++...    .-...||||+=|+ .-+      .|+||..+.-+....+|++.|+..|.+-|++.
T Consensus       616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~v  688 (695)
T PF08192_consen  616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEV  688 (695)
T ss_pred             cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence            122211 1233322    1235799999888 533      39999998655556789999987776666554


No 51 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=95.93  E-value=0.0079  Score=61.95  Aligned_cols=42  Identities=24%  Similarity=0.380  Sum_probs=38.7

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      +-|.|..|.+++||.++ +++|||+++|||+||.+.++..+.+
T Consensus       398 ~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~  440 (1051)
T KOG3532|consen  398 RAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFL  440 (1051)
T ss_pred             eEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHH
Confidence            56889999999999999 9999999999999999999987765


No 52 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=95.84  E-value=0.0099  Score=56.32  Aligned_cols=43  Identities=26%  Similarity=0.271  Sum_probs=40.7

Q ss_pred             CceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .||||..|..++|+...|+.||-|++|||+++.+.+|+.+.+.
T Consensus       130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~  172 (342)
T COG3480         130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELIDYVS  172 (342)
T ss_pred             eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHHHHHHHh
Confidence            6999999999999999999999999999999999999998763


No 53 
>PRK11186 carboxy-terminal protease; Provisional
Probab=95.69  E-value=0.015  Score=61.64  Aligned_cols=34  Identities=24%  Similarity=0.319  Sum_probs=29.6

Q ss_pred             CceEEEeeCCCCHHhhh--cCCCCEEEEEC--CEEcCC
Q 016641          341 TGVLVNKINPLSDAHEI--LKKDDIILAFD--GVPIAN  374 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vn--g~~i~~  374 (385)
                      .+++|.+|.|||||+++  |++||+|++||  |+++.+
T Consensus       255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~d  292 (667)
T PRK11186        255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVD  292 (667)
T ss_pred             CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccc
Confidence            57899999999999995  99999999999  666554


No 54 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=94.90  E-value=0.039  Score=58.81  Aligned_cols=20  Identities=30%  Similarity=0.277  Sum_probs=15.8

Q ss_pred             eEEEEEec-CCEEEecccccC
Q 016641          141 TGSGFVIP-GKKILTNAHVVA  160 (385)
Q Consensus       141 ~GSGfiI~-~g~ILT~aHvv~  160 (385)
                      -|||.+|+ +|+|+||.||+.
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~   68 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGY   68 (698)
T ss_pred             ceeEEEEcCCceEEecchhhh
Confidence            48888888 788888888864


No 55 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=94.87  E-value=0.022  Score=51.32  Aligned_cols=41  Identities=27%  Similarity=0.374  Sum_probs=34.8

Q ss_pred             ccCCCCCCCcceeeCCEEEEEEeeecCCCCceEEEEecchH
Q 016641          259 AAINPGNSGGPAIMGNKVAGVAFQNLSGAENIGYIIPVPVI  299 (385)
Q Consensus       259 ~~i~~G~SGGPL~~~G~vVGI~s~~~~~~~~~~~aip~~~i  299 (385)
                      ..+.+|+||+|++.||++||=++..+.+....+|.++++..
T Consensus       175 GGIvqGMSGSPI~qdGKLiGAVthvf~~dp~~Gygi~ie~M  215 (218)
T PF05580_consen  175 GGIVQGMSGSPIIQDGKLIGAVTHVFVNDPTKGYGIFIEWM  215 (218)
T ss_pred             CCEEecccCCCEEECCEEEEEEEEEEecCCCceeeecHHHH
Confidence            35678999999999999999998888777788999986543


No 56 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=94.86  E-value=0.031  Score=49.84  Aligned_cols=39  Identities=31%  Similarity=0.201  Sum_probs=35.0

Q ss_pred             ceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHh
Q 016641          342 GVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSH  380 (385)
Q Consensus       342 Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~  380 (385)
                      =+.|.+|.|+|||+.+ |+.||-|+++++..-.++..|++
T Consensus       140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~  179 (231)
T KOG3129|consen  140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQN  179 (231)
T ss_pred             eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHH
Confidence            5789999999999999 99999999999998888877654


No 57 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=94.79  E-value=0.029  Score=56.71  Aligned_cols=32  Identities=25%  Similarity=0.261  Sum_probs=29.7

Q ss_pred             ccCceEEEeeCCCCHHhhh-cCCCCEEEEECCE
Q 016641          339 EVTGVLVNKINPLSDAHEI-LKKDDIILAFDGV  370 (385)
Q Consensus       339 ~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~  370 (385)
                      +..+.+|..|.++|||++| |.+||-|++|||.
T Consensus       460 ~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~  492 (558)
T COG3975         460 EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI  492 (558)
T ss_pred             cCCeeEEEecCCCChhHhccCCCccEEEEEcCc
Confidence            3467899999999999999 9999999999998


No 58 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.15  E-value=0.045  Score=49.37  Aligned_cols=136  Identities=21%  Similarity=0.171  Sum_probs=46.3

Q ss_pred             CCEEEecccccCCCceEEEEEcCCCcEEEE---EEEEecCCCCeEEEEecCCcccc--cceeeecCCcccCCCeEEEEec
Q 016641          149 GKKILTNAHVVADSTFVLVRKHGSPTKYRA---QVEAVGHECDLAILIVESDEFWE--GMHFLELGDIPFLQQAVAVVGY  223 (385)
Q Consensus       149 ~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a---~v~~~d~~~DlAlLkv~~~~~~~--~~~~l~l~~~~~~G~~V~~iG~  223 (385)
                      +..++|+.||..+...+....  .|..++-   +.+..+...|++||+... ..+.  .++.+.+....++     .-|-
T Consensus        41 ~~~L~ta~Hv~~~~~~~~~~k--~g~kipl~~f~~~~~~~~~D~~il~~P~-n~~s~Lg~k~~~~~~~~~~-----~~g~  112 (203)
T PF02122_consen   41 EDALLTARHVWSRPSKVTSLK--TGEKIPLAEFTDLLESRIADFVILRGPP-NWESKLGVKAAQLSQNSQL-----AKGP  112 (203)
T ss_dssp             -EEEEE-HHHHTSSS---EEE--TTEEEE--S-EEEEE-TTT-EEEEE--H-HHHHHHT-----B----SE-----EEEE
T ss_pred             ccceecccccCCCccceeEcC--CCCcccchhChhhhCCCccCEEEEecCc-CHHHHhCcccccccchhhh-----CCCC
Confidence            559999999999866555444  3344332   455667889999999983 2211  2444444322211     0010


Q ss_pred             CCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCcceeeCCEEEEEEeeec--CCCCceEEEEecch
Q 016641          224 PQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAIMGNKVAGVAFQNL--SGAENIGYIIPVPV  298 (385)
Q Consensus       224 p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~~~G~vVGI~s~~~--~~~~~~~~aip~~~  298 (385)
                       ....  ....+.-...........+   .+...-+...+|.||.|++...++||++....  ...++.++..|+.-
T Consensus       113 -~~~y--~~~~~~~~~~sa~i~g~~~---~~~~vls~T~~G~SGtp~y~g~~vvGvH~G~~~~~~~~n~n~~spip~  183 (203)
T PF02122_consen  113 -VSFY--GFSSGEWPCSSAKIPGTEG---KFASVLSNTSPGWSGTPYYSGKNVVGVHTGSPSGSNRENNNRMSPIPP  183 (203)
T ss_dssp             -SSTT--SEEEEEEEEEE-S----ST---TEEEE-----TT-TT-EEE-SS-EEEEEEEE-----------------
T ss_pred             -eeee--eecCCCceeccCccccccC---cCCceEcCCCCCCCCCCeEECCCceEeecCcccccccccccccccccc
Confidence             0000  0111000000000000111   13444456688999999994449999998742  23345566555443


No 59 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=93.61  E-value=0.31  Score=40.25  Aligned_cols=42  Identities=19%  Similarity=0.286  Sum_probs=29.1

Q ss_pred             cccccCCCceeeEEEecccCCCCCCCcceeeCCEEEEEEeee
Q 016641          242 PTQYVHGATQLMAIQIDAAINPGNSGGPAIMGNKVAGVAFQN  283 (385)
Q Consensus       242 ~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~~~G~vVGI~s~~  283 (385)
                      ...+.....+..++....+..||+-||+|+++--||||++++
T Consensus        68 ~s~YYP~h~Q~~~l~g~Gp~~PGdCGg~L~C~HGViGi~Tag  109 (127)
T PF00947_consen   68 ESEYYPKHYQYNLLIGEGPAEPGDCGGILRCKHGVIGIVTAG  109 (127)
T ss_dssp             SBTTB-SEEEECEEEEE-SSSTT-TCSEEEETTCEEEEEEEE
T ss_pred             CccCchhheecCceeecccCCCCCCCceeEeCCCeEEEEEeC
Confidence            333434444556667777889999999999776799999985


No 60 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=93.22  E-value=0.065  Score=54.71  Aligned_cols=44  Identities=23%  Similarity=0.253  Sum_probs=37.5

Q ss_pred             cCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          340 VTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       340 ~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      .-|+.|..|..+|||++- ||.||-|++||..+..+.--=+..+|
T Consensus       428 DVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlf  472 (1027)
T KOG3580|consen  428 DVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLF  472 (1027)
T ss_pred             ceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHH
Confidence            369999999999999998 99999999999999888765444443


No 61 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.91  E-value=0.11  Score=54.18  Aligned_cols=32  Identities=38%  Similarity=0.368  Sum_probs=29.2

Q ss_pred             EEeeCCCCHHhhh--cCCCCEEEEECCEEcCChh
Q 016641          345 VNKINPLSDAHEI--LKKDDIILAFDGVPIANDG  376 (385)
Q Consensus       345 V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~  376 (385)
                      |-+|.+||||++.  ||+||-|++|||..|.+..
T Consensus       782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~ls  815 (984)
T KOG3209|consen  782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLS  815 (984)
T ss_pred             ccccccCChhHhhccccccceEEEecCeeeeccC
Confidence            6789999999997  9999999999999998764


No 62 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=91.80  E-value=0.12  Score=55.38  Aligned_cols=34  Identities=26%  Similarity=0.362  Sum_probs=31.8

Q ss_pred             ceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCCh
Q 016641          342 GVLVNKINPLSDAHEILKKDDIILAFDGVPIAND  375 (385)
Q Consensus       342 Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~  375 (385)
                      -|+|..|.+|+|+...|++||-|+.|||++|...
T Consensus        76 PviVr~VT~GGps~GKL~PGDQIl~vN~Epv~da  109 (1298)
T KOG3552|consen   76 PVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDA  109 (1298)
T ss_pred             ceEEEEecCCCCccccccCCCeEEEecCcccccc
Confidence            4899999999999999999999999999999875


No 63 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=91.63  E-value=0.31  Score=40.86  Aligned_cols=31  Identities=23%  Similarity=0.375  Sum_probs=21.7

Q ss_pred             EEecccCCCCCCCccee-eCCEEEEEEeeecC
Q 016641          255 IQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS  285 (385)
Q Consensus       255 i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~  285 (385)
                      ...+..+.+|.||+|+| .+|++|||......
T Consensus        88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~  119 (132)
T PF00949_consen   88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE  119 (132)
T ss_dssp             EEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred             EeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence            34445578899999999 99999999877654


No 64 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=91.58  E-value=0.22  Score=47.03  Aligned_cols=42  Identities=19%  Similarity=0.292  Sum_probs=30.8

Q ss_pred             CceEEEeeCCCCHH---hhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641          341 TGVLVNKINPLSDA---HEI-LKKDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       341 ~Gv~V~~V~~~spA---~~a-L~~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      .|+.=-+|.|+.++   .++ ||+|||+++|||..+++.++..+++
T Consensus       204 ~Gl~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~  249 (276)
T PRK09681        204 EGIVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALM  249 (276)
T ss_pred             CCceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHH
Confidence            45222346777544   356 9999999999999999999766554


No 65 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=91.12  E-value=0.28  Score=41.53  Aligned_cols=35  Identities=26%  Similarity=0.498  Sum_probs=31.2

Q ss_pred             CceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh
Q 016641          341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~  375 (385)
                      .-+||++|.||+-|++-  ||.||-+++|||..|..-
T Consensus       115 spiyisriipggvadrhgglkrgdqllsvngvsvege  151 (207)
T KOG3550|consen  115 SPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGE  151 (207)
T ss_pred             CceEEEeecCCccccccCcccccceeEeecceeecch
Confidence            56999999999999974  999999999999888653


No 66 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=90.35  E-value=0.31  Score=40.54  Aligned_cols=27  Identities=22%  Similarity=0.476  Sum_probs=22.5

Q ss_pred             ccCCCCCCCccee-eCCEEEEEEeeecC
Q 016641          259 AAINPGNSGGPAI-MGNKVAGVAFQNLS  285 (385)
Q Consensus       259 ~~i~~G~SGGPL~-~~G~vVGI~s~~~~  285 (385)
                      ..-.+||||-|++ ..|+||||+..+..
T Consensus       101 g~g~~GDSGRpi~DNsGrVVaIVLGG~n  128 (158)
T PF00944_consen  101 GVGKPGDSGRPIFDNSGRVVAIVLGGAN  128 (158)
T ss_dssp             TS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred             CCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence            3457899999999 99999999988644


No 67 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=89.92  E-value=0.29  Score=47.81  Aligned_cols=42  Identities=24%  Similarity=0.332  Sum_probs=37.3

Q ss_pred             cCceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhhHHhh
Q 016641          340 VTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTGSHS  381 (385)
Q Consensus       340 ~~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~l~~~  381 (385)
                      ..||.|.+|...||+..-  |++||+|+++||-+|++.+|-.+-
T Consensus       219 g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ec  262 (484)
T KOG2921|consen  219 GEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLEC  262 (484)
T ss_pred             CceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHH
Confidence            379999999999998864  999999999999999999886553


No 68 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=89.06  E-value=0.28  Score=48.86  Aligned_cols=41  Identities=27%  Similarity=0.391  Sum_probs=34.2

Q ss_pred             ccCCCCCCCcceeeCCEEEEEEeeecCCCCceEEEEecchH
Q 016641          259 AAINPGNSGGPAIMGNKVAGVAFQNLSGAENIGYIIPVPVI  299 (385)
Q Consensus       259 ~~i~~G~SGGPL~~~G~vVGI~s~~~~~~~~~~~aip~~~i  299 (385)
                      ..+.+|+||+|++.||++||=++..+-++...||.|-++..
T Consensus       355 gGivqGMSGSPi~q~gkliGAvtHVfvndpt~GYGi~ie~M  395 (402)
T TIGR02860       355 GGIVQGMSGSPIIQNGKVIGAVTHVFVNDPTSGYGVYIEWM  395 (402)
T ss_pred             CCEEecccCCCEEECCEEEEEEEEEEecCCCcceeehHHHH
Confidence            35678999999999999999998888787888899865443


No 69 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=88.66  E-value=0.29  Score=50.89  Aligned_cols=41  Identities=29%  Similarity=0.377  Sum_probs=35.5

Q ss_pred             CCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhh
Q 016641          337 RSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGT  377 (385)
Q Consensus       337 ~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~  377 (385)
                      .....|++|.+|.|+|.|++. ||.||-|++|||+...+..-
T Consensus       558 sEkGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~  599 (1283)
T KOG3542|consen  558 SEKGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISA  599 (1283)
T ss_pred             ccccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhH
Confidence            334579999999999999999 99999999999998776643


No 70 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=88.16  E-value=0.26  Score=52.66  Aligned_cols=54  Identities=19%  Similarity=0.269  Sum_probs=35.8

Q ss_pred             EEEecccCCCCCCCccee-eCCEEEEEEeeecC--------CCCce--EEEEecchHHHHHHHHH
Q 016641          254 AIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS--------GAENI--GYIIPVPVIKHFITGVV  307 (385)
Q Consensus       254 ~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~--------~~~~~--~~aip~~~i~~~l~~l~  307 (385)
                      .+.++..+..||||+|++ .+|||||+++=+..        -....  +..|-+..|..+|+++-
T Consensus       623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~  687 (698)
T PF10459_consen  623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY  687 (698)
T ss_pred             EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence            466778889999999999 99999999875432        11223  33444445666665543


No 71 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=87.55  E-value=0.58  Score=42.84  Aligned_cols=44  Identities=14%  Similarity=0.094  Sum_probs=36.4

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhcc
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSMLF  384 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~~  384 (385)
                      .|..+.=..+++-.++. ||+|||-+++||..+++.+++..+|..
T Consensus       207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~  251 (275)
T COG3031         207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQM  251 (275)
T ss_pred             EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHh
Confidence            45555556667778888 999999999999999999999887754


No 72 
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=87.24  E-value=0.5  Score=43.95  Aligned_cols=72  Identities=22%  Similarity=0.417  Sum_probs=49.2

Q ss_pred             HHHcCeeeeeeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcC--ChhhHHhh
Q 016641          306 VVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIA--NDGTGSHS  381 (385)
Q Consensus       306 l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~--~~~~l~~~  381 (385)
                      |-++|.-.   -||+...+-...+.. ..|+. ...|+.|++..||+-|+.-  |.+.|-|++|||.+|.  +.+++.++
T Consensus       164 L~khG~ek---PLGFYIRDG~SVRVt-p~Gle-kvpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDM  238 (358)
T KOG3606|consen  164 LHKHGSEK---PLGFYIRDGTSVRVT-PHGLE-KVPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDM  238 (358)
T ss_pred             hhhcCCCC---CceEEEecCceEEec-ccccc-ccCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHH
Confidence            44566532   377766543221111 24554 3579999999999999975  8999999999999995  45666655


Q ss_pred             h
Q 016641          382 M  382 (385)
Q Consensus       382 l  382 (385)
                      |
T Consensus       239 M  239 (358)
T KOG3606|consen  239 M  239 (358)
T ss_pred             H
Confidence            4


No 73 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=85.15  E-value=1.1  Score=42.38  Aligned_cols=37  Identities=14%  Similarity=0.217  Sum_probs=32.5

Q ss_pred             ceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhhH
Q 016641          342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTG  378 (385)
Q Consensus       342 Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~l  378 (385)
                      =+||..|..++||++-  |+.||-|++|||..|....-+
T Consensus        31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKv   69 (429)
T KOG3651|consen   31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKV   69 (429)
T ss_pred             eEEEEEeccCCchhccCccccCCeeEEecceeecCccHH
Confidence            5899999999999975  999999999999999865443


No 74 
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=84.23  E-value=0.77  Score=46.31  Aligned_cols=36  Identities=22%  Similarity=0.276  Sum_probs=30.2

Q ss_pred             cCceEEEeeCCCCHHh-hh-cCCCCEEEEECCEEcCCh
Q 016641          340 VTGVLVNKINPLSDAH-EI-LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       340 ~~Gv~V~~V~~~spA~-~a-L~~GDiI~~vng~~i~~~  375 (385)
                      ..|+||.+|.+++.-+ .. |.+||.|+.||.....++
T Consensus       276 DggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENm  313 (626)
T KOG3571|consen  276 DGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENM  313 (626)
T ss_pred             CCceEEeeeccCceeeccCccCccceEEEeeecchhhc
Confidence            4799999999988744 45 999999999999876654


No 75 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=84.17  E-value=5.1  Score=43.52  Aligned_cols=65  Identities=12%  Similarity=0.031  Sum_probs=35.2

Q ss_pred             EEEEEecCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecC
Q 016641          142 GSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELG  209 (385)
Q Consensus       142 GSGfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~  209 (385)
                      |...+|++.||+|.+|+..+...+..--. +...|...-..-.+..|+.+-|++.-.  .++.|++..
T Consensus        67 G~aTLigpqYiVSV~HN~~gy~~v~FG~~-g~~~Y~iV~RNn~~~~Df~~pRLnK~V--TEvaP~~~t  131 (769)
T PF02395_consen   67 GVATLIGPQYIVSVKHNGKGYNSVSFGNE-GQNTYKIVDRNNYPSGDFHMPRLNKFV--TEVAPAEMT  131 (769)
T ss_dssp             SS-EEEETTEEEBETTG-TSCCEECESCS-STCEEEEEEEEBETTSTEBEEEESS-----SS----BB
T ss_pred             ceEEEecCCeEEEEEccCCCcCceeeccc-CCceEEEEEccCCCCcccceeecCceE--EEEeccccc
Confidence            77899999999999999955444433321 223343322222334699999998742  235555554


No 76 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=83.27  E-value=1.7  Score=42.93  Aligned_cols=39  Identities=28%  Similarity=0.341  Sum_probs=34.5

Q ss_pred             EEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641          345 VNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML  383 (385)
Q Consensus       345 V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~  383 (385)
                      +..+..+|+|..+ |++||.|+++|++++.+..++...+.
T Consensus       133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~  172 (375)
T COG0750         133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLV  172 (375)
T ss_pred             eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHH
Confidence            3379999999999 99999999999999999998776553


No 77 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=82.95  E-value=11  Score=37.46  Aligned_cols=137  Identities=15%  Similarity=0.143  Sum_probs=67.3

Q ss_pred             ceEEEEEecCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcccCCCeEE
Q 016641          140 TTGSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQAVA  219 (385)
Q Consensus       140 ~~GSGfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~V~  219 (385)
                      +.|=||.+++..++|+-||+....+-..-       .+-.-+.++..-+++-+++..+. ..+++-+-|.+...-|.-+.
T Consensus       379 GsGWGfWVS~~lfITttHViP~g~~E~FG-------v~i~~i~vh~sGeF~~~rFpk~i-RPDvtgmiLEeGapEGtV~s  450 (535)
T PF05416_consen  379 GSGWGFWVSPTLFITTTHVIPPGAKEAFG-------VPISQIQVHKSGEFCRFRFPKPI-RPDVTGMILEEGAPEGTVCS  450 (535)
T ss_dssp             TTEEEEESSSSEEEEEGGGS-STTSEETT-------EECGGEEEEEETTEEEEEESS-S-STTS---EE-SS--TT-EEE
T ss_pred             CCceeeeecceEEEEeeeecCCcchhhhC-------CChhHeEEeeccceEEEecCCCC-CCCccceeeccCCCCceEEE
Confidence            56789999999999999999853211100       01111223334566777776542 12566666655545565443


Q ss_pred             E-EecCCCCC-CceEEEeeEeecccccccCCCceeeEEEec-------ccCCCCCCCccee-eCC---EEEEEEeeecC
Q 016641          220 V-VGYPQGGD-NISVTKGVVSRVEPTQYVHGATQLMAIQID-------AAINPGNSGGPAI-MGN---KVAGVAFQNLS  285 (385)
Q Consensus       220 ~-iG~p~~~~-~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d-------~~i~~G~SGGPL~-~~G---~vVGI~s~~~~  285 (385)
                      + |=.+.|.- .+.+..|....+.-....- .....++.+.       -...|||-|.|-+ ..|   -|+|++.+...
T Consensus       451 iLiKR~sGEllpLAvRMgt~AsmkIqgr~v-~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr  528 (535)
T PF05416_consen  451 ILIKRPSGELLPLAVRMGTHASMKIQGRTV-HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATR  528 (535)
T ss_dssp             EEEE-TTSBEEEEEEEEEEEEEEEETTEEE-EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-S
T ss_pred             EEEEcCCccchhhhhhhccceeEEEcceee-cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhcc
Confidence            3 34454422 1345566554432210000 1122333332       2456899999999 665   49999988654


No 78 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=82.76  E-value=1.2  Score=48.31  Aligned_cols=42  Identities=19%  Similarity=0.232  Sum_probs=35.0

Q ss_pred             CCccCceEEEeeCCCCHHhh-h-cCCCCEEEEECCEEcCChhhH
Q 016641          337 RSEVTGVLVNKINPLSDAHE-I-LKKDDIILAFDGVPIANDGTG  378 (385)
Q Consensus       337 ~~~~~Gv~V~~V~~~spA~~-a-L~~GDiI~~vng~~i~~~~~l  378 (385)
                      .++.-|+||.+|.+|++|+. . |+.||-+++|||..+-.+.+=
T Consensus       956 Gq~klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQE  999 (1629)
T KOG1892|consen  956 GQRKLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQE  999 (1629)
T ss_pred             CccccceEEEEeccCCccccccccccCceeeeecCcccccccHH
Confidence            34567999999999999885 4 999999999999987665543


No 79 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=82.32  E-value=1.1  Score=37.30  Aligned_cols=23  Identities=26%  Similarity=0.466  Sum_probs=17.8

Q ss_pred             CCCCCCccee-eCCEEEEEEeeec
Q 016641          262 NPGNSGGPAI-MGNKVAGVAFQNL  284 (385)
Q Consensus       262 ~~G~SGGPL~-~~G~vVGI~s~~~  284 (385)
                      -.|.||||++ .+|.+|||..+..
T Consensus       106 lkGSSGgPiLC~~GH~vG~f~aa~  129 (148)
T PF02907_consen  106 LKGSSGGPILCPSGHAVGMFRAAV  129 (148)
T ss_dssp             HTT-TT-EEEETTSEEEEEEEEEE
T ss_pred             EecCCCCcccCCCCCEEEEEEEEE
Confidence            4699999999 9999999976643


No 80 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=82.23  E-value=1.1  Score=47.15  Aligned_cols=37  Identities=14%  Similarity=0.125  Sum_probs=32.8

Q ss_pred             CceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhh
Q 016641          341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGT  377 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~  377 (385)
                      .+++|-++.+++||.+-  +++||-|++|||+....+..
T Consensus       923 M~LfVLRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH  961 (984)
T KOG3209|consen  923 MDLFVLRLAEDGPAIRDGRMRVGDQITEINGESTKGMTH  961 (984)
T ss_pred             cceEEEEeccCCCccccCceeecceEEEecCcccCCCcH
Confidence            46999999999999975  99999999999998877653


No 81 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=82.00  E-value=8.6  Score=30.86  Aligned_cols=53  Identities=15%  Similarity=0.129  Sum_probs=36.0

Q ss_pred             EEEecCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCC
Q 016641          144 GFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGD  210 (385)
Q Consensus       144 GfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~  210 (385)
                      ++-|.+|.++|+.||.+..+.+.      +..  -+++.  .+.|+++++.+...    .+.+++++
T Consensus         3 avHIGnG~~vt~tHva~~~~~v~------g~~--f~~~~--~~ge~~~v~~~~~~----~p~~~ig~   55 (105)
T PF03510_consen    3 AVHIGNGRYVTVTHVAKSSDSVD------GQP--FKIVK--TDGELCWVQSPLVH----LPAAQIGT   55 (105)
T ss_pred             eEEeCCCEEEEEEEEeccCceEc------CcC--cEEEE--eccCEEEEECCCCC----CCeeEecc
Confidence            67778999999999998765432      221  12222  35699999988764    56666765


No 82 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=81.00  E-value=1.6  Score=45.02  Aligned_cols=37  Identities=22%  Similarity=0.424  Sum_probs=33.3

Q ss_pred             CceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhh
Q 016641          341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGT  377 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~  377 (385)
                      .-++|++|.||+||+.-||.||-|+.|||....+...
T Consensus        40 tSiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~h   76 (1027)
T KOG3580|consen   40 TSIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLH   76 (1027)
T ss_pred             eeEEEeeccCCCCcccccccCCeEEEEcCcchhhhHH
Confidence            5689999999999999999999999999998877643


No 83 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=77.88  E-value=2.9  Score=43.64  Aligned_cols=75  Identities=15%  Similarity=0.252  Sum_probs=54.2

Q ss_pred             EEecchHHHHHHHHHHcCeeeeeeccCccc------cccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEE
Q 016641          293 IIPVPVIKHFITGVVEHGKYVGFCSLGLSC------QTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIIL  365 (385)
Q Consensus       293 aip~~~i~~~l~~l~~~g~~~~~~~lGi~~------~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~  365 (385)
                      -+|.+..+.+++.+++.-.|.    |-|.-      ..+.-|+++-++|+. .++|| |=+...|+-|++. +++|--|+
T Consensus       708 GLPLstcQs~Ik~~KnQT~Vk----ltiV~cpPV~~V~I~RPd~kyQLGFS-VQNGi-ICSLlRGGIAERGGVRVGHRII  781 (829)
T KOG3605|consen  708 GLPLSTCQSIIKGLKNQTAVK----LNIVSCPPVTTVLIRRPDLRYQLGFS-VQNGI-ICSLLRGGIAERGGVRVGHRII  781 (829)
T ss_pred             cccHHHHHHHHhcccccceEE----EEEecCCCceEEEeecccchhhccce-eeCcE-eehhhcccchhccCceeeeeEE
Confidence            478899999998887544432    22211      122356677778887 35786 5566779999999 99999999


Q ss_pred             EECCEEcC
Q 016641          366 AFDGVPIA  373 (385)
Q Consensus       366 ~vng~~i~  373 (385)
                      +|||+.|.
T Consensus       782 EINgQSVV  789 (829)
T KOG3605|consen  782 EINGQSVV  789 (829)
T ss_pred             EECCceEE
Confidence            99999874


No 84 
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=73.21  E-value=1.5  Score=42.99  Aligned_cols=35  Identities=26%  Similarity=0.302  Sum_probs=30.8

Q ss_pred             CceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh
Q 016641          341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~  375 (385)
                      .-++|++|.+|-.|++.  |..||.|++|||+.+.+.
T Consensus       110 MPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~A  146 (506)
T KOG3551|consen  110 MPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDA  146 (506)
T ss_pred             CceehhHhccccccccccceeeccEEEEecchhhhhc
Confidence            34899999999999986  999999999999987654


No 85 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=71.92  E-value=3  Score=42.62  Aligned_cols=41  Identities=27%  Similarity=0.295  Sum_probs=34.9

Q ss_pred             ceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh--hhHHhhh
Q 016641          342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND--GTGSHSM  382 (385)
Q Consensus       342 Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~--~~l~~~l  382 (385)
                      -++|.++..|+-+++.  |+.||.|+++||..+.+.  .++.++|
T Consensus       147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l  191 (542)
T KOG0609|consen  147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELL  191 (542)
T ss_pred             ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHH
Confidence            4899999999999986  999999999999999875  5555544


No 86 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=67.86  E-value=3.4  Score=45.83  Aligned_cols=32  Identities=28%  Similarity=0.356  Sum_probs=29.2

Q ss_pred             EEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh
Q 016641          344 LVNKINPLSDAHEI-LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       344 ~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~  375 (385)
                      .|-.|.++|||..+ |++||.|+.+||+++...
T Consensus       661 ~v~sv~egsPA~~agls~~DlIthvnge~v~gl  693 (1205)
T KOG0606|consen  661 SVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGL  693 (1205)
T ss_pred             eeeeecCCCCccccCCCccceeEeccCcccchh
Confidence            57789999999999 999999999999998764


No 87 
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.10  E-value=21  Score=27.64  Aligned_cols=37  Identities=24%  Similarity=0.401  Sum_probs=30.4

Q ss_pred             ccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          158 VVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       158 vv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ++.....+.|.+. +++.+.+++.++|.+.++.|=...
T Consensus        10 ~~~~~~~V~V~lr-~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          10 AVKNNTQVLINCR-NNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             HHcCCCEEEEEEc-CCCEEEEEEEEecCccEEEEcceE
Confidence            4445678899987 889999999999999999876553


No 88 
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=56.08  E-value=11  Score=38.65  Aligned_cols=40  Identities=13%  Similarity=0.130  Sum_probs=30.9

Q ss_pred             hhcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          332 NNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       332 ~~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      ..++++-  +=+.|.-..++.|..++.||||||++.||+.|.
T Consensus       291 a~l~l~v--nv~~vl~~~ENm~~g~A~rPGDVits~~GkTVE  330 (485)
T COG0260         291 AELKLPV--NVVGVLPAVENMPSGNAYRPGDVITSMNGKTVE  330 (485)
T ss_pred             HHcCCCc--eEEEEEeeeccCCCCCCCCCCCeEEecCCcEEE
Confidence            3456663  445566677788999999999999999999874


No 89 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=51.07  E-value=11  Score=37.41  Aligned_cols=22  Identities=32%  Similarity=0.617  Sum_probs=19.5

Q ss_pred             cCCCCCCCccee-eCCEEEEEEe
Q 016641          260 AINPGNSGGPAI-MGNKVAGVAF  281 (385)
Q Consensus       260 ~i~~G~SGGPL~-~~G~vVGI~s  281 (385)
                      .+..|.||+.++ .+|++|||.+
T Consensus       351 ~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  351 SLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             CCCCCCCcCeEECCCCCEEEEeC
Confidence            456899999999 9999999975


No 90 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.95  E-value=52  Score=23.06  Aligned_cols=33  Identities=18%  Similarity=0.061  Sum_probs=27.3

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES  196 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~  196 (385)
                      ..+.|.+. +++.+.+.+...|...++.|-....
T Consensus         7 ~~V~V~l~-~g~~~~G~L~~~D~~~Ni~L~~~~~   39 (63)
T cd00600           7 KTVRVELK-DGRVLEGVLVAFDKYMNLVLDDVEE   39 (63)
T ss_pred             CEEEEEEC-CCcEEEEEEEEECCCCCEEECCEEE
Confidence            46788886 8999999999999999987766543


No 91 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=48.70  E-value=12  Score=36.27  Aligned_cols=33  Identities=27%  Similarity=0.301  Sum_probs=29.2

Q ss_pred             eEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh
Q 016641          343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND  375 (385)
Q Consensus       343 v~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~  375 (385)
                      |+|+++.++-.|+..  |=.||-|++|||..|..-
T Consensus        82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c  116 (505)
T KOG3549|consen   82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTAC  116 (505)
T ss_pred             EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecC
Confidence            899999999888865  889999999999998753


No 92 
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.12  E-value=93  Score=23.01  Aligned_cols=32  Identities=9%  Similarity=-0.025  Sum_probs=26.9

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      .++.|.+. +++.+.+++.++|...++.|=...
T Consensus        10 ~~V~V~l~-dgr~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727          10 KTVSVITV-DGRVIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             CEEEEEEC-CCcEEEEEEEEEccccCEEccceE
Confidence            46778876 999999999999999998876653


No 93 
>PRK05015 aminopeptidase B; Provisional
Probab=45.74  E-value=20  Score=36.01  Aligned_cols=39  Identities=18%  Similarity=0.077  Sum_probs=29.2

Q ss_pred             hcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          333 NFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       333 ~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      .++++.  +=+.|--+.++.+..++.|+||||...||+.|.
T Consensus       230 ~~~l~~--nV~~il~~aENmisg~A~kpgDVIt~~nGkTVE  268 (424)
T PRK05015        230 TRGLNK--RVKLFLCCAENLISGNAFKLGDIITYRNGKTVE  268 (424)
T ss_pred             hcCCCc--eEEEEEEecccCCCCCCCCCCCEEEecCCcEEe
Confidence            345553  334455667788888889999999999999874


No 94 
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.69  E-value=93  Score=23.20  Aligned_cols=32  Identities=13%  Similarity=0.033  Sum_probs=26.8

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      .++.|.+. +|+.+.+.+.++|+..++.|=...
T Consensus        13 k~v~V~l~-~gr~~~G~L~~fD~~~NlvL~d~~   44 (74)
T cd01728          13 KKVVVLLR-DGRKLIGILRSFDQFANLVLQDTV   44 (74)
T ss_pred             CEEEEEEc-CCeEEEEEEEEECCcccEEecceE
Confidence            56788886 899999999999999998876553


No 95 
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=44.72  E-value=59  Score=23.98  Aligned_cols=33  Identities=21%  Similarity=0.305  Sum_probs=27.8

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES  196 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~  196 (385)
                      ..+.|.+. +|+.+.+.+.++|...++.|-....
T Consensus        15 k~V~V~lk-~g~~~~G~L~~~D~~mNlvL~d~~e   47 (72)
T PRK00737         15 SPVLVRLK-GGREFRGELQGYDIHMNLVLDNAEE   47 (72)
T ss_pred             CEEEEEEC-CCCEEEEEEEEEcccceeEEeeEEE
Confidence            46788886 8999999999999999998777643


No 96 
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=44.37  E-value=61  Score=23.48  Aligned_cols=33  Identities=18%  Similarity=0.246  Sum_probs=28.3

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES  196 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~  196 (385)
                      ..+.|.+. +|+.+.+++.++|...++.|-....
T Consensus        11 ~~V~V~l~-~g~~~~G~L~~~D~~mNlvL~~~~e   43 (68)
T cd01731          11 KPVLVKLK-GGKEVRGRLKSYDQHMNLVLEDAEE   43 (68)
T ss_pred             CEEEEEEC-CCCEEEEEEEEECCcceEEEeeEEE
Confidence            56888886 8999999999999999998877643


No 97 
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.21  E-value=56  Score=23.64  Aligned_cols=32  Identities=22%  Similarity=0.282  Sum_probs=27.1

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ..+.|.+. +|+.+.+++.++|...++.|=...
T Consensus        11 ~~V~V~Lk-~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          11 RPVVVKLN-SGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CeEEEEEC-CCCEEEEEEEEEccceeeEEeeEE
Confidence            46788886 889999999999999999876653


No 98 
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=43.94  E-value=54  Score=23.87  Aligned_cols=32  Identities=19%  Similarity=0.240  Sum_probs=26.9

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ..+.|.+. +|+.+.+++..+|...++.|=.+.
T Consensus        12 ~~V~V~Lk-~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          12 KPVIVKLK-WGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CEEEEEEC-CCcEEEEEEEEECCCEEEEEeeEE
Confidence            46788886 899999999999999999875553


No 99 
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.21  E-value=49  Score=25.08  Aligned_cols=31  Identities=16%  Similarity=0.195  Sum_probs=26.1

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV  194 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv  194 (385)
                      ..+.|.+. +|+.+.+++.++|.+.+|.|=..
T Consensus        12 k~V~V~l~-~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          12 ERVYVKLR-GDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CEEEEEEC-CCCEEEEEEEEEccceEEeccce
Confidence            56888886 88999999999999999876544


No 100
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.69  E-value=66  Score=24.40  Aligned_cols=31  Identities=6%  Similarity=-0.003  Sum_probs=26.2

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV  194 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv  194 (385)
                      .++.|.+. +|+.+.+.+.++|...+|.|=..
T Consensus        13 k~V~V~l~-~gr~~~G~L~~~D~~mNlvL~~~   43 (81)
T cd01729          13 KKIRVKFQ-GGREVTGILKGYDQLLNLVLDDT   43 (81)
T ss_pred             CeEEEEEC-CCcEEEEEEEEEcCcccEEecCE
Confidence            56788886 89999999999999999876554


No 101
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.61  E-value=59  Score=24.37  Aligned_cols=31  Identities=6%  Similarity=0.104  Sum_probs=26.3

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV  194 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv  194 (385)
                      ..+.|.+. +++.+.+++.++|...++.|=..
T Consensus        14 ~~V~V~l~-~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          14 SRIWIVMK-SDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CEEEEEEC-CCeEEEEEEEEeccceEEEEccE
Confidence            57788886 88999999999999999976554


No 102
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.08  E-value=72  Score=23.85  Aligned_cols=32  Identities=3%  Similarity=0.107  Sum_probs=26.7

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ..+.|.+. ||+.+.+.+..+|...+|.|=...
T Consensus        11 ~~v~V~l~-dgR~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          11 RTMRIHMT-DGRTLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             CeEEEEEc-CCeEEEEEEEEEcCCCcEEecCcE
Confidence            46788886 999999999999999998765543


No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.79  E-value=65  Score=24.17  Aligned_cols=32  Identities=16%  Similarity=0.107  Sum_probs=26.6

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ..+.|.+. ||+.+.+.+.++|...+|.|=...
T Consensus        11 ~~V~V~l~-dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          11 YRLRVTLQ-DGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             CEEEEEEC-CCcEEEEEEEEEcCccCEEcCCEE
Confidence            46788886 899999999999999998765543


No 104
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=40.08  E-value=11  Score=39.43  Aligned_cols=29  Identities=17%  Similarity=0.257  Sum_probs=24.2

Q ss_pred             EeeCCCCHHhhh--cCCCCEEEEECCEEcCC
Q 016641          346 NKINPLSDAHEI--LKKDDIILAFDGVPIAN  374 (385)
Q Consensus       346 ~~V~~~spA~~a--L~~GDiI~~vng~~i~~  374 (385)
                      .....++||++.  |..||-|++|||..+..
T Consensus       678 Anmm~~GpAarsgkLnIGDQiiaING~SLVG  708 (829)
T KOG3605|consen  678 ANMMHGGPAARSGKLNIGDQIMSINGTSLVG  708 (829)
T ss_pred             HhcccCChhhhcCCccccceeEeecCceecc
Confidence            456678999987  99999999999986543


No 105
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=39.02  E-value=81  Score=23.29  Aligned_cols=31  Identities=6%  Similarity=-0.016  Sum_probs=26.0

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV  194 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv  194 (385)
                      ..+.|.+. +|+.+.+++.++|...+|.|=..
T Consensus        11 k~V~V~L~-~g~~~~G~L~~~D~~mNlvL~~~   41 (72)
T cd01719          11 KKLSLKLN-GNRKVSGILRGFDPFMNLVLDDA   41 (72)
T ss_pred             CeEEEEEC-CCeEEEEEEEEEcccccEEeccE
Confidence            46778886 88999999999999999877554


No 106
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=38.82  E-value=1.2e+02  Score=21.75  Aligned_cols=33  Identities=21%  Similarity=0.184  Sum_probs=27.3

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES  196 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~  196 (385)
                      ..+.++.. .|..++++++.+|....+.+|+...
T Consensus         7 s~V~~kTc-~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           7 SQVSCRTC-FEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             cEEEEEec-CCceEEEEEEEecCCCcEEEEECcc
Confidence            45666665 6889999999999999999998655


No 107
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=36.42  E-value=46  Score=29.56  Aligned_cols=28  Identities=21%  Similarity=0.164  Sum_probs=24.7

Q ss_pred             CceEEEeeCCCCHHhhh-cCCCCEEEEEC
Q 016641          341 TGVLVNKINPLSDAHEI-LKKDDIILAFD  368 (385)
Q Consensus       341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vn  368 (385)
                      ..++|..|..||||+++ +.-|+.|+++-
T Consensus       122 ~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~  150 (183)
T PF11874_consen  122 GKVIVDEVEFGSPAEKAGIDFDWEITEVE  150 (183)
T ss_pred             CEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence            56899999999999999 99999887763


No 108
>PF09465 LBR_tudor:  Lamin-B receptor of TUDOR domain;  InterPro: IPR019023  The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=35.09  E-value=1.6e+02  Score=20.71  Aligned_cols=37  Identities=24%  Similarity=0.285  Sum_probs=29.6

Q ss_pred             CCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCC
Q 016641          161 DSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESD  197 (385)
Q Consensus       161 ~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~  197 (385)
                      ....+.++.+++...|++++..+|...++.-++.+..
T Consensus         8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG   44 (55)
T PF09465_consen    8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG   44 (55)
T ss_dssp             SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred             CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence            4567889999888888999999999999998888765


No 109
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=34.87  E-value=28  Score=35.80  Aligned_cols=31  Identities=16%  Similarity=0.164  Sum_probs=25.4

Q ss_pred             eEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          343 VLVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       343 v~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      +-|--..++.|..++.||||||...||+.|.
T Consensus       301 ~~v~~l~ENm~~~~A~rPgDVi~~~~GkTVE  331 (483)
T PRK00913        301 VGVVAACENMPSGNAYRPGDVLTSMSGKTIE  331 (483)
T ss_pred             EEEEEeeccCCCCCCCCCCCEEEECCCcEEE
Confidence            3445556788888999999999999999874


No 110
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=34.25  E-value=1.1e+02  Score=21.76  Aligned_cols=33  Identities=24%  Similarity=0.301  Sum_probs=27.0

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES  196 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~  196 (385)
                      ..+.|.+. +|+.+.+.+...|...++.|=....
T Consensus         9 ~~V~V~l~-~g~~~~G~L~~~D~~~NlvL~~~~e   41 (67)
T smart00651        9 KRVLVELK-NGREYRGTLKGFDQFMNLVLEDVEE   41 (67)
T ss_pred             cEEEEEEC-CCcEEEEEEEEECccccEEEccEEE
Confidence            46788886 8899999999999999987765543


No 111
>PTZ00412 leucyl aminopeptidase; Provisional
Probab=33.93  E-value=34  Score=35.69  Aligned_cols=40  Identities=13%  Similarity=0.117  Sum_probs=29.0

Q ss_pred             hhcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          332 NNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       332 ~~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      ..++++-  +=+-|.-..++.|..++.+|||||...||+.|.
T Consensus       337 A~Lklpv--nVv~iiplaENm~sg~A~rPGDVits~nGkTVE  376 (569)
T PTZ00412        337 AKLQLPV--NVVAAVGLAENAIGPESYHPSSIITSRKGLTVE  376 (569)
T ss_pred             HHcCCCe--EEEEEEEhhhcCCCCCCCCCCCEeEecCCCEEe
Confidence            3456653  333445556678888889999999999999864


No 112
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=33.84  E-value=32  Score=31.26  Aligned_cols=53  Identities=15%  Similarity=0.360  Sum_probs=36.2

Q ss_pred             EEEecccCCCCCCCccee-eC----CEEEEEEeeecCCCCceEEEEec--chHHHHHHHHH
Q 016641          254 AIQIDAAINPGNSGGPAI-MG----NKVAGVAFQNLSGAENIGYIIPV--PVIKHFITGVV  307 (385)
Q Consensus       254 ~i~~d~~i~~G~SGGPL~-~~----G~vVGI~s~~~~~~~~~~~aip~--~~i~~~l~~l~  307 (385)
                      .++...+...|+-|+|++ .+    -+++||+.++..+ .+.+||=++  +++++.++.|.
T Consensus       170 gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~-~~~gYAe~itQEDL~~A~~~l~  229 (231)
T PF12381_consen  170 GLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN-HAMGYAESITQEDLMRAINKLE  229 (231)
T ss_pred             eeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc-ccceehhhhhHHHHHHHHHhhc
Confidence            356667778999999999 44    4799999986532 356777655  34555555543


No 113
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=33.60  E-value=1.1e+02  Score=22.36  Aligned_cols=32  Identities=16%  Similarity=0.181  Sum_probs=27.8

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ..+.|.+. +|..+.+++...|...++.|-...
T Consensus        11 ~~V~VeLk-~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          11 HIVTVELK-TGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CEEEEEEC-CCcEEEEEEEEEcCCceeEEEEEE
Confidence            56788886 889999999999999999888774


No 114
>PF00883 Peptidase_M17:  Cytosol aminopeptidase family, catalytic domain;  InterPro: IPR000819 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine). Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear [, ]. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids []. The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another []. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape []. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices []. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core []. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer []. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain [].; GO: 0004177 aminopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 3KZW_L 3KQX_C 3KQZ_L 3KR4_I 3KR5_J 3T8W_C 3H8F_D 3H8G_A 3H8E_B 3IJ3_A ....
Probab=33.01  E-value=22  Score=34.30  Aligned_cols=30  Identities=13%  Similarity=0.201  Sum_probs=20.4

Q ss_pred             EEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          344 LVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       344 ~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      -|--+.++.|..++.+|||||.+.||+.|.
T Consensus       133 ~~l~~~EN~i~~~a~~pgDVi~s~~GkTVE  162 (311)
T PF00883_consen  133 AVLPLAENMISGNAYRPGDVITSMNGKTVE  162 (311)
T ss_dssp             EEEEEEEE--STTSTTTTEEEE-TTS-EEE
T ss_pred             EEEEcccccCCCCCCCCCCEEEeCCCCEEE
Confidence            344455678888889999999999999873


No 115
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=32.33  E-value=95  Score=23.20  Aligned_cols=33  Identities=24%  Similarity=0.293  Sum_probs=27.6

Q ss_pred             ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641          163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES  196 (385)
Q Consensus       163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~  196 (385)
                      ..+.|.+. +|+.+.+++.++|...++.|--+..
T Consensus        18 ~~V~V~lk-~g~~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          18 KRVLVKLK-NGREYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             CEEEEEEC-CCCEEEEEEEEEccceeEEEeceEE
Confidence            67888886 8899999999999999987765543


No 116
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains.  Family M17 contains zinc- and manganese-dependent exopeptidases ( EC  3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=31.88  E-value=33  Score=35.24  Aligned_cols=31  Identities=16%  Similarity=0.127  Sum_probs=25.3

Q ss_pred             eEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          343 VLVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       343 v~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      +-|.-..++.+..++.+|||||...||+.|.
T Consensus       287 ~~i~~~~EN~is~~A~rPgDVi~s~~GkTVE  317 (468)
T cd00433         287 VGVLPLAENMISGNAYRPGDVITSRSGKTVE  317 (468)
T ss_pred             EEEEEeeecCCCCCCCCCCCEeEeCCCcEEE
Confidence            4455566788888889999999999999874


No 117
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=31.30  E-value=1.3e+02  Score=21.34  Aligned_cols=35  Identities=17%  Similarity=0.167  Sum_probs=29.3

Q ss_pred             CceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCC
Q 016641          162 STFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESD  197 (385)
Q Consensus       162 ~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~  197 (385)
                      ...+.|.+. +|+.+.+.+...|...++.|-.....
T Consensus         8 g~~V~V~l~-~g~~~~G~L~~~D~~~Nl~L~~~~~~   42 (67)
T PF01423_consen    8 GKRVRVELK-NGRTYRGTLVSFDQFMNLVLSDVTET   42 (67)
T ss_dssp             TSEEEEEET-TSEEEEEEEEEEETTEEEEEEEEEEE
T ss_pred             CcEEEEEEe-CCEEEEEEEEEeechheEEeeeEEEE
Confidence            357888886 89999999999999999988777653


No 118
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=29.60  E-value=1.2e+02  Score=23.02  Aligned_cols=47  Identities=21%  Similarity=0.264  Sum_probs=30.5

Q ss_pred             EEEEEEEecCCCCeEEEEecCCcccccceeeecCCcccCCCeEEE-EecC
Q 016641          176 YRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQAVAV-VGYP  224 (385)
Q Consensus       176 ~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~V~~-iG~p  224 (385)
                      +|++++..|...++|++.+-.-.  ..+.---++...++|+.|.+ +||.
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvk--reV~l~Lv~~~v~~GdyVLVHvGfA   52 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVK--REVNLDLVGEEVKVGDYVLVHVGFA   52 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEe--EEEEeeeecCccccCCEEEEEeeEE
Confidence            57888999988889999886532  12222223335578998776 5654


No 119
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=27.86  E-value=21  Score=33.42  Aligned_cols=38  Identities=21%  Similarity=0.200  Sum_probs=32.2

Q ss_pred             eEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhhHHh
Q 016641          343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTGSH  380 (385)
Q Consensus       343 v~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~l~~  380 (385)
                      ..|..+.++|--.+.  +++||.|-+|||+.|-.-+.++-
T Consensus       151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeV  190 (334)
T KOG3938|consen  151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEV  190 (334)
T ss_pred             eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHH
Confidence            478899999988876  99999999999999987765543


No 120
>KOG2597 consensus Predicted aminopeptidase of the M17 family [General function prediction only]
Probab=26.91  E-value=67  Score=33.13  Aligned_cols=43  Identities=12%  Similarity=0.075  Sum_probs=31.8

Q ss_pred             HHHhhcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641          329 QLRNNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA  373 (385)
Q Consensus       329 ~~~~~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~  373 (385)
                      +....+++|.  +-..|.-.-+++|...+-|+||||+..||+.|.
T Consensus       310 ~a~~~l~~~i--n~~~v~plcENm~sg~A~kpgDVit~~nGKtve  352 (513)
T KOG2597|consen  310 RAAAQLSLPI--NVHAVLPLCENMPSGNATKPGDVITLRNGKTVE  352 (513)
T ss_pred             HHHHhcCCCC--ceEEEEeeeccCCCccCCCCCcEEEecCCcEEE
Confidence            3344566663  444555566789999999999999999999874


No 121
>PF11730 DUF3297:  Protein of unknown function (DUF3297);  InterPro: IPR021724  This family is expressed in Proteobacteria and Actinobacteria. The function is not known. 
Probab=25.24  E-value=44  Score=24.42  Aligned_cols=32  Identities=25%  Similarity=0.251  Sum_probs=27.4

Q ss_pred             eeCCCCHHhhh-cCCCCEEEEECCEEcCChhhH
Q 016641          347 KINPLSDAHEI-LKKDDIILAFDGVPIANDGTG  378 (385)
Q Consensus       347 ~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l  378 (385)
                      +++|.||-+.+ +-.-||=+.+||++=++.+++
T Consensus         5 S~~P~Sp~~~~~~l~~~iGIrfng~Er~nVeEY   37 (71)
T PF11730_consen    5 SINPRSPHYDAEVLERGIGIRFNGKERTNVEEY   37 (71)
T ss_pred             ccCCCChhhHHHHHhcCcceEECCeEcccceeE
Confidence            57899999988 777899999999998887764


No 122
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.14  E-value=2.1e+02  Score=21.20  Aligned_cols=33  Identities=12%  Similarity=0.126  Sum_probs=27.8

Q ss_pred             CceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641          162 STFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE  195 (385)
Q Consensus       162 ~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~  195 (385)
                      ...+.|.+. +|..+.+++..+|...++.|-.+.
T Consensus        11 g~~V~VeLk-ng~~~~G~L~~~D~~mNi~L~~~~   43 (76)
T cd01723          11 NHPMLVELK-NGETYNGHLVNCDNWMNIHLREVI   43 (76)
T ss_pred             CCEEEEEEC-CCCEEEEEEEEEcCCCceEEEeEE
Confidence            357788886 789999999999999999887663


No 123
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=23.06  E-value=53  Score=34.50  Aligned_cols=31  Identities=19%  Similarity=0.177  Sum_probs=27.8

Q ss_pred             eEEEeeCCCCHHhhh--cCCCCEEEEECCEEcC
Q 016641          343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIA  373 (385)
Q Consensus       343 v~V~~V~~~spA~~a--L~~GDiI~~vng~~i~  373 (385)
                      .+|+++.++|||...  |..||-|+.||++.+.
T Consensus       227 h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvV  259 (638)
T KOG1738|consen  227 HVTSKIFEQSPADYRQKILDGDEVLQINEQTVV  259 (638)
T ss_pred             eeccccccCChHHHhhcccCccceeeecccccc
Confidence            477889999999987  9999999999999865


No 124
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=22.86  E-value=69  Score=32.19  Aligned_cols=39  Identities=18%  Similarity=0.115  Sum_probs=31.1

Q ss_pred             EEEeeCCCCHHhhh-cC-CCCEEEEECCEEcCChhhHHhhh
Q 016641          344 LVNKINPLSDAHEI-LK-KDDIILAFDGVPIANDGTGSHSM  382 (385)
Q Consensus       344 ~V~~V~~~spA~~a-L~-~GDiI~~vng~~i~~~~~l~~~l  382 (385)
                      =|-+|.++|||+.| |+ -+|-|+.+-.......+||...|
T Consensus       112 Hvl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lI  152 (462)
T KOG3834|consen  112 HVLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLI  152 (462)
T ss_pred             eeeecCCCCHHHhcccccccceEecchhhhccchHHHHHHH
Confidence            46789999999999 88 68999999555666777776654


Done!