Query         014786
Match_columns 418
No_of_seqs    361 out of 3361
Neff          7.5 
Searched_HMMs 46136
Date          Fri Mar 29 08:30:22 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/014786.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/014786hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0   3E-51 6.5E-56  421.5  36.3  286  117-417    41-346 (455)
  2 PRK10898 serine endoprotease;  100.0 5.9E-50 1.3E-54  400.3  36.8  287  112-417    41-335 (353)
  3 TIGR02038 protease_degS peripl 100.0 7.5E-50 1.6E-54  399.7  36.8  287  112-417    41-334 (351)
  4 PRK10942 serine endoprotease;  100.0 3.3E-49 7.2E-54  408.1  34.9  286  117-417    39-367 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 1.3E-47 2.9E-52  393.8  32.8  285  118-417     3-313 (428)
  6 COG0265 DegQ Trypsin-like seri 100.0 1.9E-38 4.2E-43  317.6  28.5  288  116-417    33-326 (347)
  7 KOG1320 Serine protease [Postt  99.9 6.7E-27 1.4E-31  236.1  19.6  291  115-416   127-453 (473)
  8 KOG1421 Predicted signaling-as  99.8 1.9E-20   4E-25  191.4  16.5  279  117-417    53-357 (955)
  9 PF13365 Trypsin_2:  Trypsin-li  99.7 2.2E-16 4.8E-21  132.6  12.5  109  154-293     1-120 (120)
 10 PF00089 Trypsin:  Trypsin;  In  99.5   4E-12 8.7E-17  117.3  19.2  168  151-320    24-220 (220)
 11 KOG1421 Predicted signaling-as  99.5 1.1E-12 2.4E-17  135.0  15.5  288  122-416   524-827 (955)
 12 KOG1320 Serine protease [Postt  99.4 2.6E-12 5.7E-17  130.6  12.7  252  122-394    56-319 (473)
 13 PF13180 PDZ_2:  PDZ domain; PD  99.3 5.8E-12 1.3E-16   99.7   8.6   70  332-417     1-70  (82)
 14 cd00190 Tryp_SPc Trypsin-like   99.3   9E-11 1.9E-15  109.0  17.1  169  151-321    24-230 (232)
 15 smart00020 Tryp_SPc Trypsin-li  99.2 3.5E-10 7.7E-15  105.3  15.9  147  151-298    25-208 (229)
 16 COG3591 V8-like Glu-specific e  99.0 1.7E-08 3.6E-13   95.5  16.4  158  153-324    65-250 (251)
 17 cd00987 PDZ_serine_protease PD  99.0   3E-09 6.6E-14   85.0   8.7   75  332-417     1-80  (90)
 18 cd00991 PDZ_archaeal_metallopr  98.9 4.8E-09   1E-13   82.4   7.9   57  350-417    10-66  (79)
 19 cd00136 PDZ PDZ domain, also c  98.9 4.7E-09   1E-13   80.0   7.6   68  332-417     1-70  (70)
 20 TIGR01713 typeII_sec_gspC gene  98.8   1E-08 2.2E-13   98.6   8.2   89  314-417   159-247 (259)
 21 cd00990 PDZ_glycyl_aminopeptid  98.8 1.2E-08 2.7E-13   79.8   7.2   65  332-417     1-65  (80)
 22 cd00989 PDZ_metalloprotease PD  98.7 5.7E-08 1.2E-12   75.7   7.9   66  333-417     2-67  (79)
 23 cd00986 PDZ_LON_protease PDZ d  98.7 7.2E-08 1.6E-12   75.5   7.7   56  350-417     8-63  (79)
 24 cd00988 PDZ_CTP_protease PDZ d  98.6 1.1E-07 2.4E-12   75.2   7.4   66  333-417     3-70  (85)
 25 TIGR02037 degP_htrA_DO peripla  98.5 2.6E-07 5.7E-12   95.4   9.0   76  331-417   337-418 (428)
 26 PF00863 Peptidase_C4:  Peptida  98.5 9.5E-06   2E-10   76.3  17.5  164  122-313    13-184 (235)
 27 cd00992 PDZ_signaling PDZ doma  98.5 6.7E-07 1.5E-11   70.0   8.2   69  331-416    11-81  (82)
 28 PF00595 PDZ:  PDZ domain (Also  98.4 3.4E-07 7.5E-12   71.9   5.2   71  330-416     8-80  (81)
 29 smart00228 PDZ Domain present   98.3 1.4E-06   3E-11   68.4   7.0   71  332-417    12-82  (85)
 30 PRK10779 zinc metallopeptidase  98.2 1.4E-06 3.1E-11   90.5   6.3   54  353-417   129-182 (449)
 31 KOG3627 Trypsin [Amino acid tr  98.2 9.6E-05 2.1E-09   70.4  18.0  145  153-299    39-229 (256)
 32 TIGR00054 RIP metalloprotease   98.0 1.3E-05 2.7E-10   82.7   7.4   55  350-416   203-257 (420)
 33 TIGR00225 prc C-terminal pepti  98.0 1.9E-05 4.1E-10   78.9   7.7   67  332-417    51-119 (334)
 34 PRK10779 zinc metallopeptidase  97.9   2E-05 4.4E-10   81.9   7.6   54  351-416   222-275 (449)
 35 PF14685 Tricorn_PDZ:  Tricorn   97.9 3.9E-05 8.5E-10   61.4   7.2   68  333-417     2-77  (88)
 36 PRK10139 serine endoprotease;   97.9 2.5E-05 5.3E-10   81.3   7.4   55  350-417   390-444 (455)
 37 PLN00049 carboxyl-terminal pro  97.8 6.2E-05 1.4E-09   76.8   8.3   56  350-417   102-159 (389)
 38 PF05579 Peptidase_S32:  Equine  97.8 0.00034 7.4E-09   66.2  12.2  117  151-298   111-229 (297)
 39 PRK10942 serine endoprotease;   97.8 5.7E-05 1.2E-09   79.0   7.5   55  350-417   408-462 (473)
 40 TIGR00054 RIP metalloprotease   97.8 4.3E-05 9.3E-10   78.8   6.3   54  350-416   128-181 (420)
 41 COG0793 Prc Periplasmic protea  97.7 0.00013 2.9E-09   74.6   8.5   71  329-417    97-169 (406)
 42 PF04495 GRASP55_65:  GRASP55/6  97.6  0.0001 2.3E-09   64.1   6.0   74  332-417    26-99  (138)
 43 TIGR02860 spore_IV_B stage IV   97.6 0.00012 2.6E-09   74.1   7.1   55  350-416   105-167 (402)
 44 COG3480 SdrC Predicted secrete  97.4  0.0003 6.4E-09   68.2   6.8   55  350-416   130-184 (342)
 45 PRK11186 carboxy-terminal prot  97.3 0.00061 1.3E-08   73.7   7.8   68  331-417   243-318 (667)
 46 COG5640 Secreted trypsin-like   97.2   0.015 3.4E-07   57.4  15.5   54  272-325   223-279 (413)
 47 PF12812 PDZ_1:  PDZ-like domai  97.2  0.0013 2.8E-08   51.5   6.3   65  332-407     9-76  (78)
 48 COG3975 Predicted protease wit  97.1 0.00065 1.4E-08   70.0   4.8   49  350-417   462-510 (558)
 49 KOG3553 Tax interaction protei  97.0 0.00066 1.4E-08   54.7   3.0   32  350-392    59-90  (124)
 50 PF05580 Peptidase_S55:  SpoIVB  96.7   0.022 4.7E-07   52.8  11.1  164  145-314    13-213 (218)
 51 PF00548 Peptidase_C3:  3C cyst  96.6   0.049 1.1E-06   49.2  12.8  136  151-297    24-170 (172)
 52 PF03761 DUF316:  Domain of unk  96.6    0.16 3.4E-06   49.3  17.2   91  197-297   159-254 (282)
 53 KOG3129 26S proteasome regulat  96.0   0.012 2.6E-07   53.9   5.4   56  351-417   140-197 (231)
 54 PF08192 Peptidase_S64:  Peptid  96.0   0.042 9.1E-07   58.4  10.0  117  198-323   542-688 (695)
 55 PRK09681 putative type II secr  95.9   0.013 2.7E-07   56.7   5.3   50  356-416   210-262 (276)
 56 PF10459 Peptidase_S46:  Peptid  95.8   0.053 1.1E-06   59.2  10.0   22  152-173    47-68  (698)
 57 KOG3209 WW domain-containing p  95.6   0.014 3.1E-07   62.0   4.8   52  354-417   782-835 (984)
 58 PF00949 Peptidase_S7:  Peptida  95.4   0.016 3.5E-07   49.9   3.6   32  268-299    88-119 (132)
 59 TIGR02860 spore_IV_B stage IV   95.3    0.16 3.5E-06   51.8  11.1   38  272-313   355-392 (402)
 60 KOG3580 Tight junction protein  95.3   0.024 5.2E-07   59.2   5.1   55  350-415   429-485 (1027)
 61 PF10459 Peptidase_S46:  Peptid  95.3   0.014   3E-07   63.6   3.6   31  265-295   621-651 (698)
 62 KOG3532 Predicted protein kina  95.0   0.046 9.9E-07   58.0   6.1   62  332-411   386-447 (1051)
 63 PF02122 Peptidase_S39:  Peptid  94.4    0.13 2.7E-06   47.8   6.8  134  164-314    43-182 (203)
 64 KOG1892 Actin filament-binding  93.4     0.1 2.2E-06   57.3   5.0   57  350-416   960-1016(1629)
 65 KOG3550 Receptor targeting pro  93.0    0.17 3.7E-06   44.1   4.8   54  350-416   115-171 (207)
 66 COG3031 PulC Type II secretory  92.6    0.22 4.7E-06   46.8   5.2   48  359-417   216-263 (275)
 67 PF00947 Pico_P2A:  Picornaviru  92.5       1 2.2E-05   38.3   8.7   33  264-297    77-109 (127)
 68 PF09342 DUF1986:  Domain of un  92.4     1.7 3.7E-05   41.2  10.9   88  149-237    25-131 (267)
 69 KOG3606 Cell polarity protein   92.3    0.44 9.5E-06   45.5   6.9   66  332-407   171-243 (358)
 70 KOG3834 Golgi reassembly stack  91.8    0.18 3.9E-06   51.0   4.0   56  350-417    15-71  (462)
 71 PF00944 Peptidase_S3:  Alphavi  91.5    0.26 5.7E-06   42.2   4.1   28  271-298   100-127 (158)
 72 KOG3209 WW domain-containing p  91.1    0.47   1E-05   50.9   6.3   55  352-416   373-429 (984)
 73 KOG3549 Syntrophins (type gamm  90.9    0.36 7.9E-06   47.6   4.9   54  351-416    81-136 (505)
 74 KOG0609 Calcium/calmodulin-dep  90.6    0.42 9.2E-06   49.8   5.3   67  332-416   134-202 (542)
 75 KOG2921 Intramembrane metallop  88.8    0.64 1.4E-05   46.7   4.8   45  350-405   220-265 (484)
 76 KOG3605 Beta amyloid precursor  88.6     0.6 1.3E-05   49.6   4.7  101  274-396   677-791 (829)
 77 KOG3552 FERM domain protein FR  87.9    0.61 1.3E-05   51.5   4.4   64  332-417    65-130 (1298)
 78 KOG3542 cAMP-regulated guanine  87.5    0.46 9.9E-06   50.7   3.1   55  350-416   562-616 (1283)
 79 KOG3834 Golgi reassembly stack  87.0    0.69 1.5E-05   47.0   3.9   53  354-417   113-165 (462)
 80 KOG3580 Tight junction protein  86.9    0.85 1.8E-05   48.1   4.6   55  350-416    40-94  (1027)
 81 COG0750 Predicted membrane-ass  84.6     2.3 4.9E-05   42.9   6.5   44  356-411   135-178 (375)
 82 KOG3551 Syntrophins (type beta  84.6    0.86 1.9E-05   45.7   3.2   70  331-416    95-166 (506)
 83 PF03510 Peptidase_C24:  2C end  84.2     4.3 9.3E-05   33.5   6.6   53  156-220     3-55  (105)
 84 PF02395 Peptidase_S6:  Immunog  83.9     3.7 8.1E-05   45.6   8.1   54  153-209    66-121 (769)
 85 KOG3605 Beta amyloid precursor  83.5     1.3 2.8E-05   47.2   4.2   58  350-417   673-732 (829)
 86 PF02907 Peptidase_S29:  Hepati  82.7    0.62 1.3E-05   40.0   1.2  115  155-299    15-130 (148)
 87 KOG3571 Dishevelled 3 and rela  82.3     2.1 4.6E-05   44.4   5.0   37  350-396   277-313 (626)
 88 KOG0606 Microtubule-associated  74.7       4 8.6E-05   46.4   4.6   50  353-415   661-712 (1205)
 89 PF01732 DUF31:  Putative pepti  73.2     2.5 5.4E-05   42.9   2.6   23  273-295   351-373 (374)
 90 KOG3651 Protein kinase C, alph  69.4      10 0.00022   37.1   5.5   47  350-406    30-78  (429)
 91 PF05416 Peptidase_C37:  Southa  62.1      26 0.00056   36.0   7.0  135  151-298   378-527 (535)
 92 PF11874 DUF3394:  Domain of un  53.6      56  0.0012   29.8   7.1   37  335-389   114-150 (183)
 93 cd00600 Sm_like The eukaryotic  52.4      41 0.00088   24.3   5.2   33  176-208     7-39  (63)
 94 cd01735 LSm12_N LSm12 belongs   50.6      57  0.0012   24.2   5.5   33  176-208     7-39  (61)
 95 cd01720 Sm_D2 The eukaryotic S  50.1      31 0.00067   27.5   4.4   37  171-207    10-46  (87)
 96 KOG3938 RGS-GAIP interacting p  50.0      16 0.00035   35.2   3.1   56  352-417   151-208 (334)
 97 cd06168 LSm9 The eukaryotic Sm  46.0      52  0.0011   25.4   4.9   32  176-207    11-42  (75)
 98 cd01731 archaeal_Sm1 The archa  46.0      53  0.0011   24.5   4.9   33  176-208    11-43  (68)
 99 PRK00737 small nuclear ribonuc  46.0      52  0.0011   25.0   4.9   33  176-208    15-47  (72)
100 cd01726 LSm6 The eukaryotic Sm  46.0      48   0.001   24.7   4.7   32  176-207    11-42  (67)
101 cd01722 Sm_F The eukaryotic Sm  45.6      44 0.00095   25.0   4.4   32  176-207    12-43  (68)
102 PF00571 CBS:  CBS domain CBS d  41.7      20 0.00044   25.0   2.0   21  276-296    28-48  (57)
103 cd01717 Sm_B The eukaryotic Sm  41.7      59  0.0013   25.1   4.8   32  176-207    11-42  (79)
104 cd01730 LSm3 The eukaryotic Sm  41.0      52  0.0011   25.7   4.4   31  176-206    12-42  (82)
105 cd01729 LSm7 The eukaryotic Sm  40.9      66  0.0014   25.1   4.9   31  176-206    13-43  (81)
106 COG5233 GRH1 Peripheral Golgi   39.7      17 0.00037   35.8   1.6   30  354-394    67-96  (417)
107 cd01732 LSm5 The eukaryotic Sm  38.7      64  0.0014   24.9   4.5   31  176-206    14-44  (76)
108 cd01728 LSm1 The eukaryotic Sm  38.5      77  0.0017   24.3   4.9   31  176-206    13-43  (74)
109 PF12381 Peptidase_C3G:  Tungro  38.4      23  0.0005   33.1   2.2   55  265-323   168-228 (231)
110 PF09122 DUF1930:  Domain of un  38.3      78  0.0017   23.6   4.5   35  382-416    19-54  (68)
111 cd01719 Sm_G The eukaryotic Sm  38.0      82  0.0018   24.0   4.9   32  176-207    11-42  (72)
112 smart00651 Sm snRNP Sm protein  36.9      87  0.0019   22.9   4.9   32  176-207     9-40  (67)
113 cd01721 Sm_D3 The eukaryotic S  35.6      92   0.002   23.5   4.9   32  176-207    11-42  (70)
114 PF01423 LSM:  LSM domain ;  In  35.5      62  0.0014   23.7   3.9   33  176-208     9-41  (67)
115 cd01727 LSm8 The eukaryotic Sm  34.2      93   0.002   23.7   4.7   32  176-207    10-41  (74)
116 COG0298 HypC Hydrogenase matur  32.6      67  0.0014   25.2   3.6   47  188-236     5-52  (82)
117 COG1958 LSM1 Small nuclear rib  31.6      93   0.002   23.9   4.4   33  176-208    18-50  (79)
118 PF02601 Exonuc_VII_L:  Exonucl  30.3      60  0.0013   32.0   3.9   35  152-186   280-314 (319)
119 PF14827 Cache_3:  Sensory doma  28.6      49  0.0011   27.3   2.5   18  281-298    94-111 (116)
120 PF05578 Peptidase_S31:  Pestiv  26.5 1.5E+02  0.0032   26.2   5.1  131  152-297    51-182 (211)
121 COG0061 nadF NAD kinase [Coenz  26.2      27 0.00058   34.0   0.6   32    2-33    179-210 (281)
122 PF09465 LBR_tudor:  Lamin-B re  25.7 2.7E+02  0.0058   20.3   5.6   35  174-208     8-43  (55)
123 cd01723 LSm4 The eukaryotic Sm  25.6 1.8E+02  0.0039   22.2   5.1   32  176-207    12-43  (76)
124 PF01455 HupF_HypC:  HupF/HypC   24.1 2.1E+02  0.0045   21.6   5.0   43  188-233     5-47  (68)
125 PF02743 Cache_1:  Cache domain  23.3      58  0.0013   24.7   1.9   30  281-323    19-48  (81)
126 PF01732 DUF31:  Putative pepti  23.2      51  0.0011   33.4   2.0   23  151-173    35-67  (374)
127 PF14438 SM-ATX:  Ataxin 2 SM d  23.2 1.9E+02  0.0041   21.9   4.8   28  176-203    13-43  (77)
128 cd04627 CBS_pair_14 The CBS do  22.4      61  0.0013   26.2   2.0   21  277-297    98-118 (123)
129 cd01725 LSm2 The eukaryotic Sm  20.7 2.4E+02  0.0052   21.9   4.9   32  176-207    12-43  (81)
130 cd04603 CBS_pair_KefB_assoc Th  20.5      74  0.0016   25.3   2.1   20  277-296    86-105 (111)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=3e-51  Score=421.49  Aligned_cols=286  Identities=39%  Similarity=0.627  Sum_probs=249.3

Q ss_pred             hHHHHHHHcCCceEEEEeeecccC------c----ccc---cc-ccCCCeEEEEEEEcC-CcEEEecccccCCCCeEEEE
Q 014786          117 ATVRLFQENTPSVVNITNLAARQD------A----FTL---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT  181 (418)
Q Consensus       117 ~~~~~~~~~~~SVV~I~~~~~~~~------~----~~~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~i~V~  181 (418)
                      .+.++++++.||||.|.+......      .    |..   +. ....++.||||||++ +||||||+|||++++.+.|+
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~  120 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ  120 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence            588999999999999987653221      1    110   00 112357899999985 79999999999999999999


Q ss_pred             eCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCC
Q 014786          182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG  261 (418)
Q Consensus       182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~  261 (418)
                      +.|++.++|++++.|+++||||||++.+ ..+++++++++..+++||+|+++|||++...+++.|+|++..+.....   
T Consensus       121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~---  196 (455)
T PRK10139        121 LNDGREFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL---  196 (455)
T ss_pred             ECCCCEEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCC---
Confidence            9999999999999999999999999854 368999999999999999999999999999999999999887752211   


Q ss_pred             CCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCeeecc-
Q 014786          262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP-  340 (418)
Q Consensus       262 ~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~-  340 (418)
                      ..+.+++++|+++++|+|||||+|.+|+||||+++.+.++++..+++|+||++.+++++++|+++|++.|+|||+.+++ 
T Consensus       197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l  276 (455)
T PRK10139        197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM  276 (455)
T ss_pred             CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence            2245789999999999999999999999999999988776677899999999999999999999999999999999886 


Q ss_pred             -chhhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          341 -DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       341 -~~~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                       .+.++.+++   .|++|.+|.++|||+++||++           ||+|++|||++|.++.|+.+.+...++|++++++|
T Consensus       277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V  345 (455)
T PRK10139        277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKA-----------GDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL  345 (455)
T ss_pred             CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCC-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence             344666776   699999999999999999999           99999999999999999999998878899998887


Q ss_pred             E
Q 014786          417 F  417 (418)
Q Consensus       417 ~  417 (418)
                      +
T Consensus       346 ~  346 (455)
T PRK10139        346 L  346 (455)
T ss_pred             E
Confidence            4


No 2  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=5.9e-50  Score=400.28  Aligned_cols=287  Identities=34%  Similarity=0.527  Sum_probs=245.5

Q ss_pred             CccchhHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEE
Q 014786          112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK  191 (418)
Q Consensus       112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~  191 (418)
                      ...+....++++++.||||.|.........   .......+.||||+|+++||||||+|||++++.+.|++.||+.++|+
T Consensus        41 ~~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~  117 (353)
T PRK10898         41 DETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEAL  117 (353)
T ss_pred             ccccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEE
Confidence            334457889999999999999886532211   11112357899999999999999999999999999999999999999


Q ss_pred             EEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEc
Q 014786          192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD  271 (418)
Q Consensus       192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~  271 (418)
                      ++++|+.+||||||++..  .+++++++++..+++||+|+++|||++...+++.|+|++..+....   .....+++++|
T Consensus       118 vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtd  192 (353)
T PRK10898        118 LVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTD  192 (353)
T ss_pred             EEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEec
Confidence            999999999999999873  5888999988889999999999999998889999999987765322   12234789999


Q ss_pred             cccCCCCCCCceeCCCceEEEEEeeeeCCCC---CCCCccceeecccchhhhhhhhhcccccccccCeeeccc--hhhhh
Q 014786          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQ  346 (418)
Q Consensus       272 ~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~---~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~~--~~~~~  346 (418)
                      +++++|+|||||+|.+||||||+++.+...+   ...+++|+||++.+++++++|+++|++.++|||+.+++.  ..++.
T Consensus       193 a~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~  272 (353)
T PRK10898        193 ASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQG  272 (353)
T ss_pred             cccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHh
Confidence            9999999999999999999999998775432   236899999999999999999999999999999998753  22334


Q ss_pred             hCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          347 LGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       347 ~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      +++   .|++|.+|.+++||+++||++           ||+|++|||++|.+..|+.+.+...++|++++++|+
T Consensus       273 ~~~~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~  335 (353)
T PRK10898        273 GGIDQLQGIVVNEVSPDGPAAKAGIQV-----------NDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVM  335 (353)
T ss_pred             cCCCCCCeEEEEEECCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEE
Confidence            443   799999999999999999999           999999999999999999999988889999998874


No 3  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=7.5e-50  Score=399.67  Aligned_cols=287  Identities=36%  Similarity=0.604  Sum_probs=248.3

Q ss_pred             CccchhHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEE
Q 014786          112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK  191 (418)
Q Consensus       112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~  191 (418)
                      ...+.+..++++++.||||.|.+.....+.+   ......+.||||+|+++||||||+|||.+++.+.|.+.||+.++|+
T Consensus        41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~---~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~  117 (351)
T TIGR02038        41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSL---NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE  117 (351)
T ss_pred             cccchhHHHHHHhcCCcEEEEEeEecccccc---ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence            4455678899999999999999865433211   1123357899999999999999999999999999999999999999


Q ss_pred             EEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEc
Q 014786          192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD  271 (418)
Q Consensus       192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~  271 (418)
                      ++++|+.+||||||++..  .+++++++++..+++||+|+++|||++...+++.|+|+...+....   .....+++++|
T Consensus       118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd  192 (351)
T TIGR02038       118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD  192 (351)
T ss_pred             EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence            999999999999999874  4888999888889999999999999999899999999988764321   12235789999


Q ss_pred             cccCCCCCCCceeCCCceEEEEEeeeeCCC--CCCCCccceeecccchhhhhhhhhcccccccccCeeecc--chhhhhh
Q 014786          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQL  347 (418)
Q Consensus       272 ~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~--~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~--~~~~~~~  347 (418)
                      +.+++|+|||||+|.+|+||||+++.+...  +...+++|+||++.+++++++++++|++.|+|||+.+++  ...++.+
T Consensus       193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l  272 (351)
T TIGR02038       193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL  272 (351)
T ss_pred             CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence            999999999999999999999999876432  234689999999999999999999999999999999886  3345567


Q ss_pred             Cc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          348 GV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       348 ~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      |+   .|++|.+|.+++||+++||++           ||+|++|||++|.+++|+.+.+...++|++++++|+
T Consensus       273 gl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~  334 (351)
T TIGR02038       273 GLPDLRGIVITGVDPNGPAARAGILV-----------RDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVL  334 (351)
T ss_pred             CCCccccceEeecCCCChHHHCCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEE
Confidence            76   599999999999999999999           999999999999999999999998889999998874


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=3.3e-49  Score=408.11  Aligned_cols=286  Identities=38%  Similarity=0.600  Sum_probs=248.9

Q ss_pred             hHHHHHHHcCCceEEEEeeecccC-----------ccccc--------------------------cccCCCeEEEEEEE
Q 014786          117 ATVRLFQENTPSVVNITNLAARQD-----------AFTLD--------------------------VLEVPQGSGSGFVW  159 (418)
Q Consensus       117 ~~~~~~~~~~~SVV~I~~~~~~~~-----------~~~~~--------------------------~~~~~~~~GSGfiI  159 (418)
                      ++.++++++.||||.|.+......           +|...                          .....++.||||||
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            588999999999999988653211           01100                          00112468999999


Q ss_pred             cC-CcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCC
Q 014786          160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG  238 (418)
Q Consensus       160 ~~-~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g  238 (418)
                      ++ +||||||+|||.++++++|.+.|++.|+|++++.|+.+||||||++.. ..+++++++++..+++||+|+++|||++
T Consensus       119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~G~~V~aiG~P~g  197 (473)
T PRK10942        119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRVGDYTVAIGNPYG  197 (473)
T ss_pred             ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEcCCCC
Confidence            96 599999999999999999999999999999999999999999999753 3689999999999999999999999999


Q ss_pred             CCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchh
Q 014786          239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNG  318 (418)
Q Consensus       239 ~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~  318 (418)
                      ...+++.|+|+++.+....   ...+.+++++|+++++|+|||||+|.+|+||||+++.+.++++..+++|+||++.+++
T Consensus       198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~  274 (473)
T PRK10942        198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN  274 (473)
T ss_pred             CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence            9999999999988764211   1234578999999999999999999999999999998887776789999999999999


Q ss_pred             hhhhhhhcccccccccCeeecc--chhhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEe
Q 014786          319 IVDQLVKFGKVTRPILGIKFAP--DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV  393 (418)
Q Consensus       319 ~l~~l~~~g~v~~~~lGv~~~~--~~~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v  393 (418)
                      ++++|+++|++.|+|||+.+++  ...++.+++   .|++|.+|.+++||+++||+.           ||+|++|||++|
T Consensus       275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V  343 (473)
T PRK10942        275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKA-----------GDVITSLNGKPI  343 (473)
T ss_pred             HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCC-----------CCEEEEECCEEC
Confidence            9999999999999999999886  344667776   599999999999999999999           999999999999


Q ss_pred             CCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          394 SNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       394 ~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .+++|+.+.+...++|++++++|+
T Consensus       344 ~s~~dl~~~l~~~~~g~~v~l~v~  367 (473)
T PRK10942        344 SSFAALRAQVGTMPVGSKLTLGLL  367 (473)
T ss_pred             CCHHHHHHHHHhcCCCCEEEEEEE
Confidence            999999999998889999998874


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=1.3e-47  Score=393.81  Aligned_cols=285  Identities=45%  Similarity=0.682  Sum_probs=249.5

Q ss_pred             HHHHHHHcCCceEEEEeeecccC-------------ccccc--------cccCCCeEEEEEEEcCCcEEEecccccCCCC
Q 014786          118 TVRLFQENTPSVVNITNLAARQD-------------AFTLD--------VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS  176 (418)
Q Consensus       118 ~~~~~~~~~~SVV~I~~~~~~~~-------------~~~~~--------~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~  176 (418)
                      +.++++++.||||.|.+......             +|...        ......+.||||+|+++||||||+||+.++.
T Consensus         3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~   82 (428)
T TIGR02037         3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD   82 (428)
T ss_pred             HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence            56889999999999988652211             11100        0112457899999999999999999999999


Q ss_pred             eEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeec
Q 014786          177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS  256 (418)
Q Consensus       177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~  256 (418)
                      .+.|++.|++.++|++++.|+.+||||||++.. ..+++++++++..+++|++|+++|||++...+++.|+|+...+...
T Consensus        83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~  161 (428)
T TIGR02037        83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL  161 (428)
T ss_pred             eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence            999999999999999999999999999999875 3689999998888999999999999999999999999998776521


Q ss_pred             ccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCe
Q 014786          257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGI  336 (418)
Q Consensus       257 ~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv  336 (418)
                         ....+..++++|+++++|+|||||+|.+|+||||+++.+...++..+++|+||++.+++++++++++|++.++|||+
T Consensus       162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi  238 (428)
T TIGR02037       162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV  238 (428)
T ss_pred             ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence               12334578999999999999999999999999999998877666778999999999999999999999999999999


Q ss_pred             eecc--chhhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          337 KFAP--DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       337 ~~~~--~~~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      .+++  ...++.+|+   .|++|.+|.+++||+++||++           ||+|++|||++|.++.|+.+.+...++|++
T Consensus       239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~  307 (428)
T TIGR02037       239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKA-----------GDVILSVNGKPISSFADLRRAIGTLKPGKK  307 (428)
T ss_pred             EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCE
Confidence            9986  345677887   699999999999999999999           999999999999999999999998889999


Q ss_pred             EEEEEE
Q 014786          412 VSCFTF  417 (418)
Q Consensus       412 v~l~v~  417 (418)
                      ++++|+
T Consensus       308 v~l~v~  313 (428)
T TIGR02037       308 VTLGIL  313 (428)
T ss_pred             EEEEEE
Confidence            999874


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=1.9e-38  Score=317.61  Aligned_cols=288  Identities=45%  Similarity=0.673  Sum_probs=248.3

Q ss_pred             hhHHHHHHHcCCceEEEEeeecccC-ccccccc-c-CCCeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEE
Q 014786          116 LATVRLFQENTPSVVNITNLAARQD-AFTLDVL-E-VPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI  192 (418)
Q Consensus       116 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~~~~~-~-~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~v  192 (418)
                      ..+..+++++.|+||.|........ .|..... . ...+.||||+++++|||+|+.||+.++..+.+.+.||+.+++++
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~  112 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL  112 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence            5778899999999999988654332 1110000 0 01589999999989999999999999999999999999999999


Q ss_pred             EEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEcc
Q 014786          193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA  272 (418)
Q Consensus       193 v~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~  272 (418)
                      ++.|+..|+|++|++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. . ......+.++||+|+
T Consensus       113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~-v~~~~~~~~~IqtdA  189 (347)
T COG0265         113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-G-VGSAGGYVNFIQTDA  189 (347)
T ss_pred             EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-c-ccCcccccchhhccc
Confidence            9999999999999998543 888899999999999999999999999999999999999886 1 111122568899999


Q ss_pred             ccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCeeeccchhhhhhCc---
Q 014786          273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGV---  349 (418)
Q Consensus       273 ~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~~~---  349 (418)
                      ++++|+||||++|.+|++|||+++.+...++..+++|+||++.++.+++++.++|++.++|+|+.+.+......+|+   
T Consensus       190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~  269 (347)
T COG0265         190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA  269 (347)
T ss_pred             ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence            99999999999999999999999998876656779999999999999999999889999999999876332222443   


Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .|++|.++.+++||+++|++.           ||+|+++||+++.+..++.+.+...++|+++.++++
T Consensus       270 ~G~~V~~v~~~spa~~agi~~-----------Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~  326 (347)
T COG0265         270 AGAVVLGVLPGSPAAKAGIKA-----------GDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLL  326 (347)
T ss_pred             CceEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEE
Confidence            799999999999999999999           999999999999999999999999999999999875


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95  E-value=6.7e-27  Score=236.05  Aligned_cols=291  Identities=37%  Similarity=0.543  Sum_probs=232.0

Q ss_pred             chhHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCCCC-----------eEEEEeC
Q 014786          115 ELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFA  183 (418)
Q Consensus       115 ~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~-----------~i~V~~~  183 (418)
                      ......+.++-.+++|.|+...-..........+.+...||||||+.+|.++||+||+....           .+.|...
T Consensus       127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa  206 (473)
T KOG1320|consen  127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA  206 (473)
T ss_pred             hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence            34456788999999999998544333322334456788999999999999999999997543           2677776


Q ss_pred             CC--cEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCC-
Q 014786          184 DQ--SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAAT-  260 (418)
Q Consensus       184 dg--~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~-  260 (418)
                      +|  ..+++.+.+.|+..|+|+++++.+..-.++++++.+..+..|+++..+|.|++..++.+.|++++..|....... 
T Consensus       207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~  286 (473)
T KOG1320|consen  207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE  286 (473)
T ss_pred             ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence            66  889999999999999999999765433788888888899999999999999999999999999988886544332 


Q ss_pred             -CCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccc---------c
Q 014786          261 -GRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV---------T  330 (418)
Q Consensus       261 -~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v---------~  330 (418)
                       +....+.+++|+++++|+||+|++|.+|++||+++......+-..+++|++|.+.+..++.+..+....         .
T Consensus       287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~  366 (473)
T KOG1320|consen  287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPV  366 (473)
T ss_pred             cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccc
Confidence             245678899999999999999999999999999988765444456899999999999988887443322         2


Q ss_pred             ccccCeeecc-------chhhhhh----C-ccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHH
Q 014786          331 RPILGIKFAP-------DQSVEQL----G-VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSD  398 (418)
Q Consensus       331 ~~~lGv~~~~-------~~~~~~~----~-~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~d  398 (418)
                      +.|+|....-       ....+.+    + .+|++|.+|.+++++...+++.           ||+|++|||++|.+..+
T Consensus       367 ~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~-----------g~~V~~vng~~V~n~~~  435 (473)
T KOG1320|consen  367 HQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKP-----------GDQVVKVNGKPVKNLKH  435 (473)
T ss_pred             cccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccC-----------CCEEEEECCEEeechHH
Confidence            4577754321       0011111    1 2699999999999999999999           99999999999999999


Q ss_pred             HHHHHhcCCCCCEEEEEE
Q 014786          399 LYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       399 l~~~l~~~~~g~~v~l~v  416 (418)
                      |.++++.+.++++|.+..
T Consensus       436 l~~~i~~~~~~~~v~vl~  453 (473)
T KOG1320|consen  436 LYELIEECSTEDKVAVLD  453 (473)
T ss_pred             HHHHHHhcCcCceEEEEE
Confidence            999999888888877754


No 8  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.85  E-value=1.9e-20  Score=191.40  Aligned_cols=279  Identities=27%  Similarity=0.348  Sum_probs=216.6

Q ss_pred             hHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCC-cEEEecccccCC-CCeEEEEeCCCcEEEEEEEE
Q 014786          117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIRG-ASDIRVTFADQSAYDAKIVG  194 (418)
Q Consensus       117 ~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~vv~  194 (418)
                      .....+..+-+|||.|.......  |..+  .-..+.+|||++++. ||+|||+||+.. .-...+.+.+..+.+.-.+.
T Consensus        53 ~w~~~ia~VvksvVsI~~S~v~~--fdte--sag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy  128 (955)
T KOG1421|consen   53 DWRNTIANVVKSVVSIRFSAVRA--FDTE--SAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY  128 (955)
T ss_pred             hhhhhhhhhcccEEEEEehheee--cccc--cccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence            66678889999999998764321  2111  124678999999976 899999999974 44567778888788888899


Q ss_pred             EcCCCCeEEEEecCCC---CCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCC---CCcccEE
Q 014786          195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG---RPIQDVI  268 (418)
Q Consensus       195 ~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i  268 (418)
                      .|+-+|+.+++.+...   ..+..+.++. +-.++|.++.++|+..+.-.+...|.++.+.+....+...   .-....+
T Consensus       129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~  207 (955)
T KOG1421|consen  129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI  207 (955)
T ss_pred             CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence            9999999999998643   2344444542 3467899999999988877788899999888877665421   1123456


Q ss_pred             EEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCeeecc--chhhhh
Q 014786          269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQ  346 (418)
Q Consensus       269 ~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~--~~~~~~  346 (418)
                      |..+....|.||+|++|.+|..|.++..+...    .+-+|++|++.+.+.+.-++++.-+.|+.|.++|.+  .+..++
T Consensus       208 QaasstsggssgspVv~i~gyAVAl~agg~~s----sas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr  283 (955)
T KOG1421|consen  208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSIS----SASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR  283 (955)
T ss_pred             eehhcCCCCCCCCceecccceEEeeecCCccc----ccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence            77778889999999999999999999876442    356799999999999999988888999999999986  344566


Q ss_pred             hCc---------------cCcE-EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCC
Q 014786          347 LGV---------------SGVL-VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGD  410 (418)
Q Consensus       347 ~~~---------------~G~~-V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~  410 (418)
                      +|+               .|++ |..+.+++|+++.            |+.||++++||+.-++++.++.+.|++. .|+
T Consensus       284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~------------Le~GDillavN~t~l~df~~l~~iLDeg-vgk  350 (955)
T KOG1421|consen  284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK------------LEPGDILLAVNSTCLNDFEALEQILDEG-VGK  350 (955)
T ss_pred             cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc------------cCCCcEEEEEcceehHHHHHHHHHHhhc-cCc
Confidence            665               3444 4555667766554            4559999999999999999999999976 899


Q ss_pred             EEEEEEE
Q 014786          411 EVSCFTF  417 (418)
Q Consensus       411 ~v~l~v~  417 (418)
                      .+.|+|+
T Consensus       351 ~l~LtI~  357 (955)
T KOG1421|consen  351 NLELTIQ  357 (955)
T ss_pred             eEEEEEE
Confidence            9888875


No 9  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.70  E-value=2.2e-16  Score=132.63  Aligned_cols=109  Identities=38%  Similarity=0.599  Sum_probs=74.8

Q ss_pred             EEEEEEcCCcEEEecccccC--------CCCeEEEEeCCCcEEE--EEEEEEcCC-CCeEEEEecCCCCCCcccccCCCC
Q 014786          154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA  222 (418)
Q Consensus       154 GSGfiI~~~G~ILT~aHvv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~l~~~~  222 (418)
                      ||||+|+++|+||||+||+.        ....+.+...+++.+.  ++++..|+. .|+|||+++.             .
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------~   67 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------W   67 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC-------------E
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec-------------c
Confidence            89999999899999999998        4567888888888888  999999999 9999999980             0


Q ss_pred             CCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEE
Q 014786          223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI  293 (418)
Q Consensus       223 ~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI  293 (418)
                       ...+......            ............    ......+ +++.+.+|+||||+||.+|+||||
T Consensus        68 -~~~~~~~~~~------------~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   68 -TGVGGGVRVP------------GSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             -EEEEEEEEEE------------EEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             -cceeeeeEee------------eeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence             0000000000            000000000000    0001114 899999999999999999999997


No 10 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.47  E-value=4e-12  Score=117.33  Aligned_cols=168  Identities=24%  Similarity=0.369  Sum_probs=109.6

Q ss_pred             CeEEEEEEEcCCcEEEecccccCCCCeEEEEeC-------CC--cEEEEEEEEE----cC---CCCeEEEEecCC---CC
Q 014786          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFA-------DQ--SAYDAKIVGF----DQ---DKDVAVLRIDAP---KD  211 (418)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~-------dg--~~~~a~vv~~----d~---~~DlAlLkv~~~---~~  211 (418)
                      ...|+|++|+++ +|||++||+.+...+.+.+.       ++  ..+..+-+..    +.   .+|+|||+++.+   ..
T Consensus        24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~  102 (220)
T PF00089_consen   24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD  102 (220)
T ss_dssp             EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred             CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            457999999987 99999999999656666543       22  2344443333    22   469999999987   24


Q ss_pred             CCcccccCCC-CCCCCCCEEEEEeCCCCCCC----ceeEeEEeeeeeeeccc-CCCCCcccEEEEcc----ccCCCCCCC
Q 014786          212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSA-ATGRPIQDVIQTDA----AINPGNSGG  281 (418)
Q Consensus       212 ~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~~----~~~~G~Vs~~~~~~~~~-~~~~~~~~~i~~~~----~i~~G~SGG  281 (418)
                      ...++.+... ..+..|+.+.++||+.....    ......+.-+....... .........+....    ..+.|+|||
T Consensus       103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~  182 (220)
T PF00089_consen  103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG  182 (220)
T ss_dssp             SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            5667777652 34688999999999875322    33434443333321111 11112345566655    788999999


Q ss_pred             ceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhh
Q 014786          282 PLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIV  320 (418)
Q Consensus       282 Pl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l  320 (418)
                      |+++.++.|+||++.. ..++.....+++.++....+++
T Consensus       183 pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  183 PLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             EEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred             ccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence            9998877899999987 3333333357778887776654


No 11 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.46  E-value=1.1e-12  Score=135.04  Aligned_cols=288  Identities=18%  Similarity=0.207  Sum_probs=193.8

Q ss_pred             HHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCC-cEEEecccccC-CCCeEEEEeCCCcEEEEEEEEEcCCC
Q 014786          122 FQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDK  199 (418)
Q Consensus       122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~~d~~~  199 (418)
                      .+.+..+.|.+....    ++..+........|||.|++.. |++++.+.++. +..+.+|...|...+.|.+.+.++..
T Consensus       524 ~~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~  599 (955)
T KOG1421|consen  524 SADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTE  599 (955)
T ss_pred             hhHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCcc
Confidence            345566666665532    1222222223457999999955 89999999996 56788999999999999999999999


Q ss_pred             CeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeee----ecccCCCCCcccEEEEccccC
Q 014786          200 DVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRRE----ISSAATGRPIQDVIQTDAAIN  275 (418)
Q Consensus       200 DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~----~~~~~~~~~~~~~i~~~~~i~  275 (418)
                      ++|.+|.+...  ...+.+.+ ..+..||++...|+......-.....|..+...    ...........+.|.+++.+.
T Consensus       600 n~a~~kydp~~--~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nls  676 (955)
T KOG1421|consen  600 NVASFKYDPAL--EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLS  676 (955)
T ss_pred             ceeEeccChhH--hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecccc
Confidence            99999998743  34555543 458889999999987654432222222222111    001111123356777777776


Q ss_pred             CCCCCCceeCCCceEEEEEeeeeCC--CCCCCCccceeecccchhhhhhhhhcccccccccCeeeccchh--hhhhCccC
Q 014786          276 PGNSGGPLLDSSGSLIGINTAIYSP--SGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQS--VEQLGVSG  351 (418)
Q Consensus       276 ~G~SGGPl~n~~G~VVGI~s~~~~~--~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~--~~~~~~~G  351 (418)
                      .++--|-+.|.+|+|+|++-..+..  ++.+-..-|.+.+.+++..++.|+..++..-..+|++|....+  ++.+|++-
T Consensus       677 T~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~  756 (955)
T KOG1421|consen  677 TSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPS  756 (955)
T ss_pred             ccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCH
Confidence            6666778899999999997644432  2233345677888999999999998888877788888876433  44566655


Q ss_pred             cEEEecCCCChhhhcCcccccc--ccCCCccCCcEEEEECCEEeCCHHHHHHHHh----cCCCCCEEEEEE
Q 014786          352 VLVLDAPPNGPAGKAGLLSTKR--DAYGRLILGDIITSVNGKKVSNGSDLYRILD----QCKVGDEVSCFT  416 (418)
Q Consensus       352 ~~V~~v~~~~pa~~aGl~~~~~--~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~----~~~~g~~v~l~v  416 (418)
                      .++.+...++.-.++-+..++.  ...+-|..||||+++||+.|+...||.+...    -.+.|..++++|
T Consensus       757 e~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~eid~~ilrdg~~~~iki  827 (955)
T KOG1421|consen  757 EFIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFEEIDAVILRDGIEMEIKI  827 (955)
T ss_pred             HHHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhhhhheeeeecCcEEEEEe
Confidence            5555555555544443333222  2234567799999999999999999987543    126788777765


No 12 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.39  E-value=2.6e-12  Score=130.59  Aligned_cols=252  Identities=18%  Similarity=0.196  Sum_probs=179.2

Q ss_pred             HHHcCCceEEEEeeecccCcccccccc-CCCeEEEEEEEcCCcEEEecccccC---CCCeEEEEe-CCCcEEEEEEEEEc
Q 014786          122 FQENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFD  196 (418)
Q Consensus       122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHvv~---~~~~i~V~~-~dg~~~~a~vv~~d  196 (418)
                      .+....|++.+............|... .....|+||.+... .++|++|++.   +...+.+.- ..-+.|.+++...-
T Consensus        56 ~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~  134 (473)
T KOG1320|consen   56 VDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF  134 (473)
T ss_pred             ccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence            445567888888776666544444433 33567999999754 8999999999   555555552 23356888888888


Q ss_pred             CCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCC
Q 014786          197 QDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINP  276 (418)
Q Consensus       197 ~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~  276 (418)
                      .++|+|++.++........-++...+-+...+.++++|   |....++.|.|......  .+..+......+++++++++
T Consensus       135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~--~y~~~~~~l~~vqi~aa~~~  209 (473)
T KOG1320|consen  135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPR--IYAHSSTVLLRVQIDAAIGP  209 (473)
T ss_pred             hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---CCcEEEEeeEEEEEEec--cccCCCcceeeEEEEEeecC
Confidence            99999999998743211111233233456678899998   66779999999876543  23334445568999999999


Q ss_pred             CCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccc-cccccCeeeccch---hhhhhCc---
Q 014786          277 GNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV-TRPILGIKFAPDQ---SVEQLGV---  349 (418)
Q Consensus       277 G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v-~~~~lGv~~~~~~---~~~~~~~---  349 (418)
                      |+||+|.+...+++.|+........   +.+++.||.-.+.++.......+.. .+++++...+...   ..+.+.+   
T Consensus       210 ~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~  286 (473)
T KOG1320|consen  210 GNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE  286 (473)
T ss_pred             CccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence            9999999988789999998876422   2689999999999998877666654 4677776665422   2333333   


Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeC
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS  394 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~  394 (418)
                      .|+.+.++.+-+.|-            ..++.||+|+++||..|.
T Consensus       287 ~g~~i~~~~qtd~ai------------~~~nsg~~ll~~DG~~Ig  319 (473)
T KOG1320|consen  287 TGVLISKINQTDAAI------------NPGNSGGPLLNLDGEVIG  319 (473)
T ss_pred             cceeeeeecccchhh------------hcccCCCcEEEecCcEee
Confidence            578888888766553            345559999999999983


No 13 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.33  E-value=5.8e-12  Score=99.73  Aligned_cols=70  Identities=40%  Similarity=0.598  Sum_probs=61.5

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      ||||+.+.....     ..|++|.+|.+++||+++||++           ||+|++|||++|++..|+.+.+...++|++
T Consensus         1 ~~lGv~~~~~~~-----~~g~~V~~V~~~spA~~aGl~~-----------GD~I~~ing~~v~~~~~~~~~l~~~~~g~~   64 (82)
T PF13180_consen    1 GGLGVTVQNLSD-----TGGVVVVSVIPGSPAAKAGLQP-----------GDIILAINGKPVNSSEDLVNILSKGKPGDT   64 (82)
T ss_dssp             -E-SEEEEECSC-----SSSEEEEEESTTSHHHHTTS-T-----------TEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred             CEECeEEEEccC-----CCeEEEEEeCCCCcHHHCCCCC-----------CcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence            689999877542     2699999999999999999999           999999999999999999999999999999


Q ss_pred             EEEEEE
Q 014786          412 VSCFTF  417 (418)
Q Consensus       412 v~l~v~  417 (418)
                      ++|+|+
T Consensus        65 v~l~v~   70 (82)
T PF13180_consen   65 VTLTVL   70 (82)
T ss_dssp             EEEEEE
T ss_pred             EEEEEE
Confidence            999885


No 14 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.31  E-value=9e-11  Score=109.04  Aligned_cols=169  Identities=21%  Similarity=0.260  Sum_probs=98.6

Q ss_pred             CeEEEEEEEcCCcEEEecccccCCC--CeEEEEeCC---------CcEEEEEEEEEc-------CCCCeEEEEecCCC--
Q 014786          151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFAD---------QSAYDAKIVGFD-------QDKDVAVLRIDAPK--  210 (418)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~--~~i~V~~~d---------g~~~~a~vv~~d-------~~~DlAlLkv~~~~--  210 (418)
                      ...|+|++|+++ +|||+|||+.+.  ..+.|.+..         ...+..+-+..+       ...|||||+++.+.  
T Consensus        24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~  102 (232)
T cd00190          24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL  102 (232)
T ss_pred             cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence            468999999986 999999999875  456666532         122334444443       35799999998753  


Q ss_pred             -CCCcccccCCCC-CCCCCCEEEEEeCCCCCCC-----ceeEeEEeeeeeeecccCCC---CCcccEEEE-----ccccC
Q 014786          211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN  275 (418)
Q Consensus       211 -~~~~~~~l~~~~-~~~~G~~V~~vG~p~g~~~-----~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----~~~i~  275 (418)
                       ..+.++.+.... .+..|+.+.++||......     ......+.-+....+.....   ......+..     +...+
T Consensus       103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c  182 (232)
T cd00190         103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC  182 (232)
T ss_pred             CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence             235677775543 5778999999998754322     12222222111111110000   011122222     34577


Q ss_pred             CCCCCCceeCCC---ceEEEEEeeeeCCCCCCCCccceeecccchhhhh
Q 014786          276 PGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVD  321 (418)
Q Consensus       276 ~G~SGGPl~n~~---G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~  321 (418)
                      +|+||||++...   +.++||.++... ++.....+.+..+...+++++
T Consensus       183 ~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~  230 (232)
T cd00190         183 QGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQ  230 (232)
T ss_pred             cCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhh
Confidence            899999999764   789999998653 221123333444444444443


No 15 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.22  E-value=3.5e-10  Score=105.33  Aligned_cols=147  Identities=23%  Similarity=0.329  Sum_probs=89.8

Q ss_pred             CeEEEEEEEcCCcEEEecccccCCCC--eEEEEeCCC--------cEEEEEEEEEc-------CCCCeEEEEecCCC---
Q 014786          151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK---  210 (418)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~---  210 (418)
                      ...|+|++|+++ +|||+|||+.+..  .+.|.+...        ..+.+.-+..+       ...|+|||+++.+.   
T Consensus        25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~  103 (229)
T smart00020       25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS  103 (229)
T ss_pred             CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence            457999999976 9999999998753  677776432        22334433322       45799999998752   


Q ss_pred             CCCcccccCCC-CCCCCCCEEEEEeCCCCCC------CceeEeEEeeeeeeecccCCC---CCcccEEEE-----ccccC
Q 014786          211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN  275 (418)
Q Consensus       211 ~~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----~~~i~  275 (418)
                      ..+.++.+... ..+..++.+.+.||+....      .......+.-+..........   ......+..     ....+
T Consensus       104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c  183 (229)
T smart00020      104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC  183 (229)
T ss_pred             CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence            24566666543 3467789999999876542      112222222211111110000   001112211     35578


Q ss_pred             CCCCCCceeCCCc--eEEEEEeeee
Q 014786          276 PGNSGGPLLDSSG--SLIGINTAIY  298 (418)
Q Consensus       276 ~G~SGGPl~n~~G--~VVGI~s~~~  298 (418)
                      +|+||||++...+  .++||++...
T Consensus       184 ~gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      184 QGDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             CCCCCCeeEEECCCEEEEEEEEECC
Confidence            8999999996543  8999999865


No 16 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.00  E-value=1.7e-08  Score=95.54  Aligned_cols=158  Identities=18%  Similarity=0.228  Sum_probs=93.3

Q ss_pred             EEEEEEEcCCcEEEecccccCCCC----eEEEEe----CCCc-EE--EEEEEEEc-C---CCCeEEEEecCCCC------
Q 014786          153 SGSGFVWDSKGHVVTNYHVIRGAS----DIRVTF----ADQS-AY--DAKIVGFD-Q---DKDVAVLRIDAPKD------  211 (418)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~~~~----~i~V~~----~dg~-~~--~a~vv~~d-~---~~DlAlLkv~~~~~------  211 (418)
                      .+++|+|+++ .+||++||+....    ++.+..    .++. .+  ........ .   +.|.+...+.....      
T Consensus        65 ~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~  143 (251)
T COG3591          65 CTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINI  143 (251)
T ss_pred             eeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCc
Confidence            4466999987 9999999996443    222211    1111 11  11122112 2   34666666643211      


Q ss_pred             --CCcccccCCCCCCCCCCEEEEEeCCCCCCCc----eeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeC
Q 014786          212 --KLRPIPIGVSADLLVGQKVYAIGNPFGLDHT----LTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD  285 (418)
Q Consensus       212 --~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~----~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n  285 (418)
                        -.....+......+.++.+.++|||.+....    ...+.|...            ....+.+++.+.+|+||+|+++
T Consensus       144 ~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SGSpv~~  211 (251)
T COG3591         144 GDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSGSPVLI  211 (251)
T ss_pred             cccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCCCceEe
Confidence              1122223334457789999999999775532    223333211            1246889999999999999999


Q ss_pred             CCceEEEEEeeeeCCCCCCCCcccee-ecccchhhhhhhh
Q 014786          286 SSGSLIGINTAIYSPSGASSGVGFSI-PVDTVNGIVDQLV  324 (418)
Q Consensus       286 ~~G~VVGI~s~~~~~~~~~~~~~~aI-p~~~i~~~l~~l~  324 (418)
                      .+.+|+|+++.+....++ ...++++ -...+++++++++
T Consensus       212 ~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         212 SKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             cCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence            988999999977553332 2344333 3456666666654


No 17 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.96  E-value=3e-09  Score=84.98  Aligned_cols=75  Identities=44%  Similarity=0.659  Sum_probs=63.3

Q ss_pred             cccCeeeccchhh--hhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 014786          332 PILGIKFAPDQSV--EQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC  406 (418)
Q Consensus       332 ~~lGv~~~~~~~~--~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~  406 (418)
                      +|+|+.+++....  ..+++   .|++|.+|.+++||+++||+.           ||+|++|||++|.++.++.+.+...
T Consensus         1 ~~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~i~~~~~~~~~l~~~   69 (90)
T cd00987           1 PWLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKP-----------GDVILAVNGKPVKSVADLRRALAEL   69 (90)
T ss_pred             CccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCc-----------CCEEEEECCEECCCHHHHHHHHHhc
Confidence            5889999874322  22333   599999999999999999999           9999999999999999999999887


Q ss_pred             CCCCEEEEEEE
Q 014786          407 KVGDEVSCFTF  417 (418)
Q Consensus       407 ~~g~~v~l~v~  417 (418)
                      ..|+.+.+++.
T Consensus        70 ~~~~~i~l~v~   80 (90)
T cd00987          70 KPGDKVTLTVL   80 (90)
T ss_pred             CCCCEEEEEEE
Confidence            77899888764


No 18 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.91  E-value=4.8e-09  Score=82.43  Aligned_cols=57  Identities=28%  Similarity=0.485  Sum_probs=53.3

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .|++|.+|.+++||+++||++           ||+|++|||++|.+++|+.+.+...++|+++.++++
T Consensus        10 ~Gv~V~~V~~~spa~~aGL~~-----------GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~   66 (79)
T cd00991          10 AGVVIVGVIVGSPAENAVLHT-----------GDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVL   66 (79)
T ss_pred             CcEEEEEECCCChHHhcCCCC-----------CCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEE
Confidence            699999999999999999999           999999999999999999999988778999888764


No 19 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.91  E-value=4.7e-09  Score=79.96  Aligned_cols=68  Identities=35%  Similarity=0.541  Sum_probs=59.0

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g  409 (418)
                      +++|+.+.....      .|++|.++.+++||+++||++           ||+|++|||+++.++  +++.+++.... |
T Consensus         1 ~~~G~~~~~~~~------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~~~-g   62 (70)
T cd00136           1 GGLGFSIRGGTE------GGVVVLSVEPGSPAERAGLQA-----------GDVILAVNGTDVKNLTLEDVAELLKKEV-G   62 (70)
T ss_pred             CCccEEEecCCC------CCEEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhhCC-C
Confidence            357777765431      489999999999999999999           999999999999999  99999998765 9


Q ss_pred             CEEEEEEE
Q 014786          410 DEVSCFTF  417 (418)
Q Consensus       410 ~~v~l~v~  417 (418)
                      ++++|+++
T Consensus        63 ~~v~l~v~   70 (70)
T cd00136          63 EKVTLTVR   70 (70)
T ss_pred             CeEEEEEC
Confidence            99999874


No 20 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.82  E-value=1e-08  Score=98.57  Aligned_cols=89  Identities=15%  Similarity=0.143  Sum_probs=78.6

Q ss_pred             ccchhhhhhhhhcccccccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEe
Q 014786          314 DTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV  393 (418)
Q Consensus       314 ~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v  393 (418)
                      ..++++++++++++++-+.|+|+......    -...|++|..+.+++|++++||+.           ||+|++|||+++
T Consensus       159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~----g~~~G~~v~~v~~~s~a~~aGLr~-----------GDvIv~ING~~i  223 (259)
T TIGR01713       159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN----DKLEGYRLNPGKDPSLFYKSGLQD-----------GDIAVALNGLDL  223 (259)
T ss_pred             hhHHHHHHHHHHCHHhhhheEeEEEEEeC----CceeEEEEEecCCCCHHHHcCCCC-----------CCEEEEECCEEc
Confidence            46778999999999999999999975422    123799999999999999999999           999999999999


Q ss_pred             CCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          394 SNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       394 ~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      ++++++.+++.+.+++++++++|.
T Consensus       224 ~~~~~~~~~l~~~~~~~~v~l~V~  247 (259)
T TIGR01713       224 RDPEQAFQALQMLREETNLTLTVE  247 (259)
T ss_pred             CCHHHHHHHHHhcCCCCeEEEEEE
Confidence            999999999999889999988764


No 21 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.82  E-value=1.2e-08  Score=79.80  Aligned_cols=65  Identities=34%  Similarity=0.497  Sum_probs=53.0

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      +|+|+.+....       .|+.|.+|.+++||+++||++           ||+|++|||+++.++.++   +...+.|+.
T Consensus         1 ~~~G~~~~~~~-------~~~~V~~V~~~s~a~~aGl~~-----------GD~I~~Ing~~v~~~~~~---l~~~~~~~~   59 (80)
T cd00990           1 PYLGLTLDKEE-------GLGKVTFVRDDSPADKAGLVA-----------GDELVAVNGWRVDALQDR---LKEYQAGDP   59 (80)
T ss_pred             CcccEEEEccC-------CcEEEEEECCCChHHHhCCCC-----------CCEEEEECCEEhHHHHHH---HHhcCCCCE
Confidence            58898886543       579999999999999999999           999999999999985554   444456778


Q ss_pred             EEEEEE
Q 014786          412 VSCFTF  417 (418)
Q Consensus       412 v~l~v~  417 (418)
                      +.++++
T Consensus        60 v~l~v~   65 (80)
T cd00990          60 VELTVF   65 (80)
T ss_pred             EEEEEE
Confidence            777653


No 22 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.71  E-value=5.7e-08  Score=75.72  Aligned_cols=66  Identities=29%  Similarity=0.406  Sum_probs=54.6

Q ss_pred             ccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEE
Q 014786          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEV  412 (418)
Q Consensus       333 ~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v  412 (418)
                      |+|+.+....       ..+.|.++.+++||+++||++           ||+|++|||+++.+++|+...+... .++.+
T Consensus         2 ~~~~~~g~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~   62 (79)
T cd00989           2 ILGFVPGGPP-------IEPVIGEVVPGSPAAKAGLKA-----------GDRILAINGQKIKSWEDLVDAVQEN-PGKPL   62 (79)
T ss_pred             eeeEeccCCc-------cCcEEEeECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHHC-CCceE
Confidence            5666655432       347899999999999999999           9999999999999999999999865 47777


Q ss_pred             EEEEE
Q 014786          413 SCFTF  417 (418)
Q Consensus       413 ~l~v~  417 (418)
                      .+++.
T Consensus        63 ~l~v~   67 (79)
T cd00989          63 TLTVE   67 (79)
T ss_pred             EEEEE
Confidence            77663


No 23 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.68  E-value=7.2e-08  Score=75.49  Aligned_cols=56  Identities=30%  Similarity=0.374  Sum_probs=50.9

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .|++|.+|.+++||+. ||++           ||+|++|||++|.+++++..++...++|+.+.++++
T Consensus         8 ~Gv~V~~V~~~s~A~~-gL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~   63 (79)
T cd00986           8 HGVYVTSVVEGMPAAG-KLKA-----------GDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVK   63 (79)
T ss_pred             cCEEEEEECCCCchhh-CCCC-----------CCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEE
Confidence            6899999999999986 8998           999999999999999999999987778998888764


No 24 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.63  E-value=1.1e-07  Score=75.25  Aligned_cols=66  Identities=30%  Similarity=0.552  Sum_probs=56.0

Q ss_pred             ccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCC
Q 014786          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD  410 (418)
Q Consensus       333 ~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~  410 (418)
                      -||+.+....       .++.|..+.+++||+++||++           ||+|++|||+++.++  .++.+++.. ..|+
T Consensus         3 ~lG~~~~~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~vng~~i~~~~~~~~~~~l~~-~~~~   63 (85)
T cd00988           3 GIGLELKYDD-------GGLVITSVLPGSPAAKAGIKA-----------GDIIVAIDGEPVDGLSLEDVVKLLRG-KAGT   63 (85)
T ss_pred             EEEEEEEEcC-------CeEEEEEecCCCCHHHcCCCC-----------CCEEEEECCEEcCCCCHHHHHHHhcC-CCCC
Confidence            3666665432       688999999999999999999           999999999999999  899988876 4688


Q ss_pred             EEEEEEE
Q 014786          411 EVSCFTF  417 (418)
Q Consensus       411 ~v~l~v~  417 (418)
                      .+.+++.
T Consensus        64 ~i~l~v~   70 (85)
T cd00988          64 KVRLTLK   70 (85)
T ss_pred             EEEEEEE
Confidence            8888774


No 25 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.53  E-value=2.6e-07  Score=95.41  Aligned_cols=76  Identities=36%  Similarity=0.572  Sum_probs=65.6

Q ss_pred             ccccCeeeccch--hhhhhCc----cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHh
Q 014786          331 RPILGIKFAPDQ--SVEQLGV----SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD  404 (418)
Q Consensus       331 ~~~lGv~~~~~~--~~~~~~~----~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~  404 (418)
                      +.|+|+.+.+..  ..+.+++    .|++|.+|.+++||+++||++           ||+|++|||++|.+.+|+.+++.
T Consensus       337 ~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~-----------GDvI~~Ing~~V~s~~d~~~~l~  405 (428)
T TIGR02037       337 NPFLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQP-----------GDVILSVNQQPVSSVAELRKVLD  405 (428)
T ss_pred             ccccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHH
Confidence            468999887632  3444554    599999999999999999999           99999999999999999999999


Q ss_pred             cCCCCCEEEEEEE
Q 014786          405 QCKVGDEVSCFTF  417 (418)
Q Consensus       405 ~~~~g~~v~l~v~  417 (418)
                      ..+.|+++.++|+
T Consensus       406 ~~~~g~~v~l~v~  418 (428)
T TIGR02037       406 RAKKGGRVALLIL  418 (428)
T ss_pred             hcCCCCEEEEEEE
Confidence            8888999998874


No 26 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.49  E-value=9.5e-06  Score=76.32  Aligned_cols=164  Identities=16%  Similarity=0.291  Sum_probs=87.5

Q ss_pred             HHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCC-CCeEEEEeCCCcEEEEE-----EEEE
Q 014786          122 FQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGF  195 (418)
Q Consensus       122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~-----vv~~  195 (418)
                      +..+...|++|.......           ...=-|+...  .||+|++|.++. ...++|...-|. |...     -+..
T Consensus        13 yn~Ia~~ic~l~n~s~~~-----------~~~l~gigyG--~~iItn~HLf~~nng~L~i~s~hG~-f~v~nt~~lkv~~   78 (235)
T PF00863_consen   13 YNPIASNICRLTNESDGG-----------TRSLYGIGYG--SYIITNAHLFKRNNGELTIKSQHGE-FTVPNTTQLKVHP   78 (235)
T ss_dssp             -HHHHTTEEEEEEEETTE-----------EEEEEEEEET--TEEEEEGGGGSSTTCEEEEEETTEE-EEECEGGGSEEEE
T ss_pred             cchhhheEEEEEEEeCCC-----------eEEEEEEeEC--CEEEEChhhhccCCCeEEEEeCceE-EEcCCccccceEE
Confidence            345566788888643221           1233477775  399999999964 456777776653 2221     2233


Q ss_pred             cCCCCeEEEEecCCCCCCcccccC-CCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEcccc
Q 014786          196 DQDKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI  274 (418)
Q Consensus       196 d~~~DlAlLkv~~~~~~~~~~~l~-~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i  274 (418)
                      -+..||.++|+..   ++||.+-. .-..++.+|.|.++|.-+....  ....|+........     ....+.......
T Consensus        79 i~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p~-----~~~~fWkHwIsT  148 (235)
T PF00863_consen   79 IEGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYPE-----ENSHFWKHWIST  148 (235)
T ss_dssp             -TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEEE-----TTTTEEEE-C--
T ss_pred             eCCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEeec-----CCCCeeEEEecC
Confidence            4688999999976   35555421 2356889999999997544322  22233332222211     123566777778


Q ss_pred             CCCCCCCceeCC-CceEEEEEeeeeCCCCCCCCccceeec
Q 014786          275 NPGNSGGPLLDS-SGSLIGINTAIYSPSGASSGVGFSIPV  313 (418)
Q Consensus       275 ~~G~SGGPl~n~-~G~VVGI~s~~~~~~~~~~~~~~aIp~  313 (418)
                      ..|+=|.|+++. +|++|||++.....    ...+|+.|+
T Consensus       149 k~G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f  184 (235)
T PF00863_consen  149 KDGDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF  184 (235)
T ss_dssp             -TT-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred             CCCccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence            899999999984 99999999977543    345677665


No 27 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.48  E-value=6.7e-07  Score=69.96  Aligned_cols=69  Identities=30%  Similarity=0.453  Sum_probs=56.1

Q ss_pred             ccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeC--CHHHHHHHHhcCCC
Q 014786          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKV  408 (418)
Q Consensus       331 ~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~--~~~dl~~~l~~~~~  408 (418)
                      ...+|+.+......    ..|++|.++.+++||+++||++           ||+|++|||+++.  +.+++.+.+.... 
T Consensus        11 ~~~~G~~~~~~~~~----~~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~~~l~~~~-   74 (82)
T cd00992          11 GGGLGFSLRGGKDS----GGGIFVSRVEPGGPAERGGLRV-----------GDRILEVNGVSVEGLTHEEAVELLKNSG-   74 (82)
T ss_pred             CCCcCEEEeCcccC----CCCeEEEEECCCChHHhCCCCC-----------CCEEEEECCEEcCccCHHHHHHHHHhCC-
Confidence            35678887653211    3689999999999999999999           9999999999999  8999999998643 


Q ss_pred             CCEEEEEE
Q 014786          409 GDEVSCFT  416 (418)
Q Consensus       409 g~~v~l~v  416 (418)
                       ..+.+++
T Consensus        75 -~~v~l~v   81 (82)
T cd00992          75 -DEVTLTV   81 (82)
T ss_pred             -CeEEEEE
Confidence             2676665


No 28 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.43  E-value=3.4e-07  Score=71.94  Aligned_cols=71  Identities=27%  Similarity=0.433  Sum_probs=56.0

Q ss_pred             cccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCC
Q 014786          330 TRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCK  407 (418)
Q Consensus       330 ~~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~  407 (418)
                      ....+|+.+.......   ..|++|.++.+++||+++||++           ||.|++|||+++.++  .++.+++... 
T Consensus         8 ~~~~lG~~l~~~~~~~---~~~~~V~~v~~~~~a~~~gl~~-----------GD~Il~INg~~v~~~~~~~~~~~l~~~-   72 (81)
T PF00595_consen    8 GNGPLGFTLRGGSDND---EKGVFVSSVVPGSPAERAGLKV-----------GDRILEINGQSVRGMSHDEVVQLLKSA-   72 (81)
T ss_dssp             TTSBSSEEEEEESTSS---SEEEEEEEECTTSHHHHHTSST-----------TEEEEEETTEESTTSBHHHHHHHHHHS-
T ss_pred             CCCCcCEEEEecCCCC---cCCEEEEEEeCCChHHhcccch-----------hhhhheeCCEeCCCCCHHHHHHHHHCC-
Confidence            3567888887643211   2589999999999999999999           999999999999977  4666667664 


Q ss_pred             CCCEEEEEE
Q 014786          408 VGDEVSCFT  416 (418)
Q Consensus       408 ~g~~v~l~v  416 (418)
                       +.+|+|+|
T Consensus        73 -~~~v~L~V   80 (81)
T PF00595_consen   73 -SNPVTLTV   80 (81)
T ss_dssp             -TSEEEEEE
T ss_pred             -CCcEEEEE
Confidence             34888876


No 29 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.35  E-value=1.4e-06  Score=68.39  Aligned_cols=71  Identities=32%  Similarity=0.416  Sum_probs=54.9

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      ..+|+.+.......    .|++|..+.+++||+++||++           ||+|++|||+++.+..+..........++.
T Consensus        12 ~~~G~~~~~~~~~~----~~~~i~~v~~~s~a~~~gl~~-----------GD~I~~In~~~v~~~~~~~~~~~~~~~~~~   76 (85)
T smart00228       12 GGLGFSLVGGKDEG----GGVVVSSVVPGSPAAKAGLKV-----------GDVILEVNGTSVEGLTHLEAVDLLKKAGGK   76 (85)
T ss_pred             CcccEEEECCCCCC----CCEEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence            67788876532111    689999999999999999999           999999999999987766555443334668


Q ss_pred             EEEEEE
Q 014786          412 VSCFTF  417 (418)
Q Consensus       412 v~l~v~  417 (418)
                      +.+++.
T Consensus        77 ~~l~i~   82 (85)
T smart00228       77 VTLTVL   82 (85)
T ss_pred             EEEEEE
Confidence            887764


No 30 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.25  E-value=1.4e-06  Score=90.51  Aligned_cols=54  Identities=17%  Similarity=0.169  Sum_probs=50.8

Q ss_pred             EEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       353 ~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      +|.+|.++|||++|||++           ||+|++|||++|++++|+...+....+|++++++|+
T Consensus       129 lV~~V~~~SpA~kAGLk~-----------GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~  182 (449)
T PRK10779        129 VVGEIAPNSIAAQAQIAP-----------GTELKAVDGIETPDWDAVRLALVSKIGDESTTITVA  182 (449)
T ss_pred             cccccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEE
Confidence            789999999999999999           999999999999999999999988888998888874


No 31 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.22  E-value=9.6e-05  Score=70.43  Aligned_cols=145  Identities=25%  Similarity=0.301  Sum_probs=81.8

Q ss_pred             EEEEEEEcCCcEEEecccccCCCC--eEEEEeCC---------C---cEEEE-EEEEEc-------CC-CCeEEEEecCC
Q 014786          153 SGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFAD---------Q---SAYDA-KIVGFD-------QD-KDVAVLRIDAP  209 (418)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~d---------g---~~~~a-~vv~~d-------~~-~DlAlLkv~~~  209 (418)
                      .+.|.+|+++ ||+|++||+.+..  .+.|.+..         +   ..... +++ .+       .. +|||||+++.+
T Consensus        39 ~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~  116 (256)
T KOG3627|consen   39 LCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEP  116 (256)
T ss_pred             eeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCC
Confidence            6777788665 9999999999875  66666531         1   11111 222 22       13 79999999874


Q ss_pred             C---CCCcccccCCCCC---CCCCCEEEEEeCCCCCC------CceeEeEEeeeeeeecccCCCC---CcccEEEEc---
Q 014786          210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQTD---  271 (418)
Q Consensus       210 ~---~~~~~~~l~~~~~---~~~G~~V~~vG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~~---~~~~~i~~~---  271 (418)
                      .   ..+.++.+.....   ...+...++.||+....      .......+.-+...........   .....+...   
T Consensus       117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~  196 (256)
T KOG3627|consen  117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE  196 (256)
T ss_pred             cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence            3   3456666642322   34458888899754211      1122222222221111111110   011223332   


Q ss_pred             --cccCCCCCCCceeCCC---ceEEEEEeeeeC
Q 014786          272 --AAINPGNSGGPLLDSS---GSLIGINTAIYS  299 (418)
Q Consensus       272 --~~i~~G~SGGPl~n~~---G~VVGI~s~~~~  299 (418)
                        ...|.|+|||||+-.+   ..++||++++..
T Consensus       197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~  229 (256)
T KOG3627|consen  197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG  229 (256)
T ss_pred             CCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence              2368899999999764   699999999865


No 32 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.00  E-value=1.3e-05  Score=82.71  Aligned_cols=55  Identities=33%  Similarity=0.514  Sum_probs=49.6

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      .|+.|.+|.+++||+++||++           ||+|++|||++|.+++|+.+.+.. .+|+++.+++
T Consensus       203 ~g~vV~~V~~~SpA~~aGL~~-----------GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v  257 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGLKE-----------GDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKV  257 (420)
T ss_pred             cCcEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEE
Confidence            478999999999999999999           999999999999999999999986 4677777765


No 33 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.96  E-value=1.9e-05  Score=78.94  Aligned_cols=67  Identities=30%  Similarity=0.455  Sum_probs=54.0

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g  409 (418)
                      ..+|+.+....       .+++|.+|.+++||+++||++           ||+|++|||++|.++  .++...+.. +.|
T Consensus        51 ~~lG~~~~~~~-------~~~~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g  111 (334)
T TIGR00225        51 EGIGIQVGMDD-------GEIVIVSPFEGSPAEKAGIKP-----------GDKIIKINGKSVAGMSLDDAVALIRG-KKG  111 (334)
T ss_pred             EEEEEEEEEEC-------CEEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHhccC-CCC
Confidence            45777665432       578999999999999999999           999999999999986  567666654 568


Q ss_pred             CEEEEEEE
Q 014786          410 DEVSCFTF  417 (418)
Q Consensus       410 ~~v~l~v~  417 (418)
                      +++.+++.
T Consensus       112 ~~v~l~v~  119 (334)
T TIGR00225       112 TKVSLEIL  119 (334)
T ss_pred             CEEEEEEE
Confidence            88888764


No 34 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.93  E-value=2e-05  Score=81.89  Aligned_cols=54  Identities=24%  Similarity=0.419  Sum_probs=49.1

Q ss_pred             CcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       351 G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      +++|.+|.++|||+++||++           ||+|++|||++|++++|+.+.+.. .+|+.+.+++
T Consensus       222 ~~vV~~V~~~SpA~~AGL~~-----------GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v  275 (449)
T PRK10779        222 EPVLAEVQPNSAASKAGLQA-----------GDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEI  275 (449)
T ss_pred             CcEEEeeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEE
Confidence            57899999999999999999           999999999999999999999976 4677777765


No 35 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.92  E-value=3.9e-05  Score=61.38  Aligned_cols=68  Identities=29%  Similarity=0.516  Sum_probs=45.1

Q ss_pred             ccCeeeccchhhhhhCccCcEEEecCCC--------ChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHh
Q 014786          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPN--------GPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD  404 (418)
Q Consensus       333 ~lGv~~~~~~~~~~~~~~G~~V~~v~~~--------~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~  404 (418)
                      .||..+....       .+..|.++.++        ||-.+.|+.         +++||+|++|||+++....+++.+|.
T Consensus         2 ~LGAd~~~~~-------~~y~I~~I~~gd~~~~~~~sPL~~pGv~---------v~~GD~I~aInG~~v~~~~~~~~lL~   65 (88)
T PF14685_consen    2 LLGADFSYDN-------GGYRIARIYPGDPWNPNARSPLAQPGVD---------VREGDYILAINGQPVTADANPYRLLE   65 (88)
T ss_dssp             B-SEEEEEET-------TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB-HHHHHH
T ss_pred             ccceEEEEcC-------CEEEEEEEeCCCCCCccccCCccCCCCC---------CCCCCEEEEECCEECCCCCCHHHHhc
Confidence            5666665432       45668888775        666666665         35699999999999999999999998


Q ss_pred             cCCCCCEEEEEEE
Q 014786          405 QCKVGDEVSCFTF  417 (418)
Q Consensus       405 ~~~~g~~v~l~v~  417 (418)
                      . +.|+.|.|+|.
T Consensus        66 ~-~agk~V~Ltv~   77 (88)
T PF14685_consen   66 G-KAGKQVLLTVN   77 (88)
T ss_dssp             T-TTTSEEEEEEE
T ss_pred             c-cCCCEEEEEEe
Confidence            5 57999999874


No 36 
>PRK10139 serine endoprotease; Provisional
Probab=97.90  E-value=2.5e-05  Score=81.29  Aligned_cols=55  Identities=22%  Similarity=0.361  Sum_probs=49.2

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .|++|.+|.+++||+++||++           ||+|++|||++|.+++|+.+++.+. + +++.++|+
T Consensus       390 ~Gv~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~  444 (455)
T PRK10139        390 KGIKIDEVVKGSPAAQAGLQK-----------DDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIV  444 (455)
T ss_pred             CceEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEE
Confidence            589999999999999999999           9999999999999999999999864 2 67777653


No 37 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.80  E-value=6.2e-05  Score=76.82  Aligned_cols=56  Identities=27%  Similarity=0.471  Sum_probs=47.7

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .|++|..|.+++||+++||+.           ||+|++|||++|.+.  .++...+. .+.|+.|.++|.
T Consensus       102 ~g~~V~~V~~~SPA~~aGl~~-----------GD~Iv~InG~~v~~~~~~~~~~~l~-g~~g~~v~ltv~  159 (389)
T PLN00049        102 AGLVVVAPAPGGPAARAGIRP-----------GDVILAIDGTSTEGLSLYEAADRLQ-GPEGSSVELTLR  159 (389)
T ss_pred             CcEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHh-cCCCCEEEEEEE
Confidence            489999999999999999999           999999999999864  67777775 457888888763


No 38 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.79  E-value=0.00034  Score=66.16  Aligned_cols=117  Identities=26%  Similarity=0.383  Sum_probs=63.0

Q ss_pred             CeEEEEEEEcCCc--EEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCC
Q 014786          151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ  228 (418)
Q Consensus       151 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~  228 (418)
                      .+.|||=+...+|  .|+|+.||+. .+...|... +....   ..++..-|+|.-.+++-...+|.++++..   ..|.
T Consensus       111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~~-g~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a~~---~~Gr  182 (297)
T PF05579_consen  111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSGV-GTRRM---LTFKKNGDFAEADITNWPGAAPKYKFAQN---YTGR  182 (297)
T ss_dssp             SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEET-TEEEE---EEEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred             ecccccceEEECCeEEEEEEEEEcC-CCeEEEEec-ceEEE---EEEeccCcEEEEECCCCCCCCCceeecCC---cccc
Confidence            4456655555444  6999999998 455555443 33322   24455679999999554445677776522   1232


Q ss_pred             EEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeee
Q 014786          229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIY  298 (418)
Q Consensus       229 ~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~  298 (418)
                      --+.-      ..-+..|.|...              ..+   +-..+|+||+|++..+|.+|||++..-
T Consensus       183 AyW~t------~tGvE~G~ig~~--------------~~~---~fT~~GDSGSPVVt~dg~liGVHTGSn  229 (297)
T PF05579_consen  183 AYWLT------STGVEPGFIGGG--------------GAV---CFTGPGDSGSPVVTEDGDLIGVHTGSN  229 (297)
T ss_dssp             EEEEE------TTEEEEEEEETT--------------EEE---ESS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred             eEEEc------ccCcccceecCc--------------eEE---EEcCCCCCCCccCcCCCCEEEEEecCC
Confidence            22211      112344444311              112   234579999999999999999999864


No 39 
>PRK10942 serine endoprotease; Provisional
Probab=97.76  E-value=5.7e-05  Score=78.95  Aligned_cols=55  Identities=33%  Similarity=0.458  Sum_probs=49.2

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .|++|.+|.+++||+++||++           ||+|++|||++|.+++|+.+++...  ++.+.++|.
T Consensus       408 ~gvvV~~V~~~S~A~~aGL~~-----------GDvIv~VNg~~V~s~~dl~~~l~~~--~~~v~l~V~  462 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGLKK-----------GDVIIGANQQPVKNIAELRKILDSK--PSVLALNIQ  462 (473)
T ss_pred             CCeEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC--CCeEEEEEE
Confidence            589999999999999999999           9999999999999999999999873  367777653


No 40 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.75  E-value=4.3e-05  Score=78.81  Aligned_cols=54  Identities=28%  Similarity=0.260  Sum_probs=47.7

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      .|++|.+|.++|||++|||++           ||+|+++||+++.++.|+.+.+....  +++.+++
T Consensus       128 ~g~~V~~V~~~SpA~~AGL~~-----------GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I  181 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGIEP-----------GDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEI  181 (420)
T ss_pred             CCceeeccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEE
Confidence            588999999999999999999           99999999999999999999887655  4555544


No 41 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.67  E-value=0.00013  Score=74.62  Aligned_cols=71  Identities=28%  Similarity=0.438  Sum_probs=59.0

Q ss_pred             ccccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcC
Q 014786          329 VTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQC  406 (418)
Q Consensus       329 v~~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~  406 (418)
                      .....+|++++....      .++.|.++.+++||+++||++           ||+|++|||+++....  +..+.+. .
T Consensus        97 ~~~~GiG~~i~~~~~------~~~~V~s~~~~~PA~kagi~~-----------GD~I~~IdG~~~~~~~~~~av~~ir-G  158 (406)
T COG0793          97 GEFGGIGIELQMEDI------GGVKVVSPIDGSPAAKAGIKP-----------GDVIIKIDGKSVGGVSLDEAVKLIR-G  158 (406)
T ss_pred             ccccceeEEEEEecC------CCcEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCCCHHHHHHHhC-C
Confidence            366888888876432      678999999999999999999           9999999999999884  4666665 4


Q ss_pred             CCCCEEEEEEE
Q 014786          407 KVGDEVSCFTF  417 (418)
Q Consensus       407 ~~g~~v~l~v~  417 (418)
                      ++|..|+|++.
T Consensus       159 ~~Gt~V~L~i~  169 (406)
T COG0793         159 KPGTKVTLTIL  169 (406)
T ss_pred             CCCCeEEEEEE
Confidence            68999999874


No 42 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.64  E-value=0.0001  Score=64.07  Aligned_cols=74  Identities=26%  Similarity=0.438  Sum_probs=53.1

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      +.||+.++-.... ...-.+.-|.+|.++|||++|||++          ..|.|+.+|+....+.++|.+.++.+ .++.
T Consensus        26 g~LG~sv~~~~~~-~~~~~~~~Vl~V~p~SPA~~AGL~p----------~~DyIig~~~~~l~~~~~l~~~v~~~-~~~~   93 (138)
T PF04495_consen   26 GLLGISVRFESFE-GAEEEGWHVLRVAPNSPAAKAGLEP----------FFDYIIGIDGGLLDDEDDLFELVEAN-ENKP   93 (138)
T ss_dssp             SSS-EEEEEEE-T-TGCCCEEEEEEE-TTSHHHHTT--T----------TTEEEEEETTCE--STCHHHHHHHHT-TTS-
T ss_pred             CCCcEEEEEeccc-ccccceEEEeEecCCCHHHHCCccc----------cccEEEEccceecCCHHHHHHHHHHc-CCCc
Confidence            7788887643321 1112678899999999999999997          26999999999999999999999875 5889


Q ss_pred             EEEEEE
Q 014786          412 VSCFTF  417 (418)
Q Consensus       412 v~l~v~  417 (418)
                      +.+.||
T Consensus        94 l~L~Vy   99 (138)
T PF04495_consen   94 LQLYVY   99 (138)
T ss_dssp             EEEEEE
T ss_pred             EEEEEE
Confidence            999886


No 43 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.62  E-value=0.00012  Score=74.13  Aligned_cols=55  Identities=31%  Similarity=0.558  Sum_probs=46.4

Q ss_pred             cCcEEEecC--------CCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAP--------PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~--------~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      +||+|....        .++||+++||++           ||+|++|||++|++++|+.+++...+ |+.+.++|
T Consensus       105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~-----------GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~LtV  167 (402)
T TIGR02860       105 KGVLVVGFSDIETEKGKIHSPGEEAGIQI-----------GDRILKINGEKIKNMDDLANLINKAG-GEKLTLTI  167 (402)
T ss_pred             CEEEEEEEEcccccCCCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEEE
Confidence            577775542        368999999999           99999999999999999999998764 78887776


No 44 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.44  E-value=0.0003  Score=68.17  Aligned_cols=55  Identities=35%  Similarity=0.506  Sum_probs=50.3

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      .||||..+..++|+.            +.|+.||-|++|||+++.+.+|+.+.+...++||+|++++
T Consensus       130 ~gvyv~~v~~~~~~~------------gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~  184 (342)
T COG3480         130 AGVYVLSVIDNSPFK------------GKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDY  184 (342)
T ss_pred             eeEEEEEccCCcchh------------ceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEE
Confidence            799999999999874            4566699999999999999999999999999999999975


No 45 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.29  E-value=0.00061  Score=73.66  Aligned_cols=68  Identities=24%  Similarity=0.313  Sum_probs=52.6

Q ss_pred             ccccCeeeccchhhhhhCccCcEEEecCCCChhhhc-CccccccccCCCccCCcEEEEEC--CEEeCC-----HHHHHHH
Q 014786          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVN--GKKVSN-----GSDLYRI  402 (418)
Q Consensus       331 ~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~a-Gl~~~~~~~~~~l~~GDiIl~in--g~~v~~-----~~dl~~~  402 (418)
                      ..-+|+.++...       .++.|.+|.+|+||+++ ||++           ||+|++||  |+++.+     .+++.+.
T Consensus       243 ~~GIGa~l~~~~-------~~~~V~~vipGsPA~ka~gLk~-----------GD~IlaVn~~g~~~~dv~g~~~~~vv~l  304 (667)
T PRK11186        243 LEGIGAVLQMDD-------DYTVINSLVAGGPAAKSKKLSV-----------GDKIVGVGQDGKPIVDVIGWRLDDVVAL  304 (667)
T ss_pred             eeEEEEEEEEeC-------CeEEEEEccCCChHHHhCCCCC-----------CCEEEEECCCCCcccccccCCHHHHHHH
Confidence            345677776533       46899999999999998 9999           99999999  555443     3477777


Q ss_pred             HhcCCCCCEEEEEEE
Q 014786          403 LDQCKVGDEVSCFTF  417 (418)
Q Consensus       403 l~~~~~g~~v~l~v~  417 (418)
                      |. .+.|.+|.|+|.
T Consensus       305 ir-G~~Gt~V~LtV~  318 (667)
T PRK11186        305 IK-GPKGSKVRLEIL  318 (667)
T ss_pred             hc-CCCCCEEEEEEE
Confidence            75 467999999874


No 46 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.19  E-value=0.015  Score=57.39  Aligned_cols=54  Identities=19%  Similarity=0.281  Sum_probs=36.1

Q ss_pred             cccCCCCCCCceeCC--Cce-EEEEEeeeeCCCCCCCCccceeecccchhhhhhhhh
Q 014786          272 AAINPGNSGGPLLDS--SGS-LIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVK  325 (418)
Q Consensus       272 ~~i~~G~SGGPl~n~--~G~-VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~  325 (418)
                      ...|.|+||||+|-.  +|+ -+||++|+...++...--+..--++....+++...+
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~  279 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN  279 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence            356889999999953  454 799999987766544433434445666666666443


No 47 
>PF12812 PDZ_1:  PDZ-like domain
Probab=97.15  E-value=0.0013  Score=51.48  Aligned_cols=65  Identities=28%  Similarity=0.381  Sum_probs=51.8

Q ss_pred             cccCeeecc--chhhhhhCc-cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCC
Q 014786          332 PILGIKFAP--DQSVEQLGV-SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCK  407 (418)
Q Consensus       332 ~~lGv~~~~--~~~~~~~~~-~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~  407 (418)
                      -|.|..+.+  .+.++.+++ -|+++.....++++..-|+..           |-+|++|||+++.+.+++.+.+++.+
T Consensus         9 ~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~-----------g~iI~~Vn~kpt~~Ld~f~~vvk~ip   76 (78)
T PF12812_consen    9 EVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISK-----------GFIITSVNGKPTPDLDDFIKVVKKIP   76 (78)
T ss_pred             EEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCC-----------CeEEEeECCcCCcCHHHHHHHHHhCC
Confidence            477888876  455777776 345555667888877666888           99999999999999999999998764


No 48 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.06  E-value=0.00065  Score=69.98  Aligned_cols=49  Identities=39%  Similarity=0.532  Sum_probs=42.6

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .+..|..|.++|||++|||.+           ||.|++|||.        .+.+...++++.|++.++
T Consensus       462 g~~~i~~V~~~gPA~~AGl~~-----------Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~  510 (558)
T COG3975         462 GHEKITFVFPGGPAYKAGLSP-----------GDKIVAINGI--------SDQLDRYKVNDKIQVHVF  510 (558)
T ss_pred             CeeEEEecCCCChhHhccCCC-----------ccEEEEEcCc--------cccccccccccceEEEEc
Confidence            467999999999999999999           9999999999        456667788888888764


No 49 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.96  E-value=0.00066  Score=54.70  Aligned_cols=32  Identities=38%  Similarity=0.453  Sum_probs=30.4

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKK  392 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~  392 (418)
                      .|+||++|.++|||+.|||+.           +|.|+.+||-.
T Consensus        59 ~GiYvT~V~eGsPA~~AGLri-----------hDKIlQvNG~D   90 (124)
T KOG3553|consen   59 KGIYVTRVSEGSPAEIAGLRI-----------HDKILQVNGWD   90 (124)
T ss_pred             ccEEEEEeccCChhhhhccee-----------cceEEEecCce
Confidence            799999999999999999999           99999999954


No 50 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.68  E-value=0.022  Score=52.78  Aligned_cols=164  Identities=18%  Similarity=0.283  Sum_probs=85.9

Q ss_pred             ccccCCCeEEEEEEEcCC-cEEEecccccCCCCe-EEEEeCCCcEEEEEEEEEcCC----------------CCeEEEEe
Q 014786          145 DVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQD----------------KDVAVLRI  206 (418)
Q Consensus       145 ~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLkv  206 (418)
                      |......+.||=.+++++ +..--=.|.+.+.+. ..+.+.+|+.|++++....+.                .-+.-+.-
T Consensus        13 wVRD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~   92 (218)
T PF05580_consen   13 WVRDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEK   92 (218)
T ss_pred             EEEeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEe
Confidence            444455788998999874 555555888887664 456667888888877655321                11222221


Q ss_pred             cC----------C----CCCCcccccCCCCCCCCCCEEEEEeCCCCCC-CceeEeEEeeeeeeecccCCC----CCcccE
Q 014786          207 DA----------P----KDKLRPIPIGVSADLLVGQKVYAIGNPFGLD-HTLTTGVISGLRREISSAATG----RPIQDV  267 (418)
Q Consensus       207 ~~----------~----~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~-~~~~~G~Vs~~~~~~~~~~~~----~~~~~~  267 (418)
                      ..          .    ....++++++...++++|..-+..-. .|.. ..+..-++ .+.+.......+    ....+.
T Consensus        93 Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~ieI~-~v~~~~~~~~k~~vi~vtd~~L  170 (218)
T PF05580_consen   93 NTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDIEIE-KVLPQSSPSGKGMVIKVTDPRL  170 (218)
T ss_pred             ccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEEEEE-EEccCCCCCCCcEEEEECCcch
Confidence            11          1    01234555555556777754321111 1111 11111111 111110000000    000112


Q ss_pred             EEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecc
Q 014786          268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVD  314 (418)
Q Consensus       268 i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~  314 (418)
                      +....-+..||||+|++ .+|++||=++..+.+   +...||.++++
T Consensus       171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie  213 (218)
T PF05580_consen  171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIE  213 (218)
T ss_pred             hhhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHH
Confidence            22233466899999999 799999999977753   35678888764


No 51 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.60  E-value=0.049  Score=49.24  Aligned_cols=136  Identities=19%  Similarity=0.285  Sum_probs=78.7

Q ss_pred             CeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEE--EEEEcC---CCCeEEEEecCCCCCCcccc--cCCCCC
Q 014786          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK--IVGFDQ---DKDVAVLRIDAPKDKLRPIP--IGVSAD  223 (418)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~--vv~~d~---~~DlAlLkv~~~~~~~~~~~--l~~~~~  223 (418)
                      ...++++.|..+ ++|...| -.....+.+   +|+.++..  +...+.   ..|+++++++... +++-+.  +.+. .
T Consensus        24 ~~t~l~~gi~~~-~~lvp~H-~~~~~~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~-~   96 (172)
T PF00548_consen   24 EFTMLALGIYDR-YFLVPTH-EEPEDTIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPES-I   96 (172)
T ss_dssp             EEEEEEEEEEBT-EEEEEGG-GGGCSEEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSS-G
T ss_pred             eEEEecceEeee-EEEEECc-CCCcEEEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccc-c
Confidence            457888888765 9999999 223333433   45555432  223443   4599999997643 343332  1111 1


Q ss_pred             CCCCCEEEEEeCCCCCCC-ceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCC---CceEEEEEeee
Q 014786          224 LLVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDS---SGSLIGINTAI  297 (418)
Q Consensus       224 ~~~G~~V~~vG~p~g~~~-~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~---~G~VVGI~s~~  297 (418)
                      ....+...++=.. .... ....+.+...... .  ..+......+.++++..+|+-||||+..   .++++||+.++
T Consensus        97 ~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~--~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen   97 PEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-N--LSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             GTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-E--ETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred             ccCCCcEEEEECC-CCccEEEEEEEEeecCcc-c--cCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence            2334444444332 2332 3344445443332 1  1234456788899999999999999952   67999999985


No 52 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.56  E-value=0.16  Score=49.27  Aligned_cols=91  Identities=19%  Similarity=0.216  Sum_probs=54.9

Q ss_pred             CCCCeEEEEecCC-CCCCcccccCCC-CCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEcccc
Q 014786          197 QDKDVAVLRIDAP-KDKLRPIPIGVS-ADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI  274 (418)
Q Consensus       197 ~~~DlAlLkv~~~-~~~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i  274 (418)
                      ...+++||.++.+ .....++-|+++ .....++.+.+.|+..  ........+.-....        .....+......
T Consensus       159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~--~~~~~~~~~~i~~~~--------~~~~~~~~~~~~  228 (282)
T PF03761_consen  159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNS--TGKLKHRKLKITNCT--------KCAYSICTKQYS  228 (282)
T ss_pred             cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCC--CCeEEEEEEEEEEee--------ccceeEeccccc
Confidence            3569999999886 234566666654 3467899999999821  112222222111110        012345556667


Q ss_pred             CCCCCCCceeC---CCceEEEEEeee
Q 014786          275 NPGNSGGPLLD---SSGSLIGINTAI  297 (418)
Q Consensus       275 ~~G~SGGPl~n---~~G~VVGI~s~~  297 (418)
                      +.|++|||++.   .+..||||.+..
T Consensus       229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~  254 (282)
T PF03761_consen  229 CKGDRGGPLVKNINGRWTLIGVGASG  254 (282)
T ss_pred             CCCCccCeEEEEECCCEEEEEEEccC
Confidence            89999999984   344599997644


No 53 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=96.01  E-value=0.012  Score=53.88  Aligned_cols=56  Identities=29%  Similarity=0.248  Sum_probs=44.4

Q ss_pred             CcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHH--HhcCCCCCEEEEEEE
Q 014786          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI--LDQCKVGDEVSCFTF  417 (418)
Q Consensus       351 G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~--l~~~~~g~~v~l~v~  417 (418)
                      =+.|.+|.++|||+++||+.           ||.|++++...--++..|..+  +.+...++.+.++|+
T Consensus       140 Fa~V~sV~~~SPA~~aGl~~-----------gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~  197 (231)
T KOG3129|consen  140 FAVVDSVVPGSPADEAGLCV-----------GDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVI  197 (231)
T ss_pred             eEEEeecCCCChhhhhCccc-----------CceEEEecccccccchhHHHHHHHHHhccCcceeEEEe
Confidence            46799999999999999999           999999999888887766543  334456666666653


No 54 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.97  E-value=0.042  Score=58.45  Aligned_cols=117  Identities=20%  Similarity=0.399  Sum_probs=72.4

Q ss_pred             CCCeEEEEecCCC-------CC------CcccccCC------CCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeeccc
Q 014786          198 DKDVAVLRIDAPK-------DK------LRPIPIGV------SADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSA  258 (418)
Q Consensus       198 ~~DlAlLkv~~~~-------~~------~~~~~l~~------~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~  258 (418)
                      -.|+|||+++...       ++      -|.+.+.+      ...+..|..|+-+|...|    .+.|.+.++.-..  +
T Consensus       542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy--w  615 (695)
T PF08192_consen  542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY--W  615 (695)
T ss_pred             ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--e
Confidence            3599999998532       11      12222221      123567999999997655    5677777664332  2


Q ss_pred             CCCCCc-ccEEEEc----cccCCCCCCCceeCCCc------eEEEEEeeeeCCCCCCCCccceeecccchhhhhhh
Q 014786          259 ATGRPI-QDVIQTD----AAINPGNSGGPLLDSSG------SLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQL  323 (418)
Q Consensus       259 ~~~~~~-~~~i~~~----~~i~~G~SGGPl~n~~G------~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l  323 (418)
                      ..+... .+++...    .-...|+||+=|++.-+      .|+||.++..   +....+|.+.|+..|.+=+++.
T Consensus       616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsyd---ge~kqfglftPi~~il~rl~~v  688 (695)
T PF08192_consen  616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYD---GEQKQFGLFTPINEILDRLEEV  688 (695)
T ss_pred             cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecC---CccceeeccCcHHHHHHHHHHh
Confidence            222222 3344433    23457999999998533      3999988752   4456799998888776666553


No 55 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=95.89  E-value=0.013  Score=56.74  Aligned_cols=50  Identities=16%  Similarity=0.288  Sum_probs=42.7

Q ss_pred             ecCCCCh---hhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          356 DAPPNGP---AGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       356 ~v~~~~p---a~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      ++.|+..   -+++||+.           |||+++|||.++++.++..+++++.+..++++|+|
T Consensus       210 rl~Pgkd~~lF~~~GLq~-----------GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltV  262 (276)
T PRK09681        210 AVKPGADRSLFDASGFKE-----------GDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTV  262 (276)
T ss_pred             EECCCCcHHHHHHcCCCC-----------CCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEE
Confidence            3456543   56789999           99999999999999999999999888888888876


No 56 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=95.75  E-value=0.053  Score=59.20  Aligned_cols=22  Identities=36%  Similarity=0.359  Sum_probs=20.1

Q ss_pred             eEEEEEEEcCCcEEEecccccC
Q 014786          152 GSGSGFVWDSKGHVVTNYHVIR  173 (418)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~  173 (418)
                      +-|||-+|+++|.|+||.||+.
T Consensus        47 gGCSgsfVS~~GLvlTNHHC~~   68 (698)
T PF10459_consen   47 GGCSGSFVSPDGLVLTNHHCGY   68 (698)
T ss_pred             CceeEEEEcCCceEEecchhhh
Confidence            3599999999999999999975


No 57 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.64  E-value=0.014  Score=61.97  Aligned_cols=52  Identities=29%  Similarity=0.450  Sum_probs=43.6

Q ss_pred             EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEE
Q 014786          354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       354 V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~g~~v~l~v~  417 (418)
                      |-++.++|||+|.          |+|++||-|++|||+.|.+.+  |+..+++.  .|-+|+|+|+
T Consensus       782 iGrIieGSPAdRC----------gkLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIi  835 (984)
T KOG3209|consen  782 IGRIIEGSPADRC----------GKLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTII  835 (984)
T ss_pred             ccccccCChhHhh----------ccccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEc
Confidence            7788899999886          445569999999999999985  77888875  6889999985


No 58 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.41  E-value=0.016  Score=49.86  Aligned_cols=32  Identities=22%  Similarity=0.456  Sum_probs=22.9

Q ss_pred             EEEccccCCCCCCCceeCCCceEEEEEeeeeC
Q 014786          268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYS  299 (418)
Q Consensus       268 i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~  299 (418)
                      ...+..+.+|.||+|+||.+|++|||......
T Consensus        88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~  119 (132)
T PF00949_consen   88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE  119 (132)
T ss_dssp             EEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred             EeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence            34455577999999999999999999887654


No 59 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=95.34  E-value=0.16  Score=51.79  Aligned_cols=38  Identities=26%  Similarity=0.602  Sum_probs=29.7

Q ss_pred             cccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeec
Q 014786          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPV  313 (418)
Q Consensus       272 ~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~  313 (418)
                      ..+..||||+|++ .+|++||=++..+-+   +...||+|-+
T Consensus       355 gGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~i  392 (402)
T TIGR02860       355 GGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYI  392 (402)
T ss_pred             CCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehH
Confidence            3466799999999 899999998877664   3456788744


No 60 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=95.31  E-value=0.024  Score=59.18  Aligned_cols=55  Identities=29%  Similarity=0.398  Sum_probs=46.7

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCF  415 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~  415 (418)
                      -|++|..|.+++||++.||+.           ||.||.||.++..+.  ++...+|...++|+.|++.
T Consensus       429 VGIFVaGvqegspA~~eGlqE-----------GDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtil  485 (1027)
T KOG3580|consen  429 VGIFVAGVQEGSPAEQEGLQE-----------GDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTIL  485 (1027)
T ss_pred             eeEEEeecccCCchhhccccc-----------cceeEEeccccchhhhHHHHHHHHhcCCCCcEEeeh
Confidence            589999999999999999999           999999999998876  3444455567899998873


No 61 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=95.30  E-value=0.014  Score=63.64  Aligned_cols=31  Identities=32%  Similarity=0.593  Sum_probs=27.7

Q ss_pred             ccEEEEccccCCCCCCCceeCCCceEEEEEe
Q 014786          265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINT  295 (418)
Q Consensus       265 ~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s  295 (418)
                      .-.+.++..+..||||+|++|.+|+|||++.
T Consensus       621 pv~FlstnDitGGNSGSPvlN~~GeLVGl~F  651 (698)
T PF10459_consen  621 PVNFLSTNDITGGNSGSPVLNAKGELVGLAF  651 (698)
T ss_pred             eeEEEeccCcCCCCCCCccCCCCceEEEEee
Confidence            3457788899999999999999999999987


No 62 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=94.99  E-value=0.046  Score=57.98  Aligned_cols=62  Identities=24%  Similarity=0.380  Sum_probs=50.2

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      -.+|+.|....      -+-|-|..|.+++||.|+.+++           ||++++|||+||++..+..+.++... |+.
T Consensus       386 ~~ig~vf~~~~------~~~v~v~tv~~ns~a~k~~~~~-----------gdvlvai~~~pi~s~~q~~~~~~s~~-~~~  447 (1051)
T KOG3532|consen  386 SPIGLVFDKNT------NRAVKVCTVEDNSLADKAAFKP-----------GDVLVAINNVPIRSERQATRFLQSTT-GDL  447 (1051)
T ss_pred             CceeEEEecCC------ceEEEEEEecCCChhhHhcCCC-----------cceEEEecCccchhHHHHHHHHHhcc-cce
Confidence            45666664321      1557799999999999999999           99999999999999999999998763 443


No 63 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.36  E-value=0.13  Score=47.78  Aligned_cols=134  Identities=15%  Similarity=0.178  Sum_probs=48.0

Q ss_pred             EEEecccccCCCCeEEEEeCCCcEEE---EEEEEEcCCCCeEEEEecCCC---CCCcccccCCCCCCCCCCEEEEEeCCC
Q 014786          164 HVVTNYHVIRGASDIRVTFADQSAYD---AKIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPF  237 (418)
Q Consensus       164 ~ILT~aHvv~~~~~i~V~~~dg~~~~---a~vv~~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~  237 (418)
                      .++|+.||..+...+.. ..+|+.++   -+.+..+...|++||++....   ...+.+.+.....+..|    .+..+ 
T Consensus        43 ~L~ta~Hv~~~~~~~~~-~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g----~~~~y-  116 (203)
T PF02122_consen   43 ALLTARHVWSRPSKVTS-LKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG----PVSFY-  116 (203)
T ss_dssp             EEEE-HHHHTSSS---E-EETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE----ESSTT-
T ss_pred             ceecccccCCCccceeE-cCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC----Ceeee-
Confidence            69999999998665543 34454443   234456788899999997421   12233333211111100    11111 


Q ss_pred             CCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecc
Q 014786          238 GLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVD  314 (418)
Q Consensus       238 g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~  314 (418)
                          ....+........+.    +. ...+...-+...+|.||.|+++.+ ++||+++.. ......+.+++..|+.
T Consensus       117 ----~~~~~~~~~~sa~i~----g~-~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~-~~~~~~~n~n~~spip  182 (203)
T PF02122_consen  117 ----GFSSGEWPCSSAKIP----GT-EGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS-PSGSNRENNNRMSPIP  182 (203)
T ss_dssp             ----SEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE-----------------
T ss_pred             ----eecCCCceeccCccc----cc-cCcCCceEcCCCCCCCCCCeEECC-CceEeecCc-cccccccccccccccc
Confidence                111111111111111    11 123566667888999999999888 999999975 2223344566555443


No 64 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=93.44  E-value=0.1  Score=57.31  Aligned_cols=57  Identities=28%  Similarity=0.283  Sum_probs=46.0

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      -|+||.+|.+|++|          |+.|+|+.||-+|+|||+......+-..+-...+.|..|.+.|
T Consensus       960 lGIYvKsVV~GgaA----------d~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg~vV~leV 1016 (1629)
T KOG1892|consen  960 LGIYVKSVVEGGAA----------DHDGRLEAGDQLLSVDGHSLIGISQERAARLMTRTGNVVHLEV 1016 (1629)
T ss_pred             cceEEEEeccCCcc----------ccccccccCceeeeecCcccccccHHHHHHHHhccCCeEEEeh
Confidence            48999999999998          4567888899999999999988766554444455688998876


No 65 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=93.03  E-value=0.17  Score=44.06  Aligned_cols=54  Identities=28%  Similarity=0.380  Sum_probs=38.6

Q ss_pred             cCcEEEecCCCChhhh-cCccccccccCCCccCCcEEEEECCEEeCCHHH--HHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGK-AGLLSTKRDAYGRLILGDIITSVNGKKVSNGSD--LYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~-aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~d--l~~~l~~~~~g~~v~l~v  416 (418)
                      .-+||+++.|++-|++ -||+.           ||.++++||..|..-..  -.++|+..  -..|++.|
T Consensus       115 spiyisriipggvadrhgglkr-----------gdqllsvngvsvege~hekavellkaa--~gsvklvv  171 (207)
T KOG3550|consen  115 SPIYISRIIPGGVADRHGGLKR-----------GDQLLSVNGVSVEGEHHEKAVELLKAA--VGSVKLVV  171 (207)
T ss_pred             CceEEEeecCCccccccCcccc-----------cceeEeecceeecchhhHHHHHHHHHh--cCcEEEEE
Confidence            4589999999999887 47777           99999999999976432  23344432  33566554


No 66 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=92.59  E-value=0.22  Score=46.83  Aligned_cols=48  Identities=21%  Similarity=0.337  Sum_probs=41.7

Q ss_pred             CCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          359 PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       359 ~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      .++--++.|||.           |||.+++|+..+++.+++..+++..+.-+.++++|.
T Consensus       216 d~slF~~sglq~-----------GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~  263 (275)
T COG3031         216 DGSLFYKSGLQR-----------GDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVI  263 (275)
T ss_pred             CcchhhhhcCCC-----------cceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEE
Confidence            445577889998           999999999999999999999998877778888774


No 67 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=92.51  E-value=1  Score=38.30  Aligned_cols=33  Identities=30%  Similarity=0.449  Sum_probs=24.9

Q ss_pred             cccEEEEccccCCCCCCCceeCCCceEEEEEeee
Q 014786          264 IQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI  297 (418)
Q Consensus       264 ~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~  297 (418)
                      ..+++....+..||+-||+|+-.. -|+||++++
T Consensus        77 Q~~~l~g~Gp~~PGdCGg~L~C~H-GViGi~Tag  109 (127)
T PF00947_consen   77 QYNLLIGEGPAEPGDCGGILRCKH-GVIGIVTAG  109 (127)
T ss_dssp             EECEEEEE-SSSTT-TCSEEEETT-CEEEEEEEE
T ss_pred             ecCceeecccCCCCCCCceeEeCC-CeEEEEEeC
Confidence            345666677899999999999655 599999987


No 68 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=92.44  E-value=1.7  Score=41.22  Aligned_cols=88  Identities=17%  Similarity=0.244  Sum_probs=60.0

Q ss_pred             CCCeEEEEEEEcCCcEEEecccccCCC----CeEEEEeCCCcEEE------EEEEEEc-----CCCCeEEEEecCCC---
Q 014786          149 VPQGSGSGFVWDSKGHVVTNYHVIRGA----SDIRVTFADQSAYD------AKIVGFD-----QDKDVAVLRIDAPK---  210 (418)
Q Consensus       149 ~~~~~GSGfiI~~~G~ILT~aHvv~~~----~~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLkv~~~~---  210 (418)
                      ++...++|++||++ |||++-.|+.+-    ..+.+.++.++.+.      -++..+|     ++.++++|.++.+.   
T Consensus        25 dG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~fT  103 (267)
T PF09342_consen   25 DGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPANFT  103 (267)
T ss_pred             cCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCcccce
Confidence            45678999999987 999999999873    34667777776543      1233333     68899999998764   


Q ss_pred             CCCcccccCC-CCCCCCCCEEEEEeCCC
Q 014786          211 DKLRPIPIGV-SADLLVGQKVYAIGNPF  237 (418)
Q Consensus       211 ~~~~~~~l~~-~~~~~~G~~V~~vG~p~  237 (418)
                      ..+.|.-+.+ ..+....+..+++|.-.
T Consensus       104 r~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen  104 RYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             eeecccccccccCCCCCCCceEEEEccc
Confidence            2234444433 23455566899999643


No 69 
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=92.29  E-value=0.44  Score=45.52  Aligned_cols=66  Identities=26%  Similarity=0.403  Sum_probs=49.3

Q ss_pred             cccCeeeccchhhh--hhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCC--HHHHHHHHh
Q 014786          332 PILGIKFAPDQSVE--QLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILD  404 (418)
Q Consensus       332 ~~lGv~~~~~~~~~--~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~--~~dl~~~l~  404 (418)
                      ..||+.+.+....+  ..|+   +|+.|.+..+|+-|+..||-.          +.|.|++|||.+|..  .+++-++|-
T Consensus       171 kPLGFYIRDG~SVRVtp~GlekvpGIFISRlVpGGLAeSTGLLa----------VnDEVlEVNGIEVaGKTLDQVTDMMv  240 (358)
T KOG3606|consen  171 KPLGFYIRDGTSVRVTPHGLEKVPGIFISRLVPGGLAESTGLLA----------VNDEVLEVNGIEVAGKTLDQVTDMMV  240 (358)
T ss_pred             CCceEEEecCceEEeccccccccCceEEEeecCCccccccceee----------ecceeEEEcCEEeccccHHHHHHHHh
Confidence            46777665532221  2233   899999999999999999975          499999999999964  578888776


Q ss_pred             cCC
Q 014786          405 QCK  407 (418)
Q Consensus       405 ~~~  407 (418)
                      .+.
T Consensus       241 ANs  243 (358)
T KOG3606|consen  241 ANS  243 (358)
T ss_pred             hcc
Confidence            543


No 70 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=91.83  E-value=0.18  Score=51.04  Aligned_cols=56  Identities=30%  Similarity=0.478  Sum_probs=45.6

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHH-HhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI-LDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~-l~~~~~g~~v~l~v~  417 (418)
                      .|.-|.+|..++|+.+|||..          .-|-|++|||.+++...|..+. +++.-  +.|+++||
T Consensus        15 eg~hvlkVqedSpa~~aglep----------ffdFIvSI~g~rL~~dnd~Lk~llk~~s--ekVkltv~   71 (462)
T KOG3834|consen   15 EGYHVLKVQEDSPAHKAGLEP----------FFDFIVSINGIRLNKDNDTLKALLKANS--EKVKLTVY   71 (462)
T ss_pred             eeEEEEEeecCChHHhcCcch----------hhhhhheeCcccccCchHHHHHHHHhcc--cceEEEEE
Confidence            677899999999999999997          3799999999999988765555 44443  33999886


No 71 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=91.55  E-value=0.26  Score=42.20  Aligned_cols=28  Identities=32%  Similarity=0.565  Sum_probs=23.5

Q ss_pred             ccccCCCCCCCceeCCCceEEEEEeeee
Q 014786          271 DAAINPGNSGGPLLDSSGSLIGINTAIY  298 (418)
Q Consensus       271 ~~~i~~G~SGGPl~n~~G~VVGI~s~~~  298 (418)
                      ...-.+|+||-|++|..|+||||+-.+.
T Consensus       100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~  127 (158)
T PF00944_consen  100 TGVGKPGDSGRPIFDNSGRVVAIVLGGA  127 (158)
T ss_dssp             TTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred             cCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence            4456799999999999999999998764


No 72 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.11  E-value=0.47  Score=50.91  Aligned_cols=55  Identities=27%  Similarity=0.413  Sum_probs=44.9

Q ss_pred             cEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEE
Q 014786          352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       352 ~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v  416 (418)
                      +-|..|.+++||++-|-          |..||+|+.|||.-+-.-  .|.-+.++..+.|+.|+|++
T Consensus       373 LqVKsvl~DGPAa~dGk----------le~GDviV~INg~cvlGhTHAqaV~~fqaiPvg~~V~L~l  429 (984)
T KOG3209|consen  373 LQVKSVLKDGPAAQDGK----------LETGDVIVHINGECVLGHTHAQAVKRFQAIPVGQSVDLVL  429 (984)
T ss_pred             eeeeecccCCchhhcCc----------cccCcEEEEECCceeccccHHHHHHHhhccccCCeeeEEE
Confidence            35778899999987654          445999999999999765  57778888889999999976


No 73 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=90.92  E-value=0.36  Score=47.59  Aligned_cols=54  Identities=31%  Similarity=0.427  Sum_probs=43.2

Q ss_pred             CcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEE
Q 014786          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       351 G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v  416 (418)
                      -++|+++.++-.|+..|+-          -+||-|+.|||.-|+.-  +|+..+|..  .||+|.++|
T Consensus        81 PvviSkI~kdQaAd~tG~L----------FvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV  136 (505)
T KOG3549|consen   81 PVVISKIYKDQAADITGQL----------FVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTV  136 (505)
T ss_pred             cEEeehhhhhhhhhhcCce----------EeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEe
Confidence            3578888888777766654          34999999999999876  577788865  699999987


No 74 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=90.55  E-value=0.42  Score=49.80  Aligned_cols=67  Identities=28%  Similarity=0.427  Sum_probs=50.4

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g  409 (418)
                      -.+|+.+.....      .-++|.+++.|+-+++.|+-.          +||.|++|||..|.+.  .++.++|...+  
T Consensus       134 eplG~Tik~~e~------~~~~vARI~~GG~~~r~glL~----------~GD~i~EvNGi~v~~~~~~e~q~~l~~~~--  195 (542)
T KOG0609|consen  134 EPLGATIRVEED------TKVVVARIMHGGMADRQGLLH----------VGDEILEVNGISVANKSPEELQELLRNSR--  195 (542)
T ss_pred             CccceEEEeccC------CccEEeeeccCCcchhcccee----------eccchheecCeecccCCHHHHHHHHHhCC--
Confidence            356666654321      247999999999999998743          4999999999999875  68999998765  


Q ss_pred             CEEEEEE
Q 014786          410 DEVSCFT  416 (418)
Q Consensus       410 ~~v~l~v  416 (418)
                      ..+++.|
T Consensus       196 G~itfki  202 (542)
T KOG0609|consen  196 GSITFKI  202 (542)
T ss_pred             CcEEEEE
Confidence            3555544


No 75 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=88.78  E-value=0.64  Score=46.75  Aligned_cols=45  Identities=38%  Similarity=0.499  Sum_probs=38.9

Q ss_pred             cCcEEEecCCCChhhh-cCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhc
Q 014786          350 SGVLVLDAPPNGPAGK-AGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQ  405 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~-aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~  405 (418)
                      .|+.|.+|...||..- .||.+           ||+|+++||.+|++.+|-.+.++.
T Consensus       220 ~gV~Vtev~~~Spl~gprGL~v-----------gdvitsldgcpV~~v~dW~ecl~t  265 (484)
T KOG2921|consen  220 EGVTVTEVPSVSPLFGPRGLSV-----------GDVITSLDGCPVHKVSDWLECLAT  265 (484)
T ss_pred             ceEEEEeccccCCCcCcccCCc-----------cceEEecCCcccCCHHHHHHHHHh
Confidence            8999999999999533 47887           999999999999999998877763


No 76 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=88.57  E-value=0.6  Score=49.64  Aligned_cols=101  Identities=24%  Similarity=0.376  Sum_probs=69.1

Q ss_pred             cCCCCCCCcee-----CCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccc------cccCeeeccch
Q 014786          274 INPGNSGGPLL-----DSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTR------PILGIKFAPDQ  342 (418)
Q Consensus       274 i~~G~SGGPl~-----n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~------~~lGv~~~~~~  342 (418)
                      +..-|+|||.-     |...+++.|+-...          ..+|.+..+..++.+++.-.|+.      |..-+.+.-.+
T Consensus       677 iAnmm~~GpAarsgkLnIGDQiiaING~SL----------VGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd  746 (829)
T KOG3605|consen  677 IANMMHGGPAARSGKLNIGDQIMSINGTSL----------VGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPD  746 (829)
T ss_pred             HHhcccCChhhhcCCccccceeEeecCcee----------ccccHHHHHHHHhcccccceEEEEEecCCCceEEEeeccc
Confidence            34557888874     44456777764332          24889999999988876554432      22223333333


Q ss_pred             hhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH
Q 014786          343 SVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG  396 (418)
Q Consensus       343 ~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~  396 (418)
                      ....+|.   .| +|-....|+-|+|-|+++           |-.|++|||+.|.--
T Consensus       747 ~kyQLGFSVQNG-iICSLlRGGIAERGGVRV-----------GHRIIEINgQSVVA~  791 (829)
T KOG3605|consen  747 LRYQLGFSVQNG-IICSLLRGGIAERGGVRV-----------GHRIIEINGQSVVAT  791 (829)
T ss_pred             chhhccceeeCc-EeehhhcccchhccCcee-----------eeeEEEECCceEEec
Confidence            3445664   56 566889999999999999           999999999988643


No 77 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=87.95  E-value=0.61  Score=51.48  Aligned_cols=64  Identities=28%  Similarity=0.476  Sum_probs=46.4

Q ss_pred             cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (418)
Q Consensus       332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g  409 (418)
                      +.||+-|...        .-|+|..|.+|+|+            .|+|+.||.|+.|||++|...  +-+.+++..+  .
T Consensus        65 ~~lGFgfvag--------rPviVr~VT~GGps------------~GKL~PGDQIl~vN~Epv~daprervIdlvRac--e  122 (1298)
T KOG3552|consen   65 ASLGFGFVAG--------RPVIVRFVTEGGPS------------IGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--E  122 (1298)
T ss_pred             ccccceeecC--------CceEEEEecCCCCc------------cccccCCCeEEEecCcccccccHHHHHHHHHHH--h
Confidence            5556555432        45789999999996            477888999999999999875  4666777665  3


Q ss_pred             CEEEEEEE
Q 014786          410 DEVSCFTF  417 (418)
Q Consensus       410 ~~v~l~v~  417 (418)
                      +.|.++|.
T Consensus       123 ~sv~ltV~  130 (1298)
T KOG3552|consen  123 SSVNLTVC  130 (1298)
T ss_pred             hhcceEEe
Confidence            45666653


No 78 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=87.47  E-value=0.46  Score=50.69  Aligned_cols=55  Identities=25%  Similarity=0.358  Sum_probs=40.6

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      .|++|.+|.+++.|++.||+.           ||.|++|||+.-.+... .++..-.+....+.++|
T Consensus       562 fgifV~~V~pgskAa~~GlKR-----------gDqilEVNgQnfenis~-~KA~eiLrnnthLtltv  616 (1283)
T KOG3542|consen  562 FGIFVAEVFPGSKAAREGLKR-----------GDQILEVNGQNFENISA-KKAEEILRNNTHLTLTV  616 (1283)
T ss_pred             ceeEEeeecCCchHHHhhhhh-----------hhhhhhccccchhhhhH-HHHHHHhcCCceEEEEE
Confidence            689999999999999999999           99999999998776543 23332223344444443


No 79 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=86.97  E-value=0.69  Score=47.01  Aligned_cols=53  Identities=26%  Similarity=0.513  Sum_probs=45.2

Q ss_pred             EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786          354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       354 V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      |.+|.+++||++|||+.          .+|.|+-+-+.--...+||...|..+ .++.+++-||
T Consensus       113 vl~V~p~SPaalAgl~~----------~~DYivG~~~~~~~~~eDl~~lIesh-e~kpLklyVY  165 (462)
T KOG3834|consen  113 VLSVEPNSPAALAGLRP----------YTDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKLYVY  165 (462)
T ss_pred             eeecCCCCHHHhccccc----------ccceEecchhhhccchHHHHHHHHhc-cCCCcceeEe
Confidence            78899999999999995          38999999666677789999999876 4888888776


No 80 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=86.86  E-value=0.85  Score=48.11  Aligned_cols=55  Identities=25%  Similarity=0.378  Sum_probs=39.4

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v  416 (418)
                      .-++|.+|.+|+||+            |.||.||-|+.|||....+...-..+-+-.+-|+..+++|
T Consensus        40 tSiViSDVlpGGPAe------------G~LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~Itv   94 (1027)
T KOG3580|consen   40 TSIVISDVLPGGPAE------------GLLQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITV   94 (1027)
T ss_pred             eeEEEeeccCCCCcc------------cccccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEe
Confidence            457899999999985            6677799999999999887765443322223455555554


No 81 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=84.63  E-value=2.3  Score=42.90  Aligned_cols=44  Identities=39%  Similarity=0.605  Sum_probs=38.7

Q ss_pred             ecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786          356 DAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (418)
Q Consensus       356 ~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~  411 (418)
                      .+..++++..+|++.           ||.|+++|++++.+++++.+.+... .|..
T Consensus       135 ~v~~~s~a~~a~l~~-----------Gd~iv~~~~~~i~~~~~~~~~~~~~-~~~~  178 (375)
T COG0750         135 EVAPKSAAALAGLRP-----------GDRIVAVDGEKVASWDDVRRLLVAA-AGDV  178 (375)
T ss_pred             ecCCCCHHHHcCCCC-----------CCEEEeECCEEccCHHHHHHHHHhc-cCCc
Confidence            688899999999999           9999999999999999999888754 3444


No 82 
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=84.55  E-value=0.86  Score=45.71  Aligned_cols=70  Identities=27%  Similarity=0.362  Sum_probs=46.8

Q ss_pred             ccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCC
Q 014786          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKV  408 (418)
Q Consensus       331 ~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~  408 (418)
                      -+-||+.+.-....+    --++|+++-++-.|++.+          .|-.||.|++|||....+..  +-.++|+  +.
T Consensus        95 ~gGLGISIKGGreNk----MPIlISKIFkGlAADQt~----------aL~~gDaIlSVNG~dL~~AtHdeAVqaLK--ra  158 (506)
T KOG3551|consen   95 AGGLGISIKGGRENK----MPILISKIFKGLAADQTG----------ALFLGDAILSVNGEDLRDATHDEAVQALK--RA  158 (506)
T ss_pred             CCcceEEeecCcccC----CceehhHhcccccccccc----------ceeeccEEEEecchhhhhcchHHHHHHHH--hh
Confidence            466777665422111    346788888887776653          34459999999999988764  3444554  57


Q ss_pred             CCEEEEEE
Q 014786          409 GDEVSCFT  416 (418)
Q Consensus       409 g~~v~l~v  416 (418)
                      |++|.+.|
T Consensus       159 GkeV~lev  166 (506)
T KOG3551|consen  159 GKEVLLEV  166 (506)
T ss_pred             Cceeeeee
Confidence            99988866


No 83 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=84.22  E-value=4.3  Score=33.51  Aligned_cols=53  Identities=23%  Similarity=0.339  Sum_probs=35.1

Q ss_pred             EEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCC
Q 014786          156 GFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGV  220 (418)
Q Consensus       156 GfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~  220 (418)
                      ++=|. +|..+|+.||.+.++.+.     |..+  +++.  ...|+|+++.+..  .++.+++++
T Consensus         3 avHIG-nG~~vt~tHva~~~~~v~-----g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig~   55 (105)
T PF03510_consen    3 AVHIG-NGRYVTVTHVAKSSDSVD-----GQPF--KIVK--TDGELCWVQSPLV--HLPAAQIGT   55 (105)
T ss_pred             eEEeC-CCEEEEEEEEeccCceEc-----CcCc--EEEE--eccCEEEEECCCC--CCCeeEecc
Confidence            55565 589999999999877652     3222  2222  3559999999763  356666653


No 84 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=83.89  E-value=3.7  Score=45.58  Aligned_cols=54  Identities=19%  Similarity=0.256  Sum_probs=34.2

Q ss_pred             EEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEc--CCCCeEEEEecCC
Q 014786          153 SGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFD--QDKDVAVLRIDAP  209 (418)
Q Consensus       153 ~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d--~~~DlAlLkv~~~  209 (418)
                      .|...+|+++ ||+|.+|+..+...+..-..++..|.  ++...  +..|+.+-|++.-
T Consensus        66 ~G~aTLigpq-YiVSV~HN~~gy~~v~FG~~g~~~Y~--iV~RNn~~~~Df~~pRLnK~  121 (769)
T PF02395_consen   66 KGVATLIGPQ-YIVSVKHNGKGYNSVSFGNEGQNTYK--IVDRNNYPSGDFHMPRLNKF  121 (769)
T ss_dssp             TSS-EEEETT-EEEBETTG-TSCCEECESCSSTCEEE--EEEEEBETTSTEBEEEESS-
T ss_pred             CceEEEecCC-eEEEEEccCCCcCceeecccCCceEE--EEEccCCCCcccceeecCce
Confidence            4779999986 99999999866554433222344553  33333  4469999999763


No 85 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=83.54  E-value=1.3  Score=47.21  Aligned_cols=58  Identities=28%  Similarity=0.383  Sum_probs=46.1

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v~  417 (418)
                      +-|+|...+.++||+|.|          +|-.||-|++|||......  +..+.+++..|.-..|+++|.
T Consensus       673 PTVViAnmm~~GpAarsg----------kLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV  732 (829)
T KOG3605|consen  673 PTVVIANMMHGGPAARSG----------KLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIV  732 (829)
T ss_pred             hHHHHHhcccCChhhhcC----------CccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEe
Confidence            556777888899998774          4556999999999988775  567788888877778888774


No 86 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=82.66  E-value=0.62  Score=39.99  Aligned_cols=115  Identities=21%  Similarity=0.254  Sum_probs=54.7

Q ss_pred             EEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEe
Q 014786          155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIG  234 (418)
Q Consensus       155 SGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG  234 (418)
                      -|+.|+  |-.-|.+|--... .  +--+.|   +..-.+.+.+.|+..-....-...+.+..-+.       +.+|++-
T Consensus        15 mgt~vn--GV~wT~~HGagsr-t--lAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~-------~dlylVt   79 (148)
T PF02907_consen   15 MGTCVN--GVMWTVYHGAGSR-T--LAGPKG---PVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS-------SDLYLVT   79 (148)
T ss_dssp             EEEEET--TEEEEEHHHHTTS-E--EEBTTS---EB-ESEEETTTTEEEEE-STTB--BBB-SSSS-------SEEEEE-
T ss_pred             ehhEEc--cEEEEEEecCCcc-c--ccCCCC---cceEeEEcCCCCCcccccccccccCCccccCC-------ccEEEEe
Confidence            477775  7888888864321 1  111111   12233566777888777654333344433321       3566664


Q ss_pred             CCCCCCCceeEeEEeeeeeeecccCCCCCcccE-EEEccccCCCCCCCceeCCCceEEEEEeeeeC
Q 014786          235 NPFGLDHTLTTGVISGLRREISSAATGRPIQDV-IQTDAAINPGNSGGPLLDSSGSLIGINTAIYS  299 (418)
Q Consensus       235 ~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~-i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~  299 (418)
                      +-.    .+-.+     ++.      ++..... .-.......|.||||++..+|.+|||..+...
T Consensus        80 r~~----~v~p~-----rr~------gd~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~  130 (148)
T PF02907_consen   80 RDA----DVIPV-----RRR------GDSRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVC  130 (148)
T ss_dssp             TTS-----EEEE-----EEE------STTEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEE
T ss_pred             ccC----cEeee-----EEc------CCCceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEE
Confidence            321    11111     111      0100011 11112234799999999999999999876544


No 87 
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=82.28  E-value=2.1  Score=44.40  Aligned_cols=37  Identities=30%  Similarity=0.370  Sum_probs=29.6

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG  396 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~  396 (418)
                      .|+||.++++++.-+          +.|++..||+||.||.....++
T Consensus       277 ggIYVgsImkgGAVA----------~DGRIe~GDMiLQVNevsFENm  313 (626)
T KOG3571|consen  277 GGIYVGSIMKGGAVA----------LDGRIEPGDMILQVNEVSFENM  313 (626)
T ss_pred             CceEEeeeccCceee----------ccCccCccceEEEeeecchhhc
Confidence            799999999998643          3466666999999999776665


No 88 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=74.67  E-value=4  Score=46.41  Aligned_cols=50  Identities=34%  Similarity=0.444  Sum_probs=40.4

Q ss_pred             EEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEE
Q 014786          353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVSCF  415 (418)
Q Consensus       353 ~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~g~~v~l~  415 (418)
                      .|-.|.+++||..+|+++           ||.|+.+||++|....  ++.+.|.+  -|..+.+.
T Consensus       661 ~v~sv~egsPA~~agls~-----------~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~  712 (1205)
T KOG0606|consen  661 SVGSVEEGSPAFEAGLSA-----------GDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLR  712 (1205)
T ss_pred             eeeeecCCCCccccCCCc-----------cceeEeccCcccchhhHHHHHHHHHh--cCCeeEEE
Confidence            477889999999999999           9999999999998874  66666654  35565554


No 89 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=73.19  E-value=2.5  Score=42.92  Aligned_cols=23  Identities=26%  Similarity=0.557  Sum_probs=20.7

Q ss_pred             ccCCCCCCCceeCCCceEEEEEe
Q 014786          273 AINPGNSGGPLLDSSGSLIGINT  295 (418)
Q Consensus       273 ~i~~G~SGGPl~n~~G~VVGI~s  295 (418)
                      .+..|.||+.|+|.+|++|||..
T Consensus       351 ~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  351 SLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             CCCCCCCcCeEECCCCCEEEEeC
Confidence            55689999999999999999975


No 90 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=69.39  E-value=10  Score=37.11  Aligned_cols=47  Identities=26%  Similarity=0.458  Sum_probs=36.7

Q ss_pred             cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcC
Q 014786          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQC  406 (418)
Q Consensus       350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~  406 (418)
                      +-+||..|-.++||.+-|          .++.||-|++|||..|....  ++.++++..
T Consensus        30 PClYiVQvFD~tPAa~dG----------~i~~GDEi~avNg~svKGktKveVAkmIQ~~   78 (429)
T KOG3651|consen   30 PCLYIVQVFDKTPAAKDG----------RIRCGDEIVAVNGISVKGKTKVEVAKMIQVS   78 (429)
T ss_pred             CeEEEEEeccCCchhccC----------ccccCCeeEEecceeecCccHHHHHHHHHHh
Confidence            457899999999998754          33449999999999998764  566777654


No 91 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=62.13  E-value=26  Score=36.02  Aligned_cols=135  Identities=20%  Similarity=0.296  Sum_probs=63.5

Q ss_pred             CeEEEEEEEcCCcEEEecccccCCCC-eEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCC-CCCcccccCCCCCCCCCC
Q 014786          151 QGSGSGFVWDSKGHVVTNYHVIRGAS-DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVGQ  228 (418)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~-~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~~~l~~~~~~~~G~  228 (418)
                      -+.|-||-|+++ ..+|+-||+.... ++.     |  .+..-+.++..-+++-+++..+. .+++.+-|.  +-...|.
T Consensus       378 fGsGWGfWVS~~-lfITttHViP~g~~E~F-----G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--eGapEGt  447 (535)
T PF05416_consen  378 FGSGWGFWVSPT-LFITTTHVIPPGAKEAF-----G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--EGAPEGT  447 (535)
T ss_dssp             ETTEEEEESSSS-EEEEEGGGS-STTSEET-----T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---SS--TT-
T ss_pred             cCCceeeeecce-EEEEeeeecCCcchhhh-----C--CChhHeEEeeccceEEEecCCCCCCCccceeec--cCCCCce
Confidence            367999999987 9999999997532 211     0  01111233344577777776643 234444442  1233454


Q ss_pred             EEEE-EeCCCCCC--CceeEeEEeeeeeeecccCCCCCcccEEEE-------ccccCCCCCCCceeCCCce---EEEEEe
Q 014786          229 KVYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSGS---LIGINT  295 (418)
Q Consensus       229 ~V~~-vG~p~g~~--~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPl~n~~G~---VVGI~s  295 (418)
                      -+.+ |-.+.|.-  ..+.-|......-.-.. ..+  ...++.+       |-...||+-|.|-+-..|+   |+|+++
T Consensus       448 V~siLiKR~sGEllpLAvRMgt~AsmkIqgr~-v~G--Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~  524 (535)
T PF05416_consen  448 VCSILIKRPSGELLPLAVRMGTHASMKIQGRT-VHG--QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHA  524 (535)
T ss_dssp             EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEE-EEE--EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEE
T ss_pred             EEEEEEEcCCccchhhhhhhccceeEEEccee-ecc--eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEe
Confidence            4333 33444421  23344433322211000 000  1122222       3345689999999976654   999998


Q ss_pred             eee
Q 014786          296 AIY  298 (418)
Q Consensus       296 ~~~  298 (418)
                      +..
T Consensus       525 AAt  527 (535)
T PF05416_consen  525 AAT  527 (535)
T ss_dssp             EE-
T ss_pred             hhc
Confidence            764


No 92 
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=53.59  E-value=56  Score=29.83  Aligned_cols=37  Identities=30%  Similarity=0.333  Sum_probs=29.5

Q ss_pred             CeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEEC
Q 014786          335 GIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVN  389 (418)
Q Consensus       335 Gv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~in  389 (418)
                      |+.+.++.       ..++|..|.-+|||+++|+.-           |+.|+++-
T Consensus       114 GL~l~~e~-------~~~~Vd~v~fgS~A~~~g~d~-----------d~~I~~v~  150 (183)
T PF11874_consen  114 GLTLMEEG-------GKVIVDEVEFGSPAEKAGIDF-----------DWEITEVE  150 (183)
T ss_pred             CCEEEeeC-------CEEEEEecCCCCHHHHcCCCC-----------CcEEEEEE
Confidence            66655533       557899999999999999998           88887763


No 93 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.40  E-value=41  Score=24.30  Aligned_cols=33  Identities=18%  Similarity=0.415  Sum_probs=28.3

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.|.+.||+.+.+.+..+|...++.+-....
T Consensus         7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~   39 (63)
T cd00600           7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE   39 (63)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence            468899999999999999999998888776643


No 94 
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=50.57  E-value=57  Score=24.20  Aligned_cols=33  Identities=15%  Similarity=0.303  Sum_probs=28.5

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.+.+-.|..++++|+++|....+.+|+-.+
T Consensus         7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence            456777788999999999999999999998654


No 95 
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.13  E-value=31  Score=27.49  Aligned_cols=37  Identities=5%  Similarity=0.297  Sum_probs=30.9

Q ss_pred             ccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       171 vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ++.....+.|.+.+++.+.+++.++|...++.|=...
T Consensus        10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence            3445578899999999999999999999998876554


No 96 
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=50.03  E-value=16  Score=35.15  Aligned_cols=56  Identities=14%  Similarity=0.303  Sum_probs=42.5

Q ss_pred             cEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEE
Q 014786          352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVSCFTF  417 (418)
Q Consensus       352 ~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~g~~v~l~v~  417 (418)
                      +.|..+.++|--++.-          -..+||.|-+|||+.|....  ++.++|+..+.|++.++.++
T Consensus       151 AFIKrIkegsvidri~----------~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLi  208 (334)
T KOG3938|consen  151 AFIKRIKEGSVIDRIE----------AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLI  208 (334)
T ss_pred             eeeEeecCCchhhhhh----------heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEee
Confidence            4566666666655432          22349999999999999886  66789999999999988754


No 97 
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.99  E-value=52  Score=25.38  Aligned_cols=32  Identities=13%  Similarity=0.215  Sum_probs=27.6

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.||+.+.+.+.++|...+|.+=...
T Consensus        11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             CeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence            46889999999999999999999998765553


No 98 
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=45.98  E-value=53  Score=24.53  Aligned_cols=33  Identities=9%  Similarity=0.245  Sum_probs=29.1

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.|.+.+|+.+.+++.++|...++.+-....
T Consensus        11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e   43 (68)
T cd01731          11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE   43 (68)
T ss_pred             CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence            568899999999999999999999998877653


No 99 
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=45.98  E-value=52  Score=25.00  Aligned_cols=33  Identities=12%  Similarity=0.320  Sum_probs=28.6

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.|.+.||+.|.+++.++|...++-+=....
T Consensus        15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e   47 (72)
T PRK00737         15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAEE   47 (72)
T ss_pred             CEEEEEECCCCEEEEEEEEEcccceeEEeeEEE
Confidence            468899999999999999999999988877643


No 100
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.97  E-value=48  Score=24.72  Aligned_cols=32  Identities=13%  Similarity=0.233  Sum_probs=27.8

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.+|+.|.+++.++|...++-+=...
T Consensus        11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence            46889999999999999999999888876654


No 101
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=45.65  E-value=44  Score=25.05  Aligned_cols=32  Identities=13%  Similarity=0.217  Sum_probs=27.7

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.||+.|.+++.++|...++.+=.+.
T Consensus        12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence            46889999999999999999999888876554


No 102
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=41.69  E-value=20  Score=24.97  Aligned_cols=21  Identities=38%  Similarity=0.616  Sum_probs=17.7

Q ss_pred             CCCCCCceeCCCceEEEEEee
Q 014786          276 PGNSGGPLLDSSGSLIGINTA  296 (418)
Q Consensus       276 ~G~SGGPl~n~~G~VVGI~s~  296 (418)
                      .+.+.-|++|.+|+++|+.+.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            356778999999999999874


No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.68  E-value=59  Score=25.11  Aligned_cols=32  Identities=19%  Similarity=0.429  Sum_probs=27.5

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.||+.+.+.+.++|...++.|=...
T Consensus        11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence            46889999999999999999999988875553


No 104
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.03  E-value=52  Score=25.67  Aligned_cols=31  Identities=10%  Similarity=0.271  Sum_probs=26.8

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (418)
                      ..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus        12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence            5688999999999999999999998876444


No 105
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.91  E-value=66  Score=25.12  Aligned_cols=31  Identities=23%  Similarity=0.334  Sum_probs=26.8

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (418)
                      ..+.|.+.||+.+.+++.++|...+|.+=..
T Consensus        13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~   43 (81)
T cd01729          13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT   43 (81)
T ss_pred             CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence            4688999999999999999999988876544


No 106
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=39.66  E-value=17  Score=35.83  Aligned_cols=30  Identities=40%  Similarity=0.684  Sum_probs=26.9

Q ss_pred             EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeC
Q 014786          354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS  394 (418)
Q Consensus       354 V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~  394 (418)
                      +.+|.+.+|++++|.-.           ||.|+-+|+-++.
T Consensus        67 ~lrv~~~~~~e~~~~~~-----------~dyilg~n~Dp~~   96 (417)
T COG5233          67 VLRVNPESPAEKAGMVV-----------GDYILGINEDPLR   96 (417)
T ss_pred             heeccccChhHhhcccc-----------ceeEEeecCCcHH
Confidence            67889999999999998           9999999987764


No 107
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.74  E-value=64  Score=24.90  Aligned_cols=31  Identities=16%  Similarity=0.409  Sum_probs=27.1

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (418)
                      ..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus        14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence            5788999999999999999999998886554


No 108
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.53  E-value=77  Score=24.33  Aligned_cols=31  Identities=16%  Similarity=0.185  Sum_probs=27.0

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (418)
                      ..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus        13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~   43 (74)
T cd01728          13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT   43 (74)
T ss_pred             CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence            5688999999999999999999988877554


No 109
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=38.36  E-value=23  Score=33.06  Aligned_cols=55  Identities=15%  Similarity=0.412  Sum_probs=38.0

Q ss_pred             ccEEEEccccCCCCCCCceeCC----CceEEEEEeeeeCCCCCCCCccceeec--ccchhhhhhh
Q 014786          265 QDVIQTDAAINPGNSGGPLLDS----SGSLIGINTAIYSPSGASSGVGFSIPV--DTVNGIVDQL  323 (418)
Q Consensus       265 ~~~i~~~~~i~~G~SGGPl~n~----~G~VVGI~s~~~~~~~~~~~~~~aIp~--~~i~~~l~~l  323 (418)
                      ...+++..+...|+=|||++-.    .-+++||+.++..    ..+.+||-++  +.+++.++.|
T Consensus       168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~----~~~~gYAe~itQEDL~~A~~~l  228 (231)
T PF12381_consen  168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA----NHAMGYAESITQEDLMRAINKL  228 (231)
T ss_pred             eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc----cccceehhhhhHHHHHHHHHhh
Confidence            4566778888999999999843    3479999998753    3467788655  4444444443


No 110
>PF09122 DUF1930:  Domain of unknown function (DUF1930);  InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=38.35  E-value=78  Score=23.58  Aligned_cols=35  Identities=23%  Similarity=0.374  Sum_probs=24.2

Q ss_pred             CcEEEEECCEEeCCHH-HHHHHHhcCCCCCEEEEEE
Q 014786          382 GDIITSVNGKKVSNGS-DLYRILDQCKVGDEVSCFT  416 (418)
Q Consensus       382 GDiIl~ing~~v~~~~-dl~~~l~~~~~g~~v~l~v  416 (418)
                      .-.-+.+||..+.+++ +|..++.....|+..++.+
T Consensus        19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~F   54 (68)
T PF09122_consen   19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFF   54 (68)
T ss_dssp             TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEE
T ss_pred             cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEE
Confidence            5677889999999996 7888888778899887753


No 111
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=37.97  E-value=82  Score=23.96  Aligned_cols=32  Identities=9%  Similarity=0.165  Sum_probs=27.2

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.+|+.+.+++.++|...+|.+=...
T Consensus        11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~   42 (72)
T cd01719          11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV   42 (72)
T ss_pred             CeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence            46788999999999999999998888775553


No 112
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=36.94  E-value=87  Score=22.89  Aligned_cols=32  Identities=19%  Similarity=0.413  Sum_probs=27.5

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.||+.+.+.+..+|...++-+=...
T Consensus         9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~   40 (67)
T smart00651        9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE   40 (67)
T ss_pred             cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence            46889999999999999999999888876554


No 113
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=35.63  E-value=92  Score=23.47  Aligned_cols=32  Identities=9%  Similarity=0.272  Sum_probs=28.7

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus        11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence            46889999999999999999999999888774


No 114
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=35.46  E-value=62  Score=23.74  Aligned_cols=33  Identities=21%  Similarity=0.444  Sum_probs=29.1

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.|.+.||+.+.+.+..+|...++-+-....
T Consensus         9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~   41 (67)
T PF01423_consen    9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE   41 (67)
T ss_dssp             SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred             cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence            568899999999999999999998888877754


No 115
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.19  E-value=93  Score=23.67  Aligned_cols=32  Identities=19%  Similarity=0.267  Sum_probs=27.4

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.||+.+.+++.++|...++.+=...
T Consensus        10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727          10 KTVSVITVDGRVIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             CEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence            46788999999999999999999888776653


No 116
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=32.60  E-value=67  Score=25.19  Aligned_cols=47  Identities=19%  Similarity=0.376  Sum_probs=31.0

Q ss_pred             EEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEE-EeCC
Q 014786          188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYA-IGNP  236 (418)
Q Consensus       188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~-vG~p  236 (418)
                      ++++++..|...++|++.+-.-.... .+.+-. .+++.|+.|.+ +||.
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvkreV-~l~Lv~-~~v~~GdyVLVHvGfA   52 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVKREV-NLDLVG-EEVKVGDYVLVHVGFA   52 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEeEEE-Eeeeec-CccccCCEEEEEeeEE
Confidence            46788889988789999986532111 122211 26889999876 6764


No 117
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=31.62  E-value=93  Score=23.92  Aligned_cols=33  Identities=21%  Similarity=0.459  Sum_probs=28.6

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.|.+.+|+.|.+++.++|...++.+--+..
T Consensus        18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence            678999999999999999999999888776544


No 118
>PF02601 Exonuc_VII_L:  Exonuclease VII, large subunit;  InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus.  This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=30.30  E-value=60  Score=31.96  Aligned_cols=35  Identities=29%  Similarity=0.499  Sum_probs=31.2

Q ss_pred             eEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCc
Q 014786          152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS  186 (418)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~  186 (418)
                      ..|-.++.+++|.++|+..-+...+.+++.+.||.
T Consensus       280 ~RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~  314 (319)
T PF02601_consen  280 KRGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS  314 (319)
T ss_pred             hCceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence            45778888888999999999999999999999995


No 119
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=28.56  E-value=49  Score=27.30  Aligned_cols=18  Identities=33%  Similarity=0.617  Sum_probs=13.2

Q ss_pred             CceeCCCceEEEEEeeee
Q 014786          281 GPLLDSSGSLIGINTAIY  298 (418)
Q Consensus       281 GPl~n~~G~VVGI~s~~~  298 (418)
                      .|++|.+|++||++..++
T Consensus        94 ~PV~d~~g~viG~V~VG~  111 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVGV  111 (116)
T ss_dssp             EEEE-TTS-EEEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEEE
Confidence            578889999999998653


No 120
>PF05578 Peptidase_S31:  Pestivirus NS3 polyprotein peptidase S31;  InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=26.50  E-value=1.5e+02  Score=26.25  Aligned_cols=131  Identities=18%  Similarity=0.250  Sum_probs=63.0

Q ss_pred             eEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEE
Q 014786          152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVY  231 (418)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~  231 (418)
                      +.-+|+-+..+|-|-.--||-.+.+.+ |.-.=|+.   +++..+...      +... ..+-   +....-...|...|
T Consensus        51 gletgwaythqggissvdhvt~gkd~l-vcdsmgrt---rvvcqsnnk------~tde-~eyg---vktdsgcp~garcy  116 (211)
T PF05578_consen   51 GLETGWAYTHQGGISSVDHVTAGKDLL-VCDSMGRT---RVVCQSNNK------MTDE-TEYG---VKTDSGCPDGARCY  116 (211)
T ss_pred             cccccceeeccCCcccceeeecCCceE-EecCCCce---EEEEccCCc------ccch-hhcc---cccCCCCCCCcEEE
Confidence            345677777677777777776664432 22222221   222222110      0000 0010   11112245578888


Q ss_pred             EEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCC-CceEEEEEeee
Q 014786          232 AIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDS-SGSLIGINTAI  297 (418)
Q Consensus       232 ~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~-~G~VVGI~s~~  297 (418)
                      ++ +|...+.+-+.|.+-.+...-.+...-.....--.+|..-..|.||=|+|.. .|++||=.-.+
T Consensus       117 v~-npea~nisgtkga~vhlqk~ggef~cvta~gtpaf~~~knlkg~s~~pifeassgr~vgr~k~g  182 (211)
T PF05578_consen  117 VL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEASSGRVVGRVKVG  182 (211)
T ss_pred             Ee-CCcccccccCcceEEEEeccCCceEEEeccCCcceeeccccCCCCCCceeeccCCcEEEEEEec
Confidence            87 6655555555665544333211000000000001223334569999999985 89999987654


No 121
>COG0061 nadF NAD kinase [Coenzyme metabolism]
Probab=26.20  E-value=27  Score=34.04  Aligned_cols=32  Identities=28%  Similarity=0.525  Sum_probs=29.4

Q ss_pred             cccccccceeeeecCCCCccCCCccccccCCc
Q 014786            2 AYSLISSSTFLLSRSPNTTLAPLNKHNFPLRP   33 (418)
Q Consensus         2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~   33 (418)
                      ||||+---|.+.-+.+.+.+.|+++|++++||
T Consensus       179 AY~lSAGGPIv~P~l~ai~ltpi~p~~l~~Rp  210 (281)
T COG0061         179 AYNLSAGGPILHPGLDAIQLTPICPHSLSFRP  210 (281)
T ss_pred             HHhhhcCCCccCCCCCeEEEeecCCCcccCCC
Confidence            79999999999999999999999999998764


No 122
>PF09465 LBR_tudor:  Lamin-B receptor of TUDOR domain;  InterPro: IPR019023  The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=25.68  E-value=2.7e+02  Score=20.27  Aligned_cols=35  Identities=17%  Similarity=0.335  Sum_probs=27.4

Q ss_pred             CCCeEEEEeCCCcE-EEEEEEEEcCCCCeEEEEecC
Q 014786          174 GASDIRVTFADQSA-YDAKIVGFDQDKDVAVLRIDA  208 (418)
Q Consensus       174 ~~~~i~V~~~dg~~-~~a~vv~~d~~~DlAlLkv~~  208 (418)
                      ..+.+.++.++... |++++..+|...++.-++.+.
T Consensus         8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D   43 (55)
T PF09465_consen    8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED   43 (55)
T ss_dssp             SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred             CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence            44568888888776 599999999999999999976


No 123
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.62  E-value=1.8e+02  Score=22.21  Aligned_cols=32  Identities=13%  Similarity=0.270  Sum_probs=28.4

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus        12 ~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~   43 (76)
T cd01723          12 HPMLVELKNGETYNGHLVNCDNWMNIHLREVI   43 (76)
T ss_pred             CEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence            56889999999999999999999999887664


No 124
>PF01455 HupF_HypC:  HupF/HypC family;  InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=24.10  E-value=2.1e+02  Score=21.60  Aligned_cols=43  Identities=23%  Similarity=0.395  Sum_probs=29.3

Q ss_pred             EEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEE
Q 014786          188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI  233 (418)
Q Consensus       188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~v  233 (418)
                      ++++++..+.....|++.....   ...+.+.--.++++||+|.+-
T Consensus         5 iP~~Vv~v~~~~~~A~v~~~G~---~~~V~~~lv~~v~~Gd~VLVH   47 (68)
T PF01455_consen    5 IPGRVVEVDEDGGMAVVDFGGV---RREVSLALVPDVKVGDYVLVH   47 (68)
T ss_dssp             EEEEEEEEETTTTEEEEEETTE---EEEEEGTTCTSB-TT-EEEEE
T ss_pred             ccEEEEEEeCCCCEEEEEcCCc---EEEEEEEEeCCCCCCCEEEEe
Confidence            5788888888889999988752   334444334458999998874


No 125
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=23.32  E-value=58  Score=24.68  Aligned_cols=30  Identities=27%  Similarity=0.626  Sum_probs=21.6

Q ss_pred             CceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhh
Q 014786          281 GPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQL  323 (418)
Q Consensus       281 GPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l  323 (418)
                      -|+++.+|+++|+...             .+..+.+.++++++
T Consensus        19 ~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~   48 (81)
T PF02743_consen   19 VPIYDDDGKIIGVVGI-------------DISLDQLSEIISNI   48 (81)
T ss_dssp             EEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTS
T ss_pred             EEEECCCCCEEEEEEE-------------EeccceeeeEEEee
Confidence            4678789999999754             36666777766664


No 126
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=23.21  E-value=51  Score=33.37  Aligned_cols=23  Identities=35%  Similarity=0.525  Sum_probs=18.7

Q ss_pred             CeEEEEEEEcC----Cc------EEEecccccC
Q 014786          151 QGSGSGFVWDS----KG------HVVTNYHVIR  173 (418)
Q Consensus       151 ~~~GSGfiI~~----~G------~ILT~aHvv~  173 (418)
                      ...|||.|+|-    ++      |+.||.||+.
T Consensus        35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~   67 (374)
T PF01732_consen   35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVAS   67 (374)
T ss_pred             cCcceEEEEEEeccCCCCCCeEEEEEechhhhc
Confidence            46899999982    22      6999999998


No 127
>PF14438 SM-ATX:  Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=23.19  E-value=1.9e+02  Score=21.94  Aligned_cols=28  Identities=21%  Similarity=0.313  Sum_probs=20.5

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcC---CCCeEE
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQ---DKDVAV  203 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~---~~DlAl  203 (418)
                      ..++|++.||..|++-+...++   +.|+.|
T Consensus        13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL   43 (77)
T PF14438_consen   13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL   43 (77)
T ss_dssp             SEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred             CEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence            5689999999999999999988   556655


No 128
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=22.43  E-value=61  Score=26.21  Aligned_cols=21  Identities=33%  Similarity=0.482  Sum_probs=16.8

Q ss_pred             CCCCCceeCCCceEEEEEeee
Q 014786          277 GNSGGPLLDSSGSLIGINTAI  297 (418)
Q Consensus       277 G~SGGPl~n~~G~VVGI~s~~  297 (418)
                      +.+.=|++|.+|+++|+++..
T Consensus        98 ~~~~lpVvd~~~~~vGiit~~  118 (123)
T cd04627          98 GISSVAVVDNQGNLIGNISVT  118 (123)
T ss_pred             CCceEEEECCCCcEEEEEeHH
Confidence            445578999999999998853


No 129
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.67  E-value=2.4e+02  Score=21.90  Aligned_cols=32  Identities=13%  Similarity=0.276  Sum_probs=28.3

Q ss_pred             CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (418)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (418)
                      ..+.|.+.+|..+.+++..+|...++-+-.++
T Consensus        12 ~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~   43 (81)
T cd01725          12 KEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS   43 (81)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence            46889999999999999999999998887764


No 130
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=20.53  E-value=74  Score=25.26  Aligned_cols=20  Identities=20%  Similarity=0.270  Sum_probs=15.8

Q ss_pred             CCCCCceeCCCceEEEEEee
Q 014786          277 GNSGGPLLDSSGSLIGINTA  296 (418)
Q Consensus       277 G~SGGPl~n~~G~VVGI~s~  296 (418)
                      +.+--|++|.+|+++|+++.
T Consensus        86 ~~~~lpVvd~~~~~~Giit~  105 (111)
T cd04603          86 EPPVVAVVDKEGKLVGTIYE  105 (111)
T ss_pred             CCCeEEEEcCCCeEEEEEEh
Confidence            44446899988999999874


Done!