Query         013804
Match_columns 436
No_of_seqs    445 out of 3214
Neff          7.9 
Searched_HMMs 46136
Date          Fri Mar 29 07:33:56 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013804.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013804hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0   3E-52 6.5E-57  429.3  39.4  301  117-432    41-361 (455)
  2 TIGR02038 protease_degS peripl 100.0 5.4E-51 1.2E-55  408.4  39.4  303  112-433    41-350 (351)
  3 PRK10898 serine endoprotease;  100.0 6.1E-51 1.3E-55  407.7  39.4  303  112-433    41-351 (353)
  4 PRK10942 serine endoprotease;  100.0 9.8E-50 2.1E-54  412.4  37.3  300  117-431    39-381 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 1.6E-48 3.5E-53  401.4  35.9  301  118-433     3-329 (428)
  6 COG0265 DegQ Trypsin-like seri 100.0 6.3E-39 1.4E-43  321.9  31.5  302  116-431    33-340 (347)
  7 KOG1320 Serine protease [Postt 100.0 2.9E-27 6.2E-32  238.3  22.0  305  117-432   129-469 (473)
  8 KOG1421 Predicted signaling-as  99.9 1.1E-22 2.5E-27  206.9  22.5  295  117-433    53-373 (955)
  9 PF13365 Trypsin_2:  Trypsin-li  99.7 4.9E-16 1.1E-20  130.9  12.6  109  154-293     1-120 (120)
 10 PF13180 PDZ_2:  PDZ domain; PD  99.6 7.5E-15 1.6E-19  116.3  10.7   82  332-429     1-82  (82)
 11 KOG1421 Predicted signaling-as  99.6 6.1E-14 1.3E-18  143.8  16.0  296  123-432   525-832 (955)
 12 PF00089 Trypsin:  Trypsin;  In  99.5 5.3E-13 1.1E-17  123.9  19.3  168  151-320    24-220 (220)
 13 KOG1320 Serine protease [Postt  99.4 4.7E-13   1E-17  135.8  10.2  275  123-420    57-351 (473)
 14 cd00190 Tryp_SPc Trypsin-like   99.4 5.1E-12 1.1E-16  118.1  15.9  170  151-322    24-231 (232)
 15 smart00020 Tryp_SPc Trypsin-li  99.3 2.7E-11 5.9E-16  113.4  15.8  167  151-319    25-228 (229)
 16 cd00991 PDZ_archaeal_metallopr  99.3 2.4E-11 5.1E-16   95.5   9.9   68  350-428    10-77  (79)
 17 cd00986 PDZ_LON_protease PDZ d  99.3 4.4E-11 9.5E-16   93.9  10.4   72  350-433     8-79  (79)
 18 cd00987 PDZ_serine_protease PD  99.2 4.8E-11   1E-15   95.6   9.9   84  332-426     1-89  (90)
 19 TIGR01713 typeII_sec_gspC gene  99.2 3.1E-11 6.8E-16  115.8   9.6  101  314-429   159-259 (259)
 20 cd00990 PDZ_glycyl_aminopeptid  99.2 5.9E-11 1.3E-15   93.1   9.4   78  332-430     1-78  (80)
 21 cd00989 PDZ_metalloprotease PD  99.2 2.3E-10 5.1E-15   89.4   9.6   77  333-428     2-78  (79)
 22 cd00988 PDZ_CTP_protease PDZ d  99.1 5.6E-10 1.2E-14   88.6   9.7   79  332-429     2-83  (85)
 23 COG3591 V8-like Glu-specific e  99.0 3.6E-09 7.9E-14   99.7  13.4  159  152-324    64-250 (251)
 24 cd00136 PDZ PDZ domain, also c  98.9 8.2E-09 1.8E-13   78.6   7.1   67  333-417     2-70  (70)
 25 TIGR02037 degP_htrA_DO peripla  98.8 1.3E-08 2.8E-13  105.3   9.9   85  331-426   337-427 (428)
 26 PRK10779 zinc metallopeptidase  98.8   1E-08 2.2E-13  106.5   7.7   68  352-430   128-195 (449)
 27 TIGR00054 RIP metalloprotease   98.7 5.1E-08 1.1E-12  100.5   9.3   69  350-430   203-271 (420)
 28 PRK10779 zinc metallopeptidase  98.7 9.9E-08 2.1E-12   99.2  10.0   68  351-430   222-289 (449)
 29 TIGR00225 prc C-terminal pepti  98.6 1.1E-07 2.4E-12   95.1   8.8   80  332-430    51-132 (334)
 30 TIGR02860 spore_IV_B stage IV   98.6 2.5E-07 5.4E-12   93.2  10.5   77  333-430    97-181 (402)
 31 KOG3627 Trypsin [Amino acid tr  98.6 1.8E-06   4E-11   82.6  16.0  170  152-322    38-252 (256)
 32 smart00228 PDZ Domain present   98.6 1.7E-07 3.8E-12   73.7   7.2   74  332-420    12-85  (85)
 33 PLN00049 carboxyl-terminal pro  98.5 6.4E-07 1.4E-11   91.4  10.2   84  332-428    85-170 (389)
 34 PRK10139 serine endoprotease;   98.5 4.5E-07 9.8E-12   94.2   9.1   65  350-427   390-454 (455)
 35 PF00863 Peptidase_C4:  Peptida  98.5 2.3E-05   5E-10   73.6  19.3  163  123-313    14-184 (235)
 36 cd00992 PDZ_signaling PDZ doma  98.4   9E-07 1.9E-11   69.3   7.6   69  331-416    11-81  (82)
 37 PF00595 PDZ:  PDZ domain (Also  98.4 4.8E-07 1.1E-11   71.1   5.3   71  331-417     9-81  (81)
 38 COG3480 SdrC Predicted secrete  98.4 1.5E-06 3.3E-11   83.4   9.0   71  350-432   130-201 (342)
 39 PRK10942 serine endoprotease;   98.4 1.1E-06 2.4E-11   91.7   8.9   65  350-427   408-472 (473)
 40 TIGR03279 cyano_FeS_chp putati  98.3 9.5E-07 2.1E-11   89.6   7.5   63  353-430     1-64  (433)
 41 PF14685 Tricorn_PDZ:  Tricorn   98.3 4.4E-06 9.6E-11   66.5   9.0   79  332-427     1-88  (88)
 42 COG0793 Prc Periplasmic protea  98.2 3.4E-06 7.4E-11   86.3   8.8   80  330-427    98-181 (406)
 43 TIGR00054 RIP metalloprotease   98.2 1.7E-06 3.6E-11   89.3   6.3   66  350-428   128-193 (420)
 44 COG5640 Secreted trypsin-like   98.0 0.00024 5.2E-09   69.6  15.7   55  272-326   223-280 (413)
 45 PRK09681 putative type II secr  98.0 1.6E-05 3.6E-10   76.3   7.7   57  362-429   219-275 (276)
 46 COG3975 Predicted protease wit  98.0 1.2E-05 2.5E-10   82.3   5.9   64  350-432   462-525 (558)
 47 KOG3129 26S proteasome regulat  97.9 2.6E-05 5.7E-10   70.6   7.2   74  351-435   140-215 (231)
 48 PRK11186 carboxy-terminal prot  97.9 3.7E-05 8.1E-10   82.9   9.0   79  331-428   243-332 (667)
 49 PF05579 Peptidase_S32:  Equine  97.8 0.00026 5.6E-09   66.7  11.7  117  151-298   111-229 (297)
 50 PF04495 GRASP55_65:  GRASP55/6  97.8 5.9E-05 1.3E-09   65.5   6.7   87  332-430    26-114 (138)
 51 COG3031 PulC Type II secretory  97.3 0.00056 1.2E-08   63.4   6.2   59  359-428   216-274 (275)
 52 KOG3553 Tax interaction protei  96.9  0.0008 1.7E-08   53.9   2.9   34  350-394    59-92  (124)
 53 PF00548 Peptidase_C3:  3C cyst  96.7   0.044 9.6E-07   49.5  13.1  137  151-297    24-170 (172)
 54 PF12812 PDZ_1:  PDZ-like domai  96.7  0.0045 9.7E-08   48.3   5.6   64  332-406     9-75  (78)
 55 PF05580 Peptidase_S55:  SpoIVB  96.7   0.031 6.6E-07   51.7  11.8  166  145-315    13-214 (218)
 56 PF03761 DUF316:  Domain of unk  96.6    0.09   2E-06   51.1  15.7   91  197-297   159-254 (282)
 57 KOG3580 Tight junction protein  96.0  0.0069 1.5E-07   62.9   4.4   58  350-418   429-488 (1027)
 58 PF10459 Peptidase_S46:  Peptid  95.9   0.033 7.1E-07   60.8   9.0   22  152-173    47-68  (698)
 59 PF08192 Peptidase_S64:  Peptid  95.8   0.056 1.2E-06   57.4  10.1  117  197-322   541-687 (695)
 60 KOG3580 Tight junction protein  95.5   0.026 5.6E-07   58.8   6.0   86  332-429   198-288 (1027)
 61 KOG3532 Predicted protein kina  95.4   0.032   7E-07   58.9   6.4   58  332-406   386-443 (1051)
 62 PF00949 Peptidase_S7:  Peptida  95.3   0.025 5.5E-07   48.5   4.5   33  268-300    88-120 (132)
 63 KOG3209 WW domain-containing p  95.1    0.03 6.6E-07   59.4   5.0   56  354-421   782-839 (984)
 64 PF10459 Peptidase_S46:  Peptid  95.0   0.018 3.8E-07   62.8   3.4   59  265-323   621-686 (698)
 65 PF09342 DUF1986:  Domain of un  94.7    0.26 5.7E-06   46.4   9.6   90  147-237    23-131 (267)
 66 COG0750 Predicted membrane-ass  94.4    0.14 2.9E-06   52.0   8.0   56  356-423   135-194 (375)
 67 KOG3552 FERM domain protein FR  94.4   0.061 1.3E-06   58.8   5.4   65  332-418    65-131 (1298)
 68 TIGR02860 spore_IV_B stage IV   94.3    0.44 9.6E-06   48.6  11.1   41  271-315   354-394 (402)
 69 KOG3606 Cell polarity protein   93.9    0.15 3.3E-06   48.3   6.5   59  349-419   193-253 (358)
 70 KOG3209 WW domain-containing p  93.4    0.21 4.6E-06   53.3   7.0   58  352-419   373-432 (984)
 71 PF02122 Peptidase_S39:  Peptid  93.2    0.59 1.3E-05   43.3   8.9  134  164-314    43-182 (203)
 72 KOG3542 cAMP-regulated guanine  91.9    0.12 2.6E-06   54.7   3.0   57  350-418   562-618 (1283)
 73 KOG3834 Golgi reassembly stack  91.6    0.38 8.2E-06   48.7   5.9   69  350-429    15-85  (462)
 74 KOG3651 Protein kinase C, alph  91.4     0.4 8.7E-06   46.3   5.6   56  351-418    31-88  (429)
 75 KOG3549 Syntrophins (type gamm  91.4    0.34 7.4E-06   47.6   5.2   56  350-417    80-137 (505)
 76 PF00944 Peptidase_S3:  Alphavi  90.9    0.47   1E-05   40.4   4.9   29  271-299   100-128 (158)
 77 KOG3550 Receptor targeting pro  90.8     0.8 1.7E-05   39.8   6.4   55  350-416   115-171 (207)
 78 PF02395 Peptidase_S6:  Immunog  90.6    0.98 2.1E-05   50.1   8.6   65  151-218    64-130 (769)
 79 KOG3551 Syntrophins (type beta  90.2    0.42 9.2E-06   47.7   4.7   73  331-420    95-172 (506)
 80 KOG3571 Dishevelled 3 and rela  89.6    0.63 1.4E-05   48.0   5.6   72  332-418   261-338 (626)
 81 KOG3605 Beta amyloid precursor  88.7    0.64 1.4E-05   49.3   5.0  112  276-410   679-806 (829)
 82 KOG0609 Calcium/calmodulin-dep  87.9       1 2.2E-05   47.0   5.8   68  333-418   135-204 (542)
 83 KOG1892 Actin filament-binding  87.8    0.66 1.4E-05   51.3   4.6   60  350-421   960-1021(1629)
 84 PF02907 Peptidase_S29:  Hepati  86.7    0.44 9.5E-06   40.7   2.0  117  154-300    14-131 (148)
 85 KOG2921 Intramembrane metallop  86.4     1.1 2.4E-05   44.9   5.0   45  350-405   220-265 (484)
 86 PF00947 Pico_P2A:  Picornaviru  81.9       7 0.00015   33.2   7.2   32  265-297    78-109 (127)
 87 KOG0606 Microtubule-associated  81.6     2.2 4.9E-05   48.3   5.3   50  353-415   661-712 (1205)
 88 PF03510 Peptidase_C24:  2C end  80.4     6.4 0.00014   32.4   6.3   53  155-219     2-54  (105)
 89 KOG3834 Golgi reassembly stack  79.3     2.6 5.6E-05   42.9   4.4   65  354-429   113-179 (462)
 90 PF01732 DUF31:  Putative pepti  72.7     2.8 6.2E-05   42.6   2.8   24  272-295   350-373 (374)
 91 KOG3605 Beta amyloid precursor  68.9     5.2 0.00011   42.8   3.7   58  351-418   674-733 (829)
 92 PF05416 Peptidase_C37:  Southa  68.1      19  0.0004   36.9   7.2  137  150-299   377-528 (535)
 93 PF11874 DUF3394:  Domain of un  55.8      42 0.00091   30.5   6.7   38  333-388   112-149 (183)
 94 KOG3938 RGS-GAIP interacting p  54.3      14 0.00031   35.3   3.5   59  350-418   149-209 (334)
 95 cd01735 LSm12_N LSm12 belongs   48.5      65  0.0014   23.8   5.5   34  175-208     6-39  (61)
 96 PF00571 CBS:  CBS domain CBS d  47.7      18  0.0004   25.2   2.6   21  276-296    28-48  (57)
 97 cd00600 Sm_like The eukaryotic  45.2      61  0.0013   23.3   5.1   33  176-208     7-39  (63)
 98 PRK14864 putative biofilm stre  44.4      40 0.00087   27.7   4.3   55   78-133     3-57  (104)
 99 cd01720 Sm_D2 The eukaryotic S  43.5      44 0.00096   26.5   4.3   37  171-207    10-46  (87)
100 cd01731 archaeal_Sm1 The archa  39.3      72  0.0016   23.7   4.8   33  176-208    11-43  (68)
101 PRK00737 small nuclear ribonuc  39.1      75  0.0016   24.0   4.9   32  176-207    15-46  (72)
102 cd01722 Sm_F The eukaryotic Sm  38.0      67  0.0015   24.0   4.4   32  176-207    12-43  (68)
103 cd01726 LSm6 The eukaryotic Sm  37.7      76  0.0016   23.6   4.6   32  176-207    11-42  (67)
104 cd06168 LSm9 The eukaryotic Sm  36.5      87  0.0019   24.0   4.9   32  176-207    11-42  (75)
105 PF12381 Peptidase_C3G:  Tungro  36.5      26 0.00055   32.7   2.2   55  265-323   168-228 (231)
106 COG4956 Integral membrane prot  36.4      29 0.00064   34.1   2.7   40  385-424   269-309 (356)
107 cd01730 LSm3 The eukaryotic Sm  35.0      69  0.0015   24.9   4.2   31  176-206    12-42  (82)
108 cd01717 Sm_B The eukaryotic Sm  35.0      86  0.0019   24.1   4.7   32  176-207    11-42  (79)
109 cd01729 LSm7 The eukaryotic Sm  34.6      88  0.0019   24.3   4.7   31  176-206    13-43  (81)
110 PF04225 OapA:  Opacity-associa  33.3      21 0.00045   28.1   1.0   53  379-431     7-68  (85)
111 TIGR03000 plancto_dom_1 Planct  32.1      64  0.0014   24.9   3.4   47  382-428    11-62  (75)
112 cd01732 LSm5 The eukaryotic Sm  32.0      93   0.002   23.9   4.4   31  176-206    14-44  (76)
113 PF14827 Cache_3:  Sensory doma  31.7      47   0.001   27.4   2.9   18  281-298    94-111 (116)
114 cd01719 Sm_G The eukaryotic Sm  31.2 1.2E+02  0.0026   23.0   4.9   32  176-207    11-42  (72)
115 cd01728 LSm1 The eukaryotic Sm  30.8 1.1E+02  0.0024   23.3   4.6   31  176-206    13-43  (74)
116 COG0298 HypC Hydrogenase matur  30.8 1.1E+02  0.0024   23.8   4.5   47  188-236     5-52  (82)
117 smart00651 Sm snRNP Sm protein  29.8 1.2E+02  0.0027   22.0   4.7   32  176-207     9-40  (67)
118 PF02743 Cache_1:  Cache domain  28.9      53  0.0011   24.9   2.6   31  281-324    19-49  (81)
119 PF09122 DUF1930:  Domain of un  28.6 1.4E+02  0.0031   22.1   4.5   45  382-427    19-64  (68)
120 PF09465 LBR_tudor:  Lamin-B re  28.3 2.3E+02   0.005   20.5   5.6   35  174-208     8-43  (55)
121 PF05578 Peptidase_S31:  Pestiv  28.1 1.1E+02  0.0024   26.9   4.5   73  224-298   109-183 (211)
122 cd01727 LSm8 The eukaryotic Sm  27.4 1.3E+02  0.0029   22.7   4.5   32  176-207    10-41  (74)
123 COG1958 LSM1 Small nuclear rib  27.1 1.2E+02  0.0027   23.2   4.4   33  176-208    18-50  (79)
124 cd01721 Sm_D3 The eukaryotic S  26.8 1.6E+02  0.0034   22.1   4.8   32  176-207    11-42  (70)
125 PF14275 DUF4362:  Domain of un  26.3   2E+02  0.0042   23.4   5.5   47  381-429     2-62  (98)
126 PF01423 LSM:  LSM domain ;  In  25.5 1.5E+02  0.0032   21.6   4.4   33  176-208     9-41  (67)
127 cd04627 CBS_pair_14 The CBS do  25.2      58  0.0012   26.4   2.4   22  276-297    97-118 (123)
128 PF01455 HupF_HypC:  HupF/HypC   24.2 2.3E+02  0.0051   21.2   5.2   43  188-233     5-47  (68)
129 PF02601 Exonuc_VII_L:  Exonucl  23.8      87  0.0019   30.9   3.8   35  152-186   280-314 (319)
130 cd04603 CBS_pair_KefB_assoc Th  22.8      69  0.0015   25.5   2.4   20  277-296    86-105 (111)
131 cd04620 CBS_pair_7 The CBS dom  22.3      71  0.0015   25.3   2.4   20  277-296    90-109 (115)
132 PF10049 DUF2283:  Protein of u  20.1      73  0.0016   22.3   1.7   11  285-295    36-46  (50)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=3e-52  Score=429.25  Aligned_cols=301  Identities=39%  Similarity=0.625  Sum_probs=261.9

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccC------ccc----c---cc-ccCcCeEEEEEEEcC-CCEEEecccccCCCCeEEEE
Q 013804          117 ATVRLFQENTPSVVNITNLAARQD------AFT----L---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT  181 (436)
Q Consensus       117 ~~~~~~~~~~~sVV~I~~~~~~~~------~~~----~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~i~V~  181 (436)
                      ++.++++++.||||.|.+......      .|.    .   +. .....+.||||+|++ +||||||+||+.+++.+.|+
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~  120 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ  120 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence            578999999999999987643211      111    0   00 112247899999985 79999999999999999999


Q ss_pred             ecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCC
Q 013804          182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG  261 (436)
Q Consensus       182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~  261 (436)
                      +.|+++++|++++.|+.+||||||++.+ ..+++++|+++..+++|++|+++|||++...+++.|+|++..+.....   
T Consensus       121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~---  196 (455)
T PRK10139        121 LNDGREFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL---  196 (455)
T ss_pred             ECCCCEEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCC---
Confidence            9999999999999999999999999854 378999999999999999999999999999999999999887652211   


Q ss_pred             CCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecceeeecc
Q 013804          262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD  341 (436)
Q Consensus       262 ~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~  341 (436)
                      ..+..++++|+.+++|+|||||+|.+|+||||+++...+.++..+++|+||++.+++++++|+++|++.++|||+.+++.
T Consensus       197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l  276 (455)
T PRK10139        197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM  276 (455)
T ss_pred             CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence            12357899999999999999999999999999999887766678999999999999999999999999999999999863


Q ss_pred             --hhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 013804          342 --QSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEV  416 (436)
Q Consensus       342 --~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v  416 (436)
                        +.++.+|+   .|++|..|.++|||+++||++           ||+|++|||++|.+|+|+.+.+....+|+++.++|
T Consensus       277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V  345 (455)
T PRK10139        277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKA-----------GDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL  345 (455)
T ss_pred             CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCC-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence              34566765   699999999999999999999           99999999999999999999998878899999999


Q ss_pred             EECCEEEEEEEEeecC
Q 013804          417 LRGDQKEKIPVKLEPK  432 (436)
Q Consensus       417 ~R~g~~~~~~v~~~~~  432 (436)
                      .|+|+.+++++++...
T Consensus       346 ~R~G~~~~l~v~~~~~  361 (455)
T PRK10139        346 LRNGKPLEVEVTLDTS  361 (455)
T ss_pred             EECCEEEEEEEEECCC
Confidence            9999999999987543


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=5.4e-51  Score=408.35  Aligned_cols=303  Identities=37%  Similarity=0.623  Sum_probs=262.4

Q ss_pred             CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeE
Q 013804          112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK  191 (436)
Q Consensus       112 ~~~~~~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~  191 (436)
                      ...+.++.++++++.||||.|.+.....+.   .......+.||||+|+++||||||+||+.+++.+.|.+.||+.++|+
T Consensus        41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~---~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~  117 (351)
T TIGR02038        41 NTVEISFNKAVRRAAPAVVNIYNRSISQNS---LNQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE  117 (351)
T ss_pred             cccchhHHHHHHhcCCcEEEEEeEeccccc---cccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence            344557889999999999999886543321   11123357899999999999999999999999999999999999999


Q ss_pred             EEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEc
Q 013804          192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD  271 (436)
Q Consensus       192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~  271 (436)
                      +++.|+.+||||||++..  .+++++++++..+++|++|+++|||++...+++.|+|++..+....   ......++++|
T Consensus       118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd  192 (351)
T TIGR02038       118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD  192 (351)
T ss_pred             EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence            999999999999999863  5888999888889999999999999998899999999988764321   11235789999


Q ss_pred             cccCCCCCCCeEECCCCcEEEEEeeeecCCC--CCCcceeeeeeeccchhhhhccccceecceecceeeecc--hhhhhc
Q 013804          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG--ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQL  347 (436)
Q Consensus       272 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~--~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~--~~~~~~  347 (436)
                      +.+++|+|||||+|.+|+||||+++.+...+  ...+++|+||++.+++++++++++|++.++|||+.+++.  ..++.+
T Consensus       193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l  272 (351)
T TIGR02038       193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL  272 (351)
T ss_pred             CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence            9999999999999999999999997664322  246899999999999999999999999999999999863  345666


Q ss_pred             Cc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEE
Q 013804          348 GV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEK  424 (436)
Q Consensus       348 g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~  424 (436)
                      |+   .|++|..|.+++||+++||++           ||+|++|||++|.+++|+.+++...++|++++++|.|+|+.++
T Consensus       273 gl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~  341 (351)
T TIGR02038       273 GLPDLRGIVITGVDPNGPAARAGILV-----------RDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLE  341 (351)
T ss_pred             CCCccccceEeecCCCChHHHCCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence            76   699999999999999999999           9999999999999999999999887889999999999999999


Q ss_pred             EEEEeecCC
Q 013804          425 IPVKLEPKP  433 (436)
Q Consensus       425 ~~v~~~~~~  433 (436)
                      +++++.++|
T Consensus       342 ~~v~l~~~p  350 (351)
T TIGR02038       342 LPVTIDEKP  350 (351)
T ss_pred             EEEEecCCC
Confidence            999987654


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=6.1e-51  Score=407.75  Aligned_cols=303  Identities=35%  Similarity=0.535  Sum_probs=259.7

Q ss_pred             CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeE
Q 013804          112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK  191 (436)
Q Consensus       112 ~~~~~~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~  191 (436)
                      ...+.++.++++++.||||.|.........   .......+.||||+|+++||||||+||+.+++.+.|.+.||+.++|+
T Consensus        41 ~~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~  117 (353)
T PRK10898         41 DETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEAL  117 (353)
T ss_pred             ccccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEE
Confidence            334457889999999999999886532211   11112347899999999999999999999999999999999999999


Q ss_pred             EEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEc
Q 013804          192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD  271 (436)
Q Consensus       192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~  271 (436)
                      ++++|+.+||||||++..  .+++++++++..+++|++|+++|||++...+++.|+|++..+.....   .....++++|
T Consensus       118 vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~---~~~~~~iqtd  192 (353)
T PRK10898        118 LVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP---TGRQNFLQTD  192 (353)
T ss_pred             EEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC---ccccceEEec
Confidence            999999999999999863  58889998888899999999999999988899999999877643221   1224689999


Q ss_pred             cccCCCCCCCeEECCCCcEEEEEeeeecCCC---CCCcceeeeeeeccchhhhhccccceecceecceeeecch--hhhh
Q 013804          272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQ--SVEQ  346 (436)
Q Consensus       272 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~---~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~--~~~~  346 (436)
                      +.+++|+|||||+|.+|+||||+++.+...+   ...+++|+||++.+++++++++++|++.++|||+.+++..  .++.
T Consensus       193 a~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~  272 (353)
T PRK10898        193 ASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQG  272 (353)
T ss_pred             cccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHh
Confidence            9999999999999999999999998765432   2368999999999999999999999999999999987532  2333


Q ss_pred             cCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEE
Q 013804          347 LGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKE  423 (436)
Q Consensus       347 ~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~  423 (436)
                      +++   .|++|.+|.+++||+++||++           ||+|++|||++|.++.|+.+.+....+|++++++|.|+|+.+
T Consensus       273 ~~~~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~  341 (353)
T PRK10898        273 GGIDQLQGIVVNEVSPDGPAAKAGIQV-----------NDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQL  341 (353)
T ss_pred             cCCCCCCeEEEEEECCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEE
Confidence            343   799999999999999999999           999999999999999999999988789999999999999999


Q ss_pred             EEEEEeecCC
Q 013804          424 KIPVKLEPKP  433 (436)
Q Consensus       424 ~~~v~~~~~~  433 (436)
                      ++++++.+++
T Consensus       342 ~~~v~l~~~p  351 (353)
T PRK10898        342 TLQVTIQEYP  351 (353)
T ss_pred             EEEEEeccCC
Confidence            9999987765


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=9.8e-50  Score=412.42  Aligned_cols=300  Identities=38%  Similarity=0.600  Sum_probs=261.0

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccC---c--------cccc--------------------------cccCcCeEEEEEEE
Q 013804          117 ATVRLFQENTPSVVNITNLAARQD---A--------FTLD--------------------------VLEVPQGSGSGFVW  159 (436)
Q Consensus       117 ~~~~~~~~~~~sVV~I~~~~~~~~---~--------~~~~--------------------------~~~~~~~~GSGfiI  159 (436)
                      ++.++++++.||||.|.+......   +        |...                          ......+.||||||
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            588999999999999987653211   0        1000                          00122468999999


Q ss_pred             cC-CCEEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCC
Q 013804          160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG  238 (436)
Q Consensus       160 ~~-~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g  238 (436)
                      ++ +||||||+||+.+++.+.|++.|++.++|++++.|+.+||||||++.. ..+++++|+++..+++|++|+++|+|++
T Consensus       119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~G~~V~aiG~P~g  197 (473)
T PRK10942        119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRVGDYTVAIGNPYG  197 (473)
T ss_pred             ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEcCCCC
Confidence            86 599999999999999999999999999999999999999999999754 3689999999999999999999999999


Q ss_pred             CCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccch
Q 013804          239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNG  318 (436)
Q Consensus       239 ~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~  318 (436)
                      ...+++.|+|++..+....   ...+..++++|+.+++|+|||||+|.+|+||||+++...+.++..+++|+||++.+++
T Consensus       198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~  274 (473)
T PRK10942        198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN  274 (473)
T ss_pred             CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence            9999999999988764211   1123578999999999999999999999999999998877666778999999999999


Q ss_pred             hhhhccccceecceecceeeecc--hhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEe
Q 013804          319 IVDQLVKFGKVTRPILGIKFAPD--QSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV  393 (436)
Q Consensus       319 ~l~~l~~~g~v~~~~lGv~~~~~--~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V  393 (436)
                      ++++|+++|++.|+|||+.+++.  +.++.+++   .|++|..|.+++||+++||++           ||+|++|||++|
T Consensus       275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V  343 (473)
T PRK10942        275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKA-----------GDVITSLNGKPI  343 (473)
T ss_pred             HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCC-----------CCEEEEECCEEC
Confidence            99999999999999999999863  34666775   599999999999999999999           999999999999


Q ss_pred             CCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 013804          394 SNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  431 (436)
Q Consensus       394 ~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  431 (436)
                      .+++|+.+++....+|++++++|.|+|+.+++++++..
T Consensus       344 ~s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~  381 (473)
T PRK10942        344 SSFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQ  381 (473)
T ss_pred             CCHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCc
Confidence            99999999998888899999999999999999988754


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=1.6e-48  Score=401.44  Aligned_cols=301  Identities=44%  Similarity=0.677  Sum_probs=262.6

Q ss_pred             HHHHHHHhCCceEEEEEeeeccC---------c----ccc--c------cccCcCeEEEEEEEcCCCEEEecccccCCCC
Q 013804          118 TVRLFQENTPSVVNITNLAARQD---------A----FTL--D------VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS  176 (436)
Q Consensus       118 ~~~~~~~~~~sVV~I~~~~~~~~---------~----~~~--~------~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~  176 (436)
                      +.++++++.||||.|.+......         +    |..  .      ......+.||||+|+++||||||+||+.++.
T Consensus         3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~   82 (428)
T TIGR02037         3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD   82 (428)
T ss_pred             HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence            56799999999999988652211         1    100  0      1123457899999999999999999999999


Q ss_pred             eEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeec
Q 013804          177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS  256 (436)
Q Consensus       177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~  256 (436)
                      .+.|++.|++.++|++++.|+.+||||||++.+ ..++++.|+++..+++|++|+++|||++...+++.|+|++..+...
T Consensus        83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~  161 (428)
T TIGR02037        83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL  161 (428)
T ss_pred             eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence            999999999999999999999999999999864 3689999998888999999999999999999999999998776531


Q ss_pred             cCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecce
Q 013804          257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGI  336 (436)
Q Consensus       257 ~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv  336 (436)
                         ....+..++++|+.+++|+|||||+|.+|+||||+++.....++..+++|+||++.+++++++++++|++.++|||+
T Consensus       162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi  238 (428)
T TIGR02037       162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV  238 (428)
T ss_pred             ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence               11234568999999999999999999999999999998776666778999999999999999999999999999999


Q ss_pred             eeecc--hhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804          337 KFAPD--QSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (436)
Q Consensus       337 ~~~~~--~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~  411 (436)
                      .+++.  ..++.+|+   .|++|.+|.+++||+++||++           ||+|++|||++|.++.++.+++....+|++
T Consensus       239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~  307 (428)
T TIGR02037       239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKA-----------GDVILSVNGKPISSFADLRRAIGTLKPGKK  307 (428)
T ss_pred             EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCE
Confidence            99863  35677776   799999999999999999999           999999999999999999999988788999


Q ss_pred             EEEEEEECCEEEEEEEEeecCC
Q 013804          412 VIVEVLRGDQKEKIPVKLEPKP  433 (436)
Q Consensus       412 v~l~v~R~g~~~~~~v~~~~~~  433 (436)
                      ++++|.|+|+.+++++++...+
T Consensus       308 v~l~v~R~g~~~~~~v~l~~~~  329 (428)
T TIGR02037       308 VTLGILRKGKEKTITVTLGASP  329 (428)
T ss_pred             EEEEEEECCEEEEEEEEECcCC
Confidence            9999999999999999876543


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=6.3e-39  Score=321.87  Aligned_cols=302  Identities=45%  Similarity=0.689  Sum_probs=261.5

Q ss_pred             hhHHHHHHHhCCceEEEEEeeeccC-cccccc--ccCcCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeEE
Q 013804          116 LATVRLFQENTPSVVNITNLAARQD-AFTLDV--LEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI  192 (436)
Q Consensus       116 ~~~~~~~~~~~~sVV~I~~~~~~~~-~~~~~~--~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~v  192 (436)
                      ..+..+++++.|+||.|........ .|....  .....+.||||+++++|||+|+.||+.++..+.+.+.||+.+++++
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~  112 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL  112 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence            5778899999999999988654332 111000  0001489999999999999999999999999999999999999999


Q ss_pred             EEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEcc
Q 013804          193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA  272 (436)
Q Consensus       193 v~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~  272 (436)
                      ++.|+..|+|++|++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. ... ....+.+++++|+
T Consensus       113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~v~-~~~~~~~~IqtdA  189 (347)
T COG0265         113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-GVG-SAGGYVNFIQTDA  189 (347)
T ss_pred             EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-ccc-Ccccccchhhccc
Confidence            9999999999999987543 888899999999999999999999999999999999998886 111 1112568899999


Q ss_pred             ccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecceeeecchhhhhcC---c
Q 013804          273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLG---V  349 (436)
Q Consensus       273 ~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~g---~  349 (436)
                      ++++|+||||++|.+|++|||++......++..+++|+||++.++.++.++++.|++.++++|+.+.+......+|   .
T Consensus       190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~  269 (347)
T COG0265         190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA  269 (347)
T ss_pred             ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence            9999999999999999999999999887665677999999999999999999988999999999988633222144   3


Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      .|++|..+.+++||+++|+++           ||+|+++||++|.+..++...+....+|+++.+++.|+|+++++.+++
T Consensus       270 ~G~~V~~v~~~spa~~agi~~-----------Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l  338 (347)
T COG0265         270 AGAVVLGVLPGSPAAKAGIKA-----------GDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTL  338 (347)
T ss_pred             CceEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Confidence            799999999999999999999           999999999999999999999998889999999999999999999999


Q ss_pred             ec
Q 013804          430 EP  431 (436)
Q Consensus       430 ~~  431 (436)
                      .+
T Consensus       339 ~~  340 (347)
T COG0265         339 GD  340 (347)
T ss_pred             cC
Confidence            76


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95  E-value=2.9e-27  Score=238.34  Aligned_cols=305  Identities=37%  Similarity=0.516  Sum_probs=244.6

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCCCC-----------eEEEEecCC
Q 013804          117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFADQ  185 (436)
Q Consensus       117 ~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~-----------~i~V~~~dg  185 (436)
                      ....+.++-..+||.|+...-...+......+.+...||||+++.+|.++|++||+....           .+.|...++
T Consensus       129 ~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~  208 (473)
T KOG1320|consen  129 FVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIG  208 (473)
T ss_pred             hHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeec
Confidence            345788899999999987544333322344455678999999999999999999997543           377777766


Q ss_pred             --cEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCC--
Q 013804          186 --SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG--  261 (436)
Q Consensus       186 --~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~--  261 (436)
                        ..+++.+.+.|+..|+|+++++.+..-.++++++-+..+..|+++..+|.|++..+..+.|.+++..|........  
T Consensus       209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~g  288 (473)
T KOG1320|consen  209 PGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLETG  288 (473)
T ss_pred             CCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcccc
Confidence              8899999999999999999997654347888998888999999999999999999999999999888765443332  


Q ss_pred             CCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcccccee---------cce
Q 013804          262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV---------TRP  332 (436)
Q Consensus       262 ~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v---------~~~  332 (436)
                      ....+++++|++++.|+||+|++|.+|++||+++......+-..+++|++|.+.+..++.+..+....         .+.
T Consensus       289 ~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~~  368 (473)
T KOG1320|consen  289 VLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVHQ  368 (473)
T ss_pred             eeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccccc
Confidence            34568899999999999999999999999999887765444457899999999999998887443322         234


Q ss_pred             ecceeeec-------chhhhhc----C-ccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHH
Q 013804          333 ILGIKFAP-------DQSVEQL----G-VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLY  400 (436)
Q Consensus       333 ~lGv~~~~-------~~~~~~~----g-~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~  400 (436)
                      |+|+...-       ....+.+    + ..+|+|.+|.+++++...++.+           ||+|++|||++|.+..++.
T Consensus       369 ~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~-----------g~~V~~vng~~V~n~~~l~  437 (473)
T KOG1320|consen  369 YIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKP-----------GDQVVKVNGKPVKNLKHLY  437 (473)
T ss_pred             cCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccC-----------CCEEEEECCEEeechHHHH
Confidence            66664331       0011111    2 2689999999999999999999           9999999999999999999


Q ss_pred             HHHhcCCCCCEEEEEEEECCEEEEEEEEeecC
Q 013804          401 RILDQCKVGDEVIVEVLRGDQKEKIPVKLEPK  432 (436)
Q Consensus       401 ~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~  432 (436)
                      +++++...+++|.+..+|..|..++.+.....
T Consensus       438 ~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~~~  469 (473)
T KOG1320|consen  438 ELIEECSTEDKVAVLDRRSAEDATLEILPEHK  469 (473)
T ss_pred             HHHHhcCcCceEEEEEecCccceeEEeccccc
Confidence            99999888899999999998988888876543


No 8  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.91  E-value=1.1e-22  Score=206.88  Aligned_cols=295  Identities=27%  Similarity=0.367  Sum_probs=233.9

Q ss_pred             hHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCC-CEEEecccccCCC-CeEEEEecCCcEEeeEEEE
Q 013804          117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGA-SDIRVTFADQSAYDAKIVG  194 (436)
Q Consensus       117 ~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~-~~i~V~~~dg~~~~a~vv~  194 (436)
                      .....+..+-++||.|......-  |  +....+.+.+|||++++. ||||||+||+... -...+.+.+..+.+.-.++
T Consensus        53 ~w~~~ia~VvksvVsI~~S~v~~--f--dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy  128 (955)
T KOG1421|consen   53 DWRNTIANVVKSVVSIRFSAVRA--F--DTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY  128 (955)
T ss_pred             hhhhhhhhhcccEEEEEehheee--c--ccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence            55667889999999998754321  1  223345678999999976 8999999999754 4567888888888888999


Q ss_pred             EcCCCCeEEEEEcCCC---CCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCC---CCCcccEE
Q 013804          195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAAT---GRPIQDVI  268 (436)
Q Consensus       195 ~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~---~~~~~~~i  268 (436)
                      .|+.+|+.+++.+...   ..+..+.++. +-.++|.+++++|+..+.-.+...|.++.+.+....+..   +.....++
T Consensus       129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~  207 (955)
T KOG1421|consen  129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI  207 (955)
T ss_pred             CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence            9999999999998643   1233444432 335789999999998888888999999988887766532   11224567


Q ss_pred             EEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecceeeec--chhhhh
Q 013804          269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQ  346 (436)
Q Consensus       269 ~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~--~~~~~~  346 (436)
                      |.-+....|.||+|++|.+|..|.++..+..    ..+.+|++|++.+.+-+.-++++..++|+.|-++|.+  .+.+++
T Consensus       208 QaasstsggssgspVv~i~gyAVAl~agg~~----ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr  283 (955)
T KOG1421|consen  208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSI----SSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR  283 (955)
T ss_pred             eehhcCCCCCCCCceecccceEEeeecCCcc----cccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence            7778888999999999999999999887654    3456799999999999999998889999999999986  334666


Q ss_pred             cCc---------------cceE-EEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCC
Q 013804          347 LGV---------------SGVL-VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGD  410 (436)
Q Consensus       347 ~g~---------------~gv~-V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~  410 (436)
                      +|+               .|++ |..+.+++||++. |++           ||++++||+.-+.++.++.+.|.+ ..|+
T Consensus       284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~-----------GDillavN~t~l~df~~l~~iLDe-gvgk  350 (955)
T KOG1421|consen  284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEP-----------GDILLAVNSTCLNDFEALEQILDE-GVGK  350 (955)
T ss_pred             cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCC-----------CcEEEEEcceehHHHHHHHHHHhh-ccCc
Confidence            664               4554 5567788887654 444           999999999999999999999988 5899


Q ss_pred             EEEEEEEECCEEEEEEEEeecCC
Q 013804          411 EVIVEVLRGDQKEKIPVKLEPKP  433 (436)
Q Consensus       411 ~v~l~v~R~g~~~~~~v~~~~~~  433 (436)
                      .++|+|+|+|++.+++++...+.
T Consensus       351 ~l~LtI~Rggqelel~vtvqdlh  373 (955)
T KOG1421|consen  351 NLELTIQRGGQELELTVTVQDLH  373 (955)
T ss_pred             eEEEEEEeCCEEEEEEEEecccc
Confidence            99999999999999999887553


No 9  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.68  E-value=4.9e-16  Score=130.90  Aligned_cols=109  Identities=38%  Similarity=0.594  Sum_probs=74.8

Q ss_pred             EEEEEEcCCCEEEecccccC--------CCCeEEEEecCCcEEe--eEEEEEcCC-CCeEEEEEcCCCCCCcceecCCCC
Q 013804          154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA  222 (436)
Q Consensus       154 GSGfiI~~~G~ILT~aHvv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~l~~~~  222 (436)
                      ||||+|+++|+||||+||+.        ....+.+...+++.+.  ++++..++. +|+|||+++.              
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~--------------   66 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP--------------   66 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC--------------
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec--------------
Confidence            89999999999999999998        4567888888988888  999999999 9999999970              


Q ss_pred             CCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEE
Q 013804          223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI  293 (436)
Q Consensus       223 ~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI  293 (436)
                      ....+......            ............    ......+ +++.+.+|+|||||||.+|+||||
T Consensus        67 ~~~~~~~~~~~------------~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   67 WTGVGGGVRVP------------GSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             EEEEEEEEEEE------------EEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             ccceeeeeEee------------eeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence            00000000000            000000000000    0001114 799999999999999999999997


No 10 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.60  E-value=7.5e-15  Score=116.29  Aligned_cols=82  Identities=40%  Similarity=0.607  Sum_probs=72.6

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~  411 (436)
                      ||||+.+.....     ..|++|.+|.++|||+++||++           ||+|++|||++|+++.|+..++....+|++
T Consensus         1 ~~lGv~~~~~~~-----~~g~~V~~V~~~spA~~aGl~~-----------GD~I~~ing~~v~~~~~~~~~l~~~~~g~~   64 (82)
T PF13180_consen    1 GGLGVTVQNLSD-----TGGVVVVSVIPGSPAAKAGLQP-----------GDIILAINGKPVNSSEDLVNILSKGKPGDT   64 (82)
T ss_dssp             -E-SEEEEECSC-----SSSEEEEEESTTSHHHHTTS-T-----------TEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred             CEECeEEEEccC-----CCeEEEEEeCCCCcHHHCCCCC-----------CcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence            689999987542     2599999999999999999999           999999999999999999999988899999


Q ss_pred             EEEEEEECCEEEEEEEEe
Q 013804          412 VIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       412 v~l~v~R~g~~~~~~v~~  429 (436)
                      ++++|.|+|+.+++++++
T Consensus        65 v~l~v~R~g~~~~~~v~l   82 (82)
T PF13180_consen   65 VTLTVLRDGEELTVEVTL   82 (82)
T ss_dssp             EEEEEEETTEEEEEEEE-
T ss_pred             EEEEEEECCEEEEEEEEC
Confidence            999999999999999875


No 11 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.56  E-value=6.1e-14  Score=143.82  Aligned_cols=296  Identities=19%  Similarity=0.238  Sum_probs=196.2

Q ss_pred             HHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCC-CEEEecccccC-CCCeEEEEecCCcEEeeEEEEEcCCCC
Q 013804          123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDKD  200 (436)
Q Consensus       123 ~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~~d~~~D  200 (436)
                      ++...+.|.+....    ++.+++.......|||.|++.+ |++++...++. +..+.+|.+.|.-.++|.+.+.|+..+
T Consensus       525 ~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~n  600 (955)
T KOG1421|consen  525 ADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTEN  600 (955)
T ss_pred             hHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCccc
Confidence            34445555554422    2334444444568999999865 89999999986 567899999999999999999999999


Q ss_pred             eEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEee---eeeeeccCC-CCCCcccEEEEccccCC
Q 013804          201 VAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISG---LRREISSAA-TGRPIQDVIQTDAAINP  276 (436)
Q Consensus       201 lAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~---~~~~~~~~~-~~~~~~~~i~~~~~i~~  276 (436)
                      +|.+|.+...  ...+.|. ...+..|+++...|+......-.....+..   +........ ......+.|.+++.+.-
T Consensus       601 ~a~~kydp~~--~~~~kl~-~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nlsT  677 (955)
T KOG1421|consen  601 VASFKYDPAL--EVQLKLT-DTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLST  677 (955)
T ss_pred             eeEeccChhH--hhhhccc-eeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEeccccc
Confidence            9999998632  3445553 345788999999998755432111111111   111111111 11222466777776666


Q ss_pred             CCCCCeEECCCCcEEEEEeeeecCC--CCCCcceeeeeeeccchhhhhccccceecceecceeeecc--hhhhhcCccce
Q 013804          277 GNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQLGVSGV  352 (436)
Q Consensus       277 G~SGGPlvd~~G~VVGI~s~~~~~~--~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~--~~~~~~g~~gv  352 (436)
                      ++--|-+.|.+|+|+|+.-...+..  +.....-|.+.+.++++.++.|+..+......+|++|...  ..++.+|+.--
T Consensus       678 ~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~e  757 (955)
T KOG1421|consen  678 SCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPSE  757 (955)
T ss_pred             cccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCHH
Confidence            6666778999999999976555442  2233456778899999999999998888777888888752  24555554333


Q ss_pred             EEEecCCCCcccccCccc--ccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 013804          353 LVLDAPPNGPAGKAGLLS--TKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE  430 (436)
Q Consensus       353 ~V~~v~~~spa~~agl~~--~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~  430 (436)
                      ++.+...++...+.-+..  .+...+..|..||+|+++||+.|+...||.+..       .++.+|.|||.++++++++-
T Consensus       758 ~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~-------eid~~ilrdg~~~~ikipt~  830 (955)
T KOG1421|consen  758 FIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFE-------EIDAVILRDGIEMEIKIPTY  830 (955)
T ss_pred             HHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhh-------hhheeeeecCcEEEEEeccc
Confidence            333333333222221111  111223446679999999999999999998632       47899999999999998875


Q ss_pred             cC
Q 013804          431 PK  432 (436)
Q Consensus       431 ~~  432 (436)
                      +.
T Consensus       831 p~  832 (955)
T KOG1421|consen  831 PE  832 (955)
T ss_pred             cc
Confidence            43


No 12 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.54  E-value=5.3e-13  Score=123.87  Aligned_cols=168  Identities=23%  Similarity=0.351  Sum_probs=112.2

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCCeEEEEecC-------C--cEEeeEEEEEc----C---CCCeEEEEEcCC---CC
Q 013804          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD-------Q--SAYDAKIVGFD----Q---DKDVAVLRIDAP---KD  211 (436)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d-------g--~~~~a~vv~~d----~---~~DlAlLkv~~~---~~  211 (436)
                      ...|+|++|+++ +|||++||+.+...+.+.+..       +  ..+..+.+..+    .   .+|+|||+++.+   ..
T Consensus        24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~  102 (220)
T PF00089_consen   24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD  102 (220)
T ss_dssp             EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred             CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            468999999987 999999999996667665532       2  24555544343    2   469999999987   34


Q ss_pred             CCcceecCCC-CCCCCCCEEEEEecCCCCCC----ceeEeEEeeeeeeeccC-CCCCCcccEEEEcc----ccCCCCCCC
Q 013804          212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSA-ATGRPIQDVIQTDA----AINPGNSGG  281 (436)
Q Consensus       212 ~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~~----~~~~G~vs~~~~~~~~~-~~~~~~~~~i~~~~----~i~~G~SGG  281 (436)
                      .+.++.+... ..+..|+.+.++||+.....    ......+..+....... .........+....    ..+.|+|||
T Consensus       103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~  182 (220)
T PF00089_consen  103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG  182 (220)
T ss_dssp             SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            5677777652 33578999999999975332    34444443333321111 00111234555554    789999999


Q ss_pred             eEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhh
Q 013804          282 PLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIV  320 (436)
Q Consensus       282 Plvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l  320 (436)
                      ||++.++.|+||++.. ..++......++.++..+.+|+
T Consensus       183 pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  183 PLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             EEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred             ccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence            9998776799999988 4444343467888888877764


No 13 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.43  E-value=4.7e-13  Score=135.77  Aligned_cols=275  Identities=20%  Similarity=0.224  Sum_probs=193.7

Q ss_pred             HHhCCceEEEEEeeeccCcccccccc-CcCeEEEEEEEcCCCEEEecccccC---CCCeEEEEe-cCCcEEeeEEEEEcC
Q 013804          123 QENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFDQ  197 (436)
Q Consensus       123 ~~~~~sVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHvv~---~~~~i~V~~-~dg~~~~a~vv~~d~  197 (436)
                      +....+++.+............|... .....|+||.+... .++|++|++.   +...+.+.- +.-+.|.+++...-.
T Consensus        57 ~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~~  135 (473)
T KOG1320|consen   57 DLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVFE  135 (473)
T ss_pred             cccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHhhh
Confidence            34466788887766555443333332 33567999999754 8999999998   555555542 234678899988889


Q ss_pred             CCCeEEEEEcCCC--CCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 013804          198 DKDVAVLRIDAPK--DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN  275 (436)
Q Consensus       198 ~~DlAlLkv~~~~--~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~  275 (436)
                      ++|+|++.++...  ....++.+.  +-+...+.++++|   +....++.|+|.........  ........+++++.++
T Consensus       136 ~cd~Avv~Ie~~~f~~~~~~~e~~--~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~--~~~~~l~~vqi~aa~~  208 (473)
T KOG1320|consen  136 ECDLAVVYIESEEFWKGMNPFELG--DIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYA--HSSTVLLRVQIDAAIG  208 (473)
T ss_pred             cccceEEEEeeccccCCCcccccC--CCcccCccEEEEc---CCcEEEEeeEEEEEEecccc--CCCcceeeEEEEEeec
Confidence            9999999998643  122233443  3355668899998   67789999999877654322  2223345689999999


Q ss_pred             CCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcccccee-cceecceeeecch---hhhhcCc--
Q 013804          276 PGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV-TRPILGIKFAPDQ---SVEQLGV--  349 (436)
Q Consensus       276 ~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v-~~~~lGv~~~~~~---~~~~~g~--  349 (436)
                      +|+||+|.+...+++.|+........   ..+++.||.-.+.+|.......+.. .+++++...+...   ..+.+.+  
T Consensus       209 ~~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~  285 (473)
T KOG1320|consen  209 PGNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGL  285 (473)
T ss_pred             CCccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCc
Confidence            99999999987789999988776432   2678999999999999877666654 3566666555322   2233322  


Q ss_pred             -cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC-H-----HHHHHHHhcCCCCCEEEEEEEECC
Q 013804          350 -SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN-G-----SDLYRILDQCKVGDEVIVEVLRGD  420 (436)
Q Consensus       350 -~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s-~-----~dl~~~l~~~~~g~~v~l~v~R~g  420 (436)
                       .|+.+.++.+-+.|-+. ++.           ||.|+++||+.|-- +     -.+...+....++|++.+.+.|.+
T Consensus       286 ~~g~~i~~~~qtd~ai~~-~ns-----------g~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~  351 (473)
T KOG1320|consen  286 ETGVLISKINQTDAAINP-GNS-----------GGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG  351 (473)
T ss_pred             ccceeeeeecccchhhhc-ccC-----------CCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence             57999999887776554 333           99999999998841 1     123345566678999999999987


No 14 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.42  E-value=5.1e-12  Score=118.11  Aligned_cols=170  Identities=21%  Similarity=0.264  Sum_probs=104.0

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCC--CeEEEEecC---------CcEEeeEEEEEcC-------CCCeEEEEEcCCC--
Q 013804          151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFAD---------QSAYDAKIVGFDQ-------DKDVAVLRIDAPK--  210 (436)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~--~~i~V~~~d---------g~~~~a~vv~~d~-------~~DlAlLkv~~~~--  210 (436)
                      ...|+|++|+++ +|||+|||+.+.  ..+.|.++.         ...+.++-+..++       .+|||||+++.+.  
T Consensus        24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~  102 (232)
T cd00190          24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL  102 (232)
T ss_pred             cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence            568999999987 999999999875  566666642         2234444455553       5799999998653  


Q ss_pred             -CCCcceecCCCC-CCCCCCEEEEEecCCCCCC-----ceeEeEEeeeeeeeccCCCC---CCcccEEEE-----ccccC
Q 013804          211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN  275 (436)
Q Consensus       211 -~~~~~~~l~~~~-~~~~G~~V~~vG~p~g~~~-----~~~~G~vs~~~~~~~~~~~~---~~~~~~i~~-----~~~i~  275 (436)
                       ..+.|+.|.... .+..|+.+.++||+.....     ......+..+....+.....   ......+..     ....|
T Consensus       103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c  182 (232)
T cd00190         103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC  182 (232)
T ss_pred             CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence             236777776543 5678999999999765332     12222222221111110000   000111211     44578


Q ss_pred             CCCCCCeEECCC---CcEEEEEeeeecCCCCCCcceeeeeeeccchhhhh
Q 013804          276 PGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ  322 (436)
Q Consensus       276 ~G~SGGPlvd~~---G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~  322 (436)
                      .|+|||||+...   +.++||.+++.. |+.......+..+...++|+++
T Consensus       183 ~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~~  231 (232)
T cd00190         183 QGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQK  231 (232)
T ss_pred             cCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhhc
Confidence            999999999654   789999998754 4422334455556666666653


No 15 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.34  E-value=2.7e-11  Score=113.42  Aligned_cols=167  Identities=22%  Similarity=0.284  Sum_probs=100.3

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCC--eEEEEecCC--------cEEeeEEEEEc-------CCCCeEEEEEcCCC---
Q 013804          151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK---  210 (436)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~---  210 (436)
                      ...|+|++|+++ +|||+|||+.+..  .+.|.+...        ..+.+.-+..+       ..+|+|||+++.+.   
T Consensus        25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~  103 (229)
T smart00020       25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS  103 (229)
T ss_pred             CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence            568999999977 9999999998754  677777543        33444444433       35799999998762   


Q ss_pred             CCCcceecCCC-CCCCCCCEEEEEecCCCCC------CceeEeEEeeeeeeeccCCCCC---CcccEEE-----EccccC
Q 013804          211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQ-----TDAAIN  275 (436)
Q Consensus       211 ~~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~------~~~~~G~vs~~~~~~~~~~~~~---~~~~~i~-----~~~~i~  275 (436)
                      ..+.|+.|... ..+..++.+.+.||+....      .......+..+...........   .....+.     .....|
T Consensus       104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c  183 (229)
T smart00020      104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC  183 (229)
T ss_pred             CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence            24667777543 3467789999999986542      1122222222211111100000   0011111     135678


Q ss_pred             CCCCCCeEECCCC--cEEEEEeeeecCCCCCCcceeeeeeeccchh
Q 013804          276 PGNSGGPLLDSSG--SLIGINTAIYSPSGASSGVGFSIPVDTVNGI  319 (436)
Q Consensus       276 ~G~SGGPlvd~~G--~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~  319 (436)
                      +|+||||++...+  .++||.+.+. .|+.......+..+....+|
T Consensus       184 ~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~~~~~~~~~~i~~~~~W  228 (229)
T smart00020      184 QGDSGGPLVCNDGRWVLVGIVSWGS-GCARPGKPGVYTRVSSYLDW  228 (229)
T ss_pred             CCCCCCeeEEECCCEEEEEEEEECC-CCCCCCCCCEEEEecccccc
Confidence            9999999996543  8999999876 44433344445555444433


No 16 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.29  E-value=2.4e-11  Score=95.50  Aligned_cols=68  Identities=29%  Similarity=0.423  Sum_probs=63.6

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK  428 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~  428 (436)
                      .|++|..+.+++||+++||++           ||+|++|||++|.+++|+.+++....+|+.+.+++.|+|+.++++++
T Consensus        10 ~Gv~V~~V~~~spa~~aGL~~-----------GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~   77 (79)
T cd00991          10 AGVVIVGVIVGSPAENAVLHT-----------GDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST   77 (79)
T ss_pred             CcEEEEEECCCChHHhcCCCC-----------CCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence            699999999999999999999           99999999999999999999998866789999999999998887765


No 17 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.27  E-value=4.4e-11  Score=93.85  Aligned_cols=72  Identities=29%  Similarity=0.393  Sum_probs=66.0

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      .|++|.+|.+++||+. ||++           ||+|++|||++|.+++++.+++....+|+.+.+++.|+|+.+++++++
T Consensus         8 ~Gv~V~~V~~~s~A~~-gL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l   75 (79)
T cd00986           8 HGVYVTSVVEGMPAAG-KLKA-----------GDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLIL   75 (79)
T ss_pred             cCEEEEEECCCCchhh-CCCC-----------CCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEE
Confidence            6899999999999986 7999           999999999999999999999987678899999999999999999998


Q ss_pred             ecCC
Q 013804          430 EPKP  433 (436)
Q Consensus       430 ~~~~  433 (436)
                      .+++
T Consensus        76 ~~~~   79 (79)
T cd00986          76 KTFP   79 (79)
T ss_pred             eccC
Confidence            7653


No 18 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.25  E-value=4.8e-11  Score=95.61  Aligned_cols=84  Identities=44%  Similarity=0.681  Sum_probs=70.9

Q ss_pred             eecceeeecchhh--hhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 013804          332 PILGIKFAPDQSV--EQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC  406 (436)
Q Consensus       332 ~~lGv~~~~~~~~--~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~  406 (436)
                      +|+|+.+++....  +.++.   .|++|.++.+++||+++||++           ||+|++|||++|.++.++.+++...
T Consensus         1 ~~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~i~~~~~~~~~l~~~   69 (90)
T cd00987           1 PWLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKP-----------GDVILAVNGKPVKSVADLRRALAEL   69 (90)
T ss_pred             CccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCc-----------CCEEEEECCEECCCHHHHHHHHHhc
Confidence            5789998864422  22332   599999999999999999999           9999999999999999999999876


Q ss_pred             CCCCEEEEEEEECCEEEEEE
Q 013804          407 KVGDEVIVEVLRGDQKEKIP  426 (436)
Q Consensus       407 ~~g~~v~l~v~R~g~~~~~~  426 (436)
                      ..++.+.+++.|+|+.+++.
T Consensus        70 ~~~~~i~l~v~r~g~~~~~~   89 (90)
T cd00987          70 KPGDKVTLTVLRGGKELTVT   89 (90)
T ss_pred             CCCCEEEEEEEECCEEEEee
Confidence            66899999999999876654


No 19 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.23  E-value=3.1e-11  Score=115.78  Aligned_cols=101  Identities=19%  Similarity=0.229  Sum_probs=89.8

Q ss_pred             eccchhhhhccccceecceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEe
Q 013804          314 DTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV  393 (436)
Q Consensus       314 ~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V  393 (436)
                      ..++++++++++++++.++|+|+......    -...|++|..+.++++++++||++           ||+|++|||+++
T Consensus       159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~----g~~~G~~v~~v~~~s~a~~aGLr~-----------GDvIv~ING~~i  223 (259)
T TIGR01713       159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN----DKLEGYRLNPGKDPSLFYKSGLQD-----------GDIAVALNGLDL  223 (259)
T ss_pred             hhHHHHHHHHHHCHHhhhheEeEEEEEeC----CceeEEEEEecCCCCHHHHcCCCC-----------CCEEEEECCEEc
Confidence            45678899999999999999999975432    114799999999999999999999           999999999999


Q ss_pred             CCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804          394 SNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       394 ~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      ++++++.+++...+++++++++|+|+|+.+++.+.+
T Consensus       224 ~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~  259 (259)
T TIGR01713       224 RDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF  259 (259)
T ss_pred             CCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence            999999999998888999999999999998888764


No 20 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.23  E-value=5.9e-11  Score=93.14  Aligned_cols=78  Identities=33%  Similarity=0.521  Sum_probs=66.3

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~  411 (436)
                      +|+|+.+....       .|++|.+|.+++||+++||++           ||+|++|||++|.++.++   +.....++.
T Consensus         1 ~~~G~~~~~~~-------~~~~V~~V~~~s~a~~aGl~~-----------GD~I~~Ing~~v~~~~~~---l~~~~~~~~   59 (80)
T cd00990           1 PYLGLTLDKEE-------GLGKVTFVRDDSPADKAGLVA-----------GDELVAVNGWRVDALQDR---LKEYQAGDP   59 (80)
T ss_pred             CcccEEEEccC-------CcEEEEEECCCChHHHhCCCC-----------CCEEEEECCEEhHHHHHH---HHhcCCCCE
Confidence            57899886543       579999999999999999999           999999999999986654   444457889


Q ss_pred             EEEEEEECCEEEEEEEEee
Q 013804          412 VIVEVLRGDQKEKIPVKLE  430 (436)
Q Consensus       412 v~l~v~R~g~~~~~~v~~~  430 (436)
                      +.+++.|+|+..++.+++.
T Consensus        60 v~l~v~r~g~~~~~~v~~~   78 (80)
T cd00990          60 VELTVFRDDRLIEVPLTLA   78 (80)
T ss_pred             EEEEEEECCEEEEEEEEec
Confidence            9999999999988888764


No 21 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.16  E-value=2.3e-10  Score=89.38  Aligned_cols=77  Identities=27%  Similarity=0.421  Sum_probs=65.4

Q ss_pred             ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEE
Q 013804          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEV  412 (436)
Q Consensus       333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v  412 (436)
                      |+|+.+....       ..++|..+.+++||+++||++           ||+|++|||+++.+++++.+++... .++.+
T Consensus         2 ~~~~~~g~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~   62 (79)
T cd00989           2 ILGFVPGGPP-------IEPVIGEVVPGSPAAKAGLKA-----------GDRILAINGQKIKSWEDLVDAVQEN-PGKPL   62 (79)
T ss_pred             eeeEeccCCc-------cCcEEEeECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHHC-CCceE
Confidence            5666655332       358899999999999999999           9999999999999999999999874 57899


Q ss_pred             EEEEEECCEEEEEEEE
Q 013804          413 IVEVLRGDQKEKIPVK  428 (436)
Q Consensus       413 ~l~v~R~g~~~~~~v~  428 (436)
                      .+++.|+|+..++.++
T Consensus        63 ~l~v~r~~~~~~~~l~   78 (79)
T cd00989          63 TLTVERNGETITLTLT   78 (79)
T ss_pred             EEEEEECCEEEEEEec
Confidence            9999999988777664


No 22 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.10  E-value=5.6e-10  Score=88.55  Aligned_cols=79  Identities=28%  Similarity=0.558  Sum_probs=68.3

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g  409 (436)
                      .-||+.+....       .+++|..+.+++||+++||++           ||+|++|||+++.++  .++..++.. ..|
T Consensus         2 ~~lG~~~~~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~vng~~i~~~~~~~~~~~l~~-~~~   62 (85)
T cd00988           2 GGIGLELKYDD-------GGLVITSVLPGSPAAKAGIKA-----------GDIIVAIDGEPVDGLSLEDVVKLLRG-KAG   62 (85)
T ss_pred             eEEEEEEEEcC-------CeEEEEEecCCCCHHHcCCCC-----------CCEEEEECCEEcCCCCHHHHHHHhcC-CCC
Confidence            35788876533       689999999999999999999           999999999999999  899988876 468


Q ss_pred             CEEEEEEEEC-CEEEEEEEEe
Q 013804          410 DEVIVEVLRG-DQKEKIPVKL  429 (436)
Q Consensus       410 ~~v~l~v~R~-g~~~~~~v~~  429 (436)
                      +.+.+++.|+ |+..+++++.
T Consensus        63 ~~i~l~v~r~~~~~~~~~~~~   83 (85)
T cd00988          63 TKVRLTLKRGDGEPREVTLTR   83 (85)
T ss_pred             CEEEEEEEcCCCCEEEEEEEE
Confidence            8999999999 8888877754


No 23 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.03  E-value=3.6e-09  Score=99.75  Aligned_cols=159  Identities=18%  Similarity=0.209  Sum_probs=94.0

Q ss_pred             eEEEEEEEcCCCEEEecccccCCCCe----EEEEe----cCC-cEEeeE--EEEEc-C---CCCeEEEEEcCCC------
Q 013804          152 GSGSGFVWDSKGHVVTNYHVIRGASD----IRVTF----ADQ-SAYDAK--IVGFD-Q---DKDVAVLRIDAPK------  210 (436)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~~----i~V~~----~dg-~~~~a~--vv~~d-~---~~DlAlLkv~~~~------  210 (436)
                      ..+++|+|+++ .+||++||+.....    +.+..    .++ ..+..+  ..... .   +.|.+...+....      
T Consensus        64 ~~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~  142 (251)
T COG3591          64 LCTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGIN  142 (251)
T ss_pred             ceeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCC
Confidence            34566999987 99999999965431    11111    122 122222  12112 2   3466665554211      


Q ss_pred             --CCCcceecCCCCCCCCCCEEEEEecCCCCCCce----eEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEE
Q 013804          211 --DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTL----TTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLL  284 (436)
Q Consensus       211 --~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~----~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlv  284 (436)
                        .......+......+.++.+.++|||.+.....    ..+.+...            ....+.+++.+++|+||+|++
T Consensus       143 ~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SGSpv~  210 (251)
T COG3591         143 IGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSGSPVL  210 (251)
T ss_pred             ccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCCCceE
Confidence              112222333345567889999999998765322    22333211            124688999999999999999


Q ss_pred             CCCCcEEEEEeeeecCCCCCCcceee-eeeeccchhhhhcc
Q 013804          285 DSSGSLIGINTAIYSPSGASSGVGFS-IPVDTVNGIVDQLV  324 (436)
Q Consensus       285 d~~G~VVGI~s~~~~~~~~~~~~~~a-IP~~~i~~~l~~l~  324 (436)
                      +.+.+|||++..+....++ ...+++ .-...+++++++++
T Consensus       211 ~~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         211 ISKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             ecCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence            9988999999987664433 333433 34466677776654


No 24 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.85  E-value=8.2e-09  Score=78.63  Aligned_cols=67  Identities=39%  Similarity=0.620  Sum_probs=57.9

Q ss_pred             ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCC
Q 013804          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD  410 (436)
Q Consensus       333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~  410 (436)
                      ++|+.+.....      .|++|..+.+++||+++||++           ||+|++|||++|.++  +++.+++... .|+
T Consensus         2 ~~G~~~~~~~~------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~~-~g~   63 (70)
T cd00136           2 GLGFSIRGGTE------GGVVVLSVEPGSPAERAGLQA-----------GDVILAVNGTDVKNLTLEDVAELLKKE-VGE   63 (70)
T ss_pred             CccEEEecCCC------CCEEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhhC-CCC
Confidence            57888775431      489999999999999999999           999999999999999  9999999884 488


Q ss_pred             EEEEEEE
Q 013804          411 EVIVEVL  417 (436)
Q Consensus       411 ~v~l~v~  417 (436)
                      +++|+|+
T Consensus        64 ~v~l~v~   70 (70)
T cd00136          64 KVTLTVR   70 (70)
T ss_pred             eEEEEEC
Confidence            8988763


No 25 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.83  E-value=1.3e-08  Score=105.34  Aligned_cols=85  Identities=35%  Similarity=0.554  Sum_probs=73.1

Q ss_pred             ceecceeeecch--hhhhcCc----cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHh
Q 013804          331 RPILGIKFAPDQ--SVEQLGV----SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD  404 (436)
Q Consensus       331 ~~~lGv~~~~~~--~~~~~g~----~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~  404 (436)
                      ..++|+.+.+..  ..+.+++    .|++|.+|.++|||+++||++           ||+|++|||++|.+++|+.+++.
T Consensus       337 ~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~-----------GDvI~~Ing~~V~s~~d~~~~l~  405 (428)
T TIGR02037       337 NPFLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQP-----------GDVILSVNQQPVSSVAELRKVLD  405 (428)
T ss_pred             ccccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHH
Confidence            467899887633  2344453    599999999999999999999           99999999999999999999999


Q ss_pred             cCCCCCEEEEEEEECCEEEEEE
Q 013804          405 QCKVGDEVIVEVLRGDQKEKIP  426 (436)
Q Consensus       405 ~~~~g~~v~l~v~R~g~~~~~~  426 (436)
                      ..+.|+.+.++|.|+|+...+.
T Consensus       406 ~~~~g~~v~l~v~R~g~~~~~~  427 (428)
T TIGR02037       406 RAKKGGRVALLILRGGATIFVT  427 (428)
T ss_pred             hcCCCCEEEEEEEECCEEEEEE
Confidence            8778999999999999987654


No 26 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.79  E-value=1e-08  Score=106.53  Aligned_cols=68  Identities=16%  Similarity=0.174  Sum_probs=63.1

Q ss_pred             eEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 013804          352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE  430 (436)
Q Consensus       352 v~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~  430 (436)
                      .+|.+|.++|||++|||++           ||+|++|||++|++++|+...+....+|++++++|.|+|+.+++++++.
T Consensus       128 ~lV~~V~~~SpA~kAGLk~-----------GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~  195 (449)
T PRK10779        128 PVVGEIAPNSIAAQAQIAP-----------GTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLD  195 (449)
T ss_pred             ccccccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEec
Confidence            3689999999999999999           9999999999999999999999887888999999999999888887774


No 27 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.70  E-value=5.1e-08  Score=100.46  Aligned_cols=69  Identities=32%  Similarity=0.491  Sum_probs=64.1

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      .|++|.+|.++|||+++||++           ||+|++|||++|++++|+.+.+.. .+++++.++++|+|+..++++++
T Consensus       203 ~g~vV~~V~~~SpA~~aGL~~-----------GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~  270 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGLKE-----------GDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKVERNGETLSISLTP  270 (420)
T ss_pred             cCcEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEEEECCEEEEEEEEE
Confidence            378999999999999999999           999999999999999999999987 57888999999999998888887


Q ss_pred             e
Q 013804          430 E  430 (436)
Q Consensus       430 ~  430 (436)
                      .
T Consensus       271 ~  271 (420)
T TIGR00054       271 E  271 (420)
T ss_pred             c
Confidence            4


No 28 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.65  E-value=9.9e-08  Score=99.24  Aligned_cols=68  Identities=22%  Similarity=0.410  Sum_probs=63.5

Q ss_pred             ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 013804          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE  430 (436)
Q Consensus       351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~  430 (436)
                      +++|.+|.++|||+++||++           ||+|++|||++|++|+|+.+.+.. .+|+.+.++|.|+|+..++++++.
T Consensus       222 ~~vV~~V~~~SpA~~AGL~~-----------GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~  289 (449)
T PRK10779        222 EPVLAEVQPNSAASKAGLQA-----------GDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIERQGSPLSLTLTPD  289 (449)
T ss_pred             CcEEEeeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence            57899999999999999999           999999999999999999999987 578899999999999998888875


No 29 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.62  E-value=1.1e-07  Score=95.11  Aligned_cols=80  Identities=29%  Similarity=0.499  Sum_probs=66.4

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g  409 (436)
                      ..+|+.+....       .+++|..|.+++||+++||++           ||+|++|||++|.++  .++...+.. ..|
T Consensus        51 ~~lG~~~~~~~-------~~~~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g  111 (334)
T TIGR00225        51 EGIGIQVGMDD-------GEIVIVSPFEGSPAEKAGIKP-----------GDKIIKINGKSVAGMSLDDAVALIRG-KKG  111 (334)
T ss_pred             EEEEEEEEEEC-------CEEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHhccC-CCC
Confidence            56888876432       579999999999999999999           999999999999986  467666665 578


Q ss_pred             CEEEEEEEECCEEEEEEEEee
Q 013804          410 DEVIVEVLRGDQKEKIPVKLE  430 (436)
Q Consensus       410 ~~v~l~v~R~g~~~~~~v~~~  430 (436)
                      +++.++|.|+|+..++++++.
T Consensus       112 ~~v~l~v~R~g~~~~~~v~l~  132 (334)
T TIGR00225       112 TKVSLEILRAGKSKPLTFTLK  132 (334)
T ss_pred             CEEEEEEEeCCCCceEEEEEE
Confidence            999999999987776666664


No 30 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.59  E-value=2.5e-07  Score=93.18  Aligned_cols=77  Identities=29%  Similarity=0.515  Sum_probs=65.0

Q ss_pred             ecceeeecchhhhhcCccceEEEecC--------CCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHh
Q 013804          333 ILGIKFAPDQSVEQLGVSGVLVLDAP--------PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD  404 (436)
Q Consensus       333 ~lGv~~~~~~~~~~~g~~gv~V~~v~--------~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~  404 (436)
                      .+|+.+..         +||+|....        .++||+++||++           ||+|++|||++|.+++|+.+++.
T Consensus        97 ~iGI~l~t---------~GVlVvg~~~v~~~~g~~~SPAa~AGLq~-----------GDiIvsING~~V~s~~DL~~iL~  156 (402)
T TIGR02860        97 SIGVKLNT---------KGVLVVGFSDIETEKGKIHSPGEEAGIQI-----------GDRILKINGEKIKNMDDLANLIN  156 (402)
T ss_pred             EEEEEEec---------CEEEEEEEEcccccCCCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHH
Confidence            46666643         688886652        368999999999           99999999999999999999998


Q ss_pred             cCCCCCEEEEEEEECCEEEEEEEEee
Q 013804          405 QCKVGDEVIVEVLRGDQKEKIPVKLE  430 (436)
Q Consensus       405 ~~~~g~~v~l~v~R~g~~~~~~v~~~  430 (436)
                      .. .++.+.++|.|+|+..++++++.
T Consensus       157 ~~-~g~~V~LtV~R~Ge~~tv~V~Pv  181 (402)
T TIGR02860       157 KA-GGEKLTLTIERGGKIIETVIKPV  181 (402)
T ss_pred             hC-CCCeEEEEEEECCEEEEEEEEEe
Confidence            85 48999999999999988888754


No 31 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.59  E-value=1.8e-06  Score=82.58  Aligned_cols=170  Identities=22%  Similarity=0.249  Sum_probs=95.5

Q ss_pred             eEEEEEEEcCCCEEEecccccCCCC--eEEEEecC---------C---cEEeeEEEEEcC-------C-CCeEEEEEcCC
Q 013804          152 GSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFAD---------Q---SAYDAKIVGFDQ-------D-KDVAVLRIDAP  209 (436)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~d---------g---~~~~a~vv~~d~-------~-~DlAlLkv~~~  209 (436)
                      ..|.|.+|+++ ||+|++||+.+..  .+.|+++.         +   .......+..|+       . .|||||+++.+
T Consensus        38 ~~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~  116 (256)
T KOG3627|consen   38 HLCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEP  116 (256)
T ss_pred             eeeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCC
Confidence            47888888665 9999999999976  77777642         1   111122112332       2 79999999874


Q ss_pred             C---CCCcceecCCCCC---CCCCCEEEEEecCCCCC------CceeEeEEeeeeeeeccCCCCC---CcccEEEE----
Q 013804          210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQT----  270 (436)
Q Consensus       210 ~---~~~~~~~l~~~~~---~~~G~~V~~vG~p~g~~------~~~~~G~vs~~~~~~~~~~~~~---~~~~~i~~----  270 (436)
                      .   ..+.|+.|.....   ...+..+++.||+....      .......+.-+....+......   .....+..    
T Consensus       117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~  196 (256)
T KOG3627|consen  117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE  196 (256)
T ss_pred             cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence            3   4566777743332   34458888899864321      1222222222221111111000   00112222    


Q ss_pred             -ccccCCCCCCCeEECCC---CcEEEEEeeeecCCCCCCcceeeeeeeccchhhhh
Q 013804          271 -DAAINPGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ  322 (436)
Q Consensus       271 -~~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~  322 (436)
                       ....|.|+|||||+-.+   ..++||++++...|+.....+....+....+|+++
T Consensus       197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~  252 (256)
T KOG3627|consen  197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKE  252 (256)
T ss_pred             CCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHH
Confidence             23468999999999654   69999999987645433223334445555555554


No 32 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.58  E-value=1.7e-07  Score=73.66  Aligned_cols=74  Identities=36%  Similarity=0.490  Sum_probs=58.2

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~  411 (436)
                      ..+|+.+.......    .|++|..+.+++||+++||++           ||+|++|||+++.++.+..........++.
T Consensus        12 ~~~G~~~~~~~~~~----~~~~i~~v~~~s~a~~~gl~~-----------GD~I~~In~~~v~~~~~~~~~~~~~~~~~~   76 (85)
T smart00228       12 GGLGFSLVGGKDEG----GGVVVSSVVPGSPAAKAGLKV-----------GDVILEVNGTSVEGLTHLEAVDLLKKAGGK   76 (85)
T ss_pred             CcccEEEECCCCCC----CCEEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence            57888877533111    589999999999999999999           999999999999987665554443334678


Q ss_pred             EEEEEEECC
Q 013804          412 VIVEVLRGD  420 (436)
Q Consensus       412 v~l~v~R~g  420 (436)
                      +.+++.|++
T Consensus        77 ~~l~i~r~~   85 (85)
T smart00228       77 VTLTVLRGG   85 (85)
T ss_pred             EEEEEEeCC
Confidence            999999875


No 33 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.48  E-value=6.4e-07  Score=91.43  Aligned_cols=84  Identities=21%  Similarity=0.404  Sum_probs=64.8

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG  409 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g  409 (436)
                      .-+|+.+....... -...|++|..|.++|||+++||++           ||+|++|||++|.++  .++...+.. ..|
T Consensus        85 ~GiG~~~~~~~~~~-~~~~g~~V~~V~~~SPA~~aGl~~-----------GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g  151 (389)
T PLN00049         85 TGVGLEVGYPTGSD-GPPAGLVVVAPAPGGPAARAGIRP-----------GDVILAIDGTSTEGLSLYEAADRLQG-PEG  151 (389)
T ss_pred             eEEEEEEEEccCCC-CccCcEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhc-CCC
Confidence            45777765322100 001489999999999999999999           999999999999864  677777765 578


Q ss_pred             CEEEEEEEECCEEEEEEEE
Q 013804          410 DEVIVEVLRGDQKEKIPVK  428 (436)
Q Consensus       410 ~~v~l~v~R~g~~~~~~v~  428 (436)
                      +.+.++|.|+|+..+++++
T Consensus       152 ~~v~ltv~r~g~~~~~~l~  170 (389)
T PLN00049        152 SSVELTLRRGPETRLVTLT  170 (389)
T ss_pred             CEEEEEEEECCEEEEEEEE
Confidence            9999999999987776654


No 34 
>PRK10139 serine endoprotease; Provisional
Probab=98.48  E-value=4.5e-07  Score=94.25  Aligned_cols=65  Identities=22%  Similarity=0.428  Sum_probs=59.6

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV  427 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v  427 (436)
                      .|++|.+|.+++||+++||++           ||+|++|||++|.+|+|+.+++.+. + +++.++|+|+|+...+.+
T Consensus       390 ~Gv~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~R~g~~~~~~~  454 (455)
T PRK10139        390 KGIKIDEVVKGSPAAQAGLQK-----------DDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIVRGNESIYLLL  454 (455)
T ss_pred             CceEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEEECCEEEEEEe
Confidence            589999999999999999999           9999999999999999999999873 3 789999999999877665


No 35 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.47  E-value=2.3e-05  Score=73.61  Aligned_cols=163  Identities=16%  Similarity=0.283  Sum_probs=86.0

Q ss_pred             HHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCC-CCeEEEEecCCcEEeeE-----EEEEc
Q 013804          123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGFD  196 (436)
Q Consensus       123 ~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~-----vv~~d  196 (436)
                      ..+...|+.|.......           ...=-|+..+  .+|+|++|.++. ...++|...-|. |...     -+..-
T Consensus        14 n~Ia~~ic~l~n~s~~~-----------~~~l~gigyG--~~iItn~HLf~~nng~L~i~s~hG~-f~v~nt~~lkv~~i   79 (235)
T PF00863_consen   14 NPIASNICRLTNESDGG-----------TRSLYGIGYG--SYIITNAHLFKRNNGELTIKSQHGE-FTVPNTTQLKVHPI   79 (235)
T ss_dssp             HHHHTTEEEEEEEETTE-----------EEEEEEEEET--TEEEEEGGGGSSTTCEEEEEETTEE-EEECEGGGSEEEE-
T ss_pred             chhhheEEEEEEEeCCC-----------eEEEEEEeEC--CEEEEChhhhccCCCeEEEEeCceE-EEcCCccccceEEe
Confidence            34456688887643221           2233477775  389999999964 456777776663 3222     23444


Q ss_pred             CCCCeEEEEEcCCCCCCcceecC-CCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 013804          197 QDKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN  275 (436)
Q Consensus       197 ~~~DlAlLkv~~~~~~~~~~~l~-~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~  275 (436)
                      +..||.++|+..   ++||.+-. .-..+..++.|+++|.-+....  ..-.|+........  .   ...+...-....
T Consensus        80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p~--~---~~~fWkHwIsTk  149 (235)
T PF00863_consen   80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYPE--E---NSHFWKHWISTK  149 (235)
T ss_dssp             TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEEE--T---TTTEEEE-C---
T ss_pred             CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEeec--C---CCCeeEEEecCC
Confidence            688999999965   56665432 3356889999999997544332  12223322221111  1   124455556667


Q ss_pred             CCCCCCeEECC-CCcEEEEEeeeecCCCCCCcceeeeee
Q 013804          276 PGNSGGPLLDS-SGSLIGINTAIYSPSGASSGVGFSIPV  313 (436)
Q Consensus       276 ~G~SGGPlvd~-~G~VVGI~s~~~~~~~~~~~~~~aIP~  313 (436)
                      .|+=|.|+|+. +|++|||++.....    ...+|+.|+
T Consensus       150 ~G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f  184 (235)
T PF00863_consen  150 DGDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF  184 (235)
T ss_dssp             TT-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred             CCccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence            89999999984 99999999987543    345677766


No 36 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.42  E-value=9e-07  Score=69.32  Aligned_cols=69  Identities=32%  Similarity=0.474  Sum_probs=55.6

Q ss_pred             ceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeC--CHHHHHHHHhcCCC
Q 013804          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKV  408 (436)
Q Consensus       331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~--s~~dl~~~l~~~~~  408 (436)
                      ...+|+.+......    ..|++|.++.+++||+++||++           ||+|++|||+++.  ++.++.+++... .
T Consensus        11 ~~~~G~~~~~~~~~----~~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~~~l~~~-~   74 (82)
T cd00992          11 GGGLGFSLRGGKDS----GGGIFVSRVEPGGPAERGGLRV-----------GDRILEVNGVSVEGLTHEEAVELLKNS-G   74 (82)
T ss_pred             CCCcCEEEeCcccC----CCCeEEEEECCCChHHhCCCCC-----------CCEEEEECCEEcCccCHHHHHHHHHhC-C
Confidence            35688888754311    3589999999999999999999           9999999999999  899999988863 2


Q ss_pred             CCEEEEEE
Q 013804          409 GDEVIVEV  416 (436)
Q Consensus       409 g~~v~l~v  416 (436)
                       ..+.+++
T Consensus        75 -~~v~l~v   81 (82)
T cd00992          75 -DEVTLTV   81 (82)
T ss_pred             -CeEEEEE
Confidence             3666654


No 37 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.39  E-value=4.8e-07  Score=71.07  Aligned_cols=71  Identities=28%  Similarity=0.462  Sum_probs=55.7

Q ss_pred             ceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCC
Q 013804          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKV  408 (436)
Q Consensus       331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~  408 (436)
                      ...+|+.+.......   ..+++|.++.+++||+++||++           ||.|++|||+++.++  .++..++...  
T Consensus         9 ~~~lG~~l~~~~~~~---~~~~~V~~v~~~~~a~~~gl~~-----------GD~Il~INg~~v~~~~~~~~~~~l~~~--   72 (81)
T PF00595_consen    9 NGPLGFTLRGGSDND---EKGVFVSSVVPGSPAERAGLKV-----------GDRILEINGQSVRGMSHDEVVQLLKSA--   72 (81)
T ss_dssp             TSBSSEEEEEESTSS---SEEEEEEEECTTSHHHHHTSST-----------TEEEEEETTEESTTSBHHHHHHHHHHS--
T ss_pred             CCCcCEEEEecCCCC---cCCEEEEEEeCCChHHhcccch-----------hhhhheeCCEeCCCCCHHHHHHHHHCC--
Confidence            457888887643211   2599999999999999999999           999999999999976  4566667663  


Q ss_pred             CCEEEEEEE
Q 013804          409 GDEVIVEVL  417 (436)
Q Consensus       409 g~~v~l~v~  417 (436)
                      +.+++|+|+
T Consensus        73 ~~~v~L~V~   81 (81)
T PF00595_consen   73 SNPVTLTVQ   81 (81)
T ss_dssp             TSEEEEEEE
T ss_pred             CCcEEEEEC
Confidence            348888874


No 38 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=98.37  E-value=1.5e-06  Score=83.40  Aligned_cols=71  Identities=31%  Similarity=0.500  Sum_probs=63.7

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLR-GDQKEKIPVK  428 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~  428 (436)
                      .||++..+.+++|+            .++|+.||.|++|||+++.+.+|+..++.+.++|++|++++.| +++...++++
T Consensus       130 ~gvyv~~v~~~~~~------------~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~t  197 (342)
T COG3480         130 AGVYVLSVIDNSPF------------KGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTIT  197 (342)
T ss_pred             eeEEEEEccCCcch------------hceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEE
Confidence            79999999999986            4556669999999999999999999999999999999999997 7888888888


Q ss_pred             eecC
Q 013804          429 LEPK  432 (436)
Q Consensus       429 ~~~~  432 (436)
                      +.+.
T Consensus       198 l~~~  201 (342)
T COG3480         198 LIKN  201 (342)
T ss_pred             EEee
Confidence            8766


No 39 
>PRK10942 serine endoprotease; Provisional
Probab=98.37  E-value=1.1e-06  Score=91.74  Aligned_cols=65  Identities=34%  Similarity=0.539  Sum_probs=59.3

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV  427 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v  427 (436)
                      .|++|.+|.++++|+++||++           ||+|++|||++|.+++|+.+++.. .+ +.+.|+|+|+|+.+.+.+
T Consensus       408 ~gvvV~~V~~~S~A~~aGL~~-----------GDvIv~VNg~~V~s~~dl~~~l~~-~~-~~v~l~V~R~g~~~~v~~  472 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGLKK-----------GDVIIGANQQPVKNIAELRKILDS-KP-SVLALNIQRGDSSIYLLM  472 (473)
T ss_pred             CCeEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CC-CeEEEEEEECCEEEEEEe
Confidence            489999999999999999999           999999999999999999999987 33 789999999999877655


No 40 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.35  E-value=9.5e-07  Score=89.62  Aligned_cols=63  Identities=24%  Similarity=0.362  Sum_probs=54.8

Q ss_pred             EEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE-ECCEEEEEEEEee
Q 013804          353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVL-RGDQKEKIPVKLE  430 (436)
Q Consensus       353 ~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~-R~g~~~~~~v~~~  430 (436)
                      +|..|.++|+|+++||++           ||+|++|||++|.+|.|+..++.    ++.+.++|. |+|+..++++...
T Consensus         1 ~I~~V~pgSpAe~AGLe~-----------GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~~   64 (433)
T TIGR03279         1 LISAVLPGSIAEELGFEP-----------GDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEKD   64 (433)
T ss_pred             CcCCcCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEecC
Confidence            367889999999999999           99999999999999999988774    366899997 8898888887654


No 41 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.31  E-value=4.4e-06  Score=66.54  Aligned_cols=79  Identities=29%  Similarity=0.524  Sum_probs=50.8

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCC--------CcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHH
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPN--------GPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRIL  403 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~--------spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l  403 (436)
                      +.||..+....       .+..|..+.++        ||-.+.|+.         +++||+|++|||++|..-.++..+|
T Consensus         1 G~LGAd~~~~~-------~~y~I~~I~~gd~~~~~~~sPL~~pGv~---------v~~GD~I~aInG~~v~~~~~~~~lL   64 (88)
T PF14685_consen    1 GLLGADFSYDN-------GGYRIARIYPGDPWNPNARSPLAQPGVD---------VREGDYILAINGQPVTADANPYRLL   64 (88)
T ss_dssp             -B-SEEEEEET-------TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB-HHHHH
T ss_pred             CccceEEEEcC-------CEEEEEEEeCCCCCCccccCCccCCCCC---------CCCCCEEEEECCEECCCCCCHHHHh
Confidence            35777776543       56678887765        444444443         4669999999999999999999999


Q ss_pred             hcCCCCCEEEEEEEECC-EEEEEEE
Q 013804          404 DQCKVGDEVIVEVLRGD-QKEKIPV  427 (436)
Q Consensus       404 ~~~~~g~~v~l~v~R~g-~~~~~~v  427 (436)
                      .. +.|+.|.|+|.+.+ +.+++.|
T Consensus        65 ~~-~agk~V~Ltv~~~~~~~R~v~V   88 (88)
T PF14685_consen   65 EG-KAGKQVLLTVNRKPGGARTVVV   88 (88)
T ss_dssp             HT-TTTSEEEEEEE-STT-EEEEEE
T ss_pred             cc-cCCCEEEEEEecCCCCceEEEC
Confidence            98 68999999999965 4555543


No 42 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.24  E-value=3.4e-06  Score=86.29  Aligned_cols=80  Identities=26%  Similarity=0.468  Sum_probs=65.6

Q ss_pred             cceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCC
Q 013804          330 TRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCK  407 (436)
Q Consensus       330 ~~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~  407 (436)
                      .+..+|+++...+.      .++.|.++.+++||+++|+++           ||+|++|||+++...  +++.+.+.. +
T Consensus        98 ~~~GiG~~i~~~~~------~~~~V~s~~~~~PA~kagi~~-----------GD~I~~IdG~~~~~~~~~~av~~irG-~  159 (406)
T COG0793          98 EFGGIGIELQMEDI------GGVKVVSPIDGSPAAKAGIKP-----------GDVIIKIDGKSVGGVSLDEAVKLIRG-K  159 (406)
T ss_pred             cccceeEEEEEecC------CCcEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCCCHHHHHHHhCC-C
Confidence            56889999876432      688999999999999999999           999999999999976  457777776 6


Q ss_pred             CCCEEEEEEEECC--EEEEEEE
Q 013804          408 VGDEVIVEVLRGD--QKEKIPV  427 (436)
Q Consensus       408 ~g~~v~l~v~R~g--~~~~~~v  427 (436)
                      +|..|+|+|.|.+  +..++++
T Consensus       160 ~Gt~V~L~i~r~~~~k~~~v~l  181 (406)
T COG0793         160 PGTKVTLTILRAGGGKPFTVTL  181 (406)
T ss_pred             CCCeEEEEEEEcCCCceeEEEE
Confidence            8999999999974  3444443


No 43 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.23  E-value=1.7e-06  Score=89.31  Aligned_cols=66  Identities=29%  Similarity=0.322  Sum_probs=58.7

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK  428 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~  428 (436)
                      .|++|.+|.++|||+++||++           ||+|++|||+++.+++|+.+.+....  +++.+++.|+++..+++++
T Consensus       128 ~g~~V~~V~~~SpA~~AGL~~-----------GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I~r~g~~~~l~v~  193 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGIEP-----------GDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEILAERENWTFEVM  193 (420)
T ss_pred             CCceeeccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEEEEecCceEeccc
Confidence            578999999999999999999           99999999999999999999988754  6789999998877665443


No 44 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.00  E-value=0.00024  Score=69.56  Aligned_cols=55  Identities=18%  Similarity=0.239  Sum_probs=39.5

Q ss_pred             cccCCCCCCCeEECC--CC-cEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcccc
Q 013804          272 AAINPGNSGGPLLDS--SG-SLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKF  326 (436)
Q Consensus       272 ~~i~~G~SGGPlvd~--~G-~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~  326 (436)
                      ...|.|+||||+|-.  +| .-+||++|+...|+...-.+...-++....|++..++.
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~  280 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG  280 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence            567899999999843  45 47999999988877544344444567777888775543


No 45 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.00  E-value=1.6e-05  Score=76.31  Aligned_cols=57  Identities=21%  Similarity=0.367  Sum_probs=52.1

Q ss_pred             cccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804          362 PAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       362 pa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      --+++||++           ||++++|||.++++.++..++++.......++|+|+|||+.+++.+.+
T Consensus       219 lF~~~GLq~-----------GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l  275 (276)
T PRK09681        219 LFDASGFKE-----------GDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL  275 (276)
T ss_pred             HHHHcCCCC-----------CCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence            357789999           999999999999999999999998888899999999999999988765


No 46 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.95  E-value=1.2e-05  Score=82.33  Aligned_cols=64  Identities=38%  Similarity=0.473  Sum_probs=56.3

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      ++.+|..|.++|||++|||.+           ||.|++|||.        ...+..++.++.|++++.|.|+.+++.+++
T Consensus       462 g~~~i~~V~~~gPA~~AGl~~-----------Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~~~~L~e~~v~~  522 (558)
T COG3975         462 GHEKITFVFPGGPAYKAGLSP-----------GDKIVAINGI--------SDQLDRYKVNDKIQVHVFREGRLREFLVKL  522 (558)
T ss_pred             CeeEEEecCCCChhHhccCCC-----------ccEEEEEcCc--------cccccccccccceEEEEccCCceEEeeccc
Confidence            678999999999999999999           9999999998        334556788999999999999999998877


Q ss_pred             ecC
Q 013804          430 EPK  432 (436)
Q Consensus       430 ~~~  432 (436)
                      ...
T Consensus       523 ~~~  525 (558)
T COG3975         523 GGD  525 (558)
T ss_pred             CCC
Confidence            543


No 47 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.94  E-value=2.6e-05  Score=70.63  Aligned_cols=74  Identities=27%  Similarity=0.248  Sum_probs=61.8

Q ss_pred             ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHH--HhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI--LDQCKVGDEVIVEVLRGDQKEKIPVK  428 (436)
Q Consensus       351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~--l~~~~~g~~v~l~v~R~g~~~~~~v~  428 (436)
                      =++|..|.++|||+.+||+.           ||.|+++....-.++..|..+  +.+...++.+.++|.|.|+...+.++
T Consensus       140 Fa~V~sV~~~SPA~~aGl~~-----------gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~lt  208 (231)
T KOG3129|consen  140 FAVVDSVVPGSPADEAGLCV-----------GDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLT  208 (231)
T ss_pred             eEEEeecCCCChhhhhCccc-----------CceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeC
Confidence            35799999999999999999           999999988777776655442  33446789999999999999999999


Q ss_pred             eecCCCC
Q 013804          429 LEPKPDE  435 (436)
Q Consensus       429 ~~~~~~~  435 (436)
                      +..|...
T Consensus       209 P~~W~Gr  215 (231)
T KOG3129|consen  209 PKKWQGR  215 (231)
T ss_pred             cccccCC
Confidence            9988653


No 48 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.90  E-value=3.7e-05  Score=82.91  Aligned_cols=79  Identities=23%  Similarity=0.354  Sum_probs=60.5

Q ss_pred             ceecceeeecchhhhhcCccceEEEecCCCCccccc-CcccccccccCcccCCcEEEEEC--CEEeCC-----HHHHHHH
Q 013804          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVN--GKKVSN-----GSDLYRI  402 (436)
Q Consensus       331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~a-gl~~~~~~~~~~l~~GDiIl~vn--G~~V~s-----~~dl~~~  402 (436)
                      ...+|+.+....       ++++|..|.+++||+++ ||++           ||+|++||  |+++.+     .+++...
T Consensus       243 ~~GIGa~l~~~~-------~~~~V~~vipGsPA~ka~gLk~-----------GD~IlaVn~~g~~~~dv~g~~~~~vv~l  304 (667)
T PRK11186        243 LEGIGAVLQMDD-------DYTVINSLVAGGPAAKSKKLSV-----------GDKIVGVGQDGKPIVDVIGWRLDDVVAL  304 (667)
T ss_pred             eeEEEEEEEEeC-------CeEEEEEccCCChHHHhCCCCC-----------CCEEEEECCCCCcccccccCCHHHHHHH
Confidence            356788876543       46899999999999998 9999           99999999  554432     3477778


Q ss_pred             HhcCCCCCEEEEEEEEC---CEEEEEEEE
Q 013804          403 LDQCKVGDEVIVEVLRG---DQKEKIPVK  428 (436)
Q Consensus       403 l~~~~~g~~v~l~v~R~---g~~~~~~v~  428 (436)
                      +.. ..|.+|.|+|.|+   ++..+++++
T Consensus       305 irG-~~Gt~V~LtV~r~~~~~~~~~vtl~  332 (667)
T PRK11186        305 IKG-PKGSKVRLEILPAGKGTKTRIVTLT  332 (667)
T ss_pred             hcC-CCCCEEEEEEEeCCCCCceEEEEEE
Confidence            877 6899999999984   445555543


No 49 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.81  E-value=0.00026  Score=66.73  Aligned_cols=117  Identities=26%  Similarity=0.366  Sum_probs=63.0

Q ss_pred             CeEEEEEEEcCCC--EEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCC
Q 013804          151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ  228 (436)
Q Consensus       151 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~  228 (436)
                      ...|||=+...+|  .|+|+.||+. .+...|.. .+....   ..+...-|+|.-.++.-...+|.++++..   ..|.
T Consensus       111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~-~g~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a~~---~~Gr  182 (297)
T PF05579_consen  111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSG-VGTRRM---LTFKKNGDFAEADITNWPGAAPKYKFAQN---YTGR  182 (297)
T ss_dssp             SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE-TTEEEE---EEEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred             ecccccceEEECCeEEEEEEEEEcC-CCeEEEEe-cceEEE---EEEeccCcEEEEECCCCCCCCCceeecCC---cccc
Confidence            4455555555444  4999999998 44444444 333222   23445679999999543346777777522   2333


Q ss_pred             EEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeee
Q 013804          229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIY  298 (436)
Q Consensus       229 ~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~  298 (436)
                      --+.-      ...+..|.|..-              ..+.   -..+|+||+|+++.+|.+||||+...
T Consensus       183 AyW~t------~tGvE~G~ig~~--------------~~~~---fT~~GDSGSPVVt~dg~liGVHTGSn  229 (297)
T PF05579_consen  183 AYWLT------STGVEPGFIGGG--------------GAVC---FTGPGDSGSPVVTEDGDLIGVHTGSN  229 (297)
T ss_dssp             EEEEE------TTEEEEEEEETT--------------EEEE---SS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred             eEEEc------ccCcccceecCc--------------eEEE---EcCCCCCCCccCcCCCCEEEEEecCC
Confidence            22221      223445555311              1122   23479999999999999999999764


No 50 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.78  E-value=5.9e-05  Score=65.49  Aligned_cols=87  Identities=24%  Similarity=0.389  Sum_probs=59.0

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE  411 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~  411 (436)
                      +.||+.++-.... .-...+.-|.+|.++|||++|||++          ..|.|+.+|+....+.++|.+.++. ..++.
T Consensus        26 g~LG~sv~~~~~~-~~~~~~~~Vl~V~p~SPA~~AGL~p----------~~DyIig~~~~~l~~~~~l~~~v~~-~~~~~   93 (138)
T PF04495_consen   26 GLLGISVRFESFE-GAEEEGWHVLRVAPNSPAAKAGLEP----------FFDYIIGIDGGLLDDEDDLFELVEA-NENKP   93 (138)
T ss_dssp             SSS-EEEEEEE-T-TGCCCEEEEEEE-TTSHHHHTT--T----------TTEEEEEETTCE--STCHHHHHHHH-TTTS-
T ss_pred             CCCcEEEEEeccc-ccccceEEEeEecCCCHHHHCCccc----------cccEEEEccceecCCHHHHHHHHHH-cCCCc
Confidence            6788877643211 0112578899999999999999998          2599999999999999999999988 57889


Q ss_pred             EEEEEEEC--CEEEEEEEEee
Q 013804          412 VIVEVLRG--DQKEKIPVKLE  430 (436)
Q Consensus       412 v~l~v~R~--g~~~~~~v~~~  430 (436)
                      +.+.|...  +..+++++++.
T Consensus        94 l~L~Vyns~~~~vR~V~i~P~  114 (138)
T PF04495_consen   94 LQLYVYNSKTDSVREVTITPS  114 (138)
T ss_dssp             EEEEEEETTTTCEEEEEE---
T ss_pred             EEEEEEECCCCeEEEEEEEcC
Confidence            99999873  45566666664


No 51 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.27  E-value=0.00056  Score=63.41  Aligned_cols=59  Identities=24%  Similarity=0.424  Sum_probs=51.9

Q ss_pred             CCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804          359 PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK  428 (436)
Q Consensus       359 ~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~  428 (436)
                      +.+--+..||++           ||+.+++|+..+++.+++..+++....-+.++++|+|+|+.+.+.|.
T Consensus       216 d~slF~~sglq~-----------GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~  274 (275)
T COG3031         216 DGSLFYKSGLQR-----------GDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR  274 (275)
T ss_pred             CcchhhhhcCCC-----------cceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence            445566778998           99999999999999999999999877778899999999999988875


No 52 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.88  E-value=0.0008  Score=53.85  Aligned_cols=34  Identities=35%  Similarity=0.442  Sum_probs=31.7

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeC
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS  394 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~  394 (436)
                      .|++|+.|.++|||+.|||+.           +|.|+.+||...+
T Consensus        59 ~GiYvT~V~eGsPA~~AGLri-----------hDKIlQvNG~DfT   92 (124)
T KOG3553|consen   59 KGIYVTRVSEGSPAEIAGLRI-----------HDKILQVNGWDFT   92 (124)
T ss_pred             ccEEEEEeccCChhhhhccee-----------cceEEEecCceeE
Confidence            799999999999999999999           9999999996544


No 53 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.68  E-value=0.044  Score=49.48  Aligned_cols=137  Identities=18%  Similarity=0.271  Sum_probs=77.2

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeE--EEEEcC---CCCeEEEEEcCCCCCCccee-cCCCCCC
Q 013804          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK--IVGFDQ---DKDVAVLRIDAPKDKLRPIP-IGVSADL  224 (436)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~--vv~~d~---~~DlAlLkv~~~~~~~~~~~-l~~~~~~  224 (436)
                      ...++++.|-.+ ++|-..| -.....+.+   +|+.++..  +...+.   ..|+++++++... +++-+. +-.....
T Consensus        24 ~~t~l~~gi~~~-~~lvp~H-~~~~~~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~~~   97 (172)
T PF00548_consen   24 EFTMLALGIYDR-YFLVPTH-EEPEDTIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPESIP   97 (172)
T ss_dssp             EEEEEEEEEEBT-EEEEEGG-GGGCSEEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSSGG
T ss_pred             eEEEecceEeee-EEEEECc-CCCcEEEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccccc
Confidence            557888888765 9999999 222233333   45555433  222333   4599999997643 332221 1111112


Q ss_pred             CCCCEEEEEecCCCCCC-ceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEEC---CCCcEEEEEeee
Q 013804          225 LVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD---SSGSLIGINTAI  297 (436)
Q Consensus       225 ~~G~~V~~vG~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd---~~G~VVGI~s~~  297 (436)
                      ...+...++-.. .... ....+.+...... ..  .+......+.++++..+|+-||||+.   ..++++|||.++
T Consensus        98 ~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen   98 EYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             TEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred             cCCCcEEEEECC-CCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence            334444444322 2222 3334444433332 11  23334577888999999999999994   267999999986


No 54 
>PF12812 PDZ_1:  PDZ-like domain
Probab=96.66  E-value=0.0045  Score=48.29  Aligned_cols=64  Identities=28%  Similarity=0.390  Sum_probs=50.1

Q ss_pred             eecceeeec--chhhhhcCc-cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 013804          332 PILGIKFAP--DQSVEQLGV-SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC  406 (436)
Q Consensus       332 ~~lGv~~~~--~~~~~~~g~-~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~  406 (436)
                      -|.|-.+.+  ...++.+++ -|+++.....++++..-|+..           |-+|++|||+++.+.++|.+.+++.
T Consensus         9 ~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~-----------g~iI~~Vn~kpt~~Ld~f~~vvk~i   75 (78)
T PF12812_consen    9 EVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISK-----------GFIITSVNGKPTPDLDDFIKVVKKI   75 (78)
T ss_pred             EEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCC-----------CeEEEeECCcCCcCHHHHHHHHHhC
Confidence            367777776  345777775 345555667888887766888           9999999999999999999999874


No 55 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.65  E-value=0.031  Score=51.66  Aligned_cols=166  Identities=18%  Similarity=0.249  Sum_probs=86.5

Q ss_pred             ccccCcCeEEEEEEEcCC-CEEEecccccCCCCe-EEEEecCCcEEeeEEEEEcCC----------------CCeEEEEE
Q 013804          145 DVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQD----------------KDVAVLRI  206 (436)
Q Consensus       145 ~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLkv  206 (436)
                      |......+.||=.+++++ +..--=.|.+.+.+. ..+.+.+|+.+++++....+.                .-+.-+.-
T Consensus        13 wVRD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~   92 (218)
T PF05580_consen   13 WVRDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEK   92 (218)
T ss_pred             EEEeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEe
Confidence            334445788999999874 566666899887664 456667888888777655421                11222222


Q ss_pred             cCC--------------CCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCC----cccEE
Q 013804          207 DAP--------------KDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRP----IQDVI  268 (436)
Q Consensus       207 ~~~--------------~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~----~~~~i  268 (436)
                      +..              ....++++++...+++.|..-+.--. .+.....-.-.|..+.+.......+..    ...++
T Consensus        93 Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~ieI~~v~~~~~~~~k~~vi~vtd~~Ll  171 (218)
T PF05580_consen   93 NTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDIEIEKVLPQSSPSGKGMVIKVTDPRLL  171 (218)
T ss_pred             ccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEEEEEEEccCCCCCCCcEEEEECCcchh
Confidence            111              01234455555555666653221100 111111111112222221110000000    01222


Q ss_pred             EEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeec
Q 013804          269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT  315 (436)
Q Consensus       269 ~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~  315 (436)
                      .....+..||||+|++ .+|++||=++..+.+   +...+|.++++.
T Consensus       172 ~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie~  214 (218)
T PF05580_consen  172 EKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIEW  214 (218)
T ss_pred             hhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHHH
Confidence            3344577899999999 799999998877653   456788887654


No 56 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.59  E-value=0.09  Score=51.12  Aligned_cols=91  Identities=20%  Similarity=0.253  Sum_probs=55.1

Q ss_pred             CCCCeEEEEEcCC-CCCCcceecCCCC-CCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEcccc
Q 013804          197 QDKDVAVLRIDAP-KDKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI  274 (436)
Q Consensus       197 ~~~DlAlLkv~~~-~~~~~~~~l~~~~-~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i  274 (436)
                      ..++++||+++.+ .....++=|+++. ....|+.+.+.|+..  ........+.-....        .....+......
T Consensus       159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~--~~~~~~~~~~i~~~~--------~~~~~~~~~~~~  228 (282)
T PF03761_consen  159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNS--TGKLKHRKLKITNCT--------KCAYSICTKQYS  228 (282)
T ss_pred             cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCC--CCeEEEEEEEEEEee--------ccceeEeccccc
Confidence            4569999999876 2356677776543 356789999998721  112222222211110        012345556677


Q ss_pred             CCCCCCCeEEC-CCC--cEEEEEeee
Q 013804          275 NPGNSGGPLLD-SSG--SLIGINTAI  297 (436)
Q Consensus       275 ~~G~SGGPlvd-~~G--~VVGI~s~~  297 (436)
                      +.|++|||++. .+|  -||||.+..
T Consensus       229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~  254 (282)
T PF03761_consen  229 CKGDRGGPLVKNINGRWTLIGVGASG  254 (282)
T ss_pred             CCCCccCeEEEEECCCEEEEEEEccC
Confidence            89999999983 344  589987654


No 57 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.02  E-value=0.0069  Score=62.87  Aligned_cols=58  Identities=28%  Similarity=0.353  Sum_probs=49.7

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLR  418 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~v~R  418 (436)
                      -|++|.+|.+++||++.||++           ||.|+.||.++..+.  +|...+|....+|+.|+|.-++
T Consensus       429 VGIFVaGvqegspA~~eGlqE-----------GDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~  488 (1027)
T KOG3580|consen  429 VGIFVAGVQEGSPAEQEGLQE-----------GDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQS  488 (1027)
T ss_pred             eeEEEeecccCCchhhccccc-----------cceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhh
Confidence            489999999999999999999           999999999988875  4556667778899999886543


No 58 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=95.87  E-value=0.033  Score=60.80  Aligned_cols=22  Identities=36%  Similarity=0.359  Sum_probs=20.3

Q ss_pred             eEEEEEEEcCCCEEEecccccC
Q 013804          152 GSGSGFVWDSKGHVVTNYHVIR  173 (436)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~  173 (436)
                      +-|||-+|+++|.||||-||..
T Consensus        47 gGCSgsfVS~~GLvlTNHHC~~   68 (698)
T PF10459_consen   47 GGCSGSFVSPDGLVLTNHHCGY   68 (698)
T ss_pred             CceeEEEEcCCceEEecchhhh
Confidence            4699999999999999999975


No 59 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.83  E-value=0.056  Score=57.43  Aligned_cols=117  Identities=18%  Similarity=0.397  Sum_probs=71.7

Q ss_pred             CCCCeEEEEEcCCC-------CCC------cceecCC------CCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeecc
Q 013804          197 QDKDVAVLRIDAPK-------DKL------RPIPIGV------SADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISS  257 (436)
Q Consensus       197 ~~~DlAlLkv~~~~-------~~~------~~~~l~~------~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~  257 (436)
                      .-.|+||++++...       +.+      |.+.+.+      ...+..|.+|+-+|.-.+    .+.|.+.++.-....
T Consensus       541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw~  616 (695)
T PF08192_consen  541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYWA  616 (695)
T ss_pred             cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEec
Confidence            34599999997532       111      1222221      123567999999986644    567777766432211


Q ss_pred             CCCCCC-cccEEEEc----cccCCCCCCCeEECCCCc------EEEEEeeeecCCCCCCcceeeeeeeccchhhhh
Q 013804          258 AATGRP-IQDVIQTD----AAINPGNSGGPLLDSSGS------LIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ  322 (436)
Q Consensus       258 ~~~~~~-~~~~i~~~----~~i~~G~SGGPlvd~~G~------VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~  322 (436)
                        .+.. ..+++...    .-...|+||+=|++.-+.      |+||.++..+   ....++.+.|+..|.+=+++
T Consensus       617 --dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg---e~kqfglftPi~~il~rl~~  687 (695)
T PF08192_consen  617 --DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG---EQKQFGLFTPINEILDRLEE  687 (695)
T ss_pred             --CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC---ccceeeccCcHHHHHHHHHH
Confidence              1211 13334443    334579999999985344      9999887643   35578999999887766655


No 60 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=95.49  E-value=0.026  Score=58.76  Aligned_cols=86  Identities=24%  Similarity=0.388  Sum_probs=61.4

Q ss_pred             eecceeeecchhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcC
Q 013804          332 PILGIKFAPDQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQC  406 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~  406 (436)
                      +-+++.+.-....++||+   ..++|..+...+-|++          +|.|+.||+|++|||....|  ..|..+++.+.
T Consensus       198 ~p~kv~LvKsR~nEEyGlrLgSqIFvKeit~~gLAar----------dgnlqEGDiiLkINGtvteNmSLtDar~LIEkS  267 (1027)
T KOG3580|consen  198 GPIKVLLVKSRANEEYGLRLGSQIFVKEITRTGLAAR----------DGNLQEGDIILKINGTVTENMSLTDARKLIEKS  267 (1027)
T ss_pred             CcceEEEEeeccchhhcccccchhhhhhhcccchhhc----------cCCcccccEEEEECcEeeccccchhHHHHHHhc
Confidence            345666655555677886   6788888877666654          45555599999999987765  45888888763


Q ss_pred             CCCCEEEEEEEECCEEEEEEEEe
Q 013804          407 KVGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       407 ~~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                       . .++++.|+||.+..-+.+..
T Consensus       268 -~-GKL~lvVlRD~~qtLiNiP~  288 (1027)
T KOG3580|consen  268 -R-GKLQLVVLRDSQQTLINIPS  288 (1027)
T ss_pred             -c-CceEEEEEecCCceeeecCC
Confidence             3 45899999997765565543


No 61 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=95.38  E-value=0.032  Score=58.91  Aligned_cols=58  Identities=22%  Similarity=0.357  Sum_probs=49.3

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC  406 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~  406 (436)
                      --+|+.|....      -.-|.|-.|.++++|.++.+++           ||++++|||.+|++..++.+.++..
T Consensus       386 ~~ig~vf~~~~------~~~v~v~tv~~ns~a~k~~~~~-----------gdvlvai~~~pi~s~~q~~~~~~s~  443 (1051)
T KOG3532|consen  386 SPIGLVFDKNT------NRAVKVCTVEDNSLADKAAFKP-----------GDVLVAINNVPIRSERQATRFLQST  443 (1051)
T ss_pred             CceeEEEecCC------ceEEEEEEecCCChhhHhcCCC-----------cceEEEecCccchhHHHHHHHHHhc
Confidence            35777775432      1567899999999999999999           9999999999999999999999884


No 62 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.33  E-value=0.025  Score=48.53  Aligned_cols=33  Identities=21%  Similarity=0.424  Sum_probs=23.0

Q ss_pred             EEEccccCCCCCCCeEECCCCcEEEEEeeeecC
Q 013804          268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYSP  300 (436)
Q Consensus       268 i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~  300 (436)
                      ...+..+.+|.||+|+||.+|++|||.......
T Consensus        88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~  120 (132)
T PF00949_consen   88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV  120 (132)
T ss_dssp             EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred             EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence            344555778999999999999999998876553


No 63 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.05  E-value=0.03  Score=59.41  Aligned_cols=56  Identities=27%  Similarity=0.481  Sum_probs=44.0

Q ss_pred             EEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCE
Q 013804          354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQ  421 (436)
Q Consensus       354 V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v~R~g~  421 (436)
                      |..|.++|||++.|          +|++||.|++|||+.|.+..  |+..+++.  .|-+|+|+|.-.++
T Consensus       782 iGrIieGSPAdRCg----------kLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIip~ee  839 (984)
T KOG3209|consen  782 IGRIIEGSPADRCG----------KLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTIIPPEE  839 (984)
T ss_pred             ccccccCChhHhhc----------cccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEcChhc
Confidence            66778888888764          44459999999999999764  77777776  68889999876443


No 64 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=95.02  E-value=0.018  Score=62.84  Aligned_cols=59  Identities=22%  Similarity=0.273  Sum_probs=39.5

Q ss_pred             ccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCC-------CCCcceeeeeeeccchhhhhc
Q 013804          265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSG-------ASSGVGFSIPVDTVNGIVDQL  323 (436)
Q Consensus       265 ~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~-------~~~~~~~aIP~~~i~~~l~~l  323 (436)
                      .-.+.++..|..||||+|++|.+|+|||++.=+.-.+-       ....-+..|-+..|..+++++
T Consensus       621 pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv  686 (698)
T PF10459_consen  621 PVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV  686 (698)
T ss_pred             eeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence            34567888999999999999999999999763211110       111234455566666666654


No 65 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=94.66  E-value=0.26  Score=46.37  Aligned_cols=90  Identities=18%  Similarity=0.241  Sum_probs=62.0

Q ss_pred             ccCcCeEEEEEEEcCCCEEEecccccCCCC----eEEEEecCCcEEe------eEEEEEc-----CCCCeEEEEEcCCC-
Q 013804          147 LEVPQGSGSGFVWDSKGHVVTNYHVIRGAS----DIRVTFADQSAYD------AKIVGFD-----QDKDVAVLRIDAPK-  210 (436)
Q Consensus       147 ~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~----~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLkv~~~~-  210 (436)
                      ..++...|+|++|+++ |+|++..|+.+-.    -+.+.++.++.+.      -++..+|     ++.+++||.++.+. 
T Consensus        23 YvdG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~  101 (267)
T PF09342_consen   23 YVDGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPAN  101 (267)
T ss_pred             EEcCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCccc
Confidence            3456789999999987 9999999998743    3677777777543      1233333     67899999998764 


Q ss_pred             --CCCcceecCC-CCCCCCCCEEEEEecCC
Q 013804          211 --DKLRPIPIGV-SADLLVGQKVYAIGNPF  237 (436)
Q Consensus       211 --~~~~~~~l~~-~~~~~~G~~V~~vG~p~  237 (436)
                        ..+.|.-+.. ..+....+.++++|.-.
T Consensus       102 fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen  102 FTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             ceeeecccccccccCCCCCCCceEEEEccc
Confidence              2344554533 23445566899999653


No 66 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=94.44  E-value=0.14  Score=52.00  Aligned_cols=56  Identities=34%  Similarity=0.555  Sum_probs=48.6

Q ss_pred             ecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE---EEEEEEE-CCEEE
Q 013804          356 DAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE---VIVEVLR-GDQKE  423 (436)
Q Consensus       356 ~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~---v~l~v~R-~g~~~  423 (436)
                      ++..++++..+|+++           ||.|+++|++++.+++++.+.+.. ..+..   +.+.+.| +++.+
T Consensus       135 ~v~~~s~a~~a~l~~-----------Gd~iv~~~~~~i~~~~~~~~~~~~-~~~~~~~~~~i~~~~~~~~~~  194 (375)
T COG0750         135 EVAPKSAAALAGLRP-----------GDRIVAVDGEKVASWDDVRRLLVA-AAGDVFNLLTILVIRLDGEAH  194 (375)
T ss_pred             ecCCCCHHHHcCCCC-----------CCEEEeECCEEccCHHHHHHHHHh-ccCCcccceEEEEEeccceee
Confidence            688999999999999           999999999999999999988876 34555   8899999 77663


No 67 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=94.37  E-value=0.061  Score=58.77  Aligned_cols=65  Identities=29%  Similarity=0.491  Sum_probs=49.5

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCC
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVG  409 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g  409 (436)
                      +.||+-|...        ..|+|..|.+|+|+            .|+|++||.|++|||++|.+  ++-+.+++...  .
T Consensus        65 ~~lGFgfvag--------rPviVr~VT~GGps------------~GKL~PGDQIl~vN~Epv~daprervIdlvRac--e  122 (1298)
T KOG3552|consen   65 ASLGFGFVAG--------RPVIVRFVTEGGPS------------IGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--E  122 (1298)
T ss_pred             ccccceeecC--------CceEEEEecCCCCc------------cccccCCCeEEEecCcccccccHHHHHHHHHHH--h
Confidence            5666666432        57899999999995            56777799999999999985  56667777663  3


Q ss_pred             CEEEEEEEE
Q 013804          410 DEVIVEVLR  418 (436)
Q Consensus       410 ~~v~l~v~R  418 (436)
                      +.|.|+|.+
T Consensus       123 ~sv~ltV~q  131 (1298)
T KOG3552|consen  123 SSVNLTVCQ  131 (1298)
T ss_pred             hhcceEEec
Confidence            568888876


No 68 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.28  E-value=0.44  Score=48.58  Aligned_cols=41  Identities=24%  Similarity=0.552  Sum_probs=30.9

Q ss_pred             ccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeec
Q 013804          271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT  315 (436)
Q Consensus       271 ~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~  315 (436)
                      ...+..||||+|++ .||++||=++..+-+   ++.-+|+|-++.
T Consensus       354 tgGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~ie~  394 (402)
T TIGR02860       354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYIEW  394 (402)
T ss_pred             hCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehHHH
Confidence            34567899999999 799999988877664   345678875543


No 69 
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=93.92  E-value=0.15  Score=48.32  Aligned_cols=59  Identities=25%  Similarity=0.430  Sum_probs=46.9

Q ss_pred             ccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeC--CHHHHHHHHhcCCCCCEEEEEEEEC
Q 013804          349 VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKVGDEVIVEVLRG  419 (436)
Q Consensus       349 ~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~--s~~dl~~~l~~~~~g~~v~l~v~R~  419 (436)
                      +.|++|+...+++-|+..||-.          +.|.|++|||.+|.  +.+++..+|-..  ...+-++|.-.
T Consensus       193 vpGIFISRlVpGGLAeSTGLLa----------VnDEVlEVNGIEVaGKTLDQVTDMMvAN--shNLIiTVkPA  253 (358)
T KOG3606|consen  193 VPGIFISRLVPGGLAESTGLLA----------VNDEVLEVNGIEVAGKTLDQVTDMMVAN--SHNLIITVKPA  253 (358)
T ss_pred             cCceEEEeecCCccccccceee----------ecceeEEEcCEEeccccHHHHHHHHhhc--ccceEEEeccc
Confidence            3799999999999999999865          49999999999997  677888877652  24466666543


No 70 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=93.36  E-value=0.21  Score=53.33  Aligned_cols=58  Identities=29%  Similarity=0.476  Sum_probs=46.6

Q ss_pred             eEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEEC
Q 013804          352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLRG  419 (436)
Q Consensus       352 v~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g~~v~l~v~R~  419 (436)
                      +.|.+|.+++||++.          |+|+.||+|+.|||.-+.-  -.|.-+.+.....|+.|.|++-|.
T Consensus       373 LqVKsvl~DGPAa~d----------Gkle~GDviV~INg~cvlGhTHAqaV~~fqaiPvg~~V~L~lcRg  432 (984)
T KOG3209|consen  373 LQVKSVLKDGPAAQD----------GKLETGDVIVHINGECVLGHTHAQAVKRFQAIPVGQSVDLVLCRG  432 (984)
T ss_pred             eeeeecccCCchhhc----------CccccCcEEEEECCceeccccHHHHHHHhhccccCCeeeEEEecC
Confidence            458888999999875          4555699999999998874  456777777767899999999884


No 71 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=93.16  E-value=0.59  Score=43.31  Aligned_cols=134  Identities=15%  Similarity=0.181  Sum_probs=48.1

Q ss_pred             EEEecccccCCCCeEEEEecCCcEEee---EEEEEcCCCCeEEEEEcCCC---CCCcceecCCCCCCCCCCEEEEEecCC
Q 013804          164 HVVTNYHVIRGASDIRVTFADQSAYDA---KIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPF  237 (436)
Q Consensus       164 ~ILT~aHvv~~~~~i~V~~~dg~~~~a---~vv~~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~  237 (436)
                      .++|+.||..+...+. .+.+|+.++-   +.+..+...|++||++....   .....+.+.....+..|    .+..  
T Consensus        43 ~L~ta~Hv~~~~~~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g----~~~~--  115 (203)
T PF02122_consen   43 ALLTARHVWSRPSKVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG----PVSF--  115 (203)
T ss_dssp             EEEE-HHHHTSSS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE----ESST--
T ss_pred             ceecccccCCCcccee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC----Ceee--
Confidence            5999999999855543 3445555442   35556788899999997321   12222333211111000    0100  


Q ss_pred             CCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeee
Q 013804          238 GLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVD  314 (436)
Q Consensus       238 g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~  314 (436)
                         .....+........+..  .   ...+...-+...+|.||.|+++.+ ++||++... .........++..|+.
T Consensus       116 ---y~~~~~~~~~~sa~i~g--~---~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~-~~~~~~~n~n~~spip  182 (203)
T PF02122_consen  116 ---YGFSSGEWPCSSAKIPG--T---EGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS-PSGSNRENNNRMSPIP  182 (203)
T ss_dssp             ---TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE-----------------
T ss_pred             ---eeecCCCceeccCcccc--c---cCcCCceEcCCCCCCCCCCeEECC-CceEeecCc-cccccccccccccccc
Confidence               11122111111111111  1   123556667788999999999887 999999875 2222233445544443


No 72 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=91.94  E-value=0.12  Score=54.71  Aligned_cols=57  Identities=26%  Similarity=0.366  Sum_probs=43.7

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLR  418 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R  418 (436)
                      -|++|.+|.+++.|++.|++-           ||.|++|||+...+.. +.++..-+..+..+.+++.-
T Consensus       562 fgifV~~V~pgskAa~~GlKR-----------gDqilEVNgQnfenis-~~KA~eiLrnnthLtltvKt  618 (1283)
T KOG3542|consen  562 FGIFVAEVFPGSKAAREGLKR-----------GDQILEVNGQNFENIS-AKKAEEILRNNTHLTLTVKT  618 (1283)
T ss_pred             ceeEEeeecCCchHHHhhhhh-----------hhhhhhccccchhhhh-HHHHHHHhcCCceEEEEEec
Confidence            689999999999999999999           9999999999877654 33333333344566666654


No 73 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=91.58  E-value=0.38  Score=48.72  Aligned_cols=69  Identities=26%  Similarity=0.399  Sum_probs=51.4

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEEC--CEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRG--DQKEKIPV  427 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~--g~~~~~~v  427 (436)
                      .|.-|.+|.++|++.++||.+          --|-|++|||..+..-+|..+.+.+... ++|+++|.-.  -+.+.++|
T Consensus        15 eg~hvlkVqedSpa~~aglep----------ffdFIvSI~g~rL~~dnd~Lk~llk~~s-ekVkltv~n~kt~~~R~v~I   83 (462)
T KOG3834|consen   15 EGYHVLKVQEDSPAHKAGLEP----------FFDFIVSINGIRLNKDNDTLKALLKANS-EKVKLTVYNSKTQEVRIVEI   83 (462)
T ss_pred             eeEEEEEeecCChHHhcCcch----------hhhhhheeCcccccCchHHHHHHHHhcc-cceEEEEEecccceeEEEEe
Confidence            577788999999999999998          3899999999999987776666665333 3499988753  23344444


Q ss_pred             Ee
Q 013804          428 KL  429 (436)
Q Consensus       428 ~~  429 (436)
                      +.
T Consensus        84 ~p   85 (462)
T KOG3834|consen   84 VP   85 (462)
T ss_pred             cc
Confidence            43


No 74 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=91.38  E-value=0.4  Score=46.27  Aligned_cols=56  Identities=25%  Similarity=0.396  Sum_probs=41.9

Q ss_pred             ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 013804          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR  418 (436)
Q Consensus       351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g~~v~l~v~R  418 (436)
                      -++|..|..++||++.|          .++-||.|++|||..|+.  --++-++++..  -.+|++++..
T Consensus        31 ClYiVQvFD~tPAa~dG----------~i~~GDEi~avNg~svKGktKveVAkmIQ~~--~~eV~IhyNK   88 (429)
T KOG3651|consen   31 CLYIVQVFDKTPAAKDG----------RIRCGDEIVAVNGISVKGKTKVEVAKMIQVS--LNEVKIHYNK   88 (429)
T ss_pred             eEEEEEeccCCchhccC----------ccccCCeeEEecceeecCccHHHHHHHHHHh--ccceEEEehh
Confidence            47899999999998755          344499999999999985  44677777763  2457777653


No 75 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=91.35  E-value=0.34  Score=47.60  Aligned_cols=56  Identities=32%  Similarity=0.404  Sum_probs=45.9

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVL  417 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~v~  417 (436)
                      -.|+|+++.++-.|+..|+-=          .||-|+.|||..|+..  +|+..+|.+  .|+.|+++|.
T Consensus        80 ~PvviSkI~kdQaAd~tG~LF----------vGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~  137 (505)
T KOG3549|consen   80 LPVVISKIYKDQAADITGQLF----------VGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVK  137 (505)
T ss_pred             ccEEeehhhhhhhhhhcCceE----------eeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeH
Confidence            368899999998888887542          3999999999999864  578888876  7899999885


No 76 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=90.85  E-value=0.47  Score=40.44  Aligned_cols=29  Identities=31%  Similarity=0.556  Sum_probs=23.8

Q ss_pred             ccccCCCCCCCeEECCCCcEEEEEeeeec
Q 013804          271 DAAINPGNSGGPLLDSSGSLIGINTAIYS  299 (436)
Q Consensus       271 ~~~i~~G~SGGPlvd~~G~VVGI~s~~~~  299 (436)
                      ...-.+|+||-|++|..|+||||+-.+..
T Consensus       100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~n  128 (158)
T PF00944_consen  100 TGVGKPGDSGRPIFDNSGRVVAIVLGGAN  128 (158)
T ss_dssp             TTS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred             cCCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence            44567999999999999999999887643


No 77 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=90.82  E-value=0.8  Score=39.78  Aligned_cols=55  Identities=31%  Similarity=0.398  Sum_probs=37.7

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEV  416 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v  416 (436)
                      ..++|+.+.|++.|++-          |.|+.||.+++|||..|..-.  ...++|+. .. ..|++.|
T Consensus       115 spiyisriipggvadrh----------gglkrgdqllsvngvsvege~hekavellka-a~-gsvklvv  171 (207)
T KOG3550|consen  115 SPIYISRIIPGGVADRH----------GGLKRGDQLLSVNGVSVEGEHHEKAVELLKA-AV-GSVKLVV  171 (207)
T ss_pred             CceEEEeecCCcccccc----------CcccccceeEeecceeecchhhHHHHHHHHH-hc-CcEEEEE
Confidence            67999999999998764          334449999999999987532  23334444 23 3466654


No 78 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=90.61  E-value=0.98  Score=50.06  Aligned_cols=65  Identities=17%  Similarity=0.248  Sum_probs=37.2

Q ss_pred             CeEEEEEEEcCCCEEEecccccCCCCeEEEEecC--CcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceec
Q 013804          151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD--QSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPI  218 (436)
Q Consensus       151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d--g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l  218 (436)
                      ...|...+|++. ||+|.+|+..+...  |.|.+  +..|...----++..|+.+-|++.-..+..|+.+
T Consensus        64 ~~~G~aTLigpq-YiVSV~HN~~gy~~--v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~  130 (769)
T PF02395_consen   64 RNKGVATLIGPQ-YIVSVKHNGKGYNS--VSFGNEGQNTYKIVDRNNYPSGDFHMPRLNKFVTEVAPAEM  130 (769)
T ss_dssp             TTTSS-EEEETT-EEEBETTG-TSCCE--ECESCSSTCEEEEEEEEBETTSTEBEEEESS---SS----B
T ss_pred             cCCceEEEecCC-eEEEEEccCCCcCc--eeecccCCceEEEEEccCCCCcccceeecCceEEEEecccc
Confidence            344789999986 99999999966554  44443  4455433223334579999999864334444444


No 79 
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=90.15  E-value=0.42  Score=47.65  Aligned_cols=73  Identities=27%  Similarity=0.388  Sum_probs=51.0

Q ss_pred             ceecceeeecchhhhhcCccceEEEecCCCCcccccC-cccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCC
Q 013804          331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAG-LLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCK  407 (436)
Q Consensus       331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~ag-l~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~  407 (436)
                      -+-|||.+.-....    .-.++|+++.++-.|++++ |..           ||.|++|||....+.  ++..++|+.  
T Consensus        95 ~gGLGISIKGGreN----kMPIlISKIFkGlAADQt~aL~~-----------gDaIlSVNG~dL~~AtHdeAVqaLKr--  157 (506)
T KOG3551|consen   95 AGGLGISIKGGREN----KMPILISKIFKGLAADQTGALFL-----------GDAILSVNGEDLRDATHDEAVQALKR--  157 (506)
T ss_pred             CCcceEEeecCccc----CCceehhHhccccccccccceee-----------ccEEEEecchhhhhcchHHHHHHHHh--
Confidence            46788887743211    1478999999999888875 445           999999999988754  455566665  


Q ss_pred             CCCEEEEE--EEECC
Q 013804          408 VGDEVIVE--VLRGD  420 (436)
Q Consensus       408 ~g~~v~l~--v~R~g  420 (436)
                      .|+.|.++  +.|+-
T Consensus       158 aGkeV~levKy~REv  172 (506)
T KOG3551|consen  158 AGKEVLLEVKYMREV  172 (506)
T ss_pred             hCceeeeeeeeehhc
Confidence            67776555  45643


No 80 
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=89.58  E-value=0.63  Score=47.98  Aligned_cols=72  Identities=26%  Similarity=0.377  Sum_probs=46.5

Q ss_pred             eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--H----HHHHHhc
Q 013804          332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--D----LYRILDQ  405 (436)
Q Consensus       332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--d----l~~~l~~  405 (436)
                      ++||+.+.-..+++  |-.|++|.+|.+++..+.          +|.+.+||.|+.||.....++.  |    |++++.+
T Consensus       261 nfLGiSivgqsn~r--gDggIYVgsImkgGAVA~----------DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~  328 (626)
T KOG3571|consen  261 NFLGISIVGQSNAR--GDGGIYVGSIMKGGAVAL----------DGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSR  328 (626)
T ss_pred             ccceeEeecccCcC--CCCceEEeeeccCceeec----------cCccCccceEEEeeecchhhcCchHHHHHHHHHhcc
Confidence            67777765422211  337999999999887654          5556669999999998766543  3    3334433


Q ss_pred             CCCCCEEEEEEEE
Q 013804          406 CKVGDEVIVEVLR  418 (436)
Q Consensus       406 ~~~g~~v~l~v~R  418 (436)
                        +| .++++|-.
T Consensus       329 --~g-Pi~ltvAk  338 (626)
T KOG3571|consen  329 --PG-PIKLTVAK  338 (626)
T ss_pred             --CC-CeEEEEee
Confidence              32 36777654


No 81 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=88.69  E-value=0.64  Score=49.35  Aligned_cols=112  Identities=23%  Similarity=0.391  Sum_probs=74.2

Q ss_pred             CCCCCCeEE-----CCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceec------ceecceeeecchhh
Q 013804          276 PGNSGGPLL-----DSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVT------RPILGIKFAPDQSV  344 (436)
Q Consensus       276 ~G~SGGPlv-----d~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~------~~~lGv~~~~~~~~  344 (436)
                      .-|+|||.-     |...+++.|+-..          -..+|.+..+..++.+++.-.|+      .|..-+.+...+..
T Consensus       679 nmm~~GpAarsgkLnIGDQiiaING~S----------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~k  748 (829)
T KOG3605|consen  679 NMMHGGPAARSGKLNIGDQIMSINGTS----------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLR  748 (829)
T ss_pred             hcccCChhhhcCCccccceeEeecCce----------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccch
Confidence            457888873     4444677764322          13489999999999988766553      24444444444445


Q ss_pred             hhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC-H-HHHHHHHhcCCCCC
Q 013804          345 EQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN-G-SDLYRILDQCKVGD  410 (436)
Q Consensus       345 ~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s-~-~dl~~~l~~~~~g~  410 (436)
                      .++|.   .|| |-.+..++.|++-|+++           |-.|++|||+.|-- . +-+.++|.. ..|+
T Consensus       749 yQLGFSVQNGi-ICSLlRGGIAERGGVRV-----------GHRIIEINgQSVVA~pHekIV~lLs~-aVGE  806 (829)
T KOG3605|consen  749 YQLGFSVQNGI-ICSLLRGGIAERGGVRV-----------GHRIIEINGQSVVATPHEKIVQLLSN-AVGE  806 (829)
T ss_pred             hhccceeeCcE-eehhhcccchhccCcee-----------eeeEEEECCceEEeccHHHHHHHHHH-hhhh
Confidence            55664   564 55678999999999999           99999999997753 2 235555554 3453


No 82 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=87.86  E-value=1  Score=47.03  Aligned_cols=68  Identities=28%  Similarity=0.454  Sum_probs=52.5

Q ss_pred             ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCCC
Q 013804          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGD  410 (436)
Q Consensus       333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g~  410 (436)
                      .||+.+.....      .-++|..+..|+.+.+.|+-          +.||.|++|||..|.+  ..++..++.+..  .
T Consensus       135 plG~Tik~~e~------~~~~vARI~~GG~~~r~glL----------~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~--G  196 (542)
T KOG0609|consen  135 PLGATIRVEED------TKVVVARIMHGGMADRQGLL----------HVGDEILEVNGISVANKSPEELQELLRNSR--G  196 (542)
T ss_pred             ccceEEEeccC------CccEEeeeccCCcchhccce----------eeccchheecCeecccCCHHHHHHHHHhCC--C
Confidence            58888875431      25899999999999988853          2499999999999986  578999998854  4


Q ss_pred             EEEEEEEE
Q 013804          411 EVIVEVLR  418 (436)
Q Consensus       411 ~v~l~v~R  418 (436)
                      .++++|.-
T Consensus       197 ~itfkiiP  204 (542)
T KOG0609|consen  197 SITFKIIP  204 (542)
T ss_pred             cEEEEEcc
Confidence            57777654


No 83 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=87.79  E-value=0.66  Score=51.25  Aligned_cols=60  Identities=30%  Similarity=0.428  Sum_probs=44.9

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQ  421 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v~R~g~  421 (436)
                      -|+||.+|.+|++|+.          +|.|+.||.+++|||+..-...  +.-++|-  .-|..|.++|...|.
T Consensus       960 lGIYvKsVV~GgaAd~----------DGRL~aGDQLLsVdG~SLiGisQErAA~lmt--rtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen  960 LGIYVKSVVEGGAADH----------DGRLEAGDQLLSVDGHSLIGISQERAARLMT--RTGNVVHLEVAKQGA 1021 (1629)
T ss_pred             cceEEEEeccCCcccc----------ccccccCceeeeecCcccccccHHHHHHHHh--ccCCeEEEehhhhhh
Confidence            5899999999999865          5566669999999999776544  3334443  467888998876553


No 84 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=86.67  E-value=0.44  Score=40.69  Aligned_cols=117  Identities=21%  Similarity=0.242  Sum_probs=56.5

Q ss_pred             EEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEE
Q 013804          154 GSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI  233 (436)
Q Consensus       154 GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~v  233 (436)
                      --|+.|+  |-.-|.+|--....-   -   |..-+....+.+...|+..-....-...+.|-.-+.       +.+|++
T Consensus        14 fmgt~vn--GV~wT~~HGagsrtl---A---gp~Gpv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~-------~dlylV   78 (148)
T PF02907_consen   14 FMGTCVN--GVMWTVYHGAGSRTL---A---GPKGPVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS-------SDLYLV   78 (148)
T ss_dssp             EEEEEET--TEEEEEHHHHTTSEE---E---BTTSEB-ESEEETTTTEEEEE-STTB--BBB-SSSS-------SEEEEE
T ss_pred             eehhEEc--cEEEEEEecCCcccc---c---CCCCcceEeEEcCCCCCcccccccccccCCccccCC-------ccEEEE
Confidence            3477774  788888885433110   0   111123344566778888877754333444433321       346666


Q ss_pred             ecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEE-EEccccCCCCCCCeEECCCCcEEEEEeeeecC
Q 013804          234 GNPFGLDHTLTTGVISGLRREISSAATGRPIQDVI-QTDAAINPGNSGGPLLDSSGSLIGINTAIYSP  300 (436)
Q Consensus       234 G~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i-~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~  300 (436)
                      -+-    ..+-.+     ++.      ++....++ -.......|.||||++-.+|.+|||..+....
T Consensus        79 tr~----~~v~p~-----rr~------gd~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~t  131 (148)
T PF02907_consen   79 TRD----ADVIPV-----RRR------GDSRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVCT  131 (148)
T ss_dssp             -TT----S-EEEE-----EEE------STTEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEEE
T ss_pred             ecc----CcEeee-----EEc------CCCceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEEc
Confidence            322    111111     111      01001111 11122347999999999999999998766554


No 85 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=86.39  E-value=1.1  Score=44.93  Aligned_cols=45  Identities=38%  Similarity=0.505  Sum_probs=37.7

Q ss_pred             cceEEEecCCCCccccc-CcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhc
Q 013804          350 SGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQ  405 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~a-gl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~  405 (436)
                      .||.|++|...||...- ||.+           ||+|+++||-+|.+.+|..+.++.
T Consensus       220 ~gV~Vtev~~~Spl~gprGL~v-----------gdvitsldgcpV~~v~dW~ecl~t  265 (484)
T KOG2921|consen  220 EGVTVTEVPSVSPLFGPRGLSV-----------GDVITSLDGCPVHKVSDWLECLAT  265 (484)
T ss_pred             ceEEEEeccccCCCcCcccCCc-----------cceEEecCCcccCCHHHHHHHHHh
Confidence            79999999998887432 5666           999999999999999888777664


No 86 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=81.90  E-value=7  Score=33.20  Aligned_cols=32  Identities=31%  Similarity=0.471  Sum_probs=24.3

Q ss_pred             ccEEEEccccCCCCCCCeEECCCCcEEEEEeee
Q 013804          265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI  297 (436)
Q Consensus       265 ~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~  297 (436)
                      .+++....+..||+-||+|+- +--||||++++
T Consensus        78 ~~~l~g~Gp~~PGdCGg~L~C-~HGViGi~Tag  109 (127)
T PF00947_consen   78 YNLLIGEGPAEPGDCGGILRC-KHGVIGIVTAG  109 (127)
T ss_dssp             ECEEEEE-SSSTT-TCSEEEE-TTCEEEEEEEE
T ss_pred             cCceeecccCCCCCCCceeEe-CCCeEEEEEeC
Confidence            455666778899999999994 45599999986


No 87 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=81.65  E-value=2.2  Score=48.25  Aligned_cols=50  Identities=34%  Similarity=0.493  Sum_probs=39.1

Q ss_pred             EEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEE
Q 013804          353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVE  415 (436)
Q Consensus       353 ~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~  415 (436)
                      .|..|.+++||..+|+++           ||.|+.|||+.|...  .++.+.+.+  .|..+.+.
T Consensus       661 ~v~sv~egsPA~~agls~-----------~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~  712 (1205)
T KOG0606|consen  661 SVGSVEEGSPAFEAGLSA-----------GDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLR  712 (1205)
T ss_pred             eeeeecCCCCccccCCCc-----------cceeEeccCcccchhhHHHHHHHHHh--cCCeeEEE
Confidence            577899999999999999           999999999999864  366666654  34445443


No 88 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=80.39  E-value=6.4  Score=32.37  Aligned_cols=53  Identities=25%  Similarity=0.350  Sum_probs=34.7

Q ss_pred             EEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecC
Q 013804          155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIG  219 (436)
Q Consensus       155 SGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~  219 (436)
                      -++-|. +|..+|+.||.+..+.+     +|..+  +++.  ...|+|+++.+..  .++.++++
T Consensus         2 ~avHIG-nG~~vt~tHva~~~~~v-----~g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig   54 (105)
T PF03510_consen    2 WAVHIG-NGRYVTVTHVAKSSDSV-----DGQPF--KIVK--TDGELCWVQSPLV--HLPAAQIG   54 (105)
T ss_pred             ceEEeC-CCEEEEEEEEeccCceE-----cCcCc--EEEE--eccCEEEEECCCC--CCCeeEec
Confidence            356675 68999999999887654     22222  2222  3559999999753  35666664


No 89 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=79.35  E-value=2.6  Score=42.91  Aligned_cols=65  Identities=25%  Similarity=0.419  Sum_probs=49.2

Q ss_pred             EEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCE--EEEEEEEe
Q 013804          354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQ--KEKIPVKL  429 (436)
Q Consensus       354 V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~--~~~~~v~~  429 (436)
                      |-+|.+++||+.|||.+          .+|.|+-+-...-...+|+...|.. ..++.+++-|.--+.  .++++++.
T Consensus       113 vl~V~p~SPaalAgl~~----------~~DYivG~~~~~~~~~eDl~~lIes-he~kpLklyVYN~D~d~~ReVti~p  179 (462)
T KOG3834|consen  113 VLSVEPNSPAALAGLRP----------YTDYIVGIWDAVMHEEEDLFTLIES-HEGKPLKLYVYNHDTDSCREVTITP  179 (462)
T ss_pred             eeecCCCCHHHhccccc----------ccceEecchhhhccchHHHHHHHHh-ccCCCcceeEeecCCCccceEEeec
Confidence            66788999999999996          3999999944445677899999988 468889998876433  34455443


No 90 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=72.72  E-value=2.8  Score=42.59  Aligned_cols=24  Identities=25%  Similarity=0.504  Sum_probs=21.0

Q ss_pred             cccCCCCCCCeEECCCCcEEEEEe
Q 013804          272 AAINPGNSGGPLLDSSGSLIGINT  295 (436)
Q Consensus       272 ~~i~~G~SGGPlvd~~G~VVGI~s  295 (436)
                      ..+..|.||+.|+|.+|++|||..
T Consensus       350 ~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  350 YSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             cCCCCCCCcCeEECCCCCEEEEeC
Confidence            355689999999999999999964


No 91 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=68.88  E-value=5.2  Score=42.82  Aligned_cols=58  Identities=28%  Similarity=0.440  Sum_probs=38.6

Q ss_pred             ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEE
Q 013804          351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLR  418 (436)
Q Consensus       351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~v~R  418 (436)
                      -|+|.....++||++.|          +|-.||.|++|||...--.  ..-+.+++..+.-..|+++|.+
T Consensus       674 TVViAnmm~~GpAarsg----------kLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~  733 (829)
T KOG3605|consen  674 TVVIANMMHGGPAARSG----------KLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS  733 (829)
T ss_pred             HHHHHhcccCChhhhcC----------CccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence            34556677888988765          3445999999999866542  3445566665544556666654


No 92 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=68.10  E-value=19  Score=36.88  Aligned_cols=137  Identities=20%  Similarity=0.304  Sum_probs=67.0

Q ss_pred             cCeEEEEEEEcCCCEEEecccccCCCCe-EEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCC-CCCcceecCCCCCCCCC
Q 013804          150 PQGSGSGFVWDSKGHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVG  227 (436)
Q Consensus       150 ~~~~GSGfiI~~~G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~~~l~~~~~~~~G  227 (436)
                      .-+.|-||-|+++ ..+|+-||+..... +   |  |  .+..-+.++..-+++-+++..+. .+++-+-|.  +-...|
T Consensus       377 ~fGsGWGfWVS~~-lfITttHViP~g~~E~---F--G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--eGapEG  446 (535)
T PF05416_consen  377 KFGSGWGFWVSPT-LFITTTHVIPPGAKEA---F--G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--EGAPEG  446 (535)
T ss_dssp             EETTEEEEESSSS-EEEEEGGGS-STTSEE---T--T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---SS--TT
T ss_pred             ecCCceeeeecce-EEEEeeeecCCcchhh---h--C--CChhHeEEeeccceEEEecCCCCCCCccceeec--cCCCCc
Confidence            3467999999987 99999999975322 1   0  0  11122344556678888887653 245545552  223445


Q ss_pred             CEEEE-EecCCCCC--CceeEeEEeeeeeeeccCCCCCCcccEEEE-------ccccCCCCCCCeEECCCC---cEEEEE
Q 013804          228 QKVYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSG---SLIGIN  294 (436)
Q Consensus       228 ~~V~~-vG~p~g~~--~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPlvd~~G---~VVGI~  294 (436)
                      .-+.+ +=.+.|.-  ..+..|...+..-.-..- .+  ...++.+       |-...||+-|.|-|-..|   -|+|||
T Consensus       447 tV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v-~G--Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH  523 (535)
T PF05416_consen  447 TVCSILIKRPSGELLPLAVRMGTHASMKIQGRTV-HG--QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVH  523 (535)
T ss_dssp             -EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEE-EE--EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEE
T ss_pred             eEEEEEEEcCCccchhhhhhhccceeEEEcceee-cc--eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEE
Confidence            54443 34554432  234444443222110000 00  0112222       234558999999996555   499999


Q ss_pred             eeeec
Q 013804          295 TAIYS  299 (436)
Q Consensus       295 s~~~~  299 (436)
                      ++...
T Consensus       524 ~AAtr  528 (535)
T PF05416_consen  524 AAATR  528 (535)
T ss_dssp             EEE-S
T ss_pred             ehhcc
Confidence            87643


No 93 
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=55.78  E-value=42  Score=30.55  Aligned_cols=38  Identities=29%  Similarity=0.299  Sum_probs=31.4

Q ss_pred             ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEE
Q 013804          333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSV  388 (436)
Q Consensus       333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~v  388 (436)
                      ..|+.+.++.       +.++|..+.-+|||+++|+.-           |+.|++|
T Consensus       112 ~~GL~l~~e~-------~~~~Vd~v~fgS~A~~~g~d~-----------d~~I~~v  149 (183)
T PF11874_consen  112 AAGLTLMEEG-------GKVIVDEVEFGSPAEKAGIDF-----------DWEITEV  149 (183)
T ss_pred             hCCCEEEeeC-------CEEEEEecCCCCHHHHcCCCC-----------CcEEEEE
Confidence            3477776544       568999999999999999998           8988887


No 94 
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=54.26  E-value=14  Score=35.32  Aligned_cols=59  Identities=14%  Similarity=0.325  Sum_probs=43.5

Q ss_pred             cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEE
Q 013804          350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLR  418 (436)
Q Consensus       350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v~R  418 (436)
                      .-++|..+.++|.-++.-          .+++||.|-+|||+.|-.+.  ++-++|+....|++.++.+..
T Consensus       149 GyAFIKrIkegsvidri~----------~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie  209 (334)
T KOG3938|consen  149 GYAFIKRIKEGSVIDRIE----------AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE  209 (334)
T ss_pred             ceeeeEeecCCchhhhhh----------heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence            345666676766655432          24459999999999998775  677889998889988887653


No 95 
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=48.51  E-value=65  Score=23.80  Aligned_cols=34  Identities=15%  Similarity=0.300  Sum_probs=29.1

Q ss_pred             CCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804          175 ASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (436)
Q Consensus       175 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (436)
                      ...+.+.+..|..++++++.+|....+.+|+...
T Consensus         6 Gs~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           6 GSQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             ccEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence            3456778888999999999999999999998654


No 96 
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=47.70  E-value=18  Score=25.16  Aligned_cols=21  Identities=38%  Similarity=0.616  Sum_probs=17.8

Q ss_pred             CCCCCCeEECCCCcEEEEEee
Q 013804          276 PGNSGGPLLDSSGSLIGINTA  296 (436)
Q Consensus       276 ~G~SGGPlvd~~G~VVGI~s~  296 (436)
                      .+.+.-|++|.+|+++|+++.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            456788999999999999874


No 97 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.17  E-value=61  Score=23.32  Aligned_cols=33  Identities=18%  Similarity=0.415  Sum_probs=28.5

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (436)
                      ..+.|.+.||+.+.+.+..+|...++.+-....
T Consensus         7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~   39 (63)
T cd00600           7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE   39 (63)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence            468899999999999999999999888776643


No 98 
>PRK14864 putative biofilm stress and motility protein A; Provisional
Probab=44.36  E-value=40  Score=27.72  Aligned_cols=55  Identities=11%  Similarity=0.058  Sum_probs=25.4

Q ss_pred             HHHHHHHHHHHHHHhhccccCcccccccCCccccCccchhHHHHHHHhCCceEEEE
Q 013804           78 SLFVFCGSVVLSFTLLFSNVDSASAFVVTPQRKLQTDELATVRLFQENTPSVVNIT  133 (436)
Q Consensus        78 ~~~~~~~~l~l~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~sVV~I~  133 (436)
                      +.++.+..+++++.|.+++..+..+. ++|++...+++....+....-+=.+|.|.
T Consensus         3 ~~mk~~~~l~~~l~LS~~s~~~~~p~-~~p~~~~~A~eI~~~qa~~lq~iGtVSvs   57 (104)
T PRK14864          3 MVMRRFASLLLTLLLSACSALQGTPQ-PAPPPADHAQEIRRAQTQGLQKMGTVSAL   57 (104)
T ss_pred             hHHHHHHHHHHHHHHhhhhhcccCCC-CCCCccccceecCHHHhhCCceeeEEEEe
Confidence            34555555555555555655554433 33333445555554433222222355554


No 99 
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.48  E-value=44  Score=26.50  Aligned_cols=37  Identities=5%  Similarity=0.297  Sum_probs=30.9

Q ss_pred             ccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       171 vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ++.....+.|.+.+++.+.+++.++|...++.|=...
T Consensus        10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence            3445578999999999999999999999998876653


No 100
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=39.32  E-value=72  Score=23.72  Aligned_cols=33  Identities=9%  Similarity=0.245  Sum_probs=29.3

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (436)
                      ..+.|.+.+|+.+.+++.++|+..++.+-....
T Consensus        11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e   43 (68)
T cd01731          11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE   43 (68)
T ss_pred             CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence            468899999999999999999999998887753


No 101
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=39.15  E-value=75  Score=24.02  Aligned_cols=32  Identities=13%  Similarity=0.330  Sum_probs=28.4

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.+|+.+.+++.++|+..++.+=...
T Consensus        15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   46 (72)
T PRK00737         15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAE   46 (72)
T ss_pred             CEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence            46889999999999999999999998887764


No 102
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=37.95  E-value=67  Score=23.96  Aligned_cols=32  Identities=13%  Similarity=0.217  Sum_probs=27.8

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.+|+.+.+++..+|...++.+=.+.
T Consensus        12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence            46889999999999999999999888876653


No 103
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=37.72  E-value=76  Score=23.56  Aligned_cols=32  Identities=13%  Similarity=0.233  Sum_probs=28.0

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.+|+.|.+++.++|+..++.+=...
T Consensus        11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence            46889999999999999999999988886654


No 104
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.51  E-value=87  Score=24.04  Aligned_cols=32  Identities=13%  Similarity=0.215  Sum_probs=27.7

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.||+.+.+.+..+|...+|.+=...
T Consensus        11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             CeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence            46889999999999999999999998775553


No 105
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=36.50  E-value=26  Score=32.69  Aligned_cols=55  Identities=15%  Similarity=0.412  Sum_probs=37.2

Q ss_pred             ccEEEEccccCCCCCCCeEECC----CCcEEEEEeeeecCCCCCCcceeeeee--eccchhhhhc
Q 013804          265 QDVIQTDAAINPGNSGGPLLDS----SGSLIGINTAIYSPSGASSGVGFSIPV--DTVNGIVDQL  323 (436)
Q Consensus       265 ~~~i~~~~~i~~G~SGGPlvd~----~G~VVGI~s~~~~~~~~~~~~~~aIP~--~~i~~~l~~l  323 (436)
                      ...+++..+...|+=|||++-.    --+++||+.++..    ..+.+||-++  +.+++-+++|
T Consensus       168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~----~~~~gYAe~itQEDL~~A~~~l  228 (231)
T PF12381_consen  168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA----NHAMGYAESITQEDLMRAINKL  228 (231)
T ss_pred             eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc----cccceehhhhhHHHHHHHHHhh
Confidence            3456777888899999999732    3589999998753    3456777665  3444444433


No 106
>COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only]
Probab=36.45  E-value=29  Score=34.10  Aligned_cols=40  Identities=20%  Similarity=0.398  Sum_probs=33.8

Q ss_pred             EEEECCEEeCCHHHHHHHHhc-CCCCCEEEEEEEECCEEEE
Q 013804          385 ITSVNGKKVSNGSDLYRILDQ-CKVGDEVIVEVLRGDQKEK  424 (436)
Q Consensus       385 Il~vnG~~V~s~~dl~~~l~~-~~~g~~v~l~v~R~g~~~~  424 (436)
                      +-++.|.+|-|.+|+..+++- ..+||++++++.++|++..
T Consensus       269 Vae~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE~~  309 (356)
T COG4956         269 VAELQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKEPG  309 (356)
T ss_pred             HHhhcCCceecHHHHHHHhCCcccCCCeeEEEEeecCcccC
Confidence            456788899999999999983 5789999999999998753


No 107
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.96  E-value=69  Score=24.89  Aligned_cols=31  Identities=10%  Similarity=0.271  Sum_probs=27.0

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (436)
                      ..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus        12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence            4788999999999999999999998886544


No 108
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.95  E-value=86  Score=24.13  Aligned_cols=32  Identities=19%  Similarity=0.429  Sum_probs=27.7

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.||+.+.+.+.++|...++.|=...
T Consensus        11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence            46889999999999999999999998876553


No 109
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.61  E-value=88  Score=24.32  Aligned_cols=31  Identities=23%  Similarity=0.334  Sum_probs=26.9

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (436)
                      ..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus        13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~   43 (81)
T cd01729          13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT   43 (81)
T ss_pred             CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence            4688999999999999999999998877554


No 110
>PF04225 OapA:  Opacity-associated protein A LysM-like domain;  InterPro: IPR007340 This entry includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonization, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation [].; PDB: 2GU1_A.
Probab=33.33  E-value=21  Score=28.15  Aligned_cols=53  Identities=17%  Similarity=0.246  Sum_probs=27.4

Q ss_pred             ccCCcEEEEE---CCEEeCCHHHHHH------HHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 013804          379 LILGDIITSV---NGKKVSNGSDLYR------ILDQCKVGDEVIVEVLRGDQKEKIPVKLEP  431 (436)
Q Consensus       379 l~~GDiIl~v---nG~~V~s~~dl~~------~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~  431 (436)
                      ++.||-+-.|   .|.+..+...+.+      .|...+||+++.+.+..+|+...+.+....
T Consensus         7 V~~GDtLs~iF~~~gls~~dl~~v~~~~~~~k~L~~L~pGq~l~f~~d~~g~L~~L~~~~~~   68 (85)
T PF04225_consen    7 VKSGDTLSTIFRRAGLSASDLYAVLEADGEAKPLTRLKPGQTLEFQLDEDGQLTALRYERSP   68 (85)
T ss_dssp             --TT--HHHHHHHTT--HHHHHHHHHHGGGT--GGG--TT-EEEEEE-TTS-EEEEEEEEET
T ss_pred             ECCCCcHHHHHHHcCCCHHHHHHHHhccCccchHhhCCCCCEEEEEECCCCCEEEEEEEcCC
Confidence            3447766555   4655544444433      455678999999999999998887776543


No 111
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=32.13  E-value=64  Score=24.85  Aligned_cols=47  Identities=17%  Similarity=0.167  Sum_probs=33.0

Q ss_pred             CcEEEEECCEEeCCHHHHHHHHh-cCCCCC----EEEEEEEECCEEEEEEEE
Q 013804          382 GDIITSVNGKKVSNGSDLYRILD-QCKVGD----EVIVEVLRGDQKEKIPVK  428 (436)
Q Consensus       382 GDiIl~vnG~~V~s~~dl~~~l~-~~~~g~----~v~l~v~R~g~~~~~~v~  428 (436)
                      -|-.+.+||++.++....+.... .+..|.    ++..++.|||+..+.+-+
T Consensus        11 adAkl~v~G~~t~~~G~~R~F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~~   62 (75)
T TIGR03000        11 ADAKLKVDGKETNGTGTVRTFTTPPLEAGKEYEYTVTAEYDRDGRILTRTRT   62 (75)
T ss_pred             CCCEEEECCeEcccCccEEEEECCCCCCCCEEEEEEEEEEecCCcEEEEEEE
Confidence            58889999999998776555443 244564    477778899987655433


No 112
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=31.99  E-value=93  Score=23.92  Aligned_cols=31  Identities=16%  Similarity=0.409  Sum_probs=27.2

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (436)
                      ..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus        14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence            5788999999999999999999998886554


No 113
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=31.67  E-value=47  Score=27.40  Aligned_cols=18  Identities=33%  Similarity=0.617  Sum_probs=13.3

Q ss_pred             CeEECCCCcEEEEEeeee
Q 013804          281 GPLLDSSGSLIGINTAIY  298 (436)
Q Consensus       281 GPlvd~~G~VVGI~s~~~  298 (436)
                      .|++|.+|++||+++.+.
T Consensus        94 ~PV~d~~g~viG~V~VG~  111 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVGV  111 (116)
T ss_dssp             EEEE-TTS-EEEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEEE
Confidence            588889999999988654


No 114
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=31.23  E-value=1.2e+02  Score=22.97  Aligned_cols=32  Identities=9%  Similarity=0.165  Sum_probs=27.4

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.+|+.+.+++.++|...+|.+=...
T Consensus        11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~   42 (72)
T cd01719          11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV   42 (72)
T ss_pred             CeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence            46889999999999999999999888875553


No 115
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.81  E-value=1.1e+02  Score=23.35  Aligned_cols=31  Identities=16%  Similarity=0.185  Sum_probs=27.1

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI  206 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv  206 (436)
                      ..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus        13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~   43 (74)
T cd01728          13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT   43 (74)
T ss_pred             CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence            4688999999999999999999988887554


No 116
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=30.79  E-value=1.1e+02  Score=23.83  Aligned_cols=47  Identities=19%  Similarity=0.376  Sum_probs=31.3

Q ss_pred             EeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEE-EecC
Q 013804          188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYA-IGNP  236 (436)
Q Consensus       188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~-vG~p  236 (436)
                      ++++++..+...++|++.+-.-...+ -+.|-. .+++.|++|++ +||.
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvkreV-~l~Lv~-~~v~~GdyVLVHvGfA   52 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVKREV-NLDLVG-EEVKVGDYVLVHVGFA   52 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEeEEE-Eeeeec-CccccCCEEEEEeeEE
Confidence            57888999988889999986422111 222322 26789999886 6764


No 117
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=29.78  E-value=1.2e+02  Score=21.99  Aligned_cols=32  Identities=19%  Similarity=0.413  Sum_probs=27.7

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.||+.+.+.+..+|...++-+=...
T Consensus         9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~   40 (67)
T smart00651        9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE   40 (67)
T ss_pred             cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence            46889999999999999999999888876654


No 118
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=28.86  E-value=53  Score=24.89  Aligned_cols=31  Identities=26%  Similarity=0.588  Sum_probs=22.9

Q ss_pred             CeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcc
Q 013804          281 GPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLV  324 (436)
Q Consensus       281 GPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~  324 (436)
                      -|+.+.+|+++|++..             .+.++.+.++++++.
T Consensus        19 ~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~~   49 (81)
T PF02743_consen   19 VPIYDDDGKIIGVVGI-------------DISLDQLSEIISNIK   49 (81)
T ss_dssp             EEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTSB
T ss_pred             EEEECCCCCEEEEEEE-------------EeccceeeeEEEeeE
Confidence            5788889999999654             366777777777653


No 119
>PF09122 DUF1930:  Domain of unknown function (DUF1930);  InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=28.61  E-value=1.4e+02  Score=22.10  Aligned_cols=45  Identities=22%  Similarity=0.264  Sum_probs=27.6

Q ss_pred             CcEEEEECCEEeCCHH-HHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 013804          382 GDIITSVNGKKVSNGS-DLYRILDQCKVGDEVIVEVLRGDQKEKIPV  427 (436)
Q Consensus       382 GDiIl~vnG~~V~s~~-dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v  427 (436)
                      .-.-+.+||..|.+.+ |+..++.....|+..++.+.. +....+++
T Consensus        19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~FkS-~Rv~~iEv   64 (68)
T PF09122_consen   19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFKS-QRVAVIEV   64 (68)
T ss_dssp             TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEETT-S-EEEEE-
T ss_pred             cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEec-CcEEEEEc
Confidence            4566789999999975 788888877899988876543 33444444


No 120
>PF09465 LBR_tudor:  Lamin-B receptor of TUDOR domain;  InterPro: IPR019023  The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=28.30  E-value=2.3e+02  Score=20.46  Aligned_cols=35  Identities=17%  Similarity=0.335  Sum_probs=27.6

Q ss_pred             CCCeEEEEecCCcEE-eeEEEEEcCCCCeEEEEEcC
Q 013804          174 GASDIRVTFADQSAY-DAKIVGFDQDKDVAVLRIDA  208 (436)
Q Consensus       174 ~~~~i~V~~~dg~~~-~a~vv~~d~~~DlAlLkv~~  208 (436)
                      ..+.+.++.++...| ++++..+|...++.-++.+.
T Consensus         8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D   43 (55)
T PF09465_consen    8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED   43 (55)
T ss_dssp             SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred             CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence            446788899887765 99999999999999999976


No 121
>PF05578 Peptidase_S31:  Pestivirus NS3 polyprotein peptidase S31;  InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=28.07  E-value=1.1e+02  Score=26.94  Aligned_cols=73  Identities=21%  Similarity=0.210  Sum_probs=39.1

Q ss_pred             CCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCC-CCCcccEEEEccccCCCCCCCeEECC-CCcEEEEEeeee
Q 013804          224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAAT-GRPIQDVIQTDAAINPGNSGGPLLDS-SGSLIGINTAIY  298 (436)
Q Consensus       224 ~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~-~~~~~~~i~~~~~i~~G~SGGPlvd~-~G~VVGI~s~~~  298 (436)
                      ...|..+|++ +|...+.+-+.|.+-...+.-..... ...... -.+|..-..|-||=|+|.. .|++||=.-.+.
T Consensus       109 cp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtp-af~~~knlkg~s~~pifeassgr~vgr~k~gk  183 (211)
T PF05578_consen  109 CPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTP-AFFDLKNLKGWSGLPIFEASSGRVVGRVKVGK  183 (211)
T ss_pred             CCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCc-ceeeccccCCCCCCceeeccCCcEEEEEEecC
Confidence            4567888888 56554444455544333322110000 000001 1233344579999999974 899999866543


No 122
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.39  E-value=1.3e+02  Score=22.75  Aligned_cols=32  Identities=19%  Similarity=0.267  Sum_probs=27.6

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.||+.+.++..++|...++.+=...
T Consensus        10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727          10 KTVSVITVDGRVIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             CEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence            46889999999999999999999888776653


No 123
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=27.07  E-value=1.2e+02  Score=23.20  Aligned_cols=33  Identities=21%  Similarity=0.459  Sum_probs=28.8

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (436)
                      ..+.|.+.+|+.+.+++.++|...++.+--+..
T Consensus        18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence            678999999999999999999999888776643


No 124
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.83  E-value=1.6e+02  Score=22.12  Aligned_cols=32  Identities=9%  Similarity=0.272  Sum_probs=28.8

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID  207 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~  207 (436)
                      ..+.|.+.+|..|.+++..+|...++.+-...
T Consensus        11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence            46889999999999999999999999888774


No 125
>PF14275 DUF4362:  Domain of unknown function (DUF4362)
Probab=26.28  E-value=2e+02  Score=23.45  Aligned_cols=47  Identities=17%  Similarity=0.312  Sum_probs=27.8

Q ss_pred             CCcEEEEECCEEeCCHHHHHHHHhcCC--------------CCCEEEEEEEECCEEEEEEEEe
Q 013804          381 LGDIITSVNGKKVSNGSDLYRILDQCK--------------VGDEVIVEVLRGDQKEKIPVKL  429 (436)
Q Consensus       381 ~GDiIl~vnG~~V~s~~dl~~~l~~~~--------------~g~~v~l~v~R~g~~~~~~v~~  429 (436)
                      .||+|.+ .|+ |.+.+.|...+....              .|+++-..+.-+|+...+.+.-
T Consensus         2 ~~DVi~~-~~~-i~Nl~kl~~Fi~nv~~~k~d~IrIv~yT~EGdPI~~~L~~~G~~I~y~~Dn   62 (98)
T PF14275_consen    2 NNDVINK-HGE-IENLDKLDQFIENVEQGKPDKIRIVQYTIEGDPIFQDLEYDGNQIKYTSDN   62 (98)
T ss_pred             CCCEEEe-CCe-EEeHHHHHHHHHHHhcCCCCEEEEEEecCCCCCEEEEEEECCCEEEEEECC
Confidence            4999988 444 777777776665432              3344444455566655555544


No 126
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=25.54  E-value=1.5e+02  Score=21.63  Aligned_cols=33  Identities=21%  Similarity=0.444  Sum_probs=29.3

Q ss_pred             CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804          176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA  208 (436)
Q Consensus       176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~  208 (436)
                      ..+.|.+.+|+.+.+.+..+|...++.+-....
T Consensus         9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~   41 (67)
T PF01423_consen    9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE   41 (67)
T ss_dssp             SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred             cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence            568899999999999999999999988887754


No 127
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.21  E-value=58  Score=26.41  Aligned_cols=22  Identities=32%  Similarity=0.453  Sum_probs=17.4

Q ss_pred             CCCCCCeEECCCCcEEEEEeee
Q 013804          276 PGNSGGPLLDSSGSLIGINTAI  297 (436)
Q Consensus       276 ~G~SGGPlvd~~G~VVGI~s~~  297 (436)
                      .+.+.=|++|.+|+++|+++..
T Consensus        97 ~~~~~lpVvd~~~~~vGiit~~  118 (123)
T cd04627          97 EGISSVAVVDNQGNLIGNISVT  118 (123)
T ss_pred             cCCceEEEECCCCcEEEEEeHH
Confidence            3455679999999999998753


No 128
>PF01455 HupF_HypC:  HupF/HypC family;  InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=24.18  E-value=2.3e+02  Score=21.23  Aligned_cols=43  Identities=23%  Similarity=0.396  Sum_probs=29.3

Q ss_pred             EeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEE
Q 013804          188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI  233 (436)
Q Consensus       188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~v  233 (436)
                      ++++++..+.....|++....   ....+.+.--.++++|++|++-
T Consensus         5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH   47 (68)
T PF01455_consen    5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH   47 (68)
T ss_dssp             EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred             ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence            678899998888999998864   3344444333458899999875


No 129
>PF02601 Exonuc_VII_L:  Exonuclease VII, large subunit;  InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus.  This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=23.81  E-value=87  Score=30.86  Aligned_cols=35  Identities=29%  Similarity=0.499  Sum_probs=30.7

Q ss_pred             eEEEEEEEcCCCEEEecccccCCCCeEEEEecCCc
Q 013804          152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS  186 (436)
Q Consensus       152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~  186 (436)
                      ..|-+++-+++|.++|+..-+...+.+++.+.||.
T Consensus       280 ~RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~  314 (319)
T PF02601_consen  280 KRGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS  314 (319)
T ss_pred             hCceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence            35667788788999999999999999999999995


No 130
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=22.81  E-value=69  Score=25.46  Aligned_cols=20  Identities=20%  Similarity=0.270  Sum_probs=15.9

Q ss_pred             CCCCCeEECCCCcEEEEEee
Q 013804          277 GNSGGPLLDSSGSLIGINTA  296 (436)
Q Consensus       277 G~SGGPlvd~~G~VVGI~s~  296 (436)
                      +.+--|++|.+|+++|+++.
T Consensus        86 ~~~~lpVvd~~~~~~Giit~  105 (111)
T cd04603          86 EPPVVAVVDKEGKLVGTIYE  105 (111)
T ss_pred             CCCeEEEEcCCCeEEEEEEh
Confidence            44456899988999999874


No 131
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=22.35  E-value=71  Score=25.32  Aligned_cols=20  Identities=45%  Similarity=0.599  Sum_probs=16.3

Q ss_pred             CCCCCeEECCCCcEEEEEee
Q 013804          277 GNSGGPLLDSSGSLIGINTA  296 (436)
Q Consensus       277 G~SGGPlvd~~G~VVGI~s~  296 (436)
                      +...-|++|.+|+++||++.
T Consensus        90 ~~~~~pVvd~~~~~~Gvit~  109 (115)
T cd04620          90 QIRHLPVLDDQGQLIGLVTA  109 (115)
T ss_pred             CCceEEEEcCCCCEEEEEEh
Confidence            44567899989999999875


No 132
>PF10049 DUF2283:  Protein of unknown function (DUF2283);  InterPro: IPR019270  Members of this family of hypothetical proteins have no known function. 
Probab=20.09  E-value=73  Score=22.27  Aligned_cols=11  Identities=36%  Similarity=0.866  Sum_probs=8.4

Q ss_pred             CCCCcEEEEEe
Q 013804          285 DSSGSLIGINT  295 (436)
Q Consensus       285 d~~G~VVGI~s  295 (436)
                      |.+|++|||--
T Consensus        36 d~~G~ivGIEI   46 (50)
T PF10049_consen   36 DEDGRIVGIEI   46 (50)
T ss_pred             CCCCCEEEEEE
Confidence            57899999843


Done!