Query         013444
Match_columns 443
No_of_seqs    403 out of 2973
Neff          7.8 
Searched_HMMs 46136
Date          Fri Mar 29 03:55:04 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013444.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013444hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 3.6E-50 7.8E-55  415.6  38.8  296  126-439    41-361 (455)
  2 TIGR02038 protease_degS peripl 100.0 3.8E-49 8.3E-54  396.7  39.3  297  125-440    45-350 (351)
  3 PRK10898 serine endoprotease;  100.0 6.9E-49 1.5E-53  394.6  39.1  296  126-440    46-351 (353)
  4 PRK10942 serine endoprotease;  100.0 2.4E-47 5.3E-52  396.3  38.0  295  126-438    39-381 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 3.2E-47 6.9E-52  393.5  38.3  295  126-438     2-327 (428)
  6 COG0265 DegQ Trypsin-like seri 100.0 4.3E-37 9.4E-42  309.8  33.3  295  125-438    33-340 (347)
  7 KOG1320 Serine protease [Postt 100.0 2.1E-29 4.5E-34  254.9  24.0  318  124-442   127-472 (473)
  8 KOG1421 Predicted signaling-as  99.9 1.5E-25 3.3E-30  228.7  22.7  306  124-440    51-373 (955)
  9 PF13365 Trypsin_2:  Trypsin-li  99.7 1.9E-16   4E-21  134.0  11.7  117  157-303     1-120 (120)
 10 KOG1421 Predicted signaling-as  99.6   2E-14 4.3E-19  148.0  21.9  290  131-439   524-832 (955)
 11 PF13180 PDZ_2:  PDZ domain; PD  99.6   4E-14 8.7E-19  112.6  11.1   81  339-436     1-82  (82)
 12 PF00089 Trypsin:  Trypsin;  In  99.5 3.4E-12 7.4E-17  118.8  17.9  177  135-327    12-220 (220)
 13 cd00190 Tryp_SPc Trypsin-like   99.4 2.3E-11   5E-16  114.0  18.9  181  134-328    11-230 (232)
 14 cd00987 PDZ_serine_protease PD  99.4 5.5E-12 1.2E-16  101.5  11.2   88  339-433     1-89  (90)
 15 cd00991 PDZ_archaeal_metallopr  99.3 4.7E-11   1E-15   94.3  10.6   68  367-435     9-77  (79)
 16 cd00990 PDZ_glycyl_aminopeptid  99.2 7.3E-11 1.6E-15   93.0  10.3   68  367-437    11-78  (80)
 17 smart00020 Tryp_SPc Trypsin-li  99.2 2.6E-10 5.6E-15  107.2  16.0  161  134-308    12-208 (229)
 18 TIGR01713 typeII_sec_gspC gene  99.2 5.9E-11 1.3E-15  114.4  11.5   99  321-435   159-258 (259)
 19 KOG1320 Serine protease [Postt  99.2   5E-11 1.1E-15  121.7  10.8  273  132-424    57-349 (473)
 20 cd00989 PDZ_metalloprotease PD  99.2 1.1E-10 2.4E-15   91.7   9.9   67  368-435    12-78  (79)
 21 cd00986 PDZ_LON_protease PDZ d  99.2   2E-10 4.4E-15   90.4  10.8   70  368-439     8-78  (79)
 22 cd00988 PDZ_CTP_protease PDZ d  99.1 4.5E-10 9.7E-15   89.5   9.9   71  367-437    12-84  (85)
 23 TIGR02037 degP_htrA_DO peripla  99.0 2.7E-09 5.9E-14  110.8  11.5   90  338-433   337-427 (428)
 24 cd00136 PDZ PDZ domain, also c  99.0 2.2E-09 4.8E-14   82.2   7.8   56  368-423    13-70  (70)
 25 TIGR00054 RIP metalloprotease   98.8 1.6E-08 3.5E-13  104.6  10.0   70  368-438   203-272 (420)
 26 COG3591 V8-like Glu-specific e  98.7 1.9E-07   4E-12   88.7  13.6  160  155-331    64-250 (251)
 27 PRK10779 zinc metallopeptidase  98.7 4.9E-08 1.1E-12  101.9  10.3   69  368-437   221-289 (449)
 28 PRK10779 zinc metallopeptidase  98.7 2.7E-08 5.9E-13  103.8   7.9   67  370-437   128-195 (449)
 29 TIGR00225 prc C-terminal pepti  98.6 1.4E-07   3E-12   94.9   9.8   67  368-435    62-130 (334)
 30 cd00992 PDZ_signaling PDZ doma  98.6 2.5E-07 5.4E-12   72.8   8.6   54  368-422    26-81  (82)
 31 PRK10942 serine endoprotease;   98.6   2E-07 4.2E-12   97.8  10.2   65  368-434   408-472 (473)
 32 PRK10139 serine endoprotease;   98.6   2E-07 4.4E-12   97.2   9.8   65  368-434   390-454 (455)
 33 smart00228 PDZ Domain present   98.6 3.2E-07 6.8E-12   72.5   8.2   57  368-424    26-83  (85)
 34 TIGR02860 spore_IV_B stage IV   98.5 3.5E-07 7.6E-12   92.5  10.2   70  367-437   104-181 (402)
 35 PLN00049 carboxyl-terminal pro  98.5 4.5E-07 9.7E-12   93.0  10.0   68  368-436   102-171 (389)
 36 PF00595 PDZ:  PDZ domain (Also  98.5 2.8E-07 6.2E-12   72.7   6.4   70  339-423    10-81  (81)
 37 TIGR03279 cyano_FeS_chp putati  98.5   3E-07 6.6E-12   93.6   7.7   63  372-438     2-65  (433)
 38 KOG3627 Trypsin [Amino acid tr  98.3 2.9E-05 6.4E-10   74.5  17.6  180  136-329    25-252 (256)
 39 COG0793 Prc Periplasmic protea  98.3 3.5E-06 7.5E-11   86.6  10.6   70  368-437   112-184 (406)
 40 PF14685 Tricorn_PDZ:  Tricorn   98.3 4.7E-06   1E-10   66.8   8.8   67  367-433    11-87  (88)
 41 PF00863 Peptidase_C4:  Peptida  98.3 3.4E-05 7.4E-10   72.8  15.7  165  133-321    15-185 (235)
 42 TIGR00054 RIP metalloprotease   98.2 1.8E-06 3.8E-11   89.5   6.6   63  368-432   128-190 (420)
 43 KOG3129 26S proteasome regulat  98.1 9.3E-06   2E-10   73.9   7.5   72  369-441   140-214 (231)
 44 PF04495 GRASP55_65:  GRASP55/6  98.0 1.6E-05 3.6E-10   69.3   7.6   72  367-438    42-115 (138)
 45 PRK11186 carboxy-terminal prot  98.0 3.2E-05 6.9E-10   83.8  10.5   70  368-437   255-334 (667)
 46 COG3480 SdrC Predicted secrete  97.9 3.8E-05 8.3E-10   74.3   8.8   68  368-436   130-198 (342)
 47 PRK09681 putative type II secr  97.8 4.6E-05   1E-09   73.6   7.5   61  375-436   211-275 (276)
 48 COG3975 Predicted protease wit  97.8 4.3E-05 9.2E-10   78.7   6.5   64  366-438   460-524 (558)
 49 PF12812 PDZ_1:  PDZ-like domai  97.7 0.00018 3.9E-09   56.4   7.7   68  339-414     9-76  (78)
 50 COG5640 Secreted trypsin-like   97.5  0.0017 3.7E-08   64.1  13.5   51  282-332   223-279 (413)
 51 PF03761 DUF316:  Domain of unk  97.5   0.009 1.9E-07   58.4  18.2  177  134-325    52-273 (282)
 52 PF05579 Peptidase_S32:  Equine  97.4 0.00097 2.1E-08   63.2  10.0  116  153-307   110-228 (297)
 53 KOG3553 Tax interaction protei  97.4 0.00015 3.4E-09   58.2   3.7   36  365-400    56-91  (124)
 54 COG3031 PulC Type II secretory  97.4  0.0004 8.6E-09   64.7   6.2   67  368-435   207-274 (275)
 55 PF05580 Peptidase_S55:  SpoIVB  96.8    0.02 4.2E-07   53.2  12.1  160  154-322    19-214 (218)
 56 KOG3580 Tight junction protein  96.7  0.0022 4.7E-08   66.8   5.1   59  366-424   427-488 (1027)
 57 KOG3532 Predicted protein kina  96.5   0.006 1.3E-07   64.5   6.8   57  367-424   397-453 (1051)
 58 PF10459 Peptidase_S46:  Peptid  96.4   0.015 3.3E-07   63.6   9.6   24  155-178    47-70  (698)
 59 PF08192 Peptidase_S64:  Peptid  96.3   0.042   9E-07   58.6  11.7  117  208-330   541-688 (695)
 60 PF10459 Peptidase_S46:  Peptid  96.2  0.0075 1.6E-07   66.0   6.0   56  276-331   622-687 (698)
 61 PF00548 Peptidase_C3:  3C cyst  96.1    0.32 6.8E-06   44.2  15.4  149  135-307    12-170 (172)
 62 KOG3550 Receptor targeting pro  95.9    0.03 6.5E-07   48.7   6.9   58  365-423   112-172 (207)
 63 KOG3209 WW domain-containing p  95.5   0.023 4.9E-07   60.6   5.6   52  372-424   782-836 (984)
 64 COG0750 Predicted membrane-ass  95.4   0.056 1.2E-06   55.0   8.4   56  374-429   135-193 (375)
 65 PF00949 Peptidase_S7:  Peptida  95.4   0.047   1E-06   47.1   6.4   33  279-311    89-121 (132)
 66 KOG3209 WW domain-containing p  95.0   0.067 1.5E-06   57.2   7.4   60  365-424   671-734 (984)
 67 KOG3552 FERM domain protein FR  94.6   0.061 1.3E-06   59.0   5.9   55  368-424    75-131 (1298)
 68 TIGR02860 spore_IV_B stage IV   94.4    0.34 7.3E-06   49.7  10.6   46  278-324   351-396 (402)
 69 KOG3571 Dishevelled 3 and rela  94.4   0.079 1.7E-06   54.6   6.0   58  366-424   275-338 (626)
 70 PF09342 DUF1986:  Domain of un  94.3    0.57 1.2E-05   44.4  10.9  100  135-247    16-131 (267)
 71 KOG3542 cAMP-regulated guanine  94.1   0.043 9.4E-07   58.2   3.5   56  367-424   561-618 (1283)
 72 KOG3580 Tight junction protein  94.0    0.14   3E-06   53.9   6.9   63  361-424    33-96  (1027)
 73 KOG3605 Beta amyloid precursor  94.0   0.096 2.1E-06   55.5   5.7  116  287-416   680-806 (829)
 74 PF00944 Peptidase_S3:  Alphavi  94.0    0.17 3.6E-06   43.4   6.1   33  281-313   100-132 (158)
 75 PF02122 Peptidase_S39:  Peptid  93.9    0.31 6.7E-06   45.3   8.5  117  167-307    43-166 (203)
 76 KOG3834 Golgi reassembly stack  92.3    0.22 4.7E-06   50.6   5.2   70  367-437    14-86  (462)
 77 KOG3606 Cell polarity protein   91.5    0.51 1.1E-05   45.1   6.4   57  367-424   193-252 (358)
 78 KOG3549 Syntrophins (type gamm  91.1    0.23   5E-06   49.0   3.8   56  367-423    79-137 (505)
 79 KOG3651 Protein kinase C, alph  90.5    0.54 1.2E-05   45.6   5.7   55  369-424    31-88  (429)
 80 KOG3834 Golgi reassembly stack  90.5    0.45 9.7E-06   48.4   5.3   68  372-439   113-182 (462)
 81 KOG2921 Intramembrane metallop  89.8    0.46 9.9E-06   47.8   4.6   50  362-411   214-264 (484)
 82 KOG1892 Actin filament-binding  89.3    0.55 1.2E-05   52.1   5.1   59  365-424   957-1018(1629)
 83 KOG3551 Syntrophins (type beta  88.8    0.48   1E-05   47.5   4.0   58  365-423   107-167 (506)
 84 KOG0609 Calcium/calmodulin-dep  85.9     1.7 3.6E-05   45.6   6.1   55  369-424   147-204 (542)
 85 KOG0606 Microtubule-associated  85.6     1.3 2.7E-05   50.4   5.4   52  370-422   660-713 (1205)
 86 PF02907 Peptidase_S29:  Hepati  79.0     4.2 9.2E-05   35.0   5.0   39  284-323   105-146 (148)
 87 PF03510 Peptidase_C24:  2C end  73.1      11 0.00023   31.2   5.7   53  159-230     3-55  (105)
 88 PF02395 Peptidase_S6:  Immunog  71.4      41 0.00088   37.8  11.5   52  157-219    67-120 (769)
 89 KOG3605 Beta amyloid precursor  71.1     3.6 7.8E-05   44.1   3.1   50  375-424   680-733 (829)
 90 KOG3938 RGS-GAIP interacting p  66.4     6.8 0.00015   37.6   3.6   55  370-424   151-209 (334)
 91 PF00947 Pico_P2A:  Picornaviru  65.1      26 0.00057   29.9   6.6   33  275-308    78-110 (127)
 92 PF01732 DUF31:  Putative pepti  62.4     5.4 0.00012   40.7   2.4   24  282-305   350-373 (374)
 93 cd00600 Sm_like The eukaryotic  59.4      26 0.00055   25.5   5.1   33  186-218     6-38  (63)
 94 PRK00737 small nuclear ribonuc  54.7      30 0.00066   26.4   4.9   33  186-218    14-46  (72)
 95 cd01731 archaeal_Sm1 The archa  54.5      31 0.00068   25.9   4.9   33  186-218    10-42  (68)
 96 cd01726 LSm6 The eukaryotic Sm  54.4      29 0.00063   26.0   4.7   33  186-218    10-42  (67)
 97 PF11874 DUF3394:  Domain of un  54.0      19 0.00041   32.9   4.2   29  367-395   121-149 (183)
 98 cd06168 LSm9 The eukaryotic Sm  53.8      32 0.00069   26.6   4.9   33  186-218    10-42  (75)
 99 cd01722 Sm_F The eukaryotic Sm  53.4      29 0.00062   26.1   4.5   33  186-218    11-43  (68)
100 cd01732 LSm5 The eukaryotic Sm  53.4      29 0.00063   26.9   4.6   32  186-217    13-44  (76)
101 cd01730 LSm3 The eukaryotic Sm  52.1      27 0.00059   27.3   4.4   32  186-217    11-42  (82)
102 cd01717 Sm_B The eukaryotic Sm  52.0      32 0.00068   26.7   4.7   33  186-218    10-42  (79)
103 cd01729 LSm7 The eukaryotic Sm  49.6      39 0.00084   26.5   4.8   33  186-218    12-44  (81)
104 cd01719 Sm_G The eukaryotic Sm  48.3      44 0.00095   25.5   4.8   33  186-218    10-42  (72)
105 cd01721 Sm_D3 The eukaryotic S  47.8      49  0.0011   25.0   5.0   33  186-218    10-42  (70)
106 cd01728 LSm1 The eukaryotic Sm  47.0      46   0.001   25.6   4.8   32  186-217    12-43  (74)
107 cd01720 Sm_D2 The eukaryotic S  46.8      44 0.00096   26.7   4.8   33  186-218    14-46  (87)
108 smart00651 Sm snRNP Sm protein  46.4      49  0.0011   24.4   4.8   33  186-218     8-40  (67)
109 cd01735 LSm12_N LSm12 belongs   45.9      72  0.0016   23.7   5.4   34  186-219     6-39  (61)
110 PF01423 LSM:  LSM domain ;  In  45.2      41 0.00089   24.8   4.2   34  186-219     8-41  (67)
111 cd01727 LSm8 The eukaryotic Sm  41.6      59  0.0013   24.9   4.7   33  186-218     9-41  (74)
112 PF05416 Peptidase_C37:  Southa  39.1 2.7E+02  0.0059   29.0  10.0  134  155-309   379-528 (535)
113 PF00571 CBS:  CBS domain CBS d  36.9      34 0.00075   23.8   2.6   19  287-305    29-47  (57)
114 cd01723 LSm4 The eukaryotic Sm  36.1      91   0.002   24.0   5.0   33  186-218    11-43  (76)
115 PF12381 Peptidase_C3G:  Tungro  35.7      42 0.00092   31.4   3.5   56  275-331   168-229 (231)
116 COG1958 LSM1 Small nuclear rib  35.7      76  0.0017   24.5   4.5   33  186-218    17-49  (79)
117 KOG1738 Membrane-associated gu  32.5      59  0.0013   35.2   4.4   37  368-404   225-262 (638)
118 PF02743 Cache_1:  Cache domain  32.1      58  0.0013   24.8   3.3   32  290-331    18-49  (81)
119 COG0260 PepB Leucyl aminopepti  31.2      45 0.00097   35.3   3.3   32  369-401   299-330 (485)
120 PF14438 SM-ATX:  Ataxin 2 SM d  30.9 1.3E+02  0.0028   23.0   5.1   29  186-214    12-43  (77)
121 cd01725 LSm2 The eukaryotic Sm  30.7 1.2E+02  0.0026   23.6   4.9   33  186-218    11-43  (81)
122 cd01733 LSm10 The eukaryotic S  30.1 1.4E+02   0.003   23.2   5.1   33  186-218    19-51  (78)
123 PF14827 Cache_3:  Sensory doma  29.7      57  0.0012   27.0   3.1   18  291-308    94-111 (116)
124 cd05701 S1_Rrp5_repeat_hs10 S1  29.3      38 0.00081   25.4   1.6   33  210-242    13-54  (69)
125 cd01724 Sm_D1 The eukaryotic S  29.2 1.3E+02  0.0028   24.1   4.9   33  186-218    11-43  (90)
126 PF05578 Peptidase_S31:  Pestiv  25.9 1.7E+02  0.0036   26.0   5.3   73  233-307   108-182 (211)
127 PF09465 LBR_tudor:  Lamin-B re  23.9 2.9E+02  0.0063   20.1   5.7   35  186-220     9-44  (55)
128 PF09122 DUF1930:  Domain of un  22.2 2.3E+02  0.0049   21.2   4.5   44  389-434    19-64  (68)
129 PRK05015 aminopeptidase B; Pro  22.2      90  0.0019   32.4   3.5   29  372-401   240-268 (424)
130 PRK00913 multifunctional amino  21.1      91   0.002   33.1   3.3   29  372-401   303-331 (483)
131 cd00433 Peptidase_M17 Cytosol   20.9      89  0.0019   33.1   3.2   29  372-401   289-317 (468)
132 cd04627 CBS_pair_14 The CBS do  20.6      72  0.0016   25.9   2.1   20  287-306    98-117 (123)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=3.6e-50  Score=415.62  Aligned_cols=296  Identities=33%  Similarity=0.563  Sum_probs=259.8

Q ss_pred             hHHHHHHHhCCceEEEEeccc----------ccccc----------cCCcEEEEEEEeC-CCEEEeccccccCCCCCCCC
Q 013444          126 TIANAAARVCPAVVNLSAPRE----------FLGIL----------SGRGIGSGAIVDA-DGTILTCAHVVVDFHGSRAL  184 (443)
Q Consensus       126 ~~~~~~~~~~pSVV~I~~~~~----------~~~~~----------~~~~~GSGfiI~~-~G~ILTaaHvv~~~~~~~~~  184 (443)
                      ++.++++++.||||.|.+...          +..++          ...+.||||||++ +||||||+|||.++      
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a------  114 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQA------  114 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCC------
Confidence            499999999999999987421          01111          1236899999985 69999999999985      


Q ss_pred             CCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecC
Q 013444          185 PKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRK  264 (443)
Q Consensus       185 ~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~  264 (443)
                        ..+.|++.|++.++|++++.|+.+||||||++.+..+++++|+++..+++|++|+++|+|+++..+++.|+|++..+.
T Consensus       115 --~~i~V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~  192 (455)
T PRK10139        115 --QKISIQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRS  192 (455)
T ss_pred             --CEEEEEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcccccc
Confidence              589999999999999999999999999999986678999999999999999999999999999999999999988775


Q ss_pred             ccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEEeHHHHHHHHHHHHHcCceeeeec
Q 013444          265 SSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGRVVRPWL  341 (443)
Q Consensus       265 ~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~l  341 (443)
                      ....   ..+..+|++|+.+++|+|||||||.+|+||||+++....   ..+++|+||++.+++++++|+++|++.|+||
T Consensus       193 ~~~~---~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~L  269 (455)
T PRK10139        193 GLNL---EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLL  269 (455)
T ss_pred             ccCC---CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccce
Confidence            3211   123578999999999999999999999999999987643   3579999999999999999999999999999


Q ss_pred             CceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEE
Q 013444          342 GLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKV  420 (443)
Q Consensus       342 Gi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l  420 (443)
                      |+.+++++++.++.+++.      ...|++|.+|.++|||+++||++||+|++|||++|.+|+|+...+.. ..|+++.+
T Consensus       270 Gv~~~~l~~~~~~~lgl~------~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l  343 (455)
T PRK10139        270 GIKGTEMSADIAKAFNLD------VQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKL  343 (455)
T ss_pred             eEEEEECCHHHHHhcCCC------CCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEE
Confidence            999999999998887652      35699999999999999999999999999999999999999998876 78899999


Q ss_pred             EEEECCCeEEEEEEEecCC
Q 013444          421 VVQRANDQLVTLTVIPEEA  439 (443)
Q Consensus       421 ~v~R~~g~~~~l~v~~~~~  439 (443)
                      +|.| +|+.+++++++.+.
T Consensus       344 ~V~R-~G~~~~l~v~~~~~  361 (455)
T PRK10139        344 GLLR-NGKPLEVEVTLDTS  361 (455)
T ss_pred             EEEE-CCEEEEEEEEECCC
Confidence            9999 88988888887543


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=3.8e-49  Score=396.69  Aligned_cols=297  Identities=37%  Similarity=0.591  Sum_probs=258.5

Q ss_pred             hhHHHHHHHhCCceEEEEeccccc---ccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE
Q 013444          125 DTIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG  201 (443)
Q Consensus       125 ~~~~~~~~~~~pSVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a  201 (443)
                      .++.++++++.||||.|+......   ......+.||||+|+++||||||+|||.++        ..+.|.+.||+.++|
T Consensus        45 ~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~--------~~i~V~~~dg~~~~a  116 (351)
T TIGR02038        45 ISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA--------DQIVVALQDGRKFEA  116 (351)
T ss_pred             hhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC--------CEEEEEECCCCEEEE
Confidence            469999999999999998754221   111234679999999999999999999885        579999999999999


Q ss_pred             EEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEc
Q 013444          202 TVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTD  281 (443)
Q Consensus       202 ~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d  281 (443)
                      ++++.|+.+||||||++.. .+++++++++..+++|++|+++|+|++...+++.|+|+...+....   ......++++|
T Consensus       117 ~vv~~d~~~DlAvlkv~~~-~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd  192 (351)
T TIGR02038       117 ELVGSDPLTDLAVLKIEGD-NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD  192 (351)
T ss_pred             EEEEecCCCCEEEEEecCC-CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence            9999999999999999854 5788899888889999999999999999899999999988765321   11235789999


Q ss_pred             ccCCCCCccceeeecCCCEEEEEEEEeec-----CCCeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHh
Q 013444          282 CAINAGNSGGPLVNIDGEIVGINIMKVAA-----ADGLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQL  356 (443)
Q Consensus       282 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~-----~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l  356 (443)
                      +.+++|+|||||||.+|+||||+++....     ..+++|+||++.+++++++++++|++.|||||+.++++++...+.+
T Consensus       193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l  272 (351)
T TIGR02038       193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL  272 (351)
T ss_pred             CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence            99999999999999999999999876532     2578999999999999999999999999999999999998888777


Q ss_pred             hcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444          357 KERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       357 ~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      +..      ...|++|.+|.++|||+++||++||+|++|||++|.+++|+.+.+.. +.|+++.++|.| +|+.+++.++
T Consensus       273 gl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~  345 (351)
T TIGR02038       273 GLP------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLR-QGKQLELPVT  345 (351)
T ss_pred             CCC------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEE
Confidence            652      23699999999999999999999999999999999999999998876 788999999999 8898899998


Q ss_pred             ecCCC
Q 013444          436 PEEAN  440 (443)
Q Consensus       436 ~~~~~  440 (443)
                      +.+.+
T Consensus       346 l~~~p  350 (351)
T TIGR02038       346 IDEKP  350 (351)
T ss_pred             ecCCC
Confidence            87643


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=6.9e-49  Score=394.61  Aligned_cols=296  Identities=36%  Similarity=0.562  Sum_probs=255.5

Q ss_pred             hHHHHHHHhCCceEEEEeccccc---ccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEE
Q 013444          126 TIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGT  202 (443)
Q Consensus       126 ~~~~~~~~~~pSVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~  202 (443)
                      ++.++++++.||||.|.......   ......+.||||+|+++|+||||+|+|.++        ..+.|++.||+.++|+
T Consensus        46 ~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~  117 (353)
T PRK10898         46 SYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA--------DQIIVALQDGRVFEAL  117 (353)
T ss_pred             hHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC--------CEEEEEeCCCCEEEEE
Confidence            58999999999999999864321   111223689999999999999999999984        5899999999999999


Q ss_pred             EEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcc
Q 013444          203 VLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDC  282 (443)
Q Consensus       203 vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~  282 (443)
                      +++.|+.+||||||++. ..+++++++++..+++|++|+++|||++...+++.|+|++..+....   ......++++|+
T Consensus       118 vv~~d~~~DlAvl~v~~-~~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtda  193 (353)
T PRK10898        118 LVGSDSLTDLAVLKINA-TNLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTDA  193 (353)
T ss_pred             EEEEcCCCCEEEEEEcC-CCCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEecc
Confidence            99999999999999985 46888999888889999999999999998889999999987765321   112246899999


Q ss_pred             cCCCCCccceeeecCCCEEEEEEEEeecC------CCeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHh
Q 013444          283 AINAGNSGGPLVNIDGEIVGINIMKVAAA------DGLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQL  356 (443)
Q Consensus       283 ~i~~G~SGGPlvd~~G~VVGI~s~~~~~~------~g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l  356 (443)
                      .+++|+|||||+|.+|+||||+++.....      .+++|+||++.+++++++|+++|++.|+|||+.++++++.....+
T Consensus       194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~~  273 (353)
T PRK10898        194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQGG  273 (353)
T ss_pred             ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHhc
Confidence            99999999999999999999999866432      478999999999999999999999999999999999877655443


Q ss_pred             hcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444          357 KERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       357 ~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      +.      ....|++|.+|.++|||+++||++||+|++|||++|.++.++.+.+.. ..|+++.++|+| +++..+++++
T Consensus       274 ~~------~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~  346 (353)
T PRK10898        274 GI------DQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMR-DDKQLTLQVT  346 (353)
T ss_pred             CC------CCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEE
Confidence            32      224799999999999999999999999999999999999999988876 788999999999 8898899998


Q ss_pred             ecCCC
Q 013444          436 PEEAN  440 (443)
Q Consensus       436 ~~~~~  440 (443)
                      +.+.+
T Consensus       347 l~~~p  351 (353)
T PRK10898        347 IQEYP  351 (353)
T ss_pred             eccCC
Confidence            87654


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=2.4e-47  Score=396.31  Aligned_cols=295  Identities=35%  Similarity=0.559  Sum_probs=257.2

Q ss_pred             hHHHHHHHhCCceEEEEecccc-----------cccc--------------------------------cCCcEEEEEEE
Q 013444          126 TIANAAARVCPAVVNLSAPREF-----------LGIL--------------------------------SGRGIGSGAIV  162 (443)
Q Consensus       126 ~~~~~~~~~~pSVV~I~~~~~~-----------~~~~--------------------------------~~~~~GSGfiI  162 (443)
                      +++++++++.|+||.|.+....           +.++                                ...+.||||||
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            4999999999999999864310           0011                                11357999999


Q ss_pred             eC-CCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEE
Q 013444          163 DA-DGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVV  241 (443)
Q Consensus       163 ~~-~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~  241 (443)
                      ++ +||||||+|||.+.        ..+.|++.|++.|+|++++.|+.+||||||++...++++++|+++..+++|++|+
T Consensus       119 ~~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~  190 (473)
T PRK10942        119 DADKGYVVTNNHVVDNA--------TKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTV  190 (473)
T ss_pred             ECCCCEEEeChhhcCCC--------CEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEE
Confidence            96 59999999999985        5899999999999999999999999999999866789999999999999999999


Q ss_pred             EEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEE
Q 013444          242 AMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAV  318 (443)
Q Consensus       242 ~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aI  318 (443)
                      ++|+|+++..+++.|+|+...+....  . ..+..+|++|+.+++|+|||||+|.+|+||||++.....   ..+++|+|
T Consensus       191 aiG~P~g~~~tvt~GiVs~~~r~~~~--~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaI  267 (473)
T PRK10942        191 AIGNPYGLGETVTSGIVSALGRSGLN--V-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAI  267 (473)
T ss_pred             EEcCCCCCCcceeEEEEEEeecccCC--c-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEE
Confidence            99999999999999999988765211  1 123578999999999999999999999999999987643   25689999


Q ss_pred             eHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCE
Q 013444          319 PIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGK  398 (443)
Q Consensus       319 Pi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~  398 (443)
                      |++.+++++++|+++|++.|+|||+.++++++++++.+++.      ...|++|.+|.++|||+++||++||+|++|||+
T Consensus       268 P~~~~~~v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~------~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~  341 (473)
T PRK10942        268 PSNMVKNLTSQMVEYGQVKRGELGIMGTELNSELAKAMKVD------AQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGK  341 (473)
T ss_pred             EHHHHHHHHHHHHhccccccceeeeEeeecCHHHHHhcCCC------CCCceEEEEECCCChHHHcCCCCCCEEEEECCE
Confidence            99999999999999999999999999999999988887653      357999999999999999999999999999999


Q ss_pred             eeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 013444          399 PVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEE  438 (443)
Q Consensus       399 ~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~  438 (443)
                      +|.+|+++...+.. ..|+++.++|.| +|+.+++.+++..
T Consensus       342 ~V~s~~dl~~~l~~~~~g~~v~l~v~R-~G~~~~v~v~l~~  381 (473)
T PRK10942        342 PISSFAALRAQVGTMPVGSKLTLGLLR-DGKPVNVNVELQQ  381 (473)
T ss_pred             ECCCHHHHHHHHHhcCCCCEEEEEEEE-CCeEEEEEEEeCc
Confidence            99999999988866 678899999999 8888888887654


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=3.2e-47  Score=393.45  Aligned_cols=295  Identities=40%  Similarity=0.674  Sum_probs=258.0

Q ss_pred             hHHHHHHHhCCceEEEEecccc-------------cccc--------------cCCcEEEEEEEeCCCEEEeccccccCC
Q 013444          126 TIANAAARVCPAVVNLSAPREF-------------LGIL--------------SGRGIGSGAIVDADGTILTCAHVVVDF  178 (443)
Q Consensus       126 ~~~~~~~~~~pSVV~I~~~~~~-------------~~~~--------------~~~~~GSGfiI~~~G~ILTaaHvv~~~  178 (443)
                      ++.++++++.||||.|.+....             ..++              ...+.||||+|+++|+||||+||+.++
T Consensus         2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~   81 (428)
T TIGR02037         2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA   81 (428)
T ss_pred             cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC
Confidence            3789999999999999874210             0011              124679999999999999999999985


Q ss_pred             CCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEE
Q 013444          179 HGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIV  258 (443)
Q Consensus       179 ~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~V  258 (443)
                              ..+.|++.|++.++|++++.|+.+|||||+++....++++.|+++..+++|++|+++|||++...+++.|+|
T Consensus        82 --------~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~v  153 (428)
T TIGR02037        82 --------DEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIV  153 (428)
T ss_pred             --------CeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEE
Confidence                    589999999999999999999999999999987667999999988899999999999999999999999999


Q ss_pred             EeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEEeHHHHHHHHHHHHHcCc
Q 013444          259 SCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGR  335 (443)
Q Consensus       259 s~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aIPi~~i~~~l~~l~~~g~  335 (443)
                      +...+....   ...+..++++|+.+++|+|||||||.+|+||||++.....   ..+++|+||++.+++++++|+++++
T Consensus       154 s~~~~~~~~---~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~  230 (428)
T TIGR02037       154 SALGRSGLG---IGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGK  230 (428)
T ss_pred             EecccCccC---CCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCc
Confidence            988765311   1223568999999999999999999999999999886652   3578999999999999999999999


Q ss_pred             eeeeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CC
Q 013444          336 VVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RV  414 (443)
Q Consensus       336 v~rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~  414 (443)
                      +.|||||+.+++++++.++.++..      ...|++|.+|.++|||+++||++||+|++|||++|.++.++..++.. ..
T Consensus       231 ~~~~~lGi~~~~~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~  304 (428)
T TIGR02037       231 VQRGWLGVTIQEVTSDLAKSLGLE------KQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIGTLKP  304 (428)
T ss_pred             CcCCcCceEeecCCHHHHHHcCCC------CCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCC
Confidence            999999999999999998888763      24799999999999999999999999999999999999999998876 67


Q ss_pred             CCeEEEEEEECCCeEEEEEEEecC
Q 013444          415 GEPLKVVVQRANDQLVTLTVIPEE  438 (443)
Q Consensus       415 g~~v~l~v~R~~g~~~~l~v~~~~  438 (443)
                      |++++++|.| +++.+++++++..
T Consensus       305 g~~v~l~v~R-~g~~~~~~v~l~~  327 (428)
T TIGR02037       305 GKKVTLGILR-KGKEKTITVTLGA  327 (428)
T ss_pred             CCEEEEEEEE-CCEEEEEEEEECc
Confidence            8999999999 8888888887654


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=4.3e-37  Score=309.82  Aligned_cols=295  Identities=39%  Similarity=0.627  Sum_probs=257.0

Q ss_pred             hhHHHHHHHhCCceEEEEecccccc---------cccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC
Q 013444          125 DTIANAAARVCPAVVNLSAPREFLG---------ILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD  195 (443)
Q Consensus       125 ~~~~~~~~~~~pSVV~I~~~~~~~~---------~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d  195 (443)
                      ..+..+++++.|+||.|........         .....+.||||+++.+|||+|+.|++.++        ..+.+.+.|
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a--------~~i~v~l~d  104 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGA--------EEITVTLAD  104 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCc--------ceEEEEeCC
Confidence            4689999999999999997542211         00014789999999899999999999984        588899999


Q ss_pred             CcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccc
Q 013444          196 GRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRR  275 (443)
Q Consensus       196 g~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~  275 (443)
                      |+.+++++++.|+..|+|+||++....++.+.++++..++.|++++++|+|+++..+++.|+++...+.  .........
T Consensus       105 g~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~--~v~~~~~~~  182 (347)
T COG0265         105 GREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT--GVGSAGGYV  182 (347)
T ss_pred             CCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc--cccCccccc
Confidence            999999999999999999999997544888899999999999999999999999999999999998886  222222256


Q ss_pred             eEEEEcccCCCCCccceeeecCCCEEEEEEEEeecCC---CeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHH
Q 013444          276 EYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAAD---GLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMI  352 (443)
Q Consensus       276 ~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~---g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~  352 (443)
                      .+|++|+.+++|+||||++|.+|++|||++.......   +++|+||++.++.+++.+.+.|++.|+|+|+.+.+++...
T Consensus       183 ~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~  262 (347)
T COG0265         183 NFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADI  262 (347)
T ss_pred             chhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEccccc
Confidence            7899999999999999999999999999999877543   5899999999999999999988999999999998887665


Q ss_pred             HHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEE
Q 013444          353 IAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVT  431 (443)
Q Consensus       353 ~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~  431 (443)
                      .  ++     + ....|++|.+|.+++||+++|++.||+|+++||+++.+..++...+.. ..|+++.+++.| +|+.++
T Consensus       263 ~--~g-----~-~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r-~g~~~~  333 (347)
T COG0265         263 A--LG-----L-PVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLR-GGKERE  333 (347)
T ss_pred             c--cC-----C-CCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEE-CCEEEE
Confidence            4  22     2 245789999999999999999999999999999999999999988876 679999999999 799999


Q ss_pred             EEEEecC
Q 013444          432 LTVIPEE  438 (443)
Q Consensus       432 l~v~~~~  438 (443)
                      +.+++.+
T Consensus       334 ~~v~l~~  340 (347)
T COG0265         334 LAVTLGD  340 (347)
T ss_pred             EEEEecC
Confidence            9998876


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.97  E-value=2.1e-29  Score=254.91  Aligned_cols=318  Identities=33%  Similarity=0.421  Sum_probs=257.2

Q ss_pred             chhHHHHHHHhCCceEEEEecccc------cccccCCcEEEEEEEeCCCEEEeccccccCCCCCCC---CCCceEEEEeC
Q 013444          124 RDTIANAAARVCPAVVNLSAPREF------LGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRA---LPKGKVDVTLQ  194 (443)
Q Consensus       124 ~~~~~~~~~~~~pSVV~I~~~~~~------~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~---~~~~~i~V~~~  194 (443)
                      ...+.++.++-.+++|.|+...-+      ....-....||||+++.+|+++||+||+........   ..-..+.+...
T Consensus       127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa  206 (473)
T KOG1320|consen  127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA  206 (473)
T ss_pred             hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence            455778889999999999963211      111134467999999999999999999986432211   11124666666


Q ss_pred             CC--cEEEEEEEeecCCCCEEEEEEcCC-CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCC
Q 013444          195 DG--RTFEGTVLNADFHSDIAIVKINSK-TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG  271 (443)
Q Consensus       195 dg--~~~~a~vv~~d~~~DlAlLkl~~~-~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~  271 (443)
                      ++  ..+++.+.+.|+..|+|+++++.+ ...++++++.+..+..|+++..+|.|++..++.+.|+++...|...+++..
T Consensus       207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~  286 (473)
T KOG1320|consen  207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE  286 (473)
T ss_pred             ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence            65  899999999999999999999755 337888898899999999999999999999999999999998887665544


Q ss_pred             --CccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEEeHHHHHHHHHHHHH---cCce------e
Q 013444          272 --GMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKK---NGRV------V  337 (443)
Q Consensus       272 --~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aIPi~~i~~~l~~l~~---~g~v------~  337 (443)
                        ....+++++|+.++.|+||||++|.+|++||++++....   ..+++|++|.+.+..++.+..+   ..+.      .
T Consensus       287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~  366 (473)
T KOG1320|consen  287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPV  366 (473)
T ss_pred             cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccc
Confidence              556789999999999999999999999999998886542   3678999999998888777632   2222      2


Q ss_pred             eeecCceeecccHHHHHHhhcCCCCCC-CCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCC
Q 013444          338 RPWLGLKMLDLNDMIIAQLKERDPSFP-NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVG  415 (443)
Q Consensus       338 rp~lGi~~~~~~~~~~~~l~~~~~~~~-~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g  415 (443)
                      +.|+|..+..+...+..++..+.+.++ ....+++|.+|.+++++...++++||+|++|||++|.+..++.++++. ..+
T Consensus       367 ~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~  446 (473)
T KOG1320|consen  367 HQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTE  446 (473)
T ss_pred             cccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHHHhcCcC
Confidence            459999988888877777766666666 344689999999999999999999999999999999999999999987 455


Q ss_pred             CeEEEEEEECCCeEEEEEEEecCCCCC
Q 013444          416 EPLKVVVQRANDQLVTLTVIPEEANPD  442 (443)
Q Consensus       416 ~~v~l~v~R~~g~~~~l~v~~~~~~~~  442 (443)
                      +++.+..+| +.+..++.+.+++..+.
T Consensus       447 ~~v~vl~~~-~~e~~tl~Il~~~~~p~  472 (473)
T KOG1320|consen  447 DKVAVLDRR-SAEDATLEILPEHKIPS  472 (473)
T ss_pred             ceEEEEEec-CccceeEEecccccCCC
Confidence            678888777 77888999988876654


No 8  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.94  E-value=1.5e-25  Score=228.72  Aligned_cols=306  Identities=22%  Similarity=0.320  Sum_probs=252.8

Q ss_pred             chhHHHHHHHhCCceEEEEeccc--ccccccCCcEEEEEEEeCC-CEEEeccccccCCCCCCCCCCceEEEEeCCCcEEE
Q 013444          124 RDTIANAAARVCPAVVNLSAPRE--FLGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE  200 (443)
Q Consensus       124 ~~~~~~~~~~~~pSVV~I~~~~~--~~~~~~~~~~GSGfiI~~~-G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~  200 (443)
                      ...+...+..+-++||.|...+-  ++..+.+.+.+|||++++. |+||||+|++...       ...-.+.|.+..+.+
T Consensus        51 ~e~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pg-------P~va~avf~n~ee~e  123 (955)
T KOG1421|consen   51 SEDWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPG-------PFVASAVFDNHEEIE  123 (955)
T ss_pred             hhhhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCC-------CceeEEEecccccCC
Confidence            34788999999999999998643  3444566788999999986 8999999999863       245667788888888


Q ss_pred             EEEEeecCCCCEEEEEEcCC----CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCC---CCc
Q 013444          201 GTVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGL---GGM  273 (443)
Q Consensus       201 a~vv~~d~~~DlAlLkl~~~----~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~---~~~  273 (443)
                      .-.++.|+-+|+.+++.+..    ..+..+.+.. +-.++|.+++++|+..+.-.++..|.++..++...+++.   +..
T Consensus       124 i~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndf  202 (955)
T KOG1421|consen  124 IYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDF  202 (955)
T ss_pred             cccccCCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccc
Confidence            88899999999999999854    2333444533 456899999999998777778889999999998877643   334


Q ss_pred             cceEEEEcccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHH
Q 013444          274 RREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMII  353 (443)
Q Consensus       274 ~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~  353 (443)
                      ...++|.......|.||+|++|.+|..|.++..+... .+.+|++|++.+.+.+.-++++.-++|+.|.+++..-.-+..
T Consensus       203 nTfy~QaasstsggssgspVv~i~gyAVAl~agg~~s-sas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~  281 (955)
T KOG1421|consen  203 NTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS-SASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDEC  281 (955)
T ss_pred             cceeeeehhcCCCCCCCCceecccceEEeeecCCccc-ccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHH
Confidence            4568899999999999999999999999998876543 456789999999999999998888999999999887666667


Q ss_pred             HHhhcCC-------CCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECC
Q 013444          354 AQLKERD-------PSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAN  426 (443)
Q Consensus       354 ~~l~~~~-------~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~  426 (443)
                      +++++..       ..+|....-++|..|.+++||++. |++||++++||+.-+.++.++.+.|.+..|+.+.|+|+| +
T Consensus       282 rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~GDillavN~t~l~df~~l~~iLDegvgk~l~LtI~R-g  359 (955)
T KOG1421|consen  282 RRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEPGDILLAVNSTCLNDFEALEQILDEGVGKNLELTIQR-G  359 (955)
T ss_pred             HhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCCCcEEEEEcceehHHHHHHHHHHhhccCceEEEEEEe-C
Confidence            7776644       356766556778899999999998 999999999999999999999999999999999999999 8


Q ss_pred             CeEEEEEEEecCCC
Q 013444          427 DQLVTLTVIPEEAN  440 (443)
Q Consensus       427 g~~~~l~v~~~~~~  440 (443)
                      |++.++++...+.+
T Consensus       360 gqelel~vtvqdlh  373 (955)
T KOG1421|consen  360 GQELELTVTVQDLH  373 (955)
T ss_pred             CEEEEEEEEecccc
Confidence            89888888876543


No 9  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.69  E-value=1.9e-16  Score=134.02  Aligned_cols=117  Identities=35%  Similarity=0.570  Sum_probs=77.6

Q ss_pred             EEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEE--EEEEeecCC-CCEEEEEEcCCCCCCccccCCCCC
Q 013444          157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE--GTVLNADFH-SDIAIVKINSKTPLPAAKLGTSSK  233 (443)
Q Consensus       157 GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkl~~~~~~~~~~l~~s~~  233 (443)
                      ||||+|+++|+||||+||+.+...........+.+.+.++....  +++++.++. +|+|||+++               
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~---------------   65 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVD---------------   65 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEES---------------
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEe---------------
Confidence            89999999999999999999754332223568888999988888  999999999 999999998               


Q ss_pred             CCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEE
Q 013444          234 LCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGI  303 (443)
Q Consensus       234 ~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI  303 (443)
                           .....+...     ...............    ......+ +++.+.+|+|||||||.+|+||||
T Consensus        66 -----~~~~~~~~~-----~~~~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   66 -----PWTGVGGGV-----RVPGSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             -----CEEEEEEEE-----EEEEEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             -----cccceeeee-----EeeeeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence                 000000000     000000000000000    0001124 799999999999999999999997


No 10 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.64  E-value=2e-14  Score=147.99  Aligned_cols=290  Identities=15%  Similarity=0.144  Sum_probs=204.4

Q ss_pred             HHHhCCceEEEEecccc--cccccCCcEEEEEEEeCC-CEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeec
Q 013444          131 AARVCPAVVNLSAPREF--LGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNAD  207 (443)
Q Consensus       131 ~~~~~pSVV~I~~~~~~--~~~~~~~~~GSGfiI~~~-G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d  207 (443)
                      .+++..+.|.++...+.  ++.......|||.|++.+ |++++++.++.-.       ..+.+|++.|...++|.+.+.|
T Consensus       524 ~~~i~~~~~~v~~~~~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d-------~~d~~vt~~dS~~i~a~~~fL~  596 (955)
T KOG1421|consen  524 SADISNCLVDVEPMMPVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSD-------AKDQRVTEADSDGIPANVSFLH  596 (955)
T ss_pred             hhHHhhhhhhheeceeeccccchhhhhcCceEEEEccCCceeEecccCCch-------hhceEEeecccccccceeeEec
Confidence            45667777777765443  444444456999999976 8999999999753       3678899999999999999999


Q ss_pred             CCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEee---ecC-ccCCCCCCccceEEEEccc
Q 013444          208 FHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCV---DRK-SSDLGLGGMRREYLQTDCA  283 (443)
Q Consensus       208 ~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~---~~~-~~~~~~~~~~~~~i~~d~~  283 (443)
                      +..++|.+|.+.. -...++|.+ ..+..||++...|+............|+.+   ... ..-..+.....+.|..++.
T Consensus       597 ~t~n~a~~kydp~-~~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~n  674 (955)
T KOG1421|consen  597 PTENVASFKYDPA-LEVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDN  674 (955)
T ss_pred             CccceeEeccChh-Hhhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecc
Confidence            9999999999853 234455644 567889999999998765432222222222   111 1112233444567877777


Q ss_pred             CCCCCccceeeecCCCEEEEEEEEeecCC-----CeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHhhc
Q 013444          284 INAGNSGGPLVNIDGEIVGINIMKVAAAD-----GLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQLKE  358 (443)
Q Consensus       284 i~~G~SGGPlvd~~G~VVGI~s~~~~~~~-----g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l~~  358 (443)
                      +.-+.--|-+.|.+|+|+|++-..+.+.-     ..-|.+.+..++..|+.|+.++..+...+|+++..++-..++.+++
T Consensus       675 lsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lgl  754 (955)
T KOG1421|consen  675 LSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGL  754 (955)
T ss_pred             ccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCC
Confidence            65555456788999999999876654321     2457789999999999999888776667788776665544444433


Q ss_pred             CCC-------CCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEE
Q 013444          359 RDP-------SFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVT  431 (443)
Q Consensus       359 ~~~-------~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~  431 (443)
                      ..-       .-.....-++|+.|.+..+  +. |..||+|+++||+.|+...|+.+..      .+...|.| +|..++
T Consensus       755 p~e~imk~e~es~~~~ql~~ishv~~~~~--ki-l~~gdiilsvngk~itr~~dl~d~~------eid~~ilr-dg~~~~  824 (955)
T KOG1421|consen  755 PSEFIMKSEEESTIPRQLYVISHVRPLLH--KI-LGVGDIILSVNGKMITRLSDLHDFE------EIDAVILR-DGIEME  824 (955)
T ss_pred             CHHHHhhhhhcCCCcceEEEEEeeccCcc--cc-cccccEEEEecCeEEeeehhhhhhh------hhheeeee-cCcEEE
Confidence            210       0012334577888877544  34 9999999999999999999998733      47889999 899988


Q ss_pred             EEEEecCC
Q 013444          432 LTVIPEEA  439 (443)
Q Consensus       432 l~v~~~~~  439 (443)
                      +++...+.
T Consensus       825 ikipt~p~  832 (955)
T KOG1421|consen  825 IKIPTYPE  832 (955)
T ss_pred             EEeccccc
Confidence            88876543


No 11 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.55  E-value=4e-14  Score=112.62  Aligned_cols=81  Identities=35%  Similarity=0.655  Sum_probs=69.8

Q ss_pred             eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCe
Q 013444          339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP  417 (443)
Q Consensus       339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~  417 (443)
                      ||||+.+...+.                ..|++|.+|.++|||+++||++||+|++|||++|+++.++..++.. ..|++
T Consensus         1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~   64 (82)
T PF13180_consen    1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDT   64 (82)
T ss_dssp             -E-SEEEEECSC----------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred             CEECeEEEEccC----------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence            689999876532                4699999999999999999999999999999999999999998854 88999


Q ss_pred             EEEEEEECCCeEEEEEEEe
Q 013444          418 LKVVVQRANDQLVTLTVIP  436 (443)
Q Consensus       418 v~l~v~R~~g~~~~l~v~~  436 (443)
                      ++|+|+| +++.+++++++
T Consensus        65 v~l~v~R-~g~~~~~~v~l   82 (82)
T PF13180_consen   65 VTLTVLR-DGEELTVEVTL   82 (82)
T ss_dssp             EEEEEEE-TTEEEEEEEE-
T ss_pred             EEEEEEE-CCEEEEEEEEC
Confidence            9999999 89998888864


No 12 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.46  E-value=3.4e-12  Score=118.81  Aligned_cols=177  Identities=21%  Similarity=0.291  Sum_probs=116.9

Q ss_pred             CCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeC-------CC--cEEEEEEEe
Q 013444          135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQ-------DG--RTFEGTVLN  205 (443)
Q Consensus       135 ~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~-------dg--~~~~a~vv~  205 (443)
                      .|.+|.|.....       ...|+|++|+++ +|||++||+...        ..+.+.+.       ++  ..+...-+.
T Consensus        12 ~p~~v~i~~~~~-------~~~C~G~li~~~-~vLTaahC~~~~--------~~~~v~~g~~~~~~~~~~~~~~~v~~~~   75 (220)
T PF00089_consen   12 FPWVVSIRYSNG-------RFFCTGTLISPR-WVLTAAHCVDGA--------SDIKVRLGTYSIRNSDGSEQTIKVSKII   75 (220)
T ss_dssp             STTEEEEEETTT-------EEEEEEEEEETT-EEEEEGGGHTSG--------GSEEEEESESBTTSTTTTSEEEEEEEEE
T ss_pred             CCeEEEEeeCCC-------CeeEeEEecccc-cccccccccccc--------cccccccccccccccccccccccccccc
Confidence            478888876553       367999999988 999999999871        34444332       22  345544443


Q ss_pred             ecC-------CCCEEEEEEcCC----CCCCccccCC-CCCCCCCCEEEEEecCCCCC----CceEEEEEEeeecCccCCC
Q 013444          206 ADF-------HSDIAIVKINSK----TPLPAAKLGT-SSKLCPGDWVVAMGCPHSLQ----NTVTAGIVSCVDRKSSDLG  269 (443)
Q Consensus       206 ~d~-------~~DlAlLkl~~~----~~~~~~~l~~-s~~~~~G~~V~~iG~p~~~~----~~~t~G~Vs~~~~~~~~~~  269 (443)
                      .++       .+|||||+++.+    ..+.++.+.. ...++.|+.+.++||+....    ..+....+.......+...
T Consensus        76 ~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~  155 (220)
T PF00089_consen   76 IHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSS  155 (220)
T ss_dssp             EETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHH
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            432       469999999976    3455667755 23457899999999998633    2455555544444332211


Q ss_pred             CCC-ccceEEEEcc----cCCCCCccceeeecCCCEEEEEEEEeecC-C-CeEEEEeHHHHHHHH
Q 013444          270 LGG-MRREYLQTDC----AINAGNSGGPLVNIDGEIVGINIMKVAAA-D-GLSFAVPIDSAAKII  327 (443)
Q Consensus       270 ~~~-~~~~~i~~d~----~i~~G~SGGPlvd~~G~VVGI~s~~~~~~-~-g~~~aIPi~~i~~~l  327 (443)
                      +.. .....++...    ..|.|+|||||++.++.|+||++.+..-. . ...++.+++...++|
T Consensus       156 ~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  156 YNDNLTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             TTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             cccccccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            111 2234566555    78999999999987777999999873322 2 247788888777664


No 13 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.40  E-value=2.3e-11  Score=114.04  Aligned_cols=181  Identities=23%  Similarity=0.267  Sum_probs=108.9

Q ss_pred             hCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC---------CcEEEEEEE
Q 013444          134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------GRTFEGTVL  204 (443)
Q Consensus       134 ~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d---------g~~~~a~vv  204 (443)
                      ..|.+|.|....       ....|+|++|+++ +|||+|||+.+..      ...+.|.+..         ...+..+-+
T Consensus        11 ~~Pw~v~i~~~~-------~~~~C~GtlIs~~-~VLTaAhC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~~v~~~   76 (232)
T cd00190          11 SFPWQVSLQYTG-------GRHFCGGSLISPR-WVLTAAHCVYSSA------PSNYTVRLGSHDLSSNEGGGQVIKVKKV   76 (232)
T ss_pred             CCCCEEEEEccC-------CcEEEEEEEeeCC-EEEECHHhcCCCC------CccEEEEeCcccccCCCCceEEEEEEEE
Confidence            357888887543       2367999999987 9999999998632      1344454432         223344444


Q ss_pred             eecC-------CCCEEEEEEcCCC----CCCccccCCCC-CCCCCCEEEEEecCCCCCC-----ceEEEEEEeeecCccC
Q 013444          205 NADF-------HSDIAIVKINSKT----PLPAAKLGTSS-KLCPGDWVVAMGCPHSLQN-----TVTAGIVSCVDRKSSD  267 (443)
Q Consensus       205 ~~d~-------~~DlAlLkl~~~~----~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~-----~~t~G~Vs~~~~~~~~  267 (443)
                      ..++       .+|||||+|+.+.    .+.++.|.... .+..|+.+.+.||+.....     ......+..+....+.
T Consensus        77 ~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~  156 (232)
T cd00190          77 IVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECK  156 (232)
T ss_pred             EECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhh
Confidence            4443       5799999998652    25677775543 5778999999999765332     2233333322222221


Q ss_pred             CCCC---CccceEEEE-----cccCCCCCccceeeecC---CCEEEEEEEEeecC--CCeEEEEeHHHHHHHHH
Q 013444          268 LGLG---GMRREYLQT-----DCAINAGNSGGPLVNID---GEIVGINIMKVAAA--DGLSFAVPIDSAAKIIE  328 (443)
Q Consensus       268 ~~~~---~~~~~~i~~-----d~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~~--~g~~~aIPi~~i~~~l~  328 (443)
                      ....   ......+..     ....|.|+|||||+...   +.++||.+++..-.  .....+..+....++|+
T Consensus       157 ~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~  230 (232)
T cd00190         157 RAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ  230 (232)
T ss_pred             hhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence            1111   111222322     34578999999999664   78999999865311  12233455565666654


No 14 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.37  E-value=5.5e-12  Score=101.53  Aligned_cols=88  Identities=36%  Similarity=0.692  Sum_probs=74.4

Q ss_pred             eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCe
Q 013444          339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP  417 (443)
Q Consensus       339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~  417 (443)
                      ||+|+.++++++.....+..      ....|++|.+|.++|||+++||++||+|++|||+++.++.++..++.. ..++.
T Consensus         1 ~~~G~~~~~~~~~~~~~~~~------~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~   74 (90)
T cd00987           1 PWLGVTVQDLTPDLAEELGL------KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDK   74 (90)
T ss_pred             CccceEEeECCHHHHHHcCC------CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCE
Confidence            68999999999876655332      335699999999999999999999999999999999999999988876 45889


Q ss_pred             EEEEEEECCCeEEEEE
Q 013444          418 LKVVVQRANDQLVTLT  433 (443)
Q Consensus       418 v~l~v~R~~g~~~~l~  433 (443)
                      +.+++.| +|+..++.
T Consensus        75 i~l~v~r-~g~~~~~~   89 (90)
T cd00987          75 VTLTVLR-GGKELTVT   89 (90)
T ss_pred             EEEEEEE-CCEEEEee
Confidence            9999999 77765543


No 15 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.27  E-value=4.7e-11  Score=94.26  Aligned_cols=68  Identities=26%  Similarity=0.425  Sum_probs=62.1

Q ss_pred             CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      ..|++|.+|.++|||+++||++||+|++|||+++.+|+++..++.. ..|+++.+++.| +++..+++++
T Consensus         9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r-~g~~~~~~~~   77 (79)
T cd00991           9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLP-STTKLTNVST   77 (79)
T ss_pred             CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEE
Confidence            4699999999999999999999999999999999999999999887 468899999999 8888777765


No 16 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.24  E-value=7.3e-11  Score=93.03  Aligned_cols=68  Identities=24%  Similarity=0.434  Sum_probs=57.9

Q ss_pred             CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  437 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  437 (443)
                      ..+++|.+|.++|||+++||++||+|++|||+++.+|.++...+  ..++.+.+++.| +++..++.+++.
T Consensus        11 ~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r-~g~~~~~~v~~~   78 (80)
T cd00990          11 EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFR-DDRLIEVPLTLA   78 (80)
T ss_pred             CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEE-CCEEEEEEEEec
Confidence            35799999999999999999999999999999999876654433  467789999999 788888888765


No 17 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.24  E-value=2.6e-10  Score=107.16  Aligned_cols=161  Identities=24%  Similarity=0.312  Sum_probs=98.7

Q ss_pred             hCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCC--------cEEEEEEEe
Q 013444          134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDG--------RTFEGTVLN  205 (443)
Q Consensus       134 ~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg--------~~~~a~vv~  205 (443)
                      ..|.+|.|....       ....|+|++|+++ +|||+|||+.+..      ...+.|.+...        ..+.+.-+.
T Consensus        12 ~~Pw~~~i~~~~-------~~~~C~GtlIs~~-~VLTaahC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~v~~~~   77 (229)
T smart00020       12 SFPWQVSLQYRG-------GRHFCGGSLISPR-WVLTAAHCVYGSD------PSNIRVRLGSHDLSSGEEGQVIKVSKVI   77 (229)
T ss_pred             CCCcEEEEEEcC-------CCcEEEEEEecCC-EEEECHHHcCCCC------CcceEEEeCcccCCCCCCceEEeeEEEE
Confidence            457788886432       2367999999987 9999999998642      13455655432        334444444


Q ss_pred             ec-------CCCCEEEEEEcCC----CCCCccccCCC-CCCCCCCEEEEEecCCCCC------CceEEEEEEeeecCccC
Q 013444          206 AD-------FHSDIAIVKINSK----TPLPAAKLGTS-SKLCPGDWVVAMGCPHSLQ------NTVTAGIVSCVDRKSSD  267 (443)
Q Consensus       206 ~d-------~~~DlAlLkl~~~----~~~~~~~l~~s-~~~~~G~~V~~iG~p~~~~------~~~t~G~Vs~~~~~~~~  267 (443)
                      .+       ..+|||||+|+.+    ..+.++.|... ..+..++.+.+.||+....      .......+.......+.
T Consensus        78 ~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~  157 (229)
T smart00020       78 IHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCR  157 (229)
T ss_pred             ECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhh
Confidence            33       3579999999875    23456666543 3567789999999987542      12223333322222221


Q ss_pred             CCCC---CccceEEEE-----cccCCCCCccceeeecCC--CEEEEEEEEe
Q 013444          268 LGLG---GMRREYLQT-----DCAINAGNSGGPLVNIDG--EIVGINIMKV  308 (443)
Q Consensus       268 ~~~~---~~~~~~i~~-----d~~i~~G~SGGPlvd~~G--~VVGI~s~~~  308 (443)
                      ....   ......+..     ....|+|+||||++...+  .++||++++.
T Consensus       158 ~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      158 RAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             hhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence            1110   011112222     355789999999996543  8999999865


No 18 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.23  E-value=5.9e-11  Score=114.44  Aligned_cols=99  Identities=17%  Similarity=0.209  Sum_probs=86.6

Q ss_pred             HHHHHHHHHHHHcCceeeeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEee
Q 013444          321 DSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPV  400 (443)
Q Consensus       321 ~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V  400 (443)
                      ..++++++++.+++++.+.|+|+.....+               +...|+.|..+.++++|+++||++||+|++|||+++
T Consensus       159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i  223 (259)
T TIGR01713       159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN---------------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDL  223 (259)
T ss_pred             hhHHHHHHHHHHCHHhhhheEeEEEEEeC---------------CceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEc
Confidence            45788899999999999999999874322               224699999999999999999999999999999999


Q ss_pred             CCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444          401 QSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       401 ~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      .+++++.+++.+ ..++.++|+|+| +|+.+++.+.
T Consensus       224 ~~~~~~~~~l~~~~~~~~v~l~V~R-~G~~~~i~v~  258 (259)
T TIGR01713       224 RDPEQAFQALQMLREETNLTLTVER-DGQREDIYVR  258 (259)
T ss_pred             CCHHHHHHHHHhcCCCCeEEEEEEE-CCEEEEEEEE
Confidence            999999998887 677899999999 8888887765


No 19 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.22  E-value=5e-11  Score=121.70  Aligned_cols=273  Identities=19%  Similarity=0.196  Sum_probs=181.1

Q ss_pred             HHhCCceEEEEeccc-------ccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEE-eCCCcEEEEEE
Q 013444          132 ARVCPAVVNLSAPRE-------FLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVT-LQDGRTFEGTV  203 (443)
Q Consensus       132 ~~~~pSVV~I~~~~~-------~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~-~~dg~~~~a~v  203 (443)
                      +....+++.+.....       +.........|+||.+... .++|++|++.....     ...+.+. ...-+.|.+++
T Consensus        57 ~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~-----~~~v~v~~~gs~~k~~~~v  130 (473)
T KOG1320|consen   57 DLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNND-----HKFVTVKKHGSPRKYKAFV  130 (473)
T ss_pred             cccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCcccccccc-----ccccccccCCCchhhhhhH
Confidence            455566777765322       1111134467999999865 99999999984321     1223332 23347778888


Q ss_pred             EeecCCCCEEEEEEcCC---CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEE
Q 013444          204 LNADFHSDIAIVKINSK---TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQT  280 (443)
Q Consensus       204 v~~d~~~DlAlLkl~~~---~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~  280 (443)
                      ...-.+.|+|++.++..   ....++.+++  -+...+.++++|   +....+|.|.|........  .........+++
T Consensus       131 ~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~--ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y--~~~~~~l~~vqi  203 (473)
T KOG1320|consen  131 AAVFEECDLAVVYIESEEFWKGMNPFELGD--IPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIY--AHSSTVLLRVQI  203 (473)
T ss_pred             HHhhhcccceEEEEeeccccCCCcccccCC--CcccCccEEEEc---CCcEEEEeeEEEEEEeccc--cCCCcceeeEEE
Confidence            88888999999999853   2223344433  345567899998   5667899999998876532  222334557899


Q ss_pred             cccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHHHcCce-eeeecCceeeccc-HHHHHHhhc
Q 013444          281 DCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGRV-VRPWLGLKMLDLN-DMIIAQLKE  358 (443)
Q Consensus       281 d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~~~g~v-~rp~lGi~~~~~~-~~~~~~l~~  358 (443)
                      ++...+|+||+|.+...+++.|+++...+....+.+.+|.-.+.++.......+.. .+++++...+.+- ...++.++ 
T Consensus       204 ~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~-  282 (473)
T KOG1320|consen  204 DAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFK-  282 (473)
T ss_pred             EEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccc-
Confidence            99999999999999877999999999886544678889998888877665544322 3444544444432 22222211 


Q ss_pred             CCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeC-CHHH-----HHHHHhc-CCCCeEEEEEEE
Q 013444          359 RDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ-SITE-----IIEIMGD-RVGEPLKVVVQR  424 (443)
Q Consensus       359 ~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~-s~~d-----l~~~l~~-~~g~~v~l~v~R  424 (443)
                           ++...|+.+.++.+-+.|.+. ++.||.|+.+||..|. ++..     +...+.. .+++++.+.+.|
T Consensus       283 -----lg~~~g~~i~~~~qtd~ai~~-~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r  349 (473)
T KOG1320|consen  283 -----LGLETGVLISKINQTDAAINP-GNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLR  349 (473)
T ss_pred             -----cCcccceeeeeecccchhhhc-ccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhh
Confidence                 122378999999999999888 9999999999999883 1111     1122222 455666666666


No 20 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21  E-value=1.1e-10  Score=91.67  Aligned_cols=67  Identities=30%  Similarity=0.589  Sum_probs=59.8

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      ..++|.+|.++|||+++||++||+|++|||+++.+|+++..++....++.+.+++.| +++..++.++
T Consensus        12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r-~~~~~~~~l~   78 (79)
T cd00989          12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVER-NGETITLTLT   78 (79)
T ss_pred             cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEE-CCEEEEEEec
Confidence            458899999999999999999999999999999999999998877667889999999 7777777664


No 21 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19  E-value=2e-10  Score=90.45  Aligned_cols=70  Identities=29%  Similarity=0.478  Sum_probs=63.5

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA  439 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~  439 (443)
                      .|++|.+|.++|||++ ||++||+|++|||+++.+|+++..++.. ..++.+.+++.| +|+..++++++.+.
T Consensus         8 ~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r-~g~~~~~~v~l~~~   78 (79)
T cd00986           8 HGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKR-EEKELPEDLILKTF   78 (79)
T ss_pred             cCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEE-CCEEEEEEEEEecc
Confidence            5899999999999997 7999999999999999999999998875 678899999999 88888888888753


No 22 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.12  E-value=4.5e-10  Score=89.51  Aligned_cols=71  Identities=24%  Similarity=0.608  Sum_probs=62.8

Q ss_pred             CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  437 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  437 (443)
                      ..+++|..|.+++||+++||++||+|++|||+++.+|  .++..++....++.+.+++.|.+++..++++++.
T Consensus        12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~~~   84 (85)
T cd00988          12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLTRL   84 (85)
T ss_pred             CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEEEC
Confidence            3689999999999999999999999999999999999  9999888777788999999993277788877764


No 23 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.98  E-value=2.7e-09  Score=110.76  Aligned_cols=90  Identities=30%  Similarity=0.568  Sum_probs=77.4

Q ss_pred             eeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCC
Q 013444          338 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGE  416 (443)
Q Consensus       338 rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~  416 (443)
                      +.|+|+.+..++.....++++.     ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. +.++
T Consensus       337 ~~~lGi~~~~l~~~~~~~~~l~-----~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~  411 (428)
T TIGR02037       337 NPFLGLTVANLSPEIRKELRLK-----GDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGG  411 (428)
T ss_pred             ccccceEEecCCHHHHHHcCCC-----cCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCC
Confidence            4689999999998887766542     224699999999999999999999999999999999999999999976 5788


Q ss_pred             eEEEEEEECCCeEEEEE
Q 013444          417 PLKVVVQRANDQLVTLT  433 (443)
Q Consensus       417 ~v~l~v~R~~g~~~~l~  433 (443)
                      ++.++|+| +++...+.
T Consensus       412 ~v~l~v~R-~g~~~~~~  427 (428)
T TIGR02037       412 RVALLILR-GGATIFVT  427 (428)
T ss_pred             EEEEEEEE-CCEEEEEE
Confidence            99999999 77766553


No 24 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.97  E-value=2.2e-09  Score=82.19  Aligned_cols=56  Identities=36%  Similarity=0.692  Sum_probs=51.6

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ  423 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~  423 (443)
                      .+++|.+|.+++||+++||++||+|++|||+++.++  +++.+++....|++++|+++
T Consensus        13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~   70 (70)
T cd00136          13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTVR   70 (70)
T ss_pred             CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEEC
Confidence            489999999999999999999999999999999999  99999998877888888763


No 25 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.81  E-value=1.6e-08  Score=104.56  Aligned_cols=70  Identities=26%  Similarity=0.507  Sum_probs=64.8

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEecC
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPEE  438 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~  438 (443)
                      .+++|.+|.++|||+++||++||+|++|||++|++|+|+.+.+....++++.+++.| +|+..+++++++.
T Consensus       203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~~  272 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVER-NGETLSISLTPEA  272 (420)
T ss_pred             cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEE-CCEEEEEEEEEcC
Confidence            489999999999999999999999999999999999999999988888899999999 8888888888753


No 26 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.73  E-value=1.9e-07  Score=88.69  Aligned_cols=160  Identities=18%  Similarity=0.220  Sum_probs=93.7

Q ss_pred             cEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEe----CCCc-E--EEEEEEeec-C---CCCEEEEEEcCC---
Q 013444          155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTL----QDGR-T--FEGTVLNAD-F---HSDIAIVKINSK---  220 (443)
Q Consensus       155 ~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~----~dg~-~--~~a~vv~~d-~---~~DlAlLkl~~~---  220 (443)
                      ..|++|+|+++ .+||++||+......    ...+.+..    .++. .  +........ .   ..|.+...+...   
T Consensus        64 ~~~~~~lI~pn-tvLTa~Hc~~s~~~G----~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~  138 (251)
T COG3591          64 LCTAATLIGPN-TVLTAGHCIYSPDYG----EDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALE  138 (251)
T ss_pred             ceeeEEEEcCc-eEEEeeeEEecCCCC----hhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhc
Confidence            45677999998 999999999764321    11221111    1111 1  111112111 2   345555555421   


Q ss_pred             ------CCCCccccCCCCCCCCCCEEEEEecCCCCCCce----EEEEEEeeecCccCCCCCCccceEEEEcccCCCCCcc
Q 013444          221 ------TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTV----TAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSG  290 (443)
Q Consensus       221 ------~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~----t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SG  290 (443)
                            .-.....+......+.++.+.++|||.+..+..    ..+.+...            ....+..+|.+++|+||
T Consensus       139 ~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SG  206 (251)
T COG3591         139 SGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSG  206 (251)
T ss_pred             cCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCC
Confidence                  111223344456778899999999998765332    22222211            12368899999999999


Q ss_pred             ceeeecCCCEEEEEEEEeecCCC--eEE-EEeHHHHHHHHHHHH
Q 013444          291 GPLVNIDGEIVGINIMKVAAADG--LSF-AVPIDSAAKIIEQFK  331 (443)
Q Consensus       291 GPlvd~~G~VVGI~s~~~~~~~g--~~~-aIPi~~i~~~l~~l~  331 (443)
                      +|+++.+.+|||++..+....++  .++ ..-...++++|+++.
T Consensus       207 Spv~~~~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         207 SPVLISKDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             CceEecCceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence            99999999999999887653221  222 233445777766543


No 27 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.72  E-value=4.9e-08  Score=101.94  Aligned_cols=69  Identities=28%  Similarity=0.549  Sum_probs=63.4

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  437 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  437 (443)
                      .+++|.+|.++|||+++||++||+|++|||++|++|+|+.+.+....++.+.+++.| +|+..++++++.
T Consensus       221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~  289 (449)
T PRK10779        221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIER-QGSPLSLTLTPD  289 (449)
T ss_pred             cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEE-CCEEEEEEEEee
Confidence            367899999999999999999999999999999999999999888788899999999 888888888775


No 28 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.71  E-value=2.7e-08  Score=103.83  Aligned_cols=67  Identities=15%  Similarity=0.063  Sum_probs=58.6

Q ss_pred             eEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEec
Q 013444          370 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPE  437 (443)
Q Consensus       370 ~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~  437 (443)
                      .+|.+|.++|||++||||+||+|++|||++|++|+|+...+.. ..+++++++|.| +|+++++++++.
T Consensus       128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R-~gk~~~~~v~l~  195 (449)
T PRK10779        128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAP-FGSDQRRDKTLD  195 (449)
T ss_pred             ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEe-CCccceEEEEec
Confidence            3689999999999999999999999999999999999987765 667889999999 787766666553


No 29 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.63  E-value=1.4e-07  Score=94.89  Aligned_cols=67  Identities=25%  Similarity=0.469  Sum_probs=56.4

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      .+++|..|.++|||+++||++||+|++|||++|.+|  .++...+....|+++.+++.| +++..+++++
T Consensus        62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R-~g~~~~~~v~  130 (334)
T TIGR00225        62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILR-AGKSKPLTFT  130 (334)
T ss_pred             CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEe-CCCCceEEEE
Confidence            589999999999999999999999999999999987  577777777788899999999 6554443333


No 30 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.60  E-value=2.5e-07  Score=72.85  Aligned_cols=54  Identities=26%  Similarity=0.535  Sum_probs=47.6

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeC--CHHHHHHHHhcCCCCeEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ--SITEIIEIMGDRVGEPLKVVV  422 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~--s~~dl~~~l~~~~g~~v~l~v  422 (443)
                      .|++|.+|.++|||+++||++||+|++|||+++.  +++++.+++....+ .+.+++
T Consensus        26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v   81 (82)
T cd00992          26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV   81 (82)
T ss_pred             CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence            5899999999999999999999999999999999  89999998876443 566654


No 31 
>PRK10942 serine endoprotease; Provisional
Probab=98.59  E-value=2e-07  Score=97.83  Aligned_cols=65  Identities=34%  Similarity=0.572  Sum_probs=58.4

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV  434 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v  434 (443)
                      .|++|.+|.++|+|+++||++||+|++|||++|.+++++.+++.... +.+.|+|.| +|+...+.+
T Consensus       408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~~v~l~V~R-~g~~~~v~~  472 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLALNIQR-GDSSIYLLM  472 (473)
T ss_pred             CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence            58999999999999999999999999999999999999999988744 689999999 777766554


No 32 
>PRK10139 serine endoprotease; Provisional
Probab=98.58  E-value=2e-07  Score=97.25  Aligned_cols=65  Identities=26%  Similarity=0.456  Sum_probs=58.7

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV  434 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v  434 (443)
                      .|++|.+|.++|||+++||++||+|++|||++|.+|+++.+++..+. +++.++|+| +|+...+.+
T Consensus       390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R-~g~~~~~~~  454 (455)
T PRK10139        390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVR-GNESIYLLL  454 (455)
T ss_pred             CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence            58999999999999999999999999999999999999999997754 689999999 787766654


No 33 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.55  E-value=3.2e-07  Score=72.48  Aligned_cols=57  Identities=33%  Similarity=0.584  Sum_probs=47.6

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHh-cCCCCeEEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMG-DRVGEPLKVVVQR  424 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~-~~~g~~v~l~v~R  424 (443)
                      .|++|..|.+++||+++||++||+|++|||+++.++.+...... ...++.+.+++.|
T Consensus        26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r   83 (85)
T smart00228       26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLR   83 (85)
T ss_pred             CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEe
Confidence            68999999999999999999999999999999998766544322 2335689999988


No 34 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.55  E-value=3.5e-07  Score=92.55  Aligned_cols=70  Identities=23%  Similarity=0.480  Sum_probs=60.7

Q ss_pred             CCceEEeEEC--------CCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444          367 KSGVLVPVVT--------PGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  437 (443)
Q Consensus       367 ~~g~~V~~v~--------~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  437 (443)
                      ..|++|....        .+|||+++||++||+|++|||++|++|+|+.+++....++++.++|.| +++..++.++|.
T Consensus       104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R-~Ge~~tv~V~Pv  181 (402)
T TIGR02860       104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIER-GGKIIETVIKPV  181 (402)
T ss_pred             cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEE-CCEEEEEEEEEe
Confidence            3688885542        369999999999999999999999999999999988778899999999 888888888765


No 35 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.51  E-value=4.5e-07  Score=92.96  Aligned_cols=68  Identities=25%  Similarity=0.498  Sum_probs=59.0

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEe
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIP  436 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~  436 (443)
                      .+++|..|.++|||+++||++||+|++|||++|.+  +.++...+....|+.+.|+|.| +++..+++++.
T Consensus       102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r-~g~~~~~~l~r  171 (389)
T PLN00049        102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRR-GPETRLVTLTR  171 (389)
T ss_pred             CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEe
Confidence            48999999999999999999999999999999985  4777777777788899999999 77776666654


No 36 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.49  E-value=2.8e-07  Score=72.72  Aligned_cols=70  Identities=31%  Similarity=0.584  Sum_probs=54.6

Q ss_pred             eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCC
Q 013444          339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGE  416 (443)
Q Consensus       339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~  416 (443)
                      ..|||++..-..              ....+++|.+|.++++|+++||++||+|++|||+.+.++  .++..++....+ 
T Consensus        10 ~~lG~~l~~~~~--------------~~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~-   74 (81)
T PF00595_consen   10 GPLGFTLRGGSD--------------NDEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN-   74 (81)
T ss_dssp             SBSSEEEEEEST--------------SSSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS-
T ss_pred             CCcCEEEEecCC--------------CCcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC-
Confidence            468888765322              113599999999999999999999999999999999876  455666666544 


Q ss_pred             eEEEEEE
Q 013444          417 PLKVVVQ  423 (443)
Q Consensus       417 ~v~l~v~  423 (443)
                      +++|+|+
T Consensus        75 ~v~L~V~   81 (81)
T PF00595_consen   75 PVTLTVQ   81 (81)
T ss_dssp             EEEEEEE
T ss_pred             cEEEEEC
Confidence            7888764


No 37 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.48  E-value=3e-07  Score=93.60  Aligned_cols=63  Identities=22%  Similarity=0.370  Sum_probs=54.9

Q ss_pred             EeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEE-ECCCeEEEEEEEecC
Q 013444          372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQ-RANDQLVTLTVIPEE  438 (443)
Q Consensus       372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~-R~~g~~~~l~v~~~~  438 (443)
                      |..|.|+|+|+++||++||+|++|||++|.+|.|+...+.   ++.+.++|. | +|+..++++.+++
T Consensus         2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~r-dGe~~~l~Ie~~~   65 (433)
T TIGR03279         2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDA-NGESHQIEIEKDL   65 (433)
T ss_pred             cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcC-CCeEEEEEEecCC
Confidence            6679999999999999999999999999999999987774   356889997 6 7888888888753


No 38 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.35  E-value=2.9e-05  Score=74.53  Aligned_cols=180  Identities=22%  Similarity=0.265  Sum_probs=97.5

Q ss_pred             CceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC---------C---cEEEE-E
Q 013444          136 PAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------G---RTFEG-T  202 (443)
Q Consensus       136 pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d---------g---~~~~a-~  202 (443)
                      |-.|.+.....      ....|.|.+|+++ ||||++||+....     .. .+.|.+..         +   ..... +
T Consensus        25 Pw~~~l~~~~~------~~~~Cggsli~~~-~vltaaHC~~~~~-----~~-~~~V~~G~~~~~~~~~~~~~~~~~~v~~   91 (256)
T KOG3627|consen   25 PWQVSLQYGGN------GRHLCGGSLISPR-WVLTAAHCVKGAS-----AS-LYTVRLGEHDINLSVSEGEEQLVGDVEK   91 (256)
T ss_pred             CCEEEEEECCC------cceeeeeEEeeCC-EEEEChhhCCCCC-----Cc-ceEEEECccccccccccCchhhhceeeE
Confidence            34666655432      1136778888766 9999999998742     01 33444321         1   11111 2


Q ss_pred             EEeecC-------C-CCEEEEEEcCC----CCCCccccCCCCC---CCCCCEEEEEecCCCC------CCceEEEEEEee
Q 013444          203 VLNADF-------H-SDIAIVKINSK----TPLPAAKLGTSSK---LCPGDWVVAMGCPHSL------QNTVTAGIVSCV  261 (443)
Q Consensus       203 vv~~d~-------~-~DlAlLkl~~~----~~~~~~~l~~s~~---~~~G~~V~~iG~p~~~------~~~~t~G~Vs~~  261 (443)
                      ++ .|+       . +|||||+++.+    ..+.++.|.....   ...+..+++.||+...      ...+....+..+
T Consensus        92 ~i-~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~  170 (256)
T KOG3627|consen   92 II-VHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPII  170 (256)
T ss_pred             EE-ECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEc
Confidence            22 232       3 79999999874    3345666643222   3445888899997532      122332333333


Q ss_pred             ecCccCCCCCC---ccceEEEEc-----ccCCCCCccceeeecC---CCEEEEEEEEeec--CC-CeEEEEeHHHHHHHH
Q 013444          262 DRKSSDLGLGG---MRREYLQTD-----CAINAGNSGGPLVNID---GEIVGINIMKVAA--AD-GLSFAVPIDSAAKII  327 (443)
Q Consensus       262 ~~~~~~~~~~~---~~~~~i~~d-----~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~--~~-g~~~aIPi~~i~~~l  327 (443)
                      ....+...+..   .....+...     ...|.|+|||||+-.+   ..++||++++...  .. .-+....+....+++
T Consensus       171 ~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI  250 (256)
T KOG3627|consen  171 SNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWI  250 (256)
T ss_pred             ChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHH
Confidence            33323222211   111234443     3368999999999654   5999999998642  11 122245555555555


Q ss_pred             HH
Q 013444          328 EQ  329 (443)
Q Consensus       328 ~~  329 (443)
                      ++
T Consensus       251 ~~  252 (256)
T KOG3627|consen  251 KE  252 (256)
T ss_pred             HH
Confidence            44


No 39 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.30  E-value=3.5e-06  Score=86.60  Aligned_cols=70  Identities=36%  Similarity=0.567  Sum_probs=60.2

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEec
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPE  437 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~  437 (443)
                      .++.|.++.+++||+++||++||+|++|||+++...  +++.+.+....|..++|++.|.+ ++.++++++.+
T Consensus       112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l~Re  184 (406)
T COG0793         112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTLTRE  184 (406)
T ss_pred             CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCCceeEEEEEEE
Confidence            688999999999999999999999999999999765  56778888899999999999953 45666666654


No 40 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.29  E-value=4.7e-06  Score=66.75  Aligned_cols=67  Identities=21%  Similarity=0.419  Sum_probs=49.5

Q ss_pred             CCceEEeEECCC--------CccccCCC--CCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEE
Q 013444          367 KSGVLVPVVTPG--------SPAHLAGF--LPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLT  433 (443)
Q Consensus       367 ~~g~~V~~v~~~--------spA~~aGl--~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~  433 (443)
                      ..+..|..|.++        ||..+.|+  ++||+|++|||+++..-.++..+|..+.|+.+.|+|.+.+++.+++.
T Consensus        11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~~~R~v~   87 (88)
T PF14685_consen   11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPGGARTVV   87 (88)
T ss_dssp             TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT-EEEEE
T ss_pred             CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCCCceEEE
Confidence            368889999875        77777765  59999999999999999999999999999999999999555555554


No 41 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.28  E-value=3.4e-05  Score=72.79  Aligned_cols=165  Identities=18%  Similarity=0.219  Sum_probs=84.4

Q ss_pred             HhCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE----EEEeecC
Q 013444          133 RVCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG----TVLNADF  208 (443)
Q Consensus       133 ~~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a----~vv~~d~  208 (443)
                      -+...|.+|.-..+...     ..--|+.++ + +|+|++|..+..       ++.++|...-|.-.-.    --+..=+
T Consensus        15 ~Ia~~ic~l~n~s~~~~-----~~l~gigyG-~-~iItn~HLf~~n-------ng~L~i~s~hG~f~v~nt~~lkv~~i~   80 (235)
T PF00863_consen   15 PIASNICRLTNESDGGT-----RSLYGIGYG-S-YIITNAHLFKRN-------NGELTIKSQHGEFTVPNTTQLKVHPIE   80 (235)
T ss_dssp             HHHTTEEEEEEEETTEE-----EEEEEEEET-T-EEEEEGGGGSST-------TCEEEEEETTEEEEECEGGGSEEEE-T
T ss_pred             hhhheEEEEEEEeCCCe-----EEEEEEeEC-C-EEEEChhhhccC-------CCeEEEEeCceEEEcCCccccceEEeC
Confidence            44566777764332211     223356665 3 999999999764       3567777665532111    1223335


Q ss_pred             CCCEEEEEEcCCCCCCccccC-CCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCC
Q 013444          209 HSDIAIVKINSKTPLPAAKLG-TSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAG  287 (443)
Q Consensus       209 ~~DlAlLkl~~~~~~~~~~l~-~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G  287 (443)
                      ..||.++++..+  +||.+-. .-..++.++.|+++|.-+....  ....|+.......     .....++.+......|
T Consensus        81 ~~DiviirmPkD--fpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p-----~~~~~fWkHwIsTk~G  151 (235)
T PF00863_consen   81 GRDIVIIRMPKD--FPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYP-----EENSHFWKHWISTKDG  151 (235)
T ss_dssp             CSSEEEEE--TT--S----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEE-----ETTTTEEEE-C---TT
T ss_pred             CccEEEEeCCcc--cCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEee-----cCCCCeeEEEecCCCC
Confidence            789999999753  4433221 1246789999999998554221  1222222221111     1123466677777889


Q ss_pred             Cccceeeec-CCCEEEEEEEEeecCCCeEEEEeHH
Q 013444          288 NSGGPLVNI-DGEIVGINIMKVAAADGLSFAVPID  321 (443)
Q Consensus       288 ~SGGPlvd~-~G~VVGI~s~~~~~~~g~~~aIPi~  321 (443)
                      +=|.|+|+. ||.+|||++..... ...+|+.|+.
T Consensus       152 ~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~  185 (235)
T PF00863_consen  152 DCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFP  185 (235)
T ss_dssp             -TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--
T ss_pred             ccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCC
Confidence            999999976 59999999986544 4677888775


No 42 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.23  E-value=1.8e-06  Score=89.50  Aligned_cols=63  Identities=17%  Similarity=0.285  Sum_probs=55.3

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTL  432 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l  432 (443)
                      .|.+|.+|.++|||++|||++||+|++|||+++.+++++...+.... +++.+++.| +++..++
T Consensus       128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~-~~v~~~I~r-~g~~~~l  190 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA-GEPMVEILA-ERENWTF  190 (420)
T ss_pred             CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc-ccceEEEEE-ecCceEe
Confidence            68889999999999999999999999999999999999998887655 678899999 5555443


No 43 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=98.10  E-value=9.3e-06  Score=73.88  Aligned_cols=72  Identities=26%  Similarity=0.376  Sum_probs=60.8

Q ss_pred             ceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHH---HHhcCCCCeEEEEEEECCCeEEEEEEEecCCCC
Q 013444          369 GVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIE---IMGDRVGEPLKVVVQRANDQLVTLTVIPEEANP  441 (443)
Q Consensus       369 g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~---~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~~~~  441 (443)
                      -++|.+|.|+|||+++||+.||.|++++...--++..|+.   ......++.+.++|.| .|+.+.+.++|....+
T Consensus       140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R-~g~~v~L~ltP~~W~G  214 (231)
T KOG3129|consen  140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIR-EGQKVVLSLTPKKWQG  214 (231)
T ss_pred             eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEec-CCCEEEEEeCcccccC
Confidence            4678999999999999999999999999887777665553   3345778899999999 8999999999987654


No 44 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=98.03  E-value=1.6e-05  Score=69.27  Aligned_cols=72  Identities=28%  Similarity=0.473  Sum_probs=55.0

Q ss_pred             CCceEEeEECCCCccccCCCCC-CCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCC-eEEEEEEEecC
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLP-SDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAND-QLVTLTVIPEE  438 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~-GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g-~~~~l~v~~~~  438 (443)
                      ..++-|.+|.|+|||++|||++ .|.|+.+|+....+.++|.+.+..+.++++.|.|++... ....+.++|..
T Consensus        42 ~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~~  115 (138)
T PF04495_consen   42 EEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYNSKTDSVREVTITPSR  115 (138)
T ss_dssp             CCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEETTTTCEEEEEE---T
T ss_pred             cceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEECCCCeEEEEEEEcCC
Confidence            4588899999999999999999 599999999999999999999999999999999997433 45678888764


No 45 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.99  E-value=3.2e-05  Score=83.79  Aligned_cols=70  Identities=14%  Similarity=0.352  Sum_probs=55.9

Q ss_pred             CceEEeEECCCCccccC-CCCCCCEEEEEC--CEeeC-----CHHHHHHHHhcCCCCeEEEEEEEC--CCeEEEEEEEec
Q 013444          368 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFD--GKPVQ-----SITEIIEIMGDRVGEPLKVVVQRA--NDQLVTLTVIPE  437 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~a-Gl~~GDiI~~vn--g~~V~-----s~~dl~~~l~~~~g~~v~l~v~R~--~g~~~~l~v~~~  437 (443)
                      .+++|.+|.+|+||+++ ||++||+|++||  |+++.     +.+++..+|....|.+|.|+|.|.  +++..+++++..
T Consensus       255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl~R~  334 (667)
T PRK11186        255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLTRD  334 (667)
T ss_pred             CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEEEee
Confidence            46899999999999998 999999999999  55443     245788888888999999999983  245566666543


No 46 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.93  E-value=3.8e-05  Score=74.33  Aligned_cols=68  Identities=26%  Similarity=0.425  Sum_probs=59.1

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP  436 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~  436 (443)
                      .|+++..+..++|+... |+.||.|++|||+++.+.+|+.+++.. +.|++++|++.|.++++...++++
T Consensus       130 ~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl  198 (342)
T COG3480         130 AGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITL  198 (342)
T ss_pred             eeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence            59999999999999887 999999999999999999999998876 899999999998666655444443


No 47 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.83  E-value=4.6e-05  Score=73.60  Aligned_cols=61  Identities=18%  Similarity=0.350  Sum_probs=52.0

Q ss_pred             ECCCC---ccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 013444          375 VTPGS---PAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP  436 (443)
Q Consensus       375 v~~~s---pA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~  436 (443)
                      +.|+.   -.+++|||+||++++|||.++.+.++..+++.+ .....++|+|+| +|+.+++.+.+
T Consensus       211 l~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeR-dGq~~~i~i~l  275 (276)
T PRK09681        211 VKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLR-KGARHDISIAL  275 (276)
T ss_pred             ECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEE-CCEEEEEEEEc
Confidence            55653   457889999999999999999999998888876 667789999999 99998887754


No 48 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.77  E-value=4.3e-05  Score=78.65  Aligned_cols=64  Identities=25%  Similarity=0.387  Sum_probs=53.6

Q ss_pred             CCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 013444          366 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEE  438 (443)
Q Consensus       366 ~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~  438 (443)
                      ...+.+|..|.++|||++|||.+||.|++|||.        .+.+.. +.++.+++++.| .+..+++.+++..
T Consensus       460 ~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~-~~~L~e~~v~~~~  524 (558)
T COG3975         460 EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKIQVHVFR-EGRLREFLVKLGG  524 (558)
T ss_pred             cCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccccccccceEEEEcc-CCceEEeecccCC
Confidence            346789999999999999999999999999998        333443 788899999999 7888888777654


No 49 
>PF12812 PDZ_1:  PDZ-like domain
Probab=97.70  E-value=0.00018  Score=56.39  Aligned_cols=68  Identities=21%  Similarity=0.249  Sum_probs=57.5

Q ss_pred             eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCC
Q 013444          339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRV  414 (443)
Q Consensus       339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~  414 (443)
                      -|.|..+.+|+-..++++..        .-|.++.....++++.+-|+..|-+|.+|||+++.+.++|.+.+++-+
T Consensus         9 ~~~Ga~f~~Ls~q~aR~~~~--------~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ip   76 (78)
T PF12812_consen    9 EVCGAVFHDLSYQQARQYGI--------PVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIP   76 (78)
T ss_pred             EEcCeecccCCHHHHHHhCC--------CCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCC
Confidence            58899999999999998855        334566667888988877799999999999999999999999887643


No 50 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.54  E-value=0.0017  Score=64.06  Aligned_cols=51  Identities=18%  Similarity=0.294  Sum_probs=34.0

Q ss_pred             ccCCCCCccceeeec--CCC-EEEEEEEEeecCCC---eEEEEeHHHHHHHHHHHHH
Q 013444          282 CAINAGNSGGPLVNI--DGE-IVGINIMKVAAADG---LSFAVPIDSAAKIIEQFKK  332 (443)
Q Consensus       282 ~~i~~G~SGGPlvd~--~G~-VVGI~s~~~~~~~g---~~~aIPi~~i~~~l~~l~~  332 (443)
                      ...|.|+||||+|-.  +|+ -+||++|+.....+   -+..--++....+|+...+
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~  279 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN  279 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence            457899999999932  365 46999998764322   1223446677788877554


No 51 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.49  E-value=0.009  Score=58.42  Aligned_cols=177  Identities=20%  Similarity=0.277  Sum_probs=100.2

Q ss_pred             hCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCC----C-----CCC------------ceEEEE
Q 013444          134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSR----A-----LPK------------GKVDVT  192 (443)
Q Consensus       134 ~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~----~-----~~~------------~~i~V~  192 (443)
                      -.|-.|.+.......    .....+|++|+++ ||||++|++-......    .     -..            ..+.+.
T Consensus        52 ~~pW~v~v~~~~~~~----~~~~~~gtlIS~R-HiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~  126 (282)
T PF03761_consen   52 EAPWAVSVYTKNHNE----GNYFSTGTLISPR-HILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVR  126 (282)
T ss_pred             CCCCEEEEEeccCcc----cceecceEEeccC-eEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHHhccEEEE
Confidence            456788887665322    1133499999998 9999999997322200    0     001            112220


Q ss_pred             ----eCCC-----cEEEEEEEe-e-------cCCCCEEEEEEcCC--CCCCccccCCCC-CCCCCCEEEEEecCCCCCCc
Q 013444          193 ----LQDG-----RTFEGTVLN-A-------DFHSDIAIVKINSK--TPLPAAKLGTSS-KLCPGDWVVAMGCPHSLQNT  252 (443)
Q Consensus       193 ----~~dg-----~~~~a~vv~-~-------d~~~DlAlLkl~~~--~~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~~  252 (443)
                          ...+     +...|.++. +       ...++++||+++.+  ....++.|+++. .+..++.+.+.|+...  ..
T Consensus       127 ~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~  204 (282)
T PF03761_consen  127 CCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GK  204 (282)
T ss_pred             eecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--Ce
Confidence                0111     122344442 2       23579999999987  677888887643 4567899999888211  12


Q ss_pred             eEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceee-ecCC--CEEEEEEEEeecC-CCeEEEEeHHHHHH
Q 013444          253 VTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLV-NIDG--EIVGINIMKVAAA-DGLSFAVPIDSAAK  325 (443)
Q Consensus       253 ~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlv-d~~G--~VVGI~s~~~~~~-~g~~~aIPi~~i~~  325 (443)
                      +....+.-.....        ....+......+.|++|||++ +.+|  .||||.+.+.... ....+++.+..+++
T Consensus       205 ~~~~~~~i~~~~~--------~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~  273 (282)
T PF03761_consen  205 LKHRKLKITNCTK--------CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD  273 (282)
T ss_pred             EEEEEEEEEEeec--------cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence            2222222111110        233455566778999999998 3334  5888876543221 12456677665544


No 52 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.44  E-value=0.00097  Score=63.24  Aligned_cols=116  Identities=23%  Similarity=0.386  Sum_probs=60.7

Q ss_pred             CCcEEEEEEEeCCC--EEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc-CCCCCCccccC
Q 013444          153 GRGIGSGAIVDADG--TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN-SKTPLPAAKLG  229 (443)
Q Consensus       153 ~~~~GSGfiI~~~G--~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~-~~~~~~~~~l~  229 (443)
                      +...|||=++..+|  .|+|+.||+.+         ....|...+ ....   ..++..-|+|.-.++ -+...|..++.
T Consensus       110 Gss~Gsggvft~~~~~vvvTAtHVlg~---------~~a~v~~~g-~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a  176 (297)
T PF05579_consen  110 GSSVGSGGVFTIGGNTVVVTATHVLGG---------NTARVSGVG-TRRM---LTFKKNGDFAEADITNWPGAAPKYKFA  176 (297)
T ss_dssp             SSSEEEEEEEECTTEEEEEEEHHHCBT---------TEEEEEETT-EEEE---EEEEEETTEEEEEETTS-S---B--B-
T ss_pred             eecccccceEEECCeEEEEEEEEEcCC---------CeEEEEecc-eEEE---EEEeccCcEEEEECCCCCCCCCceeec
Confidence            34556666665554  89999999985         344454433 2222   234455699999994 34566777665


Q ss_pred             CCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEE
Q 013444          230 TSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMK  307 (443)
Q Consensus       230 ~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~  307 (443)
                      ..   ..|---+.-      +..+..|.|..              ...+   |-..+||||+|+++.+|.+|||++..
T Consensus       177 ~~---~~GrAyW~t------~tGvE~G~ig~--------------~~~~---~fT~~GDSGSPVVt~dg~liGVHTGS  228 (297)
T PF05579_consen  177 QN---YTGRAYWLT------STGVEPGFIGG--------------GGAV---CFTGPGDSGSPVVTEDGDLIGVHTGS  228 (297)
T ss_dssp             TT----SEEEEEEE------TTEEEEEEEET--------------TEEE---ESS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred             CC---cccceEEEc------ccCcccceecC--------------ceEE---EEcCCCCCCCccCcCCCCEEEEEecC
Confidence            21   112211110      11233444321              0111   33457999999999999999999874


No 53 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.41  E-value=0.00015  Score=58.18  Aligned_cols=36  Identities=31%  Similarity=0.538  Sum_probs=32.8

Q ss_pred             CCCCceEEeEECCCCccccCCCCCCCEEEEECCEee
Q 013444          365 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPV  400 (443)
Q Consensus       365 ~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V  400 (443)
                      -+..|++|++|.++|||+.|||+.+|.|+.+||...
T Consensus        56 ytD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~Df   91 (124)
T KOG3553|consen   56 YTDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDF   91 (124)
T ss_pred             cCCccEEEEEeccCChhhhhcceecceEEEecCcee
Confidence            345799999999999999999999999999999765


No 54 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.35  E-value=0.0004  Score=64.70  Aligned_cols=67  Identities=15%  Similarity=0.193  Sum_probs=56.5

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  435 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  435 (443)
                      .|..+.-..+.+..+..|||.||+.+++|+..+++.+++..+|+. ...+.++++|+| +|+...+.+.
T Consensus       207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R-~G~rhdInV~  274 (275)
T COG3031         207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIR-RGKRHDINVR  274 (275)
T ss_pred             EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEe-cCccceeeec
Confidence            355555566678888899999999999999999999999998877 556789999999 8998887764


No 55 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.84  E-value=0.02  Score=53.18  Aligned_cols=160  Identities=18%  Similarity=0.235  Sum_probs=84.8

Q ss_pred             CcEEEEEEEeCC-CEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCC----------------CCEEEEE
Q 013444          154 RGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFH----------------SDIAIVK  216 (443)
Q Consensus       154 ~~~GSGfiI~~~-G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLk  216 (443)
                      .+.||=.+++++ +..--=.|.+.+.+       ....+.+.+|+.+++.+....+.                .-+.-+.
T Consensus        19 aGiGTlTf~dp~~~~fgALGH~I~D~d-------t~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~   91 (218)
T PF05580_consen   19 AGIGTLTFYDPETGTFGALGHGISDVD-------TGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIE   91 (218)
T ss_pred             cCeEEEEEEECCCCcEEecCCeEEcCC-------CCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEE
Confidence            367888999874 55555678887754       23345667788887776654321                1111222


Q ss_pred             EcC----------C-----CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCC----CccceE
Q 013444          217 INS----------K-----TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG----GMRREY  277 (443)
Q Consensus       217 l~~----------~-----~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~----~~~~~~  277 (443)
                      -+.          .     ...++++++...++++|..-+..=.. +.....-.=.|..+.........+    -...++
T Consensus        92 ~Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~-G~~ie~f~ieI~~v~~~~~~~~k~~vi~vtd~~L  170 (218)
T PF05580_consen   92 KNTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVID-GTKIEEFDIEIEKVLPQSSPSGKGMVIKVTDPRL  170 (218)
T ss_pred             eccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEEc-CCeEEEeEEEEEEEccCCCCCCCcEEEEECCcch
Confidence            111          1     12234444445566666532211010 100000011111222211100000    011123


Q ss_pred             EEEcccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHH
Q 013444          278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDS  322 (443)
Q Consensus       278 i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~  322 (443)
                      +.....+-+|+||+|++ .+|++||=++..+.+....+|.++++.
T Consensus       171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~  214 (218)
T PF05580_consen  171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEW  214 (218)
T ss_pred             hhhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHH
Confidence            33445577899999999 699999999988877788999999765


No 56 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.68  E-value=0.0022  Score=66.76  Aligned_cols=59  Identities=24%  Similarity=0.430  Sum_probs=47.3

Q ss_pred             CCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HH-HHHHHhcCCCCeEEEEEEE
Q 013444          366 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TE-IIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       366 ~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~d-l~~~l~~~~g~~v~l~v~R  424 (443)
                      ...|++|..|.++|||++-||+.||+|++||.++..+.  ++ +.-+|.-..|+.++|.-.+
T Consensus       427 NDVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~  488 (1027)
T KOG3580|consen  427 NDVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQS  488 (1027)
T ss_pred             CceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhh
Confidence            35699999999999999999999999999999998775  23 3334455788888875543


No 57 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=96.48  E-value=0.006  Score=64.46  Aligned_cols=57  Identities=30%  Similarity=0.471  Sum_probs=47.9

Q ss_pred             CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEE
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R  424 (443)
                      ..-+.|..|.++++|.++.|++||++++|||.+|.+..+..+.++...|+ +...++|
T Consensus       397 ~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~-~~~l~~~  453 (1051)
T KOG3532|consen  397 NRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGD-LTVLVER  453 (1051)
T ss_pred             ceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccc-eEEEEee
Confidence            45677899999999999999999999999999999999999998876665 3333333


No 58 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.39  E-value=0.015  Score=63.62  Aligned_cols=24  Identities=38%  Similarity=0.440  Sum_probs=21.1

Q ss_pred             cEEEEEEEeCCCEEEeccccccCC
Q 013444          155 GIGSGAIVDADGTILTCAHVVVDF  178 (443)
Q Consensus       155 ~~GSGfiI~~~G~ILTaaHvv~~~  178 (443)
                      +-|||-||+++|+||||.||..++
T Consensus        47 gGCSgsfVS~~GLvlTNHHC~~~~   70 (698)
T PF10459_consen   47 GGCSGSFVSPDGLVLTNHHCGYGA   70 (698)
T ss_pred             CceeEEEEcCCceEEecchhhhhH
Confidence            349999999999999999998653


No 59 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.28  E-value=0.042  Score=58.65  Aligned_cols=117  Identities=22%  Similarity=0.325  Sum_probs=72.9

Q ss_pred             CCCCEEEEEEcCC--------CCC------CccccCC------CCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccC
Q 013444          208 FHSDIAIVKINSK--------TPL------PAAKLGT------SSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSD  267 (443)
Q Consensus       208 ~~~DlAlLkl~~~--------~~~------~~~~l~~------s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~  267 (443)
                      .-.|+|||+++..        +++      |.+.+.+      ...+.+|..|+-+|...+    .|.|++.+.....  
T Consensus       541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy--  614 (695)
T PF08192_consen  541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY--  614 (695)
T ss_pred             cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--
Confidence            3469999999853        111      2222321      134677999999998766    4566665543221  


Q ss_pred             CCCCCc-cceEEEEc----ccCCCCCccceeeecCCC------EEEEEEEEeecCCCeEEEEeHHHHHHHHHHH
Q 013444          268 LGLGGM-RREYLQTD----CAINAGNSGGPLVNIDGE------IVGINIMKVAAADGLSFAVPIDSAAKIIEQF  330 (443)
Q Consensus       268 ~~~~~~-~~~~i~~d----~~i~~G~SGGPlvd~~G~------VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l  330 (443)
                      +..+.. ..+++...    .-...|+||+=|++.-+.      |+||.+..-....+++++.|+..|.+-|++.
T Consensus       615 w~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~v  688 (695)
T PF08192_consen  615 WADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEV  688 (695)
T ss_pred             ecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence            111111 12333333    223679999999986444      9999988666667899999998877766654


No 60 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.20  E-value=0.0075  Score=65.98  Aligned_cols=56  Identities=23%  Similarity=0.315  Sum_probs=43.4

Q ss_pred             eEEEEcccCCCCCccceeeecCCCEEEEEEEEee----------cCCCeEEEEeHHHHHHHHHHHH
Q 013444          276 EYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVA----------AADGLSFAVPIDSAAKIIEQFK  331 (443)
Q Consensus       276 ~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~----------~~~g~~~aIPi~~i~~~l~~l~  331 (443)
                      -.+.++..+..||||+|++|.+|+|||+++-+..          .....+..|-+..|.-+|+++-
T Consensus       622 v~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~  687 (698)
T PF10459_consen  622 VNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY  687 (698)
T ss_pred             eEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence            3567888999999999999999999999986532          1224566777777888887764


No 61 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.14  E-value=0.32  Score=44.15  Aligned_cols=149  Identities=18%  Similarity=0.230  Sum_probs=82.4

Q ss_pred             CCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE--EEEeecC---C
Q 013444          135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG--TVLNADF---H  209 (443)
Q Consensus       135 ~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a--~vv~~d~---~  209 (443)
                      ..-++.|.+.       ++...++++.|..+ ++|...|.-..         ..+.+   +|+.++.  .+...+.   .
T Consensus        12 ~~N~~~v~~~-------~g~~t~l~~gi~~~-~~lvp~H~~~~---------~~i~i---~g~~~~~~d~~~lv~~~~~~   71 (172)
T PF00548_consen   12 KKNVVPVTTG-------KGEFTMLALGIYDR-YFLVPTHEEPE---------DTIYI---DGVEYKVDDSVVLVDRDGVD   71 (172)
T ss_dssp             HHHEEEEEET-------TEEEEEEEEEEEBT-EEEEEGGGGGC---------SEEEE---TTEEEEEEEEEEEEETTSSE
T ss_pred             hccEEEEEeC-------CceEEEecceEeee-EEEEECcCCCc---------EEEEE---CCEEEEeeeeEEEecCCCcc
Confidence            3456666662       23466888889876 99999992221         23433   3555543  2223343   4


Q ss_pred             CCEEEEEEcCCCCCCccccCCCCCC-CCCCEEEEEecCCCCC-CceEEEEEEeeecCccCCCCCCccceEEEEcccCCCC
Q 013444          210 SDIAIVKINSKTPLPAAKLGTSSKL-CPGDWVVAMGCPHSLQ-NTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAG  287 (443)
Q Consensus       210 ~DlAlLkl~~~~~~~~~~l~~s~~~-~~G~~V~~iG~p~~~~-~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G  287 (443)
                      .||++++++....++-+.---...+ ...+...++ +..... .....+.++.......   .+......+..+++..+|
T Consensus        72 ~Dl~~v~l~~~~kfrDIrk~~~~~~~~~~~~~l~v-~~~~~~~~~~~v~~v~~~~~i~~---~g~~~~~~~~Y~~~t~~G  147 (172)
T PF00548_consen   72 TDLTLVKLPRNPKFRDIRKFFPESIPEYPECVLLV-NSTKFPRMIVEVGFVTNFGFINL---SGTTTPRSLKYKAPTKPG  147 (172)
T ss_dssp             EEEEEEEEESSS-B--GGGGSBSSGGTEEEEEEEE-ESSSSTCEEEEEEEEEEEEEEEE---TTEEEEEEEEEESEEETT
T ss_pred             eeEEEEEccCCcccCchhhhhccccccCCCcEEEE-ECCCCccEEEEEEEEeecCcccc---CCCEeeEEEEEccCCCCC
Confidence            6999999987544432221011122 223333333 333333 2334444443333211   112234578888889999


Q ss_pred             Cccceeeec---CCCEEEEEEEE
Q 013444          288 NSGGPLVNI---DGEIVGINIMK  307 (443)
Q Consensus       288 ~SGGPlvd~---~G~VVGI~s~~  307 (443)
                      +-||||+..   .++++||+.++
T Consensus       148 ~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen  148 MCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             GTTEEEEESCGGTTEEEEEEEEE
T ss_pred             ccCCeEEEeeccCccEEEEEecc
Confidence            999999942   47999999875


No 62 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=95.86  E-value=0.03  Score=48.66  Aligned_cols=58  Identities=24%  Similarity=0.466  Sum_probs=44.5

Q ss_pred             CCCCceEEeEECCCCccccC-CCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEE
Q 013444          365 NVKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ  423 (443)
Q Consensus       365 ~~~~g~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~  423 (443)
                      .....++|+.|.|++.|++- ||+.||.+++|||..|..-  +...++|+...| .+++.|.
T Consensus       112 eqnspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g-svklvvr  172 (207)
T KOG3550|consen  112 EQNSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG-SVKLVVR  172 (207)
T ss_pred             ccCCceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC-cEEEEEe
Confidence            34568999999999999986 7999999999999998643  344566665555 4666654


No 63 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.45  E-value=0.023  Score=60.62  Aligned_cols=52  Identities=19%  Similarity=0.489  Sum_probs=43.1

Q ss_pred             EeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEE
Q 013444          372 VPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       372 V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R  424 (443)
                      |..|.++|||++.| |+.||.|++|||+.|.+.  .|+..++++ .|-.|+|+|.-
T Consensus       782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd-aGlsVtLtIip  836 (984)
T KOG3209|consen  782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD-AGLSVTLTIIP  836 (984)
T ss_pred             ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh-cCceEEEEEcC
Confidence            67789999999975 999999999999999754  566776665 57789998875


No 64 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=95.43  E-value=0.056  Score=55.05  Aligned_cols=56  Identities=30%  Similarity=0.520  Sum_probs=47.9

Q ss_pred             EECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCe---EEEEEEECCCeE
Q 013444          374 VVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEP---LKVVVQRANDQL  429 (443)
Q Consensus       374 ~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~---v~l~v~R~~g~~  429 (443)
                      .+..++++..+|+++||.|+++|++++.+|+++...+....+..   +.+.+.|-++..
T Consensus       135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~  193 (375)
T COG0750         135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA  193 (375)
T ss_pred             ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence            78999999999999999999999999999999998887766665   788888833333


No 65 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.38  E-value=0.047  Score=47.13  Aligned_cols=33  Identities=36%  Similarity=0.555  Sum_probs=23.3

Q ss_pred             EEcccCCCCCccceeeecCCCEEEEEEEEeecC
Q 013444          279 QTDCAINAGNSGGPLVNIDGEIVGINIMKVAAA  311 (443)
Q Consensus       279 ~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~  311 (443)
                      ..+..+.+|.||+|+||.+|++|||...+..-.
T Consensus        89 ~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~  121 (132)
T PF00949_consen   89 AIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG  121 (132)
T ss_dssp             EE---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred             eeecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence            344557799999999999999999998876543


No 66 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.01  E-value=0.067  Score=57.19  Aligned_cols=60  Identities=22%  Similarity=0.401  Sum_probs=49.6

Q ss_pred             CCCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeC--CHHHHHHHHhc-CCCCeEEEEEEE
Q 013444          365 NVKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQ--SITEIIEIMGD-RVGEPLKVVVQR  424 (443)
Q Consensus       365 ~~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~--s~~dl~~~l~~-~~g~~v~l~v~R  424 (443)
                      .....++|..|.+.+.|++.| |++||.|+.|||.+|.  +-.++..+|.. ..+..|.|+|.|
T Consensus       671 ep~qpi~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~~vv~Lm~~AArnghV~LtVRR  734 (984)
T KOG3209|consen  671 EPGQPIYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHSEVVDLMEAAARNGHVNLTVRR  734 (984)
T ss_pred             CCCCeeEEeeeeecccccccCcccCCCeEEEecCeeccCccHHHHHHHHHHHHhcCceEEEEee
Confidence            456789999999999999986 9999999999999995  66777777765 334468999987


No 67 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=94.57  E-value=0.061  Score=58.99  Aligned_cols=55  Identities=24%  Similarity=0.465  Sum_probs=45.3

Q ss_pred             CceEEeEECCCCccccCCCCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEEE
Q 013444          368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~R  424 (443)
                      ..++|..|.+|+|+.-. |++||+|++|||++|++  |+.+.++++.. .+.|.|+|.+
T Consensus        75 rPviVr~VT~GGps~GK-L~PGDQIl~vN~Epv~daprervIdlvRac-e~sv~ltV~q  131 (1298)
T KOG3552|consen   75 RPVIVRFVTEGGPSIGK-LQPGDQILAVNGEPVKDAPRERVIDLVRAC-ESSVNLTVCQ  131 (1298)
T ss_pred             CceEEEEecCCCCcccc-ccCCCeEEEecCcccccccHHHHHHHHHHH-hhhcceEEec
Confidence            68999999999999865 99999999999999974  67777777653 2457788877


No 68 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.40  E-value=0.34  Score=49.66  Aligned_cols=46  Identities=22%  Similarity=0.369  Sum_probs=37.2

Q ss_pred             EEEcccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHH
Q 013444          278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAA  324 (443)
Q Consensus       278 i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~  324 (443)
                      +.....+-+|+||+|++ .+|++||=++-.+-+.+..+|+|-++.-.
T Consensus       351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~Ml  396 (402)
T TIGR02860       351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWML  396 (402)
T ss_pred             hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHHH
Confidence            33345677899999999 79999998888888888899999776543


No 69 
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=94.40  E-value=0.079  Score=54.58  Aligned_cols=58  Identities=21%  Similarity=0.547  Sum_probs=43.5

Q ss_pred             CCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhc---CCCCeEEEEEEE
Q 013444          366 VKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGD---RVGEPLKVVVQR  424 (443)
Q Consensus       366 ~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~---~~g~~v~l~v~R  424 (443)
                      ...|++|.+|.+++.-+..| |.+||.|++||.....++  +|....|.+   +.| +++++|-.
T Consensus       275 gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~g-Pi~ltvAk  338 (626)
T KOG3571|consen  275 GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPG-PIKLTVAK  338 (626)
T ss_pred             CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCC-CeEEEEee
Confidence            35799999999988776665 999999999999887654  344555554   333 58888876


No 70 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=94.25  E-value=0.57  Score=44.39  Aligned_cols=100  Identities=19%  Similarity=0.256  Sum_probs=66.3

Q ss_pred             CCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEE------EEEEeec-
Q 013444          135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE------GTVLNAD-  207 (443)
Q Consensus       135 ~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~------a~vv~~d-  207 (443)
                      .|-.+.|..        .|...|||++|+++ |||++..|+.+..-    ....+.+.+..++.+.      -++..+| 
T Consensus        16 WPWlA~IYv--------dG~~~CsgvLlD~~-WlLvsssCl~~I~L----~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~   82 (267)
T PF09342_consen   16 WPWLADIYV--------DGRYWCSGVLLDPH-WLLVSSSCLRGISL----SHHYVSALLGGGKTYLSVDGPHEQISRVDC   82 (267)
T ss_pred             CcceeeEEE--------cCeEEEEEEEeccc-eEEEeccccCCccc----ccceEEEEecCcceecccCCChheEEEeee
Confidence            466666654        34578999999988 99999999987431    1256777787776544      1233333 


Q ss_pred             ----CCCCEEEEEEcCCCC----CCccccCC-CCCCCCCCEEEEEecCC
Q 013444          208 ----FHSDIAIVKINSKTP----LPAAKLGT-SSKLCPGDWVVAMGCPH  247 (443)
Q Consensus       208 ----~~~DlAlLkl~~~~~----~~~~~l~~-s~~~~~G~~V~~iG~p~  247 (443)
                          ++.+++||.++.+..    +.|.-+.. .......+.++++|.-.
T Consensus        83 ~~~V~~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   83 FKDVPESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             eeeccccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence                578999999997632    23444432 23444556899999765


No 71 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=94.09  E-value=0.043  Score=58.19  Aligned_cols=56  Identities=27%  Similarity=0.457  Sum_probs=43.4

Q ss_pred             CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHH--HHHHhcCCCCeEEEEEEE
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEI--IEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl--~~~l~~~~g~~v~l~v~R  424 (443)
                      ..|++|.+|.|++.|++.|++.||.|++|||+...++.-.  .++|.+  ...+.|+|+.
T Consensus       561 GfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKt  618 (1283)
T KOG3542|consen  561 GFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKT  618 (1283)
T ss_pred             cceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEec
Confidence            4589999999999999999999999999999998776432  234443  2346666653


No 72 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.01  E-value=0.14  Score=53.86  Aligned_cols=63  Identities=22%  Similarity=0.443  Sum_probs=48.0

Q ss_pred             CCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHH-HhcCCCCeEEEEEEE
Q 013444          361 PSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEI-MGDRVGEPLKVVVQR  424 (443)
Q Consensus       361 ~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~-l~~~~g~~v~l~v~R  424 (443)
                      +.|.+-...++|..|.|++||+-- |+.||.|+-|||....+......+ .-.+.|+...|+|+|
T Consensus        33 Phf~~getSiViSDVlpGGPAeG~-LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~ItvkR   96 (1027)
T KOG3580|consen   33 PHFENGETSIVISDVLPGGPAEGL-LQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITVKR   96 (1027)
T ss_pred             CCccCCceeEEEeeccCCCCcccc-cccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEecc
Confidence            344444567999999999999976 999999999999988776544331 122567778899988


No 73 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=93.98  E-value=0.096  Score=55.53  Aligned_cols=116  Identities=24%  Similarity=0.373  Sum_probs=72.2

Q ss_pred             CCccceee-----ecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHHHcCceeeee---cCc-eeecccHHHHHHhh
Q 013444          287 GNSGGPLV-----NIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGRVVRPW---LGL-KMLDLNDMIIAQLK  357 (443)
Q Consensus       287 G~SGGPlv-----d~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~---lGi-~~~~~~~~~~~~l~  357 (443)
                      =++|||.-     |...+++.|+-...       ..+|.+..+.+++.+|+.-.++.-.   --+ ++.-.-++..-+|+
T Consensus       680 mm~~GpAarsgkLnIGDQiiaING~SL-------VGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~kyQLG  752 (829)
T KOG3605|consen  680 MMHGGPAARSGKLNIGDQIMSINGTSL-------VGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLRYQLG  752 (829)
T ss_pred             cccCChhhhcCCccccceeEeecCcee-------ccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccchhhcc
Confidence            45666663     44445666653221       2499999999999998766553111   111 11112233333332


Q ss_pred             cCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeC--CHHHHHHHHhcCCCC
Q 013444          358 ERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ--SITEIIEIMGDRVGE  416 (443)
Q Consensus       358 ~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~--s~~dl~~~l~~~~g~  416 (443)
                      -      ...+|++- +...++.|++-|++.|-.|++|||+.|.  --+.+..+|...+|+
T Consensus       753 F------SVQNGiIC-SLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~aVGE  806 (829)
T KOG3605|consen  753 F------SVQNGIIC-SLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSNAVGE  806 (829)
T ss_pred             c------eeeCcEee-hhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHHhhhh
Confidence            2      34567754 4778999999999999999999999883  334566667666654


No 74 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=93.95  E-value=0.17  Score=43.38  Aligned_cols=33  Identities=21%  Similarity=0.352  Sum_probs=26.9

Q ss_pred             cccCCCCCccceeeecCCCEEEEEEEEeecCCC
Q 013444          281 DCAINAGNSGGPLVNIDGEIVGINIMKVAAADG  313 (443)
Q Consensus       281 d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g  313 (443)
                      ...-.+|+||-|++|..|+||||+..+..+...
T Consensus       100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~R  132 (158)
T PF00944_consen  100 TGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRR  132 (158)
T ss_dssp             TTS-STTSTTEEEESTTSBEEEEEEEEEEETTE
T ss_pred             cCCCCCCCCCCccCcCCCCEEEEEecCCCCCCc
Confidence            445679999999999999999999988766443


No 75 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=93.95  E-value=0.31  Score=45.35  Aligned_cols=117  Identities=26%  Similarity=0.334  Sum_probs=47.4

Q ss_pred             EEEeccccccCCCCCCCCCCceEEEEeCCCcEEE---EEEEeecCCCCEEEEEEcCC----CCCCccccCCCCCCCCCCE
Q 013444          167 TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE---GTVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDW  239 (443)
Q Consensus       167 ~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~---a~vv~~d~~~DlAlLkl~~~----~~~~~~~l~~s~~~~~G~~  239 (443)
                      .++|+.||..+..        .+ ..+.+|+.++   -+.+..+...|++||+....    ...+.+.+.....+.    
T Consensus        43 ~L~ta~Hv~~~~~--------~~-~~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~----  109 (203)
T PF02122_consen   43 ALLTARHVWSRPS--------KV-TSLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLA----  109 (203)
T ss_dssp             EEEE-HHHHTSSS------------EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEE----
T ss_pred             ceecccccCCCcc--------ce-eEcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhC----
Confidence            8999999998732        22 2334444444   24555678899999999732    222333332221111    


Q ss_pred             EEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEE
Q 013444          240 VVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMK  307 (443)
Q Consensus       240 V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~  307 (443)
                         -| +... .....+...+......     +....+...-+...+|.||.|+++.+ +++|++...
T Consensus       110 ---~g-~~~~-y~~~~~~~~~~sa~i~-----g~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~  166 (203)
T PF02122_consen  110 ---KG-PVSF-YGFSSGEWPCSSAKIP-----GTEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS  166 (203)
T ss_dssp             ---EE-ESST-TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred             ---CC-Ceee-eeecCCCceeccCccc-----cccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence               01 0000 1111211111111110     11123566778889999999999877 999999874


No 76 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=92.28  E-value=0.22  Score=50.61  Aligned_cols=70  Identities=23%  Similarity=0.434  Sum_probs=52.0

Q ss_pred             CCceEEeEECCCCccccCCCCCC-CEEEEECCEeeCCHHHHHH-HHhcCCCCeEEEEEEECCCe-EEEEEEEec
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPS-DVVIKFDGKPVQSITEIIE-IMGDRVGEPLKVVVQRANDQ-LVTLTVIPE  437 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~G-DiI~~vng~~V~s~~dl~~-~l~~~~g~~v~l~v~R~~g~-~~~l~v~~~  437 (443)
                      ..|.-|.+|.++|+|.++||.+= |-|++|||..++..+|..+ .|+....+ |+++|...... ...+.|++.
T Consensus        14 teg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps   86 (462)
T KOG3834|consen   14 TEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPS   86 (462)
T ss_pred             ceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecccceeEEEEeccc
Confidence            46888999999999999999997 8999999999986666554 44545444 99999863222 344555554


No 77 
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=91.52  E-value=0.51  Score=45.13  Aligned_cols=57  Identities=21%  Similarity=0.454  Sum_probs=45.7

Q ss_pred             CCceEEeEECCCCccccCC-CCCCCEEEEECCEee--CCHHHHHHHHhcCCCCeEEEEEEE
Q 013444          367 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPV--QSITEIIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V--~s~~dl~~~l~~~~g~~v~l~v~R  424 (443)
                      ..|++|....|++-|+..| |...|.|++|||.+|  ++.+++.++|-.+. ..+.++|+-
T Consensus       193 vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs-hNLIiTVkP  252 (358)
T KOG3606|consen  193 VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS-HNLIITVKP  252 (358)
T ss_pred             cCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc-cceEEEecc
Confidence            3599999999999999998 567899999999999  58889988775422 236666654


No 78 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=91.15  E-value=0.23  Score=49.00  Aligned_cols=56  Identities=18%  Similarity=0.436  Sum_probs=46.8

Q ss_pred             CCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEE
Q 013444          367 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQ  423 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~  423 (443)
                      .-+++|..|.++-.|+..| |-.||-|++|||..|+.  -+|+..+|+ +.|+.|+|+|.
T Consensus        79 n~PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLR-NAGdeVtlTV~  137 (505)
T KOG3549|consen   79 NLPVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILR-NAGDEVTLTVK  137 (505)
T ss_pred             CccEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHH-hcCCEEEEEeH
Confidence            4589999999999999887 67999999999999974  467777666 46888888886


No 79 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=90.53  E-value=0.54  Score=45.63  Aligned_cols=55  Identities=16%  Similarity=0.287  Sum_probs=42.8

Q ss_pred             ceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEE
Q 013444          369 GVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       369 g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R  424 (443)
                      -++|..|..++||++.| ++.||.|++|||..|+.-  -++.++++...+ +|+|.+..
T Consensus        31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~-eV~IhyNK   88 (429)
T KOG3651|consen   31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLN-EVKIHYNK   88 (429)
T ss_pred             eEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhcc-ceEEEehh
Confidence            57888999999999986 999999999999999754  455666665443 57777643


No 80 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=90.47  E-value=0.45  Score=48.43  Aligned_cols=68  Identities=26%  Similarity=0.432  Sum_probs=54.4

Q ss_pred             EeEECCCCccccCCCC-CCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCC-eEEEEEEEecCC
Q 013444          372 VPVVTPGSPAHLAGFL-PSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAND-QLVTLTVIPEEA  439 (443)
Q Consensus       372 V~~v~~~spA~~aGl~-~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g-~~~~l~v~~~~~  439 (443)
                      |-+|.++|||+.|||+ -+|-|+-+-+..-...+|+..+|..+.++.+++.|+.-+. ...++++++..+
T Consensus       113 vl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~pn~a  182 (462)
T KOG3834|consen  113 VLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTITPNSA  182 (462)
T ss_pred             eeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceEEeecccc
Confidence            5679999999999999 5699999955556778899999998999999999987333 346777776643


No 81 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=89.76  E-value=0.46  Score=47.85  Aligned_cols=50  Identities=30%  Similarity=0.462  Sum_probs=41.3

Q ss_pred             CCCCCCCceEEeEECCCCccccC-CCCCCCEEEEECCEeeCCHHHHHHHHh
Q 013444          362 SFPNVKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSITEIIEIMG  411 (443)
Q Consensus       362 ~~~~~~~g~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~~dl~~~l~  411 (443)
                      .|+-...|+.|.+|...||+..- ||.+||+|+++||-+|++.+|-.+.++
T Consensus       214 Pfya~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~  264 (484)
T KOG2921|consen  214 PFYAHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLA  264 (484)
T ss_pred             hhhhcCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHH
Confidence            34455789999999999998653 999999999999999998877766554


No 82 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=89.29  E-value=0.55  Score=52.09  Aligned_cols=59  Identities=24%  Similarity=0.414  Sum_probs=46.2

Q ss_pred             CCCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCHHH--HHHHHhcCCCCeEEEEEEE
Q 013444          365 NVKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       365 ~~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~~d--l~~~l~~~~g~~v~l~v~R  424 (443)
                      ..+-|++|.+|.+|++|+..| |+.||.+++|||+..-.+.+  ..+ +....|..|.+.|.+
T Consensus       957 q~klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~-lmtrtg~vV~leVaK 1018 (1629)
T KOG1892|consen  957 QRKLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAAR-LMTRTGNVVHLEVAK 1018 (1629)
T ss_pred             ccccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHH-HHhccCCeEEEehhh
Confidence            345699999999999999876 99999999999998865543  333 333567788888865


No 83 
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=88.84  E-value=0.48  Score=47.49  Aligned_cols=58  Identities=19%  Similarity=0.413  Sum_probs=44.3

Q ss_pred             CCCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEE
Q 013444          365 NVKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ  423 (443)
Q Consensus       365 ~~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~  423 (443)
                      +.+..++|++|.++-.|++.+ |..||.|++|||....+.  ++...+|+ +.|+.|.++|+
T Consensus       107 eNkMPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLK-raGkeV~levK  167 (506)
T KOG3551|consen  107 ENKMPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALK-RAGKEVLLEVK  167 (506)
T ss_pred             ccCCceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHH-hhCceeeeeee
Confidence            446799999999999999985 999999999999988643  44455555 45676665553


No 84 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=85.85  E-value=1.7  Score=45.64  Aligned_cols=55  Identities=20%  Similarity=0.349  Sum_probs=46.1

Q ss_pred             ceEEeEECCCCccccCC-CCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEEE
Q 013444          369 GVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQR  424 (443)
Q Consensus       369 g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~R  424 (443)
                      -++|..|..|+-+.+.| |..||.|.+|||..|.+  ..+++++|.+..| .+++.+.-
T Consensus       147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~G-~itfkiiP  204 (542)
T KOG0609|consen  147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSRG-SITFKIIP  204 (542)
T ss_pred             ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCCC-cEEEEEcc
Confidence            68899999999998887 89999999999999964  6889998888665 57777754


No 85 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=85.63  E-value=1.3  Score=50.35  Aligned_cols=52  Identities=31%  Similarity=0.544  Sum_probs=39.2

Q ss_pred             eEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEE
Q 013444          370 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVV  422 (443)
Q Consensus       370 ~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v  422 (443)
                      =+|..|.++|||..+|++.||.|+.+||++|...  .++.++|.+ .|.++.+.+
T Consensus       660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~-~gn~v~~~t  713 (1205)
T KOG0606|consen  660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK-SGNKVTLRT  713 (1205)
T ss_pred             eeeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh-cCCeeEEEe
Confidence            4577899999999999999999999999999754  455565543 344444433


No 86 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=79.03  E-value=4.2  Score=35.02  Aligned_cols=39  Identities=28%  Similarity=0.583  Sum_probs=25.4

Q ss_pred             CCCCCccceeeecCCCEEEEEEEEeecC---CCeEEEEeHHHH
Q 013444          284 INAGNSGGPLVNIDGEIVGINIMKVAAA---DGLSFAVPIDSA  323 (443)
Q Consensus       284 i~~G~SGGPlvd~~G~VVGI~s~~~~~~---~g~~~aIPi~~i  323 (443)
                      ...|.||||++-.+|.+|||-....-..   ..+-| +|++.+
T Consensus       105 ~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f-~P~e~l  146 (148)
T PF02907_consen  105 DLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDF-IPVETL  146 (148)
T ss_dssp             HHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEE-EEHHHH
T ss_pred             EEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEE-Eeeeec
Confidence            3579999999999999999976644321   12334 587654


No 87 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=73.05  E-value=11  Score=31.24  Aligned_cols=53  Identities=28%  Similarity=0.500  Sum_probs=32.8

Q ss_pred             EEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCC
Q 013444          159 GAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGT  230 (443)
Q Consensus       159 GfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~  230 (443)
                      ++-|. +|.++|+.|+.+...        .|     +|..+  +++.  ..-|+++++.+.. .++.+++++
T Consensus         3 avHIG-nG~~vt~tHva~~~~--------~v-----~g~~f--~~~~--~~ge~~~v~~~~~-~~p~~~ig~   55 (105)
T PF03510_consen    3 AVHIG-NGRYVTVTHVAKSSD--------SV-----DGQPF--KIVK--TDGELCWVQSPLV-HLPAAQIGT   55 (105)
T ss_pred             eEEeC-CCEEEEEEEEeccCc--------eE-----cCcCc--EEEE--eccCEEEEECCCC-CCCeeEecc
Confidence            55665 689999999998742        21     12222  1222  3448999999854 356666654


No 88 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=71.35  E-value=41  Score=37.80  Aligned_cols=52  Identities=15%  Similarity=0.257  Sum_probs=33.2

Q ss_pred             EEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC--CcEEEEEEEeecCCCCEEEEEEcC
Q 013444          157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD--GRTFEGTVLNADFHSDIAIVKINS  219 (443)
Q Consensus       157 GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d--g~~~~a~vv~~d~~~DlAlLkl~~  219 (443)
                      |...+|++. ||+|.+|...+.          -.|.|.+  ...|...--.-++..|+.+-|++.
T Consensus        67 G~aTLigpq-YiVSV~HN~~gy----------~~v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK  120 (769)
T PF02395_consen   67 GVATLIGPQ-YIVSVKHNGKGY----------NSVSFGNEGQNTYKIVDRNNYPSGDFHMPRLNK  120 (769)
T ss_dssp             SS-EEEETT-EEEBETTG-TSC----------CEECESCSSTCEEEEEEEEBETTSTEBEEEESS
T ss_pred             ceEEEecCC-eEEEEEccCCCc----------CceeecccCCceEEEEEccCCCCcccceeecCc
Confidence            778999987 999999998442          2355654  344532222223447999999985


No 89 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=71.10  E-value=3.6  Score=44.13  Aligned_cols=50  Identities=16%  Similarity=0.341  Sum_probs=34.2

Q ss_pred             ECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhc-CCCCeEEEEEEE
Q 013444          375 VTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGD-RVGEPLKVVVQR  424 (443)
Q Consensus       375 v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~-~~g~~v~l~v~R  424 (443)
                      ...++||++.| |-.||+|++|||......  ..-+.+++. +....|+++|.+
T Consensus       680 mm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~  733 (829)
T KOG3605|consen  680 MMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS  733 (829)
T ss_pred             cccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence            45689999986 999999999999876432  333455555 333356666655


No 90 
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=66.38  E-value=6.8  Score=37.65  Aligned_cols=55  Identities=11%  Similarity=0.250  Sum_probs=42.9

Q ss_pred             eEEeEECCCCccccC-CCCCCCEEEEECCEeeCCHHH--HHHHHhc-CCCCeEEEEEEE
Q 013444          370 VLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSITE--IIEIMGD-RVGEPLKVVVQR  424 (443)
Q Consensus       370 ~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~~d--l~~~l~~-~~g~~v~l~v~R  424 (443)
                      ..|..+.++|.-.+. -++.||.|-+|||+.+-.+..  +.++|+. ..|++.++.+..
T Consensus       151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie  209 (334)
T KOG3938|consen  151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE  209 (334)
T ss_pred             eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence            557778888887765 489999999999999987754  5577877 678888877653


No 91 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=65.13  E-value=26  Score=29.94  Aligned_cols=33  Identities=21%  Similarity=0.296  Sum_probs=25.2

Q ss_pred             ceEEEEcccCCCCCccceeeecCCCEEEEEEEEe
Q 013444          275 REYLQTDCAINAGNSGGPLVNIDGEIVGINIMKV  308 (443)
Q Consensus       275 ~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~  308 (443)
                      ..++....+..||+-||+|+ .+--||||++.+-
T Consensus        78 ~~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg  110 (127)
T PF00947_consen   78 YNLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGG  110 (127)
T ss_dssp             ECEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEE
T ss_pred             cCceeecccCCCCCCCceeE-eCCCeEEEEEeCC
Confidence            34566678899999999999 6778999998763


No 92 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=62.38  E-value=5.4  Score=40.74  Aligned_cols=24  Identities=33%  Similarity=0.650  Sum_probs=21.1

Q ss_pred             ccCCCCCccceeeecCCCEEEEEE
Q 013444          282 CAINAGNSGGPLVNIDGEIVGINI  305 (443)
Q Consensus       282 ~~i~~G~SGGPlvd~~G~VVGI~s  305 (443)
                      ..+..|.||+.|+|.+|++|||.+
T Consensus       350 ~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  350 YSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             cCCCCCCCcCeEECCCCCEEEEeC
Confidence            356789999999999999999975


No 93 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=59.39  E-value=26  Score=25.51  Aligned_cols=33  Identities=27%  Similarity=0.469  Sum_probs=29.1

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.||+.+.+.+..+|...++.|-...
T Consensus         6 g~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~   38 (63)
T cd00600           6 GKTVRVELKDGRVLEGVLVAFDKYMNLVLDDVE   38 (63)
T ss_pred             CCEEEEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence            468999999999999999999999998886664


No 94 
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=54.67  E-value=30  Score=26.36  Aligned_cols=33  Identities=27%  Similarity=0.491  Sum_probs=29.6

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.|.+++..+|...++.|-...
T Consensus        14 ~k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   46 (72)
T PRK00737         14 NSPVLVRLKGGREFRGELQGYDIHMNLVLDNAE   46 (72)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence            467999999999999999999999999887764


No 95 
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=54.48  E-value=31  Score=25.86  Aligned_cols=33  Identities=21%  Similarity=0.359  Sum_probs=29.8

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.+.+++..+|...+|.|-...
T Consensus        10 ~~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~   42 (68)
T cd01731          10 NKPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAE   42 (68)
T ss_pred             CCEEEEEECCCCEEEEEEEEECCcceEEEeeEE
Confidence            468999999999999999999999999987765


No 96 
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.44  E-value=29  Score=26.01  Aligned_cols=33  Identities=24%  Similarity=0.278  Sum_probs=29.3

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.|.+++..+|...+|.|-...
T Consensus        10 ~~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          10 GRPVVVKLNSGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CCeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence            468999999999999999999999999886664


No 97 
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=53.95  E-value=19  Score=32.92  Aligned_cols=29  Identities=31%  Similarity=0.245  Sum_probs=26.5

Q ss_pred             CCceEEeEECCCCccccCCCCCCCEEEEE
Q 013444          367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKF  395 (443)
Q Consensus       367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~v  395 (443)
                      .+.+.|.+|..+|||+++|+.-|+.|+++
T Consensus       121 ~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v  149 (183)
T PF11874_consen  121 GGKVIVDEVEFGSPAEKAGIDFDWEITEV  149 (183)
T ss_pred             CCEEEEEecCCCCHHHHcCCCCCcEEEEE
Confidence            45788999999999999999999999887


No 98 
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=53.80  E-value=32  Score=26.62  Aligned_cols=33  Identities=24%  Similarity=0.404  Sum_probs=29.1

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.||+.+.+++..+|.+.+|.|=...
T Consensus        10 ~~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          10 GRTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             CCeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence            468999999999999999999999999876554


No 99 
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=53.41  E-value=29  Score=26.14  Aligned_cols=33  Identities=21%  Similarity=0.330  Sum_probs=29.2

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.+.+++..+|...+|.+=.+.
T Consensus        11 g~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          11 GKPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CCEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence            468999999999999999999999999886654


No 100
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=53.40  E-value=29  Score=26.89  Aligned_cols=32  Identities=16%  Similarity=0.365  Sum_probs=28.4

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEE
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI  217 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl  217 (443)
                      +..+.|.+.+|+.+.+++.++|...++.|=..
T Consensus        13 ~~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          13 GSRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CCEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence            46899999999999999999999999987554


No 101
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.12  E-value=27  Score=27.34  Aligned_cols=32  Identities=22%  Similarity=0.376  Sum_probs=28.2

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEE
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI  217 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl  217 (443)
                      ...+.|.+.+|+.+.+++.++|.+.+|.|=..
T Consensus        11 ~k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          11 DERVYVKLRGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CCEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence            35899999999999999999999999887544


No 102
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.97  E-value=32  Score=26.74  Aligned_cols=33  Identities=36%  Similarity=0.566  Sum_probs=29.1

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.||+.+.+.+.++|.+.+|.|=...
T Consensus        10 ~~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          10 NYRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             CCEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence            468999999999999999999999999876554


No 103
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=49.63  E-value=39  Score=26.50  Aligned_cols=33  Identities=21%  Similarity=0.301  Sum_probs=28.8

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.+.+++..+|...+|.|=...
T Consensus        12 ~k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~~   44 (81)
T cd01729          12 DKKIRVKFQGGREVTGILKGYDQLLNLVLDDTV   44 (81)
T ss_pred             CCeEEEEECCCcEEEEEEEEEcCcccEEecCEE
Confidence            468999999999999999999999999875543


No 104
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.34  E-value=44  Score=25.54  Aligned_cols=33  Identities=15%  Similarity=0.231  Sum_probs=28.7

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      +..+.|.+.+|+.+.+++.++|...+|.|=...
T Consensus        10 ~k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~   42 (72)
T cd01719          10 DKKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV   42 (72)
T ss_pred             CCeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence            468999999999999999999999998885553


No 105
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.83  E-value=49  Score=25.04  Aligned_cols=33  Identities=18%  Similarity=0.393  Sum_probs=29.8

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|..|.+++..+|...++.+-...
T Consensus        10 g~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          10 GHIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CCEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence            468999999999999999999999999988774


No 106
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.05  E-value=46  Score=25.64  Aligned_cols=32  Identities=28%  Similarity=0.415  Sum_probs=28.5

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEE
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI  217 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl  217 (443)
                      ...+.|.+.+|+.+.+.+..+|++.++.|=..
T Consensus        12 ~k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~   43 (74)
T cd01728          12 DKKVVVLLRDGRKLIGILRSFDQFANLVLQDT   43 (74)
T ss_pred             CCEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence            46899999999999999999999999888554


No 107
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=46.79  E-value=44  Score=26.66  Aligned_cols=33  Identities=15%  Similarity=0.324  Sum_probs=29.0

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+++.+.+++.++|.+.+|.|=...
T Consensus        14 ~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          14 NTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             CCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence            358999999999999999999999999876554


No 108
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=46.37  E-value=49  Score=24.37  Aligned_cols=33  Identities=24%  Similarity=0.462  Sum_probs=28.8

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.||+.+.+++..+|...++-|=...
T Consensus         8 ~~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~   40 (67)
T smart00651        8 GKRVLVELKNGREYRGTLKGFDQFMNLVLEDVE   40 (67)
T ss_pred             CcEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence            458999999999999999999999998876554


No 109
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=45.92  E-value=72  Score=23.70  Aligned_cols=34  Identities=24%  Similarity=0.310  Sum_probs=29.3

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcC
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS  219 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~  219 (443)
                      ...+.+..-.|..++++++.+|....+.+|+.+.
T Consensus         6 Gs~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           6 GSQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             ccEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence            3567777888999999999999999999998764


No 110
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=45.19  E-value=41  Score=24.84  Aligned_cols=34  Identities=26%  Similarity=0.560  Sum_probs=30.2

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcC
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS  219 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~  219 (443)
                      ...+.|.+.+|+.+.+.+..+|...++.|-....
T Consensus         8 g~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~   41 (67)
T PF01423_consen    8 GKRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE   41 (67)
T ss_dssp             TSEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred             CcEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence            4689999999999999999999999998877753


No 111
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.56  E-value=59  Score=24.87  Aligned_cols=33  Identities=24%  Similarity=0.330  Sum_probs=29.1

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      +..+.|.+.+++.+.+++.++|...++.|=...
T Consensus         9 ~~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727           9 NKTVSVITVDGRVIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             CCEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence            468999999999999999999999998887654


No 112
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=39.09  E-value=2.7e+02  Score=28.96  Aligned_cols=134  Identities=16%  Similarity=0.245  Sum_probs=61.3

Q ss_pred             cEEEEEEEeCCCEEEeccccccCCCCCC-CCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCC--CCCCccccCCC
Q 013444          155 GIGSGAIVDADGTILTCAHVVVDFHGSR-ALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSK--TPLPAAKLGTS  231 (443)
Q Consensus       155 ~~GSGfiI~~~G~ILTaaHvv~~~~~~~-~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~--~~~~~~~l~~s  231 (443)
                      +.|=||.|+++ .++|+-||+......- ..+..+|               .++..-+++-+++..+  .+++-+-|.  
T Consensus       379 GsGWGfWVS~~-lfITttHViP~g~~E~FGv~i~~i---------------~vh~sGeF~~~rFpk~iRPDvtgmiLE--  440 (535)
T PF05416_consen  379 GSGWGFWVSPT-LFITTTHVIPPGAKEAFGVPISQI---------------QVHKSGEFCRFRFPKPIRPDVTGMILE--  440 (535)
T ss_dssp             TTEEEEESSSS-EEEEEGGGS-STTSEETTEECGGE---------------EEEEETTEEEEEESS-SSTTS---EE---
T ss_pred             CCceeeeecce-EEEEeeeecCCcchhhhCCChhHe---------------EEeeccceEEEecCCCCCCCccceeec--
Confidence            55779999998 9999999998632100 0001122               2233346677777653  334444442  


Q ss_pred             CCCCCCCEEEE-EecCCCC--CCceEEEEEEeeecCccCCCCCCccceEEE-------EcccCCCCCccceeeecCCC--
Q 013444          232 SKLCPGDWVVA-MGCPHSL--QNTVTAGIVSCVDRKSSDLGLGGMRREYLQ-------TDCAINAGNSGGPLVNIDGE--  299 (443)
Q Consensus       232 ~~~~~G~~V~~-iG~p~~~--~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~-------~d~~i~~G~SGGPlvd~~G~--  299 (443)
                      +..+.|.-+.+ +-.+.+.  ...+.-|........-.-  .++ ...++.       .|-...||+-|.|-|-..|+  
T Consensus       441 eGapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~--v~G-Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~  517 (535)
T PF05416_consen  441 EGAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRT--VHG-QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDW  517 (535)
T ss_dssp             SS--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEE--EEE-EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEE
T ss_pred             cCCCCceEEEEEEEcCCccchhhhhhhccceeEEEccee--ecc-eeeeeeecCCccccccCCCCCCCCCceeeecCCcE
Confidence            23344554433 2233331  123344443332211000  000 011222       23346789999999976665  


Q ss_pred             -EEEEEEEEee
Q 013444          300 -IVGINIMKVA  309 (443)
Q Consensus       300 -VVGI~s~~~~  309 (443)
                       |+|++.....
T Consensus       518 VV~GVH~AAtr  528 (535)
T PF05416_consen  518 VVIGVHAAATR  528 (535)
T ss_dssp             EEEEEEEEE-S
T ss_pred             EEEEEEehhcc
Confidence             8899987543


No 113
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=36.90  E-value=34  Score=23.82  Aligned_cols=19  Identities=47%  Similarity=0.660  Sum_probs=16.2

Q ss_pred             CCccceeeecCCCEEEEEE
Q 013444          287 GNSGGPLVNIDGEIVGINI  305 (443)
Q Consensus       287 G~SGGPlvd~~G~VVGI~s  305 (443)
                      +.+.-|++|.+|+++|+.+
T Consensus        29 ~~~~~~V~d~~~~~~G~is   47 (57)
T PF00571_consen   29 GISRLPVVDEDGKLVGIIS   47 (57)
T ss_dssp             TSSEEEEESTTSBEEEEEE
T ss_pred             CCcEEEEEecCCEEEEEEE
Confidence            5567899999999999975


No 114
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.06  E-value=91  Score=23.97  Aligned_cols=33  Identities=24%  Similarity=0.448  Sum_probs=29.6

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus        11 g~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~   43 (76)
T cd01723          11 NHPMLVELKNGETYNGHLVNCDNWMNIHLREVI   43 (76)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence            468999999999999999999999999987663


No 115
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=35.67  E-value=42  Score=31.44  Aligned_cols=56  Identities=25%  Similarity=0.486  Sum_probs=39.1

Q ss_pred             ceEEEEcccCCCCCccceeeecC----CCEEEEEEEEeecCCCeEEEEeH--HHHHHHHHHHH
Q 013444          275 REYLQTDCAINAGNSGGPLVNID----GEIVGINIMKVAAADGLSFAVPI--DSAAKIIEQFK  331 (443)
Q Consensus       275 ~~~i~~d~~i~~G~SGGPlvd~~----G~VVGI~s~~~~~~~g~~~aIPi--~~i~~~l~~l~  331 (443)
                      +..++...+...|+=|||++-.+    -+++||+..+... .+.+||-++  +.+++.++.|+
T Consensus       168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~-~~~gYAe~itQEDL~~A~~~l~  229 (231)
T PF12381_consen  168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN-HAMGYAESITQEDLMRAINKLE  229 (231)
T ss_pred             eeeeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc-ccceehhhhhHHHHHHHHHhhc
Confidence            34567788889999999998332    6899999987643 356777544  35666666554


No 116
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=35.65  E-value=76  Score=24.49  Aligned_cols=33  Identities=24%  Similarity=0.491  Sum_probs=29.0

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|+.+.+++..+|...++.|--..
T Consensus        17 ~~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   49 (79)
T COG1958          17 NKRVLVKLKNGREYRGTLVGFDQYMNLVLDDVE   49 (79)
T ss_pred             CCEEEEEECCCCEEEEEEEEEccceeEEEeceE
Confidence            368999999999999999999999998887554


No 117
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=32.46  E-value=59  Score=35.17  Aligned_cols=37  Identities=19%  Similarity=0.248  Sum_probs=32.2

Q ss_pred             CceEEeEECCCCccccC-CCCCCCEEEEECCEeeCCHH
Q 013444          368 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSIT  404 (443)
Q Consensus       368 ~g~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~~  404 (443)
                      +-.+|.++.++|||... .|..||.|+.||++.|..|+
T Consensus       225 g~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq  262 (638)
T KOG1738|consen  225 GPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ  262 (638)
T ss_pred             CceeccccccCChHHHhhcccCccceeeecccccccch
Confidence            45667889999999876 59999999999999998885


No 118
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=32.09  E-value=58  Score=24.79  Aligned_cols=32  Identities=28%  Similarity=0.574  Sum_probs=24.6

Q ss_pred             cceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHH
Q 013444          290 GGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFK  331 (443)
Q Consensus       290 GGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~  331 (443)
                      .-|+.+.+|+++|++..          .+.++.+.++++++.
T Consensus        18 s~pi~~~~g~~~Gvv~~----------di~l~~l~~~i~~~~   49 (81)
T PF02743_consen   18 SVPIYDDDGKIIGVVGI----------DISLDQLSEIISNIK   49 (81)
T ss_dssp             EEEEEETTTEEEEEEEE----------EEEHHHHHHHHTTSB
T ss_pred             EEEEECCCCCEEEEEEE----------EeccceeeeEEEeeE
Confidence            35888889999999754          478888888776653


No 119
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=31.21  E-value=45  Score=35.34  Aligned_cols=32  Identities=31%  Similarity=0.425  Sum_probs=25.1

Q ss_pred             ceEEeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444          369 GVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  401 (443)
Q Consensus       369 g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~  401 (443)
                      -+.|.-..+|.|.-.| .+|||||++.||+.|+
T Consensus       299 v~~vl~~~ENm~~g~A-~rPGDVits~~GkTVE  330 (485)
T COG0260         299 VVGVLPAVENMPSGNA-YRPGDVITSMNGKTVE  330 (485)
T ss_pred             EEEEEeeeccCCCCCC-CCCCCeEEecCCcEEE
Confidence            3344456678888877 9999999999999874


No 120
>PF14438 SM-ATX:  Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=30.92  E-value=1.3e+02  Score=22.98  Aligned_cols=29  Identities=28%  Similarity=0.461  Sum_probs=21.5

Q ss_pred             CceEEEEeCCCcEEEEEEEeecC---CCCEEE
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADF---HSDIAI  214 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~---~~DlAl  214 (443)
                      ...++|++.||..|++.+..+++   +.+++|
T Consensus        12 G~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL   43 (77)
T PF14438_consen   12 GQTVEVTTKNGSVYEGIFHSASPESNEFDVVL   43 (77)
T ss_dssp             TSEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred             CCEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence            46899999999999999999988   556655


No 121
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.71  E-value=1.2e+02  Score=23.64  Aligned_cols=33  Identities=24%  Similarity=0.368  Sum_probs=29.6

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|..+.+++..+|...++-+-.+.
T Consensus        11 g~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~   43 (81)
T cd01725          11 GKEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS   43 (81)
T ss_pred             CCEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence            468999999999999999999999999887764


No 122
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=30.13  E-value=1.4e+02  Score=23.22  Aligned_cols=33  Identities=24%  Similarity=0.380  Sum_probs=29.3

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|..|.+++..+|...++-+-...
T Consensus        19 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~   51 (78)
T cd01733          19 GKVVTVELRNETTVTGRIASVDAFMNIRLAKVT   51 (78)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence            468999999999999999999999998887664


No 123
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=29.66  E-value=57  Score=27.04  Aligned_cols=18  Identities=28%  Similarity=0.695  Sum_probs=13.6

Q ss_pred             ceeeecCCCEEEEEEEEe
Q 013444          291 GPLVNIDGEIVGINIMKV  308 (443)
Q Consensus       291 GPlvd~~G~VVGI~s~~~  308 (443)
                      .|++|.+|++||++..+.
T Consensus        94 ~PV~d~~g~viG~V~VG~  111 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVGV  111 (116)
T ss_dssp             EEEE-TTS-EEEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEEE
Confidence            589999999999998754


No 124
>cd05701 S1_Rrp5_repeat_hs10 S1_Rrp5_repeat_hs10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 10 (hs10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=29.26  E-value=38  Score=25.39  Aligned_cols=33  Identities=30%  Similarity=0.282  Sum_probs=17.5

Q ss_pred             CCEEEEEEcCCCCCCccc---------cCCCCCCCCCCEEEE
Q 013444          210 SDIAIVKINSKTPLPAAK---------LGTSSKLCPGDWVVA  242 (443)
Q Consensus       210 ~DlAlLkl~~~~~~~~~~---------l~~s~~~~~G~~V~~  242 (443)
                      .|+||+.+.....+..++         ..+++++++|+.+.+
T Consensus        13 kdfAvvSL~~t~~L~a~p~~sHLNdtfrf~seklkvG~~l~v   54 (69)
T cd05701          13 KDFAIVSLATTGDLAAFPTRSHLNDTFRFDSEKLSVGQCLDV   54 (69)
T ss_pred             hceEEEEeeccccEEEEEchhhccccccccceeeeccceEEE
Confidence            467777776543332222         224566677776554


No 125
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=29.18  E-value=1.3e+02  Score=24.10  Aligned_cols=33  Identities=18%  Similarity=0.383  Sum_probs=29.7

Q ss_pred             CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (443)
Q Consensus       186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~  218 (443)
                      ...+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus        11 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~a~   43 (90)
T cd01724          11 NETVTIELKNGTIVHGTITGVDPSMNTHLKNVK   43 (90)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcCceeEEEEEEE
Confidence            468999999999999999999999999887764


No 126
>PF05578 Peptidase_S31:  Pestivirus NS3 polyprotein peptidase S31;  InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=25.95  E-value=1.7e+02  Score=25.97  Aligned_cols=73  Identities=16%  Similarity=0.161  Sum_probs=36.4

Q ss_pred             CCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCC-CccceEEEEcccCCCCCccceeeecC-CCEEEEEEEE
Q 013444          233 KLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG-GMRREYLQTDCAINAGNSGGPLVNID-GEIVGINIMK  307 (443)
Q Consensus       233 ~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~-~~~~~~i~~d~~i~~G~SGGPlvd~~-G~VVGI~s~~  307 (443)
                      ....|..+|++ +|.....+.+.|.+-.+...-.++..- ..... -..|..-..|.||=|+|... |++||-.-.+
T Consensus       108 gcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtp-af~~~knlkg~s~~pifeassgr~vgr~k~g  182 (211)
T PF05578_consen  108 GCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTP-AFFDLKNLKGWSGLPIFEASSGRVVGRVKVG  182 (211)
T ss_pred             CCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCc-ceeeccccCCCCCCceeeccCCcEEEEEEec
Confidence            44567777777 555444344444333222211111000 00001 11233445689999999654 9999976543


No 127
>PF09465 LBR_tudor:  Lamin-B receptor of TUDOR domain;  InterPro: IPR019023  The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=23.87  E-value=2.9e+02  Score=20.08  Aligned_cols=35  Identities=29%  Similarity=0.237  Sum_probs=27.6

Q ss_pred             CceEEEEeCCCcEE-EEEEEeecCCCCEEEEEEcCC
Q 013444          186 KGKVDVTLQDGRTF-EGTVLNADFHSDIAIVKINSK  220 (443)
Q Consensus       186 ~~~i~V~~~dg~~~-~a~vv~~d~~~DlAlLkl~~~  220 (443)
                      ...+.+.+++...| ++++..+|...++.-++.++.
T Consensus         9 Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG   44 (55)
T PF09465_consen    9 GEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG   44 (55)
T ss_dssp             S-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred             CCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence            46889999987655 999999999999999998754


No 128
>PF09122 DUF1930:  Domain of unknown function (DUF1930);  InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=22.19  E-value=2.3e+02  Score=21.19  Aligned_cols=44  Identities=18%  Similarity=0.376  Sum_probs=27.0

Q ss_pred             CCEEEEECCEeeCCHH-HHHHHHhc-CCCCeEEEEEEECCCeEEEEEE
Q 013444          389 SDVVIKFDGKPVQSIT-EIIEIMGD-RVGEPLKVVVQRANDQLVTLTV  434 (443)
Q Consensus       389 GDiI~~vng~~V~s~~-dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v  434 (443)
                      .-.-+.+||..|++.+ ++..++.. +.|++-++.++.  ++...+++
T Consensus        19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~FkS--~Rv~~iEv   64 (68)
T PF09122_consen   19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFKS--QRVAVIEV   64 (68)
T ss_dssp             TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEETT--S-EEEEE-
T ss_pred             cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEec--CcEEEEEc
Confidence            4567889999999875 57777765 899987777653  45555555


No 129
>PRK05015 aminopeptidase B; Provisional
Probab=22.16  E-value=90  Score=32.43  Aligned_cols=29  Identities=21%  Similarity=0.208  Sum_probs=23.6

Q ss_pred             EeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444          372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  401 (443)
Q Consensus       372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~  401 (443)
                      |.-+.+|.+...+ .++||||..-||+.|.
T Consensus       240 il~~aENmisg~A-~kpgDVIt~~nGkTVE  268 (424)
T PRK05015        240 FLCCAENLISGNA-FKLGDIITYRNGKTVE  268 (424)
T ss_pred             EEEecccCCCCCC-CCCCCEEEecCCcEEe
Confidence            3445677777777 9999999999999874


No 130
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=21.05  E-value=91  Score=33.11  Aligned_cols=29  Identities=28%  Similarity=0.502  Sum_probs=23.9

Q ss_pred             EeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444          372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  401 (443)
Q Consensus       372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~  401 (443)
                      |.-..+|.|...+ .++||+|..-||+.|.
T Consensus       303 v~~l~ENm~~~~A-~rPgDVi~~~~GkTVE  331 (483)
T PRK00913        303 VVAACENMPSGNA-YRPGDVLTSMSGKTIE  331 (483)
T ss_pred             EEEeeccCCCCCC-CCCCCEEEECCCcEEE
Confidence            3345678888887 9999999999999874


No 131
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains.  Family M17 contains zinc- and manganese-dependent exopeptidases ( EC  3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=20.87  E-value=89  Score=33.05  Aligned_cols=29  Identities=28%  Similarity=0.353  Sum_probs=23.7

Q ss_pred             EeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444          372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  401 (443)
Q Consensus       372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~  401 (443)
                      +.-..+|.+...+ .+|||||..-||+.|+
T Consensus       289 i~~~~EN~is~~A-~rPgDVi~s~~GkTVE  317 (468)
T cd00433         289 VLPLAENMISGNA-YRPGDVITSRSGKTVE  317 (468)
T ss_pred             EEEeeecCCCCCC-CCCCCEeEeCCCcEEE
Confidence            3345678888887 9999999999999874


No 132
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=20.61  E-value=72  Score=25.94  Aligned_cols=20  Identities=25%  Similarity=0.386  Sum_probs=16.0

Q ss_pred             CCccceeeecCCCEEEEEEE
Q 013444          287 GNSGGPLVNIDGEIVGINIM  306 (443)
Q Consensus       287 G~SGGPlvd~~G~VVGI~s~  306 (443)
                      +.+.=|++|.+|+++|+++.
T Consensus        98 ~~~~lpVvd~~~~~vGiit~  117 (123)
T cd04627          98 GISSVAVVDNQGNLIGNISV  117 (123)
T ss_pred             CCceEEEECCCCcEEEEEeH
Confidence            34456999989999999875


Done!