Query         012318
Match_columns 466
No_of_seqs    455 out of 3342
Neff          7.9 
Searched_HMMs 46136
Date          Fri Mar 29 01:22:19 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/012318.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/012318hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 3.2E-49   7E-54  411.3  38.3  296  126-462    41-361 (455)
  2 TIGR02038 protease_degS peripl 100.0 1.2E-48 2.7E-53  395.7  38.9  297  125-463    45-350 (351)
  3 PRK10898 serine endoprotease;  100.0 2.4E-48 5.3E-53  393.3  38.6  296  126-463    46-351 (353)
  4 TIGR02037 degP_htrA_DO peripla 100.0 1.7E-46 3.6E-51  390.9  37.9  295  127-462     3-328 (428)
  5 PRK10942 serine endoprotease;  100.0 2.3E-46 4.9E-51  391.7  37.5  295  126-461    39-381 (473)
  6 COG0265 DegQ Trypsin-like seri 100.0   3E-36 6.5E-41  306.0  33.1  295  125-461    33-340 (347)
  7 KOG1320 Serine protease [Postt 100.0 5.1E-28 1.1E-32  246.2  23.0  327  124-465   127-472 (473)
  8 KOG1421 Predicted signaling-as  99.9 3.6E-25 7.8E-30  227.1  21.9  304  125-462    52-372 (955)
  9 PF13365 Trypsin_2:  Trypsin-li  99.7 2.4E-16 5.3E-21  134.3  12.7  117  157-303     1-120 (120)
 10 KOG1421 Predicted signaling-as  99.6   5E-14 1.1E-18  145.8  18.1  295  132-461   525-831 (955)
 11 PF13180 PDZ_2:  PDZ domain; PD  99.5 4.9E-14 1.1E-18  112.8  10.1   81  362-459     1-82  (82)
 12 PF00089 Trypsin:  Trypsin;  In  99.5 2.6E-12 5.7E-17  120.6  18.0  177  135-327    12-220 (220)
 13 cd00190 Tryp_SPc Trypsin-like   99.4 2.5E-11 5.5E-16  114.7  18.9  181  135-329    12-231 (232)
 14 cd00987 PDZ_serine_protease PD  99.3 1.1E-11 2.3E-16  100.5  10.9   88  362-456     1-89  (90)
 15 cd00991 PDZ_archaeal_metallopr  99.3 3.6E-11 7.7E-16   95.5  10.2   69  389-458     8-77  (79)
 16 smart00020 Tryp_SPc Trypsin-li  99.2 2.7E-10 5.9E-15  107.9  15.6  161  134-308    12-208 (229)
 17 cd00990 PDZ_glycyl_aminopeptid  99.2   1E-10 2.2E-15   92.8   9.7   68  390-460    11-78  (80)
 18 cd00989 PDZ_metalloprotease PD  99.2 1.2E-10 2.7E-15   92.0   9.7   67  391-458    12-78  (79)
 19 TIGR01713 typeII_sec_gspC gene  99.2 1.2E-10 2.5E-15  113.1  11.5   99  321-458   159-258 (259)
 20 cd00986 PDZ_LON_protease PDZ d  99.2 1.7E-10 3.7E-15   91.5  10.1   71  390-462     7-78  (79)
 21 KOG1320 Serine protease [Postt  99.2 7.2E-11 1.6E-15  121.3   9.3  277  131-450    56-351 (473)
 22 cd00988 PDZ_CTP_protease PDZ d  99.1 5.2E-10 1.1E-14   89.7   9.5   71  390-460    12-84  (85)
 23 cd00136 PDZ PDZ domain, also c  99.0 2.1E-09 4.6E-14   82.8   7.7   55  391-445    13-69  (70)
 24 TIGR02037 degP_htrA_DO peripla  99.0 3.5E-09 7.6E-14  110.8  11.2   90  361-456   337-427 (428)
 25 COG3591 V8-like Glu-specific e  98.8 6.3E-08 1.4E-12   92.4  13.8  160  155-331    64-250 (251)
 26 TIGR00054 RIP metalloprotease   98.8 1.5E-08 3.3E-13  105.6   9.8   70  391-461   203-272 (420)
 27 PRK10779 zinc metallopeptidase  98.7 4.1E-08 8.9E-13  103.3  10.2   69  391-460   221-289 (449)
 28 PRK10779 zinc metallopeptidase  98.7 3.3E-08 7.1E-13  104.0   8.7   67  393-460   128-195 (449)
 29 cd00992 PDZ_signaling PDZ doma  98.6   3E-07 6.4E-12   72.9   9.0   69  361-445    11-81  (82)
 30 smart00228 PDZ Domain present   98.6 2.9E-07 6.2E-12   73.2   8.9   71  362-447    12-83  (85)
 31 PRK10139 serine endoprotease;   98.6 1.6E-07 3.4E-12   98.7   9.5   66  390-457   389-454 (455)
 32 TIGR02860 spore_IV_B stage IV   98.6 2.4E-07 5.2E-12   94.3  10.1   69  391-460   105-181 (402)
 33 PF00595 PDZ:  PDZ domain (Also  98.5 2.7E-07   6E-12   73.3   7.4   72  360-446     8-81  (81)
 34 TIGR00225 prc C-terminal pepti  98.5 2.6E-07 5.7E-12   93.5   8.3   68  391-459    62-131 (334)
 35 PLN00049 carboxyl-terminal pro  98.5   4E-07 8.7E-12   94.0   9.6   68  391-459   102-171 (389)
 36 TIGR03279 cyano_FeS_chp putati  98.5 3.6E-07 7.8E-12   93.6   8.9   63  395-461     2-65  (433)
 37 PRK10942 serine endoprotease;   98.5 3.8E-07 8.3E-12   96.3   9.2   65  391-457   408-472 (473)
 38 KOG3627 Trypsin [Amino acid tr  98.4 1.9E-05   4E-10   76.5  17.7  169  155-331    38-254 (256)
 39 PF14685 Tricorn_PDZ:  Tricorn   98.3 3.2E-06   7E-11   68.0   8.0   68  390-457    11-88  (88)
 40 PF00863 Peptidase_C4:  Peptida  98.3 8.9E-05 1.9E-09   70.4  18.6  174  133-333    15-195 (235)
 41 TIGR00054 RIP metalloprotease   98.3 1.7E-06 3.6E-11   90.3   7.0   64  391-456   128-191 (420)
 42 KOG3129 26S proteasome regulat  98.2 3.1E-06 6.7E-11   77.3   7.5   73  392-465   140-215 (231)
 43 COG0793 Prc Periplasmic protea  98.2 5.1E-06 1.1E-10   85.9   9.5   70  391-460   112-184 (406)
 44 PF04495 GRASP55_65:  GRASP55/6  98.0 1.5E-05 3.2E-10   70.0   7.7   72  390-461    42-115 (138)
 45 COG3975 Predicted protease wit  97.9 1.9E-05   4E-10   81.7   7.1   65  389-462   460-525 (558)
 46 PRK11186 carboxy-terminal prot  97.9 3.5E-05 7.6E-10   84.0   9.0   70  391-460   255-334 (667)
 47 COG3480 SdrC Predicted secrete  97.9 4.4E-05 9.6E-10   74.3   8.6   70  389-459   128-198 (342)
 48 PRK09681 putative type II secr  97.9 4.8E-05   1E-09   73.9   8.5   61  398-459   211-275 (276)
 49 PF12812 PDZ_1:  PDZ-like domai  97.7 0.00017 3.6E-09   56.9   7.0   69  361-437     8-76  (78)
 50 COG5640 Secreted trypsin-like   97.6 0.00059 1.3E-08   67.6  12.0   51  282-332   223-279 (413)
 51 PF05579 Peptidase_S32:  Equine  97.6  0.0013 2.7E-08   62.8  13.2  116  154-308   111-229 (297)
 52 KOG3553 Tax interaction protei  97.5 7.3E-05 1.6E-09   60.2   3.3   37  388-424    56-92  (124)
 53 PF08192 Peptidase_S64:  Peptid  97.4  0.0015 3.3E-08   69.6  11.6  118  208-331   541-689 (695)
 54 PF03761 DUF316:  Domain of unk  97.3   0.018 3.9E-07   56.7  17.8  176  135-325    53-273 (282)
 55 COG3031 PulC Type II secretory  97.3 0.00067 1.4E-08   63.5   6.9   67  391-458   207-274 (275)
 56 PF05580 Peptidase_S55:  SpoIVB  97.2  0.0082 1.8E-07   56.0  13.2  161  154-323    19-215 (218)
 57 KOG3580 Tight junction protein  96.7  0.0031 6.7E-08   66.0   6.2   59  388-446   426-487 (1027)
 58 KOG3532 Predicted protein kina  96.6  0.0036 7.7E-08   66.5   6.4   51  389-439   396-446 (1051)
 59 PF10459 Peptidase_S46:  Peptid  96.6  0.0097 2.1E-07   65.5   9.4   24  155-178    47-70  (698)
 60 PF10459 Peptidase_S46:  Peptid  96.4  0.0055 1.2E-07   67.4   6.2   55  277-331   623-687 (698)
 61 PF00548 Peptidase_C3:  3C cyst  96.2   0.049 1.1E-06   49.7  10.6  148  136-307    13-170 (172)
 62 KOG3209 WW domain-containing p  95.9   0.011 2.5E-07   63.1   5.6   52  395-447   782-836 (984)
 63 PF02122 Peptidase_S39:  Peptid  95.8   0.048   1E-06   51.0   8.5  117  167-307    43-166 (203)
 64 KOG3552 FERM domain protein FR  95.7   0.013 2.9E-07   64.3   5.0   56  391-448    75-132 (1298)
 65 KOG3550 Receptor targeting pro  95.6   0.033 7.1E-07   48.6   6.1   57  389-446   113-172 (207)
 66 TIGR02860 spore_IV_B stage IV   95.6    0.18 3.9E-06   52.0  12.4   45  278-323   351-395 (402)
 67 PF00949 Peptidase_S7:  Peptida  95.3   0.036 7.7E-07   48.1   5.5   33  278-310    88-120 (132)
 68 COG0750 Predicted membrane-ass  95.3   0.064 1.4E-06   55.0   8.5   56  397-452   135-193 (375)
 69 PF09342 DUF1986:  Domain of un  95.3    0.56 1.2E-05   44.7  13.6   99  136-247    17-131 (267)
 70 PF00944 Peptidase_S3:  Alphavi  94.7   0.086 1.9E-06   45.3   6.0   39  279-317    98-136 (158)
 71 KOG3580 Tight junction protein  94.4   0.088 1.9E-06   55.5   6.5   73  381-457    30-104 (1027)
 72 KOG3542 cAMP-regulated guanine  94.1   0.041 8.8E-07   58.7   3.4   57  389-447   560-618 (1283)
 73 KOG3571 Dishevelled 3 and rela  93.0    0.15 3.2E-06   53.0   5.1   75  360-447   259-338 (626)
 74 KOG3209 WW domain-containing p  92.4    0.18   4E-06   54.3   5.1   56  390-447   922-980 (984)
 75 KOG3605 Beta amyloid precursor  92.1    0.32 6.8E-06   52.1   6.3  123  284-439   677-806 (829)
 76 KOG2921 Intramembrane metallop  92.0    0.17 3.7E-06   51.1   4.0   51  384-434   213-264 (484)
 77 KOG3834 Golgi reassembly stack  91.9    0.29 6.3E-06   50.0   5.6   71  389-460    13-86  (462)
 78 KOG3651 Protein kinase C, alph  90.6     0.5 1.1E-05   46.1   5.6   55  391-446    30-87  (429)
 79 KOG3606 Cell polarity protein   90.0    0.65 1.4E-05   44.6   5.7   57  390-447   193-252 (358)
 80 KOG3834 Golgi reassembly stack  89.7    0.57 1.2E-05   47.9   5.4   67  395-461   113-181 (462)
 81 KOG3549 Syntrophins (type gamm  88.7     0.6 1.3E-05   46.4   4.5   56  390-446    79-137 (505)
 82 KOG3551 Syntrophins (type beta  88.1     0.6 1.3E-05   47.1   4.2   58  389-447   108-170 (506)
 83 KOG0609 Calcium/calmodulin-dep  87.4    0.97 2.1E-05   47.6   5.4   55  392-447   147-204 (542)
 84 KOG0606 Microtubule-associated  86.5    0.93   2E-05   51.7   5.0   52  393-445   660-713 (1205)
 85 KOG1892 Actin filament-binding  86.4    0.85 1.8E-05   50.9   4.4   57  390-447   959-1018(1629)
 86 KOG3605 Beta amyloid precursor  83.0     2.3   5E-05   45.8   5.7   55  393-447   675-733 (829)
 87 PF03510 Peptidase_C24:  2C end  80.9     5.8 0.00013   33.0   6.2   53  159-230     3-55  (105)
 88 PF02395 Peptidase_S6:  Immunog  79.7      27 0.00059   39.4  13.0  163  157-330    67-266 (769)
 89 PF00947 Pico_P2A:  Picornaviru  78.3      10 0.00022   32.6   7.0   35  273-308    76-110 (127)
 90 PF02907 Peptidase_S29:  Hepati  71.0       9  0.0002   33.2   5.0  128  157-323    14-146 (148)
 91 PF01732 DUF31:  Putative pepti  69.1     3.4 7.4E-05   42.5   2.5   24  282-305   350-373 (374)
 92 PF11874 DUF3394:  Domain of un  64.3     7.3 0.00016   35.8   3.4   32  388-419   119-150 (183)
 93 KOG3938 RGS-GAIP interacting p  63.8      10 0.00022   36.6   4.3   55  393-447   151-209 (334)
 94 cd00600 Sm_like The eukaryotic  59.8      24 0.00053   25.8   5.1   33  186-218     6-38  (63)
 95 cd01726 LSm6 The eukaryotic Sm  57.2      24 0.00053   26.6   4.7   33  186-218    10-42  (67)
 96 PRK00737 small nuclear ribonuc  56.9      26 0.00057   26.9   4.9   33  186-218    14-46  (72)
 97 cd01731 archaeal_Sm1 The archa  56.5      27 0.00059   26.3   4.9   33  186-218    10-42  (68)
 98 cd01722 Sm_F The eukaryotic Sm  55.7      25 0.00054   26.6   4.5   33  186-218    11-43  (68)
 99 cd01732 LSm5 The eukaryotic Sm  55.5      25 0.00055   27.4   4.6   32  186-217    13-44  (76)
100 cd01730 LSm3 The eukaryotic Sm  54.9      23  0.0005   27.9   4.4   32  186-217    11-42  (82)
101 cd06168 LSm9 The eukaryotic Sm  52.4      34 0.00074   26.6   4.9   33  186-218    10-42  (75)
102 cd01719 Sm_G The eukaryotic Sm  51.9      36 0.00077   26.2   4.9   33  186-218    10-42  (72)
103 cd01717 Sm_B The eukaryotic Sm  51.6      32 0.00069   26.9   4.7   33  186-218    10-42  (79)
104 cd01729 LSm7 The eukaryotic Sm  51.5      34 0.00075   26.9   4.8   33  186-218    12-44  (81)
105 cd01728 LSm1 The eukaryotic Sm  50.4      38 0.00082   26.3   4.8   32  186-217    12-43  (74)
106 cd01720 Sm_D2 The eukaryotic S  49.2      36 0.00079   27.3   4.7   32  187-218    15-46  (87)
107 cd01735 LSm12_N LSm12 belongs   48.8      60  0.0013   24.2   5.4   33  187-219     7-39  (61)
108 cd01721 Sm_D3 The eukaryotic S  47.0      47   0.001   25.3   4.8   33  186-218    10-42  (70)
109 smart00651 Sm snRNP Sm protein  46.3      49  0.0011   24.5   4.8   33  186-218     8-40  (67)
110 PF05416 Peptidase_C37:  Southa  45.4 1.4E+02  0.0031   31.0   9.2  136  155-310   379-529 (535)
111 cd01727 LSm8 The eukaryotic Sm  44.2      50  0.0011   25.4   4.7   33  186-218     9-41  (74)
112 PF01423 LSM:  LSM domain ;  In  43.5      61  0.0013   24.0   5.0   34  186-219     8-41  (67)
113 COG1958 LSM1 Small nuclear rib  40.8      57  0.0012   25.4   4.6   33  187-219    18-50  (79)
114 PF00571 CBS:  CBS domain CBS d  40.8      29 0.00063   24.4   2.7   21  286-306    28-48  (57)
115 PF12381 Peptidase_C3G:  Tungro  40.6      31 0.00068   32.5   3.4   55  275-330   168-228 (231)
116 cd01723 LSm4 The eukaryotic Sm  37.2      84  0.0018   24.3   5.0   33  186-218    11-43  (76)
117 KOG1738 Membrane-associated gu  36.4      54  0.0012   35.7   4.9   37  391-427   225-262 (638)
118 PF02743 Cache_1:  Cache domain  36.0      47   0.001   25.5   3.4   32  291-332    19-50  (81)
119 cd01725 LSm2 The eukaryotic Sm  30.3 1.2E+02  0.0025   23.9   4.8   33  186-218    11-43  (81)
120 COG5233 GRH1 Peripheral Golgi   30.0      30 0.00066   34.4   1.7   30  395-424    67-96  (417)
121 cd01733 LSm10 The eukaryotic S  30.0 1.4E+02   0.003   23.3   5.1   32  187-218    20-51  (78)
122 PF14827 Cache_3:  Sensory doma  29.3      55  0.0012   27.3   3.0   18  291-308    94-111 (116)
123 cd01724 Sm_D1 The eukaryotic S  27.9 1.4E+02   0.003   24.0   4.9   33  186-218    11-43  (90)
124 PF14438 SM-ATX:  Ataxin 2 SM d  27.3 1.5E+02  0.0032   22.8   4.9   28  187-214    13-43  (77)
125 COG0260 PepB Leucyl aminopepti  23.6      55  0.0012   34.9   2.3   30  394-424   301-330 (485)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=3.2e-49  Score=411.28  Aligned_cols=296  Identities=33%  Similarity=0.567  Sum_probs=258.8

Q ss_pred             hHHHHHHHhCCceEEEEcccc-------------cccc-------ccCCceeEEEEEeC-CCeEEeccccccCCCCCCCC
Q 012318          126 TIANAAARVCPAVVNLSAPRE-------------FLGI-------LSGRGIGSGAIVDA-DGTILTCAHVVVDFHGSRAL  184 (466)
Q Consensus       126 ~~~~~~~~~~~SVV~I~~~~~-------------~~~~-------~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~~~~~  184 (466)
                      +++++++++.||||.|.+...             +++.       ....+.||||+|++ +||||||+|||.+.      
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a------  114 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQA------  114 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCC------
Confidence            499999999999999986421             0110       01236899999985 69999999999986      


Q ss_pred             CCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecC
Q 012318          185 PKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRK  264 (466)
Q Consensus       185 ~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~  264 (466)
                        ..+.|++.||+.|+|++++.|+.+||||||++....+++++|+++..++.|++|+++|+|.+...+++.|+|++..+.
T Consensus       115 --~~i~V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~  192 (455)
T PRK10139        115 --QKISIQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRS  192 (455)
T ss_pred             --CEEEEEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcccccc
Confidence              689999999999999999999999999999986678999999999999999999999999999999999999988765


Q ss_pred             ccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEecC---CCeeeEEEeHHHHHHHHHHHHHcCCcccccc
Q 012318          265 SSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQK  341 (466)
Q Consensus       265 ~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~---~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~  341 (466)
                      ...  . .....++++|+.+++|||||||||.+|+||||+++....   ..+++|+||++.+++++++|+++|       
T Consensus       193 ~~~--~-~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g-------  262 (455)
T PRK10139        193 GLN--L-EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG-------  262 (455)
T ss_pred             ccC--C-CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC-------
Confidence            221  1 123568999999999999999999999999999987642   357999999999999999999999       


Q ss_pred             ccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCE
Q 012318          342 VPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGK  421 (466)
Q Consensus       342 ~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~  421 (466)
                                      ++.++|||+.+.+++++..+.++..      ...|++|.+|.++|||+++|||+||+|++|||+
T Consensus       263 ----------------~v~r~~LGv~~~~l~~~~~~~lgl~------~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~  320 (455)
T PRK10139        263 ----------------EIKRGLLGIKGTEMSADIAKAFNLD------VQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGK  320 (455)
T ss_pred             ----------------cccccceeEEEEECCHHHHHhcCCC------CCCceEEEEECCCChHHHCCCCCCCEEEEECCE
Confidence                            8889999999999999988877642      356999999999999999999999999999999


Q ss_pred             ecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318          422 PVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA  462 (466)
Q Consensus       422 ~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~  462 (466)
                      +|.+|+++...+.. +.|+++.++|.| +|+.+++++++...
T Consensus       321 ~V~s~~dl~~~l~~~~~g~~v~l~V~R-~G~~~~l~v~~~~~  361 (455)
T PRK10139        321 PLNSFAELRSRIATTEPGTKVKLGLLR-NGKPLEVEVTLDTS  361 (455)
T ss_pred             ECCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEECCC
Confidence            99999999988876 788999999999 89998888887543


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=1.2e-48  Score=395.71  Aligned_cols=297  Identities=37%  Similarity=0.590  Sum_probs=258.1

Q ss_pred             hhHHHHHHHhCCceEEEEcccccc---ccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE
Q 012318          125 DTIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG  201 (466)
Q Consensus       125 ~~~~~~~~~~~~SVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a  201 (466)
                      .++.++++++.||||.|.+.....   ......+.||||+|+++||||||+|||.++        ..+.|.+.||+.++|
T Consensus        45 ~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~--------~~i~V~~~dg~~~~a  116 (351)
T TIGR02038        45 ISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA--------DQIVVALQDGRKFEA  116 (351)
T ss_pred             hhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC--------CEEEEEECCCCEEEE
Confidence            369999999999999998753221   111234679999999999999999999885        689999999999999


Q ss_pred             EEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEc
Q 012318          202 TVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTD  281 (466)
Q Consensus       202 ~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~  281 (466)
                      +++++|+.+||||||++. ..+++++++++..++.|++|+++|||.+...+++.|+|+...+....   ......++++|
T Consensus       117 ~vv~~d~~~DlAvlkv~~-~~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd  192 (351)
T TIGR02038       117 ELVGSDPLTDLAVLKIEG-DNLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD  192 (351)
T ss_pred             EEEEecCCCCEEEEEecC-CCCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence            999999999999999996 45888899888889999999999999999999999999988765321   11235689999


Q ss_pred             ccCCCCCccceeecCCCeEEEEEEeEecC-----CCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeee
Q 012318          282 CAINAGNSGGPLVNIDGEIVGINIMKVAA-----ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILC  356 (466)
Q Consensus       282 ~~i~~G~SGGPlvd~~G~VVGI~~~~~~~-----~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~  356 (466)
                      +.+++|||||||||.+|+||||+++....     ..+++|+||++.+++++++++++|                      
T Consensus       193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g----------------------  250 (351)
T TIGR02038       193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG----------------------  250 (351)
T ss_pred             CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----------------------
Confidence            99999999999999999999999876432     257899999999999999999999                      


Q ss_pred             eeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-
Q 012318          357 RRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-  435 (466)
Q Consensus       357 ~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-  435 (466)
                       ++.++|||+.+.++++...+.++..      ...|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. 
T Consensus       251 -~~~r~~lGv~~~~~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~  323 (351)
T TIGR02038       251 -RVIRGYIGVSGEDINSVVAQGLGLP------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAET  323 (351)
T ss_pred             -cccceEeeeEEEECCHHHHHhcCCC------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhc
Confidence             7889999999999998887776542      23699999999999999999999999999999999999999998876 


Q ss_pred             CCCCeEEEEEEECCCeEEEEEEEecCCC
Q 012318          436 RVGEPLKVVVQRANDQLVTLTVIPEEAN  463 (466)
Q Consensus       436 ~~g~~v~l~v~R~~g~~~~l~v~~~~~~  463 (466)
                      +.|+++.++|.| +|+.+++++++.+.+
T Consensus       324 ~~g~~v~l~v~R-~g~~~~~~v~l~~~p  350 (351)
T TIGR02038       324 RPGSKVMVTVLR-QGKQLELPVTIDEKP  350 (351)
T ss_pred             CCCCEEEEEEEE-CCEEEEEEEEecCCC
Confidence            788999999999 899999999887654


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=2.4e-48  Score=393.30  Aligned_cols=296  Identities=36%  Similarity=0.561  Sum_probs=255.4

Q ss_pred             hHHHHHHHhCCceEEEEcccccc---ccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEE
Q 012318          126 TIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGT  202 (466)
Q Consensus       126 ~~~~~~~~~~~SVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~  202 (466)
                      ++.++++++.||||.|.+.....   +.....+.||||+|+++||||||+|||.++        ..+.|++.||+.|+|+
T Consensus        46 ~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~  117 (353)
T PRK10898         46 SYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA--------DQIIVALQDGRVFEAL  117 (353)
T ss_pred             hHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC--------CEEEEEeCCCCEEEEE
Confidence            59999999999999999854321   111234689999999999999999999985        6899999999999999


Q ss_pred             EEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcc
Q 012318          203 VLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDC  282 (466)
Q Consensus       203 vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~  282 (466)
                      ++++|+.+||||||++. ..+++++++++..++.|++|+++|||.+...+++.|+|++..+....   ......++++|+
T Consensus       118 vv~~d~~~DlAvl~v~~-~~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtda  193 (353)
T PRK10898        118 LVGSDSLTDLAVLKINA-TNLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTDA  193 (353)
T ss_pred             EEEEcCCCCEEEEEEcC-CCCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEecc
Confidence            99999999999999985 46888999888889999999999999998889999999987764321   112246899999


Q ss_pred             cCCCCCccceeecCCCeEEEEEEeEecC------CCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeee
Q 012318          283 AINAGNSGGPLVNIDGEIVGINIMKVAA------ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILC  356 (466)
Q Consensus       283 ~i~~G~SGGPlvd~~G~VVGI~~~~~~~------~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~  356 (466)
                      .+++|||||||+|.+|+||||+++....      ..+++|+||++.+++++++++++|                      
T Consensus       194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G----------------------  251 (353)
T PRK10898        194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG----------------------  251 (353)
T ss_pred             ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----------------------
Confidence            9999999999999999999999976542      247899999999999999999999                      


Q ss_pred             eeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-
Q 012318          357 RRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-  435 (466)
Q Consensus       357 ~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-  435 (466)
                       ++.++|||+.+.++++.....++.      ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+.+.. 
T Consensus       252 -~~~~~~lGi~~~~~~~~~~~~~~~------~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~  324 (353)
T PRK10898        252 -RVIRGYIGIGGREIAPLHAQGGGI------DQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEI  324 (353)
T ss_pred             -cccccccceEEEECCHHHHHhcCC------CCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhc
Confidence             888999999999887765544322      234799999999999999999999999999999999999999888876 


Q ss_pred             CCCCeEEEEEEECCCeEEEEEEEecCCC
Q 012318          436 RVGEPLKVVVQRANDQLVTLTVIPEEAN  463 (466)
Q Consensus       436 ~~g~~v~l~v~R~~g~~~~l~v~~~~~~  463 (466)
                      +.|+++.++|.| +|+.+++.+++.+.+
T Consensus       325 ~~g~~v~l~v~R-~g~~~~~~v~l~~~p  351 (353)
T PRK10898        325 RPGSVIPVVVMR-DDKQLTLQVTIQEYP  351 (353)
T ss_pred             CCCCEEEEEEEE-CCEEEEEEEEeccCC
Confidence            788999999999 899999999887665


No 4  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=1.7e-46  Score=390.91  Aligned_cols=295  Identities=41%  Similarity=0.685  Sum_probs=257.3

Q ss_pred             HHHHHHHhCCceEEEEcccc----------------cccc-----------ccCCceeEEEEEeCCCeEEeccccccCCC
Q 012318          127 IANAAARVCPAVVNLSAPRE----------------FLGI-----------LSGRGIGSGAIVDADGTILTCAHVVVDFH  179 (466)
Q Consensus       127 ~~~~~~~~~~SVV~I~~~~~----------------~~~~-----------~~~~~~GSGfiI~~~G~ILT~aHvv~~~~  179 (466)
                      +.++++++.||||.|.+...                +++.           ....+.||||+|+++|+||||+||+.++ 
T Consensus         3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~-   81 (428)
T TIGR02037         3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA-   81 (428)
T ss_pred             HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC-
Confidence            78999999999999986420                0000           0124679999999999999999999986 


Q ss_pred             CCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEe
Q 012318          180 GSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVS  259 (466)
Q Consensus       180 ~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs  259 (466)
                             ..+.|.+.+++.++|++++.|+.+|||||+++....++++.|+++..++.|++|+++|||.+...+++.|+|+
T Consensus        82 -------~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs  154 (428)
T TIGR02037        82 -------DEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVS  154 (428)
T ss_pred             -------CeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEE
Confidence                   6899999999999999999999999999999976679999999888899999999999999999999999999


Q ss_pred             eeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEec---CCCeeeEEEeHHHHHHHHHHHHHcCCc
Q 012318          260 CVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVA---AADGLSFAVPIDSAAKIIEQFKKNGWM  336 (466)
Q Consensus       260 ~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~---~~~g~~~aip~~~i~~~l~~l~~~g~~  336 (466)
                      ...+....   ...+..++++|+.+++|+|||||||.+|+||||++....   ...+++|+||++.+++++++|+++|  
T Consensus       155 ~~~~~~~~---~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g--  229 (428)
T TIGR02037       155 ALGRSGLG---IGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG--  229 (428)
T ss_pred             ecccCccC---CCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC--
Confidence            87765311   122346899999999999999999999999999988654   2457899999999999999999999  


Q ss_pred             cccccccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEE
Q 012318          337 HVEQKVPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVI  416 (466)
Q Consensus       337 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~  416 (466)
                                           ++.++|||+.+..++++.++.++..      ...|++|.+|.++|||+++||++||+|+
T Consensus       230 ---------------------~~~~~~lGi~~~~~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~  282 (428)
T TIGR02037       230 ---------------------KVQRGWLGVTIQEVTSDLAKSLGLE------KQRGALVAQVLPGSPAEKAGLKAGDVIL  282 (428)
T ss_pred             ---------------------cCcCCcCceEeecCCHHHHHHcCCC------CCCceEEEEccCCCChHHcCCCCCCEEE
Confidence                                 7889999999999999988888642      2479999999999999999999999999


Q ss_pred             EECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318          417 KFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA  462 (466)
Q Consensus       417 ~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~  462 (466)
                      +|||++|.++.++..++.. ..|+.++++|.| +|+.+++++++...
T Consensus       283 ~Vng~~i~~~~~~~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~l~~~  328 (428)
T TIGR02037       283 SVNGKPISSFADLRRAIGTLKPGKKVTLGILR-KGKEKTITVTLGAS  328 (428)
T ss_pred             EECCEEcCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEECcC
Confidence            9999999999999988876 678999999999 89998888887543


No 5  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=2.3e-46  Score=391.71  Aligned_cols=295  Identities=35%  Similarity=0.566  Sum_probs=256.1

Q ss_pred             hHHHHHHHhCCceEEEEccccc--------------ccc---c--------------------------cCCceeEEEEE
Q 012318          126 TIANAAARVCPAVVNLSAPREF--------------LGI---L--------------------------SGRGIGSGAIV  162 (466)
Q Consensus       126 ~~~~~~~~~~~SVV~I~~~~~~--------------~~~---~--------------------------~~~~~GSGfiI  162 (466)
                      +++++++++.||||.|.+....              ++.   +                          ...+.||||+|
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            4999999999999999863310              110   0                          01357999999


Q ss_pred             eC-CCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEE
Q 012318          163 DA-DGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVV  241 (466)
Q Consensus       163 ~~-~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~  241 (466)
                      ++ +||||||+||+.+.        ..+.|++.||+.|+|++++.|+.+||||||++...++++++|+++..+++|++|+
T Consensus       119 ~~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~  190 (473)
T PRK10942        119 DADKGYVVTNNHVVDNA--------TKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTV  190 (473)
T ss_pred             ECCCCEEEeChhhcCCC--------CEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEE
Confidence            96 59999999999986        6899999999999999999999999999999866789999999999999999999


Q ss_pred             EEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEecC---CCeeeEEE
Q 012318          242 AMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAV  318 (466)
Q Consensus       242 ~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~---~~g~~~ai  318 (466)
                      ++|+|.+...+++.|+|++..+....  . ..+..++++|+.+++|+|||||+|.+|+||||+++....   ..+.+|+|
T Consensus       191 aiG~P~g~~~tvt~GiVs~~~r~~~~--~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaI  267 (473)
T PRK10942        191 AIGNPYGLGETVTSGIVSALGRSGLN--V-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAI  267 (473)
T ss_pred             EEcCCCCCCcceeEEEEEEeecccCC--c-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEE
Confidence            99999999999999999988765211  1 123468999999999999999999999999999987642   34689999


Q ss_pred             eHHHHHHHHHHHHHcCCccccccccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeeccc
Q 012318          319 PIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVV  398 (466)
Q Consensus       319 p~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V  398 (466)
                      |++.+++++++|+++|                       ++.++|||+.+.++++++.+.++..      ...|++|.+|
T Consensus       268 P~~~~~~v~~~l~~~g-----------------------~v~rg~lGv~~~~l~~~~a~~~~l~------~~~GvlV~~V  318 (473)
T PRK10942        268 PSNMVKNLTSQMVEYG-----------------------QVKRGELGIMGTELNSELAKAMKVD------AQRGAFVSQV  318 (473)
T ss_pred             EHHHHHHHHHHHHhcc-----------------------ccccceeeeEeeecCHHHHHhcCCC------CCCceEEEEE
Confidence            9999999999999999                       8889999999999999988777542      3579999999


Q ss_pred             CCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 012318          399 TPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEE  461 (466)
Q Consensus       399 ~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~  461 (466)
                      .++|||+++||++||+|++|||++|.+++++...+.. ..|++++++|.| +|+.+++.+++..
T Consensus       319 ~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~v~R-~G~~~~v~v~l~~  381 (473)
T PRK10942        319 LPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKLTLGLLR-DGKPVNVNVELQQ  381 (473)
T ss_pred             CCCChHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEE-CCeEEEEEEEeCc
Confidence            9999999999999999999999999999999988876 678899999999 8998888887754


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=3e-36  Score=305.96  Aligned_cols=295  Identities=39%  Similarity=0.627  Sum_probs=256.8

Q ss_pred             hhHHHHHHHhCCceEEEEccccccc---------cccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC
Q 012318          125 DTIANAAARVCPAVVNLSAPREFLG---------ILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD  195 (466)
Q Consensus       125 ~~~~~~~~~~~~SVV~I~~~~~~~~---------~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~  195 (466)
                      ..+..+++++.|+||.+........         .....+.||||+++.+|||+|+.|++.++        .++.+.+.|
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a--------~~i~v~l~d  104 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGA--------EEITVTLAD  104 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCc--------ceEEEEeCC
Confidence            4699999999999999987542211         00014789999999999999999999985        689999999


Q ss_pred             CcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccc
Q 012318          196 GRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRR  275 (466)
Q Consensus       196 g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~  275 (466)
                      |+.+++++++.|+..|+|+|+++....++.+.++++..++.|++++++|+|+++..+++.|+++...+.  ..+......
T Consensus       105 g~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~--~v~~~~~~~  182 (347)
T COG0265         105 GREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT--GVGSAGGYV  182 (347)
T ss_pred             CCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc--cccCccccc
Confidence            999999999999999999999997544888899999999999999999999999999999999998886  222212256


Q ss_pred             cEEEEcccCCCCCccceeecCCCeEEEEEEeEecCCC---eeeEEEeHHHHHHHHHHHHHcCCcccccccccccccccee
Q 012318          276 EYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAAD---GLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQV  352 (466)
Q Consensus       276 ~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~---g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~  352 (466)
                      .++|+|+.+++|+||||++|.+|++|||++.......   +++|++|++.+..++.+++++|                  
T Consensus       183 ~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G------------------  244 (347)
T COG0265         183 NFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG------------------  244 (347)
T ss_pred             chhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC------------------
Confidence            7899999999999999999999999999999876543   5899999999999999999999                  


Q ss_pred             eeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHH
Q 012318          353 VILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEI  432 (466)
Q Consensus       353 ~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~  432 (466)
                           ++.++|+|+.+..++.+..  ++     + ....|++|..|.+++||+++|++.||+|+++||+++.+..++...
T Consensus       245 -----~v~~~~lgv~~~~~~~~~~--~g-----~-~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~  311 (347)
T COG0265         245 -----KVVRGYLGVIGEPLTADIA--LG-----L-PVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAA  311 (347)
T ss_pred             -----CccccccceEEEEcccccc--cC-----C-CCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHH
Confidence                 8899999999988776554  22     2 246789999999999999999999999999999999999999988


Q ss_pred             Hhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 012318          433 MGD-RVGEPLKVVVQRANDQLVTLTVIPEE  461 (466)
Q Consensus       433 l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~  461 (466)
                      +.. .+|+.+.+++.| +|+..++.++..+
T Consensus       312 v~~~~~g~~v~~~~~r-~g~~~~~~v~l~~  340 (347)
T COG0265         312 VASNRPGDEVALKLLR-GGKERELAVTLGD  340 (347)
T ss_pred             HhccCCCCEEEEEEEE-CCEEEEEEEEecC
Confidence            876 679999999999 7999999998876


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96  E-value=5.1e-28  Score=246.23  Aligned_cols=327  Identities=32%  Similarity=0.429  Sum_probs=258.2

Q ss_pred             chhHHHHHHHhCCceEEEEcccccccc------ccCCceeEEEEEeCCCeEEeccccccCCCCCCCCC---CceEEEEeC
Q 012318          124 RDTIANAAARVCPAVVNLSAPREFLGI------LSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALP---KGKVDVTLQ  194 (466)
Q Consensus       124 ~~~~~~~~~~~~~SVV~I~~~~~~~~~------~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~---~~~i~V~~~  194 (466)
                      ..-++.+.++-.+++|.|....-..+.      .-....||||+++.+|+++||+||+..........   -..+.+...
T Consensus       127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa  206 (473)
T KOG1320|consen  127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA  206 (473)
T ss_pred             hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence            456888999999999999963322111      13456799999999999999999997643221111   123566665


Q ss_pred             CC--cEEEEEEEEecCCCCEEEEEeCCCC-CCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCC
Q 012318          195 DG--RTFEGTVLNADFHSDIAIVKINSKT-PLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG  271 (466)
Q Consensus       195 ~g--~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~  271 (466)
                      +|  ..+++.+.+.|+..|+|+++++.+. .+++++++.+..+..|+++..+|.|++..++.+.|+++...|.....+..
T Consensus       207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~  286 (473)
T KOG1320|consen  207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE  286 (473)
T ss_pred             ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence            55  8999999999999999999997653 37888898889999999999999999999999999999998887665554


Q ss_pred             --CccccEEEEcccCCCCCccceeecCCCeEEEEEEeEecC---CCeeeEEEeHHHHHHHHHHHHHcCCccccccccccc
Q 012318          272 --GMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLW  346 (466)
Q Consensus       272 --~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~---~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~  346 (466)
                        ....+++++++.++.|+||||++|.+|++||+++.....   ..+.+|++|.+.++.++.+..+.. +.. ...    
T Consensus       287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~-~~l-r~~----  360 (473)
T KOG1320|consen  287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQ-ISL-RPV----  360 (473)
T ss_pred             cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhc-eee-ccc----
Confidence              566789999999999999999999999999998887542   357899999999999988886544 000 000    


Q ss_pred             cccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCC-CCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCC
Q 012318          347 STCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFP-NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS  425 (466)
Q Consensus       347 ~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~-~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~  425 (466)
                      +.        ..+.+.|+|..+..+...+......+.+.++ ....+++|..|.+++++...++++||+|++|||++|++
T Consensus       361 ~~--------~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n  432 (473)
T KOG1320|consen  361 KP--------LVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKN  432 (473)
T ss_pred             cC--------cccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeec
Confidence            00        0234679999888888777666655555554 34468999999999999999999999999999999999


Q ss_pred             HHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCCCCC
Q 012318          426 ITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEANPD  465 (466)
Q Consensus       426 ~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~~~~  465 (466)
                      ..++.+++.. ..++++.+...| +.+..++.+.++...+.
T Consensus       433 ~~~l~~~i~~~~~~~~v~vl~~~-~~e~~tl~Il~~~~~p~  472 (473)
T KOG1320|consen  433 LKHLYELIEECSTEDKVAVLDRR-SAEDATLEILPEHKIPS  472 (473)
T ss_pred             hHHHHHHHHhcCcCceEEEEEec-CccceeEEecccccCCC
Confidence            9999999987 556677777777 77889999988876554


No 8  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.94  E-value=3.6e-25  Score=227.13  Aligned_cols=304  Identities=22%  Similarity=0.323  Sum_probs=243.6

Q ss_pred             hhHHHHHHHhCCceEEEEccc--cccccccCCceeEEEEEeCC-CeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE
Q 012318          125 DTIANAAARVCPAVVNLSAPR--EFLGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG  201 (466)
Q Consensus       125 ~~~~~~~~~~~~SVV~I~~~~--~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a  201 (466)
                      .+|...+.++-+|||.|....  .++-...+...||||++++. |+||||+|++...       .-.-.+.|.+....+.
T Consensus        52 e~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pg-------P~va~avf~n~ee~ei  124 (955)
T KOG1421|consen   52 EDWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPG-------PFVASAVFDNHEEIEI  124 (955)
T ss_pred             hhhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCC-------CceeEEEecccccCCc
Confidence            379999999999999999754  23334466788999999987 8999999999753       1345666777777777


Q ss_pred             EEEEecCCCCEEEEEeCCC----CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCC---CCcc
Q 012318          202 TVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGL---GGMR  274 (466)
Q Consensus       202 ~vv~~d~~~DlAlLkv~~~----~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~---~~~~  274 (466)
                      -.++.|+-||+.+++.+..    ..+..+.+.. .-.+.|.+++++|+..+...++..|.++.+++...+++.   +...
T Consensus       125 ~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfn  203 (955)
T KOG1421|consen  125 YPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFN  203 (955)
T ss_pred             ccccCCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCcccccccccccc
Confidence            7889999999999999864    2233333432 346789999999998888888889999999998877643   3334


Q ss_pred             ccEEEEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeee
Q 012318          275 REYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVI  354 (466)
Q Consensus       275 ~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~  354 (466)
                      ..++|.......|.||+|++|.+|..|.++..+... .+.+|++|++.+.+-|.-++++.                    
T Consensus       204 Tfy~QaasstsggssgspVv~i~gyAVAl~agg~~s-sas~ffLpLdrV~RaL~clq~n~--------------------  262 (955)
T KOG1421|consen  204 TFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS-SASDFFLPLDRVVRALRCLQNNT--------------------  262 (955)
T ss_pred             ceeeeehhcCCCCCCCCceecccceEEeeecCCccc-ccccceeeccchhhhhhhhhcCC--------------------
Confidence            468899999999999999999999999999886543 45789999999999999999888                    


Q ss_pred             eeeeeecccccceeecCCHHHHHHhhccC-------CCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHH
Q 012318          355 LCRRVVRPWLGLKMLDLNDMIIAQLKERD-------PSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSIT  427 (466)
Q Consensus       355 ~~~~~~~~~lG~~~~~~~~~~~~~~~~~~-------~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~  427 (466)
                         .+.|+.|-+++..-.-+..+++++.+       ..++....=++|..|.+++||++. |++||++++||+.-+.++.
T Consensus       263 ---PItRGtLqvefl~k~~de~rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~GDillavN~t~l~df~  338 (955)
T KOG1421|consen  263 ---PITRGTLQVEFLHKLFDECRRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEPGDILLAVNSTCLNDFE  338 (955)
T ss_pred             ---CcccceEEEEEehhhhHHHHhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCCCcEEEEEcceehHHHH
Confidence               55677777776665555555555444       245543334456789999999998 9999999999999999999


Q ss_pred             HHHHHHhcCCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318          428 EIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPEEA  462 (466)
Q Consensus       428 ~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~~  462 (466)
                      ++.++|.+..|+.+.|+|+| +|++.+++++.++.
T Consensus       339 ~l~~iLDegvgk~l~LtI~R-ggqelel~vtvqdl  372 (955)
T KOG1421|consen  339 ALEQILDEGVGKNLELTIQR-GGQELELTVTVQDL  372 (955)
T ss_pred             HHHHHHhhccCceEEEEEEe-CCEEEEEEEEeccc
Confidence            99999999999999999999 89999998887654


No 9  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.69  E-value=2.4e-16  Score=134.29  Aligned_cols=117  Identities=35%  Similarity=0.570  Sum_probs=77.9

Q ss_pred             eEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEE--EEEEEecCC-CCEEEEEeCCCCCCCccccCCCCC
Q 012318          157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE--GTVLNADFH-SDIAIVKINSKTPLPAAKLGTSSK  233 (466)
Q Consensus       157 GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~l~~s~~  233 (466)
                      ||||+|+++|+||||+||+.+...........+.+...++..+.  +++++.++. +|+|||+++               
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~---------------   65 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVD---------------   65 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEES---------------
T ss_pred             CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEe---------------
Confidence            89999999999999999998754333234578999999998888  999999999 999999999               


Q ss_pred             CCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEE
Q 012318          234 LCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGI  303 (466)
Q Consensus       234 ~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI  303 (466)
                           .....+...     ...............    ......+ +++.+.+|+|||||||.+|+||||
T Consensus        66 -----~~~~~~~~~-----~~~~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   66 -----PWTGVGGGV-----RVPGSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             -----CEEEEEEEE-----EEEEEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             -----cccceeeee-----EeeeeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence                 000000000     000000000000000    0001123 799999999999999999999997


No 10 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.59  E-value=5e-14  Score=145.81  Aligned_cols=295  Identities=17%  Similarity=0.168  Sum_probs=196.5

Q ss_pred             HHhCCceEEEEccccc--cccccCCceeEEEEEeCC-CeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecC
Q 012318          132 ARVCPAVVNLSAPREF--LGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADF  208 (466)
Q Consensus       132 ~~~~~SVV~I~~~~~~--~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~  208 (466)
                      +++..+.|.+....+.  ++.......|||.|++.. |+++++..++....       ...+|++.|...++|.+.+.|+
T Consensus       525 ~~i~~~~~~v~~~~~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~-------~d~~vt~~dS~~i~a~~~fL~~  597 (955)
T KOG1421|consen  525 ADISNCLVDVEPMMPVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDA-------KDQRVTEADSDGIPANVSFLHP  597 (955)
T ss_pred             hHHhhhhhhheeceeeccccchhhhhcCceEEEEccCCceeEecccCCchh-------hceEEeecccccccceeeEecC
Confidence            4555666666654332  233334467999999977 89999999997542       6788999988889999999999


Q ss_pred             CCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEee---ee-cCccCCCCCCccccEEEEcccC
Q 012318          209 HSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSC---VD-RKSSDLGLGGMRREYLQTDCAI  284 (466)
Q Consensus       209 ~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~---~~-~~~~~~~~~~~~~~~i~~~~~i  284 (466)
                      ..++|.++.+... ...++|.+ ..+..|+++...|+............++.   +. .......+.....+.|..++..
T Consensus       598 t~n~a~~kydp~~-~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nl  675 (955)
T KOG1421|consen  598 TENVASFKYDPAL-EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNL  675 (955)
T ss_pred             ccceeEeccChhH-hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEeccc
Confidence            9999999999543 34455644 46788999999999876543222222221   11 1111122333345677777777


Q ss_pred             CCCCccceeecCCCeEEEEEEeEecCC-C----eeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeeeeee
Q 012318          285 NAGNSGGPLVNIDGEIVGINIMKVAAA-D----GLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRV  359 (466)
Q Consensus       285 ~~G~SGGPlvd~~G~VVGI~~~~~~~~-~----g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~  359 (466)
                      ..++--|-+.|.+|+|+|+.-....+. .    ..-|.+.+..++++|++|+.++...     +    +..+++-    .
T Consensus       676 sT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-----p----~i~~vef----~  742 (955)
T KOG1421|consen  676 STSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-----P----TIAGVEF----S  742 (955)
T ss_pred             cccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-----c----eeeccce----e
Confidence            666666788899999999987765422 1    2457789999999999999888110     0    0000000    0


Q ss_pred             ecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCC
Q 012318          360 VRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGE  439 (466)
Q Consensus       360 ~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~  439 (466)
                      .-...+.+...++.+++.+++...   -....-++|+.|.+..+  +. |..||+|+++||+-|+.+.|+.+..      
T Consensus       743 ~i~laqar~lglp~e~imk~e~es---~~~~ql~~ishv~~~~~--ki-l~~gdiilsvngk~itr~~dl~d~~------  810 (955)
T KOG1421|consen  743 HITLAQARTLGLPSEFIMKSEEES---TIPRQLYVISHVRPLLH--KI-LGVGDIILSVNGKMITRLSDLHDFE------  810 (955)
T ss_pred             eEEeehhhccCCCHHHHhhhhhcC---CCcceEEEEEeeccCcc--cc-cccccEEEEecCeEEeeehhhhhhh------
Confidence            011223344445666655553321   12344567888877543  44 9999999999999999999998733      


Q ss_pred             eEEEEEEECCCeEEEEEEEecC
Q 012318          440 PLKVVVQRANDQLVTLTVIPEE  461 (466)
Q Consensus       440 ~v~l~v~R~~g~~~~l~v~~~~  461 (466)
                      .+...|.| +|..+++.+...+
T Consensus       811 eid~~ilr-dg~~~~ikipt~p  831 (955)
T KOG1421|consen  811 EIDAVILR-DGIEMEIKIPTYP  831 (955)
T ss_pred             hhheeeee-cCcEEEEEecccc
Confidence            47789999 8998888876644


No 11 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.53  E-value=4.9e-14  Score=112.81  Aligned_cols=81  Identities=35%  Similarity=0.655  Sum_probs=69.5

Q ss_pred             ccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCe
Q 012318          362 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP  440 (466)
Q Consensus       362 ~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~  440 (466)
                      ||||+.+....+                ..|++|.+|.++|||+++||++||+|++|||++|.++.++..++.. .+|++
T Consensus         1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~   64 (82)
T PF13180_consen    1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDT   64 (82)
T ss_dssp             -E-SEEEEECSC----------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred             CEECeEEEEccC----------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence            589998876431                4699999999999999999999999999999999999999998854 89999


Q ss_pred             EEEEEEECCCeEEEEEEEe
Q 012318          441 LKVVVQRANDQLVTLTVIP  459 (466)
Q Consensus       441 v~l~v~R~~g~~~~l~v~~  459 (466)
                      ++|+|.| +|+.+++++++
T Consensus        65 v~l~v~R-~g~~~~~~v~l   82 (82)
T PF13180_consen   65 VTLTVLR-DGEELTVEVTL   82 (82)
T ss_dssp             EEEEEEE-TTEEEEEEEE-
T ss_pred             EEEEEEE-CCEEEEEEEEC
Confidence            9999999 99999988864


No 12 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.47  E-value=2.6e-12  Score=120.56  Aligned_cols=177  Identities=21%  Similarity=0.294  Sum_probs=117.1

Q ss_pred             CCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeC-------CC--cEEEEEEEE
Q 012318          135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQ-------DG--RTFEGTVLN  205 (466)
Q Consensus       135 ~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~-------~g--~~~~a~vv~  205 (466)
                      .|.+|.|.....       ...|+|++|+++ +|||++||+...        ..+.+.+.       ++  ..+..+-+.
T Consensus        12 ~p~~v~i~~~~~-------~~~C~G~li~~~-~vLTaahC~~~~--------~~~~v~~g~~~~~~~~~~~~~~~v~~~~   75 (220)
T PF00089_consen   12 FPWVVSIRYSNG-------RFFCTGTLISPR-WVLTAAHCVDGA--------SDIKVRLGTYSIRNSDGSEQTIKVSKII   75 (220)
T ss_dssp             STTEEEEEETTT-------EEEEEEEEEETT-EEEEEGGGHTSG--------GSEEEEESESBTTSTTTTSEEEEEEEEE
T ss_pred             CCeEEEEeeCCC-------CeeEeEEecccc-cccccccccccc--------cccccccccccccccccccccccccccc
Confidence            367788876442       467999999988 999999999872        34444332       22  344444443


Q ss_pred             ec----C---CCCEEEEEeCCC----CCCCccccCC-CCCCCCCCEEEEEecCCCCC----CceEEeEEeeeecCccCCC
Q 012318          206 AD----F---HSDIAIVKINSK----TPLPAAKLGT-SSKLCPGDWVVAMGCPHSLQ----NTVTAGIVSCVDRKSSDLG  269 (466)
Q Consensus       206 ~d----~---~~DlAlLkv~~~----~~~~~~~l~~-s~~~~~G~~V~~iG~p~~~~----~~~t~G~Vs~~~~~~~~~~  269 (466)
                      .+    .   .+|||||+|+.+    ..+.++.+.. ...+..|+.+.++||+....    ..+....+.......+...
T Consensus        76 ~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~  155 (220)
T PF00089_consen   76 IHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSS  155 (220)
T ss_dssp             EETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHH
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            32    2   579999999976    3456666655 23457899999999997533    2455555544444332211


Q ss_pred             CCC-ccccEEEEcc----cCCCCCccceeecCCCeEEEEEEeEecCCC--eeeEEEeHHHHHHHH
Q 012318          270 LGG-MRREYLQTDC----AINAGNSGGPLVNIDGEIVGINIMKVAAAD--GLSFAVPIDSAAKII  327 (466)
Q Consensus       270 ~~~-~~~~~i~~~~----~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~--g~~~aip~~~i~~~l  327 (466)
                      +.. .....+....    ..|.|+|||||+..++.|+||++.+.....  ...++.+++.+.+|+
T Consensus       156 ~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  156 YNDNLTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             TTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             cccccccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            111 2234555554    789999999999877779999999843222  247889988888775


No 13 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.39  E-value=2.5e-11  Score=114.69  Aligned_cols=181  Identities=23%  Similarity=0.288  Sum_probs=111.1

Q ss_pred             CCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC---------CcEEEEEEEE
Q 012318          135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------GRTFEGTVLN  205 (466)
Q Consensus       135 ~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~---------g~~~~a~vv~  205 (466)
                      .|.+|.|....       ....|+|++|+++ +|||+|||+.+..      ...+.|.+..         ...+..+-+.
T Consensus        12 ~Pw~v~i~~~~-------~~~~C~GtlIs~~-~VLTaAhC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~~v~~~~   77 (232)
T cd00190          12 FPWQVSLQYTG-------GRHFCGGSLISPR-WVLTAAHCVYSSA------PSNYTVRLGSHDLSSNEGGGQVIKVKKVI   77 (232)
T ss_pred             CCCEEEEEccC-------CcEEEEEEEeeCC-EEEECHHhcCCCC------CccEEEEeCcccccCCCCceEEEEEEEEE
Confidence            57788886542       2367999999988 9999999997632      1244444421         2233344444


Q ss_pred             ec-------CCCCEEEEEeCCCC----CCCccccCCCC-CCCCCCEEEEEecCCCCCC-----ceEEeEEeeeecCccCC
Q 012318          206 AD-------FHSDIAIVKINSKT----PLPAAKLGTSS-KLCPGDWVVAMGCPHSLQN-----TVTAGIVSCVDRKSSDL  268 (466)
Q Consensus       206 ~d-------~~~DlAlLkv~~~~----~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~-----~~t~G~Vs~~~~~~~~~  268 (466)
                      .+       ..+|||||+|+.+.    .+.++.|.... .+..|+.+.+.||......     .+....+..+....+..
T Consensus        78 ~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~  157 (232)
T cd00190          78 VHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKR  157 (232)
T ss_pred             ECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhh
Confidence            44       35799999999752    25666675543 5678899999999765332     23333333332222221


Q ss_pred             CCC---CccccEEEE-----cccCCCCCccceeecCC---CeEEEEEEeEecCC--CeeeEEEeHHHHHHHHHH
Q 012318          269 GLG---GMRREYLQT-----DCAINAGNSGGPLVNID---GEIVGINIMKVAAA--DGLSFAVPIDSAAKIIEQ  329 (466)
Q Consensus       269 ~~~---~~~~~~i~~-----~~~i~~G~SGGPlvd~~---G~VVGI~~~~~~~~--~g~~~aip~~~i~~~l~~  329 (466)
                      ...   ......+..     ....|.|+|||||+...   +.++||.+++..-.  .....+..+....+|+++
T Consensus       158 ~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~  231 (232)
T cd00190         158 AYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQK  231 (232)
T ss_pred             hccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhc
Confidence            111   011122222     34578999999999654   89999999875421  233455667777777754


No 14 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.34  E-value=1.1e-11  Score=100.54  Aligned_cols=88  Identities=36%  Similarity=0.692  Sum_probs=74.3

Q ss_pred             ccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCe
Q 012318          362 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP  440 (466)
Q Consensus       362 ~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~  440 (466)
                      +|+|+.+.++++.....+..      ....|++|.+|.++|||+++||++||+|++|||+++.++.++.+++.. ..++.
T Consensus         1 ~~~G~~~~~~~~~~~~~~~~------~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~   74 (90)
T cd00987           1 PWLGVTVQDLTPDLAEELGL------KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDK   74 (90)
T ss_pred             CccceEEeECCHHHHHHcCC------CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCE
Confidence            58999999999876554321      234699999999999999999999999999999999999999988876 45788


Q ss_pred             EEEEEEECCCeEEEEE
Q 012318          441 LKVVVQRANDQLVTLT  456 (466)
Q Consensus       441 v~l~v~R~~g~~~~l~  456 (466)
                      +.+++.| +|+..++.
T Consensus        75 i~l~v~r-~g~~~~~~   89 (90)
T cd00987          75 VTLTVLR-GGKELTVT   89 (90)
T ss_pred             EEEEEEE-CCEEEEee
Confidence            9999999 78776554


No 15 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.27  E-value=3.6e-11  Score=95.49  Aligned_cols=69  Identities=26%  Similarity=0.419  Sum_probs=62.7

Q ss_pred             CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 012318          389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  458 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  458 (466)
                      ...|++|..|.++|||+++||++||+|++|||+++.+|+++..++.. ..|+.+.+++.| +|+.++++++
T Consensus         8 ~~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r-~g~~~~~~~~   77 (79)
T cd00991           8 AVAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLP-STTKLTNVST   77 (79)
T ss_pred             cCCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEE
Confidence            35799999999999999999999999999999999999999998876 468899999999 8888887765


No 16 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.22  E-value=2.7e-10  Score=107.87  Aligned_cols=161  Identities=24%  Similarity=0.324  Sum_probs=97.7

Q ss_pred             hCCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCC--------cEEEEEEEE
Q 012318          134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDG--------RTFEGTVLN  205 (466)
Q Consensus       134 ~~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g--------~~~~a~vv~  205 (466)
                      ..|-+|.|....       ....|+|++|+++ +|||+|||+.+..      ...+.|.+...        ..+...-+.
T Consensus        12 ~~Pw~~~i~~~~-------~~~~C~GtlIs~~-~VLTaahC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~v~~~~   77 (229)
T smart00020       12 SFPWQVSLQYRG-------GRHFCGGSLISPR-WVLTAAHCVYGSD------PSNIRVRLGSHDLSSGEEGQVIKVSKVI   77 (229)
T ss_pred             CCCcEEEEEEcC-------CCcEEEEEEecCC-EEEECHHHcCCCC------CcceEEEeCcccCCCCCCceEEeeEEEE
Confidence            356677776432       2467999999988 9999999998642      12455555432        233444444


Q ss_pred             ec-------CCCCEEEEEeCCC----CCCCccccCCC-CCCCCCCEEEEEecCCCCC------CceEEeEEeeeecCccC
Q 012318          206 AD-------FHSDIAIVKINSK----TPLPAAKLGTS-SKLCPGDWVVAMGCPHSLQ------NTVTAGIVSCVDRKSSD  267 (466)
Q Consensus       206 ~d-------~~~DlAlLkv~~~----~~~~~~~l~~s-~~~~~G~~V~~iG~p~~~~------~~~t~G~Vs~~~~~~~~  267 (466)
                      .+       ..+|||||+|+.+    ..+.++.+... ..+..++.+.+.||+....      .......+.......+.
T Consensus        78 ~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~  157 (229)
T smart00020       78 IHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCR  157 (229)
T ss_pred             ECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhh
Confidence            33       4679999999875    23456666543 3566789999999987542      12223333322222222


Q ss_pred             CCCCC---ccccEEEE-----cccCCCCCccceeecCCC--eEEEEEEeEe
Q 012318          268 LGLGG---MRREYLQT-----DCAINAGNSGGPLVNIDG--EIVGINIMKV  308 (466)
Q Consensus       268 ~~~~~---~~~~~i~~-----~~~i~~G~SGGPlvd~~G--~VVGI~~~~~  308 (466)
                      ..+..   .....+..     ....|.|+||||++...+  .++||++.+.
T Consensus       158 ~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      158 RAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             hhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence            11100   01111221     356789999999996543  9999999875


No 17 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21  E-value=1e-10  Score=92.82  Aligned_cols=68  Identities=24%  Similarity=0.434  Sum_probs=58.0

Q ss_pred             CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318          390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  460 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  460 (466)
                      ..+++|.+|.++|||+++||++||+|++|||+++.+|.++.+.+  ..++.+.+++.| +|+..++.+++.
T Consensus        11 ~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r-~g~~~~~~v~~~   78 (80)
T cd00990          11 EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFR-DDRLIEVPLTLA   78 (80)
T ss_pred             CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEE-CCEEEEEEEEec
Confidence            35799999999999999999999999999999999876654332  467889999999 888888888775


No 18 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19  E-value=1.2e-10  Score=91.97  Aligned_cols=67  Identities=30%  Similarity=0.589  Sum_probs=59.9

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEE
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVI  458 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~  458 (466)
                      ..++|..|.++|||+++||++||+|++|||+++.+++++..++....++.+.+++.| +|+..++.++
T Consensus        12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r-~~~~~~~~l~   78 (79)
T cd00989          12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVER-NGETITLTLT   78 (79)
T ss_pred             cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEE-CCEEEEEEec
Confidence            457899999999999999999999999999999999999998877667889999999 7877777765


No 19 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.19  E-value=1.2e-10  Score=113.07  Aligned_cols=99  Identities=17%  Similarity=0.209  Sum_probs=86.3

Q ss_pred             HHHHHHHHHHHHcCCccccccccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCC
Q 012318          321 DSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTP  400 (466)
Q Consensus       321 ~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~  400 (466)
                      ..++++++++++++                       +..+.|+|+......               +...|+.|..+.+
T Consensus       159 ~~~~~v~~~l~~~g-----------------------~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~  200 (259)
T TIGR01713       159 VVSRRIIEELTKDP-----------------------QKMFDYIRLSPVMKN---------------DKLEGYRLNPGKD  200 (259)
T ss_pred             hhHHHHHHHHHHCH-----------------------HhhhheEeEEEEEeC---------------CceeEEEEEecCC
Confidence            46788999999999                       788999999875321               2346999999999


Q ss_pred             CChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 012318          401 GSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  458 (466)
Q Consensus       401 ~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  458 (466)
                      +++|+++|||+||+|++|||+++.+++++.+++.+ +.++.++|+|+| +|+.+++.+.
T Consensus       201 ~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R-~G~~~~i~v~  258 (259)
T TIGR01713       201 PSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVER-DGQREDIYVR  258 (259)
T ss_pred             CCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEE-CCEEEEEEEE
Confidence            99999999999999999999999999999998887 677899999999 8998888764


No 20 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19  E-value=1.7e-10  Score=91.48  Aligned_cols=71  Identities=28%  Similarity=0.462  Sum_probs=64.0

Q ss_pred             CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318          390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA  462 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~  462 (466)
                      ..|++|..|.++|||+. ||++||+|++|||+++.+|+++.+++.. ..|+.+.+++.| +|+..++++++.+.
T Consensus         7 ~~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r-~g~~~~~~v~l~~~   78 (79)
T cd00986           7 YHGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKR-EEKELPEDLILKTF   78 (79)
T ss_pred             ecCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEE-CCEEEEEEEEEecc
Confidence            35899999999999997 7999999999999999999999998875 678899999999 89999998888754


No 21 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.17  E-value=7.2e-11  Score=121.28  Aligned_cols=277  Identities=19%  Similarity=0.195  Sum_probs=181.3

Q ss_pred             HHHhCCceEEEEccccc-------cccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEE-eCCCcEEEEE
Q 012318          131 AARVCPAVVNLSAPREF-------LGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVT-LQDGRTFEGT  202 (466)
Q Consensus       131 ~~~~~~SVV~I~~~~~~-------~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~-~~~g~~~~a~  202 (466)
                      .+....|++.+......       ....+....|+||.+... .++|++|++.....     ...+.+. ...-+.|.++
T Consensus        56 ~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~-----~~~v~v~~~gs~~k~~~~  129 (473)
T KOG1320|consen   56 VDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNND-----HKFVTVKKHGSPRKYKAF  129 (473)
T ss_pred             ccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCcccccccc-----ccccccccCCCchhhhhh
Confidence            34455566776643221       111244567999999866 99999999984311     1233333 1223668888


Q ss_pred             EEEecCCCCEEEEEeCCC---CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEE
Q 012318          203 VLNADFHSDIAIVKINSK---TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQ  279 (466)
Q Consensus       203 vv~~d~~~DlAlLkv~~~---~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~  279 (466)
                      +...-.+.|+|++.++..   ....++.+++  -+...+.++++|   +....+|.|.|.......  +..+......++
T Consensus       130 v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~--ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~--y~~~~~~l~~vq  202 (473)
T KOG1320|consen  130 VAAVFEECDLAVVYIESEEFWKGMNPFELGD--IPSLNGSGFVVG---GDGIIVTNGHVVRVEPRI--YAHSSTVLLRVQ  202 (473)
T ss_pred             HHHhhhcccceEEEEeeccccCCCcccccCC--CcccCccEEEEc---CCcEEEEeeEEEEEEecc--ccCCCcceeeEE
Confidence            888889999999999863   2233344433  344557899998   667789999998876653  223334455789


Q ss_pred             EcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeeeeee
Q 012318          280 TDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRV  359 (466)
Q Consensus       280 ~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~  359 (466)
                      +++.+.+|+||+|.+.-.+++.|+.+..........+.+|.-.+..+.......+                      ...
T Consensus       203 i~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a----------------------~~~  260 (473)
T KOG1320|consen  203 IDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSA----------------------IGN  260 (473)
T ss_pred             EEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeec----------------------ccc
Confidence            9999999999999997779999999998755446788898877766665544444                      012


Q ss_pred             ecccccceeecC-CHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCC----HH--HHHHH
Q 012318          360 VRPWLGLKMLDL-NDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS----IT--EIIEI  432 (466)
Q Consensus       360 ~~~~lG~~~~~~-~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~----~~--~~~~~  432 (466)
                      ..++++...+.+ +.+.++.++     + +...|+.+.++.+-+.|.+. ++.||.|+.+||..|.-    ..  .+...
T Consensus       261 ~f~~~nt~t~g~vs~~~R~~~~-----l-g~~~g~~i~~~~qtd~ai~~-~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~  333 (473)
T KOG1320|consen  261 GFGLLNTLTQGMVSGQLRKSFK-----L-GLETGVLISKINQTDAAINP-GNSGGPLLNLDGEVIGVNTRKVTRIGFSHG  333 (473)
T ss_pred             Cceeeeeeeecccccccccccc-----c-Ccccceeeeeecccchhhhc-ccCCCcEEEecCcEeeeeeeeeEEeecccc
Confidence            233444433222 122222221     1 22378999999999999988 99999999999988831    11  11122


Q ss_pred             Hhc-CCCCeEEEEEEECCC
Q 012318          433 MGD-RVGEPLKVVVQRAND  450 (466)
Q Consensus       433 l~~-~~g~~v~l~v~R~~g  450 (466)
                      +.. .+++++.+.+.| .+
T Consensus       334 iSf~~p~d~vl~~v~r-~~  351 (473)
T KOG1320|consen  334 ISFKIPIDTVLVIVLR-LG  351 (473)
T ss_pred             ceeccCchHhhhhhhh-hh
Confidence            222 566777777777 44


No 22 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.10  E-value=5.2e-10  Score=89.73  Aligned_cols=71  Identities=24%  Similarity=0.608  Sum_probs=62.8

Q ss_pred             CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318          390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  460 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  460 (466)
                      ..+++|..|.++|||+++||++||+|++|||+++.+|  +++..++....++.+.+++.|.+|+..++++++.
T Consensus        12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~~~   84 (85)
T cd00988          12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLTRL   84 (85)
T ss_pred             CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEEEC
Confidence            3689999999999999999999999999999999999  9999888777788999999993288888887764


No 23 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.97  E-value=2.1e-09  Score=82.81  Aligned_cols=55  Identities=36%  Similarity=0.696  Sum_probs=51.0

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEE
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVV  445 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v  445 (466)
                      .+++|..|.++|||+++||++||+|++|||+++.++  +++.+++....|+.++|++
T Consensus        13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v   69 (70)
T cd00136          13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV   69 (70)
T ss_pred             CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence            489999999999999999999999999999999999  9999999887788888876


No 24 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.96  E-value=3.5e-09  Score=110.79  Aligned_cols=90  Identities=30%  Similarity=0.568  Sum_probs=77.8

Q ss_pred             cccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCC
Q 012318          361 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGE  439 (466)
Q Consensus       361 ~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~  439 (466)
                      ..|+|+.+..+++...++++..     ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. +.++
T Consensus       337 ~~~lGi~~~~l~~~~~~~~~l~-----~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~  411 (428)
T TIGR02037       337 NPFLGLTVANLSPEIRKELRLK-----GDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGG  411 (428)
T ss_pred             ccccceEEecCCHHHHHHcCCC-----cCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCC
Confidence            5689999999998887766432     223699999999999999999999999999999999999999999876 5788


Q ss_pred             eEEEEEEECCCeEEEEE
Q 012318          440 PLKVVVQRANDQLVTLT  456 (466)
Q Consensus       440 ~v~l~v~R~~g~~~~l~  456 (466)
                      .+.|+|.| +|+...+.
T Consensus       412 ~v~l~v~R-~g~~~~~~  427 (428)
T TIGR02037       412 RVALLILR-GGATIFVT  427 (428)
T ss_pred             EEEEEEEE-CCEEEEEE
Confidence            99999999 78876654


No 25 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.83  E-value=6.3e-08  Score=92.36  Aligned_cols=160  Identities=18%  Similarity=0.196  Sum_probs=95.2

Q ss_pred             ceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEe----CCCc-EE--EEEEE-EecC---CCCEEEEEeCCC---
Q 012318          155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTL----QDGR-TF--EGTVL-NADF---HSDIAIVKINSK---  220 (466)
Q Consensus       155 ~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~----~~g~-~~--~a~vv-~~d~---~~DlAlLkv~~~---  220 (466)
                      ..|++|+|.++ .+||++||+......+    ..+.+..    .++. .+  ..... .+..   ..|.+...+...   
T Consensus        64 ~~~~~~lI~pn-tvLTa~Hc~~s~~~G~----~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~  138 (251)
T COG3591          64 LCTAATLIGPN-TVLTAGHCIYSPDYGE----DDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALE  138 (251)
T ss_pred             ceeeEEEEcCc-eEEEeeeEEecCCCCh----hhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhc
Confidence            44566999998 9999999997643211    1222211    1111 11  11111 1112   345555555421   


Q ss_pred             ------CCCCccccCCCCCCCCCCEEEEEecCCCCCCce----EEeEEeeeecCccCCCCCCccccEEEEcccCCCCCcc
Q 012318          221 ------TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTV----TAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSG  290 (466)
Q Consensus       221 ------~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~----t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SG  290 (466)
                            .......+......+.++.+.++|||.+..+..    ..+.+....            ...+.++|.+++|+||
T Consensus       139 ~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~------------~~~l~y~~dT~pG~SG  206 (251)
T COG3591         139 SGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK------------GNKLFYDADTLPGSSG  206 (251)
T ss_pred             cCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEe------------cceEEEEecccCCCCC
Confidence                  112223333445678899999999998765332    223332221            2368999999999999


Q ss_pred             ceeecCCCeEEEEEEeEecCCC--eee-EEEeHHHHHHHHHHHH
Q 012318          291 GPLVNIDGEIVGINIMKVAAAD--GLS-FAVPIDSAAKIIEQFK  331 (466)
Q Consensus       291 GPlvd~~G~VVGI~~~~~~~~~--g~~-~aip~~~i~~~l~~l~  331 (466)
                      +|+++.+.++||++..+....+  ..+ ...-...++++++++.
T Consensus       207 Spv~~~~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         207 SPVLISKDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             CceEecCceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence            9999999999999998765221  122 2234556777777764


No 26 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.81  E-value=1.5e-08  Score=105.55  Aligned_cols=70  Identities=26%  Similarity=0.507  Sum_probs=65.0

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEecC
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPEE  461 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~  461 (466)
                      .+++|.+|.++|||+++||++||+|++|||++|++|+|+.+.+....++.+.+++.| +|+..+++++++.
T Consensus       203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~~  272 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVER-NGETLSISLTPEA  272 (420)
T ss_pred             cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEE-CCEEEEEEEEEcC
Confidence            478999999999999999999999999999999999999999988788889999999 8899888888854


No 27 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.73  E-value=4.1e-08  Score=103.26  Aligned_cols=69  Identities=28%  Similarity=0.549  Sum_probs=63.6

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  460 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  460 (466)
                      .+++|.+|.++|||+++||++||+|++|||++|.+|+|+.+++....++.+.++|.| +|+..++++++.
T Consensus       221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~  289 (449)
T PRK10779        221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIER-QGSPLSLTLTPD  289 (449)
T ss_pred             cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEE-CCEEEEEEEEee
Confidence            357899999999999999999999999999999999999999887778899999999 889888888875


No 28 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.71  E-value=3.3e-08  Score=104.00  Aligned_cols=67  Identities=15%  Similarity=0.063  Sum_probs=58.8

Q ss_pred             eeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEec
Q 012318          393 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPE  460 (466)
Q Consensus       393 ~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~  460 (466)
                      .+|.+|.++|||++||||+||+|++|||++|.+|+++...+.. ..+++++++|.| +|+.++++++..
T Consensus       128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R-~gk~~~~~v~l~  195 (449)
T PRK10779        128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAP-FGSDQRRDKTLD  195 (449)
T ss_pred             ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEe-CCccceEEEEec
Confidence            3689999999999999999999999999999999999887765 667889999999 888776666553


No 29 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.59  E-value=3e-07  Score=72.88  Aligned_cols=69  Identities=25%  Similarity=0.447  Sum_probs=55.9

Q ss_pred             cccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecC--CHHHHHHHHhcCCC
Q 012318          361 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ--SITEIIEIMGDRVG  438 (466)
Q Consensus       361 ~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~--~~~~~~~~l~~~~g  438 (466)
                      ...+|+.+.....               ...|++|..|.++|||+++||++||+|++|||+++.  +++++.+++....+
T Consensus        11 ~~~~G~~~~~~~~---------------~~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~   75 (82)
T cd00992          11 GGGLGFSLRGGKD---------------SGGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD   75 (82)
T ss_pred             CCCcCEEEeCccc---------------CCCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC
Confidence            5678998765321               135899999999999999999999999999999999  88999988876433


Q ss_pred             CeEEEEE
Q 012318          439 EPLKVVV  445 (466)
Q Consensus       439 ~~v~l~v  445 (466)
                       .+.+++
T Consensus        76 -~v~l~v   81 (82)
T cd00992          76 -EVTLTV   81 (82)
T ss_pred             -eEEEEE
Confidence             566654


No 30 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.59  E-value=2.9e-07  Score=73.24  Aligned_cols=71  Identities=30%  Similarity=0.504  Sum_probs=55.5

Q ss_pred             ccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHh-cCCCCe
Q 012318          362 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMG-DRVGEP  440 (466)
Q Consensus       362 ~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~-~~~g~~  440 (466)
                      ..+|+.+....               ....|++|..|.++|||+++||++||+|++|||+++.++.+...... ...+..
T Consensus        12 ~~~G~~~~~~~---------------~~~~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~   76 (85)
T smart00228       12 GGLGFSLVGGK---------------DEGGGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGK   76 (85)
T ss_pred             CcccEEEECCC---------------CCCCCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence            78898876421               11168999999999999999999999999999999998766544332 234568


Q ss_pred             EEEEEEE
Q 012318          441 LKVVVQR  447 (466)
Q Consensus       441 v~l~v~R  447 (466)
                      +.+.+.|
T Consensus        77 ~~l~i~r   83 (85)
T smart00228       77 VTLTVLR   83 (85)
T ss_pred             EEEEEEe
Confidence            8999988


No 31 
>PRK10139 serine endoprotease; Provisional
Probab=98.59  E-value=1.6e-07  Score=98.71  Aligned_cols=66  Identities=26%  Similarity=0.450  Sum_probs=59.6

Q ss_pred             CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318          390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV  457 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v  457 (466)
                      ..|++|.+|.++|||+++||++||+|++|||++|.+|+++.+++..+. +.+.|+|+| +|+.+.+.+
T Consensus       389 ~~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R-~g~~~~~~~  454 (455)
T PRK10139        389 TKGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVR-GNESIYLLL  454 (455)
T ss_pred             CCceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence            368999999999999999999999999999999999999999998754 689999999 888777665


No 32 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.58  E-value=2.4e-07  Score=94.31  Aligned_cols=69  Identities=23%  Similarity=0.486  Sum_probs=60.0

Q ss_pred             cceeecccC--------CCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318          391 SGVLVPVVT--------PGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE  460 (466)
Q Consensus       391 ~g~~V~~V~--------~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~  460 (466)
                      .|++|....        .+|||+++|||+||+|++|||++|++|+|+.+++....++.+.++|.| +|+..++.++|.
T Consensus       105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R-~Ge~~tv~V~Pv  181 (402)
T TIGR02860       105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIER-GGKIIETVIKPV  181 (402)
T ss_pred             CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEE-CCEEEEEEEEEe
Confidence            688875432        369999999999999999999999999999999988678899999999 889888888764


No 33 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.54  E-value=2.7e-07  Score=73.27  Aligned_cols=72  Identities=31%  Similarity=0.554  Sum_probs=56.5

Q ss_pred             ecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCC
Q 012318          360 VRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRV  437 (466)
Q Consensus       360 ~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~  437 (466)
                      ....||+.+..-.+              ....+++|.+|.++|+|+++||++||+|++|||+++.++  .++..++....
T Consensus         8 ~~~~lG~~l~~~~~--------------~~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~   73 (81)
T PF00595_consen    8 GNGPLGFTLRGGSD--------------NDEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSAS   73 (81)
T ss_dssp             TTSBSSEEEEEEST--------------SSSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHST
T ss_pred             CCCCcCEEEEecCC--------------CCcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCC
Confidence            56789998875321              012699999999999999999999999999999999977  45566666644


Q ss_pred             CCeEEEEEE
Q 012318          438 GEPLKVVVQ  446 (466)
Q Consensus       438 g~~v~l~v~  446 (466)
                      + .++|+|+
T Consensus        74 ~-~v~L~V~   81 (81)
T PF00595_consen   74 N-PVTLTVQ   81 (81)
T ss_dssp             S-EEEEEEE
T ss_pred             C-cEEEEEC
Confidence            4 7888774


No 34 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.51  E-value=2.6e-07  Score=93.50  Aligned_cols=68  Identities=25%  Similarity=0.451  Sum_probs=56.9

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEe
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIP  459 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~  459 (466)
                      .+++|..|.++|||+++||++||+|++|||+++.+|  .++...+....|+.+.++|.| +|+..++++++
T Consensus        62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R-~g~~~~~~v~l  131 (334)
T TIGR00225        62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILR-AGKSKPLTFTL  131 (334)
T ss_pred             CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEe-CCCCceEEEEE
Confidence            578999999999999999999999999999999987  577777766778899999999 66544444433


No 35 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.51  E-value=4e-07  Score=93.99  Aligned_cols=68  Identities=25%  Similarity=0.498  Sum_probs=59.1

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEe
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIP  459 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~  459 (466)
                      .|++|..|.++|||+++||++||+|++|||++|.++  .++...+....|..+.|+|.| +|+..+++++.
T Consensus       102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r-~g~~~~~~l~r  171 (389)
T PLN00049        102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRR-GPETRLVTLTR  171 (389)
T ss_pred             CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEe
Confidence            489999999999999999999999999999999864  677777776778899999999 78877777654


No 36 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.50  E-value=3.6e-07  Score=93.64  Aligned_cols=63  Identities=22%  Similarity=0.370  Sum_probs=55.3

Q ss_pred             ecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEE-ECCCeEEEEEEEecC
Q 012318          395 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQ-RANDQLVTLTVIPEE  461 (466)
Q Consensus       395 V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~-R~~g~~~~l~v~~~~  461 (466)
                      |..|.++|+|+++||++||+|++|||+++.+|.|+..++.   ++.+.++|. | +|+..++++.+++
T Consensus         2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~r-dGe~~~l~Ie~~~   65 (433)
T TIGR03279         2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDA-NGESHQIEIEKDL   65 (433)
T ss_pred             cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcC-CCeEEEEEEecCC
Confidence            6789999999999999999999999999999999887774   356889997 6 8888888888754


No 37 
>PRK10942 serine endoprotease; Provisional
Probab=98.49  E-value=3.8e-07  Score=96.28  Aligned_cols=65  Identities=34%  Similarity=0.572  Sum_probs=58.9

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV  457 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v  457 (466)
                      .|++|.+|.++|+|+++||++||+|++|||++|.+++++.+++..+. +.+.|+|.| +|..+.+.+
T Consensus       408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~~v~l~V~R-~g~~~~v~~  472 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLALNIQR-GDSSIYLLM  472 (473)
T ss_pred             CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence            58999999999999999999999999999999999999999988744 689999999 888776654


No 38 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.39  E-value=1.9e-05  Score=76.50  Aligned_cols=169  Identities=20%  Similarity=0.240  Sum_probs=95.3

Q ss_pred             ceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC---------C---cEEEE-EEEEecC-------C-CCEE
Q 012318          155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------G---RTFEG-TVLNADF-------H-SDIA  213 (466)
Q Consensus       155 ~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~---------g---~~~~a-~vv~~d~-------~-~DlA  213 (466)
                      ..|-|.+|+++ ||||++||+....     .. .+.|.+..         +   ..... +++ .|+       . +|||
T Consensus        38 ~~Cggsli~~~-~vltaaHC~~~~~-----~~-~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDia  109 (256)
T KOG3627|consen   38 HLCGGSLISPR-WVLTAAHCVKGAS-----AS-LYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIA  109 (256)
T ss_pred             eeeeeEEeeCC-EEEEChhhCCCCC-----Cc-ceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEE
Confidence            36788788776 9999999998742     00 33343321         1   11111 222 222       3 8999


Q ss_pred             EEEeCCC----CCCCccccCCCCC---CCCCCEEEEEecCCCC------CCceEEeEEeeeecCccCCCCCC---ccccE
Q 012318          214 IVKINSK----TPLPAAKLGTSSK---LCPGDWVVAMGCPHSL------QNTVTAGIVSCVDRKSSDLGLGG---MRREY  277 (466)
Q Consensus       214 lLkv~~~----~~~~~~~l~~s~~---~~~G~~V~~iG~p~~~------~~~~t~G~Vs~~~~~~~~~~~~~---~~~~~  277 (466)
                      ||+++.+    ..+.++.+.....   ...+..+++.||+...      ...+....+..+....|...+..   .....
T Consensus       110 ll~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~  189 (256)
T KOG3627|consen  110 LLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTM  189 (256)
T ss_pred             EEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCE
Confidence            9999975    3345556643222   3444888899997532      22333333333333333322211   11122


Q ss_pred             EEEc-----ccCCCCCccceeecCC---CeEEEEEEeEecC-CC--eeeEEEeHHHHHHHHHHHH
Q 012318          278 LQTD-----CAINAGNSGGPLVNID---GEIVGINIMKVAA-AD--GLSFAVPIDSAAKIIEQFK  331 (466)
Q Consensus       278 i~~~-----~~i~~G~SGGPlvd~~---G~VVGI~~~~~~~-~~--g~~~aip~~~i~~~l~~l~  331 (466)
                      +...     ..+|.|+|||||+-.+   ..++||++++... ..  .-+.+..+....+|+++..
T Consensus       190 ~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~~  254 (256)
T KOG3627|consen  190 LCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKENI  254 (256)
T ss_pred             EeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHHh
Confidence            3332     3368999999999554   6999999998641 11  1233566777777777643


No 39 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.30  E-value=3.2e-06  Score=68.03  Aligned_cols=68  Identities=22%  Similarity=0.430  Sum_probs=49.4

Q ss_pred             CcceeecccCCC--------ChhhhCCC--CCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318          390 KSGVLVPVVTPG--------SPAHLAGF--LPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV  457 (466)
Q Consensus       390 ~~g~~V~~V~~~--------spA~~aGl--~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v  457 (466)
                      ..+..|..|.++        ||..+.|+  ++||+|++|||+++..-.++..+|..+.|+.+.|+|.+.+++.+++.|
T Consensus        11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~~~R~v~V   88 (88)
T PF14685_consen   11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPGGARTVVV   88 (88)
T ss_dssp             TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT-EEEEEE
T ss_pred             CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCCCceEEEC
Confidence            356778887664        78888875  599999999999999999999999999999999999995555666543


No 40 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.28  E-value=8.9e-05  Score=70.41  Aligned_cols=174  Identities=19%  Similarity=0.252  Sum_probs=91.4

Q ss_pred             HhCCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEE-----EEEec
Q 012318          133 RVCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGT-----VLNAD  207 (466)
Q Consensus       133 ~~~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~-----vv~~d  207 (466)
                      -++..|++|.-..+.     ....--|+.+. + +|+|++|..+...       ..+++....|.- ...     -+..=
T Consensus        15 ~Ia~~ic~l~n~s~~-----~~~~l~gigyG-~-~iItn~HLf~~nn-------g~L~i~s~hG~f-~v~nt~~lkv~~i   79 (235)
T PF00863_consen   15 PIASNICRLTNESDG-----GTRSLYGIGYG-S-YIITNAHLFKRNN-------GELTIKSQHGEF-TVPNTTQLKVHPI   79 (235)
T ss_dssp             HHHTTEEEEEEEETT-----EEEEEEEEEET-T-EEEEEGGGGSSTT-------CEEEEEETTEEE-EECEGGGSEEEE-
T ss_pred             hhhheEEEEEEEeCC-----CeEEEEEEeEC-C-EEEEChhhhccCC-------CeEEEEeCceEE-EcCCccccceEEe
Confidence            345678888743321     11223456665 3 9999999997652       568887776632 211     23344


Q ss_pred             CCCCEEEEEeCCCCCCCccccC-CCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCC
Q 012318          208 FHSDIAIVKINSKTPLPAAKLG-TSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINA  286 (466)
Q Consensus       208 ~~~DlAlLkv~~~~~~~~~~l~-~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~  286 (466)
                      +..||.++|++.  ++||.+-. .-..++.++.|+++|.-+.....  ...|+.......     .....++.+...+..
T Consensus        80 ~~~DiviirmPk--DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~--~s~vSesS~i~p-----~~~~~fWkHwIsTk~  150 (235)
T PF00863_consen   80 EGRDIVIIRMPK--DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSI--SSTVSESSWIYP-----EENSHFWKHWISTKD  150 (235)
T ss_dssp             TCSSEEEEE--T--TS----S---B----TT-EEEEEEEECSSCCC--EEEEEEEEEEEE-----ETTTTEEEE-C---T
T ss_pred             CCccEEEEeCCc--ccCCcchhhhccCCCCCCEEEEEEEEEEcCCe--eEEECCceEEee-----cCCCCeeEEEecCCC
Confidence            688999999995  44543321 22467899999999985543321  112222211111     012457888889999


Q ss_pred             CCccceeecC-CCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHc
Q 012318          287 GNSGGPLVNI-DGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKN  333 (466)
Q Consensus       287 G~SGGPlvd~-~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~  333 (466)
                      |+=|.|||+. ||.+||+++..... ...+|+.|+..  ++++.+.+.
T Consensus       151 G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~~--~f~~~~l~~  195 (235)
T PF00863_consen  151 GDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFPD--DFEEFYLEN  195 (235)
T ss_dssp             T-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--T--THHHHHCC-
T ss_pred             CccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCCH--HHHHHHhcc
Confidence            9999999976 49999999987644 45778888653  444444433


No 41 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.25  E-value=1.7e-06  Score=90.27  Aligned_cols=64  Identities=17%  Similarity=0.280  Sum_probs=55.8

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEE
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLT  456 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~  456 (466)
                      .|.+|.+|.++|||+++||++||+|+++||+++.+++++...+.... .++.+++.| +++..++.
T Consensus       128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~-~~v~~~I~r-~g~~~~l~  191 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA-GEPMVEILA-ERENWTFE  191 (420)
T ss_pred             CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc-ccceEEEEE-ecCceEec
Confidence            67889999999999999999999999999999999999998887755 678899999 66655443


No 42 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=98.24  E-value=3.1e-06  Score=77.30  Aligned_cols=73  Identities=26%  Similarity=0.369  Sum_probs=61.9

Q ss_pred             ceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHH---HHhcCCCCeEEEEEEECCCeEEEEEEEecCCCCC
Q 012318          392 GVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIE---IMGDRVGEPLKVVVQRANDQLVTLTVIPEEANPD  465 (466)
Q Consensus       392 g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~---~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~~~~~  465 (466)
                      -++|++|.++|||+++||+.||.|+++....-.++..++.   ......++.+.++|.| .|+.+.+.++|..|.++
T Consensus       140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R-~g~~v~L~ltP~~W~Gr  215 (231)
T KOG3129|consen  140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIR-EGQKVVLSLTPKKWQGR  215 (231)
T ss_pred             eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEec-CCCEEEEEeCcccccCC
Confidence            4679999999999999999999999999887766655543   3345778899999999 89999999999988764


No 43 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.21  E-value=5.1e-06  Score=85.93  Aligned_cols=70  Identities=36%  Similarity=0.567  Sum_probs=59.9

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEec
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPE  460 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~  460 (466)
                      .++.|.++.+++||+++||++||+|++|||+++...  +++.+.++..+|..++|++.|.+ ++.++++++.+
T Consensus       112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l~Re  184 (406)
T COG0793         112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTLTRE  184 (406)
T ss_pred             CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCCceeEEEEEEE
Confidence            688899999999999999999999999999999876  56777788889999999999942 45666666654


No 44 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=98.05  E-value=1.5e-05  Score=69.98  Aligned_cols=72  Identities=29%  Similarity=0.494  Sum_probs=55.2

Q ss_pred             CcceeecccCCCChhhhCCCCC-CCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEecC
Q 012318          390 KSGVLVPVVTPGSPAHLAGFLP-SDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPEE  461 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aGl~~-GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~~  461 (466)
                      ..+.-|.+|.|+|||++|||++ .|.|+.+|+....+.++|.+.+..+.++.+.|.|...+ ...+.++++|..
T Consensus        42 ~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~~  115 (138)
T PF04495_consen   42 EEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYNSKTDSVREVTITPSR  115 (138)
T ss_dssp             CCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEETTTTCEEEEEE---T
T ss_pred             cceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEECCCCeEEEEEEEcCC
Confidence            4688899999999999999999 59999999999999999999999888999999999743 356788988863


No 45 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.94  E-value=1.9e-05  Score=81.67  Aligned_cols=65  Identities=25%  Similarity=0.380  Sum_probs=54.6

Q ss_pred             CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318          389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA  462 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~  462 (466)
                      ...+.+|..|.++|||++|||.+||.|++|||.        .+.+.. +.++++++.+.| .|..+++.++....
T Consensus       460 ~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~-~~~L~e~~v~~~~~  525 (558)
T COG3975         460 EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKIQVHVFR-EGRLREFLVKLGGD  525 (558)
T ss_pred             cCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccccccccceEEEEcc-CCceEEeecccCCC
Confidence            356788999999999999999999999999998        233443 788899999999 88998888877543


No 46 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.90  E-value=3.5e-05  Score=84.00  Aligned_cols=70  Identities=14%  Similarity=0.355  Sum_probs=55.7

Q ss_pred             cceeecccCCCChhhhC-CCCCCCEEEEEC--CEecC---C--HHHHHHHHhcCCCCeEEEEEEEC--CCeEEEEEEEec
Q 012318          391 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFD--GKPVQ---S--ITEIIEIMGDRVGEPLKVVVQRA--NDQLVTLTVIPE  460 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~a-Gl~~GD~I~~in--g~~v~---~--~~~~~~~l~~~~g~~v~l~v~R~--~g~~~~l~v~~~  460 (466)
                      .+++|.+|.+||||+++ ||++||+|++||  |+++.   .  .+++..+|....|.+|.|+|.|.  +++.++++++.+
T Consensus       255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl~R~  334 (667)
T PRK11186        255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLTRD  334 (667)
T ss_pred             CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEEEee
Confidence            46889999999999998 999999999999  55443   2  35788888888899999999983  345666666543


No 47 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.90  E-value=4.4e-05  Score=74.27  Aligned_cols=70  Identities=26%  Similarity=0.403  Sum_probs=60.6

Q ss_pred             CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 012318          389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP  459 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~  459 (466)
                      .-.|+++..|..++|+... |+.||.|++|||+++.+.+++..++.. ++|+.+++++.|.+++....+++.
T Consensus       128 ~y~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl  198 (342)
T COG3480         128 TYAGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITL  198 (342)
T ss_pred             EEeeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence            3469999999999999987 999999999999999999999998876 899999999998666655544444


No 48 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.88  E-value=4.8e-05  Score=73.92  Aligned_cols=61  Identities=20%  Similarity=0.356  Sum_probs=51.7

Q ss_pred             cCCC---ChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 012318          398 VTPG---SPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP  459 (466)
Q Consensus       398 V~~~---spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~  459 (466)
                      +.|+   .--+++|||+||++++|||.++++.++..+++.. .....++|+|+| +|+..++.+..
T Consensus       211 l~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeR-dGq~~~i~i~l  275 (276)
T PRK09681        211 VKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLR-KGARHDISIAL  275 (276)
T ss_pred             ECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEE-CCEEEEEEEEc
Confidence            4555   3458899999999999999999999988888876 667789999999 99998887754


No 49 
>PF12812 PDZ_1:  PDZ-like domain
Probab=97.66  E-value=0.00017  Score=56.91  Aligned_cols=69  Identities=20%  Similarity=0.236  Sum_probs=58.0

Q ss_pred             cccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCC
Q 012318          361 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRV  437 (466)
Q Consensus       361 ~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~  437 (466)
                      .-|.|..+.+++-..++++..        .-|.++.....++++...|+..|-+|.+|||+++.+.++|.+++++-+
T Consensus         8 v~~~Ga~f~~Ls~q~aR~~~~--------~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ip   76 (78)
T PF12812_consen    8 VEVCGAVFHDLSYQQARQYGI--------PVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIP   76 (78)
T ss_pred             EEEcCeecccCCHHHHHHhCC--------CCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCC
Confidence            358899999999999888853        234555567889999987899999999999999999999999987644


No 50 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.65  E-value=0.00059  Score=67.59  Aligned_cols=51  Identities=18%  Similarity=0.279  Sum_probs=35.3

Q ss_pred             ccCCCCCccceeecC--CC-eEEEEEEeEecCCCe---eeEEEeHHHHHHHHHHHHH
Q 012318          282 CAINAGNSGGPLVNI--DG-EIVGINIMKVAAADG---LSFAVPIDSAAKIIEQFKK  332 (466)
Q Consensus       282 ~~i~~G~SGGPlvd~--~G-~VVGI~~~~~~~~~g---~~~aip~~~i~~~l~~l~~  332 (466)
                      ...|.|+||||+|-.  +| .-+||++++.....+   -+...-++....|+++..+
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~  279 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN  279 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence            456999999999943  25 457999998663322   1244557888888888544


No 51 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.61  E-value=0.0013  Score=62.80  Aligned_cols=116  Identities=22%  Similarity=0.373  Sum_probs=62.2

Q ss_pred             CceeEEEEEeCCC--eEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC-CCCCCCccccCC
Q 012318          154 RGIGSGAIVDADG--TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN-SKTPLPAAKLGT  230 (466)
Q Consensus       154 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~-~~~~~~~~~l~~  230 (466)
                      ...|||=+...+|  .|+|+.||+.+         +...|...+ ..   +...++..-|+|.-.++ -+..+|.+++..
T Consensus       111 ss~Gsggvft~~~~~vvvTAtHVlg~---------~~a~v~~~g-~~---~~~tF~~~GDfA~~~~~~~~G~~P~~k~a~  177 (297)
T PF05579_consen  111 SSVGSGGVFTIGGNTVVVTATHVLGG---------NTARVSGVG-TR---RMLTFKKNGDFAEADITNWPGAAPKYKFAQ  177 (297)
T ss_dssp             SSEEEEEEEECTTEEEEEEEHHHCBT---------TEEEEEETT-EE---EEEEEEEETTEEEEEETTS-S---B--B-T
T ss_pred             ecccccceEEECCeEEEEEEEEEcCC---------CeEEEEecc-eE---EEEEEeccCcEEEEECCCCCCCCCceeecC
Confidence            3556666665555  89999999985         455555443 22   23345566799999994 345667777642


Q ss_pred             CCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEe
Q 012318          231 SSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKV  308 (466)
Q Consensus       231 s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~  308 (466)
                      .   ..|.--+..      ...+..|.|...              ..+   |-+.+|+||+|+++.+|.+||+|+...
T Consensus       178 ~---~~GrAyW~t------~tGvE~G~ig~~--------------~~~---~fT~~GDSGSPVVt~dg~liGVHTGSn  229 (297)
T PF05579_consen  178 N---YTGRAYWLT------STGVEPGFIGGG--------------GAV---CFTGPGDSGSPVVTEDGDLIGVHTGSN  229 (297)
T ss_dssp             T----SEEEEEEE------TTEEEEEEEETT--------------EEE---ESS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred             C---cccceEEEc------ccCcccceecCc--------------eEE---EEcCCCCCCCccCcCCCCEEEEEecCC
Confidence            1   122211111      112333433210              011   334689999999999999999999863


No 52 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.54  E-value=7.3e-05  Score=60.24  Aligned_cols=37  Identities=30%  Similarity=0.523  Sum_probs=33.7

Q ss_pred             CCCcceeecccCCCChhhhCCCCCCCEEEEECCEecC
Q 012318          388 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  424 (466)
Q Consensus       388 ~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~  424 (466)
                      -...|++|++|.++|||+.|||+.+|.|+.+||...+
T Consensus        56 ytD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfT   92 (124)
T KOG3553|consen   56 YTDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFT   92 (124)
T ss_pred             cCCccEEEEEeccCChhhhhcceecceEEEecCceeE
Confidence            3568999999999999999999999999999998753


No 53 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=97.37  E-value=0.0015  Score=69.59  Aligned_cols=118  Identities=22%  Similarity=0.322  Sum_probs=73.9

Q ss_pred             CCCCEEEEEeCCCC--------CC------CccccCC------CCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccC
Q 012318          208 FHSDIAIVKINSKT--------PL------PAAKLGT------SSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSD  267 (466)
Q Consensus       208 ~~~DlAlLkv~~~~--------~~------~~~~l~~------s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~  267 (466)
                      .-.|+|||+++...        ++      |.+.+.+      ...+..|.+|+-+|.-.+    .|.|.+.+..-..  
T Consensus       541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy--  614 (695)
T PF08192_consen  541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY--  614 (695)
T ss_pred             cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--
Confidence            34599999998531        11      2233322      134678999999998765    3455555442211  


Q ss_pred             CCCCCc-cccEEEEc----ccCCCCCccceeecCCCe------EEEEEEeEecCCCeeeEEEeHHHHHHHHHHHH
Q 012318          268 LGLGGM-RREYLQTD----CAINAGNSGGPLVNIDGE------IVGINIMKVAAADGLSFAVPIDSAAKIIEQFK  331 (466)
Q Consensus       268 ~~~~~~-~~~~i~~~----~~i~~G~SGGPlvd~~G~------VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~  331 (466)
                      +..+.. ..+++...    .-...|+||+-|++.-+.      |+||.+..-.+..+++++.|+..|.+-|++.-
T Consensus       615 w~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~vT  689 (695)
T PF08192_consen  615 WADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEVT  689 (695)
T ss_pred             ecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHhh
Confidence            111111 12333333    224689999999986444      99999987666678999999988877777653


No 54 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.29  E-value=0.018  Score=56.71  Aligned_cols=176  Identities=19%  Similarity=0.247  Sum_probs=99.1

Q ss_pred             CCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCC----C-----CCCc------------eEEEE-
Q 012318          135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSR----A-----LPKG------------KVDVT-  192 (466)
Q Consensus       135 ~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~----~-----~~~~------------~i~V~-  192 (466)
                      .|-.|.+......    ......+|++|+++ ||||+.||+-......    .     -...            .+.+. 
T Consensus        53 ~pW~v~v~~~~~~----~~~~~~~gtlIS~R-HiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~~  127 (282)
T PF03761_consen   53 APWAVSVYTKNHN----EGNYFSTGTLISPR-HILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVRC  127 (282)
T ss_pred             CCCEEEEEeccCc----ccceecceEEeccC-eEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHHhccEEEEe
Confidence            4667777765432    11233499999999 9999999996322100    0     0001            12220 


Q ss_pred             ---eCCC-----cEEEEEEEEe--------cCCCCEEEEEeCCC--CCCCccccCCCC-CCCCCCEEEEEecCCCCCCce
Q 012318          193 ---LQDG-----RTFEGTVLNA--------DFHSDIAIVKINSK--TPLPAAKLGTSS-KLCPGDWVVAMGCPHSLQNTV  253 (466)
Q Consensus       193 ---~~~g-----~~~~a~vv~~--------d~~~DlAlLkv~~~--~~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~~~  253 (466)
                         ...+     +...|.++..        ....+++||+++.+  ....++=|+++. ....++.+.+.|+...  ..+
T Consensus       128 ~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~  205 (282)
T PF03761_consen  128 CNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKL  205 (282)
T ss_pred             ecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeE
Confidence               0111     1222444322        34679999999987  666777776643 3667899999998222  122


Q ss_pred             EEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceee---cCCCeEEEEEEeEecCC-CeeeEEEeHHHHHH
Q 012318          254 TAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLV---NIDGEIVGINIMKVAAA-DGLSFAVPIDSAAK  325 (466)
Q Consensus       254 t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlv---d~~G~VVGI~~~~~~~~-~g~~~aip~~~i~~  325 (466)
                      ....+.-.....        ....+......+.|++|||++   |....||||.+...... ....+++.+..+.+
T Consensus       206 ~~~~~~i~~~~~--------~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~  273 (282)
T PF03761_consen  206 KHRKLKITNCTK--------CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD  273 (282)
T ss_pred             EEEEEEEEEeec--------cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence            222222111110        123445566778999999998   33357899987654222 12556677666544


No 55 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.29  E-value=0.00067  Score=63.53  Aligned_cols=67  Identities=15%  Similarity=0.193  Sum_probs=56.1

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI  458 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~  458 (466)
                      .|..+.=..+++.-++.|||.||+.+++|+..+++.+++..+|.. ..-+.++++|+| +|+...+.+.
T Consensus       207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R-~G~rhdInV~  274 (275)
T COG3031         207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIR-RGKRHDINVR  274 (275)
T ss_pred             EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEe-cCccceeeec
Confidence            355555566778889999999999999999999999999888876 455679999999 8998887764


No 56 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=97.21  E-value=0.0082  Score=55.95  Aligned_cols=161  Identities=17%  Similarity=0.244  Sum_probs=86.7

Q ss_pred             CceeEEEEEeCC-CeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecC----------------CCCEEEEE
Q 012318          154 RGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADF----------------HSDIAIVK  216 (466)
Q Consensus       154 ~~~GSGfiI~~~-G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~----------------~~DlAlLk  216 (466)
                      .+.||=.+++++ +..--=.|.+.+.+       ....+.+.+|+.|++++....+                ..-+.-+.
T Consensus        19 aGiGTlTf~dp~~~~fgALGH~I~D~d-------t~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~   91 (218)
T PF05580_consen   19 AGIGTLTFYDPETGTFGALGHGISDVD-------TGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIE   91 (218)
T ss_pred             cCeEEEEEEECCCCcEEecCCeEEcCC-------CCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEE
Confidence            477899999974 56666689888753       3445666778888877666532                11122222


Q ss_pred             eCCC---------------CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCC----CccccE
Q 012318          217 INSK---------------TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG----GMRREY  277 (466)
Q Consensus       217 v~~~---------------~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~----~~~~~~  277 (466)
                      -+..               ...++++++...++++|..-+..=.......... -.|..+.+.......+    ....++
T Consensus        92 ~Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~G~~ie~f~-ieI~~v~~~~~~~~k~~vi~vtd~~L  170 (218)
T PF05580_consen   92 KNTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVIDGTKIEEFD-IEIEKVLPQSSPSGKGMVIKVTDPRL  170 (218)
T ss_pred             eccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEEcCCeEEEeE-EEEEEEccCCCCCCCcEEEEECCcch
Confidence            2211               1223344444455666643211101100000111 1112222211100000    001123


Q ss_pred             EEEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHH
Q 012318          278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSA  323 (466)
Q Consensus       278 i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i  323 (466)
                      +..+..+..|+||+|++ .+|++||=++..+.+....+|.++++..
T Consensus       171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~M  215 (218)
T PF05580_consen  171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWM  215 (218)
T ss_pred             hhhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHH
Confidence            33445678899999999 7999999999998877889999987643


No 57 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.68  E-value=0.0031  Score=65.95  Aligned_cols=59  Identities=22%  Similarity=0.414  Sum_probs=47.3

Q ss_pred             CCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCH---HHHHHHHhcCCCCeEEEEEE
Q 012318          388 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI---TEIIEIMGDRVGEPLKVVVQ  446 (466)
Q Consensus       388 ~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~---~~~~~~l~~~~g~~v~l~v~  446 (466)
                      +...|++|..|.++|||++-||++||+|+.||.++..+.   +.+..+|.--+|+.++|.-.
T Consensus       426 GNDVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ  487 (1027)
T KOG3580|consen  426 GNDVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQ  487 (1027)
T ss_pred             CCceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhh
Confidence            345799999999999999999999999999999998775   23444455577887777544


No 58 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=96.64  E-value=0.0036  Score=66.46  Aligned_cols=51  Identities=25%  Similarity=0.401  Sum_probs=45.4

Q ss_pred             CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCC
Q 012318          389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGE  439 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~  439 (466)
                      ...-+.|..|.+++||.++.|++||++++|||.+|++.++..+.++.-.|+
T Consensus       396 ~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~  446 (1051)
T KOG3532|consen  396 TNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGD  446 (1051)
T ss_pred             CceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccc
Confidence            345677899999999999999999999999999999999999988875554


No 59 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.55  E-value=0.0097  Score=65.54  Aligned_cols=24  Identities=38%  Similarity=0.440  Sum_probs=21.2

Q ss_pred             ceeEEEEEeCCCeEEeccccccCC
Q 012318          155 GIGSGAIVDADGTILTCAHVVVDF  178 (466)
Q Consensus       155 ~~GSGfiI~~~G~ILT~aHvv~~~  178 (466)
                      +-|||-||+++|+||||.||.-++
T Consensus        47 gGCSgsfVS~~GLvlTNHHC~~~~   70 (698)
T PF10459_consen   47 GGCSGSFVSPDGLVLTNHHCGYGA   70 (698)
T ss_pred             CceeEEEEcCCceEEecchhhhhH
Confidence            349999999999999999999653


No 60 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.38  E-value=0.0055  Score=67.42  Aligned_cols=55  Identities=24%  Similarity=0.332  Sum_probs=43.4

Q ss_pred             EEEEcccCCCCCccceeecCCCeEEEEEEeEec----------CCCeeeEEEeHHHHHHHHHHHH
Q 012318          277 YLQTDCAINAGNSGGPLVNIDGEIVGINIMKVA----------AADGLSFAVPIDSAAKIIEQFK  331 (466)
Q Consensus       277 ~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~----------~~~g~~~aip~~~i~~~l~~l~  331 (466)
                      .+..+..+.+||||+|++|.+|+|||+++-+.-          ..-+.+..|-+..|+.+|+++-
T Consensus       623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~  687 (698)
T PF10459_consen  623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY  687 (698)
T ss_pred             EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence            577888999999999999999999999987642          1223567777777888877654


No 61 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.21  E-value=0.049  Score=49.74  Aligned_cols=148  Identities=18%  Similarity=0.239  Sum_probs=82.0

Q ss_pred             CceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE--EEEEecC---CC
Q 012318          136 PAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG--TVLNADF---HS  210 (466)
Q Consensus       136 ~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a--~vv~~d~---~~  210 (466)
                      ..++.|...       .+...++++.|..+ ++|...|.-..         ..+.+   +|+.++.  .+...+.   ..
T Consensus        13 ~N~~~v~~~-------~g~~t~l~~gi~~~-~~lvp~H~~~~---------~~i~i---~g~~~~~~d~~~lv~~~~~~~   72 (172)
T PF00548_consen   13 KNVVPVTTG-------KGEFTMLALGIYDR-YFLVPTHEEPE---------DTIYI---DGVEYKVDDSVVLVDRDGVDT   72 (172)
T ss_dssp             HHEEEEEET-------TEEEEEEEEEEEBT-EEEEEGGGGGC---------SEEEE---TTEEEEEEEEEEEEETTSSEE
T ss_pred             ccEEEEEeC-------CceEEEecceEeee-EEEEECcCCCc---------EEEEE---CCEEEEeeeeEEEecCCCcce
Confidence            456666652       23456888889877 99999992221         34433   3555443  2223443   46


Q ss_pred             CEEEEEeCCCCCCCccccCCCCCC-CCCCEEEEEecCCCCCC-ceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCC
Q 012318          211 DIAIVKINSKTPLPAAKLGTSSKL-CPGDWVVAMGCPHSLQN-TVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGN  288 (466)
Q Consensus       211 DlAlLkv~~~~~~~~~~l~~s~~~-~~G~~V~~iG~p~~~~~-~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~  288 (466)
                      ||++++++....++-++---.... ...+...++ +...... ....+.++..... ..  .+......+.+++++..|+
T Consensus        73 Dl~~v~l~~~~kfrDIrk~~~~~~~~~~~~~l~v-~~~~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~  148 (172)
T PF00548_consen   73 DLTLVKLPRNPKFRDIRKFFPESIPEYPECVLLV-NSTKFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGM  148 (172)
T ss_dssp             EEEEEEEESSS-B--GGGGSBSSGGTEEEEEEEE-ESSSSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTG
T ss_pred             eEEEEEccCCcccCchhhhhccccccCCCcEEEE-ECCCCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCc
Confidence            999999986543332211000112 223333333 3333332 3334444333322 10  1123345788899999999


Q ss_pred             ccceeecC---CCeEEEEEEeE
Q 012318          289 SGGPLVNI---DGEIVGINIMK  307 (466)
Q Consensus       289 SGGPlvd~---~G~VVGI~~~~  307 (466)
                      -||||+..   .++++|||.++
T Consensus       149 CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen  149 CGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             TTEEEEESCGGTTEEEEEEEEE
T ss_pred             cCCeEEEeeccCccEEEEEecc
Confidence            99999942   48999999885


No 62 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.93  E-value=0.011  Score=63.09  Aligned_cols=52  Identities=19%  Similarity=0.489  Sum_probs=44.7

Q ss_pred             ecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEE
Q 012318          395 VPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQR  447 (466)
Q Consensus       395 V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R  447 (466)
                      |..|.++|||++.| |+.||.|++|||+.|.+.  .|+..++++ .|.+|+|+|.-
T Consensus       782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd-aGlsVtLtIip  836 (984)
T KOG3209|consen  782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD-AGLSVTLTIIP  836 (984)
T ss_pred             ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh-cCceEEEEEcC
Confidence            78899999999986 999999999999999876  466676665 68889999875


No 63 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.78  E-value=0.048  Score=50.96  Aligned_cols=117  Identities=24%  Similarity=0.315  Sum_probs=48.9

Q ss_pred             eEEeccccccCCCCCCCCCCceEEEEeCCCcEEE---EEEEEecCCCCEEEEEeCCC----CCCCccccCCCCCCCCCCE
Q 012318          167 TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE---GTVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDW  239 (466)
Q Consensus       167 ~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~---a~vv~~d~~~DlAlLkv~~~----~~~~~~~l~~s~~~~~G~~  239 (466)
                      .++|+.||....        ..+ ..+.+|+.++   -+.+..+...|++||+....    ..++.+.+.....+..|  
T Consensus        43 ~L~ta~Hv~~~~--------~~~-~~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g--  111 (203)
T PF02122_consen   43 ALLTARHVWSRP--------SKV-TSLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG--  111 (203)
T ss_dssp             EEEE-HHHHTSS--------S----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE--
T ss_pred             ceecccccCCCc--------cce-eEcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC--
Confidence            899999999873        222 2334454444   24556678999999999842    23344444322211100  


Q ss_pred             EEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeE
Q 012318          240 VVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMK  307 (466)
Q Consensus       240 V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~  307 (466)
                        .+..     +....+...+......     .....+..+-+.+.+|.||.|+++.. +++|++...
T Consensus       112 --~~~~-----y~~~~~~~~~~sa~i~-----g~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~  166 (203)
T PF02122_consen  112 --PVSF-----YGFSSGEWPCSSAKIP-----GTEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS  166 (203)
T ss_dssp             --ESST-----TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred             --Ceee-----eeecCCCceeccCccc-----cccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence              1111     1111211111111110     11133667778999999999999877 999999985


No 64 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=95.70  E-value=0.013  Score=64.28  Aligned_cols=56  Identities=23%  Similarity=0.472  Sum_probs=47.0

Q ss_pred             cceeecccCCCChhhhCCCCCCCEEEEECCEecCC--HHHHHHHHhcCCCCeEEEEEEEC
Q 012318          391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQRA  448 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~--~~~~~~~l~~~~g~~v~l~v~R~  448 (466)
                      ..++|..|.+|+|+.-. |++||+|++|||.+|++  |+.+.++++. ....|.|+|.++
T Consensus        75 rPviVr~VT~GGps~GK-L~PGDQIl~vN~Epv~daprervIdlvRa-ce~sv~ltV~qP  132 (1298)
T KOG3552|consen   75 RPVIVRFVTEGGPSIGK-LQPGDQILAVNGEPVKDAPRERVIDLVRA-CESSVNLTVCQP  132 (1298)
T ss_pred             CceEEEEecCCCCcccc-ccCCCeEEEecCcccccccHHHHHHHHHH-HhhhcceEEecc
Confidence            68899999999999866 99999999999999976  6778887765 345688888873


No 65 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=95.60  E-value=0.033  Score=48.61  Aligned_cols=57  Identities=25%  Similarity=0.469  Sum_probs=43.4

Q ss_pred             CCcceeecccCCCChhhhC-CCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEE
Q 012318          389 VKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ  446 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~  446 (466)
                      .+.+++|+.+.|++.|++- ||+.||++++|||..|..-  +...++|+...| .++|.|.
T Consensus       113 qnspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g-svklvvr  172 (207)
T KOG3550|consen  113 QNSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG-SVKLVVR  172 (207)
T ss_pred             cCCceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC-cEEEEEe
Confidence            3568999999999999987 7999999999999998754  334455555444 4666654


No 66 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=95.57  E-value=0.18  Score=51.97  Aligned_cols=45  Identities=22%  Similarity=0.386  Sum_probs=36.5

Q ss_pred             EEEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHH
Q 012318          278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSA  323 (466)
Q Consensus       278 i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i  323 (466)
                      +..+..+..|+||+|++ .+|++||=++-...+.+..+|+|-++..
T Consensus       351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~M  395 (402)
T TIGR02860       351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWM  395 (402)
T ss_pred             hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHH
Confidence            33345677899999999 7999999998888888889999965543


No 67 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.34  E-value=0.036  Score=48.13  Aligned_cols=33  Identities=36%  Similarity=0.506  Sum_probs=23.4

Q ss_pred             EEEcccCCCCCccceeecCCCeEEEEEEeEecC
Q 012318          278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA  310 (466)
Q Consensus       278 i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~  310 (466)
                      ...+..+.+|.||+|+||.+|++|||...+..-
T Consensus        88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~  120 (132)
T PF00949_consen   88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV  120 (132)
T ss_dssp             EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred             EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence            344555779999999999999999999887653


No 68 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=95.32  E-value=0.064  Score=55.01  Aligned_cols=56  Identities=30%  Similarity=0.520  Sum_probs=47.6

Q ss_pred             ccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCe---EEEEEEECCCeE
Q 012318          397 VVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEP---LKVVVQRANDQL  452 (466)
Q Consensus       397 ~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~---v~l~v~R~~g~~  452 (466)
                      .+..+|+|..+|+++||.|+++|++++.+|+++.+.+....+..   +.+.+.|.++..
T Consensus       135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~  193 (375)
T COG0750         135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA  193 (375)
T ss_pred             ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence            78899999999999999999999999999999988877655655   788888833443


No 69 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.27  E-value=0.56  Score=44.70  Aligned_cols=99  Identities=20%  Similarity=0.283  Sum_probs=67.4

Q ss_pred             CceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEE------EEEEEec--
Q 012318          136 PAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE------GTVLNAD--  207 (466)
Q Consensus       136 ~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~------a~vv~~d--  207 (466)
                      |-.+.|..        .|...|||++|+++ |||++-.|+.+..    .....+.+.+..++.+.      -++...|  
T Consensus        17 PWlA~IYv--------dG~~~CsgvLlD~~-WlLvsssCl~~I~----L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~   83 (267)
T PF09342_consen   17 PWLADIYV--------DGRYWCSGVLLDPH-WLLVSSSCLRGIS----LSHHYVSALLGGGKTYLSVDGPHEQISRVDCF   83 (267)
T ss_pred             cceeeEEE--------cCeEEEEEEEeccc-eEEEeccccCCcc----cccceEEEEecCcceecccCCChheEEEeeee
Confidence            66666665        45578999999998 9999999998743    11256777777776543      1233333  


Q ss_pred             ---CCCCEEEEEeCCCC----CCCccccCC-CCCCCCCCEEEEEecCC
Q 012318          208 ---FHSDIAIVKINSKT----PLPAAKLGT-SSKLCPGDWVVAMGCPH  247 (466)
Q Consensus       208 ---~~~DlAlLkv~~~~----~~~~~~l~~-s~~~~~G~~V~~iG~p~  247 (466)
                         ++.+++||.++.+.    .+.|+-+.+ .......+.++++|...
T Consensus        84 ~~V~~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   84 KDVPESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             eeccccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence               68899999999763    234444433 23444556899999876


No 70 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=94.73  E-value=0.086  Score=45.26  Aligned_cols=39  Identities=18%  Similarity=0.308  Sum_probs=30.5

Q ss_pred             EEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEE
Q 012318          279 QTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFA  317 (466)
Q Consensus       279 ~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~a  317 (466)
                      .-...-.+|+||-|++|-.|+||||+..+..+.......
T Consensus        98 ip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaLS  136 (158)
T PF00944_consen   98 IPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTALS  136 (158)
T ss_dssp             EETTS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEEE
T ss_pred             eccCCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEEE
Confidence            335566899999999999999999999988766555443


No 71 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.40  E-value=0.088  Score=55.50  Aligned_cols=73  Identities=21%  Similarity=0.428  Sum_probs=54.1

Q ss_pred             ccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHH--HHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318          381 ERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIE--IMGDRVGEPLKVVVQRANDQLVTLTV  457 (466)
Q Consensus       381 ~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~--~l~~~~g~~v~l~v~R~~g~~~~l~v  457 (466)
                      ..++.|.+-...++|..|.+++||+-. ||.||.|+-|||....+......  +|+ +.|+...|+|.|.  +++++..
T Consensus        30 RDnPhf~~getSiViSDVlpGGPAeG~-LQenDrvvMVNGvsMenv~haFAvQqLr-ksgK~A~ItvkRp--rkvqvpa  104 (1027)
T KOG3580|consen   30 RDNPHFENGETSIVISDVLPGGPAEGL-LQENDRVVMVNGVSMENVLHAFAVQQLR-KSGKVAAITVKRP--RKVQVPA  104 (1027)
T ss_pred             CCCCCccCCceeEEEeeccCCCCcccc-cccCCeEEEEcCcchhhhHHHHHHHHHH-hhccceeEEeccc--ceeeccc
Confidence            444566666678999999999999976 99999999999999877654332  333 4677788999883  3444433


No 72 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=94.14  E-value=0.041  Score=58.68  Aligned_cols=57  Identities=26%  Similarity=0.447  Sum_probs=43.9

Q ss_pred             CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHH--HHHHHhcCCCCeEEEEEEE
Q 012318          389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR  447 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~--~~~~l~~~~g~~v~l~v~R  447 (466)
                      ...|++|.+|.|+|.|++.|++.||.|++|||+...+..-  ..++|..  +..++|+|.-
T Consensus       560 kGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKt  618 (1283)
T KOG3542|consen  560 KGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKT  618 (1283)
T ss_pred             ccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEec
Confidence            4569999999999999999999999999999999877643  2334433  3456666653


No 73 
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=92.98  E-value=0.15  Score=52.96  Aligned_cols=75  Identities=15%  Similarity=0.421  Sum_probs=50.9

Q ss_pred             ecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhh-CCCCCCCEEEEECCEecCCH--HHHHHHHhc-
Q 012318          360 VRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHL-AGFLPSDVVIKFDGKPVQSI--TEIIEIMGD-  435 (466)
Q Consensus       360 ~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~-aGl~~GD~I~~ing~~v~~~--~~~~~~l~~-  435 (466)
                      ..+|||+.....+.       .      ....|++|.+|++++.-+. .-|.+||.|+.||.....++  ++....|.+ 
T Consensus       259 ~vnfLGiSivgqsn-------~------rgDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREa  325 (626)
T KOG3571|consen  259 TVNFLGISIVGQSN-------A------RGDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREA  325 (626)
T ss_pred             ccccceeEeecccC-------c------CCCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHH
Confidence            46778887654221       0      2357999999999875444 45999999999999988765  455555554 


Q ss_pred             -CCCCeEEEEEEE
Q 012318          436 -RVGEPLKVVVQR  447 (466)
Q Consensus       436 -~~g~~v~l~v~R  447 (466)
                       ....+++|+|-.
T Consensus       326 V~~~gPi~ltvAk  338 (626)
T KOG3571|consen  326 VSRPGPIKLTVAK  338 (626)
T ss_pred             hccCCCeEEEEee
Confidence             222347888765


No 74 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=92.43  E-value=0.18  Score=54.30  Aligned_cols=56  Identities=20%  Similarity=0.369  Sum_probs=43.3

Q ss_pred             CcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCHHH--HHHHHhcCCCCeEEEEEEE
Q 012318          390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR  447 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~~~--~~~~l~~~~g~~v~l~v~R  447 (466)
                      ..+++|..+.+++||.+-| ++.||+|++|||+..+.+..  ..+++++  |....+.++|
T Consensus       922 nM~LfVLRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~--gg~~vll~Lr  980 (984)
T KOG3209|consen  922 NMDLFVLRLAEDGPAIRDGRMRVGDQITEINGESTKGMTHDRAIELIKQ--GGRRVLLLLR  980 (984)
T ss_pred             ccceEEEEeccCCCccccCceeecceEEEecCcccCCCcHHHHHHHHHh--CCeEEEEEec
Confidence            4578999999999999887 99999999999999988754  3444443  4444555555


No 75 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=92.07  E-value=0.32  Score=52.06  Aligned_cols=123  Identities=26%  Similarity=0.403  Sum_probs=73.3

Q ss_pred             CCCCCccceee-----cCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeeeee
Q 012318          284 INAGNSGGPLV-----NIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRR  358 (466)
Q Consensus       284 i~~G~SGGPlv-----d~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~  358 (466)
                      +-.-++|||.-     |...+++.|+-..       =..+|.+....+++.+|+.-      .+.+..-.|.-|+.-   
T Consensus       677 iAnmm~~GpAarsgkLnIGDQiiaING~S-------LVGLPLstcQs~Ik~~KnQT------~VkltiV~cpPV~~V---  740 (829)
T KOG3605|consen  677 IANMMHGGPAARSGKLNIGDQIMSINGTS-------LVGLPLSTCQSIIKGLKNQT------AVKLNIVSCPPVTTV---  740 (829)
T ss_pred             HHhcccCChhhhcCCccccceeEeecCce-------eccccHHHHHHHHhcccccc------eEEEEEecCCCceEE---
Confidence            33456777775     3334666665322       23489999999999888765      111111122222211   


Q ss_pred             eecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCC--HHHHHHHHhcC
Q 012318          359 VVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDR  436 (466)
Q Consensus       359 ~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~--~~~~~~~l~~~  436 (466)
                                +-.-++.+.+|+.      .+.+|+ |-+..-++-|++.|++.|-.|++|||+.|.-  -+.+..+|...
T Consensus       741 ----------~I~RPd~kyQLGF------SVQNGi-ICSLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~a  803 (829)
T KOG3605|consen  741 ----------LIRRPDLRYQLGF------SVQNGI-ICSLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSNA  803 (829)
T ss_pred             ----------Eeecccchhhccc------eeeCcE-eehhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHHh
Confidence                      1112233333321      334675 6678899999999999999999999998743  24455666555


Q ss_pred             CCC
Q 012318          437 VGE  439 (466)
Q Consensus       437 ~g~  439 (466)
                      .|+
T Consensus       804 VGE  806 (829)
T KOG3605|consen  804 VGE  806 (829)
T ss_pred             hhh
Confidence            553


No 76 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=92.02  E-value=0.17  Score=51.06  Aligned_cols=51  Identities=29%  Similarity=0.455  Sum_probs=42.8

Q ss_pred             CCCCCCCcceeecccCCCChhhhC-CCCCCCEEEEECCEecCCHHHHHHHHh
Q 012318          384 PSFPNVKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSITEIIEIMG  434 (466)
Q Consensus       384 ~~~~~~~~g~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~~~~~~~l~  434 (466)
                      ..|+....|+.|++|...||+..- ||++||+|+++||.+|++.+|..+-+.
T Consensus       213 sPfya~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~  264 (484)
T KOG2921|consen  213 SPFYAHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLA  264 (484)
T ss_pred             chhhhcCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHH
Confidence            355667889999999999998543 899999999999999999888766554


No 77 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=91.91  E-value=0.29  Score=49.99  Aligned_cols=71  Identities=21%  Similarity=0.379  Sum_probs=52.9

Q ss_pred             CCcceeecccCCCChhhhCCCCCC-CEEEEECCEecCCHHHH-HHHHhcCCCCeEEEEEEECCC-eEEEEEEEec
Q 012318          389 VKSGVLVPVVTPGSPAHLAGFLPS-DVVIKFDGKPVQSITEI-IEIMGDRVGEPLKVVVQRAND-QLVTLTVIPE  460 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aGl~~G-D~I~~ing~~v~~~~~~-~~~l~~~~g~~v~l~v~R~~g-~~~~l~v~~~  460 (466)
                      ...|.-|.+|.++|+|.++||.+= |-|++|||..+..-.|. ...|+.+..+ ++++|..... ..+.++|++.
T Consensus        13 gteg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps   86 (462)
T KOG3834|consen   13 GTEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPS   86 (462)
T ss_pred             CceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecccceeEEEEeccc
Confidence            346888999999999999999997 89999999999865554 4455555444 9999986333 3455666654


No 78 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=90.64  E-value=0.5  Score=46.05  Aligned_cols=55  Identities=16%  Similarity=0.273  Sum_probs=43.1

Q ss_pred             cceeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEE
Q 012318          391 SGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ  446 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~  446 (466)
                      .=++|..|..++||++-| ++.||.|++|||..|+.-  .++.+++....+ .+++.+.
T Consensus        30 PClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~-eV~IhyN   87 (429)
T KOG3651|consen   30 PCLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLN-EVKIHYN   87 (429)
T ss_pred             CeEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhcc-ceEEEeh
Confidence            357888999999999887 999999999999999754  566777766443 3666664


No 79 
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=90.03  E-value=0.65  Score=44.64  Aligned_cols=57  Identities=21%  Similarity=0.454  Sum_probs=45.5

Q ss_pred             CcceeecccCCCChhhhCC-CCCCCEEEEECCEec--CCHHHHHHHHhcCCCCeEEEEEEE
Q 012318          390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPV--QSITEIIEIMGDRVGEPLKVVVQR  447 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v--~~~~~~~~~l~~~~g~~v~l~v~R  447 (466)
                      ..|+.|....+++.|+..| |...|.|++|||.+|  ++.+++.+++-.+. ..+.++|+-
T Consensus       193 vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs-hNLIiTVkP  252 (358)
T KOG3606|consen  193 VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS-HNLIITVKP  252 (358)
T ss_pred             cCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc-cceEEEecc
Confidence            4699999999999999999 567899999999998  56788988776532 235666654


No 80 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=89.72  E-value=0.57  Score=47.94  Aligned_cols=67  Identities=28%  Similarity=0.503  Sum_probs=54.0

Q ss_pred             ecccCCCChhhhCCCCC-CCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEecC
Q 012318          395 VPVVTPGSPAHLAGFLP-SDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPEE  461 (466)
Q Consensus       395 V~~V~~~spA~~aGl~~-GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~~  461 (466)
                      |-+|.++|||+.|||++ +|.|+.+-+...+..+|+...+..+.++.+++-|+.-+ ...+++++++..
T Consensus       113 vl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~pn~  181 (462)
T KOG3834|consen  113 VLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTITPNS  181 (462)
T ss_pred             eeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceEEeeccc
Confidence            67899999999999994 59999995556677889998888888899999888522 345778888754


No 81 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=88.67  E-value=0.6  Score=46.40  Aligned_cols=56  Identities=18%  Similarity=0.454  Sum_probs=47.3

Q ss_pred             CcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEE
Q 012318          390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ  446 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~  446 (466)
                      .-+++|+.+.++-.|+..| |-.||-|+.|||..|+..  +++..+|+. .|+.++|+|.
T Consensus        79 n~PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN-AGdeVtlTV~  137 (505)
T KOG3549|consen   79 NLPVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN-AGDEVTLTVK  137 (505)
T ss_pred             CccEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh-cCCEEEEEeH
Confidence            4589999999999999988 778999999999998764  677777764 7888888886


No 82 
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=88.12  E-value=0.6  Score=47.07  Aligned_cols=58  Identities=17%  Similarity=0.367  Sum_probs=44.4

Q ss_pred             CCcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEE--EEEE
Q 012318          389 VKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKV--VVQR  447 (466)
Q Consensus       389 ~~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l--~v~R  447 (466)
                      ....++|+.+.++-.|++.+ |..||.|++|||....+.  ++..++|+. .|+.|.+  ++.|
T Consensus       108 NkMPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKr-aGkeV~levKy~R  170 (506)
T KOG3551|consen  108 NKMPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKR-AGKEVLLEVKYMR  170 (506)
T ss_pred             cCCceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHh-hCceeeeeeeeeh
Confidence            45689999999999999986 999999999999987654  455556654 5666555  4456


No 83 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=87.45  E-value=0.97  Score=47.63  Aligned_cols=55  Identities=20%  Similarity=0.349  Sum_probs=46.2

Q ss_pred             ceeecccCCCChhhhCC-CCCCCEEEEECCEecCC--HHHHHHHHhcCCCCeEEEEEEE
Q 012318          392 GVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQR  447 (466)
Q Consensus       392 g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~--~~~~~~~l~~~~g~~v~l~v~R  447 (466)
                      -++|..+..|+-+.+.| |+.||.|.+|||..+.+  ..++.++|....| .+++.+.-
T Consensus       147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~G-~itfkiiP  204 (542)
T KOG0609|consen  147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSRG-SITFKIIP  204 (542)
T ss_pred             ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCCC-cEEEEEcc
Confidence            58899999999999987 89999999999999976  5789998887554 47777764


No 84 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=86.50  E-value=0.93  Score=51.68  Aligned_cols=52  Identities=31%  Similarity=0.544  Sum_probs=39.6

Q ss_pred             eeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEE
Q 012318          393 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVV  445 (466)
Q Consensus       393 ~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v  445 (466)
                      =+|..|.++|||..+|+++||.|+.+||++|...  .++.+.|.. .|..+.+.+
T Consensus       660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~-~gn~v~~~t  713 (1205)
T KOG0606|consen  660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK-SGNKVTLRT  713 (1205)
T ss_pred             eeeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh-cCCeeEEEe
Confidence            3577899999999999999999999999999765  355555443 355555543


No 85 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=86.37  E-value=0.85  Score=50.95  Aligned_cols=57  Identities=25%  Similarity=0.436  Sum_probs=45.5

Q ss_pred             CcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCHHH--HHHHHhcCCCCeEEEEEEE
Q 012318          390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR  447 (466)
Q Consensus       390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~~~--~~~~l~~~~g~~v~l~v~R  447 (466)
                      .-|++|.+|.+|++|+.-| |+.||++++|||+..-.+.+  ..+ |....|..|.+.|..
T Consensus       959 klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~-lmtrtg~vV~leVaK 1018 (1629)
T KOG1892|consen  959 KLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAAR-LMTRTGNVVHLEVAK 1018 (1629)
T ss_pred             ccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHH-HHhccCCeEEEehhh
Confidence            4599999999999998776 99999999999998766533  333 344578888898875


No 86 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=83.04  E-value=2.3  Score=45.81  Aligned_cols=55  Identities=16%  Similarity=0.349  Sum_probs=40.1

Q ss_pred             eeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhc-CCCCeEEEEEEE
Q 012318          393 VLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGD-RVGEPLKVVVQR  447 (466)
Q Consensus       393 ~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~-~~g~~v~l~v~R  447 (466)
                      ++|.....++||++.| |--||+|++|||......  ...+.+++. +.-..|+++|.+
T Consensus       675 VViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~  733 (829)
T KOG3605|consen  675 VVIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS  733 (829)
T ss_pred             HHHHhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence            5566778899999997 999999999999876542  455566665 333456666665


No 87 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=80.85  E-value=5.8  Score=32.95  Aligned_cols=53  Identities=28%  Similarity=0.505  Sum_probs=32.6

Q ss_pred             EEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCC
Q 012318          159 GAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGT  230 (466)
Q Consensus       159 GfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~  230 (466)
                      ++-|. +|.++|+.|+++...        .+     +|..+  +++.  ..-|+++++.+... ++.+++++
T Consensus         3 avHIG-nG~~vt~tHva~~~~--------~v-----~g~~f--~~~~--~~ge~~~v~~~~~~-~p~~~ig~   55 (105)
T PF03510_consen    3 AVHIG-NGRYVTVTHVAKSSD--------SV-----DGQPF--KIVK--TDGELCWVQSPLVH-LPAAQIGT   55 (105)
T ss_pred             eEEeC-CCEEEEEEEEeccCc--------eE-----cCcCc--EEEE--eccCEEEEECCCCC-CCeeEecc
Confidence            45565 679999999998742        21     12222  2222  34499999998543 56666654


No 88 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=79.73  E-value=27  Score=39.44  Aligned_cols=163  Identities=18%  Similarity=0.183  Sum_probs=73.4

Q ss_pred             eEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC--CcEEEEEEEEecCCCCEEEEEeCCC-CCCCccccCCC--
Q 012318          157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD--GRTFEGTVLNADFHSDIAIVKINSK-TPLPAAKLGTS--  231 (466)
Q Consensus       157 GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~--g~~~~a~vv~~d~~~DlAlLkv~~~-~~~~~~~l~~s--  231 (466)
                      |...+|++. ||+|.+|...+.          -.|.|..  ...|...--.-++..|+.+-||..- ....|+.+...  
T Consensus        67 G~aTLigpq-YiVSV~HN~~gy----------~~v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~t~~~~  135 (769)
T PF02395_consen   67 GVATLIGPQ-YIVSVKHNGKGY----------NSVSFGNEGQNTYKIVDRNNYPSGDFHMPRLNKFVTEVAPAEMTTAGS  135 (769)
T ss_dssp             SS-EEEETT-EEEBETTG-TSC----------CEECESCSSTCEEEEEEEEBETTSTEBEEEESS---SS----BBSSTT
T ss_pred             ceEEEecCC-eEEEEEccCCCc----------CceeecccCCceEEEEEccCCCCcccceeecCceEEEEeccccccccc
Confidence            789999987 999999998442          2344544  3455332222334579999999852 22233332211  


Q ss_pred             -----CCCCCCCEEEEEecCC-------C--------CCCceEEeEEeeeecCccC-CCCCCccccEEEE----cccCCC
Q 012318          232 -----SKLCPGDWVVAMGCPH-------S--------LQNTVTAGIVSCVDRKSSD-LGLGGMRREYLQT----DCAINA  286 (466)
Q Consensus       232 -----~~~~~G~~V~~iG~p~-------~--------~~~~~t~G~Vs~~~~~~~~-~~~~~~~~~~i~~----~~~i~~  286 (466)
                           .....-...+-+|...       +        .....+.|.+......... .............    .....+
T Consensus       136 ~~~~y~d~~rY~~f~R~GsG~Q~i~~~~g~~~~~~~~ay~yltgGt~~~~~~~~n~~~~~~~~~~~~~~~~~pL~n~~~~  215 (769)
T PF02395_consen  136 DSNTYNDKERYPAFVRVGSGTQYIKDRNGNGTTILGGAYNYLTGGTVYNLPGYGNGSMILSGDLKKFNSYNGPLPNYGSP  215 (769)
T ss_dssp             STTGGGHTTTC-EEEEEESSSEEEEECCEEEEEEEEETTSCEEEEEESSEEEEECTCEEEEESTTTCCCCCSSSBEB--T
T ss_pred             cccccccchhchheeecCCceEEEEcCCCCeeEEEEeccceecCCccccccccccceEEEecccccccccCCcccccccc
Confidence                 0111222334444221       1        1112333443221100000 0000000000111    223478


Q ss_pred             CCccceee--cCC---CeEEEEEEeEecC--CCeeeEEEeHHHHHHHHHHH
Q 012318          287 GNSGGPLV--NID---GEIVGINIMKVAA--ADGLSFAVPIDSAAKIIEQF  330 (466)
Q Consensus       287 G~SGGPlv--d~~---G~VVGI~~~~~~~--~~g~~~aip~~~i~~~l~~l  330 (466)
                      |+||+|||  |..   --++|+.+.....  ..+....+|.+.+.+++++-
T Consensus       216 GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~~~~~~~~~~f~~~~~~~d  266 (769)
T PF02395_consen  216 GDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKGNWWNVIPPDFINQIKQND  266 (769)
T ss_dssp             T-TT-EEEEEETTTTEEEEEEEEEEECCCCHSEEEEEEECHHHHHHHHHHC
T ss_pred             CcCCCceEEEEccCCeEEEEEEEccccccCCccceeEEecHHHHHHHHhhh
Confidence            99999999  322   4699999876542  12345567777776666653


No 89 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=78.34  E-value=10  Score=32.62  Aligned_cols=35  Identities=20%  Similarity=0.291  Sum_probs=27.5

Q ss_pred             ccccEEEEcccCCCCCccceeecCCCeEEEEEEeEe
Q 012318          273 MRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKV  308 (466)
Q Consensus       273 ~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~  308 (466)
                      ....++....+..||+-||+|+ .+--||||++++-
T Consensus        76 ~Q~~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg  110 (127)
T PF00947_consen   76 YQYNLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGG  110 (127)
T ss_dssp             EEECEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEE
T ss_pred             eecCceeecccCCCCCCCceeE-eCCCeEEEEEeCC
Confidence            3445777788899999999999 8888999999863


No 90 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=71.02  E-value=9  Score=33.23  Aligned_cols=128  Identities=18%  Similarity=0.215  Sum_probs=63.4

Q ss_pred             eEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCC-CCCCccccCCCCCCC
Q 012318          157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSK-TPLPAAKLGTSSKLC  235 (466)
Q Consensus       157 GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~-~~~~~~~l~~s~~~~  235 (466)
                      --|+.|+  |.+-|.+|=-...           ++.-+.|   +..-.+.+...|+..-..... ..+.+-.-+      
T Consensus        14 fmgt~vn--GV~wT~~HGagsr-----------tlAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg------   71 (148)
T PF02907_consen   14 FMGTCVN--GVMWTVYHGAGSR-----------TLAGPKG---PVNQMYTSVDDDLVGWPAPPGARSLTPCTCG------   71 (148)
T ss_dssp             EEEEEET--TEEEEEHHHHTTS-----------EEEBTTS---EB-ESEEETTTTEEEEE-STTB--BBB-SSS------
T ss_pred             eehhEEc--cEEEEEEecCCcc-----------cccCCCC---cceEeEEcCCCCCcccccccccccCCccccC------
Confidence            4577775  7888999965431           1222222   122345677888888877642 223333222      


Q ss_pred             CCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEE-EcccCCCCCccceeecCCCeEEEEEEeEecCC---
Q 012318          236 PGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQ-TDCAINAGNSGGPLVNIDGEIVGINIMKVAAA---  311 (466)
Q Consensus       236 ~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~-~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~---  311 (466)
                       -..++++-....    +..++-  ...         ....++. .-.....|.||||++=.+|.+|||-....-..   
T Consensus        72 -~~dlylVtr~~~----v~p~rr--~gd---------~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgva  135 (148)
T PF02907_consen   72 -SSDLYLVTRDAD----VIPVRR--RGD---------SRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVA  135 (148)
T ss_dssp             -SSEEEEE-TTS-----EEEEEE--EST---------TEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEE
T ss_pred             -CccEEEEeccCc----EeeeEE--cCC---------CceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCce
Confidence             146677744221    222211  000         0111111 11234689999999988899999987754321   


Q ss_pred             CeeeEEEeHHHH
Q 012318          312 DGLSFAVPIDSA  323 (466)
Q Consensus       312 ~g~~~aip~~~i  323 (466)
                      ..+-| +|++.+
T Consensus       136 k~i~f-~P~e~l  146 (148)
T PF02907_consen  136 KAIDF-IPVETL  146 (148)
T ss_dssp             EEEEE-EEHHHH
T ss_pred             eeEEE-Eeeeec
Confidence            22344 587764


No 91 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=69.10  E-value=3.4  Score=42.48  Aligned_cols=24  Identities=33%  Similarity=0.650  Sum_probs=21.3

Q ss_pred             ccCCCCCccceeecCCCeEEEEEE
Q 012318          282 CAINAGNSGGPLVNIDGEIVGINI  305 (466)
Q Consensus       282 ~~i~~G~SGGPlvd~~G~VVGI~~  305 (466)
                      .....|.||+.|+|.+|++|||.+
T Consensus       350 ~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  350 YSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             cCCCCCCCcCeEECCCCCEEEEeC
Confidence            356789999999999999999975


No 92 
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=64.31  E-value=7.3  Score=35.79  Aligned_cols=32  Identities=28%  Similarity=0.240  Sum_probs=28.1

Q ss_pred             CCCcceeecccCCCChhhhCCCCCCCEEEEEC
Q 012318          388 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFD  419 (466)
Q Consensus       388 ~~~~g~~V~~V~~~spA~~aGl~~GD~I~~in  419 (466)
                      .....++|++|..+|||+++|+.-|+.|+++-
T Consensus       119 ~e~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~  150 (183)
T PF11874_consen  119 EEGGKVIVDEVEFGSPAEKAGIDFDWEITEVE  150 (183)
T ss_pred             eeCCEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence            44567899999999999999999999998873


No 93 
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=63.75  E-value=10  Score=36.65  Aligned_cols=55  Identities=13%  Similarity=0.269  Sum_probs=42.9

Q ss_pred             eeecccCCCChhhhC-CCCCCCEEEEECCEecCCHH--HHHHHHhc-CCCCeEEEEEEE
Q 012318          393 VLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSIT--EIIEIMGD-RVGEPLKVVVQR  447 (466)
Q Consensus       393 ~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~~--~~~~~l~~-~~g~~v~l~v~R  447 (466)
                      ..|..+.++|.-++. -++.||.|-+|||+.+-.+.  ++.++|+. ..|++.+|.+..
T Consensus       151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie  209 (334)
T KOG3938|consen  151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE  209 (334)
T ss_pred             eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence            457777888877665 48999999999999998875  55678877 678888777664


No 94 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=59.84  E-value=24  Score=25.77  Aligned_cols=33  Identities=27%  Similarity=0.469  Sum_probs=29.2

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.|.+.+..+|...++.|-...
T Consensus         6 g~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~   38 (63)
T cd00600           6 GKTVRVELKDGRVLEGVLVAFDKYMNLVLDDVE   38 (63)
T ss_pred             CCEEEEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence            368999999999999999999999999886664


No 95 
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=57.25  E-value=24  Score=26.58  Aligned_cols=33  Identities=24%  Similarity=0.278  Sum_probs=29.5

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.|.+++.++|...+|.|-...
T Consensus        10 ~~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~   42 (67)
T cd01726          10 GRPVVVKLNSGVDYRGILACLDGYMNIALEQTE   42 (67)
T ss_pred             CCeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence            368999999999999999999999999886664


No 96 
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=56.93  E-value=26  Score=26.86  Aligned_cols=33  Identities=27%  Similarity=0.491  Sum_probs=29.7

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.|.+.+.++|...++.|-...
T Consensus        14 ~k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~   46 (72)
T PRK00737         14 NSPVLVRLKGGREFRGELQGYDIHMNLVLDNAE   46 (72)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence            368999999999999999999999999887765


No 97 
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=56.49  E-value=27  Score=26.32  Aligned_cols=33  Identities=21%  Similarity=0.359  Sum_probs=30.0

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ..++.|.+.+|+.+.+.+..+|...+|.|-...
T Consensus        10 ~~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~   42 (68)
T cd01731          10 NKPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAE   42 (68)
T ss_pred             CCEEEEEECCCCEEEEEEEEECCcceEEEeeEE
Confidence            368999999999999999999999999988775


No 98 
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=55.66  E-value=25  Score=26.64  Aligned_cols=33  Identities=21%  Similarity=0.330  Sum_probs=29.3

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.|.+++..+|...+|.|=.+.
T Consensus        11 g~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          11 GKPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             CCEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence            368999999999999999999999999886654


No 99 
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=55.53  E-value=25  Score=27.37  Aligned_cols=32  Identities=16%  Similarity=0.365  Sum_probs=28.5

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEe
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI  217 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv  217 (466)
                      +..+.|.+.+|+.+.+++.++|...++.|=..
T Consensus        13 ~~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          13 GSRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             CCEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence            36899999999999999999999999987554


No 100
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.93  E-value=23  Score=27.93  Aligned_cols=32  Identities=22%  Similarity=0.376  Sum_probs=28.3

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEe
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI  217 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv  217 (466)
                      ...+.|.+.+|+.+.+.+.++|.+.+|.|=..
T Consensus        11 ~k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          11 DERVYVKLRGDRELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             CCEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence            36899999999999999999999999987544


No 101
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.44  E-value=34  Score=26.58  Aligned_cols=33  Identities=24%  Similarity=0.404  Sum_probs=29.0

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.||+.+.+.+..+|...+|.|=...
T Consensus        10 ~~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          10 GRTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             CCeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence            368999999999999999999999999875554


No 102
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.86  E-value=36  Score=26.19  Aligned_cols=33  Identities=15%  Similarity=0.231  Sum_probs=28.9

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      +.++.|.+.+|+.+.+++.++|...+|.|=...
T Consensus        10 ~k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~   42 (72)
T cd01719          10 DKKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV   42 (72)
T ss_pred             CCeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence            468999999999999999999999999885553


No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.61  E-value=32  Score=26.86  Aligned_cols=33  Identities=36%  Similarity=0.566  Sum_probs=29.1

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.+.+.+.++|...+|.|=...
T Consensus        10 ~~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          10 NYRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             CCEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence            468999999999999999999999999875554


No 104
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.51  E-value=34  Score=26.94  Aligned_cols=33  Identities=21%  Similarity=0.301  Sum_probs=28.8

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      +.++.|.+.+|+.+.+.+.++|...+|.|=...
T Consensus        12 ~k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~~   44 (81)
T cd01729          12 DKKIRVKFQGGREVTGILKGYDQLLNLVLDDTV   44 (81)
T ss_pred             CCeEEEEECCCcEEEEEEEEEcCcccEEecCEE
Confidence            468999999999999999999999999875543


No 105
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.38  E-value=38  Score=26.27  Aligned_cols=32  Identities=28%  Similarity=0.415  Sum_probs=28.6

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEe
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI  217 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv  217 (466)
                      +.++.|.+.+|+.+.+.+.++|+..+|.|=..
T Consensus        12 ~k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~   43 (74)
T cd01728          12 DKKVVVLLRDGRKLIGILRSFDQFANLVLQDT   43 (74)
T ss_pred             CCEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence            46899999999999999999999999988554


No 106
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=49.18  E-value=36  Score=27.28  Aligned_cols=32  Identities=16%  Similarity=0.325  Sum_probs=28.8

Q ss_pred             ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ..+.|.+.+++.+.+++.++|.+.+|.|=.+.
T Consensus        15 ~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~   46 (87)
T cd01720          15 TQVLINCRNNKKLLGRVKAFDRHCNMVLENVK   46 (87)
T ss_pred             CEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence            68999999999999999999999999976554


No 107
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=48.81  E-value=60  Score=24.24  Aligned_cols=33  Identities=24%  Similarity=0.331  Sum_probs=29.4

Q ss_pred             ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCC
Q 012318          187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS  219 (466)
Q Consensus       187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~  219 (466)
                      ..+.+..-.|..++++++.+|....+.+|+.++
T Consensus         7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence            577788888999999999999999999999765


No 108
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=46.99  E-value=47  Score=25.32  Aligned_cols=33  Identities=18%  Similarity=0.393  Sum_probs=29.9

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|..|.+++..+|...++.|-.+.
T Consensus        10 g~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~   42 (70)
T cd01721          10 GHIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT   42 (70)
T ss_pred             CCEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence            368999999999999999999999999988774


No 109
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=46.26  E-value=49  Score=24.54  Aligned_cols=33  Identities=24%  Similarity=0.462  Sum_probs=28.9

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.+.+.+..+|...++-|=...
T Consensus         8 ~~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~   40 (67)
T smart00651        8 GKRVLVELKNGREYRGTLKGFDQFMNLVLEDVE   40 (67)
T ss_pred             CcEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence            368999999999999999999999999876554


No 110
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=45.41  E-value=1.4e+02  Score=31.03  Aligned_cols=136  Identities=16%  Similarity=0.209  Sum_probs=62.9

Q ss_pred             ceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCC--CCCCccccCCCC
Q 012318          155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSK--TPLPAAKLGTSS  232 (466)
Q Consensus       155 ~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~--~~~~~~~l~~s~  232 (466)
                      +.|=||.|+++ +++|+-||+....       .++   |  |  .+-.-+..+..-++.-+++..+  .+++-+-|.  .
T Consensus       379 GsGWGfWVS~~-lfITttHViP~g~-------~E~---F--G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--e  441 (535)
T PF05416_consen  379 GSGWGFWVSPT-LFITTTHVIPPGA-------KEA---F--G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--E  441 (535)
T ss_dssp             TTEEEEESSSS-EEEEEGGGS-STT-------SEE---T--T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---S
T ss_pred             CCceeeeecce-EEEEeeeecCCcc-------hhh---h--C--CChhHeEEeeccceEEEecCCCCCCCccceeec--c
Confidence            55789999998 9999999997632       111   0  0  0111123344446777777754  234444442  2


Q ss_pred             CCCCCCEEEE-EecCCCC--CCceEEeEEeeeecCccCCCCCCccccEEEE-------cccCCCCCccceeecCCC---e
Q 012318          233 KLCPGDWVVA-MGCPHSL--QNTVTAGIVSCVDRKSSDLGLGGMRREYLQT-------DCAINAGNSGGPLVNIDG---E  299 (466)
Q Consensus       233 ~~~~G~~V~~-iG~p~~~--~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPlvd~~G---~  299 (466)
                      ....|.-+.+ |=.+.|.  +..+..|........-...   .....++.+       |-.+.||+-|-|-+-..|   -
T Consensus       442 GapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v---~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~V  518 (535)
T PF05416_consen  442 GAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRTV---HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWV  518 (535)
T ss_dssp             S--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEE---EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEE
T ss_pred             CCCCceEEEEEEEcCCccchhhhhhhccceeEEEcceee---cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEE
Confidence            3344544332 3334332  2234444433322110000   001112222       334678999999995555   5


Q ss_pred             EEEEEEeEecC
Q 012318          300 IVGINIMKVAA  310 (466)
Q Consensus       300 VVGI~~~~~~~  310 (466)
                      |+|+|.....+
T Consensus       519 V~GVH~AAtr~  529 (535)
T PF05416_consen  519 VIGVHAAATRS  529 (535)
T ss_dssp             EEEEEEEE-SS
T ss_pred             EEEEEehhccC
Confidence            89999886543


No 111
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.22  E-value=50  Score=25.40  Aligned_cols=33  Identities=24%  Similarity=0.330  Sum_probs=29.2

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      +.++.|.+.+|+.+.+.+.++|...++.|=...
T Consensus         9 ~~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727           9 NKTVSVITVDGRVIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             CCEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence            368999999999999999999999999887654


No 112
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=43.47  E-value=61  Score=24.04  Aligned_cols=34  Identities=26%  Similarity=0.560  Sum_probs=30.5

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS  219 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~  219 (466)
                      ...++|.+.+|+.+.+.+..+|...++.|-....
T Consensus         8 g~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~   41 (67)
T PF01423_consen    8 GKRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE   41 (67)
T ss_dssp             TSEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred             CcEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence            3689999999999999999999999998887764


No 113
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=40.78  E-value=57  Score=25.39  Aligned_cols=33  Identities=24%  Similarity=0.485  Sum_probs=29.4

Q ss_pred             ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCC
Q 012318          187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS  219 (466)
Q Consensus       187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~  219 (466)
                      ..+.|.+.+|+.|.+++.++|...++.|--+..
T Consensus        18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence            689999999999999999999999998876553


No 114
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=40.76  E-value=29  Score=24.39  Aligned_cols=21  Identities=43%  Similarity=0.566  Sum_probs=17.5

Q ss_pred             CCCccceeecCCCeEEEEEEe
Q 012318          286 AGNSGGPLVNIDGEIVGINIM  306 (466)
Q Consensus       286 ~G~SGGPlvd~~G~VVGI~~~  306 (466)
                      .+.+.-|++|.+|+++|+++.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            456778999999999999764


No 115
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=40.55  E-value=31  Score=32.46  Aligned_cols=55  Identities=25%  Similarity=0.488  Sum_probs=37.9

Q ss_pred             ccEEEEcccCCCCCccceeecCC----CeEEEEEEeEecCCCeeeEEEeH--HHHHHHHHHH
Q 012318          275 REYLQTDCAINAGNSGGPLVNID----GEIVGINIMKVAAADGLSFAVPI--DSAAKIIEQF  330 (466)
Q Consensus       275 ~~~i~~~~~i~~G~SGGPlvd~~----G~VVGI~~~~~~~~~g~~~aip~--~~i~~~l~~l  330 (466)
                      +..++...+...|+=|||++-.+    -+++||+..+.. ..+.+||-++  +.+.+-+..|
T Consensus       168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~-~~~~gYAe~itQEDL~~A~~~l  228 (231)
T PF12381_consen  168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA-NHAMGYAESITQEDLMRAINKL  228 (231)
T ss_pred             eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc-cccceehhhhhHHHHHHHHHhh
Confidence            34678888999999999999333    699999998753 2356676544  3344444444


No 116
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=37.19  E-value=84  Score=24.30  Aligned_cols=33  Identities=24%  Similarity=0.448  Sum_probs=29.7

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|+.+.+.+..+|...++.+-.+.
T Consensus        11 g~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~   43 (76)
T cd01723          11 NHPMLVELKNGETYNGHLVNCDNWMNIHLREVI   43 (76)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence            368999999999999999999999999987664


No 117
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=36.43  E-value=54  Score=35.66  Aligned_cols=37  Identities=19%  Similarity=0.248  Sum_probs=32.0

Q ss_pred             cceeecccCCCChhhhC-CCCCCCEEEEECCEecCCHH
Q 012318          391 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSIT  427 (466)
Q Consensus       391 ~g~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~~  427 (466)
                      +-.+|+++.++|||+.. -|..||.|+.||++.+..|+
T Consensus       225 g~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq  262 (638)
T KOG1738|consen  225 GPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ  262 (638)
T ss_pred             CceeccccccCChHHHhhcccCccceeeecccccccch
Confidence            44567889999999877 49999999999999998884


No 118
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=35.95  E-value=47  Score=25.51  Aligned_cols=32  Identities=28%  Similarity=0.535  Sum_probs=25.0

Q ss_pred             ceeecCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHH
Q 012318          291 GPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKK  332 (466)
Q Consensus       291 GPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~  332 (466)
                      -|+++.+|+++|++..          .+.++.+.++++++.-
T Consensus        19 ~pi~~~~g~~~Gvv~~----------di~l~~l~~~i~~~~~   50 (81)
T PF02743_consen   19 VPIYDDDGKIIGVVGI----------DISLDQLSEIISNIKF   50 (81)
T ss_dssp             EEEEETTTEEEEEEEE----------EEEHHHHHHHHTTSBB
T ss_pred             EEEECCCCCEEEEEEE----------EeccceeeeEEEeeEE
Confidence            5888889999999864          4788888887776543


No 119
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.32  E-value=1.2e+02  Score=23.88  Aligned_cols=33  Identities=24%  Similarity=0.368  Sum_probs=29.7

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|..|.+++..+|...++-+-.+.
T Consensus        11 g~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~   43 (81)
T cd01725          11 GKEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS   43 (81)
T ss_pred             CCEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence            368999999999999999999999999887765


No 120
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=29.99  E-value=30  Score=34.37  Aligned_cols=30  Identities=33%  Similarity=0.595  Sum_probs=26.7

Q ss_pred             ecccCCCChhhhCCCCCCCEEEEECCEecC
Q 012318          395 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  424 (466)
Q Consensus       395 V~~V~~~spA~~aGl~~GD~I~~ing~~v~  424 (466)
                      +..|.+.+||+++|.-.||.|+.+|+-++.
T Consensus        67 ~lrv~~~~~~e~~~~~~~dyilg~n~Dp~~   96 (417)
T COG5233          67 VLRVNPESPAEKAGMVVGDYILGINEDPLR   96 (417)
T ss_pred             heeccccChhHhhccccceeEEeecCCcHH
Confidence            556788999999999999999999987764


No 121
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=29.96  E-value=1.4e+02  Score=23.35  Aligned_cols=32  Identities=25%  Similarity=0.403  Sum_probs=29.2

Q ss_pred             ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ..+.|.+.+|..|.+++..+|...++-|-.+.
T Consensus        20 ~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~   51 (78)
T cd01733          20 KVVTVELRNETTVTGRIASVDAFMNIRLAKVT   51 (78)
T ss_pred             CEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence            68999999999999999999999999887765


No 122
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=29.32  E-value=55  Score=27.31  Aligned_cols=18  Identities=28%  Similarity=0.695  Sum_probs=13.6

Q ss_pred             ceeecCCCeEEEEEEeEe
Q 012318          291 GPLVNIDGEIVGINIMKV  308 (466)
Q Consensus       291 GPlvd~~G~VVGI~~~~~  308 (466)
                      .|++|.+|++||++..+.
T Consensus        94 ~PV~d~~g~viG~V~VG~  111 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVGV  111 (116)
T ss_dssp             EEEE-TTS-EEEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEEE
Confidence            599999999999998753


No 123
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.90  E-value=1.4e+02  Score=24.04  Aligned_cols=33  Identities=18%  Similarity=0.383  Sum_probs=29.9

Q ss_pred             CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318          186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN  218 (466)
Q Consensus       186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~  218 (466)
                      ...+.|.+.+|..|.+.+..+|...++.|-.+.
T Consensus        11 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~a~   43 (90)
T cd01724          11 NETVTIELKNGTIVHGTITGVDPSMNTHLKNVK   43 (90)
T ss_pred             CCEEEEEECCCCEEEEEEEEEcCceeEEEEEEE
Confidence            368999999999999999999999999988765


No 124
>PF14438 SM-ATX:  Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=27.31  E-value=1.5e+02  Score=22.79  Aligned_cols=28  Identities=29%  Similarity=0.490  Sum_probs=21.1

Q ss_pred             ceEEEEeCCCcEEEEEEEEecC---CCCEEE
Q 012318          187 GKVDVTLQDGRTFEGTVLNADF---HSDIAI  214 (466)
Q Consensus       187 ~~i~V~~~~g~~~~a~vv~~d~---~~DlAl  214 (466)
                      ..+.|+..||..|++-+...++   +.+++|
T Consensus        13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL   43 (77)
T PF14438_consen   13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL   43 (77)
T ss_dssp             SEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred             CEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence            6899999999999999999988   566665


No 125
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=23.63  E-value=55  Score=34.93  Aligned_cols=30  Identities=30%  Similarity=0.452  Sum_probs=23.9

Q ss_pred             eecccCCCChhhhCCCCCCCEEEEECCEecC
Q 012318          394 LVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ  424 (466)
Q Consensus       394 ~V~~V~~~spA~~aGl~~GD~I~~ing~~v~  424 (466)
                      .|.-..+|.|.-.| .+|||||++.||+.|+
T Consensus       301 ~vl~~~ENm~~g~A-~rPGDVits~~GkTVE  330 (485)
T COG0260         301 GVLPAVENMPSGNA-YRPGDVITSMNGKTVE  330 (485)
T ss_pred             EEEeeeccCCCCCC-CCCCCeEEecCCcEEE
Confidence            34455677787777 9999999999998763


Done!