Query         019504
Match_columns 340
No_of_seqs    313 out of 2547
Neff          8.9 
Searched_HMMs 46136
Date          Fri Mar 29 10:00:04 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/019504.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/019504hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 6.1E-52 1.3E-56  397.6  32.6  289   22-340    41-349 (455)
  2 TIGR02038 protease_degS peripl 100.0 3.9E-51 8.5E-56  381.5  32.7  291   16-340    40-337 (351)
  3 PRK10898 serine endoprotease;  100.0 2.8E-50   6E-55  375.5  31.9  287   20-340    44-338 (353)
  4 PRK10942 serine endoprotease;  100.0 2.9E-49 6.2E-54  380.9  30.9  289   22-340    39-370 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 1.1E-48 2.3E-53  375.7  30.9  288   23-340     3-316 (428)
  6 COG0265 DegQ Trypsin-like seri 100.0 1.1E-38 2.5E-43  298.7  26.5  290   21-339    33-328 (347)
  7 KOG1320 Serine protease [Postt 100.0 5.7E-29 1.2E-33  233.0  19.0  290   17-317   124-436 (473)
  8 KOG1421 Predicted signaling-as  99.9 6.3E-24 1.4E-28  201.4  18.0  287   18-340    49-360 (955)
  9 KOG1421 Predicted signaling-as  99.8   1E-17 2.2E-22  159.6  14.3  274   27-320   524-809 (955)
 10 PF13365 Trypsin_2:  Trypsin-li  99.7 2.6E-17 5.7E-22  130.1  11.2  117   59-210     1-120 (120)
 11 KOG1320 Serine protease [Postt  99.5 3.7E-13   8E-18  127.1  11.4  256   26-312    55-319 (473)
 12 PF00089 Trypsin:  Trypsin;  In  99.4 1.3E-11 2.8E-16  107.5  16.1  167   58-237    26-220 (220)
 13 cd00190 Tryp_SPc Trypsin-like   99.3 1.3E-10 2.8E-15  101.9  15.0  172   57-239    25-231 (232)
 14 PF13180 PDZ_2:  PDZ domain; PD  99.2 1.2E-11 2.7E-16   91.1   3.1   72  249-340     1-73  (82)
 15 smart00020 Tryp_SPc Trypsin-li  99.2 1.2E-09 2.5E-14   96.0  15.2  164   57-231    26-223 (229)
 16 COG3591 V8-like Glu-specific e  98.9 6.7E-08 1.4E-12   84.8  14.3  195   27-241    33-250 (251)
 17 cd00991 PDZ_archaeal_metallopr  98.8 3.7E-09 8.1E-14   77.2   3.8   60  267-340     9-69  (79)
 18 PF00863 Peptidase_C4:  Peptida  98.8 7.8E-08 1.7E-12   83.7  12.0  167   27-230    13-184 (235)
 19 TIGR01713 typeII_sec_gspC gene  98.8 5.5E-09 1.2E-13   93.4   4.6   92  230-340   158-250 (259)
 20 cd00987 PDZ_serine_protease PD  98.8 1.2E-08 2.6E-13   76.2   5.3   77  250-340     2-83  (90)
 21 cd00989 PDZ_metalloprotease PD  98.7 1.1E-08 2.5E-13   74.4   3.8   58  268-339    12-69  (79)
 22 cd00990 PDZ_glycyl_aminopeptid  98.7 1.6E-08 3.5E-13   73.8   4.4   67  249-339     1-67  (80)
 23 cd00988 PDZ_CTP_protease PDZ d  98.7 2.3E-08   5E-13   73.9   5.0   69  249-339     2-72  (85)
 24 cd00136 PDZ PDZ domain, also c  98.6 2.6E-08 5.7E-13   70.7   3.5   55  268-336    13-69  (70)
 25 cd00986 PDZ_LON_protease PDZ d  98.6   5E-08 1.1E-12   71.1   3.2   58  268-340     8-66  (79)
 26 TIGR02037 degP_htrA_DO peripla  98.4 2.8E-07   6E-12   89.1   5.2   77  250-340   339-421 (428)
 27 smart00228 PDZ Domain present   98.3   8E-07 1.7E-11   65.3   3.7   60  268-340    26-85  (85)
 28 TIGR00225 prc C-terminal pepti  98.2 2.5E-06 5.4E-11   79.7   6.7   70  248-339    50-121 (334)
 29 cd00992 PDZ_signaling PDZ doma  98.2 1.1E-06 2.4E-11   64.2   2.6   38  268-316    26-65  (82)
 30 PLN00049 carboxyl-terminal pro  98.2 3.6E-06 7.9E-11   80.1   6.4   79  247-339    83-161 (389)
 31 PRK10779 zinc metallopeptidase  98.1 8.4E-07 1.8E-11   86.1   2.0   57  270-340   128-185 (449)
 32 TIGR00054 RIP metalloprotease   98.1 1.7E-06 3.6E-11   83.2   3.4   59  268-340   203-261 (420)
 33 KOG3627 Trypsin [Amino acid tr  98.1 0.00023 5.1E-09   63.6  16.1  124  117-240   106-253 (256)
 34 PF00595 PDZ:  PDZ domain (Also  98.0 5.9E-06 1.3E-10   60.4   4.4   38  268-316    25-62  (81)
 35 PRK10139 serine endoprotease;   98.0 2.7E-06 5.8E-11   82.5   3.1   58  268-340   390-447 (455)
 36 PRK10779 zinc metallopeptidase  98.0   3E-06 6.5E-11   82.3   3.2   58  269-340   222-279 (449)
 37 PF14685 Tricorn_PDZ:  Tricorn   98.0 3.6E-06 7.8E-11   62.1   2.7   60  268-339    12-79  (88)
 38 PRK10942 serine endoprotease;   98.0 4.2E-06 9.1E-11   81.6   3.1   58  268-340   408-465 (473)
 39 TIGR00054 RIP metalloprotease   97.9 4.9E-06 1.1E-10   80.0   2.7   57  268-339   128-184 (420)
 40 COG0793 Prc Periplasmic protea  97.9 1.6E-05 3.4E-10   75.9   5.4   74  247-339    98-171 (406)
 41 TIGR02860 spore_IV_B stage IV   97.8 2.3E-05 5.1E-10   73.6   3.9   66  251-340    98-171 (402)
 42 PF05579 Peptidase_S32:  Equine  97.7 0.00028   6E-09   61.8   9.6  117   57-216   112-230 (297)
 43 KOG3553 Tax interaction protei  97.6 4.5E-05 9.7E-10   56.3   3.2   34  268-312    59-92  (124)
 44 PRK11186 carboxy-terminal prot  97.6 7.4E-05 1.6E-09   75.2   5.5   73  247-338   242-319 (667)
 45 PF03761 DUF316:  Domain of unk  97.4  0.0048   1E-07   56.1  14.4  111  114-235   158-273 (282)
 46 PF00548 Peptidase_C3:  3C cyst  97.2   0.015 3.2E-07   48.9  13.5  135   58-214    26-170 (172)
 47 PF10459 Peptidase_S46:  Peptid  97.1  0.0032   7E-08   63.9  10.4   23   58-80     48-70  (698)
 48 PF04495 GRASP55_65:  GRASP55/6  97.1 0.00025 5.5E-09   57.1   1.9   57  268-338    43-100 (138)
 49 PF12812 PDZ_1:  PDZ-like domai  97.0 0.00049 1.1E-08   49.7   2.7   62  250-325    10-73  (78)
 50 COG5640 Secreted trypsin-like   97.0   0.017 3.6E-07   53.0  12.5   55  189-243   223-280 (413)
 51 PF05580 Peptidase_S55:  SpoIVB  96.9   0.032 6.9E-07   47.9  12.6   44  185-232   171-214 (218)
 52 COG3480 SdrC Predicted secrete  96.8 0.00064 1.4E-08   61.1   2.0   54  268-337   130-183 (342)
 53 PF08192 Peptidase_S64:  Peptid  96.7   0.014 2.9E-07   57.8  10.6  118  115-240   541-688 (695)
 54 PF10459 Peptidase_S46:  Peptid  96.7  0.0028   6E-08   64.3   5.8   57  184-240   623-686 (698)
 55 COG3975 Predicted protease wit  96.6  0.0036 7.8E-08   60.1   5.2   31  268-309   462-492 (558)
 56 PF02122 Peptidase_S39:  Peptid  96.5   0.032 6.9E-07   48.0  10.1  148   57-231    30-182 (203)
 57 KOG3209 WW domain-containing p  96.4  0.0016 3.4E-08   64.2   2.1   54  272-337   782-835 (984)
 58 PF00949 Peptidase_S7:  Peptida  96.4  0.0069 1.5E-07   48.2   5.2   33  185-217    88-120 (132)
 59 KOG3129 26S proteasome regulat  96.4  0.0018   4E-08   54.7   2.0   60  269-339   140-199 (231)
 60 PRK09681 putative type II secr  96.1   0.003 6.6E-08   56.6   2.1   53  275-340   211-266 (276)
 61 KOG3532 Predicted protein kina  95.9  0.0029 6.3E-08   62.1   1.2   54  268-338   398-451 (1051)
 62 KOG3580 Tight junction protein  94.9   0.017 3.6E-07   56.2   2.4   78  251-340   200-279 (1027)
 63 TIGR02860 spore_IV_B stage IV   94.8    0.26 5.6E-06   46.8  10.0   96  132-232   294-394 (402)
 64 KOG3580 Tight junction protein  94.5   0.014   3E-07   56.8   0.8   60  266-338   427-486 (1027)
 65 COG3031 PulC Type II secretory  94.4   0.016 3.5E-07   50.1   1.0   59  269-340   208-266 (275)
 66 PF02907 Peptidase_S29:  Hepati  93.4    0.15 3.2E-06   40.3   4.5  131   59-233    14-146 (148)
 67 KOG3550 Receptor targeting pro  93.3   0.096 2.1E-06   42.0   3.3   38  268-315   115-152 (207)
 68 PF00944 Peptidase_S3:  Alphavi  92.8    0.19 4.1E-06   39.6   4.3   30  188-217   100-129 (158)
 69 KOG2921 Intramembrane metallop  92.0   0.095   2E-06   48.7   2.0   43  263-316   215-258 (484)
 70 PF03510 Peptidase_C24:  2C end  91.9       1 2.2E-05   34.3   7.2   56   59-139     1-56  (105)
 71 KOG3552 FERM domain protein FR  91.9    0.14 2.9E-06   52.6   3.1   36  268-315    75-110 (1298)
 72 KOG3209 WW domain-containing p  91.8   0.092   2E-06   52.3   1.8   60  268-340   923-982 (984)
 73 KOG3605 Beta amyloid precursor  91.1    0.38 8.3E-06   47.6   5.3  101  191-313   677-790 (829)
 74 KOG3542 cAMP-regulated guanine  90.9   0.091   2E-06   52.0   0.9   37  268-315   562-598 (1283)
 75 PF02395 Peptidase_S6:  Immunog  90.6     3.7   8E-05   42.7  12.0   49  191-240   213-266 (769)
 76 PF05416 Peptidase_C37:  Southa  89.3    0.84 1.8E-05   43.1   5.7  140   53-216   375-528 (535)
 77 KOG1892 Actin filament-binding  88.7    0.19   4E-06   51.8   1.1   59  268-338   960-1018(1629)
 78 KOG3606 Cell polarity protein   88.5    0.45 9.7E-06   42.1   3.1   61  268-338   194-261 (358)
 79 PF09342 DUF1986:  Domain of un  88.1     7.8 0.00017   34.2  10.4   92   56-155    27-131 (267)
 80 KOG3651 Protein kinase C, alph  86.7    0.75 1.6E-05   41.4   3.5   38  268-315    30-67  (429)
 81 KOG0606 Microtubule-associated  84.5     0.3 6.6E-06   51.3   0.1   35  270-315   660-694 (1205)
 82 KOG3549 Syntrophins (type gamm  84.3    0.36 7.8E-06   44.2   0.4   38  269-316    81-118 (505)
 83 KOG3834 Golgi reassembly stack  81.3    0.59 1.3E-05   44.1   0.6   58  268-338    15-72  (462)
 84 PF00947 Pico_P2A:  Picornaviru  81.2     3.3 7.2E-05   32.5   4.6   32  183-215    79-110 (127)
 85 PF01732 DUF31:  Putative pepti  81.0     1.2 2.5E-05   42.4   2.5   24  189-212   350-373 (374)
 86 COG0750 Predicted membrane-ass  80.8     1.2 2.7E-05   42.0   2.7   32  274-316   135-166 (375)
 87 KOG3571 Dishevelled 3 and rela  79.3     1.7 3.8E-05   41.9   3.0   39  268-316   277-315 (626)
 88 KOG0609 Calcium/calmodulin-dep  79.3     1.3 2.9E-05   43.1   2.3   37  269-315   147-183 (542)
 89 KOG3551 Syntrophins (type beta  77.2    0.36 7.9E-06   44.8  -2.0   38  269-316   111-148 (506)
 90 PF12381 Peptidase_C3G:  Tungro  63.3     9.3  0.0002   33.0   3.5   56  182-241   168-229 (231)
 91 cd00600 Sm_like The eukaryotic  61.2      26 0.00056   23.5   5.0   32   93-126     8-39  (63)
 92 PF11874 DUF3394:  Domain of un  61.0      33 0.00071   29.0   6.4   28  268-306   122-149 (183)
 93 cd01731 archaeal_Sm1 The archa  59.4      27 0.00058   24.1   4.8   32   93-126    12-43  (68)
 94 cd01726 LSm6 The eukaryotic Sm  58.2      26 0.00057   24.1   4.6   32   93-126    12-43  (67)
 95 cd01730 LSm3 The eukaryotic Sm  58.2      22 0.00047   25.7   4.3   30   93-124    13-42  (82)
 96 COG0298 HypC Hydrogenase matur  57.5      23 0.00049   25.5   4.1   47  106-154     5-52  (82)
 97 PRK00737 small nuclear ribonuc  57.1      30 0.00065   24.3   4.8   32   93-126    16-47  (72)
 98 cd01722 Sm_F The eukaryotic Sm  56.8      27 0.00059   24.1   4.5   31   93-125    13-43  (68)
 99 cd01732 LSm5 The eukaryotic Sm  56.6      27 0.00059   24.9   4.5   30   93-124    15-44  (76)
100 cd01720 Sm_D2 The eukaryotic S  56.1      29 0.00062   25.5   4.7   32   93-126    16-47  (87)
101 cd01717 Sm_B The eukaryotic Sm  56.1      28  0.0006   24.9   4.6   31   93-125    12-42  (79)
102 PF00571 CBS:  CBS domain CBS d  55.6      14 0.00031   23.8   2.9   21  193-213    28-48  (57)
103 cd06168 LSm9 The eukaryotic Sm  55.4      32  0.0007   24.4   4.7   31   93-125    12-42  (75)
104 cd01729 LSm7 The eukaryotic Sm  54.4      33 0.00071   24.8   4.7   31   93-125    14-44  (81)
105 cd01735 LSm12_N LSm12 belongs   53.0      54  0.0012   22.3   5.2   32   93-126     8-39  (61)
106 KOG3605 Beta amyloid precursor  51.9     6.9 0.00015   39.2   1.1   35  270-314   675-709 (829)
107 cd01719 Sm_G The eukaryotic Sm  51.2      42  0.0009   23.6   4.7   31   93-125    12-42  (72)
108 cd01728 LSm1 The eukaryotic Sm  50.4      42 0.00092   23.8   4.7   31   93-125    14-44  (74)
109 smart00651 Sm snRNP Sm protein  49.6      46   0.001   22.6   4.8   32   93-126    10-41  (67)
110 COG1868 FliM Flagellar motor s  48.7      65  0.0014   30.1   6.8   40  204-243   191-230 (332)
111 cd01727 LSm8 The eukaryotic Sm  47.1      48   0.001   23.3   4.6   31   93-125    11-41  (74)
112 PF01423 LSM:  LSM domain ;  In  44.0      66  0.0014   21.8   4.8   33   93-127    10-42  (67)
113 COG1958 LSM1 Small nuclear rib  43.5      55  0.0012   23.3   4.5   32   93-126    19-50  (79)
114 PF08669 GCV_T_C:  Glycine clea  42.8      44 0.00096   24.5   4.1   33  195-227    34-66  (95)
115 KOG3834 Golgi reassembly stack  41.2      10 0.00022   36.2   0.3   35  272-316   113-147 (462)
116 PF02743 Cache_1:  Cache domain  39.2      34 0.00074   24.1   2.9   32  197-241    18-49  (81)
117 PF01455 HupF_HypC:  HupF/HypC   38.2 1.2E+02  0.0026   21.1   5.3   43  106-151     5-47  (68)
118 PF02601 Exonuc_VII_L:  Exonucl  38.1      45 0.00097   30.8   4.2   38   58-109   281-318 (319)
119 PF14827 Cache_3:  Sensory doma  35.9      40 0.00086   25.9   3.0   18  198-215    94-111 (116)
120 COG4820 EutJ Ethanolamine util  35.5 1.1E+02  0.0024   26.5   5.6  103  198-307    43-167 (277)
121 COG0260 PepB Leucyl aminopepti  33.6      43 0.00093   33.0   3.3   15  298-312   316-330 (485)
122 PTZ00138 small nuclear ribonuc  32.2   1E+02  0.0022   22.7   4.4   33   93-125    28-60  (89)
123 COG5233 GRH1 Peripheral Golgi   32.0      30 0.00064   31.8   1.8   33  271-314    66-98  (417)
124 PF01732 DUF31:  Putative pepti  30.7      29 0.00063   32.9   1.7   24   56-79     35-68  (374)
125 KOG1738 Membrane-associated gu  30.7      53  0.0012   33.0   3.4   63  236-315   200-262 (638)
126 COG2524 Predicted transcriptio  30.3 2.9E+02  0.0062   24.9   7.5   20  193-213   201-220 (294)
127 cd01739 LSm11_C The eukaryotic  30.2      94   0.002   21.4   3.5   35   93-127    12-46  (66)
128 PRK06437 hypothetical protein;  29.0 1.4E+02   0.003   20.5   4.4   30  298-338    33-62  (67)
129 PF10049 DUF2283:  Protein of u  28.1      43 0.00094   21.6   1.7   11  202-212    36-46  (50)
130 cd04627 CBS_pair_14 The CBS do  26.9      53  0.0011   24.8   2.3   21  193-213    97-117 (123)
131 cd01721 Sm_D3 The eukaryotic S  26.9 2.2E+02  0.0047   19.7   7.4   32   93-126    12-43  (70)
132 PF08605 Rad9_Rad53_bind:  Fung  26.6 1.4E+02   0.003   23.8   4.6   56   93-152    15-70  (131)
133 cd01733 LSm10 The eukaryotic S  26.6 2.4E+02  0.0052   20.1   7.5   32   93-126    21-52  (78)
134 COG2104 ThiS Sulfur transfer p  25.4 1.3E+02  0.0028   21.0   3.7   34  298-338    30-63  (68)
135 cd04603 CBS_pair_KefB_assoc Th  25.2      60  0.0013   24.1   2.3   21  193-213    85-105 (111)
136 cd04620 CBS_pair_7 The CBS dom  24.7      62  0.0013   23.9   2.3   20  194-213    90-109 (115)
137 TIGR00739 yajC preprotein tran  24.3 1.3E+02  0.0027   21.9   3.7   40  296-340    37-78  (84)
138 cd01724 Sm_D1 The eukaryotic S  22.6 3.2E+02  0.0069   20.1   7.2   60   93-156    13-72  (90)
139 TIGR03279 cyano_FeS_chp putati  21.8      41 0.00089   32.5   0.9   21  271-291     1-21  (433)
140 cd04597 CBS_pair_DRTGG_assoc2   21.6      91   0.002   23.4   2.7   21  193-213    87-107 (113)
141 cd01723 LSm4 The eukaryotic Sm  21.0 3.1E+02  0.0066   19.3   7.6   32   93-126    13-44  (76)
142 PRK10413 hydrogenase 2 accesso  20.3 2.1E+02  0.0047   20.7   4.1   48  106-153     5-55  (82)
143 PF05578 Peptidase_S31:  Pestiv  20.3 2.9E+02  0.0062   22.6   5.2  127   56-214    50-182 (211)
144 cd04592 CBS_pair_EriC_assoc_eu  20.2      99  0.0021   24.2   2.7   20  194-213    23-42  (133)
145 TIGR00074 hypC_hupF hydrogenas  20.0 2.5E+02  0.0055   20.0   4.4   41  106-151     5-45  (76)
146 smart00116 CBS Domain in cysta  20.0      97  0.0021   17.9   2.2   20  194-213    22-41  (49)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=100.00  E-value=6.1e-52  Score=397.64  Aligned_cols=289  Identities=39%  Similarity=0.564  Sum_probs=244.0

Q ss_pred             HHHHHHHHhCCCeEEEEeeeecccc--------ccCCc------ccccCCceEEEEEcC-CCEEEeCccccCCCCCCCCC
Q 019504           22 RIAQLFEKNTYSVVNIFDVTLRPTL--------NVTGL------VEIPEGNGSGVVWDG-KGHIVTNFHVIGSALSRKPA   86 (340)
Q Consensus        22 ~~~~~~~~~~~svV~I~~~~~~~~~--------~~~~~------~~~~~~~GsGfiI~~-~G~IlT~~Hvv~~~~~~~~~   86 (340)
                      .+.++++++.+|||.|.+.......        .+++.      .....+.||||+|++ +||||||+|||.++.     
T Consensus        41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~-----  115 (455)
T PRK10139         41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQ-----  115 (455)
T ss_pred             cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCC-----
Confidence            6899999999999999875432110        01111      112347899999985 799999999999886     


Q ss_pred             CCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEE
Q 019504           87 EGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVI  166 (340)
Q Consensus        87 ~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~v  166 (340)
                            .+.|++.||+  .++|++++.|+.+||||||++.+. .++++.|+++..+++|++|+++|+|+++..+++.|+|
T Consensus       116 ------~i~V~~~dg~--~~~a~vvg~D~~~DlAvlkv~~~~-~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~Giv  186 (455)
T PRK10139        116 ------KISIQLNDGR--EFDAKLIGSDDQSDIALLQIQNPS-KLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGII  186 (455)
T ss_pred             ------EEEEEECCCC--EEEEEEEEEcCCCCEEEEEecCCC-CCceeEecCccccCCCCEEEEEecCCCCCCceEEEEE
Confidence                  8999999997  899999999999999999998643 4789999999999999999999999999999999999


Q ss_pred             eeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCce
Q 019504          167 SGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKV  246 (340)
Q Consensus       167 s~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~  246 (340)
                      ++..+......  .+.++|++|+.+++|+|||||||.+|+||||+++.+...++..+++|+||++.+++++++|+++|++
T Consensus       187 S~~~r~~~~~~--~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v  264 (455)
T PRK10139        187 SALGRSGLNLE--GLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEI  264 (455)
T ss_pred             ccccccccCCC--CcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcc
Confidence            99877533222  2346899999999999999999999999999999877666678999999999999999999999999


Q ss_pred             eeeeeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCcee
Q 019504          247 VRAGLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRI  322 (340)
Q Consensus       247 ~~~~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~  322 (340)
                      .++|||+.+.  +..+++.++++  .|++|.+|.++|||+++||           ++||+|++|||++|.+..|   +..
T Consensus       265 ~r~~LGv~~~~l~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL-----------~~GDvIl~InG~~V~s~~d---l~~  330 (455)
T PRK10139        265 KRGLLGIKGTEMSADIAKAFNLDVQRGAFVSEVLPNSGSAKAGV-----------KAGDIITSLNGKPLNSFAE---LRS  330 (455)
T ss_pred             cccceeEEEEECCHHHHHhcCCCCCCceEEEEECCCChHHHCCC-----------CCCCEEEEECCEECCCHHH---HHH
Confidence            9999999764  57788888876  7999999999999999999           9999999999999999998   555


Q ss_pred             EEEe-eCCCCceEEEEeCC
Q 019504          323 YLIC-AEPNQDHLTCLKSS  340 (340)
Q Consensus       323 ~~~~-~~~~~~~~~~~~~~  340 (340)
                      .+.. ......+++++|++
T Consensus       331 ~l~~~~~g~~v~l~V~R~G  349 (455)
T PRK10139        331 RIATTEPGTKVKLGLLRNG  349 (455)
T ss_pred             HHHhcCCCCEEEEEEEECC
Confidence            5554 33344678888864


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00  E-value=3.9e-51  Score=381.49  Aligned_cols=291  Identities=36%  Similarity=0.496  Sum_probs=244.4

Q ss_pred             CCcchhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEE
Q 019504           16 LLPNEERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVN   95 (340)
Q Consensus        16 ~~~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~   95 (340)
                      ..+.+..+.++++++.+|||.|+......+   .+......+.||||+|+++||||||+|||.++.           .+.
T Consensus        40 ~~~~~~~~~~~~~~~~psVV~I~~~~~~~~---~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~-----------~i~  105 (351)
T TIGR02038        40 NNTVEISFNKAVRRAAPAVVNIYNRSISQN---SLNQLSIQGLGSGVIMSKEGYILTNYHVIKKAD-----------QIV  105 (351)
T ss_pred             ccccchhHHHHHHhcCCcEEEEEeEecccc---ccccccccceEEEEEEeCCeEEEecccEeCCCC-----------EEE
Confidence            345556889999999999999987544332   122233457899999999999999999999876           799


Q ss_pred             EEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeecccccc
Q 019504           96 ILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFS  175 (340)
Q Consensus        96 v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~  175 (340)
                      |++.||+  .++|++++.|+.+||||||++...  +++++++++..+++|++|+++|||++...+++.|+|+...+....
T Consensus       106 V~~~dg~--~~~a~vv~~d~~~DlAvlkv~~~~--~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~  181 (351)
T TIGR02038       106 VALQDGR--KFEAELVGSDPLTDLAVLKIEGDN--LPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS  181 (351)
T ss_pred             EEECCCC--EEEEEEEEecCCCCEEEEEecCCC--CceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC
Confidence            9999997  799999999999999999999754  788899888889999999999999999999999999998775432


Q ss_pred             CCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCC--CcCceEEEEehHhHHHHHHHHHHcCceeeeeeeE
Q 019504          176 QAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTG--TSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNV  253 (340)
Q Consensus       176 ~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~--~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~  253 (340)
                      ..+  ..++|++|+.+++|+|||||+|.+|+||||+++.+....  ...+++|+||++.+++++++|+++|++.++|||+
T Consensus       182 ~~~--~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv  259 (351)
T TIGR02038       182 SVG--RQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGV  259 (351)
T ss_pred             CCC--cceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeee
Confidence            221  236799999999999999999999999999998765422  2368999999999999999999999999999999


Q ss_pred             Eec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe-eC
Q 019504          254 DIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC-AE  328 (340)
Q Consensus       254 ~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~-~~  328 (340)
                      .+.  +...++.++++  .|++|.+|.++|||+++||           ++||+|++|||++|.+..|   +...+.. ..
T Consensus       260 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL-----------~~GDvI~~Ing~~V~s~~d---l~~~l~~~~~  325 (351)
T TIGR02038       260 SGEDINSVVAQGLGLPDLRGIVITGVDPNGPAARAGI-----------LVRDVILKYDGKDVIGAEE---LMDRIAETRP  325 (351)
T ss_pred             EEEECCHHHHHhcCCCccccceEeecCCCChHHHCCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhcCC
Confidence            875  46667788887  6999999999999999999           9999999999999999988   5555544 33


Q ss_pred             CCCceEEEEeCC
Q 019504          329 PNQDHLTCLKSS  340 (340)
Q Consensus       329 ~~~~~~~~~~~~  340 (340)
                      ....+++++|++
T Consensus       326 g~~v~l~v~R~g  337 (351)
T TIGR02038       326 GSKVMVTVLRQG  337 (351)
T ss_pred             CCEEEEEEEECC
Confidence            445678888863


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=100.00  E-value=2.8e-50  Score=375.46  Aligned_cols=287  Identities=36%  Similarity=0.500  Sum_probs=237.4

Q ss_pred             hhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEec
Q 019504           20 EERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILAS   99 (340)
Q Consensus        20 ~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~   99 (340)
                      ...+.++++++.+|||.|........   ........+.||||+|+++||||||+|||.++.           .+.|++.
T Consensus        44 ~~~~~~~~~~~~psvV~v~~~~~~~~---~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~-----------~i~V~~~  109 (353)
T PRK10898         44 PASYNQAVRRAAPAVVNVYNRSLNST---SHNQLEIRTLGSGVIMDQRGYILTNKHVINDAD-----------QIIVALQ  109 (353)
T ss_pred             cchHHHHHHHhCCcEEEEEeEecccc---CcccccccceeeEEEEeCCeEEEecccEeCCCC-----------EEEEEeC
Confidence            34788999999999999987543221   112233447899999999999999999999876           8999999


Q ss_pred             CCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCc
Q 019504          100 DGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGV  179 (340)
Q Consensus       100 ~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~  179 (340)
                      ||+  .++|++++.|+.+||||||++...  +++++|+++..+++|+.|+++|||++...+++.|+|+...+......+ 
T Consensus       110 dg~--~~~a~vv~~d~~~DlAvl~v~~~~--l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~~~-  184 (353)
T PRK10898        110 DGR--VFEALLVGSDSLTDLAVLKINATN--LPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSPTG-  184 (353)
T ss_pred             CCC--EEEEEEEEEcCCCCEEEEEEcCCC--CCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCCcc-
Confidence            997  799999999999999999998753  788999888889999999999999998899999999988775432222 


Q ss_pred             eecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCC---CcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEec
Q 019504          180 TIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTG---TSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIA  256 (340)
Q Consensus       180 ~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~---~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~  256 (340)
                       ..++|++|+.+++|+|||||+|.+|+||||+++.+...+   ...+++|+||++.+++++++|+++|++.++|||+...
T Consensus       185 -~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~  263 (353)
T PRK10898        185 -RQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGR  263 (353)
T ss_pred             -ccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEE
Confidence             235799999999999999999999999999998775432   2368999999999999999999999999999999764


Q ss_pred             c--HHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe-eCCCC
Q 019504          257 P--DLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC-AEPNQ  331 (340)
Q Consensus       257 ~--~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~-~~~~~  331 (340)
                      +  ...+..++++  .|++|.+|.++|||+++||           ++||+|++|||++|.+..|   +...+.. .....
T Consensus       264 ~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~aGL-----------~~GDvI~~Ing~~V~s~~~---l~~~l~~~~~g~~  329 (353)
T PRK10898        264 EIAPLHAQGGGIDQLQGIVVNEVSPDGPAAKAGI-----------QVNDLIISVNNKPAISALE---TMDQVAEIRPGSV  329 (353)
T ss_pred             ECCHHHHHhcCCCCCCeEEEEEECCCChHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhcCCCCE
Confidence            3  4444555554  8999999999999999999           9999999999999999887   5544444 33344


Q ss_pred             ceEEEEeCC
Q 019504          332 DHLTCLKSS  340 (340)
Q Consensus       332 ~~~~~~~~~  340 (340)
                      .+++++|++
T Consensus       330 v~l~v~R~g  338 (353)
T PRK10898        330 IPVVVMRDD  338 (353)
T ss_pred             EEEEEEECC
Confidence            678888863


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=100.00  E-value=2.9e-49  Score=380.91  Aligned_cols=289  Identities=39%  Similarity=0.552  Sum_probs=242.4

Q ss_pred             HHHHHHHHhCCCeEEEEeeeeccc---------cccCCc----------------------------ccccCCceEEEEE
Q 019504           22 RIAQLFEKNTYSVVNIFDVTLRPT---------LNVTGL----------------------------VEIPEGNGSGVVW   64 (340)
Q Consensus        22 ~~~~~~~~~~~svV~I~~~~~~~~---------~~~~~~----------------------------~~~~~~~GsGfiI   64 (340)
                      .+.++++++.+|||.|++......         ..+++.                            .....+.||||+|
T Consensus        39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii  118 (473)
T PRK10942         39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII  118 (473)
T ss_pred             cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence            599999999999999987553211         001110                            0112468999999


Q ss_pred             cC-CCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCC
Q 019504           65 DG-KGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLK  143 (340)
Q Consensus        65 ~~-~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~  143 (340)
                      ++ +||||||+||+.++.           ++.|++.||+  .++|++++.|+.+||||||++... .++++.|+++..++
T Consensus       119 ~~~~G~IlTn~HVv~~a~-----------~i~V~~~dg~--~~~a~vv~~D~~~DlAvlki~~~~-~l~~~~lg~s~~l~  184 (473)
T PRK10942        119 DADKGYVVTNNHVVDNAT-----------KIKVQLSDGR--KFDAKVVGKDPRSDIALIQLQNPK-NLTAIKMADSDALR  184 (473)
T ss_pred             ECCCCEEEeChhhcCCCC-----------EEEEEECCCC--EEEEEEEEecCCCCEEEEEecCCC-CCceeEecCccccC
Confidence            86 599999999999886           8999999997  899999999999999999997543 47899999999999


Q ss_pred             CCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCc
Q 019504          144 VGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAG  223 (340)
Q Consensus       144 ~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~  223 (340)
                      +|++|+++|+|+++..+++.|+|+...+.....  ..+.++|++|+.+++|+|||||+|.+|+||||+++.+...++..+
T Consensus       185 ~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~~~--~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g  262 (473)
T PRK10942        185 VGDYTVAIGNPYGLGETVTSGIVSALGRSGLNV--ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIG  262 (473)
T ss_pred             CCCEEEEEcCCCCCCcceeEEEEEEeecccCCc--ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCccc
Confidence            999999999999999999999999987653221  123467999999999999999999999999999998877666678


Q ss_pred             eEEEEehHhHHHHHHHHHHcCceeeeeeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcC
Q 019504          224 VGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIIL  299 (340)
Q Consensus       224 ~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~  299 (340)
                      ++|+||++.+++++++|++++++.++|||+.+.  +..+++.++++  .|++|.+|.++|||+++||           ++
T Consensus       263 ~gfaIP~~~~~~v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL-----------~~  331 (473)
T PRK10942        263 IGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGI-----------KA  331 (473)
T ss_pred             EEEEEEHHHHHHHHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCC-----------CC
Confidence            999999999999999999999999999998774  57788888886  7999999999999999999           99


Q ss_pred             CcEEEEECCEEccCCCCCCCceeEEEee-CCCCceEEEEeCC
Q 019504          300 GDIIVAVNNKPVSFSCLSIPSRIYLICA-EPNQDHLTCLKSS  340 (340)
Q Consensus       300 GDvi~~i~g~~v~~~~d~~~~~~~~~~~-~~~~~~~~~~~~~  340 (340)
                      ||+|++|||++|.+.+|   ++..+... .....+++|+|++
T Consensus       332 GDvIl~InG~~V~s~~d---l~~~l~~~~~g~~v~l~v~R~G  370 (473)
T PRK10942        332 GDVITSLNGKPISSFAA---LRAQVGTMPVGSKLTLGLLRDG  370 (473)
T ss_pred             CCEEEEECCEECCCHHH---HHHHHHhcCCCCEEEEEEEECC
Confidence            99999999999999998   55555443 3334678888764


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00  E-value=1.1e-48  Score=375.71  Aligned_cols=288  Identities=43%  Similarity=0.571  Sum_probs=242.9

Q ss_pred             HHHHHHHhCCCeEEEEeeeecccc-----------ccCCc----------ccccCCceEEEEEcCCCEEEeCccccCCCC
Q 019504           23 IAQLFEKNTYSVVNIFDVTLRPTL-----------NVTGL----------VEIPEGNGSGVVWDGKGHIVTNFHVIGSAL   81 (340)
Q Consensus        23 ~~~~~~~~~~svV~I~~~~~~~~~-----------~~~~~----------~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~   81 (340)
                      +.++++++.+|||.|.+.......           .+++.          .....+.||||+|+++||||||+||+.++.
T Consensus         3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~   82 (428)
T TIGR02037         3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD   82 (428)
T ss_pred             HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence            778999999999999875422110           11111          112457899999999999999999999876


Q ss_pred             CCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCce
Q 019504           82 SRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTL  161 (340)
Q Consensus        82 ~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~  161 (340)
                                 .+.|++.|++  .++|++++.|+.+|||||+++.. ..++++.|+++..+++|++|+++|||++...++
T Consensus        83 -----------~i~V~~~~~~--~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~  148 (428)
T TIGR02037        83 -----------EITVTLSDGR--EFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTV  148 (428)
T ss_pred             -----------eEEEEeCCCC--EEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcE
Confidence                       8999999997  79999999999999999999875 348999999888899999999999999999999


Q ss_pred             eEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHH
Q 019504          162 TVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLI  241 (340)
Q Consensus       162 ~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~  241 (340)
                      +.|+|+...+....  ...+..++++|+.+++|+|||||||.+|+||||+++.+...++..+++|+||++.+++++++|+
T Consensus       149 t~G~vs~~~~~~~~--~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~  226 (428)
T TIGR02037       149 TSGIVSALGRSGLG--IGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLI  226 (428)
T ss_pred             EEEEEEecccCccC--CCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHH
Confidence            99999988765321  1223467999999999999999999999999999988776555678999999999999999999


Q ss_pred             HcCceeeeeeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCC
Q 019504          242 QYGKVVRAGLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLS  317 (340)
Q Consensus       242 ~~~~~~~~~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~  317 (340)
                      +++++.++|||+.+.  +..+++.++++  .|++|.+|.++|||+++||           ++||+|++|||++|.+..| 
T Consensus       227 ~~g~~~~~~lGi~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL-----------~~GDvI~~Vng~~i~~~~~-  294 (428)
T TIGR02037       227 EGGKVQRGWLGVTIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGL-----------KAGDVILSVNGKPISSFAD-  294 (428)
T ss_pred             hcCcCcCCcCceEeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCC-----------CCCCEEEEECCEEcCCHHH-
Confidence            999999999999875  47788889986  8999999999999999999           9999999999999999987 


Q ss_pred             CCceeEEEee-CCCCceEEEEeCC
Q 019504          318 IPSRIYLICA-EPNQDHLTCLKSS  340 (340)
Q Consensus       318 ~~~~~~~~~~-~~~~~~~~~~~~~  340 (340)
                        +...+... .....+++++|++
T Consensus       295 --~~~~l~~~~~g~~v~l~v~R~g  316 (428)
T TIGR02037       295 --LRRAIGTLKPGKKVTLGILRKG  316 (428)
T ss_pred             --HHHHHHhcCCCCEEEEEEEECC
Confidence              55555443 3345678888863


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=1.1e-38  Score=298.68  Aligned_cols=290  Identities=42%  Similarity=0.538  Sum_probs=238.4

Q ss_pred             hHHHHHHHHhCCCeEEEEeeeeccccccC-Ccc-cc-cCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEE
Q 019504           21 ERIAQLFEKNTYSVVNIFDVTLRPTLNVT-GLV-EI-PEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNIL   97 (340)
Q Consensus        21 ~~~~~~~~~~~~svV~I~~~~~~~~~~~~-~~~-~~-~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~   97 (340)
                      ..+..+++++.++||.++.........++ ... .. ..+.||||+++++|||+|+.|++..+.           ++.+.
T Consensus        33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~-----------~i~v~  101 (347)
T COG0265          33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAE-----------EITVT  101 (347)
T ss_pred             cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcc-----------eEEEE
Confidence            57889999999999999875433320000 000 00 147899999998999999999999965           88999


Q ss_pred             ecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCC
Q 019504           98 ASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQA  177 (340)
Q Consensus        98 ~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~  177 (340)
                      +.||+  .+++++++.|+..|+|+|+++.... ++.+.++++..++.|+.++++|+|+++..+++.|+++...+..... 
T Consensus       102 l~dg~--~~~a~~vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~-  177 (347)
T COG0265         102 LADGR--EVPAKLVGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGS-  177 (347)
T ss_pred             eCCCC--EEEEEEEecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccccccC-
Confidence            99997  8999999999999999999998654 7888999999999999999999999999999999999998862211 


Q ss_pred             CceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEecc
Q 019504          178 GVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIAP  257 (340)
Q Consensus       178 ~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~~  257 (340)
                      ...+.++||+|+.+++|+||||++|.+|++|||++..+...++..+++|+||++.+..++.++...|++.++++|+.+.+
T Consensus       178 ~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~  257 (347)
T COG0265         178 AGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEP  257 (347)
T ss_pred             cccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEE
Confidence            11145789999999999999999999999999999988876655679999999999999999999889999999988754


Q ss_pred             HHHHhhcCC--CCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC-CCCceE
Q 019504          258 DLVASQLNV--GNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE-PNQDHL  334 (340)
Q Consensus       258 ~~~~~~~~~--~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~-~~~~~~  334 (340)
                      ......+|+  ..|++|.+|.+++||+++|+           +.||+|+++||+++.+..|   +...+.... .....+
T Consensus       258 ~~~~~~~g~~~~~G~~V~~v~~~spa~~agi-----------~~Gdii~~vng~~v~~~~~---l~~~v~~~~~g~~v~~  323 (347)
T COG0265         258 LTADIALGLPVAAGAVVLGVLPGSPAAKAGI-----------KAGDIITAVNGKPVASLSD---LVAAVASNRPGDEVAL  323 (347)
T ss_pred             cccccccCCCCCCceEEEecCCCChHHHcCC-----------CCCCEEEEECCEEccCHHH---HHHHHhccCCCCEEEE
Confidence            222122443  48999999999999999999           8999999999999999998   454444433 445677


Q ss_pred             EEEeC
Q 019504          335 TCLKS  339 (340)
Q Consensus       335 ~~~~~  339 (340)
                      +++|.
T Consensus       324 ~~~r~  328 (347)
T COG0265         324 KLLRG  328 (347)
T ss_pred             EEEEC
Confidence            77775


No 7  
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96  E-value=5.7e-29  Score=233.05  Aligned_cols=290  Identities=38%  Similarity=0.464  Sum_probs=224.7

Q ss_pred             CcchhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEE
Q 019504           17 LPNEERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNI   96 (340)
Q Consensus        17 ~~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v   96 (340)
                      .......+++.++...|+|.|..........++.....+...||||+++.+|+++||+||+................+.+
T Consensus       124 ~k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi  203 (473)
T KOG1320|consen  124 RKYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQI  203 (473)
T ss_pred             hhhhhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEE
Confidence            33456778899999999999987544333333444456677899999999999999999998654322222223346888


Q ss_pred             EecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccC
Q 019504           97 LASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQ  176 (340)
Q Consensus        97 ~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~  176 (340)
                      ..++|.....++.+.+.|+..|+|+++++.+....++++++.+..++.|+++..+|.|++..++.+.|+++...|....-
T Consensus       204 ~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~l  283 (473)
T KOG1320|consen  204 DAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKL  283 (473)
T ss_pred             EEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeeccccccccccccc
Confidence            88887445889999999999999999997765447888888888999999999999999999999999999988866542


Q ss_pred             C---CceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCcee------
Q 019504          177 A---GVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVV------  247 (340)
Q Consensus       177 ~---~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~------  247 (340)
                      .   +....+++++|+.++.|+||+|++|.+|++||+++......+-..+++|++|.+.++.++.+..++...-      
T Consensus       284 g~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~  363 (473)
T KOG1320|consen  284 GLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPL  363 (473)
T ss_pred             CcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCc
Confidence            2   2355688999999999999999999999999999987776555678999999999999888874443211      


Q ss_pred             ---eeeeeEEe-------ccHHHHhhcC----CCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccC
Q 019504          248 ---RAGLNVDI-------APDLVASQLN----VGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSF  313 (340)
Q Consensus       248 ---~~~lg~~~-------~~~~~~~~~~----~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~  313 (340)
                         +.++|...       .-..+.+.+.    ...+++|.+|.+++++...++           ++||+|++|||++|.+
T Consensus       364 ~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~-----------~~g~~V~~vng~~V~n  432 (473)
T KOG1320|consen  364 VPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGL-----------KPGDQVVKVNGKPVKN  432 (473)
T ss_pred             ccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccc-----------cCCCEEEEECCEEeec
Confidence               12333221       1111122232    226899999999999999998           8999999999999999


Q ss_pred             CCCC
Q 019504          314 SCLS  317 (340)
Q Consensus       314 ~~d~  317 (340)
                      ..|+
T Consensus       433 ~~~l  436 (473)
T KOG1320|consen  433 LKHL  436 (473)
T ss_pred             hHHH
Confidence            9873


No 8  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.92  E-value=6.3e-24  Score=201.44  Aligned_cols=287  Identities=23%  Similarity=0.272  Sum_probs=221.5

Q ss_pred             cchhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCC-CEEEeCccccCCCCCCCCCCCccEEEEEE
Q 019504           18 PNEERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGK-GHIVTNFHVIGSALSRKPAEGQVVARVNI   96 (340)
Q Consensus        18 ~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~-G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v   96 (340)
                      +....+...+.++-+|||.|.......-    .......+.||||++++. ||||||+|++....          +.-.+
T Consensus        49 ~~~e~w~~~ia~VvksvVsI~~S~v~~f----dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP----------~va~a  114 (955)
T KOG1421|consen   49 ATSEDWRNTIANVVKSVVSIRFSAVRAF----DTESAGESEATGFVVDKKLGYILTNRHVVAPGP----------FVASA  114 (955)
T ss_pred             chhhhhhhhhhhhcccEEEEEehheeec----ccccccccceeEEEEecccceEEEeccccCCCC----------ceeEE
Confidence            3344888999999999999986543331    111223456999999976 99999999998765          24455


Q ss_pred             EecCCceeEEEEEEEEeCCCCcEEEEEEecCC---CCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeecccc
Q 019504           97 LASDGVQKNFEGKLVGADRAKDLAVLKIEASE---DLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDI  173 (340)
Q Consensus        97 ~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~---~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~  173 (340)
                      .|.+-.  +.+--.++.|+.+|+.+++.++..   ..+..+.++ ....++|.+++++|+..+...++..|.++.+.+..
T Consensus       115 vf~n~e--e~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~la-p~~akvgseirvvgNDagEklsIlagflSrldr~a  191 (955)
T KOG1421|consen  115 VFDNHE--EIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLA-PELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNA  191 (955)
T ss_pred             Eecccc--cCCcccccCCchhhcceeecChhhcceeeeeccccC-ccccccCCceEEecCCccceEEeehhhhhhccCCC
Confidence            565554  667778899999999999998763   123444554 34458999999999988888899999999998876


Q ss_pred             ccCCCceec----ceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeee
Q 019504          174 FSQAGVTIG----GGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRA  249 (340)
Q Consensus       174 ~~~~~~~~~----~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~  249 (340)
                      +......+.    .++|.-+....|.||+|++|.+|..|.++..+...    ....|++|++.+++.+.-++++..++|+
T Consensus       192 pdyg~~~yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~s----sas~ffLpLdrV~RaL~clq~n~PItRG  267 (955)
T KOG1421|consen  192 PDYGEDTYNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS----SASDFFLPLDRVVRALRCLQNNTPITRG  267 (955)
T ss_pred             ccccccccccccceeeeehhcCCCCCCCCceecccceEEeeecCCccc----ccccceeeccchhhhhhhhhcCCCcccc
Confidence            544333222    35788888889999999999999999998876543    4457999999999999999989999999


Q ss_pred             eeeEEeccHHH--HhhcCCC--------------CCcEEE-eeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc
Q 019504          250 GLNVDIAPDLV--ASQLNVG--------------NGALVL-QVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS  312 (340)
Q Consensus       250 ~lg~~~~~~~~--~~~~~~~--------------~g~~V~-~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~  312 (340)
                      .|-++|.....  ++.+|++              .|++|. .|.+++||++.            |++||++++||+.-+.
T Consensus       268 tLqvefl~k~~de~rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~------------Le~GDillavN~t~l~  335 (955)
T KOG1421|consen  268 TLQVEFLHKLFDECRRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK------------LEPGDILLAVNSTCLN  335 (955)
T ss_pred             eEEEEEehhhhHHHHhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc------------cCCCcEEEEEcceehH
Confidence            99999976433  4556653              566654 59999999976            5999999999998888


Q ss_pred             CCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          313 FSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       313 ~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      ++..   ....|...-.+..++||.|-+
T Consensus       336 df~~---l~~iLDegvgk~l~LtI~Rgg  360 (955)
T KOG1421|consen  336 DFEA---LEQILDEGVGKNLELTIQRGG  360 (955)
T ss_pred             HHHH---HHHHHhhccCceEEEEEEeCC
Confidence            8876   777787777777899998854


No 9  
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.76  E-value=1e-17  Score=159.58  Aligned_cols=274  Identities=13%  Similarity=0.125  Sum_probs=193.2

Q ss_pred             HHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCC-CEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeE
Q 019504           27 FEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGK-GHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKN  105 (340)
Q Consensus        27 ~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~-G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~  105 (340)
                      .+++..+.|.+........++...+.    ..|||.|++.+ |++++++.++....          .+.++++.|..  .
T Consensus       524 ~~~i~~~~~~v~~~~~~~l~g~s~~i----~kgt~~i~d~~~g~~vvsr~~vp~d~----------~d~~vt~~dS~--~  587 (955)
T KOG1421|consen  524 SADISNCLVDVEPMMPVNLDGVSSDI----YKGTALIMDTSKGLGVVSRSVVPSDA----------KDQRVTEADSD--G  587 (955)
T ss_pred             hhHHhhhhhhheeceeeccccchhhh----hcCceEEEEccCCceeEecccCCchh----------hceEEeecccc--c
Confidence            57788888888776555544433322    34999999855 99999999998654          37888888775  7


Q ss_pred             EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeee-----ccccccCCCce
Q 019504          106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGL-----NRDIFSQAGVT  180 (340)
Q Consensus       106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~-----~~~~~~~~~~~  180 (340)
                      ++|.+.+.++..++|.++.++..  ...+.|. ...+..||++...|+............++.+     .+.........
T Consensus       588 i~a~~~fL~~t~n~a~~kydp~~--~~~~kl~-~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~  664 (955)
T KOG1421|consen  588 IPANVSFLHPTENVASFKYDPAL--EVQLKLT-DTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRAT  664 (955)
T ss_pred             ccceeeEecCccceeEeccChhH--hhhhccc-eeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeec
Confidence            89999999999999999999864  3455664 3567889999999988654322222222222     22222222222


Q ss_pred             ecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCC--CCcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEeccH
Q 019504          181 IGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQT--GTSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIAPD  258 (340)
Q Consensus       181 ~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~--~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~~~  258 (340)
                      ..+.|.+++.+..++--|-+.|.+|+|+|++-....+.  +....+-|.+.+..+++.++.|+.+.......+|++|...
T Consensus       665 n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i  744 (955)
T KOG1421|consen  665 NLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHI  744 (955)
T ss_pred             ceEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeE
Confidence            23567787777766666788999999999988766542  2234567788999999999999988777666678888765


Q ss_pred             HHHh--hcCCCCCcEEEeeCCCChhhhcCCCccccCC--CCCCcCCcEEEEECCEEccCCCCCCCc
Q 019504          259 LVAS--QLNVGNGALVLQVPGNSLAAKAGILPTTRGF--AGNIILGDIIVAVNNKPVSFSCLSIPS  320 (340)
Q Consensus       259 ~~~~--~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~--~~~l~~GDvi~~i~g~~v~~~~d~~~~  320 (340)
                      ++++  .+|++ ..++.+.+.+|.-.++-+..+|...  -..|..||||+++|||.|+.+.|+-++
T Consensus       745 ~laqar~lglp-~e~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~  809 (955)
T KOG1421|consen  745 TLAQARTLGLP-SEFIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDF  809 (955)
T ss_pred             EeehhhccCCC-HHHHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhh
Confidence            5554  45665 5566666666666666665555532  234789999999999999999984333


No 10 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.73  E-value=2.6e-17  Score=130.07  Aligned_cols=117  Identities=32%  Similarity=0.471  Sum_probs=74.8

Q ss_pred             eEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEE--EEEEEeCCC-CcEEEEEEecCCCCcccee
Q 019504           59 GSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFE--GKLVGADRA-KDLAVLKIEASEDLLKPIN  135 (340)
Q Consensus        59 GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~--a~v~~~d~~-~DlAlL~v~~~~~~~~~~~  135 (340)
                      ||||+|+++|+||||+||+..........   ...+.+.+.++.  ...  +++++.|+. .|+|||+++.         
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~---~~~~~~~~~~~~--~~~~~~~~~~~~~~~~D~All~v~~---------   66 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDWNDGKQPD---NSSVEVVFPDGR--RVPPVAEVVYFDPDDYDLALLKVDP---------   66 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCCTT--G-T---CSEEEEEETTSC--EEETEEEEEEEETT-TTEEEEEESC---------
T ss_pred             CEEEEEcCCceEEEchhheecccccccCC---CCEEEEEecCCC--EEeeeEEEEEECCccccEEEEEEec---------
Confidence            79999999899999999999754321100   127888888887  456  999999999 9999999990         


Q ss_pred             ecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEE
Q 019504          136 VGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGI  210 (340)
Q Consensus       136 l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi  210 (340)
                          .. ..+......+            ...............   .++ +++.+.+|+||||+||.+|+||||
T Consensus        67 ----~~-~~~~~~~~~~------------~~~~~~~~~~~~~~~---~~~-~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   67 ----WT-GVGGGVRVPG------------STSGVSPTSTNDNRM---LYI-TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             ----EE-EEEEEEEEEE------------EEEEEEEEEEEETEE---EEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             ----cc-ceeeeeEeee------------eccccccccCcccce---eEe-eecccCCCcEeHhEECCCCEEEeC
Confidence                00 0000001000            000000000000000   014 799999999999999999999997


No 11 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.46  E-value=3.7e-13  Score=127.06  Aligned_cols=256  Identities=21%  Similarity=0.326  Sum_probs=184.0

Q ss_pred             HHHHhCCCeEEEEeeeeccccccCCccc-ccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCcee
Q 019504           26 LFEKNTYSVVNIFDVTLRPTLNVTGLVE-IPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQK  104 (340)
Q Consensus        26 ~~~~~~~svV~I~~~~~~~~~~~~~~~~-~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~  104 (340)
                      ..+....|++.+.+....+....+|+.. +....|+||.+... .++|++|++....+.        ..+.+. ..|...
T Consensus        55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~--------~~v~v~-~~gs~~  124 (473)
T KOG1320|consen   55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDH--------KFVTVK-KHGSPR  124 (473)
T ss_pred             CccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccc--------cccccc-cCCCch
Confidence            4566777899998877777766667654 44567999999854 899999999854321        144444 555555


Q ss_pred             EEEEEEEEeCCCCcEEEEEEecCC--CCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceec
Q 019504          105 NFEGKLVGADRAKDLAVLKIEASE--DLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIG  182 (340)
Q Consensus       105 ~~~a~v~~~d~~~DlAlL~v~~~~--~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~  182 (340)
                      .+.+++...-.+.|+|++.++..+  ....|+.+.+  -+...+-++++|   +....++.|.|+......+......+ 
T Consensus       125 k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~--ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~~~~l-  198 (473)
T KOG1320|consen  125 KYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGD--IPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHSSTVL-  198 (473)
T ss_pred             hhhhhHHHhhhcccceEEEEeeccccCCCcccccCC--CcccCccEEEEc---CCcEEEEeeEEEEEEeccccCCCcce-
Confidence            788888888899999999999743  2334455543  345568899998   77789999999988765544432221 


Q ss_pred             ceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCce-eeeeeeEE---eccH
Q 019504          183 GGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKV-VRAGLNVD---IAPD  258 (340)
Q Consensus       183 ~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~-~~~~lg~~---~~~~  258 (340)
                      ..+++|+...+|+||+|.+...+++.|+.....+..   ..+.+.+|.-.+.++.......+.. .++.++..   +.+.
T Consensus       199 ~~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~  275 (473)
T KOG1320|consen  199 LRVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSG  275 (473)
T ss_pred             eeEEEEEeecCCccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeeccccc
Confidence            348999999999999999987799999998877543   2678889988887776554443322 12222222   2234


Q ss_pred             HHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc
Q 019504          259 LVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS  312 (340)
Q Consensus       259 ~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~  312 (340)
                      .+++.+.+.  .|+.+.++.+-+.|-+.|            +.||+|+++||..|.
T Consensus       276 ~~R~~~~lg~~~g~~i~~~~qtd~ai~~~------------nsg~~ll~~DG~~Ig  319 (473)
T KOG1320|consen  276 QLRKSFKLGLETGVLISKINQTDAAINPG------------NSGGPLLNLDGEVIG  319 (473)
T ss_pred             ccccccccCcccceeeeeecccchhhhcc------------cCCCcEEEecCcEee
Confidence            555555555  569999999988777664            899999999999996


No 12 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.40  E-value=1.3e-11  Score=107.51  Aligned_cols=167  Identities=23%  Similarity=0.315  Sum_probs=105.3

Q ss_pred             ceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEe-------cCCceeEEEEEEEEe----CC---CCcEEEEE
Q 019504           58 NGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILA-------SDGVQKNFEGKLVGA----DR---AKDLAVLK  123 (340)
Q Consensus        58 ~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~-------~~g~~~~~~a~v~~~----d~---~~DlAlL~  123 (340)
                      .++|++|+++ +|||++||+....           .+.+.+       .++....+...-+..    +.   ..|+|||+
T Consensus        26 ~C~G~li~~~-~vLTaahC~~~~~-----------~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~   93 (220)
T PF00089_consen   26 FCTGTLISPR-WVLTAAHCVDGAS-----------DIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLK   93 (220)
T ss_dssp             EEEEEEEETT-EEEEEGGGHTSGG-----------SEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEE
T ss_pred             eEeEEecccc-ccccccccccccc-----------ccccccccccccccccccccccccccccccccccccccccccccc
Confidence            4999999987 9999999999821           222222       222112333333322    22   56999999


Q ss_pred             EecC---CCCccceeecCC-CCCCCCCEEEEEecCCCCCC----ceeEEEEeeecccc-cc-CCCceecceEEEee----
Q 019504          124 IEAS---EDLLKPINVGQS-SFLKVGQQCLAIGNPFGFDH----TLTVGVISGLNRDI-FS-QAGVTIGGGIQTDA----  189 (340)
Q Consensus       124 v~~~---~~~~~~~~l~~~-~~~~~G~~v~~iG~p~g~~~----~~~~G~vs~~~~~~-~~-~~~~~~~~~i~~d~----  189 (340)
                      ++.+   ...+.++.+... ..++.|+.+.++||+.....    .+....+..+.... .. .........+....    
T Consensus        94 L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~  173 (220)
T PF00089_consen   94 LDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSG  173 (220)
T ss_dssp             ESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSS
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            9987   345677777652 34578999999999875322    34444443332211 00 00001113344444    


Q ss_pred             ccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHH
Q 019504          190 AINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIV  237 (340)
Q Consensus       190 ~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l  237 (340)
                      ..+.|+|||||++.++.|+||++.. ..++......++.+++.+++|+
T Consensus       174 ~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  174 DACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             BGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred             cccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence            7889999999998766799999987 3343333468889999888875


No 13 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.27  E-value=1.3e-10  Score=101.94  Aligned_cols=172  Identities=19%  Similarity=0.189  Sum_probs=98.7

Q ss_pred             CceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCC-------ceeEEEEEEEEeC-------CCCcEEEE
Q 019504           57 GNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDG-------VQKNFEGKLVGAD-------RAKDLAVL  122 (340)
Q Consensus        57 ~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g-------~~~~~~a~v~~~d-------~~~DlAlL  122 (340)
                      ...+|.+|+++ +|||+|||+.....         ..+.+.+...       ....+..+-+..+       ...|||||
T Consensus        25 ~~C~GtlIs~~-~VLTaAhC~~~~~~---------~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll   94 (232)
T cd00190          25 HFCGGSLISPR-WVLTAAHCVYSSAP---------SNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALL   94 (232)
T ss_pred             EEEEEEEeeCC-EEEECHHhcCCCCC---------ccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEE
Confidence            35899999977 99999999987421         1333333211       1112333333333       35799999


Q ss_pred             EEecCC---CCccceeecCCC-CCCCCCEEEEEecCCCCCC-----ceeEEEEeeecccc---ccCC-CceecceEE---
Q 019504          123 KIEASE---DLLKPINVGQSS-FLKVGQQCLAIGNPFGFDH-----TLTVGVISGLNRDI---FSQA-GVTIGGGIQ---  186 (340)
Q Consensus       123 ~v~~~~---~~~~~~~l~~~~-~~~~G~~v~~iG~p~g~~~-----~~~~G~vs~~~~~~---~~~~-~~~~~~~i~---  186 (340)
                      +++.+.   ..+.|+.|.... .+..|+.+.+.||......     ......+.-+....   .... .......+-   
T Consensus        95 ~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~  174 (232)
T cd00190          95 KLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGG  174 (232)
T ss_pred             EECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCC
Confidence            998763   236788886553 5678999999998654321     12222222221100   0000 000001111   


Q ss_pred             --EeeccCCCCccceeecCC---CcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHH
Q 019504          187 --TDAAINPGNSGGPLLDSK---GNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQ  239 (340)
Q Consensus       187 --~d~~i~~G~SGGPl~d~~---G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~  239 (340)
                        .+...+.|+|||||+...   +.++||.++... ++.......+..+...++|+++
T Consensus       175 ~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~~  231 (232)
T cd00190         175 LEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQK  231 (232)
T ss_pred             CCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhhc
Confidence              134577899999999764   789999998654 3321233445666777777653


No 14 
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.18  E-value=1.2e-11  Score=91.06  Aligned_cols=72  Identities=36%  Similarity=0.360  Sum_probs=55.8

Q ss_pred             eeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEE-Eee
Q 019504          249 AGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYL-ICA  327 (340)
Q Consensus       249 ~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~-~~~  327 (340)
                      +|||+.+....-      ..|++|.+|.++|||+++||           ++||+|++|||++|++..|   +..++ ...
T Consensus         1 ~~lGv~~~~~~~------~~g~~V~~V~~~spA~~aGl-----------~~GD~I~~ing~~v~~~~~---~~~~l~~~~   60 (82)
T PF13180_consen    1 GGLGVTVQNLSD------TGGVVVVSVIPGSPAAKAGL-----------QPGDIILAINGKPVNSSED---LVNILSKGK   60 (82)
T ss_dssp             -E-SEEEEECSC------SSSEEEEEESTTSHHHHTTS------------TTEEEEEETTEESSSHHH---HHHHHHCSS
T ss_pred             CEECeEEEEccC------CCeEEEEEeCCCCcHHHCCC-----------CCCcEEEEECCEEcCCHHH---HHHHHHhCC
Confidence            477777754321      36999999999999999999           9999999999999999998   66666 345


Q ss_pred             CCCCceEEEEeCC
Q 019504          328 EPNQDHLTCLKSS  340 (340)
Q Consensus       328 ~~~~~~~~~~~~~  340 (340)
                      .....+++++|.+
T Consensus        61 ~g~~v~l~v~R~g   73 (82)
T PF13180_consen   61 PGDTVTLTVLRDG   73 (82)
T ss_dssp             TTSEEEEEEEETT
T ss_pred             CCCEEEEEEEECC
Confidence            5566788888864


No 15 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.16  E-value=1.2e-09  Score=95.98  Aligned_cols=164  Identities=19%  Similarity=0.202  Sum_probs=92.3

Q ss_pred             CceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCce------eEEEEEEEEe-------CCCCcEEEEE
Q 019504           57 GNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQ------KNFEGKLVGA-------DRAKDLAVLK  123 (340)
Q Consensus        57 ~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~------~~~~a~v~~~-------d~~~DlAlL~  123 (340)
                      ...+|.+|+++ +|||+|||+.....         ..+.|.+.....      ......-+..       ....|+|||+
T Consensus        26 ~~C~GtlIs~~-~VLTaahC~~~~~~---------~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~   95 (229)
T smart00020       26 HFCGGSLISPR-WVLTAAHCVYGSDP---------SNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLK   95 (229)
T ss_pred             cEEEEEEecCC-EEEECHHHcCCCCC---------cceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEE
Confidence            35899999977 99999999987530         134444432210      1233333332       2457999999


Q ss_pred             EecCC---CCccceeecCC-CCCCCCCEEEEEecCCCCC------CceeEEEEeeecccccc---CCCceec-ceEE---
Q 019504          124 IEASE---DLLKPINVGQS-SFLKVGQQCLAIGNPFGFD------HTLTVGVISGLNRDIFS---QAGVTIG-GGIQ---  186 (340)
Q Consensus       124 v~~~~---~~~~~~~l~~~-~~~~~G~~v~~iG~p~g~~------~~~~~G~vs~~~~~~~~---~~~~~~~-~~i~---  186 (340)
                      ++.+.   ..+.|+.|... ..+..+..+.+.||+....      .......+..+....-.   .....+. ..+-   
T Consensus        96 L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~  175 (229)
T smart00020       96 LKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGG  175 (229)
T ss_pred             ECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecC
Confidence            98762   34677777653 2566789999999875532      12222222222210000   0000000 0111   


Q ss_pred             --EeeccCCCCccceeecCCC--cEEEEEeeeeeCCCCcCceEEEEehH
Q 019504          187 --TDAAINPGNSGGPLLDSKG--NLIGINTAIITQTGTSAGVGFAIPSS  231 (340)
Q Consensus       187 --~d~~i~~G~SGGPl~d~~G--~VVGi~~~~~~~~~~~~~~~~aip~~  231 (340)
                        .....++|+||||++...+  .++||.+... .+........+..+.
T Consensus       176 ~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~~~~~~~~~~i~  223 (229)
T smart00020      176 LEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS-GCARPGKPGVYTRVS  223 (229)
T ss_pred             CCCCCcccCCCCCCeeEEECCCEEEEEEEEECC-CCCCCCCCCEEEEec
Confidence              1355788999999997543  8999999875 333222333444444


No 16 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.87  E-value=6.7e-08  Score=84.80  Aligned_cols=195  Identities=19%  Similarity=0.185  Sum_probs=105.2

Q ss_pred             HHHhCCCeEEEEeeeeccccccCCcccc-cCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEE----ecCC
Q 019504           27 FEKNTYSVVNIFDVTLRPTLNVTGLVEI-PEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNIL----ASDG  101 (340)
Q Consensus        27 ~~~~~~svV~I~~~~~~~~~~~~~~~~~-~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~----~~~g  101 (340)
                      .......+.+|......+-......... .....++|+|+++ .+||++||+....... .      ++.+.    -.++
T Consensus        33 ~~~~~d~r~~V~dt~~~Py~av~~~~~~tG~~~~~~~lI~pn-tvLTa~Hc~~s~~~G~-~------~~~~~p~g~~~~~  104 (251)
T COG3591          33 TASAEDDRTQVTDTTQFPYSAVVQFEAATGRLCTAATLIGPN-TVLTAGHCIYSPDYGE-D------DIAAAPPGVNSDG  104 (251)
T ss_pred             hhcCCCCeeecccCCCCCcceeEEeecCCCcceeeEEEEcCc-eEEEeeeEEecCCCCh-h------hhhhcCCcccCCC
Confidence            3344466777754443332221111000 0111245999987 9999999998765321 0      11111    1111


Q ss_pred             c-eeEEEEEEEEeC-C---CCcEEEEEEecCC--------CCccceeecCCCCCCCCCEEEEEecCCCCCCce----eEE
Q 019504          102 V-QKNFEGKLVGAD-R---AKDLAVLKIEASE--------DLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTL----TVG  164 (340)
Q Consensus       102 ~-~~~~~a~v~~~d-~---~~DlAlL~v~~~~--------~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~----~~G  164 (340)
                      . ...+........ .   ..|.+...+....        .......+......+.++.+.++|||.+.....    ..+
T Consensus       105 ~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~  184 (251)
T COG3591         105 GPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTG  184 (251)
T ss_pred             CCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecc
Confidence            1 111221111111 2   3455555543221        112223333445668899999999997754222    222


Q ss_pred             EEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEE-ehHhHHHHHHHHH
Q 019504          165 VISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAI-PSSTVLKIVPQLI  241 (340)
Q Consensus       165 ~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~ai-p~~~i~~~l~~l~  241 (340)
                      .|..+.           ...+++++.+++|+||+|+++.+.+|||++.......++ ...++++ -...++++++++.
T Consensus       185 ~v~~~~-----------~~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         185 KVNSIK-----------GNKLFYDADTLPGSSGSPVLISKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             eeEEEe-----------cceEEEEecccCCCCCCceEecCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence            222221           124899999999999999999988999999987654332 3344433 3466788877764


No 17 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.81  E-value=3.7e-09  Score=77.17  Aligned_cols=60  Identities=23%  Similarity=0.175  Sum_probs=49.6

Q ss_pred             CCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC-CCCceEEEEeCC
Q 019504          267 GNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE-PNQDHLTCLKSS  340 (340)
Q Consensus       267 ~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~-~~~~~~~~~~~~  340 (340)
                      ..|++|.+|.++|||+++||           ++||+|++|||++|.+..|   +...+.... .....+++.|++
T Consensus         9 ~~Gv~V~~V~~~spa~~aGL-----------~~GDiI~~Ing~~v~~~~d---~~~~l~~~~~g~~v~l~v~r~g   69 (79)
T cd00991           9 VAGVVIVGVIVGSPAENAVL-----------HTGDVIYSINGTPITTLED---FMEALKPTKPGEVITVTVLPST   69 (79)
T ss_pred             CCcEEEEEECCCChHHhcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHhcCCCCCEEEEEEEECC
Confidence            37999999999999999999           9999999999999999888   666665542 344677777753


No 18 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.80  E-value=7.8e-08  Score=83.67  Aligned_cols=167  Identities=14%  Similarity=0.195  Sum_probs=89.1

Q ss_pred             HHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEE
Q 019504           27 FEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNF  106 (340)
Q Consensus        27 ~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~  106 (340)
                      +.-+...|++|.......           ...=-||... + ||+|++|......          ..+.+....|. ..+
T Consensus        13 yn~Ia~~ic~l~n~s~~~-----------~~~l~gigyG-~-~iItn~HLf~~nn----------g~L~i~s~hG~-f~v   68 (235)
T PF00863_consen   13 YNPIASNICRLTNESDGG-----------TRSLYGIGYG-S-YIITNAHLFKRNN----------GELTIKSQHGE-FTV   68 (235)
T ss_dssp             -HHHHTTEEEEEEEETTE-----------EEEEEEEEET-T-EEEEEGGGGSSTT----------CEEEEEETTEE-EEE
T ss_pred             cchhhheEEEEEEEeCCC-----------eEEEEEEeEC-C-EEEEChhhhccCC----------CeEEEEeCceE-EEc
Confidence            445667788886321111           1123477776 3 9999999998766          37888888774 222


Q ss_pred             E---EEEEEeCCCCcEEEEEEecCCCCccceee-cCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceec
Q 019504          107 E---GKLVGADRAKDLAVLKIEASEDLLKPINV-GQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIG  182 (340)
Q Consensus       107 ~---a~v~~~d~~~DlAlL~v~~~~~~~~~~~l-~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~  182 (340)
                      +   .--+..=+..||.++|++.+   ++|.+- .....++.+|+|.++|.-+..  ......||...........    
T Consensus        69 ~nt~~lkv~~i~~~DiviirmPkD---fpPf~~kl~FR~P~~~e~v~mVg~~fq~--k~~~s~vSesS~i~p~~~~----  139 (235)
T PF00863_consen   69 PNTTQLKVHPIEGRDIVIIRMPKD---FPPFPQKLKFRAPKEGERVCMVGSNFQE--KSISSTVSESSWIYPEENS----  139 (235)
T ss_dssp             CEGGGSEEEE-TCSSEEEEE--TT---S----S---B----TT-EEEEEEEECSS--CCCEEEEEEEEEEEEETTT----
T ss_pred             CCccccceEEeCCccEEEEeCCcc---cCCcchhhhccCCCCCCEEEEEEEEEEc--CCeeEEECCceEEeecCCC----
Confidence            1   11233346789999999875   333321 123467899999999964432  2233344443332322222    


Q ss_pred             ceEEEeeccCCCCccceeecC-CCcEEEEEeeeeeCCCCcCceEEEEeh
Q 019504          183 GGIQTDAAINPGNSGGPLLDS-KGNLIGINTAIITQTGTSAGVGFAIPS  230 (340)
Q Consensus       183 ~~i~~d~~i~~G~SGGPl~d~-~G~VVGi~~~~~~~~~~~~~~~~aip~  230 (340)
                      .+..+-.....|+=|.|+++. +|++|||++.....    ...||+.|+
T Consensus       140 ~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f  184 (235)
T PF00863_consen  140 HFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF  184 (235)
T ss_dssp             TEEEE-C---TT-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred             CeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence            345666666789999999987 89999999976543    345677776


No 19 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.78  E-value=5.5e-09  Score=93.38  Aligned_cols=92  Identities=17%  Similarity=0.054  Sum_probs=74.4

Q ss_pred             hHhHHHHHHHHHHcCceeeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCE
Q 019504          230 SSTVLKIVPQLIQYGKVVRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNK  309 (340)
Q Consensus       230 ~~~i~~~l~~l~~~~~~~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~  309 (340)
                      ...++++++++++.++..+.++|+......     +-..|++|..+.+++||+++||           ++||+|++|||+
T Consensus       158 ~~~~~~v~~~l~~~g~~~~~~lgi~p~~~~-----g~~~G~~v~~v~~~s~a~~aGL-----------r~GDvIv~ING~  221 (259)
T TIGR01713       158 IVVSRRIIEELTKDPQKMFDYIRLSPVMKN-----DKLEGYRLNPGKDPSLFYKSGL-----------QDGDIAVALNGL  221 (259)
T ss_pred             hhhHHHHHHHHHHCHHhhhheEeEEEEEeC-----CceeEEEEEecCCCCHHHHcCC-----------CCCCEEEEECCE
Confidence            346788999999999998999998864321     3337999999999999999999           999999999999


Q ss_pred             EccCCCCCCCceeEEEeeCC-CCceEEEEeCC
Q 019504          310 PVSFSCLSIPSRIYLICAEP-NQDHLTCLKSS  340 (340)
Q Consensus       310 ~v~~~~d~~~~~~~~~~~~~-~~~~~~~~~~~  340 (340)
                      ++++.++   +...+....+ ...+++|.|++
T Consensus       222 ~i~~~~~---~~~~l~~~~~~~~v~l~V~R~G  250 (259)
T TIGR01713       222 DLRDPEQ---AFQALQMLREETNLTLTVERDG  250 (259)
T ss_pred             EcCCHHH---HHHHHHhcCCCCeEEEEEEECC
Confidence            9999988   5555555433 46789999874


No 20 
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.76  E-value=1.2e-08  Score=76.16  Aligned_cols=77  Identities=36%  Similarity=0.401  Sum_probs=55.7

Q ss_pred             eeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEE
Q 019504          250 GLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLI  325 (340)
Q Consensus       250 ~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~  325 (340)
                      |+|+.+.  +....+.++++  .|++|.+|.++|||+++||           ++||+|++|||+++.+..|   +...+.
T Consensus         2 ~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl-----------~~GD~I~~Ing~~i~~~~~---~~~~l~   67 (90)
T cd00987           2 WLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGL-----------KPGDVILAVNGKPVKSVAD---LRRALA   67 (90)
T ss_pred             ccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCC-----------CcCCEEEEECCEECCCHHH---HHHHHH
Confidence            4565553  44444444443  6999999999999999999           9999999999999999887   444444


Q ss_pred             eeC-CCCceEEEEeCC
Q 019504          326 CAE-PNQDHLTCLKSS  340 (340)
Q Consensus       326 ~~~-~~~~~~~~~~~~  340 (340)
                      ... .....+++.|++
T Consensus        68 ~~~~~~~i~l~v~r~g   83 (90)
T cd00987          68 ELKPGDKVTLTVLRGG   83 (90)
T ss_pred             hcCCCCEEEEEEEECC
Confidence            332 344667777653


No 21 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.72  E-value=1.1e-08  Score=74.42  Aligned_cols=58  Identities=24%  Similarity=0.112  Sum_probs=47.1

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS  339 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~  339 (340)
                      ..+.|.+|.++|||+++||           ++||+|++|||+++.+..|   +...+.........+++.|+
T Consensus        12 ~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~ing~~i~~~~~---~~~~l~~~~~~~~~l~v~r~   69 (79)
T cd00989          12 IEPVIGEVVPGSPAAKAGL-----------KAGDRILAINGQKIKSWED---LVDAVQENPGKPLTLTVERN   69 (79)
T ss_pred             cCcEEEeECCCCHHHHcCC-----------CCCCEEEEECCEECCCHHH---HHHHHHHCCCceEEEEEEEC
Confidence            3589999999999999999           9999999999999999887   55555444334567777775


No 22 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.71  E-value=1.6e-08  Score=73.81  Aligned_cols=67  Identities=25%  Similarity=0.224  Sum_probs=48.7

Q ss_pred             eeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC
Q 019504          249 AGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE  328 (340)
Q Consensus       249 ~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~  328 (340)
                      +++|+.+...        ..|+.|.+|.++|||+++||           ++||+|++|||+++.++.+.+   ..+  ..
T Consensus         1 ~~~G~~~~~~--------~~~~~V~~V~~~s~a~~aGl-----------~~GD~I~~Ing~~v~~~~~~l---~~~--~~   56 (80)
T cd00990           1 PYLGLTLDKE--------EGLGKVTFVRDDSPADKAGL-----------VAGDELVAVNGWRVDALQDRL---KEY--QA   56 (80)
T ss_pred             CcccEEEEcc--------CCcEEEEEECCCChHHHhCC-----------CCCCEEEEECCEEhHHHHHHH---Hhc--CC
Confidence            3567766432        25799999999999999999           999999999999999865421   111  12


Q ss_pred             CCCceEEEEeC
Q 019504          329 PNQDHLTCLKS  339 (340)
Q Consensus       329 ~~~~~~~~~~~  339 (340)
                      .....+++.|+
T Consensus        57 ~~~v~l~v~r~   67 (80)
T cd00990          57 GDPVELTVFRD   67 (80)
T ss_pred             CCEEEEEEEEC
Confidence            23466677665


No 23 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.70  E-value=2.3e-08  Score=73.90  Aligned_cols=69  Identities=32%  Similarity=0.336  Sum_probs=52.8

Q ss_pred             eeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC--CCCCCceeEEEe
Q 019504          249 AGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS--CLSIPSRIYLIC  326 (340)
Q Consensus       249 ~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~--~d~~~~~~~~~~  326 (340)
                      .+||+.+....        .++.|..|.++|||+++||           ++||+|++|||+++.+.  .|   +..++..
T Consensus         2 ~~lG~~~~~~~--------~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~vng~~i~~~~~~~---~~~~l~~   59 (85)
T cd00988           2 GGIGLELKYDD--------GGLVITSVLPGSPAAKAGI-----------KAGDIIVAIDGEPVDGLSLED---VVKLLRG   59 (85)
T ss_pred             eEEEEEEEEcC--------CeEEEEEecCCCCHHHcCC-----------CCCCEEEEECCEEcCCCCHHH---HHHHhcC
Confidence            35777764322        6899999999999999999           99999999999999998  65   4445544


Q ss_pred             eCCCCceEEEEeC
Q 019504          327 AEPNQDHLTCLKS  339 (340)
Q Consensus       327 ~~~~~~~~~~~~~  339 (340)
                      .......+++.|+
T Consensus        60 ~~~~~i~l~v~r~   72 (85)
T cd00988          60 KAGTKVRLTLKRG   72 (85)
T ss_pred             CCCCEEEEEEEcC
Confidence            4434466777765


No 24 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.63  E-value=2.6e-08  Score=70.67  Aligned_cols=55  Identities=33%  Similarity=0.303  Sum_probs=43.6

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC--CCCCCceeEEEeeCCCCceEEE
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS--CLSIPSRIYLICAEPNQDHLTC  336 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~--~d~~~~~~~~~~~~~~~~~~~~  336 (340)
                      .+++|.+|.+++||+++||           ++||+|++|||+++.+.  .+   ...++........++++
T Consensus        13 ~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~Ing~~v~~~~~~~---~~~~l~~~~g~~v~l~v   69 (70)
T cd00136          13 GGVVVLSVEPGSPAERAGL-----------QAGDVILAVNGTDVKNLTLED---VAELLKKEVGEKVTLTV   69 (70)
T ss_pred             CCEEEEEeCCCCHHHHcCC-----------CCCCEEEEECCEECCCCCHHH---HHHHHhhCCCCeEEEEE
Confidence            4999999999999999999           99999999999999998  65   55555554423344443


No 25 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.55  E-value=5e-08  Score=71.15  Aligned_cols=58  Identities=26%  Similarity=0.254  Sum_probs=46.3

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe-eCCCCceEEEEeCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC-AEPNQDHLTCLKSS  340 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~-~~~~~~~~~~~~~~  340 (340)
                      .|++|.+|.++|||+. ||           ++||+|++|||+++.+.+|   +...+.. ......++++.|++
T Consensus         8 ~Gv~V~~V~~~s~A~~-gL-----------~~GD~I~~Ing~~v~~~~~---~~~~l~~~~~~~~v~l~v~r~g   66 (79)
T cd00986           8 HGVYVTSVVEGMPAAG-KL-----------KAGDHIIAVDGKPFKEAEE---LIDYIQSKKEGDTVKLKVKREE   66 (79)
T ss_pred             cCEEEEEECCCCchhh-CC-----------CCCCEEEEECCEECCCHHH---HHHHHHhCCCCCEEEEEEEECC
Confidence            5899999999999986 78           9999999999999999887   5555543 23334678887763


No 26 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.41  E-value=2.8e-07  Score=89.07  Aligned_cols=77  Identities=31%  Similarity=0.380  Sum_probs=60.7

Q ss_pred             eeeEEec--cHHHHhhcCCC---CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEE
Q 019504          250 GLNVDIA--PDLVASQLNVG---NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYL  324 (340)
Q Consensus       250 ~lg~~~~--~~~~~~~~~~~---~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~  324 (340)
                      ++|+.+.  +...++.++++   .|++|.+|.++|||+++||           ++||+|++|||++|.+..|   +...+
T Consensus       339 ~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL-----------~~GDvI~~Ing~~V~s~~d---~~~~l  404 (428)
T TIGR02037       339 FLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGL-----------QPGDVILSVNQQPVSSVAE---LRKVL  404 (428)
T ss_pred             ccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHH
Confidence            3555443  35556667765   6999999999999999999           9999999999999999998   66666


Q ss_pred             Eee-CCCCceEEEEeCC
Q 019504          325 ICA-EPNQDHLTCLKSS  340 (340)
Q Consensus       325 ~~~-~~~~~~~~~~~~~  340 (340)
                      ... ......++|.|++
T Consensus       405 ~~~~~g~~v~l~v~R~g  421 (428)
T TIGR02037       405 DRAKKGGRVALLILRGG  421 (428)
T ss_pred             HhcCCCCEEEEEEEECC
Confidence            553 3455788888864


No 27 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.27  E-value=8e-07  Score=65.27  Aligned_cols=60  Identities=32%  Similarity=0.336  Sum_probs=45.3

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      .|++|..|.+++||+++||           ++||+|++|||+++.+..+ ......+.. .+...++++.|.+
T Consensus        26 ~~~~i~~v~~~s~a~~~gl-----------~~GD~I~~In~~~v~~~~~-~~~~~~~~~-~~~~~~l~i~r~~   85 (85)
T smart00228       26 GGVVVSSVVPGSPAAKAGL-----------KVGDVILEVNGTSVEGLTH-LEAVDLLKK-AGGKVTLTVLRGG   85 (85)
T ss_pred             CCEEEEEECCCCHHHHcCC-----------CCCCEEEEECCEECCCCCH-HHHHHHHHh-CCCeEEEEEEeCC
Confidence            5999999999999999999           9999999999999998764 111112222 2235788887763


No 28 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.22  E-value=2.5e-06  Score=79.70  Aligned_cols=70  Identities=23%  Similarity=0.256  Sum_probs=51.5

Q ss_pred             eeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC--CCCCceeEEE
Q 019504          248 RAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC--LSIPSRIYLI  325 (340)
Q Consensus       248 ~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~--d~~~~~~~~~  325 (340)
                      ..++|+.+....        .+++|.+|.++|||+++||           ++||+|++|||++|.+.+  +   +...+.
T Consensus        50 ~~~lG~~~~~~~--------~~~~V~~V~~~spA~~aGL-----------~~GD~I~~Ing~~v~~~~~~~---~~~~l~  107 (334)
T TIGR00225        50 LEGIGIQVGMDD--------GEIVIVSPFEGSPAEKAGI-----------KPGDKIIKINGKSVAGMSLDD---AVALIR  107 (334)
T ss_pred             eEEEEEEEEEEC--------CEEEEEEeCCCChHHHcCC-----------CCCCEEEEECCEECCCCCHHH---HHHhcc
Confidence            456777764322        5899999999999999999           999999999999998763  2   333443


Q ss_pred             eeCCCCceEEEEeC
Q 019504          326 CAEPNQDHLTCLKS  339 (340)
Q Consensus       326 ~~~~~~~~~~~~~~  339 (340)
                      .......++++.|.
T Consensus       108 ~~~g~~v~l~v~R~  121 (334)
T TIGR00225       108 GKKGTKVSLEILRA  121 (334)
T ss_pred             CCCCCEEEEEEEeC
Confidence            33444566777664


No 29 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.17  E-value=1.1e-06  Score=64.20  Aligned_cols=38  Identities=29%  Similarity=0.366  Sum_probs=35.0

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc--CCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS--FSCL  316 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~--~~~d  316 (340)
                      .|++|.+|.++|||+++||           ++||+|++|||+++.  +..+
T Consensus        26 ~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~ing~~i~~~~~~~   65 (82)
T cd00992          26 GGIFVSRVEPGGPAERGGL-----------RVGDRILEVNGVSVEGLTHEE   65 (82)
T ss_pred             CCeEEEEECCCChHHhCCC-----------CCCCEEEEECCEEcCccCHHH
Confidence            5899999999999999999           999999999999999  5554


No 30 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.15  E-value=3.6e-06  Score=80.13  Aligned_cols=79  Identities=22%  Similarity=0.212  Sum_probs=51.9

Q ss_pred             eeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe
Q 019504          247 VRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC  326 (340)
Q Consensus       247 ~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~  326 (340)
                      ...++|+.+.....  ..+...|++|..|.++|||+++||           ++||+|++|||++|.+... ..+...+..
T Consensus        83 ~~~GiG~~~~~~~~--~~~~~~g~~V~~V~~~SPA~~aGl-----------~~GD~Iv~InG~~v~~~~~-~~~~~~l~g  148 (389)
T PLN00049         83 AVTGVGLEVGYPTG--SDGPPAGLVVVAPAPGGPAARAGI-----------RPGDVILAIDGTSTEGLSL-YEAADRLQG  148 (389)
T ss_pred             CceEEEEEEEEccC--CCCccCcEEEEEeCCCChHHHcCC-----------CCCCEEEEECCEECCCCCH-HHHHHHHhc
Confidence            35678887643210  001124899999999999999999           9999999999999986521 012333333


Q ss_pred             eCCCCceEEEEeC
Q 019504          327 AEPNQDHLTCLKS  339 (340)
Q Consensus       327 ~~~~~~~~~~~~~  339 (340)
                      .....++++|.|+
T Consensus       149 ~~g~~v~ltv~r~  161 (389)
T PLN00049        149 PEGSSVELTLRRG  161 (389)
T ss_pred             CCCCEEEEEEEEC
Confidence            3334456666654


No 31 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.15  E-value=8.4e-07  Score=86.11  Aligned_cols=57  Identities=23%  Similarity=0.176  Sum_probs=45.5

Q ss_pred             cEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCC-CCceEEEEeCC
Q 019504          270 ALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEP-NQDHLTCLKSS  340 (340)
Q Consensus       270 ~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~-~~~~~~~~~~~  340 (340)
                      .+|.+|.++|||++|||           |+||+|++|||++|.+.+|   ++..+....+ ...++|++|++
T Consensus       128 ~lV~~V~~~SpA~kAGL-----------k~GDvI~~vnG~~V~~~~~---l~~~v~~~~~g~~v~v~v~R~g  185 (449)
T PRK10779        128 PVVGEIAPNSIAAQAQI-----------APGTELKAVDGIETPDWDA---VRLALVSKIGDESTTITVAPFG  185 (449)
T ss_pred             ccccccCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhhccCCceEEEEEeCC
Confidence            47899999999999999           9999999999999999998   4444443332 34677887763


No 32 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.12  E-value=1.7e-06  Score=83.24  Aligned_cols=59  Identities=19%  Similarity=0.106  Sum_probs=49.3

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      .|+.|.+|.++|||+++||           ++||+|++|||++|.+.+|   +...+........++++.|++
T Consensus       203 ~g~vV~~V~~~SpA~~aGL-----------~~GD~Iv~Vng~~V~s~~d---l~~~l~~~~~~~v~l~v~R~g  261 (420)
T TIGR00054       203 IEPVLSDVTPNSPAEKAGL-----------KEGDYIQSINGEKLRSWTD---FVSAVKENPGKSMDIKVERNG  261 (420)
T ss_pred             cCcEEEEECCCCHHHHcCC-----------CCCCEEEEECCEECCCHHH---HHHHHHhCCCCceEEEEEECC
Confidence            4899999999999999999           9999999999999999998   666665544444678887763


No 33 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.07  E-value=0.00023  Score=63.59  Aligned_cols=124  Identities=23%  Similarity=0.209  Sum_probs=67.2

Q ss_pred             CcEEEEEEecCC---CCccceeecCCCC---CCCCCEEEEEecCCCC------CCceeEEEEeeecccc-ccCCCc--ee
Q 019504          117 KDLAVLKIEASE---DLLKPINVGQSSF---LKVGQQCLAIGNPFGF------DHTLTVGVISGLNRDI-FSQAGV--TI  181 (340)
Q Consensus       117 ~DlAlL~v~~~~---~~~~~~~l~~~~~---~~~G~~v~~iG~p~g~------~~~~~~G~vs~~~~~~-~~~~~~--~~  181 (340)
                      .|||||+++.+.   ..+.|+.|.....   ...+..+++.||+...      ........+.-+.... ......  ..
T Consensus       106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~  185 (256)
T KOG3627|consen  106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTI  185 (256)
T ss_pred             CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCcccc
Confidence            799999999752   4466777743332   3445888889975421      1122222222221100 000000  00


Q ss_pred             -cceEE-----EeeccCCCCccceeecCC---CcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHH
Q 019504          182 -GGGIQ-----TDAAINPGNSGGPLLDSK---GNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQL  240 (340)
Q Consensus       182 -~~~i~-----~d~~i~~G~SGGPl~d~~---G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l  240 (340)
                       ...+-     .....|.|+|||||+-.+   ..++||++++...++....-+....+....+++++.
T Consensus       186 ~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~  253 (256)
T KOG3627|consen  186 TDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKEN  253 (256)
T ss_pred             CCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHH
Confidence             01121     223368899999999654   699999999876333321223355566677776654


No 34 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.05  E-value=5.9e-06  Score=60.38  Aligned_cols=38  Identities=32%  Similarity=0.410  Sum_probs=35.8

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      .+++|.+|.+++||+++||           ++||.|++|||+++.+...
T Consensus        25 ~~~~V~~v~~~~~a~~~gl-----------~~GD~Il~INg~~v~~~~~   62 (81)
T PF00595_consen   25 KGVFVSSVVPGSPAERAGL-----------KVGDRILEINGQSVRGMSH   62 (81)
T ss_dssp             EEEEEEEECTTSHHHHHTS-----------STTEEEEEETTEESTTSBH
T ss_pred             CCEEEEEEeCCChHHhccc-----------chhhhhheeCCEeCCCCCH
Confidence            5999999999999999998           9999999999999998864


No 35 
>PRK10139 serine endoprotease; Provisional
Probab=98.04  E-value=2.7e-06  Score=82.51  Aligned_cols=58  Identities=26%  Similarity=0.260  Sum_probs=50.1

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      .|++|.+|.++|||+++||           ++||+|++|||++|.+.+|   +...+..+. +...++|+|++
T Consensus       390 ~Gv~V~~V~~~spA~~aGL-----------~~GD~I~~Ing~~v~~~~~---~~~~l~~~~-~~v~l~v~R~g  447 (455)
T PRK10139        390 KGIKIDEVVKGSPAAQAGL-----------QKDDVIIGVNRDRVNSIAE---MRKVLAAKP-AIIALQIVRGN  447 (455)
T ss_pred             CceEEEEeCCCChHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhCC-CeEEEEEEECC
Confidence            5899999999999999999           9999999999999999998   666665543 56788888864


No 36 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.03  E-value=3e-06  Score=82.28  Aligned_cols=58  Identities=24%  Similarity=0.218  Sum_probs=47.6

Q ss_pred             CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      +++|.+|.++|||+++||           ++||+|++|||++|.+.+|   +...+.........+++.|++
T Consensus       222 ~~vV~~V~~~SpA~~AGL-----------~~GDvIl~Ing~~V~s~~d---l~~~l~~~~~~~v~l~v~R~g  279 (449)
T PRK10779        222 EPVLAEVQPNSAASKAGL-----------QAGDRIVKVDGQPLTQWQT---FVTLVRDNPGKPLALEIERQG  279 (449)
T ss_pred             CcEEEeeCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhCCCCEEEEEEEECC
Confidence            689999999999999999           9999999999999999888   555554434344677777763


No 37 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.02  E-value=3.6e-06  Score=62.11  Aligned_cols=60  Identities=25%  Similarity=0.289  Sum_probs=42.0

Q ss_pred             CCcEEEeeCCC--------ChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504          268 NGALVLQVPGN--------SLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS  339 (340)
Q Consensus       268 ~g~~V~~v~~~--------spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~  339 (340)
                      .+..|.++.++        ||-.+.|+         ++++||+|++|||+++....+   +..+|..+....+.||+.+.
T Consensus        12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv---------~v~~GD~I~aInG~~v~~~~~---~~~lL~~~agk~V~Ltv~~~   79 (88)
T PF14685_consen   12 GGYRIARIYPGDPWNPNARSPLAQPGV---------DVREGDYILAINGQPVTADAN---PYRLLEGKAGKQVLLTVNRK   79 (88)
T ss_dssp             TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB----HHHHHHTTTTSEEEEEEE-S
T ss_pred             CEEEEEEEeCCCCCCccccCCccCCCC---------CCCCCCEEEEECCEECCCCCC---HHHHhcccCCCEEEEEEecC
Confidence            57778888775        66667777         348999999999999999887   78888888878888888775


No 38 
>PRK10942 serine endoprotease; Provisional
Probab=97.97  E-value=4.2e-06  Score=81.57  Aligned_cols=58  Identities=26%  Similarity=0.301  Sum_probs=50.2

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      .|++|.+|.++|||+++||           ++||+|++|||++|.+.+|   +...+.... ....++|.|.+
T Consensus       408 ~gvvV~~V~~~S~A~~aGL-----------~~GDvIv~VNg~~V~s~~d---l~~~l~~~~-~~v~l~V~R~g  465 (473)
T PRK10942        408 KGVVVDNVKPGTPAAQIGL-----------KKGDVIIGANQQPVKNIAE---LRKILDSKP-SVLALNIQRGD  465 (473)
T ss_pred             CCeEEEEeCCCChHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhCC-CeEEEEEEECC
Confidence            4899999999999999999           9999999999999999998   666665533 56788888864


No 39 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.93  E-value=4.9e-06  Score=80.02  Aligned_cols=57  Identities=25%  Similarity=0.230  Sum_probs=47.1

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS  339 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~  339 (340)
                      .|.+|.+|.++|||++|||           ++||+|+++||+++.+..|   +...+.... +...++++|+
T Consensus       128 ~g~~V~~V~~~SpA~~AGL-----------~~GDvI~~vng~~v~~~~d---l~~~ia~~~-~~v~~~I~r~  184 (420)
T TIGR00054       128 VGPVIELLDKNSIALEAGI-----------EPGDEILSVNGNKIPGFKD---VRQQIADIA-GEPMVEILAE  184 (420)
T ss_pred             CCceeeccCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhhc-ccceEEEEEe
Confidence            6899999999999999999           9999999999999999998   444444433 4567777763


No 40 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.90  E-value=1.6e-05  Score=75.91  Aligned_cols=74  Identities=26%  Similarity=0.311  Sum_probs=57.7

Q ss_pred             eeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe
Q 019504          247 VRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC  326 (340)
Q Consensus       247 ~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~  326 (340)
                      .+.|+|+++.-...       .++.|.++.+++||+++||           ++||+|++|||+++....- -+....+..
T Consensus        98 ~~~GiG~~i~~~~~-------~~~~V~s~~~~~PA~kagi-----------~~GD~I~~IdG~~~~~~~~-~~av~~irG  158 (406)
T COG0793          98 EFGGIGIELQMEDI-------GGVKVVSPIDGSPAAKAGI-----------KPGDVIIKIDGKSVGGVSL-DEAVKLIRG  158 (406)
T ss_pred             cccceeEEEEEecC-------CCcEEEecCCCChHHHcCC-----------CCCCEEEEECCEEccCCCH-HHHHHHhCC
Confidence            56788888754221       6899999999999999999           9999999999999998861 112345666


Q ss_pred             eCCCCceEEEEeC
Q 019504          327 AEPNQDHLTCLKS  339 (340)
Q Consensus       327 ~~~~~~~~~~~~~  339 (340)
                      ++...+++||.|+
T Consensus       159 ~~Gt~V~L~i~r~  171 (406)
T COG0793         159 KPGTKVTLTILRA  171 (406)
T ss_pred             CCCCeEEEEEEEc
Confidence            6667788888885


No 41 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.75  E-value=2.3e-05  Score=73.64  Aligned_cols=66  Identities=23%  Similarity=0.243  Sum_probs=51.1

Q ss_pred             eeEEeccHHHHhhcCCCCCcEEEeeC--------CCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCcee
Q 019504          251 LNVDIAPDLVASQLNVGNGALVLQVP--------GNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRI  322 (340)
Q Consensus       251 lg~~~~~~~~~~~~~~~~g~~V~~v~--------~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~  322 (340)
                      +|+.+.+          .|++|....        .+|||+++||           ++||+|++|||++|.+.+|   +..
T Consensus        98 iGI~l~t----------~GVlVvg~~~v~~~~g~~~SPAa~AGL-----------q~GDiIvsING~~V~s~~D---L~~  153 (402)
T TIGR02860        98 IGVKLNT----------KGVLVVGFSDIETEKGKIHSPGEEAGI-----------QIGDRILKINGEKIKNMDD---LAN  153 (402)
T ss_pred             EEEEEec----------CEEEEEEEEcccccCCCCCCHHHHcCC-----------CCCCEEEEECCEECCCHHH---HHH
Confidence            6776644          589986642        2699999999           9999999999999999998   665


Q ss_pred             EEEeeCCCCceEEEEeCC
Q 019504          323 YLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       323 ~~~~~~~~~~~~~~~~~~  340 (340)
                      .+.........+++.|++
T Consensus       154 iL~~~~g~~V~LtV~R~G  171 (402)
T TIGR02860       154 LINKAGGEKLTLTIERGG  171 (402)
T ss_pred             HHHhCCCCeEEEEEEECC
Confidence            665555555778888763


No 42 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.71  E-value=0.00028  Score=61.79  Aligned_cols=117  Identities=24%  Similarity=0.383  Sum_probs=61.9

Q ss_pred             CceEEEEEcCC--CEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccce
Q 019504           57 GNGSGVVWDGK--GHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPI  134 (340)
Q Consensus        57 ~~GsGfiI~~~--G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~  134 (340)
                      ..|||=+...+  -.|+|+.||+...            ...+... +.     -+...++..-|+|.-.++.-+...|.+
T Consensus       112 s~Gsggvft~~~~~vvvTAtHVlg~~------------~a~v~~~-g~-----~~~~tF~~~GDfA~~~~~~~~G~~P~~  173 (297)
T PF05579_consen  112 SVGSGGVFTIGGNTVVVTATHVLGGN------------TARVSGV-GT-----RRMLTFKKNGDFAEADITNWPGAAPKY  173 (297)
T ss_dssp             SEEEEEEEECTTEEEEEEEHHHCBTT------------EEEEEET-TE-----EEEEEEEEETTEEEEEETTS-S---B-
T ss_pred             cccccceEEECCeEEEEEEEEEcCCC------------eEEEEec-ce-----EEEEEEeccCcEEEEECCCCCCCCCce
Confidence            44555555444  4799999999853            3333332 22     123345566799999995444446777


Q ss_pred             eecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeee
Q 019504          135 NVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAI  214 (340)
Q Consensus       135 ~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~  214 (340)
                      ++++.   ..|.--+..      ..-+..|.|..-.             ++   +-..+|+||+|++..+|.+||+++..
T Consensus       174 k~a~~---~~GrAyW~t------~tGvE~G~ig~~~-------------~~---~fT~~GDSGSPVVt~dg~liGVHTGS  228 (297)
T PF05579_consen  174 KFAQN---YTGRAYWLT------STGVEPGFIGGGG-------------AV---CFTGPGDSGSPVVTEDGDLIGVHTGS  228 (297)
T ss_dssp             -B-TT----SEEEEEEE------TTEEEEEEEETTE-------------EE---ESS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred             eecCC---cccceEEEc------ccCcccceecCce-------------EE---EEcCCCCCCCccCcCCCCEEEEEecC
Confidence            77522   223322222      2234455543211             12   22346999999999999999999975


Q ss_pred             ee
Q 019504          215 IT  216 (340)
Q Consensus       215 ~~  216 (340)
                      -+
T Consensus       229 n~  230 (297)
T PF05579_consen  229 NK  230 (297)
T ss_dssp             ET
T ss_pred             CC
Confidence            43


No 43 
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.64  E-value=4.5e-05  Score=56.28  Aligned_cols=34  Identities=32%  Similarity=0.372  Sum_probs=31.7

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS  312 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~  312 (340)
                      .|+||++|.+||||+.|||           +.+|-|+.+||-..+
T Consensus        59 ~GiYvT~V~eGsPA~~AGL-----------rihDKIlQvNG~DfT   92 (124)
T KOG3553|consen   59 KGIYVTRVSEGSPAEIAGL-----------RIHDKILQVNGWDFT   92 (124)
T ss_pred             ccEEEEEeccCChhhhhcc-----------eecceEEEecCceeE
Confidence            8999999999999999999           899999999996654


No 44 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.62  E-value=7.4e-05  Score=75.21  Aligned_cols=73  Identities=19%  Similarity=0.153  Sum_probs=49.2

Q ss_pred             eeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhc-CCCccccCCCCCCcCCcEEEEEC--CEEccCCCC--CCCce
Q 019504          247 VRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKA-GILPTTRGFAGNIILGDIIVAVN--NKPVSFSCL--SIPSR  321 (340)
Q Consensus       247 ~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~-gl~~~~~~~~~~l~~GDvi~~i~--g~~v~~~~d--~~~~~  321 (340)
                      ...|+|+.+....        .++.|.+|.+|+||+++ ||           ++||+|++||  |+++.+...  +-...
T Consensus       242 ~~~GIGa~l~~~~--------~~~~V~~vipGsPA~ka~gL-----------k~GD~IlaVn~~g~~~~dv~g~~~~~vv  302 (667)
T PRK11186        242 SLEGIGAVLQMDD--------DYTVINSLVAGGPAAKSKKL-----------SVGDKIVGVGQDGKPIVDVIGWRLDDVV  302 (667)
T ss_pred             ceeEEEEEEEEeC--------CeEEEEEccCCChHHHhCCC-----------CCCCEEEEECCCCCcccccccCCHHHHH
Confidence            4567888875432        47899999999999998 88           9999999999  665544321  00123


Q ss_pred             eEEEeeCCCCceEEEEe
Q 019504          322 IYLICAEPNQDHLTCLK  338 (340)
Q Consensus       322 ~~~~~~~~~~~~~~~~~  338 (340)
                      .++...+...++|||.|
T Consensus       303 ~lirG~~Gt~V~LtV~r  319 (667)
T PRK11186        303 ALIKGPKGSKVRLEILP  319 (667)
T ss_pred             HHhcCCCCCEEEEEEEe
Confidence            34444454556666665


No 45 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.43  E-value=0.0048  Score=56.14  Aligned_cols=111  Identities=19%  Similarity=0.179  Sum_probs=64.9

Q ss_pred             CCCCcEEEEEEecC-CCCccceeecCCC-CCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeecc
Q 019504          114 DRAKDLAVLKIEAS-EDLLKPINVGQSS-FLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAI  191 (340)
Q Consensus       114 d~~~DlAlL~v~~~-~~~~~~~~l~~~~-~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i  191 (340)
                      ....+++||+++.+ .....|+-|+++. .+..++.+.+.|+...  ..+....+.-.....       ....+..+...
T Consensus       158 ~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~~~~~~~i~~~~~-------~~~~~~~~~~~  228 (282)
T PF03761_consen  158 NRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKLKHRKLKITNCTK-------CAYSICTKQYS  228 (282)
T ss_pred             ccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeEEEEEEEEEEeec-------cceeEeccccc
Confidence            35579999999887 3346788887653 4678999999997211  112222221111100       11225555667


Q ss_pred             CCCCccceeecC-C--CcEEEEEeeeeeCCCCcCceEEEEehHhHHH
Q 019504          192 NPGNSGGPLLDS-K--GNLIGINTAIITQTGTSAGVGFAIPSSTVLK  235 (340)
Q Consensus       192 ~~G~SGGPl~d~-~--G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~  235 (340)
                      +.|++|||++.. +  ..|||+.+......  .....+++.+..+++
T Consensus       229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~--~~~~~~f~~v~~~~~  273 (282)
T PF03761_consen  229 CKGDRGGPLVKNINGRWTLIGVGASGNYEC--NKNNSYFFNVSWYQD  273 (282)
T ss_pred             CCCCccCeEEEEECCCEEEEEEEccCCCcc--cccccEEEEHHHhhh
Confidence            789999999833 4  45999976543221  112456666665543


No 46 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.19  E-value=0.015  Score=48.92  Aligned_cols=135  Identities=22%  Similarity=0.323  Sum_probs=77.2

Q ss_pred             ceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCC---CCcEEEEEEecCCCC---c
Q 019504           58 NGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADR---AKDLAVLKIEASEDL---L  131 (340)
Q Consensus        58 ~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~---~~DlAlL~v~~~~~~---~  131 (340)
                      .++++.|..+ ++|...|.-. ..           .+.+   +|........+...+.   ..|+++++++...+.   .
T Consensus        26 t~l~~gi~~~-~~lvp~H~~~-~~-----------~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIr   89 (172)
T PF00548_consen   26 TMLALGIYDR-YFLVPTHEEP-ED-----------TIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIR   89 (172)
T ss_dssp             EEEEEEEEBT-EEEEEGGGGG-CS-----------EEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GG
T ss_pred             EEecceEeee-EEEEECcCCC-cE-----------EEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchh
Confidence            4778888755 9999999211 11           3443   3442233334333444   459999999775421   1


Q ss_pred             cceeecCCCCCCCCCEEEEEecCCCCCC-ceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecC---CCcE
Q 019504          132 KPINVGQSSFLKVGQQCLAIGNPFGFDH-TLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDS---KGNL  207 (340)
Q Consensus       132 ~~~~l~~~~~~~~G~~v~~iG~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~---~G~V  207 (340)
                      +.+.  +.. ....+..+++=.+ .... ....+.++....-  ..++......+.++++..+|+-||||+..   .+++
T Consensus        90 k~~~--~~~-~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i--~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i  163 (172)
T PF00548_consen   90 KFFP--ESI-PEYPECVLLVNST-KFPRMIVEVGFVTNFGFI--NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKI  163 (172)
T ss_dssp             GGSB--SSG-GTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE--EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEE
T ss_pred             hhhc--ccc-ccCCCcEEEEECC-CCccEEEEEEEEeecCcc--ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccE
Confidence            2222  111 1344555555333 2222 2333444433321  22334445678999999999999999953   6899


Q ss_pred             EEEEeee
Q 019504          208 IGINTAI  214 (340)
Q Consensus       208 VGi~~~~  214 (340)
                      +||+.++
T Consensus       164 ~GiHvaG  170 (172)
T PF00548_consen  164 IGIHVAG  170 (172)
T ss_dssp             EEEEEEE
T ss_pred             EEEEecc
Confidence            9999875


No 47 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=97.12  E-value=0.0032  Score=63.88  Aligned_cols=23  Identities=30%  Similarity=0.280  Sum_probs=20.5

Q ss_pred             ceEEEEEcCCCEEEeCccccCCC
Q 019504           58 NGSGVVWDGKGHIVTNFHVIGSA   80 (340)
Q Consensus        58 ~GsGfiI~~~G~IlT~~Hvv~~~   80 (340)
                      -+||-||+++|+|+||.||.-++
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~~~   70 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGYGA   70 (698)
T ss_pred             ceeEEEEcCCceEEecchhhhhH
Confidence            38999999999999999998653


No 48 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.10  E-value=0.00025  Score=57.12  Aligned_cols=57  Identities=26%  Similarity=0.152  Sum_probs=40.5

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcC-CcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIIL-GDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~-GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      .+.-|.+|.++|||+.|||           ++ .|.|+.+|+..+.+.++   |..++.........+.|+.
T Consensus        43 ~~~~Vl~V~p~SPA~~AGL-----------~p~~DyIig~~~~~l~~~~~---l~~~v~~~~~~~l~L~Vyn  100 (138)
T PF04495_consen   43 EGWHVLRVAPNSPAAKAGL-----------EPFFDYIIGIDGGLLDDEDD---LFELVEANENKPLQLYVYN  100 (138)
T ss_dssp             CEEEEEEE-TTSHHHHTT-------------TTTEEEEEETTCE--STCH---HHHHHHHTTTS-EEEEEEE
T ss_pred             ceEEEeEecCCCHHHHCCc-----------cccccEEEEccceecCCHHH---HHHHHHHcCCCcEEEEEEE
Confidence            6889999999999999999           76 69999999999987776   6666665555555665553


No 49 
>PF12812 PDZ_1:  PDZ-like domain
Probab=97.04  E-value=0.00049  Score=49.73  Aligned_cols=62  Identities=27%  Similarity=0.177  Sum_probs=50.0

Q ss_pred             eeeEEe--ccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEE
Q 019504          250 GLNVDI--APDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLI  325 (340)
Q Consensus       250 ~lg~~~--~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~  325 (340)
                      +.|..|  ++.+.++.++++-|.++.....++++..-|+           ..|-+|.+|||+++.+.++   |...+.
T Consensus        10 ~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i-----------~~g~iI~~Vn~kpt~~Ld~---f~~vvk   73 (78)
T PF12812_consen   10 VCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGI-----------SKGFIITSVNGKPTPDLDD---FIKVVK   73 (78)
T ss_pred             EcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCC-----------CCCeEEEeECCcCCcCHHH---HHHHHH
Confidence            456655  4688899999997777778888999887668           8899999999999999997   554443


No 50 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=96.98  E-value=0.017  Score=53.05  Aligned_cols=55  Identities=18%  Similarity=0.211  Sum_probs=38.9

Q ss_pred             eccCCCCccceeecC--CCc-EEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHc
Q 019504          189 AAINPGNSGGPLLDS--KGN-LIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQY  243 (340)
Q Consensus       189 ~~i~~G~SGGPl~d~--~G~-VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~  243 (340)
                      ...|.|+||||+|-.  +|+ -+||++|+-..++...-.+..--++....|++..++.
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~  280 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG  280 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence            346789999999954  454 6899999887666433333444577888888886553


No 51 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.87  E-value=0.032  Score=47.88  Aligned_cols=44  Identities=27%  Similarity=0.497  Sum_probs=33.5

Q ss_pred             EEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHh
Q 019504          185 IQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSST  232 (340)
Q Consensus       185 i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~  232 (340)
                      +..+.-+..||||+|++ .+|++||=++..+.+   .+..+|.++++.
T Consensus       171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie~  214 (218)
T PF05580_consen  171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIEW  214 (218)
T ss_pred             hhhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHHH
Confidence            33344566899999999 699999998877754   366788888754


No 52 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=96.80  E-value=0.00064  Score=61.09  Aligned_cols=54  Identities=33%  Similarity=0.384  Sum_probs=46.5

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEE
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCL  337 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~  337 (340)
                      .|+||..|..++|+.  |.          |+.||-|++|||+++.+.+|   +-.++..++++ ++|||-
T Consensus       130 ~gvyv~~v~~~~~~~--gk----------l~~gD~i~avdg~~f~s~~e---~i~~v~~~k~G-d~VtI~  183 (342)
T COG3480         130 AGVYVLSVIDNSPFK--GK----------LEAGDTIIAVDGEPFTSSDE---LIDYVSSKKPG-DEVTID  183 (342)
T ss_pred             eeEEEEEccCCcchh--ce----------eccCCeEEeeCCeecCCHHH---HHHHHhccCCC-CeEEEE
Confidence            799999999999986  43          69999999999999999999   77888887765 777763


No 53 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.73  E-value=0.014  Score=57.79  Aligned_cols=118  Identities=19%  Similarity=0.323  Sum_probs=72.7

Q ss_pred             CCCcEEEEEEecCC-------C------CccceeecC------CCCCCCCCEEEEEecCCCCCCceeEEEEeeecccccc
Q 019504          115 RAKDLAVLKIEASE-------D------LLKPINVGQ------SSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFS  175 (340)
Q Consensus       115 ~~~DlAlL~v~~~~-------~------~~~~~~l~~------~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~  175 (340)
                      +-.|+||++++...       +      .-|.+.+.+      -..+.+|..|+=+|...+    .+.|++.++.-... 
T Consensus       541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw-  615 (695)
T PF08192_consen  541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYW-  615 (695)
T ss_pred             cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEe-
Confidence            44599999997542       1      112233321      124567999999997665    46677766543222 


Q ss_pred             CCCc-eecceEEEe----eccCCCCccceeecCCCc------EEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHH
Q 019504          176 QAGV-TIGGGIQTD----AAINPGNSGGPLLDSKGN------LIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQL  240 (340)
Q Consensus       176 ~~~~-~~~~~i~~d----~~i~~G~SGGPl~d~~G~------VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l  240 (340)
                      .++. ...+++...    .-...|+||+-+++.-+.      |+||.+++-..   ...++++.|+..|..=|++.
T Consensus       616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge---~kqfglftPi~~il~rl~~v  688 (695)
T PF08192_consen  616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE---QKQFGLFTPINEILDRLEEV  688 (695)
T ss_pred             cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc---cceeeccCcHHHHHHHHHHh
Confidence            2222 112333333    224579999999986444      99998875332   45789999998877766654


No 54 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.68  E-value=0.0028  Score=64.34  Aligned_cols=57  Identities=23%  Similarity=0.351  Sum_probs=41.6

Q ss_pred             eEEEeeccCCCCccceeecCCCcEEEEEeeeeeCC-C------CcCceEEEEehHhHHHHHHHH
Q 019504          184 GIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQT-G------TSAGVGFAIPSSTVLKIVPQL  240 (340)
Q Consensus       184 ~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~-~------~~~~~~~aip~~~i~~~l~~l  240 (340)
                      ++.++..+..||||+|++|.+|+|||+++-..-.. .      ....-+..+.+..++.+++++
T Consensus       623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv  686 (698)
T PF10459_consen  623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV  686 (698)
T ss_pred             EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence            47888899999999999999999999987432111 0      012335567778888888765


No 55 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.55  E-value=0.0036  Score=60.14  Aligned_cols=31  Identities=35%  Similarity=0.302  Sum_probs=29.7

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCE
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNK  309 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~  309 (340)
                      .+.+|..|.++|||++|||           .+||-|++|||.
T Consensus       462 g~~~i~~V~~~gPA~~AGl-----------~~Gd~ivai~G~  492 (558)
T COG3975         462 GHEKITFVFPGGPAYKAGL-----------SPGDKIVAINGI  492 (558)
T ss_pred             CeeEEEecCCCChhHhccC-----------CCccEEEEEcCc
Confidence            6789999999999999999           899999999999


No 56 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=96.47  E-value=0.032  Score=47.97  Aligned_cols=148  Identities=21%  Similarity=0.208  Sum_probs=49.4

Q ss_pred             CceEEEEE-cCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCcee-EEEEEEEEeCCCCcEEEEEEecCC---CCc
Q 019504           57 GNGSGVVW-DGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQK-NFEGKLVGADRAKDLAVLKIEASE---DLL  131 (340)
Q Consensus        57 ~~GsGfiI-~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~-~~~a~v~~~d~~~DlAlL~v~~~~---~~~  131 (340)
                      +.++.+-. +-+-.++|++||.....           .+. .+.+|... .-..+.+..+...|++||+....-   ..+
T Consensus        30 Gya~cv~l~~g~~~L~ta~Hv~~~~~-----------~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~   97 (203)
T PF02122_consen   30 GYATCVRLFDGEDALLTARHVWSRPS-----------KVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGV   97 (203)
T ss_dssp             ----EEEE----EEEEE-HHHHTSSS---------------EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-
T ss_pred             ccceEEECcCCccceecccccCCCcc-----------cee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCc
Confidence            34555442 22237999999999843           222 23344211 112345567889999999998431   113


Q ss_pred             cceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEE
Q 019504          132 KPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGIN  211 (340)
Q Consensus       132 ~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~  211 (340)
                      +.+.+.....+..|    .+..     +....+.............+    .+..+-+...+|.||.|+|+.+ +++|++
T Consensus        98 k~~~~~~~~~~~~g----~~~~-----y~~~~~~~~~~sa~i~g~~~----~~~~vls~T~~G~SGtp~y~g~-~vvGvH  163 (203)
T PF02122_consen   98 KAAQLSQNSQLAKG----PVSF-----YGFSSGEWPCSSAKIPGTEG----KFASVLSNTSPGWSGTPYYSGK-NVVGVH  163 (203)
T ss_dssp             ----B----SEEEE----ESST-----TSEEEEEEEEEE-S----ST----TEEEE-----TT-TT-EEE-SS--EEEEE
T ss_pred             ccccccchhhhCCC----Ceee-----eeecCCCceeccCccccccC----cCCceEcCCCCCCCCCCeEECC-CceEee
Confidence            44444322221100    1111     11122111111111222221    2467777888999999999977 999999


Q ss_pred             eeeeeCCCCcCceEEEEehH
Q 019504          212 TAIITQTGTSAGVGFAIPSS  231 (340)
Q Consensus       212 ~~~~~~~~~~~~~~~aip~~  231 (340)
                      ... .......+.++..|+.
T Consensus       164 ~G~-~~~~~~~n~n~~spip  182 (203)
T PF02122_consen  164 TGS-PSGSNRENNNRMSPIP  182 (203)
T ss_dssp             EEE-----------------
T ss_pred             cCc-cccccccccccccccc
Confidence            975 2222234555555543


No 57 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=96.44  E-value=0.0016  Score=64.20  Aligned_cols=54  Identities=24%  Similarity=0.238  Sum_probs=40.5

Q ss_pred             EEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEE
Q 019504          272 VLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCL  337 (340)
Q Consensus       272 V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~  337 (340)
                      |-+|.+||||+|.|-          |++||-|++|||+.|.+.+.  +--..|++.....++|||+
T Consensus       782 iGrIieGSPAdRCgk----------LkVGDrilAVNG~sI~~lsH--adiv~LIKdaGlsVtLtIi  835 (984)
T KOG3209|consen  782 IGRIIEGSPADRCGK----------LKVGDRILAVNGQSILNLSH--ADIVSLIKDAGLSVTLTII  835 (984)
T ss_pred             ccccccCChhHhhcc----------ccccceEEEecCeeeeccCc--hhHHHHHHhcCceEEEEEc
Confidence            888999999999974          69999999999999998875  1122344444455666664


No 58 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.40  E-value=0.0069  Score=48.20  Aligned_cols=33  Identities=21%  Similarity=0.473  Sum_probs=23.2

Q ss_pred             EEEeeccCCCCccceeecCCCcEEEEEeeeeeC
Q 019504          185 IQTDAAINPGNSGGPLLDSKGNLIGINTAIITQ  217 (340)
Q Consensus       185 i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~  217 (340)
                      ...+....+|.||+|+||.+|++|||.......
T Consensus        88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~  120 (132)
T PF00949_consen   88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV  120 (132)
T ss_dssp             EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred             EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence            444555778999999999999999998876654


No 59 
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=96.39  E-value=0.0018  Score=54.67  Aligned_cols=60  Identities=22%  Similarity=0.108  Sum_probs=42.5

Q ss_pred             CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504          269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS  339 (340)
Q Consensus       269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~  339 (340)
                      =+.|.+|.++|||+++||           +.||-|+++....-.++.++.............-..||++|.
T Consensus       140 Fa~V~sV~~~SPA~~aGl-----------~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~  199 (231)
T KOG3129|consen  140 FAVVDSVVPGSPADEAGL-----------CVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIRE  199 (231)
T ss_pred             eEEEeecCCCChhhhhCc-----------ccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecC
Confidence            467999999999999999           999999999988777776421111222223334466777775


No 60 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=96.12  E-value=0.003  Score=56.60  Aligned_cols=53  Identities=23%  Similarity=0.191  Sum_probs=37.5

Q ss_pred             eCCCChh---hhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          275 VPGNSLA---AKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       275 v~~~spa---~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      |.|+..+   .++||           |+|||+++|||.+++++++.+.....|.  +....++||.|++
T Consensus       211 l~Pgkd~~lF~~~GL-----------q~GDva~sING~dL~D~~qa~~l~~~L~--~~tei~ltVeRdG  266 (276)
T PRK09681        211 VKPGADRSLFDASGF-----------KEGDIAIALNQQDFTDPRAMIALMRQLP--SMDSIQLTVLRKG  266 (276)
T ss_pred             ECCCCcHHHHHHcCC-----------CCCCEEEEeCCeeCCCHHHHHHHHHHhc--cCCeEEEEEEECC
Confidence            4565433   47798           9999999999999999997433333333  3345788888864


No 61 
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=95.94  E-value=0.0029  Score=62.09  Aligned_cols=54  Identities=35%  Similarity=0.434  Sum_probs=41.9

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      .-+.|..|++++||.++.+           ++|||+++|||.||++..+   ...++....   -+|+++|
T Consensus       398 ~~v~v~tv~~ns~a~k~~~-----------~~gdvlvai~~~pi~s~~q---~~~~~~s~~---~~~~~l~  451 (1051)
T KOG3532|consen  398 RAVKVCTVEDNSLADKAAF-----------KPGDVLVAINNVPIRSERQ---ATRFLQSTT---GDLTVLV  451 (1051)
T ss_pred             eEEEEEEecCCChhhHhcC-----------CCcceEEEecCccchhHHH---HHHHHHhcc---cceEEEE
Confidence            4677888999999999998           8999999999999999987   444444333   3455554


No 62 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.88  E-value=0.017  Score=56.20  Aligned_cols=78  Identities=23%  Similarity=0.314  Sum_probs=54.1

Q ss_pred             eeEEeccHHHHhhcCC--CCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC
Q 019504          251 LNVDIAPDLVASQLNV--GNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE  328 (340)
Q Consensus       251 lg~~~~~~~~~~~~~~--~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~  328 (340)
                      +++.+....-.+.||+  ..-+.|+++...+.|++-|=          |+.|||||+|||....+++ +...+ .|+.+-
T Consensus       200 ~kv~LvKsR~nEEyGlrLgSqIFvKeit~~gLAardgn----------lqEGDiiLkINGtvteNmS-LtDar-~LIEkS  267 (1027)
T KOG3580|consen  200 IKVLLVKSRANEEYGLRLGSQIFVKEITRTGLAARDGN----------LQEGDIILKINGTVTENMS-LTDAR-KLIEKS  267 (1027)
T ss_pred             ceEEEEeeccchhhcccccchhhhhhhcccchhhccCC----------cccccEEEEECcEeecccc-chhHH-HHHHhc
Confidence            3444433333344554  47889999998888888753          5999999999999999887 33333 444555


Q ss_pred             CCCceEEEEeCC
Q 019504          329 PNQDHLTCLKSS  340 (340)
Q Consensus       329 ~~~~~~~~~~~~  340 (340)
                      .++..+-|+|.+
T Consensus       268 ~GKL~lvVlRD~  279 (1027)
T KOG3580|consen  268 RGKLQLVVLRDS  279 (1027)
T ss_pred             cCceEEEEEecC
Confidence            566888888864


No 63 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.79  E-value=0.26  Score=46.83  Aligned_cols=96  Identities=19%  Similarity=0.174  Sum_probs=52.0

Q ss_pred             cceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCcee-----cceEEEeeccCCCCccceeecCCCc
Q 019504          132 KPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTI-----GGGIQTDAAINPGNSGGPLLDSKGN  206 (340)
Q Consensus       132 ~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~-----~~~i~~d~~i~~G~SGGPl~d~~G~  206 (340)
                      .+++++....+++|......-. .+.......-.|..+............     .+++..+..+..||||+|++ .+|+
T Consensus       294 ~~~~va~~~ev~~G~a~i~t~~-~g~~~~~~~iei~~v~~~~~~~~k~~~i~~td~~ll~~tgGivqGMSGSPi~-q~gk  371 (402)
T TIGR02860       294 KPMPVALRDEVKEGPAKILTVI-DGEKVEKFDIEIVKLVPQNSPATKGMVIKITDPRLLEKTGGIVQGMSGSPII-QNGK  371 (402)
T ss_pred             cEEeEEEHHHcccccEEEEEEE-cCCEEEEEEEEEEEEccCCCCCCceEEEEEcCccHhhHhCCEEecccCCCEE-ECCE
Confidence            5667776778888887532221 222111112222233222111111110     11233344566899999999 6999


Q ss_pred             EEEEEeeeeeCCCCcCceEEEEehHh
Q 019504          207 LIGINTAIITQTGTSAGVGFAIPSST  232 (340)
Q Consensus       207 VVGi~~~~~~~~~~~~~~~~aip~~~  232 (340)
                      +||=++-.+.+   .+..+|+|-++.
T Consensus       372 liGAvtHVfvn---dpt~GYGi~ie~  394 (402)
T TIGR02860       372 VIGAVTHVFVN---DPTSGYGVYIEW  394 (402)
T ss_pred             EEEEEEEEEec---CCCcceeehHHH
Confidence            99987776664   355678886644


No 64 
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.51  E-value=0.014  Score=56.79  Aligned_cols=60  Identities=23%  Similarity=0.263  Sum_probs=45.8

Q ss_pred             CCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          266 VGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       266 ~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      ..-|+.|..|..+|||+..||           +.||-||.||..+.++.--. +.-.+|.... ++.++||+-
T Consensus       427 NDVGIFVaGvqegspA~~eGl-----------qEGDQIL~VN~vdF~nl~RE-eAVlfLL~lP-kGEevtila  486 (1027)
T KOG3580|consen  427 NDVGIFVAGVQEGSPAEQEGL-----------QEGDQILKVNTVDFRNLVRE-EAVLFLLELP-KGEEVTILA  486 (1027)
T ss_pred             CceeEEEeecccCCchhhccc-----------cccceeEEeccccchhhhHH-HHHHHHhcCC-CCcEEeehh
Confidence            346999999999999999999           99999999999988775310 1223444444 568999873


No 65 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=94.41  E-value=0.016  Score=50.15  Aligned_cols=59  Identities=25%  Similarity=0.161  Sum_probs=43.9

Q ss_pred             CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      |..+.=..+++.-+..||           |.|||.+++|+..+++++|.+.....+....  .-.+||+|.+
T Consensus       208 Gyr~~pgkd~slF~~sgl-----------q~GDIavaiNnldltdp~~m~~llq~l~~m~--s~qlTv~R~G  266 (275)
T COG3031         208 GYRFEPGKDGSLFYKSGL-----------QRGDIAVAINNLDLTDPEDMFRLLQMLRNMP--SLQLTVIRRG  266 (275)
T ss_pred             EEEecCCCCcchhhhhcC-----------CCcceEEEecCcccCCHHHHHHHHHhhhcCc--ceEEEEEecC
Confidence            333333455677778888           9999999999999999998655555555443  4789999975


No 66 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=93.43  E-value=0.15  Score=40.28  Aligned_cols=131  Identities=21%  Similarity=0.291  Sum_probs=65.4

Q ss_pred             eEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecC
Q 019504           59 GSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQ  138 (340)
Q Consensus        59 GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~  138 (340)
                      --|+.|+  |.+-|.+|--....            +  --+-|     +..-.+.+...|+..-....-...+.|-.-+.
T Consensus        14 fmgt~vn--GV~wT~~HGagsrt------------l--Agp~G-----pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~   72 (148)
T PF02907_consen   14 FMGTCVN--GVMWTVYHGAGSRT------------L--AGPKG-----PVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS   72 (148)
T ss_dssp             EEEEEET--TEEEEEHHHHTTSE------------E--EBTTS-----EB-ESEEETTTTEEEEE-STTB--BBB-SSSS
T ss_pred             eehhEEc--cEEEEEEecCCccc------------c--cCCCC-----cceEeEEcCCCCCcccccccccccCCccccCC
Confidence            3577786  78999999654321            0  01111     22334567778888777765544444444431


Q ss_pred             CCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEe--eccCCCCccceeecCCCcEEEEEeeeee
Q 019504          139 SSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTD--AAINPGNSGGPLLDSKGNLIGINTAIIT  216 (340)
Q Consensus       139 ~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d--~~i~~G~SGGPl~d~~G~VVGi~~~~~~  216 (340)
                             ..+|++-+-..    +..+.     +..  ....    .+..-  .....|.||||++-.+|.+|||..+...
T Consensus        73 -------~dlylVtr~~~----v~p~r-----r~g--d~~~----~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~  130 (148)
T PF02907_consen   73 -------SDLYLVTRDAD----VIPVR-----RRG--DSRA----SLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVC  130 (148)
T ss_dssp             -------SEEEEE-TTS-----EEEEE-----EES--TTEE----EEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEE
T ss_pred             -------ccEEEEeccCc----EeeeE-----EcC--CCce----EecCCceeEEEecCCCCcccCCCCCEEEEEEEEEE
Confidence                   35566643211    11111     000  0000    01111  1234799999999889999999887665


Q ss_pred             CCCCcCceEEEEehHhH
Q 019504          217 QTGTSAGVGFAIPSSTV  233 (340)
Q Consensus       217 ~~~~~~~~~~aip~~~i  233 (340)
                      ..+....+-| +|++.+
T Consensus       131 trgvak~i~f-~P~e~l  146 (148)
T PF02907_consen  131 TRGVAKAIDF-IPVETL  146 (148)
T ss_dssp             ETTEEEEEEE-EEHHHH
T ss_pred             cCCceeeEEE-Eeeeec
Confidence            4433344555 487654


No 67 
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=93.25  E-value=0.096  Score=42.02  Aligned_cols=38  Identities=18%  Similarity=0.228  Sum_probs=32.7

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      +-+||+++.||+-|+|-|=          |+.||-++++||..|..-.
T Consensus       115 spiyisriipggvadrhgg----------lkrgdqllsvngvsvege~  152 (207)
T KOG3550|consen  115 SPIYISRIIPGGVADRHGG----------LKRGDQLLSVNGVSVEGEH  152 (207)
T ss_pred             CceEEEeecCCccccccCc----------ccccceeEeecceeecchh
Confidence            6799999999999998753          4999999999999887544


No 68 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=92.78  E-value=0.19  Score=39.61  Aligned_cols=30  Identities=27%  Similarity=0.550  Sum_probs=24.5

Q ss_pred             eeccCCCCccceeecCCCcEEEEEeeeeeC
Q 019504          188 DAAINPGNSGGPLLDSKGNLIGINTAIITQ  217 (340)
Q Consensus       188 d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~  217 (340)
                      ...-.+|+||-|++|..|+||||+..+..+
T Consensus       100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ne  129 (158)
T PF00944_consen  100 TGVGKPGDSGRPIFDNSGRVVAIVLGGANE  129 (158)
T ss_dssp             TTS-STTSTTEEEESTTSBEEEEEEEEEEE
T ss_pred             cCCCCCCCCCCccCcCCCCEEEEEecCCCC
Confidence            344578999999999999999999876653


No 69 
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=91.96  E-value=0.095  Score=48.74  Aligned_cols=43  Identities=28%  Similarity=0.387  Sum_probs=36.0

Q ss_pred             hcCCCCCcEEEeeCCCChhhhc-CCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          263 QLNVGNGALVLQVPGNSLAAKA-GILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       263 ~~~~~~g~~V~~v~~~spa~~~-gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      .|-...|+.|++|...||+..- ||           .+||+|+++||.+|++.+|
T Consensus       215 fya~g~gV~Vtev~~~Spl~gprGL-----------~vgdvitsldgcpV~~v~d  258 (484)
T KOG2921|consen  215 FYAHGEGVTVTEVPSVSPLFGPRGL-----------SVGDVITSLDGCPVHKVSD  258 (484)
T ss_pred             hhhcCceEEEEeccccCCCcCcccC-----------CccceEEecCCcccCCHHH
Confidence            3444589999999999987622 55           9999999999999999998


No 70 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=91.94  E-value=1  Score=34.30  Aligned_cols=56  Identities=29%  Similarity=0.303  Sum_probs=36.3

Q ss_pred             eEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecC
Q 019504           59 GSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQ  138 (340)
Q Consensus        59 GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~  138 (340)
                      |-++-|. +|..+|+.||.+..+           .|     +|.    +-+++.  ...|+|+++.+...  ++.+++++
T Consensus         1 G~avHIG-nG~~vt~tHva~~~~-----------~v-----~g~----~f~~~~--~~ge~~~v~~~~~~--~p~~~ig~   55 (105)
T PF03510_consen    1 GWAVHIG-NGRYVTVTHVAKSSD-----------SV-----DGQ----PFKIVK--TDGELCWVQSPLVH--LPAAQIGT   55 (105)
T ss_pred             CceEEeC-CCEEEEEEEEeccCc-----------eE-----cCc----CcEEEE--eccCEEEEECCCCC--CCeeEecc
Confidence            3467776 689999999998764           22     122    223333  34599999998764  56666754


Q ss_pred             C
Q 019504          139 S  139 (340)
Q Consensus       139 ~  139 (340)
                      .
T Consensus        56 g   56 (105)
T PF03510_consen   56 G   56 (105)
T ss_pred             C
Confidence            3


No 71 
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=91.87  E-value=0.14  Score=52.59  Aligned_cols=36  Identities=31%  Similarity=0.360  Sum_probs=31.2

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      .-++|..|.+|+|+.            |+|++||-|++|||++|.+..
T Consensus        75 rPviVr~VT~GGps~------------GKL~PGDQIl~vN~Epv~dap  110 (1298)
T KOG3552|consen   75 RPVIVRFVTEGGPSI------------GKLQPGDQILAVNGEPVKDAP  110 (1298)
T ss_pred             CceEEEEecCCCCcc------------ccccCCCeEEEecCccccccc
Confidence            368899999999986            457999999999999998764


No 72 
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.78  E-value=0.092  Score=52.26  Aligned_cols=60  Identities=18%  Similarity=0.239  Sum_probs=47.2

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      -+++|-++..++||.+.|=          +++||-|++|||+....+..   .+.+-.-+..+...+++||.+
T Consensus       923 M~LfVLRlAeDGPA~rdGr----------m~VGDqi~eINGesTkgmtH---~rAIelIk~gg~~vll~Lr~g  982 (984)
T KOG3209|consen  923 MDLFVLRLAEDGPAIRDGR----------MRVGDQITEINGESTKGMTH---DRAIELIKQGGRRVLLLLRRG  982 (984)
T ss_pred             cceEEEEeccCCCccccCc----------eeecceEEEecCcccCCCcH---HHHHHHHHhCCeEEEEEeccC
Confidence            4799999999999999974          59999999999999999886   344333344556777888763


No 73 
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=91.15  E-value=0.38  Score=47.59  Aligned_cols=101  Identities=24%  Similarity=0.287  Sum_probs=69.6

Q ss_pred             cCCCCccceee-----cCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeee------eeeEEeccHH
Q 019504          191 INPGNSGGPLL-----DSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRA------GLNVDIAPDL  259 (340)
Q Consensus       191 i~~G~SGGPl~-----d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~------~lg~~~~~~~  259 (340)
                      +..=++|||.-     |.--+++.|+-..          -..+|.+.....++.+++.-.+...      -.-+...-+.
T Consensus       677 iAnmm~~GpAarsgkLnIGDQiiaING~S----------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd  746 (829)
T KOG3605|consen  677 IANMMHGGPAARSGKLNIGDQIMSINGTS----------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPD  746 (829)
T ss_pred             HHhcccCChhhhcCCccccceeEeecCce----------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeeccc
Confidence            33457788874     3334566664332          2368999999999988776655432      2334444566


Q ss_pred             HHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccC
Q 019504          260 VASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSF  313 (340)
Q Consensus       260 ~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~  313 (340)
                      +..++|..  +|++-+ ...|+=|+|-|+           ++|--|++|||+.|.-
T Consensus       747 ~kyQLGFSVQNGiICS-LlRGGIAERGGV-----------RVGHRIIEINgQSVVA  790 (829)
T KOG3605|consen  747 LRYQLGFSVQNGIICS-LLRGGIAERGGV-----------RVGHRIIEINGQSVVA  790 (829)
T ss_pred             chhhccceeeCcEeeh-hhcccchhccCc-----------eeeeeEEEECCceEEe
Confidence            66677766  777544 567888999999           8999999999998853


No 74 
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=90.92  E-value=0.091  Score=52.04  Aligned_cols=37  Identities=32%  Similarity=0.349  Sum_probs=34.4

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      .|++|.+|.|++.|++.||           +.||-|++|||+...+..
T Consensus       562 fgifV~~V~pgskAa~~Gl-----------KRgDqilEVNgQnfenis  598 (1283)
T KOG3542|consen  562 FGIFVAEVFPGSKAAREGL-----------KRGDQILEVNGQNFENIS  598 (1283)
T ss_pred             ceeEEeeecCCchHHHhhh-----------hhhhhhhhccccchhhhh
Confidence            7999999999999999999           899999999999877665


No 75 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=90.56  E-value=3.7  Score=42.66  Aligned_cols=49  Identities=29%  Similarity=0.357  Sum_probs=30.0

Q ss_pred             cCCCCccceee--cC---CCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHH
Q 019504          191 INPGNSGGPLL--DS---KGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQL  240 (340)
Q Consensus       191 i~~G~SGGPl~--d~---~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l  240 (340)
                      ..+|+||+|||  |.   +.-++|+.+......+ .......+|.+++.++.++.
T Consensus       213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g-~~~~~~~~~~~f~~~~~~~d  266 (769)
T PF02395_consen  213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNG-KGNWWNVIPPDFINQIKQND  266 (769)
T ss_dssp             --TT-TT-EEEEEETTTTEEEEEEEEEEECCCCH-SEEEEEEECHHHHHHHHHHC
T ss_pred             cccCcCCCceEEEEccCCeEEEEEEEccccccCC-ccceeEEecHHHHHHHHhhh
Confidence            45899999998  32   3459999876543322 22455678888887777663


No 76 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=89.29  E-value=0.84  Score=43.06  Aligned_cols=140  Identities=19%  Similarity=0.252  Sum_probs=68.9

Q ss_pred             cccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCC-CCc
Q 019504           53 EIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASE-DLL  131 (340)
Q Consensus        53 ~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~-~~~  131 (340)
                      -..-+.|-||.++++ +++|+.||+.....          ++.     |    .+-.-+..+..-+++-+++..+- ..+
T Consensus       375 iv~fGsGWGfWVS~~-lfITttHViP~g~~----------E~F-----G----v~i~~i~vh~sGeF~~~rFpk~iRPDv  434 (535)
T PF05416_consen  375 IVKFGSGWGFWVSPT-LFITTTHVIPPGAK----------EAF-----G----VPISQIQVHKSGEFCRFRFPKPIRPDV  434 (535)
T ss_dssp             EEEETTEEEEESSSS-EEEEEGGGS-STTS----------EET-----T----EECGGEEEEEETTEEEEEESS-SSTTS
T ss_pred             heecCCceeeeecce-EEEEeeeecCCcch----------hhh-----C----CChhHeEEeeccceEEEecCCCCCCCc
Confidence            344577999999987 99999999987542          111     1    11111233444577777777652 234


Q ss_pred             cceeecCCCCCCCCCEEEE-EecCCCC--CCceeEEEEeeeccccccCCCceecceEEE-------eeccCCCCccceee
Q 019504          132 KPINVGQSSFLKVGQQCLA-IGNPFGF--DHTLTVGVISGLNRDIFSQAGVTIGGGIQT-------DAAINPGNSGGPLL  201 (340)
Q Consensus       132 ~~~~l~~~~~~~~G~~v~~-iG~p~g~--~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~-------d~~i~~G~SGGPl~  201 (340)
                      +-+.|.  .-...|.-+.+ +=.|.|.  +..+..|......-.-..-.+.  ..++.+       |....||+-|.|-+
T Consensus       435 tgmiLE--eGapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v~GQ--~GMLLTGaNAK~mDLGT~PGDCGcPYv  510 (535)
T PF05416_consen  435 TGMILE--EGAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRTVHGQ--MGMLLTGANAKGMDLGTIPGDCGCPYV  510 (535)
T ss_dssp             ---EE---SS--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEEEEE--EEEETTSTT-SSTTTS--TTGTT-EEE
T ss_pred             cceeec--cCCCCceEEEEEEEcCCccchhhhhhhccceeEEEcceeecce--eeeeeecCCccccccCCCCCCCCCcee
Confidence            555553  22344655533 4455442  2344555544332110000000  122322       33456899999999


Q ss_pred             cCCC---cEEEEEeeeee
Q 019504          202 DSKG---NLIGINTAIIT  216 (340)
Q Consensus       202 d~~G---~VVGi~~~~~~  216 (340)
                      -..|   -|+|++++...
T Consensus       511 yKrgNd~VV~GVH~AAtr  528 (535)
T PF05416_consen  511 YKRGNDWVVIGVHAAATR  528 (535)
T ss_dssp             EEETTEEEEEEEEEEE-S
T ss_pred             eecCCcEEEEEEEehhcc
Confidence            6555   49999998654


No 77 
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=88.70  E-value=0.19  Score=51.82  Aligned_cols=59  Identities=20%  Similarity=0.177  Sum_probs=42.2

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      -|+||++|.+|++|+.-|=          |+.||-+|+|||+.+--+.++- ...++. ...+-+++.|.|
T Consensus       960 lGIYvKsVV~GgaAd~DGR----------L~aGDQLLsVdG~SLiGisQEr-AA~lmt-rtg~vV~leVaK 1018 (1629)
T KOG1892|consen  960 LGIYVKSVVEGGAADHDGR----------LEAGDQLLSVDGHSLIGISQER-AARLMT-RTGNVVHLEVAK 1018 (1629)
T ss_pred             cceEEEEeccCCccccccc----------cccCceeeeecCcccccccHHH-HHHHHh-ccCCeEEEehhh
Confidence            5999999999999987764          5999999999999998877621 112332 233336665544


No 78 
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=88.49  E-value=0.45  Score=42.13  Aligned_cols=61  Identities=18%  Similarity=0.304  Sum_probs=43.4

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC-----CCCC--CceeEEEeeCCCCceEEEEe
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS-----CLSI--PSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~-----~d~~--~~~~~~~~~~~~~~~~~~~~  338 (340)
                      .|+.|.+..+|+.|+..||          |.+.|-|++|||..|.--     .|+|  ....++...+|.++.=.+.|
T Consensus       194 pGIFISRlVpGGLAeSTGL----------LaVnDEVlEVNGIEVaGKTLDQVTDMMvANshNLIiTVkPANQRnnvvr  261 (358)
T KOG3606|consen  194 PGIFISRLVPGGLAESTGL----------LAVNDEVLEVNGIEVAGKTLDQVTDMMVANSHNLIITVKPANQRNNVVR  261 (358)
T ss_pred             CceEEEeecCCccccccce----------eeecceeEEEcCEEeccccHHHHHHHHhhcccceEEEecccccccceee
Confidence            7999999999999999999          689999999999988633     2211  12234555555555544444


No 79 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=88.09  E-value=7.8  Score=34.19  Aligned_cols=92  Identities=16%  Similarity=0.212  Sum_probs=58.4

Q ss_pred             CCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEE-E---EEEEEeC-----CCCcEEEEEEec
Q 019504           56 EGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNF-E---GKLVGAD-----RAKDLAVLKIEA  126 (340)
Q Consensus        56 ~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~-~---a~v~~~d-----~~~DlAlL~v~~  126 (340)
                      .-+.||++|+++ |||++..|+.+..-..       -++.+.+..++.+.. .   -++...|     ++.+++||.++.
T Consensus        27 ~~~CsgvLlD~~-WlLvsssCl~~I~L~~-------~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~   98 (267)
T PF09342_consen   27 RYWCSGVLLDPH-WLLVSSSCLRGISLSH-------HYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQ   98 (267)
T ss_pred             eEEEEEEEeccc-eEEEeccccCCccccc-------ceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecC
Confidence            346899999987 9999999998743100       156666655541110 0   1333333     678999999998


Q ss_pred             CC---CCccceeecC-CCCCCCCCEEEEEecCC
Q 019504          127 SE---DLLKPINVGQ-SSFLKVGQQCLAIGNPF  155 (340)
Q Consensus       127 ~~---~~~~~~~l~~-~~~~~~G~~v~~iG~p~  155 (340)
                      +.   ..+.|.-+.+ .......+.++++|.-.
T Consensus        99 ~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   99 PANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             cccceeeecccccccccCCCCCCCceEEEEccc
Confidence            74   3345555543 23444566899999654


No 80 
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=86.68  E-value=0.75  Score=41.39  Aligned_cols=38  Identities=39%  Similarity=0.329  Sum_probs=33.1

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      .-+||..|..++||++-|-          ++.||-|++|||..|..-.
T Consensus        30 PClYiVQvFD~tPAa~dG~----------i~~GDEi~avNg~svKGkt   67 (429)
T KOG3651|consen   30 PCLYIVQVFDKTPAAKDGR----------IRCGDEIVAVNGISVKGKT   67 (429)
T ss_pred             CeEEEEEeccCCchhccCc----------cccCCeeEEecceeecCcc
Confidence            5789999999999999974          4999999999999987543


No 81 
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=84.55  E-value=0.3  Score=51.31  Aligned_cols=35  Identities=37%  Similarity=0.367  Sum_probs=31.9

Q ss_pred             cEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          270 ALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       270 ~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      -.|..|.++|||..+|+           +.||.|+.+||++|....
T Consensus       660 h~v~sv~egsPA~~agl-----------s~~DlIthvnge~v~gl~  694 (1205)
T KOG0606|consen  660 HSVGSVEEGSPAFEAGL-----------SAGDLITHVNGEPVHGLV  694 (1205)
T ss_pred             eeeeeecCCCCccccCC-----------CccceeEeccCcccchhh
Confidence            56888999999999999           999999999999998765


No 82 
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=84.30  E-value=0.36  Score=44.21  Aligned_cols=38  Identities=21%  Similarity=0.265  Sum_probs=32.4

Q ss_pred             CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      -++|+++..+-.|+..|+          |-.||-|++|||..|+.-..
T Consensus        81 PvviSkI~kdQaAd~tG~----------LFvGDAilqvNGi~v~~c~H  118 (505)
T KOG3549|consen   81 PVVISKIYKDQAADITGQ----------LFVGDAILQVNGIYVTACPH  118 (505)
T ss_pred             cEEeehhhhhhhhhhcCc----------eEeeeeeEEeccEEeecCCh
Confidence            578899988888888876          58999999999999987653


No 83 
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=81.32  E-value=0.59  Score=44.11  Aligned_cols=58  Identities=29%  Similarity=0.275  Sum_probs=43.5

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      .|.-|.+|..+|||.+|||.          -==|-|++|||..+....|.   ...+.++...++++|++-
T Consensus        15 eg~hvlkVqedSpa~~agle----------pffdFIvSI~g~rL~~dnd~---Lk~llk~~sekVkltv~n   72 (462)
T KOG3834|consen   15 EGYHVLKVQEDSPAHKAGLE----------PFFDFIVSINGIRLNKDNDT---LKALLKANSEKVKLTVYN   72 (462)
T ss_pred             eeEEEEEeecCChHHhcCcc----------hhhhhhheeCcccccCchHH---HHHHHHhcccceEEEEEe
Confidence            57779999999999999993          24799999999999988773   333333333447887763


No 84 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=81.18  E-value=3.3  Score=32.55  Aligned_cols=32  Identities=34%  Similarity=0.425  Sum_probs=24.4

Q ss_pred             ceEEEeeccCCCCccceeecCCCcEEEEEeeee
Q 019504          183 GGIQTDAAINPGNSGGPLLDSKGNLIGINTAII  215 (340)
Q Consensus       183 ~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~  215 (340)
                      +++....+..||+-||+|+ .+--||||++++-
T Consensus        79 ~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg  110 (127)
T PF00947_consen   79 NLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGG  110 (127)
T ss_dssp             CEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEE
T ss_pred             CceeecccCCCCCCCceeE-eCCCeEEEEEeCC
Confidence            4566677889999999999 5666999999863


No 85 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=80.98  E-value=1.2  Score=42.42  Aligned_cols=24  Identities=25%  Similarity=0.512  Sum_probs=21.0

Q ss_pred             eccCCCCccceeecCCCcEEEEEe
Q 019504          189 AAINPGNSGGPLLDSKGNLIGINT  212 (340)
Q Consensus       189 ~~i~~G~SGGPl~d~~G~VVGi~~  212 (340)
                      ..+..|.||+.++|.+|++|||..
T Consensus       350 ~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  350 YSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             cCCCCCCCcCeEECCCCCEEEEeC
Confidence            355689999999999999999965


No 86 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=80.79  E-value=1.2  Score=42.02  Aligned_cols=32  Identities=41%  Similarity=0.334  Sum_probs=30.0

Q ss_pred             eeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          274 QVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       274 ~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      .+..+++|..+|+           ++||.|+++|++++.+.+|
T Consensus       135 ~v~~~s~a~~a~l-----------~~Gd~iv~~~~~~i~~~~~  166 (375)
T COG0750         135 EVAPKSAAALAGL-----------RPGDRIVAVDGEKVASWDD  166 (375)
T ss_pred             ecCCCCHHHHcCC-----------CCCCEEEeECCEEccCHHH
Confidence            6788999999999           9999999999999999986


No 87 
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=79.33  E-value=1.7  Score=41.92  Aligned_cols=39  Identities=26%  Similarity=0.256  Sum_probs=32.2

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      .|+||.++.+++.-+.-|          .+++||.||.||.....++..
T Consensus       277 ggIYVgsImkgGAVA~DG----------RIe~GDMiLQVNevsFENmSN  315 (626)
T KOG3571|consen  277 GGIYVGSIMKGGAVALDG----------RIEPGDMILQVNEVSFENMSN  315 (626)
T ss_pred             CceEEeeeccCceeeccC----------ccCccceEEEeeecchhhcCc
Confidence            799999999988666555          249999999999988877764


No 88 
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=79.25  E-value=1.3  Score=43.05  Aligned_cols=37  Identities=24%  Similarity=0.413  Sum_probs=33.9

Q ss_pred             CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      -++|.++..|+-+.+.|+          |..||.|+++||..|.+..
T Consensus       147 ~~~vARI~~GG~~~r~gl----------L~~GD~i~EvNGi~v~~~~  183 (542)
T KOG0609|consen  147 KVVVARIMHGGMADRQGL----------LHVGDEILEVNGISVANKS  183 (542)
T ss_pred             ccEEeeeccCCcchhccc----------eeeccchheecCeecccCC
Confidence            589999999999999998          6899999999999999773


No 89 
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=77.17  E-value=0.36  Score=44.83  Aligned_cols=38  Identities=24%  Similarity=0.339  Sum_probs=32.2

Q ss_pred             CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      -++|+++.+|-.|+..+-          |..||.|++|||....+...
T Consensus       111 PIlISKIFkGlAADQt~a----------L~~gDaIlSVNG~dL~~AtH  148 (506)
T KOG3551|consen  111 PILISKIFKGLAADQTGA----------LFLGDAILSVNGEDLRDATH  148 (506)
T ss_pred             ceehhHhccccccccccc----------eeeccEEEEecchhhhhcch
Confidence            588999999888887764          58999999999999987764


No 90 
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=63.30  E-value=9.3  Score=33.00  Aligned_cols=56  Identities=18%  Similarity=0.411  Sum_probs=40.7

Q ss_pred             cceEEEeeccCCCCccceeecC----CCcEEEEEeeeeeCCCCcCceEEEEeh--HhHHHHHHHHH
Q 019504          182 GGGIQTDAAINPGNSGGPLLDS----KGNLIGINTAIITQTGTSAGVGFAIPS--STVLKIVPQLI  241 (340)
Q Consensus       182 ~~~i~~d~~i~~G~SGGPl~d~----~G~VVGi~~~~~~~~~~~~~~~~aip~--~~i~~~l~~l~  241 (340)
                      ...+++..+...|+=|||++-.    .-+++||+.++...    .+.+||-++  +.+++.+..|.
T Consensus       168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~----~~~gYAe~itQEDL~~A~~~l~  229 (231)
T PF12381_consen  168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN----HAMGYAESITQEDLMRAINKLE  229 (231)
T ss_pred             eeeeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc----ccceehhhhhHHHHHHHHHhhc
Confidence            3457888888999999999843    35899999987653    356777666  55666666553


No 91 
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=61.16  E-value=26  Score=23.45  Aligned_cols=32  Identities=31%  Similarity=0.428  Sum_probs=28.2

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.||+  .+.+++..+|...++.|-....
T Consensus         8 ~V~V~l~~g~--~~~G~L~~~D~~~Ni~L~~~~~   39 (63)
T cd00600           8 TVRVELKDGR--VLEGVLVAFDKYMNLVLDDVEE   39 (63)
T ss_pred             EEEEEECCCc--EEEEEEEEECCCCCEEECCEEE
Confidence            7899999997  8999999999999998877654


No 92 
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=60.99  E-value=33  Score=29.02  Aligned_cols=28  Identities=36%  Similarity=0.213  Sum_probs=25.0

Q ss_pred             CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEE
Q 019504          268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAV  306 (340)
Q Consensus       268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i  306 (340)
                      ..++|+.|..||||+++|+           .-++.|+++
T Consensus       122 ~~~~Vd~v~fgS~A~~~g~-----------d~d~~I~~v  149 (183)
T PF11874_consen  122 GKVIVDEVEFGSPAEKAGI-----------DFDWEITEV  149 (183)
T ss_pred             CEEEEEecCCCCHHHHcCC-----------CCCcEEEEE
Confidence            6789999999999999999           888877776


No 93 
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis.  All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=59.44  E-value=27  Score=24.15  Aligned_cols=32  Identities=22%  Similarity=0.295  Sum_probs=28.8

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      ++.|.+.+|+  .+.+++.++|+..++.|-....
T Consensus        12 ~V~V~l~~g~--~~~G~L~~~D~~mNlvL~~~~e   43 (68)
T cd01731          12 PVLVKLKGGK--EVRGRLKSYDQHMNLVLEDAEE   43 (68)
T ss_pred             EEEEEECCCC--EEEEEEEEECCcceEEEeeEEE
Confidence            7999999997  8999999999999999887754


No 94 
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.18  E-value=26  Score=24.14  Aligned_cols=32  Identities=25%  Similarity=0.372  Sum_probs=28.1

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|+  .+.+++.++|+..++.|=....
T Consensus        12 ~V~V~Lk~g~--~~~G~L~~~D~~mNlvL~~~~~   43 (67)
T cd01726          12 PVVVKLNSGV--DYRGILACLDGYMNIALEQTEE   43 (67)
T ss_pred             eEEEEECCCC--EEEEEEEEEccceeeEEeeEEE
Confidence            7899999997  8999999999999998876643


No 95 
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.16  E-value=22  Score=25.71  Aligned_cols=30  Identities=17%  Similarity=0.289  Sum_probs=26.6

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEE
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKI  124 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v  124 (340)
                      .+.|.+.+|+  .+.+++.++|.+.+|.|=..
T Consensus        13 ~V~V~l~~gr--~~~G~L~~fD~~mNlvL~d~   42 (82)
T cd01730          13 RVYVKLRGDR--ELRGRLHAYDQHLNMILGDV   42 (82)
T ss_pred             EEEEEECCCC--EEEEEEEEEccceEEeccce
Confidence            7899999997  89999999999999987544


No 96 
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=57.49  E-value=23  Score=25.45  Aligned_cols=47  Identities=26%  Similarity=0.393  Sum_probs=32.1

Q ss_pred             EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEE-EecC
Q 019504          106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLA-IGNP  154 (340)
Q Consensus       106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~-iG~p  154 (340)
                      ++++++..|...++|++.+-.-...+.---+.  ..++.|+.|++ +||.
T Consensus         5 iPgqI~~I~~~~~~A~Vd~gGvkreV~l~Lv~--~~v~~GdyVLVHvGfA   52 (82)
T COG0298           5 IPGQIVEIDDNNHLAIVDVGGVKREVNLDLVG--EEVKVGDYVLVHVGFA   52 (82)
T ss_pred             cccEEEEEeCCCceEEEEeccEeEEEEeeeec--CccccCCEEEEEeeEE
Confidence            57888899988789999987653322222222  26789999987 5653


No 97 
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=57.05  E-value=30  Score=24.26  Aligned_cols=32  Identities=28%  Similarity=0.364  Sum_probs=28.4

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|+  .+.+++.++|+..++.|-....
T Consensus        16 ~V~V~lk~g~--~~~G~L~~~D~~mNlvL~d~~e   47 (72)
T PRK00737         16 PVLVRLKGGR--EFRGELQGYDIHMNLVLDNAEE   47 (72)
T ss_pred             EEEEEECCCC--EEEEEEEEEcccceeEEeeEEE
Confidence            7899999997  8999999999999999887654


No 98 
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures.  To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=56.78  E-value=27  Score=24.14  Aligned_cols=31  Identities=26%  Similarity=0.385  Sum_probs=27.6

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      .+.|.+.+|+  .+.+++.++|...++.|=...
T Consensus        13 ~V~V~Lk~g~--~~~G~L~~~D~~mNi~L~~~~   43 (68)
T cd01722          13 PVIVKLKWGM--EYKGTLVSVDSYMNLQLANTE   43 (68)
T ss_pred             EEEEEECCCc--EEEEEEEEECCCEEEEEeeEE
Confidence            7899999997  899999999999999886654


No 99 
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=56.60  E-value=27  Score=24.87  Aligned_cols=30  Identities=23%  Similarity=0.438  Sum_probs=26.8

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEE
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKI  124 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v  124 (340)
                      .+.|.+.+|+  .+.+++.++|.+.++.|=..
T Consensus        15 ~V~V~l~~gr--~~~G~L~g~D~~mNlvL~da   44 (76)
T cd01732          15 RIWIVMKSDK--EFVGTLLGFDDYVNMVLEDV   44 (76)
T ss_pred             EEEEEECCCe--EEEEEEEEeccceEEEEccE
Confidence            7899999997  89999999999999987654


No 100
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=56.09  E-value=29  Score=25.51  Aligned_cols=32  Identities=16%  Similarity=0.280  Sum_probs=28.0

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|+  .+.+++.++|.+.++.|=....
T Consensus        16 ~V~V~lr~~r--~~~G~L~~fD~hmNlvL~d~~E   47 (87)
T cd01720          16 QVLINCRNNK--KLLGRVKAFDRHCNMVLENVKE   47 (87)
T ss_pred             EEEEEEcCCC--EEEEEEEEecCccEEEEcceEE
Confidence            7999999997  8999999999999999866543


No 101
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits.  The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.  Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=56.06  E-value=28  Score=24.92  Aligned_cols=31  Identities=26%  Similarity=0.506  Sum_probs=27.3

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      .+.|.+.+|+  .+.+++.++|.+.++.|=...
T Consensus        12 ~V~V~l~dgR--~~~G~L~~~D~~~NlVL~~~~   42 (79)
T cd01717          12 RLRVTLQDGR--QFVGQFLAFDKHMNLVLSDCE   42 (79)
T ss_pred             EEEEEECCCc--EEEEEEEEEcCccCEEcCCEE
Confidence            7899999997  899999999999999876554


No 102
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=55.58  E-value=14  Score=23.83  Aligned_cols=21  Identities=38%  Similarity=0.627  Sum_probs=17.6

Q ss_pred             CCCccceeecCCCcEEEEEee
Q 019504          193 PGNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       193 ~G~SGGPl~d~~G~VVGi~~~  213 (340)
                      .+.+.-|++|.+|+++|+.+.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            467789999999999999763


No 103
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=55.42  E-value=32  Score=24.44  Aligned_cols=31  Identities=19%  Similarity=0.332  Sum_probs=27.2

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      .+.|.+.||+  .+.+++.++|...+|.|=...
T Consensus        12 ~v~V~l~dgR--~~~G~l~~~D~~~NivL~~~~   42 (75)
T cd06168          12 TMRIHMTDGR--TLVGVFLCTDRDCNIILGSAQ   42 (75)
T ss_pred             eEEEEEcCCe--EEEEEEEEEcCCCcEEecCcE
Confidence            7899999997  899999999999999876553


No 104
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.35  E-value=33  Score=24.76  Aligned_cols=31  Identities=19%  Similarity=0.290  Sum_probs=27.1

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      ++.|.+.+|+  .+.+++.++|...+|.|=...
T Consensus        14 ~V~V~l~~gr--~~~G~L~~~D~~mNlvL~~~~   44 (81)
T cd01729          14 KIRVKFQGGR--EVTGILKGYDQLLNLVLDDTV   44 (81)
T ss_pred             eEEEEECCCc--EEEEEEEEEcCcccEEecCEE
Confidence            7899999997  899999999999999876553


No 105
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures.   In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=52.98  E-value=54  Score=22.33  Aligned_cols=32  Identities=19%  Similarity=0.281  Sum_probs=27.4

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.+....|.  .++++++.+|....+.+|+-+.
T Consensus         8 ~V~~kTc~g~--~ieGEV~afD~~tk~lIlk~~s   39 (61)
T cd01735           8 QVSCRTCFEQ--RLQGEVVAFDYPSKMLILKCPS   39 (61)
T ss_pred             EEEEEecCCc--eEEEEEEEecCCCcEEEEECcc
Confidence            6777777786  8999999999999999998655


No 106
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=51.87  E-value=6.9  Score=39.17  Aligned_cols=35  Identities=20%  Similarity=0.326  Sum_probs=27.2

Q ss_pred             cEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC
Q 019504          270 ALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS  314 (340)
Q Consensus       270 ~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~  314 (340)
                      ++|..+..++||+|.|-          |..||-|++|||..+.-.
T Consensus       675 VViAnmm~~GpAarsgk----------LnIGDQiiaING~SLVGL  709 (829)
T KOG3605|consen  675 VVIANMMHGGPAARSGK----------LNIGDQIMSINGTSLVGL  709 (829)
T ss_pred             HHHHhcccCChhhhcCC----------ccccceeEeecCceeccc
Confidence            33555667899999974          689999999999877543


No 107
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.16  E-value=42  Score=23.59  Aligned_cols=31  Identities=16%  Similarity=0.198  Sum_probs=27.1

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      ++.|.+.+|+  .+.+++.++|...+|.|=...
T Consensus        12 ~V~V~L~~g~--~~~G~L~~~D~~mNlvL~~~~   42 (72)
T cd01719          12 KLSLKLNGNR--KVSGILRGFDPFMNLVLDDAV   42 (72)
T ss_pred             eEEEEECCCe--EEEEEEEEEcccccEEeccEE
Confidence            7899999997  899999999999999886553


No 108
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.45  E-value=42  Score=23.75  Aligned_cols=31  Identities=26%  Similarity=0.341  Sum_probs=27.3

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      ++.|.+.+|+  .+.+.+.++|++.++.|=...
T Consensus        14 ~v~V~l~~gr--~~~G~L~~fD~~~NlvL~d~~   44 (74)
T cd01728          14 KVVVLLRDGR--KLIGILRSFDQFANLVLQDTV   44 (74)
T ss_pred             EEEEEEcCCe--EEEEEEEEECCcccEEecceE
Confidence            7999999997  899999999999999886553


No 109
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=49.57  E-value=46  Score=22.58  Aligned_cols=32  Identities=28%  Similarity=0.478  Sum_probs=27.9

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.||+  .+.+++..+|...++-|=....
T Consensus        10 ~V~V~l~~g~--~~~G~L~~~D~~~NlvL~~~~e   41 (67)
T smart00651       10 RVLVELKNGR--EYRGTLKGFDQFMNLVLEDVEE   41 (67)
T ss_pred             EEEEEECCCc--EEEEEEEEECccccEEEccEEE
Confidence            7899999997  8999999999999998876644


No 110
>COG1868 FliM Flagellar motor switch protein [Cell motility and secretion]
Probab=48.68  E-value=65  Score=30.05  Aligned_cols=40  Identities=18%  Similarity=0.131  Sum_probs=29.9

Q ss_pred             CCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHc
Q 019504          204 KGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQY  243 (340)
Q Consensus       204 ~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~  243 (340)
                      -.+++-+++....-.....-+++++|...++++.+.+...
T Consensus       191 pne~vv~i~~~i~ig~~~g~~niciP~~~le~i~~kl~~~  230 (332)
T COG1868         191 PNEIVVLITLEVEIGNLSGMFNICIPYSMLEPIREKLSSR  230 (332)
T ss_pred             CCceEEEEEEEEEECCcceEEEEEeeHHHHHHHHHHHhhh
Confidence            4566666666555444456799999999999999988764


No 111
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.13  E-value=48  Score=23.35  Aligned_cols=31  Identities=29%  Similarity=0.432  Sum_probs=27.4

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      ++.|.+.+|+  .+.+++.++|.+.++.|=...
T Consensus        11 ~V~V~l~dgr--~~~G~L~~~D~~~NlvL~~~~   41 (74)
T cd01727          11 TVSVITVDGR--VIVGTLKGFDQATNLILDDSH   41 (74)
T ss_pred             EEEEEECCCc--EEEEEEEEEccccCEEccceE
Confidence            7899999997  899999999999999887653


No 112
>PF01423 LSM:  LSM domain ;  InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=43.99  E-value=66  Score=21.81  Aligned_cols=33  Identities=24%  Similarity=0.467  Sum_probs=29.5

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecC
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEAS  127 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~  127 (340)
                      .+.|.+.+|.  .+.+++..+|...++.|-.....
T Consensus        10 ~V~V~l~~g~--~~~G~L~~~D~~~Nl~L~~~~~~   42 (67)
T PF01423_consen   10 RVRVELKNGR--TYRGTLVSFDQFMNLVLSDVTET   42 (67)
T ss_dssp             EEEEEETTSE--EEEEEEEEEETTEEEEEEEEEEE
T ss_pred             EEEEEEeCCE--EEEEEEEEeechheEEeeeEEEE
Confidence            7999999997  89999999999999998887654


No 113
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=43.52  E-value=55  Score=23.31  Aligned_cols=32  Identities=31%  Similarity=0.527  Sum_probs=28.3

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|+  .+.+++.++|...++.|--...
T Consensus        19 ~V~V~lk~g~--~~~G~L~~~D~~mNlvL~d~~e   50 (79)
T COG1958          19 RVLVKLKNGR--EYRGTLVGFDQYMNLVLDDVEE   50 (79)
T ss_pred             EEEEEECCCC--EEEEEEEEEccceeEEEeceEE
Confidence            7899999997  8999999999999998876654


No 114
>PF08669 GCV_T_C:  Glycine cleavage T-protein C-terminal barrel domain;  InterPro: IPR013977  This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=42.79  E-value=44  Score=24.47  Aligned_cols=33  Identities=21%  Similarity=0.347  Sum_probs=22.1

Q ss_pred             CccceeecCCCcEEEEEeeeeeCCCCcCceEEE
Q 019504          195 NSGGPLLDSKGNLIGINTAIITQTGTSAGVGFA  227 (340)
Q Consensus       195 ~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~a  227 (340)
                      ..|.|+++.+|+.||.+++......-...++++
T Consensus        34 ~~g~~v~~~~g~~vG~vTS~~~sp~~~~~Iala   66 (95)
T PF08669_consen   34 RGGEPVYDEDGKPVGRVTSGAYSPTLGKNIALA   66 (95)
T ss_dssp             STTCEEEETTTEEEEEEEEEEEETTTTEEEEEE
T ss_pred             CCCCEEEECCCcEEeEEEEEeECCCCCceEEEE
Confidence            457899987999999988764432223444444


No 115
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=41.18  E-value=10  Score=36.18  Aligned_cols=35  Identities=34%  Similarity=0.224  Sum_probs=26.4

Q ss_pred             EEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504          272 VLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL  316 (340)
Q Consensus       272 V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d  316 (340)
                      |-+|.++|||++|||+          --+|-|+-+-.......+|
T Consensus       113 vl~V~p~SPaalAgl~----------~~~DYivG~~~~~~~~~eD  147 (462)
T KOG3834|consen  113 VLSVEPNSPAALAGLR----------PYTDYIVGIWDAVMHEEED  147 (462)
T ss_pred             eeecCCCCHHHhcccc----------cccceEecchhhhccchHH
Confidence            5578899999999993          3789999994444455555


No 116
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=39.24  E-value=34  Score=24.09  Aligned_cols=32  Identities=22%  Similarity=0.473  Sum_probs=24.1

Q ss_pred             cceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHH
Q 019504          197 GGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLI  241 (340)
Q Consensus       197 GGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~  241 (340)
                      .-|+.+.+|+++|+...             .+..+.+.++++++.
T Consensus        18 s~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~~   49 (81)
T PF02743_consen   18 SVPIYDDDGKIIGVVGI-------------DISLDQLSEIISNIK   49 (81)
T ss_dssp             EEEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTSB
T ss_pred             EEEEECCCCCEEEEEEE-------------EeccceeeeEEEeeE
Confidence            36888889999999654             577888887776653


No 117
>PF01455 HupF_HypC:  HupF/HypC family;  InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=38.17  E-value=1.2e+02  Score=21.07  Aligned_cols=43  Identities=21%  Similarity=0.346  Sum_probs=29.8

Q ss_pred             EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEE
Q 019504          106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAI  151 (340)
Q Consensus       106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~i  151 (340)
                      ++++++..+.....|++.....   ...+.+.--.++++||+|++-
T Consensus         5 iP~~Vv~v~~~~~~A~v~~~G~---~~~V~~~lv~~v~~Gd~VLVH   47 (68)
T PF01455_consen    5 IPGRVVEVDEDGGMAVVDFGGV---RREVSLALVPDVKVGDYVLVH   47 (68)
T ss_dssp             EEEEEEEEETTTTEEEEEETTE---EEEEEGTTCTSB-TT-EEEEE
T ss_pred             ccEEEEEEeCCCCEEEEEcCCc---EEEEEEEEeCCCCCCCEEEEe
Confidence            6888999988889999987753   244444333458999999886


No 118
>PF02601 Exonuc_VII_L:  Exonuclease VII, large subunit;  InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus.  This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=38.10  E-value=45  Score=30.77  Aligned_cols=38  Identities=21%  Similarity=0.435  Sum_probs=31.4

Q ss_pred             ceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEE
Q 019504           58 NGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGK  109 (340)
Q Consensus        58 ~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~  109 (340)
                      .|-.++.+++|.++|+..-+...+           .+.+.+.||.   +.++
T Consensus       281 RGYaiv~~~~g~vI~s~~~l~~gd-----------~i~i~l~DG~---~~a~  318 (319)
T PF02601_consen  281 RGYAIVRDKDGKVITSVKQLKPGD-----------EIEIRLADGS---IKAE  318 (319)
T ss_pred             CceEEEECCCCCEECCHHHCCCCC-----------EEEEEEcceE---EEEE
Confidence            377788888899999999998876           8999999994   5554


No 119
>PF14827 Cache_3:  Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=35.89  E-value=40  Score=25.87  Aligned_cols=18  Identities=33%  Similarity=0.708  Sum_probs=13.3

Q ss_pred             ceeecCCCcEEEEEeeee
Q 019504          198 GPLLDSKGNLIGINTAII  215 (340)
Q Consensus       198 GPl~d~~G~VVGi~~~~~  215 (340)
                      .|++|.+|++||++.-++
T Consensus        94 ~PV~d~~g~viG~V~VG~  111 (116)
T PF14827_consen   94 APVYDSDGKVIGVVSVGV  111 (116)
T ss_dssp             EEEE-TTS-EEEEEEEEE
T ss_pred             EeeECCCCcEEEEEEEEE
Confidence            688999999999987654


No 120
>COG4820 EutJ Ethanolamine utilization protein, possible chaperonin [Amino acid transport and metabolism]
Probab=35.48  E-value=1.1e+02  Score=26.47  Aligned_cols=103  Identities=19%  Similarity=0.182  Sum_probs=52.4

Q ss_pred             ceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEeccHHHHhh-----------cC-
Q 019504          198 GPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIAPDLVASQ-----------LN-  265 (340)
Q Consensus       198 GPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~~~~~~~~-----------~~-  265 (340)
                      .-++|.+|+-+..+.-...--.+.--..|.=.++.++++.+.+.+.       ||+++....-+=-           .+ 
T Consensus        43 ~~vlD~d~~Pvag~~~~advVRDGiVvdf~eaveiVrrlkd~lEk~-------lGi~~tha~taiPPGt~~~~~ri~iNV  115 (277)
T COG4820          43 SMVLDRDGQPVAGCLDWADVVRDGIVVDFFEAVEIVRRLKDTLEKQ-------LGIRFTHAATAIPPGTEQGDPRISINV  115 (277)
T ss_pred             EEEEcCCCCeEEEEehhhhhhccceEEehhhHHHHHHHHHHHHHHh-------hCeEeeeccccCCCCccCCCceEEEEe
Confidence            3456777776665443211111112345666778888888887653       3333321110000           00 


Q ss_pred             -CCCCcEEEeeCCCChhhhc--CCCcccc-----CCCC--CCcCCcEEEEEC
Q 019504          266 -VGNGALVLQVPGNSLAAKA--GILPTTR-----GFAG--NIILGDIIVAVN  307 (340)
Q Consensus       266 -~~~g~~V~~v~~~spa~~~--gl~~~~~-----~~~~--~l~~GDvi~~i~  307 (340)
                       ...|+-|..|..+..|+..  +|+-+-.     ++.|  .++.|+||...|
T Consensus       116 iESAGlevl~vlDEPTAaa~vL~l~dg~VVDiGGGTTGIsi~kkGkViy~AD  167 (277)
T COG4820         116 IESAGLEVLHVLDEPTAAADVLQLDDGGVVDIGGGTTGISIVKKGKVIYSAD  167 (277)
T ss_pred             ecccCceeeeecCCchhHHHHhccCCCcEEEeCCCcceeEEEEcCcEEEecc
Confidence             1257777777655555443  3322211     1222  368999999876


No 121
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=33.60  E-value=43  Score=32.96  Aligned_cols=15  Identities=40%  Similarity=0.612  Sum_probs=14.1

Q ss_pred             cCCcEEEEECCEEcc
Q 019504          298 ILGDIIVAVNNKPVS  312 (340)
Q Consensus       298 ~~GDvi~~i~g~~v~  312 (340)
                      +|||||+++||+.|.
T Consensus       316 rPGDVits~~GkTVE  330 (485)
T COG0260         316 RPGDVITSMNGKTVE  330 (485)
T ss_pred             CCCCeEEecCCcEEE
Confidence            899999999999885


No 122
>PTZ00138 small nuclear ribonucleoprotein; Provisional
Probab=32.17  E-value=1e+02  Score=22.74  Aligned_cols=33  Identities=30%  Similarity=0.409  Sum_probs=26.4

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE  125 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~  125 (340)
                      .+.+.+.++....+.+++.++|...++.|=...
T Consensus        28 ~V~i~l~~~~~r~~~G~L~gfD~~mNlVL~d~~   60 (89)
T PTZ00138         28 RVQIWLYDHPNLRIEGKILGFDEYMNMVLDDAE   60 (89)
T ss_pred             EEEEEEEeCCCcEEEEEEEEEcccceEEEccEE
Confidence            677777776545899999999999998876654


No 123
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=32.00  E-value=30  Score=31.78  Aligned_cols=33  Identities=42%  Similarity=0.698  Sum_probs=27.9

Q ss_pred             EEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC
Q 019504          271 LVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS  314 (340)
Q Consensus       271 ~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~  314 (340)
                      -|-+|.+.+||+++|.           ..||-|+-+|+-++.-+
T Consensus        66 ~~lrv~~~~~~e~~~~-----------~~~dyilg~n~Dp~~fl   98 (417)
T COG5233          66 EVLRVNPESPAEKAGM-----------VVGDYILGINEDPLRFL   98 (417)
T ss_pred             hheeccccChhHhhcc-----------ccceeEEeecCCcHHHH
Confidence            3567789999999998           78999999998887643


No 124
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=30.73  E-value=29  Score=32.89  Aligned_cols=24  Identities=29%  Similarity=0.432  Sum_probs=19.0

Q ss_pred             CCceEEEEEcC----CC------EEEeCccccCC
Q 019504           56 EGNGSGVVWDG----KG------HIVTNFHVIGS   79 (340)
Q Consensus        56 ~~~GsGfiI~~----~G------~IlT~~Hvv~~   79 (340)
                      ...|||+|+|-    ++      |+.||.||+..
T Consensus        35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~~   68 (374)
T PF01732_consen   35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVASN   68 (374)
T ss_pred             cCcceEEEEEEeccCCCCCCeEEEEEechhhhcc
Confidence            35799999972    22      79999999984


No 125
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=30.66  E-value=53  Score=33.00  Aligned_cols=63  Identities=16%  Similarity=0.095  Sum_probs=41.0

Q ss_pred             HHHHHHHcCceeeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504          236 IVPQLIQYGKVVRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC  315 (340)
Q Consensus       236 ~l~~l~~~~~~~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~  315 (340)
                      .++.......-+.-++|+...+..       .+-.+|.++.++|||.+...          |..||-|+.||++.|....
T Consensus       200 ~Le~vqls~~kp~eglg~~I~Ssy-------dg~h~~s~~~e~Spad~~~k----------I~dgdEv~qiN~qtvVgwq  262 (638)
T KOG1738|consen  200 SLERVQLSTLSPSEGLGLYIDSSY-------DGPHVTSKIFEQSPADYRQK----------ILDGDEVLQINEQTVVGWQ  262 (638)
T ss_pred             HHHHHHhccCCcccCCceEEeeec-------CCceeccccccCChHHHhhc----------ccCccceeeecccccccch
Confidence            444443333333445555543221       24556778899999998864          5899999999999977443


No 126
>COG2524 Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]
Probab=30.29  E-value=2.9e+02  Score=24.92  Aligned_cols=20  Identities=35%  Similarity=0.713  Sum_probs=17.1

Q ss_pred             CCCccceeecCCCcEEEEEee
Q 019504          193 PGNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       193 ~G~SGGPl~d~~G~VVGi~~~  213 (340)
                      .|..|.|++|.+ ++||+.+.
T Consensus       201 ~~i~GaPVvd~d-k~vGiit~  220 (294)
T COG2524         201 KGIRGAPVVDDD-KIVGIITL  220 (294)
T ss_pred             cCccCCceecCC-ceEEEEEH
Confidence            799999999855 99999764


No 127
>cd01739 LSm11_C The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm11 is an SmD2 - like subunit which binds U7 snRNA along with LSm10 and five other Sm subunits to form a 7-member ring structure. LSm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=30.17  E-value=94  Score=21.44  Aligned_cols=35  Identities=26%  Similarity=0.427  Sum_probs=27.1

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecC
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEAS  127 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~  127 (340)
                      .+.++-.+|-.-.+.+.++++|.+.+++|.-++..
T Consensus        12 rV~iR~~~gvrG~~~G~lvAFDK~wNm~L~DV~E~   46 (66)
T cd01739          12 RVHIRTFKGLRGVCSGFLVAFDKFWNMALVDVDET   46 (66)
T ss_pred             EEEEecccCcccEEEEEEEeeeeehhheehhhhhh
Confidence            45555555554478999999999999999988765


No 128
>PRK06437 hypothetical protein; Provisional
Probab=28.98  E-value=1.4e+02  Score=20.54  Aligned_cols=30  Identities=23%  Similarity=0.241  Sum_probs=22.0

Q ss_pred             cCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          298 ILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       298 ~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      .+..+.+++||+.+.  .|     ..+   + .+|+|.|++
T Consensus        33 ~~~~vaV~vNg~iv~--~~-----~~L---~-dgD~Veiv~   62 (67)
T PRK06437         33 DEEEYVVIVNGSPVL--ED-----HNV---K-KEDDVLILE   62 (67)
T ss_pred             CCccEEEEECCEECC--Cc-----eEc---C-CCCEEEEEe
Confidence            678999999999997  32     111   2 458888886


No 129
>PF10049 DUF2283:  Protein of unknown function (DUF2283);  InterPro: IPR019270  Members of this family of hypothetical proteins have no known function. 
Probab=28.09  E-value=43  Score=21.58  Aligned_cols=11  Identities=36%  Similarity=0.881  Sum_probs=7.9

Q ss_pred             cCCCcEEEEEe
Q 019504          202 DSKGNLIGINT  212 (340)
Q Consensus       202 d~~G~VVGi~~  212 (340)
                      |.+|++|||-.
T Consensus        36 d~~G~ivGIEI   46 (50)
T PF10049_consen   36 DEDGRIVGIEI   46 (50)
T ss_pred             CCCCCEEEEEE
Confidence            45788998843


No 130
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=26.93  E-value=53  Score=24.82  Aligned_cols=21  Identities=38%  Similarity=0.550  Sum_probs=16.9

Q ss_pred             CCCccceeecCCCcEEEEEee
Q 019504          193 PGNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       193 ~G~SGGPl~d~~G~VVGi~~~  213 (340)
                      .+.+.=|++|.+|+++|+++.
T Consensus        97 ~~~~~lpVvd~~~~~vGiit~  117 (123)
T cd04627          97 EGISSVAVVDNQGNLIGNISV  117 (123)
T ss_pred             cCCceEEEECCCCcEEEEEeH
Confidence            345567999989999999875


No 131
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.89  E-value=2.2e+02  Score=19.69  Aligned_cols=32  Identities=19%  Similarity=0.332  Sum_probs=28.8

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|.  .+.+++..+|...++.|-....
T Consensus        12 ~V~VeLk~g~--~~~G~L~~~D~~MNl~L~~~~~   43 (70)
T cd01721          12 IVTVELKTGE--VYRGKLIEAEDNMNCQLKDVTV   43 (70)
T ss_pred             EEEEEECCCc--EEEEEEEEEcCCceeEEEEEEE
Confidence            7899999997  8999999999999999988753


No 132
>PF08605 Rad9_Rad53_bind:  Fungal Rad9-like Rad53-binding;  InterPro: IPR013914  In Saccharomyces cerevisiae (Baker s yeast), the Rad9 is a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via the Rad53 IPR000253 from INTERPRO domain []. There is no clear higher eukaryotic ortholog to Rad9. 
Probab=26.57  E-value=1.4e+02  Score=23.81  Aligned_cols=56  Identities=20%  Similarity=0.208  Sum_probs=39.1

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEe
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIG  152 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG  152 (340)
                      .++..+ +-  +.|+|+++..+...+-.+++++.....++.-.+ ..-+++.||.|-+-+
T Consensus        15 avW~~~-~~--~yYPa~~~~~~~~~~~~~V~Fedg~~~i~~~dv-~~LDlRIGD~Vkv~~   70 (131)
T PF08605_consen   15 AVWAGY-NL--KYYPATCVGSGVDRDRSLVRFEDGTYEIKNEDV-KYLDLRIGDTVKVDG   70 (131)
T ss_pred             ceeecC-CC--eEeeEEEEeecCCCCeEEEEEecCceEeCcccE-eeeeeecCCEEEECC
Confidence            455544 22  478999999988888899999876433333333 234689999998876


No 133
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.  LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=26.57  E-value=2.4e+02  Score=20.07  Aligned_cols=32  Identities=13%  Similarity=0.275  Sum_probs=28.5

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|.  .+.+++..+|...++-|-....
T Consensus        21 ~V~VeLKng~--~~~G~L~~vD~~MNl~L~~~~~   52 (78)
T cd01733          21 VVTVELRNET--TVTGRIASVDAFMNIRLAKVTI   52 (78)
T ss_pred             EEEEEECCCC--EEEEEEEEEcCCceeEEEEEEE
Confidence            6899999997  8999999999999998887754


No 134
>COG2104 ThiS Sulfur transfer protein involved in thiamine biosynthesis [Coenzyme metabolism]
Probab=25.39  E-value=1.3e+02  Score=20.95  Aligned_cols=34  Identities=21%  Similarity=0.166  Sum_probs=22.4

Q ss_pred             cCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504          298 ILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK  338 (340)
Q Consensus       298 ~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~  338 (340)
                      .+-=+++++||+.|....    +.....   ..+|.++|+|
T Consensus        30 ~~~~vav~vNg~iVpr~~----~~~~~l---~~gD~ievv~   63 (68)
T COG2104          30 NPEGVAVAVNGEIVPRSQ----WADTIL---KEGDRIEVVR   63 (68)
T ss_pred             CCceEEEEECCEEccchh----hhhccc---cCCCEEEEEE
Confidence            667789999999997543    222222   2348888876


No 135
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=25.22  E-value=60  Score=24.06  Aligned_cols=21  Identities=19%  Similarity=0.286  Sum_probs=16.4

Q ss_pred             CCCccceeecCCCcEEEEEee
Q 019504          193 PGNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       193 ~G~SGGPl~d~~G~VVGi~~~  213 (340)
                      .+.+--|++|.+|+++|+++.
T Consensus        85 ~~~~~lpVvd~~~~~~Giit~  105 (111)
T cd04603          85 TEPPVVAVVDKEGKLVGTIYE  105 (111)
T ss_pred             cCCCeEEEEcCCCeEEEEEEh
Confidence            344456999988999999874


No 136
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=24.72  E-value=62  Score=23.93  Aligned_cols=20  Identities=45%  Similarity=0.629  Sum_probs=16.3

Q ss_pred             CCccceeecCCCcEEEEEee
Q 019504          194 GNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       194 G~SGGPl~d~~G~VVGi~~~  213 (340)
                      +...-|++|.+|+++|+.+.
T Consensus        90 ~~~~~pVvd~~~~~~Gvit~  109 (115)
T cd04620          90 QIRHLPVLDDQGQLIGLVTA  109 (115)
T ss_pred             CCceEEEEcCCCCEEEEEEh
Confidence            44567999988999999875


No 137
>TIGR00739 yajC preprotein translocase, YajC subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas.
Probab=24.34  E-value=1.3e+02  Score=21.91  Aligned_cols=40  Identities=15%  Similarity=0.196  Sum_probs=25.2

Q ss_pred             CCcCCcEEEEECCEE--ccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504          296 NIILGDIIVAVNNKP--VSFSCLSIPSRIYLICAEPNQDHLTCLKSS  340 (340)
Q Consensus       296 ~l~~GDvi~~i~g~~--v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~  340 (340)
                      .|++||-|+-..|--  |.+.+|    ........ .+..+++.|++
T Consensus        37 ~L~~Gd~VvT~gGi~G~V~~i~d----~~v~vei~-~g~~i~~~r~a   78 (84)
T TIGR00739        37 SLKKGDKVLTIGGIIGTVTKIAE----NTIVIELN-DNTEITFSKNA   78 (84)
T ss_pred             hCCCCCEEEECCCeEEEEEEEeC----CEEEEEEC-CCeEEEEEhHH
Confidence            469999999998853  344544    22232333 34889988864


No 138
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.59  E-value=3.2e+02  Score=20.07  Aligned_cols=60  Identities=15%  Similarity=0.200  Sum_probs=39.8

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCC
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFG  156 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g  156 (340)
                      .+.|.+.+|.  .+.+++..+|...++.|-.+......-.+..++  .-.-.|..|..+-.|..
T Consensus        13 ~V~VeLKng~--~~~G~L~~vD~~MNl~L~~a~~~~~~~~~~~~~--~v~IRG~nI~yi~lPd~   72 (90)
T cd01724          13 TVTIELKNGT--IVHGTITGVDPSMNTHLKNVKLTLKGRNPVPLD--TLSIRGNNIRYFILPDS   72 (90)
T ss_pred             EEEEEECCCC--EEEEEEEEEcCceeEEEEEEEEEcCCCceeEcc--eEEEeCCEEEEEEcCCc
Confidence            7899999997  899999999999999998875432111223332  11234666666555543


No 139
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=21.84  E-value=41  Score=32.52  Aligned_cols=21  Identities=29%  Similarity=0.363  Sum_probs=17.6

Q ss_pred             EEEeeCCCChhhhcCCCcccc
Q 019504          271 LVLQVPGNSLAAKAGILPTTR  291 (340)
Q Consensus       271 ~V~~v~~~spa~~~gl~~~~~  291 (340)
                      +|..|.++|||+++||+++..
T Consensus         1 ~I~~V~pgSpAe~AGLe~GD~   21 (433)
T TIGR03279         1 LISAVLPGSIAEELGFEPGDA   21 (433)
T ss_pred             CcCCcCCCCHHHHcCCCCCCE
Confidence            467899999999999977655


No 140
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=21.58  E-value=91  Score=23.44  Aligned_cols=21  Identities=29%  Similarity=0.332  Sum_probs=17.4

Q ss_pred             CCCccceeecCCCcEEEEEee
Q 019504          193 PGNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       193 ~G~SGGPl~d~~G~VVGi~~~  213 (340)
                      .+...-|++|.+|+++|+++.
T Consensus        87 ~~~~~lpVvd~~~~l~Givt~  107 (113)
T cd04597          87 HNIRTLPVVDDDGTPAGIITL  107 (113)
T ss_pred             cCCCEEEEECCCCeEEEEEEH
Confidence            456678999999999999764


No 141
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation.  Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure.  Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=21.04  E-value=3.1e+02  Score=19.28  Aligned_cols=32  Identities=16%  Similarity=0.332  Sum_probs=28.5

Q ss_pred             EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504           93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA  126 (340)
Q Consensus        93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~  126 (340)
                      .+.|.+.+|.  .+.+++..+|...++.+-.+..
T Consensus        13 ~V~VeLkng~--~~~G~L~~~D~~mNi~L~~~~~   44 (76)
T cd01723          13 PMLVELKNGE--TYNGHLVNCDNWMNIHLREVIC   44 (76)
T ss_pred             EEEEEECCCC--EEEEEEEEEcCCCceEEEeEEE
Confidence            7899999997  8999999999999999987643


No 142
>PRK10413 hydrogenase 2 accessory protein HypG; Provisional
Probab=20.27  E-value=2.1e+02  Score=20.68  Aligned_cols=48  Identities=15%  Similarity=0.139  Sum_probs=28.1

Q ss_pred             EEEEEEEeCCCC-cEEEEEEecCCCCccceeecCC-CCCCCCCEEEEE-ec
Q 019504          106 FEGKLVGADRAK-DLAVLKIEASEDLLKPINVGQS-SFLKVGQQCLAI-GN  153 (340)
Q Consensus       106 ~~a~v~~~d~~~-DlAlL~v~~~~~~~~~~~l~~~-~~~~~G~~v~~i-G~  153 (340)
                      ++++++..+... .+|++.+......+...-+++. ..+++||+|++- ||
T Consensus         5 iP~kVi~i~~~~~~~A~vd~~Gv~r~V~l~Lv~~~~~~~~vGDyVLVHaGf   55 (82)
T PRK10413          5 VPGQVLAVGEDIHQLAQVEVCGIKRDVNIALICEGNPADLLGQWVLVHVGF   55 (82)
T ss_pred             cceEEEEECCCCCcEEEEEcCCeEEEEEeeeeccCCcccccCCEEEEecch
Confidence            577788777653 6788777654322221122221 246899999884 54


No 143
>PF05578 Peptidase_S31:  Pestivirus NS3 polyprotein peptidase S31;  InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=20.27  E-value=2.9e+02  Score=22.63  Aligned_cols=127  Identities=20%  Similarity=0.263  Sum_probs=64.1

Q ss_pred             CCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCcccee
Q 019504           56 EGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPIN  135 (340)
Q Consensus        56 ~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~  135 (340)
                      .+.-+|+.+..+|-|-.--||-.+.            ++.|.-+-|+     .+++..+...      +....+ + -+.
T Consensus        50 rgletgwaythqggissvdhvt~gk------------d~lvcdsmgr-----trvvcqsnnk------~tde~e-y-gvk  104 (211)
T PF05578_consen   50 RGLETGWAYTHQGGISSVDHVTAGK------------DLLVCDSMGR-----TRVVCQSNNK------MTDETE-Y-GVK  104 (211)
T ss_pred             hcccccceeeccCCcccceeeecCC------------ceEEecCCCc-----eEEEEccCCc------ccchhh-c-ccc
Confidence            3456788887777777777887664            3444444443     1233322110      000000 0 011


Q ss_pred             ecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccc-----cCCCceecceEEEeeccCCCCccceeecC-CCcEEE
Q 019504          136 VGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIF-----SQAGVTIGGGIQTDAAINPGNSGGPLLDS-KGNLIG  209 (340)
Q Consensus       136 l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~-----~~~~~~~~~~i~~d~~i~~G~SGGPl~d~-~G~VVG  209 (340)
                      - ++ ....|..+|++ +|.....+-+.|.+-.+...--     ...+.    --.+|..-..|.||=|+|.. .|++||
T Consensus       105 t-ds-gcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gt----paf~~~knlkg~s~~pifeassgr~vg  177 (211)
T PF05578_consen  105 T-DS-GCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGT----PAFFDLKNLKGWSGLPIFEASSGRVVG  177 (211)
T ss_pred             c-CC-CCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCC----cceeeccccCCCCCCceeeccCCcEEE
Confidence            1 22 23567788877 5654444444443332221100     00010    02344444579999999976 799999


Q ss_pred             EEeee
Q 019504          210 INTAI  214 (340)
Q Consensus       210 i~~~~  214 (340)
                      =.-.+
T Consensus       178 r~k~g  182 (211)
T PF05578_consen  178 RVKVG  182 (211)
T ss_pred             EEEec
Confidence            76544


No 144
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually 
Probab=20.21  E-value=99  Score=24.23  Aligned_cols=20  Identities=35%  Similarity=0.209  Sum_probs=16.9

Q ss_pred             CCccceeecCCCcEEEEEee
Q 019504          194 GNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       194 G~SGGPl~d~~G~VVGi~~~  213 (340)
                      +.++-|++|.+|+++|+++.
T Consensus        23 ~~~~~~VvD~~g~l~Givt~   42 (133)
T cd04592          23 KQSCVLVVDSDDFLEGILTL   42 (133)
T ss_pred             CCCEEEEECCCCeEEEEEEH
Confidence            45678999999999999874


No 145
>TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. An additional proposed function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. PubMed:12441107.
Probab=20.03  E-value=2.5e+02  Score=19.97  Aligned_cols=41  Identities=20%  Similarity=0.370  Sum_probs=26.1

Q ss_pred             EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEE
Q 019504          106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAI  151 (340)
Q Consensus       106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~i  151 (340)
                      ++++++..+.  +.|++.+....   ..+.+.--.++++||+|++-
T Consensus         5 iP~~V~~i~~--~~A~v~~~G~~---~~v~l~lv~~~~vGD~VLVH   45 (76)
T TIGR00074         5 IPGQVVEIDE--NIALVEFCGIK---RDVSLDLVGEVKVGDYVLVH   45 (76)
T ss_pred             cceEEEEEcC--CEEEEEcCCeE---EEEEEEeeCCCCCCCEEEEe
Confidence            5677777765  46888776432   23333222467899999874


No 146
>smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.
Probab=20.02  E-value=97  Score=17.93  Aligned_cols=20  Identities=40%  Similarity=0.664  Sum_probs=15.1

Q ss_pred             CCccceeecCCCcEEEEEee
Q 019504          194 GNSGGPLLDSKGNLIGINTA  213 (340)
Q Consensus       194 G~SGGPl~d~~G~VVGi~~~  213 (340)
                      +.+.-|+++.+++++|+.+.
T Consensus        22 ~~~~~~v~~~~~~~~g~i~~   41 (49)
T smart00116       22 GIRRLPVVDEEGRLVGIVTR   41 (49)
T ss_pred             CCCcccEECCCCeEEEEEEH
Confidence            44566888888999998763


Done!