Query         psy18070
Match_columns 169
No_of_seqs    192 out of 1639
Neff          7.0 
Searched_HMMs 46136
Date          Fri Aug 16 21:52:17 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy18070.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/18070hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 3.6E-31 7.7E-36  233.5  15.7  140    1-156   202-348 (455)
  2 TIGR02038 protease_degS peripl 100.0 8.4E-30 1.8E-34  218.3  16.4  140    1-156   188-336 (351)
  3 PRK10898 serine endoprotease;  100.0 1.2E-29 2.6E-34  217.5  16.8  140    1-156   188-337 (353)
  4 PRK10942 serine endoprotease;  100.0 1.9E-29 4.2E-34  223.4  15.1  140    1-156   223-369 (473)
  5 TIGR02037 degP_htrA_DO peripla 100.0 8.3E-29 1.8E-33  216.6  16.6  140    1-156   169-315 (428)
  6 COG0265 DegQ Trypsin-like seri  99.9   2E-23 4.4E-28  177.9  14.1  138    1-156   184-328 (347)
  7 KOG1320|consensus               99.7 2.8E-16 6.1E-21  138.3   9.4  150    1-156   294-456 (473)
  8 cd00987 PDZ_serine_protease PD  99.2 1.2E-10 2.6E-15   80.1   8.8   80   65-156     1-82  (90)
  9 PF13180 PDZ_2:  PDZ domain; PD  99.2   7E-11 1.5E-15   80.9   7.5   76   65-162     1-78  (82)
 10 KOG1421|consensus               99.1 4.7E-10   1E-14  102.0  11.7  143    1-156   206-359 (955)
 11 cd00990 PDZ_glycyl_aminopeptid  98.9 1.3E-08 2.8E-13   68.7   8.2   67   65-156     1-67  (80)
 12 TIGR02037 degP_htrA_DO peripla  98.8 3.4E-08 7.4E-13   86.7   8.9   83   64-157   337-421 (428)
 13 cd00991 PDZ_archaeal_metallopr  98.6 2.2E-07 4.8E-12   63.2   8.1   61   92-158     8-70  (79)
 14 TIGR01713 typeII_sec_gspC gene  98.6 2.6E-07 5.6E-12   76.6   9.3   92   43-159   159-252 (259)
 15 KOG1320|consensus               98.6 5.2E-08 1.1E-12   86.4   5.1  116    1-132   200-318 (473)
 16 cd00136 PDZ PDZ domain, also c  98.5   4E-07 8.8E-12   59.7   7.0   36   94-135    13-48  (70)
 17 smart00228 PDZ Domain present   98.5 9.2E-07   2E-11   59.5   7.8   71   65-156    12-84  (85)
 18 cd00992 PDZ_signaling PDZ doma  98.4 1.1E-06 2.3E-11   59.2   7.4   49   64-133    11-59  (82)
 19 cd00988 PDZ_CTP_protease PDZ d  98.4 1.5E-06 3.3E-11   59.0   7.9   57   94-156    13-72  (85)
 20 cd00989 PDZ_metalloprotease PD  98.4 1.5E-06 3.2E-11   58.2   7.6   57   94-156    12-69  (79)
 21 cd00986 PDZ_LON_protease PDZ d  98.4 2.8E-06   6E-11   57.4   7.8   56   94-156     8-65  (79)
 22 PF00595 PDZ:  PDZ domain (Also  98.3 1.5E-06 3.2E-11   58.9   6.2   71   63-153     8-80  (81)
 23 PRK10942 serine endoprotease;   98.3 3.8E-06 8.1E-11   75.1   8.2   57   94-156   408-464 (473)
 24 PRK10139 serine endoprotease;   98.0 1.9E-05 4.1E-10   70.3   6.8   57   94-156   390-446 (455)
 25 TIGR00225 prc C-terminal pepti  97.9 5.8E-05 1.3E-09   64.4   8.5   57   94-156    62-121 (334)
 26 TIGR00054 RIP metalloprotease   97.8   5E-05 1.1E-09   66.9   6.6   57   94-156   203-260 (420)
 27 TIGR00054 RIP metalloprotease   97.8 3.6E-05 7.8E-10   67.8   5.4   57   93-155   127-183 (420)
 28 PRK10779 zinc metallopeptidase  97.7 2.9E-05 6.2E-10   68.9   3.4   55   96-156   128-184 (449)
 29 PRK10779 zinc metallopeptidase  97.7  0.0001 2.3E-09   65.4   6.7   57   94-156   221-278 (449)
 30 PLN00049 carboxyl-terminal pro  97.6 0.00013 2.8E-09   63.7   6.7   35   94-134   102-136 (389)
 31 COG0793 Prc Periplasmic protea  97.5 0.00044 9.6E-09   60.8   7.9   57   94-156   112-171 (406)
 32 PF12812 PDZ_1:  PDZ-like domai  97.5 0.00052 1.1E-08   47.0   6.4   64   64-141     8-71  (78)
 33 KOG3553|consensus               97.5 9.9E-05 2.1E-09   53.0   2.8   32   94-131    59-90  (124)
 34 COG3975 Predicted protease wit  97.2  0.0007 1.5E-08   60.9   5.6   54   67-130   439-492 (558)
 35 PF04495 GRASP55_65:  GRASP55/6  97.2 0.00057 1.2E-08   51.7   4.3   73   65-155    26-100 (138)
 36 TIGR02860 spore_IV_B stage IV   97.1  0.0023 4.9E-08   56.3   7.9   58   93-156   104-170 (402)
 37 PRK11186 carboxy-terminal prot  97.1  0.0017 3.8E-08   60.4   7.2   29   94-128   255-284 (667)
 38 TIGR03279 cyano_FeS_chp putati  97.0 0.00091   2E-08   59.2   4.5   50   97-154     1-50  (433)
 39 PF10459 Peptidase_S46:  Peptid  96.7  0.0027 5.9E-08   59.4   5.4   50    3-52    625-686 (698)
 40 KOG3129|consensus               96.6  0.0045 9.7E-08   49.9   5.4   61   95-161   140-204 (231)
 41 PF14685 Tricorn_PDZ:  Tricorn   96.1   0.044 9.6E-07   38.3   7.6   57   94-156    12-79  (88)
 42 KOG3532|consensus               96.1   0.017 3.8E-07   53.7   6.6   59   94-158   398-456 (1051)
 43 PF00949 Peptidase_S7:  Peptida  96.0  0.0075 1.6E-07   45.3   3.4   29    4-32     90-118 (132)
 44 PF00944 Peptidase_S3:  Alphavi  95.4   0.022 4.8E-07   43.1   3.9   36    6-41    101-136 (158)
 45 KOG1421|consensus               95.2    0.27 5.8E-06   46.2  10.8  115    9-137   677-805 (955)
 46 PF05579 Peptidase_S32:  Equine  94.9   0.034 7.3E-07   46.5   3.8   33    8-40    205-237 (297)
 47 PRK09681 putative type II secr  94.8   0.052 1.1E-06   45.6   5.0   60   94-156   204-265 (276)
 48 KOG3571|consensus               94.4    0.12 2.5E-06   46.8   6.2   59   92-156   275-339 (626)
 49 KOG3580|consensus               93.8   0.082 1.8E-06   48.9   4.2   36   94-135   429-464 (1027)
 50 KOG2921|consensus               93.0   0.099 2.2E-06   46.0   3.4   41   92-137   218-258 (484)
 51 KOG3605|consensus               92.3    0.26 5.7E-06   45.8   5.2   82   38-136   707-792 (829)
 52 KOG3550|consensus               92.0    0.31 6.8E-06   37.7   4.6   36   94-135   115-151 (207)
 53 KOG3580|consensus               91.7    0.37   8E-06   44.7   5.4   64   91-161    37-101 (1027)
 54 KOG3209|consensus               91.6    0.23 4.9E-06   46.7   4.0   51   98-154   782-835 (984)
 55 COG3591 V8-like Glu-specific e  90.9    0.24 5.2E-06   41.1   3.2   32    2-33    194-225 (251)
 56 KOG3542|consensus               90.6     0.2 4.4E-06   47.0   2.7   57   93-155   561-618 (1283)
 57 KOG3834|consensus               90.1    0.39 8.5E-06   42.6   4.0   63   96-164   111-175 (462)
 58 COG3480 SdrC Predicted secrete  89.8    0.95   2E-05   38.9   5.9   55   94-155   130-186 (342)
 59 PF02907 Peptidase_S29:  Hepati  88.5    0.49 1.1E-05   35.8   3.0   32    9-40    106-138 (148)
 60 PF00947 Pico_P2A:  Picornaviru  88.1    0.95 2.1E-05   33.8   4.3   35    4-39     83-117 (127)
 61 KOG3209|consensus               84.8     2.1 4.5E-05   40.6   5.5   59   92-156   921-981 (984)
 62 KOG0606|consensus               83.4    0.95 2.1E-05   44.6   2.8   34   96-135   660-693 (1205)
 63 KOG3651|consensus               81.0     2.1 4.5E-05   36.8   3.7   42   94-141    30-72  (429)
 64 COG3031 PulC Type II secretory  80.8     2.2 4.8E-05   35.3   3.7   56   95-156   208-265 (275)
 65 PF00863 Peptidase_C4:  Peptida  79.5     3.4 7.3E-05   34.0   4.4   40    4-43    144-185 (235)
 66 KOG3552|consensus               76.0       4 8.7E-05   39.9   4.4   55   94-155    75-131 (1298)
 67 KOG1892|consensus               75.7     3.1 6.8E-05   40.8   3.6   38   91-134   957-995 (1629)
 68 PF08192 Peptidase_S64:  Peptid  74.1     5.1 0.00011   37.6   4.5   44    8-51    636-687 (695)
 69 COG0750 Predicted membrane-ass  73.8     7.1 0.00015   33.3   5.1   50  100-155   135-188 (375)
 70 KOG3627|consensus               72.9     2.9 6.3E-05   33.3   2.4   26    8-33    201-229 (256)
 71 PF11874 DUF3394:  Domain of un  70.9     5.7 0.00012   31.5   3.5   28   94-127   122-149 (183)
 72 PF00571 CBS:  CBS domain CBS d  67.1     7.4 0.00016   23.4   2.8   20   11-30     29-48  (57)
 73 KOG3606|consensus               65.2     5.6 0.00012   33.7   2.5   38   92-135   192-230 (358)
 74 KOG0609|consensus               62.6      18 0.00039   33.2   5.4   35   95-135   147-182 (542)
 75 PF02743 Cache_1:  Cache domain  56.6      25 0.00053   23.0   4.1   33   15-55     19-51  (81)
 76 cd00218 GlcAT-I Beta1,3-glucur  54.7      13 0.00028   30.4   2.9   31   14-45    136-172 (223)
 77 KOG3605|consensus               53.1     9.8 0.00021   35.9   2.2   52   98-155   677-733 (829)
 78 PF02122 Peptidase_S39:  Peptid  46.3      22 0.00048   28.5   3.0   40    3-43    139-182 (203)
 79 KOG3834|consensus               46.0      30 0.00066   31.0   4.0   58   91-155    12-72  (462)
 80 KOG3549|consensus               40.8      39 0.00085   29.8   3.8   53   95-153    81-136 (505)
 81 COG0260 PepB Leucyl aminopepti  37.5      42 0.00091   30.6   3.7   31   98-135   302-334 (485)
 82 cd04582 CBS_pair_ABC_OpuCA_ass  36.7      33 0.00071   22.7   2.3   22   10-31     80-101 (106)
 83 cd04596 CBS_pair_DRTGG_assoc T  35.9      33 0.00072   22.9   2.3   21   10-30     82-102 (108)
 84 KOG1728|consensus               35.6      13 0.00029   28.2   0.2   31  104-141   111-141 (156)
 85 PF08669 GCV_T_C:  Glycine clea  35.2      53  0.0012   22.2   3.2   31   10-40     32-67  (95)
 86 KOG1476|consensus               35.1      27 0.00059   30.1   2.0   32   15-47    223-260 (330)
 87 PF10049 DUF2283:  Protein of u  34.8      43 0.00093   20.4   2.4   13   18-30     35-47  (50)
 88 cd04618 CBS_pair_5 The CBS dom  33.5      31 0.00068   23.3   1.8   21   11-31     72-93  (98)
 89 cd04606 CBS_pair_Mg_transporte  33.3      39 0.00085   22.6   2.3   21   10-30     82-102 (109)
 90 smart00116 CBS Domain in cysta  33.2      42 0.00092   18.0   2.1   20   11-30     22-41  (49)
 91 cd04610 CBS_pair_ParBc_assoc T  32.5      41 0.00088   22.2   2.3   18   13-30     84-101 (107)
 92 PRK09570 rpoH DNA-directed RNA  31.6      29 0.00063   23.7   1.3   16  104-125    43-59  (79)
 93 cd04592 CBS_pair_EriC_assoc_eu  30.7      51  0.0011   23.8   2.6   22   10-31     22-43  (133)
 94 cd04641 CBS_pair_28 The CBS do  30.2      53  0.0011   22.4   2.6   22    9-30     21-42  (120)
 95 cd00433 Peptidase_M17 Cytosol   29.8      69  0.0015   29.0   3.8   28  101-135   292-321 (468)
 96 TIGR00612 ispG_gcpE 1-hydroxy-  29.2      60  0.0013   28.2   3.1   37   42-84    107-143 (346)
 97 PRK00913 multifunctional amino  28.8      76  0.0016   28.9   3.9   28  101-135   306-335 (483)
 98 TIGR02913 HAF_rpt probable ext  28.4      62  0.0013   18.9   2.2   12   18-29      4-15  (39)
 99 cd04614 CBS_pair_1 The CBS dom  28.3      66  0.0014   21.4   2.7   47   10-56     22-71  (96)
100 PF00883 Peptidase_M17:  Cytoso  28.3      61  0.0013   27.8   3.0   29  100-135   136-166 (311)
101 PF03761 DUF316:  Domain of unk  28.3      61  0.0013   26.4   3.0   28    4-31    224-254 (282)
102 cd00190 Tryp_SPc Trypsin-like   28.2      30 0.00065   26.3   1.1   17    6-22    179-195 (232)
103 PF08275 Toprim_N:  DNA primase  27.6      47   0.001   24.3   2.0   17   16-32     82-98  (128)
104 KOG3551|consensus               27.3      60  0.0013   29.0   2.8   32   94-131   110-142 (506)
105 PRK05015 aminopeptidase B; Pro  25.2      96  0.0021   27.8   3.8   30   99-135   241-272 (424)
106 cd04603 CBS_pair_KefB_assoc Th  23.4      91   0.002   21.0   2.7   19   11-29     23-41  (111)
107 cd04801 CBS_pair_M50_like This  23.1      77  0.0017   21.2   2.3   20   12-31     25-44  (114)
108 PRK03760 hypothetical protein;  22.9      98  0.0021   22.5   2.9   25   94-125    89-113 (117)
109 cd04643 CBS_pair_30 The CBS do  22.8      73  0.0016   21.2   2.2   20   12-31     24-43  (116)
110 COG1792 MreC Cell shape-determ  22.6 3.3E+02  0.0071   22.8   6.4   41    2-44    134-175 (284)
111 PF12120 Arr-ms:  Rifampin ADP-  22.6      43 0.00094   23.8   0.9   30  113-142     5-45  (100)
112 COG4043 Preprotein translocase  22.3      52  0.0011   23.7   1.3   27  115-141    31-65  (111)
113 PF01191 RNA_pol_Rpb5_C:  RNA p  22.0      44 0.00094   22.6   0.8   18  103-126    39-57  (74)
114 COG5428 Uncharacterized conser  21.7      64  0.0014   21.5   1.5   14   18-31     36-49  (69)
115 cd04594 CBS_pair_EriC_assoc_ar  21.3      82  0.0018   20.9   2.1   21    9-30     78-98  (104)
116 cd04459 Rho_CSD Rho_CSD: Rho p  21.1      60  0.0013   21.4   1.3   12  115-126    38-49  (68)
117 cd04617 CBS_pair_4 The CBS dom  21.0   1E+02  0.0023   20.8   2.7   21   10-30     22-42  (118)
118 cd04619 CBS_pair_6 The CBS dom  20.8 1.1E+02  0.0024   20.6   2.8   22   10-31     22-43  (114)
119 cd04623 CBS_pair_10 The CBS do  20.6      95  0.0021   20.4   2.3   21   11-31     23-43  (113)
120 cd04621 CBS_pair_8 The CBS dom  20.0 1.1E+02  0.0023   21.8   2.6   21   11-31     23-43  (135)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=99.97  E-value=3.6e-31  Score=233.52  Aligned_cols=140  Identities=26%  Similarity=0.323  Sum_probs=122.9

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN   75 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~   75 (169)
                      +|||||+||||||||||+|.+||||||+++..+     +|++||||++.+++++++|+++|    ++.++|||+++++++
T Consensus       202 ~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g----~v~r~~LGv~~~~l~  277 (455)
T PRK10139        202 FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG----EIKRGLLGIKGTEMS  277 (455)
T ss_pred             EEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC----cccccceeEEEEECC
Confidence            589999999999999999999999999999764     57999999999999999999999    899999999999999


Q ss_pred             HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--hcCCCCeeEEEE
Q psy18070         76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--SINHPSITCHIL  153 (169)
Q Consensus        76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~~~~~~~~~~~i  153 (169)
                      +++++.++++      ...|++|.+|.++|||+      ++|||+||+|+++++.++.+..++...  .......+.+.+
T Consensus       278 ~~~~~~lgl~------~~~Gv~V~~V~~~SpA~------~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V  345 (455)
T PRK10139        278 ADIAKAFNLD------VQRGAFVSEVLPNSGSA------KAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL  345 (455)
T ss_pred             HHHHHhcCCC------CCCceEEEEECCCChHH------HCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence            9999999884      23799999999999999      999999999999988887776655433  223445668888


Q ss_pred             EEe
Q psy18070        154 LRL  156 (169)
Q Consensus       154 ~r~  156 (169)
                      .|.
T Consensus       346 ~R~  348 (455)
T PRK10139        346 LRN  348 (455)
T ss_pred             EEC
Confidence            885


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.97  E-value=8.4e-30  Score=218.26  Aligned_cols=140  Identities=29%  Similarity=0.378  Sum_probs=121.9

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-------CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEE
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-------AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLT   73 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-------~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~   73 (169)
                      ++||||++|||||||||+|.+||||||+++.+.       ++++||||++.+++++++++++|    ++.++|||+++++
T Consensus       188 ~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g----~~~r~~lGv~~~~  263 (351)
T TIGR02038       188 FIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG----RVIRGYIGVSGED  263 (351)
T ss_pred             EEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----cccceEeeeEEEE
Confidence            589999999999999999999999999997652       57999999999999999999999    8899999999999


Q ss_pred             CCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCCeeEE
Q psy18070         74 LNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCH  151 (169)
Q Consensus        74 l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~  151 (169)
                      +++..++.++++      ...|++|.+|.++|||+      ++||++||+|+++++.++.+..++.  +...+....+.+
T Consensus       264 ~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~------~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l  331 (351)
T TIGR02038       264 INSVVAQGLGLP------DLRGIVITGVDPNGPAA------RAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMV  331 (351)
T ss_pred             CCHHHHHhcCCC------ccccceEeecCCCChHH------HCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEE
Confidence            999999998884      23799999999999999      9999999999999988887765543  333344556788


Q ss_pred             EEEEe
Q psy18070        152 ILLRL  156 (169)
Q Consensus       152 ~i~r~  156 (169)
                      .++|+
T Consensus       332 ~v~R~  336 (351)
T TIGR02038       332 TVLRQ  336 (351)
T ss_pred             EEEEC
Confidence            88885


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=99.97  E-value=1.2e-29  Score=217.49  Aligned_cols=140  Identities=27%  Similarity=0.352  Sum_probs=120.2

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc--------CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEE
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT--------AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITML   72 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~--------~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~   72 (169)
                      +|||||++|||||||||+|.+||||||+++.+.        ++++||||++.+++++++|+++|    ++.++|||+.++
T Consensus       188 ~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G----~~~~~~lGi~~~  263 (353)
T PRK10898        188 FLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG----RVIRGYIGIGGR  263 (353)
T ss_pred             eEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----cccccccceEEE
Confidence            589999999999999999999999999998753        47899999999999999999999    889999999999


Q ss_pred             ECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhh--hhhhcCCCCeeE
Q psy18070         73 TLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKL--VVWSINHPSITC  150 (169)
Q Consensus        73 ~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~--~~~~~~~~~~~~  150 (169)
                      ++++...+.++++      ...|++|.+|.++|||+      ++||++||+|+++++.++.+..++  .+........+.
T Consensus       264 ~~~~~~~~~~~~~------~~~Gv~V~~V~~~spA~------~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~  331 (353)
T PRK10898        264 EIAPLHAQGGGID------QLQGIVVNEVSPDGPAA------KAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIP  331 (353)
T ss_pred             ECCHHHHHhcCCC------CCCeEEEEEECCCChHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEE
Confidence            9998877776653      23799999999999999      999999999999999988765443  333334455668


Q ss_pred             EEEEEe
Q psy18070        151 HILLRL  156 (169)
Q Consensus       151 ~~i~r~  156 (169)
                      +.++|.
T Consensus       332 l~v~R~  337 (353)
T PRK10898        332 VVVMRD  337 (353)
T ss_pred             EEEEEC
Confidence            888885


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=99.96  E-value=1.9e-29  Score=223.45  Aligned_cols=140  Identities=31%  Similarity=0.361  Sum_probs=122.6

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN   75 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~   75 (169)
                      ||||||++|||||||||+|.+||||||+++..+     .+++||||++.+++++++|++++    .+.++|+|+.+++++
T Consensus       223 ~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g----~v~rg~lGv~~~~l~  298 (473)
T PRK10942        223 FIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYG----QVKRGELGIMGTELN  298 (473)
T ss_pred             eEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhcc----ccccceeeeEeeecC
Confidence            589999999999999999999999999998764     46999999999999999999999    899999999999999


Q ss_pred             HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCCeeEEEE
Q psy18070         76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCHIL  153 (169)
Q Consensus        76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i  153 (169)
                      +++++.++++.      ..|++|.+|.++|||+      ++||++||+|+++++.++.+..++.  +........+.+.+
T Consensus       299 ~~~a~~~~l~~------~~GvlV~~V~~~SpA~------~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~v  366 (473)
T PRK10942        299 SELAKAMKVDA------QRGAFVSQVLPNSSAA------KAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKLTLGL  366 (473)
T ss_pred             HHHHHhcCCCC------CCceEEEEECCCChHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence            99999998852      3799999999999999      9999999999999888877765544  33334455668888


Q ss_pred             EEe
Q psy18070        154 LRL  156 (169)
Q Consensus       154 ~r~  156 (169)
                      +|+
T Consensus       367 ~R~  369 (473)
T PRK10942        367 LRD  369 (473)
T ss_pred             EEC
Confidence            774


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.96  E-value=8.3e-29  Score=216.60  Aligned_cols=140  Identities=30%  Similarity=0.418  Sum_probs=123.1

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN   75 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~   75 (169)
                      ++||||++|||||||||+|.+||||||+++..+     .+++||||++.+++++++|++++    .+.++|||+++++++
T Consensus       169 ~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g----~~~~~~lGi~~~~~~  244 (428)
T TIGR02037       169 FIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG----KVQRGWLGVTIQEVT  244 (428)
T ss_pred             eEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC----cCcCCcCceEeecCC
Confidence            589999999999999999999999999998764     57899999999999999999999    889999999999999


Q ss_pred             HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCCeeEEEE
Q psy18070         76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCHIL  153 (169)
Q Consensus        76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i  153 (169)
                      +++++.++++      ...|++|.+|.++|||+      ++||++||+|+++++.++.+..++.  +........+.+.+
T Consensus       245 ~~~~~~lgl~------~~~Gv~V~~V~~~spA~------~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~v~l~v  312 (428)
T TIGR02037       245 SDLAKSLGLE------KQRGALVAQVLPGSPAE------KAGLKAGDVILSVNGKPISSFADLRRAIGTLKPGKKVTLGI  312 (428)
T ss_pred             HHHHHHcCCC------CCCceEEEEccCCCChH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEE
Confidence            9999999984      23799999999999999      9999999999999998887665544  33334455678888


Q ss_pred             EEe
Q psy18070        154 LRL  156 (169)
Q Consensus       154 ~r~  156 (169)
                      +|+
T Consensus       313 ~R~  315 (428)
T TIGR02037       313 LRK  315 (428)
T ss_pred             EEC
Confidence            885


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.90  E-value=2e-23  Score=177.92  Aligned_cols=138  Identities=32%  Similarity=0.421  Sum_probs=119.8

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN   75 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~   75 (169)
                      +|||||++||||||||++|.+|++|||+++...     ++++||||++.++.++.++...|    ++.++|+|+.+.+++
T Consensus       184 ~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G----~v~~~~lgv~~~~~~  259 (347)
T COG0265         184 FIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG----KVVRGYLGVIGEPLT  259 (347)
T ss_pred             hhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC----CccccccceEEEEcc
Confidence            589999999999999999999999999999986     24899999999999999999988    899999999999998


Q ss_pred             HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--cCCCCeeEEEE
Q psy18070         76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--INHPSITCHIL  153 (169)
Q Consensus        76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--~~~~~~~~~~i  153 (169)
                      +...  +++      ....|++|.+|.++|||+      ++|+++||+|++.++..+.+..++....  ......+.+.+
T Consensus       260 ~~~~--~g~------~~~~G~~V~~v~~~spa~------~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~  325 (347)
T COG0265         260 ADIA--LGL------PVAAGAVVLGVLPGSPAA------KAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKL  325 (347)
T ss_pred             cccc--cCC------CCCCceEEEecCCCChHH------HcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEE
Confidence            8776  554      244799999999999999      9999999999999888887776666443  33344668888


Q ss_pred             EEe
Q psy18070        154 LRL  156 (169)
Q Consensus       154 ~r~  156 (169)
                      +|.
T Consensus       326 ~r~  328 (347)
T COG0265         326 LRG  328 (347)
T ss_pred             EEC
Confidence            886


No 7  
>KOG1320|consensus
Probab=99.66  E-value=2.8e-16  Score=138.30  Aligned_cols=150  Identities=29%  Similarity=0.338  Sum_probs=113.1

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCC-----ccceeecceecEE
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDI-----DRTITHKKYIGIT   70 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~-----~~~~~~~~~lGi~   70 (169)
                      ++|||+++|+||||||++|.+|++||+++++..     .+++|++|.+.+..++.+..+..+     -.....+.|+|+.
T Consensus       294 ~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~~~~g~~  373 (473)
T KOG1320|consen  294 INQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVHQYIGLP  373 (473)
T ss_pred             ecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccccccCCce
Confidence            579999999999999999999999999999986     789999999999999988743321     1112235688888


Q ss_pred             EEECCHHHHHHhhcccCCCC-CCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--cCCCC
Q psy18070         71 MLTLNEKLIEQLRRDRHIPY-DLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--INHPS  147 (169)
Q Consensus        71 ~~~l~~~~~~~~~~~~~~~~-~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--~~~~~  147 (169)
                      +-.++..+..+.--..+.+| ...+||+|++|.+++++.      ..++++||+|+++|++++.+..++.-..  ....+
T Consensus       374 s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~------~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~~  447 (473)
T KOG1320|consen  374 SYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSING------GYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTED  447 (473)
T ss_pred             eEEEecceEEeecCCCccccccceeEEEEEEeccCCCcc------cccccCCCEEEEECCEEeechHHHHHHHHhcCcCc
Confidence            77776555544444444444 333699999999999999      9999999999999999888886666432  22233


Q ss_pred             eeEEEEEEe
Q psy18070        148 ITCHILLRL  156 (169)
Q Consensus       148 ~~~~~i~r~  156 (169)
                      .+....+|.
T Consensus       448 ~v~vl~~~~  456 (473)
T KOG1320|consen  448 KVAVLDRRS  456 (473)
T ss_pred             eEEEEEecC
Confidence            445555554


No 8  
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21  E-value=1.2e-10  Score=80.07  Aligned_cols=80  Identities=28%  Similarity=0.270  Sum_probs=62.2

Q ss_pred             ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--
Q psy18070         65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--  142 (169)
Q Consensus        65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--  142 (169)
                      +|+|+.+++++++.+..+.+      ....|++|.+|.++|||+      ++||++||+|+++++.++.+..++....  
T Consensus         1 ~~~G~~~~~~~~~~~~~~~~------~~~~g~~V~~v~~~s~a~------~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~   68 (90)
T cd00987           1 PWLGVTVQDLTPDLAEELGL------KDTKGVLVASVDPGSPAA------KAGLKPGDVILAVNGKPVKSVADLRRALAE   68 (90)
T ss_pred             CccceEEeECCHHHHHHcCC------CCCCEEEEEEECCCCHHH------HcCCCcCCEEEEECCEECCCHHHHHHHHHh
Confidence            58999999999887776554      234699999999999999      9999999999999999887654444332  


Q ss_pred             cCCCCeeEEEEEEe
Q psy18070        143 INHPSITCHILLRL  156 (169)
Q Consensus       143 ~~~~~~~~~~i~r~  156 (169)
                      ......+.+.+.|+
T Consensus        69 ~~~~~~i~l~v~r~   82 (90)
T cd00987          69 LKPGDKVTLTVLRG   82 (90)
T ss_pred             cCCCCEEEEEEEEC
Confidence            22245667777774


No 9  
>PF13180 PDZ_2:  PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.20  E-value=7e-11  Score=80.91  Aligned_cols=76  Identities=24%  Similarity=0.212  Sum_probs=58.4

Q ss_pred             ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--h
Q psy18070         65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--S  142 (169)
Q Consensus        65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~  142 (169)
                      ||||+++...++                ..|++|.+|.++|||+      ++||++||+|+++++.++.+..++...  .
T Consensus         1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~------~aGl~~GD~I~~ing~~v~~~~~~~~~l~~   58 (82)
T PF13180_consen    1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAA------KAGLQPGDIILAINGKPVNSSEDLVNILSK   58 (82)
T ss_dssp             -E-SEEEEECSC----------------SSSEEEEEESTTSHHH------HTTS-TTEEEEEETTEESSSHHHHHHHHHC
T ss_pred             CEECeEEEEccC----------------CCeEEEEEeCCCCcHH------HCCCCCCcEEEEECCEEcCCHHHHHHHHHh
Confidence            689999998753                2699999999999999      999999999999999988776665533  3


Q ss_pred             cCCCCeeEEEEEEeeEEeec
Q psy18070        143 INHPSITCHILLRLYLLVCS  162 (169)
Q Consensus       143 ~~~~~~~~~~i~r~~~~v~~  162 (169)
                      ......+.+.++|+--..+.
T Consensus        59 ~~~g~~v~l~v~R~g~~~~~   78 (82)
T PF13180_consen   59 GKPGDTVTLTVLRDGEELTV   78 (82)
T ss_dssp             SSTTSEEEEEEEETTEEEEE
T ss_pred             CCCCCEEEEEEEECCEEEEE
Confidence            45566779999996554443


No 10 
>KOG1421|consensus
Probab=99.14  E-value=4.7e-10  Score=102.03  Aligned_cols=143  Identities=15%  Similarity=0.142  Sum_probs=113.1

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeecc-CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECCHHHH
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLI   79 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~   79 (169)
                      ++|.-+....|.||.|++|.+|..|..+..+.. ++.+|++|++.+.+.+..+++..    ...|+.|-+++.+-.-+..
T Consensus       206 y~QaasstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~----PItRGtLqvefl~k~~de~  281 (955)
T KOG1421|consen  206 YIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNT----PITRGTLQVEFLHKLFDEC  281 (955)
T ss_pred             eeeehhcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCC----CcccceEEEEEehhhhHHH
Confidence            579999999999999999999999999887764 56899999999999999999888    7888999999887777777


Q ss_pred             HHhhcccC-------CCCCCCCcEE-EEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC--CCCee
Q psy18070         80 EQLRRDRH-------IPYDLTHGVL-IWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN--HPSIT  149 (169)
Q Consensus        80 ~~~~~~~~-------~~~~~~~Gv~-V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~--~~~~~  149 (169)
                      +++|++..       .+|.. .|++ |..|.++|||+      +. |++||+++++|.- ..+++..+....+  -+..+
T Consensus       282 rrlGL~sE~eqv~r~k~P~~-tgmLvV~~vL~~gpa~------k~-Le~GDillavN~t-~l~df~~l~~iLDegvgk~l  352 (955)
T KOG1421|consen  282 RRLGLSSEWEQVVRTKFPER-TGMLVVETVLPEGPAE------KK-LEPGDILLAVNST-CLNDFEALEQILDEGVGKNL  352 (955)
T ss_pred             HhcCCcHHHHHHHHhcCccc-ceeEEEEEeccCCchh------hc-cCCCcEEEEEcce-ehHHHHHHHHHHhhccCceE
Confidence            78877543       44543 4655 68899999999      66 9999999998843 3344333332222  34566


Q ss_pred             EEEEEEe
Q psy18070        150 CHILLRL  156 (169)
Q Consensus       150 ~~~i~r~  156 (169)
                      .+.++|.
T Consensus       353 ~LtI~Rg  359 (955)
T KOG1421|consen  353 ELTIQRG  359 (955)
T ss_pred             EEEEEeC
Confidence            9999995


No 11 
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.88  E-value=1.3e-08  Score=68.69  Aligned_cols=67  Identities=15%  Similarity=-0.002  Sum_probs=50.1

Q ss_pred             ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC
Q psy18070         65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN  144 (169)
Q Consensus        65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~  144 (169)
                      +|+|+.+.+-                  ..|++|.+|.++|||+      ++||++||+|+++++.++.+ ....+....
T Consensus         1 ~~~G~~~~~~------------------~~~~~V~~V~~~s~a~------~aGl~~GD~I~~Ing~~v~~-~~~~l~~~~   55 (80)
T cd00990           1 PYLGLTLDKE------------------EGLGKVTFVRDDSPAD------KAGLVAGDELVAVNGWRVDA-LQDRLKEYQ   55 (80)
T ss_pred             CcccEEEEcc------------------CCcEEEEEECCCChHH------HhCCCCCCEEEEECCEEhHH-HHHHHHhcC
Confidence            5788888641                  2589999999999999      99999999999999998776 222333333


Q ss_pred             CCCeeEEEEEEe
Q psy18070        145 HPSITCHILLRL  156 (169)
Q Consensus       145 ~~~~~~~~i~r~  156 (169)
                      ....+.+.+.|.
T Consensus        56 ~~~~v~l~v~r~   67 (80)
T cd00990          56 AGDPVELTVFRD   67 (80)
T ss_pred             CCCEEEEEEEEC
Confidence            344567777764


No 12 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.76  E-value=3.4e-08  Score=86.72  Aligned_cols=83  Identities=22%  Similarity=0.295  Sum_probs=66.7

Q ss_pred             cceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--
Q psy18070         64 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--  141 (169)
Q Consensus        64 ~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--  141 (169)
                      ..|+|+++.+++++.+++++++     ....|++|.+|.++|||+      ++||++||+|+++++.++.+..++...  
T Consensus       337 ~~~lGi~~~~l~~~~~~~~~l~-----~~~~Gv~V~~V~~~SpA~------~aGL~~GDvI~~Ing~~V~s~~d~~~~l~  405 (428)
T TIGR02037       337 NPFLGLTVANLSPEIRKELRLK-----GDVKGVVVTKVVSGSPAA------RAGLQPGDVILSVNQQPVSSVAELRKVLD  405 (428)
T ss_pred             ccccceEEecCCHHHHHHcCCC-----cCcCceEEEEeCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence            4689999999999999988774     334799999999999999      999999999999998887766554433  


Q ss_pred             hcCCCCeeEEEEEEee
Q psy18070        142 SINHPSITCHILLRLY  157 (169)
Q Consensus       142 ~~~~~~~~~~~i~r~~  157 (169)
                      ..+....+.+.++|+-
T Consensus       406 ~~~~g~~v~l~v~R~g  421 (428)
T TIGR02037       406 RAKKGGRVALLILRGG  421 (428)
T ss_pred             hcCCCCEEEEEEEECC
Confidence            2234566788888864


No 13 
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.64  E-value=2.2e-07  Score=63.19  Aligned_cols=61  Identities=16%  Similarity=0.021  Sum_probs=46.4

Q ss_pred             CCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC--CCCeeEEEEEEeeE
Q psy18070         92 LTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN--HPSITCHILLRLYL  158 (169)
Q Consensus        92 ~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~--~~~~~~~~i~r~~~  158 (169)
                      ...|++|.+|.++|||+      ++||++||+|+++++.++.+..++......  ....+.+.+.|+-.
T Consensus         8 ~~~Gv~V~~V~~~spa~------~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~   70 (79)
T cd00991           8 AVAGVVIVGVIVGSPAE------NAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTT   70 (79)
T ss_pred             cCCcEEEEEECCCChHH------hcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCE
Confidence            34799999999999999      999999999999998887765444433222  24456788887543


No 14 
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.61  E-value=2.6e-07  Score=76.62  Aligned_cols=92  Identities=11%  Similarity=0.016  Sum_probs=72.9

Q ss_pred             hhHHHHHHhhhhcCCccceeecceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCc
Q psy18070         43 DYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTS  122 (169)
Q Consensus        43 ~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GD  122 (169)
                      ..++++++++++.+    ..-+.|+|+.....+               ....|+.|..+.++|||+      ++|||+||
T Consensus       159 ~~~~~v~~~l~~~g----~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~~s~a~------~aGLr~GD  213 (259)
T TIGR01713       159 VVSRRIIEELTKDP----QKMFDYIRLSPVMKN---------------DKLEGYRLNPGKDPSLFY------KSGLQDGD  213 (259)
T ss_pred             hhHHHHHHHHHHCH----HhhhheEeEEEEEeC---------------CceeEEEEEecCCCCHHH------HcCCCCCC
Confidence            46788999999988    888999999986543               112699999999999999      99999999


Q ss_pred             EEEecCceeecchhhhh--hhhcCCCCeeEEEEEEeeEE
Q psy18070        123 SRLLGECLAQYTTSKLV--VWSINHPSITCHILLRLYLL  159 (169)
Q Consensus       123 vI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i~r~~~~  159 (169)
                      +|+++|+.++.+..+..  +........+.+.+.|+--.
T Consensus       214 vIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~  252 (259)
T TIGR01713       214 IAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQR  252 (259)
T ss_pred             EEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEE
Confidence            99999999887765544  33334445778999986543


No 15 
>KOG1320|consensus
Probab=98.60  E-value=5.2e-08  Score=86.38  Aligned_cols=116  Identities=18%  Similarity=0.180  Sum_probs=92.7

Q ss_pred             CEeeccccCCCCccceEEcCCccEEEEEeeec--cCCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEEC-CHH
Q psy18070          1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTL-NEK   77 (169)
Q Consensus         1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~--~~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l-~~~   77 (169)
                      .+|+||+++|||||+|.+.-.++++|+++.++  .+++++.+|.-...++.......++   ....++++...+.+ +.+
T Consensus       200 ~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~---~~~f~~~nt~t~g~vs~~  276 (473)
T KOG1320|consen  200 RVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAI---GNGFGLLNTLTQGMVSGQ  276 (473)
T ss_pred             eEEEEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeecc---ccCceeeeeeeecccccc
Confidence            38999999999999999988899999999998  4578999999999998887766653   34556666666555 466


Q ss_pred             HHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceee
Q psy18070         78 LIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQ  132 (169)
Q Consensus        78 ~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~  132 (169)
                      .++.+.+.     .. .|+.+.++.+-+.|.      .. ++.||+|+..+++.+
T Consensus       277 ~R~~~~lg-----~~-~g~~i~~~~qtd~ai------~~-~nsg~~ll~~DG~~I  318 (473)
T KOG1320|consen  277 LRKSFKLG-----LE-TGVLISKINQTDAAI------NP-GNSGGPLLNLDGEVI  318 (473)
T ss_pred             cccccccC-----cc-cceeeeeecccchhh------hc-ccCCCcEEEecCcEe
Confidence            66666553     33 789999999999888      55 999999999766655


No 16 
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.54  E-value=4e-07  Score=59.67  Aligned_cols=36  Identities=25%  Similarity=0.211  Sum_probs=33.0

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT  135 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~  135 (169)
                      .|++|.+|.++|||+      ++||++||+|+++++.++.+.
T Consensus        13 ~~~~V~~v~~~s~a~------~~gl~~GD~I~~Ing~~v~~~   48 (70)
T cd00136          13 GGVVVLSVEPGSPAE------RAGLQAGDVILAVNGTDVKNL   48 (70)
T ss_pred             CCEEEEEeCCCCHHH------HcCCCCCCEEEEECCEECCCC
Confidence            489999999999999      999999999999998887665


No 17 
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.48  E-value=9.2e-07  Score=59.46  Aligned_cols=71  Identities=20%  Similarity=0.098  Sum_probs=50.0

Q ss_pred             ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhhhh
Q psy18070         65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVVWS  142 (169)
Q Consensus        65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~~~  142 (169)
                      ..+|+.+.....               ...|++|..|.++|||+      ++||++||+|+++++....+.  .+.....
T Consensus        12 ~~~G~~~~~~~~---------------~~~~~~i~~v~~~s~a~------~~gl~~GD~I~~In~~~v~~~~~~~~~~~~   70 (85)
T smart00228       12 GGLGFSLVGGKD---------------EGGGVVVSSVVPGSPAA------KAGLKVGDVILEVNGTSVEGLTHLEAVDLL   70 (85)
T ss_pred             CcccEEEECCCC---------------CCCCEEEEEECCCCHHH------HcCCCCCCEEEEECCEECCCCCHHHHHHHH
Confidence            678888875321               11599999999999999      999999999999998877643  3333332


Q ss_pred             cCCCCeeEEEEEEe
Q psy18070        143 INHPSITCHILLRL  156 (169)
Q Consensus       143 ~~~~~~~~~~i~r~  156 (169)
                      ......+.+.+.|.
T Consensus        71 ~~~~~~~~l~i~r~   84 (85)
T smart00228       71 KKAGGKVTLTVLRG   84 (85)
T ss_pred             HhCCCeEEEEEEeC
Confidence            33334556666663


No 18 
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.45  E-value=1.1e-06  Score=59.16  Aligned_cols=49  Identities=16%  Similarity=0.188  Sum_probs=41.2

Q ss_pred             cceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeec
Q psy18070         64 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQY  133 (169)
Q Consensus        64 ~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~  133 (169)
                      ...+|+++......               ..|++|.+|.++|||+      ++||++||+|+++++....
T Consensus        11 ~~~~G~~~~~~~~~---------------~~~~~V~~v~~~s~a~------~~gl~~GD~I~~ing~~i~   59 (82)
T cd00992          11 GGGLGFSLRGGKDS---------------GGGIFVSRVEPGGPAE------RGGLRVGDRILEVNGVSVE   59 (82)
T ss_pred             CCCcCEEEeCcccC---------------CCCeEEEEECCCChHH------hCCCCCCCEEEEECCEEcC
Confidence            45689988764321               2699999999999999      9999999999999998877


No 19 
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.43  E-value=1.5e-06  Score=59.03  Aligned_cols=57  Identities=23%  Similarity=0.095  Sum_probs=43.8

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhhhhcC-CCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVVWSIN-HPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~~~~~-~~~~~~~~i~r~  156 (169)
                      .+++|..|.++|||+      ++||++||+|+++++.+..+.  .+....... ....+.+.+.|.
T Consensus        13 ~~~~V~~v~~~s~a~------~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~   72 (85)
T cd00988          13 GGLVITSVLPGSPAA------KAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRG   72 (85)
T ss_pred             CeEEEEEecCCCCHH------HcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcC
Confidence            689999999999999      999999999999999887764  444333222 344567777775


No 20 
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.43  E-value=1.5e-06  Score=58.23  Aligned_cols=57  Identities=21%  Similarity=0.027  Sum_probs=43.1

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCC-CCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINH-PSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~-~~~~~~~i~r~  156 (169)
                      ..++|.+|.++|||+      ++||++||+|+++++.+..+..+........ ...+.+.+.|.
T Consensus        12 ~~~~V~~v~~~s~a~------~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~   69 (79)
T cd00989          12 IEPVIGEVVPGSPAA------KAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERN   69 (79)
T ss_pred             cCcEEEeECCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEEC
Confidence            358999999999999      9999999999999999877654443332222 34557777764


No 21 
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.36  E-value=2.8e-06  Score=57.39  Aligned_cols=56  Identities=16%  Similarity=0.121  Sum_probs=42.8

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--cCCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--INHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--~~~~~~~~~~i~r~  156 (169)
                      .|++|.+|.++|||+      . ||++||+|+++++.++.+..++....  ......+.+.+.|.
T Consensus         8 ~Gv~V~~V~~~s~A~------~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~   65 (79)
T cd00986           8 HGVYVTSVVEGMPAA------G-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKRE   65 (79)
T ss_pred             cCEEEEEECCCCchh------h-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEEC
Confidence            699999999999999      6 79999999999998877654443222  23344567888774


No 22 
>PF00595 PDZ:  PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available;  InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.  PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.35  E-value=1.5e-06  Score=58.92  Aligned_cols=71  Identities=18%  Similarity=0.122  Sum_probs=50.6

Q ss_pred             ecceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhh
Q psy18070         63 HKKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVV  140 (169)
Q Consensus        63 ~~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~  140 (169)
                      ....+|+++.......              ..|++|.+|.++|||+      ++||++||.|+++|+..+.+.  .+...
T Consensus         8 ~~~~lG~~l~~~~~~~--------------~~~~~V~~v~~~~~a~------~~gl~~GD~Il~INg~~v~~~~~~~~~~   67 (81)
T PF00595_consen    8 GNGPLGFTLRGGSDND--------------EKGVFVSSVVPGSPAE------RAGLKVGDRILEINGQSVRGMSHDEVVQ   67 (81)
T ss_dssp             TTSBSSEEEEEESTSS--------------SEEEEEEEECTTSHHH------HHTSSTTEEEEEETTEESTTSBHHHHHH
T ss_pred             CCCCcCEEEEecCCCC--------------cCCEEEEEEeCCChHH------hcccchhhhhheeCCEeCCCCCHHHHHH
Confidence            4567999998653210              2599999999999999      999999999999998887654  33333


Q ss_pred             hhcCCCCeeEEEE
Q psy18070        141 WSINHPSITCHIL  153 (169)
Q Consensus       141 ~~~~~~~~~~~~i  153 (169)
                      .....+..+.+.+
T Consensus        68 ~l~~~~~~v~L~V   80 (81)
T PF00595_consen   68 LLKSASNPVTLTV   80 (81)
T ss_dssp             HHHHSTSEEEEEE
T ss_pred             HHHCCCCcEEEEE
Confidence            3333444555544


No 23 
>PRK10942 serine endoprotease; Provisional
Probab=98.25  E-value=3.8e-06  Score=75.10  Aligned_cols=57  Identities=19%  Similarity=0.144  Sum_probs=47.3

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r~  156 (169)
                      .|++|.+|.++|||+      ++||++||+|+++|+.++.+..++.....+.+..+.+.+.|.
T Consensus       408 ~gvvV~~V~~~S~A~------~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~~~v~l~V~R~  464 (473)
T PRK10942        408 KGVVVDNVKPGTPAA------QIGLKKGDVIIGANQQPVKNIAELRKILDSKPSVLALNIQRG  464 (473)
T ss_pred             CCeEEEEeCCCChHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCeEEEEEEEC
Confidence            589999999999999      999999999999999988887666544444456678888885


No 24 
>PRK10139 serine endoprotease; Provisional
Probab=97.96  E-value=1.9e-05  Score=70.34  Aligned_cols=57  Identities=19%  Similarity=0.165  Sum_probs=46.0

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r~  156 (169)
                      .|++|.+|.++|||+      ++||++||+|+++|+.++.+..++.....+++..+.+.++|+
T Consensus       390 ~Gv~V~~V~~~spA~------~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~v~l~v~R~  446 (455)
T PRK10139        390 KGIKIDEVVKGSPAA------QAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKPAIIALQIVRG  446 (455)
T ss_pred             CceEEEEeCCCChHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCeEEEEEEEC
Confidence            599999999999999      999999999999998888776555544333445667888885


No 25 
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.90  E-value=5.8e-05  Score=64.35  Aligned_cols=57  Identities=21%  Similarity=0.130  Sum_probs=41.9

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhhhh-cCCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVVWS-INHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~~~-~~~~~~~~~~i~r~  156 (169)
                      .+++|.+|.++|||+      ++||++||+|+++++.++.+-  .+..... ......+.+.+.|.
T Consensus        62 ~~~~V~~V~~~spA~------~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~  121 (334)
T TIGR00225        62 GEIVIVSPFEGSPAE------KAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRA  121 (334)
T ss_pred             CEEEEEEeCCCChHH------HcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeC
Confidence            589999999999999      999999999999999887652  2222121 12344557777774


No 26 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.80  E-value=5e-05  Score=66.90  Aligned_cols=57  Identities=18%  Similarity=-0.057  Sum_probs=44.0

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC-CCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN-HPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~-~~~~~~~~i~r~  156 (169)
                      .|++|.+|.++|||+      ++||++||+|+++|+.++.+-.+....... ....+.+.+.|+
T Consensus       203 ~g~vV~~V~~~SpA~------~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~  260 (420)
T TIGR00054       203 IEPVLSDVTPNSPAE------KAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVERN  260 (420)
T ss_pred             cCcEEEEECCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEEC
Confidence            489999999999999      999999999999998887765554433322 333457777775


No 27 
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.79  E-value=3.6e-05  Score=67.81  Aligned_cols=57  Identities=19%  Similarity=0.013  Sum_probs=46.2

Q ss_pred             CCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEE
Q psy18070         93 THGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLR  155 (169)
Q Consensus        93 ~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r  155 (169)
                      ..|++|.+|.++|||+      +|||++||+|+++|+.++.+..++........+...+.+.|
T Consensus       127 ~~g~~V~~V~~~SpA~------~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~~~v~~~I~r  183 (420)
T TIGR00054       127 EVGPVIELLDKNSIAL------EAGIEPGDEILSVNGNKIPGFKDVRQQIADIAGEPMVEILA  183 (420)
T ss_pred             CCCceeeccCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhhcccceEEEEE
Confidence            3699999999999999      99999999999999888887766665544444566677766


No 28 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.69  E-value=2.9e-05  Score=68.90  Aligned_cols=55  Identities=15%  Similarity=-0.019  Sum_probs=42.7

Q ss_pred             EEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--hcCCCCeeEEEEEEe
Q psy18070         96 VLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--SINHPSITCHILLRL  156 (169)
Q Consensus        96 v~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~~~~~~~~~~~i~r~  156 (169)
                      .+|.+|.++|||+      +||||+||+|+++|+.++.+-.++...  .......+.+++.|.
T Consensus       128 ~lV~~V~~~SpA~------kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~  184 (449)
T PRK10779        128 PVVGEIAPNSIAA------QAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPF  184 (449)
T ss_pred             ccccccCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeC
Confidence            4789999999999      999999999999888887776555433  333334568888875


No 29 
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.67  E-value=0.0001  Score=65.36  Aligned_cols=57  Identities=16%  Similarity=0.002  Sum_probs=43.1

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhc-CCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSI-NHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~-~~~~~~~~~i~r~  156 (169)
                      .+++|.+|.++|||+      ++||++||+|+++|+.++.+-.+...... .....+.+.+.|+
T Consensus       221 ~~~vV~~V~~~SpA~------~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~  278 (449)
T PRK10779        221 IEPVLAEVQPNSAAS------KAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQ  278 (449)
T ss_pred             cCcEEEeeCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEEC
Confidence            368999999999999      99999999999999888766544433222 2334567777774


No 30 
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.64  E-value=0.00013  Score=63.70  Aligned_cols=35  Identities=23%  Similarity=0.253  Sum_probs=32.4

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecc
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYT  134 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~  134 (169)
                      .|++|..|.++|||+      ++||++||+|+++++.++.+
T Consensus       102 ~g~~V~~V~~~SPA~------~aGl~~GD~Iv~InG~~v~~  136 (389)
T PLN00049        102 AGLVVVAPAPGGPAA------RAGIRPGDVILAIDGTSTEG  136 (389)
T ss_pred             CcEEEEEeCCCChHH------HcCCCCCCEEEEECCEECCC
Confidence            489999999999999      99999999999999988764


No 31 
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.49  E-value=0.00044  Score=60.83  Aligned_cols=57  Identities=26%  Similarity=0.183  Sum_probs=44.5

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchh-hhhhhhc--CCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTS-KLVVWSI--NHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~-~~~~~~~--~~~~~~~~~i~r~  156 (169)
                      .++.|.++.+++||+      ++||++||+|+++++.++.... +.++..+  +....+.+.+.|.
T Consensus       112 ~~~~V~s~~~~~PA~------kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~  171 (406)
T COG0793         112 GGVKVVSPIDGSPAA------KAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRA  171 (406)
T ss_pred             CCcEEEecCCCChHH------HcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEc
Confidence            589999999999999      9999999999999988776663 3333333  3344668888884


No 32 
>PF12812 PDZ_1:  PDZ-like domain
Probab=97.46  E-value=0.00052  Score=46.98  Aligned_cols=64  Identities=14%  Similarity=-0.000  Sum_probs=52.2

Q ss_pred             cceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh
Q psy18070         64 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW  141 (169)
Q Consensus        64 ~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~  141 (169)
                      -.|+|..+++|+-+.++++++.        -|+++.....++++.      ..|+.+|-+|+++|+.++.+-.++...
T Consensus         8 v~~~Ga~f~~Ls~q~aR~~~~~--------~~gv~v~~~~g~~~~------~~~i~~g~iI~~Vn~kpt~~Ld~f~~v   71 (78)
T PF12812_consen    8 VEVCGAVFHDLSYQQARQYGIP--------VGGVYVAVSGGSLAF------AGGISKGFIITSVNGKPTPDLDDFIKV   71 (78)
T ss_pred             EEEcCeecccCCHHHHHHhCCC--------CCEEEEEecCCChhh------hCCCCCCeEEEeECCcCCcCHHHHHHH
Confidence            3679999999999999999875        456666778899998      666999999999998887776655543


No 33 
>KOG3553|consensus
Probab=97.46  E-value=9.9e-05  Score=52.98  Aligned_cols=32  Identities=28%  Similarity=0.302  Sum_probs=30.1

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCcee
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLA  131 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~  131 (169)
                      .|++|++|.+||||+      .||||.+|-|+.+|+-.
T Consensus        59 ~GiYvT~V~eGsPA~------~AGLrihDKIlQvNG~D   90 (124)
T KOG3553|consen   59 KGIYVTRVSEGSPAE------IAGLRIHDKILQVNGWD   90 (124)
T ss_pred             ccEEEEEeccCChhh------hhcceecceEEEecCce
Confidence            699999999999999      99999999999988764


No 34 
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.18  E-value=0.0007  Score=60.90  Aligned_cols=54  Identities=20%  Similarity=0.179  Sum_probs=39.2

Q ss_pred             ecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCce
Q psy18070         67 IGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECL  130 (169)
Q Consensus        67 lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v  130 (169)
                      .|+++.++.++ .-.+|++-.   +...+.+|+.|.++|||+      +|||.+||.|++++++
T Consensus       439 ~gL~~~~~~~~-~~~LGl~v~---~~~g~~~i~~V~~~gPA~------~AGl~~Gd~ivai~G~  492 (558)
T COG3975         439 FGLTFTPKPRE-AYYLGLKVK---SEGGHEKITFVFPGGPAY------KAGLSPGDKIVAINGI  492 (558)
T ss_pred             cceEEEecCCC-CcccceEec---ccCCeeEEEecCCCChhH------hccCCCccEEEEEcCc
Confidence            56666555443 223333211   333678999999999999      9999999999999998


No 35 
>PF04495 GRASP55_65:  GRASP55/65 PDZ-like domain ;  InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.17  E-value=0.00057  Score=51.68  Aligned_cols=73  Identities=19%  Similarity=0.099  Sum_probs=44.5

Q ss_pred             ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCC-CcEEEecCceeecchhhhh-hhh
Q psy18070         65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKP-TSSRLLGECLAQYTTSKLV-VWS  142 (169)
Q Consensus        65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~-GDvI~~~~~v~~~~~~~~~-~~~  142 (169)
                      +.||++++--..+-            ..+.+.-|.+|.|+|||+      +|||+| .|-|+..+.....+..++. ...
T Consensus        26 g~LG~sv~~~~~~~------------~~~~~~~Vl~V~p~SPA~------~AGL~p~~DyIig~~~~~l~~~~~l~~~v~   87 (138)
T PF04495_consen   26 GLLGISVRFESFEG------------AEEEGWHVLRVAPNSPAA------KAGLEPFFDYIIGIDGGLLDDEDDLFELVE   87 (138)
T ss_dssp             SSS-EEEEEEE-TT------------GCCCEEEEEEE-TTSHHH------HTT--TTTEEEEEETTCE--STCHHHHHHH
T ss_pred             CCCcEEEEEecccc------------cccceEEEeEecCCCHHH------HCCccccccEEEEccceecCCHHHHHHHHH
Confidence            67999987543210            223689999999999999      999999 6999996654433333333 223


Q ss_pred             cCCCCeeEEEEEE
Q psy18070        143 INHPSITCHILLR  155 (169)
Q Consensus       143 ~~~~~~~~~~i~r  155 (169)
                      ......+.+.+++
T Consensus        88 ~~~~~~l~L~Vyn  100 (138)
T PF04495_consen   88 ANENKPLQLYVYN  100 (138)
T ss_dssp             HTTTS-EEEEEEE
T ss_pred             HcCCCcEEEEEEE
Confidence            4455667787776


No 36 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.10  E-value=0.0023  Score=56.30  Aligned_cols=58  Identities=21%  Similarity=0.064  Sum_probs=42.3

Q ss_pred             CCcEEEEEE--------ccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC-CCCeeEEEEEEe
Q psy18070         93 THGVLIWRV--------MYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN-HPSITCHILLRL  156 (169)
Q Consensus        93 ~~Gv~V~~V--------~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~-~~~~~~~~i~r~  156 (169)
                      +.||+|...        ..+|||+      ++|||+||+|+++|+.++.+..++...... ....+.+.+.|.
T Consensus       104 t~GVlVvg~~~v~~~~g~~~SPAa------~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~  170 (402)
T TIGR02860       104 TKGVLVVGFSDIETEKGKIHSPGE------EAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERG  170 (402)
T ss_pred             cCEEEEEEEEcccccCCCCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEEC
Confidence            379999665        2369999      999999999999998887776555433222 245567777775


No 37 
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.06  E-value=0.0017  Score=60.42  Aligned_cols=29  Identities=10%  Similarity=0.000  Sum_probs=26.9

Q ss_pred             CcEEEEEEccCCccccccccccc-CCCCCcEEEecC
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSA-GIKPTSSRLLGE  128 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~a-GL~~GDvI~~~~  128 (169)
                      .+++|.+|.|||||+      ++ ||++||+|++++
T Consensus       255 ~~~~V~~vipGsPA~------ka~gLk~GD~IlaVn  284 (667)
T PRK11186        255 DYTVINSLVAGGPAA------KSKKLSVGDKIVGVG  284 (667)
T ss_pred             CeEEEEEccCCChHH------HhCCCCCCCEEEEEC
Confidence            468999999999999      98 999999999976


No 38 
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=96.99  E-value=0.00091  Score=59.21  Aligned_cols=50  Identities=18%  Similarity=0.056  Sum_probs=37.1

Q ss_pred             EEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEE
Q psy18070         97 LIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILL  154 (169)
Q Consensus        97 ~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~  154 (169)
                      +|.+|.|+|||+      ++||++||.|+++|+.++.+-.+.......  ..+.+.+.
T Consensus         1 ~I~~V~pgSpAe------~AGLe~GD~IlsING~~V~Dw~D~~~~l~~--e~l~L~V~   50 (433)
T TIGR03279         1 LISAVLPGSIAE------ELGFEPGDALVSINGVAPRDLIDYQFLCAD--EELELEVL   50 (433)
T ss_pred             CcCCcCCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHhcC--CcEEEEEE
Confidence            367899999999      999999999999999988775554433322  33455554


No 39 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.70  E-value=0.0027  Score=59.43  Aligned_cols=50  Identities=26%  Similarity=0.401  Sum_probs=35.8

Q ss_pred             eeccccCCCCccceEEcCCccEEEEEeeecc----------CCe--EEEEehhhHHHHHHhh
Q psy18070          3 LTGIMVKFGNSGGPLVNLDGEVIGINSMKVT----------AGI--SFAIPIDYAIEFLTNY   52 (169)
Q Consensus         3 q~da~in~GnSGGplvn~~G~vvGi~~~~~~----------~~~--~faiP~~~i~~~l~~l   52 (169)
                      -++.-|..||||+|++|.+||+||+++-..-          ...  +..|=+..+..+++.+
T Consensus       625 lstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv  686 (698)
T PF10459_consen  625 LSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV  686 (698)
T ss_pred             EeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence            3567788999999999999999999997652          122  3444445555665554


No 40 
>KOG3129|consensus
Probab=96.62  E-value=0.0045  Score=49.89  Aligned_cols=61  Identities=20%  Similarity=0.009  Sum_probs=43.0

Q ss_pred             cEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhh----hhcCCCCeeEEEEEEeeEEee
Q psy18070         95 GVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVV----WSINHPSITCHILLRLYLLVC  161 (169)
Q Consensus        95 Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~----~~~~~~~~~~~~i~r~~~~v~  161 (169)
                      =++|.+|.|+|||+      +|||+.||-|++...+...+-..+..    ........+.+++.|.--.|+
T Consensus       140 Fa~V~sV~~~SPA~------~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~  204 (231)
T KOG3129|consen  140 FAVVDSVVPGSPAD------EAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVV  204 (231)
T ss_pred             eEEEeecCCCChhh------hhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEE
Confidence            36899999999999      99999999999966665555443221    122345566888888644443


No 41 
>PF14685 Tricorn_PDZ:  Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=96.13  E-value=0.044  Score=38.34  Aligned_cols=57  Identities=14%  Similarity=0.004  Sum_probs=36.4

Q ss_pred             CcEEEEEEccC--------CcccccccccccCC--CCCcEEEecCceeecchhhhhhhhcCC-CCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYN--------SPAYFIKFRTSAGI--KPTSSRLLGECLAQYTTSKLVVWSINH-PSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~--------spA~~~~~~~~aGL--~~GDvI~~~~~v~~~~~~~~~~~~~~~-~~~~~~~i~r~  156 (169)
                      .+..|.++.++        ||-.      +.|+  ++||+|+++|+.++..+.....+...+ ...+.+++.+.
T Consensus        12 ~~y~I~~I~~gd~~~~~~~sPL~------~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~   79 (88)
T PF14685_consen   12 GGYRIARIYPGDPWNPNARSPLA------QPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRK   79 (88)
T ss_dssp             TEEEEEEE-BS-TTSSS-B-GGG------GGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-S
T ss_pred             CEEEEEEEeCCCCCCccccCCcc------CCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecC
Confidence            67889999776        5555      5554  599999999999999886666555554 44667777653


No 42 
>KOG3532|consensus
Probab=96.05  E-value=0.017  Score=53.73  Aligned_cols=59  Identities=15%  Similarity=0.024  Sum_probs=46.8

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEEeeE
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLRLYL  158 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r~~~  158 (169)
                      .-|-|..|.+++||.      ++-+++|||+++++++++.+..+.......-.+.+.....|...
T Consensus       398 ~~v~v~tv~~ns~a~------k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~~~l~~~~~~  456 (1051)
T KOG3532|consen  398 RAVKVCTVEDNSLAD------KAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDLTVLVERSLD  456 (1051)
T ss_pred             eEEEEEEecCCChhh------HhcCCCcceEEEecCccchhHHHHHHHHHhcccceEEEEeeccc
Confidence            457799999999999      99999999999999999998888776655556665444444433


No 43 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.01  E-value=0.0075  Score=45.34  Aligned_cols=29  Identities=31%  Similarity=0.598  Sum_probs=21.1

Q ss_pred             eccccCCCCccceEEcCCccEEEEEeeec
Q psy18070          4 TGIMVKFGNSGGPLVNLDGEVIGINSMKV   32 (169)
Q Consensus         4 ~da~in~GnSGGplvn~~G~vvGi~~~~~   32 (169)
                      .|.-+.+|.||.|++|.+|++||+-....
T Consensus        90 ~~~d~~~GsSGSpi~n~~g~ivGlYg~g~  118 (132)
T PF00949_consen   90 IDLDFPKGSSGSPIFNQNGEIVGLYGNGV  118 (132)
T ss_dssp             E---S-TTGTT-EEEETTSCEEEEEEEEE
T ss_pred             eecccCCCCCCCceEcCCCcEEEEEccce
Confidence            34558999999999999999999976554


No 44 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=95.41  E-value=0.022  Score=43.05  Aligned_cols=36  Identities=28%  Similarity=0.345  Sum_probs=29.2

Q ss_pred             cccCCCCccceEEcCCccEEEEEeeeccCCeEEEEe
Q psy18070          6 IMVKFGNSGGPLVNLDGEVIGINSMKVTAGISFAIP   41 (169)
Q Consensus         6 a~in~GnSGGplvn~~G~vvGi~~~~~~~~~~faiP   41 (169)
                      ..-+||.||-|++|-.|+||||+-...++|--.++.
T Consensus       101 g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaLS  136 (158)
T PF00944_consen  101 GVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTALS  136 (158)
T ss_dssp             TS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEEE
T ss_pred             CCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEEE
Confidence            345899999999999999999998887766555554


No 45 
>KOG1421|consensus
Probab=95.18  E-value=0.27  Score=46.16  Aligned_cols=115  Identities=16%  Similarity=0.093  Sum_probs=73.2

Q ss_pred             CCCCccceEEcCCccEEEEEeeecc---CC----eEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECCHHHHHH
Q psy18070          9 KFGNSGGPLVNLDGEVIGINSMKVT---AG----ISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLIEQ   81 (169)
Q Consensus         9 n~GnSGGplvn~~G~vvGi~~~~~~---~~----~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~~~   81 (169)
                      ..++| |-+.|-+|+|+++=-.-..   ++    +-|-+.+..+..+++.|+.++    ......+|+.+..++-..++.
T Consensus       677 T~c~s-g~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~----~~rp~i~~vef~~i~laqar~  751 (955)
T KOG1421|consen  677 TSCLS-GRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGP----SARPTIAGVEFSHITLAQART  751 (955)
T ss_pred             ccccc-eEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCC----CCCceeeccceeeEEeehhhc
Confidence            34455 4899999999995332222   22    346667778999999999887    444555677776666555555


Q ss_pred             hhcccC------CCCCCC-CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh
Q psy18070         82 LRRDRH------IPYDLT-HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK  137 (169)
Q Consensus        82 ~~~~~~------~~~~~~-~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~  137 (169)
                      +|++..      -..... +=.+|+.|.+.-+..         |..||||+++|+.-+..-.+
T Consensus       752 lglp~e~imk~e~es~~~~ql~~ishv~~~~~ki---------l~~gdiilsvngk~itr~~d  805 (955)
T KOG1421|consen  752 LGLPSEFIMKSEEESTIPRQLYVISHVRPLLHKI---------LGVGDIILSVNGKMITRLSD  805 (955)
T ss_pred             cCCCHHHHhhhhhcCCCcceEEEEEeeccCcccc---------cccccEEEEecCeEEeeehh
Confidence            554311      000111 235678898766544         99999999988776554433


No 46 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=94.86  E-value=0.034  Score=46.51  Aligned_cols=33  Identities=30%  Similarity=0.527  Sum_probs=25.4

Q ss_pred             cCCCCccceEEcCCccEEEEEeeeccCCeEEEE
Q psy18070          8 VKFGNSGGPLVNLDGEVIGINSMKVTAGISFAI   40 (169)
Q Consensus         8 in~GnSGGplvn~~G~vvGi~~~~~~~~~~fai   40 (169)
                      .+||+||.|++..+|.+|||.+.+...|.++.-
T Consensus       205 T~~GDSGSPVVt~dg~liGVHTGSn~~G~g~vT  237 (297)
T PF05579_consen  205 TGPGDSGSPVVTEDGDLIGVHTGSNKRGSGAVT  237 (297)
T ss_dssp             S-GGCTT-EEEETTC-EEEEEEEEETTTEEEEE
T ss_pred             cCCCCCCCccCcCCCCEEEEEecCCCcCceEEE
Confidence            479999999999999999999988776666543


No 47 
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=94.84  E-value=0.052  Score=45.57  Aligned_cols=60  Identities=10%  Similarity=0.063  Sum_probs=41.9

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhh--hhhhcCCCCeeEEEEEEe
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKL--VVWSINHPSITCHILLRL  156 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~--~~~~~~~~~~~~~~i~r~  156 (169)
                      .|+.=-.|.|+.+++   .-.++|||+|||++++|++...+..+.  +...+.....+.+++.|.
T Consensus       204 ~Gl~GYrl~Pgkd~~---lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRd  265 (276)
T PRK09681        204 EGIVGYAVKPGADRS---LFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRK  265 (276)
T ss_pred             CCceEEEECCCCcHH---HHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEEC
Confidence            462224567775543   123799999999999999987766543  344566777889999995


No 48 
>KOG3571|consensus
Probab=94.36  E-value=0.12  Score=46.79  Aligned_cols=59  Identities=10%  Similarity=0.074  Sum_probs=41.7

Q ss_pred             CCCcEEEEEEccCCccccccccccc-CCCCCcEEEecCceeecch-----hhhhhhhcCCCCeeEEEEEEe
Q psy18070         92 LTHGVLIWRVMYNSPAYFIKFRTSA-GIKPTSSRLLGECLAQYTT-----SKLVVWSINHPSITCHILLRL  156 (169)
Q Consensus        92 ~~~Gv~V~~V~~~spA~~~~~~~~a-GL~~GDvI~~~~~v~~~~~-----~~~~~~~~~~~~~~~~~i~r~  156 (169)
                      ...|++|.++.+++.-+      .- -|.+||.|+.+|.+...+-     .+.+.....+++.+.+++-+-
T Consensus       275 gDggIYVgsImkgGAVA------~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltvAk~  339 (626)
T KOG3571|consen  275 GDGGIYVGSIMKGGAVA------LDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTVAKC  339 (626)
T ss_pred             CCCceEEeeeccCceee------ccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEEeec
Confidence            34799999999999777      44 4999999999887754332     233333456777777776553


No 49 
>KOG3580|consensus
Probab=93.77  E-value=0.082  Score=48.87  Aligned_cols=36  Identities=19%  Similarity=0.195  Sum_probs=32.6

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT  135 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~  135 (169)
                      -|+.|..|..+|||+      +-||+.||-|+.+|.+...+-
T Consensus       429 VGIFVaGvqegspA~------~eGlqEGDQIL~VN~vdF~nl  464 (1027)
T KOG3580|consen  429 VGIFVAGVQEGSPAE------QEGLQEGDQILKVNTVDFRNL  464 (1027)
T ss_pred             eeEEEeecccCCchh------hccccccceeEEeccccchhh
Confidence            599999999999999      999999999999988875554


No 50 
>KOG2921|consensus
Probab=93.03  E-value=0.099  Score=45.99  Aligned_cols=41  Identities=17%  Similarity=0.084  Sum_probs=34.0

Q ss_pred             CCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh
Q psy18070         92 LTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK  137 (169)
Q Consensus        92 ~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~  137 (169)
                      ...||.|++|...||+.    +.+ ||.+||+|++.++.++.+..+
T Consensus       218 ~g~gV~Vtev~~~Spl~----gpr-GL~vgdvitsldgcpV~~v~d  258 (484)
T KOG2921|consen  218 HGEGVTVTEVPSVSPLF----GPR-GLSVGDVITSLDGCPVHKVSD  258 (484)
T ss_pred             cCceEEEEeccccCCCc----Ccc-cCCccceEEecCCcccCCHHH
Confidence            34699999999999998    334 999999999988887777644


No 51 
>KOG3605|consensus
Probab=92.29  E-value=0.26  Score=45.84  Aligned_cols=82  Identities=13%  Similarity=0.091  Sum_probs=59.0

Q ss_pred             EEEehhhHHHHHHhhhhcCCccceeec----ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccc
Q psy18070         38 FAIPIDYAIEFLTNYKRKDIDRTITHK----KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFR  113 (169)
Q Consensus        38 faiP~~~i~~~l~~l~~~g~~~~~~~~----~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~  113 (169)
                      --+|.+..+.+++.++++-    .++.    .--=.++.-.-|+++.+||.      ....||+++=. .|+-|+     
T Consensus       707 VGLPLstcQs~Ik~~KnQT----~VkltiV~cpPV~~V~I~RPd~kyQLGF------SVQNGiICSLl-RGGIAE-----  770 (829)
T KOG3605|consen  707 VGLPLSTCQSIIKGLKNQT----AVKLNIVSCPPVTTVLIRRPDLRYQLGF------SVQNGIICSLL-RGGIAE-----  770 (829)
T ss_pred             ccccHHHHHHHHhcccccc----eEEEEEecCCCceEEEeecccchhhccc------eeeCcEeehhh-cccchh-----
Confidence            3478888999998888775    3222    11112233334778888887      45589988755 589999     


Q ss_pred             cccCCCCCcEEEecCceeecchh
Q psy18070        114 TSAGIKPTSSRLLGECLAQYTTS  136 (169)
Q Consensus       114 ~~aGL~~GDvI~~~~~v~~~~~~  136 (169)
                       |.|+|.|-.|+++|+..+..+-
T Consensus       771 -RGGVRVGHRIIEINgQSVVA~p  792 (829)
T KOG3605|consen  771 -RGGVRVGHRIIEINGQSVVATP  792 (829)
T ss_pred             -ccCceeeeeEEEECCceEEecc
Confidence             9999999999999988877663


No 52 
>KOG3550|consensus
Probab=92.02  E-value=0.31  Score=37.70  Aligned_cols=36  Identities=17%  Similarity=0.115  Sum_probs=32.0

Q ss_pred             CcEEEEEEccCCcccccccccc-cCCCCCcEEEecCceeecch
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTS-AGIKPTSSRLLGECLAQYTT  135 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~-aGL~~GDvI~~~~~v~~~~~  135 (169)
                      +-++|+.+.||+-|+      + .||+.||-++++|++.+...
T Consensus       115 spiyisriipggvad------rhgglkrgdqllsvngvsvege  151 (207)
T KOG3550|consen  115 SPIYISRIIPGGVAD------RHGGLKRGDQLLSVNGVSVEGE  151 (207)
T ss_pred             CceEEEeecCCcccc------ccCcccccceeEeecceeecch
Confidence            579999999999999      6 58999999999999887655


No 53 
>KOG3580|consensus
Probab=91.70  E-value=0.37  Score=44.73  Aligned_cols=64  Identities=11%  Similarity=0.081  Sum_probs=42.8

Q ss_pred             CCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh-hhhhhcCCCCeeEEEEEEeeEEee
Q psy18070         91 DLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK-LVVWSINHPSITCHILLRLYLLVC  161 (169)
Q Consensus        91 ~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~-~~~~~~~~~~~~~~~i~r~~~~v~  161 (169)
                      +..+-++|++|.||+||+       .-||.||-|+.+|++...+... +++...++.+...-+..++-..|+
T Consensus        37 ~getSiViSDVlpGGPAe-------G~LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~ItvkRprkvq  101 (1027)
T KOG3580|consen   37 NGETSIVISDVLPGGPAE-------GLLQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITVKRPRKVQ  101 (1027)
T ss_pred             CCceeEEEeeccCCCCcc-------cccccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEecccceee
Confidence            445679999999999999       4599999999999997666533 233344444444333333333333


No 54 
>KOG3209|consensus
Probab=91.56  E-value=0.23  Score=46.74  Aligned_cols=51  Identities=18%  Similarity=0.081  Sum_probs=36.8

Q ss_pred             EEEEccCCcccccccccccC-CCCCcEEEecCceeecchhhhh--hhhcCCCCeeEEEEE
Q psy18070         98 IWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCHILL  154 (169)
Q Consensus        98 V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i~  154 (169)
                      |-+|.+||||+      +.| |+.||.|+++|+..+.+-....  .+..+..-.+.++|.
T Consensus       782 iGrIieGSPAd------RCgkLkVGDrilAVNG~sI~~lsHadiv~LIKdaGlsVtLtIi  835 (984)
T KOG3209|consen  782 IGRIIEGSPAD------RCGKLKVGDRILAVNGQSILNLSHADIVSLIKDAGLSVTLTII  835 (984)
T ss_pred             ccccccCChhH------hhccccccceEEEecCeeeeccCchhHHHHHHhcCceEEEEEc
Confidence            77889999999      876 9999999999998887764433  233333444455553


No 55 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=90.88  E-value=0.24  Score=41.07  Aligned_cols=32  Identities=25%  Similarity=0.270  Sum_probs=28.5

Q ss_pred             EeeccccCCCCccceEEcCCccEEEEEeeecc
Q psy18070          2 SLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT   33 (169)
Q Consensus         2 iq~da~in~GnSGGplvn~~G~vvGi~~~~~~   33 (169)
                      ++-|+-+-||+||.|+++.+.+|+|+......
T Consensus       194 l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~  225 (251)
T COG3591         194 LFYDADTLPGSSGSPVLISKDEVIGVHYNGPG  225 (251)
T ss_pred             EEEEecccCCCCCCceEecCceEEEEEecCCC
Confidence            67788999999999999999999999887654


No 56 
>KOG3542|consensus
Probab=90.61  E-value=0.2  Score=47.04  Aligned_cols=57  Identities=14%  Similarity=0.031  Sum_probs=39.1

Q ss_pred             CCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch-hhhhhhhcCCCCeeEEEEEE
Q psy18070         93 THGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT-SKLVVWSINHPSITCHILLR  155 (169)
Q Consensus        93 ~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~-~~~~~~~~~~~~~~~~~i~r  155 (169)
                      ..|++|.+|.|+|-|+      +.||+.||-|+++|+....+- +..+........-..+++.-
T Consensus       561 GfgifV~~V~pgskAa------~~GlKRgDqilEVNgQnfenis~~KA~eiLrnnthLtltvKt  618 (1283)
T KOG3542|consen  561 GFGIFVAEVFPGSKAA------REGLKRGDQILEVNGQNFENISAKKAEEILRNNTHLTLTVKT  618 (1283)
T ss_pred             cceeEEeeecCCchHH------HhhhhhhhhhhhccccchhhhhHHHHHHHhcCCceEEEEEec
Confidence            3589999999999999      999999999999887654433 22233333333333444433


No 57 
>KOG3834|consensus
Probab=90.15  E-value=0.39  Score=42.61  Aligned_cols=63  Identities=16%  Similarity=0.059  Sum_probs=47.3

Q ss_pred             EEEEEEccCCcccccccccccCCC-CCcEEEec-CceeecchhhhhhhhcCCCCeeEEEEEEeeEEeeccc
Q psy18070         96 VLIWRVMYNSPAYFIKFRTSAGIK-PTSSRLLG-ECLAQYTTSKLVVWSINHPSITCHILLRLYLLVCSEL  164 (169)
Q Consensus        96 v~V~~V~~~spA~~~~~~~~aGL~-~GDvI~~~-~~v~~~~~~~~~~~~~~~~~~~~~~i~r~~~~v~~~~  164 (169)
                      .-|-+|.++|||+      .|||+ -+|-|+.+ +.+-..++..+.+...+....+++.+|.-+.-.|.++
T Consensus       111 wHvl~V~p~SPaa------lAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReV  175 (462)
T KOG3834|consen  111 WHVLSVEPNSPAA------LAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREV  175 (462)
T ss_pred             eeeeecCCCCHHH------hcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceE
Confidence            4467889999999      99999 68999987 6665555555555666677788888888777666554


No 58 
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=89.83  E-value=0.95  Score=38.87  Aligned_cols=55  Identities=16%  Similarity=0.137  Sum_probs=41.8

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--hcCCCCeeEEEEEE
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--SINHPSITCHILLR  155 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~~~~~~~~~~~i~r  155 (169)
                      .||++..|..+|||.       .-|+.||-|+++++.+..+..++.-.  +.+....+.+...|
T Consensus       130 ~gvyv~~v~~~~~~~-------gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r  186 (342)
T COG3480         130 AGVYVLSVIDNSPFK-------GKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYER  186 (342)
T ss_pred             eeEEEEEccCCcchh-------ceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEe
Confidence            699999999999998       45999999999988887777666633  33334455666665


No 59 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=88.47  E-value=0.49  Score=35.81  Aligned_cols=32  Identities=34%  Similarity=0.596  Sum_probs=22.3

Q ss_pred             CCCCccceEEcCCccEEEEEeeecc-CCeEEEE
Q psy18070          9 KFGNSGGPLVNLDGEVIGINSMKVT-AGISFAI   40 (169)
Q Consensus         9 n~GnSGGplvn~~G~vvGi~~~~~~-~~~~fai   40 (169)
                      -.|.||||++-.+|.+|||-.+... .+..-+|
T Consensus       106 lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i  138 (148)
T PF02907_consen  106 LKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAI  138 (148)
T ss_dssp             HTT-TT-EEEETTSEEEEEEEEEEEETTEEEEE
T ss_pred             EecCCCCcccCCCCCEEEEEEEEEEcCCceeeE
Confidence            3699999999999999999876653 3444333


No 60 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=88.13  E-value=0.95  Score=33.76  Aligned_cols=35  Identities=29%  Similarity=0.304  Sum_probs=25.2

Q ss_pred             eccccCCCCccceEEcCCccEEEEEeeeccCCeEEE
Q psy18070          4 TGIMVKFGNSGGPLVNLDGEVIGINSMKVTAGISFA   39 (169)
Q Consensus         4 ~da~in~GnSGGplvn~~G~vvGi~~~~~~~~~~fa   39 (169)
                      ...+..||.-||+|+ .+--|+||.+++...-.+|+
T Consensus        83 g~Gp~~PGdCGg~L~-C~HGViGi~Tagg~g~VaF~  117 (127)
T PF00947_consen   83 GEGPAEPGDCGGILR-CKHGVIGIVTAGGEGHVAFA  117 (127)
T ss_dssp             EE-SSSTT-TCSEEE-ETTCEEEEEEEEETTEEEEE
T ss_pred             ecccCCCCCCCceeE-eCCCeEEEEEeCCCceEEEE
Confidence            345789999999999 55569999999875434443


No 61 
>KOG3209|consensus
Probab=84.78  E-value=2.1  Score=40.62  Aligned_cols=59  Identities=14%  Similarity=0.118  Sum_probs=44.9

Q ss_pred             CCCcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecch-hhhhhhhcCCCCeeEEEEEEe
Q psy18070         92 LTHGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTT-SKLVVWSINHPSITCHILLRL  156 (169)
Q Consensus        92 ~~~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~-~~~~~~~~~~~~~~~~~i~r~  156 (169)
                      +.-+++|.+..+++||.      +.| ++.||-|+++|+...... -..++..++..+...++++|+
T Consensus       921 ynM~LfVLRlAeDGPA~------rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~gg~~vll~Lr~  981 (984)
T KOG3209|consen  921 YNMDLFVLRLAEDGPAI------RDGRMRVGDQITEINGESTKGMTHDRAIELIKQGGRRVLLLLRR  981 (984)
T ss_pred             cccceEEEEeccCCCcc------ccCceeecceEEEecCcccCCCcHHHHHHHHHhCCeEEEEEecc
Confidence            34579999999999999      876 999999999988765554 334556677777666666664


No 62 
>KOG0606|consensus
Probab=83.37  E-value=0.95  Score=44.56  Aligned_cols=34  Identities=18%  Similarity=0.025  Sum_probs=28.9

Q ss_pred             EEEEEEccCCcccccccccccCCCCCcEEEecCceeecch
Q psy18070         96 VLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT  135 (169)
Q Consensus        96 v~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~  135 (169)
                      -.|+.|.++|||.      .+|+++||.|+.+++..+...
T Consensus       660 h~v~sv~egsPA~------~agls~~DlIthvnge~v~gl  693 (1205)
T KOG0606|consen  660 HSVGSVEEGSPAF------EAGLSAGDLITHVNGEPVHGL  693 (1205)
T ss_pred             eeeeeecCCCCcc------ccCCCccceeEeccCcccchh
Confidence            5789999999999      899999999999886554443


No 63 
>KOG3651|consensus
Probab=81.01  E-value=2.1  Score=36.76  Aligned_cols=42  Identities=19%  Similarity=0.169  Sum_probs=35.3

Q ss_pred             CcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecchhhhhhh
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTTSKLVVW  141 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~~~~~~~  141 (169)
                      .=++|..|-.++||+      +-| ++.||-|+++|++.+...-+..+.
T Consensus        30 PClYiVQvFD~tPAa------~dG~i~~GDEi~avNg~svKGktKveVA   72 (429)
T KOG3651|consen   30 PCLYIVQVFDKTPAA------KDGRIRCGDEIVAVNGISVKGKTKVEVA   72 (429)
T ss_pred             CeEEEEEeccCCchh------ccCccccCCeeEEecceeecCccHHHHH
Confidence            468999999999999      765 999999999999988776555544


No 64 
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=80.77  E-value=2.2  Score=35.35  Aligned_cols=56  Identities=11%  Similarity=0.006  Sum_probs=37.6

Q ss_pred             cEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh--hhhhhcCCCCeeEEEEEEe
Q psy18070         95 GVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK--LVVWSINHPSITCHILLRL  156 (169)
Q Consensus        95 Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~--~~~~~~~~~~~~~~~i~r~  156 (169)
                      |-.+.=..++|.-+      ..|||+||+-+++|+....+..+  .++..+......++++.|+
T Consensus       208 Gyr~~pgkd~slF~------~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~  265 (275)
T COG3031         208 GYRFEPGKDGSLFY------KSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRR  265 (275)
T ss_pred             EEEecCCCCcchhh------hhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEec
Confidence            33344444555555      89999999999988876555433  3344556667778888885


No 65 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=79.48  E-value=3.4  Score=34.01  Aligned_cols=40  Identities=30%  Similarity=0.532  Sum_probs=23.4

Q ss_pred             eccccCCCCccceEEcCC-ccEEEEEeeecc-CCeEEEEehh
Q psy18070          4 TGIMVKFGNSGGPLVNLD-GEVIGINSMKVT-AGISFAIPID   43 (169)
Q Consensus         4 ~da~in~GnSGGplvn~~-G~vvGi~~~~~~-~~~~faiP~~   43 (169)
                      +-.+...|.=|.|+|+.. |.+||+-++... ...+|+.|+.
T Consensus       144 HwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~~~N~F~~f~  185 (235)
T PF00863_consen  144 HWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTSSRNYFTPFP  185 (235)
T ss_dssp             E-C---TT-TT-EEEETTT--EEEEEEEEETTTSSEEEEE--
T ss_pred             EEecCCCCccCCcEEEcCCCcEEEEEcCccCCCCeEEEEcCC
Confidence            345678999999999975 999999998764 4456766653


No 66 
>KOG3552|consensus
Probab=75.98  E-value=4  Score=39.86  Aligned_cols=55  Identities=11%  Similarity=0.044  Sum_probs=37.4

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh-hhhc-CCCCeeEEEEEE
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV-VWSI-NHPSITCHILLR  155 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~-~~~~-~~~~~~~~~i~r  155 (169)
                      .-|+|..|.+|+|+.       ..|+|||-|+++|+-.+...-+.. +... .-...+.+++.+
T Consensus        75 rPviVr~VT~GGps~-------GKL~PGDQIl~vN~Epv~daprervIdlvRace~sv~ltV~q  131 (1298)
T KOG3552|consen   75 RPVIVRFVTEGGPSI-------GKLQPGDQILAVNGEPVKDAPRERVIDLVRACESSVNLTVCQ  131 (1298)
T ss_pred             CceEEEEecCCCCcc-------ccccCCCeEEEecCcccccccHHHHHHHHHHHhhhcceEEec
Confidence            358999999999999       569999999998877665553222 2222 223445666555


No 67 
>KOG1892|consensus
Probab=75.67  E-value=3.1  Score=40.82  Aligned_cols=38  Identities=13%  Similarity=0.107  Sum_probs=30.9

Q ss_pred             CCCCcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecc
Q psy18070         91 DLTHGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYT  134 (169)
Q Consensus        91 ~~~~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~  134 (169)
                      +..-|++|.+|.+|++|+      .-| |+.||-++++++.....
T Consensus       957 q~klGIYvKsVV~GgaAd------~DGRL~aGDQLLsVdG~SLiG  995 (1629)
T KOG1892|consen  957 QRKLGIYVKSVVEGGAAD------HDGRLEAGDQLLSVDGHSLIG  995 (1629)
T ss_pred             ccccceEEEEeccCCccc------cccccccCceeeeecCccccc
Confidence            334599999999999999      544 99999999988776443


No 68 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=74.07  E-value=5.1  Score=37.62  Aligned_cols=44  Identities=18%  Similarity=0.224  Sum_probs=34.8

Q ss_pred             cCCCCccceEEcCCcc------EEEEEeeecc--CCeEEEEehhhHHHHHHh
Q psy18070          8 VKFGNSGGPLVNLDGE------VIGINSMKVT--AGISFAIPIDYAIEFLTN   51 (169)
Q Consensus         8 in~GnSGGplvn~~G~------vvGi~~~~~~--~~~~faiP~~~i~~~l~~   51 (169)
                      -.+|.||.-+++.-+.      |+||.++...  -.+|++.|++.+..-|++
T Consensus       636 a~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~  687 (695)
T PF08192_consen  636 ASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEE  687 (695)
T ss_pred             cCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHH
Confidence            3579999999998766      9999998654  258999998877666654


No 69 
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=73.83  E-value=7.1  Score=33.29  Aligned_cols=50  Identities=14%  Similarity=0.117  Sum_probs=34.1

Q ss_pred             EEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCC--eeEEEEEE
Q psy18070        100 RVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPS--ITCHILLR  155 (169)
Q Consensus       100 ~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~--~~~~~i~r  155 (169)
                      .+..+|+|+      .+|+++||.|++.++.+..+-.+..  ........  ...+.+.|
T Consensus       135 ~v~~~s~a~------~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~  188 (375)
T COG0750         135 EVAPKSAAA------LAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIR  188 (375)
T ss_pred             ecCCCCHHH------HcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEe
Confidence            688999999      9999999999997777666554332  22222222  25666666


No 70 
>KOG3627|consensus
Probab=72.93  E-value=2.9  Score=33.34  Aligned_cols=26  Identities=42%  Similarity=0.506  Sum_probs=21.5

Q ss_pred             cCCCCccceEEcCC---ccEEEEEeeecc
Q psy18070          8 VKFGNSGGPLVNLD---GEVIGINSMKVT   33 (169)
Q Consensus         8 in~GnSGGplvn~~---G~vvGi~~~~~~   33 (169)
                      ...|+|||||+-.+   ..++||++....
T Consensus       201 ~C~GDSGGPLv~~~~~~~~~~GivS~G~~  229 (256)
T KOG3627|consen  201 ACQGDSGGPLVCEDNGRWVLVGIVSWGSG  229 (256)
T ss_pred             cccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence            56799999999776   699999987654


No 71 
>PF11874 DUF3394:  Domain of unknown function (DUF3394);  InterPro: IPR021814  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM. 
Probab=70.87  E-value=5.7  Score=31.49  Aligned_cols=28  Identities=29%  Similarity=0.184  Sum_probs=25.6

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEEec
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLG  127 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~  127 (169)
                      ..+.|..|..||||+      ++|+.-++.|+++
T Consensus       122 ~~~~Vd~v~fgS~A~------~~g~d~d~~I~~v  149 (183)
T PF11874_consen  122 GKVIVDEVEFGSPAE------KAGIDFDWEITEV  149 (183)
T ss_pred             CEEEEEecCCCCHHH------HcCCCCCcEEEEE
Confidence            568999999999999      9999999999883


No 72 
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=67.13  E-value=7.4  Score=23.43  Aligned_cols=20  Identities=45%  Similarity=0.642  Sum_probs=16.8

Q ss_pred             CCccceEEcCCccEEEEEee
Q psy18070         11 GNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus        11 GnSGGplvn~~G~vvGi~~~   30 (169)
                      +-+.-|++|.+|+++|+.+.
T Consensus        29 ~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   29 GISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             TSSEEEEESTTSBEEEEEEH
T ss_pred             CCcEEEEEecCCEEEEEEEH
Confidence            45678999999999999774


No 73 
>KOG3606|consensus
Probab=65.20  E-value=5.6  Score=33.70  Aligned_cols=38  Identities=18%  Similarity=0.147  Sum_probs=32.3

Q ss_pred             CCCcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecch
Q psy18070         92 LTHGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTT  135 (169)
Q Consensus        92 ~~~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~  135 (169)
                      ...|+.|++..||+-|+      .-| |..+|-++++|++++..-
T Consensus       192 kvpGIFISRlVpGGLAe------STGLLaVnDEVlEVNGIEVaGK  230 (358)
T KOG3606|consen  192 KVPGIFISRLVPGGLAE------STGLLAVNDEVLEVNGIEVAGK  230 (358)
T ss_pred             ccCceEEEeecCCcccc------ccceeeecceeEEEcCEEeccc
Confidence            34799999999999999      777 568999999999987543


No 74 
>KOG0609|consensus
Probab=62.63  E-value=18  Score=33.17  Aligned_cols=35  Identities=17%  Similarity=0.136  Sum_probs=31.3

Q ss_pred             cEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecch
Q psy18070         95 GVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTT  135 (169)
Q Consensus        95 Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~  135 (169)
                      -++|..+..|+-|.      +.| |+.||.|.++|++.+.+.
T Consensus       147 ~~~vARI~~GG~~~------r~glL~~GD~i~EvNGi~v~~~  182 (542)
T KOG0609|consen  147 KVVVARIMHGGMAD------RQGLLHVGDEILEVNGISVANK  182 (542)
T ss_pred             ccEEeeeccCCcch------hccceeeccchheecCeecccC
Confidence            59999999999999      887 789999999999987776


No 75 
>PF02743 Cache_1:  Cache domain;  InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=56.56  E-value=25  Score=22.96  Aligned_cols=33  Identities=30%  Similarity=0.511  Sum_probs=25.7

Q ss_pred             ceEEcCCccEEEEEeeeccCCeEEEEehhhHHHHHHhhhhc
Q psy18070         15 GPLVNLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYKRK   55 (169)
Q Consensus        15 Gplvn~~G~vvGi~~~~~~~~~~faiP~~~i~~~l~~l~~~   55 (169)
                      -|+.+.+|+++|+..        ..+..+.+.++++++.-+
T Consensus        19 ~pi~~~~g~~~Gvv~--------~di~l~~l~~~i~~~~~~   51 (81)
T PF02743_consen   19 VPIYDDDGKIIGVVG--------IDISLDQLSEIISNIKFG   51 (81)
T ss_dssp             EEEEETTTEEEEEEE--------EEEEHHHHHHHHTTSBBT
T ss_pred             EEEECCCCCEEEEEE--------EEeccceeeeEEEeeEEC
Confidence            488888999999865        457788888888776543


No 76 
>cd00218 GlcAT-I Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl  of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately.  The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43).
Probab=54.68  E-value=13  Score=30.38  Aligned_cols=31  Identities=23%  Similarity=0.400  Sum_probs=23.7

Q ss_pred             cceEEcCCccEEEEEeeecc------CCeEEEEehhhH
Q psy18070         14 GGPLVNLDGEVIGINSMKVT------AGISFAIPIDYA   45 (169)
Q Consensus        14 GGplvn~~G~vvGi~~~~~~------~~~~faiP~~~i   45 (169)
                      -||+++ +|+|+|+.+.-..      +-.|||+-+..+
T Consensus       136 egP~c~-~gkV~gw~~~w~~~R~f~idmAGFA~n~~ll  172 (223)
T cd00218         136 EGPVCE-NGKVVGWHTAWKPERPFPIDMAGFAFNSKLL  172 (223)
T ss_pred             eccEee-CCeEeEEecCCCCCCCCcceeeeEEEehhhh
Confidence            379998 9999999987543      235899987654


No 77 
>KOG3605|consensus
Probab=53.11  E-value=9.8  Score=35.87  Aligned_cols=52  Identities=12%  Similarity=0.112  Sum_probs=34.5

Q ss_pred             EEEEccCCcccccccccccC-CCCCcEEEecCceeecc----hhhhhhhhcCCCCeeEEEEEE
Q psy18070         98 IWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYT----TSKLVVWSINHPSITCHILLR  155 (169)
Q Consensus        98 V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~----~~~~~~~~~~~~~~~~~~i~r  155 (169)
                      |...+.++||+      +.| |..||-|+++|+....-    +-+..+...+....+++++.+
T Consensus       677 iAnmm~~GpAa------rsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~  733 (829)
T KOG3605|consen  677 IANMMHGGPAA------RSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS  733 (829)
T ss_pred             HHhcccCChhh------hcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence            33458899999      887 99999999999876543    234444444555555555443


No 78 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=46.29  E-value=22  Score=28.52  Aligned_cols=40  Identities=20%  Similarity=0.153  Sum_probs=16.2

Q ss_pred             eeccccCCCCccceEEcCCccEEEEEeeec----cCCeEEEEehh
Q psy18070          3 LTGIMVKFGNSGGPLVNLDGEVIGINSMKV----TAGISFAIPID   43 (169)
Q Consensus         3 q~da~in~GnSGGplvn~~G~vvGi~~~~~----~~~~~faiP~~   43 (169)
                      +.-....+|.||.|+++.. ++||+-....    .++.++.-|+.
T Consensus       139 ~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip  182 (203)
T PF02122_consen  139 SVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIP  182 (203)
T ss_dssp             EE-----TT-TT-EEE-SS--EEEEEEEE----------------
T ss_pred             ceEcCCCCCCCCCCeEECC-CceEeecCccccccccccccccccc
Confidence            3445667999999999988 9999988741    25566665553


No 79 
>KOG3834|consensus
Probab=46.04  E-value=30  Score=31.02  Aligned_cols=58  Identities=17%  Similarity=0.157  Sum_probs=40.9

Q ss_pred             CCCCcEEEEEEccCCcccccccccccCCCC-CcEEEecCceeecchhhh--hhhhcCCCCeeEEEEEE
Q psy18070         91 DLTHGVLIWRVMYNSPAYFIKFRTSAGIKP-TSSRLLGECLAQYTTSKL--VVWSINHPSITCHILLR  155 (169)
Q Consensus        91 ~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~-GDvI~~~~~v~~~~~~~~--~~~~~~~~~~~~~~i~r  155 (169)
                      ..+.|--|.+|.++|||.      ++||.+ -|-|++++++....+-+.  .++....+. ++++++.
T Consensus        12 ggteg~hvlkVqedSpa~------~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n   72 (462)
T KOG3834|consen   12 GGTEGYHVLKVQEDSPAH------KAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYN   72 (462)
T ss_pred             CCceeEEEEEeecCChHH------hcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEe
Confidence            445688999999999999      999998 688999888877644222  233333333 6666654


No 80 
>KOG3549|consensus
Probab=40.79  E-value=39  Score=29.77  Aligned_cols=53  Identities=13%  Similarity=-0.010  Sum_probs=36.8

Q ss_pred             cEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecchhhhh-h-hhcCCCCeeEEEE
Q psy18070         95 GVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTTSKLV-V-WSINHPSITCHIL  153 (169)
Q Consensus        95 Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~~~~~-~-~~~~~~~~~~~~i  153 (169)
                      -|+|+.+-++-.|+      ..| |-.||.|+.+|++.+...-... + ..-+.++.+.+++
T Consensus        81 PvviSkI~kdQaAd------~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNAGdeVtlTV  136 (505)
T KOG3549|consen   81 PVVISKIYKDQAAD------ITGQLFVGDAILQVNGIYVTACPHEEVVNILRNAGDEVTLTV  136 (505)
T ss_pred             cEEeehhhhhhhhh------hcCceEeeeeeEEeccEEeecCChHHHHHHHHhcCCEEEEEe
Confidence            48999999888777      555 7799999999999877663222 2 2334455555543


No 81 
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=37.45  E-value=42  Score=30.57  Aligned_cols=31  Identities=19%  Similarity=0.257  Sum_probs=21.5

Q ss_pred             EEEEccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070         98 IWRVMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT  135 (169)
Q Consensus        98 V~~V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~  135 (169)
                      |.-..+|-|..      .| .||||||++  +.-|++.++
T Consensus       302 vl~~~ENm~~g------~A-~rPGDVits~~GkTVEV~NT  334 (485)
T COG0260         302 VLPAVENMPSG------NA-YRPGDVITSMNGKTVEVLNT  334 (485)
T ss_pred             EEeeeccCCCC------CC-CCCCCeEEecCCcEEEEccc
Confidence            44445677777      55 899999999  555565555


No 82 
>cd04582 CBS_pair_ABC_OpuCA_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown.  In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzi
Probab=36.69  E-value=33  Score=22.69  Aligned_cols=22  Identities=27%  Similarity=0.291  Sum_probs=17.0

Q ss_pred             CCCccceEEcCCccEEEEEeee
Q psy18070         10 FGNSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~~   31 (169)
                      .+-+--|++|.+|+++|+.+..
T Consensus        80 ~~~~~~~Vv~~~~~~~Gvi~~~  101 (106)
T cd04582          80 HDMSWLPCVDEDGRYVGEVTQR  101 (106)
T ss_pred             CCCCeeeEECCCCcEEEEEEHH
Confidence            3445578999999999998753


No 83 
>cd04596 CBS_pair_DRTGG_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=35.93  E-value=33  Score=22.92  Aligned_cols=21  Identities=29%  Similarity=0.294  Sum_probs=17.3

Q ss_pred             CCCccceEEcCCccEEEEEee
Q psy18070         10 FGNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~   30 (169)
                      .+-..-|++|.+|+++|+.+.
T Consensus        82 ~~~~~~~Vv~~~~~~~G~it~  102 (108)
T cd04596          82 EGIEMLPVVDDNKKLLGIISR  102 (108)
T ss_pred             cCCCeeeEEcCCCCEEEEEEH
Confidence            455667999999999999875


No 84 
>KOG1728|consensus
Probab=35.57  E-value=13  Score=28.17  Aligned_cols=31  Identities=26%  Similarity=0.402  Sum_probs=24.5

Q ss_pred             CCcccccccccccCCCCCcEEEecCceeecchhhhhhh
Q psy18070        104 NSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW  141 (169)
Q Consensus       104 ~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~  141 (169)
                      =||+.      + .+++||+++.+++-+...+..+.++
T Consensus       111 ~SPcF------r-di~~gDiVtvGecrPLSKtvrfnVL  141 (156)
T KOG1728|consen  111 VSPCF------R-DIQEGDIVTVGECRPLSKTVRFNVL  141 (156)
T ss_pred             cchhh------h-ccccCCEEEEeecccccceEEEEEE
Confidence            38898      7 7999999999998877776555544


No 85 
>PF08669 GCV_T_C:  Glycine cleavage T-protein C-terminal barrel domain;  InterPro: IPR013977  This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=35.22  E-value=53  Score=22.19  Aligned_cols=31  Identities=23%  Similarity=0.367  Sum_probs=21.1

Q ss_pred             CCCccceEEcCCccEEEEEeeec-c----CCeEEEE
Q psy18070         10 FGNSGGPLVNLDGEVIGINSMKV-T----AGISFAI   40 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~~~-~----~~~~fai   40 (169)
                      +=..|.|++..+|+.||..+... +    .+++++.
T Consensus        32 ~~~~g~~v~~~~g~~vG~vTS~~~sp~~~~~Iala~   67 (95)
T PF08669_consen   32 PPRGGEPVYDEDGKPVGRVTSGAYSPTLGKNIALAY   67 (95)
T ss_dssp             --STTCEEEETTTEEEEEEEEEEEETTTTEEEEEEE
T ss_pred             CCCCCCEEEECCCcEEeEEEEEeECCCCCceEEEEE
Confidence            34567899988999999877654 2    3455554


No 86 
>KOG1476|consensus
Probab=35.10  E-value=27  Score=30.06  Aligned_cols=32  Identities=28%  Similarity=0.576  Sum_probs=24.3

Q ss_pred             ceEEcCCccEEEEEeeecc------CCeEEEEehhhHHH
Q psy18070         15 GPLVNLDGEVIGINSMKVT------AGISFAIPIDYAIE   47 (169)
Q Consensus        15 Gplvn~~G~vvGi~~~~~~------~~~~faiP~~~i~~   47 (169)
                      ||.++ +|+|+|++..-..      +-.|||+-..++..
T Consensus       223 ~P~v~-~~kvvg~~~~w~~~r~f~vdmaGFAvNl~lll~  260 (330)
T KOG1476|consen  223 GPVVN-NGKVVGWHTRWEPERPFAVDMAGFAVNLKLLLD  260 (330)
T ss_pred             cceec-cCeeEEEEeccccCCCCccchhhheehhhhhcc
Confidence            69998 9999999987553      33589998766544


No 87 
>PF10049 DUF2283:  Protein of unknown function (DUF2283);  InterPro: IPR019270  Members of this family of hypothetical proteins have no known function. 
Probab=34.85  E-value=43  Score=20.45  Aligned_cols=13  Identities=31%  Similarity=0.736  Sum_probs=9.8

Q ss_pred             EcCCccEEEEEee
Q psy18070         18 VNLDGEVIGINSM   30 (169)
Q Consensus        18 vn~~G~vvGi~~~   30 (169)
                      +|.+|++|||-..
T Consensus        35 ~d~~G~ivGIEIl   47 (50)
T PF10049_consen   35 YDEDGRIVGIEIL   47 (50)
T ss_pred             ECCCCCEEEEEEE
Confidence            4667999998654


No 88 
>cd04618 CBS_pair_5 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=33.51  E-value=31  Score=23.26  Aligned_cols=21  Identities=14%  Similarity=0.148  Sum_probs=16.4

Q ss_pred             CCccceEEcCC-ccEEEEEeee
Q psy18070         11 GNSGGPLVNLD-GEVIGINSMK   31 (169)
Q Consensus        11 GnSGGplvn~~-G~vvGi~~~~   31 (169)
                      +-.-=|++|.+ |+++|+.+..
T Consensus        72 ~~~~lpVvd~~~~~~~giit~~   93 (98)
T cd04618          72 KIHRLPVIDPSTGTGLYILTSR   93 (98)
T ss_pred             CCCEeeEEECCCCCceEEeehh
Confidence            44456999987 9999998754


No 89 
>cd04606 CBS_pair_Mg_transporter This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE.  MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=33.30  E-value=39  Score=22.59  Aligned_cols=21  Identities=24%  Similarity=0.539  Sum_probs=16.6

Q ss_pred             CCCccceEEcCCccEEEEEee
Q psy18070         10 FGNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~   30 (169)
                      .+-...|++|.+|+++|+.+.
T Consensus        82 ~~~~~~~Vv~~~~~~~Gvit~  102 (109)
T cd04606          82 YDLLALPVVDEEGRLVGIITV  102 (109)
T ss_pred             cCCceeeeECCCCcEEEEEEh
Confidence            334567999999999999875


No 90 
>smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.
Probab=33.22  E-value=42  Score=17.97  Aligned_cols=20  Identities=30%  Similarity=0.554  Sum_probs=15.3

Q ss_pred             CCccceEEcCCccEEEEEee
Q psy18070         11 GNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus        11 GnSGGplvn~~G~vvGi~~~   30 (169)
                      +-+.-|+++.+++++|+.+.
T Consensus        22 ~~~~~~v~~~~~~~~g~i~~   41 (49)
T smart00116       22 GIRRLPVVDEEGRLVGIVTR   41 (49)
T ss_pred             CCCcccEECCCCeEEEEEEH
Confidence            34456889988999998764


No 91 
>cd04610 CBS_pair_ParBc_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain downstream. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=32.50  E-value=41  Score=22.19  Aligned_cols=18  Identities=28%  Similarity=0.453  Sum_probs=14.7

Q ss_pred             ccceEEcCCccEEEEEee
Q psy18070         13 SGGPLVNLDGEVIGINSM   30 (169)
Q Consensus        13 SGGplvn~~G~vvGi~~~   30 (169)
                      +--|++|.+|+++|+.+.
T Consensus        84 ~~~~Vv~~~g~~~Gvi~~  101 (107)
T cd04610          84 SKLPVVDENNNLVGIITN  101 (107)
T ss_pred             CeEeEECCCCeEEEEEEH
Confidence            346889999999999774


No 92 
>PRK09570 rpoH DNA-directed RNA polymerase subunit H; Reviewed
Probab=31.65  E-value=29  Score=23.74  Aligned_cols=16  Identities=31%  Similarity=0.439  Sum_probs=12.0

Q ss_pred             CCccccccccccc-CCCCCcEEE
Q psy18070        104 NSPAYFIKFRTSA-GIKPTSSRL  125 (169)
Q Consensus       104 ~spA~~~~~~~~a-GL~~GDvI~  125 (169)
                      ..|++      +. |+++||||-
T Consensus        43 ~DPv~------r~~g~k~GdVvk   59 (79)
T PRK09570         43 SDPVV------KAIGAKPGDVIK   59 (79)
T ss_pred             cChhh------hhcCCCCCCEEE
Confidence            45666      54 999999974


No 93 
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually 
Probab=30.69  E-value=51  Score=23.82  Aligned_cols=22  Identities=23%  Similarity=0.099  Sum_probs=17.7

Q ss_pred             CCCccceEEcCCccEEEEEeee
Q psy18070         10 FGNSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~~   31 (169)
                      .+-++-|++|.+|+++|+.+..
T Consensus        22 ~~~~~~~VvD~~g~l~Givt~~   43 (133)
T cd04592          22 EKQSCVLVVDSDDFLEGILTLG   43 (133)
T ss_pred             cCCCEEEEECCCCeEEEEEEHH
Confidence            3456789999999999998743


No 94 
>cd04641 CBS_pair_28 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=30.22  E-value=53  Score=22.44  Aligned_cols=22  Identities=27%  Similarity=0.382  Sum_probs=17.8

Q ss_pred             CCCCccceEEcCCccEEEEEee
Q psy18070          9 KFGNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus         9 n~GnSGGplvn~~G~vvGi~~~   30 (169)
                      ..+-+.-|++|.+|+++|+.+.
T Consensus        21 ~~~~~~~pVv~~~~~~~Giv~~   42 (120)
T cd04641          21 ERRVSALPIVDENGKVVDVYSR   42 (120)
T ss_pred             HcCCCeeeEECCCCeEEEEEeH
Confidence            3455678999999999999874


No 95 
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains.  Family M17 contains zinc- and manganese-dependent exopeptidases ( EC  3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=29.80  E-value=69  Score=28.95  Aligned_cols=28  Identities=18%  Similarity=0.133  Sum_probs=20.3

Q ss_pred             EccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070        101 VMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT  135 (169)
Q Consensus       101 V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~  135 (169)
                      ..+|.|..      .+ .||||||++  +.-|++.++
T Consensus       292 ~~EN~is~------~A-~rPgDVi~s~~GkTVEI~NT  321 (468)
T cd00433         292 LAENMISG------NA-YRPGDVITSRSGKTVEILNT  321 (468)
T ss_pred             eeecCCCC------CC-CCCCCEeEeCCCcEEEEecC
Confidence            34677777      55 899999999  555665554


No 96 
>TIGR00612 ispG_gcpE 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase. Chlamydial members of the family have a long insert. The family is largely restricted to Bacteria, where it is widely but not universally distributed. No homology can be detected between the GcpE family and other proteins.
Probab=29.16  E-value=60  Score=28.24  Aligned_cols=37  Identities=14%  Similarity=0.254  Sum_probs=29.0

Q ss_pred             hhhHHHHHHhhhhcCCccceeecceecEEEEECCHHHHHHhhc
Q psy18070         42 IDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLIEQLRR   84 (169)
Q Consensus        42 ~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~~~~~~   84 (169)
                      -+.++++++..++.+      ..-++|+.-..|++++.+.++.
T Consensus       107 ~e~v~~vv~~ak~~~------ipIRIGVN~GSL~~~~~~kyg~  143 (346)
T TIGR00612       107 RERVRDVVEKARDHG------KAMRIGVNHGSLERRLLEKYGD  143 (346)
T ss_pred             HHHHHHHHHHHHHCC------CCEEEecCCCCCcHHHHHHcCC
Confidence            467788888888777      5567999999999988887753


No 97 
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=28.82  E-value=76  Score=28.88  Aligned_cols=28  Identities=21%  Similarity=0.250  Sum_probs=19.8

Q ss_pred             EccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070        101 VMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT  135 (169)
Q Consensus       101 V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~  135 (169)
                      ..+|.|..      .| .||||||++  +.-|++.++
T Consensus       306 l~ENm~~~------~A-~rPgDVi~~~~GkTVEV~NT  335 (483)
T PRK00913        306 ACENMPSG------NA-YRPGDVLTSMSGKTIEVLNT  335 (483)
T ss_pred             eeccCCCC------CC-CCCCCEEEECCCcEEEeecC
Confidence            34677777      66 999999999  445555444


No 98 
>TIGR02913 HAF_rpt probable extracellular repeat, HAF family. The model for this family detects a homology domain of about 40 amino acids. Member proteins always have a least two tandem copies and as many as seven. The spacing between repeats as defined here usually is four residues exactly. This repeat is named for a tripeptide motif HAF found in most members. Some members proteins are found in species with no outer membrane (archaea and Gram-positive bacteria) while others have C-terminal autotransporter domains that suggest that the repeat region is transported across the outer membrane. This domain seems likely to be an extracellular protein repeat.
Probab=28.38  E-value=62  Score=18.89  Aligned_cols=12  Identities=42%  Similarity=0.880  Sum_probs=9.8

Q ss_pred             EcCCccEEEEEe
Q psy18070         18 VNLDGEVIGINS   29 (169)
Q Consensus        18 vn~~G~vvGi~~   29 (169)
                      +|.+|+|||...
T Consensus         4 In~~G~VvG~s~   15 (39)
T TIGR02913         4 INNDGQVVGYST   15 (39)
T ss_pred             CCCCCcEEEEEE
Confidence            688999999754


No 99 
>cd04614 CBS_pair_1 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=28.32  E-value=66  Score=21.36  Aligned_cols=47  Identities=19%  Similarity=0.198  Sum_probs=31.2

Q ss_pred             CCCccceEEcCCccEEEEEeeec--c-CCeEEEEehhhHHHHHHhhhhcC
Q psy18070         10 FGNSGGPLVNLDGEVIGINSMKV--T-AGISFAIPIDYAIEFLTNYKRKD   56 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~~~--~-~~~~faiP~~~i~~~l~~l~~~g   56 (169)
                      .+-+.-|++|.+|+++|+.+..-  . ..+.+.=|-+.+.+.++.+.+++
T Consensus        22 ~~~~~~~V~d~~~~~~Giv~~~dl~~~~~~~~v~~~~~l~~a~~~m~~~~   71 (96)
T cd04614          22 ANVKALPVLDDDGKLSGIITERDLIAKSEVVTATKRTTVSECAQKMKRNR   71 (96)
T ss_pred             cCCCeEEEECCCCCEEEEEEHHHHhcCCCcEEecCCCCHHHHHHHHHHhC
Confidence            45577899999999999987443  1 12344445556677777776665


No 100
>PF00883 Peptidase_M17:  Cytosol aminopeptidase family, catalytic domain;  InterPro: IPR000819 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine). Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear [, ]. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids []. The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another []. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape []. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices []. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core []. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer []. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain [].; GO: 0004177 aminopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 3KZW_L 3KQX_C 3KQZ_L 3KR4_I 3KR5_J 3T8W_C 3H8F_D 3H8G_A 3H8E_B 3IJ3_A ....
Probab=28.27  E-value=61  Score=27.79  Aligned_cols=29  Identities=17%  Similarity=0.072  Sum_probs=17.0

Q ss_pred             EEccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070        100 RVMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT  135 (169)
Q Consensus       100 ~V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~  135 (169)
                      -..+|.|..      .+ .+|||||++  +.-|++.++
T Consensus       136 ~~~EN~i~~------~a-~~pgDVi~s~~GkTVEI~NT  166 (311)
T PF00883_consen  136 PLAENMISG------NA-YRPGDVITSMNGKTVEIGNT  166 (311)
T ss_dssp             EEEEE--ST------TS-TTTTEEEE-TTS-EEEES-T
T ss_pred             EcccccCCC------CC-CCCCCEEEeCCCCEEEEEee
Confidence            334677777      55 899999999  555665555


No 101
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=28.25  E-value=61  Score=26.39  Aligned_cols=28  Identities=36%  Similarity=0.459  Sum_probs=19.6

Q ss_pred             eccccCCCCccceEE---cCCccEEEEEeee
Q psy18070          4 TGIMVKFGNSGGPLV---NLDGEVIGINSMK   31 (169)
Q Consensus         4 ~da~in~GnSGGplv---n~~G~vvGi~~~~   31 (169)
                      ++.-..+|.+||||+   |-+-.|||+.+..
T Consensus       224 ~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~  254 (282)
T PF03761_consen  224 TKQYSCKGDRGGPLVKNINGRWTLIGVGASG  254 (282)
T ss_pred             cccccCCCCccCeEEEEECCCEEEEEEEccC
Confidence            344567899999998   4445678876644


No 102
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=28.23  E-value=30  Score=26.27  Aligned_cols=17  Identities=47%  Similarity=0.571  Sum_probs=13.4

Q ss_pred             cccCCCCccceEEcCCc
Q psy18070          6 IMVKFGNSGGPLVNLDG   22 (169)
Q Consensus         6 a~in~GnSGGplvn~~G   22 (169)
                      ....+|+|||||+...+
T Consensus       179 ~~~c~gdsGgpl~~~~~  195 (232)
T cd00190         179 KDACQGDSGGPLVCNDN  195 (232)
T ss_pred             CccccCCCCCcEEEEeC
Confidence            44567999999998665


No 103
>PF08275 Toprim_N:  DNA primase catalytic core, N-terminal domain;  InterPro: IPR013264 This is the N-terminal, catalytic core domain of DNA primases. DNA primase (2.7.7 from EC) is a nucleotidyltransferase which synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork. It can also prime the leading strand and has been implicated in cell division []. ; PDB: 1EQN_E 1DD9_A 3B39_B 1DDE_A 2AU3_A.
Probab=27.63  E-value=47  Score=24.33  Aligned_cols=17  Identities=24%  Similarity=0.632  Sum_probs=12.8

Q ss_pred             eEEcCCccEEEEEeeec
Q psy18070         16 PLVNLDGEVIGINSMKV   32 (169)
Q Consensus        16 plvn~~G~vvGi~~~~~   32 (169)
                      |+.|.+|+|||+..-..
T Consensus        82 PI~d~~G~vvgF~gR~l   98 (128)
T PF08275_consen   82 PIRDERGRVVGFGGRRL   98 (128)
T ss_dssp             EEE-TTS-EEEEEEEES
T ss_pred             EEEcCCCCEEEEecccC
Confidence            89999999999977655


No 104
>KOG3551|consensus
Probab=27.27  E-value=60  Score=29.03  Aligned_cols=32  Identities=16%  Similarity=0.060  Sum_probs=25.8

Q ss_pred             CcEEEEEEccCCcccccccccccC-CCCCcEEEecCcee
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLA  131 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~  131 (169)
                      --++|+.+-+|=.|+      +.+ |..||.|+++|+-.
T Consensus       110 MPIlISKIFkGlAAD------Qt~aL~~gDaIlSVNG~d  142 (506)
T KOG3551|consen  110 MPILISKIFKGLAAD------QTGALFLGDAILSVNGED  142 (506)
T ss_pred             CceehhHhccccccc------cccceeeccEEEEecchh
Confidence            469999999988777      543 99999999976654


No 105
>PRK05015 aminopeptidase B; Provisional
Probab=25.16  E-value=96  Score=27.82  Aligned_cols=30  Identities=17%  Similarity=-0.004  Sum_probs=20.2

Q ss_pred             EEEccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070         99 WRVMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT  135 (169)
Q Consensus        99 ~~V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~  135 (169)
                      .-..+|.+..      .+ .||||||+.  +.-|++.++
T Consensus       241 l~~aENmisg------~A-~kpgDVIt~~nGkTVEI~NT  272 (424)
T PRK05015        241 LCCAENLISG------NA-FKLGDIITYRNGKTVEVMNT  272 (424)
T ss_pred             EEecccCCCC------CC-CCCCCEEEecCCcEEeeecc
Confidence            3344677777      55 999999999  444555444


No 106
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.36  E-value=91  Score=20.95  Aligned_cols=19  Identities=21%  Similarity=0.363  Sum_probs=15.4

Q ss_pred             CCccceEEcCCccEEEEEe
Q psy18070         11 GNSGGPLVNLDGEVIGINS   29 (169)
Q Consensus        11 GnSGGplvn~~G~vvGi~~   29 (169)
                      +-+.-|++|.+|+++|+.+
T Consensus        23 ~~~~~~V~d~~~~~~G~v~   41 (111)
T cd04603          23 GARAVVVVDEENKVLGQVT   41 (111)
T ss_pred             CCCEEEEEcCCCCEEEEEE
Confidence            3456688999999999987


No 107
>cd04801 CBS_pair_M50_like This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the metalloprotease peptidase M50.  CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.13  E-value=77  Score=21.23  Aligned_cols=20  Identities=25%  Similarity=0.351  Sum_probs=15.9

Q ss_pred             CccceEEcCCccEEEEEeee
Q psy18070         12 NSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        12 nSGGplvn~~G~vvGi~~~~   31 (169)
                      .+.=|++|.+|+++|+.+..
T Consensus        25 ~~~~~V~d~~~~~~G~v~~~   44 (114)
T cd04801          25 QRRFVVVDNEGRYVGIISLA   44 (114)
T ss_pred             ceeEEEEcCCCcEEEEEEHH
Confidence            45668899999999998743


No 108
>PRK03760 hypothetical protein; Provisional
Probab=22.93  E-value=98  Score=22.46  Aligned_cols=25  Identities=4%  Similarity=-0.201  Sum_probs=18.7

Q ss_pred             CcEEEEEEccCCcccccccccccCCCCCcEEE
Q psy18070         94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRL  125 (169)
Q Consensus        94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~  125 (169)
                      .-.+|-++..|. ++      +.||++||.|.
T Consensus        89 ~a~~VLEl~aG~-~~------~~gi~~Gd~v~  113 (117)
T PRK03760         89 PARYIIEGPVGK-IR------VLKVEVGDEIE  113 (117)
T ss_pred             cceEEEEeCCCh-HH------HcCCCCCCEEE
Confidence            356788887655 55      58999999974


No 109
>cd04643 CBS_pair_30 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=22.84  E-value=73  Score=21.24  Aligned_cols=20  Identities=25%  Similarity=0.483  Sum_probs=15.9

Q ss_pred             CccceEEcCCccEEEEEeee
Q psy18070         12 NSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        12 nSGGplvn~~G~vvGi~~~~   31 (169)
                      -+.-|++|.+|+++|+.+..
T Consensus        24 ~~~~~V~d~~~~~~Giv~~~   43 (116)
T cd04643          24 YSAIPVLDKEGKYVGTISLT   43 (116)
T ss_pred             CceeeeECCCCcEEEEEeHH
Confidence            34568999999999998753


No 110
>COG1792 MreC Cell shape-determining protein [Cell envelope biogenesis, outer membrane]
Probab=22.63  E-value=3.3e+02  Score=22.81  Aligned_cols=41  Identities=17%  Similarity=0.148  Sum_probs=25.9

Q ss_pred             EeeccccCCCCccc-eEEcCCccEEEEEeeeccCCeEEEEehhh
Q psy18070          2 SLTGIMVKFGNSGG-PLVNLDGEVIGINSMKVTAGISFAIPIDY   44 (169)
Q Consensus         2 iq~da~in~GnSGG-plvn~~G~vvGi~~~~~~~~~~faiP~~~   44 (169)
                      +-+|+--|.|=.=| |+++..| |||-+..- +.+....+.+..
T Consensus       134 ivId~Gs~~GV~~~~~Vi~~~G-LVG~V~~V-~~~tS~V~Lltd  175 (284)
T COG1792         134 IVIDKGSNDGIKKGMPVVAEGG-LVGKVVEV-SKNTSRVLLLTD  175 (284)
T ss_pred             EEEecCcccCccCCCeEEECCc-eEEEEEEE-cCceeEEEEeec
Confidence            34666777776644 9999999 99965543 334445554443


No 111
>PF12120 Arr-ms:  Rifampin ADP-ribosyl transferase;  InterPro: IPR021975 This domain is part of the beta subunit of bacterial DNA dependent RNA polymerase. This domain is the binding site for the antibacterial drug rifampin (and its analogues) which blocks the DNA/RNA tunnel and prevents initiation of transcription. ; PDB: 2HW2_A.
Probab=22.55  E-value=43  Score=23.82  Aligned_cols=30  Identities=17%  Similarity=0.208  Sum_probs=14.8

Q ss_pred             ccccCCCCCcEEEec-----------Cceeecchhhhhhhh
Q psy18070        113 RTSAGIKPTSSRLLG-----------ECLAQYTTSKLVVWS  142 (169)
Q Consensus       113 ~~~aGL~~GDvI~~~-----------~~v~~~~~~~~~~~~  142 (169)
                      +|+|-|++||.|+.+           +-|.....++.+.|.
T Consensus         5 GTkAdL~~GDll~pG~~SNy~~~~~~n~iY~Ta~ld~A~w~   45 (100)
T PF12120_consen    5 GTKADLQVGDLLTPGFRSNYGPGRVMNHIYFTATLDAAIWG   45 (100)
T ss_dssp             EESS---TT-EE-S--B-SSSTT-B-S-EEEESBHHHHHHH
T ss_pred             cccccCCCCcEecCCCccccCCCceeeEEEEeeccchhHHH
Confidence            578999999999973           234444556666553


No 112
>COG4043 Preprotein translocase subunit Sec61beta [Intracellular    trafficking, secretion, and vesicular transport]
Probab=22.31  E-value=52  Score=23.74  Aligned_cols=27  Identities=22%  Similarity=0.352  Sum_probs=17.5

Q ss_pred             ccCCCCCcEEEe-cC-------ceeecchhhhhhh
Q psy18070        115 SAGIKPTSSRLL-GE-------CLAQYTTSKLVVW  141 (169)
Q Consensus       115 ~aGL~~GDvI~~-~~-------~v~~~~~~~~~~~  141 (169)
                      +.+++|||.|+- ++       .+....+++..+.
T Consensus        31 rr~ik~GD~IiF~~~~l~v~V~~vr~Y~tF~~mlr   65 (111)
T COG4043          31 RRQIKPGDKIIFNGDKLKVEVIDVRVYDTFEEMLR   65 (111)
T ss_pred             hcCCCCCCEEEEcCCeeEEEEEEEeehhHHHHHHH
Confidence            689999999986 22       3444555554443


No 113
>PF01191 RNA_pol_Rpb5_C:  RNA polymerase Rpb5, C-terminal domain;  InterPro: IPR000783  Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; 2.7.7.6 from EC) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region (IPR005571 from INTERPRO), plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH) [, , , ]. This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2). ; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 1EIK_A 2Y0S_Z 1DZF_A 3GTG_E 2VUM_E 3GTP_E 3GTO_E 3S17_E 3S1R_E 1I3Q_E ....
Probab=22.04  E-value=44  Score=22.55  Aligned_cols=18  Identities=22%  Similarity=0.322  Sum_probs=10.9

Q ss_pred             cCCcccccccccc-cCCCCCcEEEe
Q psy18070        103 YNSPAYFIKFRTS-AGIKPTSSRLL  126 (169)
Q Consensus       103 ~~spA~~~~~~~~-aGL~~GDvI~~  126 (169)
                      ...|.+      + .|+++||||--
T Consensus        39 ~~DPv~------r~~g~k~GdVvkI   57 (74)
T PF01191_consen   39 SSDPVA------RYLGAKPGDVVKI   57 (74)
T ss_dssp             TTSHHH------HHTT--TTSEEEE
T ss_pred             ccChhh------hhcCCCCCCEEEE
Confidence            355666      4 59999999743


No 114
>COG5428 Uncharacterized conserved small protein [Function unknown]
Probab=21.73  E-value=64  Score=21.53  Aligned_cols=14  Identities=36%  Similarity=0.700  Sum_probs=10.9

Q ss_pred             EcCCccEEEEEeee
Q psy18070         18 VNLDGEVIGINSMK   31 (169)
Q Consensus        18 vn~~G~vvGi~~~~   31 (169)
                      +|.+|+|+||-.-.
T Consensus        36 ide~GkV~GiEi~~   49 (69)
T COG5428          36 IDENGKVIGIEIWN   49 (69)
T ss_pred             ecCCCcEEEEEEEc
Confidence            47889999997643


No 115
>cd04594 CBS_pair_EriC_assoc_archaea This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the EriC CIC-type chloride channels in archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS do
Probab=21.33  E-value=82  Score=20.86  Aligned_cols=21  Identities=33%  Similarity=0.501  Sum_probs=15.8

Q ss_pred             CCCCccceEEcCCccEEEEEee
Q psy18070          9 KFGNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus         9 n~GnSGGplvn~~G~vvGi~~~   30 (169)
                      ..+.+--|+++ +|+++|+.+.
T Consensus        78 ~~~~~~~~Vv~-~~~~iGvit~   98 (104)
T cd04594          78 KNKTRWCPVVD-DGKFKGIVTL   98 (104)
T ss_pred             HcCcceEEEEE-CCEEEEEEEH
Confidence            34455678998 7999999874


No 116
>cd04459 Rho_CSD Rho_CSD: Rho protein cold-shock domain (CSD). Rho protein is a transcription termination factor in most bacteria. In bacteria, there are two distinct mechanisms for mRNA transcription termination. In intrinsic termination, RNA polymerase and nascent mRNA are released from DNA template by an mRNA stem loop structure, which resembles the transcription termination mechanism used by eukaryotic pol III. The second mechanism is mediated by Rho factor. Rho factor terminates transcription by using energy from ATP hydrolysis to forcibly dissociate the transcripts from RNA polymerase. Rho protein contains an N-terminal S1-like domain, which binds single-stranded RNA. Rho has a C-terminal ATPase domain which hydrolyzes ATP to provide energy to strip RNA polymerase and mRNA from the DNA template. Rho functions as a homohexamer.
Probab=21.14  E-value=60  Score=21.44  Aligned_cols=12  Identities=0%  Similarity=-0.003  Sum_probs=10.9

Q ss_pred             ccCCCCCcEEEe
Q psy18070        115 SAGIKPTSSRLL  126 (169)
Q Consensus       115 ~aGL~~GDvI~~  126 (169)
                      +.|||.||.|..
T Consensus        38 r~~LR~GD~V~G   49 (68)
T cd04459          38 RFNLRTGDTVVG   49 (68)
T ss_pred             HhCCCCCCEEEE
Confidence            789999999987


No 117
>cd04617 CBS_pair_4 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=21.01  E-value=1e+02  Score=20.85  Aligned_cols=21  Identities=24%  Similarity=0.289  Sum_probs=16.5

Q ss_pred             CCCccceEEcCCccEEEEEee
Q psy18070         10 FGNSGGPLVNLDGEVIGINSM   30 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~   30 (169)
                      .+-+..|++|.+|+++|+.+.
T Consensus        22 ~~~~~~~V~d~~~~~~Givt~   42 (118)
T cd04617          22 EDVGSLFVVDEDGDLVGVVSR   42 (118)
T ss_pred             cCCCEEEEEcCCCCEEEEEEH
Confidence            344577899999999999774


No 118
>cd04619 CBS_pair_6 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=20.76  E-value=1.1e+02  Score=20.57  Aligned_cols=22  Identities=14%  Similarity=0.135  Sum_probs=17.1

Q ss_pred             CCCccceEEcCCccEEEEEeee
Q psy18070         10 FGNSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        10 ~GnSGGplvn~~G~vvGi~~~~   31 (169)
                      .+...-|++|.+|+++|+.+..
T Consensus        22 ~~~~~~~Vvd~~g~~~G~vt~~   43 (114)
T cd04619          22 PGIDLVVVCDPHGKLAGVLTKT   43 (114)
T ss_pred             cCCCEEEEECCCCCEEEEEehH
Confidence            3455668999999999998743


No 119
>cd04623 CBS_pair_10 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=20.60  E-value=95  Score=20.43  Aligned_cols=21  Identities=24%  Similarity=0.306  Sum_probs=16.1

Q ss_pred             CCccceEEcCCccEEEEEeee
Q psy18070         11 GNSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        11 GnSGGplvn~~G~vvGi~~~~   31 (169)
                      +-+.=|++|.+|+++|+.+..
T Consensus        23 ~~~~~~V~~~~~~~~Giv~~~   43 (113)
T cd04623          23 NIGAVVVVDDGGRLVGIFSER   43 (113)
T ss_pred             CCCeEEEECCCCCEEEEEehH
Confidence            345568999889999997743


No 120
>cd04621 CBS_pair_8 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=20.01  E-value=1.1e+02  Score=21.77  Aligned_cols=21  Identities=19%  Similarity=0.374  Sum_probs=17.2

Q ss_pred             CCccceEEcCCccEEEEEeee
Q psy18070         11 GNSGGPLVNLDGEVIGINSMK   31 (169)
Q Consensus        11 GnSGGplvn~~G~vvGi~~~~   31 (169)
                      +-+.-|++|.+|+++|+.+..
T Consensus        23 ~~~~l~V~d~~~~~~Giv~~~   43 (135)
T cd04621          23 GVGRVIVVDDNGKPVGVITYR   43 (135)
T ss_pred             CCCcceEECCCCCEEEEEeHH
Confidence            456779999999999998743


Done!