Query         043345
Match_columns 80
No_of_seqs    126 out of 574
Neff          7.7 
Searched_HMMs 46136
Date          Fri Mar 29 04:07:20 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/043345.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/043345hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF01627 Hpt:  Hpt domain;  Int  99.5 1.4E-13   3E-18   78.4   7.7   60    4-63     26-88  (90)
  2 KOG4747 Two-component phosphor  99.5 1.3E-13 2.8E-18   86.8   7.8   75    4-78     69-143 (150)
  3 COG2198 ArcB FOG: HPt domain [  99.4 1.9E-12 4.2E-17   78.8   8.3   67    4-70     49-118 (122)
  4 smart00073 HPT Histidine Phosp  99.4 1.3E-12 2.8E-17   74.6   4.6   64    3-67     23-86  (87)
  5 cd00088 HPT Histidine Phosphot  99.2 4.9E-11 1.1E-15   69.3   5.8   41    4-44     29-69  (94)
  6 TIGR02956 TMAO_torS TMAO reduc  99.0 5.9E-10 1.3E-14   85.1   6.6   64    4-69    905-968 (968)
  7 PRK10618 phosphotransfer inter  99.0   2E-09 4.4E-14   83.3   7.6   61    3-63    832-892 (894)
  8 PRK11091 aerobic respiration c  98.9 8.9E-09 1.9E-13   77.6   7.3   70    3-72    706-775 (779)
  9 PRK11107 hybrid sensory histid  98.2 7.8E-06 1.7E-10   62.3   8.7   67    4-70    850-917 (919)
 10 PRK11466 hybrid sensory histid  98.2 3.6E-06 7.8E-11   64.4   5.3   38    4-41    849-886 (914)
 11 COG0643 CheA Chemotaxis protei  97.9 6.2E-05 1.4E-09   57.8   6.9   63    6-68     40-105 (716)
 12 PRK10547 chemotaxis protein Ch  97.8 8.7E-05 1.9E-09   56.6   7.2   59    6-64     37-98  (670)
 13 PRK09959 hybrid sensory histid  97.1  0.0049 1.1E-07   48.9   8.5   67    4-70   1126-1193(1197)
 14 PRK15347 two component system   96.7  0.0049 1.1E-07   47.3   5.9   54    9-63    862-915 (921)
 15 PF08900 DUF1845:  Domain of un  83.9      10 0.00022   25.4   7.2   54   19-72     32-91  (217)
 16 PF13779 DUF4175:  Domain of un  81.7     4.6 9.9E-05   32.2   5.5   45   19-63    530-574 (820)
 17 PF14276 DUF4363:  Domain of un  78.2     5.3 0.00012   23.9   3.9   49   13-66     17-65  (121)
 18 TIGR02302 aProt_lowcomp conser  77.2       9 0.00019   30.8   5.8   44   21-64    562-605 (851)
 19 PLN02956 PSII-Q subunit         72.9      24 0.00053   23.3   6.3   39   27-65    146-184 (185)
 20 smart00388 HisKA His Kinase A   72.3      10 0.00022   18.6   6.9   58    9-74      5-62  (66)
 21 COG2991 Uncharacterized protei  70.8     0.8 1.7E-05   25.9  -0.9   26   16-42     28-53  (77)
 22 TIGR03761 ICE_PFL4669 integrat  69.4      32 0.00069   23.2   7.3   51   21-71     32-88  (216)
 23 PF03670 UPF0184:  Uncharacteri  69.1      20 0.00043   20.7   5.2   37   41-77     31-67  (83)
 24 PF03993 DUF349:  Domain of Unk  68.7      13 0.00028   20.1   3.8   32   30-61     37-68  (77)
 25 PF07361 Cytochrom_B562:  Cytoc  65.2      16 0.00036   21.5   3.9   33   24-56     60-92  (103)
 26 PF00512 HisKA:  His Kinase A (  65.1      18 0.00039   18.7   8.7   58    9-74      5-64  (68)
 27 KOG2424 Protein involved in tr  64.6      15 0.00033   24.3   3.9   24   21-46    147-170 (195)
 28 PF04837 MbeB_N:  MbeB-like, N-  64.4      19 0.00042   18.9   7.1   48   28-79      3-50  (52)
 29 KOG3232 Vacuolar assembly/sort  63.5      41 0.00088   22.3   6.2   40   28-67     94-133 (203)
 30 PF04722 Ssu72:  Ssu72-like pro  58.6      23 0.00049   23.6   4.0   36   21-57    145-181 (195)
 31 PRK13858 type IV secretion sys  56.8      49  0.0011   21.1   7.6   62    8-77     73-136 (147)
 32 PF05757 PsbQ:  Oxygen evolving  55.8      42 0.00092   22.4   4.9   38   27-64    162-199 (202)
 33 PF12854 PPR_1:  PPR repeat      55.7      15 0.00033   16.9   2.2   22   34-55     12-33  (34)
 34 PF09577 Spore_YpjB:  Sporulati  53.9      58  0.0013   22.2   5.5   40   28-67    102-141 (232)
 35 PF04400 DUF539:  Protein of un  52.2     1.1 2.4E-05   23.0  -2.3   19   16-34      7-25  (45)
 36 PF07870 DUF1657:  Protein of u  52.2      33 0.00071   17.7   5.0   36   34-69     14-49  (50)
 37 PF14077 WD40_alt:  Alternative  49.9      29 0.00063   17.9   2.7   36   43-78     11-46  (48)
 38 COG2178 Predicted RNA-binding   48.3      83  0.0018   21.2   6.5   43   27-69     27-69  (204)
 39 cd00082 HisKA Histidine Kinase  46.1      35 0.00076   16.2   7.5   55    9-71      7-62  (65)
 40 PF09403 FadA:  Adhesion protei  45.1      74  0.0016   19.7   5.0    9   20-28     13-21  (126)
 41 PF01535 PPR:  PPR repeat;  Int  44.7      28  0.0006   14.7   2.7   22   36-57      7-28  (31)
 42 PF00435 Spectrin:  Spectrin re  44.4      52  0.0011   17.7   7.2   45   26-71     57-101 (105)
 43 PRK10548 flagellar biosynthesi  43.5      76  0.0016   19.4   7.8   23   27-49     12-34  (121)
 44 PF13812 PPR_3:  Pentatricopept  42.4      33 0.00071   14.8   3.3   22   36-57      8-29  (34)
 45 PF15300 INT_SG_DDX_CT_C:  INTS  41.8      20 0.00044   19.7   1.5   20    2-21     17-36  (65)
 46 PF11173 DUF2960:  Protein of u  41.6      22 0.00047   20.4   1.6   15   66-80     38-52  (79)
 47 PF10180 DUF2373:  Uncharacteri  37.9      67  0.0014   17.5   3.2   29    3-35     35-63  (65)
 48 PRK10987 regulatory protein Am  37.9      63  0.0014   22.3   3.8   35   21-55     79-113 (284)
 49 PF08738 Gon7:  Gon7 family;  I  37.4      92   0.002   18.6   4.0   26   47-72     51-76  (103)
 50 PRK00068 hypothetical protein;  37.4 1.2E+02  0.0027   25.0   5.7   39   26-64    930-968 (970)
 51 PRK10265 chaperone-modulator p  37.3      58  0.0013   19.0   3.1   23   49-71     77-99  (101)
 52 TIGR00756 PPR pentatricopeptid  37.3      39 0.00085   14.3   3.2   22   36-57      7-28  (35)
 53 KOG4182 Uncharacterized conser  37.1   2E+02  0.0044   22.5   7.8   58    4-61     73-166 (828)
 54 PF01044 Vinculin:  Vinculin fa  37.0 1.6E+02  0.0035   24.1   6.3   61    8-68    411-480 (968)
 55 PF05190 MutS_IV:  MutS family   36.6      66  0.0014   17.5   3.2   26   50-75      4-29  (92)
 56 PF14756 Pdase_C33_assoc:  Pept  35.9 1.1E+02  0.0024   19.0   4.6   21   24-44     70-90  (147)
 57 PRK15058 cytochrome b562; Prov  35.0 1.1E+02  0.0025   19.0   4.6   32   25-56     86-117 (128)
 58 TIGR00444 mazG MazG family pro  34.0 1.1E+02  0.0023   21.1   4.3   43   25-67    129-172 (248)
 59 PRK11107 hybrid sensory histid  33.4 2.3E+02  0.0051   22.1   6.7   55   10-72    297-351 (919)
 60 COG1270 CbiB Cobalamin biosynt  32.9   1E+02  0.0022   22.1   4.2   35   21-55     94-128 (320)
 61 PF09686 Plasmid_RAQPRD:  Plasm  32.7      53  0.0012   18.8   2.3   23   52-74     46-68  (81)
 62 PLN02999 photosystem II oxygen  31.9 1.6E+02  0.0034   19.6   5.7   38   27-64    150-187 (190)
 63 PRK03170 dihydrodipicolinate s  31.8 1.7E+02  0.0036   19.9   5.2   41   18-58    197-239 (292)
 64 PF13041 PPR_2:  PPR repeat fam  31.8      69  0.0015   15.5   3.2   24   34-57      8-31  (50)
 65 PF09660 DUF2397:  Protein of u  31.5 2.3E+02  0.0049   21.3   9.2   64   14-77    105-174 (486)
 66 PF07891 DUF1666:  Protein of u  31.2 1.8E+02  0.0039   20.2   5.4   47   27-74      7-62  (247)
 67 PLN02729 PSII-Q subunit         31.2 1.7E+02  0.0037   19.9   5.4   37   27-63    180-216 (220)
 68 COG4354 Predicted bile acid be  31.1 1.9E+02  0.0041   22.9   5.6   56   14-73    501-556 (721)
 69 PRK01209 cobD cobalamin biosyn  30.3 1.1E+02  0.0024   21.4   4.1   33   23-55     92-124 (312)
 70 TIGR00674 dapA dihydrodipicoli  29.7 1.8E+02   0.004   19.7   5.1   42   17-58    193-236 (285)
 71 TIGR02878 spore_ypjB sporulati  27.7 1.5E+02  0.0033   20.4   4.2   33   29-61    104-136 (233)
 72 PRK10722 hypothetical protein;  27.5 2.1E+02  0.0047   19.8   4.9   32   45-76    178-209 (247)
 73 cd00408 DHDPS-like Dihydrodipi  27.5 1.8E+02  0.0039   19.5   4.6   43   18-60    193-237 (281)
 74 PF13047 DUF3907:  Protein of u  27.3 1.2E+02  0.0026   19.4   3.5   26   44-69    117-142 (148)
 75 PF13747 DUF4164:  Domain of un  27.1 1.3E+02  0.0029   17.2   4.2   23   47-69     36-58  (89)
 76 PHA02585 16 small terminase pr  27.0 1.8E+02  0.0039   18.8   7.6   64    4-76     45-109 (161)
 77 PF00701 DHDPS:  Dihydrodipicol  26.5 1.6E+02  0.0035   19.9   4.3   42   19-60    198-241 (289)
 78 PF08657 DASH_Spc34:  DASH comp  25.0 2.4E+02  0.0052   19.5   6.6   62    8-69    134-199 (259)
 79 cd00950 DHDPS Dihydrodipicolin  24.9 2.3E+02  0.0049   19.2   5.2   41   18-58    196-238 (284)
 80 TIGR01690 ICE_RAQPRD integrati  24.9      61  0.0013   19.1   1.7   23   52-74     59-81  (94)
 81 PF08581 Tup_N:  Tup N-terminal  24.7 1.4E+02  0.0031   16.8   3.8   17   50-66      4-20  (79)
 82 cd05136 RasGAP_DAB2IP The DAB2  24.2 2.3E+02  0.0049   20.0   4.7   46   28-73     13-58  (309)
 83 COG0783 Dps DNA-binding ferrit  24.0   2E+02  0.0044   18.3   5.2   66    7-72     57-124 (156)
 84 PF02870 Methyltransf_1N:  6-O-  23.9 1.1E+02  0.0024   16.4   2.6   17   59-75     52-68  (77)
 85 PRK10841 hybrid sensory kinase  23.9   4E+02  0.0086   21.6   8.0   56   10-73    451-506 (924)
 86 TIGR02956 TMAO_torS TMAO reduc  23.9 3.7E+02  0.0079   21.2   7.8   55   10-72    468-522 (968)
 87 PF01231 IDO:  Indoleamine 2,3-  23.7 2.8E+02  0.0061   20.4   5.3   34   38-71    181-214 (422)
 88 PF13942 Lipoprotein_20:  YfhG   23.4 2.3E+02   0.005   18.7   4.6   32   45-76    132-163 (179)
 89 TIGR03042 PS_II_psbQ_bact phot  23.4   2E+02  0.0044   18.1   6.7   37   29-72    104-140 (142)
 90 PF01322 Cytochrom_C_2:  Cytoch  23.4 1.7E+02  0.0037   17.2   6.4   36   28-63     83-118 (122)
 91 PF11855 DUF3375:  Protein of u  23.4 3.2E+02   0.007   20.4   7.4   64    5-69    100-163 (478)
 92 cd00089 HR1 Protein kinase C-r  23.4 1.4E+02  0.0029   16.1   4.1   29   43-71     42-70  (72)
 93 COG1220 HslU ATP-dependent pro  22.9 1.2E+02  0.0026   22.6   3.2   31    7-37    374-404 (444)
 94 PRK06342 transcription elongat  22.9 2.1E+02  0.0046   18.2   6.2   46    7-67     36-81  (160)
 95 KOG3612 PHD Zn-finger protein   22.9 2.6E+02  0.0056   21.8   5.0   36   29-64    491-526 (588)
 96 PF10481 CENP-F_N:  Cenp-F N-te  22.6 2.9E+02  0.0064   19.7   5.6   42   27-68     70-127 (307)
 97 COG5214 POL12 DNA polymerase a  22.5 1.3E+02  0.0028   22.9   3.3   45   27-80     23-68  (581)
 98 COG5613 Uncharacterized conser  22.1 3.3E+02  0.0072   20.1   5.7   43   25-67    319-361 (400)
 99 TIGR00465 ilvC ketol-acid redu  22.1 2.7E+02  0.0058   19.6   4.8   38    9-46    210-247 (314)
100 PF10191 COG7:  Golgi complex c  22.0 3.1E+02  0.0067   21.9   5.5   38   25-62    126-163 (766)
101 PRK04778 septation ring format  22.0 3.6E+02  0.0079   20.5   6.5   42   28-69    176-217 (569)
102 PF07743 HSCB_C:  HSCB C-termin  21.6 1.5E+02  0.0033   16.0   6.6   25   31-55     42-66  (78)
103 PF10069 DICT:  Sensory domain   21.5      56  0.0012   19.9   1.2   50   19-72     22-72  (129)
104 PF03858 Crust_neuro_H:  Crusta  21.5      93   0.002   15.6   1.7   22    6-27      5-26  (41)
105 COG1598 Predicted nuclease of   21.5      66  0.0014   17.6   1.3   28   46-73     24-51  (73)
106 PF04428 Choline_kin_N:  Cholin  21.4      97  0.0021   16.3   1.9   11    8-18     30-40  (53)
107 PF05465 Halo_GVPC:  Halobacter  21.0 1.1E+02  0.0024   14.2   3.7   23   48-70      4-26  (32)
108 PF06160 EzrA:  Septation ring   20.9 3.9E+02  0.0084   20.4   6.7   42   27-68    171-212 (560)
109 PRK14158 heat shock protein Gr  20.8 2.6E+02  0.0057   18.5   5.4   15   28-42     95-109 (194)
110 cd04751 Commd3 COMM_Domain con  20.8      96  0.0021   17.9   2.0   16    6-21     72-87  (95)
111 PF14989 CCDC32:  Coiled-coil d  20.8      99  0.0021   19.7   2.2   71    9-79     60-133 (148)
112 PF13779 DUF4175:  Domain of un  20.8 4.7E+02    0.01   21.3   6.4   21   53-73    492-512 (820)
113 COG2841 Uncharacterized protei  20.8 1.7E+02  0.0038   16.4   6.7   65    7-73      2-69  (72)
114 PF04782 DUF632:  Protein of un  20.5 1.7E+02  0.0037   20.8   3.5   36   21-56    271-306 (312)
115 PRK08655 prephenate dehydrogen  20.3 3.6E+02  0.0078   19.8   5.7   35   28-62    240-274 (437)

No 1  
>PF01627 Hpt:  Hpt domain;  InterPro: IPR008207 Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions []. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK. A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [, ]. Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [, ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.  HKs can be roughly divided into two classes: orthodox and hybrid kinases [, ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain. This entry represents a domain present at the N terminus in proteins which undergo autophosphorylation. The group includes, the gliding motility regulatory protein from Myxococcus xanthus and a number of bacterial chemotaxis proteins.; GO: 0004871 signal transducer activity, 0000160 two-component signal transduction system (phosphorelay); PDB: 3KYJ_A 3KYI_A 3IQT_A 1Y6D_A 2LD6_A 1TQG_A 2R25_A 1OXB_A 1QSP_B 1C03_B ....
Probab=99.51  E-value=1.4e-13  Score=78.44  Aligned_cols=60  Identities=20%  Similarity=0.347  Sum_probs=50.7

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHH---HHHHHHHHHHHHHHHH
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEG---CQQYLQHLKQEYYLVK   63 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~---~~~~~~~l~~~~~~l~   63 (80)
                      ..|++.+.+.+|+|||+++++|+..+..+|..+|..++.++.+.   +...++.|...++.+.
T Consensus        26 ~~d~~~l~~~~H~lkG~a~~~g~~~l~~~~~~lE~~~~~~~~~~~~~~~~~~~~l~~~l~~l~   88 (90)
T PF01627_consen   26 QEDWEELRRLAHRLKGSAGNLGAPRLAELAEQLEQALKSGDKPEAEELEQLLDELEAMLEQLR   88 (90)
T ss_dssp             HCHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHTTHHHHSHHHHHHHHHHHHHHHHHH
T ss_pred             HhhHHHHHHHHHHHhhhHHhcCHHHHHHHHHHHHHHHHcCCccchhHHHHHHHHHHHHHHHHh
Confidence            57999999999999999999999999999999999999988877   5555555555555544


No 2  
>KOG4747 consensus Two-component phosphorelay intermediate involved in MAP kinase cascade regulation [Signal transduction mechanisms]
Probab=99.50  E-value=1.3e-13  Score=86.76  Aligned_cols=75  Identities=57%  Similarity=0.936  Sum_probs=71.9

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhh
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSLA   78 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~~   78 (80)
                      ++|+..++...|.+||||.+|||.++...|..+...|+.+|.+++...+++++.+|..++.+|++++++|||.+.
T Consensus        69 ~~d~k~~~~~~hqlkgssssIGa~kvk~~c~~~~~~~~~~n~egcvr~l~~v~ie~~~lkkkL~~~f~L~rq~i~  143 (150)
T KOG4747|consen   69 ERDFKKLGSHVHQLKGSSSSIGALKVKKVCVGFNEFCEAGNIEGCVRCLQQVKIEYSLLKKKLETLFQLERQEIL  143 (150)
T ss_pred             HhHHHHHHHHHHHccCchhhhhHHHHHHHHHHHHHHHhhccchhHhhchHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            349999999999999999999999999999999999999999999999999999999999999999999999554


No 3  
>COG2198 ArcB FOG: HPt domain [Signal transduction mechanisms]
Probab=99.41  E-value=1.9e-12  Score=78.76  Aligned_cols=67  Identities=22%  Similarity=0.380  Sum_probs=58.2

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhh-cCHHHHHHHHHHHHHH--HHHHHHHHHHHH
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEE-RNIEGCQQYLQHLKQE--YYLVKNKLQTLF   70 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~-~~~~~~~~~~~~l~~~--~~~l~~~L~~~l   70 (80)
                      ..|+..+.+++|+|||+++++|+..|...|.+||..++. ...+....++.++..+  +..+...+.++.
T Consensus        49 ~~d~~~~~~~aH~lkg~a~~lg~~~L~~~~~~lE~~~~~~~~~~~~~~~i~~l~~~~~~~~~~~~~~~~~  118 (122)
T COG2198          49 AEDNDGLARLAHRLKGSAASLGLPALAQLCQQLEDALRSGASLEELEELIAELKDELQLDVLALELLTYL  118 (122)
T ss_pred             cCCcHHHHHHHHHHHhHHHhccHHHHHHHHHHHHHHHHcCCcHHHHHHHHHHHHHHhcchHHHHHHHHHh
Confidence            568899999999999999999999999999999999988 5677889999999999  666666665554


No 4  
>smart00073 HPT Histidine Phosphotransfer domain. Contains an active histidine residue that mediates phosphotransfer reactions. Domain detected only in eubacteria. This alignment is an extension to that shown in the Cell structure paper.
Probab=99.36  E-value=1.3e-12  Score=74.63  Aligned_cols=64  Identities=20%  Similarity=0.352  Sum_probs=52.1

Q ss_pred             CchhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            3 QNVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQ   67 (80)
Q Consensus         3 ~~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~   67 (80)
                      ...|+..+.+.+|+|||+++++|+..|..+|..+|..++... ++...+++.+...|..+...|.
T Consensus        23 ~~~~~~~l~~~~H~LKG~a~~~g~~~l~~~~~~lE~~~~~~~-~~~~~~~~~l~~~~~~~~~~l~   86 (87)
T smart00073       23 DAQDVNEIFRAAHTLKGSAGSLGLQQLAQLCHQLENLLDAAR-SGEVELTPDLLDLLLELVDVLK   86 (87)
T ss_pred             CHhHHHHHHHHHHhhhhhHHhcCHHHHHHHHHHHHHHHHHHH-cCCCCCCHHHHHHHHHHHHHHc
Confidence            357899999999999999999999999999999999987644 3334566777777777766553


No 5  
>cd00088 HPT Histidine Phosphotransfer domain, involved in signalling through a two part component systems in which an autophosphorylating histidine protein kinase serves as a phosphoryl donor to a response regulator protein; the response regulator protein is modulated by phosphorylation and dephosphorylation of a conserved aspartic acid residue; two-component proteins are abundant in most eubacteria; In E. coli there are 62 two-component proteins involved in a variety of processes such as chemotaxis, osmoregulation, metabolism and transport 1; also present in both Gram positive and Gram negative pathogenic bacteria where they regulate basic housekeeping functions and control expression of toxins and other proteins important for pathogenesis; in archaea and eukaryotes, two-component pathways constitute a very small number of all signaling systems; in fungi they mediate environmental stress responses and, in pathogenic yeast, hyphal development. In Dictyostelium and in plants, they are i
Probab=99.21  E-value=4.9e-11  Score=69.25  Aligned_cols=41  Identities=24%  Similarity=0.429  Sum_probs=38.9

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcC
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERN   44 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~   44 (80)
                      +.|+..+...+|+|||+++++|+..|...|..+|..++.+.
T Consensus        29 ~~d~~~l~~~~H~LkGsa~~~G~~~l~~~~~~lE~~~~~~~   69 (94)
T cd00088          29 AEDLNEIFRAAHTLKGSAASLGLQRLAQLAHQLEDLLDALR   69 (94)
T ss_pred             HHHHHHHHHHHHhhhhHHhcCChHHHHHHHHHHHHHHHHHH
Confidence            68999999999999999999999999999999999998754


No 6  
>TIGR02956 TMAO_torS TMAO reductase sytem sensor TorS. This protein, TorS, is part of a regulatory system for the torCAD operon that encodes the pterin molybdenum cofactor-containing enzyme trimethylamine-N-oxide (TMAO) reductase (TorA), a cognate chaperone (TorD), and a penta-haem cytochrome (TorC). TorS works together with the inducer-binding protein TorT and the response regulator TorR. TorS contains histidine kinase ATPase (pfam02518), HAMP (pfam00672), phosphoacceptor (pfam00512), and phosphotransfer (pfam01627) domains and a response regulator receiver domain (pfam00072).
Probab=99.03  E-value=5.9e-10  Score=85.10  Aligned_cols=64  Identities=20%  Similarity=0.330  Sum_probs=58.3

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      .+|+..++..+|+|||+++++|+..|...|.+||..++.+++  ....++.|...|..+...|+.|
T Consensus       905 ~~d~~~~~~~~H~lkg~~~~~g~~~l~~~~~~le~~~~~~~~--~~~~~~~l~~~~~~~~~~l~~~  968 (968)
T TIGR02956       905 VDDDAQIKKLAHKLKGSAGSLGLTQLTQLCQQLEKQGKTGAL--ELSDIDEIKQAWQASKTALDQW  968 (968)
T ss_pred             CCCHHHHHHHHHHHHHHHHHhCHHHHHHHHHHHHHhcccCCc--chhHHHHHHHHHHHHHHHHHhC
Confidence            579999999999999999999999999999999999999887  4567899999999999988764


No 7  
>PRK10618 phosphotransfer intermediate protein in two-component regulatory system with RcsBC; Provisional
Probab=98.98  E-value=2e-09  Score=83.32  Aligned_cols=61  Identities=16%  Similarity=0.348  Sum_probs=55.8

Q ss_pred             CchhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHH
Q 043345            3 QNVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVK   63 (80)
Q Consensus         3 ~~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~   63 (80)
                      ..+||..+...||+|||+++++|+..+.++|..||+.++.++..++...+.+|...+..+.
T Consensus       832 ~~~D~~~l~~~aHrLKG~~aml~l~~l~~~~~~LE~~i~~~~~~~i~~~i~~id~~v~~ll  892 (894)
T PRK10618        832 ATSDFASLAQTAHRLKGVFAMLNLVPGKQLCETLEHLIREKDEPGIENYISDIDSFVKSLL  892 (894)
T ss_pred             hccCHHHHHHHHHHHHHHHHHcChHHHHHHHHHHHHHHhhCChHHHHHHHHHHHHHHHHHh
Confidence            3579999999999999999999999999999999999999999999888888888776654


No 8  
>PRK11091 aerobic respiration control sensor protein ArcB; Provisional
Probab=98.86  E-value=8.9e-09  Score=77.65  Aligned_cols=70  Identities=17%  Similarity=0.261  Sum_probs=65.7

Q ss_pred             CchhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            3 QNVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus         3 ~~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      +..|+..+...+|+|||+++++|+..|..+|..+|.....+.++....++++|..+|......|+.|+..
T Consensus       706 ~~~d~~~~~~~ah~l~g~~~~~g~~~l~~~~~~le~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~  775 (779)
T PRK11091        706 TARDQKGIVEEAHKIKGAAGSVGLRHLQQLAQQIQSPDLPAWWDNVQDWVEELKNEWRHDVEVLKAWLAQ  775 (779)
T ss_pred             HCCCHHHHHHHHHHHHHHHHHhhHHHHHHHHHHHhCcCccccHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            3578899999999999999999999999999999999888889999999999999999999999999875


No 9  
>PRK11107 hybrid sensory histidine kinase BarA; Provisional
Probab=98.25  E-value=7.8e-06  Score=62.30  Aligned_cols=67  Identities=21%  Similarity=0.276  Sum_probs=55.4

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcC-HHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERN-IEGCQQYLQHLKQEYYLVKNKLQTLF   70 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~-~~~~~~~~~~l~~~~~~l~~~L~~~l   70 (80)
                      ..|+..+..++|++||+++++|+..+...|..+|..++.+. .+.+...+..+..++..+...+..++
T Consensus       850 ~~~~~~~~~~~h~l~g~~~~~g~~~l~~~~~~le~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  917 (919)
T PRK11107        850 GEDPEGLLDLIHKLHGSCSYSGVPRLKKLCQLIEQQLRSGTSVEDLEPELLELLDEMENVARAAKKVL  917 (919)
T ss_pred             CCCHHHHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHHHcCCChhhHHHHHHHHHHHHHHHHHHHHHHh
Confidence            46788999999999999999999999999999999998764 45666667777777777777776665


No 10 
>PRK11466 hybrid sensory histidine kinase TorS; Provisional
Probab=98.17  E-value=3.6e-06  Score=64.40  Aligned_cols=38  Identities=29%  Similarity=0.415  Sum_probs=35.4

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHh
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCE   41 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~   41 (80)
                      .+|+..+...+|+|||+++++|+..+...|.++|..+.
T Consensus       849 ~~~~~~~~~~ah~lkg~~~~lg~~~l~~~~~~le~~~~  886 (914)
T PRK11466        849 SQDSEKIKRAAHQLKSSCSSLGMRQASQACAQLEQQPL  886 (914)
T ss_pred             CCCHHHHHHHHHHHHHHHHHhHHHHHHHHHHHHhCCCC
Confidence            47899999999999999999999999999999999764


No 11 
>COG0643 CheA Chemotaxis protein histidine kinase and related kinases [Cell motility and secretion / Signal transduction mechanisms]
Probab=97.85  E-value=6.2e-05  Score=57.76  Aligned_cols=63  Identities=16%  Similarity=0.269  Sum_probs=46.6

Q ss_pred             hHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHH---HhhcCHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            6 DFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSF---CEERNIEGCQQYLQHLKQEYYLVKNKLQT   68 (80)
Q Consensus         6 D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~---~~~~~~~~~~~~~~~l~~~~~~l~~~L~~   68 (80)
                      ....+.+.+|+|||+++.+|+..+..+|..+|..   .+++...--..++..+-.+.+.+...++.
T Consensus        40 ~ln~ifRaaHTlKG~a~~~g~~~l~~l~H~~E~~ld~~r~g~~~~~~~l~d~~l~~~D~l~~~~~~  105 (716)
T COG0643          40 LLNAIFRAAHTLKGGAGTLGLTTLAELAHAMEDLLDALRNGELELTSELLDLLLEALDALEEMLDA  105 (716)
T ss_pred             HHHHHHHHHHhhhhhhhhcChhHHHHHHHHHHHHHHHHhcCCccCcHHHHHHHhhhhHHHHHHHHh
Confidence            3567899999999999999999999999999975   46666554455555555555555554443


No 12 
>PRK10547 chemotaxis protein CheA; Provisional
Probab=97.82  E-value=8.7e-05  Score=56.61  Aligned_cols=59  Identities=7%  Similarity=0.191  Sum_probs=41.8

Q ss_pred             hHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHH---hhcCHHHHHHHHHHHHHHHHHHHH
Q 043345            6 DFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFC---EERNIEGCQQYLQHLKQEYYLVKN   64 (80)
Q Consensus         6 D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~---~~~~~~~~~~~~~~l~~~~~~l~~   64 (80)
                      ....+.+.+|+|||+|+.+|+..|..+|..+|...   |+|...-...++.-+-.+++.+..
T Consensus        37 ~in~lFRa~HTiKG~a~~~g~~~i~~l~H~~E~lld~vR~g~l~~~~~~~dlll~~~D~l~~   98 (670)
T PRK10547         37 QLNAIFRAAHSIKGGAGTFGFTVLQETTHLMENLLDEARRGEMQLNTDIINLFLETKDIMQE   98 (670)
T ss_pred             HHHHHHHHHHhhhhHHhhcCchHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHH
Confidence            35678899999999999999999999999999766   555433223344444444444433


No 13 
>PRK09959 hybrid sensory histidine kinase in two-component regulatory system with EvgA; Provisional
Probab=97.06  E-value=0.0049  Score=48.90  Aligned_cols=67  Identities=12%  Similarity=0.177  Sum_probs=53.6

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcC-HHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERN-IEGCQQYLQHLKQEYYLVKNKLQTLF   70 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~-~~~~~~~~~~l~~~~~~l~~~L~~~l   70 (80)
                      .+|...+..++|+++|++..+|+..|...|.++|......+ .+.+...++.+...+..+...+..|+
T Consensus      1126 ~~~~~~~~~~~h~~~g~~~~l~~~~l~~~~~~~e~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~ 1193 (1197)
T PRK09959       1126 AGDNRTFHQCIHRIHGAANILNLQKLINISHQLEITPVSDDSKPEILQLLNSVKEHIAELDQEIAVFC 1193 (1197)
T ss_pred             cCCHHHHHHHHHHHHHHHHHcCHHHHHHHHHHHHHhhhcCCchHHHHHHHHHHHHHHHHHHHHHHHhc
Confidence            35667889999999999999999999999999998886655 45667777777777766666666654


No 14 
>PRK15347 two component system sensor kinase SsrA; Provisional
Probab=96.71  E-value=0.0049  Score=47.32  Aligned_cols=54  Identities=17%  Similarity=0.201  Sum_probs=41.5

Q ss_pred             HHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHH
Q 043345            9 KVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVK   63 (80)
Q Consensus         9 ~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~   63 (80)
                      .+..++|.+||+++.+|+..+...|.++|..++.+...... .+..++..+..+.
T Consensus       862 ~l~~~~h~i~~~~~~~g~~~l~~~~~~~e~~~~~~~~~~~~-~~~~~~~~~~~~~  915 (921)
T PRK15347        862 VLSQLLHTLKGCAGQAGLTELQCAVIDLENALETGEILSLE-ELTDLRELIHALF  915 (921)
T ss_pred             HHHHHHHHHHHHHHHcCHHHHHHHHHHHHHHHhcCCCCCHH-HHHHHHHHHHHHh
Confidence            78899999999999999999999999999999876644322 2455555444433


No 15 
>PF08900 DUF1845:  Domain of unknown function (DUF1845);  InterPro: IPR014996  Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens. 
Probab=83.90  E-value=10  Score=25.36  Aligned_cols=54  Identities=13%  Similarity=0.159  Sum_probs=36.6

Q ss_pred             hhhhccChHHHHHHHHHHHHHHhhcCHHH------HHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           19 GSSSSIGAQRVNNVCTAFRSFCEERNIEG------CQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus        19 Gss~nlGa~~L~~~c~~lE~~~~~~~~~~------~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      |..+.+|.......+..|.+.+.+.|+-.      +.+.++.+...+......|+..+..
T Consensus        32 ~~~~I~Gm~~~~~~~~~i~~~a~~DdPyAD~~L~~iEe~i~~~~~~l~~~~~~l~~~l~~   91 (217)
T PF08900_consen   32 GKPAIIGMPGFASRLNRIWRDARQDDPYADWWLLRIEEKINEARQELQELIARLDALLAE   91 (217)
T ss_pred             CCCCCcCHHHHHHHHHHHHHHHhcCCcHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            34488999999999999999998887432      3444555555555555555544443


No 16 
>PF13779 DUF4175:  Domain of unknown function (DUF4175)
Probab=81.71  E-value=4.6  Score=32.19  Aligned_cols=45  Identities=16%  Similarity=0.297  Sum_probs=39.9

Q ss_pred             hhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHH
Q 043345           19 GSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVK   63 (80)
Q Consensus        19 Gss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~   63 (80)
                      +.+.+++-..|.++..+||+.+++|+.+++..++++|++-++.++
T Consensus       530 ~~~~~~~~~dL~~mmd~ie~la~~G~~~~A~q~L~qlq~mmenmq  574 (820)
T PF13779_consen  530 GNSQMMSQQDLQRMMDRIEELARSGRMDEARQLLEQLQQMMENMQ  574 (820)
T ss_pred             hhhhccCHHHHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHhcc
Confidence            456689999999999999999999999999999999988877654


No 17 
>PF14276 DUF4363:  Domain of unknown function (DUF4363)
Probab=78.16  E-value=5.3  Score=23.87  Aligned_cols=49  Identities=14%  Similarity=0.179  Sum_probs=39.2

Q ss_pred             HHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHH
Q 043345           13 HVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKL   66 (80)
Q Consensus        13 laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L   66 (80)
                      ..|.+++++     ..+...+..+|+...++|++.+...++.+...+......+
T Consensus        17 ~~~~l~~~~-----~~i~~~l~~i~~~i~~~dW~~A~~~~~~l~~~W~k~~~~~   65 (121)
T PF14276_consen   17 SNNYLNNST-----DSIEEQLEQIEEAIENEDWEKAYKETEELEKEWDKNKKRW   65 (121)
T ss_pred             HHhhhhhHH-----HHHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHhhchhe
Confidence            346666664     5667889999999999999999999999999988765543


No 18 
>TIGR02302 aProt_lowcomp conserved hypothetical protein TIGR02302. Members of this family are long (~850 residue) bacterial proteins from the alpha Proteobacteria. Each has 2-3 predicted transmembrane helices near the N-terminus and a long C-terminal region that includes stretches of Gln/Gly-rich low complexity sequence, predicted by TMHMM to be outside the membrane. In Bradyrhizobium japonicum, two tandem reading frames are together homologous the single members found in other species; the cutoffs scores are set low enough that the longer scores above the trusted cutoff and the shorter above the noise cutoff for this model.
Probab=77.21  E-value=9  Score=30.78  Aligned_cols=44  Identities=14%  Similarity=0.230  Sum_probs=39.7

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKN   64 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~   64 (80)
                      +.+|+-..|-++..+||+.+++|+.+++.+++++|++-++.++.
T Consensus       562 ~~~l~~~dLq~Mmd~ieela~~G~~~~A~qlL~qlq~mmenlq~  605 (851)
T TIGR02302       562 TKVLRQQDLQNMMDQIENLARSGDRDQAKQLLSQLQQMMNNLQM  605 (851)
T ss_pred             ccccCHHHHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHhc
Confidence            45688999999999999999999999999999999998888763


No 19 
>PLN02956 PSII-Q subunit
Probab=72.92  E-value=24  Score=23.32  Aligned_cols=39  Identities=8%  Similarity=0.114  Sum_probs=27.1

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNK   65 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~   65 (80)
                      ..|.+...+|..+++..|.......+......|+.+..+
T Consensus       146 ~~LFd~l~~LD~AAR~kd~~~a~k~Y~~tva~lD~Vl~~  184 (185)
T PLN02956        146 SDLFNSVTKLDYAARDKDETRVWEYYENIVASLDDIFSR  184 (185)
T ss_pred             HHHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHhc
Confidence            445566678888888888777777777666666666543


No 20 
>smart00388 HisKA His Kinase A (phosphoacceptor) domain. Dimerisation and phosphoacceptor domain of histidine kinases.
Probab=72.34  E-value=10  Score=18.61  Aligned_cols=58  Identities=12%  Similarity=0.174  Sum_probs=31.8

Q ss_pred             HHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            9 KVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCL   74 (80)
Q Consensus         9 ~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~   74 (80)
                      -+..++|.||..-+.+     ...+..+..   ....+.....+..+......+..-+..++...+
T Consensus         5 ~~~~i~Hel~~pl~~i-----~~~~~~l~~---~~~~~~~~~~~~~~~~~~~~~~~~v~~l~~~~~   62 (66)
T smart00388        5 FLANLSHELRTPLTAI-----RGYLELLED---TELSEEQREYLETILRSAERLLRLINDLLDLSR   62 (66)
T ss_pred             HHHHHHHhccCcHHHH-----HHHHHHHHh---CCCChHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            3566889999544322     122222322   122222266677777777777777766665543


No 21 
>COG2991 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=70.81  E-value=0.8  Score=25.91  Aligned_cols=26  Identities=31%  Similarity=0.632  Sum_probs=20.9

Q ss_pred             HhhhhhhccChHHHHHHHHHHHHHHhh
Q 043345           16 QLKGSSSSIGAQRVNNVCTAFRSFCEE   42 (80)
Q Consensus        16 ~LKGss~nlGa~~L~~~c~~lE~~~~~   42 (80)
                      .||||+|-|++..+...|. -++-|.+
T Consensus        28 ~I~GSCGGi~alGi~K~Cd-C~~pCDt   53 (77)
T COG2991          28 SIKGSCGGIAALGIEKVCD-CDEPCDT   53 (77)
T ss_pred             ccccccccHHhhccchhcC-CCCchHH
Confidence            6899999999999999887 4555544


No 22 
>TIGR03761 ICE_PFL4669 integrating conjugative element protein, PFL_4669 family. Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens.
Probab=69.37  E-value=32  Score=23.21  Aligned_cols=51  Identities=16%  Similarity=0.165  Sum_probs=35.6

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHHH------HHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIEG------CQQYLQHLKQEYYLVKNKLQTLFQ   71 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~~------~~~~~~~l~~~~~~l~~~L~~~l~   71 (80)
                      .+.+|.+.....+..|.+.+.+.|+-.      +...++.+...++.+...++..+.
T Consensus        32 ~~IiGl~~f~s~~~~i~~~a~~DdPyAD~~Ll~~E~~l~~~~~~l~~~~~~l~~~l~   88 (216)
T TIGR03761        32 PGIIGMPGFISRLNRINQASEQDDPYADWALLRIEEKLLSARQEMQALLQRLDDLLA   88 (216)
T ss_pred             CCCcCcHHHHHHHHHHHHHHHcCCcHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            468999999999999999999888532      344455555555555555555444


No 23 
>PF03670 UPF0184:  Uncharacterised protein family (UPF0184);  InterPro: IPR022788  This family of proteins has no known function. 
Probab=69.06  E-value=20  Score=20.72  Aligned_cols=37  Identities=19%  Similarity=0.206  Sum_probs=29.1

Q ss_pred             hhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh
Q 043345           41 EERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSL   77 (80)
Q Consensus        41 ~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~   77 (80)
                      -+...+.+..+++.|++--..|...|..+|+..||+-
T Consensus        31 ins~LD~Lns~LD~LE~rnD~l~~~L~~LLesnrq~R   67 (83)
T PF03670_consen   31 INSMLDQLNSCLDHLEQRNDHLHAQLQELLESNRQIR   67 (83)
T ss_pred             HHHHHHHHHHHHHHHHHhhhHHHHHHHHHHHHHHHHH
Confidence            3344667777888888888889999999999888864


No 24 
>PF03993 DUF349:  Domain of Unknown Function (DUF349);  InterPro: IPR007139 This motif is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three alpha-helices, and possibly a beta-strand.
Probab=68.75  E-value=13  Score=20.05  Aligned_cols=32  Identities=13%  Similarity=0.240  Sum_probs=18.3

Q ss_pred             HHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHH
Q 043345           30 NNVCTAFRSFCEERNIEGCQQYLQHLKQEYYL   61 (80)
Q Consensus        30 ~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~   61 (80)
                      -.+|.+++......++......+..|...|..
T Consensus        37 ~~Li~~~~~l~~~~d~~~~~~~~k~l~~~Wk~   68 (77)
T PF03993_consen   37 EALIEEAEALAESEDWKEAAEEIKELQQEWKE   68 (77)
T ss_pred             HHHHHHHHHhcccccHHHHHHHHHHHHHHHHH
Confidence            35666666666666655555555555555543


No 25 
>PF07361 Cytochrom_B562:  Cytochrome b562;  InterPro: IPR009155 Cytochrome b562 is a haem-containing protein that is expressed in the periplasm of Escherichia coli. In b-type cytochromes, the haem atom is not covalently attached to the polypeptide. Cytochrome b562 has a four-helical bundle structure that is structurally similar to that found in members of the cytochrome c family (IPR002321 from INTERPRO). Cytochrome b562 has a reduction potential of 167 mV, which sets the energy yield possible in metabolism and is also a key determinant of the rate at which redox reactions proceed [].; GO: 0005506 iron ion binding, 0009055 electron carrier activity, 0020037 heme binding, 0042597 periplasmic space; PDB: 4ER9_A 3IQ6_G 2QLA_B 3FOO_A 3M79_C 256B_A 3NMI_F 3HNK_A 3NMK_D 2BC5_A ....
Probab=65.15  E-value=16  Score=21.53  Aligned_cols=33  Identities=12%  Similarity=0.249  Sum_probs=24.5

Q ss_pred             cChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHH
Q 043345           24 IGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLK   56 (80)
Q Consensus        24 lGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~   56 (80)
                      =|...|......++..+.+|+.+++...+..|.
T Consensus        60 ~Gl~~li~~id~a~~~~~~G~l~~AK~~l~~l~   92 (103)
T PF07361_consen   60 EGLDKLIDQIDKAEALAEAGKLDEAKAALKKLD   92 (103)
T ss_dssp             HHHHHHHHHHHHHHHHHHTTHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHH
Confidence            467778888888888888888887766654443


No 26 
>PF00512 HisKA:  His Kinase A (phospho-acceptor) domain;  InterPro: IPR003661 Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions []. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK. A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [, ]. Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [, ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily.  HKs can be roughly divided into two classes: orthodox and hybrid kinases [, ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain. This entry represents the dimerisation and phosphoacceptor domain found in histidine kinases. It has been found in bacterial sensor protein/histidine kinases. Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms []. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and the phosphotransfer from aspartyl phosphate back to ADP or to water []. The homodimeric domain includes the site of histidine autophosphorylation and phosphate transfer reactions. The structure of the homodimeric domain comprises a closed, four-helical bundle with a left-handed twist, formed by two identical alpha-hairpin subunits.; GO: 0000155 two-component sensor activity, 0007165 signal transduction, 0016020 membrane; PDB: 3DGE_A 2C2A_A 3A0R_A 4EW8_A 2LFS_B 2LFR_B 3JZ3_A 1JOY_B 3ZRW_C 3ZRV_A ....
Probab=65.13  E-value=18  Score=18.73  Aligned_cols=58  Identities=9%  Similarity=0.143  Sum_probs=39.6

Q ss_pred             HHHHHHHHhhhhhhccChHHHHHHHHHHHHHHh-hcCHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            9 KVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCE-ERNIEG-CQQYLQHLKQEYYLVKNKLQTLFQVCL   74 (80)
Q Consensus         9 ~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~-~~~~~~-~~~~~~~l~~~~~~l~~~L~~~l~~~~   74 (80)
                      -+..++|-||.        +|..+...++.... ....++ ....+..+..+..++..-++.++.+-|
T Consensus         5 ~~~~isHelr~--------PL~~i~~~~~~l~~~~~~~~~~~~~~l~~i~~~~~~l~~li~~ll~~sr   64 (68)
T PF00512_consen    5 FLASISHELRN--------PLTAIRGYLELLERDSDLDPEQLREYLDRIRSAADRLNELINDLLDFSR   64 (68)
T ss_dssp             HHHHHHHHHHH--------HHHHHHHHHHHHHCSSCC-HHHCHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHhHHHHH--------HHHHHHHHHHHHHHccCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            45678888885        44444455555444 333333 488899999999999998888887654


No 27 
>KOG2424 consensus Protein involved in transcription start site selection [Transcription]
Probab=64.56  E-value=15  Score=24.34  Aligned_cols=24  Identities=17%  Similarity=0.552  Sum_probs=18.7

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIE   46 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~   46 (80)
                      -+.+||.-+.++|..|+.  ++.+++
T Consensus       147 dA~~Gaf~I~elcq~l~~--~s~d~E  170 (195)
T KOG2424|consen  147 DATLGAFLILELCQCLQA--QSDDLE  170 (195)
T ss_pred             hhhhhHHHHHHHHHHHHh--ccccHH
Confidence            467999999999999987  444433


No 28 
>PF04837 MbeB_N:  MbeB-like, N-term conserved region;  InterPro: IPR006922 This family consists of Mbe/Mob proteins defined by an N-terminal conserved region. These proteins are essential for specific plasmid transfer.
Probab=64.36  E-value=19  Score=18.91  Aligned_cols=48  Identities=13%  Similarity=0.202  Sum_probs=36.3

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhc
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSLAS   79 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~~~   79 (80)
                      .+..++..+|+..+.    .....=..|+.+|.+....|..-|....+.+.+
T Consensus         3 ~il~LA~~feqkske----qa~ste~~vk~af~~~E~~l~~~L~~s~~~is~   50 (52)
T PF04837_consen    3 EILNLAKDFEQKSKE----QAESTEQMVKNAFEQHEKSLSAALKESEQKISD   50 (52)
T ss_pred             HHHHHHHHHHHHHHH----HHHHHHHHHHHHHHHHHHHHHHHHHHhHHHhhh
Confidence            467788888887765    455566778889999999888888876666653


No 29 
>KOG3232 consensus Vacuolar assembly/sorting protein DID2 [Intracellular trafficking, secretion, and vesicular transport]
Probab=63.50  E-value=41  Score=22.31  Aligned_cols=40  Identities=13%  Similarity=0.228  Sum_probs=33.9

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQ   67 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~   67 (80)
                      .|...|..++.+.+..|.+.+..+++..+..|..+...-.
T Consensus        94 sM~gVvK~md~alktmNLekis~~MDkFE~qFedldvqt~  133 (203)
T KOG3232|consen   94 SMAGVVKSMDSALKTMNLEKISQLMDKFEKQFEDLDVQTE  133 (203)
T ss_pred             HHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHhhhhhhHHH
Confidence            3677899999999999999999999999999988765433


No 30 
>PF04722 Ssu72:  Ssu72-like protein;  InterPro: IPR006811 The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB []. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3'-end formation. Ssu72 is also essential for transcription termination of snRNAs [].; GO: 0004721 phosphoprotein phosphatase activity, 0006397 mRNA processing, 0005634 nucleus; PDB: 3O2S_B 3O2Q_E 3FMV_H 3OMW_D 3P9Y_B 3FDF_A 3OMX_A.
Probab=58.61  E-value=23  Score=23.64  Aligned_cols=36  Identities=19%  Similarity=0.422  Sum_probs=22.7

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHH-HHHHHHHHHHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIE-GCQQYLQHLKQ   57 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~-~~~~~~~~l~~   57 (80)
                      -+.+||..+.++|..|+. ....+++ .+..++.+.+.
T Consensus       145 eA~~Ga~~ileLc~~l~~-~~~~d~e~~i~~il~~fe~  181 (195)
T PF04722_consen  145 EATIGAFLILELCQMLEE-EASEDLEDEIDEILQEFEE  181 (195)
T ss_dssp             HHHHHHHHHHHHHHHHH---TSSSHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHHHHh-hccccHHHHHHHHHHHHHH
Confidence            367899999999999997 3344544 34444444443


No 31 
>PRK13858 type IV secretion system T-DNA border endonuclease VirD1; Provisional
Probab=56.82  E-value=49  Score=21.10  Aligned_cols=62  Identities=13%  Similarity=0.205  Sum_probs=40.8

Q ss_pred             HHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhc--CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh
Q 043345            8 TKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEER--NIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSL   77 (80)
Q Consensus         8 ~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~--~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~   77 (80)
                      +.+..+...|.|-++||  ++|+..|.+      .+  +.+.+..--..+-.+|..+..-|...|...|..+
T Consensus        73 e~~~~lir~l~gianNL--NQLAr~aN~------~~~~~~~~l~~er~~~g~~~~~l~~~l~~~~~vsrrr~  136 (147)
T PRK13858         73 EKMEAILQSIGTLSSNI--AALLSAYAE------NPRPDLEALRAERIAFGKEFADLDGLLRSILSVSRRRI  136 (147)
T ss_pred             HHHHHHHHHHHHHHHHH--HHHHHHHhc------CCCCcHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh
Confidence            34556777777777775  444444444      22  2555666667778888888888888888877654


No 32 
>PF05757 PsbQ:  Oxygen evolving enhancer protein 3 (PsbQ);  InterPro: IPR008797 Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product. PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [, ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection [].  In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants []. This family represents the PSII OEC protein PsbQ. Both PsbQ and PsbP (IPR002683 from INTERPRO) are regulators that are necessary for the biogenesis of optically active PSII. The crystal structure of PsbQ from spinach revealed a 4-helical bundle polypeptide. The distribution of positive and negative charges on the protein surface might explain the ability of PsbQ to increase the binding of chloride and calcium ions and make them available to PSII [].; GO: 0005509 calcium ion binding, 0015979 photosynthesis, 0009523 photosystem II, 0009654 oxygen evolving complex, 0019898 extrinsic to membrane; PDB: 1VYK_A 1NZE_A 3LS1_A 3LS0_A.
Probab=55.77  E-value=42  Score=22.42  Aligned_cols=38  Identities=5%  Similarity=0.106  Sum_probs=27.5

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKN   64 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~   64 (80)
                      ..|.....+|..+++..|...+...++.....++.+..
T Consensus       162 ~~lf~~ie~LD~Aar~K~~~~a~~~Y~~t~~~Ldevla  199 (202)
T PF05757_consen  162 NKLFDNIEELDYAARSKDVPEAEKYYADTVKALDEVLA  199 (202)
T ss_dssp             HHHHHHHHHHHHHHHTT-HHHHHHHHHHHHHHHHHHHC
T ss_pred             HHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHH
Confidence            56777788888888998888777777766666666544


No 33 
>PF12854 PPR_1:  PPR repeat
Probab=55.66  E-value=15  Score=16.91  Aligned_cols=22  Identities=9%  Similarity=0.459  Sum_probs=17.2

Q ss_pred             HHHHHHHhhcCHHHHHHHHHHH
Q 043345           34 TAFRSFCEERNIEGCQQYLQHL   55 (80)
Q Consensus        34 ~~lE~~~~~~~~~~~~~~~~~l   55 (80)
                      .-|.-.|+.|..+++..+++++
T Consensus        12 ~lI~~~Ck~G~~~~A~~l~~~M   33 (34)
T PF12854_consen   12 TLIDGYCKAGRVDEAFELFDEM   33 (34)
T ss_pred             HHHHHHHHCCCHHHHHHHHHhC
Confidence            3456688999999988888764


No 34 
>PF09577 Spore_YpjB:  Sporulation protein YpjB (SpoYpjB);  InterPro: IPR014231 Proteins in thie entry, typified by YpjB, are restricted to a subset of the endospore-forming bacteria which includes Bacillus species, but not species. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon []. Sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect, but this gene is not, however, a part of the endospore formation minimal gene set.
Probab=53.85  E-value=58  Score=22.20  Aligned_cols=40  Identities=13%  Similarity=0.233  Sum_probs=31.6

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQ   67 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~   67 (80)
                      .|.+....+++++..|+.......++.+...|..+...+.
T Consensus       102 ~i~~~~~~mk~a~~~~~~~~f~~~~n~f~~~y~~I~Psl~  141 (232)
T PF09577_consen  102 PIMEDFQRMKQAAQKGDKEAFRASLNEFLSHYELIRPSLT  141 (232)
T ss_pred             HHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHhcchhh
Confidence            4667778888989888988888888888888877765543


No 35 
>PF04400 DUF539:  Protein of unknown function (DUF539);  InterPro: IPR007495 This is a family of putative periplasmic proteins.
Probab=52.23  E-value=1.1  Score=23.00  Aligned_cols=19  Identities=32%  Similarity=0.709  Sum_probs=15.5

Q ss_pred             HhhhhhhccChHHHHHHHH
Q 043345           16 QLKGSSSSIGAQRVNNVCT   34 (80)
Q Consensus        16 ~LKGss~nlGa~~L~~~c~   34 (80)
                      .|+||+|-|++..+-..|.
T Consensus         7 ~I~GSCGGl~~lGi~~~C~   25 (45)
T PF04400_consen    7 PIKGSCGGLGALGIDKECD   25 (45)
T ss_pred             cccccchhhhhcCCCccCC
Confidence            5899999999887766666


No 36 
>PF07870 DUF1657:  Protein of unknown function (DUF1657);  InterPro: IPR012452 This domain appears to be restricted to the Bacillales. 
Probab=52.20  E-value=33  Score=17.66  Aligned_cols=36  Identities=11%  Similarity=0.175  Sum_probs=27.7

Q ss_pred             HHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           34 TAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus        34 ~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      ..+|.++-.-+-+.+..++.+....+..+...|+.+
T Consensus        14 A~Le~fal~T~d~~AK~~y~~~a~~l~~ii~~L~~r   49 (50)
T PF07870_consen   14 ADLETFALQTQDQEAKQMYEQAAQQLEEIIQDLEPR   49 (50)
T ss_pred             hhHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHhHcc
Confidence            667888866666677888999888888888877644


No 37 
>PF14077 WD40_alt:  Alternative WD40 repeat motif
Probab=49.94  E-value=29  Score=17.89  Aligned_cols=36  Identities=14%  Similarity=0.225  Sum_probs=28.4

Q ss_pred             cCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhh
Q 043345           43 RNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSLA   78 (80)
Q Consensus        43 ~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~~   78 (80)
                      |+-+.....+..|+.++..++..=...+.+.-+|++
T Consensus        11 G~~e~l~vrv~eLEeEV~~LrKINrdLfdFSt~iiT   46 (48)
T PF14077_consen   11 GDQEQLRVRVSELEEEVRTLRKINRDLFDFSTRIIT   46 (48)
T ss_pred             CCcchheeeHHHHHHHHHHHHHHhHHHHhhhhhhcc
Confidence            455566677889999999988866788888888876


No 38 
>COG2178 Predicted RNA-binding protein of the translin family [Translation, ribosomal structure and biogenesis]
Probab=48.33  E-value=83  Score=21.18  Aligned_cols=43  Identities=9%  Similarity=0.103  Sum_probs=34.1

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      ..+...|...--+.+.++.+.+...+......+..++..|..|
T Consensus        27 Rei~r~s~~aI~~~H~~~~eeA~~~l~~a~~~v~~Lk~~l~~~   69 (204)
T COG2178          27 REIVRLSGEAIFLLHRGDFEEAEKKLKKASEAVEKLKRLLAGF   69 (204)
T ss_pred             HHHHHHHHHHHHHHHhccHHHHHHHHHHHHHHHHHHHHHHhhh
Confidence            4566777777788888998888888888888888888777655


No 39 
>cd00082 HisKA Histidine Kinase A (dimerization/phosphoacceptor) domain; Histidine Kinase A dimers are formed through parallel association of 2 domains creating 4-helix bundles; usually these domains contain a conserved His residue and are activated via trans-autophosphorylation by the catalytic domain of the histidine kinase. They subsequently transfer the phosphoryl group to the Asp acceptor residue of a response regulator protein. Two-component signalling systems, consisting of a histidine protein kinase that senses a signal input and a response regulator that mediates the output, are ancient and evolutionarily conserved signaling mechanisms in prokaryotes and eukaryotes.
Probab=46.13  E-value=35  Score=16.25  Aligned_cols=55  Identities=20%  Similarity=0.250  Sum_probs=28.4

Q ss_pred             HHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcC-HHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            9 KVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERN-IEGCQQYLQHLKQEYYLVKNKLQTLFQ   71 (80)
Q Consensus         9 ~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~-~~~~~~~~~~l~~~~~~l~~~L~~~l~   71 (80)
                      -+..++|.||..-+.+        -..++....... .+.....+..+......+..-++.++.
T Consensus         7 ~~~~~~hel~~pl~~i--------~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~l~~   62 (65)
T cd00082           7 FLANVSHELRTPLTAI--------RGALELLEEELLDDEEQREYLERIREEAERLLRLINDLLD   62 (65)
T ss_pred             HHHHHhHHhcchHHHH--------HHHHHHHHhcccCcHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            3566889998543322        222222221111 344556666666666666665555544


No 40 
>PF09403 FadA:  Adhesion protein FadA;  InterPro: IPR018543  FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices. ; PDB: 3ETZ_B 3ETY_A 2GL2_B 3ETX_C 3ETW_A.
Probab=45.08  E-value=74  Score=19.68  Aligned_cols=9  Identities=44%  Similarity=0.302  Sum_probs=0.0

Q ss_pred             hhhccChHH
Q 043345           20 SSSSIGAQR   28 (80)
Q Consensus        20 ss~nlGa~~   28 (80)
                      ||.+++|+.
T Consensus        13 ss~sfaA~~   21 (126)
T PF09403_consen   13 SSISFAATA   21 (126)
T ss_dssp             ---------
T ss_pred             HHHHHHccc
Confidence            455666666


No 41 
>PF01535 PPR:  PPR repeat;  InterPro: IPR002885 This entry represents the PPR repeat. Pentatricopeptide repeat (PPR) proteins are characterised by tandem repeats of a degenerate 35 amino acid motif []. Most of PPR proteins have roles in mitochondria or plastid []. PPR repeats were discovered while screening Arabidopsis proteins for those predicted to be targeted to mitochondria or chloroplast [, ]. Some of these proteins have been shown to play a role in post-transcriptional processes within organelles and they are thought to be sequence-specific RNA-binding proteins [, , ]. Plant genomes have between one hundred to five hundred PPR genes per genome whereas non-plant genomes encode two to six PPR proteins. Although no PPR structures are yet known, the motif is predicted to fold into a helix-turn-helix structure similar to those found in the tetratricopeptide repeat (TPR) family (see PDOC50005 from PROSITEDOC) [].  The plant PPR protein family has been divided in two subfamilies on the basis of their motif content and organisation [, ]. Examples of PPR repeat-containing proteins include PET309 P32522 from SWISSPROT, which may be involved in RNA stabilisation [], and crp1, which is involved in RNA processing []. The repeat is associated with a predicted plant protein O49549 from SWISSPROT that has a domain organisation similar to the human BRCA1 protein.
Probab=44.72  E-value=28  Score=14.67  Aligned_cols=22  Identities=9%  Similarity=0.463  Sum_probs=16.1

Q ss_pred             HHHHHhhcCHHHHHHHHHHHHH
Q 043345           36 FRSFCEERNIEGCQQYLQHLKQ   57 (80)
Q Consensus        36 lE~~~~~~~~~~~~~~~~~l~~   57 (80)
                      |.-.++.++.+++..++.++.+
T Consensus         7 i~~~~~~~~~~~a~~~~~~M~~   28 (31)
T PF01535_consen    7 ISGYCKMGQFEEALEVFDEMRE   28 (31)
T ss_pred             HHHHHccchHHHHHHHHHHHhH
Confidence            5566788888888888777654


No 42 
>PF00435 Spectrin:  Spectrin repeat;  InterPro: IPR002017 Spectrin repeats [] are found in several proteins involved in cytoskeletal structure. These include spectrin alpha and beta subunits [, ], alpha-actinin [] and dystrophin. The spectrin repeat forms a three-helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C.; GO: 0005515 protein binding; PDB: 1HCI_A 1QUU_A 3FB2_B 1S35_A 1U5P_A 1U4Q_A 1CUN_B 1YDI_B 3EDV_A 1AJ3_A ....
Probab=44.42  E-value=52  Score=17.74  Aligned_cols=45  Identities=13%  Similarity=0.126  Sum_probs=30.9

Q ss_pred             hHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           26 AQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQ   71 (80)
Q Consensus        26 a~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~   71 (80)
                      ...|...+..| ......+.+.+...+..|...|..+...+..+..
T Consensus        57 l~~l~~~~~~L-~~~~~~~~~~i~~~~~~l~~~w~~l~~~~~~r~~  101 (105)
T PF00435_consen   57 LESLNEQAQQL-IDSGPEDSDEIQEKLEELNQRWEALCELVEERRQ  101 (105)
T ss_dssp             HHHHHHHHHHH-HHTTHTTHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHH-HHcCCCcHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            34455666666 3333456677888889999999998888866544


No 43 
>PRK10548 flagellar biosynthesis protein FliT; Provisional
Probab=43.46  E-value=76  Score=19.38  Aligned_cols=23  Identities=9%  Similarity=0.138  Sum_probs=17.5

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQ   49 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~   49 (80)
                      ..|..+..++=.+++.|+|+.+.
T Consensus        12 q~I~~lS~~ML~aA~~g~Wd~Li   34 (121)
T PRK10548         12 QQILTLSQSMLRLATEGQWDELI   34 (121)
T ss_pred             HHHHHHHHHHHHHHHHCCHHHHH
Confidence            35667778888888999988753


No 44 
>PF13812 PPR_3:  Pentatricopeptide repeat domain
Probab=42.37  E-value=33  Score=14.84  Aligned_cols=22  Identities=14%  Similarity=0.351  Sum_probs=16.1

Q ss_pred             HHHHHhhcCHHHHHHHHHHHHH
Q 043345           36 FRSFCEERNIEGCQQYLQHLKQ   57 (80)
Q Consensus        36 lE~~~~~~~~~~~~~~~~~l~~   57 (80)
                      |...++.|+.+.+..++..++.
T Consensus         8 l~a~~~~g~~~~a~~~~~~M~~   29 (34)
T PF13812_consen    8 LRACAKAGDPDAALQLFDEMKE   29 (34)
T ss_pred             HHHHHHCCCHHHHHHHHHHHHH
Confidence            4455688888888888777764


No 45 
>PF15300 INT_SG_DDX_CT_C:  INTS6/SAGE1/DDX26B/CT45 C-terminus
Probab=41.84  E-value=20  Score=19.70  Aligned_cols=20  Identities=15%  Similarity=0.303  Sum_probs=16.6

Q ss_pred             cCchhHHHHHHHHHHhhhhh
Q 043345            2 QQNVDFTKVGGHVHQLKGSS   21 (80)
Q Consensus         2 ~~~~D~~~~~~laH~LKGss   21 (80)
                      .|.+||+.+-.+...++|+-
T Consensus        17 rpGr~ye~iF~lL~~vqG~~   36 (65)
T PF15300_consen   17 RPGRNYEKIFKLLEQVQGPL   36 (65)
T ss_pred             ccCCcHHHHHHHHHHccCCH
Confidence            37789999999888888875


No 46 
>PF11173 DUF2960:  Protein of unknown function (DUF2960);  InterPro: IPR021343  This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 
Probab=41.56  E-value=22  Score=20.35  Aligned_cols=15  Identities=13%  Similarity=0.160  Sum_probs=11.8

Q ss_pred             HHHHHHHHHhhhhcC
Q 043345           66 LQTLFQVCLKSLASS   80 (80)
Q Consensus        66 L~~~l~~~~~~~~~~   80 (80)
                      |..|+.+|+|+...|
T Consensus        38 lt~fl~ME~Qv~~~s   52 (79)
T PF11173_consen   38 LTEFLKMEQQVEMTS   52 (79)
T ss_pred             HHHHHHHHHHHHHHh
Confidence            678999999986543


No 47 
>PF10180 DUF2373:  Uncharacterised conserved protein (DUF2373);  InterPro: IPR019327  This is a conserved family of proteins found from fungi to humans. The function is not known. 
Probab=37.92  E-value=67  Score=17.47  Aligned_cols=29  Identities=17%  Similarity=0.232  Sum_probs=20.7

Q ss_pred             CchhHHHHHHHHHHhhhhhhccChHHHHHHHHH
Q 043345            3 QNVDFTKVGGHVHQLKGSSSSIGAQRVNNVCTA   35 (80)
Q Consensus         3 ~~~D~~~~~~laH~LKGss~nlGa~~L~~~c~~   35 (80)
                      |..+++.+..+.-.|||++.    .+|.+.|.+
T Consensus        35 P~~~~~~ll~Yl~glkG~aR----~rl~~~a~~   63 (65)
T PF10180_consen   35 PSEYFPILLEYLKGLKGGAR----ERLREEAKE   63 (65)
T ss_pred             CHHHHHHHHHHHHhCcchHH----HHHHHHHHh
Confidence            56778889999999999664    345555543


No 48 
>PRK10987 regulatory protein AmpE; Provisional
Probab=37.87  E-value=63  Score=22.34  Aligned_cols=35  Identities=17%  Similarity=0.215  Sum_probs=30.1

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHL   55 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l   55 (80)
                      ...+|...+.+...++.++.++||.+.+...+.++
T Consensus        79 ~~~lg~r~L~~~~~~v~~AL~~gDl~aAR~~l~~l  113 (284)
T PRK10987         79 LLCIGAGKQRLHYKAYLQAACRGDSQACYHMAEEL  113 (284)
T ss_pred             HHHhCCchHHHHHHHHHHHHHCCCHHHHHHHHHHh
Confidence            34589999999999999999999999888876666


No 49 
>PF08738 Gon7:  Gon7 family;  InterPro: IPR014849 In Saccharomyces cerevisiae Gon7 is a member of the KEOPS protein complex. A protein complex proposed to be involved in transcription and promoting telomere uncapping and telomere elongation []. 
Probab=37.43  E-value=92  Score=18.61  Aligned_cols=26  Identities=12%  Similarity=0.349  Sum_probs=21.8

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           47 GCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus        47 ~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      .-...|..|+..+..++..|+.||.-
T Consensus        51 ~K~t~L~~LR~~lt~lQddIN~fLTe   76 (103)
T PF08738_consen   51 DKDTYLSELRAQLTTLQDDINEFLTE   76 (103)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            33467899999999999999999864


No 50 
>PRK00068 hypothetical protein; Validated
Probab=37.41  E-value=1.2e+02  Score=25.01  Aligned_cols=39  Identities=5%  Similarity=0.031  Sum_probs=30.8

Q ss_pred             hHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHH
Q 043345           26 AQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKN   64 (80)
Q Consensus        26 a~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~   64 (80)
                      +....++..+.+++.++||+.+..+.+++|++.++++..
T Consensus       930 l~~a~~a~~~a~~Alk~GDw~~yG~a~~~L~~al~~~~~  968 (970)
T PRK00068        930 LKEAQDAYNKAIEAQKSGDFAEYGEALKELDDALNKYNK  968 (970)
T ss_pred             HHHHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHh
Confidence            445667777788888999999999988888888776643


No 51 
>PRK10265 chaperone-modulator protein CbpM; Provisional
Probab=37.33  E-value=58  Score=18.99  Aligned_cols=23  Identities=30%  Similarity=0.483  Sum_probs=15.2

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHH
Q 043345           49 QQYLQHLKQEYYLVKNKLQTLFQ   71 (80)
Q Consensus        49 ~~~~~~l~~~~~~l~~~L~~~l~   71 (80)
                      .+.+++|++++..++..|..|++
T Consensus        77 Ld~i~~Lr~el~~L~~~l~~~~~   99 (101)
T PRK10265         77 LDEIAHLKQENRLLRQRLSRFVA   99 (101)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHhc
Confidence            34456667777777777777764


No 52 
>TIGR00756 PPR pentatricopeptide repeat domain (PPR motif). This family has a similar consensus to the TPR domain (tetratricopeptide), pfam pfam00515, a 33-residue repeat. It is predicted to form a pair of antiparallel helices similar to that of TPR.
Probab=37.29  E-value=39  Score=14.32  Aligned_cols=22  Identities=9%  Similarity=0.424  Sum_probs=16.5

Q ss_pred             HHHHHhhcCHHHHHHHHHHHHH
Q 043345           36 FRSFCEERNIEGCQQYLQHLKQ   57 (80)
Q Consensus        36 lE~~~~~~~~~~~~~~~~~l~~   57 (80)
                      |...++.++.+++..++..+..
T Consensus         7 i~~~~~~~~~~~a~~~~~~M~~   28 (35)
T TIGR00756         7 IDGLCKAGRVEEALELFKEMLE   28 (35)
T ss_pred             HHHHHHCCCHHHHHHHHHHHHH
Confidence            4556788888888888877754


No 53 
>KOG4182 consensus Uncharacterized conserved protein [Function unknown]
Probab=37.09  E-value=2e+02  Score=22.46  Aligned_cols=58  Identities=16%  Similarity=0.200  Sum_probs=35.5

Q ss_pred             chhHHHHHHHHHHhhhhhhccC------------------------------------hHHHHHHHHHHHHHHhhcCHHH
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIG------------------------------------AQRVNNVCTAFRSFCEERNIEG   47 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlG------------------------------------a~~L~~~c~~lE~~~~~~~~~~   47 (80)
                      ..|...++.=+|+|.|+.+.|-                                    +..+..+...+|.-...||..+
T Consensus        73 akd~~~Lq~Da~~Lq~kma~il~el~~aegesadCiAaLaRldn~kQkleaA~esLQdaaGl~nL~a~lED~Fa~gDL~~  152 (828)
T KOG4182|consen   73 AKDSAALQADAHRLQEKMAAILLELAAAEGESADCIAALARLDNKKQKLEAAKESLQDAAGLGNLLAELEDGFARGDLKG  152 (828)
T ss_pred             hhHHHHHHHHHHHHHHHHHHHHHHHHHHhCChHHHHHHHHHhccHHHHHHHHHHHHHhhccHHHHHHHHHHHhhcCCchh
Confidence            3566777888888888765431                                    2233444555666666666666


Q ss_pred             HHHHHHHHHHHHHH
Q 043345           48 CQQYLQHLKQEYYL   61 (80)
Q Consensus        48 ~~~~~~~l~~~~~~   61 (80)
                      +...+..++.++..
T Consensus       153 aadkLaalqkcL~A  166 (828)
T KOG4182|consen  153 AADKLAALQKCLHA  166 (828)
T ss_pred             HHHHHHHHHHHHHH
Confidence            66666666666543


No 54 
>PF01044 Vinculin:  Vinculin family;  InterPro: IPR006077 Vinculin is a eukaryotic protein that seems to be involved in the attachment of the actin-based microfilaments to the plasma membrane. Vinculin is located at the cytoplasmic side of focal contacts or adhesion plaques []. In addition to actin, vinculin interacts with other structural proteins such as talin and alpha-actinins. Vinculin is a large protein of 116 kDa (about a 1000 residues). Structurally the protein consists of an acidic N-terminal domain of about 90 kDa separated from a basic C-terminal domain of about 25 kDa by a proline-rich region of about 50 residues. The central part of the N-terminal domain consists of a variable number (3 in vertebrates, 2 in Caenorhabditis elegans) of repeats of a 110 amino acids domain. Alpha-catenins are evolutionary related to vinculin IPR001033 from INTERPRO []. Catenins are proteins that associate with the cytoplasmic domain of a variety of cadherins. The association of catenins to cadherins produces a complex which is linked to the actin filament network, and which seems to be of primary importance for cadherins cell-adhesion properties. Three different types of catenins seem to exist: alpha, beta, and gamma. Alpha-catenins are proteins of about 100 kDa which are evolutionary related to vinculin. In terms of their structure the most significant differences are the absence, in alpha-catenin, of the repeated domain and of the proline-rich segment.; GO: 0005198 structural molecule activity, 0007155 cell adhesion, 0015629 actin cytoskeleton; PDB: 3S90_B 1TR2_B 2IBF_A 1RKC_A 3TJ5_A 3RF3_B 4DJ9_A 2GWW_A 2HSQ_A 3TJ6_A ....
Probab=36.97  E-value=1.6e+02  Score=24.11  Aligned_cols=61  Identities=15%  Similarity=0.231  Sum_probs=38.2

Q ss_pred             HHHHHHHHHhhhhhhccChHHHHHHHHHHHH---------HHhhcCHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            8 TKVGGHVHQLKGSSSSIGAQRVNNVCTAFRS---------FCEERNIEGCQQYLQHLKQEYYLVKNKLQT   68 (80)
Q Consensus         8 ~~~~~laH~LKGss~nlGa~~L~~~c~~lE~---------~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~   68 (80)
                      ..+..-+..+-+++..=--..+..+|.++++         ....++.++......+|...+..|+..|..
T Consensus       411 ~~lv~e~~~~A~~~~~~~R~~Il~lc~~i~~l~~qL~dL~~~~~~~spea~~la~~L~~~l~~L~~~l~~  480 (968)
T PF01044_consen  411 RDLVEEARKLADSSDPEEREEILELCDEIEQLTNQLADLEMRGEGDSPEAKALAEQLSQKLDDLRQQLQK  480 (968)
T ss_dssp             HHHHHHHHHHHHTSSHHHHHHHHHHHHHHHHHHHHHHHHCHCSCCSSHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHhccccchHHhHHHHHHHHHHhcchhhhhhhccCCCcccccccccchhhhHHHHHHHHHH
Confidence            3445555566666654445577888888888         333445456666667777777777666653


No 55 
>PF05190 MutS_IV:  MutS family domain IV C-terminus.;  InterPro: IPR007861 Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication []. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.  MutS is a modular protein with a complex structure [], and is composed of:   N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease. Connector domain, which is similar in structure to Holliday junction resolvase ruvC. Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA. Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a beta-sheet structure. ATPase domain (connected to the core domain), which has a classical Walker A motif. HTH (helix-turn-helix) domain, which is involved in dimer contacts.   The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts [].  This entry represents the clamp domain (domain 4) found in proteins of the MutS family. The clamp domain is inserted within the core domain at the top of the lever helices. It has a beta-sheet structure [].; GO: 0005524 ATP binding, 0030983 mismatched DNA binding, 0006298 mismatch repair; PDB: 2WTU_A 1OH7_A 1OH5_B 1W7A_B 1NG9_A 1OH8_B 1WBD_A 1WB9_A 3K0S_A 1OH6_A ....
Probab=36.59  E-value=66  Score=17.51  Aligned_cols=26  Identities=27%  Similarity=0.466  Sum_probs=15.6

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHh
Q 043345           50 QYLQHLKQEYYLVKNKLQTLFQVCLK   75 (80)
Q Consensus        50 ~~~~~l~~~~~~l~~~L~~~l~~~~~   75 (80)
                      ..++.+...+..+...|..++.-.++
T Consensus         4 ~~Ld~~~~~~~~~~~~l~~~~~~~~~   29 (92)
T PF05190_consen    4 EELDELREEYEEIEEELEELLEEIRK   29 (92)
T ss_dssp             HHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            34566666666666666666655544


No 56 
>PF14756 Pdase_C33_assoc:  Peptidase_C33-associated domain
Probab=35.87  E-value=1.1e+02  Score=19.05  Aligned_cols=21  Identities=14%  Similarity=0.493  Sum_probs=16.3

Q ss_pred             cChHHHHHHHHHHHHHHhhcC
Q 043345           24 IGAQRVNNVCTAFRSFCEERN   44 (80)
Q Consensus        24 lGa~~L~~~c~~lE~~~~~~~   44 (80)
                      +-.-++..+|+-||..|-..|
T Consensus        70 ~~lgkiislcqvie~ccc~qn   90 (147)
T PF14756_consen   70 VCLGKIISLCQVIEECCCSQN   90 (147)
T ss_pred             hHHHHHHHHHHHHHHHHcccC
Confidence            345578999999999995554


No 57 
>PRK15058 cytochrome b562; Provisional
Probab=35.04  E-value=1.1e+02  Score=18.97  Aligned_cols=32  Identities=16%  Similarity=0.172  Sum_probs=25.2

Q ss_pred             ChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHH
Q 043345           25 GAQRVNNVCTAFRSFCEERNIEGCQQYLQHLK   56 (80)
Q Consensus        25 Ga~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~   56 (80)
                      |...|...-...+..+.+|+.+++.....+|.
T Consensus        86 G~d~Li~qID~a~~la~~GkL~eAK~~a~~l~  117 (128)
T PRK15058         86 GFDILVGQIDGALKLANEGKVKEAQAAAEQLK  117 (128)
T ss_pred             HHHHHHHHHHHHHHHHhCCCHHHHHHHHHHHH
Confidence            67888888888889999999998776654443


No 58 
>TIGR00444 mazG MazG family protein. This family of prokaryotic proteins has no known function. It includes the uncharacterized protein MazG in E. coli.
Probab=33.98  E-value=1.1e+02  Score=21.09  Aligned_cols=43  Identities=7%  Similarity=0.002  Sum_probs=30.5

Q ss_pred             ChHHHHHHHHHHHHHHhhc-CHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           25 GAQRVNNVCTAFRSFCEER-NIEGCQQYLQHLKQEYYLVKNKLQ   67 (80)
Q Consensus        25 Ga~~L~~~c~~lE~~~~~~-~~~~~~~~~~~l~~~~~~l~~~L~   67 (80)
                      +.+.|.....-...+++.| +++.....+..+..++.++..++.
T Consensus       129 ~lPaL~~A~ki~~raa~~Gfdw~~~~~~~~k~~EE~~El~~a~~  172 (248)
T TIGR00444       129 TLPALMRAAKIQKRCAKVGFDWEDVSPVWDKVYEELDEVMYEAR  172 (248)
T ss_pred             cCCHHHHHHHHHHHHHHcCCCCCCcHHHHHHHHHHHHHHHHHHh
Confidence            4456666666677777777 566777788888888877777663


No 59 
>PRK11107 hybrid sensory histidine kinase BarA; Provisional
Probab=33.36  E-value=2.3e+02  Score=22.06  Aligned_cols=55  Identities=15%  Similarity=0.201  Sum_probs=26.4

Q ss_pred             HHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           10 VGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus        10 ~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      +..++|-||--        |..+-..++...+....+.....+..+......+..-++..+.+
T Consensus       297 l~~isHelrtP--------L~~i~~~~~~l~~~~~~~~~~~~l~~i~~~~~~l~~li~~ll~~  351 (919)
T PRK11107        297 LANMSHELRTP--------LNGVIGFTRQTLKTPLTPTQRDYLQTIERSANNLLAIINDILDF  351 (919)
T ss_pred             HHHhhHhhccc--------HHHHHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            45677887743        22222223322322222334445566666665555555555443


No 60 
>COG1270 CbiB Cobalamin biosynthesis protein CobD/CbiB [Coenzyme metabolism]
Probab=32.90  E-value=1e+02  Score=22.15  Aligned_cols=35  Identities=9%  Similarity=0.249  Sum_probs=29.1

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHL   55 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l   55 (80)
                      ...++...|.+.+.++.+..+++|.++....+..+
T Consensus        94 ~~tla~rsL~~~~~~v~~~L~~gdl~~aR~~ls~i  128 (320)
T COG1270          94 KTTLAIRSLADHARKVARALRRGDLEGARRALSMI  128 (320)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHhCCHHHHHHHHHHH
Confidence            45688999999999999999999988877666543


No 61 
>PF09686 Plasmid_RAQPRD:  Plasmid protein of unknown function (Plasmid_RAQPRD);  InterPro: IPR019110  This entry identifies a family of proteins, around 100 amino acids in length, that include a predicted signal sequence and a perfectly conserved motif, RAQPRD, towards the C terminus. They are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae pv. tomato str. DC3000. The function of these proteins is unknown. 
Probab=32.66  E-value=53  Score=18.76  Aligned_cols=23  Identities=9%  Similarity=0.196  Sum_probs=20.5

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHH
Q 043345           52 LQHLKQEYYLVKNKLQTLFQVCL   74 (80)
Q Consensus        52 ~~~l~~~~~~l~~~L~~~l~~~~   74 (80)
                      ..+|...+..++.-|+.||...|
T Consensus        46 Y~rl~~Dl~~ir~GI~~YL~psR   68 (81)
T PF09686_consen   46 YPRLRADLERIRAGIQDYLNPSR   68 (81)
T ss_pred             HHHHHHHHHHHHHHHHHHcCccc
Confidence            68899999999999999998766


No 62 
>PLN02999 photosystem II oxygen-evolving enhancer 3 protein (PsbQ)
Probab=31.86  E-value=1.6e+02  Score=19.62  Aligned_cols=38  Identities=5%  Similarity=0.055  Sum_probs=26.7

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKN   64 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~   64 (80)
                      ..|.+-...|+.+++..+..+.....+..-..++.+..
T Consensus       150 nkLFdnvt~LDyAAR~K~~~eae~yY~~Tv~slddVl~  187 (190)
T PLN02999        150 NELVENMSELDYYVRTPKVYESYLYYEKTLKSIDNVVE  187 (190)
T ss_pred             HHHhhhHHHHHHHHhcCChHHHHHHHHHHHHHHHHHHH
Confidence            56777778888899888877766666665555555544


No 63 
>PRK03170 dihydrodipicolinate synthase; Provisional
Probab=31.84  E-value=1.7e+02  Score=19.94  Aligned_cols=41  Identities=7%  Similarity=0.032  Sum_probs=30.6

Q ss_pred             hhhhhccChHH--HHHHHHHHHHHHhhcCHHHHHHHHHHHHHH
Q 043345           18 KGSSSSIGAQR--VNNVCTAFRSFCEERNIEGCQQYLQHLKQE   58 (80)
Q Consensus        18 KGss~nlGa~~--L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~   58 (80)
                      -|+.|.++...  +.+.+.++-++.++||.+.+.++..++...
T Consensus       197 ~G~~G~is~~~n~~P~~~~~l~~~~~~gd~~~a~~l~~~l~~~  239 (292)
T PRK03170        197 LGGVGVISVAANVAPKEMAEMCDAALAGDFAEAREIHRRLLPL  239 (292)
T ss_pred             cCCCEEEEhHHhhhHHHHHHHHHHHHCCCHHHHHHHHHHHHHH
Confidence            46677666544  779999999999999998877766555543


No 64 
>PF13041 PPR_2:  PPR repeat family 
Probab=31.83  E-value=69  Score=15.49  Aligned_cols=24  Identities=17%  Similarity=0.497  Sum_probs=18.7

Q ss_pred             HHHHHHHhhcCHHHHHHHHHHHHH
Q 043345           34 TAFRSFCEERNIEGCQQYLQHLKQ   57 (80)
Q Consensus        34 ~~lE~~~~~~~~~~~~~~~~~l~~   57 (80)
                      .-|...++.|+.+++..++.++..
T Consensus         8 ~li~~~~~~~~~~~a~~l~~~M~~   31 (50)
T PF13041_consen    8 TLISGYCKAGKFEEALKLFKEMKK   31 (50)
T ss_pred             HHHHHHHHCcCHHHHHHHHHHHHH
Confidence            345667888999998888888775


No 65 
>PF09660 DUF2397:  Protein of unknown function (DUF2397);  InterPro: IPR013493  Proteins in this family are encoded within a conserved gene four-gene neighbourhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. (strain EbN1) (Aromatoleum aromaticum (strain EbN1)) and Ralstonia solanacearum (Betaproteobacteria). 
Probab=31.46  E-value=2.3e+02  Score=21.35  Aligned_cols=64  Identities=13%  Similarity=0.256  Sum_probs=48.1

Q ss_pred             HHHhhhhhhccChHHHHHHHHHHHHHHh------hcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh
Q 043345           14 VHQLKGSSSSIGAQRVNNVCTAFRSFCE------ERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSL   77 (80)
Q Consensus        14 aH~LKGss~nlGa~~L~~~c~~lE~~~~------~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~   77 (80)
                      ...+.|.++++--..|-.+...|.....      .++.+.+...+..|...|..+...-..|+......+
T Consensus       105 le~~~~~~gsL~~~~l~~I~~~L~~L~~~~~~~~~~d~~~~~~~w~~L~~~f~~L~~na~df~~~L~~~~  174 (486)
T PF09660_consen  105 LENLLGERGSLQRTLLERILERLRALAELAESPREGDAAEVYEWWRDLFEDFERLAQNAQDFYASLQSVK  174 (486)
T ss_pred             HHhhcccccccchhHHHHHHHHHHHHHHHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh
Confidence            3444578888887777777777765553      457778888999999999999999998887755443


No 66 
>PF07891 DUF1666:  Protein of unknown function (DUF1666);  InterPro: IPR012870 These sequences are derived from hypothetical plant proteins of unknown function. The region in question is approximately 250 residues long. 
Probab=31.20  E-value=1.8e+02  Score=20.15  Aligned_cols=47  Identities=13%  Similarity=0.224  Sum_probs=32.0

Q ss_pred             HHHHHHHHHHHHHHhh---------cCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEE---------RNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCL   74 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~---------~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~   74 (80)
                      .=++.+|-..|-..-+         .+.. -....+.+.++|+.++.-|++|++-|+
T Consensus         7 vYVaQiCLSWEaL~wqY~k~~~l~~~~~~-~~~~yn~VA~eFQqFQVLLQRFiENEP   62 (247)
T PF07891_consen    7 VYVAQICLSWEALHWQYKKASELWESDPQ-NPHCYNHVAGEFQQFQVLLQRFIENEP   62 (247)
T ss_pred             HHHHHHHhhHHHHHhHHHHHHHHHhcCCC-CCCChHHHHHHHHHHHHHHHHHHhCCC
Confidence            3467778877743311         1111 123578999999999999999998764


No 67 
>PLN02729 PSII-Q subunit
Probab=31.18  E-value=1.7e+02  Score=19.88  Aligned_cols=37  Identities=11%  Similarity=0.108  Sum_probs=24.4

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVK   63 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~   63 (80)
                      ..|.+-..+|..+++..+..+....+......++.|.
T Consensus       180 nkLFdn~~eLD~AaR~Ks~~eae~yY~~Tv~aLdeVl  216 (220)
T PLN02729        180 NRLFDNFEKLEDASKRKNLSETESSYKDTKTLLQEVM  216 (220)
T ss_pred             HHHHhhHHHHHHHHhCCChHHHHHHHHHHHHHHHHHH
Confidence            5677777888888888876666665555555544443


No 68 
>COG4354 Predicted bile acid beta-glucosidase [Carbohydrate transport and metabolism]
Probab=31.10  E-value=1.9e+02  Score=22.93  Aligned_cols=56  Identities=14%  Similarity=0.076  Sum_probs=33.1

Q ss_pred             HHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           14 VHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVC   73 (80)
Q Consensus        14 aH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~   73 (80)
                      +|+|+|.++-.|-..++.+..-++-.--=++.    ..++.+..+........+++|..-
T Consensus       501 a~~i~G~ssy~~sl~iaal~A~l~is~~l~~~----~~~~a~~e~~~~~~~~y~~~L~~~  556 (721)
T COG4354         501 ATRIQGHSSYCGSLFIAALIAALEISKYLLDN----AQLEALNEASKNYVDFYNTWLKEA  556 (721)
T ss_pred             cceeechhhhhhHHHHHHHHHHHHHHHHHhhh----hhhhhhHHHHHHHHHHHHHHHHHH
Confidence            68999999999998888888877743322111    233334444444444444444433


No 69 
>PRK01209 cobD cobalamin biosynthesis protein; Provisional
Probab=30.34  E-value=1.1e+02  Score=21.42  Aligned_cols=33  Identities=6%  Similarity=0.185  Sum_probs=27.9

Q ss_pred             ccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHH
Q 043345           23 SIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHL   55 (80)
Q Consensus        23 nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l   55 (80)
                      .+|...+.+.+.++.++.+++|.+.+...+..+
T Consensus        92 ~l~~~~l~~~~~~v~~al~~gd~~~AR~~l~~~  124 (312)
T PRK01209         92 ALAGRSLADHARAVARALRAGDLEEARRAVSMI  124 (312)
T ss_pred             HHhhhhHHHHHHHHHHHHHcCCHHHHHHHHHHH
Confidence            478889999999999999999988887777665


No 70 
>TIGR00674 dapA dihydrodipicolinate synthase. Dihydrodipicolinate synthase is a homotetrameric enzyme of lysine biosynthesis. E. coli has several paralogs closely related to dihydrodipicoline synthase (DapA), as well as the more distant N-acetylneuraminate lyase. In Pyrococcus horikoshii, the bidirectional best hit with E. coli is to an uncharacterized paralog of DapA, not DapA itself, and it is omitted from the seed. The putative members from the Chlamydias (pathogens with a parasitic metabolism) are easily the most divergent members of the multiple alignment.
Probab=29.72  E-value=1.8e+02  Score=19.73  Aligned_cols=42  Identities=12%  Similarity=0.099  Sum_probs=30.8

Q ss_pred             hhhhhhccChHH--HHHHHHHHHHHHhhcCHHHHHHHHHHHHHH
Q 043345           17 LKGSSSSIGAQR--VNNVCTAFRSFCEERNIEGCQQYLQHLKQE   58 (80)
Q Consensus        17 LKGss~nlGa~~--L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~   58 (80)
                      .-|..|.++...  +.+.+.++=++..+||.+.+.++..++..-
T Consensus       193 ~~G~~G~i~~~~~~~P~~~~~l~~a~~~gd~~~A~~lq~~l~~l  236 (285)
T TIGR00674       193 ALGGKGVISVTANVAPKLMKEMVNNALEGDFAEAREIHQKLMPL  236 (285)
T ss_pred             HcCCCEEEehHHHhhHHHHHHHHHHHHcCCHHHHHHHHHHHHHH
Confidence            446777765444  668999999999999998877765555543


No 71 
>TIGR02878 spore_ypjB sporulation protein YpjB. Members of this protein, YpjB, family are restricted to a subset of endospore-forming bacteria, including Bacillus species but not CLostridium or some others. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon, where sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect. This protein family is not, however, a part of the endospore formation minimal gene set.
Probab=27.67  E-value=1.5e+02  Score=20.35  Aligned_cols=33  Identities=18%  Similarity=0.238  Sum_probs=17.9

Q ss_pred             HHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHH
Q 043345           29 VNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYL   61 (80)
Q Consensus        29 L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~   61 (80)
                      |......+++++..++.......++..-..|.-
T Consensus       104 im~~f~~mk~a~~~~~~~~f~~~ln~Fl~~Y~~  136 (233)
T TIGR02878       104 VMEAFTELEKAAQKEDSQAFQEKLNEFLSLYDL  136 (233)
T ss_pred             HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHhhh
Confidence            444555566666666655555555555554443


No 72 
>PRK10722 hypothetical protein; Provisional
Probab=27.49  E-value=2.1e+02  Score=19.81  Aligned_cols=32  Identities=22%  Similarity=0.265  Sum_probs=26.9

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh
Q 043345           45 IEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKS   76 (80)
Q Consensus        45 ~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~   76 (80)
                      .+...+....|+..+..+..+|+..-.+|||.
T Consensus       178 lD~lrqq~~~Lq~~L~~t~rKLEnLTdIERqL  209 (247)
T PRK10722        178 LDALRQQQQRLQYQLELTTRKLENLTDIERQL  209 (247)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            45566777888889999999999999999986


No 73 
>cd00408 DHDPS-like Dihydrodipicolinate synthase family. A member of the class I aldolases, which use an active-site lysine which stablilzes a reaction intermediate via Schiff base formation, and have TIM beta/alpha barrel fold. The dihydrodipicolinate synthase family comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways and includes such proteins as N-acetylneuraminate lyase, MosA protein, 5-keto-4-deoxy-glucarate dehydratase, trans-o-hydroxybenzylidenepyruvate hydratase-aldolase, trans-2'-carboxybenzalpyruvate hydratase-aldolase, and 2-keto-3-deoxy- gluconate aldolase. The family is also referred to as the N-acetylneuraminate lyase (NAL) family.
Probab=27.46  E-value=1.8e+02  Score=19.55  Aligned_cols=43  Identities=12%  Similarity=0.149  Sum_probs=31.1

Q ss_pred             hhhhhccCh--HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHH
Q 043345           18 KGSSSSIGA--QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYY   60 (80)
Q Consensus        18 KGss~nlGa--~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~   60 (80)
                      -|..|.++.  +-+.+.+..+-++.++|+.+.+.++...+..-..
T Consensus       193 ~G~~G~i~~~~n~~p~~~~~~~~~~~~g~~~~a~~~~~~~~~~~~  237 (281)
T cd00408         193 LGADGAISGAANVAPKLAVALYEAARAGDLEEARALQDRLLPLIE  237 (281)
T ss_pred             cCCCEEEehHHhhCHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHH
Confidence            355555555  6678999999999999998887776665555433


No 74 
>PF13047 DUF3907:  Protein of unknown function (DUF3907)
Probab=27.35  E-value=1.2e+02  Score=19.36  Aligned_cols=26  Identities=8%  Similarity=0.181  Sum_probs=21.7

Q ss_pred             CHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           44 NIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus        44 ~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      -.+++..++..|...|..++..|+=|
T Consensus       117 ~~~~l~~l~~~le~~Fq~mREEL~YY  142 (148)
T PF13047_consen  117 PPRSLKDLMLSLEKIFQEMREELEYY  142 (148)
T ss_pred             CChHHHHHHHHHHHHHHHHHHHHHHH
Confidence            34678889999999999999998754


No 75 
>PF13747 DUF4164:  Domain of unknown function (DUF4164)
Probab=27.06  E-value=1.3e+02  Score=17.23  Aligned_cols=23  Identities=13%  Similarity=0.185  Sum_probs=10.7

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHH
Q 043345           47 GCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus        47 ~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      ....-++.|.....++-..|...
T Consensus        36 ~~e~ei~~l~~dr~rLa~eLD~~   58 (89)
T PF13747_consen   36 ELEEEIQRLDADRSRLAQELDQA   58 (89)
T ss_pred             hHHHHHHHHHhhHHHHHHHHHhH
Confidence            33444455555545555555433


No 76 
>PHA02585 16 small terminase protein; Provisional
Probab=27.02  E-value=1.8e+02  Score=18.80  Aligned_cols=64  Identities=9%  Similarity=0.125  Sum_probs=40.2

Q ss_pred             chhHHHHHHHHHHhhhhhhccChHHHHHHHH-HHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh
Q 043345            4 NVDFTKVGGHVHQLKGSSSSIGAQRVNNVCT-AFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKS   76 (80)
Q Consensus         4 ~~D~~~~~~laH~LKGss~nlGa~~L~~~c~-~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~   76 (80)
                      ..||+.+|+-.|..-        ..+.+... .||. +++.+.+..-+++..|-..+..+...|-..-.-+|.+
T Consensus        45 e~DY~~vR~~~h~q~--------Qm~mda~~iaLE~-AknSesPR~~EVf~~Lm~qm~~~nk~Ll~lhK~MK~i  109 (161)
T PHA02585         45 EDDYSLVRRNMHFQQ--------QMLMDAAKIALEN-AKNSESPRHVEVFATLMGQMTNTNKEILKIHKEMKDI  109 (161)
T ss_pred             HHHHHHHHHHHHHHH--------HHHHHHHHHHHHh-ccccCCchHHHHHHHHHHHHHhhHHHHHHHHHHHHHh
Confidence            368999999999821        11222223 3444 5555666677778888888877777776555544443


No 77 
>PF00701 DHDPS:  Dihydrodipicolinate synthetase family;  InterPro: IPR002220 Dihydropicolinate synthase (DHDPS) is the key enzyme in lysine biosynthesis via the diaminopimelate pathway of prokaryotes, some phycomycetes and higher plants. The enzyme catalyses the condensation of L-aspartate-beta- semialdehyde and pyruvate to dihydropicolinic acid via a ping-pong mechanism in which pyruvate binds to the enzyme by forming a Schiff-base with a lysine residue []. Three other proteins are structurally related to DHDPS and probably also act via a similar catalytic mechanism. These are Escherichia coli N-acetylneuraminate lyase (4.1.3.3 from EC) (gene nanA), which catalyzes the condensation of N-acetyl-D-mannosamine and pyruvate to form N-acetylneuraminate; Rhizobium meliloti (Sinorhizobium meliloti) protein mosA [], which is involved in the biosynthesis of the rhizopine 3-o-methyl-scyllo-inosamine; and E. coli hypothetical protein yjhH. The sequences of DHDPS from different sources are well-conserved. The structure takes the form of a homotetramer, in which 2 monomers are related by an approximate 2-fold symmetry []. Each monomer comprises 2 domains: an 8-fold alpha-/beta-barrel, and a C-terminal alpha-helical domain. The fold resembles that of N-acetylneuraminate lyase. The active site lysine is located in the barrel domain, and has access via 2 channels on the C-terminal side of the barrel.; GO: 0016829 lyase activity, 0008152 metabolic process; PDB: 3B4U_B 3S8H_A 3QZE_B 1XXX_F 3L21_F 3IRD_A 3A5F_B 3G0S_B 3DAQ_C 3UQN_A ....
Probab=26.54  E-value=1.6e+02  Score=19.93  Aligned_cols=42  Identities=14%  Similarity=0.063  Sum_probs=29.0

Q ss_pred             hhhhcc--ChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHH
Q 043345           19 GSSSSI--GAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYY   60 (80)
Q Consensus        19 Gss~nl--Ga~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~   60 (80)
                      |+.|.+  .++-+.+.+.++-+++.+|+.+.+..+.+++..-+.
T Consensus       198 G~~G~is~~~n~~P~~~~~i~~~~~~Gd~~~A~~l~~~l~~~~~  241 (289)
T PF00701_consen  198 GADGFISGLANVFPELIVEIYDAFQAGDWEEARELQQRLLPLRE  241 (289)
T ss_dssp             TSSEEEESGGGTHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHH
T ss_pred             cCCEEEEcccccChHHHHHHHHHHHcCcHHHHHHHHHHHhHHHH
Confidence            444433  233477899999999999999887766665555433


No 78 
>PF08657 DASH_Spc34:  DASH complex subunit Spc34 ;  InterPro: IPR013966  The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis []. In Saccharomyces cerevisiae (Baker's yeast) DASH forms both rings and spiral structures on microtubules in vitro [, ]. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules []. 
Probab=25.04  E-value=2.4e+02  Score=19.51  Aligned_cols=62  Identities=15%  Similarity=0.195  Sum_probs=42.7

Q ss_pred             HHHHHHHHHhhhhhhcc----ChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            8 TKVGGHVHQLKGSSSSI----GAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus         8 ~~~~~laH~LKGss~nl----Ga~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      ...++.++.+..++-.+    |=..+-.+|..+|..|..=...++..-+..|...|..+...|..|
T Consensus       134 ~~~~~~avA~vlG~~m~~e~~~d~dvevLL~~ae~L~~vYP~~ga~eki~~Lr~~y~~l~~~i~~l  199 (259)
T PF08657_consen  134 KQRRNTAVALVLGGVMHEEIVEDVDVEVLLRGAEKLCNVYPLPGAREKIAALRQRYNQLSNSIAYL  199 (259)
T ss_pred             HHHHHHHHHHhccCcccccccccCCHHHHHHHHHHHHHhCCChHHHHHHHHHHHHHHHHHHHHHHH
Confidence            34455665555554322    444566778888888876566688888888888888888877655


No 79 
>cd00950 DHDPS Dihydrodipicolinate synthase (DHDPS) is a key enzyme in lysine biosynthesis. It catalyzes the aldol condensation of L-aspartate-beta- semialdehyde and pyruvate to dihydropicolinic acid via a Schiff base formation between pyruvate and a lysine residue. The functional enzyme is a homotetramer consisting of a dimer of dimers. DHDPS is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways.
Probab=24.90  E-value=2.3e+02  Score=19.15  Aligned_cols=41  Identities=10%  Similarity=0.105  Sum_probs=30.4

Q ss_pred             hhhhhccChHH--HHHHHHHHHHHHhhcCHHHHHHHHHHHHHH
Q 043345           18 KGSSSSIGAQR--VNNVCTAFRSFCEERNIEGCQQYLQHLKQE   58 (80)
Q Consensus        18 KGss~nlGa~~--L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~   58 (80)
                      -|..|.+....  +.+.+.++=++.++|+.+++.++..++..-
T Consensus       196 ~G~~G~~s~~~n~~p~~~~~~~~~~~~g~~~~a~~l~~~l~~~  238 (284)
T cd00950         196 LGGVGVISVAANVAPKLMAEMVRAALAGDLEKARELHRKLLPL  238 (284)
T ss_pred             CCCCEEEehHHHhhHHHHHHHHHHHHCCCHHHHHHHHHHHHHH
Confidence            37777765554  778999999999999998877766555543


No 80 
>TIGR01690 ICE_RAQPRD integrative conjugative element protein, RAQPRD family. This model represents a small family of proteins about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function is unknown.
Probab=24.88  E-value=61  Score=19.14  Aligned_cols=23  Identities=9%  Similarity=0.209  Sum_probs=19.6

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHH
Q 043345           52 LQHLKQEYYLVKNKLQTLFQVCL   74 (80)
Q Consensus        52 ~~~l~~~~~~l~~~L~~~l~~~~   74 (80)
                      +.++...+.+++.-|+.||...|
T Consensus        59 Y~rl~~Dl~~ir~GI~~YL~PsR   81 (94)
T TIGR01690        59 YPRLRADLKRIRQGIQQYLTPSR   81 (94)
T ss_pred             HHHHHHHHHHHHHHHHHhCCCcc
Confidence            46699999999999999998765


No 81 
>PF08581 Tup_N:  Tup N-terminal;  InterPro: IPR013890  The N-terminal region of the Tup protein has been shown to interact with the Ssn6 transcriptional co-repressor []. ; PDB: 3VP9_B 3VP8_B.
Probab=24.66  E-value=1.4e+02  Score=16.83  Aligned_cols=17  Identities=18%  Similarity=0.386  Sum_probs=9.2

Q ss_pred             HHHHHHHHHHHHHHHHH
Q 043345           50 QYLQHLKQEYYLVKNKL   66 (80)
Q Consensus        50 ~~~~~l~~~~~~l~~~L   66 (80)
                      ++++.|+.+|..+...+
T Consensus         4 elLd~ir~Ef~~~~~e~   20 (79)
T PF08581_consen    4 ELLDAIRQEFENLSQEA   20 (79)
T ss_dssp             HHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHHHH
Confidence            45555666665555433


No 82 
>cd05136 RasGAP_DAB2IP The DAB2IP family of Ras GTPase-activating proteins includes DAB2IP, nGAP, and Syn GAP. Disabled 2 interactive protein, (DAB2IP; also known as ASK-interacting protein 1 (AIP1)), is a member of the GTPase-activating proteins, down-regulates Ras-mediated signal pathways, and mediates TNF-induced activation of ASK1-JNK signaling pathways. The mechanism by which TNF signaling is coupled to DAB2IP is not known.
Probab=24.15  E-value=2.3e+02  Score=20.05  Aligned_cols=46  Identities=20%  Similarity=0.151  Sum_probs=33.4

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVC   73 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~   73 (80)
                      ....+|..||......+.+++...+-.+-+........|...+..|
T Consensus        13 ~~~~l~~~l~~~~~~~~~~ela~~Lv~if~~~~~~~~~l~~Li~~E   58 (309)
T cd05136          13 NYARLCEVLEPVLSVRAKEELACALVHVLQSTGKAKDFLTDLVMAE   58 (309)
T ss_pred             hHHHHHHHHHhhCCchhHHHHHHHHHHHHHhcChHHHHHHHHHHHH
Confidence            4568899999988888888877777777666666666665555443


No 83 
>COG0783 Dps DNA-binding ferritin-like protein (oxidative damage protectant) [Inorganic ion transport and metabolism]
Probab=24.04  E-value=2e+02  Score=18.32  Aligned_cols=66  Identities=12%  Similarity=0.274  Sum_probs=43.2

Q ss_pred             HHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHH-hhcCH-HHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            7 FTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFC-EERNI-EGCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus         7 ~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~-~~~~~-~~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      |..+......+---...+|..++.....-++... +..+. ....+.+..|...|..+...++..+..
T Consensus        57 y~el~~~~DeiAERi~~LGg~p~~t~~~~~~~s~ike~~~~~~~~~~l~~l~~~~~~l~~~~r~~~~~  124 (156)
T COG0783          57 YEELAEHVDEIAERIRALGGVPLGTLSEYLKLSSIKEEPGDYTAREMLKELVEDYEYLIKELRKGIEL  124 (156)
T ss_pred             HHHHHHHHHHHHHHHHHcCCCCcccHHHHHHhCCCcccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            3444445555555556678888888888777766 22221 357778888888888887777766554


No 84 
>PF02870 Methyltransf_1N:  6-O-methylguanine DNA methyltransferase, ribonuclease-like domain;  InterPro: IPR008332 Synonym(s): 6-O-methylguanine-DNA methyltransferase, O-6-methylguanine-DNA-alkyltransferase The repair of DNA containing O6-alkylated guanine is carried out by DNA-[protein]-cysteine S-methyltransferase (2.1.1.63 from EC). The major mutagenic and carcinogenic effect of methylating agents in DNA is the formation of O6-alkylguanine. The alkyl group at the O-6 position is transferred to a cysteine residue in the enzyme []. This is a suicide reaction since the enzyme is irreversibly inactivated and the methylated protein accumulates as a dead-end product. Most, but not all of the methyltransferases are also able to repair O-4-methylthymine. DNA-[protein]-cysteine S-methyltransferases are widely distributed and are found in various prokaryotic and eukaryotic sources []. This group of proteins are characterised by having an N-terminal ribonuclease-like domain associated with 6-O-methylguanine DNA methyltransferase activity (IPR001497 from INTERPRO).; GO: 0003908 methylated-DNA-[protein]-cysteine S-methyltransferase activity, 0006281 DNA repair; PDB: 1SFE_A 1T39_B 1T38_A 1EH7_A 1EH6_A 1YFH_C 1EH8_A 1QNT_A.
Probab=23.94  E-value=1.1e+02  Score=16.35  Aligned_cols=17  Identities=18%  Similarity=0.075  Sum_probs=12.5

Q ss_pred             HHHHHHHHHHHHHHHHh
Q 043345           59 YYLVKNKLQTLFQVCLK   75 (80)
Q Consensus        59 ~~~l~~~L~~~l~~~~~   75 (80)
                      +..+...|+.|+..+|+
T Consensus        52 ~~~~~~qL~eYF~G~r~   68 (77)
T PF02870_consen   52 LAEAKQQLDEYFAGERT   68 (77)
T ss_dssp             HHHHHHHHHHHHHHHHS
T ss_pred             HHHHHHHHHHHHcCCCC
Confidence            45566778888888776


No 85 
>PRK10841 hybrid sensory kinase in two-component regulatory system with RcsB and YojN; Provisional
Probab=23.93  E-value=4e+02  Score=21.64  Aligned_cols=56  Identities=7%  Similarity=0.182  Sum_probs=30.5

Q ss_pred             HHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           10 VGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVC   73 (80)
Q Consensus        10 ~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~   73 (80)
                      +...+|-||-        +|..+...+|........++....+..+...-..+..-++..+.+-
T Consensus       451 la~iSHELRT--------PL~~I~g~lelL~~~~~~~~~~~~l~~i~~~~~~L~~lI~dlLd~s  506 (924)
T PRK10841        451 LATVSHELRT--------PLYGIIGNLDLLQTKELPKGVDRLVTAMNNSSSLLLKIISDILDFS  506 (924)
T ss_pred             HHHhHHHHHH--------HHHHHHHHHHHHhCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4667888874        3333334444433333334555666666666666666555555543


No 86 
>TIGR02956 TMAO_torS TMAO reductase sytem sensor TorS. This protein, TorS, is part of a regulatory system for the torCAD operon that encodes the pterin molybdenum cofactor-containing enzyme trimethylamine-N-oxide (TMAO) reductase (TorA), a cognate chaperone (TorD), and a penta-haem cytochrome (TorC). TorS works together with the inducer-binding protein TorT and the response regulator TorR. TorS contains histidine kinase ATPase (pfam02518), HAMP (pfam00672), phosphoacceptor (pfam00512), and phosphotransfer (pfam01627) domains and a response regulator receiver domain (pfam00072).
Probab=23.91  E-value=3.7e+02  Score=21.24  Aligned_cols=55  Identities=15%  Similarity=0.219  Sum_probs=29.2

Q ss_pred             HHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           10 VGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus        10 ~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      +..++|-||.        +|..+...++........+.....++.+......+...++..+..
T Consensus       468 ~~~~sHelrt--------PL~~i~~~~~ll~~~~~~~~~~~~l~~i~~~~~~l~~~i~~ll~~  522 (968)
T TIGR02956       468 LATMSHEIRT--------PLNGILGTLELLGDTGLTSQQQQYLQVINRSGESLLDILNDILDY  522 (968)
T ss_pred             HHHhHHHhhh--------HHHHHHHHHHHHhCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            3566777774        333333344433333333444556666666666666666555544


No 87 
>PF01231 IDO:  Indoleamine 2,3-dioxygenase;  InterPro: IPR000898 Indoleamine 2,3-dioxgyenase (IDO, 1.13.11.42 from EC) [] is a cytosolic haem protein which, together with the hepatic enzyme tryptophan 2,3-dioxygenase, catalyzes the conversion of tryptophan and other indole derivatives to kynurenines. The physiological role of IDO is not fully understood but is of great interest, because IDO is widely distributed in human tissues, can be up-regulated via cytokines such as interferon-gamma, and can thereby modulate the levels of tryptophan, which is vital for cell growth. The degradative action of IDO on tryptophan leads to cell death by starvation of this essential and relatively scarce amino acid. IDO is a haem-containing enzyme of about 400 amino acids. Site-directed mutagenesis showed His346 (P14902 from SWISSPROT) to be essential for haem binding, indicating that this histidine residue may be the proximal ligand. Mutation of Asp274 also compromised the ability of IDO to bind haem, suggesting that Asp274 may coordinate to haem directly as the distal ligand or is essential in maintaining the conformation of the haem pocket []. Other proteins that are evolutionarily related to IDO include yeast hypothetical protein YJR078w; and myoglobin from the red muscle of the archaeogastropodic molluscs, Nordotis madaka (Giant abalone) and Sulculus diversicolor [, ]. These unusual globins lack enzymatic activity but have kept the haem group.; GO: 0020037 heme binding; PDB: 2D0U_A 2D0T_A.
Probab=23.66  E-value=2.8e+02  Score=20.43  Aligned_cols=34  Identities=6%  Similarity=0.192  Sum_probs=25.1

Q ss_pred             HHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           38 SFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQ   71 (80)
Q Consensus        38 ~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~   71 (80)
                      .++..+|...+...+..|...+..+...|.++-+
T Consensus       181 ~a~~~~d~~~i~~~L~~i~~~i~~i~~~l~rm~e  214 (422)
T PF01231_consen  181 DAVKAGDSDRITEALRRIAEAIERITALLERMYE  214 (422)
T ss_dssp             HHHHTT-HHHHHHHHHHHHHHHHHHHHHHTTHHH
T ss_pred             HHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHHhc
Confidence            4456788888888888888888888877765544


No 88 
>PF13942 Lipoprotein_20:  YfhG lipoprotein
Probab=23.40  E-value=2.3e+02  Score=18.70  Aligned_cols=32  Identities=19%  Similarity=0.209  Sum_probs=26.0

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh
Q 043345           45 IEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKS   76 (80)
Q Consensus        45 ~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~   76 (80)
                      .+.+...-..|+.++..+..+|+..-.+|||.
T Consensus       132 lD~Lr~qq~~Lq~qL~~T~RKLEnLTDIERQL  163 (179)
T PF13942_consen  132 LDALRQQQQRLQYQLDTTTRKLENLTDIERQL  163 (179)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHhhhhHHHHHH
Confidence            34566667788888899999999999999985


No 89 
>TIGR03042 PS_II_psbQ_bact photosystem II protein PsbQ. This protein through the member sll1638 from Synechocystis sp. PCC 6803, was shown to be part of the cyanobacteria photosystem II. It is homologous to (but quite diverged from) the chloroplast PsbQ protein, called oxygen-evolving enhancer protein 3 (OEE3). We designate this cyanobacteria protein PsbQ by homology.
Probab=23.38  E-value=2e+02  Score=18.14  Aligned_cols=37  Identities=3%  Similarity=0.085  Sum_probs=20.1

Q ss_pred             HHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           29 VNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus        29 L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      |.+....|-.+++.+|...       ...+|..+...++.|+++
T Consensus       104 Lf~~L~~LD~AA~~kd~~~-------a~k~Y~~av~~~dafl~~  140 (142)
T TIGR03042       104 LKDDLEKLDEAARLQDGPQ-------AQKAYQKAAADFDAYLDL  140 (142)
T ss_pred             HHHHHHHHHHHHHhcCHHH-------HHHHHHHHHHHHHHHHhh
Confidence            3344455555666655443       445566666666666653


No 90 
>PF01322 Cytochrom_C_2:  Cytochrome C';  InterPro: IPR002321 Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes. Ambler [] recognised four classes of cytC.  Class II includes the high-spin cytC' and a number of low-spin cytochromes, e.g. cyt c-556. The haem-attachment site is close to the C terminus. The cytC' are capable of binding such ligands as CO, NO or CN(-), albeit with rate and equilibrium constants 100 to 1,000,000-fold smaller than other high-spin haemoproteins []. This, coupled with its relatively low redox potential, makes it unlikely that cytC' is a terminal oxidase. Thus cytC' probably functions as an electron transfer protein [].  The 3D structures of a number of cytC' have been determined. The molecule usually exists as a dimer, each monomer folding as a four-alpha-helix bundle incorporating a covalently-bound haem group at the core []. The Chromatium vinosum cytC' exhibits dimer dissociation upon ligand binding [].; GO: 0005506 iron ion binding, 0009055 electron carrier activity, 0020037 heme binding, 0005746 mitochondrial respiratory chain; PDB: 1BBH_A 2J9B_B 2J8W_A 1JAF_B 3ZTM_A 2XLD_A 2XL6_A 1E86_A 2YLD_A 2YKZ_A ....
Probab=23.38  E-value=1.7e+02  Score=17.17  Aligned_cols=36  Identities=3%  Similarity=0.093  Sum_probs=25.0

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHH
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVK   63 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~   63 (80)
                      .+......|..+++.+|.+.+...+..|...-....
T Consensus        83 ~~~~aa~~L~~aa~~~d~~~~~~a~~~v~~~C~aCH  118 (122)
T PF01322_consen   83 AFQKAAAALAAAAKSGDLAAIKAAFGEVGKSCKACH  118 (122)
T ss_dssp             HHHHHHHHHHHHHHHTSHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHH
Confidence            344555778888888888888877777766554443


No 91 
>PF11855 DUF3375:  Protein of unknown function (DUF3375);  InterPro: IPR021804  This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 479 to 499 amino acids in length. 
Probab=23.37  E-value=3.2e+02  Score=20.42  Aligned_cols=64  Identities=17%  Similarity=0.217  Sum_probs=48.3

Q ss_pred             hhHHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            5 VDFTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus         5 ~D~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      ......-+.+.+|.+.....-..+|..+...|++.+..-+.+. ..-+..|+.+...+..+|.+.
T Consensus       100 ~~a~~Al~~l~~L~~~~~~~TeSRl~tv~~~l~~la~~~~~Dp-~~Ri~~Le~e~~~i~~EI~~l  163 (478)
T PF11855_consen  100 PAAEKALRFLERLEERRFVGTESRLNTVFDALRQLAEGTDPDP-ERRIAELEREIAEIDAEIDRL  163 (478)
T ss_pred             HHHHHHHHHHHHcCCCcccccHHHHHHHHHHHHHHHHhcCCCH-HHHHHHHHHHHHHHHHHHHHH
Confidence            3455677788899999999999999999999999997766554 344566666666666666544


No 92 
>cd00089 HR1 Protein kinase C-related kinase homology region 1 domain; also known as the ACC (antiparallel coiled-coil) finger domain or Rho-binding domain. Found in vertebrate PRK1 and yeast PKC1 protein kinases C; those found in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Rho family members function as molecular switches, cycling between inactive  and active forms, controlling a variety of cellular processes. HR1 repeats often occur in tandem repeat arrangments, seperated by a short linker region.
Probab=23.36  E-value=1.4e+02  Score=16.08  Aligned_cols=29  Identities=31%  Similarity=0.370  Sum_probs=20.7

Q ss_pred             cCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           43 RNIEGCQQYLQHLKQEYYLVKNKLQTLFQ   71 (80)
Q Consensus        43 ~~~~~~~~~~~~l~~~~~~l~~~L~~~l~   71 (80)
                      ++...+...+......++.++..|..|..
T Consensus        42 ~~~~~~~~~l~es~~ki~~Lr~~L~k~~~   70 (72)
T cd00089          42 KLLAEAEQMLRESKQKLELLKMQLEKLKQ   70 (72)
T ss_pred             cCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            44566777777777777888888877753


No 93 
>COG1220 HslU ATP-dependent protease HslVU (ClpYQ), ATPase subunit [Posttranslational modification, protein turnover, chaperones]
Probab=22.91  E-value=1.2e+02  Score=22.59  Aligned_cols=31  Identities=19%  Similarity=0.358  Sum_probs=26.1

Q ss_pred             HHHHHHHHHHhhhhhhccChHHHHHHHHHHH
Q 043345            7 FTKVGGHVHQLKGSSSSIGAQRVNNVCTAFR   37 (80)
Q Consensus         7 ~~~~~~laH~LKGss~nlGa~~L~~~c~~lE   37 (80)
                      .+.+...|..+-...-||||.+|......+=
T Consensus       374 I~~iAeiA~~vN~~~ENIGARRLhTvlErlL  404 (444)
T COG1220         374 IKRIAEIAYQVNEKTENIGARRLHTVLERLL  404 (444)
T ss_pred             HHHHHHHHHHhcccccchhHHHHHHHHHHHH
Confidence            3577888999999999999999998876653


No 94 
>PRK06342 transcription elongation factor regulatory protein; Validated
Probab=22.89  E-value=2.1e+02  Score=18.18  Aligned_cols=46  Identities=4%  Similarity=0.151  Sum_probs=29.4

Q ss_pred             HHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            7 FTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQ   67 (80)
Q Consensus         7 ~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~   67 (80)
                      |..+..-.+.||               ..+..+...||..+.......+...+..+...|.
T Consensus        36 ~~~L~~El~~L~---------------~~i~~Ar~~GDlsEak~~~~~~e~rI~~L~~~L~   81 (160)
T PRK06342         36 LKALEDQLAQAR---------------AAYEAAQAIEDVNERRRQMARPLRDLRYLAARRR   81 (160)
T ss_pred             HHHHHHHHHHHH---------------HHHHHHHHCCChhHHHHHHHHHHHHHHHHHHHHc
Confidence            556666666666               3566777788888766566666666665555543


No 95 
>KOG3612 consensus PHD Zn-finger protein [General function prediction only]
Probab=22.88  E-value=2.6e+02  Score=21.83  Aligned_cols=36  Identities=8%  Similarity=0.049  Sum_probs=22.0

Q ss_pred             HHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHH
Q 043345           29 VNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKN   64 (80)
Q Consensus        29 L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~   64 (80)
                      |.+.+..+|+..+....+.+..+..+.+.++..++.
T Consensus       491 m~~~r~tlE~k~~~n~~e~~kkl~~~~qr~l~etKk  526 (588)
T KOG3612|consen  491 MAEMRKTLEQKHAENIKEEIKKLAEEHQRALAETKK  526 (588)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            566677777777666666666655555555555443


No 96 
>PF10481 CENP-F_N:  Cenp-F N-terminal domain;  InterPro: IPR018463 Mitosin or centromere-associated protein-F (Cenp-F) is found bound across the centromere as one of the proteins of the outer layer of the kinetochore []. Most of the kinetochore/centromere functions appear to depend upon binding of the C-terminal part of the molecule, whereas the N-terminal part, here, may be a cytoplasmic player in controlling the function of microtubules and dynein [].
Probab=22.59  E-value=2.9e+02  Score=19.67  Aligned_cols=42  Identities=21%  Similarity=0.411  Sum_probs=24.7

Q ss_pred             HHHHHHHHHHHHHH----------------hhcCHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFC----------------EERNIEGCQQYLQHLKQEYYLVKNKLQT   68 (80)
Q Consensus        27 ~~L~~~c~~lE~~~----------------~~~~~~~~~~~~~~l~~~~~~l~~~L~~   68 (80)
                      ..|.+.|..+|..-                -.|....+...++.|..++.++...|++
T Consensus        70 q~l~e~c~~lek~rqKlshdlq~Ke~qv~~lEgQl~s~Kkqie~Leqelkr~KsELEr  127 (307)
T PF10481_consen   70 QSLMESCENLEKTRQKLSHDLQVKESQVNFLEGQLNSCKKQIEKLEQELKRCKSELER  127 (307)
T ss_pred             hhHHHHHHHHHHHHHHhhHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            45677777777533                2233344555666677777777666654


No 97 
>COG5214 POL12 DNA polymerase alpha-primase complex, polymerase-associated subunit B [DNA replication, recombination, and repair]
Probab=22.53  E-value=1.3e+02  Score=22.87  Aligned_cols=45  Identities=16%  Similarity=0.200  Sum_probs=23.9

Q ss_pred             HHHHHHHHHHHHHH-hhcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcC
Q 043345           27 QRVNNVCTAFRSFC-EERNIEGCQQYLQHLKQEYYLVKNKLQTLFQVCLKSLASS   80 (80)
Q Consensus        27 ~~L~~~c~~lE~~~-~~~~~~~~~~~~~~l~~~~~~l~~~L~~~l~~~~~~~~~~   80 (80)
                      .....+..++|..| +.|++.--   +..    |..+...  ..+++|||+++++
T Consensus        23 m~~q~mf~kwes~~~qr~~t~~d---l~t----~~~f~k~--mk~qmerqv~at~   68 (581)
T COG5214          23 MDEQTMFYKWESWCLQRGNTKLD---LDT----FKAFAKD--MKFQMERQVKATL   68 (581)
T ss_pred             ccHHHHHHHHHHHHHhcCCcccc---cHH----HHHHHHH--HHHHHHHHHHHHh
Confidence            34456667889988 45554211   111    2222221  2367888988753


No 98 
>COG5613 Uncharacterized conserved protein [Function unknown]
Probab=22.11  E-value=3.3e+02  Score=20.13  Aligned_cols=43  Identities=12%  Similarity=0.107  Sum_probs=17.1

Q ss_pred             ChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           25 GAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQ   67 (80)
Q Consensus        25 Ga~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~   67 (80)
                      |+.+...-.-+-|.+-.+.+.+....-+..|+.-++++.+.+.
T Consensus       319 Gi~Qa~t~~~nae~a~~qad~q~~~ad~~~Lq~iierlkeelk  361 (400)
T COG5613         319 GIRQAGTTALNAEAAQLQADSQLAAADVQNLQRIIERLKEELK  361 (400)
T ss_pred             hHHHhcchhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            3333333333444443444443333333444444444444333


No 99 
>TIGR00465 ilvC ketol-acid reductoisomerase. This is the second enzyme in the parallel isoleucine-valine biosynthetic pathway
Probab=22.05  E-value=2.7e+02  Score=19.61  Aligned_cols=38  Identities=16%  Similarity=0.044  Sum_probs=25.3

Q ss_pred             HHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHH
Q 043345            9 KVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIE   46 (80)
Q Consensus         9 ~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~   46 (80)
                      .....+|++||++-.+.-..+..+...+=.-++=|+..
T Consensus       210 A~~~~~~~~~g~~~l~~e~g~~~l~~~Vsstaeyg~~~  247 (314)
T TIGR00465       210 AYFETVHELKLIVDLIYEGGITGMRDRISNTAEYGALT  247 (314)
T ss_pred             HHHHHHHHHHHHHHHHHHhcHHHHHHHcCCHHHcCcch
Confidence            34566899999999886665655555555555556543


No 100
>PF10191 COG7:  Golgi complex component 7 (COG7);  InterPro: IPR019335 The conserved oligomeric Golgi (COG) complex is an eight-subunit (Cog1-8) peripheral Golgi protein involved in membrane trafficking and glycoconjugate synthesis []. COG7 is required for normal Golgi morphology and trafficking. Mutation in COG7 causes a congenital disorder of glycosylation []. 
Probab=22.01  E-value=3.1e+02  Score=21.89  Aligned_cols=38  Identities=11%  Similarity=0.203  Sum_probs=31.7

Q ss_pred             ChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHH
Q 043345           25 GAQRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLV   62 (80)
Q Consensus        25 Ga~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l   62 (80)
                      .|..+..++.+++.....+|.+.+...+..++..+..+
T Consensus       126 EA~~w~~l~~~v~~~~~~~d~~~~a~~l~~m~~sL~~l  163 (766)
T PF10191_consen  126 EADNWSTLSAEVDDLFESGDIAKIADRLAEMQRSLAVL  163 (766)
T ss_pred             HHHhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHH
Confidence            46678889999999999999999888888888877654


No 101
>PRK04778 septation ring formation regulator EzrA; Provisional
Probab=21.99  E-value=3.6e+02  Score=20.52  Aligned_cols=42  Identities=12%  Similarity=0.286  Sum_probs=32.7

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQTL   69 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~~   69 (80)
                      .+-....+++....+||...+...+..++.....+...++..
T Consensus       176 ~~e~~f~~f~~l~~~Gd~~~A~e~l~~l~~~~~~l~~~~~~i  217 (569)
T PRK04778        176 NLEEEFSQFVELTESGDYVEAREILDQLEEELAALEQIMEEI  217 (569)
T ss_pred             HHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            455556677788889999999999999998888887766544


No 102
>PF07743 HSCB_C:  HSCB C-terminal oligomerisation domain;  InterPro: IPR009073 This entry represents the C-terminal oligomerisation domain found in HscB (heat shock cognate protein B), which is also known as HSC20 (20K heat shock cognate protein). HscB acts as a co-chaperone to regulate the ATPase activity and peptide-binding specificity of the molecular chaperone HscA, also known as HSC66 (HSP70 class). HscB proteins contain two domains, an N-terminal J-domain, which is involved in interactions with HscA, connected by a short loop to the C-terminal oligomerisation domain; the two domains make contact through a hydrophobic interface. The core of the oligomerisation domain is thought to bind and target proteins to HscA and consists of an open, three-helical bundle []. HscB, along with HscA, has been shown to play a role in the biogenesis of iron-sulphur proteins.; GO: 0006457 protein folding; PDB: 1FPO_C 3BVO_B 3HHO_A 3UO2_B 3UO3_B.
Probab=21.55  E-value=1.5e+02  Score=15.96  Aligned_cols=25  Identities=12%  Similarity=0.274  Sum_probs=12.5

Q ss_pred             HHHHHHHHHHhhcCHHHHHHHHHHH
Q 043345           31 NVCTAFRSFCEERNIEGCQQYLQHL   55 (80)
Q Consensus        31 ~~c~~lE~~~~~~~~~~~~~~~~~l   55 (80)
                      .....|..+...++++.+...+.+|
T Consensus        42 ~~~~~l~~~f~~~d~~~A~~~~~kL   66 (78)
T PF07743_consen   42 ELIKELAEAFDAKDWEEAKEALRKL   66 (78)
T ss_dssp             HHHHHHHHHHHTT-HHHHHHHHHHH
T ss_pred             HHHHHHHHHHccCcHHHHHHHHHHH
Confidence            3344444555555555555555555


No 103
>PF10069 DICT:  Sensory domain found in DIguanylate Cyclases & Two-component systems;  InterPro: IPR019278  This entry, found in various cyanobacterial sensor proteins that catalyse the reaction [ATP + protein L-histidine = ADP + protein N- phospho-L-histidine], has no known function. 
Probab=21.54  E-value=56  Score=19.86  Aligned_cols=50  Identities=10%  Similarity=0.046  Sum_probs=31.4

Q ss_pred             hhhhccChHHHHHHHHHHHHHHhhcC-HHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           19 GSSSSIGAQRVNNVCTAFRSFCEERN-IEGCQQYLQHLKQEYYLVKNKLQTLFQV   72 (80)
Q Consensus        19 Gss~nlGa~~L~~~c~~lE~~~~~~~-~~~~~~~~~~l~~~~~~l~~~L~~~l~~   72 (80)
                      +...++--..|..+|..||..+-... ..-+..-|++...    ....+.+|.++
T Consensus        22 ~~~~~~~k~~L~alsr~iEd~a~~~~~~~~v~a~FQ~~s~----~~~e~~rY~~l   72 (129)
T PF10069_consen   22 PPFTSYSKRLLVALSRAIEDRAWRAGISGTVWAGFQRLSR----FRQEIDRYRQL   72 (129)
T ss_pred             CCcceecHHHHHHHHHHHHHHHHhcCCCCEEEEeCCChhh----hHHHHHHHHHH
Confidence            47788899999999999999985544 2233333333333    33333555544


No 104
>PF03858 Crust_neuro_H:  Crustacean neurohormone H;  InterPro: IPR005558 Arthropod express a family of neuropeptides [] which so far consist of the following types of neurohormones:  Crustacean hyperglycemic hormone (CHH). CHH is primarily involved in blood sugar regulation, but also plays a role in the control of molting and reproduction. Molt-inhibiting hormone (MIH). MIH inhibits Y-organs where molting hormone (ecdysteroid) is secreted. A molting cycle is initiated when MIH secretion diminishes or stops. Gonad-inhibiting hormone (GIH), also known as vitellogenesis-inhibiting hormone (VIH) because of its role in inhibiting vitellogenesis in female animals. Mandibular organ-inhibiting hormone (MOIH). MOIH represses the synthesis of methyl farnesoate, the precursor of insect juvenile hormone III in the mandibular organ. Ion transport peptide (ITP) from locust. ITP stimulates salt and water reabsorption and inhibits acid secretion in the ileum of the locust.  Caenorhabditis elegans hypothetical protein ZC168.2.  These neurohormones are peptides of 70 to 80 residues which are processed from larger size precursors. They contain six conserved cysteines that are involved in disulphide bonds, as shown in the following schematic representation.  Crustacean neurohormone H proteins are referred to as precursor-related peptides as they are typically co-transcribed and translated with the CHH neurohormone (IPR001166 from INTERPRO). However, in some species this neuropeptide is synthesized as a separate protein. Furthermore, neurohormone H can undergo proteolysis to give rise to 5 different neuropeptides [].
Probab=21.53  E-value=93  Score=15.56  Aligned_cols=22  Identities=23%  Similarity=0.348  Sum_probs=17.2

Q ss_pred             hHHHHHHHHHHhhhhhhccChH
Q 043345            6 DFTKVGGHVHQLKGSSSSIGAQ   27 (80)
Q Consensus         6 D~~~~~~laH~LKGss~nlGa~   27 (80)
                      -|-.+.++.-+|||++-+.+..
T Consensus         5 G~GRMerLLaSlrg~~~s~~pl   26 (41)
T PF03858_consen    5 GFGRMERLLASLRGSADSSTPL   26 (41)
T ss_pred             chhhHHHHHHHHhccCCCCcch
Confidence            4677888999999998876543


No 105
>COG1598 Predicted nuclease of the RNAse H fold, HicB family [General    function prediction only]
Probab=21.52  E-value=66  Score=17.59  Aligned_cols=28  Identities=14%  Similarity=0.286  Sum_probs=18.3

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           46 EGCQQYLQHLKQEYYLVKNKLQTLFQVC   73 (80)
Q Consensus        46 ~~~~~~~~~l~~~~~~l~~~L~~~l~~~   73 (80)
                      ++|...-..++..+..+..+|+-|+..+
T Consensus        24 pgc~s~G~T~eea~~n~~eai~l~~e~~   51 (73)
T COG1598          24 PGCHSQGETLEEALQNAKEAIELHLEAL   51 (73)
T ss_pred             CCccccCCCHHHHHHHHHHHHHHHHHHH
Confidence            3444555666777777777777777653


No 106
>PF04428 Choline_kin_N:  Choline kinase N terminus;  InterPro: IPR007521 This domain is found N-terminal to choline/ethanolamine kinase regions (IPR002573 from INTERPRO) in some plant and fungal choline kinase enzymes (2.7.1.32 from EC). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis.; GO: 0016773 phosphotransferase activity, alcohol group as acceptor
Probab=21.38  E-value=97  Score=16.28  Aligned_cols=11  Identities=27%  Similarity=0.383  Sum_probs=9.2

Q ss_pred             HHHHHHHHHhh
Q 043345            8 TKVGGHVHQLK   18 (80)
Q Consensus         8 ~~~~~laH~LK   18 (80)
                      ..+.+++|+||
T Consensus        30 ~di~~l~htL~   40 (53)
T PF04428_consen   30 QDILRLIHTLK   40 (53)
T ss_pred             HHHHHHHHHhc
Confidence            57888999987


No 107
>PF05465 Halo_GVPC:  Halobacterial gas vesicle protein C (GVPC) repeat;  InterPro: IPR008639 This family consists of Halobacterium gas vesicle protein C sequences which are thought to confer stability to the gas vesicle membranes [,].; GO: 0031412 gas vesicle organization, 0031411 gas vesicle
Probab=20.98  E-value=1.1e+02  Score=14.23  Aligned_cols=23  Identities=4%  Similarity=0.163  Sum_probs=17.1

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHH
Q 043345           48 CQQYLQHLKQEYYLVKNKLQTLF   70 (80)
Q Consensus        48 ~~~~~~~l~~~~~~l~~~L~~~l   70 (80)
                      +...+..++.+|..++.....|-
T Consensus         4 l~a~I~~~r~~f~~~~~aF~aY~   26 (32)
T PF05465_consen    4 LLAAIAEFREEFDDTQDAFEAYA   26 (32)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHH
Confidence            44567778888888888887774


No 108
>PF06160 EzrA:  Septation ring formation regulator, EzrA ;  InterPro: IPR010379 During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerises into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation [].; GO: 0000921 septin ring assembly, 0005940 septin ring, 0016021 integral to membrane
Probab=20.88  E-value=3.9e+02  Score=20.41  Aligned_cols=42  Identities=14%  Similarity=0.294  Sum_probs=32.8

Q ss_pred             HHHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345           27 QRVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLVKNKLQT   68 (80)
Q Consensus        27 ~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l~~~L~~   68 (80)
                      ..+-.....++....+||...+...+..++.....+...++.
T Consensus       171 ~~ie~~F~~f~~lt~~GD~~~A~eil~~l~~~~~~l~~~~e~  212 (560)
T PF06160_consen  171 ENIEEEFSEFEELTENGDYLEAREILEKLKEETDELEEIMED  212 (560)
T ss_pred             HHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            345556677778889999999999999999888887776653


No 109
>PRK14158 heat shock protein GrpE; Provisional
Probab=20.82  E-value=2.6e+02  Score=18.48  Aligned_cols=15  Identities=0%  Similarity=0.014  Sum_probs=7.8

Q ss_pred             HHHHHHHHHHHHHhh
Q 043345           28 RVNNVCTAFRSFCEE   42 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~   42 (80)
                      .|...+..||.+...
T Consensus        95 ~lLpV~DnLerAl~~  109 (194)
T PRK14158         95 EILPAVDNMERALDH  109 (194)
T ss_pred             HHHhHHhHHHHHHhc
Confidence            344555556655543


No 110
>cd04751 Commd3 COMM_Domain containing protein 3. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.
Probab=20.81  E-value=96  Score=17.89  Aligned_cols=16  Identities=19%  Similarity=0.376  Sum_probs=8.5

Q ss_pred             hHHHHHHHHHHhhhhh
Q 043345            6 DFTKVGGHVHQLKGSS   21 (80)
Q Consensus         6 D~~~~~~laH~LKGss   21 (80)
                      |.+++..+.+.||...
T Consensus        72 ~~e~L~~Li~~Lk~A~   87 (95)
T cd04751          72 TLEQLQDLVNKLKDAA   87 (95)
T ss_pred             CHHHHHHHHHHHHHHH
Confidence            4455555555555444


No 111
>PF14989 CCDC32:  Coiled-coil domain containing 32
Probab=20.79  E-value=99  Score=19.74  Aligned_cols=71  Identities=10%  Similarity=0.251  Sum_probs=43.3

Q ss_pred             HHHHHHHHhhhhhhccChHHHHHHHHHHHHHHhhcCHHH--HHHHH-HHHHHHHHHHHHHHHHHHHHHHhhhhc
Q 043345            9 KVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCEERNIEG--CQQYL-QHLKQEYYLVKNKLQTLFQVCLKSLAS   79 (80)
Q Consensus         9 ~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~~~~~~~--~~~~~-~~l~~~~~~l~~~L~~~l~~~~~~~~~   79 (80)
                      .+.+-.-+|||.+..|-...|...-.+.-+.|-..=..+  ....+ ..+...-..+...|.++|..|+|++++
T Consensus        60 sLE~KL~rik~~~~~vtsKemL~sL~~aK~d~~~rlL~~~~~~~~~~~~~~~D~p~~~~~l~R~L~Pe~qAls~  133 (148)
T PF14989_consen   60 SLERKLKRIKGKNREVTSKEMLRSLSQAKEDCWDRLLSSGNPSEFFEDDLDSDEPILEHYLKRWLAPEKQALSK  133 (148)
T ss_pred             HHHHHHHHHhCCCccCCHHHHHHHHHHHHHHHHHHHHcCCCcchhccCccccccchhhhhHHhhhCchhhcccH
Confidence            455557889999999999888887777666553221111  11111 112222233455588999999998764


No 112
>PF13779 DUF4175:  Domain of unknown function (DUF4175)
Probab=20.79  E-value=4.7e+02  Score=21.32  Aligned_cols=21  Identities=10%  Similarity=0.191  Sum_probs=12.2

Q ss_pred             HHHHHHHHHHHHHHHHHHHHH
Q 043345           53 QHLKQEYYLVKNKLQTLFQVC   73 (80)
Q Consensus        53 ~~l~~~~~~l~~~L~~~l~~~   73 (80)
                      ++|+.-++.++.+++.||+..
T Consensus       492 eEI~rLm~eLR~A~~~ym~~L  512 (820)
T PF13779_consen  492 EEIARLMQELREAMQDYMQAL  512 (820)
T ss_pred             HHHHHHHHHHHHHHHHHHHHH
Confidence            445555556666666666653


No 113
>COG2841 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=20.79  E-value=1.7e+02  Score=16.37  Aligned_cols=65  Identities=17%  Similarity=0.276  Sum_probs=46.7

Q ss_pred             HHHHHHHHHHhhhhhhccChHHHHHHHHHHHHHHh--hcC-HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 043345            7 FTKVGGHVHQLKGSSSSIGAQRVNNVCTAFRSFCE--ERN-IEGCQQYLQHLKQEYYLVKNKLQTLFQVC   73 (80)
Q Consensus         7 ~~~~~~laH~LKGss~nlGa~~L~~~c~~lE~~~~--~~~-~~~~~~~~~~l~~~~~~l~~~L~~~l~~~   73 (80)
                      +..++.+.|.|||--  -.+.+|.+--.+|-....  .++ ......-+..|+.+=-.+..+|-.+|+-.
T Consensus         2 ~~Efr~~is~Lk~~d--ahF~rLfd~hn~LDd~I~~~E~n~~~~s~~ev~~LKKqkL~LKDEi~~~L~~a   69 (72)
T COG2841           2 FHEFRDLISKLKAND--AHFARLFDKHNELDDRIKRAEGNRQPGSDAEVSNLKKQKLQLKDEIASILQKA   69 (72)
T ss_pred             chhHHHHHHHHhccc--hHHHHHHHHHhHHHHHHHHHhcCCCCCcHHHHHHHHHHHHHhHHHHHHHHHHH
Confidence            456789999999875  467888888888876552  222 33455567888888888888888877643


No 114
>PF04782 DUF632:  Protein of unknown function (DUF632);  InterPro: IPR006867 This conserved region contains a leucine zipper-like domain. The proteins are found only in plants and their functions are unknown.
Probab=20.46  E-value=1.7e+02  Score=20.80  Aligned_cols=36  Identities=8%  Similarity=0.180  Sum_probs=25.6

Q ss_pred             hhccChHHHHHHHHHHHHHHhhcCHHHHHHHHHHHH
Q 043345           21 SSSIGAQRVNNVCTAFRSFCEERNIEGCQQYLQHLK   56 (80)
Q Consensus        21 s~nlGa~~L~~~c~~lE~~~~~~~~~~~~~~~~~l~   56 (80)
                      -+.+|+++++.+|.+.-++...-....+...+..+.
T Consensus       271 p~~~~aPpIf~lC~~W~~aLd~lp~k~v~~AIk~f~  306 (312)
T PF04782_consen  271 PRRSGAPPIFVLCNDWSQALDRLPDKEVSEAIKSFA  306 (312)
T ss_pred             ccccCCCcHHHHHHHHHHHHHcCChHHHHHHHHHHH
Confidence            456899999999999999887655445444444333


No 115
>PRK08655 prephenate dehydrogenase; Provisional
Probab=20.28  E-value=3.6e+02  Score=19.80  Aligned_cols=35  Identities=6%  Similarity=0.143  Sum_probs=20.3

Q ss_pred             HHHHHHHHHHHHHhhcCHHHHHHHHHHHHHHHHHH
Q 043345           28 RVNNVCTAFRSFCEERNIEGCQQYLQHLKQEYYLV   62 (80)
Q Consensus        28 ~L~~~c~~lE~~~~~~~~~~~~~~~~~l~~~~~~l   62 (80)
                      .+...+.++.....++|.+.+...+.+-...+..+
T Consensus       240 ~~~~~l~~l~~~l~~~D~~~l~~~~~~a~~~~~~~  274 (437)
T PRK08655        240 TFIKECEELSELVKNGDREEFVERMKEAAKHFGDT  274 (437)
T ss_pred             HHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHhccc
Confidence            34445566666666677666666666555544433


Done!