Query         026103
Match_columns 243
No_of_seqs    261 out of 1368
Neff          5.4 
Searched_HMMs 46136
Date          Fri Mar 29 03:48:19 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/026103.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/026103hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd00018 AP2 DNA-binding domain  99.8 7.6E-21 1.6E-25  135.8   7.5   61   65-125     1-61  (61)
  2 smart00380 AP2 DNA-binding dom  99.8 4.5E-20 9.7E-25  133.4   8.5   63   66-128     1-63  (64)
  3 PHA00280 putative NHN endonucl  99.5 1.4E-13   3E-18  111.8   7.1   72   45-119    47-119 (121)
  4 PF00847 AP2:  AP2 domain;  Int  99.1 2.2E-10 4.8E-15   79.8   6.2   52   65-116     1-56  (56)
  5 PF14657 Integrase_AP2:  AP2-li  69.8      19 0.00041   23.9   5.5   37   77-113     1-41  (46)
  6 PHA02601 int integrase; Provis  57.0      17 0.00037   32.7   4.5   44   69-113     2-46  (333)
  7 PF13356 DUF4102:  Domain of un  56.0      37 0.00081   25.3   5.5   37   71-107    28-68  (89)
  8 cd00801 INT_P4 Bacteriophage P  55.0      29 0.00063   30.9   5.6   39   75-113     9-49  (357)
  9 PF08846 DUF1816:  Domain of un  49.6      33 0.00072   25.4   4.1   38   77-114     9-46  (68)
 10 PF10729 CedA:  Cell division a  48.0      32  0.0007   25.8   3.8   38   64-104    30-67  (80)
 11 PRK09692 integrase; Provisiona  41.1      75  0.0016   29.9   6.2   36   70-105    33-74  (413)
 12 PHA03308 transcriptional regul  33.8      42 0.00091   35.6   3.4    9  111-119  1333-1341(1463)
 13 PRK10113 cell division modulat  31.1      39 0.00085   25.3   2.0   37   65-104    31-67  (80)
 14 PF05036 SPOR:  Sporulation rel  29.5      58  0.0012   22.4   2.7   24   87-110    42-65  (76)
 15 PF14112 DUF4284:  Domain of un  24.9      49  0.0011   26.6   1.8   19   89-107     2-20  (122)
 16 PF12404 DUF3663:  Peptidase ;   23.7      37 0.00079   25.8   0.8   38  110-147    10-47  (77)
 17 cd07998 WGR_DNA_ligase WGR dom  21.9   3E+02  0.0064   20.8   5.4   51   60-110     8-64  (77)
 18 PF07494 Reg_prop:  Two compone  21.2      72  0.0016   18.3   1.5   11   87-97     14-24  (24)

No 1  
>cd00018 AP2 DNA-binding domain found in transcription regulators in plants such as APETALA2 and EREBP (ethylene responsive element binding protein). In EREBPs the domain specifically binds to the 11bp GCC box of the ethylene response element (ERE), a promotor element essential for ethylene responsiveness. EREBPs and the C-repeat binding factor CBF1, which is involved in stress response, contain a single copy of the AP2 domain. APETALA2-like proteins, which play a role in plant  development contain two copies.
Probab=99.84  E-value=7.6e-21  Score=135.84  Aligned_cols=61  Identities=59%  Similarity=0.930  Sum_probs=57.2

Q ss_pred             CceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhcCCCccCCCCCC
Q 026103           65 PIFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALRGKSACLNFADS  125 (243)
Q Consensus        65 s~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~G~~a~~NFp~s  125 (243)
                      |+||||+++++|||+|+|+++..++++|||+|+|+|||++|||.|+++++|..+.+|||++
T Consensus         1 s~~~GV~~~~~gkw~A~I~~~~~gk~~~lG~f~t~eeAa~Ayd~a~~~~~g~~a~~Nf~~~   61 (61)
T cd00018           1 SKYRGVRQRPWGKWVAEIRDPSGGRRIWLGTFDTAEEAARAYDRAALKLRGSSAVLNFPDS   61 (61)
T ss_pred             CCccCEEECCCCcEEEEEEeCCCCceEccCCCCCHHHHHHHHHHHHHHhcCCccccCCCCC
Confidence            6899998777799999999966699999999999999999999999999999999999975


No 2  
>smart00380 AP2 DNA-binding domain in plant proteins such as APETALA2 and EREBPs.
Probab=99.82  E-value=4.5e-20  Score=133.37  Aligned_cols=63  Identities=52%  Similarity=0.915  Sum_probs=59.3

Q ss_pred             ceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhcCCCccCCCCCCCCC
Q 026103           66 IFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALRGKSACLNFADSVWR  128 (243)
Q Consensus        66 ~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~G~~a~~NFp~s~~~  128 (243)
                      .|+||+++++|||+|+|+++.+++++|||+|+|+||||+|||.|+++++|..+.+|||.+.|.
T Consensus         1 ~~kGV~~~~~gkw~A~I~~~~~~k~~~lG~f~t~eeAa~Ayd~a~~~~~g~~a~~Nf~~~~y~   63 (64)
T smart00380        1 KYRGVRQRPWGKWVAEIRDPSKGKRVWLGTFDTAEEAARAYDRAAFKFRGRSARLNFPNSLYD   63 (64)
T ss_pred             CEeeEEeCCCCeEEEEEEecCCCcEEecCCCCCHHHHHHHHHHHHHHhcCCccccCCCCccCC
Confidence            499997787899999999999999999999999999999999999999999999999998764


No 3  
>PHA00280 putative NHN endonuclease
Probab=99.46  E-value=1.4e-13  Score=111.83  Aligned_cols=72  Identities=14%  Similarity=0.144  Sum_probs=60.2

Q ss_pred             hccCCccccCCCCCCCCCCCCceeEe-EECCCCeEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhcCCCcc
Q 026103           45 LATSRPKKRAGRRVFKETRHPIFRGV-RMRNNNKWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALRGKSAC  119 (243)
Q Consensus        45 ~~s~~~k~~~~r~k~~~~~~s~yrGV-r~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~G~~a~  119 (243)
                      +.-..+.....+++.+.+++|+|+|| ++...|||+|+|+.  +||+++||.|+++|+|+.||+ ++.+|+|.+|+
T Consensus        47 Lr~~T~~eN~~N~~~~~~N~SG~kGV~~~k~~~kw~A~I~~--~gK~~~lG~f~~~e~A~~a~~-~~~~lhGeFa~  119 (121)
T PHA00280         47 LRLALPKENSWNMKTPKSNTSGLKGLSWSKEREMWRGTVTA--EGKQHNFRSRDLLEVVAWIYR-TRRELHGQFAR  119 (121)
T ss_pred             hhhcCHHHHhcccCCCCCCCCCCCeeEEecCCCeEEEEEEE--CCEEEEcCCCCCHHHHHHHHH-HHHHHhhcccc
Confidence            33344555566666677899999999 56677999999999  999999999999999999997 77889998875


No 4  
>PF00847 AP2:  AP2 domain;  InterPro: IPR001471 Pathogenesis-related genes transcriptional activator binds to the GCC-box pathogenesis-related promoter element and activates the plant's defence genes. Ethylene, chemically the simplest plant hormone, participates in a number of stress responses and developmental processes: e.g., fruit ripening, inhibition of stem and root elongation, promotion of seed germination and flowering, senescence of leaves and flowers, and sex determination []. DNA sequence elements that confer ethylene responsiveness have been shown to contain two 11bp GCC boxes, which are necessary and sufficient for transcriptional control by ethylene. Ethylene responsive element binding proteins (EREBPs) have now been identified in a variety of plants. The proteins share a similar domain of around 59 amino acids, which interacts directly with the GCC box in the ERE.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent; PDB: 3IGM_A 3GCC_A 1GCC_A 2GCC_A.
Probab=99.10  E-value=2.2e-10  Score=79.82  Aligned_cols=52  Identities=25%  Similarity=0.395  Sum_probs=44.9

Q ss_pred             CceeEe-EECCCCeEEEEEeecCC---CeEEeccCCCCHHHHHHHHHHHHHHhcCC
Q 026103           65 PIFRGV-RMRNNNKWVCELREPNK---QTRIWLGTYPSPEMAARAHDVAALALRGK  116 (243)
Q Consensus        65 s~yrGV-r~r~~GkW~AeIr~~~~---~kri~LGtf~t~EeAA~AyD~Aa~~~~G~  116 (243)
                      |+|+|| +.+..++|+|+|+++..   +++++||.|+++++|+++++.+++.++|.
T Consensus         1 s~~~GV~~~~~~~~W~a~i~~~~~~g~~k~f~~g~fg~~~eA~~~a~~~r~~~~~e   56 (56)
T PF00847_consen    1 SGYKGVSWDKRRGRWRAQIRVWSENGKRKRFSVGKFGFEEEAKRAAIEARKELEGE   56 (56)
T ss_dssp             SSSTTEEEETTTTEEEEEEEECCCTTEEEEEEECCCCCHHHHHHHHHHHHHHCTS-
T ss_pred             CCcEEEEEcCCCCEEEEEEEEcccCcccEEEeCccCCCHHHHHHHHHHHHHHhcCC
Confidence            689999 55667999999999432   49999999999999999999999999874


No 5  
>PF14657 Integrase_AP2:  AP2-like DNA-binding integrase domain
Probab=69.78  E-value=19  Score=23.86  Aligned_cols=37  Identities=14%  Similarity=0.202  Sum_probs=28.4

Q ss_pred             eEEEEEe--ec--CCCeEEeccCCCCHHHHHHHHHHHHHHh
Q 026103           77 KWVCELR--EP--NKQTRIWLGTYPSPEMAARAHDVAALAL  113 (243)
Q Consensus        77 kW~AeIr--~~--~~~kri~LGtf~t~EeAA~AyD~Aa~~~  113 (243)
                      +|...|.  .+  ++.++++-+-|.|..||-.+.......+
T Consensus         1 ~w~~~v~g~~~~~Gkrk~~~k~GF~TkkeA~~~~~~~~~~~   41 (46)
T PF14657_consen    1 TWYYRVYGYDDETGKRKQKTKRGFKTKKEAEKALAKIEAEL   41 (46)
T ss_pred             CEEEEEEEEECCCCCEEEEEcCCCCcHHHHHHHHHHHHHHH
Confidence            5788883  33  3457889999999999999988766655


No 6  
>PHA02601 int integrase; Provisional
Probab=56.99  E-value=17  Score=32.75  Aligned_cols=44  Identities=20%  Similarity=0.386  Sum_probs=29.5

Q ss_pred             EeEECCCCeEEEEEeec-CCCeEEeccCCCCHHHHHHHHHHHHHHh
Q 026103           69 GVRMRNNNKWVCELREP-NKQTRIWLGTYPSPEMAARAHDVAALAL  113 (243)
Q Consensus        69 GVr~r~~GkW~AeIr~~-~~~kri~LGtf~t~EeAA~AyD~Aa~~~  113 (243)
                      +|++.++|+|.++++.. ..|+++.. +|.|..||..........+
T Consensus         2 ~~~~~~~g~w~~~~~~~~~~g~r~~~-~f~tk~eA~~~~~~~~~~~   46 (333)
T PHA02601          2 AVRKLKDGKWLCEIYPNGRDGKRIRK-RFATKGEALAFENYTMAEV   46 (333)
T ss_pred             ceEEcCCCCEEEEEEECCCCCchhhh-hhcCHHHHHHHHHHHHHhc
Confidence            46667778999999863 23666653 6999988876555443333


No 7  
>PF13356 DUF4102:  Domain of unknown function (DUF4102); PDB: 3JU0_A 3RMP_A 3JTZ_A 2KJ8_A.
Probab=56.00  E-value=37  Score=25.27  Aligned_cols=37  Identities=27%  Similarity=0.382  Sum_probs=24.5

Q ss_pred             EECCCC--eEEEEEeecCCCeEEeccCCCC--HHHHHHHHH
Q 026103           71 RMRNNN--KWVCELREPNKQTRIWLGTYPS--PEMAARAHD  107 (243)
Q Consensus        71 r~r~~G--kW~AeIr~~~~~kri~LGtf~t--~EeAA~AyD  107 (243)
                      +-.+.|  .|.-+.+..++.+++-||.|+.  ..+|.....
T Consensus        28 ~v~~~G~kt~~~r~~~~gk~~~~~lG~~p~~sl~~AR~~a~   68 (89)
T PF13356_consen   28 RVTPSGSKTFYFRYRINGKRRRITLGRYPELSLAEAREKAR   68 (89)
T ss_dssp             EE-TTS-EEEEEEEEETTEEEEEEEEECTTS-HHHHHHHHH
T ss_pred             EEEeCCCeEEEEEEEecceEEEeccCCCccCCHHHHHHHHH
Confidence            444444  4998888866677899999976  444444443


No 8  
>cd00801 INT_P4 Bacteriophage P4 integrase. P4-like integrases are found in temperate bacteriophages, integrative plasmids, pathogenicity and symbiosis islands, and other mobile genetic elements.  They share the same fold in their catalytic domain and the overall reaction mechanism with the superfamily of DNA breaking-rejoining enzymes. The P4 integrase mediates integrative and excisive site-specific recombination between two sites, called attachment sites, located on the phage genome and the bacterial chromosome. The phage attachment site is often found adjacent to the integrase gene, while the host attachment sites are typically situated near tRNA genes.
Probab=55.04  E-value=29  Score=30.89  Aligned_cols=39  Identities=33%  Similarity=0.493  Sum_probs=27.6

Q ss_pred             CCeEEEEEeecCCCeEEeccCCC--CHHHHHHHHHHHHHHh
Q 026103           75 NNKWVCELREPNKQTRIWLGTYP--SPEMAARAHDVAALAL  113 (243)
Q Consensus        75 ~GkW~AeIr~~~~~kri~LGtf~--t~EeAA~AyD~Aa~~~  113 (243)
                      .+.|..+++..++.+++.||+|+  +.++|..........+
T Consensus         9 ~~~~~~~~~~~g~~~~~~~g~~~~~~~~~A~~~~~~~~~~~   49 (357)
T cd00801           9 SKSWRFRYRLAGKRKRLTLGSYPAVSLAEAREKADEARALL   49 (357)
T ss_pred             CEEEEEEeccCCceeEEeCcCCCCCCHHHHHHHHHHHHHHH
Confidence            35699999997777788899996  6666666555543333


No 9  
>PF08846 DUF1816:  Domain of unknown function (DUF1816);  InterPro: IPR014945  Q4C9H3 from SWISSPROT is associated with the IPR008213 from INTERPRO domain suggesting this protein could have a role in phycobilisomes. 
Probab=49.56  E-value=33  Score=25.38  Aligned_cols=38  Identities=24%  Similarity=0.325  Sum_probs=28.6

Q ss_pred             eEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhc
Q 026103           77 KWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALR  114 (243)
Q Consensus        77 kW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~  114 (243)
                      .|-++|.-..-.-..|.|-|.|.+||..+.---...+.
T Consensus         9 aWWveI~T~~P~ctYyFGPF~s~~eA~~~~~gyieDL~   46 (68)
T PF08846_consen    9 AWWVEIETQNPNCTYYFGPFDSREEAEAALPGYIEDLE   46 (68)
T ss_pred             cEEEEEEcCCCCEEEEeCCcCCHHHHHHHhccHHHHHH
Confidence            47788887555677889999999999988655444443


No 10 
>PF10729 CedA:  Cell division activator CedA;  InterPro: IPR019666  CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet [, ]. ; GO: 0003677 DNA binding, 0051301 cell division; PDB: 2BN8_A 2D35_A.
Probab=47.97  E-value=32  Score=25.78  Aligned_cols=38  Identities=24%  Similarity=0.184  Sum_probs=24.9

Q ss_pred             CCceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHH
Q 026103           64 HPIFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAAR  104 (243)
Q Consensus        64 ~s~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~  104 (243)
                      --+||-|+.-+ |||+|.+..  +..-..--.|..+|.|-|
T Consensus        30 ~dgfrdvw~lr-gkyvafvl~--ge~f~rsp~fs~pesaqr   67 (80)
T PF10729_consen   30 MDGFRDVWQLR-GKYVAFVLM--GEHFRRSPAFSVPESAQR   67 (80)
T ss_dssp             TTTECCECCCC-CEEEEEEES--SS-EEE---BSSHHHHHH
T ss_pred             cccccceeeec-cceEEEEEe--cchhccCCCcCCcHHHHH
Confidence            45799996543 999999987  443444457888887765


No 11 
>PRK09692 integrase; Provisional
Probab=41.06  E-value=75  Score=29.86  Aligned_cols=36  Identities=17%  Similarity=0.372  Sum_probs=22.8

Q ss_pred             eEECCCC--eEEEEEeecC--CCeEEeccCCC--CHHHHHHH
Q 026103           70 VRMRNNN--KWVCELREPN--KQTRIWLGTYP--SPEMAARA  105 (243)
Q Consensus        70 Vr~r~~G--kW~AeIr~~~--~~kri~LGtf~--t~EeAA~A  105 (243)
                      |+-+..|  .|+.+.+.+.  +.+++-||.|+  |..+|..+
T Consensus        33 l~v~~~G~k~~~~rY~~~~~gk~~~~~lG~yp~~sl~~AR~~   74 (413)
T PRK09692         33 LLIKSSGSKIWQFRYYRPLTKTRAKKSFGPYPSVTLADARNY   74 (413)
T ss_pred             EEEECCCcEEEEEEEecCCCCceeeeeCCCCCCCCHHHHHHH
Confidence            4445555  4998887543  34447899999  56555443


No 12 
>PHA03308 transcriptional regulator ICP4; Provisional
Probab=33.77  E-value=42  Score=35.63  Aligned_cols=9  Identities=22%  Similarity=0.438  Sum_probs=5.4

Q ss_pred             HHhcCCCcc
Q 026103          111 LALRGKSAC  119 (243)
Q Consensus       111 ~~~~G~~a~  119 (243)
                      ..-||+...
T Consensus      1333 vaawgrdtv 1341 (1463)
T PHA03308       1333 VAAWGRDTV 1341 (1463)
T ss_pred             HHHhccccc
Confidence            345787763


No 13 
>PRK10113 cell division modulator; Provisional
Probab=31.07  E-value=39  Score=25.26  Aligned_cols=37  Identities=27%  Similarity=0.335  Sum_probs=24.3

Q ss_pred             CceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHH
Q 026103           65 PIFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAAR  104 (243)
Q Consensus        65 s~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~  104 (243)
                      -+||-|+.-+ |||+|.+..  ...-..--.|..+|.|-|
T Consensus        31 d~frDVW~Lr-GKYVAFvl~--ge~FrRSPaFs~PEsAQR   67 (80)
T PRK10113         31 DSFRDVWMLR-GKYVAFVLM--GESFLRSPAFSVPESAQR   67 (80)
T ss_pred             cchhhhheec-cceEEEEEe--chhhccCCccCCcHHHHH
Confidence            4688896543 899999877  222222346777777655


No 14 
>PF05036 SPOR:  Sporulation related domain;  InterPro: IPR007730 This 70 residue domain is composed of two 35 residue repeats that are found in bacterial proteins involved in sporulation and cell division, such as FtsN, CwlM and RlpA. This repeat might be involved in binding peptidoglycan. FtsN is an essential cell division protein with a simple bitopic topology: a short N-terminal cytoplasmic segment fused to a large carboxy periplasmic domain through a single transmembrane domain. The repeats lie at the periplasmic C terminus, which has an RNP-like fold []. FtsN localises to the septum ring complex. The CwlM protein is a cell wall hydrolase, where the C-terminal region, including the repeats, determines substrate specificity []. RlpA is a rare lipoprotein A protein that may be important for cell division. Its N-terminal cysteine may be attached to thioglyceride and N-fatty acyl residues [].; PDB: 1X60_A 1UTA_A.
Probab=29.47  E-value=58  Score=22.44  Aligned_cols=24  Identities=25%  Similarity=0.232  Sum_probs=17.5

Q ss_pred             CCeEEeccCCCCHHHHHHHHHHHH
Q 026103           87 KQTRIWLGTYPSPEMAARAHDVAA  110 (243)
Q Consensus        87 ~~kri~LGtf~t~EeAA~AyD~Aa  110 (243)
                      ..-+|.+|.|++.++|..+-....
T Consensus        42 ~~yrV~~G~f~~~~~A~~~~~~l~   65 (76)
T PF05036_consen   42 PWYRVRVGPFSSREEAEAALRKLK   65 (76)
T ss_dssp             TCEEEEECCECTCCHHHHHHHHHH
T ss_pred             ceEEEEECCCCCHHHHHHHHHHHh
Confidence            445788888999888877765433


No 15 
>PF14112 DUF4284:  Domain of unknown function (DUF4284)
Probab=24.92  E-value=49  Score=26.64  Aligned_cols=19  Identities=21%  Similarity=0.604  Sum_probs=14.2

Q ss_pred             eEEeccCCCCHHHHHHHHH
Q 026103           89 TRIWLGTYPSPEMAARAHD  107 (243)
Q Consensus        89 kri~LGtf~t~EeAA~AyD  107 (243)
                      ..||||+|.+.++--.=.+
T Consensus         2 VsiWiG~f~s~~el~~Y~e   20 (122)
T PF14112_consen    2 VSIWIGNFKSEDELEEYFE   20 (122)
T ss_pred             eEEEEecCCCHHHHHHHhC
Confidence            3699999999877655443


No 16 
>PF12404 DUF3663:  Peptidase ;  InterPro: IPR008330 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This family represents the peptidase B group of leucyl aminopeptidases, which are restricted to the gammaproteobacteria. They contain a C-terminal aminopeptidase catalytic domain and an N-terminal domain of unknown function. They are zinc-dependent exopeptidases (3.4.11.1 from EC) and belong to MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF). They selectively release N-terminal amino acid residues from polypeptides and proteins and are involved in the processing, catabolism and degradation of intracellular proteins [, , ]. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another []. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain []. The same catalytic aminopeptidase domain is found in the other M17 peptidases IPR011356 from INTERPRO. These two groups of aminopeptidases differ by their N-terminal domains. The N-terminal domain in members of IPR011356 from INTERPRO has been implicated in DNA binding [, ] and it is not associated with members of this family which have a different N-terminal domain and therefore are not expected to bind DNA or be involved in transcriptional regulation. In addition, there are related proteins with the same catalytic domain and unique N-terminal sequences unrelated to any of the two N-terminal domains discussed above. For additional information please see [, , , ]. ; GO: 0004177 aminopeptidase activity, 0008235 metalloexopeptidase activity, 0030145 manganese ion binding, 0005737 cytoplasm
Probab=23.68  E-value=37  Score=25.80  Aligned_cols=38  Identities=26%  Similarity=0.343  Sum_probs=26.8

Q ss_pred             HHHhcCCCccCCCCCCCCCCCCCCCCChhHHHHHHHHH
Q 026103          110 ALALRGKSACLNFADSVWRLPVPASTDAKDIRKAAAEA  147 (243)
Q Consensus       110 a~~~~G~~a~~NFp~s~~~lp~p~~~~~~di~~aa~~A  147 (243)
                      +-..||.+|.+-|-+.--.+.+-.......||+||.+-
T Consensus        10 A~a~WG~~AllSf~~~ga~IHl~~~~~l~~IQrAaRkL   47 (77)
T PF12404_consen   10 AAAHWGEKALLSFNEQGATIHLSEGDDLRAIQRAARKL   47 (77)
T ss_pred             ChhHhCcCcEEEEcCCCEEEEECCCcchHHHHHHHHHH
Confidence            34578999988888766555554455677888887763


No 17 
>cd07998 WGR_DNA_ligase WGR domain of bacterial DNA ligases. The WGR domain is found in a small family of predicted bacterial DNA ligases. It has been called WGR after the most conserved central motif of the domain. The domain typically occurs in together with an ATP-dependent DNA ligase domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain.
Probab=21.93  E-value=3e+02  Score=20.80  Aligned_cols=51  Identities=14%  Similarity=0.080  Sum_probs=33.7

Q ss_pred             CCCCCCceeEe--EECCCCeEEEEEeecCCCeEEec----cCCCCHHHHHHHHHHHH
Q 026103           60 KETRHPIFRGV--RMRNNNKWVCELREPNKQTRIWL----GTYPSPEMAARAHDVAA  110 (243)
Q Consensus        60 ~~~~~s~yrGV--r~r~~GkW~AeIr~~~~~kri~L----Gtf~t~EeAA~AyD~Aa  110 (243)
                      +..++..|-=|  .....+.|...|+..+.|...-.    .+|.++++|.+++++-.
T Consensus         8 ~dg~S~Kfyev~~~~~~d~g~~v~~~yGR~Gt~gq~~tkt~~~~~~~~A~k~~~Klv   64 (77)
T cd07998           8 QEGNSDKVYEVDLFEVSDDGYVVNFRYGRRGSALREGTKTVAPVTLEAAEKIFDKLV   64 (77)
T ss_pred             ecCCCceEEEEEEEeccCCceEEEEEEccccCCcccccccCCCCCHHHHHHHHHHHH
Confidence            34455555555  33445778888888665654444    35589999999999743


No 18 
>PF07494 Reg_prop:  Two component regulator propeller;  InterPro: IPR011110 A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to members of IPR002372 from INTERPRO and IPR001680 from INTERPRO indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=21.18  E-value=72  Score=18.27  Aligned_cols=11  Identities=36%  Similarity=1.090  Sum_probs=8.4

Q ss_pred             CCeEEeccCCC
Q 026103           87 KQTRIWLGTYP   97 (243)
Q Consensus        87 ~~kri~LGtf~   97 (243)
                      +..+||+||+.
T Consensus        14 ~~G~lWigT~~   24 (24)
T PF07494_consen   14 SDGNLWIGTYN   24 (24)
T ss_dssp             TTSCEEEEETS
T ss_pred             CCcCEEEEeCC
Confidence            56689999873


Done!