Query 026103
Match_columns 243
No_of_seqs 261 out of 1368
Neff 5.4
Searched_HMMs 46136
Date Fri Mar 29 03:48:19 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/026103.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/026103hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00018 AP2 DNA-binding domain 99.8 7.6E-21 1.6E-25 135.8 7.5 61 65-125 1-61 (61)
2 smart00380 AP2 DNA-binding dom 99.8 4.5E-20 9.7E-25 133.4 8.5 63 66-128 1-63 (64)
3 PHA00280 putative NHN endonucl 99.5 1.4E-13 3E-18 111.8 7.1 72 45-119 47-119 (121)
4 PF00847 AP2: AP2 domain; Int 99.1 2.2E-10 4.8E-15 79.8 6.2 52 65-116 1-56 (56)
5 PF14657 Integrase_AP2: AP2-li 69.8 19 0.00041 23.9 5.5 37 77-113 1-41 (46)
6 PHA02601 int integrase; Provis 57.0 17 0.00037 32.7 4.5 44 69-113 2-46 (333)
7 PF13356 DUF4102: Domain of un 56.0 37 0.00081 25.3 5.5 37 71-107 28-68 (89)
8 cd00801 INT_P4 Bacteriophage P 55.0 29 0.00063 30.9 5.6 39 75-113 9-49 (357)
9 PF08846 DUF1816: Domain of un 49.6 33 0.00072 25.4 4.1 38 77-114 9-46 (68)
10 PF10729 CedA: Cell division a 48.0 32 0.0007 25.8 3.8 38 64-104 30-67 (80)
11 PRK09692 integrase; Provisiona 41.1 75 0.0016 29.9 6.2 36 70-105 33-74 (413)
12 PHA03308 transcriptional regul 33.8 42 0.00091 35.6 3.4 9 111-119 1333-1341(1463)
13 PRK10113 cell division modulat 31.1 39 0.00085 25.3 2.0 37 65-104 31-67 (80)
14 PF05036 SPOR: Sporulation rel 29.5 58 0.0012 22.4 2.7 24 87-110 42-65 (76)
15 PF14112 DUF4284: Domain of un 24.9 49 0.0011 26.6 1.8 19 89-107 2-20 (122)
16 PF12404 DUF3663: Peptidase ; 23.7 37 0.00079 25.8 0.8 38 110-147 10-47 (77)
17 cd07998 WGR_DNA_ligase WGR dom 21.9 3E+02 0.0064 20.8 5.4 51 60-110 8-64 (77)
18 PF07494 Reg_prop: Two compone 21.2 72 0.0016 18.3 1.5 11 87-97 14-24 (24)
No 1
>cd00018 AP2 DNA-binding domain found in transcription regulators in plants such as APETALA2 and EREBP (ethylene responsive element binding protein). In EREBPs the domain specifically binds to the 11bp GCC box of the ethylene response element (ERE), a promotor element essential for ethylene responsiveness. EREBPs and the C-repeat binding factor CBF1, which is involved in stress response, contain a single copy of the AP2 domain. APETALA2-like proteins, which play a role in plant development contain two copies.
Probab=99.84 E-value=7.6e-21 Score=135.84 Aligned_cols=61 Identities=59% Similarity=0.930 Sum_probs=57.2
Q ss_pred CceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhcCCCccCCCCCC
Q 026103 65 PIFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALRGKSACLNFADS 125 (243)
Q Consensus 65 s~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~G~~a~~NFp~s 125 (243)
|+||||+++++|||+|+|+++..++++|||+|+|+|||++|||.|+++++|..+.+|||++
T Consensus 1 s~~~GV~~~~~gkw~A~I~~~~~gk~~~lG~f~t~eeAa~Ayd~a~~~~~g~~a~~Nf~~~ 61 (61)
T cd00018 1 SKYRGVRQRPWGKWVAEIRDPSGGRRIWLGTFDTAEEAARAYDRAALKLRGSSAVLNFPDS 61 (61)
T ss_pred CCccCEEECCCCcEEEEEEeCCCCceEccCCCCCHHHHHHHHHHHHHHhcCCccccCCCCC
Confidence 6899998777799999999966699999999999999999999999999999999999975
No 2
>smart00380 AP2 DNA-binding domain in plant proteins such as APETALA2 and EREBPs.
Probab=99.82 E-value=4.5e-20 Score=133.37 Aligned_cols=63 Identities=52% Similarity=0.915 Sum_probs=59.3
Q ss_pred ceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhcCCCccCCCCCCCCC
Q 026103 66 IFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALRGKSACLNFADSVWR 128 (243)
Q Consensus 66 ~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~G~~a~~NFp~s~~~ 128 (243)
.|+||+++++|||+|+|+++.+++++|||+|+|+||||+|||.|+++++|..+.+|||.+.|.
T Consensus 1 ~~kGV~~~~~gkw~A~I~~~~~~k~~~lG~f~t~eeAa~Ayd~a~~~~~g~~a~~Nf~~~~y~ 63 (64)
T smart00380 1 KYRGVRQRPWGKWVAEIRDPSKGKRVWLGTFDTAEEAARAYDRAAFKFRGRSARLNFPNSLYD 63 (64)
T ss_pred CEeeEEeCCCCeEEEEEEecCCCcEEecCCCCCHHHHHHHHHHHHHHhcCCccccCCCCccCC
Confidence 499997787899999999999999999999999999999999999999999999999998764
No 3
>PHA00280 putative NHN endonuclease
Probab=99.46 E-value=1.4e-13 Score=111.83 Aligned_cols=72 Identities=14% Similarity=0.144 Sum_probs=60.2
Q ss_pred hccCCccccCCCCCCCCCCCCceeEe-EECCCCeEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhcCCCcc
Q 026103 45 LATSRPKKRAGRRVFKETRHPIFRGV-RMRNNNKWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALRGKSAC 119 (243)
Q Consensus 45 ~~s~~~k~~~~r~k~~~~~~s~yrGV-r~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~G~~a~ 119 (243)
+.-..+.....+++.+.+++|+|+|| ++...|||+|+|+. +||+++||.|+++|+|+.||+ ++.+|+|.+|+
T Consensus 47 Lr~~T~~eN~~N~~~~~~N~SG~kGV~~~k~~~kw~A~I~~--~gK~~~lG~f~~~e~A~~a~~-~~~~lhGeFa~ 119 (121)
T PHA00280 47 LRLALPKENSWNMKTPKSNTSGLKGLSWSKEREMWRGTVTA--EGKQHNFRSRDLLEVVAWIYR-TRRELHGQFAR 119 (121)
T ss_pred hhhcCHHHHhcccCCCCCCCCCCCeeEEecCCCeEEEEEEE--CCEEEEcCCCCCHHHHHHHHH-HHHHHhhcccc
Confidence 33344555566666677899999999 56677999999999 999999999999999999997 77889998875
No 4
>PF00847 AP2: AP2 domain; InterPro: IPR001471 Pathogenesis-related genes transcriptional activator binds to the GCC-box pathogenesis-related promoter element and activates the plant's defence genes. Ethylene, chemically the simplest plant hormone, participates in a number of stress responses and developmental processes: e.g., fruit ripening, inhibition of stem and root elongation, promotion of seed germination and flowering, senescence of leaves and flowers, and sex determination []. DNA sequence elements that confer ethylene responsiveness have been shown to contain two 11bp GCC boxes, which are necessary and sufficient for transcriptional control by ethylene. Ethylene responsive element binding proteins (EREBPs) have now been identified in a variety of plants. The proteins share a similar domain of around 59 amino acids, which interacts directly with the GCC box in the ERE.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent; PDB: 3IGM_A 3GCC_A 1GCC_A 2GCC_A.
Probab=99.10 E-value=2.2e-10 Score=79.82 Aligned_cols=52 Identities=25% Similarity=0.395 Sum_probs=44.9
Q ss_pred CceeEe-EECCCCeEEEEEeecCC---CeEEeccCCCCHHHHHHHHHHHHHHhcCC
Q 026103 65 PIFRGV-RMRNNNKWVCELREPNK---QTRIWLGTYPSPEMAARAHDVAALALRGK 116 (243)
Q Consensus 65 s~yrGV-r~r~~GkW~AeIr~~~~---~kri~LGtf~t~EeAA~AyD~Aa~~~~G~ 116 (243)
|+|+|| +.+..++|+|+|+++.. +++++||.|+++++|+++++.+++.++|.
T Consensus 1 s~~~GV~~~~~~~~W~a~i~~~~~~g~~k~f~~g~fg~~~eA~~~a~~~r~~~~~e 56 (56)
T PF00847_consen 1 SGYKGVSWDKRRGRWRAQIRVWSENGKRKRFSVGKFGFEEEAKRAAIEARKELEGE 56 (56)
T ss_dssp SSSTTEEEETTTTEEEEEEEECCCTTEEEEEEECCCCCHHHHHHHHHHHHHHCTS-
T ss_pred CCcEEEEEcCCCCEEEEEEEEcccCcccEEEeCccCCCHHHHHHHHHHHHHHhcCC
Confidence 689999 55667999999999432 49999999999999999999999999874
No 5
>PF14657 Integrase_AP2: AP2-like DNA-binding integrase domain
Probab=69.78 E-value=19 Score=23.86 Aligned_cols=37 Identities=14% Similarity=0.202 Sum_probs=28.4
Q ss_pred eEEEEEe--ec--CCCeEEeccCCCCHHHHHHHHHHHHHHh
Q 026103 77 KWVCELR--EP--NKQTRIWLGTYPSPEMAARAHDVAALAL 113 (243)
Q Consensus 77 kW~AeIr--~~--~~~kri~LGtf~t~EeAA~AyD~Aa~~~ 113 (243)
+|...|. .+ ++.++++-+-|.|..||-.+.......+
T Consensus 1 ~w~~~v~g~~~~~Gkrk~~~k~GF~TkkeA~~~~~~~~~~~ 41 (46)
T PF14657_consen 1 TWYYRVYGYDDETGKRKQKTKRGFKTKKEAEKALAKIEAEL 41 (46)
T ss_pred CEEEEEEEEECCCCCEEEEEcCCCCcHHHHHHHHHHHHHHH
Confidence 5788883 33 3457889999999999999988766655
No 6
>PHA02601 int integrase; Provisional
Probab=56.99 E-value=17 Score=32.75 Aligned_cols=44 Identities=20% Similarity=0.386 Sum_probs=29.5
Q ss_pred EeEECCCCeEEEEEeec-CCCeEEeccCCCCHHHHHHHHHHHHHHh
Q 026103 69 GVRMRNNNKWVCELREP-NKQTRIWLGTYPSPEMAARAHDVAALAL 113 (243)
Q Consensus 69 GVr~r~~GkW~AeIr~~-~~~kri~LGtf~t~EeAA~AyD~Aa~~~ 113 (243)
+|++.++|+|.++++.. ..|+++.. +|.|..||..........+
T Consensus 2 ~~~~~~~g~w~~~~~~~~~~g~r~~~-~f~tk~eA~~~~~~~~~~~ 46 (333)
T PHA02601 2 AVRKLKDGKWLCEIYPNGRDGKRIRK-RFATKGEALAFENYTMAEV 46 (333)
T ss_pred ceEEcCCCCEEEEEEECCCCCchhhh-hhcCHHHHHHHHHHHHHhc
Confidence 46667778999999863 23666653 6999988876555443333
No 7
>PF13356 DUF4102: Domain of unknown function (DUF4102); PDB: 3JU0_A 3RMP_A 3JTZ_A 2KJ8_A.
Probab=56.00 E-value=37 Score=25.27 Aligned_cols=37 Identities=27% Similarity=0.382 Sum_probs=24.5
Q ss_pred EECCCC--eEEEEEeecCCCeEEeccCCCC--HHHHHHHHH
Q 026103 71 RMRNNN--KWVCELREPNKQTRIWLGTYPS--PEMAARAHD 107 (243)
Q Consensus 71 r~r~~G--kW~AeIr~~~~~kri~LGtf~t--~EeAA~AyD 107 (243)
+-.+.| .|.-+.+..++.+++-||.|+. ..+|.....
T Consensus 28 ~v~~~G~kt~~~r~~~~gk~~~~~lG~~p~~sl~~AR~~a~ 68 (89)
T PF13356_consen 28 RVTPSGSKTFYFRYRINGKRRRITLGRYPELSLAEAREKAR 68 (89)
T ss_dssp EE-TTS-EEEEEEEEETTEEEEEEEEECTTS-HHHHHHHHH
T ss_pred EEEeCCCeEEEEEEEecceEEEeccCCCccCCHHHHHHHHH
Confidence 444444 4998888866677899999976 444444443
No 8
>cd00801 INT_P4 Bacteriophage P4 integrase. P4-like integrases are found in temperate bacteriophages, integrative plasmids, pathogenicity and symbiosis islands, and other mobile genetic elements. They share the same fold in their catalytic domain and the overall reaction mechanism with the superfamily of DNA breaking-rejoining enzymes. The P4 integrase mediates integrative and excisive site-specific recombination between two sites, called attachment sites, located on the phage genome and the bacterial chromosome. The phage attachment site is often found adjacent to the integrase gene, while the host attachment sites are typically situated near tRNA genes.
Probab=55.04 E-value=29 Score=30.89 Aligned_cols=39 Identities=33% Similarity=0.493 Sum_probs=27.6
Q ss_pred CCeEEEEEeecCCCeEEeccCCC--CHHHHHHHHHHHHHHh
Q 026103 75 NNKWVCELREPNKQTRIWLGTYP--SPEMAARAHDVAALAL 113 (243)
Q Consensus 75 ~GkW~AeIr~~~~~kri~LGtf~--t~EeAA~AyD~Aa~~~ 113 (243)
.+.|..+++..++.+++.||+|+ +.++|..........+
T Consensus 9 ~~~~~~~~~~~g~~~~~~~g~~~~~~~~~A~~~~~~~~~~~ 49 (357)
T cd00801 9 SKSWRFRYRLAGKRKRLTLGSYPAVSLAEAREKADEARALL 49 (357)
T ss_pred CEEEEEEeccCCceeEEeCcCCCCCCHHHHHHHHHHHHHHH
Confidence 35699999997777788899996 6666666555543333
No 9
>PF08846 DUF1816: Domain of unknown function (DUF1816); InterPro: IPR014945 Q4C9H3 from SWISSPROT is associated with the IPR008213 from INTERPRO domain suggesting this protein could have a role in phycobilisomes.
Probab=49.56 E-value=33 Score=25.38 Aligned_cols=38 Identities=24% Similarity=0.325 Sum_probs=28.6
Q ss_pred eEEEEEeecCCCeEEeccCCCCHHHHHHHHHHHHHHhc
Q 026103 77 KWVCELREPNKQTRIWLGTYPSPEMAARAHDVAALALR 114 (243)
Q Consensus 77 kW~AeIr~~~~~kri~LGtf~t~EeAA~AyD~Aa~~~~ 114 (243)
.|-++|.-..-.-..|.|-|.|.+||..+.---...+.
T Consensus 9 aWWveI~T~~P~ctYyFGPF~s~~eA~~~~~gyieDL~ 46 (68)
T PF08846_consen 9 AWWVEIETQNPNCTYYFGPFDSREEAEAALPGYIEDLE 46 (68)
T ss_pred cEEEEEEcCCCCEEEEeCCcCCHHHHHHHhccHHHHHH
Confidence 47788887555677889999999999988655444443
No 10
>PF10729 CedA: Cell division activator CedA; InterPro: IPR019666 CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet [, ]. ; GO: 0003677 DNA binding, 0051301 cell division; PDB: 2BN8_A 2D35_A.
Probab=47.97 E-value=32 Score=25.78 Aligned_cols=38 Identities=24% Similarity=0.184 Sum_probs=24.9
Q ss_pred CCceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHH
Q 026103 64 HPIFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAAR 104 (243)
Q Consensus 64 ~s~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~ 104 (243)
--+||-|+.-+ |||+|.+.. +..-..--.|..+|.|-|
T Consensus 30 ~dgfrdvw~lr-gkyvafvl~--ge~f~rsp~fs~pesaqr 67 (80)
T PF10729_consen 30 MDGFRDVWQLR-GKYVAFVLM--GEHFRRSPAFSVPESAQR 67 (80)
T ss_dssp TTTECCECCCC-CEEEEEEES--SS-EEE---BSSHHHHHH
T ss_pred cccccceeeec-cceEEEEEe--cchhccCCCcCCcHHHHH
Confidence 45799996543 999999987 443444457888887765
No 11
>PRK09692 integrase; Provisional
Probab=41.06 E-value=75 Score=29.86 Aligned_cols=36 Identities=17% Similarity=0.372 Sum_probs=22.8
Q ss_pred eEECCCC--eEEEEEeecC--CCeEEeccCCC--CHHHHHHH
Q 026103 70 VRMRNNN--KWVCELREPN--KQTRIWLGTYP--SPEMAARA 105 (243)
Q Consensus 70 Vr~r~~G--kW~AeIr~~~--~~kri~LGtf~--t~EeAA~A 105 (243)
|+-+..| .|+.+.+.+. +.+++-||.|+ |..+|..+
T Consensus 33 l~v~~~G~k~~~~rY~~~~~gk~~~~~lG~yp~~sl~~AR~~ 74 (413)
T PRK09692 33 LLIKSSGSKIWQFRYYRPLTKTRAKKSFGPYPSVTLADARNY 74 (413)
T ss_pred EEEECCCcEEEEEEEecCCCCceeeeeCCCCCCCCHHHHHHH
Confidence 4445555 4998887543 34447899999 56555443
No 12
>PHA03308 transcriptional regulator ICP4; Provisional
Probab=33.77 E-value=42 Score=35.63 Aligned_cols=9 Identities=22% Similarity=0.438 Sum_probs=5.4
Q ss_pred HHhcCCCcc
Q 026103 111 LALRGKSAC 119 (243)
Q Consensus 111 ~~~~G~~a~ 119 (243)
..-||+...
T Consensus 1333 vaawgrdtv 1341 (1463)
T PHA03308 1333 VAAWGRDTV 1341 (1463)
T ss_pred HHHhccccc
Confidence 345787763
No 13
>PRK10113 cell division modulator; Provisional
Probab=31.07 E-value=39 Score=25.26 Aligned_cols=37 Identities=27% Similarity=0.335 Sum_probs=24.3
Q ss_pred CceeEeEECCCCeEEEEEeecCCCeEEeccCCCCHHHHHH
Q 026103 65 PIFRGVRMRNNNKWVCELREPNKQTRIWLGTYPSPEMAAR 104 (243)
Q Consensus 65 s~yrGVr~r~~GkW~AeIr~~~~~kri~LGtf~t~EeAA~ 104 (243)
-+||-|+.-+ |||+|.+.. ...-..--.|..+|.|-|
T Consensus 31 d~frDVW~Lr-GKYVAFvl~--ge~FrRSPaFs~PEsAQR 67 (80)
T PRK10113 31 DSFRDVWMLR-GKYVAFVLM--GESFLRSPAFSVPESAQR 67 (80)
T ss_pred cchhhhheec-cceEEEEEe--chhhccCCccCCcHHHHH
Confidence 4688896543 899999877 222222346777777655
No 14
>PF05036 SPOR: Sporulation related domain; InterPro: IPR007730 This 70 residue domain is composed of two 35 residue repeats that are found in bacterial proteins involved in sporulation and cell division, such as FtsN, CwlM and RlpA. This repeat might be involved in binding peptidoglycan. FtsN is an essential cell division protein with a simple bitopic topology: a short N-terminal cytoplasmic segment fused to a large carboxy periplasmic domain through a single transmembrane domain. The repeats lie at the periplasmic C terminus, which has an RNP-like fold []. FtsN localises to the septum ring complex. The CwlM protein is a cell wall hydrolase, where the C-terminal region, including the repeats, determines substrate specificity []. RlpA is a rare lipoprotein A protein that may be important for cell division. Its N-terminal cysteine may be attached to thioglyceride and N-fatty acyl residues [].; PDB: 1X60_A 1UTA_A.
Probab=29.47 E-value=58 Score=22.44 Aligned_cols=24 Identities=25% Similarity=0.232 Sum_probs=17.5
Q ss_pred CCeEEeccCCCCHHHHHHHHHHHH
Q 026103 87 KQTRIWLGTYPSPEMAARAHDVAA 110 (243)
Q Consensus 87 ~~kri~LGtf~t~EeAA~AyD~Aa 110 (243)
..-+|.+|.|++.++|..+-....
T Consensus 42 ~~yrV~~G~f~~~~~A~~~~~~l~ 65 (76)
T PF05036_consen 42 PWYRVRVGPFSSREEAEAALRKLK 65 (76)
T ss_dssp TCEEEEECCECTCCHHHHHHHHHH
T ss_pred ceEEEEECCCCCHHHHHHHHHHHh
Confidence 445788888999888877765433
No 15
>PF14112 DUF4284: Domain of unknown function (DUF4284)
Probab=24.92 E-value=49 Score=26.64 Aligned_cols=19 Identities=21% Similarity=0.604 Sum_probs=14.2
Q ss_pred eEEeccCCCCHHHHHHHHH
Q 026103 89 TRIWLGTYPSPEMAARAHD 107 (243)
Q Consensus 89 kri~LGtf~t~EeAA~AyD 107 (243)
..||||+|.+.++--.=.+
T Consensus 2 VsiWiG~f~s~~el~~Y~e 20 (122)
T PF14112_consen 2 VSIWIGNFKSEDELEEYFE 20 (122)
T ss_pred eEEEEecCCCHHHHHHHhC
Confidence 3699999999877655443
No 16
>PF12404 DUF3663: Peptidase ; InterPro: IPR008330 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This family represents the peptidase B group of leucyl aminopeptidases, which are restricted to the gammaproteobacteria. They contain a C-terminal aminopeptidase catalytic domain and an N-terminal domain of unknown function. They are zinc-dependent exopeptidases (3.4.11.1 from EC) and belong to MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF). They selectively release N-terminal amino acid residues from polypeptides and proteins and are involved in the processing, catabolism and degradation of intracellular proteins [, , ]. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another []. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain []. The same catalytic aminopeptidase domain is found in the other M17 peptidases IPR011356 from INTERPRO. These two groups of aminopeptidases differ by their N-terminal domains. The N-terminal domain in members of IPR011356 from INTERPRO has been implicated in DNA binding [, ] and it is not associated with members of this family which have a different N-terminal domain and therefore are not expected to bind DNA or be involved in transcriptional regulation. In addition, there are related proteins with the same catalytic domain and unique N-terminal sequences unrelated to any of the two N-terminal domains discussed above. For additional information please see [, , , ]. ; GO: 0004177 aminopeptidase activity, 0008235 metalloexopeptidase activity, 0030145 manganese ion binding, 0005737 cytoplasm
Probab=23.68 E-value=37 Score=25.80 Aligned_cols=38 Identities=26% Similarity=0.343 Sum_probs=26.8
Q ss_pred HHHhcCCCccCCCCCCCCCCCCCCCCChhHHHHHHHHH
Q 026103 110 ALALRGKSACLNFADSVWRLPVPASTDAKDIRKAAAEA 147 (243)
Q Consensus 110 a~~~~G~~a~~NFp~s~~~lp~p~~~~~~di~~aa~~A 147 (243)
+-..||.+|.+-|-+.--.+.+-.......||+||.+-
T Consensus 10 A~a~WG~~AllSf~~~ga~IHl~~~~~l~~IQrAaRkL 47 (77)
T PF12404_consen 10 AAAHWGEKALLSFNEQGATIHLSEGDDLRAIQRAARKL 47 (77)
T ss_pred ChhHhCcCcEEEEcCCCEEEEECCCcchHHHHHHHHHH
Confidence 34578999988888766555554455677888887763
No 17
>cd07998 WGR_DNA_ligase WGR domain of bacterial DNA ligases. The WGR domain is found in a small family of predicted bacterial DNA ligases. It has been called WGR after the most conserved central motif of the domain. The domain typically occurs in together with an ATP-dependent DNA ligase domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain.
Probab=21.93 E-value=3e+02 Score=20.80 Aligned_cols=51 Identities=14% Similarity=0.080 Sum_probs=33.7
Q ss_pred CCCCCCceeEe--EECCCCeEEEEEeecCCCeEEec----cCCCCHHHHHHHHHHHH
Q 026103 60 KETRHPIFRGV--RMRNNNKWVCELREPNKQTRIWL----GTYPSPEMAARAHDVAA 110 (243)
Q Consensus 60 ~~~~~s~yrGV--r~r~~GkW~AeIr~~~~~kri~L----Gtf~t~EeAA~AyD~Aa 110 (243)
+..++..|-=| .....+.|...|+..+.|...-. .+|.++++|.+++++-.
T Consensus 8 ~dg~S~Kfyev~~~~~~d~g~~v~~~yGR~Gt~gq~~tkt~~~~~~~~A~k~~~Klv 64 (77)
T cd07998 8 QEGNSDKVYEVDLFEVSDDGYVVNFRYGRRGSALREGTKTVAPVTLEAAEKIFDKLV 64 (77)
T ss_pred ecCCCceEEEEEEEeccCCceEEEEEEccccCCcccccccCCCCCHHHHHHHHHHHH
Confidence 34455555555 33445778888888665654444 35589999999999743
No 18
>PF07494 Reg_prop: Two component regulator propeller; InterPro: IPR011110 A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to members of IPR002372 from INTERPRO and IPR001680 from INTERPRO indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=21.18 E-value=72 Score=18.27 Aligned_cols=11 Identities=36% Similarity=1.090 Sum_probs=8.4
Q ss_pred CCeEEeccCCC
Q 026103 87 KQTRIWLGTYP 97 (243)
Q Consensus 87 ~~kri~LGtf~ 97 (243)
+..+||+||+.
T Consensus 14 ~~G~lWigT~~ 24 (24)
T PF07494_consen 14 SDGNLWIGTYN 24 (24)
T ss_dssp TTSCEEEEETS
T ss_pred CCcCEEEEeCC
Confidence 56689999873
Done!