Query T0530 NP_391841.1, Bacillus subtilis, 115 residues Match_columns 115 No_of_seqs 104 out of 166 Neff 6.2 Searched_HMMs 11830 Date Fri May 21 18:06:52 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0530.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0530.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF06486 DUF1093: Protein of u 99.8 3E-21 2.5E-25 141.6 6.2 74 32-105 1-78 (78) 2 PF11337 DUF3139: Protein of u 69.9 1.3 0.00011 20.9 2.7 23 1-23 1-26 (85) 3 PF03748 FliL: Flagellar basal 60.8 2.6 0.00022 19.1 2.9 24 1-24 1-25 (149) 4 PF07901 DUF1672: Protein of u 59.0 1.6 0.00013 20.5 1.5 19 1-19 1-19 (304) 5 PF05170 AsmA: AsmA family; I 54.7 2.1 0.00018 19.6 1.6 17 1-17 1-17 (604) 6 PF07423 DUF1510: Protein of u 47.0 6.1 0.00052 17.0 2.9 20 6-25 19-38 (217) 7 PF06097 DUF945: Bacterial pro 44.7 6.1 0.00052 17.0 2.6 29 56-84 62-99 (459) 8 PF05404 TRAP-delta: Transloco 43.4 8.2 0.00069 16.2 4.5 14 55-68 93-106 (167) 9 PF08085 Entericidin: Enterici 43.3 6.6 0.00056 16.8 2.6 21 1-21 1-21 (42) 10 PF06510 DUF1102: Protein of u 42.7 8.4 0.00071 16.2 5.0 15 1-15 1-15 (185) 11 PF02402 Lysis_col: Lysis prot 37.4 4.5 0.00038 17.8 1.0 11 1-11 1-11 (46) 12 PF11153 DUF2931: Protein of u 35.4 11 0.00093 15.5 6.5 33 76-114 91-126 (216) 13 PF11839 DUF3359: Protein of u 34.8 10 0.00084 15.7 2.4 22 1-24 1-22 (96) 14 PF10614 Tafi-CsgF: Curli prod 32.6 9.5 0.0008 15.9 2.0 34 1-36 1-34 (142) 15 PF10969 DUF2771: Protein of u 32.3 12 0.001 15.2 3.7 15 2-16 1-15 (161) 16 PF06291 Lambda_Bor: Bor prote 30.8 7 0.00059 16.6 1.1 18 1-19 1-18 (97) 17 PF05079 DUF680: Protein of un 29.5 14 0.0012 14.9 2.4 24 1-27 1-24 (75) 18 PF01203 GSPII_N: Bacterial ty 27.8 15 0.0013 14.7 2.9 20 1-20 1-20 (252) 19 PF07009 DUF1312: Protein of u 27.2 15 0.0013 14.7 3.5 16 36-51 23-38 (113) 20 PF06316 Ail_Lom: Enterobacter 27.0 6.9 0.00058 16.7 0.5 20 71-90 96-115 (199) 21 PF10828 DUF2570: Protein of u 26.4 16 0.0013 14.6 3.0 31 1-32 1-31 (110) 22 PF06551 DUF1120: Protein of u 26.2 16 0.0013 14.6 2.4 17 1-17 1-17 (145) 23 PF10855 DUF2648: Protein of u 23.6 12 0.001 15.2 1.3 10 1-10 1-10 (33) 24 PF10694 DUF2500: Protein of u 23.3 18 0.0015 14.2 7.7 45 54-98 62-108 (110) 25 PF09403 FadA: Adhesion protei 22.6 6.9 0.00058 16.7 -0.1 19 1-21 1-19 (126) 26 PF02920 Integrase_DNA: DNA bi 21.4 14 0.0012 14.9 1.2 35 79-113 17-55 (67) 27 PF05342 Peptidase_M26_N: M26 20.6 20 0.0017 13.9 3.4 46 59-105 49-100 (250) No 1 >PF06486 DUF1093: Protein of unknown function (DUF1093); InterPro: IPR006542 These are a family of small (about 115 amino acids) uncharacterised proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members.; PDB: 2k5w_A 2k5q_A. Probab=99.82 E-value=3e-21 Score=141.62 Aligned_cols=74 Identities=50% Similarity=0.835 Sum_probs=70.2 Q ss_pred CCCCCCCEEEEEECCCCCCCCC---CCEEEEEEEECCCCCEEEEEECCCCCCCCCCEEEEEECCCE-ECCCEEECHHH Q ss_conf 5432884258998388722267---74378775375788568999703887765663899865823-11420331530 Q T0530 32 NPFIHQQDVYVQIDRDGRHLSP---GGTEYTLDGYNASGKKEEVTFFAGKELRKNAYLKVKAKGKY-VETWEEVKFED 105 (115) Q Consensus 32 np~~~~~~~Y~~i~~~g~~~~~---~~y~Y~l~~yd~~G~~k~i~f~~~~~L~~~aYlki~~k~~~-V~s~eeV~~~~ 105 (115) ||++++++|||+|+++|+.... .+|.|+|+|||++|++++|+|+++++|++||||||+++++. |+||+||+++| T Consensus 1 N~~~~~~~yYv~i~~~~~~~~~~~~~~y~Y~l~~yde~G~~k~v~f~~~~~L~~~aylki~~~~~~~V~s~eeV~~~e 78 (78) T PF06486_consen 1 NPFYGGEDYYVKITEDGKKEGKNKDKRYEYTLKGYDEDGKEKEVTFTASKPLKKGAYLKIYVKGKYGVTSYEEVQKDE 78 (78) T ss_dssp -TT---EEEEEE------EECGGG-EEEEEEEEEEE----EEEEEEEESS---TTSEEEEEE-TTS-EEEEEEE-CCC T ss_pred CCCCCCEEEEEEECCCCEEECCCCCCEEEEEEEEECCCCCEEEEEEECCCCCCCCCEEEEEECCCCCEEEEEEECCCC T ss_conf 984685199999887887801268743999889998899999999981787789978999983874285328975579 No 2 >PF11337 DUF3139: Protein of unknown function (DUF3139) Probab=69.87 E-value=1.3 Score=20.94 Aligned_cols=23 Identities=26% Similarity=0.485 Sum_probs=9.4 Q ss_pred CHH---HHHHHHHHHHHHHHHHHHHC Q ss_conf 935---77789999999999887531 Q T0530 1 MKK---AMAILAVLAAAAVICGLLFF 23 (115) Q Consensus 1 MKK---~l~ii~i~~v~~~~~~~~~~ 23 (115) ||| ++.+++++++.++++..+++ T Consensus 1 MKK~k~~~~ii~ii~I~l~~~~~~~~ 26 (85) T PF11337_consen 1 MKKKKKIFIIIIIIIISLFVGIVYFF 26 (85) T ss_pred CCCCEEHHHHHHHHHHHHHHHHHHHH T ss_conf 98400158999999999999999986 No 3 >PF03748 FliL: Flagellar basal body-associated protein FliL; InterPro: IPR005503 This FliL protein controls the rotational direction of the flagella during chemotaxis . FliL is a cytoplasmic membrane protein associated with the basal body .; GO: 0001539 ciliary or flagellar motility, 0006935 chemotaxis, 0009425 flagellin-based flagellum basal body Probab=60.78 E-value=2.6 Score=19.14 Aligned_cols=24 Identities=33% Similarity=0.384 Sum_probs=10.9 Q ss_pred CHHHHHHHHHHHHHHHHH-HHHHCC Q ss_conf 935777899999999998-875315 Q T0530 1 MKKAMAILAVLAAAAVIC-GLLFFH 24 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~-~~~~~~ 24 (115) |||++.+++++++++++. +++++. T Consensus 1 kK~li~~i~~~ll~~~~~~~~~~~~ 25 (149) T PF03748_consen 1 KKKLIIIIVALLLLIVGAGGGYFFF 25 (149) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9437999999999999999999987 No 4 >PF07901 DUF1672: Protein of unknown function (DUF1672); InterPro: IPR012873 This family is composed of hypothetical bacterial proteins of unknown function. Probab=58.96 E-value=1.6 Score=20.46 Aligned_cols=19 Identities=26% Similarity=0.226 Sum_probs=12.3 Q ss_pred CHHHHHHHHHHHHHHHHHH Q ss_conf 9357778999999999988 Q T0530 1 MKKAMAILAVLAAAAVICG 19 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~ 19 (115) |+|++.+|++.++++.+|+ T Consensus 1 M~K~i~~ll~~~lLLgGCs 19 (304) T PF07901_consen 1 MKKRIISLLAATLLLGGCS 19 (304) T ss_pred CHHHHHHHHHHHHHHCCCC T ss_conf 9148999999999974444 No 5 >PF05170 AsmA: AsmA family; InterPro: IPR007844 The AsmA protein is involved in the assembly of outer membrane proteins in Escherichia coli . AsmA mutations were isolated as extragenic suppressors of an OmpF assembly mutant . AsmA may have a role in LPS biogenesis . Probab=54.69 E-value=2.1 Score=19.65 Aligned_cols=17 Identities=35% Similarity=0.544 Sum_probs=9.2 Q ss_pred CHHHHHHHHHHHHHHHH Q ss_conf 93577789999999999 Q T0530 1 MKKAMAILAVLAAAAVI 17 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~ 17 (115) |||++.+++++++++++ T Consensus 1 Mkk~lkil~~~l~~lvl 17 (604) T PF05170_consen 1 MKKLLKILLIILIVLVL 17 (604) T ss_pred CCHHHHHHHHHHHHHHH T ss_conf 94799999999999999 No 6 >PF07423 DUF1510: Protein of unknown function (DUF1510); InterPro: IPR009988 This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. Probab=47.00 E-value=6.1 Score=16.97 Aligned_cols=20 Identities=40% Similarity=0.555 Sum_probs=7.4 Q ss_pred HHHHHHHHHHHHHHHHHCCC Q ss_conf 78999999999988753156 Q T0530 6 AILAVLAAAAVICGLLFFHN 25 (115) Q Consensus 6 ~ii~i~~v~~~~~~~~~~~~ 25 (115) +|++++++++++++.+|+.+ T Consensus 19 aI~IV~lLIiiv~~~lff~~ 38 (217) T PF07423_consen 19 AIIIVLLLIIIVAYQLFFGN 38 (217) T ss_pred HHHHHHHHHHHHHHHHHCCC T ss_conf 99999999999988851347 No 7 >PF06097 DUF945: Bacterial protein of unknown function (DUF945); InterPro: IPR010352 This family consists of several hypothetical bacterial proteins of unknown function. Probab=44.67 E-value=6.1 Score=16.98 Aligned_cols=29 Identities=10% Similarity=0.140 Sum_probs=13.6 Q ss_pred EEEEEEEECCCC------CEEEEEECC---CCCCCCCC Q ss_conf 378775375788------568999703---88776566 Q T0530 56 TEYTLDGYNASG------KKEEVTFFA---GKELRKNA 84 (115) Q Consensus 56 y~Y~l~~yd~~G------~~k~i~f~~---~~~L~~~a 84 (115) -.|.+..-+... ...++.|.. +++++-++ T Consensus 62 a~~~l~~~~~~~~~~~~~~~~~i~~~~~I~HGP~p~~~ 99 (459) T PF06097_consen 62 ATYRLTVDDPALGNQLKAPPQSIVFNSDISHGPFPLSS 99 (459) T ss_pred EEEEEEECCCCCCCCCCCCCCEEEEEEEEECCCCCCCC T ss_conf 99999982644333113788079998788817866311 No 8 >PF05404 TRAP-delta: Translocon-associated protein, delta subunit precursor (TRAP-delta); InterPro: IPR008855 This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown .; GO: 0005783 endoplasmic reticulum, 0016021 integral to membrane Probab=43.45 E-value=8.2 Score=16.24 Aligned_cols=14 Identities=14% Similarity=0.366 Sum_probs=10.2 Q ss_pred CEEEEEEEECCCCC Q ss_conf 43787753757885 Q T0530 55 GTEYTLDGYNASGK 68 (115) Q Consensus 55 ~y~Y~l~~yd~~G~ 68 (115) .-+|.+..|||+|= T Consensus 93 sG~y~V~~fDEegy 106 (167) T PF05404_consen 93 SGTYQVRFFDEEGY 106 (167) T ss_pred CCCEEEEEECHHHH T ss_conf 88579999672889 No 9 >PF08085 Entericidin: Entericidin EcnA/B family; InterPro: IPR012556 This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing .; GO: 0009636 response to toxin, 0016020 membrane Probab=43.29 E-value=6.6 Score=16.79 Aligned_cols=21 Identities=33% Similarity=0.284 Sum_probs=11.1 Q ss_pred CHHHHHHHHHHHHHHHHHHHH Q ss_conf 935777899999999998875 Q T0530 1 MKKAMAILAVLAAAAVICGLL 21 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~~ 21 (115) |||.+.++.+++.......+| T Consensus 1 Mkk~~~~~~~~~~~~~~l~gC 21 (42) T PF08085_consen 1 MKKKILIILALLALALALAGC 21 (42) T ss_pred CCHHHHHHHHHHHHHHHHHHH T ss_conf 955899999999999988002 No 10 >PF06510 DUF1102: Protein of unknown function (DUF1102); InterPro: IPR009482 This family consists of several hypothetical archaeal proteins of unknown function. Probab=42.69 E-value=8.4 Score=16.17 Aligned_cols=15 Identities=40% Similarity=0.492 Sum_probs=9.1 Q ss_pred CHHHHHHHHHHHHHH Q ss_conf 935777899999999 Q T0530 1 MKKAMAILAVLAAAA 15 (115) Q Consensus 1 MKK~l~ii~i~~v~~ 15 (115) |||+++++++++.+. T Consensus 1 M~kligi~~l~~g~~ 15 (185) T PF06510_consen 1 MKKLIGILLLAAGLG 15 (185) T ss_pred CHHHHHHHHHHHHHH T ss_conf 926899999999877 No 11 >PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined . The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells . A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB . Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C-terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively . Sequence similarities between colicins E2, A and E1 are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides . Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase . The mature ColE2 lysis protein is located in the cell envelope . ; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane Probab=37.36 E-value=4.5 Score=17.77 Aligned_cols=11 Identities=45% Similarity=0.455 Sum_probs=6.7 Q ss_pred CHHHHHHHHHH Q ss_conf 93577789999 Q T0530 1 MKKAMAILAVL 11 (115) Q Consensus 1 MKK~l~ii~i~ 11 (115) |||++.++.++ T Consensus 1 MkKi~~~~i~~ 11 (46) T PF02402_consen 1 MKKILFIGILL 11 (46) T ss_pred CCEEEEEHHHH T ss_conf 94787738999 No 12 >PF11153 DUF2931: Protein of unknown function (DUF2931) Probab=35.37 E-value=11 Score=15.49 Aligned_cols=33 Identities=24% Similarity=0.468 Sum_probs=20.2 Q ss_pred CCCCCCCCCEEEEEE---CCCEECCCEEECHHHCCHHHHHHC Q ss_conf 388776566389986---582311420331530356989962 Q T0530 76 AGKELRKNAYLKVKA---KGKYVETWEEVKFEDMPDSVQSKL 114 (115) Q Consensus 76 ~~~~L~~~aYlki~~---k~~~V~s~eeV~~~~iP~ka~~kL 114 (115) -.++|+.-.++.=.+ +..+.+.+ +||+..++++ T Consensus 91 ~~~~lPd~i~i~W~Sl~DkK~Y~~~i------~ip~~~~~~M 126 (216) T PF11153_consen 91 KGKPLPDRIYICWVSLIDKKFYETKI------DIPEELRQRM 126 (216) T ss_pred CCCCCCCEEEEEEEECCCCCEEEEEE------ECCHHHHHHH T ss_conf 66789967999999813551789999------8799999975 No 13 >PF11839 DUF3359: Protein of unknown function (DUF3359) Probab=34.79 E-value=10 Score=15.74 Aligned_cols=22 Identities=32% Similarity=0.313 Sum_probs=12.1 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHCC Q ss_conf 935777899999999998875315 Q T0530 1 MKKAMAILAVLAAAAVICGLLFFH 24 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~~~~~ 24 (115) |||++ +..++..+++.+||-.. T Consensus 1 M~~~l--~s~~~~~~~L~~GCAs~ 22 (96) T PF11839_consen 1 MKKLL--ISALALAALLAAGCAST 22 (96) T ss_pred CCHHH--HHHHHHHHHHHHHCCCC T ss_conf 90599--99999999998572688 No 14 >PF10614 Tafi-CsgF: Curli production assembly/transport component CsgF Probab=32.64 E-value=9.5 Score=15.87 Aligned_cols=34 Identities=26% Similarity=0.156 Sum_probs=13.5 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 935777899999999998875315654443254328 Q T0530 1 MKKAMAILAVLAAAAVICGLLFFHNDVTDRFNPFIH 36 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~~~~~~~~~d~~np~~~ 36 (115) ||+++.+.+++++++ ...++...-.+...||.-+ T Consensus 1 mk~~~l~a~l~~~~~--a~~a~A~~LVY~PvNPsFG 34 (142) T PF10614_consen 1 MKYRGLLAVLLLLAA--AGPAQAQELVYTPVNPSFG 34 (142) T ss_pred CCCHHHHHHHHHHHH--CCCCCHHHEEECCCCCCCC T ss_conf 916399999999982--5500042225514699989 No 15 >PF10969 DUF2771: Protein of unknown function (DUF2771) Probab=32.34 E-value=12 Score=15.19 Aligned_cols=15 Identities=13% Similarity=0.419 Sum_probs=8.5 Q ss_pred HHHHHHHHHHHHHHH Q ss_conf 357778999999999 Q T0530 2 KKAMAILAVLAAAAV 16 (115) Q Consensus 2 KK~l~ii~i~~v~~~ 16 (115) ||++++|++++++++ T Consensus 1 ~~~~a~i~a~vvV~~ 15 (161) T PF10969_consen 1 KRILALIVAVVVVLL 15 (161) T ss_pred CCCHHHHHHHHHHHH T ss_conf 900146477666533 No 16 >PF06291 Lambda_Bor: Bor protein; InterPro: IPR010438 This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the E. coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis . Probab=30.81 E-value=7 Score=16.65 Aligned_cols=18 Identities=33% Similarity=0.372 Sum_probs=10.7 Q ss_pred CHHHHHHHHHHHHHHHHHH Q ss_conf 9357778999999999988 Q T0530 1 MKKAMAILAVLAAAAVICG 19 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~ 19 (115) |||++.++++ ++++.+|. T Consensus 1 mkk~ll~~~l-~llltgCa 18 (97) T PF06291_consen 1 MKKILLAAAL-ALLLTGCA 18 (97) T ss_pred CHHHHHHHHH-HHHHCCCC T ss_conf 9004999999-99964566 No 17 >PF05079 DUF680: Protein of unknown function (DUF680); InterPro: IPR007771 This family contains uncharacterised proteins which seem to be found exclusively in Mesorhizobium loti. Probab=29.55 E-value=14 Score=14.93 Aligned_cols=24 Identities=38% Similarity=0.384 Sum_probs=11.3 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHCCCCC Q ss_conf 935777899999999998875315654 Q T0530 1 MKKAMAILAVLAAAAVICGLLFFHNDV 27 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~~~~~~~~ 27 (115) |||++...+.+ +++.+..|-+++. T Consensus 1 MkKi~L~aaA~---l~~sgsAFAgs~~ 24 (75) T PF05079_consen 1 MKKIALTAAAL---LLISGSAFAGSDH 24 (75) T ss_pred CHHHHHHHHHH---HHHHHHHHCCCCC T ss_conf 91699999999---9965376518988 No 18 >PF01203 GSPII_N: Bacterial type II secretion system protein N; InterPro: IPR000645 The secretion pathway (GSP) for the export of proteins (also called the type II pathway) requires a number of protein components. One of them is known as the 'N' protein and has been sequenced in a variety of bacteria such as Aeromonas hydrophila (gene exeN); Erwinia carotovora (gene outN); Klebsiella pneumoniae (gene pulN); or Vibrio cholerae (gene epsN). The size of the 'N' protein is around 250 amino acids. It apparently contains a single transmembrane domain located in the N-terminal section. The short N-terminal domain is predicted to be cytoplasmic and the large C-terminal domain periplasmic.; GO: 0008565 protein transporter activity, 0015628 protein secretion by the type II secretion system, 0015627 type II protein secretion system complex Probab=27.78 E-value=15 Score=14.73 Aligned_cols=20 Identities=15% Similarity=0.232 Sum_probs=10.7 Q ss_pred CHHHHHHHHHHHHHHHHHHH Q ss_conf 93577789999999999887 Q T0530 1 MKKAMAILAVLAAAAVICGL 20 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~ 20 (115) ||+.+..+++++++.+++.. T Consensus 1 Mkr~~~~~~~~~~~~l~~Lv 20 (252) T PF01203_consen 1 MKRRILWILLFLLAYLVFLV 20 (252) T ss_pred CCHHHHHHHHHHHHHHHHHH T ss_conf 93249999999999999999 No 19 >PF07009 DUF1312: Protein of unknown function (DUF1312); InterPro: IPR010739 This family consists of several bacterial proteins of around 120 residues in length. The function of this family is unknown. Probab=27.21 E-value=15 Score=14.67 Aligned_cols=16 Identities=25% Similarity=0.397 Sum_probs=11.6 Q ss_pred CCCEEEEEECCCCCCC Q ss_conf 8842589983887222 Q T0530 36 HQQDVYVQIDRDGRHL 51 (115) Q Consensus 36 ~~~~~Y~~i~~~g~~~ 51 (115) .....++.|.-+|+.. T Consensus 23 ~~~~~~avI~~~gk~~ 38 (113) T PF07009_consen 23 SGGGKYAVIYVDGKEI 38 (113) T ss_pred CCCCEEEEEEECCEEE T ss_conf 8997499999999999 No 20 >PF06316 Ail_Lom: Enterobacterial Ail/Lom protein; InterPro: IPR000758 Virulence-related outer membrane proteins are expressed in Gram-negative bacteria and are essential to bacterial survival within macrophages and for eukaryotic cell invasion. Members of this group include: PagC, required by Salmonella typhimurium for survival in macrophages and for virulence in mice Rck outer membrane protein of the S. typhimurium virulence plasmid Ail, a product of the Yersinia enterocolitica chromosome capable of mediating bacterial adherence to and invasion of epithelial cell lines OmpX from Escherichia coli that promotes adhesion to and entry into mammalian cells. It also has a role in the resistance against attack by the human complement system a Bacteriophage lambda outer membrane protein, Lom The crystal structure of OmpX from E. coli reveals that OmpX consists of an eight-stranded antiparallel all-next-neighbour beta barrel . The structure shows two girdles of aromatic amino acid residues and a ribbon of nonpolar residues that attach to the membrane interior. The core of the barrel consists of an extended hydrogen-bonding network of highly conserved residues. OmpX thus resembles an inverse micelle. The OmpX structure shows that the membrane-spanning part of the protein is much better conserved than the extracellular loops. Moreover, these loops form a protruding beta sheet, the edge of which presumably binds to external proteins. It is suggested that this type of binding promotes cell adhesion and invasion and helps defend against the complement system. Although OmpX has the same beta-sheet topology as the structurally related outer membrane protein A (OmpA) IPR000498 from INTERPRO, their barrels differ with respect to the shear numbers and internal hydrogen-bonding networks.; GO: 0009279 cell outer membrane; PDB: 1q9f_A 1qj9_A 1q9g_A 1orm_A 1qj8_A. Probab=27.01 E-value=6.9 Score=16.68 Aligned_cols=20 Identities=25% Similarity=0.413 Sum_probs=14.8 Q ss_pred EEEECCCCCCCCCCEEEEEE Q ss_conf 99970388776566389986 Q T0530 71 EVTFFAGKELRKNAYLKVKA 90 (115) Q Consensus 71 ~i~f~~~~~L~~~aYlki~~ 90 (115) =..++++.-+|-|.|+-+|. T Consensus 96 YySlmaGPsyR~NdyvS~Yg 115 (199) T PF06316_consen 96 YYSLMAGPSYRFNDYVSLYG 115 (199) T ss_dssp E----B---EECCTTEEE-- T ss_pred EEEEEECCEEEECCEEEHHE T ss_conf 89996266088531040014 No 21 >PF10828 DUF2570: Protein of unknown function (DUF2570) Probab=26.42 E-value=16 Score=14.58 Aligned_cols=31 Identities=13% Similarity=0.310 Sum_probs=15.7 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCC Q ss_conf 93577789999999999887531565444325 Q T0530 1 MKKAMAILAVLAAAAVICGLLFFHNDVTDRFN 32 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~~~~~~~~~d~~n 32 (115) |++++.+.+.++++++++ ..++-+..+|+++ T Consensus 1 ~~~~~~~~l~~liv~l~~-~~~~qs~~id~L~ 31 (110) T PF10828_consen 1 MTRYIYIALAFLIVGLCG-WIWYQSSRIDRLR 31 (110) T ss_pred CCHHHHHHHHHHHHHHHH-HHHHHHHHHHHHH T ss_conf 917999999999999999-9999999999999 No 22 >PF06551 DUF1120: Protein of unknown function (DUF1120); InterPro: IPR010546 This family consists of several bacterial proteins, at least one of which is involved in enzyme induction following nitrogen deprivation. The exact function of this family is unknown Probab=26.25 E-value=16 Score=14.56 Aligned_cols=17 Identities=29% Similarity=0.397 Sum_probs=9.6 Q ss_pred CHHHHHHHHHHHHHHHH Q ss_conf 93577789999999999 Q T0530 1 MKKAMAILAVLAAAAVI 17 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~ 17 (115) |||.++..++++.++++ T Consensus 1 MKK~l~~~~l~a~l~~~ 17 (145) T PF06551_consen 1 MKKNLLATLLLASLLLL 17 (145) T ss_pred CCHHHHHHHHHHHHHHH T ss_conf 93679999999999986 No 23 >PF10855 DUF2648: Protein of unknown function (DUF2648) Probab=23.60 E-value=12 Score=15.19 Aligned_cols=10 Identities=50% Similarity=0.610 Sum_probs=5.8 Q ss_pred CHHHHHHHHH Q ss_conf 9357778999 Q T0530 1 MKKAMAILAV 10 (115) Q Consensus 1 MKK~l~ii~i 10 (115) |||...++++ T Consensus 1 MkkL~iiL~i 10 (33) T PF10855_consen 1 MKKLAIILII 10 (33) T ss_pred CCCEEEHHHH T ss_conf 9634425988 No 24 >PF10694 DUF2500: Protein of unknown function (DUF2500) Probab=23.34 E-value=18 Score=14.23 Aligned_cols=45 Identities=16% Similarity=0.096 Sum_probs=33.5 Q ss_pred CCEEEEEEEECCCCCEEEEEECCC--CCCCCCCEEEEEECCCEECCC Q ss_conf 743787753757885689997038--877656638998658231142 Q T0530 54 GGTEYTLDGYNASGKKEEVTFFAG--KELRKNAYLKVKAKGKYVETW 98 (115) Q Consensus 54 ~~y~Y~l~~yd~~G~~k~i~f~~~--~~L~~~aYlki~~k~~~V~s~ 98 (115) .+-.|-+..=-++|.++++..+.. +.|.+|.-=.|..+|..-.+. T Consensus 62 ~~~~YyitF~~~~G~~~ef~V~~~~Y~~l~eGd~G~Lt~QGtrf~~F 108 (110) T PF10694_consen 62 ESTRYYITFEFESGGRIEFRVSGHEYGQLAEGDKGTLTYQGTRFLGF 108 (110) T ss_pred CCEEEEEEEEECCCCEEEEEECHHHHCCCCCCCEEEEEECCCEEEEE T ss_conf 40699999996789779999898995778999888899836788553 No 25 >PF09403 FadA: Adhesion protein FadA; PDB: 3etz_A 3etx_C 2gl2_A 3ety_A 3etw_A. Probab=22.61 E-value=6.9 Score=16.68 Aligned_cols=19 Identities=21% Similarity=0.151 Sum_probs=8.8 Q ss_pred CHHHHHHHHHHHHHHHHHHHH Q ss_conf 935777899999999998875 Q T0530 1 MKKAMAILAVLAAAAVICGLL 21 (115) Q Consensus 1 MKK~l~ii~i~~v~~~~~~~~ 21 (115) |||++ |+.++++..+++.. T Consensus 1 MKKi~--L~~ml~lss~sfAa 19 (126) T PF09403_consen 1 MKKIL--LCGMLLLSSLSFAA 19 (126) T ss_dssp --------------------- T ss_pred CCHHH--HHHHHHHHHHHHHH T ss_conf 90589--99999999999762 No 26 >PF02920 Integrase_DNA: DNA binding domain of tn916 integrase; InterPro: IPR004191 The integrase family of site-specific recombinases catalyze a diverse array of DNA rearrangements in archaebacteria, eubacteria and yeast. The structure of the DNA binding domain of the the conjugative transposon Tn916 integrase protein was determined using NMR spectroscopy. The N-terminal domain was found to be structurally similar to the double stranded RNA binding domain (dsRBD). Experimental evidence suggests that the integrase protein interacts with DNA using residues located on the face of its three stranded beta-sheet .; GO: 0003677 DNA binding, 0008907 integrase activity, 0015074 DNA integration, 0005634 nucleus; PDB: 1tn9_A 1b69_A 2bb8_A 1bb8_A. Probab=21.41 E-value=14 Score=14.94 Aligned_cols=35 Identities=20% Similarity=0.387 Sum_probs=25.9 Q ss_pred CCCCCCEEEEEE----CCCEECCCEEECHHHCCHHHHHH Q ss_conf 776566389986----58231142033153035698996 Q T0530 79 ELRKNAYLKVKA----KGKYVETWEEVKFEDMPDSVQSK 113 (115) Q Consensus 79 ~L~~~aYlki~~----k~~~V~s~eeV~~~~iP~ka~~k 113 (115) ..+.|.|+.-++ +.+.|-||--|..+..|..-++. T Consensus 17 q~kdgRy~ykyvd~~g~~r~vYsWKL~~td~~~~gk~~~ 55 (67) T PF02920_consen 17 QRKDGRYLYKYVDELGKERFVYSWKLVPTDKEPAGKRDC 55 (67) T ss_dssp E-----EEEEEE-----EEEEEES-SSTTS--------- T ss_pred HHHHHHHHHHHHHHCCCEEEEEEEEEECCCCCHHHHHHH T ss_conf 776574999976114880158899864177326566664 No 27 >PF05342 Peptidase_M26_N: M26 IgA1-specific Metallo-endopeptidase N-terminal region; InterPro: IPR008006 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases corresponds to MEROPS peptidase family M26 (clan MA(E)). The active site residues for members of this family and family M4 occur in the motif HEXXH. The type example is IgA1-specific metalloendopeptidase from Streptococcus sanguinis (Q59986 from SWISSPROT). Probab=20.63 E-value=20 Score=13.91 Aligned_cols=46 Identities=30% Similarity=0.300 Sum_probs=33.5 Q ss_pred EEEEECCCCCEEEEEECCCCC-CCCCCEEEEEECC-C----EECCCEEECHHH Q ss_conf 775375788568999703887-7656638998658-2----311420331530 Q T0530 59 TLDGYNASGKKEEVTFFAGKE-LRKNAYLKVKAKG-K----YVETWEEVKFED 105 (115) Q Consensus 59 ~l~~yd~~G~~k~i~f~~~~~-L~~~aYlki~~k~-~----~V~s~eeV~~~~ 105 (115) .|-. .++|.++.+.+-+..| ...+=|||+.+.. + -|.|+||+.++. T Consensus 49 ~Ly~-~eng~~~~~~~L~~~P~d~~~yylKV~S~~~Kd~~LpV~sIee~t~dg 100 (250) T PF05342_consen 49 ELYK-QENGTYKQVTSLSEVPSDLSNYYLKVTSSDFKDVYLPVSSIEEVTKDG 100 (250) T ss_pred EEEE-CCCCCEEEEEEHHCCCCCHHHCEEEEECCCCCCEEEEEEEEEEEEECC T ss_conf 7789-259928866733138888568089998066861898765568887659 Done!