Query T0571 ZP_02066433.1, Bacteroides ovatus ATCC 8483, 344 residues Match_columns 344 No_of_seqs 153 out of 205 Neff 7.2 Searched_HMMs 11830 Date Fri Jun 4 14:34:21 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0571.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0571.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF08522 DUF1735: Domain of un 99.0 1E-10 8.6E-15 89.8 5.4 59 91-151 1-59 (59) 2 PF12262 Lipase_bact_N: Bacter 83.4 0.31 2.6E-05 23.4 2.7 25 1-26 1-25 (268) 3 PF08085 Entericidin: Enterici 79.9 0.37 3.2E-05 22.8 2.1 21 1-21 1-22 (42) 4 PF02402 Lysis_col: Lysis prot 73.9 0.28 2.3E-05 23.7 0.1 21 1-22 1-21 (46) 5 PF01441 Lipoprotein_6: Lipopr 73.6 0.28 2.3E-05 23.7 0.0 24 1-25 1-24 (210) 6 PF07901 DUF1672: Protein of u 73.4 0.73 6.1E-05 20.8 2.1 23 1-25 1-23 (304) 7 PF12245 DUF3607: Protein of u 68.6 2.2 0.00019 17.4 6.8 80 52-134 51-134 (724) 8 PF03160 Calx-beta: Calx-beta 65.0 2.6 0.00022 17.0 7.8 75 60-152 26-100 (101) 9 PF03304 Mlp: Mlp lipoprotein 64.6 1.2 0.0001 19.3 1.7 22 1-24 1-22 (181) 10 PF06085 Rz1: Lipoprotein Rz1 61.6 1.8 0.00015 18.1 2.2 23 1-23 1-23 (59) 11 PF06291 Lambda_Bor: Bor prote 61.4 1.5 0.00013 18.5 1.8 19 1-22 1-19 (97) 12 PF07273 DUF1439: Protein of u 58.0 0.9 7.6E-05 20.2 0.1 20 1-22 1-20 (177) 13 PF10566 Glyco_hydro_97: Glyco 57.4 3.5 0.00029 16.1 3.3 22 128-153 121-142 (643) 14 PF10671 TcpQ: Toxin co-regula 55.7 1.3 0.00011 19.0 0.6 19 1-22 1-19 (169) 15 PF01298 Lipoprotein_5: Transf 54.6 1.8 0.00015 18.1 1.2 21 1-21 1-22 (593) 16 PF02030 Lipoprotein_8: Hypoth 50.4 3.9 0.00033 15.7 2.4 24 1-24 1-27 (493) 17 PF08139 LPAM_1: Prokaryotic m 49.4 3.2 0.00027 16.3 1.8 18 2-22 9-26 (26) 18 PF11153 DUF2931: Protein of u 48.7 4.7 0.0004 15.1 5.4 16 141-156 105-120 (216) 19 PF07172 GRP: Glycine rich pro 48.1 3.6 0.0003 15.9 1.9 21 1-21 1-22 (95) 20 PF06474 MLTD_N: MLTD_N; Inte 47.7 3.6 0.0003 16.0 1.8 21 1-25 1-21 (95) 21 PF03207 OspD: Borrelia outer 45.9 4.6 0.00039 15.2 2.2 24 1-24 1-24 (254) 22 PF11810 DUF3332: Domain of un 45.1 4.8 0.0004 15.1 2.1 19 2-20 2-20 (176) 23 PF09160 FimH_man-bind: FimH, 45.0 5.3 0.00045 14.7 2.7 33 100-132 95-130 (150) 24 PF12099 DUF3575: Protein of u 44.9 5.3 0.00045 14.7 6.7 19 1-20 1-19 (189) 25 PF11839 DUF3359: Protein of u 44.7 4.6 0.00039 15.2 2.0 24 1-25 1-24 (96) 26 PF12092 DUF3568: Protein of u 42.9 5.2 0.00044 14.8 2.1 20 2-21 1-20 (131) 27 PF03978 Borrelia_REV: Borreli 41.3 4 0.00034 15.6 1.3 23 1-23 1-23 (160) 28 PF06551 DUF1120: Protein of u 40.7 6.2 0.00052 14.3 2.2 19 1-20 1-19 (145) 29 PF06788 UPF0257: Uncharacteri 40.1 5 0.00042 14.9 1.6 23 1-26 1-23 (236) 30 PF09403 FadA: Adhesion protei 37.3 2.9 0.00024 16.6 0.0 20 1-20 1-20 (126) 31 PF04449 Fimbrial_CS1: CS1 typ 35.8 7.3 0.00062 13.8 4.7 14 77-90 54-67 (143) 32 PF03032 Brevenin: Brevenin/es 34.5 6.1 0.00051 14.3 1.3 27 1-28 3-29 (46) 33 PF01203 GSPII_N: Bacterial ty 33.9 7.8 0.00066 13.6 2.1 15 1-15 1-15 (252) 34 PF06649 DUF1161: Protein of u 31.9 7.6 0.00064 13.7 1.5 27 1-27 1-31 (75) 35 PF04507 DUF576: Protein of un 31.8 8.4 0.00071 13.3 2.1 21 2-22 5-25 (257) 36 PF11254 DUF3053: Protein of u 30.8 8.7 0.00074 13.2 2.1 18 8-25 7-24 (229) 37 PF09716 ETRAMP: Malarial earl 28.8 9.4 0.0008 13.0 2.5 22 1-22 1-24 (84) 38 PF05628 Borrelia_P13: Borreli 28.5 8.3 0.0007 13.4 1.2 21 1-21 1-21 (167) 39 PF10626 TraO: Conjugative tra 26.9 7.6 0.00064 13.7 0.7 31 143-173 105-135 (193) 40 PF07867 DUF1654: Protein of u 26.9 10 0.00086 12.8 2.1 19 257-275 49-67 (73) 41 PF06646 Mycoplasma_p37: High 26.8 5.4 0.00046 14.7 0.0 26 1-26 1-30 (386) 42 PF09710 Trep_dent_lipo: Trepo 26.6 5.8 0.00049 14.5 0.1 30 181-213 206-235 (394) 43 PF05540 Serpulina_VSP: Serpul 26.1 10 0.00089 12.7 1.5 18 142-159 128-145 (377) 44 PF06030 DUF916: Bacterial pro 25.4 11 0.00091 12.6 6.1 82 52-135 17-106 (122) 45 PF03082 MAGSP: Male accessory 24.3 11 0.00096 12.5 1.6 22 1-22 1-22 (264) 46 PF07996 T4SS: Type IV secreti 23.4 6.9 0.00058 14.0 0.0 20 1-20 1-20 (217) 47 PF05643 DUF799: Putative bact 22.6 12 0.001 12.2 2.4 18 1-21 1-18 (215) 48 PF00820 Lipoprotein_1: Borrel 22.6 12 0.001 12.2 3.0 104 1-127 1-122 (273) 49 PF11873 DUF3393: Domain of un 22.6 10 0.00087 12.7 0.8 16 8-23 4-19 (204) 50 PF06280 DUF1034: Fn3-like dom 22.6 12 0.001 12.2 3.9 24 110-134 56-79 (112) 51 PF06873 SerH: Cell surface im 22.3 12 0.001 12.2 1.9 12 1-12 1-12 (403) 52 PF01514 YscJ_FliF: Secretory 22.1 7.6 0.00064 13.7 0.0 22 1-22 1-27 (206) No 1 >PF08522 DUF1735: Domain of unknown function (DUF1735); InterPro: IPR013728 This domain of unknown function is found in a number of Bacteroidetes proteins including acylhydrolases. Probab=99.04 E-value=1e-10 Score=89.76 Aligned_cols=59 Identities=24% Similarity=0.418 Sum_probs=50.7 Q ss_pred HHHHHHHHCCCEEEECCCCCEEECCCCEEECCCCEEEEEEEEEEECCCCCCCCCCCEEEEE Q ss_conf 8765211118666885864166367507974776257667998412156666778725889 Q T0571 91 CDNLYFKDTDQPLVPMPASYYTLASDRIAIPKGQIMAGVEVQLTDDFFADEKSISENYVIP 151 (344) Q Consensus 91 L~~~YN~~~~t~Y~~LP~~~Ysl~~~~v~I~aGe~~~~v~i~~~~~~~~~~l~~~~~YvLP 151 (344) |+ .||++|+++|++||+++|+|++.+++|+||+..+.++|+|+...+. .|+.+.+|||| T Consensus 1 l~-~YN~~~~t~y~~LP~~~Y~l~~~~v~i~aG~~~~~~~i~v~~~~~~-~l~~~~~Y~LP 59 (59) T PF08522_consen 1 LD-AYNAANGTDYKLLPEDCYSLPSKTVTIPAGESSSSVPITVKFKGLE-ELDPDKTYVLP 59 (59) T ss_pred CH-HHHHHCCCCCEECCHHHEEECCCEEEEECCCEEEEEEEEEEECCCC-CCCCCCEEECC T ss_conf 97-6786509861888846889659879990999888666999948842-27889738369 No 2 >PF12262 Lipase_bact_N: Bacterial virulence factor lipase N-terminal Probab=83.41 E-value=0.31 Score=23.38 Aligned_cols=25 Identities=24% Similarity=0.429 Sum_probs=16.8 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCCC Q ss_conf 91358899999987763014776556 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDNE 26 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~~ 26 (344) |||+++.++++. ++.+++|-++... T Consensus 1 Mkk~~l~~~ias-al~LaGCg~ds~~ 25 (268) T PF12262_consen 1 MKKKLLSLAIAS-ALGLAGCGGDSES 25 (268) T ss_pred CCHHHHHHHHHH-HHHCCCCCCCCCC T ss_conf 943589999999-9751114799767 No 3 >PF08085 Entericidin: Entericidin EcnA/B family; InterPro: IPR012556 This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing .; GO: 0009636 response to toxin, 0016020 membrane Probab=79.93 E-value=0.37 Score=22.83 Aligned_cols=21 Identities=38% Similarity=0.673 Sum_probs=14.1 Q ss_pred CCHH-HHHHHHHHHHHHHCCCC Q ss_conf 9135-88999999877630147 Q T0571 1 MKKN-LAYIGLVLLILTWTSCE 21 (344) Q Consensus 1 MKk~-~~~l~l~~l~~l~tSC~ 21 (344) |||. +..++++++++.+++|+ T Consensus 1 Mkk~~~~~~~~~~~~~~l~gCn 22 (42) T PF08085_consen 1 MKKKILIILALLALALALAGCN 22 (42) T ss_pred CCHHHHHHHHHHHHHHHHHHHH T ss_conf 9558999999999999880026 No 4 >PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined . The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells . A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB . Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C-terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively . Sequence similarities between colicins E2, A and E1 are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides . Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase . The mature ColE2 lysis protein is located in the cell envelope . ; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane Probab=73.92 E-value=0.28 Score=23.74 Aligned_cols=21 Identities=43% Similarity=0.895 Sum_probs=14.4 Q ss_pred CCHHHHHHHHHHHHHHHCCCCC Q ss_conf 9135889999998776301477 Q T0571 1 MKKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n 22 (344) |||++ ++.++.+.+++++|+- T Consensus 1 MkKi~-~~~i~~~~~~L~aCQa 21 (46) T PF02402_consen 1 MKKIL-FIGILLLTMLLAACQA 21 (46) T ss_pred CCEEE-EEHHHHHHHHHHHHHH T ss_conf 94787-7389999999987201 No 5 >PF01441 Lipoprotein_6: Lipoprotein This Pfam family is a subset of the Prosite family.; InterPro: IPR001800 Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens . They are predominantly found in the Spirochaetaceae.; GO: 0006952 defense response, 0009279 cell outer membrane; PDB: 1f1m_C 1g5z_A 2ga0_E 1yjg_E 1ggq_C. Probab=73.58 E-value=0.28 Score=23.75 Aligned_cols=24 Identities=29% Similarity=0.563 Sum_probs=15.6 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 9135889999998776301477655 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDN 25 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~ 25 (344) |||+-+-. +++.+|+|.||+|.-. T Consensus 1 Mkk~tlSa-IlMtLflfisCNNsG~ 24 (210) T PF01441_consen 1 MKKNTLSA-ILMTLFLFISCNNSGK 24 (210) T ss_dssp ------------------------- T ss_pred CCHHHHHH-HHHHHHHHHHCCCCCC T ss_conf 95137999-9999999996378887 No 6 >PF07901 DUF1672: Protein of unknown function (DUF1672); InterPro: IPR012873 This family is composed of hypothetical bacterial proteins of unknown function. Probab=73.40 E-value=0.73 Score=20.80 Aligned_cols=23 Identities=26% Similarity=0.462 Sum_probs=15.6 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 9135889999998776301477655 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDN 25 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~ 25 (344) |||++. .|+++++++++|.+-.. T Consensus 1 M~K~i~--~ll~~~lLLgGCs~m~~ 23 (304) T PF07901_consen 1 MKKRII--SLLAATLLLGGCSNMNE 23 (304) T ss_pred CHHHHH--HHHHHHHHHCCCCCCCC T ss_conf 914899--99999999744446760 No 7 >PF12245 DUF3607: Protein of unknown function (DUF3607) Probab=68.59 E-value=2.2 Score=17.43 Aligned_cols=80 Identities=18% Similarity=0.181 Sum_probs=34.8 Q ss_pred EEEEECCCCCEEEEEEEEEEECCCCCCCEEEEEEECHHHHH----HHHHHHCCCEEEECCCCCEEECCCCEEECCCCEEE Q ss_conf 25541057734799999985166777547999987658987----65211118666885864166367507974776257 Q T0571 52 EFVDNTLDNQHKMVIKAAWGGGYTNRNNVVINFKVDESLCD----NLYFKDTDQPLVPMPASYYTLASDRIAIPKGQIMA 127 (344) Q Consensus 52 ~~~~~~~~~~~~~~i~v~vsgs~~~~~dv~V~i~vD~slL~----~~YN~~~~t~Y~~LP~~~Ysl~~~~v~I~aGe~~~ 127 (344) +.+.....-+....++|+.++....=..++-++.++...+. ++|-+ +..--.|+++.|++-.+.+.+ +|..++ T Consensus 51 i~~~lssGLDRk~kisV~rs~g~~~~st~ts~~~~a~d~it~~G~eyYGk--~ltlPal~eG~ytl~~eiLd~-~g~~V~ 127 (724) T PF12245_consen 51 ITFALSSGLDRKVKISVTRSSGTLMVSTVTSHVLVADDRITADGSEYYGK--ELTLPALGEGTYTLKAEILDS-DGNVVQ 127 (724) T ss_pred EEEEEECCCCCEEEEEEEECCCEEEEEECCCEEEEEEEEEEECCCCCCCC--EEECCCCCCCCEEEEEEEECC-CCCEEE T ss_conf 69999615553179999955985999851330787613785078420051--640244688727999988616-887787 Q ss_pred EEEEEEE Q ss_conf 6679984 Q T0571 128 GVEVQLT 134 (344) Q Consensus 128 ~v~i~~~ 134 (344) .-+..|. T Consensus 128 t~~ypl~ 134 (724) T PF12245_consen 128 TYSYPLT 134 (724) T ss_pred EEEEEEE T ss_conf 6567689 No 8 >PF03160 Calx-beta: Calx-beta domain; InterPro: IPR003644 The calx-beta motif is present as a tandem repeat in the cytoplasmic domains of Calx Na-Ca exchangers, which are used to expel calcium from cells. This motif overlaps domains used for calcium binding and regulation. The calx-beta motif is also present in the cytoplasmic tail of mammalian integrin-beta4, which mediates the bi-directional transfer of signals across the plasma membrane, as well as in some cyanobacterial proteins. This motif contains a series of beta-strands and turns that form a self-contained beta-sheet , .; GO: 0007154 cell communication, 0016021 integral to membrane; PDB: 2fws_A 3gin_B 2dpk_A 3e9u_A 2fwu_A 2qvm_A 2qvk_A 3fq4_B 3fso_A 3h6a_B .... Probab=65.00 E-value=2.6 Score=16.95 Aligned_cols=75 Identities=17% Similarity=0.318 Sum_probs=45.6 Q ss_pred CCEEEEEEEEEEECCCCCCCEEEEEEECHHHHHHHHHHHCCCEEEECCCCCEEECCCCEEECCCCEEEEEEEEEEECCCC Q ss_conf 73479999998516677754799998765898765211118666885864166367507974776257667998412156 Q T0571 60 NQHKMVIKAAWGGGYTNRNNVVINFKVDESLCDNLYFKDTDQPLVPMPASYYTLASDRIAIPKGQIMAGVEVQLTDDFFA 139 (344) Q Consensus 60 ~~~~~~i~v~vsgs~~~~~dv~V~i~vD~slL~~~YN~~~~t~Y~~LP~~~Ysl~~~~v~I~aGe~~~~v~i~~~~~~~~ 139 (344) +.....+.+..+|. .....++|.+..-+. .+..+.+| ...+.++++++|+..+.+.|.+.+.. T Consensus 26 ~~~~~~~~V~r~~~-~~~~~v~V~~~t~~g------tA~~g~Dy--------~~~~~~v~F~~ge~~~~i~v~i~dD~-- 88 (101) T PF03160_consen 26 EDGTVTVTVVRTGG-TTSGPVTVNYSTSDG------TATAGSDY--------TPVSGTVTFAPGETSKTITVPIIDDN-- 88 (101) T ss_dssp T--EEEEEEEEES---TTSEEEEEEEEE-S------SS-TTTTB--------E-----EEE-TT-EEEEEEEEB---S-- T ss_pred CCEEEEEEEEEEEE-CCCEEEEEEEEEECC------CCCCCCCC--------CCCCEEEEECCCCCEEEEEEEEECCC-- T ss_conf 98099999999411-588699999999788------53021676--------50340899939982989999995899-- Q ss_pred CCCCCCCEEEEEE Q ss_conf 6667787258899 Q T0571 140 DEKSISENYVIPL 152 (344) Q Consensus 140 ~~l~~~~~YvLPl 152 (344) ..+.++.+.|=| T Consensus 89 -~~E~~E~f~v~L 100 (101) T PF03160_consen 89 -VPEGDETFTVQL 100 (101) T ss_dssp -STTSSEEEEEEE T ss_pred -CCCCCEEEEEEE T ss_conf -855865899998 No 9 >PF03304 Mlp: Mlp lipoprotein family; InterPro: IPR004983 The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species . This family were previously known as 2.9 lipoprotein genes . These surface expressed genes may represent new candidate vaccinogens for Lyme disease . Members of this family generally are downstream of four ORFs called A,B,C and D that are involved in hemolytic activity. Probab=64.55 E-value=1.2 Score=19.27 Aligned_cols=22 Identities=36% Similarity=0.523 Sum_probs=12.3 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 913588999999877630147765 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSD 24 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e 24 (344) ||.+.++ ++.+++++.||+..+ T Consensus 1 mKiinil--fcl~lllL~~Cn~nd 22 (181) T PF03304_consen 1 MKIINIL--FCLFLLLLNSCNSND 22 (181) T ss_pred CCEEHHH--HHHHHHHHHCCCCCC T ss_conf 9440589--999999994767688 No 10 >PF06085 Rz1: Lipoprotein Rz1 precursor; InterPro: IPR010346 This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda, which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces .; GO: 0019064 viral envelope fusion with host membrane, 0019867 outer membrane Probab=61.65 E-value=1.8 Score=18.07 Aligned_cols=23 Identities=26% Similarity=0.385 Sum_probs=18.4 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCC Q ss_conf 91358899999987763014776 Q T0571 1 MKKNLAYIGLVLLILTWTSCESS 23 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~ 23 (344) |+++...+..+++.+.++||... T Consensus 1 Mr~l~~~l~~~~~~L~lsaC~S~ 23 (59) T PF06085_consen 1 MRKLKMLLCALALPLALSACSSK 23 (59) T ss_pred CHHHHHHHHHHHHHHHHHHHCCC T ss_conf 90389999999999999871589 No 11 >PF06291 Lambda_Bor: Bor protein; InterPro: IPR010438 This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the E. coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis . Probab=61.40 E-value=1.5 Score=18.54 Aligned_cols=19 Identities=42% Similarity=0.492 Sum_probs=11.9 Q ss_pred CCHHHHHHHHHHHHHHHCCCCC Q ss_conf 9135889999998776301477 Q T0571 1 MKKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n 22 (344) |||.++.+++ ++++++|.. T Consensus 1 mkk~ll~~~l---~llltgCa~ 19 (97) T PF06291_consen 1 MKKILLAAAL---ALLLTGCAQ 19 (97) T ss_pred CHHHHHHHHH---HHHHCCCCE T ss_conf 9004999999---999645663 No 12 >PF07273 DUF1439: Protein of unknown function (DUF1439); InterPro: IPR010835 This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown.; PDB: 3eyr_B. Probab=58.01 E-value=0.9 Score=20.16 Aligned_cols=20 Identities=45% Similarity=0.522 Sum_probs=14.6 Q ss_pred CCHHHHHHHHHHHHHHHCCCCC Q ss_conf 9135889999998776301477 Q T0571 1 MKKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n 22 (344) ||+.+. +++++++++++|+. T Consensus 1 Mk~~~~--~~l~~~~~L~gC~~ 20 (177) T PF07273_consen 1 MKKLLL--LALILALLLTGCAS 20 (177) T ss_dssp --------------------CH T ss_pred CCHHHH--HHHHHHHHHHCCCC T ss_conf 926999--99999999861255 No 13 >PF10566 Glyco_hydro_97: Glycoside hydrolase 97 ; PDB: 2jkp_A 2zq0_A 2jke_B 2d73_B 2jka_B. Probab=57.37 E-value=3.5 Score=16.05 Aligned_cols=22 Identities=14% Similarity=0.348 Sum_probs=11.9 Q ss_pred EEEEEEEECCCCCCCCCCCEEEEEEE Q ss_conf 66799841215666677872588999 Q T0571 128 GVEVQLTDDFFADEKSISENYVIPLL 153 (344) Q Consensus 128 ~v~i~~~~~~~~~~l~~~~~YvLPl~ 153 (344) .+.|.|. .+.|++. =.|.+|-. T Consensus 121 ~~~l~fR--vyddGvA--fRY~~p~~ 142 (643) T PF10566_consen 121 RLNLEFR--VYDDGVA--FRYEFPQQ 142 (643) T ss_dssp EEEEEEE--E---------EEEE--B T ss_pred EEEEEEE--ECCCCEE--EEEEECCC T ss_conf 3899999--9069879--99997798 No 14 >PF10671 TcpQ: Toxin co-regulated pilus biosynthesis protein Q Probab=55.70 E-value=1.3 Score=19.01 Aligned_cols=19 Identities=42% Similarity=0.681 Sum_probs=11.9 Q ss_pred CCHHHHHHHHHHHHHHHCCCCC Q ss_conf 9135889999998776301477 Q T0571 1 MKKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n 22 (344) |||+++ ++.+++++++|.- T Consensus 1 ~kkn~i---~~~~~i~lsGcs~ 19 (169) T PF10671_consen 1 MKKNLI---AITLAIMLSGCSS 19 (169) T ss_pred CCCCEE---HHHHHHHHCCCCC T ss_conf 974030---3778988426333 No 15 >PF01298 Lipoprotein_5: Transferrin binding protein-like solute binding protein; InterPro: IPR001677 Bacterial transferrin binding proteins act as transferrin receptors and are required for transferrin utilisation. Transferrins are iron-binding glycoproteins that control the level of free iron in biological fluids. ; GO: 0004998 transferrin receptor activity, 0016020 membrane Probab=54.60 E-value=1.8 Score=18.07 Aligned_cols=21 Identities=19% Similarity=0.396 Sum_probs=14.2 Q ss_pred CCHH-HHHHHHHHHHHHHCCCC Q ss_conf 9135-88999999877630147 Q T0571 1 MKKN-LAYIGLVLLILTWTSCE 21 (344) Q Consensus 1 MKk~-~~~l~l~~l~~l~tSC~ 21 (344) |++. +...+|+++++||+||. T Consensus 1 M~~~~~~~~~~~l~~~lLsACs 22 (593) T PF01298_consen 1 MNNPPLNQSAIALAAFLLSACS 22 (593) T ss_pred CCCCCCCHHHHHHHHHHHHHHC T ss_conf 9865552558999999998734 No 16 >PF02030 Lipoprotein_8: Hypothetical lipoprotein (MG045 family); InterPro: IPR000044 Mycoplasma genitalium has the smallest known genome of any free-living organism. Its complete genome sequence has been determined by whole-genome random sequencing and assembly . Only 470 putative coding regions were identified, including genes for DNA replication, transcription and translation, DNA repair, cellular transport and energy metabolism . A hypothetical protein from the MG045 gene has a homologue of similarly unknown function in Mycoplasma pneumoniae .; GO: 0016020 membrane Probab=50.39 E-value=3.9 Score=15.68 Aligned_cols=24 Identities=21% Similarity=0.321 Sum_probs=14.0 Q ss_pred CCHHHHHHHHHHH---HHHHCCCCCCC Q ss_conf 9135889999998---77630147765 Q T0571 1 MKKNLAYIGLVLL---ILTWTSCESSD 24 (344) Q Consensus 1 MKk~~~~l~l~~l---~~l~tSC~n~e 24 (344) |||+..++.+++. ..+++||.+.. T Consensus 1 mk~~~k~~~~~~~l~~~~~ltac~~~~ 27 (493) T PF02030_consen 1 MKKQKKFLFSLIGLTFSSILTACSKNN 27 (493) T ss_pred CCHHHHHHHHHHHHHHHHHHHHCCCCC T ss_conf 940467899999999988876445586 No 17 >PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 This family consists of the homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection . Probab=49.42 E-value=3.2 Score=16.26 Aligned_cols=18 Identities=33% Similarity=0.626 Sum_probs=9.8 Q ss_pred CHHHHHHHHHHHHHHHCCCCC Q ss_conf 135889999998776301477 Q T0571 2 KKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 2 Kk~~~~l~l~~l~~l~tSC~n 22 (344) || + +.++.+++.+++|.+ T Consensus 9 Kk-i--l~~~~a~~~LaGCss 26 (26) T PF08139_consen 9 KK-I--LFLLLALFMLAGCSS 26 (26) T ss_pred HH-H--HHHHHHHHHHHHCCC T ss_conf 99-9--999999999833149 No 18 >PF11153 DUF2931: Protein of unknown function (DUF2931) Probab=48.68 E-value=4.7 Score=15.12 Aligned_cols=16 Identities=6% Similarity=-0.045 Sum_probs=10.0 Q ss_pred CCCCCCEEEEEEEEEC Q ss_conf 6677872588999841 Q T0571 141 EKSISENYVIPLLMTN 156 (344) Q Consensus 141 ~l~~~~~YvLPl~I~~ 156 (344) ++...+.|..=+.|-. T Consensus 105 Sl~DkK~Y~~~i~ip~ 120 (216) T PF11153_consen 105 SLIDKKFYETKIDIPE 120 (216) T ss_pred ECCCCCEEEEEEECCH T ss_conf 8135517899998799 No 19 >PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses. Probab=48.07 E-value=3.6 Score=15.94 Aligned_cols=21 Identities=33% Similarity=0.475 Sum_probs=10.2 Q ss_pred CC-HHHHHHHHHHHHHHHCCCC Q ss_conf 91-3588999999877630147 Q T0571 1 MK-KNLAYIGLVLLILTWTSCE 21 (344) Q Consensus 1 MK-k~~~~l~l~~l~~l~tSC~ 21 (344) |- |.+++|+|+++++++.|++ T Consensus 1 M~sk~~llL~lllA~vlliss~ 22 (95) T PF07172_consen 1 MASKAFLLLGLLLAAVLLISSE 22 (95) T ss_pred CCHHHHHHHHHHHHHHHHHHHH T ss_conf 9247999999999999999874 No 20 >PF06474 MLTD_N: MLTD_N; InterPro: IPR010511 This entry comprises the N-terminal domain of membrane-bound lytic murein transglycosylase D. Probab=47.75 E-value=3.6 Score=15.97 Aligned_cols=21 Identities=24% Similarity=0.425 Sum_probs=11.8 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 9135889999998776301477655 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDN 25 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~ 25 (344) ||-. ++++++++++||+.... T Consensus 1 m~~~----~~l~~~llLaGCqs~~~ 21 (95) T PF06474_consen 1 MRFL----AVLALALLLAGCQSTPQ 21 (95) T ss_pred CHHH----HHHHHHHHHHHCCCCCC T ss_conf 9299----99999999984679999 No 21 >PF03207 OspD: Borrelia outer surface protein D (OspD); InterPro: IPR004894 This is a family of outer surface proteins from Borrelia. The function of these proteins is unknown. Probab=45.94 E-value=4.6 Score=15.17 Aligned_cols=24 Identities=33% Similarity=0.431 Sum_probs=16.4 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 913588999999877630147765 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSD 24 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e 24 (344) |||.+.++.+-+++++-.||.-+. T Consensus 1 mkklikill~slflllsisc~hdk 24 (254) T PF03207_consen 1 MKKLIKILLLSLFLLLSISCVHDK 24 (254) T ss_pred CHHHHHHHHHHHHHHHHHHHCCCH T ss_conf 916999999999999863212423 No 22 >PF11810 DUF3332: Domain of unknown function (DUF3332) Probab=45.09 E-value=4.8 Score=15.07 Aligned_cols=19 Identities=32% Similarity=0.524 Sum_probs=12.1 Q ss_pred CHHHHHHHHHHHHHHHCCC Q ss_conf 1358899999987763014 Q T0571 2 KKNLAYIGLVLLILTWTSC 20 (344) Q Consensus 2 Kk~~~~l~l~~l~~l~tSC 20 (344) |+....+++++++++|+|| T Consensus 2 k~~~~~~~~~~~~~~lsgC 20 (176) T PF11810_consen 2 KKILAAVAILLGSVSLSGC 20 (176) T ss_pred CHHHHHHHHHHHHHHHCCC T ss_conf 1369999999999985234 No 23 >PF09160 FimH_man-bind: FimH, mannose binding; InterPro: IPR015243 This domain adopts a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. It is predominantly found in bacterial mannose-specific adhesins, and is capable of binding to D-mannose . ; PDB: 2vco_B 1klf_L 1kiu_P 1uwf_A 1qun_L 1tr7_A. Probab=44.95 E-value=5.3 Score=14.74 Aligned_cols=33 Identities=27% Similarity=0.518 Sum_probs=21.6 Q ss_pred CCEEEECCCCCEEECCC---CEEECCCCEEEEEEEE Q ss_conf 86668858641663675---0797477625766799 Q T0571 100 DQPLVPMPASYYTLASD---RIAIPKGQIMAGVEVQ 132 (344) Q Consensus 100 ~t~Y~~LP~~~Ysl~~~---~v~I~aGe~~~~v~i~ 132 (344) +..|++||...|=-+-. -+.|++|+..|.+.+. T Consensus 95 ~~~~~p~p~klYLtp~~~agGv~I~~G~~iAtl~m~ 130 (150) T PF09160_consen 95 DGNYKPWPAKLYLTPISAAGGVVIKKGELIATLNMH 130 (150) T ss_dssp SSS-EE--EEEEEEESTT-----B----EEEEEEEE T ss_pred CCCCCCCCEEEEEEECCCCCCEEEECCCEEEEEEEE T ss_conf 798466546899975487885898379889999999 No 24 >PF12099 DUF3575: Protein of unknown function (DUF3575) Probab=44.91 E-value=5.3 Score=14.73 Aligned_cols=19 Identities=37% Similarity=0.345 Sum_probs=9.5 Q ss_pred CCHHHHHHHHHHHHHHHCCC Q ss_conf 91358899999987763014 Q T0571 1 MKKNLAYIGLVLLILTWTSC 20 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC 20 (344) |||+.+++ +++++++.+++ T Consensus 1 ~~~~~~~~-~~~~~~~~~~~ 19 (189) T PF12099_consen 1 MKKIRFLF-LLLLLFCTSSP 19 (189) T ss_pred CCEEEHHH-HHHHHHHHHCC T ss_conf 93530459-99999997535 No 25 >PF11839 DUF3359: Protein of unknown function (DUF3359) Probab=44.71 E-value=4.6 Score=15.16 Aligned_cols=24 Identities=38% Similarity=0.470 Sum_probs=15.1 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 9135889999998776301477655 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDN 25 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~ 25 (344) ||+.+ +-++.+.++++.+|-...+ T Consensus 1 M~~~l-~s~~~~~~~L~~GCAs~s~ 24 (96) T PF11839_consen 1 MKKLL-ISALALAALLAAGCASTSD 24 (96) T ss_pred CCHHH-HHHHHHHHHHHHHCCCCCH T ss_conf 90599-9999999999857268858 No 26 >PF12092 DUF3568: Protein of unknown function (DUF3568) Probab=42.90 E-value=5.2 Score=14.82 Aligned_cols=20 Identities=30% Similarity=0.285 Sum_probs=13.2 Q ss_pred CHHHHHHHHHHHHHHHCCCC Q ss_conf 13588999999877630147 Q T0571 2 KKNLAYIGLVLLILTWTSCE 21 (344) Q Consensus 2 Kk~~~~l~l~~l~~l~tSC~ 21 (344) ||.+..+.+.++++.++||. T Consensus 1 kk~~~~~l~~~~~l~l~sC~ 20 (131) T PF12092_consen 1 KKLLLATLIAASALSLSSCG 20 (131) T ss_pred CCHHHHHHHHHHHHHHCCCH T ss_conf 91289999999999871430 No 27 >PF03978 Borrelia_REV: Borrelia burgdorferi REV protein; InterPro: IPR007126 This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete) and Borrelia garinii. The function of REV is unknown although it has been shown that the gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli . Probab=41.31 E-value=4 Score=15.61 Aligned_cols=23 Identities=17% Similarity=0.288 Sum_probs=13.4 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCC Q ss_conf 91358899999987763014776 Q T0571 1 MKKNLAYIGLVLLILTWTSCESS 23 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~ 23 (344) ||++.++-.++++++...||+.. T Consensus 1 MknkNI~KLfFvsmlfvmaCk~y 23 (160) T PF03978_consen 1 MKNKNIFKLFFVSMLFVMACKAY 23 (160) T ss_pred CCCCHHHHHHHHHHHHHHHHHHH T ss_conf 97400999999999999999999 No 28 >PF06551 DUF1120: Protein of unknown function (DUF1120); InterPro: IPR010546 This family consists of several bacterial proteins, at least one of which is involved in enzyme induction following nitrogen deprivation. The exact function of this family is unknown Probab=40.70 E-value=6.2 Score=14.30 Aligned_cols=19 Identities=32% Similarity=0.235 Sum_probs=10.3 Q ss_pred CCHHHHHHHHHHHHHHHCCC Q ss_conf 91358899999987763014 Q T0571 1 MKKNLAYIGLVLLILTWTSC 20 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC 20 (344) |||.++...+++.+ ++.++ T Consensus 1 MKK~l~~~~l~a~l-~~~~~ 19 (145) T PF06551_consen 1 MKKNLLATLLLASL-LLLAS 19 (145) T ss_pred CCHHHHHHHHHHHH-HHHHC T ss_conf 93679999999999-98602 No 29 >PF06788 UPF0257: Uncharacterised protein family (UPF0257); InterPro: IPR010646 This is a group of proteins of unknown function. Probab=40.09 E-value=5 Score=14.94 Aligned_cols=23 Identities=35% Similarity=0.579 Sum_probs=14.7 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCCC Q ss_conf 91358899999987763014776556 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDNE 26 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~~ 26 (344) ||+. +.+.++++++++|++.... T Consensus 1 ~~~~---~~~~~l~~~l~~cd~~~~~ 23 (236) T PF06788_consen 1 MKKQ---LLLCLLALLLAGCDNASAP 23 (236) T ss_pred CCEE---EHHHHHHHHHHHCCCCCCH T ss_conf 9605---4589999977641254511 No 30 >PF09403 FadA: Adhesion protein FadA; PDB: 3etz_A 3etx_C 2gl2_A 3ety_A 3etw_A. Probab=37.29 E-value=2.9 Score=16.64 Aligned_cols=20 Identities=35% Similarity=0.308 Sum_probs=10.0 Q ss_pred CCHHHHHHHHHHHHHHHCCC Q ss_conf 91358899999987763014 Q T0571 1 MKKNLAYIGLVLLILTWTSC 20 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC 20 (344) |||+++..++++.+++|++- T Consensus 1 MKKi~L~~ml~lss~sfAa~ 20 (126) T PF09403_consen 1 MKKILLCGMLLLSSLSFAAT 20 (126) T ss_dssp -------------------- T ss_pred CCHHHHHHHHHHHHHHHHHH T ss_conf 90589999999999997622 No 31 >PF04449 Fimbrial_CS1: CS1 type fimbrial major subunit; InterPro: IPR007540 Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia . The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown . A member of this family has also been identified in Salmonella typhi and Salmonella enterica .; GO: 0009289 fimbrium Probab=35.76 E-value=7.3 Score=13.78 Aligned_cols=14 Identities=14% Similarity=0.244 Sum_probs=6.9 Q ss_pred CCCEEEEEEECHHH Q ss_conf 75479999876589 Q T0571 77 RNNVVINFKVDESL 90 (344) Q Consensus 77 ~~dv~V~i~vD~sl 90 (344) .+++.|++.-++.| T Consensus 54 ~k~v~v~L~~~~~L 67 (143) T PF04449_consen 54 SKDVNVKLANPPKL 67 (143) T ss_pred CCCEEEEEECCHHH T ss_conf 76579998089656 No 32 >PF03032 Brevenin: Brevenin/esculentin/gaegurin/rugosin family; InterPro: IPR004275 In addition to the highly specific cell-mediated immune system, vertebrates possess an efficient host-defense mechanism against invading microorganisms which involves the synthesis of highly potent antimicrobial peptides with a large spectrum of activity. This entry represents a number of these defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins.; GO: 0006952 defense response, 0042742 defense response to bacterium, 0005576 extracellular region Probab=34.53 E-value=6.1 Score=14.35 Aligned_cols=27 Identities=22% Similarity=0.333 Sum_probs=14.3 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCCCCC Q ss_conf 9135889999998776301477655656 Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDNEFP 28 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~~~e 28 (344) |||-+.++.+ +-++.++-|+.....++ T Consensus 3 lKKSLlLlfF-LG~islSlCeeer~adE 29 (46) T PF03032_consen 3 LKKSLLLLFF-LGTISLSLCEEERDADE 29 (46) T ss_pred HHHHHHHHHH-HHHHHHHHHHHHCCCCH T ss_conf 2588999999-98721167777604641 No 33 >PF01203 GSPII_N: Bacterial type II secretion system protein N; InterPro: IPR000645 The secretion pathway (GSP) for the export of proteins (also called the type II pathway) requires a number of protein components. One of them is known as the 'N' protein and has been sequenced in a variety of bacteria such as Aeromonas hydrophila (gene exeN); Erwinia carotovora (gene outN); Klebsiella pneumoniae (gene pulN); or Vibrio cholerae (gene epsN). The size of the 'N' protein is around 250 amino acids. It apparently contains a single transmembrane domain located in the N-terminal section. The short N-terminal domain is predicted to be cytoplasmic and the large C-terminal domain periplasmic.; GO: 0008565 protein transporter activity, 0015628 protein secretion by the type II secretion system, 0015627 type II protein secretion system complex Probab=33.94 E-value=7.8 Score=13.59 Aligned_cols=15 Identities=40% Similarity=0.603 Sum_probs=9.6 Q ss_pred CCHHHHHHHHHHHHH Q ss_conf 913588999999877 Q T0571 1 MKKNLAYIGLVLLIL 15 (344) Q Consensus 1 MKk~~~~l~l~~l~~ 15 (344) ||+++.++.++++++ T Consensus 1 Mkr~~~~~~~~~~~~ 15 (252) T PF01203_consen 1 MKRRILWILLFLLAY 15 (252) T ss_pred CCHHHHHHHHHHHHH T ss_conf 932499999999999 No 34 >PF06649 DUF1161: Protein of unknown function (DUF1161); InterPro: IPR010595 This family consists of several short, hypothetical bacterial proteins of unknown function. Probab=31.94 E-value=7.6 Score=13.67 Aligned_cols=27 Identities=41% Similarity=0.541 Sum_probs=15.4 Q ss_pred CCHHHHHHHHHHHHH-HH---CCCCCCCCCC Q ss_conf 913588999999877-63---0147765565 Q T0571 1 MKKNLAYIGLVLLIL-TW---TSCESSDNEF 27 (344) Q Consensus 1 MKk~~~~l~l~~l~~-l~---tSC~n~e~~~ 27 (344) |||.++..+|++++. .+ .||+.=..+. T Consensus 1 Mkk~~l~~~l~~la~~alAA~~sCE~lk~eI 31 (75) T PF06649_consen 1 MKKFLLAVALLLLAAPALAAPKSCEELKAEI 31 (75) T ss_pred CCHHHHHHHHHHHHHHHHHCCCCHHHHHHHH T ss_conf 9356999999998564541558889999999 No 35 >PF04507 DUF576: Protein of unknown function, DUF576; InterPro: IPR007595 This family contains several uncharacterised staphylococcal proteins. Probab=31.78 E-value=8.4 Score=13.35 Aligned_cols=21 Identities=38% Similarity=0.585 Sum_probs=13.9 Q ss_pred CHHHHHHHHHHHHHHHCCCCC Q ss_conf 135889999998776301477 Q T0571 2 KKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 2 Kk~~~~l~l~~l~~l~tSC~n 22 (344) ||..++++++.+++++++|.. T Consensus 5 kkl~l~is~liLii~I~Gcg~ 25 (257) T PF04507_consen 5 KKLALYISLLILIIFIGGCGI 25 (257) T ss_pred HHHHHHHHHHHHHHHHHCCCC T ss_conf 667999999999998831345 No 36 >PF11254 DUF3053: Protein of unknown function (DUF3053) Probab=30.80 E-value=8.7 Score=13.24 Aligned_cols=18 Identities=17% Similarity=0.488 Sum_probs=11.4 Q ss_pred HHHHHHHHHHCCCCCCCC Q ss_conf 999998776301477655 Q T0571 8 IGLVLLILTWTSCESSDN 25 (344) Q Consensus 8 l~l~~l~~l~tSC~n~e~ 25 (344) ++.+++.+.+++|.|.|. T Consensus 7 l~al~~vl~LaGC~dkE~ 24 (229) T PF11254_consen 7 LLALLMVLQLAGCGDKEP 24 (229) T ss_pred HHHHHHHHHHHHCCCCCH T ss_conf 999999999831479977 No 37 >PF09716 ETRAMP: Malarial early transcribed membrane protein (ETRAMP) Probab=28.80 E-value=9.4 Score=13.01 Aligned_cols=22 Identities=32% Similarity=0.489 Sum_probs=11.3 Q ss_pred CC--HHHHHHHHHHHHHHHCCCCC Q ss_conf 91--35889999998776301477 Q T0571 1 MK--KNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 1 MK--k~~~~l~l~~l~~l~tSC~n 22 (344) || |.+++++++.+.-+|+-|-+ T Consensus 1 MKi~kv~~ff~~Ll~i~~l~p~~~ 24 (84) T PF09716_consen 1 MKISKVFYFFAFLLAINLLTPCLC 24 (84) T ss_pred CCHHHHHHHHHHHHHHHHCCCCCC T ss_conf 937999999999999997537555 No 38 >PF05628 Borrelia_P13: Borrelia membrane protein P13; InterPro: IPR008420 This family consists of P13 proteins from Borrelia species. P13 is a 13 kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism . Probab=28.49 E-value=8.3 Score=13.39 Aligned_cols=21 Identities=24% Similarity=0.251 Sum_probs=15.0 Q ss_pred CCHHHHHHHHHHHHHHHCCCC Q ss_conf 913588999999877630147 Q T0571 1 MKKNLAYIGLVLLILTWTSCE 21 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~ 21 (344) |||+++.+.++.+.+-..|-+ T Consensus 1 MkKi~~lilif~~t~qiFA~~ 21 (167) T PF05628_consen 1 MKKIFILILIFFLTIQIFAQK 21 (167) T ss_pred CCEEEHHHHHHHHHHHHHHCC T ss_conf 962421478887787634124 No 39 >PF10626 TraO: Conjugative transposon protein TraO Probab=26.91 E-value=7.6 Score=13.68 Aligned_cols=31 Identities=13% Similarity=0.287 Sum_probs=24.6 Q ss_pred CCCCEEEEEEEEECCCCCCCCCCCCCEEECC Q ss_conf 7787258899984156742124564023124 Q T0571 143 SISENYVIPLLMTNVQGADSILQGKPVVENP 173 (344) Q Consensus 143 ~~~~~YvLPl~I~~~sg~~~i~~~~~~~~~p 173 (344) |..++-.+-+=+...-|++++..|..++.+- T Consensus 105 D~~K~vfl~~G~SaL~GYEtvN~g~~lL~DG 135 (193) T PF10626_consen 105 DAGKNVFLSLGLSALAGYETVNWGDKLLYDG 135 (193) T ss_pred CCCCEEEEEECCCEEEEEEEECCCCCCCCCC T ss_conf 6983799994320175556743786124684 No 40 >PF07867 DUF1654: Protein of unknown function (DUF1654); InterPro: IPR012449 This family consists of proteins from the Pseudomonadaceae. Probab=26.85 E-value=10 Score=12.78 Aligned_cols=19 Identities=26% Similarity=0.462 Sum_probs=15.2 Q ss_pred CCEEEEEEECCCCCEEEEE Q ss_conf 5408999997997389986 Q T0571 257 ISYTVRLSFAEDGSCTVHS 275 (344) Q Consensus 257 ~~~~~~l~f~~~~~~ti~~ 275 (344) ....+.|+|++||.|+|.= T Consensus 49 etdgi~l~~~dDGsV~i~W 67 (73) T PF07867_consen 49 ETDGIDLTFNDDGSVRIRW 67 (73) T ss_pred CCCCCEEEECCCCEEEEEE T ss_conf 4778057755898499998 No 41 >PF06646 Mycoplasma_p37: High affinity transport system protein p37; InterPro: IPR010592 This family consists of several high affinity transport system protein p37 sequences, which are specific to Mycoplasma species. The p37 gene is part of an operon encoding two additional proteins, which are highly similar to components of the periplasmic binding-protein-dependent transport systems of Gram-negative bacteria. It has been suggested that p37 is part of a homologous, high-affinity transport system in Mycoplasma hyorhinis, a Gram-positive bacterium .; GO: 0005215 transporter activity, 0006810 transport, 0016020 membrane; PDB: 3e78_A 3eki_A 3e79_A. Probab=26.83 E-value=5.4 Score=14.68 Aligned_cols=26 Identities=23% Similarity=0.133 Sum_probs=14.0 Q ss_pred CCHHHHHHHHHH----HHHHHCCCCCCCCC Q ss_conf 913588999999----87763014776556 Q T0571 1 MKKNLAYIGLVL----LILTWTSCESSDNE 26 (344) Q Consensus 1 MKk~~~~l~l~~----l~~l~tSC~n~e~~ 26 (344) ||+++..+.++. ..++++||....+. T Consensus 1 ~k~k~~~~~~~~~i~~~~~~~~SC~~~~~~ 30 (386) T PF06646_consen 1 MKKKKKKLLSFSSIFSSSSIAISCGTQSNN 30 (386) T ss_dssp ------------------------------ T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHCCCCCC T ss_conf 950035799999999889997741467676 No 42 >PF09710 Trep_dent_lipo: Treponema clustered lipoprotein (Trep_dent_lipo) Probab=26.63 E-value=5.8 Score=14.48 Aligned_cols=30 Identities=20% Similarity=0.336 Sum_probs=15.5 Q ss_pred CCCCCCEEEEEEEECCCCCCCEEEECCEEEEEC Q ss_conf 122311147898512577664155311023312 Q T0571 181 WSILPQNFVLYAVKYVNPWHGEYLRRGIDHATV 213 (344) Q Consensus 181 ~~~~~~~~~l~~v~~~N~ysg~Y~~~~~~~~~~ 213 (344) -+..++.|-.+++-|+ -|.++..|.+.+.+ T Consensus 206 ~~~~~~~y~~~vlDYV---kGNFTnSGyDEYiV 235 (394) T PF09710_consen 206 KTHKYKQYDYKVLDYV---KGNFTNSGYDEYIV 235 (394) T ss_pred CCCCCCCCCEEEEEEE---CCCCCCCCCCEEEE T ss_conf 0055576755742431---25656788643899 No 43 >PF05540 Serpulina_VSP: Serpulina hyodysenteriae variable surface protein; InterPro: IPR008838 This family consists of several variable surface proteins from Brachyspira hyodysenteriae. Probab=26.07 E-value=10 Score=12.69 Aligned_cols=18 Identities=6% Similarity=0.032 Sum_probs=10.5 Q ss_pred CCCCCEEEEEEEEECCCC Q ss_conf 677872588999841567 Q T0571 142 KSISENYVIPLLMTNVQG 159 (344) Q Consensus 142 l~~~~~YvLPl~I~~~sg 159 (344) |.-.-.=++|++|.-.++ T Consensus 128 LNdnLRI~vPVqIaV~~~ 145 (377) T PF05540_consen 128 LNDNLRIAVPVQIAVGSD 145 (377) T ss_pred CCCCEEEEEEEEEEECCC T ss_conf 056528998799997367 No 44 >PF06030 DUF916: Bacterial protein of unknown function (DUF916); InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function. Probab=25.43 E-value=11 Score=12.61 Aligned_cols=82 Identities=5% Similarity=0.197 Sum_probs=38.5 Q ss_pred EEEEECCCCCEEEEEEEEEEECCCCCCCEEEEEEECHHHHHH----HHHHHCCCEEEECC---CCCEEECCCCEEECCCC Q ss_conf 255410577347999999851667775479999876589876----52111186668858---64166367507974776 Q T0571 52 EFVDNTLDNQHKMVIKAAWGGGYTNRNNVVINFKVDESLCDN----LYFKDTDQPLVPMP---ASYYTLASDRIAIPKGQ 124 (344) Q Consensus 52 ~~~~~~~~~~~~~~i~v~vsgs~~~~~dv~V~i~vD~slL~~----~YN~~~~t~Y~~LP---~~~Ysl~~~~v~I~aGe 124 (344) .|++....-..+..+.+.+ ....++.+++.+.+..+.-.. .|.......=.-|+ .++-+++..++++|+++ T Consensus 17 ~YFdL~~~P~q~qtl~v~v--~N~t~~~itv~v~~~~A~Tn~nG~idY~~~~~~~d~sl~~~~~~~v~~~~~~Vtl~~~~ 94 (122) T PF06030_consen 17 SYFDLKVKPGQTQTLQVRV--TNNTDKPITVKVSANNATTNDNGVIDYSPSTKKKDSSLKYPFSDLVKIPKEEVTLPANS 94 (122) T ss_pred CCEEEEECCCCEEEEEEEE--ECCCCCCEEEEEEEEEEEECCCEEEEECCCCCCCCCCCCCCHHHHCCCCCCEEEECCCC T ss_conf 7489996899959999999--92899968999997165756887899667887746434846799612688769989998 Q ss_pred EEE-EEEEEEEE Q ss_conf 257-66799841 Q T0571 125 IMA-GVEVQLTD 135 (344) Q Consensus 125 ~~~-~v~i~~~~ 135 (344) ..- .+.|++.. T Consensus 95 sk~V~~~lk~P~ 106 (122) T PF06030_consen 95 SKTVTFTLKMPK 106 (122) T ss_pred EEEEEEEEECCC T ss_conf 799999998688 No 45 >PF03082 MAGSP: Male accessory gland secretory protein; InterPro: IPR004315 The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. The protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. During copulation it is transferred to the female genital tract where it is rapidly altered .; GO: 0007618 mating, 0005576 extracellular region Probab=24.26 E-value=11 Score=12.46 Aligned_cols=22 Identities=23% Similarity=0.551 Sum_probs=17.8 Q ss_pred CCHHHHHHHHHHHHHHHCCCCC Q ss_conf 9135889999998776301477 Q T0571 1 MKKNLAYIGLVLLILTWTSCES 22 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n 22 (344) |..++..-++++++|..++|+. T Consensus 1 MNQILLCS~iLLllfaVAnC~~ 22 (264) T PF03082_consen 1 MNQILLCSAILLLLFAVANCDG 22 (264) T ss_pred CCEEHHHHHHHHHHHHHHHCCC T ss_conf 9622006889999988761145 No 46 >PF07996 T4SS: Type IV secretion system proteins; InterPro: IPR012991 Members of this family are components of the type IV secretion system. They mediate intracellular transfer of macromolecules via a mechanism ancestrally related to that of bacterial conjugation machineries.; PDB: 1r8i_A. Probab=23.41 E-value=6.9 Score=13.96 Aligned_cols=20 Identities=30% Similarity=0.373 Sum_probs=10.7 Q ss_pred CCHHHHHHHHHHHHHHHCCC Q ss_conf 91358899999987763014 Q T0571 1 MKKNLAYIGLVLLILTWTSC 20 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC 20 (344) |||+++.+++++++++.+.+ T Consensus 1 MKk~~~~~~~~~~l~~~~~a 20 (217) T PF07996_consen 1 MKKKTLALALALALLMSSPA 20 (217) T ss_dssp -------------------- T ss_pred CHHHHHHHHHHHHHHHCCHH T ss_conf 91579999999999803355 No 47 >PF05643 DUF799: Putative bacterial lipoprotein (DUF799); InterPro: IPR008517 This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins. Probab=22.61 E-value=12 Score=12.25 Aligned_cols=18 Identities=28% Similarity=0.370 Sum_probs=10.1 Q ss_pred CCHHHHHHHHHHHHHHHCCCC Q ss_conf 913588999999877630147 Q T0571 1 MKKNLAYIGLVLLILTWTSCE 21 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~ 21 (344) ||+.++ .++++++|++|. T Consensus 1 ~k~~~~---~l~~~l~LsgCa 18 (215) T PF05643_consen 1 MKPLLL---GLAALLLLSGCA 18 (215) T ss_pred CHHHHH---HHHHHHHHHHCC T ss_conf 900599---999999996076 No 48 >PF00820 Lipoprotein_1: Borrelia lipoprotein The Pfam entry is a subset of this entry.; InterPro: IPR001809 The ospA and ospB genes encode the major outer membrane proteins of the Lyme disease spirochaete Borrelia burgdorferi . The deduced gene products OspA and OspB, contain 273 and 296 residues respectively . The two Osp proteins show a high degree of sequence similarity, indicating a recent evolutionary event. Molecular analysis and sequence comparison of OspA and OspB with other proteins has revealed similarity to the signal peptides of prokaryotic lipoproteins , .; GO: 0009279 cell outer membrane; PDB: 3ckg_A 3cka_B 2fkg_A 2fkj_A 2pi3_O 2ol6_O 3ckf_A 2i5z_O 2hkd_A 1p4p_A .... Probab=22.57 E-value=12 Score=12.24 Aligned_cols=104 Identities=16% Similarity=0.187 Sum_probs=46.8 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEEEEECCCCCEEEEEEEEE-----E---- Q ss_conf 9135889999998776301477655656511114888632332000122102554105773479999998-----5---- Q T0571 1 MKKNLAYIGLVLLILTWTSCESSDNEFPDFDYQTVYFANQYGLRTIELGESEFVDNTLDNQHKMVIKAAW-----G---- 71 (344) Q Consensus 1 MKk~~~~l~l~~l~~l~tSC~n~e~~~e~~~~~~vy~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~v~v-----s---- 71 (344) ||+++ .-+++.+.+.+|....-...+. +.+- ..-|.. ....+...-+.+..|.+.++| . T Consensus 1 MkkYL---lG~~LilAliaC~Q~~ss~d~k--~s~~--~dLp~~-----~~V~vSKEKnkdGKY~L~AtVDklELKGtSD 68 (273) T PF00820_consen 1 MKKYL---LGIGLILALIACKQNVSSLDEK--NSSS--VDLPGE-----MKVFVSKEKNKDGKYDLRATVDKLELKGTSD 68 (273) T ss_dssp ----------------------------SS--SEEE--EEE--------EEEEEESSE-----BEEEEEETTEEEB--BS T ss_pred CCEEH---HHHHHHHHHHHHHCCCCCCCCC--CCCC--CCCCCC-----EEEEEEECCCCCCCEEEEEEEEEEEEECCCC T ss_conf 93301---4799999998760224465645--5621--048884-----3899982147787368999864478833224 Q ss_pred ---------ECCCCCCCEEEEEEECHHHHHHHHHHHCCCEEEECCCCCEEECCCCEEECCCCEEE Q ss_conf ---------16677754799998765898765211118666885864166367507974776257 Q T0571 72 ---------GGYTNRNNVVINFKVDESLCDNLYFKDTDQPLVPMPASYYTLASDRIAIPKGQIMA 127 (344) Q Consensus 72 ---------gs~~~~~dv~V~i~vD~slL~~~YN~~~~t~Y~~LP~~~Ysl~~~~v~I~aGe~~~ 127 (344) |.++ ++ -.|++.+..+|- .+.++.+.++. .+-+..++-+.|+... T Consensus 69 KnnGSG~LEG~K~-Dk-SKvkltisdDL~--------~tt~E~fkedg-t~Vs~KV~~KdkS~TE 122 (273) T PF00820_consen 69 KNNGSGTLEGVKA-DK-SKVKLTISDDLS--------TTTFETFKEDG-TLVSRKVTSKDKSSTE 122 (273) T ss_dssp S------BB---T-TS--EEEEEE-TT-----------EEEEEE------EEEEEEEETTSSEEE T ss_pred CCCCCCEEECCCC-CC-CEEEEEEECCCC--------CEEEEEECCCC-CEEEEEEEECCCCCCH T ss_conf 7788530312117-76-567899922667--------30699983799-4986555605676303 No 49 >PF11873 DUF3393: Domain of unknown function (DUF3393) Probab=22.56 E-value=10 Score=12.73 Aligned_cols=16 Identities=44% Similarity=0.723 Sum_probs=9.3 Q ss_pred HHHHHHHHHHCCCCCC Q ss_conf 9999987763014776 Q T0571 8 IGLVLLILTWTSCESS 23 (344) Q Consensus 8 l~l~~l~~l~tSC~n~ 23 (344) +++++++++++||... T Consensus 4 l~~~~~~llL~~Cs~~ 19 (204) T PF11873_consen 4 LILLLIILLLSSCSSE 19 (204) T ss_pred HHHHHHHHHHHHHCCC T ss_conf 1999999999985775 No 50 >PF06280 DUF1034: Fn3-like domain (DUF1034); InterPro: IPR010435 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 1xf1_A 3eif_A. Probab=22.56 E-value=12 Score=12.24 Aligned_cols=24 Identities=25% Similarity=0.462 Sum_probs=18.1 Q ss_pred CEEECCCCEEECCCCEEEEEEEEEE Q ss_conf 1663675079747762576679984 Q T0571 110 YYTLASDRIAIPKGQIMAGVEVQLT 134 (344) Q Consensus 110 ~Ysl~~~~v~I~aGe~~~~v~i~~~ 134 (344) ..+++..++++|||+... +.|+|+ T Consensus 56 ~~~~~~~~vTV~ag~s~~-v~vt~~ 79 (112) T PF06280_consen 56 SVTFSPNTVTVPAGGSKT-VTVTFT 79 (112) T ss_dssp EEE---EEEEE-TTEEEE-EEEEEE T ss_pred EEEECCCEEEECCCCEEE-EEEEEE T ss_conf 666379849999999899-999997 No 51 >PF06873 SerH: Cell surface immobilisation antigen SerH; InterPro: IPR009670 This family consists of several cell surface immobilisation antigen SerH proteins which seem to be specific to Tetrahymena thermophila. The SerH locus of T. thermophila is one of several paralogous loci with genes encoding variants of the major cell surface protein known as the immobilisation antigen (i-ag) . Probab=22.33 E-value=12 Score=12.21 Aligned_cols=12 Identities=25% Similarity=0.562 Sum_probs=7.8 Q ss_pred CCHHHHHHHHHH Q ss_conf 913588999999 Q T0571 1 MKKNLAYIGLVL 12 (344) Q Consensus 1 MKk~~~~l~l~~ 12 (344) ||.+++++.|+. T Consensus 1 M~~k~lIi~Lii 12 (403) T PF06873_consen 1 MQNKILIICLII 12 (403) T ss_pred CCCHHHHHHHHH T ss_conf 960046899999 No 52 >PF01514 YscJ_FliF: Secretory protein of YscJ/FliF family; InterPro: IPR006182 This domain is found in proteins that are related to the YscJ lipoprotein, where it covers most of the sequence, and the flagellar M-ring protein FliF, where it covers the N-terminal region. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flagellar proteins, based on the similarity to YscJ proteins .; PDB: 1yj7_D. Probab=22.10 E-value=7.6 Score=13.67 Aligned_cols=22 Identities=32% Similarity=0.566 Sum_probs=12.2 Q ss_pred CCHHHHH-----HHHHHHHHHHCCCCC Q ss_conf 9135889-----999998776301477 Q T0571 1 MKKNLAY-----IGLVLLILTWTSCES 22 (344) Q Consensus 1 MKk~~~~-----l~l~~l~~l~tSC~n 22 (344) |+|+..+ ++++++++++++|.+ T Consensus 1 ~~k~~ki~~~~vi~~~~~~~~~~~~~~ 27 (206) T PF01514_consen 1 MNKKQKIMAVAVIALLVALLLLSSCPD 27 (206) T ss_dssp -------------------------E- T ss_pred CCHHHHHHHHHHHHHHHHHHHHHCCCC T ss_conf 951478878999999999999966997 Done!