Query T0550 ZP_02066326.1, Bacteroides ovatus, 339 residues Match_columns 339 No_of_seqs 142 out of 195 Neff 6.5 Searched_HMMs 11830 Date Fri May 21 18:08:25 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0550.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0550.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF08522 DUF1735: Domain of un 99.3 7.3E-13 6.2E-17 93.2 5.1 55 88-148 1-59 (59) 2 PF08085 Entericidin: Enterici 83.8 0.24 2E-05 23.7 2.3 22 1-22 1-23 (42) 3 PF12099 DUF3575: Protein of u 82.9 0.94 8E-05 20.1 6.7 62 1-68 1-62 (189) 4 PF02402 Lysis_col: Lysis prot 75.7 0.24 2E-05 23.7 0.1 22 1-23 1-22 (46) 5 PF12262 Lipase_bact_N: Bacter 75.4 0.76 6.4E-05 20.6 2.6 26 1-27 1-26 (268) 6 PF03304 Mlp: Mlp lipoprotein 71.9 0.79 6.7E-05 20.5 2.0 21 1-23 1-21 (181) 7 PF12245 DUF3607: Protein of u 71.1 2.1 0.00017 18.0 6.4 83 50-135 52-136 (724) 8 PF01441 Lipoprotein_6: Lipopr 69.4 0.39 3.3E-05 22.4 0.0 24 1-25 1-24 (210) 9 PF07901 DUF1672: Protein of u 62.4 1.6 0.00014 18.6 2.1 20 1-22 1-20 (304) 10 PF03207 OspD: Borrelia outer 59.4 2.4 0.0002 17.6 2.5 23 1-23 1-23 (254) 11 PF06085 Rz1: Lipoprotein Rz1 58.0 2.6 0.00022 17.3 2.5 22 1-22 1-22 (59) 12 PF03978 Borrelia_REV: Borreli 56.1 1.9 0.00016 18.2 1.5 22 1-22 1-22 (160) 13 PF07273 DUF1439: Protein of u 56.0 1 8.6E-05 19.9 0.1 20 1-22 1-20 (177) 14 PF06291 Lambda_Bor: Bor prote 54.9 2.4 0.0002 17.6 1.9 19 1-22 1-19 (97) 15 PF00907 T-box: T-box; InterP 50.8 4.4 0.00037 16.0 2.7 42 117-159 19-60 (183) 16 PF10566 Glyco_hydro_97: Glyco 50.1 4.9 0.00042 15.7 3.3 21 1-21 1-22 (643) 17 PF06474 MLTD_N: MLTD_N; Inte 47.8 3.9 0.00033 16.3 2.0 18 1-22 1-18 (95) 18 PF01298 Lipoprotein_5: Transf 45.7 3.2 0.00027 16.8 1.3 21 1-21 1-22 (593) 19 PF02030 Lipoprotein_8: Hypoth 44.7 5.8 0.00049 15.3 2.5 23 1-23 1-26 (493) 20 PF11839 DUF3359: Protein of u 42.3 5.8 0.00049 15.3 2.2 21 1-22 1-21 (96) 21 PF06788 UPF0257: Uncharacteri 41.1 6.8 0.00058 14.9 2.7 27 1-30 1-27 (236) 22 PF03160 Calx-beta: Calx-beta 40.1 7.1 0.0006 14.8 7.5 72 58-149 27-100 (101) 23 PF04507 DUF576: Protein of un 37.8 7.6 0.00064 14.6 2.2 27 3-29 6-36 (257) 24 PF12092 DUF3568: Protein of u 37.8 7.3 0.00061 14.7 2.1 21 2-22 1-21 (131) 25 PF03082 MAGSP: Male accessory 37.6 6.4 0.00054 15.0 1.8 23 1-23 1-23 (264) 26 PF12034 DUF3520: Domain of un 37.2 7.9 0.00066 14.5 2.3 58 71-135 4-64 (182) 27 PF11810 DUF3332: Domain of un 37.2 7.8 0.00066 14.5 2.2 20 2-21 2-21 (176) 28 PF10671 TcpQ: Toxin co-regula 36.1 4.1 0.00035 16.2 0.7 19 1-22 1-19 (169) 29 PF09403 FadA: Adhesion protei 35.4 3.2 0.00027 16.8 0.0 20 1-20 1-20 (126) 30 PF05643 DUF799: Putative bact 34.4 8 0.00068 14.4 1.9 18 1-21 1-18 (215) 31 PF11777 DUF3316: Protein of u 33.5 9 0.00076 14.1 2.5 22 1-22 1-22 (114) 32 PF08139 LPAM_1: Prokaryotic m 33.5 8.6 0.00073 14.2 1.9 17 2-21 9-25 (26) 33 PF06280 DUF1034: Fn3-like dom 26.9 12 0.00099 13.4 5.4 45 112-157 56-104 (112) 34 PF11153 DUF2931: Protein of u 25.7 12 0.001 13.3 5.0 35 1-39 1-38 (216) 35 PF07172 GRP: Glycine rich pro 25.4 13 0.0011 13.3 2.1 18 4-21 5-22 (95) 36 PF06135 DUF965: Bacterial pro 25.3 6.3 0.00053 15.1 0.1 31 249-280 27-58 (79) 37 PF05628 Borrelia_P13: Borreli 25.2 11 0.00092 13.6 1.3 22 1-22 1-22 (167) 38 PF09160 FimH_man-bind: FimH, 22.8 14 0.0012 13.0 2.3 34 102-135 95-133 (150) 39 PF06030 DUF916: Bacterial pro 22.7 14 0.0012 13.0 6.1 68 61-134 29-104 (122) 40 PF09710 Trep_dent_lipo: Trepo 20.6 9.1 0.00077 14.1 0.1 12 12-23 10-21 (394) No 1 >PF08522 DUF1735: Domain of unknown function (DUF1735); InterPro: IPR013728 This domain of unknown function is found in a number of Bacteroidetes proteins including acylhydrolases. Probab=99.31 E-value=7.3e-13 Score=93.18 Aligned_cols=55 Identities=33% Similarity=0.524 Sum_probs=50.7 Q ss_pred HHHHHHHHCCCCCCCCEEEECCCCCCCCCC-CEEEECCCEEEEEEEEEEECC---CCCCCEEEEE Q ss_conf 988877643653257356886610100575-258716965788999985156---7876613768 Q T0550 88 LKTLNIERFSLYRPELWYTEMEEDKYEFPE-TVHIPAGSCVELLNIDFNLQD---IDMLEKWVLP 148 (339) Q Consensus 88 L~~YN~~~~~~~~~~t~y~~LP~~~Ysl~~-tv~I~AGe~~s~i~I~f~~~~---Ld~~~~YvLP 148 (339) |++||++| ++.|++||+++|+|++ +++|+||+..+.++|+|++++ |+.+++|||| T Consensus 1 l~~YN~~~------~t~y~~LP~~~Y~l~~~~v~i~aG~~~~~~~i~v~~~~~~~l~~~~~Y~LP 59 (59) T PF08522_consen 1 LDAYNAAN------GTDYKLLPEDCYSLPSKTVTIPAGESSSSVPITVKFKGLEELDPDKTYVLP 59 (59) T ss_pred CHHHHHHC------CCCCEECCHHHEEECCCEEEEECCCEEEEEEEEEEECCCCCCCCCCEEECC T ss_conf 97678650------986188884688965987999099988866699994884227889738369 No 2 >PF08085 Entericidin: Entericidin EcnA/B family; InterPro: IPR012556 This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing .; GO: 0009636 response to toxin, 0016020 membrane Probab=83.80 E-value=0.24 Score=23.67 Aligned_cols=22 Identities=45% Similarity=0.685 Sum_probs=14.9 Q ss_pred CCH-HHHHHHHHHHHHHHHCCCC Q ss_conf 912-7899999999887510477 Q T0550 1 MKN-IYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKk-i~~~i~ll~ll~~~ssC~d 22 (339) ||| +..+++++++++.+++||- T Consensus 1 Mkk~~~~~~~~~~~~~~l~gCnT 23 (42) T PF08085_consen 1 MKKKILIILALLALALALAGCNT 23 (42) T ss_pred CCHHHHHHHHHHHHHHHHHHHHH T ss_conf 95589999999999998800263 No 3 >PF12099 DUF3575: Protein of unknown function (DUF3575) Probab=82.86 E-value=0.94 Score=20.05 Aligned_cols=62 Identities=18% Similarity=0.230 Sum_probs=27.6 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEECCCCCCCCEEEEEEEEEECCCCEEEEEEEEEE Q ss_conf 91278999999998875104777766535403665201334432024678886078725799999972 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNEWEDEQYEQYVSFKAPIASGSDGVTTIYVRYKDNGKVTYQLPIIVS 68 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd~e~e~y~~~vy~~~~~~~~~~~~~~~~v~~~~~~~~t~~l~v~vs 68 (339) ||||..+++++++.+ ..++.. ...+.+.+++-........-++-+....+...|..+++..+ T Consensus 1 ~~~~~~~~~~~~~~~-~~~~~~-----~~~q~~alKtNlLy~a~~tpNlg~E~~l~~~~sl~l~~~yn 62 (189) T PF12099_consen 1 MKKIRFLFLLLLLFC-TSSPAK-----ASAQQVALKTNLLYDATGTPNLGVEFRLGKRWSLDLSGSYN 62 (189) T ss_pred CCEEEHHHHHHHHHH-HHCCCC-----CCCEEEEEEEEHHHHHHHCCCEEEEEEECCCEEEEEEEEEC T ss_conf 935304599999999-753556-----65438999970666886498649999967987999678926 No 4 >PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined . The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells . A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB . Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C-terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively . Sequence similarities between colicins E2, A and E1 are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides . Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase . The mature ColE2 lysis protein is located in the cell envelope . ; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane Probab=75.68 E-value=0.24 Score=23.67 Aligned_cols=22 Identities=36% Similarity=0.449 Sum_probs=14.9 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCC Q ss_conf 91278999999998875104777 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNE 23 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd 23 (339) ||||..++++++ .+++++|+-+ T Consensus 1 MkKi~~~~i~~~-~~~L~aCQaN 22 (46) T PF02402_consen 1 MKKILFIGILLL-TMLLAACQAN 22 (46) T ss_pred CCEEEEEHHHHH-HHHHHHHHHC T ss_conf 947877389999-9999872012 No 5 >PF12262 Lipase_bact_N: Bacterial virulence factor lipase N-terminal Probab=75.36 E-value=0.76 Score=20.61 Aligned_cols=26 Identities=23% Similarity=0.331 Sum_probs=16.8 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCCCCCC Q ss_conf 912789999999988751047777665 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNEWEDE 27 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd~e~e 27 (339) |||.++.+++ +..+.+++|.++-+.+ T Consensus 1 Mkk~~l~~~i-asal~LaGCg~ds~~~ 26 (268) T PF12262_consen 1 MKKKLLSLAI-ASALGLAGCGGDSESS 26 (268) T ss_pred CCHHHHHHHH-HHHHHCCCCCCCCCCC T ss_conf 9435899999-9997511147997676 No 6 >PF03304 Mlp: Mlp lipoprotein family; InterPro: IPR004983 The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species . This family were previously known as 2.9 lipoprotein genes . These surface expressed genes may represent new candidate vaccinogens for Lyme disease . Members of this family generally are downstream of four ORFs called A,B,C and D that are involved in hemolytic activity. Probab=71.86 E-value=0.79 Score=20.52 Aligned_cols=21 Identities=38% Similarity=0.672 Sum_probs=13.2 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCC Q ss_conf 91278999999998875104777 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNE 23 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd 23 (339) ||.|.++++++ ++++.||+.+ T Consensus 1 mKiinilfcl~--lllL~~Cn~n 21 (181) T PF03304_consen 1 MKIINILFCLF--LLLLNSCNSN 21 (181) T ss_pred CCEEHHHHHHH--HHHHHCCCCC T ss_conf 94405899999--9999476768 No 7 >PF12245 DUF3607: Protein of unknown function (DUF3607) Probab=71.14 E-value=2.1 Score=18.01 Aligned_cols=83 Identities=14% Similarity=0.201 Sum_probs=39.9 Q ss_pred EEEEECCCCEEEEEEEEEEECCCCCCCEEEEEEECHHHHHHHHHHHCCCCCCCCEEEECCCCCCCCCCCEEEECCCEEEE Q ss_conf 88860787257999999720766776479999877699988877643653257356886610100575258716965788 Q T0550 50 YVRYKDNGKVTYQLPIIVSGSTVNSQDRDIHIAVDKDTLKTLNIERFSLYRPELWYTEMEEDKYEFPETVHIPAGSCVEL 129 (339) Q Consensus 50 ~v~~~~~~~~t~~l~v~vsgs~~~~~ditVti~vD~slL~~YN~~~~~~~~~~t~y~~LP~~~Ysl~~tv~I~AGe~~s~ 129 (339) .+.+..+=+....|.|..++.+..=..++=++.++...+..=-.++|+. +..--.|+++.|++-.++.=-+|..++. T Consensus 52 ~~~lssGLDRk~kisV~rs~g~~~~st~ts~~~~a~d~it~~G~eyYGk---~ltlPal~eG~ytl~~eiLd~~g~~V~t 128 (724) T PF12245_consen 52 TFALSSGLDRKVKISVTRSSGTLMVSTVTSHVLVADDRITADGSEYYGK---ELTLPALGEGTYTLKAEILDSDGNVVQT 128 (724) T ss_pred EEEEECCCCCEEEEEEEECCCEEEEEECCCEEEEEEEEEEECCCCCCCC---EEECCCCCCCCEEEEEEEECCCCCEEEE T ss_conf 9999615553179999955985999851330787613785078420051---6402446887279999886168877876 Q ss_pred --EEEEEE Q ss_conf --999985 Q T0550 130 --LNIDFN 135 (339) Q Consensus 130 --i~I~f~ 135 (339) .|+.++ T Consensus 129 ~~ypl~ID 136 (724) T PF12245_consen 129 YSYPLTID 136 (724) T ss_pred EEEEEEEE T ss_conf 56768995 No 8 >PF01441 Lipoprotein_6: Lipoprotein This Pfam family is a subset of the Prosite family.; InterPro: IPR001800 Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens . They are predominantly found in the Spirochaetaceae.; GO: 0006952 defense response, 0009279 cell outer membrane; PDB: 1f1m_C 1g5z_A 2ga0_E 1yjg_E 1ggq_C. Probab=69.35 E-value=0.39 Score=22.36 Aligned_cols=24 Identities=29% Similarity=0.382 Sum_probs=15.1 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 9127899999999887510477776 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNEWE 25 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd~e 25 (339) |||.- +..+++.+++|.|||+-.. T Consensus 1 Mkk~t-lSaIlMtLflfisCNNsG~ 24 (210) T PF01441_consen 1 MKKNT-LSAILMTLFLFISCNNSGK 24 (210) T ss_dssp ------------------------- T ss_pred CCHHH-HHHHHHHHHHHHHCCCCCC T ss_conf 95137-9999999999996378887 No 9 >PF07901 DUF1672: Protein of unknown function (DUF1672); InterPro: IPR012873 This family is composed of hypothetical bacterial proteins of unknown function. Probab=62.41 E-value=1.6 Score=18.59 Aligned_cols=20 Identities=50% Similarity=0.639 Sum_probs=13.7 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) |||++. ++++++++++||.. T Consensus 1 M~K~i~--~ll~~~lLLgGCs~ 20 (304) T PF07901_consen 1 MKKRII--SLLAATLLLGGCSN 20 (304) T ss_pred CHHHHH--HHHHHHHHHCCCCC T ss_conf 914899--99999999744446 No 10 >PF03207 OspD: Borrelia outer surface protein D (OspD); InterPro: IPR004894 This is a family of outer surface proteins from Borrelia. The function of these proteins is unknown. Probab=59.39 E-value=2.4 Score=17.58 Aligned_cols=23 Identities=26% Similarity=0.479 Sum_probs=17.8 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCC Q ss_conf 91278999999998875104777 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNE 23 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd 23 (339) |||.+.++++.+++++..||--| T Consensus 1 mkklikill~slflllsisc~hd 23 (254) T PF03207_consen 1 MKKLIKILLLSLFLLLSISCVHD 23 (254) T ss_pred CHHHHHHHHHHHHHHHHHHHCCC T ss_conf 91699999999999986321242 No 11 >PF06085 Rz1: Lipoprotein Rz1 precursor; InterPro: IPR010346 This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda, which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces .; GO: 0019064 viral envelope fusion with host membrane, 0019867 outer membrane Probab=57.95 E-value=2.6 Score=17.34 Aligned_cols=22 Identities=27% Similarity=0.499 Sum_probs=16.9 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) ||++...++.+++.+.++||.. T Consensus 1 Mr~l~~~l~~~~~~L~lsaC~S 22 (59) T PF06085_consen 1 MRKLKMLLCALALPLALSACSS 22 (59) T ss_pred CHHHHHHHHHHHHHHHHHHHCC T ss_conf 9038999999999999987158 No 12 >PF03978 Borrelia_REV: Borrelia burgdorferi REV protein; InterPro: IPR007126 This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete) and Borrelia garinii. The function of REV is unknown although it has been shown that the gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli . Probab=56.09 E-value=1.9 Score=18.20 Aligned_cols=22 Identities=27% Similarity=0.495 Sum_probs=14.4 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) ||++.++=++++++++..||+. T Consensus 1 MknkNI~KLfFvsmlfvmaCk~ 22 (160) T PF03978_consen 1 MKNKNIFKLFFVSMLFVMACKA 22 (160) T ss_pred CCCCHHHHHHHHHHHHHHHHHH T ss_conf 9740099999999999999999 No 13 >PF07273 DUF1439: Protein of unknown function (DUF1439); InterPro: IPR010835 This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown.; PDB: 3eyr_B. Probab=55.99 E-value=1 Score=19.85 Aligned_cols=20 Identities=30% Similarity=0.522 Sum_probs=14.2 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) ||++.. ++++++++++||+. T Consensus 1 Mk~~~~--~~l~~~~~L~gC~~ 20 (177) T PF07273_consen 1 MKKLLL--LALILALLLTGCAS 20 (177) T ss_dssp --------------------CH T ss_pred CCHHHH--HHHHHHHHHHCCCC T ss_conf 926999--99999999861255 No 14 >PF06291 Lambda_Bor: Bor protein; InterPro: IPR010438 This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the E. coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis . Probab=54.93 E-value=2.4 Score=17.63 Aligned_cols=19 Identities=37% Similarity=0.551 Sum_probs=11.8 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) |||+... +++.++++||.. T Consensus 1 mkk~ll~---~~l~llltgCa~ 19 (97) T PF06291_consen 1 MKKILLA---AALALLLTGCAQ 19 (97) T ss_pred CHHHHHH---HHHHHHHCCCCE T ss_conf 9004999---999999645663 No 15 >PF00907 T-box: T-box; InterPro: IPR001699 Transcription factors of the T-box family are required both for early cell-fate decisions, such as those necessary for formation of the basic vertebrate body plan, and for differentiation and organogenesis . The T-box is defined as the minimal region within the T-box protein that is both necessary and sufficient for sequence-specific DNA binding, all members of the family so far examined bind to the DNA consensus sequence TCACACCT. The T-box is a relatively large DNA-binding domain, generally comprising about a third of the entire protein (17-26 kDa). These genes were uncovered on the basis of similarity to the DNA binding domain of Mus musculus (Mouse) Brachyury (T) gene product, which similarity is the defining feature of the family. The Brachyury gene is named for its phenotype, which was identified 70 years ago as a mutant mouse strain with a short blunted tail. The gene, and its paralogues, have become a well-studied model for the family, and hence much of what is known about the T-box family is derived from the murine Brachyury gene. Consistent with its nuclear location, Brachyury protein has a sequence-specific DNA-binding activity and can act as a transcriptional regulator . Homozygous mutants for the gene undergo extensive developmental anomalies, thus rendering the mutation lethal . The postulated role of Brachyury is as a transcription factor, regulating the specification and differentiation of posterior mesoderm during gastrulation in a dose-dependent manner . T-box proteins tend to be expressed in specific organs or cell types, especially during development, and they are generally required for the development of those tissues, for example, Brachyury is expressed in posterior mesoderm and in the developing notochord, and it is required for the formation of these cells in mice . ; GO: 0003700 transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus; PDB: 1xbr_A 1h6f_B. Probab=50.76 E-value=4.4 Score=16.01 Aligned_cols=42 Identities=19% Similarity=0.295 Sum_probs=31.0 Q ss_pred CCEEEECCCEEEEEEEEEEECCCCCCCEEEEEEEEEECCCCCC Q ss_conf 5258716965788999985156787661376899972688740 Q T0550 117 ETVHIPAGSCVELLNIDFNLQDIDMLEKWVLPLTIVDDGSYAY 159 (339) Q Consensus 117 ~tv~I~AGe~~s~i~I~f~~~~Ld~~~~YvLPltI~~~s~~~~ 159 (339) +.++=++|-+.-. .++|+++||++...|.+=|.++-....-| T Consensus 19 EMIItk~GRrmFP-~l~~~vsGLdp~~~Y~v~l~~~~~d~~ry 60 (183) T PF00907_consen 19 EMIITKSGRRMFP-TLKFSVSGLDPNAKYSVMLDMVPVDDKRY 60 (183) T ss_dssp EEE-B----B-SS---EEEEE---TTSEEEEEEEEEECCCEEE T ss_pred EEEEECCCCCCCC-CEEEEEECCCCCCEEEEEEEEEECCCCEE T ss_conf 7999679971388-66999968897720479999998688142 No 16 >PF10566 Glyco_hydro_97: Glycoside hydrolase 97 ; PDB: 2jkp_A 2zq0_A 2jke_B 2d73_B 2jka_B. Probab=50.08 E-value=4.9 Score=15.70 Aligned_cols=21 Identities=24% Similarity=0.395 Sum_probs=10.6 Q ss_pred CCHHHHHHHHHHHHHHH-HCCC Q ss_conf 91278999999998875-1047 Q T0550 1 MKNIYIYLSLLAVIVLG-TACN 21 (339) Q Consensus 1 MKki~~~i~ll~ll~~~-ssC~ 21 (339) |||+.++++++++++++ ++|. T Consensus 1 MKk~~i~~l~~~l~~~~~~~~~ 22 (643) T PF10566_consen 1 MKKLIIILLALLLLLSASSSAA 22 (643) T ss_dssp ---------------------- T ss_pred CCHHHHHHHHHHHHHHHHHHHC T ss_conf 9437999999999987411201 No 17 >PF06474 MLTD_N: MLTD_N; InterPro: IPR010511 This entry comprises the N-terminal domain of membrane-bound lytic murein transglycosylase D. Probab=47.75 E-value=3.9 Score=16.33 Aligned_cols=18 Identities=28% Similarity=0.604 Sum_probs=9.6 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) ||-+ ++++++++++||.. T Consensus 1 m~~~----~~l~~~llLaGCqs 18 (95) T PF06474_consen 1 MRFL----AVLALALLLAGCQS 18 (95) T ss_pred CHHH----HHHHHHHHHHHCCC T ss_conf 9299----99999999984679 No 18 >PF01298 Lipoprotein_5: Transferrin binding protein-like solute binding protein; InterPro: IPR001677 Bacterial transferrin binding proteins act as transferrin receptors and are required for transferrin utilisation. Transferrins are iron-binding glycoproteins that control the level of free iron in biological fluids. ; GO: 0004998 transferrin receptor activity, 0016020 membrane Probab=45.68 E-value=3.2 Score=16.85 Aligned_cols=21 Identities=24% Similarity=0.263 Sum_probs=12.6 Q ss_pred CCHH-HHHHHHHHHHHHHHCCC Q ss_conf 9127-89999999988751047 Q T0550 1 MKNI-YIYLSLLAVIVLGTACN 21 (339) Q Consensus 1 MKki-~~~i~ll~ll~~~ssC~ 21 (339) |++. ....++++++++|+||- T Consensus 1 M~~~~~~~~~~~l~~~lLsACs 22 (593) T PF01298_consen 1 MNNPPLNQSAIALAAFLLSACS 22 (593) T ss_pred CCCCCCCHHHHHHHHHHHHHHC T ss_conf 9865552558999999998734 No 19 >PF02030 Lipoprotein_8: Hypothetical lipoprotein (MG045 family); InterPro: IPR000044 Mycoplasma genitalium has the smallest known genome of any free-living organism. Its complete genome sequence has been determined by whole-genome random sequencing and assembly . Only 470 putative coding regions were identified, including genes for DNA replication, transcription and translation, DNA repair, cellular transport and energy metabolism . A hypothetical protein from the MG045 gene has a homologue of similarly unknown function in Mycoplasma pneumoniae .; GO: 0016020 membrane Probab=44.66 E-value=5.8 Score=15.30 Aligned_cols=23 Identities=30% Similarity=0.353 Sum_probs=14.1 Q ss_pred CCHHHHHHHHHH---HHHHHHCCCCC Q ss_conf 912789999999---98875104777 Q T0550 1 MKNIYIYLSLLA---VIVLGTACNNE 23 (339) Q Consensus 1 MKki~~~i~ll~---ll~~~ssC~dd 23 (339) ||++.+++.+++ +..+++||.++ T Consensus 1 mk~~~k~~~~~~~l~~~~~ltac~~~ 26 (493) T PF02030_consen 1 MKKQKKFLFSLIGLTFSSILTACSKN 26 (493) T ss_pred CCHHHHHHHHHHHHHHHHHHHHCCCC T ss_conf 94046789999999998887644558 No 20 >PF11839 DUF3359: Protein of unknown function (DUF3359) Probab=42.27 E-value=5.8 Score=15.27 Aligned_cols=21 Identities=24% Similarity=0.474 Sum_probs=13.3 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) ||++. +..+.+.+++..||-. T Consensus 1 M~~~l-~s~~~~~~~L~~GCAs 21 (96) T PF11839_consen 1 MKKLL-ISALALAALLAAGCAS 21 (96) T ss_pred CCHHH-HHHHHHHHHHHHHCCC T ss_conf 90599-9999999999857268 No 21 >PF06788 UPF0257: Uncharacterised protein family (UPF0257); InterPro: IPR010646 This is a group of proteins of unknown function. Probab=41.06 E-value=6.8 Score=14.85 Aligned_cols=27 Identities=26% Similarity=0.408 Sum_probs=15.0 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCCCCCCCCC Q ss_conf 912789999999988751047777665354 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNEWEDEQYE 30 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd~e~e~y~ 30 (339) ||+. +++.++++++++|++.-.-..|. T Consensus 1 ~~~~---~~~~~l~~~l~~cd~~~~~~~f~ 27 (236) T PF06788_consen 1 MKKQ---LLLCLLALLLAGCDNASAPKSFT 27 (236) T ss_pred CCEE---EHHHHHHHHHHHCCCCCCHHCCC T ss_conf 9605---45899999776412545110179 No 22 >PF03160 Calx-beta: Calx-beta domain; InterPro: IPR003644 The calx-beta motif is present as a tandem repeat in the cytoplasmic domains of Calx Na-Ca exchangers, which are used to expel calcium from cells. This motif overlaps domains used for calcium binding and regulation. The calx-beta motif is also present in the cytoplasmic tail of mammalian integrin-beta4, which mediates the bi-directional transfer of signals across the plasma membrane, as well as in some cyanobacterial proteins. This motif contains a series of beta-strands and turns that form a self-contained beta-sheet , .; GO: 0007154 cell communication, 0016021 integral to membrane; PDB: 2fws_A 3gin_B 2dpk_A 3e9u_A 2fwu_A 2qvm_A 2qvk_A 3fq4_B 3fso_A 3h6a_B .... Probab=40.12 E-value=7.1 Score=14.76 Aligned_cols=72 Identities=14% Similarity=0.208 Sum_probs=43.6 Q ss_pred CEEEEEEEEEEECCCCCCCEEEEEEECHHHHHHHHHHHCCCCCCCCEEEECCCCCCCC-CCCEEEECCCEEEEEEEEEEE Q ss_conf 2579999997207667764799998776999888776436532573568866101005-752587169657889999851 Q T0550 58 KVTYQLPIIVSGSTVNSQDRDIHIAVDKDTLKTLNIERFSLYRPELWYTEMEEDKYEF-PETVHIPAGSCVELLNIDFNL 136 (339) Q Consensus 58 ~~t~~l~v~vsgs~~~~~ditVti~vD~slL~~YN~~~~~~~~~~t~y~~LP~~~Ysl-~~tv~I~AGe~~s~i~I~f~~ 136 (339) .....+.|..+|. .....+.|.+...+.. .-+..-|.. +.++++++|+....+.|.+.- T Consensus 27 ~~~~~~~V~r~~~-~~~~~v~V~~~t~~gt-------------------A~~g~Dy~~~~~~v~F~~ge~~~~i~v~i~d 86 (101) T PF03160_consen 27 DGTVTVTVVRTGG-TTSGPVTVNYSTSDGT-------------------ATAGSDYTPVSGTVTFAPGETSKTITVPIID 86 (101) T ss_dssp --EEEEEEEEES---TTSEEEEEEEEE-SS-------------------S-TTTTBE-----EEE-TT-EEEEEEEEB-- T ss_pred CEEEEEEEEEEEE-CCCEEEEEEEEEECCC-------------------CCCCCCCCCCCEEEEECCCCCEEEEEEEEEC T ss_conf 8099999999411-5886999999997885-------------------3021676503408999399829899999958 Q ss_pred CC-CCCCCEEEEEE Q ss_conf 56-78766137689 Q T0550 137 QD-IDMLEKWVLPL 149 (339) Q Consensus 137 ~~-Ld~~~~YvLPl 149 (339) +. .+.++...|-| T Consensus 87 D~~~E~~E~f~v~L 100 (101) T PF03160_consen 87 DNVPEGDETFTVQL 100 (101) T ss_dssp -SSTTSSEEEEEEE T ss_pred CCCCCCCEEEEEEE T ss_conf 99855865899998 No 23 >PF04507 DUF576: Protein of unknown function, DUF576; InterPro: IPR007595 This family contains several uncharacterised staphylococcal proteins. Probab=37.77 E-value=7.6 Score=14.59 Aligned_cols=27 Identities=26% Similarity=0.549 Sum_probs=16.2 Q ss_pred HHHHHHHHHHHHHHHHCCCC----CCCCCCC Q ss_conf 27899999999887510477----7766535 Q T0550 3 NIYIYLSLLAVIVLGTACNN----EWEDEQY 29 (339) Q Consensus 3 ki~~~i~ll~ll~~~ssC~d----d~e~e~y 29 (339) ++..++++++|.++.+||.. ++.++|. T Consensus 6 kl~l~is~liLii~I~Gcg~~~k~~sKe~qI 36 (257) T PF04507_consen 6 KLALYISLLILIIFIGGCGIMNKEDSKEAQI 36 (257) T ss_pred HHHHHHHHHHHHHHHHCCCCCCCCCHHHHHH T ss_conf 6799999999999883134556654078999 No 24 >PF12092 DUF3568: Protein of unknown function (DUF3568) Probab=37.76 E-value=7.3 Score=14.69 Aligned_cols=21 Identities=14% Similarity=0.221 Sum_probs=14.4 Q ss_pred CHHHHHHHHHHHHHHHHCCCC Q ss_conf 127899999999887510477 Q T0550 2 KNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 2 Kki~~~i~ll~ll~~~ssC~d 22 (339) ||+...+++.++++.++||.- T Consensus 1 kk~~~~~l~~~~~l~l~sC~~ 21 (131) T PF12092_consen 1 KKLLLATLIAASALSLSSCGV 21 (131) T ss_pred CCHHHHHHHHHHHHHHCCCHH T ss_conf 912899999999998714302 No 25 >PF03082 MAGSP: Male accessory gland secretory protein; InterPro: IPR004315 The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. The protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. During copulation it is transferred to the female genital tract where it is rapidly altered .; GO: 0007618 mating, 0005576 extracellular region Probab=37.62 E-value=6.4 Score=15.01 Aligned_cols=23 Identities=17% Similarity=0.414 Sum_probs=19.5 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCC Q ss_conf 91278999999998875104777 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNE 23 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd 23 (339) |..|..|..++++++.+++|... T Consensus 1 MNQILLCS~iLLllfaVAnC~~~ 23 (264) T PF03082_consen 1 MNQILLCSAILLLLFAVANCDGL 23 (264) T ss_pred CCEEHHHHHHHHHHHHHHHCCCC T ss_conf 96220068899999887611454 No 26 >PF12034 DUF3520: Domain of unknown function (DUF3520) Probab=37.24 E-value=7.9 Score=14.49 Aligned_cols=58 Identities=14% Similarity=0.171 Sum_probs=40.2 Q ss_pred CCCCCCEEEEEEECHHHHHHHHHHHCCCCCCCCEEEECCCCCCCCCC-C-EEEECCCEE-EEEEEEEE Q ss_conf 66776479999877699988877643653257356886610100575-2-587169657-88999985 Q T0550 71 TVNSQDRDIHIAVDKDTLKTLNIERFSLYRPELWYTEMEEDKYEFPE-T-VHIPAGSCV-ELLNIDFN 135 (339) Q Consensus 71 ~~~~~ditVti~vD~slL~~YN~~~~~~~~~~t~y~~LP~~~Ysl~~-t-v~I~AGe~~-s~i~I~f~ 135 (339) .+..+|+.++++.+|..+.+|-.- |-.-.+|-.+-+.=.. . .-|-||.++ +...|... T Consensus 4 ~tiAkDVKiQVEFNPa~V~~YRLI-------GYEnR~L~~eDF~nD~vDAGEIGAGHsVTALYEi~p~ 64 (182) T PF12034_consen 4 FTIAKDVKIQVEFNPAQVAEYRLI-------GYENRALADEDFNNDKVDAGEIGAGHSVTALYEIVPV 64 (182) T ss_pred CHHHHHCEEEEEECHHHHHHHHHH-------HHHHCCCCHHHCCCCCCCCCCCCCCCEEEEEEEEEEC T ss_conf 021022147788888997588652-------2321246522245876551002688678999999986 No 27 >PF11810 DUF3332: Domain of unknown function (DUF3332) Probab=37.21 E-value=7.8 Score=14.51 Aligned_cols=20 Identities=25% Similarity=0.268 Sum_probs=13.2 Q ss_pred CHHHHHHHHHHHHHHHHCCC Q ss_conf 12789999999988751047 Q T0550 2 KNIYIYLSLLAVIVLGTACN 21 (339) Q Consensus 2 Kki~~~i~ll~ll~~~ssC~ 21 (339) |++...+++++++++|+||= T Consensus 2 k~~~~~~~~~~~~~~lsgC~ 21 (176) T PF11810_consen 2 KKILAAVAILLGSVSLSGCI 21 (176) T ss_pred CHHHHHHHHHHHHHHHCCCC T ss_conf 13699999999999852342 No 28 >PF10671 TcpQ: Toxin co-regulated pilus biosynthesis protein Q Probab=36.09 E-value=4.1 Score=16.18 Aligned_cols=19 Identities=21% Similarity=0.501 Sum_probs=10.8 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) |||.++. +.+++++++|-- T Consensus 1 ~kkn~i~---~~~~i~lsGcs~ 19 (169) T PF10671_consen 1 MKKNLIA---ITLAIMLSGCSS 19 (169) T ss_pred CCCCEEH---HHHHHHHCCCCC T ss_conf 9740303---778988426333 No 29 >PF09403 FadA: Adhesion protein FadA; PDB: 3etz_A 3etx_C 2gl2_A 3ety_A 3etw_A. Probab=35.37 E-value=3.2 Score=16.85 Aligned_cols=20 Identities=30% Similarity=0.165 Sum_probs=10.8 Q ss_pred CCHHHHHHHHHHHHHHHHCC Q ss_conf 91278999999998875104 Q T0550 1 MKNIYIYLSLLAVIVLGTAC 20 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC 20 (339) |||+.++.++++..++|++- T Consensus 1 MKKi~L~~ml~lss~sfAa~ 20 (126) T PF09403_consen 1 MKKILLCGMLLLSSLSFAAT 20 (126) T ss_dssp -------------------- T ss_pred CCHHHHHHHHHHHHHHHHHH T ss_conf 90589999999999997622 No 30 >PF05643 DUF799: Putative bacterial lipoprotein (DUF799); InterPro: IPR008517 This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins. Probab=34.39 E-value=8 Score=14.43 Aligned_cols=18 Identities=33% Similarity=0.595 Sum_probs=9.7 Q ss_pred CCHHHHHHHHHHHHHHHHCCC Q ss_conf 912789999999988751047 Q T0550 1 MKNIYIYLSLLAVIVLGTACN 21 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~ 21 (339) ||++... +++++++++|. T Consensus 1 ~k~~~~~---l~~~l~LsgCa 18 (215) T PF05643_consen 1 MKPLLLG---LAALLLLSGCA 18 (215) T ss_pred CHHHHHH---HHHHHHHHHCC T ss_conf 9005999---99999996076 No 31 >PF11777 DUF3316: Protein of unknown function (DUF3316) Probab=33.54 E-value=9 Score=14.13 Aligned_cols=22 Identities=32% Similarity=0.289 Sum_probs=11.9 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) |||+..+.+++++.+...+|.- T Consensus 1 MKkl~ll~~~l~~s~~a~A~~~ 22 (114) T PF11777_consen 1 MKKLILLASLLLLSSSAFAGNY 22 (114) T ss_pred CHHHHHHHHHHHHHHHHHHHHC T ss_conf 9049999999998667763113 No 32 >PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 This family consists of the homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection . Probab=33.47 E-value=8.6 Score=14.25 Aligned_cols=17 Identities=29% Similarity=0.528 Sum_probs=9.5 Q ss_pred CHHHHHHHHHHHHHHHHCCC Q ss_conf 12789999999988751047 Q T0550 2 KNIYIYLSLLAVIVLGTACN 21 (339) Q Consensus 2 Kki~~~i~ll~ll~~~ssC~ 21 (339) |||. .+++++++++||. T Consensus 9 Kkil---~~~~a~~~LaGCs 25 (26) T PF08139_consen 9 KKIL---FLLLALFMLAGCS 25 (26) T ss_pred HHHH---HHHHHHHHHHHCC T ss_conf 9999---9999999983314 No 33 >PF06280 DUF1034: Fn3-like domain (DUF1034); InterPro: IPR010435 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 1xf1_A 3eif_A. Probab=26.88 E-value=12 Score=13.43 Aligned_cols=45 Identities=18% Similarity=0.318 Sum_probs=28.7 Q ss_pred CCCCC-CCEEEECCCEEEEEEEEEEE-CCCCCCCEEEEE--EEEEECCCC Q ss_conf 10057-52587169657889999851-567876613768--999726887 Q T0550 112 KYEFP-ETVHIPAGSCVELLNIDFNL-QDIDMLEKWVLP--LTIVDDGSY 157 (339) Q Consensus 112 ~Ysl~-~tv~I~AGe~~s~i~I~f~~-~~Ld~~~~YvLP--ltI~~~s~~ 157 (339) .-+++ +++++|||+... +.++|++ .+++....+++- |++.++.+. T Consensus 56 ~~~~~~~~vTV~ag~s~~-v~vt~~~p~~~~~~~~~~~eG~V~~~~~~~~ 104 (112) T PF06280_consen 56 SVTFSPNTVTVPAGGSKT-VTVTFTPPSGFDASNNPFYEGFVRFTSSDGE 104 (112) T ss_dssp EEE---EEEEE-TTEEEE-EEEEEE--GGGHH------E-EEEEESSTTS T ss_pred EEEECCCEEEECCCCEEE-EEEEEEECCCCCCCCCCEEEEEEEEECCCCC T ss_conf 666379849999999899-9999976314664458899999999808998 No 34 >PF11153 DUF2931: Protein of unknown function (DUF2931) Probab=25.75 E-value=12 Score=13.31 Aligned_cols=35 Identities=23% Similarity=0.318 Sum_probs=16.7 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCCCCC---CCCCCEEEEECCC Q ss_conf 91278999999998875104777766---5354036652013 Q T0550 1 MKNIYIYLSLLAVIVLGTACNNEWED---EQYEQYVSFKAPI 39 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~dd~e~---e~y~~~vy~~~~~ 39 (339) ||++..+ +++++++||...-.. +.++=.+.+.+|. T Consensus 1 m~~i~~l----ll~lll~~Cs~~~~~~~~~~~~W~~~~~~P~ 38 (216) T PF11153_consen 1 MKKILLL----LLLLLLAGCSTSPTEPSQPYDEWRFGVGAPK 38 (216) T ss_pred CCCHHHH----HHHHHHHHCCCCCCCCCCCCCCEEEEECCCC T ss_conf 9007999----9999997536886445688883489861787 No 35 >PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses. Probab=25.38 E-value=13 Score=13.26 Aligned_cols=18 Identities=28% Similarity=0.410 Sum_probs=6.7 Q ss_pred HHHHHHHHHHHHHHHCCC Q ss_conf 789999999988751047 Q T0550 4 IYIYLSLLAVIVLGTACN 21 (339) Q Consensus 4 i~~~i~ll~ll~~~ssC~ 21 (339) .++++.|+++++++.+++ T Consensus 5 ~~llL~lllA~vlliss~ 22 (95) T PF07172_consen 5 AFLLLGLLLAAVLLISSE 22 (95) T ss_pred HHHHHHHHHHHHHHHHHH T ss_conf 999999999999999874 No 36 >PF06135 DUF965: Bacterial protein of unknown function (DUF965); InterPro: IPR009309 This family consists of several hypothetical bacterial proteins. The function of the family is unknown. Probab=25.32 E-value=6.3 Score=15.08 Aligned_cols=31 Identities=35% Similarity=0.530 Sum_probs=22.4 Q ss_pred EEEEEECCCCCC-EEEEECCCCCEEEEECCCCC Q ss_conf 499831688762-14788687427883034323 Q T0550 249 TLDMKQDDPSNE-MEFELIGTPTYSSTSVMDAT 280 (339) Q Consensus 249 ~~t~~~~~~~~~-~~f~~~~~~~y~~~~~~~~~ 280 (339) .+..+..+|.|+ +++-++|.|+| |++-++|- T Consensus 27 AL~EKGYNPiNQiVGYllSGDPaY-Itsh~~AR 58 (79) T PF06135_consen 27 ALEEKGYNPINQIVGYLLSGDPAY-ITSHNNAR 58 (79) T ss_pred HHHHCCCCHHHHHHHHEECCCCCC-CCCCCHHH T ss_conf 999858881866773202489761-15631099 No 37 >PF05628 Borrelia_P13: Borrelia membrane protein P13; InterPro: IPR008420 This family consists of P13 proteins from Borrelia species. P13 is a 13 kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism . Probab=25.18 E-value=11 Score=13.64 Aligned_cols=22 Identities=23% Similarity=0.359 Sum_probs=15.1 Q ss_pred CCHHHHHHHHHHHHHHHHCCCC Q ss_conf 9127899999999887510477 Q T0550 1 MKNIYIYLSLLAVIVLGTACNN 22 (339) Q Consensus 1 MKki~~~i~ll~ll~~~ssC~d 22 (339) |||+.++++++.+++-.-|-+| T Consensus 1 MkKi~~lilif~~t~qiFA~~d 22 (167) T PF05628_consen 1 MKKIFILILIFFLTIQIFAQKD 22 (167) T ss_pred CCEEEHHHHHHHHHHHHHHCCC T ss_conf 9624214788877876341246 No 38 >PF09160 FimH_man-bind: FimH, mannose binding; InterPro: IPR015243 This domain adopts a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. It is predominantly found in bacterial mannose-specific adhesins, and is capable of binding to D-mannose . ; PDB: 2vco_B 1klf_L 1kiu_P 1uwf_A 1qun_L 1tr7_A. Probab=22.82 E-value=14 Score=12.97 Aligned_cols=34 Identities=24% Similarity=0.307 Sum_probs=24.5 Q ss_pred CCEEEECCCCCCCCC--C--CEEEECCCEEEEEEEE-EE Q ss_conf 735688661010057--5--2587169657889999-85 Q T0550 102 ELWYTEMEEDKYEFP--E--TVHIPAGSCVELLNID-FN 135 (339) Q Consensus 102 ~t~y~~LP~~~Ysl~--~--tv~I~AGe~~s~i~I~-f~ 135 (339) +..|+++|..-|=-| . -|.|+||+..+.+..+ ++ T Consensus 95 ~~~~~p~p~klYLtp~~~agGv~I~~G~~iAtl~m~k~~ 133 (150) T PF09160_consen 95 DGNYKPWPAKLYLTPISAAGGVVIKKGELIATLNMHKIA 133 (150) T ss_dssp SSS-EE--EEEEEEESTT-----B----EEEEEEEEEEE T ss_pred CCCCCCCCEEEEEEECCCCCCEEEECCCEEEEEEEEEEC T ss_conf 798466546899975487885898379889999999732 No 39 >PF06030 DUF916: Bacterial protein of unknown function (DUF916); InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function. Probab=22.74 E-value=14 Score=12.96 Aligned_cols=68 Identities=13% Similarity=0.216 Sum_probs=31.9 Q ss_pred EEEEEEEEECCCCCCCEEEEEEECHHHHHHHHHHHCCCCCCCCEE---EECCC---CCCCCCCC-EEEECCCEEE-EEEE Q ss_conf 999999720766776479999877699988877643653257356---88661---01005752-5871696578-8999 Q T0550 61 YQLPIIVSGSTVNSQDRDIHIAVDKDTLKTLNIERFSLYRPELWY---TEMEE---DKYEFPET-VHIPAGSCVE-LLNI 132 (339) Q Consensus 61 ~~l~v~vsgs~~~~~ditVti~vD~slL~~YN~~~~~~~~~~t~y---~~LP~---~~Ysl~~t-v~I~AGe~~s-~i~I 132 (339) ..|.+.+. ...++.++|.+.+..+.-. .+-..+.....- .-|+- +.-+++.. |++|++++.- .+.| T Consensus 29 qtl~v~v~--N~t~~~itv~v~~~~A~Tn----~nG~idY~~~~~~~d~sl~~~~~~~v~~~~~~Vtl~~~~sk~V~~~l 102 (122) T PF06030_consen 29 QTLQVRVT--NNTDKPITVKVSANNATTN----DNGVIDYSPSTKKKDSSLKYPFSDLVKIPKEEVTLPANSSKTVTFTL 102 (122) T ss_pred EEEEEEEE--CCCCCCEEEEEEEEEEEEC----CCEEEEECCCCCCCCCCCCCCHHHHCCCCCCEEEECCCCEEEEEEEE T ss_conf 99999999--2899968999997165756----88789966788774643484679961268876998999879999999 Q ss_pred EE Q ss_conf 98 Q T0550 133 DF 134 (339) Q Consensus 133 ~f 134 (339) ++ T Consensus 103 k~ 104 (122) T PF06030_consen 103 KM 104 (122) T ss_pred EC T ss_conf 86 No 40 >PF09710 Trep_dent_lipo: Treponema clustered lipoprotein (Trep_dent_lipo) Probab=20.63 E-value=9.1 Score=14.11 Aligned_cols=12 Identities=25% Similarity=0.434 Sum_probs=6.5 Q ss_pred HHHHHHHCCCCC Q ss_conf 998875104777 Q T0550 12 AVIVLGTACNNE 23 (339) Q Consensus 12 ~ll~~~ssC~dd 23 (339) +|++++.||.+| T Consensus 10 iLA~lLFSCSKE 21 (394) T PF09710_consen 10 ILAALLFSCSKE 21 (394) T ss_pred HHHHHHHHHHHH T ss_conf 999999641365 Done!