Query T0612 YP_001093860.1, Shewanella loihica PV-4, 129 residues Match_columns 129 No_of_seqs 106 out of 143 Neff 6.7 Searched_HMMs 11830 Date Mon Jul 5 08:59:26 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0612.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0612.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF07233 DUF1425: Protein of u 100.0 2E-32 1.7E-36 204.3 13.0 94 32-125 1-94 (94) 2 PF00345 Pili_assembly_N: Gram 95.6 0.037 3.2E-06 29.5 8.6 84 42-126 1-91 (122) 3 PF06291 Lambda_Bor: Bor prote 94.9 0.0041 3.4E-07 35.1 1.7 21 1-21 1-21 (97) 4 PF08139 LPAM_1: Prokaryotic m 94.7 0.006 5.1E-07 34.1 2.2 19 1-19 8-26 (26) 5 PF07273 DUF1439: Protein of u 93.8 0.041 3.5E-06 29.3 4.9 59 1-60 1-75 (177) 6 PF11906 DUF3426: Protein of u 93.6 0.17 1.5E-05 25.7 9.3 71 53-123 66-148 (149) 7 PF05643 DUF799: Putative bact 93.4 0.018 1.5E-06 31.4 2.5 19 1-19 1-19 (215) 8 PF06474 MLTD_N: MLTD_N; Inte 92.9 0.032 2.7E-06 29.9 3.2 21 1-22 1-21 (95) 9 PF11153 DUF2931: Protein of u 92.8 0.052 4.4E-06 28.7 4.2 21 1-22 1-21 (216) 10 PF12262 Lipase_bact_N: Bacter 91.8 0.042 3.6E-06 29.2 2.7 21 1-21 1-23 (268) 11 PF10671 TcpQ: Toxin co-regula 90.0 0.033 2.8E-06 29.8 0.8 19 1-19 1-19 (169) 12 PF07901 DUF1672: Protein of u 88.6 0.099 8.3E-06 27.1 2.4 19 1-19 1-20 (304) 13 PF03627 PapG_N: PapG carbohyd 87.4 0.045 3.8E-06 29.1 0.0 18 1-19 1-18 (226) 14 PF05590 DUF769: Xylella fasti 86.9 0.13 1.1E-05 26.3 2.2 17 5-21 9-25 (284) 15 PF11839 DUF3359: Protein of u 85.7 0.17 1.5E-05 25.7 2.3 21 1-21 1-23 (96) 16 PF09619 YscW: Type III secret 83.6 1.1 9.1E-05 21.1 6.9 35 68-106 80-114 (124) 17 PF06085 Rz1: Lipoprotein Rz1 82.0 0.32 2.7E-05 24.1 2.3 23 1-23 1-26 (59) 18 PF11611 TRF2: Telomeric repea 80.3 0.52 4.4E-05 22.9 3.0 75 52-126 33-115 (123) 19 PF06788 UPF0257: Uncharacteri 79.9 0.47 4E-05 23.2 2.6 19 1-19 1-19 (236) 20 PF09476 Pilus_CpaD: Pilus bio 78.5 0.74 6.3E-05 22.0 3.3 16 7-22 2-17 (203) 21 PF11322 DUF3124: Protein of u 77.9 1.7 0.00014 20.0 5.9 52 56-109 24-75 (125) 22 PF11873 DUF3393: Domain of un 77.9 0.3 2.5E-05 24.3 1.1 17 2-18 1-17 (204) 23 PF00942 CBM_3: Cellulose bind 77.7 1.7 0.00015 19.9 5.7 71 52-123 10-85 (86) 24 PF10566 Glyco_hydro_97: Glyco 76.6 1.3 0.00011 20.5 4.2 16 1-16 1-16 (643) 25 PF09533 DUF2380: Predicted li 75.5 0.69 5.8E-05 22.2 2.4 22 1-22 1-24 (217) 26 PF00820 Lipoprotein_1: Borrel 74.2 1.4 0.00012 20.4 3.7 22 1-22 1-22 (273) 27 PF08085 Entericidin: Enterici 74.0 0.62 5.3E-05 22.5 1.9 18 1-18 1-22 (42) 28 PF06572 DUF1131: Protein of u 71.1 0.34 2.9E-05 24.0 0.0 19 1-19 1-20 (192) 29 PF12099 DUF3575: Protein of u 70.5 2.1 0.00018 19.4 3.9 18 1-18 1-18 (189) 30 PF04076 BOF: Bacterial OB fol 70.0 0.62 5.2E-05 22.5 1.1 49 76-125 74-123 (126) 31 PF11355 DUF3157: Protein of u 67.7 3 0.00025 18.5 4.9 86 41-126 102-201 (203) 32 PF11659 DUF3261: Protein of u 66.8 3 0.00026 18.5 4.1 45 10-54 2-50 (154) 33 PF11353 DUF3153: Protein of u 64.6 1.1 9.6E-05 21.0 1.6 30 5-34 2-31 (209) 34 PF11254 DUF3053: Protein of u 62.6 1.5 0.00013 20.2 1.9 17 5-21 7-23 (229) 35 PF07919 DUF1683: Protein of u 62.1 3.9 0.00033 17.9 8.1 74 31-110 31-104 (125) 36 PF03304 Mlp: Mlp lipoprotein 61.4 1.5 0.00013 20.2 1.8 20 1-20 1-21 (181) 37 PF02402 Lysis_col: Lysis prot 58.9 0.68 5.7E-05 22.3 -0.4 20 1-20 1-22 (46) 38 PF12079 DUF3558: Protein of u 58.0 2.8 0.00023 18.7 2.6 41 72-112 77-122 (168) 39 PF05079 DUF680: Protein of un 57.8 1.9 0.00016 19.7 1.7 18 1-18 1-18 (75) 40 PF05211 NLBH: Neuraminyllacto 55.0 1 8.7E-05 21.2 0.0 21 2-22 1-21 (258) 41 PF03843 Slp: Outer membrane l 54.3 5.3 0.00045 17.1 5.1 14 14-27 1-14 (160) 42 PF06848 Disaggr_repeat: Disag 52.1 3.5 0.0003 18.1 2.4 24 78-103 101-124 (182) 43 PF01289 Thiol_cytolysin: Thio 50.8 6 0.00051 16.8 7.8 50 72-126 371-427 (467) 44 PF07996 T4SS: Type IV secreti 46.8 1.7 0.00014 20.0 0.0 27 1-27 1-28 (217) 45 PF07383 DUF1496: Protein of u 46.7 7 0.00059 16.4 4.1 20 1-20 1-20 (88) 46 PF05481 Myco_19_kDa: Mycobact 43.4 1.2 0.0001 20.8 -1.1 120 1-124 1-158 (160) 47 PF11769 DUF3313: Protein of u 42.5 1.9 0.00016 19.6 -0.2 11 12-22 1-11 (201) 48 PF01297 SBP_bac_9: Periplasmi 40.4 2.6 0.00022 18.8 0.2 18 5-22 1-18 (303) 49 PF12276 DUF3617: Protein of u 38.5 6.6 0.00056 16.6 2.0 24 61-84 93-116 (162) 50 PF07119 DUF1375: Protein of u 38.2 3.8 0.00033 17.9 0.8 13 8-20 2-14 (76) 51 PF10368 YkyA: Putative cell-w 37.7 2.8 0.00024 18.7 0.0 12 7-18 1-12 (204) 52 PF10828 DUF2570: Protein of u 37.1 5.1 0.00043 17.2 1.3 18 1-18 1-18 (110) 53 PF06316 Ail_Lom: Enterobacter 36.1 9 0.00076 15.8 2.4 18 69-86 49-66 (199) 54 PF01298 Lipoprotein_5: Transf 35.7 5.9 0.0005 16.8 1.4 14 5-18 9-22 (593) 55 PF06649 DUF1161: Protein of u 34.1 11 0.00089 15.4 2.5 13 1-13 1-13 (75) 56 PF09580 Spore_YhcN_YlaJ: Spor 32.2 2.6 0.00022 18.8 -0.9 12 10-21 2-13 (174) 57 PF10023 DUF2265: Predicted am 30.6 6 0.00051 16.8 0.8 14 6-19 1-14 (337) 58 PF03748 FliL: Flagellar basal 30.6 7.4 0.00063 16.3 1.2 18 1-18 1-18 (149) 59 PF05540 Serpulina_VSP: Serpul 30.4 6.9 0.00059 16.4 1.0 18 1-18 1-18 (377) 60 PF07148 MalM: Maltose operon 29.0 10 0.00088 15.4 1.8 32 58-89 81-113 (281) 61 PF07705 CARDB: CARDB; InterP 28.5 14 0.0012 14.7 9.5 71 48-125 12-83 (101) 62 PF09676 TraV: Type IV conjuga 27.7 5.6 0.00048 17.0 0.2 11 8-18 1-11 (135) 63 PF09926 DUF2158: Uncharacteri 27.6 15 0.0012 14.6 3.8 20 68-87 25-44 (53) 64 PF11810 DUF3332: Domain of un 27.1 12 0.001 15.1 1.8 17 3-19 6-23 (176) 65 PF06486 DUF1093: Protein of u 24.3 13 0.0011 14.9 1.5 17 71-87 26-42 (78) 66 PF07424 TrbM: TrbM; InterPro 23.1 18 0.0015 14.1 2.2 14 1-14 1-14 (186) 67 PF05753 TRAP_beta: Translocon 23.1 18 0.0015 14.1 9.7 55 53-110 36-94 (181) 68 PF06280 DUF1034: Fn3-like dom 22.9 18 0.0015 14.1 6.9 62 54-115 7-82 (112) 69 PF06387 Calcyon: D1 dopamine 22.8 8.6 0.00072 15.9 0.4 16 72-87 101-119 (186) 70 PF05272 VirE: Virulence-assoc 22.6 16 0.0014 14.3 1.7 29 1-29 33-61 (198) 71 PF07219 HemY_N: HemY protein 22.5 14 0.0011 14.7 1.4 18 1-18 1-18 (134) 72 PF05968 Bacillus_PapR: Bacill 22.3 17 0.0014 14.2 1.8 18 1-18 1-19 (48) 73 PF06551 DUF1120: Protein of u 21.2 18 0.0015 14.1 1.7 18 1-18 1-20 (145) 74 PF04744 Monooxygenase_B: Mone 21.1 19 0.0016 13.8 6.1 65 42-112 250-334 (381) 75 PF01847 VHL: von Hippel-Linda 20.8 20 0.0017 13.8 4.7 34 50-87 6-39 (156) 76 PF11777 DUF3316: Protein of u 20.6 19 0.0016 13.9 1.8 12 1-12 1-12 (114) 77 PF05404 TRAP-delta: Transloco 20.4 20 0.0017 13.7 5.9 31 55-87 78-108 (167) 78 PF01552 Pico_P2B: Picornaviru 20.2 20 0.0017 13.7 1.9 17 5-21 62-78 (99) 79 PF06518 DUF1104: Protein of u 20.1 19 0.0016 13.8 1.7 16 1-16 1-16 (142) No 1 >PF07233 DUF1425: Protein of unknown function (DUF1425); InterPro: IPR010824 This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown. Probab=99.98 E-value=2e-32 Score=204.27 Aligned_cols=94 Identities=38% Similarity=0.639 Sum_probs=92.0 Q ss_pred EEEECCCCCCEEEEEEEEEEECCCEEEEEEEEEECCCCCEEEEEEEEEECCCCCEECCCCCCCEEEEECCCCEEEEEEEC Q ss_conf 68607521250477421022048748999999846687628999999976889683888876038997799508999863 Q T0612 32 VRVDNGSFHSDVDVSAVTTQAEAGFLRARGTIISKSPKDQRLQYKFTWYDINGATVEDEGVSWKSLKLHGKQQMQVTALS 111 (129) Q Consensus 32 vv~~~s~l~~~i~v~~~~~~~~~g~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v~~~~~~W~~l~l~~~~~~~i~~va 111 (129) |||++++|+++|.+++|+++..||+++++++++|++++|++|+|||||||+|||++++..++|++++|+|+++.+|+++| T Consensus 1 vv~~~s~l~~~i~v~~~~~~~~~g~~~~~~~l~N~~~~~~~l~Yrf~WyD~~G~~v~~~~~~w~~~~l~~~~~~~i~~va 80 (94) T PF07233_consen 1 VVMDNSVLAAGISVSQPRIRTSNGLLEAQVTLSNKSSKPLTLQYRFYWYDAQGFEVDPEQEPWQSLILPGGQTVTIQAVA 80 (94) T ss_pred CEECCHHHCCCEEEECCEEEEECCEEEEEEEEEECCCCCEEEEEEEEEECCCCCCCCCCCCCCEEEEECCCCEEEEEEEC T ss_conf 99848155186799712895139809999999979899789999999988999992898889899998799779999886 Q ss_pred CCCCEEEEEEEEEE Q ss_conf 79840589999997 Q T0612 112 PNATAVRCELYVRE 125 (129) Q Consensus 112 p~~~a~~~RlylRe 125 (129) |||+|++||||||| T Consensus 81 p~~~A~~~Rlylre 94 (94) T PF07233_consen 81 PNPEAKDFRLYLRE 94 (94) T ss_pred CCCCCEEEEEEEEC T ss_conf 99983899999979 No 2 >PF00345 Pili_assembly_N: Gram-negative pili assembly chaperone, N-terminal domain; InterPro: IPR016147 Most Gram-negative bacteria possess a supramolecular structure - the pili - on their surface, which mediates attachment to specific receptors. Many interactive subunits are required to assemble pili, but their assembly only takes place after translocation across the cytoplasmic membrane. Periplasmic chaperones assist pili assembly by binding to the subunits, thereby preventing premature aggregation , . Pili chaperones are structurally, and possibly evolutionarily, related to the immunoglobulin superfamily , : they contain two globular domains, with a topology identical to an immunoglobulin fold. This entry represents the N-terminal domain of pili assembly chaperone, and has a beta-sandwich fold consisting of seven strands in two sheets with a Greek key topology.; GO: 0005515 protein binding, 0007047 cell wall organization and biogenesis, 0030288 outer membrane-bounded periplasmic space; PDB: 1l4i_A 1kiu_I 1klf_G 1qun_K 3bwu_C 1bf8_A 1ze3_C 3f65_H 3f6l_A 3f6i_A .... Probab=95.64 E-value=0.037 Score=29.53 Aligned_cols=84 Identities=14% Similarity=0.052 Sum_probs=57.0 Q ss_pred EEEEEEEEEEECCCEEEEEEEEEECCCCCEEEEEEEEEECC-CCCEEC-C-CCCCCEEEEECCCCEEEEEEEC----CCC Q ss_conf 04774210220487489999998466876289999999768-896838-8-8876038997799508999863----798 Q T0612 42 DVDVSAVTTQAEAGFLRARGTIISKSPKDQRLQYKFTWYDI-NGATVE-D-EGVSWKSLKLHGKQQMQVTALS----PNA 114 (129) Q Consensus 42 ~i~v~~~~~~~~~g~~~~~v~l~N~~~~~~~l~Yrf~WyD~-~Gl~v~-~-~~~~W~~l~l~~~~~~~i~~va----p~~ 114 (129) +|.+...+.....+...++++|.|+++.+..++-+++.-|+ ++.+-. + .-.| -.+.|.++++.++.-.. |.. T Consensus 1 gi~i~~trii~~~~~~~~~~~v~N~~~~~~~vq~~v~~~~~~~~~~~~~~fiv~P-p~~~L~p~~~q~vRi~~~~~lp~d 79 (122) T PF00345_consen 1 GITISPTRIIYDEDQRSASVTVTNNSDEPYLVQVWVDDGDEEDEDEPTDPFIVTP-PLFRLEPGESQTVRIYRGNPLPQD 79 (122) T ss_dssp EEEESCSEEEEETT-SEEEEEEEESSSS-EEEEEEEEETTSTTCECSS-SEEEES-SEEEE-TTEEEEEEEEEGGGS-SS T ss_pred CEEECCEEEEEECCCCEEEEEEEECCCCCEEEEEEEEECCCCCCCCCCCCEEEEC-CCEEECCCCCEEEEEEECCCCCCC T ss_conf 9076567999938997789999949899499999997325676766645389829-607858998189999818999988 Q ss_pred CEEEEEEEEEEE Q ss_conf 405899999972 Q T0612 115 TAVRCELYVREA 126 (129) Q Consensus 115 ~a~~~RlylRe~ 126 (129) +-.-|||.+++. T Consensus 80 ~E~~y~l~~~~I 91 (122) T PF00345_consen 80 RESLYRLNFREI 91 (122) T ss_dssp S-EEEEEEEEEE T ss_pred CCEEEEEEEEEC T ss_conf 128999999963 No 3 >PF06291 Lambda_Bor: Bor protein; InterPro: IPR010438 This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the E. coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis . Probab=94.87 E-value=0.0041 Score=35.08 Aligned_cols=21 Identities=48% Similarity=0.589 Sum_probs=17.9 Q ss_pred CHHHHHHHHHHHHHHCCCCCC Q ss_conf 924889999988871457886 Q T0612 1 MNKGLVLACLLLGLSACAPHT 21 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas~~ 21 (129) |||.++.+.++++|+||++.+ T Consensus 1 mkk~ll~~~l~llltgCa~qt 21 (97) T PF06291_consen 1 MKKILLAAALALLLTGCAQQT 21 (97) T ss_pred CHHHHHHHHHHHHHCCCCEEE T ss_conf 900499999999964566389 No 4 >PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 This family consists of the homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection . Probab=94.70 E-value=0.006 Score=34.09 Aligned_cols=19 Identities=37% Similarity=0.511 Sum_probs=17.1 Q ss_pred CHHHHHHHHHHHHHHCCCC Q ss_conf 9248899999888714578 Q T0612 1 MNKGLVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas 19 (129) |||.+.+++++++|+||++ T Consensus 8 ~Kkil~~~~a~~~LaGCss 26 (26) T PF08139_consen 8 MKKILFLLLALFMLAGCSS 26 (26) T ss_pred HHHHHHHHHHHHHHHHCCC T ss_conf 9999999999999833149 No 5 >PF07273 DUF1439: Protein of unknown function (DUF1439); InterPro: IPR010835 This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown.; PDB: 3eyr_B. Probab=93.79 E-value=0.041 Score=29.29 Aligned_cols=59 Identities=27% Similarity=0.295 Sum_probs=33.3 Q ss_pred CHHHH-HHHHHHHHHHCCCCCCCCCCCCCCC-------CEEEE-----CCCCCCEEEEEEEEEEEC---CCEEEEE Q ss_conf 92488-9999988871457886673026876-------16860-----752125047742102204---8748999 Q T0612 1 MNKGL-VLACLLLGLSACAPHTGGIMISSTG-------EVRVD-----NGSFHSDVDVSAVTTQAE---AGFLRAR 60 (129) Q Consensus 1 Mkk~l-~~~~~~l~L~GCas~~~~~~~~~~~-------~vv~~-----~s~l~~~i~v~~~~~~~~---~g~~~~~ 60 (129) ||+.+ ++++++++|+||++- +.+.+.+++ +..++ +.++...+.+.++.+.-. .+....+ T Consensus 1 Mk~~~~~~l~~~~~L~gC~~~-~~ysise~eiq~~L~k~~~~~k~~g~~gl~~~~v~l~n~~v~lg~~~~nrv~l~ 75 (177) T PF07273_consen 1 MKKLLLLALILALLLTGCASL-SQYSISEQEIQQYLAKKFPFQKKIGIPGLFDADVSLSNPQVQLGREDPNRVALS 75 (177) T ss_dssp ------------------CHC-CEEEE-HHHHHHHHHHC---EEEE------EEEEEEEEEEEE----STT-EEEE T ss_pred CCHHHHHHHHHHHHHHCCCCC-CCEEECHHHHHHHHHHHCCHHHHCCCCCEEEEEEEECCCEEECCCCCCCEEEEE T ss_conf 926999999999998612556-616687999999998657852310457638899997584566478899889999 No 6 >PF11906 DUF3426: Protein of unknown function (DUF3426) Probab=93.60 E-value=0.17 Score=25.69 Aligned_cols=71 Identities=17% Similarity=0.166 Sum_probs=53.7 Q ss_pred CCCEEEEEEEEEECCCCCEE-EEEEEEEECCCCCEECCCCC-C--------CEEEEECCCCEEEEEEE--CCCCCEEEEE Q ss_conf 48748999999846687628-99999997688968388887-6--------03899779950899986--3798405899 Q T0612 53 EAGFLRARGTIISKSPKDQR-LQYKFTWYDINGATVEDEGV-S--------WKSLKLHGKQQMQVTAL--SPNATAVRCE 120 (129) Q Consensus 53 ~~g~~~~~v~l~N~~~~~~~-l~Yrf~WyD~~Gl~v~~~~~-~--------W~~l~l~~~~~~~i~~v--ap~~~a~~~R 120 (129) .++..+.+..+.|..++++. .+-++.-||.+|-.+....- | =....+++++++.+... .|.+++++|| T Consensus 66 ~~~~~~i~g~l~N~~~~~~~~P~l~l~L~D~~g~~v~~r~~~P~eyl~~~~~~~~~l~pg~~~~f~~~~~~~~~~a~~y~ 145 (149) T PF11906_consen 66 GGDVLVISGTLRNRADFPQAWPALELTLTDAQGQPVARRVFTPAEYLPPALANQAGLPPGQSVPFRVVFEDPPPNAAGYR 145 (149) T ss_pred CCCEEEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECHHHHCCCCCCCCCCCCCCCEEEEEEEECCCCCCCCEEE T ss_conf 89679999999938987534745999999899999999997757844643332344499986899999407998631589 Q ss_pred EEE Q ss_conf 999 Q T0612 121 LYV 123 (129) Q Consensus 121 lyl 123 (129) +++ T Consensus 146 v~~ 148 (149) T PF11906_consen 146 VEF 148 (149) T ss_pred EEE T ss_conf 997 No 7 >PF05643 DUF799: Putative bacterial lipoprotein (DUF799); InterPro: IPR008517 This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins. Probab=93.38 E-value=0.018 Score=31.41 Aligned_cols=19 Identities=42% Similarity=0.387 Sum_probs=17.7 Q ss_pred CHHHHHHHHHHHHHHCCCC Q ss_conf 9248899999888714578 Q T0612 1 MNKGLVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas 19 (129) ||++++.++++++|+||+. T Consensus 1 ~k~~~~~l~~~l~LsgCa~ 19 (215) T PF05643_consen 1 MKPLLLGLAALLLLSGCAV 19 (215) T ss_pred CHHHHHHHHHHHHHHHCCC T ss_conf 9005999999999960768 No 8 >PF06474 MLTD_N: MLTD_N; InterPro: IPR010511 This entry comprises the N-terminal domain of membrane-bound lytic murein transglycosylase D. Probab=92.88 E-value=0.032 Score=29.93 Aligned_cols=21 Identities=33% Similarity=0.295 Sum_probs=13.2 Q ss_pred CHHHHHHHHHHHHHHCCCCCCC Q ss_conf 9248899999888714578866 Q T0612 1 MNKGLVLACLLLGLSACAPHTG 22 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas~~~ 22 (129) ||-.++++++ ++|+||++.++ T Consensus 1 m~~~~~l~~~-llLaGCqs~~~ 21 (95) T PF06474_consen 1 MRFLAVLALA-LLLAGCQSTPQ 21 (95) T ss_pred CHHHHHHHHH-HHHHHCCCCCC T ss_conf 9299999999-99984679999 No 9 >PF11153 DUF2931: Protein of unknown function (DUF2931) Probab=92.85 E-value=0.052 Score=28.72 Aligned_cols=21 Identities=43% Similarity=0.589 Sum_probs=14.9 Q ss_pred CHHHHHHHHHHHHHHCCCCCCC Q ss_conf 9248899999888714578866 Q T0612 1 MNKGLVLACLLLGLSACAPHTG 22 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas~~~ 22 (129) |||.++++ +.++|+||++.+. T Consensus 1 m~~i~~ll-l~lll~~Cs~~~~ 21 (216) T PF11153_consen 1 MKKILLLL-LLLLLAGCSTSPT 21 (216) T ss_pred CCCHHHHH-HHHHHHHCCCCCC T ss_conf 90079999-9999975368864 No 10 >PF12262 Lipase_bact_N: Bacterial virulence factor lipase N-terminal Probab=91.77 E-value=0.042 Score=29.21 Aligned_cols=21 Identities=33% Similarity=0.583 Sum_probs=16.0 Q ss_pred CHHHHH--HHHHHHHHHCCCCCC Q ss_conf 924889--999988871457886 Q T0612 1 MNKGLV--LACLLLGLSACAPHT 21 (129) Q Consensus 1 Mkk~l~--~~~~~l~L~GCas~~ 21 (129) |||.++ +++.+++|+||+..+ T Consensus 1 Mkk~~l~~~iasal~LaGCg~ds 23 (268) T PF12262_consen 1 MKKKLLSLAIASALGLAGCGGDS 23 (268) T ss_pred CCHHHHHHHHHHHHHCCCCCCCC T ss_conf 94358999999997511147997 No 11 >PF10671 TcpQ: Toxin co-regulated pilus biosynthesis protein Q Probab=90.01 E-value=0.033 Score=29.81 Aligned_cols=19 Identities=37% Similarity=0.616 Sum_probs=17.9 Q ss_pred CHHHHHHHHHHHHHHCCCC Q ss_conf 9248899999888714578 Q T0612 1 MNKGLVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas 19 (129) |||-+|+++..+.|.||++ T Consensus 1 ~kkn~i~~~~~i~lsGcs~ 19 (169) T PF10671_consen 1 MKKNLIAITLAIMLSGCSS 19 (169) T ss_pred CCCCEEHHHHHHHHCCCCC T ss_conf 9740303778988426333 No 12 >PF07901 DUF1672: Protein of unknown function (DUF1672); InterPro: IPR012873 This family is composed of hypothetical bacterial proteins of unknown function. Probab=88.58 E-value=0.099 Score=27.09 Aligned_cols=19 Identities=32% Similarity=0.459 Sum_probs=13.8 Q ss_pred CHHH-HHHHHHHHHHHCCCC Q ss_conf 9248-899999888714578 Q T0612 1 MNKG-LVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~-l~~~~~~l~L~GCas 19 (129) |+|. .++++++|+|+||+. T Consensus 1 M~K~i~~ll~~~lLLgGCs~ 20 (304) T PF07901_consen 1 MKKRIISLLAATLLLGGCSN 20 (304) T ss_pred CHHHHHHHHHHHHHHCCCCC T ss_conf 91489999999999744446 No 13 >PF03627 PapG_N: PapG carbohydrate binding domain; InterPro: IPR005310 PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan , .; GO: 0030246 carbohydrate binding, 0007155 cell adhesion; PDB: 1j8r_A 1j8s_A. Probab=87.38 E-value=0.045 Score=29.06 Aligned_cols=18 Identities=33% Similarity=0.429 Sum_probs=16.1 Q ss_pred CHHHHHHHHHHHHHHCCCC Q ss_conf 9248899999888714578 Q T0612 1 MNKGLVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas 19 (129) ||||+.+++. |.|+||.+ T Consensus 1 MKKWfPAfLF-LslSG~nd 18 (226) T PF03627_consen 1 MKKWFPAFLF-LSLSGCND 18 (226) T ss_dssp ------------------- T ss_pred CCCCHHEHEE-EEECCCCC T ss_conf 9630011203-56237775 No 14 >PF05590 DUF769: Xylella fastidiosa protein of unknown function (DUF769); InterPro: IPR008487 This family consists of several uncharacterised hypothetical proteins of unknown function from Xylella fastidiosa, the organism that causes Pierce's disease in plants. Probab=86.89 E-value=0.13 Score=26.33 Aligned_cols=17 Identities=24% Similarity=0.284 Sum_probs=13.6 Q ss_pred HHHHHHHHHHHCCCCCC Q ss_conf 89999988871457886 Q T0612 5 LVLACLLLGLSACAPHT 21 (129) Q Consensus 5 l~~~~~~l~L~GCas~~ 21 (129) +.++.+.|+|+||+|-| T Consensus 9 ~sllaaslllagcss~p 25 (284) T PF05590_consen 9 CSLLAASLLLAGCSSGP 25 (284) T ss_pred HHHHHHHHHHHCCCCCC T ss_conf 79999999972378899 No 15 >PF11839 DUF3359: Protein of unknown function (DUF3359) Probab=85.74 E-value=0.17 Score=25.70 Aligned_cols=21 Identities=33% Similarity=0.365 Sum_probs=14.4 Q ss_pred CHHHHHHHHH--HHHHHCCCCCC Q ss_conf 9248899999--88871457886 Q T0612 1 MNKGLVLACL--LLGLSACAPHT 21 (129) Q Consensus 1 Mkk~l~~~~~--~l~L~GCas~~ 21 (129) |||.|+..+. ++|++||++.+ T Consensus 1 M~~~l~s~~~~~~~L~~GCAs~s 23 (96) T PF11839_consen 1 MKKLLISALALAALLAAGCASTS 23 (96) T ss_pred CCHHHHHHHHHHHHHHHHCCCCC T ss_conf 90599999999999985726885 No 16 >PF09619 YscW: Type III secretion system lipoprotein chaperone (YscW) Probab=83.57 E-value=1.1 Score=21.10 Aligned_cols=35 Identities=14% Similarity=0.120 Sum_probs=17.3 Q ss_pred CCCEEEEEEEEEECCCCCEECCCCCCCEEEEECCCCEEE Q ss_conf 876289999999768896838888760389977995089 Q T0612 68 PKDQRLQYKFTWYDINGATVEDEGVSWKSLKLHGKQQMQ 106 (129) Q Consensus 68 ~~~~~l~Yrf~WyD~~Gl~v~~~~~~W~~l~l~~~~~~~ 106 (129) ++...|.-|++|-++-++.. ..|+.+.=.++..+. T Consensus 80 ~G~~Ylra~L~~~g~~~vqa----~~qq~v~~~~~~~v~ 114 (124) T PF09619_consen 80 EGELYLRARLRFQGKRAVQA----SSQQKVFKGGKYVVQ 114 (124) T ss_pred CCCEEEEEEEEECCHHHHHH----HHHHHHHCCCCEEEE T ss_conf 87259999999856788657----777743359828999 No 17 >PF06085 Rz1: Lipoprotein Rz1 precursor; InterPro: IPR010346 This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda, which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces .; GO: 0019064 viral envelope fusion with host membrane, 0019867 outer membrane Probab=81.97 E-value=0.32 Score=24.15 Aligned_cols=23 Identities=39% Similarity=0.433 Sum_probs=14.3 Q ss_pred CHHHHHHH---HHHHHHHCCCCCCCC Q ss_conf 92488999---998887145788667 Q T0612 1 MNKGLVLA---CLLLGLSACAPHTGG 23 (129) Q Consensus 1 Mkk~l~~~---~~~l~L~GCas~~~~ 23 (129) ||+...++ .+.|+|+||+|.++. T Consensus 1 Mr~l~~~l~~~~~~L~lsaC~S~p~~ 26 (59) T PF06085_consen 1 MRKLKMLLCALALPLALSACSSKPPV 26 (59) T ss_pred CHHHHHHHHHHHHHHHHHHHCCCCCC T ss_conf 90389999999999999871589976 No 18 >PF11611 TRF2: Telomeric repeat-binding factor 2; PDB: 3cfu_A. Probab=80.32 E-value=0.52 Score=22.91 Aligned_cols=75 Identities=13% Similarity=0.092 Sum_probs=53.0 Q ss_pred ECCCEEEEEEEEEECCCCCEEE-EEEEEEECCCCCEECCCCCC--C----EEEEECCCCEEEEEEECCCCCEEE-EEEEE Q ss_conf 0487489999998466876289-99999976889683888876--0----389977995089998637984058-99999 Q T0612 52 AEAGFLRARGTIISKSPKDQRL-QYKFTWYDINGATVEDEGVS--W----KSLKLHGKQQMQVTALSPNATAVR-CELYV 123 (129) Q Consensus 52 ~~~g~~~~~v~l~N~~~~~~~l-~Yrf~WyD~~Gl~v~~~~~~--W----~~l~l~~~~~~~i~~vap~~~a~~-~Rlyl 123 (129) ..+.++.+.+.+.|+++.++.+ .+.|..+|.+|-...+.... . ..-.|.++++.+...+=--|+..+ |+|.+ T Consensus 33 ~g~~fvvV~v~v~N~~~e~~~~~~~~f~L~d~~g~~y~~~~~~~~~~~~~~~~~l~pG~~~~g~ivF~vp~~~~~~~L~~ 112 (123) T PF11611_consen 33 EGGKFVVVDVTVKNNGDEPISFSPSDFKLYDDDGKEYDPDFSASSDPDNFFSGELKPGESVEGKIVFEVPKDSQPYELEY 112 (123) T ss_dssp --SEEEEEEEEEEE-----B-B-----EEE-TT--B--EEE-CCC---------B----EE---EEEEE----GG-EEEE T ss_pred CCCEEEEEEEEEEECCCCCEEECCCCEEEEECCCCEECCCCCCCCCCCCCCCEEECCCCEEEEEEEEEECCCCCCEEEEE T ss_conf 99989999999999999957757571999949997981443310011554534999999899999999899994579999 Q ss_pred EEE Q ss_conf 972 Q T0612 124 REA 126 (129) Q Consensus 124 Re~ 126 (129) ..- T Consensus 113 ~~~ 115 (123) T PF11611_consen 113 DPD 115 (123) T ss_dssp -H- T ss_pred ECC T ss_conf 267 No 19 >PF06788 UPF0257: Uncharacterised protein family (UPF0257); InterPro: IPR010646 This is a group of proteins of unknown function. Probab=79.92 E-value=0.47 Score=23.19 Aligned_cols=19 Identities=42% Similarity=0.604 Sum_probs=17.3 Q ss_pred CHHHHHHHHHHHHHHCCCC Q ss_conf 9248899999888714578 Q T0612 1 MNKGLVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas 19 (129) |||.+.+.+++++|+||.. T Consensus 1 ~~~~~~~~~l~~~l~~cd~ 19 (236) T PF06788_consen 1 MKKQLLLCLLALLLAGCDN 19 (236) T ss_pred CCEEEHHHHHHHHHHHCCC T ss_conf 9605458999997764125 No 20 >PF09476 Pilus_CpaD: Pilus biogenesis CpaD protein (pilus_cpaD) Probab=78.52 E-value=0.74 Score=22.03 Aligned_cols=16 Identities=44% Similarity=0.683 Sum_probs=11.2 Q ss_pred HHHHHHHHHCCCCCCC Q ss_conf 9999888714578866 Q T0612 7 LACLLLGLSACAPHTG 22 (129) Q Consensus 7 ~~~~~l~L~GCas~~~ 22 (129) ++.++++|+||++..+ T Consensus 2 l~~~~~~LaaC~~~~~ 17 (203) T PF09476_consen 2 LLALALALAACASTAD 17 (203) T ss_pred HHHHHHHHHHCCCCCC T ss_conf 7899999752169876 No 21 >PF11322 DUF3124: Protein of unknown function (DUF3124) Probab=77.92 E-value=1.7 Score=19.96 Aligned_cols=52 Identities=17% Similarity=0.196 Sum_probs=41.4 Q ss_pred EEEEEEEEEECCCCCEEEEEEEEEECCCCCEECCCCCCCEEEEECCCCEEEEEE Q ss_conf 489999998466876289999999768896838888760389977995089998 Q T0612 56 FLRARGTIISKSPKDQRLQYKFTWYDINGATVEDEGVSWKSLKLHGKQQMQVTA 109 (129) Q Consensus 56 ~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v~~~~~~W~~l~l~~~~~~~i~~ 109 (129) .+.+.+.++|+....=-.-=+...||.+|=.|..-.. +++.|.|.++..+-- T Consensus 24 ~Lt~tLSiRNtd~~~~i~i~~v~Yydt~Gklvr~yl~--~Pi~L~Pl~s~~~~V 75 (125) T PF11322_consen 24 NLTATLSIRNTDPTHPITITSVDYYDTDGKLVRSYLD--APIELKPLASTEIVV 75 (125) T ss_pred EEEEEEEEECCCCCCCEEEEEEEEECCCCEEHHHHHC--CCEECCCCCEEEEEE T ss_conf 7899999974999998899998898599909687636--984538862389997 No 22 >PF11873 DUF3393: Domain of unknown function (DUF3393) Probab=77.86 E-value=0.3 Score=24.30 Aligned_cols=17 Identities=41% Similarity=0.634 Sum_probs=13.7 Q ss_pred HHHHHHHHHHHHHHCCC Q ss_conf 24889999988871457 Q T0612 2 NKGLVLACLLLGLSACA 18 (129) Q Consensus 2 kk~l~~~~~~l~L~GCa 18 (129) ||.|++++++++|+||+ T Consensus 1 ~k~l~~~~~~llL~~Cs 17 (204) T PF11873_consen 1 KKFLILLLIILLLSSCS 17 (204) T ss_pred CCCHHHHHHHHHHHHHC T ss_conf 94819999999999857 No 23 >PF00942 CBM_3: Cellulose binding domain; InterPro: IPR001956 This domain is involved in cellulose binding and is found associated with a wide range of bacterial glycosyl hydrolases. The structure for this domain is known ; it forms a beta sandwich.; GO: 0030246 carbohydrate binding, 0005975 carbohydrate metabolic process; PDB: 1g43_A 1nbc_B 1k72_B 1g87_B 1kfg_A 1ga2_B 3tf4_A 1js4_B 1tf4_A 4tf4_A .... Probab=77.74 E-value=1.7 Score=19.93 Aligned_cols=71 Identities=11% Similarity=0.198 Sum_probs=40.7 Q ss_pred ECCCEEEEEEEEEECCCCCE---EEEEEEEEECCCCCEECCCCCCCEEEEECCCCEE--EEEEECCCCCEEEEEEEE Q ss_conf 04874899999984668762---8999999976889683888876038997799508--999863798405899999 Q T0612 52 AEAGFLRARGTIISKSPKDQ---RLQYKFTWYDINGATVEDEGVSWKSLKLHGKQQM--QVTALSPNATAVRCELYV 123 (129) Q Consensus 52 ~~~g~~~~~v~l~N~~~~~~---~l~Yrf~WyD~~Gl~v~~~~~~W~~l~l~~~~~~--~i~~vap~~~a~~~Rlyl 123 (129) ......+..+.|.|+...++ .|..| ||||.+|..-....-.|-++.....+.. ++....|-....++=|+| T Consensus 10 ~~~n~i~~~~~i~Ntg~pa~~l~~l~~R-Yyft~d~~~~~~~~~d~~~v~~~~~~~~~~~~~~~~~~~~~a~yYvEi 85 (86) T PF00942_consen 10 ASTNFIEPKFKIYNTGWPARDLSDLKIR-YYFTLDGSKAAGFWCDDATVGTNYSSNVTGTFSGLSPPDAGADYYVEI 85 (86) T ss_dssp SEESEEEEEEEEEE-----B-CGGEEEE-EEEE-CCHHHEEEECCCCEEEECEGGGEE-EEEEEEEEETTEEEEEEE T ss_pred CCCCEEEEEEEEEECCCCCEECCCEEEE-EEEECCCCCCCCCCCCEEEECCCCCCCCCEEECCCCCCCCCCCEEEEC T ss_conf 9866588999999789998763777999-999167662456441117855644787427886667677897489973 No 24 >PF10566 Glyco_hydro_97: Glycoside hydrolase 97 ; PDB: 2jkp_A 2zq0_A 2jke_B 2d73_B 2jka_B. Probab=76.65 E-value=1.3 Score=20.54 Aligned_cols=16 Identities=31% Similarity=0.320 Sum_probs=9.0 Q ss_pred CHHHHHHHHHHHHHHC Q ss_conf 9248899999888714 Q T0612 1 MNKGLVLACLLLGLSA 16 (129) Q Consensus 1 Mkk~l~~~~~~l~L~G 16 (129) |||.++++++++++.+ T Consensus 1 MKk~~i~~l~~~l~~~ 16 (643) T PF10566_consen 1 MKKLIIILLALLLLLS 16 (643) T ss_dssp ---------------- T ss_pred CCHHHHHHHHHHHHHH T ss_conf 9437999999999987 No 25 >PF09533 DUF2380: Predicted lipoprotein of unknown function (DUF2380) Probab=75.53 E-value=0.69 Score=22.22 Aligned_cols=22 Identities=36% Similarity=0.422 Sum_probs=15.3 Q ss_pred CHHHHHHHH--HHHHHHCCCCCCC Q ss_conf 924889999--9888714578866 Q T0612 1 MNKGLVLAC--LLLGLSACAPHTG 22 (129) Q Consensus 1 Mkk~l~~~~--~~l~L~GCas~~~ 22 (129) |+..+.+.+ +++++.|||+..+ T Consensus 1 m~~~~~~~l~~l~~~~~gCa~~~~ 24 (217) T PF09533_consen 1 MRRALVLWLLVLALLWVGCASAAP 24 (217) T ss_pred CCHHHHHHHHHHHHHHHHHCCCCC T ss_conf 913799999999999865115789 No 26 >PF00820 Lipoprotein_1: Borrelia lipoprotein The Pfam entry is a subset of this entry.; InterPro: IPR001809 The ospA and ospB genes encode the major outer membrane proteins of the Lyme disease spirochaete Borrelia burgdorferi . The deduced gene products OspA and OspB, contain 273 and 296 residues respectively . The two Osp proteins show a high degree of sequence similarity, indicating a recent evolutionary event. Molecular analysis and sequence comparison of OspA and OspB with other proteins has revealed similarity to the signal peptides of prokaryotic lipoproteins , .; GO: 0009279 cell outer membrane; PDB: 3ckg_A 3cka_B 2fkg_A 2fkj_A 2pi3_O 2ol6_O 3ckf_A 2i5z_O 2hkd_A 1p4p_A .... Probab=74.19 E-value=1.4 Score=20.44 Aligned_cols=22 Identities=36% Similarity=0.475 Sum_probs=19.4 Q ss_pred CHHHHHHHHHHHHHHCCCCCCC Q ss_conf 9248899999888714578866 Q T0612 1 MNKGLVLACLLLGLSACAPHTG 22 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas~~~ 22 (129) |||+|+-+.++|.|-||.-+.+ T Consensus 1 MkkYLlG~~LilAliaC~Q~~s 22 (273) T PF00820_consen 1 MKKYLLGIGLILALIACKQNVS 22 (273) T ss_dssp ---------------------- T ss_pred CCEEHHHHHHHHHHHHHHCCCC T ss_conf 9330147999999987602244 No 27 >PF08085 Entericidin: Entericidin EcnA/B family; InterPro: IPR012556 This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing .; GO: 0009636 response to toxin, 0016020 membrane Probab=74.02 E-value=0.62 Score=22.47 Aligned_cols=18 Identities=44% Similarity=0.654 Sum_probs=11.0 Q ss_pred CHH----HHHHHHHHHHHHCCC Q ss_conf 924----889999988871457 Q T0612 1 MNK----GLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk----~l~~~~~~l~L~GCa 18 (129) ||| .+.++.+++.|+||. T Consensus 1 Mkk~~~~~~~~~~~~~~l~gCn 22 (42) T PF08085_consen 1 MKKKILIILALLALALALAGCN 22 (42) T ss_pred CCHHHHHHHHHHHHHHHHHHHH T ss_conf 9558999999999999880026 No 28 >PF06572 DUF1131: Protein of unknown function (DUF1131); InterPro: IPR010938 This family consists of several hypothetical bacterial proteins of unknown function.; PDB: 2qzb_B. Probab=71.08 E-value=0.34 Score=23.98 Aligned_cols=19 Identities=32% Similarity=0.335 Sum_probs=14.0 Q ss_pred CHHH-HHHHHHHHHHHCCCC Q ss_conf 9248-899999888714578 Q T0612 1 MNKG-LVLACLLLGLSACAP 19 (129) Q Consensus 1 Mkk~-l~~~~~~l~L~GCas 19 (129) ||+. +.++...|+|+||++ T Consensus 1 ~~~~r~~ll~~~lll~gCs~ 20 (192) T PF06572_consen 1 MKSLRLLLLGGPLLLTGCST 20 (192) T ss_dssp -------------------- T ss_pred CCCCHHHHHHHHHHHHCCCC T ss_conf 97500099875898744555 No 29 >PF12099 DUF3575: Protein of unknown function (DUF3575) Probab=70.47 E-value=2.1 Score=19.45 Aligned_cols=18 Identities=33% Similarity=0.250 Sum_probs=12.9 Q ss_pred CHHHHHHHHHHHHHHCCC Q ss_conf 924889999988871457 Q T0612 1 MNKGLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCa 18 (129) |||+..+++++++.+.|+ T Consensus 1 ~~~~~~~~~~~~~~~~~~ 18 (189) T PF12099_consen 1 MKKIRFLFLLLLLFCTSS 18 (189) T ss_pred CCEEEHHHHHHHHHHHHC T ss_conf 935304599999999753 No 30 >PF04076 BOF: Bacterial OB fold (BOF) protein; InterPro: IPR005220 This family includes putative periplasmic proteins.; PDB: 1nnx_A. Probab=69.96 E-value=0.62 Score=22.49 Aligned_cols=49 Identities=16% Similarity=0.234 Sum_probs=30.0 Q ss_pred EEEEECCCCCE-ECCCCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEEEE Q ss_conf 99997688968-388887603899779950899986379840589999997 Q T0612 76 KFTWYDINGAT-VEDEGVSWKSLKLHGKQQMQVTALSPNATAVRCELYVRE 125 (129) Q Consensus 76 rf~WyD~~Gl~-v~~~~~~W~~l~l~~~~~~~i~~vap~~~a~~~RlylRe 125 (129) +|.+=|..|=- |+=....|....+.++..+.|.+--- .+-....|++.. T Consensus 74 ~Y~F~D~tG~I~VeId~~~w~g~~v~p~~kV~i~GevD-k~~~~~~IdV~~ 123 (126) T PF04076_consen 74 KYIFRDGTGEIQVEIDDDVWNGQPVTPDDKVRIFGEVD-KDWNPPEIDVKR 123 (126) T ss_dssp EEEE------EEEE--GGG-------TTSEEEEE-EEE-EETTEEEEEEEE T ss_pred EEEEECCCCCEEEEECCCEECCCCCCCCCEEEEEEEEC-CCCCCCEEEEEE T ss_conf 69998899509999881200796069999899999991-799872899999 No 31 >PF11355 DUF3157: Protein of unknown function (DUF3157) Probab=67.71 E-value=3 Score=18.52 Aligned_cols=86 Identities=24% Similarity=0.271 Sum_probs=56.1 Q ss_pred CEEEEEEEEEEECCCEEEEEEEEEECCCC-CEEEEEEEEEECCCCCEECCCCC-CCEEEE------ECCCCEEE---EEE Q ss_conf 50477421022048748999999846687-62899999997688968388887-603899------77995089---998 Q T0612 41 SDVDVSAVTTQAEAGFLRARGTIISKSPK-DQRLQYKFTWYDINGATVEDEGV-SWKSLK------LHGKQQMQ---VTA 109 (129) Q Consensus 41 ~~i~v~~~~~~~~~g~~~~~v~l~N~~~~-~~~l~Yrf~WyD~~Gl~v~~~~~-~W~~l~------l~~~~~~~---i~~ 109 (129) .||.|.-...+.++|.+-....+.|+++. -+.+.-....||++|-.+..+.. -|+.+. |+++|+.. |-. T Consensus 102 sGV~V~l~~s~~e~~~L~L~f~ltn~Sse~vv~Vevev~lf~d~G~~L~~e~v~vWqaI~RmpdTYLRkgqqr~s~~i~i 181 (203) T PF11355_consen 102 SGVKVSLGASQWEDDRLGLPFELTNQSSEHVVLVEVEVSLFDDSGALLKTETVKVWQAIFRMPDTYLRKGQQRQSKVIWI 181 (203) T ss_pred CCEEEEEECCCCCCCEEEEEEEECCCCCCEEEEEEEEEEEECCCCCCHHCCHHHHHHHHHHCHHHHCCCCCCCCCCEEEE T ss_conf 77049972251128736758885158960599999999998688871201130687877538153157665466852899 Q ss_pred ECCCCCEEE---EEEEEEEE Q ss_conf 637984058---99999972 Q T0612 110 LSPNATAVR---CELYVREA 126 (129) Q Consensus 110 vap~~~a~~---~RlylRe~ 126 (129) .-|...--+ +++.|-|. T Consensus 182 ~~~d~~~~~k~lis~kI~Ev 201 (203) T PF11355_consen 182 EGPDKSQWQKQLISLKIIEV 201 (203) T ss_pred ECCCHHHHHHHCEEEEEEEC T ss_conf 45666663000057899971 No 32 >PF11659 DUF3261: Protein of unknown function (DUF3261) Probab=66.81 E-value=3 Score=18.51 Aligned_cols=45 Identities=16% Similarity=0.250 Sum_probs=19.6 Q ss_pred HHHHHHCCCCCCC---CCCCCCCCCEEEEC-CCCCCEEEEEEEEEEECC Q ss_conf 9888714578866---73026876168607-521250477421022048 Q T0612 10 LLLGLSACAPHTG---GIMISSTGEVRVDN-GSFHSDVDVSAVTTQAEA 54 (129) Q Consensus 10 ~~l~L~GCas~~~---~~~~~~~~~vv~~~-s~l~~~i~v~~~~~~~~~ 54 (129) ++++|+|||+.++ ...++....+-... +..+..+...+..+...+ T Consensus 2 l~llL~gCs~~~~~~~~v~la~~~~~~Lp~~~~~~~~~~~~Qlvt~~~~ 50 (154) T PF11659_consen 2 LALLLSGCSSQPQRQTCVALAPGVSVTLPPPAQLGPSLSLQQLVTATWG 50 (154) T ss_pred EEEEHHHHHCCCCCCCCEEECCCCEEEECCCCCCCCCCCEEEEEEEEEC T ss_conf 2861015436878888668589952340786535777347999999989 No 33 >PF11353 DUF3153: Protein of unknown function (DUF3153) Probab=64.55 E-value=1.1 Score=20.97 Aligned_cols=30 Identities=30% Similarity=0.383 Sum_probs=21.9 Q ss_pred HHHHHHHHHHHCCCCCCCCCCCCCCCCEEE Q ss_conf 899999888714578866730268761686 Q T0612 5 LVLACLLLGLSACAPHTGGIMISSTGEVRV 34 (129) Q Consensus 5 l~~~~~~l~L~GCas~~~~~~~~~~~~vv~ 34 (129) +++++++++|+||-.-..++.+.+++++-+ T Consensus 2 ~vllll~lLLsGCVr~~~~i~~~~~d~I~l 31 (209) T PF11353_consen 2 AVLLLLTLLLSGCVRYDADIDFSGPDRIKL 31 (209) T ss_pred CHHHHHHHHHCCEEEEEEEEEECCCCEEEE T ss_conf 779999987276378887788789995987 No 34 >PF11254 DUF3053: Protein of unknown function (DUF3053) Probab=62.61 E-value=1.5 Score=20.23 Aligned_cols=17 Identities=24% Similarity=0.528 Sum_probs=12.4 Q ss_pred HHHHHHHHHHHCCCCCC Q ss_conf 89999988871457886 Q T0612 5 LVLACLLLGLSACAPHT 21 (129) Q Consensus 5 l~~~~~~l~L~GCas~~ 21 (129) ++++++.+.|+||+.+. T Consensus 7 l~al~~vl~LaGC~dkE 23 (229) T PF11254_consen 7 LLALLMVLQLAGCGDKE 23 (229) T ss_pred HHHHHHHHHHHHCCCCC T ss_conf 99999999983147997 No 35 >PF07919 DUF1683: Protein of unknown function (DUF1683); InterPro: IPR012880 The proteins featured in this family are all hypothetical eukaryotic proteins of unknown function. The region in question is approximately 150 residues long. Probab=62.09 E-value=3.9 Score=17.89 Aligned_cols=74 Identities=11% Similarity=0.004 Sum_probs=52.8 Q ss_pred CEEEECCCCCCEEEEEEEEEEECCCEEEEEEEEEECCCCCEEEEEEEEEECCCCCEECCCCCCCEEEEECCCCEEEEEEE Q ss_conf 16860752125047742102204874899999984668762899999997688968388887603899779950899986 Q T0612 31 EVRVDNGSFHSDVDVSAVTTQAEAGFLRARGTIISKSPKDQRLQYKFTWYDINGATVEDEGVSWKSLKLHGKQQMQVTAL 110 (129) Q Consensus 31 ~vv~~~s~l~~~i~v~~~~~~~~~g~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v~~~~~~W~~l~l~~~~~~~i~~v 110 (129) .+.+.++. -.|.++-+.....++.....+.|.|.+...+++..-+ =|.++|-..+... ..+.+.++++.++.-. T Consensus 31 ~i~v~~~~--l~V~~~~p~~~~~~~~~~l~~~l~N~T~~~~~~~~~l--~~s~~F~fSG~k~--~~~~vlP~s~~~v~~~ 104 (125) T PF07919_consen 31 EITVPSSP--LRVLAEAPSSAIVGEPFTLDYTLENPTMHFQEFELSL--EPSDNFMFSGPKQ--LTLQVLPGSRHTVRYN 104 (125) T ss_pred CEECCCCC--CEEEEECCCCCCCCCCEEEEEEEECCCCCCEEEEEEE--CCCCCEEEECCCC--CEEEECCCCCEEEEEE T ss_conf 63733898--4999984874405986999999995999749999996--7679789968873--4279789975799999 No 36 >PF03304 Mlp: Mlp lipoprotein family; InterPro: IPR004983 The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species . This family were previously known as 2.9 lipoprotein genes . These surface expressed genes may represent new candidate vaccinogens for Lyme disease . Members of this family generally are downstream of four ORFs called A,B,C and D that are involved in hemolytic activity. Probab=61.37 E-value=1.5 Score=20.19 Aligned_cols=20 Identities=40% Similarity=0.637 Sum_probs=15.4 Q ss_pred CHHHHHHHHH-HHHHHCCCCC Q ss_conf 9248899999-8887145788 Q T0612 1 MNKGLVLACL-LLGLSACAPH 20 (129) Q Consensus 1 Mkk~l~~~~~-~l~L~GCas~ 20 (129) ||..-|+++. +|+|.||-++ T Consensus 1 mKiinilfcl~lllL~~Cn~n 21 (181) T PF03304_consen 1 MKIINILFCLFLLLLNSCNSN 21 (181) T ss_pred CCEEHHHHHHHHHHHHCCCCC T ss_conf 944058999999999476768 No 37 >PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined . The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells . A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB . Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C-terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively . Sequence similarities between colicins E2, A and E1 are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides . Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase . The mature ColE2 lysis protein is located in the cell envelope . ; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane Probab=58.91 E-value=0.68 Score=22.27 Aligned_cols=20 Identities=35% Similarity=0.491 Sum_probs=12.6 Q ss_pred CHHHHHH--HHHHHHHHCCCCC Q ss_conf 9248899--9998887145788 Q T0612 1 MNKGLVL--ACLLLGLSACAPH 20 (129) Q Consensus 1 Mkk~l~~--~~~~l~L~GCas~ 20 (129) |||.+.. +++.++|+||-.| T Consensus 1 MkKi~~~~i~~~~~~L~aCQaN 22 (46) T PF02402_consen 1 MKKILFIGILLLTMLLAACQAN 22 (46) T ss_pred CCEEEEEHHHHHHHHHHHHHHC T ss_conf 9478773899999999872012 No 38 >PF12079 DUF3558: Protein of unknown function (DUF3558) Probab=58.01 E-value=2.8 Score=18.74 Aligned_cols=41 Identities=15% Similarity=0.422 Sum_probs=26.0 Q ss_pred EEEEEEEEECCCCCEEC---CCCCCC--EEEEECCCCEEEEEEECC Q ss_conf 89999999768896838---888760--389977995089998637 Q T0612 72 RLQYKFTWYDINGATVE---DEGVSW--KSLKLHGKQQMQVTALSP 112 (129) Q Consensus 72 ~l~Yrf~WyD~~Gl~v~---~~~~~W--~~l~l~~~~~~~i~~vap 112 (129) -+.+-|+||+...+.-+ .+..++ ..+.+.|..-...+.-.+ T Consensus 77 ~~~vs~~~~~~~~l~~er~~~~~~~~~~~~~~I~G~~a~~~~~~~~ 122 (168) T PF12079_consen 77 GMDVSFSWYRGSSLDRERALAENLGYEVEDITIAGRPAFVARDPGD 122 (168) T ss_pred CCEEEEEEECCCCHHHHHHHHHCCCCCEEEEEECCEEEEEEECCCC T ss_conf 4048998613885788655421268723665566835899865899 No 39 >PF05079 DUF680: Protein of unknown function (DUF680); InterPro: IPR007771 This family contains uncharacterised proteins which seem to be found exclusively in Mesorhizobium loti. Probab=57.77 E-value=1.9 Score=19.69 Aligned_cols=18 Identities=39% Similarity=0.305 Sum_probs=15.8 Q ss_pred CHHHHHHHHHHHHHHCCC Q ss_conf 924889999988871457 Q T0612 1 MNKGLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCa 18 (129) |||.++.+.++|+++|-+ T Consensus 1 MkKi~L~aaA~l~~sgsA 18 (75) T PF05079_consen 1 MKKIALTAAALLLISGSA 18 (75) T ss_pred CHHHHHHHHHHHHHHHHH T ss_conf 916999999999965376 No 40 >PF05211 NLBH: Neuraminyllactose-binding hemagglutinin precursor (NLBH); InterPro: IPR007876 This family is comprised of several flagellar sheath adhesin proteins also called neuraminyllactose-binding haemagglutinin precursor (NLBH or HpaA) or N-acetylneuraminyllactose-binding fibrillar haemagglutinin receptor-binding subunits. NLBH is found exclusively in Helicobacter which are gut colonising bacteria and bind to sialic acid rich macromolecules present on the gastric epithelium . The sialic acid-sensitive agglutination of erythrocytes by certain strains of Helicobacter pylori has been attributed to the NLBH protein .; GO: 0009279 cell outer membrane, 0019861 flagellum; PDB: 2i9i_A 3bgh_A. Probab=55.02 E-value=1 Score=21.22 Aligned_cols=21 Identities=29% Similarity=0.137 Sum_probs=17.1 Q ss_pred HHHHHHHHHHHHHHCCCCCCC Q ss_conf 248899999888714578866 Q T0612 2 NKGLVLACLLLGLSACAPHTG 22 (129) Q Consensus 2 kk~l~~~~~~l~L~GCas~~~ 22 (129) ||.++.+.+..+|.|||.+|. T Consensus 1 kK~ll~~~l~slLv~ca~~~~ 21 (258) T PF05211_consen 1 KKCLLALGLGSLLVGCAFYPA 21 (258) T ss_dssp --------------------- T ss_pred CCEEEEEHHHHHHHCCCCCHH T ss_conf 952530036778644676635 No 41 >PF03843 Slp: Outer membrane lipoprotein Slp family; InterPro: IPR004658 Slp superfamily members are present in the Gram-negative gamma proteobacteria Escherichia coli (which also contains a close paralog), Haemophilus influenzae and Pasteurella multocida and Vibrio cholerae. The known members of the family to date share a motif LX[GA]C near the N-terminus, which is compatible with the possibility that the protein is modified into a lipoprotein with Cys as the new N-terminus. Slp from E. coli is known to be a lipoprotein of the outer membrane and to be expressed in response to carbon starvation.; GO: 0019867 outer membrane Probab=54.33 E-value=5.3 Score=17.11 Aligned_cols=14 Identities=29% Similarity=0.444 Sum_probs=9.9 Q ss_pred HHCCCCCCCCCCCC Q ss_conf 71457886673026 Q T0612 14 LSACAPHTGGIMIS 27 (129) Q Consensus 14 L~GCas~~~~~~~~ 27 (129) |+||||-+..+..+ T Consensus 1 L~gCasvP~~~~~~ 14 (160) T PF03843_consen 1 LSGCASVPSELKRN 14 (160) T ss_pred CCCCCCCCHHHCCC T ss_conf 97441798445246 No 42 >PF06848 Disaggr_repeat: Disaggregatase related repeat; InterPro: IPR010671 This entry describes several repeats which seem to be specific to the Methanosarcina archaea species and are often found in multiple copies in disaggregatase proteins. Members of this family are also found in single copies in several hypothetical proteins. Probab=52.09 E-value=3.5 Score=18.13 Aligned_cols=24 Identities=25% Similarity=0.715 Sum_probs=18.5 Q ss_pred EEECCCCCEECCCCCCCEEEEECCCC Q ss_conf 99768896838888760389977995 Q T0612 78 TWYDINGATVEDEGVSWKSLKLHGKQ 103 (129) Q Consensus 78 ~WyD~~Gl~v~~~~~~W~~l~l~~~~ 103 (129) .|||++|..+.. .|+-+++|.|.. T Consensus 101 DWyDkngvlQG~--TpyAtit~k~s~ 124 (182) T PF06848_consen 101 DWYDKNGVLQGS--TPYATITIKGSD 124 (182) T ss_pred CCCCCCCCEECC--CCEEEEEECCCC T ss_conf 431257821157--644899865888 No 43 >PF01289 Thiol_cytolysin: Thiol-activated cytolysin; InterPro: IPR001869 Thiol-activated cytolysins , are toxins produced by a variety of Gram-positive bacteria and are characterised by their ability to lyse cholesterol-containing membranes, their reversible inactivation by oxidation and their capacity to bind to cholesterol. All these proteins contain a single cysteine residue, located in their C-terminal section, which has been shown to be essential for the binding to cholesterol.; GO: 0015485 cholesterol binding, 0009405 pathogenesis; PDB: 1s3r_B 3cqf_B 1m3i_C 1m3j_B 1pfo_A. Probab=50.78 E-value=6 Score=16.78 Aligned_cols=50 Identities=26% Similarity=0.346 Sum_probs=33.0 Q ss_pred EEEEEEEE----ECCCCCEECCC---CCCCEEEEECCCCEEEEEEECCCCCEEEEEEEEEEE Q ss_conf 89999999----76889683888---876038997799508999863798405899999972 Q T0612 72 RLQYKFTW----YDINGATVEDE---GVSWKSLKLHGKQQMQVTALSPNATAVRCELYVREA 126 (129) Q Consensus 72 ~l~Yrf~W----yD~~Gl~v~~~---~~~W~~l~l~~~~~~~i~~vap~~~a~~~RlylRe~ 126 (129) --+|..+| ||++|-|+-.. +..|+..+.|=..+..+ -++|++.|+++||. T Consensus 371 va~~~i~wde~~~d~~g~e~~~~k~w~~n~~~~ta~f~~~i~~-----~~n~rni~v~~~e~ 427 (467) T PF01289_consen 371 VAQFNITWDEVSYDENGNEVVTHKAWEGNGKDRTAHFSTTIPL-----PGNARNIRVKAREC 427 (467) T ss_dssp -EEEEEEEEEEEE-----EEEEEEE------EB-SSEEEEEEE------TTEEEEEEEEEEE T ss_pred EEEEEEEEEECCCCCCCCEEEEECCCCCCCCCCCCCCEEEEEC-----CCCCCEEEEEEEEC T ss_conf 9999977310230899988743024467886455562388755-----98864148998751 No 44 >PF07996 T4SS: Type IV secretion system proteins; InterPro: IPR012991 Members of this family are components of the type IV secretion system. They mediate intracellular transfer of macromolecules via a mechanism ancestrally related to that of bacterial conjugation machineries.; PDB: 1r8i_A. Probab=46.80 E-value=1.7 Score=20.02 Aligned_cols=27 Identities=30% Similarity=0.407 Sum_probs=17.4 Q ss_pred CHHHHHHHHHHHHHHCCCC-CCCCCCCC Q ss_conf 9248899999888714578-86673026 Q T0612 1 MNKGLVLACLLLGLSACAP-HTGGIMIS 27 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas-~~~~~~~~ 27 (129) |||.++.+.++++|+.|++ ...||++- T Consensus 1 MKk~~~~~~~~~~l~~~~~a~AaGIPV~ 28 (217) T PF07996_consen 1 MKKKTLALALALALLMSSPAAAAGIPVI 28 (217) T ss_dssp ---------------------------- T ss_pred CHHHHHHHHHHHHHHHCCHHHHCCCCCC T ss_conf 9157999999999980335542899823 No 45 >PF07383 DUF1496: Protein of unknown function (DUF1496); InterPro: IPR009971 This family consists of several bacterial proteins of around 90 residues in length. Members of this family seem to be found exclusively in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown. Probab=46.72 E-value=7 Score=16.40 Aligned_cols=20 Identities=20% Similarity=0.184 Sum_probs=13.6 Q ss_pred CHHHHHHHHHHHHHHCCCCC Q ss_conf 92488999998887145788 Q T0612 1 MNKGLVLACLLLGLSACAPH 20 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas~ 20 (129) ||+++++++.+++...|.++ T Consensus 1 M~~~~~~~~~~~~~~~~~~~ 20 (88) T PF07383_consen 1 MKRLLILCFALSASLLALSN 20 (88) T ss_pred CCCHHHHHHHHHHHHHHHCC T ss_conf 93203569999999987422 No 46 >PF05481 Myco_19_kDa: Mycobacterium 19 kDa lipoprotein antigen; InterPro: IPR008691 Most of the antigens of Mycobacterium leprae and Mycobacterium tuberculosis that have been identified are members of stress protein families, which are highly conserved throughout many diverse species. Of the M. leprae and M. tuberculosis antigens identified by monoclonal antibodies, all except the 18 kDa M. leprae antigen and the 19 kDa M. tuberculosis antigen are strongly cross-reactive between these two species and are coded within very similar genes , .; GO: 0016020 membrane Probab=43.43 E-value=1.2 Score=20.75 Aligned_cols=120 Identities=18% Similarity=0.226 Sum_probs=50.6 Q ss_pred CHHHHHH-----HHHHHHHHCCCCCC-C----C-----------CCCCCCCCEEEECCCCC--CEEEEEEEE----EEEC Q ss_conf 9248899-----99988871457886-6----7-----------30268761686075212--504774210----2204 Q T0612 1 MNKGLVL-----ACLLLGLSACAPHT-G----G-----------IMISSTGEVRVDNGSFH--SDVDVSAVT----TQAE 53 (129) Q Consensus 1 Mkk~l~~-----~~~~l~L~GCas~~-~----~-----------~~~~~~~~vv~~~s~l~--~~i~v~~~~----~~~~ 53 (129) |||.+.. +++++.|+||++.. . + .......++.++...+. ..+.+.+.- +..- T Consensus 1 m~~~~~~av~g~A~~aa~~~GCS~~~~~~~~~~~t~~~~~~~~sp~~a~g~~V~VdG~~~~~~~~V~C~~~g~~~~I~ig 80 (160) T PF05481_consen 1 MKRGLTVAVAGAAALAAGLSGCSSGDKSAASSSPTSSSSSPSASPGAAAGTQVTVDGKDQDVTGSVTCSQAGGNTNIAIG 80 (160) T ss_pred CCCEEEEEEHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCCCCCCEEEECCCCEEEEECC T ss_conf 96157765325899987541136898526566765577777767776776479988831468753799804999999717 Q ss_pred CCEEEEEEEEEECCCCCEEEEEEEEEECCCCCEE---CCCCCCCEEEEECCCCEEEEEEECC-----CCC---EEEEEEE Q ss_conf 8748999999846687628999999976889683---8888760389977995089998637-----984---0589999 Q T0612 54 AGFLRARGTIISKSPKDQRLQYKFTWYDINGATV---EDEGVSWKSLKLHGKQQMQVTALSP-----NAT---AVRCELY 122 (129) Q Consensus 54 ~g~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v---~~~~~~W~~l~l~~~~~~~i~~vap-----~~~---a~~~Rly 122 (129) ++..-..+.|. .-+.|. ++- +--=+-+||.. ++....--.+.. -+.+.+|.+.+. ||. -+.|+|. T Consensus 81 ~~~~G~~a~vt-~g~~p~-V~s-Vgig~v~G~tl~y~~G~g~G~A~vtk-dG~tYtItGtA~G~D~~NP~~~~t~~FeI~ 156 (160) T PF05481_consen 81 DDTQGIAAVVT-DGDPPT-VES-VGIGNVDGFTLGYSEGTGGGSAKVTK-DGKTYTITGTATGADMANPMAPVTKPFEIE 156 (160) T ss_pred CCCCCEEEEEC-CCCCCE-EEE-EEEEECCCEEEEEECCCCCCCEEEEE-CCCEEEEEEEEECCCCCCCCCCCCCCEEEE T ss_conf 88774799972-699953-689-85660078178872699888756996-599899987787115789876525666899 Q ss_pred EE Q ss_conf 99 Q T0612 123 VR 124 (129) Q Consensus 123 lR 124 (129) +. T Consensus 157 vt 158 (160) T PF05481_consen 157 VT 158 (160) T ss_pred EE T ss_conf 87 No 47 >PF11769 DUF3313: Protein of unknown function (DUF3313) Probab=42.53 E-value=1.9 Score=19.62 Aligned_cols=11 Identities=36% Similarity=0.486 Sum_probs=8.2 Q ss_pred HHHHCCCCCCC Q ss_conf 88714578866 Q T0612 12 LGLSACAPHTG 22 (129) Q Consensus 12 l~L~GCas~~~ 22 (129) |+|+|||+.++ T Consensus 1 l~lagCas~~~ 11 (201) T PF11769_consen 1 LLLAGCASVPP 11 (201) T ss_pred CEEEECCCCCC T ss_conf 92733689998 No 48 >PF01297 SBP_bac_9: Periplasmic solute binding protein family; InterPro: IPR006127 This is a family of periplasmic solute binding proteins such as TroA P96116 from SWISSPROT that interacts with an ATP-binding cassette transport system in Treponema pallidum and plays a role in the transport of zinc across the cytoplasmic membrane of the bacterium. ; GO: 0005488 binding, 0030288 outer membrane-bounded periplasmic space; PDB: 1toa_A 1k0f_A 3hh8_A 1psz_A 1xvl_B 3cx3_B 2o1e_A 2ov3_A 1pq4_A 2ov1_A .... Probab=40.44 E-value=2.6 Score=18.85 Aligned_cols=18 Identities=44% Similarity=0.582 Sum_probs=13.1 Q ss_pred HHHHHHHHHHHCCCCCCC Q ss_conf 899999888714578866 Q T0612 5 LVLACLLLGLSACAPHTG 22 (129) Q Consensus 5 l~~~~~~l~L~GCas~~~ 22 (129) +++++++++|+||++... T Consensus 1 ~~~l~~~~~l~~c~~~~~ 18 (303) T PF01297_consen 1 LLALLLLLLLSACSSAAA 18 (303) T ss_dssp ------------------ T ss_pred CHHHHHHHHHHHHCCCCC T ss_conf 989999999998457732 No 49 >PF12276 DUF3617: Protein of unknown function (DUF3617) Probab=38.50 E-value=6.6 Score=16.56 Aligned_cols=24 Identities=13% Similarity=0.236 Sum_probs=14.9 Q ss_pred EEEEECCCCCEEEEEEEEEECCCC Q ss_conf 999846687628999999976889 Q T0612 61 GTIISKSPKDQRLQYKFTWYDING 84 (129) Q Consensus 61 v~l~N~~~~~~~l~Yrf~WyD~~G 84 (129) ....+.....-.+.+++.+=+++| T Consensus 93 C~~~~~~~~g~~~~~~~~C~~~~~ 116 (162) T PF12276_consen 93 CTYTNVDRSGNTVKFDMSCTDPGG 116 (162) T ss_pred CCEEEEEEECCEEEEEEEECCCCC T ss_conf 977258986998999999579997 No 50 >PF07119 DUF1375: Protein of unknown function (DUF1375); InterPro: IPR010780 This family consists of several hypothetical, putative lipoproteins of around 80 residues in length. Members of this family seem to be specific to the class Gammaproteobacteria. The function of this family is unknown. Probab=38.20 E-value=3.8 Score=17.91 Aligned_cols=13 Identities=46% Similarity=0.603 Sum_probs=8.9 Q ss_pred HHHHHHHHCCCCC Q ss_conf 9998887145788 Q T0612 8 ACLLLGLSACAPH 20 (129) Q Consensus 8 ~~~~l~L~GCas~ 20 (129) .+++++|+||++- T Consensus 2 l~~~~~l~GCgTi 14 (76) T PF07119_consen 2 LALLLLLSGCGTI 14 (76) T ss_pred HHHHHHHCCCCCC T ss_conf 8899874667130 No 51 >PF10368 YkyA: Putative cell-wall binding lipoprotein; PDB: 2ap3_A. Probab=37.67 E-value=2.8 Score=18.71 Aligned_cols=12 Identities=25% Similarity=0.312 Sum_probs=9.0 Q ss_pred HHHHHHHHHCCC Q ss_conf 999988871457 Q T0612 7 LACLLLGLSACA 18 (129) Q Consensus 7 ~~~~~l~L~GCa 18 (129) +++++++|+||. T Consensus 1 ~~~s~~lLtGC~ 12 (204) T PF10368_consen 1 FILSALLLTGCF 12 (204) T ss_dssp -----------H T ss_pred CHHHHHHHHHCC T ss_conf 948999998438 No 52 >PF10828 DUF2570: Protein of unknown function (DUF2570) Probab=37.14 E-value=5.1 Score=17.19 Aligned_cols=18 Identities=22% Similarity=0.412 Sum_probs=15.7 Q ss_pred CHHHHHHHHHHHHHHCCC Q ss_conf 924889999988871457 Q T0612 1 MNKGLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCa 18 (129) |+||...++.+++++.|+ T Consensus 1 ~~~~~~~~l~~liv~l~~ 18 (110) T PF10828_consen 1 MTRYIYIALAFLIVGLCG 18 (110) T ss_pred CCHHHHHHHHHHHHHHHH T ss_conf 917999999999999999 No 53 >PF06316 Ail_Lom: Enterobacterial Ail/Lom protein; InterPro: IPR000758 Virulence-related outer membrane proteins are expressed in Gram-negative bacteria and are essential to bacterial survival within macrophages and for eukaryotic cell invasion. Members of this group include: PagC, required by Salmonella typhimurium for survival in macrophages and for virulence in mice Rck outer membrane protein of the S. typhimurium virulence plasmid Ail, a product of the Yersinia enterocolitica chromosome capable of mediating bacterial adherence to and invasion of epithelial cell lines OmpX from Escherichia coli that promotes adhesion to and entry into mammalian cells. It also has a role in the resistance against attack by the human complement system a Bacteriophage lambda outer membrane protein, Lom The crystal structure of OmpX from E. coli reveals that OmpX consists of an eight-stranded antiparallel all-next-neighbour beta barrel . The structure shows two girdles of aromatic amino acid residues and a ribbon of nonpolar residues that attach to the membrane interior. The core of the barrel consists of an extended hydrogen-bonding network of highly conserved residues. OmpX thus resembles an inverse micelle. The OmpX structure shows that the membrane-spanning part of the protein is much better conserved than the extracellular loops. Moreover, these loops form a protruding beta sheet, the edge of which presumably binds to external proteins. It is suggested that this type of binding promotes cell adhesion and invasion and helps defend against the complement system. Although OmpX has the same beta-sheet topology as the structurally related outer membrane protein A (OmpA) IPR000498 from INTERPRO, their barrels differ with respect to the shear numbers and internal hydrogen-bonding networks.; GO: 0009279 cell outer membrane; PDB: 1q9f_A 1qj9_A 1q9g_A 1orm_A 1qj8_A. Probab=36.15 E-value=9 Score=15.77 Aligned_cols=18 Identities=28% Similarity=0.519 Sum_probs=12.3 Q ss_pred CCEEEEEEEEEECCCCCE Q ss_conf 762899999997688968 Q T0612 69 KDQRLQYKFTWYDINGAT 86 (129) Q Consensus 69 ~~~~l~Yrf~WyD~~Gl~ 86 (129) +=++|.||..|=|.-|+- T Consensus 49 ~G~NlKYRYE~d~~lGvi 66 (199) T PF06316_consen 49 KGFNLKYRYEFDDPLGVI 66 (199) T ss_dssp ---EEEEEEECTT----- T ss_pred CCEEEEEEEECCCCCEEE T ss_conf 835998541438981748 No 54 >PF01298 Lipoprotein_5: Transferrin binding protein-like solute binding protein; InterPro: IPR001677 Bacterial transferrin binding proteins act as transferrin receptors and are required for transferrin utilisation. Transferrins are iron-binding glycoproteins that control the level of free iron in biological fluids. ; GO: 0004998 transferrin receptor activity, 0016020 membrane Probab=35.68 E-value=5.9 Score=16.82 Aligned_cols=14 Identities=36% Similarity=0.432 Sum_probs=9.2 Q ss_pred HHHHHHHHHHHCCC Q ss_conf 89999988871457 Q T0612 5 LVLACLLLGLSACA 18 (129) Q Consensus 5 l~~~~~~l~L~GCa 18 (129) ..++++++||++|+ T Consensus 9 ~~~~l~~~lLsACs 22 (593) T PF01298_consen 9 SAIALAAFLLSACS 22 (593) T ss_pred HHHHHHHHHHHHHC T ss_conf 58999999998734 No 55 >PF06649 DUF1161: Protein of unknown function (DUF1161); InterPro: IPR010595 This family consists of several short, hypothetical bacterial proteins of unknown function. Probab=34.13 E-value=11 Score=15.38 Aligned_cols=13 Identities=46% Similarity=0.409 Sum_probs=8.8 Q ss_pred CHHHHHHHHHHHH Q ss_conf 9248899999888 Q T0612 1 MNKGLVLACLLLG 13 (129) Q Consensus 1 Mkk~l~~~~~~l~ 13 (129) ||||++.+.++++ T Consensus 1 Mkk~~l~~~l~~l 13 (75) T PF06649_consen 1 MKKFLLAVALLLL 13 (75) T ss_pred CCHHHHHHHHHHH T ss_conf 9356999999998 No 56 >PF09580 Spore_YhcN_YlaJ: Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ) Probab=32.16 E-value=2.6 Score=18.85 Aligned_cols=12 Identities=33% Similarity=0.553 Sum_probs=8.7 Q ss_pred HHHHHHCCCCCC Q ss_conf 988871457886 Q T0612 10 LLLGLSACAPHT 21 (129) Q Consensus 10 ~~l~L~GCas~~ 21 (129) ++++|+||+... T Consensus 2 ~~~~LaGC~~~~ 13 (174) T PF09580_consen 2 LLSLLAGCGNNN 13 (174) T ss_pred CEEEEECCCCCC T ss_conf 304660548898 No 57 >PF10023 DUF2265: Predicted aminopeptidase (DUF2265) Probab=30.61 E-value=6 Score=16.78 Aligned_cols=14 Identities=36% Similarity=0.574 Sum_probs=10.6 Q ss_pred HHHHHHHHHHCCCC Q ss_conf 99999888714578 Q T0612 6 VLACLLLGLSACAP 19 (129) Q Consensus 6 ~~~~~~l~L~GCas 19 (129) ++++++++|+||++ T Consensus 1 ~~~~~~l~l~GC~~ 14 (337) T PF10023_consen 1 LLLLLALLLAGCSS 14 (337) T ss_pred CHHHHHHHHCCCCH T ss_conf 93899999635515 No 58 >PF03748 FliL: Flagellar basal body-associated protein FliL; InterPro: IPR005503 This FliL protein controls the rotational direction of the flagella during chemotaxis . FliL is a cytoplasmic membrane protein associated with the basal body .; GO: 0001539 ciliary or flagellar motility, 0006935 chemotaxis, 0009425 flagellin-based flagellum basal body Probab=30.57 E-value=7.4 Score=16.26 Aligned_cols=18 Identities=28% Similarity=0.209 Sum_probs=12.0 Q ss_pred CHHHHHHHHHHHHHHCCC Q ss_conf 924889999988871457 Q T0612 1 MNKGLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCa 18 (129) ||||++.+++++++.+|+ T Consensus 1 kK~li~~i~~~ll~~~~~ 18 (149) T PF03748_consen 1 KKKLIIIIVALLLLIVGA 18 (149) T ss_pred CCHHHHHHHHHHHHHHHH T ss_conf 943799999999999999 No 59 >PF05540 Serpulina_VSP: Serpulina hyodysenteriae variable surface protein; InterPro: IPR008838 This family consists of several variable surface proteins from Brachyspira hyodysenteriae. Probab=30.36 E-value=6.9 Score=16.43 Aligned_cols=18 Identities=33% Similarity=0.394 Sum_probs=13.9 Q ss_pred CHHHHHHHHHHHHHHCCC Q ss_conf 924889999988871457 Q T0612 1 MNKGLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCa 18 (129) |||.|+.+.+++.++-|+ T Consensus 1 MKK~lL~~~allti~~~S 18 (377) T PF05540_consen 1 MKKFLLTAIALLTIASAS 18 (377) T ss_pred CCEEHHHHHHHHHHHHHH T ss_conf 925268899999998766 No 60 >PF07148 MalM: Maltose operon periplasmic protein precursor (MalM); InterPro: IPR010794 This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown .; GO: 0008643 carbohydrate transport, 0042597 periplasmic space Probab=28.98 E-value=10 Score=15.40 Aligned_cols=32 Identities=16% Similarity=-0.062 Sum_probs=17.0 Q ss_pred EEEEEEEECCCCCE-EEEEEEEEECCCCCEECC Q ss_conf 99999984668762-899999997688968388 Q T0612 58 RARGTIISKSPKDQ-RLQYKFTWYDINGATVED 89 (129) Q Consensus 58 ~~~v~l~N~~~~~~-~l~Yrf~WyD~~Gl~v~~ 89 (129) +..+.|++.-.+.. -+.=..--+|+++=++.. T Consensus 81 ~~~i~LsS~v~~~~~VfaP~VlvLD~~f~~~~~ 113 (281) T PF07148_consen 81 SLSITLSSLVIDDSQVFAPNVLVLDEQFQPVAT 113 (281) T ss_pred CEEEEEEEEECCCCEEEEEEEEEECCCCCEEEE T ss_conf 279999976518841775208997277774454 No 61 >PF07705 CARDB: CARDB; InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins. Probab=28.53 E-value=14 Score=14.68 Aligned_cols=71 Identities=13% Similarity=0.032 Sum_probs=46.4 Q ss_pred EEEEECCCEEEEEEEEEECCCCCEEEEEEEEEECCCCCEECCCCCCCEEE-EECCCCEEEEEEECCCCCEEEEEEEEEE Q ss_conf 10220487489999998466876289999999768896838888760389-9779950899986379840589999997 Q T0612 48 VTTQAEAGFLRARGTIISKSPKDQRLQYKFTWYDINGATVEDEGVSWKSL-KLHGKQQMQVTALSPNATAVRCELYVRE 125 (129) Q Consensus 48 ~~~~~~~g~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v~~~~~~W~~l-~l~~~~~~~i~~vap~~~a~~~RlylRe 125 (129) +.....+.-.+..+.|+|....+. =..+..+|....+. .+..+ .|.++++.++...-+-+.+-+|.|.+.- T Consensus 12 ~~~~~~g~~~~i~~~V~N~G~~~a-~~~~v~~~~~~~~~------~~~~i~~L~~g~~~~v~~~~~~~~~G~~~l~~~i 83 (101) T PF07705_consen 12 PSSPTPGESVTITVTVKNQGTADA-ENVTVSFYLDGDLV------STVTIPSLAPGESATVTFTWTPPTSGNYTLTAVI 83 (101) T ss_pred CCCCCCCCEEEEEEEEEECCCCCC-CCEEEEEEECCCCC------CCEEECCCCCCCEEEEEEEEEECCCCEEEEEEEE T ss_conf 885568988999999997787766-65899999899820------6779444889968999999871789819999999 No 62 >PF09676 TraV: Type IV conjugative transfer system lipoprotein (TraV) Probab=27.74 E-value=5.6 Score=16.96 Aligned_cols=11 Identities=55% Similarity=0.685 Sum_probs=8.1 Q ss_pred HHHHHHHHCCC Q ss_conf 99988871457 Q T0612 8 ACLLLGLSACA 18 (129) Q Consensus 8 ~~~~l~L~GCa 18 (129) ++++++|+||+ T Consensus 1 ~~~~l~LsGCs 11 (135) T PF09676_consen 1 ALALLLLSGCS 11 (135) T ss_pred CHHHHHHCCCC T ss_conf 90353210255 No 63 >PF09926 DUF2158: Uncharacterized small protein (DUF2158) Probab=27.60 E-value=15 Score=14.58 Aligned_cols=20 Identities=15% Similarity=0.275 Sum_probs=15.2 Q ss_pred CCCEEEEEEEEEECCCCCEE Q ss_conf 87628999999976889683 Q T0612 68 PKDQRLQYKFTWYDINGATV 87 (129) Q Consensus 68 ~~~~~l~Yrf~WyD~~Gl~v 87 (129) ...-.-.|++.|||..|... T Consensus 25 ~~~~~~~v~C~Wf~~~~~~~ 44 (53) T PF09926_consen 25 AGASSGWVECQWFDGEGERQ 44 (53) T ss_pred CCCCCCEEEEEEECCCCCCC T ss_conf 67788659999816998322 No 64 >PF11810 DUF3332: Domain of unknown function (DUF3332) Probab=27.13 E-value=12 Score=15.08 Aligned_cols=17 Identities=18% Similarity=0.333 Sum_probs=11.5 Q ss_pred HHHHHHHHHHHHHCC-CC Q ss_conf 488999998887145-78 Q T0612 3 KGLVLACLLLGLSAC-AP 19 (129) Q Consensus 3 k~l~~~~~~l~L~GC-as 19 (129) +.+.+++.++.|+|| ++ T Consensus 6 ~~~~~~~~~~~lsgC~Gs 23 (176) T PF11810_consen 6 AAVAILLGSVSLSGCIGS 23 (176) T ss_pred HHHHHHHHHHHHCCCCCC T ss_conf 999999999985234242 No 65 >PF06486 DUF1093: Protein of unknown function (DUF1093); InterPro: IPR006542 These are a family of small (about 115 amino acids) uncharacterised proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members.; PDB: 2k5w_A 2k5q_A. Probab=24.25 E-value=13 Score=14.85 Aligned_cols=17 Identities=29% Similarity=0.595 Sum_probs=15.1 Q ss_pred EEEEEEEEEECCCCCEE Q ss_conf 28999999976889683 Q T0612 71 QRLQYKFTWYDINGATV 87 (129) Q Consensus 71 ~~l~Yrf~WyD~~Gl~v 87 (129) ..-.|++-+||++|=+. T Consensus 26 ~~y~Y~l~~yde~G~~k 42 (78) T PF06486_consen 26 KRYEYTLKGYDEDGKEK 42 (78) T ss_dssp EEEEEEEEEEE----EE T ss_pred CEEEEEEEEECCCCCEE T ss_conf 43999889998899999 No 66 >PF07424 TrbM: TrbM; InterPro: IPR009989 This family contains the bacterial protein TrbM (approximately 180 residues long). In Comamonas testosteroni T-2, TrbM is derived from the IncP1beta plasmid pTSA, which encodes the widespread genes for p-toluenesulphonate (TSA) degradation . Probab=23.13 E-value=18 Score=14.09 Aligned_cols=14 Identities=29% Similarity=0.394 Sum_probs=8.7 Q ss_pred CHHHHHHHHHHHHH Q ss_conf 92488999998887 Q T0612 1 MNKGLVLACLLLGL 14 (129) Q Consensus 1 Mkk~l~~~~~~l~L 14 (129) |||.++++++++++ T Consensus 1 MKK~~la~~l~~~~ 14 (186) T PF07424_consen 1 MKKKLLAVALAFAA 14 (186) T ss_pred CCHHHHHHHHHHHH T ss_conf 93489999999998 No 67 >PF05753 TRAP_beta: Translocon-associated protein beta (TRAPB); InterPro: IPR008856 This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion .; GO: 0005783 endoplasmic reticulum, 0016021 integral to membrane Probab=23.08 E-value=18 Score=14.08 Aligned_cols=55 Identities=13% Similarity=0.054 Sum_probs=37.8 Q ss_pred CCCEEEEEEEEEECCCCCEEEEEEEEEEC----CCCCEECCCCCCCEEEEECCCCEEEEEEE Q ss_conf 48748999999846687628999999976----88968388887603899779950899986 Q T0612 53 EAGFLRARGTIISKSPKDQRLQYKFTWYD----INGATVEDEGVSWKSLKLHGKQQMQVTAL 110 (129) Q Consensus 53 ~~g~~~~~v~l~N~~~~~~~l~Yrf~WyD----~~Gl~v~~~~~~W~~l~l~~~~~~~i~~v 110 (129) ++.-+.+...|.|..+.+ .|...-.| .+.|++-.....|+=-.|++++.++..-+ T Consensus 36 ~g~~v~V~~~iyN~G~s~---A~dV~i~D~~~p~~~F~lvsG~~s~~~~~l~pgs~vsh~~v 94 (181) T PF05753_consen 36 EGEDVTVSYTIYNVGSSP---AYDVSITDDSFPPDDFELVSGSLSASWERLPPGSNVSHSYV 94 (181) T ss_pred CCCEEEEEEEEEECCCCC---EEEEEEECCCCCCCCEEEECCCEEEEEEEECCCCEEEEEEE T ss_conf 785799999999779871---68889978999944309974842568998589973789999 No 68 >PF06280 DUF1034: Fn3-like domain (DUF1034); InterPro: IPR010435 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 1xf1_A 3eif_A. Probab=22.93 E-value=18 Score=14.06 Aligned_cols=62 Identities=13% Similarity=0.073 Sum_probs=40.7 Q ss_pred CCEEEEEEEEEECCCCCEEEEEEEE-EE-----CCCCCEEC--------CCCCCCEEEEECCCCEEEEEEECCCCC Q ss_conf 8748999999846687628999999-97-----68896838--------888760389977995089998637984 Q T0612 54 AGFLRARGTIISKSPKDQRLQYKFT-WY-----DINGATVE--------DEGVSWKSLKLHGKQQMQVTALSPNAT 115 (129) Q Consensus 54 ~g~~~~~v~l~N~~~~~~~l~Yrf~-Wy-----D~~Gl~v~--------~~~~~W~~l~l~~~~~~~i~~vap~~~ 115 (129) ++.....+.|+|..+++++..+.-. .+ ..+|.... .....|..++|+++++.+|...-.-|. T Consensus 7 ~~~~~~tvtl~N~g~~~~tY~~~~~~~~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vTV~ag~s~~v~vt~~~p~ 82 (112) T PF06280_consen 7 GNFFTFTVTLHNTGNKDKTYTLSHVGVLTDQTDKNDGYFTLPPIAPGAASVTFSPNTVTVPAGGSKTVTVTFTPPS 82 (112) T ss_dssp -SEEEEEEEEEE-SSS-EEEEEEEE-EEEEEE-----BEEEEEEE----EEE---EEEEE-TTEEEEEEEEEE--G T ss_pred CCCEEEEEEEEECCCCCEEEEEEEEEEEEEEEECCCCCCCCCCCCCEEEEEEECCCEEEECCCCEEEEEEEEEECC T ss_conf 7848999999958999889999406887789722577113565542025666379849999999899999997631 No 69 >PF06387 Calcyon: D1 dopamine receptor-interacting protein (calcyon); InterPro: IPR009431 This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca^2+ as well as cAMP-dependent signaling . Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) and schizophrenia.; GO: 0050780 dopamine receptor binding, 0007212 dopamine receptor signaling pathway, 0016021 integral to membrane Probab=22.76 E-value=8.6 Score=15.90 Aligned_cols=16 Identities=38% Similarity=0.791 Sum_probs=12.6 Q ss_pred EEEEEEEEECC---CCCEE Q ss_conf 89999999768---89683 Q T0612 72 RLQYKFTWYDI---NGATV 87 (129) Q Consensus 72 ~l~Yrf~WyD~---~Gl~v 87 (129) -+-|+-||||. +||.. T Consensus 101 LVvYKa~~YDq~CPdGFv~ 119 (186) T PF06387_consen 101 LVVYKAYWYDQTCPDGFVL 119 (186) T ss_pred HHHHHEEECCCCCCCCEEE T ss_conf 9863111305779974156 No 70 >PF05272 VirE: Virulence-associated protein E; InterPro: IPR007936 This family contains several bacterial virulence-associated protein E like proteins. Probab=22.58 E-value=16 Score=14.34 Aligned_cols=29 Identities=21% Similarity=0.091 Sum_probs=20.2 Q ss_pred CHHHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 92488999998887145788667302687 Q T0612 1 MNKGLVLACLLLGLSACAPHTGGIMISST 29 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCas~~~~~~~~~~ 29 (129) |||||+.+.+-.+=-||..++.-+....| T Consensus 33 ~~~wlig~Var~~~pg~k~d~vliL~G~Q 61 (198) T PF05272_consen 33 FKKWLIGAVARAFNPGCKFDTVLILVGKQ 61 (198) T ss_pred HHHHHHHHHHHHHCCCCCCCEEEEEECCC T ss_conf 99999999999978997788688988899 No 71 >PF07219 HemY_N: HemY protein N-terminus; InterPro: IPR010817 This entry represents the N terminus (approximately 150 residues) of bacterial HemY porphyrin biosynthesis proteins. These are membrane protein involved in a late step of protoheme IX synthesis . Probab=22.48 E-value=14 Score=14.74 Aligned_cols=18 Identities=28% Similarity=0.431 Sum_probs=14.2 Q ss_pred CHHHHHHHHHHHHHHCCC Q ss_conf 924889999988871457 Q T0612 1 MNKGLVLACLLLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~L~GCa 18 (129) |+|.+++++.+++++.|. T Consensus 1 M~R~l~~~li~l~la~~~ 18 (134) T PF07219_consen 1 MIRILIFLLIVLALAAVG 18 (134) T ss_pred CHHHHHHHHHHHHHHHHH T ss_conf 989999999999999999 No 72 >PF05968 Bacillus_PapR: Bacillus PapR protein; InterPro: IPR009239 This family consists of the Bacillus species-specific PapR protein. The papR gene belongs to the PlcR regulon and is located 70 bp downstream from plcR. It encodes a 48-amino-acid peptide. Disruption of the papR gene abolishes expression of the PlcR regulon, resulting in a large decrease in haemolysis and virulence in insect larvae. A processed form of PapR activates the PlcR regulon by allowing PlcR to bind to its DNA target. This activating mechanism is strain specific . Probab=22.34 E-value=17 Score=14.19 Aligned_cols=18 Identities=28% Similarity=0.176 Sum_probs=12.9 Q ss_pred CHHHHHHHHHHHH-HHCCC Q ss_conf 9248899999888-71457 Q T0612 1 MNKGLVLACLLLG-LSACA 18 (129) Q Consensus 1 Mkk~l~~~~~~l~-L~GCa 18 (129) |||.|+..++++. +.|-+ T Consensus 1 mkk~l~~sll~lam~~gis 19 (48) T PF05968_consen 1 MKKLLIGSLLTLAMAWGIS 19 (48) T ss_pred CCHHHHHHHHHHHHHHHHH T ss_conf 9047885899999996000 No 73 >PF06551 DUF1120: Protein of unknown function (DUF1120); InterPro: IPR010546 This family consists of several bacterial proteins, at least one of which is involved in enzyme induction following nitrogen deprivation. The exact function of this family is unknown Probab=21.18 E-value=18 Score=14.10 Aligned_cols=18 Identities=39% Similarity=0.360 Sum_probs=9.6 Q ss_pred CHHHHHHHHH--HHHHHCCC Q ss_conf 9248899999--88871457 Q T0612 1 MNKGLVLACL--LLGLSACA 18 (129) Q Consensus 1 Mkk~l~~~~~--~l~L~GCa 18 (129) |||.|+..++ .+++.+++ T Consensus 1 MKK~l~~~~l~a~l~~~~~s 20 (145) T PF06551_consen 1 MKKNLLATLLLASLLLLASS 20 (145) T ss_pred CCHHHHHHHHHHHHHHHHCH T ss_conf 93679999999999986021 No 74 >PF04744 Monooxygenase_B: Monellin Monooxygenase subunit B protein; InterPro: IPR006833 Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related . These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules . These enzymes are composed of 3 subunits - A (IPR003393 from INTERPRO), B (IPR006833 from INTERPRO) and C (IPR006980 from INTERPRO) - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus (Bath) is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certain. The soluble regions of particulate methane monooxygenase from Methylococcus capsulatus (Bath) derive primarily from the B subunit. This subunit forms two antiparallel beta sheets and contains the mono- and di- nuclear copper metal centres .; PDB: 4mon_D 1fa3_A 3mon_F 1krl_D 2o9u_X 1iv7_B 1mol_B 1iv9_A 1fuw_A 1m9g_A .... Probab=21.14 E-value=19 Score=13.85 Aligned_cols=65 Identities=15% Similarity=0.203 Sum_probs=39.4 Q ss_pred EEEEEEEEEEECCCEEEEEEEEEECCCCCEEEEE----EEEEECCC----------------CCEECCCCCCCEEEEECC Q ss_conf 0477421022048748999999846687628999----99997688----------------968388887603899779 Q T0612 42 DVDVSAVTTQAEAGFLRARGTIISKSPKDQRLQY----KFTWYDIN----------------GATVEDEGVSWKSLKLHG 101 (129) Q Consensus 42 ~i~v~~~~~~~~~g~~~~~v~l~N~~~~~~~l~Y----rf~WyD~~----------------Gl~v~~~~~~W~~l~l~~ 101 (129) .+.+.+-.=+..+--++....++|+.+.++.|-= -.-+.+.+ ||++++.+ .|++ T Consensus 250 ~~kv~~atY~VPGR~l~~~~~VTN~g~~pv~lgEF~tA~vRFln~~v~~~~~~yp~~lla~~GL~v~~~~------pI~P 323 (381) T PF04744_consen 250 KAKVTDATYRVPGRALRMTLKVTNNGDEPVRLGEFNTANVRFLNPDVPTDDPNYPDELLAERGLSVSDNS------PIAP 323 (381) T ss_dssp EEEEEEEEEE----EEEEEEEEEE-----EE---EE-SS-EE--TTT--------GCCEE----EES--S-------B-- T ss_pred EEEEECCEEECCCCEEEEEEEEECCCCCCEEEEEEEECCEEEECCCCCCCCCCCCHHHCCCCCCCCCCCC------CCCC T ss_conf 9998255763488179999999748986468876650436775786666888994455156771518988------7699 Q ss_pred CCEEEEEEECC Q ss_conf 95089998637 Q T0612 102 KQQMQVTALSP 112 (129) Q Consensus 102 ~~~~~i~~vap 112 (129) +|+.++.-.+- T Consensus 324 GETk~v~v~aq 334 (381) T PF04744_consen 324 GETKTVEVEAQ 334 (381) T ss_dssp --EEEEEEEEE T ss_pred CCCEEEEEEEE T ss_conf 96258999961 No 75 >PF01847 VHL: von Hippel-Lindau disease tumour suppressor protein; InterPro: IPR002714 This family of proteins is involved in the ubiquitylation and subsequent proteasomal degradation of proteins via the von Hippel-Lindau ubiquitylation complex. They appear to act as the target recruitment subunit in the E3 ubiquitin ligase complex and recruit hydroxylated hypoxia-inducible factor (HIF) under normoxic conditions. They are also involved in transcriptional repression through interaction with HIF1A, HIF1AN and histone deacetylases. Human VHL has been demonstrated to form a ternary complex with elonginB O44226 from SWISSPROT and elonginC O13292 from SWISSPROT proteins . This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor P15692 from SWISSPROT mRNA.; GO: 0016567 protein ubiquitination, 0005634 nucleus; PDB: 1lm8_V 1lqb_C 1vcb_I. Probab=20.81 E-value=20 Score=13.81 Aligned_cols=34 Identities=9% Similarity=0.198 Sum_probs=25.4 Q ss_pred EEECCCEEEEEEEEEECCCCCEEEEEEEEEECCCCCEE Q ss_conf 22048748999999846687628999999976889683 Q T0612 50 TQAEAGFLRARGTIISKSPKDQRLQYKFTWYDINGATV 87 (129) Q Consensus 50 ~~~~~g~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v 87 (129) .+..+......+...|++..++ +.||.|-+|=++ T Consensus 6 lRS~~S~~~s~V~FvN~s~r~V----d~~Wlny~G~~~ 39 (156) T PF01847_consen 6 LRSVNSREPSYVVFVNRSNRTV----DVYWLNYDGKEQ 39 (156) T ss_dssp S-------EEEEEEEE-SSS-E----EEEEE-----EE T ss_pred CCCCCCCCCEEEEEEECCCCEE----EEEEECCCCCEE T ss_conf 0214788836999995899848----899986799886 No 76 >PF11777 DUF3316: Protein of unknown function (DUF3316) Probab=20.57 E-value=19 Score=13.88 Aligned_cols=12 Identities=58% Similarity=0.750 Sum_probs=8.2 Q ss_pred CHHHHHHHHHHH Q ss_conf 924889999988 Q T0612 1 MNKGLVLACLLL 12 (129) Q Consensus 1 Mkk~l~~~~~~l 12 (129) |||.++++++++ T Consensus 1 MKkl~ll~~~l~ 12 (114) T PF11777_consen 1 MKKLILLASLLL 12 (114) T ss_pred CHHHHHHHHHHH T ss_conf 904999999999 No 77 >PF05404 TRAP-delta: Translocon-associated protein, delta subunit precursor (TRAP-delta); InterPro: IPR008855 This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown .; GO: 0005783 endoplasmic reticulum, 0016021 integral to membrane Probab=20.36 E-value=20 Score=13.75 Aligned_cols=31 Identities=13% Similarity=0.175 Sum_probs=18.5 Q ss_pred CEEEEEEEEEECCCCCEEEEEEEEEECCCCCEE Q ss_conf 748999999846687628999999976889683 Q T0612 55 GFLRARGTIISKSPKDQRLQYKFTWYDINGATV 87 (129) Q Consensus 55 g~~~~~v~l~N~~~~~~~l~Yrf~WyD~~Gl~v 87 (129) +.-|++=....+.... =.|....||+.|+.. T Consensus 78 ~kyQVSW~~e~K~a~s--G~y~V~~fDEegy~a 108 (167) T PF05404_consen 78 NKYQVSWSEEHKKASS--GTYQVRFFDEEGYAA 108 (167) T ss_pred CEEEEEEEEECCCCCC--CCEEEEEECHHHHHH T ss_conf 7079999830041668--857999967288999 No 78 >PF01552 Pico_P2B: Picornavirus 2B protein; InterPro: IPR002527 Poliovirus infection leads to drastic alterations in membrane permeability late during infection. Proteins 2B and 2BC enhance membrane permeability , .; GO: 0000166 nucleotide binding, 0003968 RNA-directed RNA polymerase activity, 0005198 structural molecule activity, 0008233 peptidase activity, 0008234 cysteine-type peptidase activity, 0016740 transferase activity, 0016779 nucleotidyltransferase activity, 0016787 hydrolase activity, 0006410 transcription, RNA-dependent, 0018144 RNA-protein covalent cross-linking, 0019012 virion Probab=20.18 E-value=20 Score=13.74 Aligned_cols=17 Identities=29% Similarity=0.366 Sum_probs=11.0 Q ss_pred HHHHHHHHHHHCCCCCC Q ss_conf 89999988871457886 Q T0612 5 LVLACLLLGLSACAPHT 21 (129) Q Consensus 5 l~~~~~~l~L~GCas~~ 21 (129) +...++++.|-||..+| T Consensus 62 ~~Tv~ATlaLLGCd~SP 78 (99) T PF01552_consen 62 LVTVLATLALLGCDGSP 78 (99) T ss_pred HHHHHHHHHHHCCCCCH T ss_conf 59999999997848988 No 79 >PF06518 DUF1104: Protein of unknown function (DUF1104); InterPro: IPR009488 This family consists of several hypothetical proteins of unknown function which appear to be found exclusively in Helicobacter pylori. Probab=20.05 E-value=19 Score=13.84 Aligned_cols=16 Identities=25% Similarity=0.318 Sum_probs=10.9 Q ss_pred CHHHHHHHHHHHHHHC Q ss_conf 9248899999888714 Q T0612 1 MNKGLVLACLLLGLSA 16 (129) Q Consensus 1 Mkk~l~~~~~~l~L~G 16 (129) |||.+.++++.+|+++ T Consensus 1 mKk~~~i~l~~~L~~~ 16 (142) T PF06518_consen 1 MKKAVSILLVSLLLAS 16 (142) T ss_pred CCHHHHHHHHHHHHHH T ss_conf 9147999999999999 Done!