Query 029811
Match_columns 187
No_of_seqs 151 out of 494
Neff 6.5
Searched_HMMs 46136
Date Fri Mar 29 04:02:05 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/029811.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/029811hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF04525 Tub_2: Tubby C 2; In 100.0 3.9E-31 8.4E-36 214.4 13.9 147 25-174 2-155 (187)
2 COG4894 Uncharacterized conser 99.8 3E-19 6.5E-24 138.8 4.3 126 36-180 5-131 (159)
3 PF03803 Scramblase: Scramblas 98.7 8E-07 1.7E-11 73.4 15.1 124 38-170 23-173 (221)
4 COG4894 Uncharacterized conser 98.5 2E-07 4.4E-12 73.0 6.5 69 33-103 26-94 (159)
5 PF04525 Tub_2: Tubby C 2; In 98.4 4.2E-06 9.2E-11 67.7 11.1 96 34-133 36-142 (187)
6 PF03803 Scramblase: Scramblas 96.9 0.022 4.8E-07 46.9 12.2 65 53-121 78-148 (221)
7 KOG0621 Phospholipid scramblas 93.7 1.1 2.3E-05 39.3 11.0 111 51-174 99-236 (292)
8 PF01690 PLRV_ORF5: Potato lea 85.7 1.6 3.4E-05 40.5 5.2 61 8-72 10-72 (465)
9 KOG0621 Phospholipid scramblas 79.6 25 0.00055 30.8 10.1 65 53-121 134-206 (292)
10 PF09008 Head_binding: Head bi 75.6 8.9 0.00019 28.9 5.3 42 44-91 63-104 (114)
11 KOG3950 Gamma/delta sarcoglyca 69.2 5.3 0.00011 34.4 3.2 51 39-89 127-177 (292)
12 PF13860 FlgD_ig: FlgD Ig-like 63.4 16 0.00034 25.3 4.2 17 52-68 28-44 (81)
13 PF09000 Cytotoxic: Cytotoxic; 62.5 17 0.00038 26.1 4.3 50 40-93 18-69 (85)
14 PF04790 Sarcoglycan_1: Sarcog 58.8 19 0.00041 31.1 4.8 48 38-85 104-154 (264)
15 PRK12816 flgG flagellar basal 52.9 20 0.00042 30.7 3.9 42 47-89 96-138 (264)
16 PF15119 APOC4: Apolipoprotein 50.3 11 0.00025 27.3 1.7 19 3-21 2-20 (99)
17 PF15529 Toxin_49: Putative to 49.7 47 0.001 24.1 4.8 19 51-69 30-48 (89)
18 TIGR02488 flgG_G_neg flagellar 47.0 25 0.00055 29.8 3.7 42 47-89 94-136 (259)
19 PRK15393 NUDIX hydrolase YfcD; 45.8 65 0.0014 25.6 5.7 55 52-107 11-73 (180)
20 PRK12694 flgG flagellar basal 44.5 28 0.00062 29.6 3.6 42 47-89 96-138 (260)
21 PF02974 Inh: Protease inhibit 44.1 1.2E+02 0.0025 21.9 6.4 31 74-107 61-91 (99)
22 PRK12693 flgG flagellar basal 41.8 39 0.00085 28.6 4.0 42 47-89 96-138 (261)
23 cd05828 Sortase_D_4 Sortase D 40.7 48 0.001 24.8 4.0 40 50-89 65-104 (127)
24 smart00634 BID_1 Bacterial Ig- 40.5 79 0.0017 22.1 4.9 11 53-63 24-34 (92)
25 TIGR03784 marine_sortase sorta 39.4 57 0.0012 26.2 4.4 22 72-93 110-132 (174)
26 cd06166 Sortase_D_5 Sortase D 39.1 58 0.0013 24.3 4.2 21 50-70 68-88 (126)
27 PF12396 DUF3659: Protein of u 37.4 87 0.0019 21.2 4.4 38 54-91 14-57 (64)
28 PF12312 NeA_P2: Nepovirus sub 36.8 22 0.00047 29.8 1.6 27 8-34 104-133 (258)
29 PRK06655 flgD flagellar basal 33.9 60 0.0013 27.2 3.9 18 74-91 127-144 (225)
30 PF12690 BsuPI: Intracellular 33.6 33 0.00072 24.1 1.9 17 52-68 27-43 (82)
31 PF03574 Peptidase_S48: Peptid 33.5 7.6 0.00017 30.1 -1.4 23 165-187 127-149 (149)
32 PRK12691 flgG flagellar basal 33.1 61 0.0013 27.5 3.9 41 48-89 97-138 (262)
33 cd03676 Nudix_hydrolase_3 Memb 32.8 1.9E+02 0.0041 22.6 6.5 60 48-107 2-72 (180)
34 KOG2675 Adenylate cyclase-asso 31.4 30 0.00065 32.1 1.8 9 60-68 292-300 (480)
35 PRK12640 flgF flagellar basal 29.7 50 0.0011 28.0 2.8 37 52-89 87-123 (246)
36 PRK12817 flgG flagellar basal 29.5 61 0.0013 27.6 3.3 37 52-89 98-134 (260)
37 PRK12813 flgD flagellar basal 28.2 64 0.0014 27.2 3.1 15 76-90 127-141 (223)
38 PRK12819 flgG flagellar basal 28.0 75 0.0016 27.0 3.6 38 51-89 99-136 (257)
39 PRK10523 lipoprotein involved 27.3 99 0.0021 26.3 4.1 36 46-87 91-127 (234)
40 KOG4375 Scaffold protein Shank 26.6 50 0.0011 28.6 2.2 25 1-25 52-76 (272)
41 PRK15244 virulence protein Spv 26.5 40 0.00087 32.4 1.7 19 3-21 359-377 (591)
42 TIGR02150 IPP_isom_1 isopenten 26.4 1.7E+02 0.0036 22.5 5.0 53 53-105 1-61 (158)
43 COG5436 Predicted integral mem 26.1 1.8E+02 0.004 23.5 5.1 16 77-92 93-108 (182)
44 PRK12818 flgG flagellar basal 25.8 76 0.0016 27.0 3.2 37 52-89 102-138 (256)
45 PF09475 Dot_icm_IcmQ: Dot/Icm 25.1 24 0.00052 28.7 0.0 63 28-99 95-158 (179)
46 COG3111 Periplasmic protein wi 25.0 1.5E+02 0.0032 23.0 4.3 51 36-95 43-93 (128)
47 COG2849 Uncharacterized protei 24.9 2.3E+02 0.0049 23.6 5.9 56 50-105 158-213 (230)
48 PF15324 TALPID3: Hedgehog sig 24.8 49 0.0011 34.1 2.0 28 4-31 945-973 (1252)
49 cd00028 B_lectin Bulb-type man 24.7 2.5E+02 0.0054 20.3 5.5 12 52-63 66-77 (116)
50 PF07680 DoxA: TQO small subun 24.7 1.3E+02 0.0027 23.4 4.0 28 72-99 46-73 (133)
51 PF06788 UPF0257: Uncharacteri 24.6 1.7E+02 0.0037 24.9 5.0 38 54-91 52-91 (236)
52 PRK13245 hetR heterocyst diffe 24.4 22 0.00047 30.4 -0.4 35 153-187 165-202 (299)
53 smart00108 B_lectin Bulb-type 24.3 2.7E+02 0.0059 19.9 5.6 12 52-63 65-76 (114)
54 PF13511 DUF4124: Domain of un 24.1 62 0.0013 20.7 1.9 18 52-69 15-32 (60)
55 COG1021 EntE Peptide arylation 23.9 45 0.00098 31.2 1.5 38 52-89 345-382 (542)
56 PF08269 Cache_2: Cache domain 23.7 15 0.00033 25.8 -1.3 42 47-88 52-94 (95)
57 cd00004 Sortase Sortases are c 23.0 1.6E+02 0.0035 21.7 4.2 18 52-69 70-87 (128)
58 TIGR00156 conserved hypothetic 22.8 77 0.0017 24.4 2.4 48 39-95 46-93 (126)
59 PF04076 BOF: Bacterial OB fol 22.5 67 0.0014 23.7 2.0 22 74-95 49-70 (103)
60 TIGR02527 dot_icm_IcmQ Dot/Icm 22.5 43 0.00094 27.3 1.0 37 62-98 120-157 (182)
61 PF02974 Inh: Protease inhibit 22.4 3E+02 0.0066 19.7 6.5 55 28-87 41-95 (99)
62 PF11398 DUF2813: Protein of u 22.2 1.7E+02 0.0037 26.5 4.9 80 19-102 69-158 (373)
63 PF09629 YorP: YorP protein; 21.9 1.6E+02 0.0035 20.0 3.5 27 41-68 33-59 (71)
64 cd05830 Sortase_D_5 Sortase D 21.5 1.7E+02 0.0036 22.1 4.1 21 50-70 69-89 (137)
65 COG5033 TFG3 Transcription ini 21.0 1.6E+02 0.0034 24.9 4.0 65 73-140 35-101 (225)
66 PF04170 NlpE: NlpE N-terminal 20.7 2.1E+02 0.0045 20.0 4.2 19 74-92 51-70 (87)
67 cd03676 Nudix_hydrolase_3 Memb 20.7 4.1E+02 0.0089 20.6 6.9 59 77-136 7-72 (180)
68 PF08829 AlphaC_N: Alpha C pro 20.5 31 0.00068 28.1 -0.2 32 52-83 92-123 (194)
69 PF08495 FIST: FIST N domain; 20.4 1.5E+02 0.0032 23.1 3.7 17 52-68 180-196 (198)
70 PF13585 CHU_C: C-terminal dom 20.3 91 0.002 21.8 2.2 19 74-92 28-46 (87)
No 1
>PF04525 Tub_2: Tubby C 2; InterPro: IPR007612 This is a family of plant and bacterial uncharacterised proteins.; PDB: 1ZXU_A 2Q4M_A.
Probab=99.97 E-value=3.9e-31 Score=214.44 Aligned_cols=147 Identities=34% Similarity=0.558 Sum_probs=88.1
Q ss_pred EEecCccccCcCEEEEEEeeeceecCCCeEEEcCCCCEEEEEEe-ecCCCCCeEEEECCCCCeEEEEEeccccccccEEE
Q 029811 25 SIIGPQYCLPYPVDLAVVRKFMTLADGSFTVTDINDNIMFKVKE-KHFSLHDKRTLLDPAGNPVVTITEKLFSAHEKHSV 103 (187)
Q Consensus 25 ~vV~~~~~a~~~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~g-k~~s~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v 103 (187)
+||+++||++++++|+||+|.+++++++|+|+|++|+++|+|+| +.+++++++.|+|++|++|++|++|.++++++|++
T Consensus 2 ~vv~~~~~~~~~~~l~v~~k~~~~~~~~f~V~D~~G~~vf~V~g~~~~s~~~~~~l~D~~G~~L~~i~~k~~~l~~~w~i 81 (187)
T PF04525_consen 2 VVVDAQYCSPQPVTLTVKKKSLSFSGDDFTVYDENGNVVFRVDGGKFFSIGKKRTLMDASGNPLFTIRRKLFSLRPTWEI 81 (187)
T ss_dssp -SS-GGGB-SS-EEEEEE----------EEEEETTS-EEEEEE--SCTTBTTEEEEE-TTS-EEEEEE--------EEEE
T ss_pred cEECHHHcCCCceEEEEEEEEeeecCCCEEEEcCCCCEEEEEEEecccCCCCEEEEECCCCCEEEEEEeeecccceEEEE
Confidence 45699999999999999999998888899999999999999999 89999999999999999999999999999999999
Q ss_pred EECCCCCCCCeEEEEEecccccCcceEEEEEecCC-----CCceEEEEEeecC-CCCceEEecCCCceeeeeeeccc
Q 029811 104 FRGASTDAKDLLFTVGASSVLQLKTTLNVFWQVIP-----NKRFVILRSKAAG-QNDPVLFMQESPTQLLPRCTKRK 174 (187)
Q Consensus 104 ~~g~~~~~~~~lftvk~~~~~~~k~k~~V~~~~~~-----~g~~~~~~i~g~~-~~~~~i~~~~s~~~~v~~i~~rk 174 (187)
|.+++.+..+++|+||+++....++++.+|+.... .++..+|+|+|+| .++|.|+. +.+.+||+| +||
T Consensus 82 ~~~~~~~~~~~i~tvkk~~~~~~~~~~~~f~~~~~~~~~~~~~~~~~~i~G~~~~~~~~I~~--~~g~~VA~i-~rk 155 (187)
T PF04525_consen 82 YRGGGSEGKKPIFTVKKKSMLQNKDSFDVFLPPKSNISIDDSEGPDFEIKGNFWDRSFTIYD--SGGRVVAEI-SRK 155 (187)
T ss_dssp EETT---GGGEEEEEE----------EEEEET--T----------SEEEES-TTTT--EEEE--CC--EEEEE-EE-
T ss_pred EECCCCccCceEEEEEEecccCCCcceeEEEecccceeecCCCCceEEEEEEecCcEEEEEE--cCCCEEEEE-ecc
Confidence 99986555678999998877777899999987532 3567789999998 88999996 668999999 533
No 2
>COG4894 Uncharacterized conserved protein [Function unknown]
Probab=99.76 E-value=3e-19 Score=138.84 Aligned_cols=126 Identities=20% Similarity=0.256 Sum_probs=101.4
Q ss_pred CEEEEEEeeeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEEeccccccccEEEEECCCCCCCCeE
Q 029811 36 PVDLAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTITEKLFSAHEKHSVFRGASTDAKDLL 115 (187)
Q Consensus 36 ~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v~~g~~~~~~~~l 115 (187)
+.+|.|+||++++++ +|.|+|.+|+.+|+|+|+.|++++.+.+.|++|.+|.+|++|++++.++|++-.|+ |. +
T Consensus 5 ~~tl~mkQk~~~~gd-~f~I~d~dgE~af~VeGs~f~i~dtlti~Da~G~~l~~i~~kll~l~~~yeI~d~~----g~-~ 78 (159)
T COG4894 5 MITLFMKQKMFSFGD-AFHIYDRDGEEAFKVEGSFFSIGDTLTITDASGKTLVSIEQKLLSLLPRYEISDGG----GT-V 78 (159)
T ss_pred hHhHhhhhhhhhccc-ceEEECCCCcEEEEEeeeEEeeCceEEEEecCCCChHHHHHHHhhccceeEEEcCC----CC-E
Confidence 467999999999988 99999999999999999999999999999999999999999999999999997665 34 8
Q ss_pred EEEEecccccCcceEEEEEecCCCCceEEEEEeecC-CCCceEEecCCCceeeeeeeccccccccc
Q 029811 116 FTVGASSVLQLKTTLNVFWQVIPNKRFVILRSKAAG-QNDPVLFMQESPTQLLPRCTKRKPLEATS 180 (187)
Q Consensus 116 ftvk~~~~~~~k~k~~V~~~~~~~g~~~~~~i~g~~-~~~~~i~~~~s~~~~v~~i~~rk~~~~~~ 180 (187)
|.++ +++..+++++++...+ +|+.|+. .....+-+|+. ++|.+ .||.+..+|
T Consensus 79 ~~vr-KK~tf~Rdk~e~d~~~--------~eihGNi~d~efkl~dg~~---~~aeV-sKkwf~~rd 131 (159)
T COG4894 79 CEVR-KKVTFSRDKFEIDGLN--------WEIHGNIWDDEFKLTDGEN---VRAEV-SKKWFSWRD 131 (159)
T ss_pred EEEE-EEEEEEeeeEEEcCCC--------eEEecceeceEEEEecCCc---eehhh-eeeeEeccc
Confidence 8888 5666668888885443 4555552 33344555554 77777 677666554
No 3
>PF03803 Scramblase: Scramblase ; InterPro: IPR005552 Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury [].
Probab=98.71 E-value=8e-07 Score=73.40 Aligned_cols=124 Identities=14% Similarity=0.126 Sum_probs=84.1
Q ss_pred EEEEEeeecee-------cCCCeEEEcCCCCEEEEEEeecC---------CCCCeEEEECCCCCeEEEEEecccc-----
Q 029811 38 DLAVVRKFMTL-------ADGSFTVTDINDNIMFKVKEKHF---------SLHDKRTLLDPAGNPVVTITEKLFS----- 96 (187)
Q Consensus 38 ~L~vk~K~~sl-------s~~~F~I~D~~G~~vf~V~gk~~---------s~~~~~~l~D~~G~~L~~Ir~K~ls----- 96 (187)
.+.|+|+...+ ..+.|.|+|.+|+.+|.+....- .-.-+..++|..|+++++|+|..--
T Consensus 23 ~l~I~Q~~e~~e~~~~~e~~N~Y~I~n~~g~~i~~~~E~s~~~~R~~~~~~R~f~~~i~D~~g~~vl~i~Rp~~c~~C~~ 102 (221)
T PF03803_consen 23 QLLIKQQIEPLEIFTGFETPNRYDIKNPNGQQIYYAVEESDCCSRQCCGSHRPFKMHIYDNYGREVLTIERPFKCCSCCP 102 (221)
T ss_pred EEEEEEEEEEeceecccccCceEEEECCCCCEEEEEEEeCcceeeeecCCCCCEEEEEEecCCCEEEEEEcCCcceeccc
Confidence 45566654432 23699999999999999976521 1133568899999999999986421
Q ss_pred -ccccEEEEECCCCCCCCeEEEEEecccccCcceEEEEEecCCCCceEEEEEeecCC-----CCceEEecCCCceeeeee
Q 029811 97 -AHEKHSVFRGASTDAKDLLFTVGASSVLQLKTTLNVFWQVIPNKRFVILRSKAAGQ-----NDPVLFMQESPTQLLPRC 170 (187)
Q Consensus 97 -l~~~w~v~~g~~~~~~~~lftvk~~~~~~~k~k~~V~~~~~~~g~~~~~~i~g~~~-----~~~~i~~~~s~~~~v~~i 170 (187)
.....+|+. +.++.+.+|+ ..+..++++++|+-.+ ....+.|+|... .+..-.+-+.++..|++|
T Consensus 103 ~~~~~~~V~~----p~g~~iG~I~-q~~~~~~~~f~I~d~~----~~~~~~I~gp~~~~~~~~~~~F~I~~~~~~~vg~I 173 (221)
T PF03803_consen 103 CCLQEMEVES----PPGNLIGSIR-QPFSCCRPNFDIFDAN----GNPIFTIKGPCCCCSCCCDWEFEIKDPNGQEVGSI 173 (221)
T ss_pred ccceeEEEec----CCCcEEEEEE-EcCcccceEEEEEECC----CceEEEEeCCcceeccccceeeeeecccCcEEEEE
Confidence 124455533 4589999999 6677789999986554 355688888752 222222234467999999
No 4
>COG4894 Uncharacterized conserved protein [Function unknown]
Probab=98.55 E-value=2e-07 Score=73.03 Aligned_cols=69 Identities=20% Similarity=0.357 Sum_probs=63.5
Q ss_pred cCcCEEEEEEeeeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEEeccccccccEEE
Q 029811 33 LPYPVDLAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTITEKLFSAHEKHSV 103 (187)
Q Consensus 33 a~~~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v 103 (187)
..++.++.|+.+.|++.+ .|+|+|+.|..++.++.+++++..++.|.|++|+ ++.+++|..-++++|++
T Consensus 26 ~dgE~af~VeGs~f~i~d-tlti~Da~G~~l~~i~~kll~l~~~yeI~d~~g~-~~~vrKK~tf~Rdk~e~ 94 (159)
T COG4894 26 RDGEEAFKVEGSFFSIGD-TLTITDASGKTLVSIEQKLLSLLPRYEISDGGGT-VCEVRKKVTFSRDKFEI 94 (159)
T ss_pred CCCcEEEEEeeeEEeeCc-eEEEEecCCCChHHHHHHHhhccceeEEEcCCCC-EEEEEEEEEEEeeeEEE
Confidence 367899999999999988 8999999999999999999999999999999999 88888887666888887
No 5
>PF04525 Tub_2: Tubby C 2; InterPro: IPR007612 This is a family of plant and bacterial uncharacterised proteins.; PDB: 1ZXU_A 2Q4M_A.
Probab=98.40 E-value=4.2e-06 Score=67.72 Aligned_cols=96 Identities=20% Similarity=0.368 Sum_probs=50.5
Q ss_pred CcCEEEEEEe-eeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCe----EEEEEec-cccccccEEEEECC
Q 029811 34 PYPVDLAVVR-KFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNP----VVTITEK-LFSAHEKHSVFRGA 107 (187)
Q Consensus 34 ~~~~~L~vk~-K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~----L~~Ir~K-~lsl~~~w~v~~g~ 107 (187)
.+...|.++. +.+++++ .+.|+|++|+++++++.+.+++..+..+++++|.. |++|+++ .+...+...+|...
T Consensus 36 ~G~~vf~V~g~~~~s~~~-~~~l~D~~G~~L~~i~~k~~~l~~~w~i~~~~~~~~~~~i~tvkk~~~~~~~~~~~~f~~~ 114 (187)
T PF04525_consen 36 NGNVVFRVDGGKFFSIGK-KRTLMDASGNPLFTIRRKLFSLRPTWEIYRGGGSEGKKPIFTVKKKSMLQNKDSFDVFLPP 114 (187)
T ss_dssp TS-EEEEEE--SCTTBTT-EEEEE-TTS-EEEEEE--------EEEEEETT---GGGEEEEEE----------EEEEET-
T ss_pred CCCEEEEEEEecccCCCC-EEEEECCCCCEEEEEEeeecccceEEEEEECCCCccCceEEEEEEecccCCCcceeEEEec
Confidence 5678999999 8999988 99999999999999999999999999999999884 9999998 44445556666652
Q ss_pred C-----CCCCCeEEEEEecccccCcceEEEE
Q 029811 108 S-----TDAKDLLFTVGASSVLQLKTTLNVF 133 (187)
Q Consensus 108 ~-----~~~~~~lftvk~~~~~~~k~k~~V~ 133 (187)
. ...+..-++|+ -. .+..+++|+
T Consensus 115 ~~~~~~~~~~~~~~~i~-G~--~~~~~~~I~ 142 (187)
T PF04525_consen 115 KSNISIDDSEGPDFEIK-GN--FWDRSFTIY 142 (187)
T ss_dssp -T----------SEEEE-S---TTTT--EEE
T ss_pred ccceeecCCCCceEEEE-EE--ecCcEEEEE
Confidence 1 12344456666 22 334566664
No 6
>PF03803 Scramblase: Scramblase ; InterPro: IPR005552 Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury [].
Probab=96.95 E-value=0.022 Score=46.87 Aligned_cols=65 Identities=12% Similarity=0.252 Sum_probs=52.7
Q ss_pred eEEEcCCCCEEEEEEeecCC------CCCeEEEECCCCCeEEEEEeccccccccEEEEECCCCCCCCeEEEEEec
Q 029811 53 FTVTDINDNIMFKVKEKHFS------LHDKRTLLDPAGNPVVTITEKLFSAHEKHSVFRGASTDAKDLLFTVGAS 121 (187)
Q Consensus 53 F~I~D~~G~~vf~V~gk~~s------~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v~~g~~~~~~~~lftvk~~ 121 (187)
..|+|..|+.|++++-...- ...+..+.++.|+.|++|+++...+.+.+.|+..+ ++.+++|+-+
T Consensus 78 ~~i~D~~g~~vl~i~Rp~~c~~C~~~~~~~~~V~~p~g~~iG~I~q~~~~~~~~f~I~d~~----~~~~~~I~gp 148 (221)
T PF03803_consen 78 MHIYDNYGREVLTIERPFKCCSCCPCCLQEMEVESPPGNLIGSIRQPFSCCRPNFDIFDAN----GNPIFTIKGP 148 (221)
T ss_pred EEEEecCCCEEEEEEcCCcceecccccceeEEEecCCCcEEEEEEEcCcccceEEEEEECC----CceEEEEeCC
Confidence 36799999999999986421 23678888999999999999876688999998765 5789999844
No 7
>KOG0621 consensus Phospholipid scramblase [Cell wall/membrane/envelope biogenesis]
Probab=93.75 E-value=1.1 Score=39.29 Aligned_cols=111 Identities=9% Similarity=0.077 Sum_probs=63.7
Q ss_pred CCeEEEcCCCCEEEEEEeec---------CCCCCeEEEECCCCCeEEEEEeccccccccEEEEECCCCCCCCeEEEEEec
Q 029811 51 GSFTVTDINDNIMFKVKEKH---------FSLHDKRTLLDPAGNPVVTITEKLFSAHEKHSVFRGASTDAKDLLFTVGAS 121 (187)
Q Consensus 51 ~~F~I~D~~G~~vf~V~gk~---------~s~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v~~g~~~~~~~~lftvk~~ 121 (187)
+.|.|.|.+|+.+|.+-... ..-.-...++|.-|+++++++|...-... .+.+ ....-+++...
T Consensus 99 NRY~v~~~~g~~v~~~~E~S~~~~Rq~~g~~RpF~~~i~D~~g~eVl~~~R~~~c~~~---~c~~----~~~~~~~v~~p 171 (292)
T KOG0621|consen 99 NRYVVHDMYGQPLYYAMERSNVFARQYLGTHRPFAMRIMDNFGQEVLTCKRPFPCCSS---ACAL----CLAQEIEIQSP 171 (292)
T ss_pred cEEEEEcCCcChhHHHHhhchHHHHHhhccCCcceeEeecccCcEEEEEecccccccc---cccc----ccccEEEEEcC
Confidence 78999999999998654432 11244578889999999999998542221 1111 12223444433
Q ss_pred ccccC----------cceEEEEEecCCCCceEEEEEeecC-------C-CCceEEecCCCceeeeeeeccc
Q 029811 122 SVLQL----------KTTLNVFWQVIPNKRFVILRSKAAG-------Q-NDPVLFMQESPTQLLPRCTKRK 174 (187)
Q Consensus 122 ~~~~~----------k~k~~V~~~~~~~g~~~~~~i~g~~-------~-~~~~i~~~~s~~~~v~~i~~rk 174 (187)
....+ .++++| .+ .+..+.|.|+|.. . ..--+...+ .+.+|.+| .||
T Consensus 172 ~~~~lG~v~q~~~~~~~~f~i--~~--~~~~~v~~v~gp~~~~~~~~~d~~f~~~~~d-~~~~vg~I-~k~ 236 (292)
T KOG0621|consen 172 PMGLLGKVLQTWGCVNPNFHL--WD--RDGNLVFLVEGPRCCTFACCDDTVFFPKTTD-NGRIVGSI-SRK 236 (292)
T ss_pred CCceEEEEEEeeccccceEEE--Ec--ccceeEEEEEcCceeEEEeecCcceeEEEcC-CCeEEEEE-eec
Confidence 33222 233333 22 4457778888872 1 111233333 58889999 454
No 8
>PF01690 PLRV_ORF5: Potato leaf roll virus readthrough protein; InterPro: IPR002929 This family consists mainly of the Potato leafroll virus (PLrV) read through protein otherwise known as the minor capsid protein. This is generated via a readthrough of open reading frame 3, the coat protein, allowing transcription of open reading frame 5 to give an extended coat protein with a large C-terminal addition or read through domain []. The read through protein is essential for the circulative aphid transmission of PLrV [] and Beet western yellows virus []. The N-terminal region of the luteovirus readthrough domain determines virus binding to Buchnera GroEL and is essential for virus persistence in the aphid [].; GO: 0019028 viral capsid
Probab=85.72 E-value=1.6 Score=40.51 Aligned_cols=61 Identities=26% Similarity=0.467 Sum_probs=36.1
Q ss_pred CCCCCCCCCCCCCCCceEEecCcccc--CcCEEEEEEeeeceecCCCeEEEcCCCCEEEEEEeecCC
Q 029811 8 VPAPTPPPNPAMYSNPVSIIGPQYCL--PYPVDLAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFS 72 (187)
Q Consensus 8 ~~~~~~~~~~~~~~~pv~vV~~~~~a--~~~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s 72 (187)
-|.|||||.|.+..+|.+-.-..|+. .-+.+.+..+. +++...|++.+++..++++...+.
T Consensus 10 ~P~P~P~P~P~P~PePtP~~~~RF~~Y~G~p~~~I~tr~----n~d~I~v~~l~~q~~~yiEdE~~~ 72 (465)
T PF01690_consen 10 GPSPTPPPPPAPTPEPTPAKHERFIGYEGVPQTKISTRE----NDDSISVRSLNSQRMRYIEDENWN 72 (465)
T ss_pred CCCCCCCCCCcccCCCcccCccceEEEecccceeeeccc----cccceEeeccCceeEEEEecccce
Confidence 34455555555555555544667775 22333333222 345778888899999988886543
No 9
>KOG0621 consensus Phospholipid scramblase [Cell wall/membrane/envelope biogenesis]
Probab=79.56 E-value=25 Score=30.82 Aligned_cols=65 Identities=11% Similarity=0.147 Sum_probs=53.1
Q ss_pred eEEEcCCCCEEEEEEeecCC--------CCCeEEEECCCCCeEEEEEeccccccccEEEEECCCCCCCCeEEEEEec
Q 029811 53 FTVTDINDNIMFKVKEKHFS--------LHDKRTLLDPAGNPVVTITEKLFSAHEKHSVFRGASTDAKDLLFTVGAS 121 (187)
Q Consensus 53 F~I~D~~G~~vf~V~gk~~s--------~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v~~g~~~~~~~~lftvk~~ 121 (187)
-.|.|..|+.|++++..+.- ..++..+..+.|..|.++.+-..-+.+.|.+-.++ ++.+|.|+..
T Consensus 134 ~~i~D~~g~eVl~~~R~~~c~~~~c~~~~~~~~~v~~p~~~~lG~v~q~~~~~~~~f~i~~~~----~~~v~~v~gp 206 (292)
T KOG0621|consen 134 MRIMDNFGQEVLTCKRPFPCCSSACALCLAQEIEIQSPPMGLLGKVLQTWGCVNPNFHLWDRD----GNLVFLVEGP 206 (292)
T ss_pred eEeecccCcEEEEEeccccccccccccccccEEEEEcCCCceEEEEEEeeccccceEEEEccc----ceeEEEEEcC
Confidence 46899999999999998532 25788899999999999998876788999997643 6788999844
No 10
>PF09008 Head_binding: Head binding; InterPro: IPR009093 This entry represents the N-terminal domain of the Bacteriophage P22, Gp9, tailspike protein (TSP). The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. The tailspike protein of Salmonella bacteriophage P22 is a viral adhesion protein that mediates attachment of the viral protein to host cell-surface lipopolysaccharide. The tailspike protein displays both receptor binding and destroying properties, inactivating the receptor by endoglycosidase activity. The N-terminal, head-binding domain mediates the non-covalent attachment of the six homotrimeric tailspike molecules to the DNA injection apparatus []. The N-terminal domain of the P22 tailspike protein shows significant sequence similarity to the N-terminal domain of the Shigella phage Sf6 tailspike protein [].; GO: 0009405 pathogenesis; PDB: 2XC1_C 1LKT_D 2VFQ_A 2VFO_A 2VFN_A 2VFP_A 2VKY_B 2VFM_A 2VNL_A 2VBK_A ....
Probab=75.61 E-value=8.9 Score=28.91 Aligned_cols=42 Identities=19% Similarity=0.239 Sum_probs=25.6
Q ss_pred eeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEE
Q 029811 44 KFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTIT 91 (187)
Q Consensus 44 K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir 91 (187)
....+..++|.++ +|+.+..|... ++-..++|++|..+|.+-
T Consensus 63 QPi~iN~gg~~~y--~gq~a~~vt~~----~hSMAv~d~~g~q~Fy~p 104 (114)
T PF09008_consen 63 QPIIINKGGFPVY--NGQIAKFVTVP----GHSMAVYDANGQQQFYFP 104 (114)
T ss_dssp SSEEE-TTS-EEE--TTEE--EEESS----SEEEEEE-TTS-EEEEES
T ss_pred CCEEEccCCceEE--ccceeEEEEcc----CceEEEEeCCCcEEEeec
Confidence 4445655588888 45566666665 456889999999999874
No 11
>KOG3950 consensus Gamma/delta sarcoglycan [Cytoskeleton]
Probab=69.16 E-value=5.3 Score=34.36 Aligned_cols=51 Identities=18% Similarity=0.246 Sum_probs=33.7
Q ss_pred EEEEeeeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 39 LAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 39 L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
|.+=.+..-..+..|.|.|.+|.++|.+|.+-..++.+..=.+..|..+|.
T Consensus 127 l~lgp~~ve~~~~~Fev~~~dgk~LFsad~dEv~vgae~LRv~g~~GavF~ 177 (292)
T KOG3950|consen 127 LILGPKKVEAQCKRFEVNDVDGKLLFSADEDEVVVGAEKLRVLGAEGAVFE 177 (292)
T ss_pred EEechHHHhhhhceeEEecCCCcEEEEeccceeEeeeeeeEeccCCccccc
Confidence 444333333344589999999999999998866666655555666666664
No 12
>PF13860 FlgD_ig: FlgD Ig-like domain; PDB: 3C12_A 3OSV_A.
Probab=63.43 E-value=16 Score=25.33 Aligned_cols=17 Identities=18% Similarity=0.292 Sum_probs=8.0
Q ss_pred CeEEEcCCCCEEEEEEe
Q 029811 52 SFTVTDINDNIMFKVKE 68 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~g 68 (187)
.+.|+|++|++|.++..
T Consensus 28 ~v~I~d~~G~~V~t~~~ 44 (81)
T PF13860_consen 28 TVTIYDSNGQVVRTISL 44 (81)
T ss_dssp EEEEEETTS-EEEEEEE
T ss_pred EEEEEcCCCCEEEEEEc
Confidence 34455555555555443
No 13
>PF09000 Cytotoxic: Cytotoxic; InterPro: IPR009105 Colicins are plasmid-encoded protein antibiotics, or bacteriocins, produced by strains of Escherichia coli that kill closely related bacteria. Colicins are classified according to the cell-surface receptor they bind to, colicin E3 binding to the BtuB receptor involved in vitamin B12 uptake. The lethal action of colicin E3 arises from its ability to inactivate the ribosome by site-specific RNase cleavage of the 16S ribosomal RNA, which is carried out by the catalytic, or ribonuclease domain. Colicin E3 is comprised of three domains, each domain being involved in a different stage of infection: receptor binding, translocation and cytotoxicity. Colicin E3 is a Y-shaped molecule with the receptor-binding middle domain forming the stalk, the N-terminal translocation domain forming the two globular heads (IPR003058 from INTERPRO), and the C-terminal catalytic domain forming the two globular arms. To neutralise the toxic effects of colicin E3, the host cell produces an immunity protein, which binds to the C-terminal end of the ribonuclease domain and effectively suppresses its activity. This entry represents the ribonuclease domain (also called catalytic or cytotoxic domain) found in various colicins. This domain confers cytotoxic activity to proteins, enabling the formation of nucleolytic breaks in 16S ribosomal RNA. The structure of the domain reveals a highly twisted central beta-sheet elaborated with a short N-terminal alpha-helix [, ]. ; GO: 0003723 RNA binding, 0016788 hydrolase activity, acting on ester bonds, 0043022 ribosome binding, 0009405 pathogenesis; PDB: 2B5U_C 1JCH_A 1E44_B 2XFZ_Y.
Probab=62.53 E-value=17 Score=26.11 Aligned_cols=50 Identities=12% Similarity=0.149 Sum_probs=31.5
Q ss_pred EEEeeeceecCCCe--EEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEEec
Q 029811 40 AVVRKFMTLADGSF--TVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTITEK 93 (187)
Q Consensus 40 ~vk~K~~sls~~~F--~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir~K 93 (187)
..++|.-.++++.- .=+|..|..+|.-|.+ +++++++|..|++|..+-..
T Consensus 18 ~~k~ktp~~gg~~~r~rw~~~kG~kiYewDsq----HG~lEvy~~~GkHLGe~Dp~ 69 (85)
T PF09000_consen 18 KAKPKTPVQGGGGKRKRWKDKKGRKIYEWDSQ----HGELEVYNKRGKHLGEFDPK 69 (85)
T ss_dssp EE---SB-SSSSSB--EEEETTTTEEEEEETT----TTEEEEEETT-BEEEEE-TT
T ss_pred hccccCccccCCccccceEcCCCCEEEEEcCC----CCeEEEEcCCCcCcccccCC
Confidence 34444444443222 2367889999988875 78999999999999987643
No 14
>PF04790 Sarcoglycan_1: Sarcoglycan complex subunit protein; InterPro: IPR006875 The dystrophin glycoprotein complex (DGC) is a membrane-spanning complex that links the interior cytoskeleton to the extracellular matrix in muscle. The sarcoglycan complex is a subcomplex within the DGC and is composed of several muscle-specific, transmembrane proteins (alpha-, beta-, gamma-, delta- and zeta-sarcoglycan). The sarcoglycans are asparagine-linked glycosylated proteins with single transmembrane domains. This family contains beta, gamma and delta members [, ].; GO: 0007010 cytoskeleton organization, 0016012 sarcoglycan complex, 0016021 integral to membrane
Probab=58.82 E-value=19 Score=31.06 Aligned_cols=48 Identities=19% Similarity=0.295 Sum_probs=24.3
Q ss_pred EEEEEeee-ceecCCCeEEEcC-CCCEEEEEEeecCCCC-CeEEEECCCCC
Q 029811 38 DLAVVRKF-MTLADGSFTVTDI-NDNIMFKVKEKHFSLH-DKRTLLDPAGN 85 (187)
Q Consensus 38 ~L~vk~K~-~sls~~~F~I~D~-~G~~vf~V~gk~~s~~-~~~~l~D~~G~ 85 (187)
.|.+-+.. .-...+.|.|+|. +|+++|.++..-..++ +++.+..+.|-
T Consensus 104 ~l~v~~~~~v~~~~~~F~V~d~~~g~~lFsad~~~v~v~~~~lrv~~~~G~ 154 (264)
T PF04790_consen 104 RLVVGPDGTVEAQSNRFEVKDPRDGKTLFSADRPEVVVGAEKLRVTGPEGA 154 (264)
T ss_pred eEEECCCccEEEecCeEEEEcCCCCceEEEecCCceEEeeeeEEecCCccE
Confidence 44443333 3333446777776 6777777776643332 33333344444
No 15
>PRK12816 flgG flagellar basal body rod protein FlgG; Reviewed
Probab=52.88 E-value=20 Score=30.75 Aligned_cols=42 Identities=10% Similarity=0.275 Sum_probs=31.3
Q ss_pred eecCCC-eEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 47 TLADGS-FTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 47 sls~~~-F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
.+.|++ |.|.+.+|+.+|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 96 AI~G~GFF~V~~~~G~~~YTR~G~-F~~d~~G~Lvt~~G~~vl~ 138 (264)
T PRK12816 96 AIEGEGFFKILMPDGTYAYTRDGS-FKIDANGQLVTSNGYRLLP 138 (264)
T ss_pred EECCCcEEEEEcCCCCeEEeeCCC-eeECCCCCEECCCCCEecc
Confidence 334434 56777889888997777 5666777799999999985
No 16
>PF15119 APOC4: Apolipoprotein C4
Probab=50.30 E-value=11 Score=27.31 Aligned_cols=19 Identities=37% Similarity=0.688 Sum_probs=16.2
Q ss_pred CCCccCCCCCCCCCCCCCC
Q 029811 3 QQPVNVPAPTPPPNPAMYS 21 (187)
Q Consensus 3 ~~~~~~~~~~~~~~~~~~~ 21 (187)
||....+.|+|||.|+|-.
T Consensus 2 q~~~~~~~psP~p~~~~S~ 20 (99)
T PF15119_consen 2 QQEAPEESPSPPPGPESSR 20 (99)
T ss_pred cccCCCCCCCCCCCcccCc
Confidence 7888889999999998754
No 17
>PF15529 Toxin_49: Putative toxin 49
Probab=49.68 E-value=47 Score=24.12 Aligned_cols=19 Identities=21% Similarity=0.261 Sum_probs=14.2
Q ss_pred CCeEEEcCCCCEEEEEEee
Q 029811 51 GSFTVTDINDNIMFKVKEK 69 (187)
Q Consensus 51 ~~F~I~D~~G~~vf~V~gk 69 (187)
.+|+++|++|.++-++++.
T Consensus 30 t~Y~tY~~~G~~~kr~r~~ 48 (89)
T PF15529_consen 30 TSYTTYDEDGMIVKRYRGS 48 (89)
T ss_pred cceeEEcCCCcEeEEeecc
Confidence 4899999999955555443
No 18
>TIGR02488 flgG_G_neg flagellar basal-body rod protein FlgG, Gram-negative bacteria. This family consists of the FlgG protein of the flagellar apparatus in the Proteobacteria and spirochetes.
Probab=47.03 E-value=25 Score=29.83 Aligned_cols=42 Identities=14% Similarity=0.324 Sum_probs=30.7
Q ss_pred eecCCC-eEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 47 TLADGS-FTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 47 sls~~~-F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
.+.|++ |.|.+.+|+.+|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 94 AI~G~GfF~V~~~~g~~~yTR~G~-F~~d~~G~Lvt~~G~~Vl~ 136 (259)
T TIGR02488 94 AIEGEGFFQVLMPDGTTAYTRDGA-FKINAEGQLVTSNGYPLQP 136 (259)
T ss_pred EEcCCcEEEEEcCCCCeEEeeCCc-eEECCCCCEECCCCCEecC
Confidence 334444 46667788888887776 5666777799999999885
No 19
>PRK15393 NUDIX hydrolase YfcD; Provisional
Probab=45.78 E-value=65 Score=25.59 Aligned_cols=55 Identities=9% Similarity=-0.001 Sum_probs=34.6
Q ss_pred CeEEEcCCCCEEEEEEe------ecCCCCCeEEEECCCCCeEEEEEeccc--cccccEEEEECC
Q 029811 52 SFTVTDINDNIMFKVKE------KHFSLHDKRTLLDPAGNPVVTITEKLF--SAHEKHSVFRGA 107 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~g------k~~s~~~~~~l~D~~G~~L~~Ir~K~l--sl~~~w~v~~g~ 107 (187)
-+.|+|++|+++-.+.- +++...-...++|.+|+.|+. +|... .+...|..+-||
T Consensus 11 ~~~~~d~~~~~~g~~~~~~~~~~~~~h~~~~v~v~~~~g~iLL~-~R~~~~~~~pg~~~~~pGG 73 (180)
T PRK15393 11 WVDIVNENNEVIAQASREQMRAQCLRHRATYIVVHDGMGKILVQ-RRTETKDFLPGMLDATAGG 73 (180)
T ss_pred EEEEECCCCCEeeEEEHHHHhhCCCceEEEEEEEECCCCeEEEE-EeCCCCCCCCCcccccCCC
Confidence 47899999999998721 223345566778988887763 44321 134556666554
No 20
>PRK12694 flgG flagellar basal body rod protein FlgG; Reviewed
Probab=44.54 E-value=28 Score=29.59 Aligned_cols=42 Identities=14% Similarity=0.306 Sum_probs=30.9
Q ss_pred eecCCCe-EEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 47 TLADGSF-TVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 47 sls~~~F-~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
.+.|++| .|.+.+|...|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 96 AI~G~GfF~V~~~~G~~~yTR~G~-F~~d~~G~Lvt~~G~~Vl~ 138 (260)
T PRK12694 96 AINGQGFFQVLMPDGTTAYTRDGS-FQTNAQGQLVTSSGYPLQP 138 (260)
T ss_pred EEcCCcEEEEEcCCCCeEEeeCCC-ceECCCCCEECCCCCEecc
Confidence 3444444 6777889888887777 5666777799999999885
No 21
>PF02974 Inh: Protease inhibitor Inh; InterPro: IPR021140 This entry represents the metalloprotease inhibitor I38, as well as the outer membrane lipoprotein Omp19. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This family of proteins represent monomeric serralysin inhibitors of about 125 residues, which interact with specific metalloprotease which are synthesised by serralysin secretors and characterised by being plant, insect and animal pathogens. It is probable that the serralysin inhibitors protect the host from proteolysis during export of the protease. The members of this family belong to MEROPS proteinase inhibitor family I38, clan IK. X-ray crystallography of a complex between the Serratia marcescens protease, SmaPI, and the inhibitor of Erwinia chrysanthemi, Inh, reveals that Inh is folded into an eight-stranded b-barrel with an N-terminal trunk of 10 residues. Residues 1-5 occupy part of the extended active site of the proteinase, thereby preventing access of the substrate. Residues 6-10 form a linker that connects the N-terminal proteinase-binding peptide to the body of the b-barrel. The backbone carbonyl of Ser-1 interacts with the catalytic zinc; the Ser-2 side chain occupies the S1'-binding site and also forms a hydrogen bond to the carboxyl end of the catalytic Glu, whereas Leu-3 occupies the S2' recognition site. Penetration of the trunk region further than 5 residues into the substrate binding cleft appears to be prevented by the b-barrel, which itself interacts with the proteinase near its Met turn (19). Peptide mimetics of the trunk at concentrations up to about 100 mM do not inhibit the protease, demonstrating that the barrel is essential for inhibitory activity [, ]. Structurally and functionally these inhibitors are closely related to the lipocalins, fatty acid-binding proteins, avidins and the enigmatic triabin. Together these five protein families constitute the calycin superfamily []. The proteins are characterised by their high specificity for small hydrophobic molecules and by their ability to form complexes with soluble macromolecules either through intramolecular disulphides or protein-protein interactions []. ; PDB: 1JIW_I 2RN4_A 1SMP_I.
Probab=44.05 E-value=1.2e+02 Score=21.92 Aligned_cols=31 Identities=16% Similarity=0.045 Sum_probs=21.9
Q ss_pred CCeEEEECCCCCeEEEEEeccccccccEEEEECC
Q 029811 74 HDKRTLLDPAGNPVVTITEKLFSAHEKHSVFRGA 107 (187)
Q Consensus 74 ~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v~~g~ 107 (187)
++.+.|+|++|+.|..+.+.. ...|+....+
T Consensus 61 gd~l~L~d~~G~~v~~f~~~~---~g~~~g~~~~ 91 (99)
T PF02974_consen 61 GDGLVLTDADGSVVAFFYRSG---DGRFEGQTPD 91 (99)
T ss_dssp TTEEEEE-TTS-EEEEEEEEC---TTEEEEEECC
T ss_pred CCEEEEECCCCCEEEEEEccC---CeeEEeEcCC
Confidence 567999999999999988763 3567776654
No 22
>PRK12693 flgG flagellar basal body rod protein FlgG; Provisional
Probab=41.80 E-value=39 Score=28.61 Aligned_cols=42 Identities=19% Similarity=0.379 Sum_probs=30.9
Q ss_pred eecCCCe-EEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 47 TLADGSF-TVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 47 sls~~~F-~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
.+.|++| .|.+.+|++.|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 96 Ai~G~GfF~v~~~~G~~~yTR~G~-F~~d~~G~Lvt~~G~~vl~ 138 (261)
T PRK12693 96 AIEGQGFFQVQLPDGTIAYTRDGS-FKLDQDGQLVTSGGYPLQP 138 (261)
T ss_pred EECCCcEEEEEcCCCCeEEeeCCC-eeECCCCCEECCCCCEEee
Confidence 3445455 5677789888997777 5566667799999999885
No 23
>cd05828 Sortase_D_4 Sortase D (SrtD) is a membrane transpeptidase found in gram-positive bacteria that anchors surface proteins to peptidoglycans of the bacterial cell wall envelope. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at the cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. Class D sortases are further classified into subfamilies 4 and 5. This group contains a subset of Class D sortases belonging to subfamily-4. These sortases recognize a unique sorting signal (LPXTA) and they constitute a specialized sorting pathway found in bacilli. Their substrates are predicted to be predominantly enzymes such as 5'-nucleotidases, glycosyl hydrolase, and subtilase.
Probab=40.65 E-value=48 Score=24.80 Aligned_cols=40 Identities=18% Similarity=0.405 Sum_probs=22.4
Q ss_pred CCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 50 DGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 50 ~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
|+.+.|++.++...|+|.....--..+..+++..|.+.++
T Consensus 65 Gd~i~v~~~~~~~~Y~V~~~~~v~~~~~~~~~~~~~~~Lt 104 (127)
T cd05828 65 GDIITLQTLGGTYTYRVTSTRIVDADDTSVLAPSDDPTLT 104 (127)
T ss_pred CCEEEEEECCEEEEEEEeeEEEECccccEEccCCCCCEEE
Confidence 4467777777777777777643233344455544444433
No 24
>smart00634 BID_1 Bacterial Ig-like domain (group 1).
Probab=40.46 E-value=79 Score=22.08 Aligned_cols=11 Identities=55% Similarity=0.507 Sum_probs=6.8
Q ss_pred eEEEcCCCCEE
Q 029811 53 FTVTDINDNIM 63 (187)
Q Consensus 53 F~I~D~~G~~v 63 (187)
.+|.|++|+++
T Consensus 24 v~v~D~~Gnpv 34 (92)
T smart00634 24 ATVTDANGNPV 34 (92)
T ss_pred EEEECCCCCCc
Confidence 45677777643
No 25
>TIGR03784 marine_sortase sortase, marine proteobacterial type. Members of this protein family are sortase enzymes, cysteine transpeptidases involved in protein sorting activities. Members of this family tend to be found in proteobacteria, rather than in Gram-positive bacteria where sortases attach proteins to the Gram-positive cell wall or participate in pilin cross-linking. Many species with this sortase appear to contain a signal target sequence, a protein with a Vault protein inter-alpha-trypsin domain (pfam08487) and a von Willebrand factor type A domain (pfam00092), encoded by an adjacent gene. These sortases are designated subfamily 6 according to Comfort and Clubb (2004).
Probab=39.39 E-value=57 Score=26.21 Aligned_cols=22 Identities=23% Similarity=0.185 Sum_probs=14.8
Q ss_pred CCCCeEEEECCCCCe-EEEEEec
Q 029811 72 SLHDKRTLLDPAGNP-VVTITEK 93 (187)
Q Consensus 72 s~~~~~~l~D~~G~~-L~~Ir~K 93 (187)
..++++.|.|.+|+. .+++...
T Consensus 110 ~~GD~I~v~~~~g~~~~Y~V~~~ 132 (174)
T TIGR03784 110 RPGDVIRLQTPDGQWQSYQVTAT 132 (174)
T ss_pred CCCCEEEEEECCCeEEEEEEeEE
Confidence 457777777777775 4777554
No 26
>cd06166 Sortase_D_5 Sortase D (SrtD) is a membrane transpeptidase found in gram-positive bacteria that anchors surface proteins to peptidoglycans of the bacterial cell wall envelope. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at the cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. Class D sortases are further classified into subfamilies 4 and 5. This group contains a subset of Class D sortases belonging to subfamily-5, represented by Clostridium perfringens CPE2315. Subfamily-5 sortases recognize a nonstandard sorting signal (LAXTG) and have replaced Sortase A in some gram-postive bacteria. They may play a housekeeping role in the cell.
Probab=39.08 E-value=58 Score=24.32 Aligned_cols=21 Identities=19% Similarity=0.273 Sum_probs=15.3
Q ss_pred CCCeEEEcCCCCEEEEEEeec
Q 029811 50 DGSFTVTDINDNIMFKVKEKH 70 (187)
Q Consensus 50 ~~~F~I~D~~G~~vf~V~gk~ 70 (187)
|+.+.|+|.++.-.|+|.+..
T Consensus 68 Gd~v~v~~~~~~~~Y~V~~~~ 88 (126)
T cd06166 68 GDEIKVTTKNGTYKYKITSIF 88 (126)
T ss_pred CCEEEEEECCEEEEEEEEEEE
Confidence 457777777777788887764
No 27
>PF12396 DUF3659: Protein of unknown function (DUF3659) ; InterPro: IPR022124 This domain family is found in bacteria and eukaryotes, and is approximately 70 amino acids in length.
Probab=37.38 E-value=87 Score=21.17 Aligned_cols=38 Identities=21% Similarity=0.315 Sum_probs=26.4
Q ss_pred EEEcCCCCEEEE-EEeecC-----CCCCeEEEECCCCCeEEEEE
Q 029811 54 TVTDINDNIMFK-VKEKHF-----SLHDKRTLLDPAGNPVVTIT 91 (187)
Q Consensus 54 ~I~D~~G~~vf~-V~gk~~-----s~~~~~~l~D~~G~~L~~Ir 91 (187)
.|.|.+|+++=+ |+|.+- .+-.+=.|.|.+|+.|....
T Consensus 14 ~V~d~~G~~vG~vveGd~k~L~G~~vd~~G~I~d~~G~viGkae 57 (64)
T PF12396_consen 14 NVVDDDGNVVGRVVEGDPKKLVGKKVDEDGDILDKDGNVIGKAE 57 (64)
T ss_pred eEECCCCCEEEEEecCCHHHhcCCcCCCCCCEECCCCCEEEEEE
Confidence 688999999999 555432 33444567777888877654
No 28
>PF12312 NeA_P2: Nepovirus subgroup A polyprotein ; InterPro: IPR021081 Proteins in this entry are typically between 259 and 1110 amino acids in length. They are found in association with PF03688 from PFAM, PF03689 from PFAM and PF03391 from PFAM. This entry includes RNA2 polyprotein (Protein 2A) which is implicated in RNA2 replication.
Probab=36.76 E-value=22 Score=29.85 Aligned_cols=27 Identities=33% Similarity=0.629 Sum_probs=20.2
Q ss_pred CCCCCCCCCCCCCCCceEE---ecCccccC
Q 029811 8 VPAPTPPPNPAMYSNPVSI---IGPQYCLP 34 (187)
Q Consensus 8 ~~~~~~~~~~~~~~~pv~v---V~~~~~a~ 34 (187)
+-||.|||.|.++-+||+- -+..||..
T Consensus 104 v~ipspPp~P~pyfR~vGAFAPTRSgfIRa 133 (258)
T PF12312_consen 104 VVIPSPPPMPRPYFRPVGAFAPTRSGFIRA 133 (258)
T ss_pred cccCCCcCCCCcccccccccCcccchHHHH
Confidence 5789999999999999963 23455543
No 29
>PRK06655 flgD flagellar basal body rod modification protein; Reviewed
Probab=33.95 E-value=60 Score=27.22 Aligned_cols=18 Identities=39% Similarity=0.396 Sum_probs=11.6
Q ss_pred CCeEEEECCCCCeEEEEE
Q 029811 74 HDKRTLLDPAGNPVVTIT 91 (187)
Q Consensus 74 ~~~~~l~D~~G~~L~~Ir 91 (187)
.-.+.|+|++|+.+.++.
T Consensus 127 ~vti~I~D~~G~~Vrt~~ 144 (225)
T PRK06655 127 NVTVTITDSAGQVVRTID 144 (225)
T ss_pred EEEEEEEcCCCCEEEEEe
Confidence 345677777777776653
No 30
>PF12690 BsuPI: Intracellular proteinase inhibitor; InterPro: IPR020481 BsuPI is a intracellular proteinase inhibitor that directly regulates the major intracellular proteinase (ISP-1) activity in vivo. It inhibits ISP-1 in the early stages of sporulation and then may be inactivated by a membrane-bound proteinase [].; PDB: 3ISY_A.
Probab=33.56 E-value=33 Score=24.10 Aligned_cols=17 Identities=18% Similarity=0.409 Sum_probs=11.0
Q ss_pred CeEEEcCCCCEEEEEEe
Q 029811 52 SFTVTDINDNIMFKVKE 68 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~g 68 (187)
+|.|+|.+|+.|++=..
T Consensus 27 D~~v~d~~g~~vwrwS~ 43 (82)
T PF12690_consen 27 DFVVKDKEGKEVWRWSD 43 (82)
T ss_dssp EEEEE-TT--EEEETTT
T ss_pred EEEEECCCCCEEEEecC
Confidence 78889999999987543
No 31
>PF03574 Peptidase_S48: Peptidase family S48; InterPro: IPR005319 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases, which includes HetR, are associated with heterocystous cyanobacteria and belong to MEROPS peptidase family S48 (clan S-). HetR is a DNA-binding serine-type protease required for heterocyst differentiation in heterocystous cyanobacteria under conditions of nitrogen deprivation. Mutation of HetR from of Anabaena sp. (strain PCC 7120) by site-specific mutagenesis of Ser-152 showed that this residue was one of the peptidase active site residues. It was suggested that peptidase activity might be needed for repression of HetR overproduction under conditions of nitrogen deprivation []. Modification of Cys-48 prevented disulphide-bond formation and homodimerisation of HetR and DNA-binding. The homodimer of HetR binds the promoter regions of hetR, hepA, and patS, suggesting a direct control of the expression of these genes by HetR. The pentapeptide RGSGR, which is present at the C terminus of PatS, blocks heterocyst formation, inhibits the DNA binding of HetR and prevents hetR up-regulation [].; GO: 0003677 DNA binding, 0004252 serine-type endopeptidase activity, 0043158 heterocyst differentiation; PDB: 3QOE_A 3QOD_A.
Probab=33.51 E-value=7.6 Score=30.05 Aligned_cols=23 Identities=43% Similarity=0.608 Sum_probs=18.3
Q ss_pred eeeeeeecccccccccceeccCC
Q 029811 165 QLLPRCTKRKPLEATSLTRIDSP 187 (187)
Q Consensus 165 ~~v~~i~~rk~~~~~~~~~~~~~ 187 (187)
+.+|.-.+|+++-++..||||+|
T Consensus 127 eAlAeHIkRRLlysgTVtriD~p 149 (149)
T PF03574_consen 127 EALAEHIKRRLLYSGTVTRIDSP 149 (149)
T ss_dssp HHHHHHHHHHHHHTTSEEEEEES
T ss_pred HHHHHHHHHHHhhccceEecCCC
Confidence 34444349999999999999997
No 32
>PRK12691 flgG flagellar basal body rod protein FlgG; Reviewed
Probab=33.13 E-value=61 Score=27.52 Aligned_cols=41 Identities=12% Similarity=0.272 Sum_probs=30.4
Q ss_pred ecCCC-eEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 48 LADGS-FTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 48 ls~~~-F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
+.|++ |.|.+.+|+.+|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 97 I~G~GfF~V~~~~G~~~yTR~G~-F~~d~~G~Lvt~~G~~vl~ 138 (262)
T PRK12691 97 IQGRGYFQIQLPDGETAYTRAGA-FNRSADGQIVTSDGYPVQP 138 (262)
T ss_pred EcCCcEEEEEcCCCCEEEeeCCC-eeECCCCCEECCCCCEeEe
Confidence 34434 56666789888997777 5566777799999999985
No 33
>cd03676 Nudix_hydrolase_3 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belong to this superfamily requires a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate spe
Probab=32.76 E-value=1.9e+02 Score=22.55 Aligned_cols=60 Identities=7% Similarity=-0.055 Sum_probs=33.5
Q ss_pred ecCCCeEEEcCCCCEEEEEEeecC---CC-CCeEE----EECCC--CCeEEEEEec-cccccccEEEEECC
Q 029811 48 LADGSFTVTDINDNIMFKVKEKHF---SL-HDKRT----LLDPA--GNPVVTITEK-LFSAHEKHSVFRGA 107 (187)
Q Consensus 48 ls~~~F~I~D~~G~~vf~V~gk~~---s~-~~~~~----l~D~~--G~~L~~Ir~K-~lsl~~~w~v~~g~ 107 (187)
|.+.-|.|+|++|+++..+.-... .+ +.... +.|.+ |..+++-|.. +.++-..|...-+|
T Consensus 2 ~~~E~~~v~d~~~~~~~~~~r~~~~~~g~~h~~v~~~~~~~~~~~~~~l~lqrRs~~K~~~Pg~wd~~~~G 72 (180)
T cd03676 2 WRNELYAVYGPFGEPLFEIERAASRLFGLVTYGVHLNGYVRDEDGGLRIWIPRRSPTKATWPGMLDNLVAG 72 (180)
T ss_pred CcCcceeeECCCCCEeEEEEecccccCCceEEEEEEEEEEEcCCCCeEEEEEeccCCCCCCCCceeeeccc
Confidence 556678899999999987765432 22 22333 23554 4333333322 12345778666554
No 34
>KOG2675 consensus Adenylate cyclase-associated protein (CAP/Srv2p) [Cytoskeleton; Signal transduction mechanisms]
Probab=31.37 E-value=30 Score=32.15 Aligned_cols=9 Identities=11% Similarity=0.172 Sum_probs=4.4
Q ss_pred CCEEEEEEe
Q 029811 60 DNIMFKVKE 68 (187)
Q Consensus 60 G~~vf~V~g 68 (187)
-|+..|-.+
T Consensus 292 KNP~LR~~~ 300 (480)
T KOG2675|consen 292 KNPNLRATS 300 (480)
T ss_pred cChhhhccC
Confidence 455555444
No 35
>PRK12640 flgF flagellar basal body rod protein FlgF; Reviewed
Probab=29.67 E-value=50 Score=28.01 Aligned_cols=37 Identities=14% Similarity=0.217 Sum_probs=27.6
Q ss_pred CeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 52 SFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
=|.|.+.+|+..|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 87 FF~V~~~~G~~~yTR~G~-F~~d~~G~Lvt~~G~~vlg 123 (246)
T PRK12640 87 WLAVQAPDGSEAYTRNGS-LQVDANGQLRTANGLPVLG 123 (246)
T ss_pred EEEEEcCCCCEEEEeCCC-eeECCCCCEEcCCCCCccC
Confidence 466677888888887776 5566666788888888774
No 36
>PRK12817 flgG flagellar basal body rod protein FlgG; Reviewed
Probab=29.50 E-value=61 Score=27.57 Aligned_cols=37 Identities=19% Similarity=0.328 Sum_probs=28.3
Q ss_pred CeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 52 SFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
=|.|.+.+|..+|+=+|. |.+...-.|.+++|.+|+.
T Consensus 98 fF~V~~~~G~~~yTR~G~-F~~d~~G~Lvt~~G~~vl~ 134 (260)
T PRK12817 98 FFRVIMADGTYAYTRAGN-FNIDSNGMLVDDNGNRLEI 134 (260)
T ss_pred EEEEEcCCCCeEEEeCCc-eeECCCCCEEcCCCCEEEe
Confidence 456677889888987777 4566666788889999885
No 37
>PRK12813 flgD flagellar basal body rod modification protein; Reviewed
Probab=28.23 E-value=64 Score=27.15 Aligned_cols=15 Identities=27% Similarity=0.277 Sum_probs=6.9
Q ss_pred eEEEECCCCCeEEEE
Q 029811 76 KRTLLDPAGNPVVTI 90 (187)
Q Consensus 76 ~~~l~D~~G~~L~~I 90 (187)
.+.|+|++|+.+.++
T Consensus 127 ~v~I~D~~G~vV~t~ 141 (223)
T PRK12813 127 ELVVRDAAGAEVARE 141 (223)
T ss_pred EEEEEcCCCCEEEEE
Confidence 344444444444443
No 38
>PRK12819 flgG flagellar basal body rod protein FlgG; Reviewed
Probab=27.96 E-value=75 Score=26.99 Aligned_cols=38 Identities=29% Similarity=0.397 Sum_probs=27.8
Q ss_pred CCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 51 GSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 51 ~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
+.|-+.+.+|...|+=+|. |.+...-.|.+.+|.+|+.
T Consensus 99 ~gFf~v~~~G~~~yTR~G~-F~~d~~G~Lvt~~G~~vlg 136 (257)
T PRK12819 99 SSFFVTSKNGETFLTRDGS-FTLNSDRYLQTASGAFVMG 136 (257)
T ss_pred CEEEEEcCCCCeeEeeCCC-eeECCCCCEEcCCCCEEec
Confidence 4677777788878887776 4566666788888888773
No 39
>PRK10523 lipoprotein involved with copper homeostasis and adhesion; Provisional
Probab=27.33 E-value=99 Score=26.30 Aligned_cols=36 Identities=22% Similarity=0.427 Sum_probs=0.0
Q ss_pred ceecCCCeEEEcCCCCE-EEEEEeecCCCCCeEEEECCCCCeE
Q 029811 46 MTLADGSFTVTDINDNI-MFKVKEKHFSLHDKRTLLDPAGNPV 87 (187)
Q Consensus 46 ~sls~~~F~I~D~~G~~-vf~V~gk~~s~~~~~~l~D~~G~~L 87 (187)
|...++.++..|++|+. -|+|.+.. +++.|.+|+++
T Consensus 91 w~~~~~~i~L~~~~g~~~yF~v~e~~------L~mLD~~G~~i 127 (234)
T PRK10523 91 WARTADKLVLTDSKGEKSYYRAKGDA------LEMLDREGNPI 127 (234)
T ss_pred EEecCCEEEEecCCCCEeEEEECCCE------EEEecCCCCcc
No 40
>KOG4375 consensus Scaffold protein Shank and related SAM domain proteins [Signal transduction mechanisms]
Probab=26.62 E-value=50 Score=28.64 Aligned_cols=25 Identities=12% Similarity=0.001 Sum_probs=20.3
Q ss_pred CCCCCccCCCCCCCCCCCCCCCceE
Q 029811 1 MAQQPVNVPAPTPPPNPAMYSNPVS 25 (187)
Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~pv~ 25 (187)
|+||++-.++|.|++.|.|-.++..
T Consensus 52 ~~ie~~~~~~p~~sa~~~~~k~~gp 76 (272)
T KOG4375|consen 52 MGIESVGHRIDILSAIQSMKKQQGK 76 (272)
T ss_pred ccccCCCCCCCCCCCCCCccccCCC
Confidence 7899999888888888888777643
No 41
>PRK15244 virulence protein SpvB; Provisional
Probab=26.48 E-value=40 Score=32.37 Aligned_cols=19 Identities=47% Similarity=0.842 Sum_probs=14.9
Q ss_pred CCCccCCCCCCCCCCCCCC
Q 029811 3 QQPVNVPAPTPPPNPAMYS 21 (187)
Q Consensus 3 ~~~~~~~~~~~~~~~~~~~ 21 (187)
..|||.-.|+|||+|.|..
T Consensus 359 ~~~~~~~~~~~~~~~~~~~ 377 (591)
T PRK15244 359 RAPVNNMMPPPPPPPMMGG 377 (591)
T ss_pred cccCCCCCCCcccCcccCC
Confidence 3689988888888887755
No 42
>TIGR02150 IPP_isom_1 isopentenyl-diphosphate delta-isomerase, type 1. This model represents type 1 of two non-homologous families of the enzyme isopentenyl-diphosphate delta-isomerase (IPP isomerase). IPP is an essential building block for many compounds, including enzyme cofactors, sterols, and prenyl groups. This inzyme interconverts isopentenyl diphosphate and dimethylallyl diphosphate.
Probab=26.36 E-value=1.7e+02 Score=22.55 Aligned_cols=53 Identities=13% Similarity=0.031 Sum_probs=35.2
Q ss_pred eEEEcCCCCEEEEEEeecCCC-------CCeEEEECCCCCeEEEEEec-cccccccEEEEE
Q 029811 53 FTVTDINDNIMFKVKEKHFSL-------HDKRTLLDPAGNPVVTITEK-LFSAHEKHSVFR 105 (187)
Q Consensus 53 F~I~D~~G~~vf~V~gk~~s~-------~~~~~l~D~~G~~L~~Ir~K-~lsl~~~w~v~~ 105 (187)
+.|+|++|+++=++....... .-...|+|.+|+.|+.-|.. ...+-..|.+--
T Consensus 1 ~~~~d~~~~~~g~~~r~~~~~~~g~~h~~v~v~v~~~~g~vLl~kR~~~k~~~PG~W~~~~ 61 (158)
T TIGR02150 1 VILVDENDNPIGTASKAEVHLQETPLHRAFSVFLFNEEGQLLLQRRALSKITWPGVWTNSC 61 (158)
T ss_pred CEEECCCCCEeeeeeHHHhhhcCCCeEEEEEEEEEcCCCeEEEEeccCCCcCCCCCccccc
Confidence 358999999998887654321 22366788889888764433 234568888643
No 43
>COG5436 Predicted integral membrane protein [Function unknown]
Probab=26.06 E-value=1.8e+02 Score=23.48 Aligned_cols=16 Identities=25% Similarity=0.532 Sum_probs=9.4
Q ss_pred EEEECCCCCeEEEEEe
Q 029811 77 RTLLDPAGNPVVTITE 92 (187)
Q Consensus 77 ~~l~D~~G~~L~~Ir~ 92 (187)
+.++|.+|+.+|+|..
T Consensus 93 vsiyds~~nn~fS~ND 108 (182)
T COG5436 93 VSIYDSNGNNFFSIND 108 (182)
T ss_pred EEEEcCCCCceEEecc
Confidence 5566666666666543
No 44
>PRK12818 flgG flagellar basal body rod protein FlgG; Reviewed
Probab=25.85 E-value=76 Score=26.97 Aligned_cols=37 Identities=22% Similarity=0.301 Sum_probs=27.2
Q ss_pred CeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 52 SFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
=|.|.+.+|+..|+=+|. |.+...-.|.+.+|.+|+-
T Consensus 102 FF~V~~~~G~~~YTR~G~-F~~d~~G~Lvt~~G~~vlg 138 (256)
T PRK12818 102 FFTVERNAGNNYYTRDGH-FHVDTQGYLVNDSGYYVLG 138 (256)
T ss_pred eEEEEcCCCCeEEeeCCC-eeECCCCCEEcCCCCEEec
Confidence 466677788878887777 4555666688888988874
No 45
>PF09475 Dot_icm_IcmQ: Dot/Icm secretion system protein (dot_icm_IcmQ); InterPro: IPR013365 Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation ().; PDB: 3FXE_A 3FXD_C.
Probab=25.07 E-value=24 Score=28.68 Aligned_cols=63 Identities=19% Similarity=0.355 Sum_probs=0.0
Q ss_pred cCccccCcCEEEEEEeeeceecCCCeEEEcCCCCEEEEEEe-ecCCCCCeEEEECCCCCeEEEEEeccccccc
Q 029811 28 GPQYCLPYPVDLAVVRKFMTLADGSFTVTDINDNIMFKVKE-KHFSLHDKRTLLDPAGNPVVTITEKLFSAHE 99 (187)
Q Consensus 28 ~~~~~a~~~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~g-k~~s~~~~~~l~D~~G~~L~~Ir~K~lsl~~ 99 (187)
+|-|-...-+.-.||.|--.++. .| ++.+|+. ..++++..+...|.-|+||++++.+-+.+.+
T Consensus 95 RPIY~nE~dvk~~IksKenk~NE-AY--------VaiyInq~dIl~~~~dk~~~Dk~GkpLltLkdrai~leN 158 (179)
T PF09475_consen 95 RPIYANEEDVKAAIKSKENKLNE-AY--------VAIYINQSDILSLSPDKIPTDKLGKPLLTLKDRAINLEN 158 (179)
T ss_dssp -------------------------------------------------------------------------
T ss_pred CCCcCCHHHHHHHHHhhhcccce-eE--------EEEEEchHhcccCCcccccccccCCcccccchhhcchhh
Confidence 55565566666777766444433 33 3344444 4688899999999999999999988655443
No 46
>COG3111 Periplasmic protein with OB-fold [Function unknown]
Probab=24.97 E-value=1.5e+02 Score=22.97 Aligned_cols=51 Identities=25% Similarity=0.344 Sum_probs=32.0
Q ss_pred CEEEEEEeeeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEEeccc
Q 029811 36 PVDLAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTITEKLF 95 (187)
Q Consensus 36 ~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir~K~l 95 (187)
...-+|++ ..++.+ +=.|. ..|+++=++.+ +++.|+|.+|+--..|..+.|
T Consensus 43 ~~~~TV~~-Ak~~~D-da~V~-l~GnIv~qi~~------D~y~FrD~sGeI~VeIdd~~w 93 (128)
T COG3111 43 AKVTTVDQ-AKTLHD-DAWVS-LEGNIVRQIGD------DRYVFRDASGEINVDIDDKVW 93 (128)
T ss_pred cceeEHHH-hhcccc-CCeEE-EEeeEEEeeCC------ceEEEEcCCccEEEEeccccc
Confidence 44445543 334445 33343 36777666554 568899999988888887764
No 47
>COG2849 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=24.90 E-value=2.3e+02 Score=23.62 Aligned_cols=56 Identities=14% Similarity=0.020 Sum_probs=42.1
Q ss_pred CCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEEeccccccccEEEEE
Q 029811 50 DGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTITEKLFSAHEKHSVFR 105 (187)
Q Consensus 50 ~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir~K~lsl~~~w~v~~ 105 (187)
++....+.++|++...|.=+--...+....+|.+|....++..+--.....+..|.
T Consensus 158 ~g~~k~yy~nGkl~~e~~~knG~~~G~~k~Y~enGkl~~e~~~kng~~~G~~~~yd 213 (230)
T COG2849 158 EGIAKTYYENGKLLSEVPYKNGKKNGVVKIYYENGKLVEEVTYKNGKLDGVVKEYD 213 (230)
T ss_pred cccEEEEcCCCcEEEeecccCCcccceEEEEccCCCEeEEEEecCCcccccEEEEe
Confidence 34778888999999988877655677888999999999999877633445555553
No 48
>PF15324 TALPID3: Hedgehog signalling target
Probab=24.77 E-value=49 Score=34.09 Aligned_cols=28 Identities=39% Similarity=0.753 Sum_probs=21.0
Q ss_pred CCccCCCCCCCCCCCCC-CCceEEecCcc
Q 029811 4 QPVNVPAPTPPPNPAMY-SNPVSIIGPQY 31 (187)
Q Consensus 4 ~~~~~~~~~~~~~~~~~-~~pv~vV~~~~ 31 (187)
-||..|.||||++|..+ .-++.|.-|+-
T Consensus 945 TPvpTPqpTPP~SP~s~~ke~~~vkTPds 973 (1252)
T PF15324_consen 945 TPVPTPQPTPPQSPPSPPKEPVLVKTPDS 973 (1252)
T ss_pred CCCCCCCCCCCCCCCCccccCCcccCCCC
Confidence 47888999999999664 66777766664
No 49
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=24.70 E-value=2.5e+02 Score=20.29 Aligned_cols=12 Identities=8% Similarity=0.293 Sum_probs=5.5
Q ss_pred CeEEEcCCCCEE
Q 029811 52 SFTVTDINDNIM 63 (187)
Q Consensus 52 ~F~I~D~~G~~v 63 (187)
.+.++|.+|.++
T Consensus 66 nLvl~~~~g~~v 77 (116)
T cd00028 66 NLVIYDGSGTVV 77 (116)
T ss_pred CeEEEcCCCcEE
Confidence 444444444443
No 50
>PF07680 DoxA: TQO small subunit DoxA; InterPro: IPR011636 Thiosulphate:quinone oxidoreductase (TQO) catalyses one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon Acidianus ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in Q8AAF0 from SWISSPROT.
Probab=24.70 E-value=1.3e+02 Score=23.43 Aligned_cols=28 Identities=21% Similarity=0.195 Sum_probs=22.2
Q ss_pred CCCCeEEEECCCCCeEEEEEeccccccc
Q 029811 72 SLHDKRTLLDPAGNPVVTITEKLFSAHE 99 (187)
Q Consensus 72 s~~~~~~l~D~~G~~L~~Ir~K~lsl~~ 99 (187)
++--...|.|.+|+.+++...+.++-.+
T Consensus 46 sfl~~i~l~d~~g~vv~~~~~~~L~~lP 73 (133)
T PF07680_consen 46 SFLIGIQLKDSTGHVVLNWDQEKLSSLP 73 (133)
T ss_pred ceeeEEEEECCCCCEEEEeCHHHhhhCC
Confidence 5566789999999999999887665444
No 51
>PF06788 UPF0257: Uncharacterised protein family (UPF0257); InterPro: IPR010646 This is a group of proteins of unknown function.; GO: 0005886 plasma membrane
Probab=24.56 E-value=1.7e+02 Score=24.93 Aligned_cols=38 Identities=18% Similarity=0.199 Sum_probs=25.2
Q ss_pred EEEcCCCCEEEEEEeecC--CCCCeEEEECCCCCeEEEEE
Q 029811 54 TVTDINDNIMFKVKEKHF--SLHDKRTLLDPAGNPVVTIT 91 (187)
Q Consensus 54 ~I~D~~G~~vf~V~gk~~--s~~~~~~l~D~~G~~L~~Ir 91 (187)
+++|++|++.++|++.+- +.=..+.+.|..-+.-+.|.
T Consensus 52 t~~de~g~v~~~v~~~l~~eGCfd~l~~~~~~~n~~~~Lv 91 (236)
T PF06788_consen 52 TLYDEDGEVTKRVSLTLSREGCFDTLELYDKENNTHLALV 91 (236)
T ss_pred EEEcCCCcEEEEEEEEECCccceeeeeecccccccceEEE
Confidence 789999999999999852 23344556665444444443
No 52
>PRK13245 hetR heterocyst differentiation control protein; Reviewed
Probab=24.41 E-value=22 Score=30.41 Aligned_cols=35 Identities=31% Similarity=0.369 Sum_probs=23.8
Q ss_pred CCceEEecCC---CceeeeeeecccccccccceeccCC
Q 029811 153 NDPVLFMQES---PTQLLPRCTKRKPLEATSLTRIDSP 187 (187)
Q Consensus 153 ~~~~i~~~~s---~~~~v~~i~~rk~~~~~~~~~~~~~ 187 (187)
+++...-+++ ..+.+|.-.+|++|-++..||||||
T Consensus 165 rsQed~p~~rrmpLSeAlaEHIkRRLlysgTVtrid~p 202 (299)
T PRK13245 165 RSQEDLPPEHRMPLSEALAEHIKRRLLYSGTVTRIDSP 202 (299)
T ss_pred hhhhcCChhccCchHHHHHHHHHHHHhhccceeeccCC
Confidence 4444444444 2344555449999999999999998
No 53
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=24.29 E-value=2.7e+02 Score=19.93 Aligned_cols=12 Identities=8% Similarity=0.288 Sum_probs=5.0
Q ss_pred CeEEEcCCCCEE
Q 029811 52 SFTVTDINDNIM 63 (187)
Q Consensus 52 ~F~I~D~~G~~v 63 (187)
.+.|+|.+|.++
T Consensus 65 nLvl~~~~g~~v 76 (114)
T smart00108 65 NLVLYDGDGRVV 76 (114)
T ss_pred CEEEEeCCCCEE
Confidence 444444444333
No 54
>PF13511 DUF4124: Domain of unknown function (DUF4124)
Probab=24.13 E-value=62 Score=20.69 Aligned_cols=18 Identities=11% Similarity=0.157 Sum_probs=12.9
Q ss_pred CeEEEcCCCCEEEEEEee
Q 029811 52 SFTVTDINDNIMFKVKEK 69 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk 69 (187)
=|.=.|++|+++|.=.-.
T Consensus 15 vYk~~D~~G~v~ysd~P~ 32 (60)
T PF13511_consen 15 VYKWVDENGVVHYSDTPP 32 (60)
T ss_pred EEEEECCCCCEEECccCC
Confidence 466678899988875533
No 55
>COG1021 EntE Peptide arylation enzymes [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=23.93 E-value=45 Score=31.15 Aligned_cols=38 Identities=26% Similarity=0.455 Sum_probs=34.2
Q ss_pred CeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEE
Q 029811 52 SFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVT 89 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~ 89 (187)
.||=.|..-+.|+.-+|+.+|-.+++.+.|++|+|+..
T Consensus 345 nyTRLDDp~E~i~~TQGrPlsP~DEvrvvD~dg~pv~p 382 (542)
T COG1021 345 NYTRLDDPPEIIIHTQGRPLSPDDEVRVVDADGNPVAP 382 (542)
T ss_pred cccccCCchHheeecCCCcCCCcceeEEecCCCCCCCC
Confidence 57778888899999999999999999999999998754
No 56
>PF08269 Cache_2: Cache domain; InterPro: IPR013163 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions []. This entry is composed of the type 2 Cache domain.; PDB: 2QHK_A 4EXO_A.
Probab=23.65 E-value=15 Score=25.80 Aligned_cols=42 Identities=17% Similarity=0.343 Sum_probs=17.7
Q ss_pred eecC-CCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEE
Q 029811 47 TLAD-GSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVV 88 (187)
Q Consensus 47 sls~-~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~ 88 (187)
.+.+ +=|.|+|.+|..++--...-+--.+-.-+.|.+|.+++
T Consensus 52 r~~~~gY~fi~d~~g~~l~hp~~p~~~G~n~~~~~D~~G~~~i 94 (95)
T PF08269_consen 52 RYGGDGYFFIYDMDGVVLAHPSNPELEGKNLSDLKDPNGKYLI 94 (95)
T ss_dssp -SBTTB--EEE-TTSBEEEESS-GGGTT-B-TT-B-TT--BHH
T ss_pred ccCCCCeEEEEeCCCeEEEcCCCcccCCcccccCCCCCCCEEe
Confidence 3443 34778888887766533222222344457788887764
No 57
>cd00004 Sortase Sortases are cysteine transpeptidases, found in gram-positive bacteria, that anchor surface proteins to peptidoglycans of the bacterial cell wall envelope. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. The different classes are called Sortase A or SrtA (subfamily 1), B or SrtB (subfamily 2), C or SrtC (subfamily3), D or SrtD (subfamilies 4 and 5), and E or SrtE. In two different sortase subfamilies, the N-terminus either functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring, or it contains a signal peptide only and the C-terminus serves as a membrane anchor. Most gram-positive bacteria contain more than one s
Probab=23.04 E-value=1.6e+02 Score=21.66 Aligned_cols=18 Identities=22% Similarity=0.296 Sum_probs=8.9
Q ss_pred CeEEEcCCCCEEEEEEee
Q 029811 52 SFTVTDINDNIMFKVKEK 69 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk 69 (187)
.+.|++.++.-.|+|...
T Consensus 70 ~v~v~~~~~~~~Y~V~~~ 87 (128)
T cd00004 70 KIYLTDGGKTYVYKVTSI 87 (128)
T ss_pred EEEEEECCEEEEEEEEEE
Confidence 444455544455555544
No 58
>TIGR00156 conserved hypothetical protein TIGR00156. As of the last revision, this family consists only of two proteins from Escherichia coli and one from the related species Haemophilus influenzae.
Probab=22.85 E-value=77 Score=24.39 Aligned_cols=48 Identities=21% Similarity=0.317 Sum_probs=28.9
Q ss_pred EEEEeeeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeEEEEEeccc
Q 029811 39 LAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPVVTITEKLF 95 (187)
Q Consensus 39 L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L~~Ir~K~l 95 (187)
-++++ ...+.+|.+.+. .|+++=++. +.++.|.|.+|....+|.++.|
T Consensus 46 ~tV~~-a~~~~Ddt~V~L--~G~Iv~~l~------~d~Y~F~D~TG~I~VeId~~~w 93 (126)
T TIGR00156 46 MTVDF-AKSMHDGASVTL--RGNIISHIG------DDRYVFRDKSGEINVVIPAAVW 93 (126)
T ss_pred EeHHH-HhhCCCCCEEEE--EEEEEEEeC------CceEEEECCCCCEEEEECHHHc
Confidence 34443 334455444332 455444442 3558889999999999988765
No 59
>PF04076 BOF: Bacterial OB fold (BOF) protein; InterPro: IPR005220 Proteins in this entry have an OB-fold fold (oligonucleotide/oligosaccharide binding motif). Analysis of the predicted nucleotide-binding site of the OB-fold suggests that they lack nucleic acid-binding properties. They contain an predicted N-terminal signal peptide which indicates that they localise to the periplasm where they may function to bind proteins, small molecules, or other typical OB-fold ligands. As hypothesised for the distantly related OB-fold containing bacterial enterotoxins, the loss of nucleotide-binding function and the rapid evolution of the OB-fold ligand-binding site may be associated with the presence of members in mobile genetic elements and their potential role in bacterial pathogenicity [].; PDB: 1NNX_A.
Probab=22.54 E-value=67 Score=23.72 Aligned_cols=22 Identities=27% Similarity=0.448 Sum_probs=12.8
Q ss_pred CCeEEEECCCCCeEEEEEeccc
Q 029811 74 HDKRTLLDPAGNPVVTITEKLF 95 (187)
Q Consensus 74 ~~~~~l~D~~G~~L~~Ir~K~l 95 (187)
+.++.|.|.+|...+.|-.+.|
T Consensus 49 ~d~Y~F~D~TG~I~VeId~~~w 70 (103)
T PF04076_consen 49 DDKYLFRDATGEIEVEIDDDVW 70 (103)
T ss_dssp TTEEEEEETTEEEEEE--GGGS
T ss_pred CCEEEEECCCCcEEEEEChhhc
Confidence 4556666777766677666654
No 60
>TIGR02527 dot_icm_IcmQ Dot/Icm secretion system protein IcmQ. Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation (PubMed:15661013).
Probab=22.50 E-value=43 Score=27.26 Aligned_cols=37 Identities=14% Similarity=0.289 Sum_probs=28.7
Q ss_pred EEEEEEee-cCCCCCeEEEECCCCCeEEEEEecccccc
Q 029811 62 IMFKVKEK-HFSLHDKRTLLDPAGNPVVTITEKLFSAH 98 (187)
Q Consensus 62 ~vf~V~gk-~~s~~~~~~l~D~~G~~L~~Ir~K~lsl~ 98 (187)
++.+|+.. .++++..+.-.|.-|+||++++.+-+.+.
T Consensus 120 VaiyI~q~dIl~~~~dk~p~Dk~GkpLltLKdkai~Le 157 (182)
T TIGR02527 120 VAIAIDQSDIIHLSADKAPKDKLGKLLLTLKDKAIKLE 157 (182)
T ss_pred EEEEEchHhcccCCcccCcccccCCcccccchhhhchh
Confidence 34455554 68899999999999999999998765443
No 61
>PF02974 Inh: Protease inhibitor Inh; InterPro: IPR021140 This entry represents the metalloprotease inhibitor I38, as well as the outer membrane lipoprotein Omp19. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This family of proteins represent monomeric serralysin inhibitors of about 125 residues, which interact with specific metalloprotease which are synthesised by serralysin secretors and characterised by being plant, insect and animal pathogens. It is probable that the serralysin inhibitors protect the host from proteolysis during export of the protease. The members of this family belong to MEROPS proteinase inhibitor family I38, clan IK. X-ray crystallography of a complex between the Serratia marcescens protease, SmaPI, and the inhibitor of Erwinia chrysanthemi, Inh, reveals that Inh is folded into an eight-stranded b-barrel with an N-terminal trunk of 10 residues. Residues 1-5 occupy part of the extended active site of the proteinase, thereby preventing access of the substrate. Residues 6-10 form a linker that connects the N-terminal proteinase-binding peptide to the body of the b-barrel. The backbone carbonyl of Ser-1 interacts with the catalytic zinc; the Ser-2 side chain occupies the S1'-binding site and also forms a hydrogen bond to the carboxyl end of the catalytic Glu, whereas Leu-3 occupies the S2' recognition site. Penetration of the trunk region further than 5 residues into the substrate binding cleft appears to be prevented by the b-barrel, which itself interacts with the proteinase near its Met turn (19). Peptide mimetics of the trunk at concentrations up to about 100 mM do not inhibit the protease, demonstrating that the barrel is essential for inhibitory activity [, ]. Structurally and functionally these inhibitors are closely related to the lipocalins, fatty acid-binding proteins, avidins and the enigmatic triabin. Together these five protein families constitute the calycin superfamily []. The proteins are characterised by their high specificity for small hydrophobic molecules and by their ability to form complexes with soluble macromolecules either through intramolecular disulphides or protein-protein interactions []. ; PDB: 1JIW_I 2RN4_A 1SMP_I.
Probab=22.35 E-value=3e+02 Score=19.72 Aligned_cols=55 Identities=18% Similarity=0.210 Sum_probs=31.4
Q ss_pred cCccccCcCEEEEEEeeeceecCCCeEEEcCCCCEEEEEEeecCCCCCeEEEECCCCCeE
Q 029811 28 GPQYCLPYPVDLAVVRKFMTLADGSFTVTDINDNIMFKVKEKHFSLHDKRTLLDPAGNPV 87 (187)
Q Consensus 28 ~~~~~a~~~~~L~vk~K~~sls~~~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~G~~L 87 (187)
.+.-|+... |.-....|...+++..++|++|+.|.+....- .+.+.-.-.+|.+|
T Consensus 41 ~~~~C~~~~--l~~~~~~W~~~gd~l~L~d~~G~~v~~f~~~~---~g~~~g~~~~g~~~ 95 (99)
T PF02974_consen 41 GDRGCAGKL--LAEVPAGWRPTGDGLVLTDADGSVVAFFYRSG---DGRFEGQTPDGQPL 95 (99)
T ss_dssp ESHHHHCCC--SSS--SEEEEETTEEEEE-TTS-EEEEEEEEC---TTEEEEEECCCEEE
T ss_pred CCCCcchhH--HhhCccceeEcCCEEEEECCCCCEEEEEEccC---CeeEEeEcCCCCEE
Confidence 344555543 21122346677889999999999998776653 33455555566444
No 62
>PF11398 DUF2813: Protein of unknown function (DUF2813); InterPro: IPR022602 This entry contains YbjD from Escherichia coli (strain K12), which is a conserved protein with a nucleotide triphosphate binding domain.
Probab=22.21 E-value=1.7e+02 Score=26.49 Aligned_cols=80 Identities=20% Similarity=0.331 Sum_probs=46.2
Q ss_pred CCCCceEEecCccccCcCEEEEEEeeeceecCCCeEEEcCCC--CEEEEEEeecC---CCCCeEEEECCCCCeEEEEE--
Q 029811 19 MYSNPVSIIGPQYCLPYPVDLAVVRKFMTLADGSFTVTDIND--NIMFKVKEKHF---SLHDKRTLLDPAGNPVVTIT-- 91 (187)
Q Consensus 19 ~~~~pv~vV~~~~~a~~~~~L~vk~K~~sls~~~F~I~D~~G--~~vf~V~gk~~---s~~~~~~l~D~~G~~L~~Ir-- 91 (187)
...++|.| .-.||...+-...-. ..-.+ . .+.+.|++| .+-|+|+|..- .+.-.+.+.|.+|++|-.=.
T Consensus 69 ~~~~~i~i-~~~F~e~~~~e~~~~-~~~~l-~-~~~~~~~dg~~rI~yRv~a~~~~~g~v~t~~~FLd~~G~~l~~~~~~ 144 (373)
T PF11398_consen 69 DQERHIQI-VLTFCESDPGEHNAR-RYRHL-S-PVWVPDDDGLQRIYYRVSAELDEDGDVETRRSFLDSDGNPLPLHNID 144 (373)
T ss_pred ccCceEEE-EEEecCCCCCccccc-cchhc-c-cceeECCCCCeeEEEEEEEEEcCCCCEEEEEeccCCCCCCcccCCHH
Confidence 34566777 445665444322111 11112 2 455666655 67789999865 56778889999999843322
Q ss_pred ---eccccccccEE
Q 029811 92 ---EKLFSAHEKHS 102 (187)
Q Consensus 92 ---~K~lsl~~~w~ 102 (187)
++++.+++-..
T Consensus 145 ~l~~~li~l~PVlR 158 (373)
T PF11398_consen 145 KLVRELIRLHPVLR 158 (373)
T ss_pred HHHHHHHhcCceEE
Confidence 23455666433
No 63
>PF09629 YorP: YorP protein; InterPro: IPR018591 This entry is represented by Bacteriophage SP-beta, YorP. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. YorP is a 71 residue protein. The structure is of an alpha helix between two of five beta strands. The function is unknown. ; PDB: 2HEQ_A.
Probab=21.85 E-value=1.6e+02 Score=20.01 Aligned_cols=27 Identities=15% Similarity=0.264 Sum_probs=16.6
Q ss_pred EEeeeceecCCCeEEEcCCCCEEEEEEe
Q 029811 41 VVRKFMTLADGSFTVTDINDNIMFKVKE 68 (187)
Q Consensus 41 vk~K~~sls~~~F~I~D~~G~~vf~V~g 68 (187)
|-+++-|+.= ||.|.|++|.+-|.-+.
T Consensus 33 IIe~l~S~~Y-DY~V~~~~GdI~~fKE~ 59 (71)
T PF09629_consen 33 IIEKLHSATY-DYAVSDETGDITRFKEH 59 (71)
T ss_dssp EEEE---SS--SEEEEETTS-EEEE-GG
T ss_pred hhhhhhhhee-eeeeecccCceeeeeec
Confidence 4457778877 99999999998876543
No 64
>cd05830 Sortase_D_5 Sortase D (SrtD) is a membrane transpeptidase found in gram-positive bacteria that anchors surface proteins to peptidoglycans of the bacterial cell wall envelope. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at the cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. Class D sortases are further classified into subfamilies 4 and 5. This group contains a subset of Class D sortases belonging to subfamily-5 represented by Streptomyces avermitilis SAV4337. Subfamily-5 sortases recognize a nonstandard sorting signal (LAXTG) and have replaced Sortase A in some gram-postive bacteria. They may play a housekeeping role in the cell.
Probab=21.51 E-value=1.7e+02 Score=22.14 Aligned_cols=21 Identities=10% Similarity=0.183 Sum_probs=15.2
Q ss_pred CCCeEEEcCCCCEEEEEEeec
Q 029811 50 DGSFTVTDINDNIMFKVKEKH 70 (187)
Q Consensus 50 ~~~F~I~D~~G~~vf~V~gk~ 70 (187)
|+.+.|+|.+|...|+|....
T Consensus 69 Gd~i~v~~~~~~~~Y~V~~~~ 89 (137)
T cd05830 69 GDKIVVETADGWYTYVVRSSE 89 (137)
T ss_pred CCEEEEEECCeEEEEEEeEEE
Confidence 457777777777778887763
No 65
>COG5033 TFG3 Transcription initiation factor IIF, auxiliary subunit [Transcription]
Probab=21.02 E-value=1.6e+02 Score=24.93 Aligned_cols=65 Identities=15% Similarity=0.295 Sum_probs=42.1
Q ss_pred CCCeEEEECCCCC-eEEEEEec-cccccccEEEEECCCCCCCCeEEEEEecccccCcceEEEEEecCCCC
Q 029811 73 LHDKRTLLDPAGN-PVVTITEK-LFSAHEKHSVFRGASTDAKDLLFTVGASSVLQLKTTLNVFWQVIPNK 140 (187)
Q Consensus 73 ~~~~~~l~D~~G~-~L~~Ir~K-~lsl~~~w~v~~g~~~~~~~~lftvk~~~~~~~k~k~~V~~~~~~~g 140 (187)
++-.+.+.|+.|+ .+.+|-+| .+.||++|. .-.-.-.++-|+|+...|--|--.+.|||.+....
T Consensus 35 th~w~v~v~~~g~E~~~~iv~KVifkLH~Tf~---NP~Rti~~pPFeI~EtGWGEF~i~I~iff~~~age 101 (225)
T COG5033 35 THIWLVFVRAPGKEDIATIVKKVIFKLHPTFS---NPTRTIESPPFEIKETGWGEFDIQIKIFFAEKAGE 101 (225)
T ss_pred hEEEEEEEeCCCCcchhhhhheeeEEeccccC---CCcccccCCCcEEEecccccceEEEEEEEecCCCc
Confidence 3445666676654 46777555 467888843 21111245679999888877778888888875543
No 66
>PF04170 NlpE: NlpE N-terminal domain; InterPro: IPR007298 This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway []. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. NlpE from Escherichia coli and Salmonella typhi is also known to confer copper tolerance in copper-sensitive strains of E. coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes [].; PDB: 3LHN_A 2Z4I_B 2Z4H_A.
Probab=20.73 E-value=2.1e+02 Score=20.02 Aligned_cols=19 Identities=26% Similarity=0.310 Sum_probs=8.6
Q ss_pred CCeEEEECCCCC-eEEEEEe
Q 029811 74 HDKRTLLDPAGN-PVVTITE 92 (187)
Q Consensus 74 ~~~~~l~D~~G~-~L~~Ir~ 92 (187)
+..++|.+.+|. ..|.+..
T Consensus 51 ~~~i~L~~~~~~~~~f~v~~ 70 (87)
T PF04170_consen 51 GNIITLTDNNGDKRYFKVGE 70 (87)
T ss_dssp SSEEEEEETTTTCEEEEEET
T ss_pred CCEEEEecCCCCEEEEEECC
Confidence 445555454433 3444433
No 67
>cd03676 Nudix_hydrolase_3 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belong to this superfamily requires a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate spe
Probab=20.72 E-value=4.1e+02 Score=20.62 Aligned_cols=59 Identities=10% Similarity=0.076 Sum_probs=31.2
Q ss_pred EEEECCCCCeEEEEEeccccccc----cEE--EEECCCCCCCCeEEEEEecccc-cCcceEEEEEec
Q 029811 77 RTLLDPAGNPVVTITEKLFSAHE----KHS--VFRGASTDAKDLLFTVGASSVL-QLKTTLNVFWQV 136 (187)
Q Consensus 77 ~~l~D~~G~~L~~Ir~K~lsl~~----~w~--v~~g~~~~~~~~lftvk~~~~~-~~k~k~~V~~~~ 136 (187)
+-++|.+|++++.+.|......+ .-. +|..+++..+..+++ ||+... .+-..++.+..|
T Consensus 7 ~~v~d~~~~~~~~~~r~~~~~~g~~h~~v~~~~~~~~~~~~~~l~lq-rRs~~K~~~Pg~wd~~~~G 72 (180)
T cd03676 7 YAVYGPFGEPLFEIERAASRLFGLVTYGVHLNGYVRDEDGGLRIWIP-RRSPTKATWPGMLDNLVAG 72 (180)
T ss_pred eeeECCCCCEeEEEEecccccCCceEEEEEEEEEEEcCCCCeEEEEE-eccCCCCCCCCceeeeccc
Confidence 45899999999988765443333 222 233332112344444 434432 234566666655
No 68
>PF08829 AlphaC_N: Alpha C protein N terminal; InterPro: IPR014933 The alpha C protein (ACP) is found in Streptococcus and acts as an invasin which plays a role in the internalisation and translocation of the organism across human epithelial surfaces. Group B Streptococcus is the leading cause of diseases including bacterial pneumonia, sepsis and meningitis. The N-terminal of ACP is associated with virulence and forms a beta sandwich and a three helix bundle [, , ]. ; PDB: 1YWM_A 2O0I_1.
Probab=20.54 E-value=31 Score=28.11 Aligned_cols=32 Identities=16% Similarity=0.167 Sum_probs=22.5
Q ss_pred CeEEEcCCCCEEEEEEeecCCCCCeEEEECCC
Q 029811 52 SFTVTDINDNIMFKVKEKHFSLHDKRTLLDPA 83 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~gk~~s~~~~~~l~D~~ 83 (187)
.|+|.|++|++++.-||+.-..+=.+.++|.+
T Consensus 92 tY~ild~~G~P~~k~DGQvdIvsvnlt~Ydst 123 (194)
T PF08829_consen 92 TYNILDEDGNPHVKSDGQVDIVSVNLTFYDST 123 (194)
T ss_dssp EEEEEETTSSB-B-TTSSB-EEEEEEEEE--H
T ss_pred EEEeecCCCCcccCCCCcEEEEEEEEEEeCcH
Confidence 58889999999999999976667778888853
No 69
>PF08495 FIST: FIST N domain; InterPro: IPR013702 The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids [].
Probab=20.35 E-value=1.5e+02 Score=23.06 Aligned_cols=17 Identities=29% Similarity=0.573 Sum_probs=8.1
Q ss_pred CeEEEcCCCCEEEEEEe
Q 029811 52 SFTVTDINDNIMFKVKE 68 (187)
Q Consensus 52 ~F~I~D~~G~~vf~V~g 68 (187)
.|+|+.++|+.|+..++
T Consensus 180 ~~~VT~a~~~~I~eld~ 196 (198)
T PF08495_consen 180 PMTVTKAEGNIIYELDG 196 (198)
T ss_pred CEEEEEecCCEEEEECC
Confidence 44444444444444444
No 70
>PF13585 CHU_C: C-terminal domain of CHU protein family; PDB: 3EIF_A 1XF1_B.
Probab=20.26 E-value=91 Score=21.80 Aligned_cols=19 Identities=16% Similarity=0.165 Sum_probs=8.7
Q ss_pred CCeEEEECCCCCeEEEEEe
Q 029811 74 HDKRTLLDPAGNPVVTITE 92 (187)
Q Consensus 74 ~~~~~l~D~~G~~L~~Ir~ 92 (187)
.-++.|+|.-|+.+|+...
T Consensus 28 ~~~~~IynrwG~~Vf~~~~ 46 (87)
T PF13585_consen 28 NYSLTIYNRWGELVFESND 46 (87)
T ss_dssp EEEEEEE-SSS-EEEE---
T ss_pred eeEEEEEeCCCcEEEEECC
Confidence 3556666666666666553
Done!