Query 047026
Match_columns 596
No_of_seqs 184 out of 295
Neff 5.6
Searched_HMMs 46136
Date Fri Mar 29 06:52:55 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/047026.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/047026hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF06045 Rhamnogal_lyase: Rham 100.0 1.1E-49 2.4E-54 388.2 13.5 160 1-160 44-203 (203)
2 PF14683 CBM-like: Polysacchar 100.0 5.6E-46 1.2E-50 356.5 12.0 164 399-587 1-167 (167)
3 PF09284 RhgB_N: Rhamnogalactu 100.0 3.1E-38 6.6E-43 312.6 17.6 190 35-284 53-245 (249)
4 PF14686 fn3_3: Polysaccharide 99.9 6.6E-22 1.4E-26 173.5 8.7 93 291-390 1-95 (95)
5 PF13620 CarboxypepD_reg: Carb 98.8 1.5E-08 3.3E-13 85.0 8.6 81 294-393 1-82 (82)
6 PF13715 DUF4480: Domain of un 98.5 1.3E-06 2.8E-11 74.6 11.2 88 294-406 1-88 (88)
7 cd03865 M14_CPE_H Peptidase M1 98.0 1.8E-05 3.9E-10 86.3 9.2 101 253-392 301-401 (402)
8 cd03864 M14_CPN Peptidase M14 98.0 3E-05 6.5E-10 84.5 10.6 101 253-392 291-391 (392)
9 cd03863 M14_CPD_II The second 98.0 2.5E-05 5.3E-10 84.7 9.9 79 292-393 296-374 (375)
10 cd06245 M14_CPD_III The third 97.8 9.7E-05 2.1E-09 79.8 9.7 76 293-393 287-362 (363)
11 cd03868 M14_CPD_I The first ca 97.6 0.00015 3.2E-09 78.5 8.5 76 292-391 295-371 (372)
12 cd03858 M14_CP_N-E_like Carbox 97.6 0.00024 5.2E-09 76.9 9.4 72 293-387 298-370 (374)
13 cd03867 M14_CPZ Peptidase M14- 97.3 0.00092 2E-08 73.1 8.9 72 293-387 318-391 (395)
14 cd03866 M14_CPM Peptidase M14 96.3 0.012 2.6E-07 64.1 8.5 70 292-382 294-363 (376)
15 PF08400 phage_tail_N: Prophag 95.4 0.075 1.6E-06 49.9 8.5 78 294-383 4-81 (134)
16 PRK15036 hydroxyisourate hydro 95.3 0.04 8.6E-07 52.0 6.2 66 291-367 25-94 (137)
17 PF03422 CBM_6: Carbohydrate b 94.6 0.26 5.6E-06 44.4 9.5 92 483-588 31-124 (125)
18 PF08308 PEGA: PEGA domain; I 94.6 0.25 5.5E-06 40.5 8.5 45 346-394 25-69 (71)
19 cd00421 intradiol_dioxygenase 93.6 0.16 3.5E-06 48.1 6.2 64 291-360 10-80 (146)
20 cd03869 M14_CPX_like Peptidase 93.4 0.19 4.1E-06 55.4 7.1 67 292-382 329-395 (405)
21 PF09430 DUF2012: Protein of u 93.1 0.52 1.1E-05 43.4 8.5 39 339-380 23-61 (123)
22 KOG1948 Metalloproteinase-rela 92.6 0.37 8E-06 56.8 8.1 55 293-366 316-371 (1165)
23 PF05738 Cna_B: Cna protein B- 92.3 0.58 1.3E-05 38.0 6.9 44 340-384 21-66 (70)
24 cd03463 3,4-PCD_alpha Protocat 92.0 0.32 6.9E-06 48.2 5.9 63 291-359 35-106 (185)
25 cd03459 3,4-PCD Protocatechuat 91.1 0.46 9.9E-06 45.9 5.9 64 291-360 14-87 (158)
26 PF07210 DUF1416: Protein of u 90.9 2.6 5.6E-05 36.6 9.5 61 291-368 6-66 (85)
27 COG3485 PcaH Protocatechuate 3 89.5 0.64 1.4E-05 47.5 5.6 65 291-361 71-144 (226)
28 TIGR02465 chlorocat_1_2 chloro 89.3 0.77 1.7E-05 47.5 6.1 64 291-360 97-165 (246)
29 TIGR02423 protocat_alph protoc 89.1 0.78 1.7E-05 45.7 5.8 64 291-360 38-111 (193)
30 PF07495 Y_Y_Y: Y_Y_Y domain; 88.9 0.58 1.3E-05 37.4 3.9 28 339-366 21-49 (66)
31 smart00606 CBD_IV Cellulose Bi 88.8 5 0.00011 36.4 10.5 90 482-587 38-129 (129)
32 PF00775 Dioxygenase_C: Dioxyg 88.8 0.94 2E-05 44.8 6.1 64 291-360 28-98 (183)
33 PF03170 BcsB: Bacterial cellu 88.1 1.4 3.1E-05 50.8 7.9 77 483-573 29-111 (605)
34 cd03464 3,4-PCD_beta Protocate 87.4 1.3 2.9E-05 45.0 6.3 65 290-360 63-137 (220)
35 TIGR02422 protocat_beta protoc 87.2 1.4 3.1E-05 44.8 6.4 67 288-360 56-132 (220)
36 cd03462 1,2-CCD chlorocatechol 86.0 1.2 2.6E-05 46.1 5.2 65 290-360 97-166 (247)
37 cd03458 Catechol_intradiol_dio 85.7 3 6.4E-05 43.5 7.9 65 290-360 102-171 (256)
38 KOG1948 Metalloproteinase-rela 84.6 2.4 5.2E-05 50.4 7.2 58 293-366 119-176 (1165)
39 cd03460 1,2-CTD Catechol 1,2 d 83.8 2 4.3E-05 45.3 5.7 65 290-360 122-191 (282)
40 TIGR02438 catachol_actin catec 83.1 2.3 5E-05 44.8 5.9 64 291-360 131-199 (281)
41 TIGR02439 catechol_proteo cate 82.6 2.5 5.5E-05 44.6 5.9 64 291-360 127-195 (285)
42 PF13364 BetaGal_dom4_5: Beta- 81.7 4.9 0.00011 36.3 6.7 54 499-569 50-104 (111)
43 PF10670 DUF4198: Domain of un 81.5 4.3 9.4E-05 39.7 6.9 62 292-364 150-211 (215)
44 PRK11114 cellulose synthase re 81.5 2.5 5.5E-05 50.3 6.1 74 486-572 84-163 (756)
45 cd03461 1,2-HQD Hydroxyquinol 81.5 3 6.5E-05 43.9 5.9 65 290-360 118-187 (277)
46 PF02837 Glyco_hydro_2_N: Glyc 79.6 4.8 0.0001 37.9 6.3 66 486-573 73-140 (167)
47 KOG2649 Zinc carboxypeptidase 77.9 7.1 0.00015 43.9 7.7 77 293-394 378-455 (500)
48 TIGR02962 hdxy_isourate hydrox 77.3 5.4 0.00012 36.5 5.5 51 309-365 12-67 (112)
49 PF03170 BcsB: Bacterial cellu 75.3 6.3 0.00014 45.6 6.8 78 482-572 323-409 (605)
50 PF00576 Transthyretin: HIUase 73.4 2.9 6.3E-05 38.2 2.8 52 309-365 12-68 (112)
51 PLN03059 beta-galactosidase; P 71.3 4.1 8.9E-05 48.8 4.1 84 485-574 623-716 (840)
52 cd05469 Transthyretin_like Tra 70.3 7 0.00015 35.9 4.5 52 309-365 12-67 (113)
53 cd05821 TLP_Transthyretin Tran 69.1 12 0.00026 34.8 5.8 64 292-365 6-73 (121)
54 cd03457 intradiol_dioxygenase_ 65.4 16 0.00035 36.3 6.3 62 293-359 27-100 (188)
55 cd05822 TLP_HIUase HIUase (5-h 63.6 17 0.00037 33.3 5.6 51 309-365 12-67 (112)
56 PF01060 DUF290: Transthyretin 61.4 17 0.00036 30.9 4.9 45 296-352 1-45 (80)
57 PF03944 Endotoxin_C: delta en 59.9 48 0.001 31.1 8.2 98 483-588 36-140 (143)
58 COG2351 Transthyretin-like pro 58.5 33 0.00072 31.9 6.4 66 293-373 9-79 (124)
59 PF14900 DUF4493: Domain of un 53.7 2.5E+02 0.0054 28.3 15.1 41 353-394 62-108 (235)
60 PF02369 Big_1: Bacterial Ig-l 53.3 77 0.0017 27.9 7.9 69 290-366 20-90 (100)
61 PF13754 Big_3_4: Bacterial Ig 51.2 32 0.00069 27.0 4.6 30 336-365 2-33 (54)
62 PF08531 Bac_rhamnosid_N: Alph 51.1 20 0.00044 34.7 4.2 53 509-573 12-66 (172)
63 smart00095 TR_THY Transthyreti 51.0 39 0.00084 31.5 5.7 62 293-364 4-69 (121)
64 PF09912 DUF2141: Uncharacteri 45.3 53 0.0012 29.8 5.7 48 315-365 12-62 (112)
65 PRK10340 ebgA cryptic beta-D-g 41.2 46 0.001 41.3 6.1 64 487-572 115-179 (1021)
66 PF11008 DUF2846: Protein of u 40.8 32 0.00069 31.1 3.6 44 344-387 56-99 (117)
67 TIGR03000 plancto_dom_1 Planct 39.1 2.3E+02 0.0049 24.4 8.0 39 347-387 30-73 (75)
68 PF07550 DUF1533: Protein of u 38.7 31 0.00067 28.2 2.8 19 552-570 36-55 (65)
69 PF12866 DUF3823: Protein of u 38.2 1.4E+02 0.0029 30.6 8.0 91 292-394 21-113 (222)
70 PRK09525 lacZ beta-D-galactosi 33.1 80 0.0017 39.3 6.3 64 487-572 126-191 (1027)
71 PF11797 DUF3324: Protein of u 32.8 52 0.0011 30.9 3.7 30 352-381 102-131 (140)
72 PF01190 Pollen_Ole_e_I: Polle 30.5 53 0.0011 28.6 3.1 37 310-351 18-54 (97)
73 PRK10150 beta-D-glucuronidase; 27.1 1.1E+02 0.0025 35.4 6.0 64 487-572 71-136 (604)
74 KOG0496 Beta-galactosidase [Ca 25.8 1.3E+02 0.0028 35.4 5.9 71 481-573 556-626 (649)
75 PRK13211 N-acetylglucosamine-b 23.8 3.7E+02 0.0081 30.7 9.0 67 284-365 320-388 (478)
76 PF13750 Big_3_3: Bacterial Ig 23.8 6.6E+02 0.014 24.1 10.1 27 484-510 2-29 (158)
77 PF13954 PapC_N: PapC N-termin 22.3 1.2E+02 0.0025 28.6 4.0 26 352-380 26-51 (146)
78 PF14200 RicinB_lectin_2: Rici 22.1 1.2E+02 0.0027 26.2 3.9 38 311-356 34-72 (105)
79 PF04571 Lipin_N: lipin, N-ter 21.6 1.2E+02 0.0025 28.0 3.7 38 475-523 35-72 (110)
80 COG4676 Uncharacterized protei 21.5 1.4E+02 0.0031 30.6 4.6 46 186-253 167-212 (268)
No 1
>PF06045 Rhamnogal_lyase: Rhamnogalacturonate lyase family; InterPro: IPR010325 Rhamnogalacturonate lyase degrades the rhamnogalacturonan I (RG-I) backbone of pectin []. This family contains mainly members from plants, but also contains the plant pathogen Erwinia chrysanthemi.
Probab=100.00 E-value=1.1e-49 Score=388.21 Aligned_cols=160 Identities=53% Similarity=1.014 Sum_probs=157.7
Q ss_pred CCCccCCCCCCCCccEEEEEecCCCCccceeeccCceEEEEeccCCEEEEEEEeecCCCCCCCccccceeEEEEEecCcc
Q 047026 1 MDNLLDLKSSESSRGYWDINWNLPEGQDRYQLLNGGEYSVINMSNDSVEVSFRSSYDPSIQSTKLPLSVDIRYILRSGVS 80 (596)
Q Consensus 1 ~~~~l~~~~~~~~~gY~d~~w~~~~~~~~~~~~~gt~~~vi~~~~~~i~vs~~~~~~~~~~g~~~~l~l~~~~v~r~G~s 80 (596)
|+|||+..|++.+|||||++|+.+|.+++|++++||+|+||++++++|||||+++|+||++++.+||+||+||||++|+|
T Consensus 44 i~NLle~~n~e~nrGYwD~~W~~~G~~~~~~~~~gt~f~Vi~~te~qVevSF~r~w~~s~~~~~~plnIDkryVm~rG~S 123 (203)
T PF06045_consen 44 IDNLLEVANKENNRGYWDLVWNEPGSKGKFDRIKGTEFSVIEQTEEQVEVSFSRTWDPSLDGKSVPLNIDKRYVMLRGSS 123 (203)
T ss_pred EehhhcccCcccCCceEEEecccCCccccccccCCcEEEEEEcCCCeEEEEEEcccCcCCCCCcceeEeeEEEEEecCCc
Confidence 58999999999999999999999999899999999999999999999999999999999999999999999999999999
Q ss_pred eeEEEEeecCCCCCCCCCCCceEEEEEcCCCCCccceecccccccCCCCCCCCCCCcceeeeeceEEeecCCCCCCCceE
Q 047026 81 GFHCYSIYERPPGCRAFDLAQTRLAFKLRRDKFHYMAITDAKQRIMPLPEDLLPGRGKQLIVPESVLLVNPINPDLKGEV 160 (596)
Q Consensus 81 giY~y~~~~~~~~~p~~~lge~R~v~Rl~~~~f~~~~~~d~r~~~~P~~~d~~~~~~~~l~~~eav~l~~~~~~~~~G~~ 160 (596)
|||+|+|++|+++||+++|+|+|+||||++++|++||++|+||+.||+|+||++++|++|+|||||+|++|+||+++|||
T Consensus 124 GfY~YAI~e~~~~~Pa~~l~q~R~vfKl~~d~F~ymai~d~rqr~mP~~~D~~~~~~~~l~y~eav~l~~p~~~~~~gev 203 (203)
T PF06045_consen 124 GFYSYAIFEHPAGWPAFDLGQTRIVFKLNKDKFHYMAISDDRQRIMPSPDDRDPARGQPLAYPEAVLLVNPINPQFRGEV 203 (203)
T ss_pred eEEEEEEEecCCCCCCcccceeEEEEECCccccceEEecccccccCCChHHccccCCCcccCchhhhcCCCCCccccccC
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999986
No 2
>PF14683 CBM-like: Polysaccharide lyase family 4, domain III; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=100.00 E-value=5.6e-46 Score=356.48 Aligned_cols=164 Identities=44% Similarity=0.784 Sum_probs=115.7
Q ss_pred CceEEEeccCCCCcceecCCCCcccccccccCCchhhhcccccccccccCCCCCeeEEeeccCCCCCeeEEEEeecCCCC
Q 047026 399 PTVWEIGFPDRTALGCYVPDVNPMYVNKLFLNSPEKYRQYGLWDRYTDVHPESDQFFTVGVNDPKKDWFFAHVDRRGPDN 478 (596)
Q Consensus 399 ~~LweIG~~Drta~~F~~~d~~~~~~n~~~~~hp~~~R~yglW~~y~~~~P~~dl~ytVG~S~~~~Dw~ya~~~~~~~~~ 478 (596)
++|||||+|||+|.||+++| |+++|||+ |++|+++||++|++|+||+| +++||||||++++
T Consensus 1 ~~iW~IG~~Drta~eF~~~~-------------~~~~r~~~-~~d~~~~~p~~~~~ytVG~S-~~~Dw~y~~~~~~---- 61 (167)
T PF14683_consen 1 PTIWQIGTPDRTAAEFRNGD-------------PDKYRQYG-WSDYSRDFPWEDLTYTVGSS-PAKDWPYAQWGRV---- 61 (167)
T ss_dssp SEEEEEE-SSSS-TTSBTHH--------------HHTTS---TT--TTS----S-EEETTTS--GGGSBSEEETTT----
T ss_pred CcceEeCCCCCCchhhccCC-------------hhhhhhcC-cccchhhCCCCCCEEEEccC-cccCCcEEEEecc----
Confidence 58999999999999999873 25699998 99999999998999999999 8899999999984
Q ss_pred CCCCccEEEEEEeCCCc-cceEEEEEEEecc-CCCeeEEEEcCccCCcccccccccCCCCeeeeeeEE-EeeEEEEEEee
Q 047026 479 KYLPTTWTIKFHLDSII-KGTYNLRLAIASA-TRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIH-GLYQLFSIHVS 555 (596)
Q Consensus 479 ~~~~~~w~I~F~L~~~~-~~~~tLriala~a-~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~-G~~~~~~~~ip 555 (596)
+++|+|+|+|++++ .+++||||+||+| ++++++|+|||+....| ...+++|++++|+|+| |+|++++|+||
T Consensus 62 ---~~~w~I~F~l~~~~~~~~~tL~i~la~a~~~~~~~V~vNg~~~~~~---~~~~~~d~~~~r~g~~~G~~~~~~~~ip 135 (167)
T PF14683_consen 62 ---NGTWTIKFDLDAVQLAGTYTLRIALAGASAGGRLQVSVNGWSGPFP---SAPFGNDNAIYRSGIHRGNYRLYEFDIP 135 (167)
T ss_dssp ---S--EEEEEEE-GGG-S--EEEEEEEEEEETT-EEEEEETTEE--------------S--GGGT---S---EEEEEE-
T ss_pred ---CCCEEEEEECCCCccCCcEEEEEEeccccCCCCEEEEEcCccCCcc---ccccCCCCceeeCceecccEEEEEEEEc
Confidence 59999999999999 5699999999999 89999999999666422 3467899999999998 99999999999
Q ss_pred cCceeeeccEEEEEEeecCCCCceEEEEEEEE
Q 047026 556 SLLLIKGDNSMFLVQSRSGDPVCGVLYDYLRL 587 (596)
Q Consensus 556 a~~L~~G~NtI~l~~~~g~s~~~~vmyD~IrL 587 (596)
+++|++|+|+|+|++++|++.+.|||||||||
T Consensus 136 a~~L~~G~Nti~lt~~~gs~~~~gvmyD~I~L 167 (167)
T PF14683_consen 136 ASLLKAGENTITLTVPSGSGLSPGVMYDYIRL 167 (167)
T ss_dssp TTSS-SEEEEEEEEEE-S-GGSSEEEEEEEEE
T ss_pred HHHEEeccEEEEEEEccCCCccCeEEEEEEEC
Confidence 99999999999999999987777999999998
No 3
>PF09284 RhgB_N: Rhamnogalacturonase B, N-terminal; InterPro: IPR015364 This domain is found in prokaryotic enzyme rhamnogalacturonase B, it adopts a structure consisting of a beta supersandwich, with eighteen strands in two beta-sheets. The exact function of the domain is unknown, but a putative role includes carbohydrate-binding []. ; GO: 0016837 carbon-oxygen lyase activity, acting on polysaccharides, 0030246 carbohydrate binding, 0005975 carbohydrate metabolic process; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=100.00 E-value=3.1e-38 Score=312.57 Aligned_cols=190 Identities=19% Similarity=0.308 Sum_probs=120.7
Q ss_pred CceEEEEeccCCEEEEEEEeecCCCCCCCccccceeEEEEEecCcceeEEEEeecCCCCCCCCCCCceEEEEEcCCCCCc
Q 047026 35 GGEYSVINMSNDSVEVSFRSSYDPSIQSTKLPLSVDIRYILRSGVSGFHCYSIYERPPGCRAFDLAQTRLAFKLRRDKFH 114 (596)
Q Consensus 35 gt~~~vi~~~~~~i~vs~~~~~~~~~~g~~~~l~l~~~~v~r~G~sgiY~y~~~~~~~~~p~~~lge~R~v~Rl~~~~f~ 114 (596)
|+...-+.+..++|+|+|+.. +|+||||+|+|++.|||+++.+. +.+|+|||+|+||++++||
T Consensus 53 GsatVs~~~~~~~IkVt~~~~------------tLthyyv~r~g~~~IYmaT~~~~-----e~~igelRfIaRL~~~~lp 115 (249)
T PF09284_consen 53 GSATVSITTSGDYIKVTCKTG------------TLTHYYVARPGENNIYMATYITA-----EPSIGELRFIARLNRSILP 115 (249)
T ss_dssp SS-EEEEEEETTEEEEEEE-S------------SEEEEEEEETT--EEEEEEEESS-------TTS-EEEEEEE-TTTS-
T ss_pred CccEEEEEeeCCEEEEEEEcC------------CeEEEEEEecCCceEEEEeccCC-----CCCccceEEEEEcccccCC
Confidence 444445666677999999984 79999999999999999999877 5589999999999999999
Q ss_pred cceecccccccCCCCCCCCCCCcceeeeeceEEeecCCCCCCCceEeeceecccccccCceEEEEeeCCceEEEEEcCCC
Q 047026 115 YMAITDAKQRIMPLPEDLLPGRGKQLIVPESVLLVNPINPDLKGEVDDKYQYSMDNKDGGLHGWISSGPIIGFWIIFPSH 194 (596)
Q Consensus 115 ~~~~~d~r~~~~P~~~d~~~~~~~~l~~~eav~l~~~~~~~~~G~~~sKY~~s~~~~D~~vhG~~s~g~~vG~W~I~~s~ 194 (596)
+-. .. .. .++ ..|...++.+.|++.. +|+++||||++.+++|+++||+ +|+++|+|||++++
T Consensus 116 n~~-~~---~~---~~~---~~g~taIEgsDVf~~~------~G~TrSKfYSs~r~IDd~~hgv--~g~~vgv~mi~~~~ 177 (249)
T PF09284_consen 116 NEY-PY---GD---VST---TDGGTAIEGSDVFLVS------DGQTRSKFYSSQRFIDDDVHGV--SGSAVGVYMIMSNY 177 (249)
T ss_dssp EEE-TT---GG---GG-----TT-EEEETTTEEEE-------TTEEEEGGGG--BGGG-SEEEE--E-SS-EEEEE----
T ss_pred CCC-Cc---cc---ccc---cCCceEEeeccEEEec------CceEeeeeccccceeccceEEE--ecCCeEEEEEeCCc
Confidence 821 11 11 111 1333344555576653 6999999999999999999997 78899999999999
Q ss_pred CcccCCcceecccccCCc---cEEEEEeecccccCceeeccccCcccceeeceEEEEEcCCCCCcccchhHHHHHHHHHh
Q 047026 195 EFRNGGPTKQNLTVHTGP---TCLAMFHGTHYIGNEILAHFQEGEAWRKVFGPIFVYLNSTSDASKAYNLWIDAKKQRLL 271 (596)
Q Consensus 195 E~~sGGPlkqdL~~h~g~---~~l~y~~s~Hy~g~~~~~~~~~Ge~w~kv~GP~~~y~N~g~~~~~~~~l~~DA~~~~~~ 271 (596)
|.+|||||+|||++|.++ .||+||+|+|.++|+ +|.| +||||+|+|++|++|+. .
T Consensus 178 E~SSGGPFfRDI~~~~~~~~~~Ly~ymnSgH~qTE~----~R~G-----LhGPYaL~FT~g~~Ps~-----~-------- 235 (249)
T PF09284_consen 178 EKSSGGPFFRDINTNNGGDGNELYNYMNSGHTQTEP----YRMG-----LHGPYALAFTDGGAPSA-----S-------- 235 (249)
T ss_dssp TT-SS-TT-B---EEE-SS-EEEEEEEE-STT--S--------E-----EEEEEEEEEESS----S--------------
T ss_pred cccCCCCchhhhhhccCCccceeeeeEecCcccCch----hccc-----cCCceEEEEcCCCCCCC-----c--------
Confidence 999999999999999765 599999999999987 5685 99999999999999863 1
Q ss_pred hhccCCCcCCCCC
Q 047026 272 QEAAWPYDFVSSP 284 (596)
Q Consensus 272 E~~~wpysf~~s~ 284 (596)
+-+++|+++-
T Consensus 236 ---~~D~sff~~L 245 (249)
T PF09284_consen 236 ---DLDTSFFDDL 245 (249)
T ss_dssp -------GGGGGT
T ss_pred ---cccccchhhc
Confidence 2478999863
No 4
>PF14686 fn3_3: Polysaccharide lyase family 4, domain II; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=99.86 E-value=6.6e-22 Score=173.55 Aligned_cols=93 Identities=43% Similarity=0.794 Sum_probs=54.0
Q ss_pred CceeEEEEEEeeecccccCCCCc-ceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceee
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPA-KYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIG 369 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa-~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G 369 (596)
+||+|+|+|+++|.+. ..++ ..++|+|+.+++ ++ ++++||||++||++|+|+|+|||||+|+|+||++|++|
T Consensus 1 ~RG~VsG~l~l~dg~~---~~~~~~~~~Vgl~~~~d---~~-q~~~yqYwt~td~~G~Fti~~V~pGtY~L~ay~~g~~g 73 (95)
T PF14686_consen 1 QRGSVSGRLTLSDGVT---NPPAGANAVVGLAPPGD---FQ-QNKGYQYWTRTDSDGNFTIPNVRPGTYRLYAYADGIFG 73 (95)
T ss_dssp G-BEEEEEEE---SS-----TT--S-EEEEEE------------SS-EEEEE--TTSEEE---B-SEEEEEEEEE----T
T ss_pred CCCEEEEEEEEccCcc---cCccceeEEEEeeeccc---cc-cCCCCcEEEEeCCCCcEEeCCeeCcEeEEEEEEecccC
Confidence 5999999999988543 2444 678999998886 44 59999999999999999999999999999999999999
Q ss_pred eEee-eeEEEEeCCCeeeecce
Q 047026 370 DYLD-KALVTISAGSQTELGNL 390 (596)
Q Consensus 370 ~~~~-~~~VtV~aG~t~~l~~l 390 (596)
++.. +.+|+|++|++++|++|
T Consensus 74 ~~~~~~~~ItV~~g~~~~lg~~ 95 (95)
T PF14686_consen 74 DYKVASDSITVSGGTTTDLGDL 95 (95)
T ss_dssp TEEEEEEEEEE-T-EEE-----
T ss_pred ceEEecceEEEcCCcEeccccC
Confidence 9986 78899999999988764
No 5
>PF13620 CarboxypepD_reg: Carboxypeptidase regulatory-like domain; PDB: 3MN8_D 3P0D_I 3KCP_A 2B59_B 1UWY_A 1H8L_A 1QMU_A 2NSM_A.
Probab=98.83 E-value=1.5e-08 Score=85.00 Aligned_cols=81 Identities=26% Similarity=0.388 Sum_probs=59.5
Q ss_pred eEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEee
Q 047026 294 SATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYLD 373 (596)
Q Consensus 294 tVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~ 373 (596)
+|+|+|...+ +.|..+|.|.|..... +..+-+.||++|+|.|++++||+|+|.+...|+. ..
T Consensus 1 tI~G~V~d~~------g~pv~~a~V~l~~~~~---------~~~~~~~Td~~G~f~~~~l~~g~Y~l~v~~~g~~---~~ 62 (82)
T PF13620_consen 1 TISGTVTDAT------GQPVPGATVTLTDQDG---------GTVYTTTTDSDGRFSFEGLPPGTYTLRVSAPGYQ---PQ 62 (82)
T ss_dssp -EEEEEEETT------SCBHTT-EEEET--TT---------TECCEEE--TTSEEEEEEE-SEEEEEEEEBTTEE----E
T ss_pred CEEEEEEcCC------CCCcCCEEEEEEEeeC---------CCEEEEEECCCceEEEEccCCEeEEEEEEECCcc---eE
Confidence 6899999765 8999999999975432 4567899999999999999999999999999865 33
Q ss_pred e-eEEEEeCCCeeeecceEEe
Q 047026 374 K-ALVTISAGSQTELGNLTYV 393 (596)
Q Consensus 374 ~-~~VtV~aG~t~~l~~l~~~ 393 (596)
. ..|+|.+|++..+ +|+++
T Consensus 63 ~~~~v~v~~~~~~~~-~i~L~ 82 (82)
T PF13620_consen 63 TQENVTVTAGQTTTV-DITLE 82 (82)
T ss_dssp EEEEEEESSSSEEE---EEEE
T ss_pred EEEEEEEeCCCEEEE-EEEEC
Confidence 3 3599999998887 57663
No 6
>PF13715 DUF4480: Domain of unknown function (DUF4480)
Probab=98.51 E-value=1.3e-06 Score=74.57 Aligned_cols=88 Identities=22% Similarity=0.294 Sum_probs=68.8
Q ss_pred eEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEee
Q 047026 294 SATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYLD 373 (596)
Q Consensus 294 tVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~ 373 (596)
+|+|+|...+ ++.|..+|.|.+.... ..+.||++|.|+|. +++|+|+|.++..|+. ..
T Consensus 1 ti~G~V~d~~-----t~~pl~~a~V~~~~~~-------------~~~~Td~~G~F~i~-~~~g~~~l~is~~Gy~---~~ 58 (88)
T PF13715_consen 1 TISGKVVDSD-----TGEPLPGATVYLKNTK-------------KGTVTDENGRFSIK-LPEGDYTLKISYIGYE---TK 58 (88)
T ss_pred CEEEEEEECC-----CCCCccCeEEEEeCCc-------------ceEEECCCeEEEEE-EcCCCeEEEEEEeCEE---EE
Confidence 5899998764 4799999999997443 37889999999999 9999999999999765 55
Q ss_pred eeEEEEeCCCeeeecceEEeeCCCCCceEEEec
Q 047026 374 KALVTISAGSQTELGNLTYVPLRNGPTVWEIGF 406 (596)
Q Consensus 374 ~~~VtV~aG~t~~l~~l~~~~~~~g~~LweIG~ 406 (596)
...|.+..+....+ ++.+.+ +..+|-||.+
T Consensus 59 ~~~i~~~~~~~~~~-~i~L~~--~~~~L~eVvV 88 (88)
T PF13715_consen 59 TITISVNSNKNTNL-NIYLEP--KSNQLDEVVV 88 (88)
T ss_pred EEEEEecCCCEEEE-EEEEee--CcccCCeEEC
Confidence 55677766655566 577766 4667877753
No 7
>cd03865 M14_CPE_H Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine pe
Probab=98.01 E-value=1.8e-05 Score=86.32 Aligned_cols=101 Identities=18% Similarity=0.290 Sum_probs=75.6
Q ss_pred CCCcccchhHHHHHHHHHhhhccCCCcCCCCCCCCCCCCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc
Q 047026 253 SDASKAYNLWIDAKKQRLLQEAAWPYDFVSSPYYLTANERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE 332 (596)
Q Consensus 253 ~~~~~~~~l~~DA~~~~~~E~~~wpysf~~s~~y~~~~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~ 332 (596)
+..+....+|.+-|.-+.+ |.....|| |+|+|+... +.|..+|.|.+....
T Consensus 301 P~~~~L~~~W~~n~~all~--------------~~~q~~~g-I~G~V~D~~------g~pI~~AtV~V~g~~-------- 351 (402)
T cd03865 301 PPEETLKQYWEDNKNSLVN--------------YIEQVHRG-VKGFVKDLQ------GNPIANATISVEGID-------- 351 (402)
T ss_pred CCHHHHHHHHHHHHHHHHH--------------HHHHhccc-eEEEEECCC------CCcCCCeEEEEEcCc--------
Confidence 3334566688887765432 22223477 999998753 688899999997433
Q ss_pred cccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEeeeeEEEEeCCCeeeecceEE
Q 047026 333 SKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYLDKALVTISAGSQTELGNLTY 392 (596)
Q Consensus 333 ~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~ 392 (596)
..+.||.+|.|.+ .++||+|+|+|.+.|+. .....|+|.+|+++.+ ++++
T Consensus 352 -----~~~~T~~~G~Y~~-~L~pG~Ytv~vsa~Gy~---~~~~~V~V~~~~~~~v-df~L 401 (402)
T cd03865 352 -----HDITSAKDGDYWR-LLAPGNYKLTASAPGYL---AVVKKVAVPYSPAVRV-DFEL 401 (402)
T ss_pred -----cccEECCCeeEEE-CCCCEEEEEEEEecCcc---cEEEEEEEcCCCcEEE-eEEe
Confidence 2568999999998 89999999999999876 4557799999988777 4665
No 8
>cd03864 M14_CPN Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates inclu
Probab=98.00 E-value=3e-05 Score=84.49 Aligned_cols=101 Identities=17% Similarity=0.308 Sum_probs=75.0
Q ss_pred CCCcccchhHHHHHHHHHhhhccCCCcCCCCCCCCCCCCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc
Q 047026 253 SDASKAYNLWIDAKKQRLLQEAAWPYDFVSSPYYLTANERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE 332 (596)
Q Consensus 253 ~~~~~~~~l~~DA~~~~~~E~~~wpysf~~s~~y~~~~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~ 332 (596)
+..+....+|.+-|.-+.+ |... --..|+|+|+..+ +.|..+|.|.+...
T Consensus 291 p~~~~l~~~w~~n~~all~--------------~~~~-~~~gI~G~V~D~~------g~pi~~A~V~v~g~--------- 340 (392)
T cd03864 291 PPEEELEREWLGNREALIS--------------YIEQ-VHQGIKGMVTDEN------NNGIANAVISVSGI--------- 340 (392)
T ss_pred CCHHHHHHHHHHHHHHHHH--------------HHHH-hcCeEEEEEECCC------CCccCCeEEEEECC---------
Confidence 4445566788887665433 1111 1248999998764 78999999999633
Q ss_pred cccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEeeeeEEEEeCCCeeeecceEE
Q 047026 333 SKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYLDKALVTISAGSQTELGNLTY 392 (596)
Q Consensus 333 ~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~ 392 (596)
..-+.||++|.| +.+++||+|+|.|+..|+. .++.+|+|.+++++.+ ++++
T Consensus 341 ----~~~~~T~~~G~y-~r~l~pG~Y~l~vs~~Gy~---~~t~~v~V~~~~~~~~-df~L 391 (392)
T cd03864 341 ----SHDVTSGTLGDY-FRLLLPGTYTVTASAPGYQ---PSTVTVTVGPAEATLV-NFQL 391 (392)
T ss_pred ----ccceEECCCCcE-EecCCCeeEEEEEEEcCce---eEEEEEEEcCCCcEEE-eeEe
Confidence 336789999999 9999999999999999875 5666799999887766 4654
No 9
>cd03863 M14_CPD_II The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally ac
Probab=98.00 E-value=2.5e-05 Score=84.72 Aligned_cols=79 Identities=18% Similarity=0.191 Sum_probs=64.2
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeE
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDY 371 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~ 371 (596)
...|+|+|.... .+.|..+|.|.+.... ..+.||.+|.|.+ .|+||+|+|+|+..|+.
T Consensus 296 ~~gI~G~V~D~~-----~g~pl~~AtV~V~g~~-------------~~~~Td~~G~f~~-~l~pG~ytl~vs~~GY~--- 353 (375)
T cd03863 296 HRGVRGFVLDAT-----DGRGILNATISVADIN-------------HPVTTYKDGDYWR-LLVPGTYKVTASARGYD--- 353 (375)
T ss_pred cCeEEEEEEeCC-----CCCCCCCeEEEEecCc-------------CceEECCCccEEE-ccCCeeEEEEEEEcCcc---
Confidence 478999998752 3689999999997433 3688999999999 69999999999999865
Q ss_pred eeeeEEEEeCCCeeeecceEEe
Q 047026 372 LDKALVTISAGSQTELGNLTYV 393 (596)
Q Consensus 372 ~~~~~VtV~aG~t~~l~~l~~~ 393 (596)
..+.+|+|.+|+++.+ ++.++
T Consensus 354 ~~~~~v~V~~~~~~~~-~~~L~ 374 (375)
T cd03863 354 PVTKTVEVDSKGAVQV-NFTLS 374 (375)
T ss_pred cEEEEEEEcCCCcEEE-EEEec
Confidence 4555799999999887 57664
No 10
>cd06245 M14_CPD_III The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active a
Probab=97.77 E-value=9.7e-05 Score=79.82 Aligned_cols=76 Identities=20% Similarity=0.227 Sum_probs=62.5
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEe
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYL 372 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~ 372 (596)
-.|+|+|+..+ +.|..+|.|.+... . .+.||.+|.|.+. ++||+|+|.+...|+. .
T Consensus 287 ~gI~G~V~d~~------g~pi~~A~V~v~g~-------------~-~~~T~~~G~y~~~-L~pG~y~v~vs~~Gy~---~ 342 (363)
T cd06245 287 KGVHGVVTDKA------GKPISGATIVLNGG-------------H-RVYTKEGGYFHVL-LAPGQHNINVIAEGYQ---Q 342 (363)
T ss_pred cEEEEEEEcCC------CCCccceEEEEeCC-------------C-ceEeCCCcEEEEe-cCCceEEEEEEEeCce---e
Confidence 56999998754 78999999999632 1 5679999999997 9999999999999865 5
Q ss_pred eeeEEEEeCCCeeeecceEEe
Q 047026 373 DKALVTISAGSQTELGNLTYV 393 (596)
Q Consensus 373 ~~~~VtV~aG~t~~l~~l~~~ 393 (596)
.+.+|+|.+++++.+ ++++.
T Consensus 343 ~~~~V~v~~~~~~~~-~f~L~ 362 (363)
T cd06245 343 EHLPVVVSHDEASSV-KIVLD 362 (363)
T ss_pred EEEEEEEcCCCeEEE-EEEec
Confidence 666799999988777 57664
No 11
>cd03868 M14_CPD_I The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at p
Probab=97.63 E-value=0.00015 Score=78.54 Aligned_cols=76 Identities=22% Similarity=0.267 Sum_probs=60.3
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeE
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDY 371 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~ 371 (596)
.+.|+|+|+..+ +.|..+|.|.+.... ..+.||++|.|.+ +++||+|+|.+...|+.
T Consensus 295 ~~~i~G~V~d~~------g~pv~~A~V~v~~~~-------------~~~~td~~G~y~~-~l~~G~Y~l~vs~~Gf~--- 351 (372)
T cd03868 295 HIGVKGFVRDAS------GNPIEDATIMVAGID-------------HNVTTAKFGDYWR-LLLPGTYTITAVAPGYE--- 351 (372)
T ss_pred CCceEEEEEcCC------CCcCCCcEEEEEecc-------------cceEeCCCceEEe-cCCCEEEEEEEEecCCC---
Confidence 478999998764 789999999997433 3689999999984 79999999999999875
Q ss_pred ee-eeEEEEeCCCeeeecceE
Q 047026 372 LD-KALVTISAGSQTELGNLT 391 (596)
Q Consensus 372 ~~-~~~VtV~aG~t~~l~~l~ 391 (596)
.+ ...|+|.+|+++.+ +++
T Consensus 352 ~~~~~~v~v~~g~~~~~-~~~ 371 (372)
T cd03868 352 PSTVTDVVVKEGEATSV-NFT 371 (372)
T ss_pred ceEEeeEEEcCCCeEEE-eeE
Confidence 32 23477999998776 353
No 12
>cd03858 M14_CP_N-E_like Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappr
Probab=97.59 E-value=0.00024 Score=76.86 Aligned_cols=72 Identities=19% Similarity=0.264 Sum_probs=59.0
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEe
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYL 372 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~ 372 (596)
.+|+|+|+..+ +.|..+|.|.+. +....+.||.+|.|.+. ++||+|+|.+...|+. .
T Consensus 298 ~~i~G~V~d~~------g~pl~~A~V~i~-------------~~~~~~~Td~~G~f~~~-l~~G~y~l~vs~~Gy~---~ 354 (374)
T cd03858 298 RGIKGFVRDAN------GNPIANATISVE-------------GINHDVTTAEDGDYWRL-LLPGTYNVTASAPGYE---P 354 (374)
T ss_pred CceEEEEECCC------CCccCCeEEEEe-------------cceeeeEECCCceEEEe-cCCEeEEEEEEEcCcc---e
Confidence 48999998764 689999999995 44568999999999986 7999999999999764 5
Q ss_pred eeeEEEEeC-CCeeee
Q 047026 373 DKALVTISA-GSQTEL 387 (596)
Q Consensus 373 ~~~~VtV~a-G~t~~l 387 (596)
++.+|+|.+ |+++.+
T Consensus 355 ~~~~v~v~~~g~~~~~ 370 (374)
T cd03858 355 QTKSVVVPNDNSAVVV 370 (374)
T ss_pred EEEEEEEecCCceEEE
Confidence 555677777 887766
No 13
>cd03867 M14_CPZ Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling.
Probab=97.26 E-value=0.00092 Score=73.11 Aligned_cols=72 Identities=21% Similarity=0.236 Sum_probs=56.9
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEe
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYL 372 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~ 372 (596)
-.|+|+|+..+ +.|..+|.|.+.. ....+.||++|.|. .+++||+|+|.+...|+. .
T Consensus 318 ~~i~G~V~D~~------g~pi~~A~V~v~g-------------~~~~~~Td~~G~y~-~~l~~G~y~l~vs~~Gy~---~ 374 (395)
T cd03867 318 RGIKGFVKDKD------GNPIKGARISVRG-------------IRHDITTAEDGDYW-RLLPPGIHIVSAQAPGYT---K 374 (395)
T ss_pred ceeEEEEEcCC------CCccCCeEEEEec-------------cccceEECCCceEE-EecCCCcEEEEEEecCee---e
Confidence 36999999764 7899999999973 34478899999997 689999999999999875 5
Q ss_pred eeeEEEEeC--CCeeee
Q 047026 373 DKALVTISA--GSQTEL 387 (596)
Q Consensus 373 ~~~~VtV~a--G~t~~l 387 (596)
...+|+|.+ ++...+
T Consensus 375 ~~~~v~v~~~~~~~~~~ 391 (395)
T cd03867 375 VMKRVTLPARMKRAGRV 391 (395)
T ss_pred EEEEEEeCCcCCCceEe
Confidence 556688865 444444
No 14
>cd03866 M14_CPM Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the cont
Probab=96.35 E-value=0.012 Score=64.11 Aligned_cols=70 Identities=20% Similarity=0.269 Sum_probs=53.4
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeE
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDY 371 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~ 371 (596)
.+.|+|+|+..+ +.|..+|.|.+...+ ...-+.||++|.|.+. ++||+|+|.+.+.|+.
T Consensus 294 ~~gI~G~V~D~~------g~pi~~A~V~v~g~~-----------~~~~~~T~~~G~y~~~-l~pG~Y~v~vsa~Gy~--- 352 (376)
T cd03866 294 HLGVKGQVFDSN------GNPIPNAIVEVKGRK-----------HICPYRTNVNGEYFLL-LLPGKYMINVTAPGFK--- 352 (376)
T ss_pred cCceEEEEECCC------CCccCCeEEEEEcCC-----------ceeEEEECCCceEEEe-cCCeeEEEEEEeCCcc---
Confidence 467999998643 789999999997432 1123469999999775 9999999999999875
Q ss_pred eeeeEEEEeCC
Q 047026 372 LDKALVTISAG 382 (596)
Q Consensus 372 ~~~~~VtV~aG 382 (596)
....+|.|.+.
T Consensus 353 ~~~~~v~v~~~ 363 (376)
T cd03866 353 TVITNVIIPYN 363 (376)
T ss_pred eEEEEEEeCCC
Confidence 45556777653
No 15
>PF08400 phage_tail_N: Prophage tail fibre N-terminal; InterPro: IPR013609 This entry represents the N terminus of phage 933W tail fibre protein. The characteristics of the protein distribution suggest prophage matches.
Probab=95.45 E-value=0.075 Score=49.92 Aligned_cols=78 Identities=24% Similarity=0.178 Sum_probs=53.8
Q ss_pred eEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeEee
Q 047026 294 SATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYLD 373 (596)
Q Consensus 294 tVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~ 373 (596)
.|||.+.... |+|..++.+.|+.-... ..==.+.-=+..|++.|.|.|. +.||.|.+++...|.. ..+
T Consensus 4 ~ISGvL~dg~------G~pv~g~~I~L~A~~tS---~~Vv~~t~as~~t~~~G~Ys~~-~epG~Y~V~l~~~g~~--~~~ 71 (134)
T PF08400_consen 4 KISGVLKDGA------GKPVPGCTITLKARRTS---STVVVGTVASVVTGEAGEYSFD-VEPGVYRVTLKVEGRP--PVY 71 (134)
T ss_pred EEEEEEeCCC------CCcCCCCEEEEEEccCc---hheEEEEEEEEEcCCCceEEEE-ecCCeEEEEEEECCCC--cee
Confidence 5888887654 89999999999743210 0000123336788999999995 9999999999999754 223
Q ss_pred eeEEEEeCCC
Q 047026 374 KALVTISAGS 383 (596)
Q Consensus 374 ~~~VtV~aG~ 383 (596)
-..|+|.+.+
T Consensus 72 vG~I~V~~dS 81 (134)
T PF08400_consen 72 VGDITVYEDS 81 (134)
T ss_pred EEEEEEecCC
Confidence 2457776443
No 16
>PRK15036 hydroxyisourate hydrolase; Provisional
Probab=95.29 E-value=0.04 Score=51.99 Aligned_cols=66 Identities=17% Similarity=0.222 Sum_probs=49.8
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEe---CC-ccCCeeEEEEEEcc
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTV---KN-VVPGVYGLHGWVPG 366 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI---~n-VrpGtY~L~a~~~G 366 (596)
+.+.|++.|+... .++||.++.|-|..... ++|+ .-.-+.||++|+|.. .+ +.||.|+|.....+
T Consensus 25 ~~~~Is~HVLDt~-----~G~PA~gV~V~L~~~~~-~~w~-----~l~~~~Td~dGR~~~l~~~~~~~~G~Y~L~F~t~~ 93 (137)
T PRK15036 25 QQNILSVHILNQQ-----TGKPAADVTVTLEKKAD-NGWL-----QLNTAKTDKDGRIKALWPEQTATTGDYRVVFKTGD 93 (137)
T ss_pred cCCCeEEEEEeCC-----CCcCCCCCEEEEEEccC-CceE-----EEEEEEECCCCCCccccCcccCCCeeEEEEEEcch
Confidence 4467999998764 48999999999975432 2342 235578999999986 34 88999999998765
Q ss_pred e
Q 047026 367 F 367 (596)
Q Consensus 367 ~ 367 (596)
+
T Consensus 94 Y 94 (137)
T PRK15036 94 Y 94 (137)
T ss_pred h
Confidence 3
No 17
>PF03422 CBM_6: Carbohydrate binding module (family 6); InterPro: IPR005084 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM6 from CAZY which was previously known as cellulose-binding domain family VI (CBD VI). CBM6 bind to amorphous cellulose, xylan, mixed beta-(1,3)(1,4)glucan and beta-1,3-glucan[, , ]. CBM6 adopts a classic lectin-like beta-jelly roll fold, predominantly consisting of five antiparallel beta-strands on one face and four antiparallel beta-strands on the other face. It contains two potential ligand binding sites, named respectively cleft A and B. These clefts include aromatic residues which are probably involved in the substrate binding. The cleft B is located on the concave surface of one beta-sheet, and the cleft A on one edge of the protein between the loop that connects the inner and outer beta-sheets of the jellyroll fold []. The multiple binding clefts confer the extensive range of specificities displayed by the domain [, , ].; GO: 0030246 carbohydrate binding; PDB: 1UY1_A 1UY3_A 1UY4_A 1UY2_A 1UYY_A 1UXZ_B 1UYZ_A 1UY0_B 1UYX_A 1UZ0_A ....
Probab=94.63 E-value=0.26 Score=44.35 Aligned_cols=92 Identities=23% Similarity=0.394 Sum_probs=55.2
Q ss_pred ccEEEEEE-eCCCccceEEEEEEEeccCC-CeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCcee
Q 047026 483 TTWTIKFH-LDSIIKGTYNLRLAIASATR-SDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLI 560 (596)
Q Consensus 483 ~~w~I~F~-L~~~~~~~~tLriala~a~~-~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~ 560 (596)
+.| |.|+ ++-...+.|+|++..|.... ++++|+||+.... ...+..++. .+---.|...+..| .|.
T Consensus 31 G~~-~~~~~Vd~~~~g~y~~~~~~a~~~~~~~~~l~id~~~g~--~~~~~~~~~------tg~w~~~~~~~~~v---~l~ 98 (125)
T PF03422_consen 31 GDW-IEYNNVDVPEAGTYTLTIRYANGGGGGTIELRIDGPDGT--LIGTVSLPP------TGGWDTWQTVSVSV---KLP 98 (125)
T ss_dssp TTE-EEEEEEEESSSEEEEEEEEEEESSSSEEEEEEETTTTSE--EEEEEEEE-------ESSTTEEEEEEEEE---EEE
T ss_pred CCE-EEEEEEeeCCCceEEEEEEEECCCCCcEEEEEECCCCCc--EEEEEEEcC------CCCccccEEEEEEE---eeC
Confidence 444 4455 44334688899988888654 6999999993221 112222221 11011234444444 466
Q ss_pred eeccEEEEEEeecCCCCceEEEEEEEEe
Q 047026 561 KGDNSMFLVQSRSGDPVCGVLYDYLRLE 588 (596)
Q Consensus 561 ~G~NtI~l~~~~g~s~~~~vmyD~IrLe 588 (596)
+|.|+|+|....+.+ ..+-.|+|+|+
T Consensus 99 ~G~h~i~l~~~~~~~--~~~niD~~~f~ 124 (125)
T PF03422_consen 99 AGKHTIYLVFNGGDG--WAFNIDYFQFT 124 (125)
T ss_dssp SEEEEEEEEESSSSS--B-EEEEEEEEE
T ss_pred CCeeEEEEEEECCCC--ceEEeEEEEEE
Confidence 799999999876543 35889999886
No 18
>PF08308 PEGA: PEGA domain; InterPro: IPR013229 This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands.
Probab=94.59 E-value=0.25 Score=40.48 Aligned_cols=45 Identities=29% Similarity=0.409 Sum_probs=37.0
Q ss_pred cceEeCCccCCeeEEEEEEcceeeeEeeeeEEEEeCCCeeeecceEEee
Q 047026 346 GNFTVKNVVPGVYGLHGWVPGFIGDYLDKALVTISAGSQTELGNLTYVP 394 (596)
Q Consensus 346 G~FtI~nVrpGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~~~ 394 (596)
...++..+++|.|+|.+..+|+. ..+..|.|.+|++..+ ++.+++
T Consensus 25 tp~~~~~l~~G~~~v~v~~~Gy~---~~~~~v~v~~~~~~~v-~~~L~~ 69 (71)
T PF08308_consen 25 TPLTLKDLPPGEHTVTVEKPGYE---PYTKTVTVKPGETTTV-NVTLEP 69 (71)
T ss_pred CcceeeecCCccEEEEEEECCCe---eEEEEEEECCCCEEEE-EEEEEE
Confidence 34578789999999999999765 5667799999999888 577765
No 19
>cd00421 intradiol_dioxygenase Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers.
Probab=93.59 E-value=0.16 Score=48.11 Aligned_cols=64 Identities=22% Similarity=0.335 Sum_probs=47.8
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccc-------cceeEEEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESK-------DYQFWVQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~-------~yqywt~td~~G~FtI~nVrpGtY~L 360 (596)
..=+|.|+|+..+ +.|..+|.|-++.....|....+.. ..--...||++|.|.+.-|+||.|.+
T Consensus 10 ~~l~l~G~V~D~~------g~pv~~A~VeiW~~d~~G~Y~~~~~~~~~~~~~~rg~~~Td~~G~y~f~ti~Pg~Y~~ 80 (146)
T cd00421 10 EPLTLTGTVLDGD------GCPVPDALVEIWQADADGRYSGQDDSGLDPEFFLRGRQITDADGRYRFRTIKPGPYPI 80 (146)
T ss_pred CEEEEEEEEECCC------CCCCCCcEEEEEecCCCCccCCcCccccCCCCCCEEEEEECCCcCEEEEEEcCCCCCC
Confidence 3458999999776 7888899999977666554432211 22234779999999999999999995
No 20
>cd03869 M14_CPX_like Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Pro
Probab=93.41 E-value=0.19 Score=55.43 Aligned_cols=67 Identities=18% Similarity=0.223 Sum_probs=49.2
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcceeeeE
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDY 371 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~ 371 (596)
|| |+|.|+... +.|..+|.|.+..-. ....|.++|.|--- +.||+|+|+|.++|+.
T Consensus 329 ~G-ikG~V~d~~------g~~i~~a~i~v~g~~-------------~~v~t~~~GdywRl-l~pG~y~v~~~a~gy~--- 384 (405)
T cd03869 329 RG-IKGVVRDKT------GKGIPNAIISVEGIN-------------HDIRTASDGDYWRL-LNPGEYRVTAHAEGYT--- 384 (405)
T ss_pred cC-ceEEEECCC------CCcCCCcEEEEecCc-------------cceeeCCCCceEEe-cCCceEEEEEEecCCC---
Confidence 44 899987653 788889999887432 24567788876542 8999999999999764
Q ss_pred eeeeEEEEeCC
Q 047026 372 LDKALVTISAG 382 (596)
Q Consensus 372 ~~~~~VtV~aG 382 (596)
....+|+|..+
T Consensus 385 ~~~~~~~v~~~ 395 (405)
T cd03869 385 SSTKNCEVGYE 395 (405)
T ss_pred cccEEEEEcCC
Confidence 55566777754
No 21
>PF09430 DUF2012: Protein of unknown function (DUF2012); InterPro: IPR019008 This domain is found in different proteins, including uncharacterised protein family UPF0480 and nodal modulators. A nodal modulator has been identified as part of a protein complex that participates in the nodal signaling pathway during vertebrate development [].
Probab=93.08 E-value=0.52 Score=43.41 Aligned_cols=39 Identities=31% Similarity=0.474 Sum_probs=31.3
Q ss_pred EEEECCccceEeCCccCCeeEEEEEEcceeeeEeeeeEEEEe
Q 047026 339 WVQTDSKGNFTVKNVVPGVYGLHGWVPGFIGDYLDKALVTIS 380 (596)
Q Consensus 339 wt~td~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~~~~VtV~ 380 (596)
-+...++|+|.|.||++|+|.|.+-...+. |.. -.|.|.
T Consensus 23 ~~~v~~dG~F~f~~Vp~GsY~L~V~s~~~~--F~~-~RVdV~ 61 (123)
T PF09430_consen 23 SAFVRSDGSFVFHNVPPGSYLLEVHSPDYV--FPP-YRVDVS 61 (123)
T ss_pred EEEecCCCEEEeCCCCCceEEEEEECCCcc--ccC-EEEEEe
Confidence 678899999999999999999999987643 222 346676
No 22
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=92.56 E-value=0.37 Score=56.78 Aligned_cols=55 Identities=22% Similarity=0.244 Sum_probs=45.1
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCC-ccCCeeEEEEEEcc
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKN-VVPGVYGLHGWVPG 366 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~n-VrpGtY~L~a~~~G 366 (596)
-+|+|||++.. .+.|.+++.|.+. -+-..+||++|+|++.| +..|+||+.|-+..
T Consensus 316 fSvtGRVl~g~-----~g~~l~gvvvlvn--------------gk~~~kTdaqGyykLen~~t~gtytI~a~keh 371 (1165)
T KOG1948|consen 316 FSVTGRVLVGS-----KGLPLSGVVVLVN--------------GKSGGKTDAQGYYKLENLKTDGTYTITAKKEH 371 (1165)
T ss_pred EEeeeeEEeCC-----CCCCccceEEEEc--------------CcccceEcccceEEeeeeeccCcEEEEEeccc
Confidence 47899988752 3688888888883 33467899999999999 99999999998764
No 23
>PF05738 Cna_B: Cna protein B-type domain; InterPro: IPR008454 This entry represents a repeated B region domain found in the collagen-binding surface protein Cna in Staphylococcus aureus, as well as other related domains. The B region domain of Cna has a prealbumin-like beta-sandwich fold of seven strands in two sheets with a Greek key topology []. However, this domain does not mediate collagen binding, the IPR008456 from INTERPRO region carries out that function; instead it appears to form a stalk that presents the ligand binding domain away from the bacterial cell surface. Cna is a collagen-binding MSCRAMM (Microbial Surface Component Recognizing Adhesive Matrix Molecules), and is necessary and sufficient for S. aureus cells to adhere to cartilage.; PDB: 2X5P_A 3RKP_A 3KPT_A 1VLF_T 1TI2_F 1TI6_D 1TI4_J 1VLE_V 1VLD_X 3PF2_A ....
Probab=92.30 E-value=0.58 Score=37.96 Aligned_cols=44 Identities=32% Similarity=0.418 Sum_probs=31.0
Q ss_pred EEECCccceEeCCccCCeeEEEEEE--cceeeeEeeeeEEEEeCCCe
Q 047026 340 VQTDSKGNFTVKNVVPGVYGLHGWV--PGFIGDYLDKALVTISAGSQ 384 (596)
Q Consensus 340 t~td~~G~FtI~nVrpGtY~L~a~~--~G~~G~~~~~~~VtV~aG~t 384 (596)
..+|++|.|.|.+++||+|.|.--. .|+.- -.....++|..++.
T Consensus 21 ~~Td~~G~~~f~~L~~G~Y~l~E~~aP~GY~~-~~~~~~~~i~~~~~ 66 (70)
T PF05738_consen 21 VTTDENGKYTFKNLPPGTYTLKETKAPDGYQL-DDTPYEFTITEDGD 66 (70)
T ss_dssp EEGGTTSEEEEEEEESEEEEEEEEETTTTEEE-EECEEEEEECTTSC
T ss_pred EEECCCCEEEEeecCCeEEEEEEEECCCCCEE-CCCceEEEEecCCE
Confidence 5689999999999999999999886 44320 01223356666554
No 24
>cd03463 3,4-PCD_alpha Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=91.99 E-value=0.32 Score=48.19 Aligned_cols=63 Identities=27% Similarity=0.439 Sum_probs=48.7
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccccc-------ccceeE--EEECCccceEeCCccCCeeE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTES-------KDYQFW--VQTDSKGNFTVKNVVPGVYG 359 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~-------~~yqyw--t~td~~G~FtI~nVrpGtY~ 359 (596)
..=.|+|+|...+ ++|..+|.|=++.....|....+. .+++.| ..||++|.|++.-|+||-|.
T Consensus 35 ~~l~l~G~V~D~~------g~Pi~gA~VeiWqad~~G~Y~~~~~~~~~~~~~f~~rGr~~TD~~G~y~F~Ti~Pg~Y~ 106 (185)
T cd03463 35 ERITLEGRVYDGD------GAPVPDAMLEIWQADAAGRYAHPADSRRRLDPGFRGFGRVATDADGRFSFTTVKPGAVP 106 (185)
T ss_pred CEEEEEEEEECCC------CCCCCCCEEEEEcCCCCCccCCcCCcccccCCCCCcEEEEEECCCCCEEEEEEcCCCcC
Confidence 4568999998655 899999999998777666443321 344455 45999999999999999986
No 25
>cd03459 3,4-PCD Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=91.11 E-value=0.46 Score=45.89 Aligned_cols=64 Identities=25% Similarity=0.427 Sum_probs=48.8
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccccc--------ccceeE--EEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTES--------KDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~--------~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
++=.|+|+|+..+ +.|..+|.|=++.....|....+. .++..| ..||++|.|++.-|+||-|.+
T Consensus 14 ~~l~l~g~V~D~~------g~Pv~~A~veiWqad~~G~Y~~~~~~~~~~~~~~f~~rG~~~Td~~G~~~f~Ti~Pg~Y~~ 87 (158)
T cd03459 14 ERIILEGRVLDGD------GRPVPDALVEIWQADAAGRYRHPRDSHRAPLDPNFTGFGRVLTDADGRYRFRTIKPGAYPW 87 (158)
T ss_pred cEEEEEEEEECCC------CCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCceeEEEECCCCcEEEEEECCCCcCC
Confidence 3457899998654 899999999998777666443322 345545 458999999999999999984
No 26
>PF07210 DUF1416: Protein of unknown function (DUF1416); InterPro: IPR010814 This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown.
Probab=90.89 E-value=2.6 Score=36.64 Aligned_cols=61 Identities=25% Similarity=0.311 Sum_probs=47.0
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEccee
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPGFI 368 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G~~ 368 (596)
....|+|+|+ .+ +.|..+++|-|-+... .| -=-..|+++|.|.+- ..||+.+|.+-..+-.
T Consensus 6 ke~VItG~V~-~~------G~Pv~gAyVRLLD~sg--EF-------taEvvts~~G~FRFf-aapG~WtvRal~~~g~ 66 (85)
T PF07210_consen 6 KETVITGRVT-RD------GEPVGGAYVRLLDSSG--EF-------TAEVVTSATGDFRFF-AAPGSWTVRALSRGGN 66 (85)
T ss_pred ceEEEEEEEe-cC------CcCCCCeEEEEEcCCC--Ce-------EEEEEecCCccEEEE-eCCCceEEEEEccCCC
Confidence 3578999999 44 7999999999975542 12 112457899999995 8999999999987643
No 27
>COG3485 PcaH Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=89.54 E-value=0.64 Score=47.47 Aligned_cols=65 Identities=22% Similarity=0.314 Sum_probs=50.2
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccce-------eE--EEECCccceEeCCccCCeeEEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQ-------FW--VQTDSKGNFTVKNVVPGVYGLH 361 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yq-------yw--t~td~~G~FtI~nVrpGtY~L~ 361 (596)
+|=.|+|+|+..+ ++|..+|.|=+++.+..|-.....+.+. =| +.||++|.|.+.-|+||.|--.
T Consensus 71 e~i~l~G~VlD~~------G~Pv~~A~VEiWQAda~GrY~~~~d~~~~~~~~f~g~Gr~~Td~~G~y~F~Ti~Pg~yp~~ 144 (226)
T COG3485 71 ERILLEGRVLDGN------GRPVPDALVEIWQADADGRYSHPKDSRLAPLPNFNGRGRTITDEDGEYRFRTIKPGPYPWR 144 (226)
T ss_pred ceEEEEEEEECCC------CCCCCCCEEEEEEcCCCCcccCccccccCcCccccceEEEEeCCCceEEEEEeecccccCC
Confidence 7889999999876 8999999999987776665542222222 23 5689999999999999998543
No 28
>TIGR02465 chlorocat_1_2 chlorocatechol 1,2-dioxygenase. Members of this protein family are chlorocatechol 1,2-dioxygenase. This protein is closely related to catechol 1,2-dioxygenase, TIGR02439, EC 1.13.11.1. Note that annotated database entries have appeared for the present protein family with the EC number that refers to that of family TIGR02439. This protein acts in pathways of the biodegradation of chlorinated aromatic compounds.
Probab=89.30 E-value=0.77 Score=47.46 Aligned_cols=64 Identities=17% Similarity=0.204 Sum_probs=48.6
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc---ccccee--EEEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE---SKDYQF--WVQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~---~~~yqy--wt~td~~G~FtI~nVrpGtY~L 360 (596)
+.=.|+|+|...+ ++|..+|.|=++....+|....+ .....+ +..||++|.|.+.-|+||-|-+
T Consensus 97 ~~l~v~G~V~D~~------G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~F~Ti~P~~Ypi 165 (246)
T TIGR02465 97 KPLLIRGTVRDLS------GTPVAGAVIDVWHSTPDGKYSGFHDNIPDDYYRGKLVTAADGSYEVRTTMPVPYQI 165 (246)
T ss_pred cEEEEEEEEEcCC------CCCcCCcEEEEECCCCCCCCCCCCCCCCCCCCeEEEEECCCCCEEEEEECCCCCCC
Confidence 4578999998655 89999999999877776644322 122333 5778999999999999999853
No 29
>TIGR02423 protocat_alph protocatechuate 3,4-dioxygenase, alpha subunit. This model represents the alpha chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the beta chain (TIGR02422), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=89.09 E-value=0.78 Score=45.74 Aligned_cols=64 Identities=30% Similarity=0.441 Sum_probs=48.3
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccccc--------ccceeE--EEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTES--------KDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~--------~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
++=.|+|+|+..+ ++|..+|.|=++.....|-...+. .+++-| ..||++|.|.+.-|+||.|..
T Consensus 38 ~~l~l~G~V~D~~------g~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~~~f~grGr~~Td~~G~y~f~TI~Pg~Yp~ 111 (193)
T TIGR02423 38 ERIRLEGRVLDGD------GHPVPDALIEIWQADAAGRYNSPADLRAPATDPGFRGWGRTGTDESGEFTFETVKPGAVPD 111 (193)
T ss_pred CEEEEEEEEECCC------CCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCCeEEEEECCCCCEEEEEEcCCCcCC
Confidence 4568999998654 899999999998777666443321 244444 458999999999999998864
No 30
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=88.86 E-value=0.58 Score=37.44 Aligned_cols=28 Identities=25% Similarity=0.490 Sum_probs=22.5
Q ss_pred EEEECCcc-ceEeCCccCCeeEEEEEEcc
Q 047026 339 WVQTDSKG-NFTVKNVVPGVYGLHGWVPG 366 (596)
Q Consensus 339 wt~td~~G-~FtI~nVrpGtY~L~a~~~G 366 (596)
|....... .+++.+++||+|+|.|.+..
T Consensus 21 W~~~~~~~~~~~~~~L~~G~Y~l~V~a~~ 49 (66)
T PF07495_consen 21 WITLGSYSNSISYTNLPPGKYTLEVRAKD 49 (66)
T ss_dssp EEEESSTS-EEEEES--SEEEEEEEEEEE
T ss_pred EEECCCCcEEEEEEeCCCEEEEEEEEEEC
Confidence 77787777 99999999999999999743
No 31
>smart00606 CBD_IV Cellulose Binding Domain Type IV.
Probab=88.81 E-value=5 Score=36.37 Aligned_cols=90 Identities=20% Similarity=0.328 Sum_probs=50.6
Q ss_pred CccEEEEEE-eCCCccceEEEEEEEecc-CCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCce
Q 047026 482 PTTWTIKFH-LDSIIKGTYNLRLAIASA-TRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLL 559 (596)
Q Consensus 482 ~~~w~I~F~-L~~~~~~~~tLriala~a-~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L 559 (596)
++.| |.|+ ++-.+.+.+++.|..+.. ..+.++|++++.+.. ...+..++.... . -.+...+.+|+ |
T Consensus 38 ~g~w-~~y~~vd~~~~g~~~i~~~~as~~~~~~i~v~~d~~~G~--~~~~~~~p~tg~-----~-~~~~~~~~~v~---~ 105 (129)
T smart00606 38 DGDW-IAYKDVDFGSSGAYTFTARVASGNAGGSIELRLDSPTGT--LVGTVDVPSTGG-----W-QTYQTVSATVT---L 105 (129)
T ss_pred CCCE-EEEEeEecCCCCceEEEEEEeCCCCCceEEEEECCCCCc--EEEEEEeCCCCC-----C-ccCEEEEEEEc---c
Confidence 3455 5566 543335778888877775 345899999974432 112223332111 0 12333444443 4
Q ss_pred eeeccEEEEEEeecCCCCceEEEEEEEE
Q 047026 560 IKGDNSMFLVQSRSGDPVCGVLYDYLRL 587 (596)
Q Consensus 560 ~~G~NtI~l~~~~g~s~~~~vmyD~IrL 587 (596)
.+|.++|+|....++ . +..|.+++
T Consensus 106 ~~G~~~l~~~~~~~~-~---~~ld~~~F 129 (129)
T smart00606 106 PAGVHDVYLVFKGGN-Y---FNIDWFRF 129 (129)
T ss_pred CCceEEEEEEEECCC-c---EEEEEEEC
Confidence 489999999875543 2 77777653
No 32
>PF00775 Dioxygenase_C: Dioxygenase; InterPro: IPR000627 This entry represents the C-terminal domain common to several intradiol ring-cleavage dioxygenases. Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes (IPR000486 from INTERPRO) use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) []. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues. Enzymes that belong to the intradiol family include catechol 1,2-dioxygenase (1,2-CTD) (1.13.11.1 from EC); protocatechuate 3,4-dioxygenase (3,4-PCD) (1.13.11.3 from EC); and chlorocatechol 1,2-dioxygenase (1.13.11.1 from EC) [].; GO: 0003824 catalytic activity, 0008199 ferric iron binding, 0006725 cellular aromatic compound metabolic process, 0055114 oxidation-reduction process; PDB: 2BUV_A 2BUX_A 2BUU_A 2BUR_A 1EO9_A 2BUZ_A 2BV0_A 1EO2_A 1EOC_A 1EOA_A ....
Probab=88.78 E-value=0.94 Score=44.76 Aligned_cols=64 Identities=22% Similarity=0.371 Sum_probs=40.4
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccccc-------ccceeEEEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTES-------KDYQFWVQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~-------~~yqywt~td~~G~FtI~nVrpGtY~L 360 (596)
+.=.|.|+|...+ ++|..+|.|=++.....|....+. ....=+..||++|.|++.-|+||.|.+
T Consensus 28 ~~l~l~G~V~D~~------g~Pv~~A~veiWqada~G~Ys~~~~~~~~~~~~~rG~~~Td~~G~y~f~Ti~Pg~Y~~ 98 (183)
T PF00775_consen 28 EPLVLHGRVIDTD------GKPVPGALVEIWQADADGRYSGQDPGSDQPDFNLRGRFRTDADGRYSFRTIKPGPYPI 98 (183)
T ss_dssp -EEEEEEEEEETT------SSB-TTEEEEEEE--TTS--TTTBTTSSSSTTTTEEEEEECTTSEEEEEEE----EEE
T ss_pred CEEEEEEEEECCC------CCCCCCcEEEEEecCCCCccccccccccccCCCcceEEecCCCCEEEEEeeCCCCCCC
Confidence 3558999999765 899999999998766666333221 122234678999999999999999975
No 33
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=88.09 E-value=1.4 Score=50.79 Aligned_cols=77 Identities=23% Similarity=0.229 Sum_probs=53.6
Q ss_pred ccEEEEEEeCCCc-cceEEEEEEEecc-----CCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeec
Q 047026 483 TTWTIKFHLDSII-KGTYNLRLAIASA-----TRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSS 556 (596)
Q Consensus 483 ~~w~I~F~L~~~~-~~~~tLriala~a-----~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa 556 (596)
..-.|.|.+...+ ...++|+|.+.-+ ..+.++|.|||..+.. ..+..++. .....+|+||+
T Consensus 29 ~~~~~~f~v~~~~~v~~a~L~L~~~~S~~l~~~~S~L~V~lNg~~v~s-----~~l~~~~~--------~~~~~~i~Ip~ 95 (605)
T PF03170_consen 29 ASRTIYFPVPADWVVTKATLNLSYTYSPSLLPERSQLTVSLNGQPVGS-----IPLDAESA--------QPQTVTIPIPP 95 (605)
T ss_pred CceEEEEEcCCCccccceEEEEEEEECcccCCCcceEEEEECCEEeEE-----EecCcCCC--------CceEEEEecCh
Confidence 4556777777766 4556666666654 2368999999987652 22222222 24678999999
Q ss_pred CceeeeccEEEEEEeec
Q 047026 557 LLLIKGDNSMFLVQSRS 573 (596)
Q Consensus 557 ~~L~~G~NtI~l~~~~g 573 (596)
. |++|.|.|.|.....
T Consensus 96 ~-l~~g~N~l~~~~~~~ 111 (605)
T PF03170_consen 96 A-LIKGFNRLTFEFIGH 111 (605)
T ss_pred h-hcCCceEEEEEEEec
Confidence 9 999999999987543
No 34
>cd03464 3,4-PCD_beta Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=87.40 E-value=1.3 Score=45.00 Aligned_cols=65 Identities=22% Similarity=0.384 Sum_probs=48.7
Q ss_pred CCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc--------cccceeE--EEECCccceEeCCccCCeeE
Q 047026 290 NERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE--------SKDYQFW--VQTDSKGNFTVKNVVPGVYG 359 (596)
Q Consensus 290 ~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~--------~~~yqyw--t~td~~G~FtI~nVrpGtY~ 359 (596)
.++=.|+|+|...+ ++|..+|.|=++.....|-...+ ..+++.+ ..||++|.|.|.-|+||.|.
T Consensus 63 G~~i~l~G~V~D~~------G~PV~~A~VEIWQad~~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~Pg~Yp 136 (220)
T cd03464 63 GERIIVHGRVLDED------GRPVPNTLVEIWQANAAGRYRHKRDQHDAPLDPNFGGAGRTLTDDDGYYRFRTIKPGAYP 136 (220)
T ss_pred CCEEEEEEEEECCC------CCCCCCCEEEEEecCCCCcccCccCCcccccCCCCCCEEEEEECCCccEEEEEECCCCcc
Confidence 45678899998655 89999999999877666644322 1234434 46899999999999999995
Q ss_pred E
Q 047026 360 L 360 (596)
Q Consensus 360 L 360 (596)
.
T Consensus 137 ~ 137 (220)
T cd03464 137 W 137 (220)
T ss_pred C
Confidence 4
No 35
>TIGR02422 protocat_beta protocatechuate 3,4-dioxygenase, beta subunit. This model represents the beta chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the alpha chain (TIGR02423), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=87.16 E-value=1.4 Score=44.80 Aligned_cols=67 Identities=22% Similarity=0.368 Sum_probs=49.0
Q ss_pred CCCCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc--------cccceeE--EEECCccceEeCCccCCe
Q 047026 288 TANERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE--------SKDYQFW--VQTDSKGNFTVKNVVPGV 357 (596)
Q Consensus 288 ~~~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~--------~~~yqyw--t~td~~G~FtI~nVrpGt 357 (596)
+..++=.|+|+|+..+ ++|..+|.|=++.....|....+ ..+++.+ ..||++|.|.|.-|+||-
T Consensus 56 ~~G~~i~l~G~V~D~~------g~PV~~A~VEIWQada~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~PG~ 129 (220)
T TIGR02422 56 PIGERIIVHGRVLDED------GRPVPNTLVEVWQANAAGRYRHKNDQYLAPLDPNFGGVGRTLTDSDGYYRFRTIKPGP 129 (220)
T ss_pred CCCCEEEEEEEEECCC------CCCCCCCEEEEEecCCCCcccCccCccccccCCCCCCEEEEEECCCccEEEEEECCCC
Confidence 3345778999998765 89999999999876666644322 1233323 458999999999999999
Q ss_pred eEE
Q 047026 358 YGL 360 (596)
Q Consensus 358 Y~L 360 (596)
|..
T Consensus 130 Y~~ 132 (220)
T TIGR02422 130 YPW 132 (220)
T ss_pred ccC
Confidence 854
No 36
>cd03462 1,2-CCD chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway.
Probab=85.97 E-value=1.2 Score=46.07 Aligned_cols=65 Identities=18% Similarity=0.241 Sum_probs=47.3
Q ss_pred CCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccccc---cccee--EEEECCccceEeCCccCCeeEE
Q 047026 290 NERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTES---KDYQF--WVQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 290 ~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~---~~yqy--wt~td~~G~FtI~nVrpGtY~L 360 (596)
.++=.|+|+|...+ ++|..+|.|=++.....|....+. ....+ ...||++|.|.+.-|+||.|-+
T Consensus 97 G~~l~l~G~V~D~~------G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~RG~~~Td~~G~y~F~Ti~P~~Ypi 166 (247)
T cd03462 97 HKPLLFRGTVKDLA------GAPVAGAVIDVWHSTPDGKYSGFHPNIPEDYYRGKIRTDEDGRYEVRTTVPVPYQI 166 (247)
T ss_pred CCEEEEEEEEEcCC------CCCcCCcEEEEECCCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCcCC
Confidence 34568999998665 899999999998776666433211 11222 3568999999999999999843
No 37
>cd03458 Catechol_intradiol_dioxygenases Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=85.68 E-value=3 Score=43.46 Aligned_cols=65 Identities=18% Similarity=0.295 Sum_probs=48.6
Q ss_pred CCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc---cccceeE--EEECCccceEeCCccCCeeEE
Q 047026 290 NERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE---SKDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 290 ~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~---~~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
.++=.|+|+|...+ ++|..+|.|=++.....|....+ ......+ ..||++|.|.+.-|+||-|-+
T Consensus 102 G~~l~l~G~V~D~~------G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~f~Ti~P~~Ypi 171 (256)
T cd03458 102 GEPLFVHGTVTDTD------GKPLAGATVDVWHADPDGFYSQQDPDQPEFNLRGKFRTDEDGRYRFRTIRPVPYPI 171 (256)
T ss_pred CcEEEEEEEEEcCC------CCCCCCcEEEEEccCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCccC
Confidence 34567999999765 89999999999877666644321 2233333 568999999999999999954
No 38
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=84.62 E-value=2.4 Score=50.38 Aligned_cols=58 Identities=21% Similarity=0.279 Sum_probs=41.8
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEEcc
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWVPG 366 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~~G 366 (596)
-+|+|+|.... +-.+.++.|-|-... +--=-|.|+++|.|.+.||.||+|.+.|..+.
T Consensus 119 Fsv~GkVlgaa------ggGpagV~velrs~e----------~~iast~T~~~Gky~f~~iiPG~Yev~ashp~ 176 (1165)
T KOG1948|consen 119 FSVRGKVLGAA------GGGPAGVLVELRSQE----------DPIASTKTEDGGKYEFRNIIPGKYEVSASHPA 176 (1165)
T ss_pred eeEeeEEeecc------CCCcccceeeccccc----------CcceeeEecCCCeEEEEecCCCceEEeccCcc
Confidence 36788887753 234455666664331 22336889999999999999999999998765
No 39
>cd03460 1,2-CTD Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=83.82 E-value=2 Score=45.30 Aligned_cols=65 Identities=20% Similarity=0.353 Sum_probs=49.2
Q ss_pred CCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccc---ccccceeE--EEECCccceEeCCccCCeeEE
Q 047026 290 NERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQT---ESKDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 290 ~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~---~~~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
.++=.|+|+|...+ ++|..+|.|=++.....|.... ...+++.+ ..||++|.|.+.-|+||-|-+
T Consensus 122 Gepl~l~G~V~D~~------G~PI~~A~VeiWqad~~G~Ys~~~~~~~~f~~RGr~~TD~~G~y~F~TI~P~~Ypi 191 (282)
T cd03460 122 GETLVMHGTVTDTD------GKPVPGAKVEVWHANSKGFYSHFDPTQSPFNLRRSIITDADGRYRFRSIMPSGYGV 191 (282)
T ss_pred CCEEEEEEEEECCC------CCCcCCcEEEEECCCCCCCcCCCCCCCCCCCCceEEEeCCCCCEEEEEECCCCCcC
Confidence 45678999998765 8999999999987777664432 12233333 568999999999999999953
No 40
>TIGR02438 catachol_actin catechol 1,2-dioxygenase, Actinobacterial. Members of this family are catechol 1,2-dioxygenases of the Actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so are built in this separate model. The member from Rhodococcus rhodochrous NCIMB 13259 (GB|AAC33003.1) is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol.
Probab=83.12 E-value=2.3 Score=44.81 Aligned_cols=64 Identities=17% Similarity=0.218 Sum_probs=47.0
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc---cccceeE--EEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE---SKDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~---~~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
++=.|+|+|...+ ++|..+|.|=++.....|....+ ....++. ..||++|.|.+.-|+||-|-+
T Consensus 131 ~pl~v~G~V~D~~------G~Pv~gA~VdiWqada~G~Ys~~~~~~~~~~lRGr~~TDadG~y~F~TI~Pg~Ypi 199 (281)
T TIGR02438 131 TPLVFSGQVTDLD------GNGLAGAKVELWHADDDGFYSQFAPGIPEWNLRGTIIADDEGRFEITTMQPAPYQI 199 (281)
T ss_pred CEEEEEEEEEcCC------CCCcCCCEEEEEecCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcCC
Confidence 4568999998655 89999999999766666643221 1222233 568999999999999999964
No 41
>TIGR02439 catechol_proteo catechol 1,2-dioxygenase, proteobacterial. Members of this family known so far are catechol 1,2-dioxygenases of the Proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the Actinobacteria, which are quite similar to each other and resolved by separate models. This enzyme catalyzes intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogs 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol.
Probab=82.57 E-value=2.5 Score=44.60 Aligned_cols=64 Identities=25% Similarity=0.409 Sum_probs=48.4
Q ss_pred CceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCccc---ccccceeE--EEECCccceEeCCccCCeeEE
Q 047026 291 ERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQT---ESKDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 291 ~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~---~~~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
++=.|+|+|...+ ++|..+|.|=++.....|.... ...+++.+ ..||++|.|.+.-|+||-|-+
T Consensus 127 ~pl~v~G~V~D~~------G~PI~gA~VeIWqad~~G~Ys~~~~~~~~~~lRG~~~TD~~G~y~F~TI~P~~Ypi 195 (285)
T TIGR02439 127 ETLFLHGQVTDAD------GKPIAGAKVELWHANTKGNYSHFDKSQSEFNLRRTIITDAEGRYRARSIVPSGYGC 195 (285)
T ss_pred cEEEEEEEEECCC------CCCcCCcEEEEEccCCCCCcCCCCCCCCCCCceEEEEECCCCCEEEEEECCCCCcC
Confidence 4568999998665 8999999999987776664432 12334444 568999999999999999953
No 42
>PF13364 BetaGal_dom4_5: Beta-galactosidase jelly roll domain; PDB: 1TG7_A 1XC6_A 3OGS_A 3OGV_A 3OGR_A 3OG2_A.
Probab=81.67 E-value=4.9 Score=36.31 Aligned_cols=54 Identities=19% Similarity=0.182 Sum_probs=34.1
Q ss_pred EEEE-EEEeccCCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCceeeeccEEEEE
Q 047026 499 YNLR-LAIASATRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLIKGDNSMFLV 569 (596)
Q Consensus 499 ~tLr-iala~a~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~~G~NtI~l~ 569 (596)
..|+ +.+......+.+|.|||+.++.=. +.+ ....+|.||++.|+.++|.|.+-
T Consensus 50 ~~~~~l~~~~g~~~~~~vwVNG~~~G~~~---~~~--------------g~q~tf~~p~~il~~~n~v~~vl 104 (111)
T PF13364_consen 50 TSLTPLNIQGGNAFRASVWVNGWFLGSYW---PGI--------------GPQTTFSVPAGILKYGNNVLVVL 104 (111)
T ss_dssp EEEE-EEECSSTTEEEEEEETTEEEEEEE---TTT--------------ECCEEEEE-BTTBTTCEEEEEEE
T ss_pred eeEEEEeccCCCceEEEEEECCEEeeeec---CCC--------------CccEEEEeCceeecCCCEEEEEE
Confidence 4455 555556667899999999775200 011 11289999999999985554443
No 43
>PF10670 DUF4198: Domain of unknown function (DUF4198)
Probab=81.52 E-value=4.3 Score=39.67 Aligned_cols=62 Identities=16% Similarity=0.136 Sum_probs=47.9
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEEE
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGWV 364 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~~ 364 (596)
-..++.+|+-. |+|..++.|.+...+. |. +........+||++|.++|+=-+||.|-|.+..
T Consensus 150 g~~~~~~vl~~-------GkPl~~a~V~~~~~~~---~~-~~~~~~~~~~TD~~G~~~~~~~~~G~wli~a~~ 211 (215)
T PF10670_consen 150 GDPLPFQVLFD-------GKPLAGAEVEAFSPGG---WY-DVEHEAKTLKTDANGRATFTLPRPGLWLIRASH 211 (215)
T ss_pred CCEEEEEEEEC-------CeEcccEEEEEEECCC---cc-ccccceEEEEECCCCEEEEecCCCEEEEEEEEE
Confidence 45788898864 7999999999876553 21 112227789999999999998899999998854
No 44
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=81.51 E-value=2.5 Score=50.28 Aligned_cols=74 Identities=20% Similarity=0.172 Sum_probs=48.5
Q ss_pred EEEEEeCCCc-cceEEEEEEEecc-----CCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCce
Q 047026 486 TIKFHLDSII-KGTYNLRLAIASA-----TRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLL 559 (596)
Q Consensus 486 ~I~F~L~~~~-~~~~tLriala~a-----~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L 559 (596)
.|.|.++..+ ...++|+|...-+ ..++++|.|||..+.. ..+..++ .|.....+|+||+ .|
T Consensus 84 ~i~f~vp~d~~v~~A~L~L~y~~Sp~l~~~~S~L~V~lNg~~v~s-----~pL~~~~-------~~~~~~~~i~IP~-~l 150 (756)
T PRK11114 84 GIEFGVRSDEVVTKARLNLEYTYSPALLPDLSHLKVYLNGELMGT-----LPLDKEQ-------LGKKVLAQLPIDP-RF 150 (756)
T ss_pred eeEeecCccccccCcEEEEEEEECCCCCCCCCeEEEEECCEEeEE-----EecCccc-------CCCcceeEEecCH-HH
Confidence 6777777666 3445555554443 3478999999986641 2222111 2445788999999 56
Q ss_pred eeeccEEEEEEee
Q 047026 560 IKGDNSMFLVQSR 572 (596)
Q Consensus 560 ~~G~NtI~l~~~~ 572 (596)
..|.|.|.|....
T Consensus 151 ~~g~N~L~~~~~~ 163 (756)
T PRK11114 151 ITDFNRLRLEFIG 163 (756)
T ss_pred cCCCceEEEEEec
Confidence 6899999998643
No 45
>cd03461 1,2-HQD Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes.
Probab=81.47 E-value=3 Score=43.92 Aligned_cols=65 Identities=20% Similarity=0.341 Sum_probs=48.3
Q ss_pred CCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc---cccceeE--EEECCccceEeCCccCCeeEE
Q 047026 290 NERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE---SKDYQFW--VQTDSKGNFTVKNVVPGVYGL 360 (596)
Q Consensus 290 ~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~---~~~yqyw--t~td~~G~FtI~nVrpGtY~L 360 (596)
.++=.|+|+|...+ ++|..+|.|=++.....|....+ ..+...+ ..||++|.|.+.-|+||-|-+
T Consensus 118 G~~l~v~G~V~D~~------G~Pv~gA~VeiWqad~~G~Y~~~~~~~~~~~lRGr~~Td~~G~y~F~Ti~Pg~Ypi 187 (277)
T cd03461 118 GEPCFVHGRVTDTD------GKPLPGATVDVWQADPNGLYDVQDPDQPEFNLRGKFRTDEDGRYAFRTLRPTPYPI 187 (277)
T ss_pred CCEEEEEEEEEcCC------CCCcCCcEEEEECcCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcCC
Confidence 34678999999765 89999999999876666643321 1223333 568999999999999999975
No 46
>PF02837 Glyco_hydro_2_N: Glycosyl hydrolases family 2, sugar binding domain; InterPro: IPR006104 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 2 GH2 from CAZY comprises enzymes with several known activities; beta-galactosidase (3.2.1.23 from EC); beta-mannosidase (3.2.1.25 from EC); beta-glucuronidase (3.2.1.31 from EC). These enzymes contain a conserved glutamic acid residue which has been shown [], in Escherichia coli lacZ (P00722 from SWISSPROT), to be the general acid/base catalyst in the active site of the enzyme. This domain has a jelly-roll fold [].; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3DEC_A 3OB8_A 3OBA_A 3CMG_A 3FN9_C 2VZU_A 2X09_A 2VZO_A 2X05_A 2VZV_B ....
Probab=79.63 E-value=4.8 Score=37.93 Aligned_cols=66 Identities=23% Similarity=0.253 Sum_probs=44.3
Q ss_pred EEEEEeCCCcc-ceEEEEEEEeccCCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCceeeec-
Q 047026 486 TIKFHLDSIIK-GTYNLRLAIASATRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLIKGD- 563 (596)
Q Consensus 486 ~I~F~L~~~~~-~~~tLriala~a~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~~G~- 563 (596)
+=+|++++... ..+.|++.-.. ....|.|||..++. ..+.+..++++|+. .|+.|.
T Consensus 73 r~~f~lp~~~~~~~~~L~f~gv~---~~a~v~vNG~~vg~------------------~~~~~~~~~~dIt~-~l~~g~~ 130 (167)
T PF02837_consen 73 RRTFTLPADWKGKRVFLRFEGVD---YAAEVYVNGKLVGS------------------HEGGYTPFEFDITD-YLKPGEE 130 (167)
T ss_dssp EEEEEESGGGTTSEEEEEESEEE---SEEEEEETTEEEEE------------------EESTTS-EEEECGG-GSSSEEE
T ss_pred EEEEEeCchhcCceEEEEeccce---EeeEEEeCCeEEee------------------eCCCcCCeEEeChh-hccCCCC
Confidence 33588876552 34555554433 45789999985531 12446789999975 789998
Q ss_pred cEEEEEEeec
Q 047026 564 NSMFLVQSRS 573 (596)
Q Consensus 564 NtI~l~~~~g 573 (596)
|+|.+.+.+.
T Consensus 131 N~l~V~v~~~ 140 (167)
T PF02837_consen 131 NTLAVRVDNW 140 (167)
T ss_dssp EEEEEEEESS
T ss_pred EEEEEEEeec
Confidence 9999999764
No 47
>KOG2649 consensus Zinc carboxypeptidase [General function prediction only]
Probab=77.92 E-value=7.1 Score=43.90 Aligned_cols=77 Identities=21% Similarity=0.280 Sum_probs=53.1
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccce-EeCCccCCeeEEEEEEcceeeeE
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNF-TVKNVVPGVYGLHGWVPGFIGDY 371 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~F-tI~nVrpGtY~L~a~~~G~~G~~ 371 (596)
--|+|-|.... ++|..+|+|-+..-.. =.+|...|.| .+ ..||.|.|+|.+.|+.
T Consensus 378 ~GIkG~V~D~~------G~~I~NA~IsV~ginH-------------dv~T~~~GDYWRL--L~PG~y~vta~A~Gy~--- 433 (500)
T KOG2649|consen 378 RGIKGLVFDDT------GNPIANATISVDGINH-------------DVTTAKEGDYWRL--LPPGKYIITASAEGYD--- 433 (500)
T ss_pred hccceeEEcCC------CCccCceEEEEecCcC-------------ceeecCCCceEEe--eCCcceEEEEecCCCc---
Confidence 34889888643 8999999999975443 1234455543 44 7899999999999754
Q ss_pred eeeeEEEEeCCCeeeecceEEee
Q 047026 372 LDKALVTISAGSQTELGNLTYVP 394 (596)
Q Consensus 372 ~~~~~VtV~aG~t~~l~~l~~~~ 394 (596)
....+|+|..-..+.. ++++..
T Consensus 434 ~~tk~v~V~~~~a~~~-df~L~~ 455 (500)
T KOG2649|consen 434 PVTKTVTVPPDRAARV-NFTLQR 455 (500)
T ss_pred ceeeEEEeCCCCccce-eEEEec
Confidence 4556788886333333 677765
No 48
>TIGR02962 hdxy_isourate hydroxyisourate hydrolase. Members of this family, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes and have been demonstrated to have hydroxyisourate hydrolase activity. In eukaryotes, a clade separate from the transthyretins (a family of thyroid-hormone binding proteins) has also been shown to have HIU hydrolase activity in urate catabolizing organisms. Transthyretin, then, would appear to be the recently diverged paralog of the more ancient HIUH family.
Probab=77.33 E-value=5.4 Score=36.50 Aligned_cols=51 Identities=22% Similarity=0.177 Sum_probs=37.0
Q ss_pred CCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceE-----eCCccCCeeEEEEEEc
Q 047026 309 SLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFT-----VKNVVPGVYGLHGWVP 365 (596)
Q Consensus 309 ~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~Ft-----I~nVrpGtY~L~a~~~ 365 (596)
.|+||.++.|-|..... +.|+ .---++||++|+.. ...+.||+|+|..-..
T Consensus 12 ~G~PAagv~V~L~~~~~-~~~~-----~i~~~~Tn~DGR~~~~l~~~~~~~~G~Y~l~F~~g 67 (112)
T TIGR02962 12 SGKPAAGVPVTLYRLDG-SGWT-----PLAEGVTNADGRCPDLLPEGETLAAGIYKLRFDTG 67 (112)
T ss_pred CCccCCCCEEEEEEecC-CCeE-----EEEEEEECCCCCCcCcccCcccCCCeeEEEEEEhh
Confidence 48999999999974321 1242 23346799999987 4567899999998764
No 49
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=75.28 E-value=6.3 Score=45.57 Aligned_cols=78 Identities=19% Similarity=0.261 Sum_probs=53.5
Q ss_pred CccEEEEEEeCCCc----cceEEEEEEEecc-----CCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEE
Q 047026 482 PTTWTIKFHLDSII----KGTYNLRLAIASA-----TRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSI 552 (596)
Q Consensus 482 ~~~w~I~F~L~~~~----~~~~tLriala~a-----~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~ 552 (596)
..+..+.|.|+..- .....|.+..+.+ ..+++.|.|||.-+.. ..+.+ +-.+....+++
T Consensus 323 ~~~~~~~f~lP~dl~~~~~~~i~l~L~y~y~~~~~~~~S~l~V~vNg~~i~s-----~~L~~-------~~~~~~~~~~v 390 (605)
T PF03170_consen 323 PQPISFNFRLPPDLFAWDGSGIPLHLRYRYTPGLDFDGSRLTVYVNGQFIGS-----LPLTP-------ADGAGFDRYTV 390 (605)
T ss_pred CCcceeEeeCCccccccCCCceEEEEEEecCCCCCCCCcEEEEEECCEEEEe-----EECCC-------CCCCccceeEE
Confidence 46778888887743 3445555555554 3568999999986641 22221 23345678999
Q ss_pred EeecCceeeeccEEEEEEee
Q 047026 553 HVSSLLLIKGDNSMFLVQSR 572 (596)
Q Consensus 553 ~ipa~~L~~G~NtI~l~~~~ 572 (596)
.|| ..++.|.|+|.|...-
T Consensus 391 ~iP-~~~~~~~N~l~~~f~l 409 (605)
T PF03170_consen 391 SIP-RLLLPGRNQLQFEFDL 409 (605)
T ss_pred ecC-chhcCCCcEEEEEEEe
Confidence 999 9999999999887643
No 50
>PF00576 Transthyretin: HIUase/Transthyretin family; InterPro: IPR023416 This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [, ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence. Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates []. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) []. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain [].; PDB: 1TFP_B 1KGJ_D 1IE4_C 1GKE_C 1KGI_D 2H0J_B 2H0E_B 2H0F_B 1ZD6_A 3DGD_D ....
Probab=73.39 E-value=2.9 Score=38.21 Aligned_cols=52 Identities=23% Similarity=0.269 Sum_probs=36.4
Q ss_pred CCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccce-----EeCCccCCeeEEEEEEc
Q 047026 309 SLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNF-----TVKNVVPGVYGLHGWVP 365 (596)
Q Consensus 309 ~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~F-----tI~nVrpGtY~L~a~~~ 365 (596)
.|+||.++.|-|......++|+ .-.-+.||++|+. .-..+.+|.|+|..-..
T Consensus 12 ~G~PA~gv~V~L~~~~~~~~~~-----~l~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~~~ 68 (112)
T PF00576_consen 12 TGKPAAGVPVTLYRLDSDGSWT-----LLAEGVTDADGRIKQPLLEGESLEPGIYKLVFDTG 68 (112)
T ss_dssp TTEE-TT-EEEEEEEETTSCEE-----EEEEEEBETTSEESSTSSETTTS-SEEEEEEEEHH
T ss_pred CCCCccCCEEEEEEecCCCCcE-----EEEEEEECCCCcccccccccccccceEEEEEEEHH
Confidence 4899999999997554344563 3345779999988 44678899999998754
No 51
>PLN03059 beta-galactosidase; Provisional
Probab=71.28 E-value=4.1 Score=48.80 Aligned_cols=84 Identities=14% Similarity=0.139 Sum_probs=52.8
Q ss_pred EEEEEEeCCCccceEEEEEEEeccCCCeeEEEEcCccCCcccccc--cccCCCCeeeeeeE--------EEeeEEEEEEe
Q 047026 485 WTIKFHLDSIIKGTYNLRLAIASATRSDLEIFVNYIDQGHLVYQE--MNLGMDNTVCRHGI--------HGLYQLFSIHV 554 (596)
Q Consensus 485 w~I~F~L~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~p~~~~--~~~~~d~~i~R~~~--------~G~~~~~~~~i 554 (596)
.+-.|++++. .. - ++|-....+.=+|.|||..+++ .+.. ..-+=+.|-+|+++ .|.-...-+.|
T Consensus 623 YK~~Fd~p~g--~D-p--v~LDm~gmGKG~aWVNG~nIGR-YW~~~a~~~gC~~c~y~g~~~~~kc~~~cggP~q~lYHV 696 (840)
T PLN03059 623 YKTTFDAPGG--ND-P--LALDMSSMGKGQIWINGQSIGR-HWPAYTAHGSCNGCNYAGTFDDKKCRTNCGEPSQRWYHV 696 (840)
T ss_pred EEEEEeCCCC--CC-C--EEEecccCCCeeEEECCccccc-ccccccccCCCccccccccccchhhhccCCCceeEEEeC
Confidence 4567777542 11 1 2233334566689999999885 3211 11122456777776 24566777899
Q ss_pred ecCceeeeccEEEEEEeecC
Q 047026 555 SSLLLIKGDNSMFLVQSRSG 574 (596)
Q Consensus 555 pa~~L~~G~NtI~l~~~~g~ 574 (596)
|+++|++|.|+|.|==..+.
T Consensus 697 Pr~~Lk~g~N~lViFEe~gg 716 (840)
T PLN03059 697 PRSWLKPSGNLLIVFEEWGG 716 (840)
T ss_pred cHHHhccCCceEEEEEecCC
Confidence 99999999999887544443
No 52
>cd05469 Transthyretin_like Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms.
Probab=70.33 E-value=7 Score=35.88 Aligned_cols=52 Identities=19% Similarity=0.246 Sum_probs=37.0
Q ss_pred CCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEe----CCccCCeeEEEEEEc
Q 047026 309 SLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTV----KNVVPGVYGLHGWVP 365 (596)
Q Consensus 309 ~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI----~nVrpGtY~L~a~~~ 365 (596)
.|+||.++.|-|......++|+ .---++||+||+..- ..+.+|.|+|..-..
T Consensus 12 ~G~PAagv~V~L~~~~~~~~w~-----~l~~~~Tn~DGR~~~~l~~~~~~~G~Y~l~F~t~ 67 (113)
T cd05469 12 RGSPAANVAIKVFRKTADGSWE-----IFATGKTNEDGELHGLITEEEFXAGVYRVEFDTK 67 (113)
T ss_pred CCccCCCCEEEEEEecCCCceE-----EEEEEEECCCCCccCccccccccceEEEEEEehH
Confidence 4899999999997532222353 233467999999852 357899999998664
No 53
>cd05821 TLP_Transthyretin Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates. TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein. Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity. A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by bindi
Probab=69.14 E-value=12 Score=34.78 Aligned_cols=64 Identities=14% Similarity=0.135 Sum_probs=43.9
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceE----eCCccCCeeEEEEEEc
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFT----VKNVVPGVYGLHGWVP 365 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~Ft----I~nVrpGtY~L~a~~~ 365 (596)
|-.|+=.|+... .|+||.++.|-|......++|+ ..--++||++|+.. -..+.+|.|+|..-..
T Consensus 6 ~~~ittHVLDt~-----~G~PAaGV~V~L~~~~~~~~w~-----~l~~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~tg 73 (121)
T cd05821 6 KCPLMVKVLDAV-----RGSPAANVAVKVFKKTADGSWE-----PFASGKTTETGEIHGLTTDEQFTEGVYKVEFDTK 73 (121)
T ss_pred CCCcEEEEEECC-----CCccCCCCEEEEEEecCCCceE-----EEEEEEECCCCCCCCccCccccCCeeEEEEEehh
Confidence 556777766553 4899999999996432112353 33457799999885 2346789999998653
No 54
>cd03457 intradiol_dioxygenase_like Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown.
Probab=65.43 E-value=16 Score=36.27 Aligned_cols=62 Identities=18% Similarity=0.209 Sum_probs=43.5
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccc---c------ccce-e--EEEECCccceEeCCccCCeeE
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTE---S------KDYQ-F--WVQTDSKGNFTVKNVVPGVYG 359 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~---~------~~yq-y--wt~td~~G~FtI~nVrpGtY~ 359 (596)
=.+.|+|...+ .++|..+|.|=++.....|..... . .+.. . +..||++|.|++.-|.||-|.
T Consensus 27 l~l~g~V~D~~-----~c~Pv~~a~VdiWh~da~G~Ys~~~~~~~~~~~~~~~~flRG~~~TD~~G~~~F~TI~PG~Y~ 100 (188)
T cd03457 27 LTLDLQVVDVA-----TCCPPPNAAVDIWHCDATGVYSGYSAGGGGGEDTDDETFLRGVQPTDADGVVTFTTIFPGWYP 100 (188)
T ss_pred EEEEEEEEeCC-----CCccCCCeEEEEecCCCCCCCCCccCCccccccccCCCcCEEEEEECCCccEEEEEECCCCCC
Confidence 47888887643 268999999999876655533321 1 1111 2 356899999999999999985
No 55
>cd05822 TLP_HIUase HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family. HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site. In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location. Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences. HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located betw
Probab=63.59 E-value=17 Score=33.30 Aligned_cols=51 Identities=20% Similarity=0.171 Sum_probs=36.9
Q ss_pred CCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEe-----CCccCCeeEEEEEEc
Q 047026 309 SLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTV-----KNVVPGVYGLHGWVP 365 (596)
Q Consensus 309 ~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI-----~nVrpGtY~L~a~~~ 365 (596)
.|+||.++-|-|..... ++|+ .-.-++||++|+..- ..+.+|+|+|..-..
T Consensus 12 ~G~PAagv~V~L~~~~~-~~~~-----~i~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~~~ 67 (112)
T cd05822 12 TGKPAAGVAVTLYRLDG-NGWT-----LLATGVTNADGRCDDLLPPGAQLAAGTYKLTFDTG 67 (112)
T ss_pred CCcccCCCEEEEEEecC-CCeE-----EEEEEEECCCCCccCcccccccCCCeeEEEEEEhh
Confidence 48999999999975432 1242 233477999998753 468899999998764
No 56
>PF01060 DUF290: Transthyretin-like family; InterPro: IPR001534 This new apparently nematode-specific protein family has been called family 2 []. The proteins show weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. The specific function of this protein is unknown.; GO: 0005615 extracellular space
Probab=61.35 E-value=17 Score=30.87 Aligned_cols=45 Identities=27% Similarity=0.294 Sum_probs=29.2
Q ss_pred EEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCC
Q 047026 296 TGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKN 352 (596)
Q Consensus 296 sG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~n 352 (596)
+|+|.=. ++|+.++.|-|...... .....-=-+.||++|+|+|..
T Consensus 1 ~G~L~C~-------~~P~~~~~V~L~e~d~~-----~~Ddll~~~~Td~~G~F~l~G 45 (80)
T PF01060_consen 1 KGQLMCG-------GKPAKNVKVKLWEDDYF-----DPDDLLDETKTDSDGNFELSG 45 (80)
T ss_pred CeEEEeC-------CccCCCCEEEEEECCCC-----CCCceeEEEEECCCceEEEEE
Confidence 4677654 58999999999644310 111221237789999999865
No 57
>PF03944 Endotoxin_C: delta endotoxin; InterPro: IPR005638 This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain (IPR005639 from INTERPRO) involved in membrane insertion and pore formation; a beta-sheet central domain (IPR001178 from INTERPRO) involved in receptor binding; and a C-terminal beta-sandwich domain that interacts with the N-terminal domain to form a channel [, ]. This entry represents the conserved C-terminal domain.; PDB: 1DLC_A 1JI6_A 1W99_A 1CIY_A 1I5P_A 2C9K_A 3EB7_A.
Probab=59.88 E-value=48 Score=31.11 Aligned_cols=98 Identities=14% Similarity=0.216 Sum_probs=51.6
Q ss_pred ccE--EEEEEeCCCccceEEEEEEEeccCCCeeEEEEcCccCC--cccccccccCCCCeeeeeeEEEeeEEEE-EEeecC
Q 047026 483 TTW--TIKFHLDSIIKGTYNLRLAIASATRSDLEIFVNYIDQG--HLVYQEMNLGMDNTVCRHGIHGLYQLFS-IHVSSL 557 (596)
Q Consensus 483 ~~w--~I~F~L~~~~~~~~tLriala~a~~~~~~V~vN~~~~~--~p~~~~~~~~~d~~i~R~~~~G~~~~~~-~~ipa~ 557 (596)
... ++++..+......|.+||-.|+.+.+.+.|.+++.... .++..+..-. .. ..++|..|. ++++..
T Consensus 36 ~~~~~~~~v~~~~~~~~~YrIRiRYAs~~~~~~~i~~~~~~~~~~~~~~~T~~~~--~~-----~~~~y~~F~y~~~~~~ 108 (143)
T PF03944_consen 36 GSLSIKIRVTINNSSSQKYRIRIRYASNSNGTLSISINNSSGNLSFNFPSTMSNG--DN-----LTLNYESFQYVEFPTP 108 (143)
T ss_dssp CEECEEEEEEESSSSTEEEEEEEEEEESS-EEEEEEETTEEEECEEEE--SSSTT--GG-----CCETGGG-EEEEESSE
T ss_pred CceEEEEEEEecCCCCceEEEEEEEEECCCcEEEEEECCccceeeeeccccccCC--Cc-----cccccceeEeeecCce
Confidence 444 45555444336789999999999889999999987542 1121222211 11 223233232 222221
Q ss_pred -ceeeec-cEEEEEEeecCCCCceEEEEEEEEe
Q 047026 558 -LLIKGD-NSMFLVQSRSGDPVCGVLYDYLRLE 588 (596)
Q Consensus 558 -~L~~G~-NtI~l~~~~g~s~~~~vmyD~IrLe 588 (596)
.+..+. .+|.|.+...++. ..|..|-|++.
T Consensus 109 ~~~~~~~~~~~~i~i~~~~~~-~~v~IDkIEFI 140 (143)
T PF03944_consen 109 FTFSSNQSITITISIQNISSN-GNVYIDKIEFI 140 (143)
T ss_dssp EEESTSEEEEEEEEEESSTTT-S-EEEEEEEEE
T ss_pred EEecCCCceEEEEEEEecCCC-CeEEEEeEEEE
Confidence 122222 5677765543332 47999999885
No 58
>COG2351 Transthyretin-like protein [General function prediction only]
Probab=58.47 E-value=33 Score=31.89 Aligned_cols=66 Identities=24% Similarity=0.334 Sum_probs=44.4
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceE-----eCCccCCeeEEEEEEcce
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFT-----VKNVVPGVYGLHGWVPGF 367 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~Ft-----I~nVrpGtY~L~a~~~G~ 367 (596)
|.++=.|+... .|+||.++.|.|..-..+ .|+ ----+.||.||+-. -..+++|.|+|..-+
T Consensus 9 G~LTTHVLDta-----~GkPAagv~V~L~rl~~~-~~~-----~l~t~~Tn~DGR~d~pll~g~~~~~G~Y~l~F~~--- 74 (124)
T COG2351 9 GRLTTHVLDTA-----SGKPAAGVKVELYRLEGN-QWE-----LLKTVVTNADGRIDAPLLAGETLATGIYELVFHT--- 74 (124)
T ss_pred ceeeeeeeecc-----cCCcCCCCEEEEEEecCC-cce-----eeeEEEecCCCcccccccCccccccceEEEEEEc---
Confidence 45555655543 489999999999754432 332 22246789999876 346789999999865
Q ss_pred eeeEee
Q 047026 368 IGDYLD 373 (596)
Q Consensus 368 ~G~~~~ 373 (596)
|||-.
T Consensus 75 -gdYf~ 79 (124)
T COG2351 75 -GDYFK 79 (124)
T ss_pred -chhhh
Confidence 55543
No 59
>PF14900 DUF4493: Domain of unknown function (DUF4493)
Probab=53.70 E-value=2.5e+02 Score=28.33 Aligned_cols=41 Identities=22% Similarity=0.248 Sum_probs=28.5
Q ss_pred ccCCeeEEEEEEcc--eee----eEeeeeEEEEeCCCeeeecceEEee
Q 047026 353 VVPGVYGLHGWVPG--FIG----DYLDKALVTISAGSQTELGNLTYVP 394 (596)
Q Consensus 353 VrpGtY~L~a~~~G--~~G----~~~~~~~VtV~aG~t~~l~~l~~~~ 394 (596)
+++|+|+|.|+... ..| .|.-++.++|.+|+++.+. |+-..
T Consensus 62 L~~G~Ytv~A~~g~~~~~~~d~pyy~G~~~f~I~~g~~t~v~-v~C~l 108 (235)
T PF14900_consen 62 LPVGSYTVKASYGDNVAAGFDKPYYEGSTTFTIEKGETTTVS-VTCKL 108 (235)
T ss_pred ecCCcEEEEEEcCCCccccccCceeecceeEEEecCCcEEEE-EEEEe
Confidence 67899999999422 112 2455668999999998773 65433
No 60
>PF02369 Big_1: Bacterial Ig-like domain (group 1); InterPro: IPR003344 Proteins that contain this domain are found in a variety of bacterial and phage surface proteins such as intimins. Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion [].; PDB: 1CWV_A 4E9L_A 1F02_I 1F00_I.
Probab=53.32 E-value=77 Score=27.89 Aligned_cols=69 Identities=23% Similarity=0.237 Sum_probs=36.3
Q ss_pred CCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccc--eEeCCccCCeeEEEEEEcc
Q 047026 290 NERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGN--FTVKNVVPGVYGLHGWVPG 366 (596)
Q Consensus 290 ~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~--FtI~nVrpGtY~L~a~~~G 366 (596)
.....+.=++.+.|. .+.|..+..|-+......+... .... -+.||++|. +++..-++|+|++.|...|
T Consensus 20 ~g~~~~tltatV~D~----~gnpv~g~~V~f~~~~~~~~l~--~~~~--~~~Td~~G~a~~tltst~aG~~~VtA~~~~ 90 (100)
T PF02369_consen 20 DGSDTNTLTATVTDA----NGNPVPGQPVTFSSSSSGGTLS--PTNT--SATTDSNGIATVTLTSTKAGTYTVTATVDG 90 (100)
T ss_dssp SSSS-EEEEEEEEET----TSEB-TS-EEEE--EESSSEES---CEE---EEE-TTSEEEEEEE-SS-EEEEEEEEETT
T ss_pred CCcCcEEEEEEEEcC----CCCCCCCCEEEEEEcCCCcEEe--cCcc--ccEECCCEEEEEEEEecCceEEEEEEEECC
Confidence 344445555555563 3788888888872111111121 0000 357899996 5566779999999999986
No 61
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=51.17 E-value=32 Score=27.04 Aligned_cols=30 Identities=20% Similarity=0.368 Sum_probs=21.0
Q ss_pred ceeEEEECCccceEe--CCccCCeeEEEEEEc
Q 047026 336 YQFWVQTDSKGNFTV--KNVVPGVYGLHGWVP 365 (596)
Q Consensus 336 yqywt~td~~G~FtI--~nVrpGtY~L~a~~~ 365 (596)
-.|.+.+|++|.+++ +....|+|++.+.+.
T Consensus 2 ~~~~~t~~~~G~Ws~t~~~~~dG~y~itv~a~ 33 (54)
T PF13754_consen 2 VTYTTTVDSDGNWSFTVPALADGTYTITVTAT 33 (54)
T ss_pred eEEEEEECCCCcEEEeCCCCCCccEEEEEEEE
Confidence 356777888898776 444558888877754
No 62
>PF08531 Bac_rhamnosid_N: Alpha-L-rhamnosidase N-terminal domain; InterPro: IPR013737 This domain is found in bacterial rhamnosidase A and B enzymes and is probably involved in substrate recognition. ; PDB: 2OKX_B.
Probab=51.09 E-value=20 Score=34.66 Aligned_cols=53 Identities=19% Similarity=0.150 Sum_probs=29.0
Q ss_pred CCCeeEEEEcCccCCcccccccccCCCCeeeeeeE--EEeeEEEEEEeecCceeeeccEEEEEEeec
Q 047026 509 TRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGI--HGLYQLFSIHVSSLLLIKGDNSMFLVQSRS 573 (596)
Q Consensus 509 ~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~--~G~~~~~~~~ipa~~L~~G~NtI~l~~~~g 573 (596)
+.+..++.|||+.+..-.+.+. +..+ +-+|.. ++| ..+|++|+|+|-+.+..|
T Consensus 12 a~g~Y~l~vNG~~V~~~~l~P~---------~t~y~~~~~Y~t--yDV-t~~L~~G~N~iav~lg~g 66 (172)
T PF08531_consen 12 ALGRYELYVNGERVGDGPLAPG---------WTDYDKRVYYQT--YDV-TPYLRPGENVIAVWLGNG 66 (172)
T ss_dssp EESEEEEEETTEEEEEE-----------------BTTEEEEEE--EE--TTT--TTEEEEEEEEEE-
T ss_pred eCeeEEEEECCEEeeCCccccc---------cccCCCceEEEE--EeC-hHHhCCCCCEEEEEEeCC
Confidence 5579999999987752111100 1112 224444 444 468899999999998654
No 63
>smart00095 TR_THY Transthyretin.
Probab=50.96 E-value=39 Score=31.47 Aligned_cols=62 Identities=16% Similarity=0.152 Sum_probs=40.5
Q ss_pred eeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceE--e--CCccCCeeEEEEEE
Q 047026 293 GSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFT--V--KNVVPGVYGLHGWV 364 (596)
Q Consensus 293 GtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~Ft--I--~nVrpGtY~L~a~~ 364 (596)
-.|+=.|+... .|+||.++.|-|......+.|+ .---.+||.+|+.. + ..+.+|.|+|..-.
T Consensus 4 ~plTtHVLDt~-----~G~PAagv~V~L~~~~~~~~w~-----~la~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~t 69 (121)
T smart00095 4 CPLMVKVLDAV-----RGSPAVNVAVKVFKKTEEGTWE-----PFASGKTNESGEIHELTTDEKFVEGLYKVEFDT 69 (121)
T ss_pred CCeEEEEEECC-----CCccCCCCEEEEEEeCCCCceE-----EEEEEecCCCccccCccCcccccceEEEEEEeh
Confidence 34555555442 4899999999996432112342 22236789999884 1 35779999999865
No 64
>PF09912 DUF2141: Uncharacterized protein conserved in bacteria (DUF2141); InterPro: IPR018673 This family of conserved hypothetical proteins has no known function.
Probab=45.25 E-value=53 Score=29.75 Aligned_cols=48 Identities=17% Similarity=0.302 Sum_probs=30.4
Q ss_pred eeEEEEcCCCCCCCcccccccceeEEEE---CCccceEeCCccCCeeEEEEEEc
Q 047026 315 YAYIGLSSARTEGGWQTESKDYQFWVQT---DSKGNFTVKNVVPGVYGLHGWVP 365 (596)
Q Consensus 315 ~a~V~L~~~~~~g~wq~~~~~yqywt~t---d~~G~FtI~nVrpGtY~L~a~~~ 365 (596)
.+.|.|....+ +|. ..+..-....+ +.+-.++|++++||+|.+.++.+
T Consensus 12 ~v~v~ly~~~~--~f~-~~~~~~~~~~~~~~~~~~~~~f~~lp~G~YAi~v~hD 62 (112)
T PF09912_consen 12 QVRVALYNSAE--GFE-NKKKALKRVKVPAKGGTVTITFEDLPPGTYAIAVFHD 62 (112)
T ss_pred EEEEEEEcChh--chh-hcccceeEEEEEcCCCcEEEEECCCCCccEEEEEEEe
Confidence 45666665533 352 22223333333 23458999999999999999976
No 65
>PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed
Probab=41.24 E-value=46 Score=41.27 Aligned_cols=64 Identities=19% Similarity=0.255 Sum_probs=44.3
Q ss_pred EEEEeCCCccc-eEEEEEEEeccCCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCceeeeccE
Q 047026 487 IKFHLDSIIKG-TYNLRLAIASATRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLIKGDNS 565 (596)
Q Consensus 487 I~F~L~~~~~~-~~tLriala~a~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~~G~Nt 565 (596)
=.|.+++...+ ...|++..+ .....|.|||+.++. . .|-+..++|+|.. .|+.|+|+
T Consensus 115 r~F~lp~~~~gkrv~L~FeGV---~s~a~VwvNG~~VG~------~------------~g~~~pfefDIT~-~l~~G~N~ 172 (1021)
T PRK10340 115 RTFTLSDGWQGKQTIIKFDGV---ETYFEVYVNGQYVGF------S------------KGSRLTAEFDISA-MVKTGDNL 172 (1021)
T ss_pred EEEEeCcccccCcEEEEECcc---ceEEEEEECCEEecc------c------------cCCCccEEEEcch-hhCCCccE
Confidence 35888765433 344555433 456899999987641 0 1446778899987 67899999
Q ss_pred EEEEEee
Q 047026 566 MFLVQSR 572 (596)
Q Consensus 566 I~l~~~~ 572 (596)
|.+.+.+
T Consensus 173 LaV~V~~ 179 (1021)
T PRK10340 173 LCVRVMQ 179 (1021)
T ss_pred EEEEEEe
Confidence 9999854
No 66
>PF11008 DUF2846: Protein of unknown function (DUF2846); InterPro: IPR022548 Some members in this group of proteins with unknown function are annotated as lipoproteins. However this cannot be confirmed.
Probab=40.83 E-value=32 Score=31.09 Aligned_cols=44 Identities=18% Similarity=0.117 Sum_probs=29.7
Q ss_pred CccceEeCCccCCeeEEEEEEcceeeeEeeeeEEEEeCCCeeee
Q 047026 344 SKGNFTVKNVVPGVYGLHGWVPGFIGDYLDKALVTISAGSQTEL 387 (596)
Q Consensus 344 ~~G~FtI~nVrpGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l 387 (596)
..|.|..-.|+||+|++.+......+.-..+.+|+|.+|++--+
T Consensus 56 ~~g~y~~~~v~pG~h~i~~~~~~~~~~~~~~l~~~~~~G~~yy~ 99 (117)
T PF11008_consen 56 KNGGYFYVEVPPGKHTISAKSEFSSSPGANSLDVTVEAGKTYYV 99 (117)
T ss_pred CCCeEEEEEECCCcEEEEEecCccCCCCccEEEEEEcCCCEEEE
Confidence 56777777899999999995431110011446689999998554
No 67
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=39.06 E-value=2.3e+02 Score=24.42 Aligned_cols=39 Identities=21% Similarity=0.257 Sum_probs=29.2
Q ss_pred ceEeCCccCCe---eEEEEEE--cceeeeEeeeeEEEEeCCCeeee
Q 047026 347 NFTVKNVVPGV---YGLHGWV--PGFIGDYLDKALVTISAGSQTEL 387 (596)
Q Consensus 347 ~FtI~nVrpGt---Y~L~a~~--~G~~G~~~~~~~VtV~aG~t~~l 387 (596)
.|.-+++.+|. |++.+-. +|- ....+.+|.|.||.+.++
T Consensus 30 ~F~T~~L~~G~~y~Y~v~a~~~~dG~--~~t~~~~V~vrAGd~~~v 73 (75)
T TIGR03000 30 TFTTPPLEAGKEYEYTVTAEYDRDGR--ILTRTRTVVVRAGDTVTV 73 (75)
T ss_pred EEECCCCCCCCEEEEEEEEEEecCCc--EEEEEEEEEEcCCceEEe
Confidence 69999999997 6666642 542 246777899999998766
No 68
>PF07550 DUF1533: Protein of unknown function (DUF1533); InterPro: IPR011432 This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.
Probab=38.71 E-value=31 Score=28.20 Aligned_cols=19 Identities=11% Similarity=0.403 Sum_probs=16.9
Q ss_pred EEeecCce-eeeccEEEEEE
Q 047026 552 IHVSSLLL-IKGDNSMFLVQ 570 (596)
Q Consensus 552 ~~ipa~~L-~~G~NtI~l~~ 570 (596)
+.|.+++| +.|+|+|+|.-
T Consensus 36 l~i~~~~f~~~G~~~I~I~A 55 (65)
T PF07550_consen 36 LKIKASAFNKDGENTIVIKA 55 (65)
T ss_pred EEEcHHHcCcCCceEEEEEe
Confidence 88899999 78999999984
No 69
>PF12866 DUF3823: Protein of unknown function (DUF3823); InterPro: IPR024278 This is a family of uncharacterised proteins from Bacteroidetes. These proteins have characteristic DN and DR sequence-motifs but their function is not known.; PDB: 3HN5_B 4EIU_A.
Probab=38.20 E-value=1.4e+02 Score=30.63 Aligned_cols=91 Identities=16% Similarity=0.102 Sum_probs=45.9
Q ss_pred ceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeCCccCCeeEEEEE-Ecc-eee
Q 047026 292 RGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVKNVVPGVYGLHGW-VPG-FIG 369 (596)
Q Consensus 292 RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~nVrpGtY~L~a~-~~G-~~G 369 (596)
-++++|+|....--.+. +....++-+-|.+.+ |..- +.|.+ ....+|.|.-..+=+|+|+|..- .++ ..
T Consensus 21 ~s~l~G~iiD~~tgE~i-~~~~~gv~i~l~e~g----y~~~--~~~~~-~v~qDGtf~n~~lF~G~Yki~~~~G~fp~~- 91 (222)
T PF12866_consen 21 DSTLTGRIIDVYTGEPI-QTDIGGVRIQLYELG----YGDN--TPQDV-YVKQDGTFRNTKLFDGDYKIVPKNGNFPWV- 91 (222)
T ss_dssp -EEEEEEEEECCTTEE-----STSSEEEEECS-----CCG----SEEE-EB-TTSEEEEEEE-SEEEEEEE-CTSCSBS-
T ss_pred CceEEEEEEEeecCCee-eecCCceEEEEEecc----cccC--CCcce-EEccCCceeeeeEeccceEEEEcCCCCccc-
Confidence 58999999542210000 111246677776555 5422 44444 36789999888999999999982 232 00
Q ss_pred eEeeeeEEEEeCCCeeeecceEEee
Q 047026 370 DYLDKALVTISAGSQTELGNLTYVP 394 (596)
Q Consensus 370 ~~~~~~~VtV~aG~t~~l~~l~~~~ 394 (596)
.-..+..|.|+ |.+ ++ ++..+|
T Consensus 92 ~~~dti~v~i~-G~t-~~-d~eVtP 113 (222)
T PF12866_consen 92 VPVDTIEVDIK-GNT-TQ-DFEVTP 113 (222)
T ss_dssp CCE--EEEEES-SCE-EE-EEEE-B
T ss_pred CCCccEEEEec-Cce-EE-eEEeee
Confidence 11233346666 554 33 455443
No 70
>PRK09525 lacZ beta-D-galactosidase; Reviewed
Probab=33.12 E-value=80 Score=39.28 Aligned_cols=64 Identities=16% Similarity=0.135 Sum_probs=43.5
Q ss_pred EEEEeCCCccc--eEEEEEEEeccCCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCceeeecc
Q 047026 487 IKFHLDSIIKG--TYNLRLAIASATRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLIKGDN 564 (596)
Q Consensus 487 I~F~L~~~~~~--~~tLriala~a~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~~G~N 564 (596)
=.|++++...+ ...|++.. ......|.|||+.++. -.|-+.-++|+|. ..|+.|+|
T Consensus 126 r~F~vp~~w~~~~rv~L~FeG---V~~~a~VwvNG~~VG~------------------~~g~~~pfefDIT-~~l~~G~N 183 (1027)
T PRK09525 126 LTFTVDESWLQSGQTRIIFDG---VNSAFHLWCNGRWVGY------------------SQDSRLPAEFDLS-PFLRAGEN 183 (1027)
T ss_pred EEEEeChhhcCCCeEEEEECe---eccEEEEEECCEEEEe------------------ecCCCceEEEECh-hhhcCCcc
Confidence 35888765322 34455442 3467899999986531 1245677899996 57789999
Q ss_pred EEEEEEee
Q 047026 565 SMFLVQSR 572 (596)
Q Consensus 565 tI~l~~~~ 572 (596)
+|.+.+.+
T Consensus 184 ~L~V~V~~ 191 (1027)
T PRK09525 184 RLAVMVLR 191 (1027)
T ss_pred EEEEEEEe
Confidence 99999955
No 71
>PF11797 DUF3324: Protein of unknown function C-terminal (DUF3324); InterPro: IPR021759 This family consists of several hypothetical bacterial proteins of unknown function.
Probab=32.76 E-value=52 Score=30.90 Aligned_cols=30 Identities=23% Similarity=0.143 Sum_probs=20.7
Q ss_pred CccCCeeEEEEEEcceeeeEeeeeEEEEeC
Q 047026 352 NVVPGVYGLHGWVPGFIGDYLDKALVTISA 381 (596)
Q Consensus 352 nVrpGtY~L~a~~~G~~G~~~~~~~VtV~a 381 (596)
.++||+|+|.+-+..-.+.-..+..++|++
T Consensus 102 ~lk~G~Y~l~~~~~~~~~~W~f~k~F~It~ 131 (140)
T PF11797_consen 102 KLKPGKYTLKITAKSGKKTWTFTKDFTITA 131 (140)
T ss_pred CccCCEEEEEEEEEcCCcEEEEEEEEEECH
Confidence 699999999987654333344556677763
No 72
>PF01190 Pollen_Ole_e_I: Pollen proteins Ole e I like; InterPro: IPR006041 Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation. The allergens in this family include allergens with the following designations: Ole e 1. A number of plant pollen proteins, whose biological function is not yet known, are structurally related []. These proteins are most probably secreted and consist of about 145 residues. There are six cysteines which are conserved in the sequence of these proteins. They seem to be involved in disulphide bonds.
Probab=30.48 E-value=53 Score=28.60 Aligned_cols=37 Identities=19% Similarity=0.293 Sum_probs=24.6
Q ss_pred CCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeC
Q 047026 310 LIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVK 351 (596)
Q Consensus 310 ~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~ 351 (596)
..|..+|.|.|.=.+..+ .......+.||++|.|.|.
T Consensus 18 ~~~l~GA~V~v~C~~~~~-----~~~~~~~~~Td~~G~F~i~ 54 (97)
T PF01190_consen 18 AKPLPGAKVSVECKDGNG-----GVVFSAEAKTDENGYFSIE 54 (97)
T ss_pred CccCCCCEEEEECCCCCC-----CcEEEEEEEeCCCCEEEEE
Confidence 367788999885332110 0135556889999999995
No 73
>PRK10150 beta-D-glucuronidase; Provisional
Probab=27.14 E-value=1.1e+02 Score=35.35 Aligned_cols=64 Identities=14% Similarity=0.201 Sum_probs=41.3
Q ss_pred EEEEeCCCccc-eEEEEEEEeccCCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCceeeecc-
Q 047026 487 IKFHLDSIIKG-TYNLRLAIASATRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLIKGDN- 564 (596)
Q Consensus 487 I~F~L~~~~~~-~~tLriala~a~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~~G~N- 564 (596)
=.|++++...+ ...|++.. .....+|.|||+.++. -.|-+..++|+|.. .|+.|+|
T Consensus 71 r~f~lp~~~~gk~v~L~Feg---v~~~a~V~lNG~~vg~------------------~~~~~~~f~~DIT~-~l~~G~~n 128 (604)
T PRK10150 71 REVFIPKGWAGQRIVLRFGS---VTHYAKVWVNGQEVME------------------HKGGYTPFEADITP-YVYAGKSV 128 (604)
T ss_pred EEEECCcccCCCEEEEEECc---ccceEEEEECCEEeee------------------EcCCccceEEeCch-hccCCCce
Confidence 35888754433 34455543 2345799999986541 12456778999975 5678864
Q ss_pred EEEEEEee
Q 047026 565 SMFLVQSR 572 (596)
Q Consensus 565 tI~l~~~~ 572 (596)
+|.+.+.+
T Consensus 129 ~L~V~v~n 136 (604)
T PRK10150 129 RITVCVNN 136 (604)
T ss_pred EEEEEEec
Confidence 99999843
No 74
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=25.82 E-value=1.3e+02 Score=35.45 Aligned_cols=71 Identities=20% Similarity=0.205 Sum_probs=47.2
Q ss_pred CCccEEEEEEeCCCccceEEEEEEEeccCCCeeEEEEcCccCCcccccccccCCCCeeeeeeEEEeeEEEEEEeecCcee
Q 047026 481 LPTTWTIKFHLDSIIKGTYNLRLAIASATRSDLEIFVNYIDQGHLVYQEMNLGMDNTVCRHGIHGLYQLFSIHVSSLLLI 560 (596)
Q Consensus 481 ~~~~w~I~F~L~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~p~~~~~~~~~d~~i~R~~~~G~~~~~~~~ipa~~L~ 560 (596)
++-+|-=.|+.++.... ++|-...-+.=+|.|||.++++ . +..- |. ..++.||++.|+
T Consensus 556 ~P~~w~k~f~~p~g~~~-----t~Ldm~g~GKG~vwVNG~niGR-Y---W~~~-----------G~--Q~~yhvPr~~Lk 613 (649)
T KOG0496|consen 556 QPLTWYKTFDIPSGSEP-----TALDMNGWGKGQVWVNGQNIGR-Y---WPSF-----------GP--QRTYHVPRSWLK 613 (649)
T ss_pred CCeEEEEEecCCCCCCC-----eEEecCCCcceEEEECCccccc-c---cCCC-----------CC--ceEEECcHHHhC
Confidence 56777767777664331 1222234567799999999874 2 2111 33 567889999999
Q ss_pred eeccEEEEEEeec
Q 047026 561 KGDNSMFLVQSRS 573 (596)
Q Consensus 561 ~G~NtI~l~~~~g 573 (596)
.+.|.|.+---.+
T Consensus 614 ~~~N~lvvfEee~ 626 (649)
T KOG0496|consen 614 PSGNLLVVFEEEG 626 (649)
T ss_pred cCCceEEEEEecc
Confidence 9999988765444
No 75
>PRK13211 N-acetylglucosamine-binding protein A; Reviewed
Probab=23.76 E-value=3.7e+02 Score=30.71 Aligned_cols=67 Identities=7% Similarity=-0.069 Sum_probs=39.1
Q ss_pred CCCCCCCCceeEEEEEEeeecccccCCCCcceeEEEEcCCCCCCCcccccccceeEEEECCccceEeC--CccCCeeEEE
Q 047026 284 PYYLTANERGSATGRFFVQDKFVSSSLIPAKYAYIGLSSARTEGGWQTESKDYQFWVQTDSKGNFTVK--NVVPGVYGLH 361 (596)
Q Consensus 284 ~~y~~~~~RGtVsG~v~~~D~~~~~~~~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td~~G~FtI~--nVrpGtY~L~ 361 (596)
++|.-.++..+|.=+|... ....+.+-|.+... +.++++=-+..|..-.|+|. ++.+|.|+|.
T Consensus 320 ~eY~I~dG~~~i~ftv~a~---------g~~~vta~V~d~~g------~~~~~~~~~v~d~s~~vtL~Ls~~~AG~y~Lv 384 (478)
T PRK13211 320 KEYKIGDGAATLDFTVTAT---------GDMNVEATVYNHDG------EALGSKSQTVNDGSQSVSLDLSKLKAGHHMLV 384 (478)
T ss_pred ceeEEcCCcEEEEEEEEec---------cceEEEEEEEcCCC------CeeeeeeEEecCCceeEEEecccCCCceEEEE
Confidence 6677766666666555543 23355555543221 23344333444555567655 9999999999
Q ss_pred EEEc
Q 047026 362 GWVP 365 (596)
Q Consensus 362 a~~~ 365 (596)
+.+.
T Consensus 385 v~~t 388 (478)
T PRK13211 385 VKAK 388 (478)
T ss_pred EEEE
Confidence 8753
No 76
>PF13750 Big_3_3: Bacterial Ig-like domain (group 3)
Probab=23.75 E-value=6.6e+02 Score=24.13 Aligned_cols=27 Identities=26% Similarity=0.562 Sum_probs=21.1
Q ss_pred cEEEEEEeCCCccceEEEEE-EEeccCC
Q 047026 484 TWTIKFHLDSIIKGTYNLRL-AIASATR 510 (596)
Q Consensus 484 ~w~I~F~L~~~~~~~~tLri-ala~a~~ 510 (596)
+|+..|++...+.|.|+|.+ .-.-+++
T Consensus 2 ~~~~~fd~~~l~dG~Y~l~~~~a~D~ag 29 (158)
T PF13750_consen 2 NYTYTFDLSTLPDGSYTLTVVTATDAAG 29 (158)
T ss_pred cEEEEEEeCcCCCccEEEEEEEEEecCC
Confidence 69999999888899999998 3444443
No 77
>PF13954 PapC_N: PapC N-terminal domain; PDB: 2VQI_B 3FIP_A 3RFZ_E 3OHN_A 1ZDV_A 1ZE3_D 3BWU_D 1ZDX_A.
Probab=22.34 E-value=1.2e+02 Score=28.58 Aligned_cols=26 Identities=19% Similarity=0.524 Sum_probs=18.2
Q ss_pred CccCCeeEEEEEEcceeeeEeeeeEEEEe
Q 047026 352 NVVPGVYGLHGWVPGFIGDYLDKALVTIS 380 (596)
Q Consensus 352 nVrpGtY~L~a~~~G~~G~~~~~~~VtV~ 380 (596)
.+.||+|.+-+|++| .+.....|++.
T Consensus 26 ~~~pG~Y~vdv~vN~---~~~~~~~i~f~ 51 (146)
T PF13954_consen 26 AIPPGEYSVDVYVNG---KFIGRYDIEFI 51 (146)
T ss_dssp SS-SEEEEEEEEETT---EEEEEEEEEEE
T ss_pred CCCCeEEEEEEEECC---eeeeeEEEEEE
Confidence 699999999999996 33444445555
No 78
>PF14200 RicinB_lectin_2: Ricin-type beta-trefoil lectin domain-like; PDB: 2X2S_C 2X2T_A 2VSE_B 2VSA_A 3EF2_A 2IHO_A 3HZB_H 1YBI_B 3PHZ_A 3NBE_A ....
Probab=22.09 E-value=1.2e+02 Score=26.24 Aligned_cols=38 Identities=18% Similarity=0.308 Sum_probs=25.4
Q ss_pred CCcceeEEEEcCCCCCCCcccccccceeEEEEC-CccceEeCCccCC
Q 047026 311 IPAKYAYIGLSSARTEGGWQTESKDYQFWVQTD-SKGNFTVKNVVPG 356 (596)
Q Consensus 311 ~pa~~a~V~L~~~~~~g~wq~~~~~yqywt~td-~~G~FtI~nVrpG 356 (596)
..+.++.|.+... .....|.|.... .+|.|+|.++..|
T Consensus 34 ~~~~g~~v~~~~~--------~~~~~Q~W~i~~~~~g~y~I~n~~s~ 72 (105)
T PF14200_consen 34 STANGTNVQQWTC--------NGNDNQQWKIEPVGDGYYRIRNKNSG 72 (105)
T ss_dssp TCSTTEBEEEEES--------SSSGGGEEEEEESTTSEEEEEETSTT
T ss_pred CcCCCcEEEEecC--------CCCcCcEEEEEEecCCeEEEEECCCC
Confidence 3456677777433 234788886654 6788999888665
No 79
>PF04571 Lipin_N: lipin, N-terminal conserved region; InterPro: IPR007651 Mutations in the lipin gene lead to fatty liver dystrophy in mice. The protein has been shown to be phosphorylated by the TOR Ser/Thr protein kinases in response to insulin stimulation. This entry represents a conserved domain found at the N terminus of the member proteins [, ].
Probab=21.63 E-value=1.2e+02 Score=27.96 Aligned_cols=38 Identities=18% Similarity=0.281 Sum_probs=28.0
Q ss_pred CCCCCCCCccEEEEEEeCCCccceEEEEEEEeccCCCeeEEEEcCccCC
Q 047026 475 GPDNKYLPTTWTIKFHLDSIIKGTYNLRLAIASATRSDLEIFVNYIDQG 523 (596)
Q Consensus 475 ~~~~~~~~~~w~I~F~L~~~~~~~~tLriala~a~~~~~~V~vN~~~~~ 523 (596)
..||+++-++|.++|-- | .+..+....+.|.|||..+.
T Consensus 35 q~DGs~~sSPFhVRFGk---------~--~vl~~~ek~V~I~VNG~~~~ 72 (110)
T PF04571_consen 35 QPDGSLKSSPFHVRFGK---------L--GVLRPREKVVDIEVNGKPVD 72 (110)
T ss_pred cCCCCEecCccEEEEcc---------e--eeecccCcEEEEEECCEEcc
Confidence 46788899999999972 1 33444556789999998764
No 80
>COG4676 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=21.49 E-value=1.4e+02 Score=30.61 Aligned_cols=46 Identities=33% Similarity=0.521 Sum_probs=33.0
Q ss_pred EEEEEcCCCCcccCCcceecccccCCccEEEEEeecccccCceeeccccCcccceeeceEEEEEcCCC
Q 047026 186 GFWIIFPSHEFRNGGPTKQNLTVHTGPTCLAMFHGTHYIGNEILAHFQEGEAWRKVFGPIFVYLNSTS 253 (596)
Q Consensus 186 G~W~I~~s~E~~sGGPlkqdL~~h~g~~~l~y~~s~Hy~g~~~~~~~~~Ge~w~kv~GP~~~y~N~g~ 253 (596)
++|.=.+ -.++||-|--|.+.-.||+++.| .. -++|+|++|+|==+
T Consensus 167 Hawygn~--~lsngg~LDvDvttGyGPEifa~-------pa-------------P~~G~ylvYVNY~G 212 (268)
T COG4676 167 HAWYGNP--VLSNGGALDVDVTTGYGPEIFAM-------PA-------------PVHGTYLVYVNYYG 212 (268)
T ss_pred eeeecCc--eecCCcccCcccccCCCcceecc-------CC-------------CCCccEEEEEEeec
Confidence 4444333 46789999888888888887755 22 27899999999633
Done!