Query 038979
Match_columns 606
No_of_seqs 172 out of 293
Neff 5.8
Searched_HMMs 46136
Date Fri Mar 29 05:11:41 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/038979.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/038979hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09284 RhgB_N: Rhamnogalactu 100.0 1.8E-53 3.9E-58 420.7 24.7 233 7-278 3-244 (249)
2 PF06045 Rhamnogal_lyase: Rham 100.0 8E-48 1.7E-52 375.4 18.9 170 1-170 5-193 (203)
3 PF14683 CBM-like: Polysacchar 100.0 7.2E-46 1.6E-50 356.3 10.9 165 412-599 1-167 (167)
4 PF14686 fn3_3: Polysaccharide 99.9 9.4E-22 2E-26 172.7 9.2 93 304-403 1-95 (95)
5 PF13620 CarboxypepD_reg: Carb 98.8 2.4E-08 5.2E-13 84.0 9.2 81 307-406 1-82 (82)
6 PF13715 DUF4480: Domain of un 98.5 1.8E-06 4E-11 73.8 10.9 88 307-419 1-88 (88)
7 cd03863 M14_CPD_II The second 98.0 2.7E-05 5.9E-10 84.5 9.6 100 269-406 275-374 (375)
8 cd03865 M14_CPE_H Peptidase M1 97.9 2.2E-05 4.7E-10 85.8 8.1 98 269-405 304-401 (402)
9 cd03864 M14_CPN Peptidase M14 97.9 3.6E-05 7.8E-10 84.1 9.0 98 269-405 294-391 (392)
10 cd06245 M14_CPD_III The third 97.7 0.00013 2.8E-09 79.0 9.6 76 306-406 287-362 (363)
11 cd03868 M14_CPD_I The first ca 97.5 0.00022 4.7E-09 77.5 8.3 73 305-400 295-368 (372)
12 cd03858 M14_CP_N-E_like Carbox 97.5 0.00033 7.2E-09 76.0 9.4 114 249-400 251-370 (374)
13 cd03867 M14_CPZ Peptidase M14- 97.2 0.001 2.2E-08 73.0 8.8 71 307-400 319-391 (395)
14 cd03866 M14_CPM Peptidase M14 96.4 0.01 2.2E-07 64.8 8.6 70 305-395 294-363 (376)
15 PRK15036 hydroxyisourate hydro 95.4 0.037 8E-07 52.2 6.4 64 303-377 24-91 (137)
16 PF08400 phage_tail_N: Prophag 95.2 0.11 2.4E-06 48.8 8.7 78 307-396 4-81 (134)
17 PF08308 PEGA: PEGA domain; I 94.8 0.21 4.6E-06 41.0 8.7 44 360-407 26-69 (71)
18 PF03422 CBM_6: Carbohydrate b 94.2 0.31 6.8E-06 43.9 9.1 80 508-600 44-124 (125)
19 cd03869 M14_CPX_like Peptidase 92.9 0.23 5.1E-06 54.8 6.9 67 305-395 329-395 (405)
20 cd00421 intradiol_dioxygenase 92.6 0.27 5.9E-06 46.7 6.2 64 304-373 10-80 (146)
21 PF05738 Cna_B: Cna protein B- 92.6 0.66 1.4E-05 37.7 7.7 45 353-398 21-67 (70)
22 PF09430 DUF2012: Protein of u 92.6 0.58 1.3E-05 43.1 8.1 40 351-393 22-61 (123)
23 PF07210 DUF1416: Protein of u 92.3 0.79 1.7E-05 39.7 7.8 62 304-382 6-67 (85)
24 KOG1948 Metalloproteinase-rela 91.3 0.52 1.1E-05 55.6 7.4 53 306-377 316-369 (1165)
25 cd03463 3,4-PCD_alpha Protocat 89.6 0.68 1.5E-05 45.9 5.8 63 304-372 35-106 (185)
26 PF07495 Y_Y_Y: Y_Y_Y domain; 89.4 0.52 1.1E-05 37.8 4.0 29 351-379 20-49 (66)
27 cd03459 3,4-PCD Protocatechuat 89.1 0.78 1.7E-05 44.3 5.7 64 304-373 14-87 (158)
28 cd03462 1,2-CCD chlorocatechol 87.5 1.6 3.4E-05 45.2 7.0 64 303-372 97-165 (247)
29 PF00775 Dioxygenase_C: Dioxyg 87.4 1.5 3.2E-05 43.5 6.5 64 304-373 28-98 (183)
30 TIGR02465 chlorocat_1_2 chloro 87.2 1.2 2.7E-05 46.0 6.0 64 304-373 97-165 (246)
31 PF03170 BcsB: Bacterial cellu 86.7 1.8 3.8E-05 50.2 7.7 77 494-584 29-110 (605)
32 TIGR02423 protocat_alph protoc 86.2 1.4 3.1E-05 43.9 5.7 64 304-373 38-111 (193)
33 TIGR02422 protocat_beta protoc 85.8 2.7 5.8E-05 42.9 7.5 67 301-373 56-132 (220)
34 cd03464 3,4-PCD_beta Protocate 85.7 2.7 5.8E-05 42.9 7.4 65 303-373 63-137 (220)
35 KOG1948 Metalloproteinase-rela 85.3 2.2 4.7E-05 50.7 7.3 57 306-378 119-175 (1165)
36 COG3485 PcaH Protocatechuate 3 84.8 1.7 3.6E-05 44.5 5.5 64 304-373 71-143 (226)
37 TIGR02962 hdxy_isourate hydrox 84.4 2.1 4.5E-05 39.2 5.4 55 322-386 12-71 (112)
38 PRK10340 ebgA cryptic beta-D-g 83.4 37 0.0008 42.2 17.2 43 4-46 716-758 (1021)
39 PF02837 Glyco_hydro_2_N: Glyc 83.0 2.5 5.4E-05 39.9 5.7 68 496-585 72-140 (167)
40 PF02929 Bgal_small_N: Beta ga 82.7 14 0.00029 38.8 11.5 30 15-44 1-30 (276)
41 PF10670 DUF4198: Domain of un 82.3 4 8.7E-05 40.0 7.0 62 305-377 150-211 (215)
42 TIGR02438 catachol_actin catec 81.2 2.3 5.1E-05 44.8 5.1 65 303-373 130-199 (281)
43 cd05822 TLP_HIUase HIUase (5-h 80.6 3.5 7.7E-05 37.7 5.4 55 322-386 12-71 (112)
44 cd03460 1,2-CTD Catechol 1,2 d 80.6 2.9 6.4E-05 44.1 5.6 65 303-373 122-191 (282)
45 cd03458 Catechol_intradiol_dio 80.6 9.6 0.00021 39.8 9.2 65 303-373 102-171 (256)
46 TIGR02439 catechol_proteo cate 79.6 3.3 7.2E-05 43.8 5.6 65 303-373 126-195 (285)
47 smart00606 CBD_IV Cellulose Bi 79.5 18 0.00039 32.7 9.9 88 495-599 40-129 (129)
48 PRK09525 lacZ beta-D-galactosi 79.5 61 0.0013 40.3 17.1 43 4-46 741-783 (1027)
49 PF00576 Transthyretin: HIUase 79.1 2.7 5.8E-05 38.4 4.1 50 322-376 12-66 (112)
50 PRK11114 cellulose synthase re 78.3 3.5 7.5E-05 49.3 5.9 75 497-584 84-163 (756)
51 cd03461 1,2-HQD Hydroxyquinol 77.6 4.6 9.9E-05 42.6 5.9 65 303-373 118-187 (277)
52 PF13364 BetaGal_dom4_5: Beta- 77.2 12 0.00025 33.9 7.7 53 511-581 50-104 (111)
53 cd05821 TLP_Transthyretin Tran 76.9 3.5 7.7E-05 38.2 4.3 68 305-386 6-77 (121)
54 cd05469 Transthyretin_like Tra 74.4 4.6 9.9E-05 37.1 4.3 56 322-386 12-71 (113)
55 KOG2649 Zinc carboxypeptidase 73.9 10 0.00022 42.8 7.5 78 306-407 378-455 (500)
56 PF02369 Big_1: Bacterial Ig-l 72.9 30 0.00065 30.5 9.1 68 304-379 21-90 (100)
57 COG2351 Transthyretin-like pro 70.8 15 0.00032 34.1 6.6 67 306-387 9-80 (124)
58 smart00095 TR_THY Transthyreti 69.8 7.2 0.00016 36.2 4.5 67 306-386 4-74 (121)
59 PF14315 DUF4380: Domain of un 69.8 74 0.0016 33.2 12.6 33 12-44 5-38 (274)
60 PF03170 BcsB: Bacterial cellu 69.7 8.5 0.00018 44.6 6.2 79 493-584 323-409 (605)
61 PF01060 DUF290: Transthyretin 65.1 13 0.00027 31.6 4.8 55 309-375 1-55 (80)
62 PLN03059 beta-galactosidase; P 63.1 7.5 0.00016 46.7 4.0 85 494-585 621-715 (840)
63 KOG4342 Alpha-mannosidase [Car 61.4 12 0.00027 43.3 5.1 39 4-43 706-746 (1078)
64 cd03457 intradiol_dioxygenase_ 55.3 30 0.00065 34.5 6.2 62 306-372 27-100 (188)
65 PRK10340 ebgA cryptic beta-D-g 54.4 17 0.00037 45.0 5.2 67 496-584 113-179 (1021)
66 PF08531 Bac_rhamnosid_N: Alph 50.2 13 0.00028 36.0 2.7 61 511-585 6-66 (172)
67 cd09024 Aldose_epim_lacX Aldos 47.3 2.1E+02 0.0045 29.8 11.3 30 14-43 1-32 (288)
68 PF11008 DUF2846: Protein of u 46.9 25 0.00055 31.8 3.9 44 357-400 56-99 (117)
69 PF13754 Big_3_4: Bacterial Ig 46.1 42 0.00091 26.3 4.5 28 350-377 3-32 (54)
70 PF03944 Endotoxin_C: delta en 44.7 89 0.0019 29.3 7.4 95 497-600 41-140 (143)
71 PRK09525 lacZ beta-D-galactosi 38.0 50 0.0011 41.1 5.7 66 496-584 124-191 (1027)
72 PF12866 DUF3823: Protein of u 37.8 1.1E+02 0.0024 31.3 7.2 63 305-376 21-84 (222)
73 PF07748 Glyco_hydro_38C: Glyc 37.5 2.3E+02 0.005 31.0 10.4 116 14-132 89-226 (457)
74 PF07550 DUF1533: Protein of u 37.3 29 0.00063 28.4 2.4 19 564-582 36-55 (65)
75 PF01190 Pollen_Ole_e_I: Polle 36.2 68 0.0015 27.9 4.8 37 323-364 18-54 (97)
76 smart00634 BID_1 Bacterial Ig- 35.6 1.3E+02 0.0029 25.7 6.5 66 305-381 19-86 (92)
77 PF11797 DUF3324: Protein of u 33.2 49 0.0011 31.1 3.6 30 365-394 102-131 (140)
78 PRK10150 beta-D-glucuronidase; 33.1 77 0.0017 36.8 6.0 65 497-583 70-135 (604)
79 PF01690 PLRV_ORF5: Potato lea 33.1 1.3E+02 0.0027 34.2 7.2 108 345-474 74-192 (465)
80 PF14900 DUF4493: Domain of un 32.6 66 0.0014 32.6 4.7 53 350-407 48-108 (235)
81 PF10794 DUF2606: Protein of u 32.0 1.8E+02 0.0039 27.1 6.8 54 321-376 51-105 (131)
82 PF10989 DUF2808: Protein of u 29.5 3.2E+02 0.007 25.8 8.5 101 19-126 40-144 (146)
83 PF14849 YidC_periplas: YidC p 28.7 55 0.0012 33.4 3.5 26 13-38 1-26 (270)
84 PF09912 DUF2141: Uncharacteri 28.4 1.3E+02 0.0028 27.3 5.4 20 358-377 42-61 (112)
85 PF03785 Peptidase_C25_C: Pept 27.6 1.9E+02 0.0042 25.1 5.9 39 327-377 26-69 (81)
86 PRK13211 N-acetylglucosamine-b 24.1 3.7E+02 0.008 30.8 9.1 67 297-378 320-388 (478)
87 PRK15172 putative aldose-1-epi 23.4 1.4E+02 0.0031 31.4 5.5 36 10-45 9-44 (300)
88 PF14200 RicinB_lectin_2: Rici 23.0 1.2E+02 0.0027 26.2 4.2 39 323-369 33-72 (105)
89 PF01263 Aldose_epim: Aldose 1 22.8 1.2E+02 0.0025 31.2 4.6 36 13-48 2-40 (300)
90 KOG0496 Beta-galactosidase [Ca 21.2 1.5E+02 0.0032 35.0 5.3 69 492-585 556-626 (649)
91 TIGR03769 P_ac_wall_RPT actino 20.1 1E+02 0.0023 23.1 2.6 11 367-377 11-21 (41)
No 1
>PF09284 RhgB_N: Rhamnogalacturonase B, N-terminal; InterPro: IPR015364 This domain is found in prokaryotic enzyme rhamnogalacturonase B, it adopts a structure consisting of a beta supersandwich, with eighteen strands in two beta-sheets. The exact function of the domain is unknown, but a putative role includes carbohydrate-binding []. ; GO: 0016837 carbon-oxygen lyase activity, acting on polysaccharides, 0030246 carbohydrate binding, 0005975 carbohydrate metabolic process; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=100.00 E-value=1.8e-53 Score=420.65 Aligned_cols=233 Identities=19% Similarity=0.305 Sum_probs=162.7
Q ss_pred EEEeCCEEEEeCC-eEEEEEeCCceeEEEEEEcCEEeccccCCCccCcCccceEEEEEEccCcEEEEEEEEeecCCCCCC
Q 038979 7 LHEQNNHVVMNNG-ILQVSISTPQGFVIGIQYKGNKNLLNVQNEEDNRGIEATNYKVIMRTKEQVELSFTRMWQPYTNGT 85 (606)
Q Consensus 7 ~~~~g~~vv~~Ng-~l~vtv~k~~g~itsl~y~G~e~l~~~~~~~~~~Gl~~~~~~v~~~~~~~i~vs~~~~~~~~~~g~ 85 (606)
++++|+.+|+|.| .|+|+|+|++|||+||+|+|+|+|.+.+++|+++||+.+++++.+.++ +|+|+|+. ++
T Consensus 3 ~t~sg~~~viDtga~Lvf~V~~s~gDitSi~y~g~ElQ~~~k~ShI~SGLGsatVs~~~~~~-~IkVt~~~-------~t 74 (249)
T PF09284_consen 3 YTDSGSNYVIDTGAGLVFKVSKSNGDITSIKYNGTELQYSSKNSHINSGLGSATVSITTSGD-YIKVTCKT-------GT 74 (249)
T ss_dssp EEE-SSEEEEE---TEEEEEETTT--EEEEEETTEE-B-SSS-BEETT--SS-EEEEEEETT-EEEEEEE--------SS
T ss_pred eEecCCcEEEECCCCEEEEEecCCCCeEEEEECCEeeecCCccceeccCCCccEEEEEeeCC-EEEEEEEc-------CC
Confidence 5788877777777 499999999999999999999999999999999999999999998775 99999998 43
Q ss_pred ccceeeeEEEEEEcCCceEEEEEEeecccCCCCCcccccEEEEeCCCCCCccee-eccccccccCCcccccccceeeeec
Q 038979 86 IAPVNIDKRFLMLRGSSGFYSYAIYKRLKGWPGFQLFNNRMVFKPNPDKFHYMI-ISGNRQREMPLQQDRERGHKLAYEE 164 (606)
Q Consensus 86 ~~~~~l~~~~v~r~g~sgiY~y~i~~~~~~~p~~~lge~R~v~Rl~~~~f~~~~-~~d~r~~~~P~~~d~~~g~~l~~~e 164 (606)
|+||||+|+|++.|||++ + .+ .+++|||||||+||++++||+-. ..+ .++ ..|..++|.+
T Consensus 75 -----Lthyyv~r~g~~~IYmaT-~--~~--~e~~igelRfIaRL~~~~lpn~~~~~~--------~~~-~~g~taIEgs 135 (249)
T PF09284_consen 75 -----LTHYYVARPGENNIYMAT-Y--IT--AEPSIGELRFIARLNRSILPNEYPYGD--------VST-TDGGTAIEGS 135 (249)
T ss_dssp -----EEEEEEEETT--EEEEEE-E--ES--S--TTS-EEEEEEE-TTTS-EEETTGG--------GG---TT-EEEETT
T ss_pred -----eEEEEEEecCCceEEEEe-c--cC--CCCCccceEEEEEcccccCCCCCCccc--------ccc-cCCceEEeec
Confidence 899999999999999999 4 24 58899999999999999999932 111 122 4678889999
Q ss_pred eEEcC-CCEeecceecccccccceeEEEEeCCCceEEEEEcCCCCcccCCCceecccccCCCc---eEEEEeeeccccCc
Q 038979 165 AVLLP-NGEVDDKYQYSMDAKDIRVHGWISTDSTVGFWQILPSSESRSFGPLKQFLTSHTGPI---SINTFHSTHYVGEN 240 (606)
Q Consensus 165 ~v~l~-~G~~~sKY~~s~~~~d~~vhG~~s~g~~vG~W~I~~s~E~~sGGP~kqdL~~h~g~~---~ln~~~s~H~~g~~ 240 (606)
||++. ||+++||||++.+++|+++|||+ |+++|+|||++++|.+|||||+|||++|.++. +++||+|+|.+
T Consensus 136 DVf~~~~G~TrSKfYSs~r~IDd~~hgv~--g~~vgv~mi~~~~E~SSGGPFfRDI~~~~~~~~~~Ly~ymnSgH~q--- 210 (249)
T PF09284_consen 136 DVFLVSDGQTRSKFYSSQRFIDDDVHGVS--GSAVGVYMIMSNYEKSSGGPFFRDINTNNGGDGNELYNYMNSGHTQ--- 210 (249)
T ss_dssp TEEEE-TTEEEEGGGG--BGGG-SEEEEE---SS-EEEEE----TT-SS-TT-B---EEE-SS-EEEEEEEE-STT----
T ss_pred cEEEecCceEeeeeccccceeccceEEEe--cCCeEEEEEeCCccccCCCCchhhhhhccCCccceeeeeEecCccc---
Confidence 99977 99999999999999999999998 88999999999999999999999999998765 89999999987
Q ss_pred ccccccCCcccc-eeeceEEEEEcCCCCC--CCchhhHHHH
Q 038979 241 FGMKFKDGEAWK-KIFGPFLVYVNSVAGK--GDRQMLWRDA 278 (606)
Q Consensus 241 ~~~~~~~ge~w~-k~~GP~~~y~N~g~~~--~~~~~l~~Da 278 (606)
+|+|| +|||||+|+|++|.++ .++|..|+|.
T Consensus 211 -------TE~~R~GLhGPYaL~FT~g~~Ps~~~~D~sff~~ 244 (249)
T PF09284_consen 211 -------TEPYRMGLHGPYALAFTDGGAPSASDLDTSFFDD 244 (249)
T ss_dssp --------S----EEEEEEEEEEESS----S-----GGGGG
T ss_pred -------CchhccccCCceEEEEcCCCCCCCccccccchhh
Confidence 89999 9999999999997764 3589999987
No 2
>PF06045 Rhamnogal_lyase: Rhamnogalacturonate lyase family; InterPro: IPR010325 Rhamnogalacturonate lyase degrades the rhamnogalacturonan I (RG-I) backbone of pectin []. This family contains mainly members from plants, but also contains the plant pathogen Erwinia chrysanthemi.
Probab=100.00 E-value=8e-48 Score=375.35 Aligned_cols=170 Identities=58% Similarity=0.985 Sum_probs=164.1
Q ss_pred CCCceEEEEeCCEEEEeCCeEEEEEeCCceeEEEEEEcCEEeccccCCCccCcC-----------------ccceEEEEE
Q 038979 1 SAAGVQLHEQNNHVVMNNGILQVSISTPQGFVIGIQYKGNKNLLNVQNEEDNRG-----------------IEATNYKVI 63 (606)
Q Consensus 1 ~~~~v~~~~~g~~vv~~Ng~l~vtv~k~~g~itsl~y~G~e~l~~~~~~~~~~G-----------------l~~~~~~v~ 63 (606)
+..+|+|++++.+|+|+||+|+|||+||+|+||+|+|+|++||++..+++.+|| +.+|+|+|+
T Consensus 5 ~~~~V~L~~~~~~VvldNGiVqVtls~p~G~VtgIkYnGi~NLle~~n~e~nrGYwD~~W~~~G~~~~~~~~~gt~f~Vi 84 (203)
T PF06045_consen 5 SSSGVTLTVQGRQVVLDNGIVQVTLSKPGGIVTGIKYNGIDNLLEVANKENNRGYWDLVWNEPGSKGKFDRIKGTEFSVI 84 (203)
T ss_pred cCCCeEEEEcCCEEEEECCEEEEEEcCCCceEEEEEECCEehhhcccCcccCCceEEEecccCCccccccccCCcEEEEE
Confidence 467899999999999999999999999999999999999999999888887777 678999999
Q ss_pred EccCcEEEEEEEEeecCCCCCCccceeeeEEEEEEcCCceEEEEEEeecccCCCCCcccccEEEEeCCCCCCcceeeccc
Q 038979 64 MRTKEQVELSFTRMWQPYTNGTIAPVNIDKRFLMLRGSSGFYSYAIYKRLKGWPGFQLFNNRMVFKPNPDKFHYMIISGN 143 (606)
Q Consensus 64 ~~~~~~i~vs~~~~~~~~~~g~~~~~~l~~~~v~r~g~sgiY~y~i~~~~~~~p~~~lge~R~v~Rl~~~~f~~~~~~d~ 143 (606)
.+++++|||||+++|+||++++.+||+||+||||++|+||||+|+||+|+++||+++|+|+|+||||++++|++|+++|+
T Consensus 85 ~~te~qVevSF~r~w~~s~~~~~~plnIDkryVm~rG~SGfY~YAI~e~~~~~Pa~~l~q~R~vfKl~~d~F~ymai~d~ 164 (203)
T PF06045_consen 85 EQTEEQVEVSFSRTWDPSLDGKSVPLNIDKRYVMLRGSSGFYSYAIFEHPAGWPAFDLGQTRIVFKLNKDKFHYMAISDD 164 (203)
T ss_pred EcCCCeEEEEEEcccCcCCCCCcceeEeeEEEEEecCCceEEEEEEEecCCCCCCcccceeEEEEECCccccceEEeccc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred cccccCCcccc--cccceeeeeceEEcCC
Q 038979 144 RQREMPLQQDR--ERGHKLAYEEAVLLPN 170 (606)
Q Consensus 144 r~~~~P~~~d~--~~g~~l~~~e~v~l~~ 170 (606)
|||.||+|+|+ .+|++|+|+|||+|+|
T Consensus 165 rqr~mP~~~D~~~~~~~~l~y~eav~l~~ 193 (203)
T PF06045_consen 165 RQRIMPSPDDRDPARGQPLAYPEAVLLVN 193 (203)
T ss_pred ccccCCChHHccccCCCcccCchhhhcCC
Confidence 99999999999 5789999999999987
No 3
>PF14683 CBM-like: Polysaccharide lyase family 4, domain III; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=100.00 E-value=7.2e-46 Score=356.31 Aligned_cols=165 Identities=47% Similarity=0.842 Sum_probs=115.8
Q ss_pred CceeEEeccCCCccceecCCCCCcccccccccccccccchhHhhhhhhcCCCCeeEEEeccCCCCceeeEEeeeecCCcc
Q 038979 412 PTLWEIGIPDRSAAEFYIPNPNPKYINKLYVKHDRFRQYGLWERYAELHRKRDLVYEVWANNYRKDWYFAQNTRKKGNKY 491 (606)
Q Consensus 412 ~~LweIG~~Drta~eF~~~d~~~~~~~~~~~~~d~~r~yglW~r~~~~~P~~dl~ytVG~S~~~~Dw~ya~~~~~~~~~~ 491 (606)
++|||||+|||+|.||+++|+ ++||||+ |+||+++||++|++|+||+| +++||||||+.+..
T Consensus 1 ~~iW~IG~~Drta~eF~~~~~------------~~~r~~~-~~d~~~~~p~~~~~ytVG~S-~~~Dw~y~~~~~~~---- 62 (167)
T PF14683_consen 1 PTIWQIGTPDRTAAEFRNGDP------------DKYRQYG-WSDYSRDFPWEDLTYTVGSS-PAKDWPYAQWGRVN---- 62 (167)
T ss_dssp SEEEEEE-SSSS-TTSBTHH-------------HHTTS---TT--TTS----S-EEETTTS--GGGSBSEEETTTS----
T ss_pred CcceEeCCCCCCchhhccCCh------------hhhhhcC-cccchhhCCCCCCEEEEccC-cccCCcEEEEeccC----
Confidence 589999999999999999974 4699999 99999999998999999999 88899999996432
Q ss_pred cceeEEEEEEecCccccccEEEEEEEeec-cCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEE-eecEEEEEEeecC
Q 038979 492 EGSTWQIQFKLEGVVKKATYKLRVAVAAA-HGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIH-GVYKLFNVDVPGK 569 (606)
Q Consensus 492 ~~~~w~I~F~L~~~~~~~~~tLriala~a-~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~-G~~~~~~~~ipa~ 569 (606)
++|+|+|+|++++..+.+||||+||+| ++++++|+|||+....|. ..+++|++++|+|+| |+|++++|+||++
T Consensus 63 --~~w~I~F~l~~~~~~~~~tL~i~la~a~~~~~~~V~vNg~~~~~~~---~~~~~d~~~~r~g~~~G~~~~~~~~ipa~ 137 (167)
T PF14683_consen 63 --GTWTIKFDLDAVQLAGTYTLRIALAGASAGGRLQVSVNGWSGPFPS---APFGNDNAIYRSGIHRGNYRLYEFDIPAS 137 (167)
T ss_dssp ----EEEEEEE-GGG-S--EEEEEEEEEEETT-EEEEEETTEE--------------S--GGGT---S---EEEEEE-TT
T ss_pred --CCEEEEEECCCCccCCcEEEEEEeccccCCCCEEEEEcCccCCccc---cccCCCCceeeCceecccEEEEEEEEcHH
Confidence 779999999999976799999999999 999999999997766332 347899999999998 9999999999999
Q ss_pred ceeeeecEEEEEeecCCCCCceEEEEEEEE
Q 038979 570 VLRKGNNTIYLSQPRKLDAFTGIMYDYLRF 599 (606)
Q Consensus 570 ~L~~G~NtI~l~~~~g~~~~~~vmyD~IrL 599 (606)
+|++|+|+|+|++++|++.++|||||||||
T Consensus 138 ~L~~G~Nti~lt~~~gs~~~~gvmyD~I~L 167 (167)
T PF14683_consen 138 LLKAGENTITLTVPSGSGLSPGVMYDYIRL 167 (167)
T ss_dssp SS-SEEEEEEEEEE-S-GGSSEEEEEEEEE
T ss_pred HEEeccEEEEEEEccCCCccCeEEEEEEEC
Confidence 999999999999999987889999999998
No 4
>PF14686 fn3_3: Polysaccharide lyase family 4, domain II; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=99.86 E-value=9.4e-22 Score=172.70 Aligned_cols=93 Identities=44% Similarity=0.822 Sum_probs=54.1
Q ss_pred CCeEEEEEEEEeeccccccCccc-CceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceec
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAA-KGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIG 382 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa-~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G 382 (606)
+||+|+|+|.+.|.+. ..++ ..++|+|+.+++.+ |+++||||+++|++|+|+|++||||+|+|++|.+|+.|
T Consensus 1 ~RG~VsG~l~l~dg~~---~~~~~~~~~Vgl~~~~d~~----q~~~yqYwt~td~~G~Fti~~V~pGtY~L~ay~~g~~g 73 (95)
T PF14686_consen 1 QRGSVSGRLTLSDGVT---NPPAGANAVVGLAPPGDFQ----QNKGYQYWTRTDSDGNFTIPNVRPGTYRLYAYADGIFG 73 (95)
T ss_dssp G-BEEEEEEE---SS-----TT--S-EEEEEE------------SS-EEEEE--TTSEEE---B-SEEEEEEEEE----T
T ss_pred CCCEEEEEEEEccCcc---cCccceeEEEEeeeccccc----cCCCCcEEEEeCCCCcEEeCCeeCcEeEEEEEEecccC
Confidence 5999999999988432 2445 67899999887754 49999999999999999999999999999999999999
Q ss_pred eeee-eeEEEEeCCceeeecce
Q 038979 383 DFKY-HAAIRITAGSAKQIGNL 403 (606)
Q Consensus 383 ~~~~-~~~VtV~aG~t~~l~~l 403 (606)
|+.. +.+|+|++|++++|++|
T Consensus 74 ~~~~~~~~ItV~~g~~~~lg~~ 95 (95)
T PF14686_consen 74 DYKVASDSITVSGGTTTDLGDL 95 (95)
T ss_dssp TEEEEEEEEEE-T-EEE-----
T ss_pred ceEEecceEEEcCCcEeccccC
Confidence 9995 88899999999988764
No 5
>PF13620 CarboxypepD_reg: Carboxypeptidase regulatory-like domain; PDB: 3MN8_D 3P0D_I 3KCP_A 2B59_B 1UWY_A 1H8L_A 1QMU_A 2NSM_A.
Probab=98.81 E-value=2.4e-08 Score=84.00 Aligned_cols=81 Identities=26% Similarity=0.446 Sum_probs=58.7
Q ss_pred EEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceeee
Q 038979 307 SISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 307 tVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
+|+|+|.-.+ +.|..+|.|.|... ..+..+-+.||++|+|.|++++||+|+|.+..+|+. ..
T Consensus 1 tI~G~V~d~~------g~pv~~a~V~l~~~---------~~~~~~~~~Td~~G~f~~~~l~~g~Y~l~v~~~g~~---~~ 62 (82)
T PF13620_consen 1 TISGTVTDAT------GQPVPGATVTLTDQ---------DGGTVYTTTTDSDGRFSFEGLPPGTYTLRVSAPGYQ---PQ 62 (82)
T ss_dssp -EEEEEEETT------SCBHTT-EEEET-----------TTTECCEEE--TTSEEEEEEE-SEEEEEEEEBTTEE----E
T ss_pred CEEEEEEcCC------CCCcCCEEEEEEEe---------eCCCEEEEEECCCceEEEEccCCEeEEEEEEECCcc---eE
Confidence 6899996554 89999999999643 234578899999999999999999999999987765 33
Q ss_pred ee-EEEEeCCceeeecceEEe
Q 038979 387 HA-AIRITAGSAKQIGNLVYK 406 (606)
Q Consensus 387 ~~-~VtV~aG~t~~l~~l~~~ 406 (606)
.. .|+|.+|++..+ ++.++
T Consensus 63 ~~~~v~v~~~~~~~~-~i~L~ 82 (82)
T PF13620_consen 63 TQENVTVTAGQTTTV-DITLE 82 (82)
T ss_dssp EEEEEEESSSSEEE---EEEE
T ss_pred EEEEEEEeCCCEEEE-EEEEC
Confidence 33 599999998887 57663
No 6
>PF13715 DUF4480: Domain of unknown function (DUF4480)
Probab=98.46 E-value=1.8e-06 Score=73.77 Aligned_cols=88 Identities=28% Similarity=0.409 Sum_probs=68.4
Q ss_pred EEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceeee
Q 038979 307 SISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 307 tVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
+|+|+|.-++ .+.|..+|.|.+... ...+.||++|+|+|+ +++|+|+|.++..|+. ..
T Consensus 1 ti~G~V~d~~-----t~~pl~~a~V~~~~~-------------~~~~~Td~~G~F~i~-~~~g~~~l~is~~Gy~---~~ 58 (88)
T PF13715_consen 1 TISGKVVDSD-----TGEPLPGATVYLKNT-------------KKGTVTDENGRFSIK-LPEGDYTLKISYIGYE---TK 58 (88)
T ss_pred CEEEEEEECC-----CCCCccCeEEEEeCC-------------cceEEECCCeEEEEE-EcCCCeEEEEEEeCEE---EE
Confidence 5899996544 379999999998544 367889999999999 9999999999986655 66
Q ss_pred eeEEEEeCCceeeecceEEecCCCCCceeEEec
Q 038979 387 HAAIRITAGSAKQIGNLVYKAPRNGPTLWEIGI 419 (606)
Q Consensus 387 ~~~VtV~aG~t~~l~~l~~~~p~~~~~LweIG~ 419 (606)
...|.+..++...+ .+.+.+ ...+|-||.+
T Consensus 59 ~~~i~~~~~~~~~~-~i~L~~--~~~~L~eVvV 88 (88)
T PF13715_consen 59 TITISVNSNKNTNL-NIYLEP--KSNQLDEVVV 88 (88)
T ss_pred EEEEEecCCCEEEE-EEEEee--CcccCCeEEC
Confidence 66677777665566 577774 5668888753
No 7
>cd03863 M14_CPD_II The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally ac
Probab=97.97 E-value=2.7e-05 Score=84.55 Aligned_cols=100 Identities=15% Similarity=0.297 Sum_probs=73.8
Q ss_pred CCchhhHHHHHHHhhhhcccCCCCCCCCCCCCCCCCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccccc
Q 038979 269 GDRQMLWRDANRQFMNEVKSWPYKFPASKDFARSNKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKG 348 (606)
Q Consensus 269 ~~~~~l~~Da~~~~~~E~~~WPy~f~~~~~y~~~~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~ 348 (606)
+.+...|.+-|.-+. .|.. +-...|+|+|+-.. .+.|..+|.|.+...
T Consensus 275 ~~l~~~w~~n~~all--------------~~~~-~~~~gI~G~V~D~~-----~g~pl~~AtV~V~g~------------ 322 (375)
T cd03863 275 EELPKYWEQNRRSLL--------------QFMK-QVHRGVRGFVLDAT-----DGRGILNATISVADI------------ 322 (375)
T ss_pred HHHHHHHHHHHHHHH--------------HHHH-HhcCeEEEEEEeCC-----CCCCCCCeEEEEecC------------
Confidence 456777877665433 1211 12479999985432 378999999998533
Q ss_pred ceeEEEeCCCcceEeCcccCceeEEEEEECceeceeeeeeEEEEeCCceeeecceEEe
Q 038979 349 YQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITAGSAKQIGNLVYK 406 (606)
Q Consensus 349 yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~~ 406 (606)
...+.||.+|.|.+ .|+||+|+|+|++.|+. ..+.+|+|.+|+++.+ ++.++
T Consensus 323 -~~~~~Td~~G~f~~-~l~pG~ytl~vs~~GY~---~~~~~v~V~~~~~~~~-~~~L~ 374 (375)
T cd03863 323 -NHPVTTYKDGDYWR-LLVPGTYKVTASARGYD---PVTKTVEVDSKGAVQV-NFTLS 374 (375)
T ss_pred -cCceEECCCccEEE-ccCCeeEEEEEEEcCcc---cEEEEEEEcCCCcEEE-EEEec
Confidence 45678999999999 69999999999997766 5555799999999887 47665
No 8
>cd03865 M14_CPE_H Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine pe
Probab=97.93 E-value=2.2e-05 Score=85.81 Aligned_cols=98 Identities=16% Similarity=0.325 Sum_probs=73.8
Q ss_pred CCchhhHHHHHHHhhhhcccCCCCCCCCCCCCCCCCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccccc
Q 038979 269 GDRQMLWRDANRQFMNEVKSWPYKFPASKDFARSNKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKG 348 (606)
Q Consensus 269 ~~~~~l~~Da~~~~~~E~~~WPy~f~~~~~y~~~~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~ 348 (606)
+.+...|.+-|.-+. .|.....|| |+|+|+-.. +.|..+|.|.+...
T Consensus 304 ~~L~~~W~~n~~all--------------~~~~q~~~g-I~G~V~D~~------g~pI~~AtV~V~g~------------ 350 (402)
T cd03865 304 ETLKQYWEDNKNSLV--------------NYIEQVHRG-VKGFVKDLQ------GNPIANATISVEGI------------ 350 (402)
T ss_pred HHHHHHHHHHHHHHH--------------HHHHHhccc-eEEEEECCC------CCcCCCeEEEEEcC------------
Confidence 457888988875433 232233477 999984332 78999999998533
Q ss_pred ceeEEEeCCCcceEeCcccCceeEEEEEECceeceeeeeeEEEEeCCceeeecceEE
Q 038979 349 YQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITAGSAKQIGNLVY 405 (606)
Q Consensus 349 yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~ 405 (606)
...+.||.+|.|.+ .++||+|+|+|.+.|+. .....|+|.+++++.+ ++++
T Consensus 351 -~~~~~T~~~G~Y~~-~L~pG~Ytv~vsa~Gy~---~~~~~V~V~~~~~~~v-df~L 401 (402)
T cd03865 351 -DHDITSAKDGDYWR-LLAPGNYKLTASAPGYL---AVVKKVAVPYSPAVRV-DFEL 401 (402)
T ss_pred -ccccEECCCeeEEE-CCCCEEEEEEEEecCcc---cEEEEEEEcCCCcEEE-eEEe
Confidence 34568999999998 89999999999998877 5557799999998777 4655
No 9
>cd03864 M14_CPN Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates inclu
Probab=97.90 E-value=3.6e-05 Score=84.06 Aligned_cols=98 Identities=13% Similarity=0.283 Sum_probs=72.5
Q ss_pred CCchhhHHHHHHHhhhhcccCCCCCCCCCCCCCCCCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccccc
Q 038979 269 GDRQMLWRDANRQFMNEVKSWPYKFPASKDFARSNKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKG 348 (606)
Q Consensus 269 ~~~~~l~~Da~~~~~~E~~~WPy~f~~~~~y~~~~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~ 348 (606)
+.+...|.+-|.-+.+ |... --..|+|+|+..+ +.|..+|.|.+...
T Consensus 294 ~~l~~~w~~n~~all~--------------~~~~-~~~gI~G~V~D~~------g~pi~~A~V~v~g~------------ 340 (392)
T cd03864 294 EELEREWLGNREALIS--------------YIEQ-VHQGIKGMVTDEN------NNGIANAVISVSGI------------ 340 (392)
T ss_pred HHHHHHHHHHHHHHHH--------------HHHH-hcCeEEEEEECCC------CCccCCeEEEEECC------------
Confidence 4567778777654331 1111 1247999996544 79999999998533
Q ss_pred ceeEEEeCCCcceEeCcccCceeEEEEEECceeceeeeeeEEEEeCCceeeecceEE
Q 038979 349 YQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITAGSAKQIGNLVY 405 (606)
Q Consensus 349 yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~ 405 (606)
...+.||++|.| +++++||+|+|.+++.|+. ..+.+|+|.+++++.+ ++++
T Consensus 341 -~~~~~T~~~G~y-~r~l~pG~Y~l~vs~~Gy~---~~t~~v~V~~~~~~~~-df~L 391 (392)
T cd03864 341 -SHDVTSGTLGDY-FRLLLPGTYTVTASAPGYQ---PSTVTVTVGPAEATLV-NFQL 391 (392)
T ss_pred -ccceEECCCCcE-EecCCCeeEEEEEEEcCce---eEEEEEEEcCCCcEEE-eeEe
Confidence 456789999999 9999999999999997776 6667799999987766 3554
No 10
>cd06245 M14_CPD_III The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active a
Probab=97.72 E-value=0.00013 Score=79.02 Aligned_cols=76 Identities=21% Similarity=0.355 Sum_probs=61.7
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceee
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFK 385 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~ 385 (606)
-.|+|+|+..+ |.|..+|.|.+... . .+.||.+|.|.+. ++||+|+|++.+.|+. .
T Consensus 287 ~gI~G~V~d~~------g~pi~~A~V~v~g~-------------~-~~~T~~~G~y~~~-L~pG~y~v~vs~~Gy~---~ 342 (363)
T cd06245 287 KGVHGVVTDKA------GKPISGATIVLNGG-------------H-RVYTKEGGYFHVL-LAPGQHNINVIAEGYQ---Q 342 (363)
T ss_pred cEEEEEEEcCC------CCCccceEEEEeCC-------------C-ceEeCCCcEEEEe-cCCceEEEEEEEeCce---e
Confidence 56999996543 89999999998532 2 5779999999997 9999999999997766 6
Q ss_pred eeeEEEEeCCceeeecceEEe
Q 038979 386 YHAAIRITAGSAKQIGNLVYK 406 (606)
Q Consensus 386 ~~~~VtV~aG~t~~l~~l~~~ 406 (606)
.+.+|+|.+++++.+ ++++.
T Consensus 343 ~~~~V~v~~~~~~~~-~f~L~ 362 (363)
T cd06245 343 EHLPVVVSHDEASSV-KIVLD 362 (363)
T ss_pred EEEEEEEcCCCeEEE-EEEec
Confidence 667799999988777 47664
No 11
>cd03868 M14_CPD_I The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at p
Probab=97.54 E-value=0.00022 Score=77.50 Aligned_cols=73 Identities=18% Similarity=0.236 Sum_probs=58.4
Q ss_pred CeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceecee
Q 038979 305 RGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDF 384 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~ 384 (606)
.+.|+|+|+-.. +.|..+|.|.|... ...+.||++|.|.+ +++||+|+|++.+.|+.
T Consensus 295 ~~~i~G~V~d~~------g~pv~~A~V~v~~~-------------~~~~~td~~G~y~~-~l~~G~Y~l~vs~~Gf~--- 351 (372)
T cd03868 295 HIGVKGFVRDAS------GNPIEDATIMVAGI-------------DHNVTTAKFGDYWR-LLLPGTYTITAVAPGYE--- 351 (372)
T ss_pred CCceEEEEEcCC------CCcCCCcEEEEEec-------------ccceEeCCCceEEe-cCCCEEEEEEEEecCCC---
Confidence 478999985543 79999999999543 35689999999984 79999999999998876
Q ss_pred e-eeeEEEEeCCceeee
Q 038979 385 K-YHAAIRITAGSAKQI 400 (606)
Q Consensus 385 ~-~~~~VtV~aG~t~~l 400 (606)
. ....|+|.+|+++.+
T Consensus 352 ~~~~~~v~v~~g~~~~~ 368 (372)
T cd03868 352 PSTVTDVVVKEGEATSV 368 (372)
T ss_pred ceEEeeEEEcCCCeEEE
Confidence 4 333477999998776
No 12
>cd03858 M14_CP_N-E_like Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappr
Probab=97.52 E-value=0.00033 Score=76.00 Aligned_cols=114 Identities=17% Similarity=0.309 Sum_probs=77.1
Q ss_pred cccc-eeeceEEEEEcCCCC----CCCchhhHHHHHHHhhhhcccCCCCCCCCCCCCCCCCCeEEEEEEEEeeccccccC
Q 038979 249 EAWK-KIFGPFLVYVNSVAG----KGDRQMLWRDANRQFMNEVKSWPYKFPASKDFARSNKRGSISGRLIVKDRYVSRAG 323 (606)
Q Consensus 249 e~w~-k~~GP~~~y~N~g~~----~~~~~~l~~Da~~~~~~E~~~WPy~f~~~~~y~~~~qRGtVsG~v~~sd~~~~~~~ 323 (606)
++|- ...+|+.+-|=-+.. .+.+..+|.+...-+.. +. .+...+|+|+|+-.+ +
T Consensus 251 ~Dw~y~~~~~~~~t~El~~~~~p~~~~i~~i~~en~~all~--------------l~-~~a~~~i~G~V~d~~------g 309 (374)
T cd03858 251 QDWNYLHTNCFEITLELSCCKFPPASELPKYWEENREALLA--------------YI-EQVHRGIKGFVRDAN------G 309 (374)
T ss_pred hhhhhhccCceEEEEeccCCCCCChhHhHHHHHHHHHHHHH--------------HH-hhcCCceEEEEECCC------C
Confidence 4566 666777666643222 13455666655432221 00 112348999985443 7
Q ss_pred cccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceeeeeeEEEEeC-Cceeee
Q 038979 324 IAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITA-GSAKQI 400 (606)
Q Consensus 324 ~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~a-G~t~~l 400 (606)
.|..+|.|.|. +....+.||.+|.|.+. ++||+|+|.+...|+. .++.+|+|.. |+++.+
T Consensus 310 ~pl~~A~V~i~-------------~~~~~~~Td~~G~f~~~-l~~G~y~l~vs~~Gy~---~~~~~v~v~~~g~~~~~ 370 (374)
T cd03858 310 NPIANATISVE-------------GINHDVTTAEDGDYWRL-LLPGTYNVTASAPGYE---PQTKSVVVPNDNSAVVV 370 (374)
T ss_pred CccCCeEEEEe-------------cceeeeEECCCceEEEe-cCCEeEEEEEEEcCcc---eEEEEEEEecCCceEEE
Confidence 89999999984 34678999999999986 7999999999997765 5566788887 887766
No 13
>cd03867 M14_CPZ Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling.
Probab=97.23 E-value=0.001 Score=73.00 Aligned_cols=71 Identities=24% Similarity=0.306 Sum_probs=56.2
Q ss_pred EEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceeee
Q 038979 307 SISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 307 tVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
.|+|+|+..+ +.|..+|.|.|.. ....+.||++|.|. .+++||+|+|.+.+.|++ ..
T Consensus 319 ~i~G~V~D~~------g~pi~~A~V~v~g-------------~~~~~~Td~~G~y~-~~l~~G~y~l~vs~~Gy~---~~ 375 (395)
T cd03867 319 GIKGFVKDKD------GNPIKGARISVRG-------------IRHDITTAEDGDYW-RLLPPGIHIVSAQAPGYT---KV 375 (395)
T ss_pred eeEEEEEcCC------CCccCCeEEEEec-------------cccceEECCCceEE-EecCCCcEEEEEEecCee---eE
Confidence 5999996543 7999999999843 36678999999997 689999999999997776 56
Q ss_pred eeEEEEeC--Cceeee
Q 038979 387 HAAIRITA--GSAKQI 400 (606)
Q Consensus 387 ~~~VtV~a--G~t~~l 400 (606)
..+|+|.+ ++...+
T Consensus 376 ~~~v~v~~~~~~~~~~ 391 (395)
T cd03867 376 MKRVTLPARMKRAGRV 391 (395)
T ss_pred EEEEEeCCcCCCceEe
Confidence 66688865 444444
No 14
>cd03866 M14_CPM Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the cont
Probab=96.43 E-value=0.01 Score=64.84 Aligned_cols=70 Identities=19% Similarity=0.218 Sum_probs=52.9
Q ss_pred CeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceecee
Q 038979 305 RGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDF 384 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~ 384 (606)
.+.|+|+|+-. .+.|..+|.|.|... +...-+.||++|.|.+. ++||+|+|.+.++|+.
T Consensus 294 ~~gI~G~V~D~------~g~pi~~A~V~v~g~-----------~~~~~~~T~~~G~y~~~-l~pG~Y~v~vsa~Gy~--- 352 (376)
T cd03866 294 HLGVKGQVFDS------NGNPIPNAIVEVKGR-----------KHICPYRTNVNGEYFLL-LLPGKYMINVTAPGFK--- 352 (376)
T ss_pred cCceEEEEECC------CCCccCCeEEEEEcC-----------CceeEEEECCCceEEEe-cCCeeEEEEEEeCCcc---
Confidence 56799999533 278999999998533 11223469999999775 9999999999998876
Q ss_pred eeeeEEEEeCC
Q 038979 385 KYHAAIRITAG 395 (606)
Q Consensus 385 ~~~~~VtV~aG 395 (606)
....+|.|.+.
T Consensus 353 ~~~~~v~v~~~ 363 (376)
T cd03866 353 TVITNVIIPYN 363 (376)
T ss_pred eEEEEEEeCCC
Confidence 55666877754
No 15
>PRK15036 hydroxyisourate hydrolase; Provisional
Probab=95.41 E-value=0.037 Score=52.21 Aligned_cols=64 Identities=16% Similarity=0.267 Sum_probs=48.1
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEe---Cc-ccCceeEEEEEE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSI---KN-VLIGNYNLYAWI 377 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I---~n-V~pGtY~L~a~~ 377 (606)
++.+.|+++|+-.. .|+||.++.|-|..... ++|+. -.-+.||++|+|.. .+ +.||.|.|....
T Consensus 24 a~~~~Is~HVLDt~-----~G~PA~gV~V~L~~~~~-~~w~~-----l~~~~Td~dGR~~~l~~~~~~~~G~Y~L~F~t 91 (137)
T PRK15036 24 AQQNILSVHILNQQ-----TGKPAADVTVTLEKKAD-NGWLQ-----LNTAKTDKDGRIKALWPEQTATTGDYRVVFKT 91 (137)
T ss_pred ccCCCeEEEEEeCC-----CCcCCCCCEEEEEEccC-CceEE-----EEEEEECCCCCCccccCcccCCCeeEEEEEEc
Confidence 34467999997654 39999999999976522 24541 34578999999986 34 889999999984
No 16
>PF08400 phage_tail_N: Prophage tail fibre N-terminal; InterPro: IPR013609 This entry represents the N terminus of phage 933W tail fibre protein. The characteristics of the protein distribution suggest prophage matches.
Probab=95.20 E-value=0.11 Score=48.85 Aligned_cols=78 Identities=29% Similarity=0.328 Sum_probs=53.7
Q ss_pred EEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceeee
Q 038979 307 SISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 307 tVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
.|||.| .| ..|+|..++.+.|+.-...- +-=.+.-=+..|+++|+|+|. +.||.|.++++..|.. ..+
T Consensus 4 ~ISGvL--~d----g~G~pv~g~~I~L~A~~tS~---~Vv~~t~as~~t~~~G~Ys~~-~epG~Y~V~l~~~g~~--~~~ 71 (134)
T PF08400_consen 4 KISGVL--KD----GAGKPVPGCTITLKARRTSS---TVVVGTVASVVTGEAGEYSFD-VEPGVYRVTLKVEGRP--PVY 71 (134)
T ss_pred EEEEEE--eC----CCCCcCCCCEEEEEEccCch---heEEEEEEEEEcCCCceEEEE-ecCCeEEEEEEECCCC--cee
Confidence 477776 33 45999999999998542210 001244556789999999995 9999999999987754 223
Q ss_pred eeEEEEeCCc
Q 038979 387 HAAIRITAGS 396 (606)
Q Consensus 387 ~~~VtV~aG~ 396 (606)
-..|+|.+.+
T Consensus 72 vG~I~V~~dS 81 (134)
T PF08400_consen 72 VGDITVYEDS 81 (134)
T ss_pred EEEEEEecCC
Confidence 3457776543
No 17
>PF08308 PEGA: PEGA domain; InterPro: IPR013229 This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands.
Probab=94.84 E-value=0.21 Score=40.97 Aligned_cols=44 Identities=16% Similarity=0.417 Sum_probs=36.5
Q ss_pred ceEeCcccCceeEEEEEECceeceeeeeeEEEEeCCceeeecceEEec
Q 038979 360 NFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITAGSAKQIGNLVYKA 407 (606)
Q Consensus 360 ~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l~~l~~~~ 407 (606)
..++..+++|.|+|.+..+|+. ..+..|.|.+|++..+ .+.+++
T Consensus 26 p~~~~~l~~G~~~v~v~~~Gy~---~~~~~v~v~~~~~~~v-~~~L~~ 69 (71)
T PF08308_consen 26 PLTLKDLPPGEHTVTVEKPGYE---PYTKTVTVKPGETTTV-NVTLEP 69 (71)
T ss_pred cceeeecCCccEEEEEEECCCe---eEEEEEEECCCCEEEE-EEEEEE
Confidence 4577789999999999987765 6677899999999888 477764
No 18
>PF03422 CBM_6: Carbohydrate binding module (family 6); InterPro: IPR005084 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM6 from CAZY which was previously known as cellulose-binding domain family VI (CBD VI). CBM6 bind to amorphous cellulose, xylan, mixed beta-(1,3)(1,4)glucan and beta-1,3-glucan[, , ]. CBM6 adopts a classic lectin-like beta-jelly roll fold, predominantly consisting of five antiparallel beta-strands on one face and four antiparallel beta-strands on the other face. It contains two potential ligand binding sites, named respectively cleft A and B. These clefts include aromatic residues which are probably involved in the substrate binding. The cleft B is located on the concave surface of one beta-sheet, and the cleft A on one edge of the protein between the loop that connects the inner and outer beta-sheets of the jellyroll fold []. The multiple binding clefts confer the extensive range of specificities displayed by the domain [, , ].; GO: 0030246 carbohydrate binding; PDB: 1UY1_A 1UY3_A 1UY4_A 1UY2_A 1UYY_A 1UXZ_B 1UYZ_A 1UY0_B 1UYX_A 1UZ0_A ....
Probab=94.23 E-value=0.31 Score=43.94 Aligned_cols=80 Identities=25% Similarity=0.397 Sum_probs=48.5
Q ss_pred cccEEEEEEEeeccC-CeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCceeeeecEEEEEeecCC
Q 038979 508 KATYKLRVAVAAAHG-AELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVLRKGNNTIYLSQPRKL 586 (606)
Q Consensus 508 ~~~~tLriala~a~~-~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~NtI~l~~~~g~ 586 (606)
.+.|+|++..|.... ++++|+||+.+.. ...+..++.. +---.|...+..| .|.+|.|+|+|....+.
T Consensus 44 ~g~y~~~~~~a~~~~~~~~~l~id~~~g~--~~~~~~~~~t------g~w~~~~~~~~~v---~l~~G~h~i~l~~~~~~ 112 (125)
T PF03422_consen 44 AGTYTLTIRYANGGGGGTIELRIDGPDGT--LIGTVSLPPT------GGWDTWQTVSVSV---KLPAGKHTIYLVFNGGD 112 (125)
T ss_dssp SEEEEEEEEEEESSSSEEEEEEETTTTSE--EEEEEEEE-E------SSTTEEEEEEEEE---EEESEEEEEEEEESSSS
T ss_pred CceEEEEEEEECCCCCcEEEEEECCCCCc--EEEEEEEcCC------CCccccEEEEEEE---eeCCCeeEEEEEEECCC
Confidence 478899988888644 7999999993221 1111111110 0001233444444 45679999999987654
Q ss_pred CCCceEEEEEEEEe
Q 038979 587 DAFTGIMYDYLRFE 600 (606)
Q Consensus 587 ~~~~~vmyD~IrLe 600 (606)
+ ..+-.|+|+|+
T Consensus 113 ~--~~~niD~~~f~ 124 (125)
T PF03422_consen 113 G--WAFNIDYFQFT 124 (125)
T ss_dssp S--B-EEEEEEEEE
T ss_pred C--ceEEeEEEEEE
Confidence 3 36889999986
No 19
>cd03869 M14_CPX_like Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Pro
Probab=92.88 E-value=0.23 Score=54.77 Aligned_cols=67 Identities=18% Similarity=0.231 Sum_probs=48.9
Q ss_pred CeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceecee
Q 038979 305 RGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDF 384 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~ 384 (606)
|| |+|.|+-.. +.|..+|.|.+.. ......|.++|.|-- =+.||+|+++|.++|+.
T Consensus 329 ~G-ikG~V~d~~------g~~i~~a~i~v~g-------------~~~~v~t~~~GdywR-ll~pG~y~v~~~a~gy~--- 384 (405)
T cd03869 329 RG-IKGVVRDKT------GKGIPNAIISVEG-------------INHDIRTASDGDYWR-LLNPGEYRVTAHAEGYT--- 384 (405)
T ss_pred cC-ceEEEECCC------CCcCCCcEEEEec-------------CccceeeCCCCceEE-ecCCceEEEEEEecCCC---
Confidence 54 899884332 7889999998843 344556778887654 38999999999997765
Q ss_pred eeeeEEEEeCC
Q 038979 385 KYHAAIRITAG 395 (606)
Q Consensus 385 ~~~~~VtV~aG 395 (606)
....+|+|..+
T Consensus 385 ~~~~~~~v~~~ 395 (405)
T cd03869 385 SSTKNCEVGYE 395 (405)
T ss_pred cccEEEEEcCC
Confidence 55566777754
No 20
>cd00421 intradiol_dioxygenase Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers.
Probab=92.64 E-value=0.27 Score=46.66 Aligned_cols=64 Identities=14% Similarity=0.303 Sum_probs=46.6
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccc-------cceeEEEeCCCcceEeCcccCceeEE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECK-------GYQFWTVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~-------~yqywt~td~~G~F~I~nV~pGtY~L 373 (606)
..=.+.|+|+-.+ +.|..+|.|-|-.....|....+.. ...-...||++|.|.|.-|+||.|.+
T Consensus 10 ~~l~l~G~V~D~~------g~pv~~A~VeiW~~d~~G~Y~~~~~~~~~~~~~~rg~~~Td~~G~y~f~ti~Pg~Y~~ 80 (146)
T cd00421 10 EPLTLTGTVLDGD------GCPVPDALVEIWQADADGRYSGQDDSGLDPEFFLRGRQITDADGRYRFRTIKPGPYPI 80 (146)
T ss_pred CEEEEEEEEECCC------CCCCCCcEEEEEecCCCCccCCcCccccCCCCCCEEEEEECCCcCEEEEEEcCCCCCC
Confidence 3458999996555 8899999999866655553332211 22335789999999999999999994
No 21
>PF05738 Cna_B: Cna protein B-type domain; InterPro: IPR008454 This entry represents a repeated B region domain found in the collagen-binding surface protein Cna in Staphylococcus aureus, as well as other related domains. The B region domain of Cna has a prealbumin-like beta-sandwich fold of seven strands in two sheets with a Greek key topology []. However, this domain does not mediate collagen binding, the IPR008456 from INTERPRO region carries out that function; instead it appears to form a stalk that presents the ligand binding domain away from the bacterial cell surface. Cna is a collagen-binding MSCRAMM (Microbial Surface Component Recognizing Adhesive Matrix Molecules), and is necessary and sufficient for S. aureus cells to adhere to cartilage.; PDB: 2X5P_A 3RKP_A 3KPT_A 1VLF_T 1TI2_F 1TI6_D 1TI4_J 1VLE_V 1VLD_X 3PF2_A ....
Probab=92.62 E-value=0.66 Score=37.67 Aligned_cols=45 Identities=24% Similarity=0.356 Sum_probs=31.5
Q ss_pred EEeCCCcceEeCcccCceeEEEEEE--CceeceeeeeeEEEEeCCcee
Q 038979 353 TVANEGGNFSIKNVLIGNYNLYAWI--PGFIGDFKYHAAIRITAGSAK 398 (606)
Q Consensus 353 t~td~~G~F~I~nV~pGtY~L~a~~--~G~~G~~~~~~~VtV~aG~t~ 398 (606)
..+|++|.|.|.+++||+|.|.--. .|+.-. .....++|..++..
T Consensus 21 ~~Td~~G~~~f~~L~~G~Y~l~E~~aP~GY~~~-~~~~~~~i~~~~~~ 67 (70)
T PF05738_consen 21 VTTDENGKYTFKNLPPGTYTLKETKAPDGYQLD-DTPYEFTITEDGDV 67 (70)
T ss_dssp EEGGTTSEEEEEEEESEEEEEEEEETTTTEEEE-ECEEEEEECTTSCE
T ss_pred EEECCCCEEEEeecCCeEEEEEEEECCCCCEEC-CCceEEEEecCCEE
Confidence 5699999999999999999999875 333310 12233666666543
No 22
>PF09430 DUF2012: Protein of unknown function (DUF2012); InterPro: IPR019008 This domain is found in different proteins, including uncharacterised protein family UPF0480 and nodal modulators. A nodal modulator has been identified as part of a protein complex that participates in the nodal signaling pathway during vertebrate development [].
Probab=92.60 E-value=0.58 Score=43.10 Aligned_cols=40 Identities=23% Similarity=0.425 Sum_probs=31.1
Q ss_pred eEEEeCCCcceEeCcccCceeEEEEEECceeceeeeeeEEEEe
Q 038979 351 FWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRIT 393 (606)
Q Consensus 351 ywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~ 393 (606)
+-+...++|+|.|.||++|+|.|.+-...+. | ..-.|.|.
T Consensus 22 ~~~~v~~dG~F~f~~Vp~GsY~L~V~s~~~~--F-~~~RVdV~ 61 (123)
T PF09430_consen 22 ISAFVRSDGSFVFHNVPPGSYLLEVHSPDYV--F-PPYRVDVS 61 (123)
T ss_pred eEEEecCCCEEEeCCCCCceEEEEEECCCcc--c-cCEEEEEe
Confidence 3778999999999999999999999865433 2 22347777
No 23
>PF07210 DUF1416: Protein of unknown function (DUF1416); InterPro: IPR010814 This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown.
Probab=92.32 E-value=0.79 Score=39.66 Aligned_cols=62 Identities=29% Similarity=0.430 Sum_probs=47.6
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceec
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIG 382 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G 382 (606)
....|+|+|+ .+ +.|..+++|-|-++.. ..-=-..++++|.|.+- ..||+.+|.+-+++-.|
T Consensus 6 ke~VItG~V~-~~------G~Pv~gAyVRLLD~sg---------EFtaEvvts~~G~FRFf-aapG~WtvRal~~~g~~ 67 (85)
T PF07210_consen 6 KETVITGRVT-RD------GEPVGGAYVRLLDSSG---------EFTAEVVTSATGDFRFF-AAPGSWTVRALSRGGNG 67 (85)
T ss_pred ceEEEEEEEe-cC------CcCCCCeEEEEEcCCC---------CeEEEEEecCCccEEEE-eCCCceEEEEEccCCCC
Confidence 3578999997 56 8999999999975522 11222457899999995 89999999999876554
No 24
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=91.27 E-value=0.52 Score=55.61 Aligned_cols=53 Identities=23% Similarity=0.308 Sum_probs=43.7
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCc-ccCceeEEEEEE
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKN-VLIGNYNLYAWI 377 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~n-V~pGtY~L~a~~ 377 (606)
-+|+|||++.- .+.|.+++.|.+. -+-..+||++|+|++.| +..|+|++.|-.
T Consensus 316 fSvtGRVl~g~-----~g~~l~gvvvlvn--------------gk~~~kTdaqGyykLen~~t~gtytI~a~k 369 (1165)
T KOG1948|consen 316 FSVTGRVLVGS-----KGLPLSGVVVLVN--------------GKSGGKTDAQGYYKLENLKTDGTYTITAKK 369 (1165)
T ss_pred EEeeeeEEeCC-----CCCCccceEEEEc--------------CcccceEcccceEEeeeeeccCcEEEEEec
Confidence 48999997753 3789999988873 24566899999999999 999999999963
No 25
>cd03463 3,4-PCD_alpha Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=89.64 E-value=0.68 Score=45.90 Aligned_cols=63 Identities=22% Similarity=0.401 Sum_probs=47.3
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccc-------ccceeE--EEeCCCcceEeCcccCceeE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTEC-------KGYQFW--TVANEGGNFSIKNVLIGNYN 372 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~-------~~yqyw--t~td~~G~F~I~nV~pGtY~ 372 (606)
..=.|+|+|+-.+ ++|..+|.|=+-.....|....+. .+++.| ..||++|+|++.-|+||-|.
T Consensus 35 ~~l~l~G~V~D~~------g~Pi~gA~VeiWqad~~G~Y~~~~~~~~~~~~~f~~rGr~~TD~~G~y~F~Ti~Pg~Y~ 106 (185)
T cd03463 35 ERITLEGRVYDGD------GAPVPDAMLEIWQADAAGRYAHPADSRRRLDPGFRGFGRVATDADGRFSFTTVKPGAVP 106 (185)
T ss_pred CEEEEEEEEECCC------CCCCCCCEEEEEcCCCCCccCCcCCcccccCCCCCcEEEEEECCCCCEEEEEEcCCCcC
Confidence 4568999986444 899999999997766666333211 344455 56999999999999999986
No 26
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=89.42 E-value=0.52 Score=37.79 Aligned_cols=29 Identities=21% Similarity=0.276 Sum_probs=22.4
Q ss_pred eEEEeCCCc-ceEeCcccCceeEEEEEECc
Q 038979 351 FWTVANEGG-NFSIKNVLIGNYNLYAWIPG 379 (606)
Q Consensus 351 ywt~td~~G-~F~I~nV~pGtY~L~a~~~G 379 (606)
=|....... .+++.+.+||+|+|.|.+..
T Consensus 20 ~W~~~~~~~~~~~~~~L~~G~Y~l~V~a~~ 49 (66)
T PF07495_consen 20 EWITLGSYSNSISYTNLPPGKYTLEVRAKD 49 (66)
T ss_dssp SEEEESSTS-EEEEES--SEEEEEEEEEEE
T ss_pred eEEECCCCcEEEEEEeCCCEEEEEEEEEEC
Confidence 366777777 99999999999999999743
No 27
>cd03459 3,4-PCD Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=89.15 E-value=0.78 Score=44.33 Aligned_cols=64 Identities=19% Similarity=0.392 Sum_probs=47.5
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccc--------ccceeE--EEeCCCcceEeCcccCceeEE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTEC--------KGYQFW--TVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~--------~~yqyw--t~td~~G~F~I~nV~pGtY~L 373 (606)
++=.|+|+|+-.+ +.|..+|.|=+-.....|....+. .++..| ..||++|.|.+.-|+||-|.+
T Consensus 14 ~~l~l~g~V~D~~------g~Pv~~A~veiWqad~~G~Y~~~~~~~~~~~~~~f~~rG~~~Td~~G~~~f~Ti~Pg~Y~~ 87 (158)
T cd03459 14 ERIILEGRVLDGD------GRPVPDALVEIWQADAAGRYRHPRDSHRAPLDPNFTGFGRVLTDADGRYRFRTIKPGAYPW 87 (158)
T ss_pred cEEEEEEEEECCC------CCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCceeEEEECCCCcEEEEEECCCCcCC
Confidence 3457889986444 899999999997766665433322 345544 568999999999999999983
No 28
>cd03462 1,2-CCD chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway.
Probab=87.48 E-value=1.6 Score=45.23 Aligned_cols=64 Identities=14% Similarity=0.203 Sum_probs=45.5
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccc---cccee--EEEeCCCcceEeCcccCceeE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTEC---KGYQF--WTVANEGGNFSIKNVLIGNYN 372 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~---~~yqy--wt~td~~G~F~I~nV~pGtY~ 372 (606)
.++=.|+|+|+-.+ |.|..+|.|=+-.....|....+. ....+ ...||++|.|.+.-|+||.|-
T Consensus 97 G~~l~l~G~V~D~~------G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~RG~~~Td~~G~y~F~Ti~P~~Yp 165 (247)
T cd03462 97 HKPLLFRGTVKDLA------GAPVAGAVIDVWHSTPDGKYSGFHPNIPEDYYRGKIRTDEDGRYEVRTTVPVPYQ 165 (247)
T ss_pred CCEEEEEEEEEcCC------CCCcCCcEEEEECCCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCcC
Confidence 34568999996544 899999999986665555332211 11111 467899999999999999994
No 29
>PF00775 Dioxygenase_C: Dioxygenase; InterPro: IPR000627 This entry represents the C-terminal domain common to several intradiol ring-cleavage dioxygenases. Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes (IPR000486 from INTERPRO) use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) []. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues. Enzymes that belong to the intradiol family include catechol 1,2-dioxygenase (1,2-CTD) (1.13.11.1 from EC); protocatechuate 3,4-dioxygenase (3,4-PCD) (1.13.11.3 from EC); and chlorocatechol 1,2-dioxygenase (1.13.11.1 from EC) [].; GO: 0003824 catalytic activity, 0008199 ferric iron binding, 0006725 cellular aromatic compound metabolic process, 0055114 oxidation-reduction process; PDB: 2BUV_A 2BUX_A 2BUU_A 2BUR_A 1EO9_A 2BUZ_A 2BV0_A 1EO2_A 1EOC_A 1EOA_A ....
Probab=87.42 E-value=1.5 Score=43.46 Aligned_cols=64 Identities=20% Similarity=0.332 Sum_probs=40.0
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccc-------ccceeEEEeCCCcceEeCcccCceeEE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTEC-------KGYQFWTVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~-------~~yqywt~td~~G~F~I~nV~pGtY~L 373 (606)
+.=.|.|+|+-.+ +.|..+|.|=+-.....|....+. ....=+..||++|.|++.-|+||.|.+
T Consensus 28 ~~l~l~G~V~D~~------g~Pv~~A~veiWqada~G~Ys~~~~~~~~~~~~~rG~~~Td~~G~y~f~Ti~Pg~Y~~ 98 (183)
T PF00775_consen 28 EPLVLHGRVIDTD------GKPVPGALVEIWQADADGRYSGQDPGSDQPDFNLRGRFRTDADGRYSFRTIKPGPYPI 98 (183)
T ss_dssp -EEEEEEEEEETT------SSB-TTEEEEEEE--TTS--TTTBTTSSSSTTTTEEEEEECTTSEEEEEEE----EEE
T ss_pred CEEEEEEEEECCC------CCCCCCcEEEEEecCCCCccccccccccccCCCcceEEecCCCCEEEEEeeCCCCCCC
Confidence 3458999996554 899999999997665555333221 123344778999999999999999974
No 30
>TIGR02465 chlorocat_1_2 chlorocatechol 1,2-dioxygenase. Members of this protein family are chlorocatechol 1,2-dioxygenase. This protein is closely related to catechol 1,2-dioxygenase, TIGR02439, EC 1.13.11.1. Note that annotated database entries have appeared for the present protein family with the EC number that refers to that of family TIGR02439. This protein acts in pathways of the biodegradation of chlorinated aromatic compounds.
Probab=87.19 E-value=1.2 Score=46.01 Aligned_cols=64 Identities=16% Similarity=0.237 Sum_probs=46.9
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccc---cccee--EEEeCCCcceEeCcccCceeEE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTEC---KGYQF--WTVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~---~~yqy--wt~td~~G~F~I~nV~pGtY~L 373 (606)
+.=.|+|+|+-.+ |.|..+|.|=+-....+|....+. ....+ +..||++|.|.+.-|+||-|-+
T Consensus 97 ~~l~v~G~V~D~~------G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~F~Ti~P~~Ypi 165 (246)
T TIGR02465 97 KPLLIRGTVRDLS------GTPVAGAVIDVWHSTPDGKYSGFHDNIPDDYYRGKLVTAADGSYEVRTTMPVPYQI 165 (246)
T ss_pred cEEEEEEEEEcCC------CCCcCCcEEEEECCCCCCCCCCCCCCCCCCCCeEEEEECCCCCEEEEEECCCCCCC
Confidence 4578999996544 899999999997666665333211 12233 5788999999999999999853
No 31
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=86.69 E-value=1.8 Score=50.19 Aligned_cols=77 Identities=17% Similarity=0.213 Sum_probs=54.3
Q ss_pred eeEEEEEEecCccccccEEEEEEEeec-----cCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeec
Q 038979 494 STWQIQFKLEGVVKKATYKLRVAVAAA-----HGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPG 568 (606)
Q Consensus 494 ~~w~I~F~L~~~~~~~~~tLriala~a-----~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa 568 (606)
..-.|.|.+...+....++|+|.+.-+ ..+.++|.|||..+.. + .+..++. .....+|+||.
T Consensus 29 ~~~~~~f~v~~~~~v~~a~L~L~~~~S~~l~~~~S~L~V~lNg~~v~s--~---~l~~~~~--------~~~~~~i~Ip~ 95 (605)
T PF03170_consen 29 ASRTIYFPVPADWVVTKATLNLSYTYSPSLLPERSQLTVSLNGQPVGS--I---PLDAESA--------QPQTVTIPIPP 95 (605)
T ss_pred CceEEEEEcCCCccccceEEEEEEEECcccCCCcceEEEEECCEEeEE--E---ecCcCCC--------CceEEEEecCh
Confidence 345777888777654567777777765 2368999999986542 1 1222221 24778999999
Q ss_pred CceeeeecEEEEEeec
Q 038979 569 KVLRKGNNTIYLSQPR 584 (606)
Q Consensus 569 ~~L~~G~NtI~l~~~~ 584 (606)
. |..|.|.|.|....
T Consensus 96 ~-l~~g~N~l~~~~~~ 110 (605)
T PF03170_consen 96 A-LIKGFNRLTFEFIG 110 (605)
T ss_pred h-hcCCceEEEEEEEe
Confidence 9 99999999998754
No 32
>TIGR02423 protocat_alph protocatechuate 3,4-dioxygenase, alpha subunit. This model represents the alpha chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the beta chain (TIGR02422), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=86.19 E-value=1.4 Score=43.93 Aligned_cols=64 Identities=25% Similarity=0.473 Sum_probs=46.8
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccccc--------ccceeE--EEeCCCcceEeCcccCceeEE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTEC--------KGYQFW--TVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~--------~~yqyw--t~td~~G~F~I~nV~pGtY~L 373 (606)
++=.++|+|+-.+ +.|..+|.|=+-+...+|....+. .+++-| ..||++|+|.+.-|+||.|..
T Consensus 38 ~~l~l~G~V~D~~------g~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~~~f~grGr~~Td~~G~y~f~TI~Pg~Yp~ 111 (193)
T TIGR02423 38 ERIRLEGRVLDGD------GHPVPDALIEIWQADAAGRYNSPADLRAPATDPGFRGWGRTGTDESGEFTFETVKPGAVPD 111 (193)
T ss_pred CEEEEEEEEECCC------CCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCCeEEEEECCCCCEEEEEEcCCCcCC
Confidence 4568999996443 899999999997766665333221 244444 468999999999999998853
No 33
>TIGR02422 protocat_beta protocatechuate 3,4-dioxygenase, beta subunit. This model represents the beta chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the alpha chain (TIGR02423), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=85.78 E-value=2.7 Score=42.89 Aligned_cols=67 Identities=18% Similarity=0.333 Sum_probs=47.3
Q ss_pred CCCCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccc--------cccceeE--EEeCCCcceEeCcccCce
Q 038979 301 RSNKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTE--------CKGYQFW--TVANEGGNFSIKNVLIGN 370 (606)
Q Consensus 301 ~~~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~--------~~~yqyw--t~td~~G~F~I~nV~pGt 370 (606)
+..++=.|+|+|+-.+ +.|..+|.|=+-+...+|....+ ..++..+ ..||++|.|.|.-|+||-
T Consensus 56 ~~G~~i~l~G~V~D~~------g~PV~~A~VEIWQada~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~PG~ 129 (220)
T TIGR02422 56 PIGERIIVHGRVLDED------GRPVPNTLVEVWQANAAGRYRHKNDQYLAPLDPNFGGVGRTLTDSDGYYRFRTIKPGP 129 (220)
T ss_pred CCCCEEEEEEEEECCC------CCCCCCCEEEEEecCCCCcccCccCccccccCCCCCCEEEEEECCCccEEEEEECCCC
Confidence 3346778999996544 89999999999766555533321 1223323 458999999999999999
Q ss_pred eEE
Q 038979 371 YNL 373 (606)
Q Consensus 371 Y~L 373 (606)
|..
T Consensus 130 Y~~ 132 (220)
T TIGR02422 130 YPW 132 (220)
T ss_pred ccC
Confidence 843
No 34
>cd03464 3,4-PCD_beta Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=85.69 E-value=2.7 Score=42.89 Aligned_cols=65 Identities=18% Similarity=0.373 Sum_probs=47.1
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccc--------cccceeE--EEeCCCcceEeCcccCceeE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTE--------CKGYQFW--TVANEGGNFSIKNVLIGNYN 372 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~--------~~~yqyw--t~td~~G~F~I~nV~pGtY~ 372 (606)
.++=.|+|+|+-.+ +.|..+|.|=+-+....|....+ ..+++.+ ..||++|.|.|.-|+||.|.
T Consensus 63 G~~i~l~G~V~D~~------G~PV~~A~VEIWQad~~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~Pg~Yp 136 (220)
T cd03464 63 GERIIVHGRVLDED------GRPVPNTLVEIWQANAAGRYRHKRDQHDAPLDPNFGGAGRTLTDDDGYYRFRTIKPGAYP 136 (220)
T ss_pred CCEEEEEEEEECCC------CCCCCCCEEEEEecCCCCcccCccCCcccccCCCCCCEEEEEECCCccEEEEEECCCCcc
Confidence 45678899996544 89999999999766555533321 1234433 56899999999999999994
Q ss_pred E
Q 038979 373 L 373 (606)
Q Consensus 373 L 373 (606)
.
T Consensus 137 ~ 137 (220)
T cd03464 137 W 137 (220)
T ss_pred C
Confidence 3
No 35
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=85.33 E-value=2.2 Score=50.70 Aligned_cols=57 Identities=25% Similarity=0.417 Sum_probs=40.0
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEEC
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIP 378 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~ 378 (606)
-+|+|+|.... +-.+.++.|-|-.. .+----|.|+++|.|.+.+|.||+|.+.|..+
T Consensus 119 Fsv~GkVlgaa------ggGpagV~velrs~----------e~~iast~T~~~Gky~f~~iiPG~Yev~ashp 175 (1165)
T KOG1948|consen 119 FSVRGKVLGAA------GGGPAGVLVELRSQ----------EDPIASTKTEDGGKYEFRNIIPGKYEVSASHP 175 (1165)
T ss_pred eeEeeEEeecc------CCCcccceeecccc----------cCcceeeEecCCCeEEEEecCCCceEEeccCc
Confidence 36777775433 33445566665432 22344588999999999999999999999853
No 36
>COG3485 PcaH Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=84.85 E-value=1.7 Score=44.50 Aligned_cols=64 Identities=19% Similarity=0.327 Sum_probs=47.6
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccce-------eE--EEeCCCcceEeCcccCceeEE
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQ-------FW--TVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yq-------yw--t~td~~G~F~I~nV~pGtY~L 373 (606)
+|=.|+|+|+-.+ |.|..+|.|=+-+-...|-.....+.+. =| +.||++|.|.+.-|+||.|--
T Consensus 71 e~i~l~G~VlD~~------G~Pv~~A~VEiWQAda~GrY~~~~d~~~~~~~~f~g~Gr~~Td~~G~y~F~Ti~Pg~yp~ 143 (226)
T COG3485 71 ERILLEGRVLDGN------GRPVPDALVEIWQADADGRYSHPKDSRLAPLPNFNGRGRTITDEDGEYRFRTIKPGPYPW 143 (226)
T ss_pred ceEEEEEEEECCC------CCCCCCCEEEEEEcCCCCcccCccccccCcCccccceEEEEeCCCceEEEEEeecccccC
Confidence 7889999996655 9999999999976655553332222222 23 668999999999999999843
No 37
>TIGR02962 hdxy_isourate hydroxyisourate hydrolase. Members of this family, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes and have been demonstrated to have hydroxyisourate hydrolase activity. In eukaryotes, a clade separate from the transthyretins (a family of thyroid-hormone binding proteins) has also been shown to have HIU hydrolase activity in urate catabolizing organisms. Transthyretin, then, would appear to be the recently diverged paralog of the more ancient HIUH family.
Probab=84.43 E-value=2.1 Score=39.17 Aligned_cols=55 Identities=25% Similarity=0.303 Sum_probs=39.7
Q ss_pred cCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceE-----eCcccCceeEEEEEECceeceeee
Q 038979 322 AGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFS-----IKNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 322 ~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~-----I~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
.|+||.++.|-|..... +.|+. -.-++||+||+.. ...+.||.|+|..- .|+|..
T Consensus 12 ~G~PAagv~V~L~~~~~-~~~~~-----i~~~~Tn~DGR~~~~l~~~~~~~~G~Y~l~F~----~g~Yf~ 71 (112)
T TIGR02962 12 SGKPAAGVPVTLYRLDG-SGWTP-----LAEGVTNADGRCPDLLPEGETLAAGIYKLRFD----TGDYFA 71 (112)
T ss_pred CCccCCCCEEEEEEecC-CCeEE-----EEEEEECCCCCCcCcccCcccCCCeeEEEEEE----hhhhhh
Confidence 39999999999964322 12441 2346799999987 45678999999997 666554
No 38
>PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed
Probab=83.35 E-value=37 Score=42.16 Aligned_cols=43 Identities=16% Similarity=0.165 Sum_probs=38.4
Q ss_pred ceEEEEeCCEEEEeCCeEEEEEeCCceeEEEEEEcCEEecccc
Q 038979 4 GVQLHEQNNHVVMNNGILQVSISTPQGFVIGIQYKGNKNLLNV 46 (606)
Q Consensus 4 ~v~~~~~g~~vv~~Ng~l~vtv~k~~g~itsl~y~G~e~l~~~ 46 (606)
.+++.+++..+++.++.++++|+|.+|.|+|++++|++.+.+.
T Consensus 716 ~~~~~~~~~~~~i~~~~~~~~fdk~tG~l~s~~~~g~~ll~~~ 758 (1021)
T PRK10340 716 PLTLEEDRLSCTVRGYNFAITFSKVSGKLTSWQVNGESLLTRE 758 (1021)
T ss_pred CeeEEecCCEEEEEeCCEEEEEECCcceEEEEEeCCeeeecCC
Confidence 4678888999999999999999999999999999999887543
No 39
>PF02837 Glyco_hydro_2_N: Glycosyl hydrolases family 2, sugar binding domain; InterPro: IPR006104 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 2 GH2 from CAZY comprises enzymes with several known activities; beta-galactosidase (3.2.1.23 from EC); beta-mannosidase (3.2.1.25 from EC); beta-glucuronidase (3.2.1.31 from EC). These enzymes contain a conserved glutamic acid residue which has been shown [], in Escherichia coli lacZ (P00722 from SWISSPROT), to be the general acid/base catalyst in the active site of the enzyme. This domain has a jelly-roll fold [].; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3DEC_A 3OB8_A 3OBA_A 3CMG_A 3FN9_C 2VZU_A 2X09_A 2VZO_A 2X05_A 2VZV_B ....
Probab=83.02 E-value=2.5 Score=39.95 Aligned_cols=68 Identities=24% Similarity=0.285 Sum_probs=46.3
Q ss_pred EEEEEEecCccccccEEEEEEEeeccCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCceeeee
Q 038979 496 WQIQFKLEGVVKKATYKLRVAVAAAHGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVLRKGN 575 (606)
Q Consensus 496 w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~ 575 (606)
.+=+|+|++......+.|++.-. ...-.|.|||..+. ...+.+..++++|+. .|+.|.
T Consensus 72 Yr~~f~lp~~~~~~~~~L~f~gv---~~~a~v~vNG~~vg------------------~~~~~~~~~~~dIt~-~l~~g~ 129 (167)
T PF02837_consen 72 YRRTFTLPADWKGKRVFLRFEGV---DYAAEVYVNGKLVG------------------SHEGGYTPFEFDITD-YLKPGE 129 (167)
T ss_dssp EEEEEEESGGGTTSEEEEEESEE---ESEEEEEETTEEEE------------------EEESTTS-EEEECGG-GSSSEE
T ss_pred EEEEEEeCchhcCceEEEEeccc---eEeeEEEeCCeEEe------------------eeCCCcCCeEEeChh-hccCCC
Confidence 45568887765433355555433 36678999997442 123557789999975 788998
Q ss_pred -cEEEEEeecC
Q 038979 576 -NTIYLSQPRK 585 (606)
Q Consensus 576 -NtI~l~~~~g 585 (606)
|+|.+.+.+.
T Consensus 130 ~N~l~V~v~~~ 140 (167)
T PF02837_consen 130 ENTLAVRVDNW 140 (167)
T ss_dssp EEEEEEEEESS
T ss_pred CEEEEEEEeec
Confidence 9999999764
No 40
>PF02929 Bgal_small_N: Beta galactosidase small chain; InterPro: IPR004199 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Beta-galactosidase enzymes (3.2.1.23 from EC) belong to several glycoside hydrolase families: GH1 from CAZY, GH2 from CAZY, GH35 from CAZY and GH42 from CAZY. Beta-galactosidase is the product of the lac operon Z gene of Escherichia coli. This enzyme catalyses the hydrolysis of the disaccharide lactose to galactose and glucose, and can also convert lactose to allolactose, the inducer of the lac operon. This domain is found in single chain beta-galactosidases, which are comprised of five domains. The active site is located in a deep pocket built around the central alpha-beta barrel, with the other domains conferring specificity for a disaccharide substrate. This entry represents domain 5 of glycoside hydrolase family 2, which contains an N-terminal loop that swings towards the active site upon the deep binding of a ligand to produce a closed conformation []. This domain is also found in the amino-terminal portion of the small chain of dimeric beta-galactosidases.; GO: 0004565 beta-galactosidase activity, 0005975 carbohydrate metabolic process, 0009341 beta-galactosidase complex; PDB: 1JZ3_D 1JYY_H 1GHO_P 3VD9_B 3I3E_B 3T0B_A 3T09_C 1F4A_D 3VDC_C 3VDB_D ....
Probab=82.74 E-value=14 Score=38.77 Aligned_cols=30 Identities=13% Similarity=0.246 Sum_probs=26.3
Q ss_pred EEeCCeEEEEEeCCceeEEEEEEcCEEecc
Q 038979 15 VMNNGILQVSISTPQGFVIGIQYKGNKNLL 44 (606)
Q Consensus 15 v~~Ng~l~vtv~k~~g~itsl~y~G~e~l~ 44 (606)
+|.++.++++|+|.+|.++|++|+|++.+.
T Consensus 1 tV~g~~f~~~Fdk~~G~l~s~~~~g~~ll~ 30 (276)
T PF02929_consen 1 TVSGKDFSYVFDKKTGTLTSYKYNGKELLK 30 (276)
T ss_dssp -EEETTEEEEEETTTTCEEEEEETTEEEEC
T ss_pred CCccCCEEEEEECCCCeEEEEEECCEEeec
Confidence 367778999999999999999999998774
No 41
>PF10670 DUF4198: Domain of unknown function (DUF4198)
Probab=82.30 E-value=4 Score=40.03 Aligned_cols=62 Identities=18% Similarity=0.195 Sum_probs=47.1
Q ss_pred CeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEE
Q 038979 305 RGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWI 377 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~ 377 (606)
-..++.+|+. + |+|..++.|.+...+. |. +........+||++|.++|+=-+||.|-|.+..
T Consensus 150 g~~~~~~vl~-~------GkPl~~a~V~~~~~~~---~~-~~~~~~~~~~TD~~G~~~~~~~~~G~wli~a~~ 211 (215)
T PF10670_consen 150 GDPLPFQVLF-D------GKPLAGAEVEAFSPGG---WY-DVEHEAKTLKTDANGRATFTLPRPGLWLIRASH 211 (215)
T ss_pred CCEEEEEEEE-C------CeEcccEEEEEEECCC---cc-ccccceEEEEECCCCEEEEecCCCEEEEEEEEE
Confidence 4578899864 4 8999999999865522 11 122227889999999999998899999999863
No 42
>TIGR02438 catachol_actin catechol 1,2-dioxygenase, Actinobacterial. Members of this family are catechol 1,2-dioxygenases of the Actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so are built in this separate model. The member from Rhodococcus rhodochrous NCIMB 13259 (GB|AAC33003.1) is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol.
Probab=81.19 E-value=2.3 Score=44.81 Aligned_cols=65 Identities=22% Similarity=0.321 Sum_probs=45.2
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccc---cccceeE--EEeCCCcceEeCcccCceeEE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTE---CKGYQFW--TVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~---~~~yqyw--t~td~~G~F~I~nV~pGtY~L 373 (606)
.++=.|+|+|+-.+ |+|..+|.|-+-.....|....+ ....++. ..||++|.|.+.-|+||-|-+
T Consensus 130 G~pl~v~G~V~D~~------G~Pv~gA~VdiWqada~G~Ys~~~~~~~~~~lRGr~~TDadG~y~F~TI~Pg~Ypi 199 (281)
T TIGR02438 130 GTPLVFSGQVTDLD------GNGLAGAKVELWHADDDGFYSQFAPGIPEWNLRGTIIADDEGRFEITTMQPAPYQI 199 (281)
T ss_pred CCEEEEEEEEEcCC------CCCcCCCEEEEEecCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcCC
Confidence 34568999996444 89999999999555445432211 1122223 568999999999999999964
No 43
>cd05822 TLP_HIUase HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family. HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site. In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location. Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences. HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located betw
Probab=80.62 E-value=3.5 Score=37.67 Aligned_cols=55 Identities=24% Similarity=0.291 Sum_probs=39.5
Q ss_pred cCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEe-----CcccCceeEEEEEECceeceeee
Q 038979 322 AGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSI-----KNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 322 ~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I-----~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
.|+||.++-|-|...... .|+ .-.-++||+||+..- ..+.+|+|.|..- .|+|..
T Consensus 12 ~G~PAagv~V~L~~~~~~-~~~-----~i~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~----~~~Yf~ 71 (112)
T cd05822 12 TGKPAAGVAVTLYRLDGN-GWT-----LLATGVTNADGRCDDLLPPGAQLAAGTYKLTFD----TGAYFA 71 (112)
T ss_pred CCcccCCCEEEEEEecCC-CeE-----EEEEEEECCCCCccCcccccccCCCeeEEEEEE----hhhhhh
Confidence 399999999999754332 344 122377999999753 4588999999997 565543
No 44
>cd03460 1,2-CTD Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=80.60 E-value=2.9 Score=44.10 Aligned_cols=65 Identities=14% Similarity=0.333 Sum_probs=47.6
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccc---cccccee--EEEeCCCcceEeCcccCceeEE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQT---ECKGYQF--WTVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~---~~~~yqy--wt~td~~G~F~I~nV~pGtY~L 373 (606)
.++=.|+|+|+-.+ |.|..+|.|=+-.....|.... ...+++. ...||++|.|.+.-|+||-|-+
T Consensus 122 Gepl~l~G~V~D~~------G~PI~~A~VeiWqad~~G~Ys~~~~~~~~f~~RGr~~TD~~G~y~F~TI~P~~Ypi 191 (282)
T cd03460 122 GETLVMHGTVTDTD------GKPVPGAKVEVWHANSKGFYSHFDPTQSPFNLRRSIITDADGRYRFRSIMPSGYGV 191 (282)
T ss_pred CCEEEEEEEEECCC------CCCcCCcEEEEECCCCCCCcCCCCCCCCCCCCceEEEeCCCCCEEEEEECCCCCcC
Confidence 45678999996544 8999999999977666654332 1123333 3678999999999999999853
No 45
>cd03458 Catechol_intradiol_dioxygenases Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=80.58 E-value=9.6 Score=39.76 Aligned_cols=65 Identities=15% Similarity=0.268 Sum_probs=46.9
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccc---cccceeE--EEeCCCcceEeCcccCceeEE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTE---CKGYQFW--TVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~---~~~yqyw--t~td~~G~F~I~nV~pGtY~L 373 (606)
.++=.++|+|+-.+ +.|..+|.|=+-.....|....+ ......+ ..||++|.|.+.-|+||-|-+
T Consensus 102 G~~l~l~G~V~D~~------G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~f~Ti~P~~Ypi 171 (256)
T cd03458 102 GEPLFVHGTVTDTD------GKPLAGATVDVWHADPDGFYSQQDPDQPEFNLRGKFRTDEDGRYRFRTIRPVPYPI 171 (256)
T ss_pred CcEEEEEEEEEcCC------CCCCCCcEEEEEccCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCccC
Confidence 34567999996544 89999999999766555533321 1233333 668999999999999999964
No 46
>TIGR02439 catechol_proteo catechol 1,2-dioxygenase, proteobacterial. Members of this family known so far are catechol 1,2-dioxygenases of the Proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the Actinobacteria, which are quite similar to each other and resolved by separate models. This enzyme catalyzes intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogs 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol.
Probab=79.56 E-value=3.3 Score=43.78 Aligned_cols=65 Identities=15% Similarity=0.337 Sum_probs=47.1
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCccc---ccccceeE--EEeCCCcceEeCcccCceeEE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQT---ECKGYQFW--TVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~---~~~~yqyw--t~td~~G~F~I~nV~pGtY~L 373 (606)
.++=.|+|+|+-.+ |+|..+|.|=+-.....|.... ...+++.+ ..||++|.|.+.-|+||-|-+
T Consensus 126 G~pl~v~G~V~D~~------G~PI~gA~VeIWqad~~G~Ys~~~~~~~~~~lRG~~~TD~~G~y~F~TI~P~~Ypi 195 (285)
T TIGR02439 126 GETLFLHGQVTDAD------GKPIAGAKVELWHANTKGNYSHFDKSQSEFNLRRTIITDAEGRYRARSIVPSGYGC 195 (285)
T ss_pred CcEEEEEEEEECCC------CCCcCCcEEEEEccCCCCCcCCCCCCCCCCCceEEEEECCCCCEEEEEECCCCCcC
Confidence 34568999996544 8999999999977666653332 11233333 568999999999999999863
No 47
>smart00606 CBD_IV Cellulose Binding Domain Type IV.
Probab=79.54 E-value=18 Score=32.74 Aligned_cols=88 Identities=23% Similarity=0.420 Sum_probs=48.5
Q ss_pred eEEEEEE-ecCccccccEEEEEEEeec-cCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCcee
Q 038979 495 TWQIQFK-LEGVVKKATYKLRVAVAAA-HGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVLR 572 (606)
Q Consensus 495 ~w~I~F~-L~~~~~~~~~tLriala~a-~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~ 572 (606)
.| |.|+ ++-. ..+.+++.|..+.. ..+.++|++++.+.. ...+..++.... . -.+...+.+|+ |.
T Consensus 40 ~w-~~y~~vd~~-~~g~~~i~~~~as~~~~~~i~v~~d~~~G~--~~~~~~~p~tg~-----~-~~~~~~~~~v~---~~ 106 (129)
T smart00606 40 DW-IAYKDVDFG-SSGAYTFTARVASGNAGGSIELRLDSPTGT--LVGTVDVPSTGG-----W-QTYQTVSATVT---LP 106 (129)
T ss_pred CE-EEEEeEecC-CCCceEEEEEEeCCCCCceEEEEECCCCCc--EEEEEEeCCCCC-----C-ccCEEEEEEEc---cC
Confidence 35 5555 4432 23677787777765 446899999975432 111212222211 0 12333444443 44
Q ss_pred eeecEEEEEeecCCCCCceEEEEEEEE
Q 038979 573 KGNNTIYLSQPRKLDAFTGIMYDYLRF 599 (606)
Q Consensus 573 ~G~NtI~l~~~~g~~~~~~vmyD~IrL 599 (606)
+|.++|+|....++ .+..|.+++
T Consensus 107 ~G~~~l~~~~~~~~----~~~ld~~~F 129 (129)
T smart00606 107 AGVHDVYLVFKGGN----YFNIDWFRF 129 (129)
T ss_pred CceEEEEEEEECCC----cEEEEEEEC
Confidence 89999999876543 277777764
No 48
>PRK09525 lacZ beta-D-galactosidase; Reviewed
Probab=79.51 E-value=61 Score=40.34 Aligned_cols=43 Identities=16% Similarity=0.235 Sum_probs=38.2
Q ss_pred ceEEEEeCCEEEEeCCeEEEEEeCCceeEEEEEEcCEEecccc
Q 038979 4 GVQLHEQNNHVVMNNGILQVSISTPQGFVIGIQYKGNKNLLNV 46 (606)
Q Consensus 4 ~v~~~~~g~~vv~~Ng~l~vtv~k~~g~itsl~y~G~e~l~~~ 46 (606)
.+++.+.+..+++.++.++++|+|.+|.++|++++|+|.+.+.
T Consensus 741 ~~~~~~~~~~~~i~~~~~~~~f~~~~G~l~s~~~~g~~~l~~~ 783 (1027)
T PRK09525 741 APQLTQDEQDFCIELGNQRWQFNRQSGLLSQWWVGGKEQLLTP 783 (1027)
T ss_pred CceEEEcCCeEEEEECCEEEEEECCCceEEEEEECCEEeeccC
Confidence 3467888899999999999999999999999999999887643
No 49
>PF00576 Transthyretin: HIUase/Transthyretin family; InterPro: IPR023416 This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [, ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence. Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates []. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) []. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain [].; PDB: 1TFP_B 1KGJ_D 1IE4_C 1GKE_C 1KGI_D 2H0J_B 2H0E_B 2H0F_B 1ZD6_A 3DGD_D ....
Probab=79.09 E-value=2.7 Score=38.43 Aligned_cols=50 Identities=26% Similarity=0.293 Sum_probs=35.8
Q ss_pred cCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcce-----EeCcccCceeEEEEE
Q 038979 322 AGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNF-----SIKNVLIGNYNLYAW 376 (606)
Q Consensus 322 ~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F-----~I~nV~pGtY~L~a~ 376 (606)
.|+||.++.|-|....++++|+. -.-+.||+||+. .-..+.+|.|.|..-
T Consensus 12 ~G~PA~gv~V~L~~~~~~~~~~~-----l~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~ 66 (112)
T PF00576_consen 12 TGKPAAGVPVTLYRLDSDGSWTL-----LAEGVTDADGRIKQPLLEGESLEPGIYKLVFD 66 (112)
T ss_dssp TTEE-TT-EEEEEEEETTSCEEE-----EEEEEBETTSEESSTSSETTTS-SEEEEEEEE
T ss_pred CCCCccCCEEEEEEecCCCCcEE-----EEEEEECCCCcccccccccccccceEEEEEEE
Confidence 39999999999976654455662 334679999988 345788999999987
No 50
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=78.26 E-value=3.5 Score=49.28 Aligned_cols=75 Identities=15% Similarity=0.135 Sum_probs=49.2
Q ss_pred EEEEEecCccccccEEEEEEEeec-----cCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCce
Q 038979 497 QIQFKLEGVVKKATYKLRVAVAAA-----HGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVL 571 (606)
Q Consensus 497 ~I~F~L~~~~~~~~~tLriala~a-----~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L 571 (606)
.|.|.+...+....++|+|...-+ ..++++|.|||+.+.. + .+..++ .|.....+|+||+ .|
T Consensus 84 ~i~f~vp~d~~v~~A~L~L~y~~Sp~l~~~~S~L~V~lNg~~v~s--~---pL~~~~-------~~~~~~~~i~IP~-~l 150 (756)
T PRK11114 84 GIEFGVRSDEVVTKARLNLEYTYSPALLPDLSHLKVYLNGELMGT--L---PLDKEQ-------LGKKVLAQLPIDP-RF 150 (756)
T ss_pred eeEeecCccccccCcEEEEEEEECCCCCCCCCeEEEEECCEEeEE--E---ecCccc-------CCCcceeEEecCH-HH
Confidence 666777666543456666665553 3478999999986541 1 111111 2445788999999 46
Q ss_pred eeeecEEEEEeec
Q 038979 572 RKGNNTIYLSQPR 584 (606)
Q Consensus 572 ~~G~NtI~l~~~~ 584 (606)
..|.|.|.|....
T Consensus 151 ~~g~N~L~~~~~~ 163 (756)
T PRK11114 151 ITDFNRLRLEFIG 163 (756)
T ss_pred cCCCceEEEEEec
Confidence 6899999998543
No 51
>cd03461 1,2-HQD Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes.
Probab=77.60 E-value=4.6 Score=42.60 Aligned_cols=65 Identities=17% Similarity=0.318 Sum_probs=46.6
Q ss_pred CCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccc---ccccee--EEEeCCCcceEeCcccCceeEE
Q 038979 303 NKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTE---CKGYQF--WTVANEGGNFSIKNVLIGNYNL 373 (606)
Q Consensus 303 ~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~---~~~yqy--wt~td~~G~F~I~nV~pGtY~L 373 (606)
.++=.|+|+|+-.+ |.|..+|.|=+-.....|....+ ...... ...||++|.|.+.-|+||-|-+
T Consensus 118 G~~l~v~G~V~D~~------G~Pv~gA~VeiWqad~~G~Y~~~~~~~~~~~lRGr~~Td~~G~y~F~Ti~Pg~Ypi 187 (277)
T cd03461 118 GEPCFVHGRVTDTD------GKPLPGATVDVWQADPNGLYDVQDPDQPEFNLRGKFRTDEDGRYAFRTLRPTPYPI 187 (277)
T ss_pred CCEEEEEEEEEcCC------CCCcCCcEEEEECcCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcCC
Confidence 35678999996544 89999999998666555533321 122222 2568999999999999999975
No 52
>PF13364 BetaGal_dom4_5: Beta-galactosidase jelly roll domain; PDB: 1TG7_A 1XC6_A 3OGS_A 3OGV_A 3OGR_A 3OG2_A.
Probab=77.16 E-value=12 Score=33.86 Aligned_cols=53 Identities=21% Similarity=0.275 Sum_probs=34.3
Q ss_pred EEEE-EEEeeccCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEE-eecEEEEEEeecCceeeeecEEEEE
Q 038979 511 YKLR-VAVAAAHGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIH-GVYKLFNVDVPGKVLRKGNNTIYLS 581 (606)
Q Consensus 511 ~tLr-iala~a~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~-G~~~~~~~~ipa~~L~~G~NtI~l~ 581 (606)
..|+ +.+......+.+|.|||+.+.. ++ +..-..+|.||+++|+.++|.|.+-
T Consensus 50 ~~~~~l~~~~g~~~~~~vwVNG~~~G~------------------~~~~~g~q~tf~~p~~il~~~n~v~~vl 104 (111)
T PF13364_consen 50 TSLTPLNIQGGNAFRASVWVNGWFLGS------------------YWPGIGPQTTFSVPAGILKYGNNVLVVL 104 (111)
T ss_dssp EEEE-EEECSSTTEEEEEEETTEEEEE------------------EETTTECCEEEEE-BTTBTTCEEEEEEE
T ss_pred eeEEEEeccCCCceEEEEEECCEEeee------------------ecCCCCccEEEEeCceeecCCCEEEEEE
Confidence 4455 5555567789999999986541 11 1111289999999999985555443
No 53
>cd05821 TLP_Transthyretin Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates. TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein. Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity. A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by bindi
Probab=76.91 E-value=3.5 Score=38.22 Aligned_cols=68 Identities=16% Similarity=0.147 Sum_probs=46.5
Q ss_pred CeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceE----eCcccCceeEEEEEECce
Q 038979 305 RGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFS----IKNVLIGNYNLYAWIPGF 380 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~----I~nV~pGtY~L~a~~~G~ 380 (606)
|-.++=.|+-.. .|+||.++.|-|......+.|+. .--++||+||+.. -..+.+|.|+|..-
T Consensus 6 ~~~ittHVLDt~-----~G~PAaGV~V~L~~~~~~~~w~~-----l~~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~---- 71 (121)
T cd05821 6 KCPLMVKVLDAV-----RGSPAANVAVKVFKKTADGSWEP-----FASGKTTETGEIHGLTTDEQFTEGVYKVEFD---- 71 (121)
T ss_pred CCCcEEEEEECC-----CCccCCCCEEEEEEecCCCceEE-----EEEEEECCCCCCCCccCccccCCeeEEEEEe----
Confidence 566777765544 39999999999964321234542 3347799999875 23467899999997
Q ss_pred eceeee
Q 038979 381 IGDFKY 386 (606)
Q Consensus 381 ~G~~~~ 386 (606)
.|+|..
T Consensus 72 tg~Yf~ 77 (121)
T cd05821 72 TKAYWK 77 (121)
T ss_pred hhHhhh
Confidence 666553
No 54
>cd05469 Transthyretin_like Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms.
Probab=74.40 E-value=4.6 Score=37.06 Aligned_cols=56 Identities=18% Similarity=0.214 Sum_probs=39.5
Q ss_pred cCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEe----CcccCceeEEEEEECceeceeee
Q 038979 322 AGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSI----KNVLIGNYNLYAWIPGFIGDFKY 386 (606)
Q Consensus 322 ~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I----~nV~pGtY~L~a~~~G~~G~~~~ 386 (606)
.|+||.++.|-|.....++.|+. ---++||+||+..- ..+.+|.|+|..- .|+|..
T Consensus 12 ~G~PAagv~V~L~~~~~~~~w~~-----l~~~~Tn~DGR~~~~l~~~~~~~G~Y~l~F~----t~~Yf~ 71 (113)
T cd05469 12 RGSPAANVAIKVFRKTADGSWEI-----FATGKTNEDGELHGLITEEEFXAGVYRVEFD----TKSYWK 71 (113)
T ss_pred CCccCCCCEEEEEEecCCCceEE-----EEEEEECCCCCccCccccccccceEEEEEEe----hHHhHh
Confidence 39999999999975422234541 23467999999852 3568999999997 565543
No 55
>KOG2649 consensus Zinc carboxypeptidase [General function prediction only]
Probab=73.90 E-value=10 Score=42.77 Aligned_cols=78 Identities=14% Similarity=0.210 Sum_probs=52.9
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEEECceeceee
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAWIPGFIGDFK 385 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~~~G~~G~~~ 385 (606)
--|+|-|+-.. |.|..+|.|-+..-.. =.+|..+|.|=- =..||.|.++|.+.|+. .
T Consensus 378 ~GIkG~V~D~~------G~~I~NA~IsV~ginH-------------dv~T~~~GDYWR-LL~PG~y~vta~A~Gy~---~ 434 (500)
T KOG2649|consen 378 RGIKGLVFDDT------GNPIANATISVDGINH-------------DVTTAKEGDYWR-LLPPGKYIITASAEGYD---P 434 (500)
T ss_pred hccceeEEcCC------CCccCceEEEEecCcC-------------ceeecCCCceEE-eeCCcceEEEEecCCCc---c
Confidence 34889885433 9999999999865421 234555664432 27899999999997755 6
Q ss_pred eeeEEEEeCCceeeecceEEec
Q 038979 386 YHAAIRITAGSAKQIGNLVYKA 407 (606)
Q Consensus 386 ~~~~VtV~aG~t~~l~~l~~~~ 407 (606)
...+|+|..-..+.. ++++.+
T Consensus 435 ~tk~v~V~~~~a~~~-df~L~~ 455 (500)
T KOG2649|consen 435 VTKTVTVPPDRAARV-NFTLQR 455 (500)
T ss_pred eeeEEEeCCCCccce-eEEEec
Confidence 666789987333344 577773
No 56
>PF02369 Big_1: Bacterial Ig-like domain (group 1); InterPro: IPR003344 Proteins that contain this domain are found in a variety of bacterial and phage surface proteins such as intimins. Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion [].; PDB: 1CWV_A 4E9L_A 1F02_I 1F00_I.
Probab=72.92 E-value=30 Score=30.52 Aligned_cols=68 Identities=16% Similarity=0.225 Sum_probs=36.3
Q ss_pred CCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcc--eEeCcccCceeEEEEEECc
Q 038979 304 KRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGN--FSIKNVLIGNYNLYAWIPG 379 (606)
Q Consensus 304 qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~--F~I~nV~pGtY~L~a~~~G 379 (606)
....+.=++++.|.+ |.|..+..|-+......+... .... -+.||++|. +++..-++|+|+++|.+.|
T Consensus 21 g~~~~tltatV~D~~----gnpv~g~~V~f~~~~~~~~l~--~~~~--~~~Td~~G~a~~tltst~aG~~~VtA~~~~ 90 (100)
T PF02369_consen 21 GSDTNTLTATVTDAN----GNPVPGQPVTFSSSSSGGTLS--PTNT--SATTDSNGIATVTLTSTKAGTYTVTATVDG 90 (100)
T ss_dssp SSS-EEEEEEEEETT----SEB-TS-EEEE--EESSSEES---CEE---EEE-TTSEEEEEEE-SS-EEEEEEEEETT
T ss_pred CcCcEEEEEEEEcCC----CCCCCCCEEEEEEcCCCcEEe--cCcc--ccEECCCEEEEEEEEecCceEEEEEEEECC
Confidence 344444455557743 899999988872111111121 1100 357899996 5667779999999999764
No 57
>COG2351 Transthyretin-like protein [General function prediction only]
Probab=70.83 E-value=15 Score=34.08 Aligned_cols=67 Identities=22% Similarity=0.365 Sum_probs=47.0
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEe-----CcccCceeEEEEEECce
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSI-----KNVLIGNYNLYAWIPGF 380 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I-----~nV~pGtY~L~a~~~G~ 380 (606)
|.++=.|+-+. .|+||.++.|.|..-..+ .|+. ---+.||+||+-.- ..+++|.|+|..-
T Consensus 9 G~LTTHVLDta-----~GkPAagv~V~L~rl~~~-~~~~-----l~t~~Tn~DGR~d~pll~g~~~~~G~Y~l~F~---- 73 (124)
T COG2351 9 GRLTTHVLDTA-----SGKPAAGVKVELYRLEGN-QWEL-----LKTVVTNADGRIDAPLLAGETLATGIYELVFH---- 73 (124)
T ss_pred ceeeeeeeecc-----cCCcCCCCEEEEEEecCC-ccee-----eeEEEecCCCcccccccCccccccceEEEEEE----
Confidence 56666665444 499999999999755333 3441 23467899998763 4678999999987
Q ss_pred eceeeee
Q 038979 381 IGDFKYH 387 (606)
Q Consensus 381 ~G~~~~~ 387 (606)
.|||...
T Consensus 74 ~gdYf~~ 80 (124)
T COG2351 74 TGDYFKS 80 (124)
T ss_pred cchhhhc
Confidence 7776643
No 58
>smart00095 TR_THY Transthyretin.
Probab=69.81 E-value=7.2 Score=36.23 Aligned_cols=67 Identities=16% Similarity=0.188 Sum_probs=44.2
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceE--e--CcccCceeEEEEEECcee
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFS--I--KNVLIGNYNLYAWIPGFI 381 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~--I--~nV~pGtY~L~a~~~G~~ 381 (606)
-.++=.|+-.. .|+||.++.|-|......+.|+. ---.+||+||+.. + ..+.+|.|+|..- .
T Consensus 4 ~plTtHVLDt~-----~G~PAagv~V~L~~~~~~~~w~~-----la~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~----t 69 (121)
T smart00095 4 CPLMVKVLDAV-----RGSPAVNVAVKVFKKTEEGTWEP-----FASGKTNESGEIHELTTDEKFVEGLYKVEFD----T 69 (121)
T ss_pred CCeEEEEEECC-----CCccCCCCEEEEEEeCCCCceEE-----EEEEecCCCccccCccCcccccceEEEEEEe----h
Confidence 34555654444 39999999999964321223541 2236789999874 1 3567999999997 6
Q ss_pred ceeee
Q 038979 382 GDFKY 386 (606)
Q Consensus 382 G~~~~ 386 (606)
|+|..
T Consensus 70 g~Yf~ 74 (121)
T smart00095 70 KSYWK 74 (121)
T ss_pred hHhHh
Confidence 66654
No 59
>PF14315 DUF4380: Domain of unknown function (DUF4380)
Probab=69.77 E-value=74 Score=33.24 Aligned_cols=33 Identities=21% Similarity=0.487 Sum_probs=29.1
Q ss_pred CEEEEeCCeEEEEEeCCce-eEEEEEEcCEEecc
Q 038979 12 NHVVMNNGILQVSISTPQG-FVIGIQYKGNKNLL 44 (606)
Q Consensus 12 ~~vv~~Ng~l~vtv~k~~g-~itsl~y~G~e~l~ 44 (606)
+.+.|+|+.++++|+-.-| .|.++.++|.+|++
T Consensus 5 ~~~~l~N~~i~l~Vtp~~GgRIl~~~~~g~~N~~ 38 (274)
T PF14315_consen 5 NCLRLSNGDIELIVTPDVGGRILSFGLNGGENLF 38 (274)
T ss_pred eEEEEECCCEEEEEecCCCCEEEEEEeCCCceEE
Confidence 5689999999999886666 99999999988888
No 60
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=69.71 E-value=8.5 Score=44.60 Aligned_cols=79 Identities=16% Similarity=0.231 Sum_probs=52.4
Q ss_pred ceeEEEEEEecCcc---ccccEEEEEEEeec-----cCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEE
Q 038979 493 GSTWQIQFKLEGVV---KKATYKLRVAVAAA-----HGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNV 564 (606)
Q Consensus 493 ~~~w~I~F~L~~~~---~~~~~tLriala~a-----~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~ 564 (606)
+.+..+.|.|+.+- ......|.+..+.+ ..+++.|.|||.-+.. ..+.+ +-.+....+++
T Consensus 323 ~~~~~~~f~lP~dl~~~~~~~i~l~L~y~y~~~~~~~~S~l~V~vNg~~i~s-----~~L~~-------~~~~~~~~~~v 390 (605)
T PF03170_consen 323 PQPISFNFRLPPDLFAWDGSGIPLHLRYRYTPGLDFDGSRLTVYVNGQFIGS-----LPLTP-------ADGAGFDRYTV 390 (605)
T ss_pred CCcceeEeeCCccccccCCCceEEEEEEecCCCCCCCCcEEEEEECCEEEEe-----EECCC-------CCCCccceeEE
Confidence 34577888887642 22345555555554 3578999999985531 11111 22355688999
Q ss_pred EeecCceeeeecEEEEEeec
Q 038979 565 DVPGKVLRKGNNTIYLSQPR 584 (606)
Q Consensus 565 ~ipa~~L~~G~NtI~l~~~~ 584 (606)
.|| ..++.|.|.|.|...-
T Consensus 391 ~iP-~~~~~~~N~l~~~f~l 409 (605)
T PF03170_consen 391 SIP-RLLLPGRNQLQFEFDL 409 (605)
T ss_pred ecC-chhcCCCcEEEEEEEe
Confidence 999 9999999999887543
No 61
>PF01060 DUF290: Transthyretin-like family; InterPro: IPR001534 This new apparently nematode-specific protein family has been called family 2 []. The proteins show weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. The specific function of this protein is unknown.; GO: 0005615 extracellular space
Probab=65.08 E-value=13 Score=31.63 Aligned_cols=55 Identities=20% Similarity=0.169 Sum_probs=33.0
Q ss_pred EEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEE
Q 038979 309 SGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYA 375 (606)
Q Consensus 309 sG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a 375 (606)
+|+|.= + +.|+.++.|-|-+... ......-=-+.||++|+|+|..-...-..+.-
T Consensus 1 ~G~L~C-~------~~P~~~~~V~L~e~d~-----~~~Ddll~~~~Td~~G~F~l~G~~~e~~~i~P 55 (80)
T PF01060_consen 1 KGQLMC-G------GKPAKNVKVKLWEDDY-----FDPDDLLDETKTDSDGNFELSGSTNEFTTIEP 55 (80)
T ss_pred CeEEEe-C------CccCCCCEEEEEECCC-----CCCCceeEEEEECCCceEEEEEEccCCccccE
Confidence 477753 3 6999999999943311 01122222277899999999754444333333
No 62
>PLN03059 beta-galactosidase; Provisional
Probab=63.07 E-value=7.5 Score=46.75 Aligned_cols=85 Identities=15% Similarity=0.143 Sum_probs=54.0
Q ss_pred eeEEEEEEecCccccccEEEEEEEeeccCCeEEEEEcCccCCC--CCccccccCCCCeeeeeEE--------EeecEEEE
Q 038979 494 STWQIQFKLEGVVKKATYKLRVAVAAAHGAELQVRVNSRSARR--PLFSSGSVGRENAIARHGI--------HGVYKLFN 563 (606)
Q Consensus 494 ~~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vNg~~~~~--p~~~t~~~~~~~~i~R~~~--------~G~~~~~~ 563 (606)
+=.+-.|++++.. +- ++|-....+.=+|.|||.++.. |.+.+ .-+=+.|-+|+++ .|.....-
T Consensus 621 twYK~~Fd~p~g~---Dp---v~LDm~gmGKG~aWVNG~nIGRYW~~~a~-~~gC~~c~y~g~~~~~kc~~~cggP~q~l 693 (840)
T PLN03059 621 TWYKTTFDAPGGN---DP---LALDMSSMGKGQIWINGQSIGRHWPAYTA-HGSCNGCNYAGTFDDKKCRTNCGEPSQRW 693 (840)
T ss_pred eEEEEEEeCCCCC---CC---EEEecccCCCeeEEECCcccccccccccc-cCCCccccccccccchhhhccCCCceeEE
Confidence 3346778874422 11 2233345666789999998875 33211 1122457788887 24567777
Q ss_pred EEeecCceeeeecEEEEEeecC
Q 038979 564 VDVPGKVLRKGNNTIYLSQPRK 585 (606)
Q Consensus 564 ~~ipa~~L~~G~NtI~l~~~~g 585 (606)
+.||+++|++|.|+|.|=-..|
T Consensus 694 YHVPr~~Lk~g~N~lViFEe~g 715 (840)
T PLN03059 694 YHVPRSWLKPSGNLLIVFEEWG 715 (840)
T ss_pred EeCcHHHhccCCceEEEEEecC
Confidence 8899999999999988754444
No 63
>KOG4342 consensus Alpha-mannosidase [Carbohydrate transport and metabolism]
Probab=61.38 E-value=12 Score=43.33 Aligned_cols=39 Identities=31% Similarity=0.543 Sum_probs=30.3
Q ss_pred ceEEEEeCCEEEEeCCeEEEEEeCCceeEEEEEEc--CEEec
Q 038979 4 GVQLHEQNNHVVMNNGILQVSISTPQGFVIGIQYK--GNKNL 43 (606)
Q Consensus 4 ~v~~~~~g~~vv~~Ng~l~vtv~k~~g~itsl~y~--G~e~l 43 (606)
+|-.-...++|+++||+|.|+|. |+|.|+||.-- |.|.+
T Consensus 706 p~~~yq~Dd~~~L~Ng~lrV~i~-p~G~itSl~d~~~grE~l 746 (1078)
T KOG4342|consen 706 PVFVYQTDDSVTLDNGILRVKID-PTGRITSLVDVASGREAL 746 (1078)
T ss_pred ceeEEecCCeEEEECCEEEEEEC-CCCceeeeeehhcccchh
Confidence 44566678999999999999998 57999999543 55444
No 64
>cd03457 intradiol_dioxygenase_like Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown.
Probab=55.27 E-value=30 Score=34.46 Aligned_cols=62 Identities=10% Similarity=0.104 Sum_probs=42.2
Q ss_pred eEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccc---c--------cccee-EEEeCCCcceEeCcccCceeE
Q 038979 306 GSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTE---C--------KGYQF-WTVANEGGNFSIKNVLIGNYN 372 (606)
Q Consensus 306 GtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~---~--------~~yqy-wt~td~~G~F~I~nV~pGtY~ 372 (606)
=.+.|+|+-.+ .+.|..+|.|=|-.....|..... . ..+-. +..||++|.|++.-|.||-|.
T Consensus 27 l~l~g~V~D~~-----~c~Pv~~a~VdiWh~da~G~Ys~~~~~~~~~~~~~~~~flRG~~~TD~~G~~~F~TI~PG~Y~ 100 (188)
T cd03457 27 LTLDLQVVDVA-----TCCPPPNAAVDIWHCDATGVYSGYSAGGGGGEDTDDETFLRGVQPTDADGVVTFTTIFPGWYP 100 (188)
T ss_pred EEEEEEEEeCC-----CCccCCCeEEEEecCCCCCCCCCccCCccccccccCCCcCEEEEEECCCccEEEEEECCCCCC
Confidence 47888885422 268999999999765555533221 1 11111 367899999999999999885
No 65
>PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed
Probab=54.35 E-value=17 Score=45.01 Aligned_cols=67 Identities=15% Similarity=0.186 Sum_probs=46.4
Q ss_pred EEEEEEecCccccccEEEEEEEeeccCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCceeeee
Q 038979 496 WQIQFKLEGVVKKATYKLRVAVAAAHGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVLRKGN 575 (606)
Q Consensus 496 w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~ 575 (606)
.+=.|.+++........|++.. ......|.|||+.+.. -.|-+..++|+|.. .|+.|+
T Consensus 113 Yrr~F~lp~~~~gkrv~L~FeG---V~s~a~VwvNG~~VG~------------------~~g~~~pfefDIT~-~l~~G~ 170 (1021)
T PRK10340 113 YQRTFTLSDGWQGKQTIIKFDG---VETYFEVYVNGQYVGF------------------SKGSRLTAEFDISA-MVKTGD 170 (1021)
T ss_pred EEEEEEeCcccccCcEEEEECc---cceEEEEEECCEEecc------------------ccCCCccEEEEcch-hhCCCc
Confidence 4446888765542334455542 3567899999986541 11556788999987 678999
Q ss_pred cEEEEEeec
Q 038979 576 NTIYLSQPR 584 (606)
Q Consensus 576 NtI~l~~~~ 584 (606)
|+|.+.+.+
T Consensus 171 N~LaV~V~~ 179 (1021)
T PRK10340 171 NLLCVRVMQ 179 (1021)
T ss_pred cEEEEEEEe
Confidence 999999854
No 66
>PF08531 Bac_rhamnosid_N: Alpha-L-rhamnosidase N-terminal domain; InterPro: IPR013737 This domain is found in bacterial rhamnosidase A and B enzymes and is probably involved in substrate recognition. ; PDB: 2OKX_B.
Probab=50.18 E-value=13 Score=36.05 Aligned_cols=61 Identities=16% Similarity=0.114 Sum_probs=31.1
Q ss_pred EEEEEEEeeccCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCceeeeecEEEEEeecC
Q 038979 511 YKLRVAVAAAHGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVLRKGNNTIYLSQPRK 585 (606)
Q Consensus 511 ~tLriala~a~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~NtI~l~~~~g 585 (606)
++|.| ++.+.+++.|||+.+..-.+.+ .+..+.-.....+++| ..+|++|+|+|-+.+..|
T Consensus 6 A~l~i----sa~g~Y~l~vNG~~V~~~~l~P---------~~t~y~~~~~Y~tyDV-t~~L~~G~N~iav~lg~g 66 (172)
T PF08531_consen 6 ARLYI----SALGRYELYVNGERVGDGPLAP---------GWTDYDKRVYYQTYDV-TPYLRPGENVIAVWLGNG 66 (172)
T ss_dssp -EEEE----EEESEEEEEETTEEEEEE-----------------BTTEEEEEEEE--TTT--TTEEEEEEEEEE-
T ss_pred EEEEE----EeCeeEEEEECCEEeeCCcccc---------ccccCCCceEEEEEeC-hHHhCCCCCEEEEEEeCC
Confidence 44554 3567999999998664211111 0111111112224555 468899999999998654
No 67
>cd09024 Aldose_epim_lacX Aldose 1-epimerase, similar to Lactococcus lactis lacX. Proteins similar to Lactococcus lactis lacX are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen.
Probab=47.32 E-value=2.1e+02 Score=29.85 Aligned_cols=30 Identities=30% Similarity=0.429 Sum_probs=27.0
Q ss_pred EEEeCCeEEEEEeCCceeEEEEEEc--CEEec
Q 038979 14 VVMNNGILQVSISTPQGFVIGIQYK--GNKNL 43 (606)
Q Consensus 14 vv~~Ng~l~vtv~k~~g~itsl~y~--G~e~l 43 (606)
++|.|+.++++|..-+|.|+|++.+ |.|.+
T Consensus 1 ~~l~n~~~~a~v~~~Ga~l~s~~~~~~g~e~l 32 (288)
T cd09024 1 ITLENEFLTVTISEHGAELTSIKDKKTGREYL 32 (288)
T ss_pred CEEECCcEEEEEeccCcEEEEEEeCCCCCEEE
Confidence 4689999999999999999999998 88877
No 68
>PF11008 DUF2846: Protein of unknown function (DUF2846); InterPro: IPR022548 Some members in this group of proteins with unknown function are annotated as lipoproteins. However this cannot be confirmed.
Probab=46.94 E-value=25 Score=31.80 Aligned_cols=44 Identities=14% Similarity=0.057 Sum_probs=30.2
Q ss_pred CCcceEeCcccCceeEEEEEECceeceeeeeeEEEEeCCceeee
Q 038979 357 EGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITAGSAKQI 400 (606)
Q Consensus 357 ~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~l 400 (606)
..|.|..-.|+||+|++.+......+.-....+|+|.+|++--+
T Consensus 56 ~~g~y~~~~v~pG~h~i~~~~~~~~~~~~~~l~~~~~~G~~yy~ 99 (117)
T PF11008_consen 56 KNGGYFYVEVPPGKHTISAKSEFSSSPGANSLDVTVEAGKTYYV 99 (117)
T ss_pred CCCeEEEEEECCCcEEEEEecCccCCCCccEEEEEEcCCCEEEE
Confidence 56777778899999999995321111112566799999998544
No 69
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=46.06 E-value=42 Score=26.34 Aligned_cols=28 Identities=21% Similarity=0.381 Sum_probs=21.0
Q ss_pred eeEEEeCCCcceEe--CcccCceeEEEEEE
Q 038979 350 QFWTVANEGGNFSI--KNVLIGNYNLYAWI 377 (606)
Q Consensus 350 qywt~td~~G~F~I--~nV~pGtY~L~a~~ 377 (606)
.|.+.+|++|.+++ +....|+|++++.+
T Consensus 3 ~~~~t~~~~G~Ws~t~~~~~dG~y~itv~a 32 (54)
T PF13754_consen 3 TYTTTVDSDGNWSFTVPALADGTYTITVTA 32 (54)
T ss_pred EEEEEECCCCcEEEeCCCCCCccEEEEEEE
Confidence 56777889997766 55556888888875
No 70
>PF03944 Endotoxin_C: delta endotoxin; InterPro: IPR005638 This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain (IPR005639 from INTERPRO) involved in membrane insertion and pore formation; a beta-sheet central domain (IPR001178 from INTERPRO) involved in receptor binding; and a C-terminal beta-sandwich domain that interacts with the N-terminal domain to form a channel [, ]. This entry represents the conserved C-terminal domain.; PDB: 1DLC_A 1JI6_A 1W99_A 1CIY_A 1I5P_A 2C9K_A 3EB7_A.
Probab=44.71 E-value=89 Score=29.35 Aligned_cols=95 Identities=18% Similarity=0.302 Sum_probs=49.1
Q ss_pred EEEEEecCccccccEEEEEEEeeccCCeEEEEEcCccCC--CCCccccccCCCCeeeeeEEEeecEEEE-EEeecC-cee
Q 038979 497 QIQFKLEGVVKKATYKLRVAVAAAHGAELQVRVNSRSAR--RPLFSSGSVGRENAIARHGIHGVYKLFN-VDVPGK-VLR 572 (606)
Q Consensus 497 ~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vNg~~~~--~p~~~t~~~~~~~~i~R~~~~G~~~~~~-~~ipa~-~L~ 572 (606)
++++..+. .....|.+||-.|+.+.+.+.|.+++.... .+.-.|. .+... ..++|..|. ++++.. .+.
T Consensus 41 ~~~v~~~~-~~~~~YrIRiRYAs~~~~~~~i~~~~~~~~~~~~~~~T~--~~~~~-----~~~~y~~F~y~~~~~~~~~~ 112 (143)
T PF03944_consen 41 KIRVTINN-SSSQKYRIRIRYASNSNGTLSISINNSSGNLSFNFPSTM--SNGDN-----LTLNYESFQYVEFPTPFTFS 112 (143)
T ss_dssp EEEEEESS-SSTEEEEEEEEEEESS-EEEEEEETTEEEECEEEE--SS--STTGG-----CCETGGG-EEEEESSEEEES
T ss_pred EEEEEecC-CCCceEEEEEEEEECCCcEEEEEECCccceeeeeccccc--cCCCc-----cccccceeEeeecCceEEec
Confidence 44444332 233679999999998889999999987542 1111121 11111 333333332 222321 122
Q ss_pred eee-cEEEEEeecCCCCCceEEEEEEEEe
Q 038979 573 KGN-NTIYLSQPRKLDAFTGIMYDYLRFE 600 (606)
Q Consensus 573 ~G~-NtI~l~~~~g~~~~~~vmyD~IrLe 600 (606)
.+. .+|.|.+...++. ..|.-|-|++.
T Consensus 113 ~~~~~~~~i~i~~~~~~-~~v~IDkIEFI 140 (143)
T PF03944_consen 113 SNQSITITISIQNISSN-GNVYIDKIEFI 140 (143)
T ss_dssp TSEEEEEEEEEESSTTT-S-EEEEEEEEE
T ss_pred CCCceEEEEEEEecCCC-CeEEEEeEEEE
Confidence 222 5677765543332 47999999985
No 71
>PRK09525 lacZ beta-D-galactosidase; Reviewed
Probab=37.98 E-value=50 Score=41.06 Aligned_cols=66 Identities=14% Similarity=0.215 Sum_probs=45.6
Q ss_pred EEEEEEecCccccc-cEEEEEEEeeccCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEE-eecEEEEEEeecCceee
Q 038979 496 WQIQFKLEGVVKKA-TYKLRVAVAAAHGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIH-GVYKLFNVDVPGKVLRK 573 (606)
Q Consensus 496 w~I~F~L~~~~~~~-~~tLriala~a~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~-G~~~~~~~~ipa~~L~~ 573 (606)
.+=.|.+++..... ...|++. +......|.|||+.+. +| |-+.-++|+|. ..|+.
T Consensus 124 Yrr~F~vp~~w~~~~rv~L~Fe---GV~~~a~VwvNG~~VG-------------------~~~g~~~pfefDIT-~~l~~ 180 (1027)
T PRK09525 124 YSLTFTVDESWLQSGQTRIIFD---GVNSAFHLWCNGRWVG-------------------YSQDSRLPAEFDLS-PFLRA 180 (1027)
T ss_pred EEEEEEeChhhcCCCeEEEEEC---eeccEEEEEECCEEEE-------------------eecCCCceEEEECh-hhhcC
Confidence 34468887654322 3445544 3567889999998553 23 55677899996 67789
Q ss_pred eecEEEEEeec
Q 038979 574 GNNTIYLSQPR 584 (606)
Q Consensus 574 G~NtI~l~~~~ 584 (606)
|+|+|.+.+.+
T Consensus 181 G~N~L~V~V~~ 191 (1027)
T PRK09525 181 GENRLAVMVLR 191 (1027)
T ss_pred CccEEEEEEEe
Confidence 99999999854
No 72
>PF12866 DUF3823: Protein of unknown function (DUF3823); InterPro: IPR024278 This is a family of uncharacterised proteins from Bacteroidetes. These proteins have characteristic DN and DR sequence-motifs but their function is not known.; PDB: 3HN5_B 4EIU_A.
Probab=37.76 E-value=1.1e+02 Score=31.30 Aligned_cols=63 Identities=17% Similarity=0.312 Sum_probs=36.4
Q ss_pred CeEEEEEEEEeecccccc-CcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCcccCceeEEEEE
Q 038979 305 RGSISGRLIVKDRYVSRA-GIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVLIGNYNLYAW 376 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~-~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~pGtY~L~a~ 376 (606)
-++++|+| -|.|-+.+ +....++-+-|-+. +|.. .+-|.+ ....||.|.-..+=+|+|.|..-
T Consensus 21 ~s~l~G~i--iD~~tgE~i~~~~~gv~i~l~e~----gy~~--~~~~~~-~v~qDGtf~n~~lF~G~Yki~~~ 84 (222)
T PF12866_consen 21 DSTLTGRI--IDVYTGEPIQTDIGGVRIQLYEL----GYGD--NTPQDV-YVKQDGTFRNTKLFDGDYKIVPK 84 (222)
T ss_dssp -EEEEEEE--EECCTTEE----STSSEEEEECS-----CCG----SEEE-EB-TTSEEEEEEE-SEEEEEEE-
T ss_pred CceEEEEE--EEeecCCeeeecCCceEEEEEec----cccc--CCCcce-EEccCCceeeeeEeccceEEEEc
Confidence 58999998 34332111 12224667777555 4552 233433 37789999888999999999983
No 73
>PF07748 Glyco_hydro_38C: Glycosyl hydrolases family 38 C-terminal domain; InterPro: IPR011682 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 38 GH38 from CAZY comprises enzymes with only one known activity; alpha-mannosidase (3.2.1.24 from EC) (3.2.1.114 from EC). This domain is found at the C terminus of glycosyl hydrolases from family 38.; GO: 0015923 mannosidase activity, 0006013 mannose metabolic process; PDB: 2WYI_A 2WYH_A 3LVT_A 3CZN_A 2FYV_A 3D50_A 3EJU_A 3EJS_A 3DX3_A 3BVX_A ....
Probab=37.47 E-value=2.3e+02 Score=30.99 Aligned_cols=116 Identities=11% Similarity=0.147 Sum_probs=65.5
Q ss_pred EEEeCCeEEEEEeCCceeEEEE--EEcCEEec---------cccCCCccC----cCccceEEEEEEccCcEEEEEEEEee
Q 038979 14 VVMNNGILQVSISTPQGFVIGI--QYKGNKNL---------LNVQNEEDN----RGIEATNYKVIMRTKEQVELSFTRMW 78 (606)
Q Consensus 14 vv~~Ng~l~vtv~k~~g~itsl--~y~G~e~l---------~~~~~~~~~----~Gl~~~~~~v~~~~~~~i~vs~~~~~ 78 (606)
.+|.|+.+.|+|+..+|.|++| +-+|.+.. ++....... ..-....+.++.+++-...|.+...+
T Consensus 89 ~~leN~~~~v~~~~~tG~i~sl~dk~~g~~~~~~~~~~~~~y~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~i~~~~~~ 168 (457)
T PF07748_consen 89 NVLENEFYKVTFDPNTGGIKSLYDKKTGREYVDQDGNDFYIYEDIDGDYDISPLELQRSGAYLFVEDGPLRSSIRVEYKF 168 (457)
T ss_dssp TEEETSSEEEEE-TTTSSEEEEEETTTS-EEEEEECEEEEEEEBTTSCTTE-SCCCGSEECCEEEEESSSEEEEEEEEEE
T ss_pred EEEEccEEEEEEeCCCCeEEEEEEccCCeEEEeecCCceeEeccccccccccccccccCceEEEEecCCceEEEEEEEEE
Confidence 5789999999999776999999 44455544 111111100 01134455566666656655555544
Q ss_pred c-CC------CCCCccceeeeEEEEEEcCCceEEEEEEeecccCCCCCcccccEEEEeCCC
Q 038979 79 Q-PY------TNGTIAPVNIDKRFLMLRGSSGFYSYAIYKRLKGWPGFQLFNNRMVFKPNP 132 (606)
Q Consensus 79 ~-~~------~~g~~~~~~l~~~~v~r~g~sgiY~y~i~~~~~~~p~~~lge~R~v~Rl~~ 132 (606)
. |. .........+.+.+.|.+|.+-|..=+...+... +.. -++|+.|..+-
T Consensus 169 ~~p~~~~~~~~~~~~~~~~i~~~i~L~~~~~~ie~~~~vdn~~~--~~~-~~l~~~f~t~i 226 (457)
T PF07748_consen 169 ELPKNLSLVKRSEQTGSSRITQTIRLYKGSPRIEFETEVDNWAE--DHR-KELRVRFPTNI 226 (457)
T ss_dssp EEESCBECEEEESCEEEEEEEEEEEEETTESSEEEEEEEEE-TT--SCE-EEEEEEEEES-
T ss_pred eccCCcEEEEEEEeccceEEEEEEEEecCceEEEEEEEeccccc--CCc-eeEEEEeecCC
Confidence 1 00 0011133468999999999999988776642221 111 36777777653
No 74
>PF07550 DUF1533: Protein of unknown function (DUF1533); InterPro: IPR011432 This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.
Probab=37.29 E-value=29 Score=28.35 Aligned_cols=19 Identities=21% Similarity=0.529 Sum_probs=16.9
Q ss_pred EEeecCce-eeeecEEEEEe
Q 038979 564 VDVPGKVL-RKGNNTIYLSQ 582 (606)
Q Consensus 564 ~~ipa~~L-~~G~NtI~l~~ 582 (606)
+.|.+++| ++|+|+|+|.-
T Consensus 36 l~i~~~~f~~~G~~~I~I~A 55 (65)
T PF07550_consen 36 LKIKASAFNKDGENTIVIKA 55 (65)
T ss_pred EEEcHHHcCcCCceEEEEEe
Confidence 88899999 78999999983
No 75
>PF01190 Pollen_Ole_e_I: Pollen proteins Ole e I like; InterPro: IPR006041 Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation. The allergens in this family include allergens with the following designations: Ole e 1. A number of plant pollen proteins, whose biological function is not yet known, are structurally related []. These proteins are most probably secreted and consist of about 145 residues. There are six cysteines which are conserved in the sequence of these proteins. They seem to be involved in disulphide bonds.
Probab=36.20 E-value=68 Score=27.92 Aligned_cols=37 Identities=24% Similarity=0.258 Sum_probs=25.1
Q ss_pred CcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeC
Q 038979 323 GIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIK 364 (606)
Q Consensus 323 ~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~ 364 (606)
..|..+|.|.|.=.+.++ .......+.||++|.|.|.
T Consensus 18 ~~~l~GA~V~v~C~~~~~-----~~~~~~~~~Td~~G~F~i~ 54 (97)
T PF01190_consen 18 AKPLPGAKVSVECKDGNG-----GVVFSAEAKTDENGYFSIE 54 (97)
T ss_pred CccCCCCEEEEECCCCCC-----CcEEEEEEEeCCCCEEEEE
Confidence 467888999886332210 0234566889999999996
No 76
>smart00634 BID_1 Bacterial Ig-like domain (group 1).
Probab=35.63 E-value=1.3e+02 Score=25.74 Aligned_cols=66 Identities=14% Similarity=0.168 Sum_probs=39.4
Q ss_pred CeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcc--eEeCcccCceeEEEEEECcee
Q 038979 305 RGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGN--FSIKNVLIGNYNLYAWIPGFI 381 (606)
Q Consensus 305 RGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~--F~I~nV~pGtY~L~a~~~G~~ 381 (606)
...|+-+| .|. .|.|..+..|-+.-.+.. .....+ -...+|++|. +.+..-.+|++++++.+.|..
T Consensus 19 ~~~i~v~v--~D~----~Gnpv~~~~V~f~~~~~~-~~~~~~----~~~~Td~~G~a~~~l~~~~~G~~~vta~~~~~~ 86 (92)
T smart00634 19 AITLTATV--TDA----NGNPVAGQEVTFTTPSGG-ALTLSK----GTATTDANGIATVTLTSTTAGVYTVTASLENGS 86 (92)
T ss_pred cEEEEEEE--ECC----CCCCcCCCEEEEEECCCc-eeeccC----CeeeeCCCCEEEEEEECCCCcEEEEEEEECCCc
Confidence 34555555 554 378888887776544322 111111 1236888996 445566789999999876543
No 77
>PF11797 DUF3324: Protein of unknown function C-terminal (DUF3324); InterPro: IPR021759 This family consists of several hypothetical bacterial proteins of unknown function.
Probab=33.24 E-value=49 Score=31.09 Aligned_cols=30 Identities=20% Similarity=0.216 Sum_probs=21.4
Q ss_pred cccCceeEEEEEECceeceeeeeeEEEEeC
Q 038979 365 NVLIGNYNLYAWIPGFIGDFKYHAAIRITA 394 (606)
Q Consensus 365 nV~pGtY~L~a~~~G~~G~~~~~~~VtV~a 394 (606)
.++||+|+|.+-+..-.+.-..+.+++|++
T Consensus 102 ~lk~G~Y~l~~~~~~~~~~W~f~k~F~It~ 131 (140)
T PF11797_consen 102 KLKPGKYTLKITAKSGKKTWTFTKDFTITA 131 (140)
T ss_pred CccCCEEEEEEEEEcCCcEEEEEEEEEECH
Confidence 799999999988754444444556677764
No 78
>PRK10150 beta-D-glucuronidase; Provisional
Probab=33.12 E-value=77 Score=36.79 Aligned_cols=65 Identities=18% Similarity=0.158 Sum_probs=42.4
Q ss_pred EEEEEecCccccccEEEEEEEeeccCCeEEEEEcCccCCCCCccccccCCCCeeeeeEEEeecEEEEEEeecCceeeee-
Q 038979 497 QIQFKLEGVVKKATYKLRVAVAAAHGAELQVRVNSRSARRPLFSSGSVGRENAIARHGIHGVYKLFNVDVPGKVLRKGN- 575 (606)
Q Consensus 497 ~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vNg~~~~~p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~- 575 (606)
+=.|.+++........|++. +....-.|.|||+.+.. -.|-+..++|+|.. .|+.|.
T Consensus 70 rr~f~lp~~~~gk~v~L~Fe---gv~~~a~V~lNG~~vg~------------------~~~~~~~f~~DIT~-~l~~G~~ 127 (604)
T PRK10150 70 QREVFIPKGWAGQRIVLRFG---SVTHYAKVWVNGQEVME------------------HKGGYTPFEADITP-YVYAGKS 127 (604)
T ss_pred EEEEECCcccCCCEEEEEEC---cccceEEEEECCEEeee------------------EcCCccceEEeCch-hccCCCc
Confidence 44588876443223445553 24456799999985531 12557788999975 567886
Q ss_pred cEEEEEee
Q 038979 576 NTIYLSQP 583 (606)
Q Consensus 576 NtI~l~~~ 583 (606)
|+|.+.+.
T Consensus 128 n~L~V~v~ 135 (604)
T PRK10150 128 VRITVCVN 135 (604)
T ss_pred eEEEEEEe
Confidence 49999984
No 79
>PF01690 PLRV_ORF5: Potato leaf roll virus readthrough protein; InterPro: IPR002929 This family consists mainly of the Potato leafroll virus (PLrV) read through protein otherwise known as the minor capsid protein. This is generated via a readthrough of open reading frame 3, the coat protein, allowing transcription of open reading frame 5 to give an extended coat protein with a large C-terminal addition or read through domain []. The read through protein is essential for the circulative aphid transmission of PLrV [] and Beet western yellows virus []. The N-terminal region of the luteovirus readthrough domain determines virus binding to Buchnera GroEL and is essential for virus persistence in the aphid [].; GO: 0019028 viral capsid
Probab=33.08 E-value=1.3e+02 Score=34.17 Aligned_cols=108 Identities=15% Similarity=0.249 Sum_probs=67.4
Q ss_pred ccccceeEEEeC--CCcceEeCcccCceeEEEEEECceeceeeeeeEEEEeCCc--eeeecceEEecCCCCCceeEEecc
Q 038979 345 ECKGYQFWTVAN--EGGNFSIKNVLIGNYNLYAWIPGFIGDFKYHAAIRITAGS--AKQIGNLVYKAPRNGPTLWEIGIP 420 (606)
Q Consensus 345 ~~~~yqywt~td--~~G~F~I~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~--t~~l~~l~~~~p~~~~~LweIG~~ 420 (606)
-+-.++|+...+ +.=.|.|+ |..|+|.+++...|++ . |.=+.|. ---+|.|.+.. +....|.||..
T Consensus 74 ~~i~a~w~snn~~~A~p~f~~P-vp~G~~sV~isceG~q---~----v~~~gg~~dg~~~GlIAY~~--~~~~~WnvG~y 143 (465)
T PF01690_consen 74 VNIDAGWYSNNSVKAIPMFVFP-VPKGKWSVEISCEGYQ---A----VSSIGGPNDGKWDGLIAYDN--SSSDGWNVGNY 143 (465)
T ss_pred EEecceeEecCcceeeeEEEEe-cCCceEEEEEEeccee---c----ccccCCCCCCceeeeEEecC--ccccccccccc
Confidence 456667776655 45578887 9999999999987765 1 2222221 12367677773 44499999986
Q ss_pred C-------CCccceecCCCCCcccccccccccccccchhHhhhhhhcCCCCeeEEEeccCC
Q 038979 421 D-------RSAAEFYIPNPNPKYINKLYVKHDRFRQYGLWERYAELHRKRDLVYEVWANNY 474 (606)
Q Consensus 421 D-------rta~eF~~~d~~~~~~~~~~~~~d~~r~yglW~r~~~~~P~~dl~ytVG~S~~ 474 (606)
. +..+-|+.++++ +-||-++|.+=.+-|| + +.+.|+|=..+.
T Consensus 144 ~g~~ItN~~~~nt~~~GHpD------~e~N~c~F~~~q~vEr--D----~~~SFhl~~~~~ 192 (465)
T PF01690_consen 144 NGCTITNYKADNTWKYGHPD------LELNGCHFNDGQVVER--D----GTISFHLEATGD 192 (465)
T ss_pred cCcEEecccccCcccCCCCC------ceecCcccccCceEEe--e----eeEEEEEEecCC
Confidence 5 456677777654 4455455655433332 3 347888866643
No 80
>PF14900 DUF4493: Domain of unknown function (DUF4493)
Probab=32.63 E-value=66 Score=32.63 Aligned_cols=53 Identities=17% Similarity=0.382 Sum_probs=34.9
Q ss_pred eeEEEeCCC-cceEeCcccCceeEEEEEECce---ec----eeeeeeEEEEeCCceeeecceEEec
Q 038979 350 QFWTVANEG-GNFSIKNVLIGNYNLYAWIPGF---IG----DFKYHAAIRITAGSAKQIGNLVYKA 407 (606)
Q Consensus 350 qywt~td~~-G~F~I~nV~pGtY~L~a~~~G~---~G----~~~~~~~VtV~aG~t~~l~~l~~~~ 407 (606)
.+|..++-. +.+. .++|+|+|.|+ .|- .| .|..+++++|.+|+++.+. ++...
T Consensus 48 ~~~~~~~~~~~~i~---L~~G~Ytv~A~-~g~~~~~~~d~pyy~G~~~f~I~~g~~t~v~-v~C~l 108 (235)
T PF14900_consen 48 KYWKYSEMPGESIE---LPVGSYTVKAS-YGDNVAAGFDKPYYEGSTTFTIEKGETTTVS-VTCKL 108 (235)
T ss_pred EecchhccccceEe---ecCCcEEEEEE-cCCCccccccCceeecceeEEEecCCcEEEE-EEEEe
Confidence 444444444 3333 57999999999 551 12 2445678999999998884 76653
No 81
>PF10794 DUF2606: Protein of unknown function (DUF2606); InterPro: IPR019730 This entry represents bacterial proteins with unknown function.
Probab=32.05 E-value=1.8e+02 Score=27.13 Aligned_cols=54 Identities=24% Similarity=0.159 Sum_probs=35.9
Q ss_pred ccCcccCceEEEecCCCCCCCcccccccc-eeEEEeCCCcceEeCcccCceeEEEEE
Q 038979 321 RAGIAAKGAYVGLAKPGRAGSWQTECKGY-QFWTVANEGGNFSIKNVLIGNYNLYAW 376 (606)
Q Consensus 321 ~~~~pa~~a~V~La~~~~~g~~q~~~~~y-qywt~td~~G~F~I~nV~pGtY~L~a~ 376 (606)
.++.|+....|.|-...+. +-| .++.- .-=.+||.+|.+.=+++|-|+|.+.+-
T Consensus 51 ~e~~pi~~~ev~lmKa~ds-~~q-Ps~eig~~IGKTD~~Gki~Wk~~~kG~Y~v~l~ 105 (131)
T PF10794_consen 51 AEGQPIKDFEVTLMKAADS-DPQ-PSKEIGISIGKTDEEGKIIWKNGRKGKYIVFLP 105 (131)
T ss_pred CCCCcccceEEEEEecccc-CCC-CchhhceeecccCCCCcEEEecCCcceEEEEEc
Confidence 4589999998877541111 111 11111 112469999999999999999998875
No 82
>PF10989 DUF2808: Protein of unknown function (DUF2808); InterPro: IPR021256 This family of proteins with unknown function appears to be restricted to Cyanobacteria.
Probab=29.46 E-value=3.2e+02 Score=25.77 Aligned_cols=101 Identities=21% Similarity=0.228 Sum_probs=57.3
Q ss_pred CeEEEEEeCCceeEEEEEEcCEEeccccCCCccCcC--ccceEEEEEEccCcEEEEEEEEeecCCCCCCccceeeeEEEE
Q 038979 19 GILQVSISTPQGFVIGIQYKGNKNLLNVQNEEDNRG--IEATNYKVIMRTKEQVELSFTRMWQPYTNGTIAPVNIDKRFL 96 (606)
Q Consensus 19 g~l~vtv~k~~g~itsl~y~G~e~l~~~~~~~~~~G--l~~~~~~v~~~~~~~i~vs~~~~~~~~~~g~~~~~~l~~~~v 96 (606)
++..++|+.+.+.=.+++++.++......+ ...+| +.=+++++-. +...|.|.|...=.|. + .+.|..+=|
T Consensus 40 ~L~~l~I~~p~~~~~~~~~~~i~v~~~~~g-~~~~g~~ipl~~v~~~~-~~~~i~I~f~~PV~pG---~--tv~V~l~~v 112 (146)
T PF10989_consen 40 ALQKLTISQPDGFDGSIDFDKIQVRAFSLG-PRRRGESIPLAEVEWDE-DGRTITITFDEPVPPG---T--TVTVVLSPV 112 (146)
T ss_pred cceeEEEEccccccccccCCcceEEEEecc-CcccCCccCceEEEEcC-CCCEEEEEeCCCCCCC---C--EEEEEEEee
Confidence 357788887876555677777754322112 22233 2212333333 3479999999865444 3 234444445
Q ss_pred EEcCCceEEEEEEeecccCC-C-CCcccccEE
Q 038979 97 MLRGSSGFYSYAIYKRLKGW-P-GFQLFNNRM 126 (606)
Q Consensus 97 ~r~g~sgiY~y~i~~~~~~~-p-~~~lge~R~ 126 (606)
.-+...|.|.|.+...+.+. | ..-||-+|+
T Consensus 113 ~NP~~~G~Y~f~v~a~p~G~~p~~~ylG~~rl 144 (146)
T PF10989_consen 113 RNPRSGGTYQFNVTAFPPGDNPIGQYLGTWRL 144 (146)
T ss_pred eCCCCCCeEEEEEEEECCCCCcccceeeEEEE
Confidence 55667799999987665532 2 444555554
No 83
>PF14849 YidC_periplas: YidC periplasmic domain; PDB: 3BS6_B 3BLC_B.
Probab=28.74 E-value=55 Score=33.44 Aligned_cols=26 Identities=23% Similarity=0.482 Sum_probs=21.6
Q ss_pred EEEEeCCeEEEEEeCCceeEEEEEEc
Q 038979 13 HVVMNNGILQVSISTPQGFVIGIQYK 38 (606)
Q Consensus 13 ~vv~~Ng~l~vtv~k~~g~itsl~y~ 38 (606)
+|+++|+.+.++|+..+|.|.++..+
T Consensus 1 ~v~ven~~~~~~~s~~GG~i~~~~Lk 26 (270)
T PF14849_consen 1 RVTVENDLFKVTFSSKGGRIKSVELK 26 (270)
T ss_dssp -EEEE-SS-EEEEETBTTEEEEEEEE
T ss_pred CEEEECCCEEEEEECCCCeEEEEEcC
Confidence 48999999999999999999999875
No 84
>PF09912 DUF2141: Uncharacterized protein conserved in bacteria (DUF2141); InterPro: IPR018673 This family of conserved hypothetical proteins has no known function.
Probab=28.42 E-value=1.3e+02 Score=27.26 Aligned_cols=20 Identities=10% Similarity=0.278 Sum_probs=17.8
Q ss_pred CcceEeCcccCceeEEEEEE
Q 038979 358 GGNFSIKNVLIGNYNLYAWI 377 (606)
Q Consensus 358 ~G~F~I~nV~pGtY~L~a~~ 377 (606)
.-.++|++++||+|.+.++.
T Consensus 42 ~~~~~f~~lp~G~YAi~v~h 61 (112)
T PF09912_consen 42 TVTITFEDLPPGTYAIAVFH 61 (112)
T ss_pred cEEEEECCCCCccEEEEEEE
Confidence 44899999999999999983
No 85
>PF03785 Peptidase_C25_C: Peptidase family C25, C terminal ig-like domain; InterPro: IPR005536 This domain is found in almost all members of MEROPS peptidase family C25, (clan CD). Peptidase family C25 is a protein family found in the bacteria Porphyromonas gingivalis (Bacteroides gingivalis) a Gram-negative anaerobic bacterial species strongly associated with adult periodontitis. One of its distinguishing characteristics and putative virulence properties is the ability to agglutinate erythrocytes []. It is a highly proteolytic organism which metabolises small peptides and amino acids. Indirect evidence suggests that the proteases produced by this microorganism constitute an important virulence factor []. Protease-encoding genes have been shown to contain multiple copies of repeated nucleotide sequences. These conserved sequences have also been found in haemagglutinin genes [].; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 1CVR_A.
Probab=27.65 E-value=1.9e+02 Score=25.14 Aligned_cols=39 Identities=26% Similarity=0.371 Sum_probs=23.8
Q ss_pred CceEEEecCCCCCCCcccccccceeEEEeCCCcceEeCccc-----CceeEEEEEE
Q 038979 327 KGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIKNVL-----IGNYNLYAWI 377 (606)
Q Consensus 327 ~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~nV~-----pGtY~L~a~~ 377 (606)
.++.|-|+.. |--|..-.-++|+++|+ +. +|+|+|++-.
T Consensus 26 ~gs~ValS~d-----------g~l~G~ai~~sG~ati~-l~~~it~~~~~tlTit~ 69 (81)
T PF03785_consen 26 PGSYVALSQD-----------GDLYGKAIVNSGNATIN-LTNPITDEGTLTLTITA 69 (81)
T ss_dssp TT-EEEEEET-----------TEEEEEEE-BTTEEEEE--SS--TT-SEEEEEEE-
T ss_pred CCcEEEEecC-----------CEEEEEEEecCceEEEE-CCcccCCCceEEEEEEE
Confidence 3567777543 44677554449999985 44 6889998863
No 86
>PRK13211 N-acetylglucosamine-binding protein A; Reviewed
Probab=24.13 E-value=3.7e+02 Score=30.78 Aligned_cols=67 Identities=12% Similarity=0.106 Sum_probs=39.6
Q ss_pred CCCCCCCCCeEEEEEEEEeeccccccCcccCceEEEecCCCCCCCcccccccceeEEEeCCCcceEeC--cccCceeEEE
Q 038979 297 KDFARSNKRGSISGRLIVKDRYVSRAGIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVANEGGNFSIK--NVLIGNYNLY 374 (606)
Q Consensus 297 ~~y~~~~qRGtVsG~v~~sd~~~~~~~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td~~G~F~I~--nV~pGtY~L~ 374 (606)
++|.-..+..+|.=+|.... ...+.+-|-.... +.++++--+..|..-.|+|. ++.+|.|+|.
T Consensus 320 ~eY~I~dG~~~i~ftv~a~g---------~~~vta~V~d~~g------~~~~~~~~~v~d~s~~vtL~Ls~~~AG~y~Lv 384 (478)
T PRK13211 320 KEYKIGDGAATLDFTVTATG---------DMNVEATVYNHDG------EALGSKSQTVNDGSQSVSLDLSKLKAGHHMLV 384 (478)
T ss_pred ceeEEcCCcEEEEEEEEecc---------ceEEEEEEEcCCC------CeeeeeeEEecCCceeEEEecccCCCceEEEE
Confidence 67888766666666665432 2345555532211 13444333444555567665 9999999999
Q ss_pred EEEC
Q 038979 375 AWIP 378 (606)
Q Consensus 375 a~~~ 378 (606)
+.+.
T Consensus 385 v~~t 388 (478)
T PRK13211 385 VKAK 388 (478)
T ss_pred EEEE
Confidence 9853
No 87
>PRK15172 putative aldose-1-epimerase; Provisional
Probab=23.38 E-value=1.4e+02 Score=31.40 Aligned_cols=36 Identities=14% Similarity=0.263 Sum_probs=31.1
Q ss_pred eCCEEEEeCCeEEEEEeCCceeEEEEEEcCEEeccc
Q 038979 10 QNNHVVMNNGILQVSISTPQGFVIGIQYKGNKNLLN 45 (606)
Q Consensus 10 ~g~~vv~~Ng~l~vtv~k~~g~itsl~y~G~e~l~~ 45 (606)
.+..++|.|+.+.++|..=+|.|++|+++|.+.++.
T Consensus 9 ~~~~~~l~~~~~~v~i~~~Ga~i~~l~~~~~~vv~~ 44 (300)
T PRK15172 9 SGQTISLAAGDYQATIVTVGAGLAELTFQGRHLVIP 44 (300)
T ss_pred CcCEEEEeCCCEEEEEecCCcEEEEEEECCEEEEec
Confidence 567899999999999999999999999988765554
No 88
>PF14200 RicinB_lectin_2: Ricin-type beta-trefoil lectin domain-like; PDB: 2X2S_C 2X2T_A 2VSE_B 2VSA_A 3EF2_A 2IHO_A 3HZB_H 1YBI_B 3PHZ_A 3NBE_A ....
Probab=22.98 E-value=1.2e+02 Score=26.24 Aligned_cols=39 Identities=28% Similarity=0.547 Sum_probs=25.9
Q ss_pred CcccCceEEEecCCCCCCCcccccccceeEEEeC-CCcceEeCcccCc
Q 038979 323 GIAAKGAYVGLAKPGRAGSWQTECKGYQFWTVAN-EGGNFSIKNVLIG 369 (606)
Q Consensus 323 ~~pa~~a~V~La~~~~~g~~q~~~~~yqywt~td-~~G~F~I~nV~pG 369 (606)
+..+.++.|.+ |.......|.|.... .+|.|.|.++..|
T Consensus 33 ~~~~~g~~v~~--------~~~~~~~~Q~W~i~~~~~g~y~I~n~~s~ 72 (105)
T PF14200_consen 33 GSTANGTNVQQ--------WTCNGNDNQQWKIEPVGDGYYRIRNKNSG 72 (105)
T ss_dssp TTCSTTEBEEE--------EESSSSGGGEEEEEESTTSEEEEEETSTT
T ss_pred CCcCCCcEEEE--------ecCCCCcCcEEEEEEecCCeEEEEECCCC
Confidence 34456677776 333346778886654 6788999888665
No 89
>PF01263 Aldose_epim: Aldose 1-epimerase; InterPro: IPR008183 Aldose 1-epimerase (5.1.3.3 from EC) (mutarotase) is the enzyme responsible for the anomeric interconversion of D-glucose and other aldoses between their alpha- and beta-forms. The sequence of mutarotase from two bacteria, Acinetobacter calcoaceticus and Streptococcus thermophilus is available []. It has also been shown that, on the basis of extensive sequence similarities, a mutarotase domain seems to be present in the C-terminal half of the fungal GAL10 protein which encodes, in the N-terminal part, UDP-glucose 4-epimerase.; GO: 0016853 isomerase activity, 0005975 carbohydrate metabolic process; PDB: 1YGA_A 3DCD_A 2CIQ_A 2CIS_A 2CIR_A 2HTB_C 2HTA_B 3Q1N_A 1NSZ_B 1NSR_B ....
Probab=22.80 E-value=1.2e+02 Score=31.24 Aligned_cols=36 Identities=22% Similarity=0.263 Sum_probs=30.0
Q ss_pred EEEEeCC-eEEEEEeCCceeEEEEEEcC--EEeccccCC
Q 038979 13 HVVMNNG-ILQVSISTPQGFVIGIQYKG--NKNLLNVQN 48 (606)
Q Consensus 13 ~vv~~Ng-~l~vtv~k~~g~itsl~y~G--~e~l~~~~~ 48 (606)
.|+|.|+ .+.+.|..-+|.|+|++.+| .|.+.....
T Consensus 2 ~itL~n~~~~~~~i~~~Ga~l~s~~~~~~~~~~l~~~~~ 40 (300)
T PF01263_consen 2 LITLENGNGLSAVIPEYGAELTSLQVKGNGREVLWQPDP 40 (300)
T ss_dssp EEEEEETTSEEEEEETBTTEEEEEEETTTTEESB-B-ST
T ss_pred EEEEECCCceEEEEeccCcEEEEEEECCCCeEEecCCCC
Confidence 5899999 89999999999999999999 777755443
No 90
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=21.19 E-value=1.5e+02 Score=35.00 Aligned_cols=69 Identities=23% Similarity=0.295 Sum_probs=47.1
Q ss_pred cceeEEEEEEecCccccccEEEEEEEeeccCCeEEEEEcCccCCC--CCccccccCCCCeeeeeEEEeecEEEEEEeecC
Q 038979 492 EGSTWQIQFKLEGVVKKATYKLRVAVAAAHGAELQVRVNSRSARR--PLFSSGSVGRENAIARHGIHGVYKLFNVDVPGK 569 (606)
Q Consensus 492 ~~~~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vNg~~~~~--p~~~t~~~~~~~~i~R~~~~G~~~~~~~~ipa~ 569 (606)
+|-+|-=.|+.++... . ++|-....+.=+|.|||.++.. |.+ |. ..++-||++
T Consensus 556 ~P~~w~k~f~~p~g~~--~----t~Ldm~g~GKG~vwVNG~niGRYW~~~-----------------G~--Q~~yhvPr~ 610 (649)
T KOG0496|consen 556 QPLTWYKTFDIPSGSE--P----TALDMNGWGKGQVWVNGQNIGRYWPSF-----------------GP--QRTYHVPRS 610 (649)
T ss_pred CCeEEEEEecCCCCCC--C----eEEecCCCcceEEEECCcccccccCCC-----------------CC--ceEEECcHH
Confidence 4677766777655442 1 1222235577899999998873 322 33 567889999
Q ss_pred ceeeeecEEEEEeecC
Q 038979 570 VLRKGNNTIYLSQPRK 585 (606)
Q Consensus 570 ~L~~G~NtI~l~~~~g 585 (606)
.||.+.|.|.+---.+
T Consensus 611 ~Lk~~~N~lvvfEee~ 626 (649)
T KOG0496|consen 611 WLKPSGNLLVVFEEEG 626 (649)
T ss_pred HhCcCCceEEEEEecc
Confidence 9999999988765554
No 91
>TIGR03769 P_ac_wall_RPT actinobacterial surface-anchored protein domain. This model describes a repeat domain that one to three times in Actinobacterial proteins, some of which have LPXTG-type sortase recognition motifs for covalent attachment to the Gram-positive cell wall. Where it occurs with duplication in an LPXTG-anchored protein, it tends to be adjacent to the substrate-binding protein of the gene trio of an ABC transporter system, where that substrate-binding protein has a single copy of this same domain. This arrangement suggests a substrate-binding relay system, with the LPXTG protein acting as a substrate receptor.
Probab=20.14 E-value=1e+02 Score=23.12 Aligned_cols=11 Identities=27% Similarity=0.286 Sum_probs=10.0
Q ss_pred cCceeEEEEEE
Q 038979 367 LIGNYNLYAWI 377 (606)
Q Consensus 367 ~pGtY~L~a~~ 377 (606)
+||.|+|.+-+
T Consensus 11 ~PG~Y~l~~~a 21 (41)
T TIGR03769 11 KPGTYTLTVQA 21 (41)
T ss_pred CCeEEEEEEEE
Confidence 79999999986
Done!