Query 011825
Match_columns 476
No_of_seqs 186 out of 301
Neff 5.7
Searched_HMMs 46136
Date Fri Mar 29 05:38:40 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/011825.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/011825hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF14683 CBM-like: Polysacchar 100.0 8.5E-47 1.8E-51 351.9 10.7 165 281-470 1-167 (167)
2 PF09284 RhgB_N: Rhamnogalactu 99.9 4.6E-25 1E-29 213.2 6.0 100 42-168 142-244 (249)
3 PF14686 fn3_3: Polysaccharide 99.9 1E-22 2.2E-27 173.7 8.7 93 176-272 1-95 (95)
4 PF13620 CarboxypepD_reg: Carb 98.9 8.8E-09 1.9E-13 83.8 8.8 80 179-274 1-81 (82)
5 PF13715 DUF4480: Domain of un 98.6 5.1E-07 1.1E-11 74.7 10.6 88 179-288 1-88 (88)
6 cd03863 M14_CPD_II The second 98.2 7.8E-06 1.7E-10 85.9 9.7 78 177-274 296-373 (375)
7 cd03865 M14_CPE_H Peptidase M1 98.1 5.9E-06 1.3E-10 87.4 7.7 96 143-274 306-401 (402)
8 cd03864 M14_CPN Peptidase M14 98.1 1.1E-05 2.4E-10 85.3 9.2 76 178-274 316-391 (392)
9 cd06245 M14_CPD_III The third 97.9 4.2E-05 9E-10 80.2 9.7 75 178-274 287-361 (363)
10 cd03858 M14_CP_N-E_like Carbox 97.8 9.9E-05 2.2E-09 77.4 9.4 72 178-269 298-370 (374)
11 cd03868 M14_CPD_I The first ca 97.8 7.6E-05 1.6E-09 78.3 8.4 77 177-273 295-371 (372)
12 cd03867 M14_CPZ Peptidase M14- 97.5 0.00036 7.7E-09 74.0 8.9 72 178-269 318-391 (395)
13 cd03866 M14_CPM Peptidase M14 96.9 0.0038 8.2E-08 65.9 9.3 70 177-264 294-363 (376)
14 PRK15036 hydroxyisourate hydro 96.2 0.011 2.3E-07 54.1 6.2 65 176-248 25-93 (137)
15 PF08400 phage_tail_N: Prophag 95.8 0.049 1.1E-06 49.6 8.5 77 179-264 4-80 (134)
16 PF08308 PEGA: PEGA domain; I 95.8 0.061 1.3E-06 42.7 8.1 45 228-276 25-69 (71)
17 PF03422 CBM_6: Carbohydrate b 95.2 0.16 3.4E-06 44.3 9.4 81 378-471 43-124 (125)
18 cd03869 M14_CPX_like Peptidase 94.7 0.077 1.7E-06 56.6 7.1 67 177-264 329-395 (405)
19 PF05738 Cna_B: Cna protein B- 94.4 0.18 3.8E-06 39.7 7.0 45 222-267 21-67 (70)
20 PF09430 DUF2012: Protein of u 94.0 0.21 4.5E-06 44.5 7.3 40 220-262 22-61 (123)
21 KOG1948 Metalloproteinase-rela 94.0 0.14 3E-06 58.5 7.5 55 178-248 316-371 (1165)
22 cd00421 intradiol_dioxygenase 93.9 0.14 3.1E-06 47.0 6.2 64 176-242 10-80 (146)
23 COG3485 PcaH Protocatechuate 3 92.9 0.24 5.2E-06 48.9 6.4 65 176-243 71-144 (226)
24 cd03463 3,4-PCD_alpha Protocat 91.7 0.39 8.5E-06 46.1 6.1 63 176-241 35-106 (185)
25 cd03459 3,4-PCD Protocatechuat 91.1 0.5 1.1E-05 44.2 6.0 64 176-242 14-87 (158)
26 PF03170 BcsB: Bacterial cellu 91.1 0.64 1.4E-05 51.9 8.0 78 365-456 29-111 (605)
27 cd03462 1,2-CCD chlorocatechol 90.9 0.72 1.6E-05 46.2 7.2 63 176-241 98-165 (247)
28 TIGR02465 chlorocat_1_2 chloro 90.7 0.41 8.9E-06 47.9 5.3 64 176-242 97-165 (246)
29 PF07210 DUF1416: Protein of u 90.5 2.9 6.2E-05 35.2 9.2 60 176-251 6-67 (85)
30 PF07495 Y_Y_Y: Y_Y_Y domain; 90.4 0.36 7.8E-06 37.4 3.7 27 220-246 20-47 (66)
31 KOG1948 Metalloproteinase-rela 90.4 0.71 1.5E-05 53.1 7.2 57 179-248 120-176 (1165)
32 PF00775 Dioxygenase_C: Dioxyg 89.5 0.86 1.9E-05 43.6 6.2 64 176-242 28-98 (183)
33 cd03464 3,4-PCD_beta Protocate 89.0 0.79 1.7E-05 45.2 5.7 65 175-242 63-137 (220)
34 TIGR02423 protocat_alph protoc 88.7 0.94 2E-05 43.8 5.9 64 176-242 38-111 (193)
35 cd03458 Catechol_intradiol_dio 88.3 1.5 3.3E-05 44.2 7.3 65 175-242 102-171 (256)
36 TIGR02438 catachol_actin catec 86.9 1 2.3E-05 45.9 5.2 64 176-242 131-199 (281)
37 TIGR02422 protocat_beta protoc 86.8 1.7 3.6E-05 43.0 6.4 67 173-242 56-132 (220)
38 smart00606 CBD_IV Cellulose Bi 85.9 10 0.00022 33.2 10.4 89 365-470 39-129 (129)
39 PF02837 Glyco_hydro_2_N: Glyc 85.4 1.3 2.9E-05 40.3 4.8 69 366-456 71-140 (167)
40 PRK11114 cellulose synthase re 85.1 1.5 3.2E-05 50.6 5.9 74 368-454 84-162 (756)
41 TIGR02962 hdxy_isourate hydrox 84.6 2 4.4E-05 38.0 5.3 59 180-246 3-66 (112)
42 cd03461 1,2-HQD Hydroxyquinol 84.4 1.6 3.5E-05 44.5 5.2 65 175-242 118-187 (277)
43 cd03460 1,2-CTD Catechol 1,2 d 84.0 2.1 4.6E-05 43.7 5.8 65 175-242 122-191 (282)
44 PF00576 Transthyretin: HIUase 83.3 1.9 4E-05 38.2 4.5 51 191-246 12-67 (112)
45 PF13364 BetaGal_dom4_5: Beta- 82.8 3.4 7.3E-05 36.1 5.9 54 382-452 50-104 (111)
46 TIGR02439 catechol_proteo cate 82.6 2.8 6.1E-05 42.9 6.1 64 176-242 127-195 (285)
47 cd05821 TLP_Transthyretin Tran 81.1 3.7 8.1E-05 36.9 5.6 63 177-246 6-72 (121)
48 cd05469 Transthyretin_like Tra 80.0 4.5 9.7E-05 36.0 5.7 51 191-246 12-66 (113)
49 KOG2649 Zinc carboxypeptidase 79.8 5.4 0.00012 43.5 7.3 77 178-276 378-455 (500)
50 COG2351 Transthyretin-like pro 79.0 6.3 0.00014 35.4 6.2 66 178-255 9-79 (124)
51 PF14900 DUF4493: Domain of un 78.2 68 0.0015 31.4 14.1 53 218-274 47-106 (235)
52 PF03170 BcsB: Bacterial cellu 76.6 4.8 0.0001 45.0 6.1 78 364-454 323-408 (605)
53 PF10670 DUF4198: Domain of un 75.2 9.5 0.00021 36.0 7.0 62 177-246 150-211 (215)
54 cd05822 TLP_HIUase HIUase (5-h 75.0 6.5 0.00014 34.8 5.3 50 191-246 12-66 (112)
55 PLN03059 beta-galactosidase; P 74.4 3.3 7.1E-05 48.2 4.1 86 365-457 621-716 (840)
56 PF02369 Big_1: Bacterial Ig-l 70.2 22 0.00047 30.4 7.3 68 176-248 21-90 (100)
57 cd03457 intradiol_dioxygenase_ 69.0 11 0.00023 36.3 5.7 62 178-241 27-100 (188)
58 smart00095 TR_THY Transthyreti 69.0 12 0.00027 33.6 5.7 62 178-246 4-69 (121)
59 PF09912 DUF2141: Uncharacteri 66.1 14 0.0003 32.5 5.4 49 197-248 12-63 (112)
60 PRK10340 ebgA cryptic beta-D-g 62.9 9.9 0.00022 45.5 5.0 68 366-455 112-179 (1021)
61 PF08531 Bac_rhamnosid_N: Alph 57.7 8.3 0.00018 36.1 2.6 62 381-456 5-66 (172)
62 PF01060 DUF290: Transthyretin 57.0 23 0.00051 29.0 4.9 49 181-238 1-49 (80)
63 PF01190 Pollen_Ole_e_I: Polle 55.9 22 0.00048 29.9 4.7 37 192-233 18-54 (97)
64 PF13754 Big_3_4: Bacterial Ig 55.6 23 0.00051 26.9 4.3 28 219-246 3-32 (54)
65 PF11008 DUF2846: Protein of u 51.2 19 0.00041 31.5 3.7 43 226-269 56-99 (117)
66 PF03944 Endotoxin_C: delta en 47.5 56 0.0012 29.6 6.3 95 368-471 41-140 (143)
67 PF07550 DUF1533: Protein of u 46.5 16 0.00035 28.9 2.2 19 435-453 36-55 (65)
68 PRK09525 lacZ beta-D-galactosi 46.1 36 0.00079 40.9 6.0 68 366-455 123-191 (1027)
69 TIGR03000 plancto_dom_1 Planct 44.9 1.2E+02 0.0025 25.3 7.0 39 229-269 30-73 (75)
70 PRK10150 beta-D-glucuronidase; 42.3 51 0.0011 36.9 6.2 66 367-454 69-135 (604)
71 PRK13211 N-acetylglucosamine-b 36.2 1.6E+02 0.0035 32.5 8.6 66 169-246 320-387 (478)
72 PF11797 DUF3324: Protein of u 36.0 41 0.00089 30.5 3.5 30 234-263 102-131 (140)
73 smart00634 BID_1 Bacterial Ig- 32.9 2.1E+02 0.0047 23.6 7.1 64 178-249 20-85 (92)
74 PF03785 Peptidase_C25_C: Pept 32.8 1.2E+02 0.0026 25.5 5.4 39 196-246 26-69 (81)
75 PF14200 RicinB_lectin_2: Rici 31.1 53 0.0011 27.6 3.2 48 181-238 24-72 (105)
76 PF12866 DUF3823: Protein of u 29.0 1.5E+02 0.0033 29.2 6.4 61 177-244 21-83 (222)
77 PF10794 DUF2606: Protein of u 28.1 1.8E+02 0.0039 26.4 5.9 64 180-248 44-108 (131)
78 TIGR03769 P_ac_wall_RPT actino 26.2 1.2E+02 0.0026 22.1 3.8 11 236-246 11-21 (41)
79 PF04571 Lipin_N: lipin, N-ter 24.9 1E+02 0.0023 27.3 3.9 38 356-405 34-71 (110)
80 PF14344 DUF4397: Domain of un 24.2 1.2E+02 0.0026 26.2 4.3 36 234-269 39-77 (122)
81 KOG3006 Transthyretin and rela 22.2 2.3E+02 0.005 25.7 5.5 60 178-245 21-85 (132)
82 PF11589 DUF3244: Domain of un 21.9 2.8E+02 0.006 23.7 6.0 65 178-254 32-104 (106)
83 cd04970 Ig6_Contactin_like Six 21.6 3.8E+02 0.0081 21.2 6.5 36 225-262 44-82 (85)
84 KOG0496 Beta-galactosidase [Ca 21.5 1.4E+02 0.0031 34.0 5.1 71 363-456 556-626 (649)
85 PF00041 fn3: Fibronectin type 20.3 3.1E+02 0.0067 20.9 5.6 21 226-246 54-75 (85)
No 1
>PF14683 CBM-like: Polysaccharide lyase family 4, domain III; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=100.00 E-value=8.5e-47 Score=351.91 Aligned_cols=165 Identities=52% Similarity=0.885 Sum_probs=114.2
Q ss_pred CCeEEEeccCCCccccccCCCCccccccccccCCchhhcccchhcccccCCCCCeeEEeeccCCCCCceeEEEEEecCCc
Q 011825 281 PTLWEIGIPDRSAREFNVPDPDPKYVNRLFVNHPDRFRQYGLWSRYTELYPNEDLVYTIGVSDYSKDWFFAQVVREMDNK 360 (476)
Q Consensus 281 ~~LweIG~~Drt~~~F~~~d~~~~~~~k~~~~hp~~~R~yglW~~y~~~~P~~dl~ytVG~S~~~~Dw~ya~~~~~~~~~ 360 (476)
++|||||+|||++.||+++|+ +++|||| |++|+++||++|++||||+| +++||||||+++ .+
T Consensus 1 ~~iW~IG~~Drta~eF~~~~~-------------~~~r~~~-~~d~~~~~p~~~~~ytVG~S-~~~Dw~y~~~~~--~~- 62 (167)
T PF14683_consen 1 PTIWQIGTPDRTAAEFRNGDP-------------DKYRQYG-WSDYSRDFPWEDLTYTVGSS-PAKDWPYAQWGR--VN- 62 (167)
T ss_dssp SEEEEEE-SSSS-TTSBTHH--------------HHTTS---TT--TTS----S-EEETTTS--GGGSBSEEETT--TS-
T ss_pred CcceEeCCCCCCchhhccCCh-------------hhhhhcC-cccchhhCCCCCCEEEEccC-cccCCcEEEEec--cC-
Confidence 579999999999999998642 5699998 99999999998999999999 889999999964 33
Q ss_pred ccCcccEEEEEEeCCCCCCCcEEEEEEEecc-CCCeEEEEEcCCCCCCCcccccccCCCCeeeceeee-eecEEEEEEee
Q 011825 361 TYQGTTWQIKFKLDHVDRNSSYKLRVAIASA-TLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIH-GLYLLYHVNIP 438 (476)
Q Consensus 361 ~~~~~~w~I~F~L~~~~~~~~~tLriala~a-~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~-G~~~~~~~~ip 438 (476)
++|+|+|+|++++..+.+||||+||++ ++++++|+|||+.. +++ ...+++|++++|++++ |+|++++|+||
T Consensus 63 ----~~w~I~F~l~~~~~~~~~tL~i~la~a~~~~~~~V~vNg~~~--~~~-~~~~~~d~~~~r~g~~~G~~~~~~~~ip 135 (167)
T PF14683_consen 63 ----GTWTIKFDLDAVQLAGTYTLRIALAGASAGGRLQVSVNGWSG--PFP-SAPFGNDNAIYRSGIHRGNYRLYEFDIP 135 (167)
T ss_dssp ------EEEEEEE-GGG-S--EEEEEEEEEEETT-EEEEEETTEE--------------S--GGGT---S---EEEEEE-
T ss_pred ----CCEEEEEECCCCccCCcEEEEEEeccccCCCCEEEEEcCccC--Ccc-ccccCCCCceeeCceecccEEEEEEEEc
Confidence 899999999999877799999999999 79999999999544 233 2467899999999998 99999999999
Q ss_pred CCCeeeeecEEEEEeecCCCCCceEEEEEEEE
Q 011825 439 GTRFIEGENTIFLKQPRCTSPFQGIMYDYIRL 470 (476)
Q Consensus 439 a~~L~~G~NtI~l~~~~gss~~~~vmyD~IrL 470 (476)
+++|++|+|+|+|++++|++.+.|||||||||
T Consensus 136 a~~L~~G~Nti~lt~~~gs~~~~gvmyD~I~L 167 (167)
T PF14683_consen 136 ASLLKAGENTITLTVPSGSGLSPGVMYDYIRL 167 (167)
T ss_dssp TTSS-SEEEEEEEEEE-S-GGSSEEEEEEEEE
T ss_pred HHHEEeccEEEEEEEccCCCccCeEEEEEEEC
Confidence 99999999999999999987788999999998
No 2
>PF09284 RhgB_N: Rhamnogalacturonase B, N-terminal; InterPro: IPR015364 This domain is found in prokaryotic enzyme rhamnogalacturonase B, it adopts a structure consisting of a beta supersandwich, with eighteen strands in two beta-sheets. The exact function of the domain is unknown, but a putative role includes carbohydrate-binding []. ; GO: 0016837 carbon-oxygen lyase activity, acting on polysaccharides, 0030246 carbohydrate binding, 0005975 carbohydrate metabolic process; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=99.91 E-value=4.6e-25 Score=213.19 Aligned_cols=100 Identities=22% Similarity=0.340 Sum_probs=63.7
Q ss_pred cceEeeeeeeecccccccEEEEEecCCCeEEEEEcCCCccccCCCceeccccccCCc---EEEEEeeccccCCccccccC
Q 011825 42 QGEVDDKYQYSCENKDLKVHGWICRTTPVGFWLIIPSDEFRSGGPLKQNLTSHVGPT---TLAVFLSGHYAGKYMETHIG 118 (476)
Q Consensus 42 ~G~~~sKY~~s~~~~d~~vhG~~~~g~~vG~W~I~~s~E~~sGGP~kqdL~~h~g~~---~l~y~~s~H~~g~~~~~~~~ 118 (476)
+|+++||||++.+++|+++||+ +++++|+|||++++|.+|||||+|||++|.++. ||+||+|+|. |||++|
T Consensus 142 ~G~TrSKfYSs~r~IDd~~hgv--~g~~vgv~mi~~~~E~SSGGPFfRDI~~~~~~~~~~Ly~ymnSgH~----qTE~~R 215 (249)
T PF09284_consen 142 DGQTRSKFYSSQRFIDDDVHGV--SGSAVGVYMIMSNYEKSSGGPFFRDINTNNGGDGNELYNYMNSGHT----QTEPYR 215 (249)
T ss_dssp TTEEEEGGGG--BGGG-SEEEE--E-SS-EEEEE----TT-SS-TT-B---EEE-SS-EEEEEEEE-STT------S---
T ss_pred CceEeeeeccccceeccceEEE--ecCCeEEEEEeCCccccCCCCchhhhhhccCCccceeeeeEecCcc----cCchhc
Confidence 7999999999999999999999 688999999999999999999999999998764 9999999998 699999
Q ss_pred CCCCCceeeceEEEEEcCCCCCCCcchhHHHHHHHHhhhhcCCCCCCCCC
Q 011825 119 QDEPWKKVFGPVFIYLNSAADGDDPLWLWEDAKIKMMSEVQSWPYNFPAS 168 (476)
Q Consensus 119 ~Ge~w~k~~GP~~~y~n~g~~~~~~~~l~~Da~~~~~~E~~~wpy~f~~s 168 (476)
.| |||||+|+|++|++|+.. +.+++|+++
T Consensus 216 ~G-----LhGPYaL~FT~g~~Ps~~----------------~~D~sff~~ 244 (249)
T PF09284_consen 216 MG-----LHGPYALAFTDGGAPSAS----------------DLDTSFFDD 244 (249)
T ss_dssp -E-----EEEEEEEEEESS----S---------------------GGGGG
T ss_pred cc-----cCCceEEEEcCCCCCCCc----------------cccccchhh
Confidence 99 999999999999997431 247888876
No 3
>PF14686 fn3_3: Polysaccharide lyase family 4, domain II; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=99.88 E-value=1e-22 Score=173.66 Aligned_cols=93 Identities=47% Similarity=0.894 Sum_probs=53.9
Q ss_pred CCeEEEEEEEEecCCCcccc-CceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeee
Q 011825 176 ERGCVSGRLLVQDSNDVISA-NGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYR 254 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa-~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~ 254 (476)
+||+|+|+|++.|.....++ ..++|+|+++++.+ |+++||||++||++|+|+|+|||||+|+|++|.+|++|+|.
T Consensus 1 ~RG~VsG~l~l~dg~~~~~~~~~~~Vgl~~~~d~~----q~~~yqYwt~td~~G~Fti~~V~pGtY~L~ay~~g~~g~~~ 76 (95)
T PF14686_consen 1 QRGSVSGRLTLSDGVTNPPAGANAVVGLAPPGDFQ----QNKGYQYWTRTDSDGNFTIPNVRPGTYRLYAYADGIFGDYK 76 (95)
T ss_dssp G-BEEEEEEE---SS--TT--S-EEEEEE------------SS-EEEEE--TTSEEE---B-SEEEEEEEEE----TTEE
T ss_pred CCCEEEEEEEEccCcccCccceeEEEEeeeccccc----cCCCCcEEEEeCCCCcEEeCCeeCcEeEEEEEEecccCceE
Confidence 59999999999883222444 67999999998764 49999999999999999999999999999999999999998
Q ss_pred e-eeEEEEeCCceeeecce
Q 011825 255 S-DALVTITSGSNIKMGDL 272 (476)
Q Consensus 255 ~-~~~VtV~aG~t~~lg~l 272 (476)
. +.+|+|++|++++|++|
T Consensus 77 ~~~~~ItV~~g~~~~lg~~ 95 (95)
T PF14686_consen 77 VASDSITVSGGTTTDLGDL 95 (95)
T ss_dssp EEEEEEEE-T-EEE-----
T ss_pred EecceEEEcCCcEeccccC
Confidence 7 77899999999988754
No 4
>PF13620 CarboxypepD_reg: Carboxypeptidase regulatory-like domain; PDB: 3MN8_D 3P0D_I 3KCP_A 2B59_B 1UWY_A 1H8L_A 1QMU_A 2NSM_A.
Probab=98.88 E-value=8.8e-09 Score=83.81 Aligned_cols=80 Identities=29% Similarity=0.434 Sum_probs=59.2
Q ss_pred EEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeee-e
Q 011825 179 CVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSD-A 257 (476)
Q Consensus 179 tVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~-~ 257 (476)
+|+|+|...+ +.|..+|.|.|.... .+..+-+.||++|+|.|++++||+|+|.+...|+. ... .
T Consensus 1 tI~G~V~d~~---g~pv~~a~V~l~~~~---------~~~~~~~~Td~~G~f~~~~l~~g~Y~l~v~~~g~~---~~~~~ 65 (82)
T PF13620_consen 1 TISGTVTDAT---GQPVPGATVTLTDQD---------GGTVYTTTTDSDGRFSFEGLPPGTYTLRVSAPGYQ---PQTQE 65 (82)
T ss_dssp -EEEEEEETT---SCBHTT-EEEET--T---------TTECCEEE--TTSEEEEEEE-SEEEEEEEEBTTEE----EEEE
T ss_pred CEEEEEEcCC---CCCcCCEEEEEEEee---------CCCEEEEEECCCceEEEEccCCEeEEEEEEECCcc---eEEEE
Confidence 6899999865 899999999997431 34578899999999999999999999999998865 333 2
Q ss_pred EEEEeCCceeeecceEE
Q 011825 258 LVTITSGSNIKMGDLVY 274 (476)
Q Consensus 258 ~VtV~aG~t~~lg~l~~ 274 (476)
.|+|.+|++..+ +|++
T Consensus 66 ~v~v~~~~~~~~-~i~L 81 (82)
T PF13620_consen 66 NVTVTAGQTTTV-DITL 81 (82)
T ss_dssp EEEESSSSEEE---EEE
T ss_pred EEEEeCCCEEEE-EEEE
Confidence 599999999888 5766
No 5
>PF13715 DUF4480: Domain of unknown function (DUF4480)
Probab=98.59 E-value=5.1e-07 Score=74.69 Aligned_cols=88 Identities=28% Similarity=0.408 Sum_probs=67.7
Q ss_pred EEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeeeE
Q 011825 179 CVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDAL 258 (476)
Q Consensus 179 tVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~~ 258 (476)
+|+|+|...+ ++.|..+|.|.+.+. ...+.||++|.|+|+ +++|+|+|.+...|+. ..+..
T Consensus 1 ti~G~V~d~~--t~~pl~~a~V~~~~~-------------~~~~~Td~~G~F~i~-~~~g~~~l~is~~Gy~---~~~~~ 61 (88)
T PF13715_consen 1 TISGKVVDSD--TGEPLPGATVYLKNT-------------KKGTVTDENGRFSIK-LPEGDYTLKISYIGYE---TKTIT 61 (88)
T ss_pred CEEEEEEECC--CCCCccCeEEEEeCC-------------cceEEECCCeEEEEE-EcCCCeEEEEEEeCEE---EEEEE
Confidence 5899998765 479999999999643 367889999999999 9999999999997755 55556
Q ss_pred EEEeCCceeeecceEEcCCCCCCCeEEEec
Q 011825 259 VTITSGSNIKMGDLVYEPPRDGPTLWEIGI 288 (476)
Q Consensus 259 VtV~aG~t~~lg~l~~~~~~~~~~LweIG~ 288 (476)
|.+..++...+ .+.+.+. ..+|-||.+
T Consensus 62 i~~~~~~~~~~-~i~L~~~--~~~L~eVvV 88 (88)
T PF13715_consen 62 ISVNSNKNTNL-NIYLEPK--SNQLDEVVV 88 (88)
T ss_pred EEecCCCEEEE-EEEEeeC--cccCCeEEC
Confidence 77766655566 5777553 567877753
No 6
>cd03863 M14_CPD_II The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally ac
Probab=98.16 E-value=7.8e-06 Score=85.93 Aligned_cols=78 Identities=19% Similarity=0.212 Sum_probs=64.0
Q ss_pred CeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeee
Q 011825 177 RGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSD 256 (476)
Q Consensus 177 RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~ 256 (476)
...|+|+|+... .+.|..+|+|.+.+ ....+.||.+|.|.+ .|+||+|+|+|++.|+. ..+
T Consensus 296 ~~gI~G~V~D~~--~g~pl~~AtV~V~g-------------~~~~~~Td~~G~f~~-~l~pG~ytl~vs~~GY~---~~~ 356 (375)
T cd03863 296 HRGVRGFVLDAT--DGRGILNATISVAD-------------INHPVTTYKDGDYWR-LLVPGTYKVTASARGYD---PVT 356 (375)
T ss_pred cCeEEEEEEeCC--CCCCCCCeEEEEec-------------CcCceEECCCccEEE-ccCCeeEEEEEEEcCcc---cEE
Confidence 478999998653 37889999999953 345688999999999 69999999999998865 455
Q ss_pred eEEEEeCCceeeecceEE
Q 011825 257 ALVTITSGSNIKMGDLVY 274 (476)
Q Consensus 257 ~~VtV~aG~t~~lg~l~~ 274 (476)
.+|+|.+|+++.+ ++.+
T Consensus 357 ~~v~V~~~~~~~~-~~~L 373 (375)
T cd03863 357 KTVEVDSKGAVQV-NFTL 373 (375)
T ss_pred EEEEEcCCCcEEE-EEEe
Confidence 5799999999888 5766
No 7
>cd03865 M14_CPE_H Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine pe
Probab=98.11 E-value=5.9e-06 Score=87.44 Aligned_cols=96 Identities=22% Similarity=0.326 Sum_probs=73.1
Q ss_pred cchhHHHHHHHHhhhhcCCCCCCCCCCCCCCCCCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEE
Q 011825 143 PLWLWEDAKIKMMSEVQSWPYNFPASEDFQKSEERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWT 222 (476)
Q Consensus 143 ~~~l~~Da~~~~~~E~~~wpy~f~~s~~y~~~s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt 222 (476)
...+|++-|.-+. .|.....|| |+|+|+... +.|..+|.|.+.+. ...+
T Consensus 306 L~~~W~~n~~all--------------~~~~q~~~g-I~G~V~D~~---g~pI~~AtV~V~g~-------------~~~~ 354 (402)
T cd03865 306 LKQYWEDNKNSLV--------------NYIEQVHRG-VKGFVKDLQ---GNPIANATISVEGI-------------DHDI 354 (402)
T ss_pred HHHHHHHHHHHHH--------------HHHHHhccc-eEEEEECCC---CCcCCCeEEEEEcC-------------cccc
Confidence 4567877766543 233334577 999998654 68889999999542 3456
Q ss_pred EeCCCcceEeCcccCcceEEEEEECceeeeeeeeeEEEEeCCceeeecceEE
Q 011825 223 TADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDALVTITSGSNIKMGDLVY 274 (476)
Q Consensus 223 ~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~lg~l~~ 274 (476)
.||.+|.|.+ +++||+|+|+|.+.|+. .....|+|.+++++.+ ++++
T Consensus 355 ~T~~~G~Y~~-~L~pG~Ytv~vsa~Gy~---~~~~~V~V~~~~~~~v-df~L 401 (402)
T cd03865 355 TSAKDGDYWR-LLAPGNYKLTASAPGYL---AVVKKVAVPYSPAVRV-DFEL 401 (402)
T ss_pred EECCCeeEEE-CCCCEEEEEEEEecCcc---cEEEEEEEcCCCcEEE-eEEe
Confidence 8999999998 89999999999998876 4456799999998887 4765
No 8
>cd03864 M14_CPN Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates inclu
Probab=98.09 E-value=1.1e-05 Score=85.28 Aligned_cols=76 Identities=16% Similarity=0.238 Sum_probs=62.8
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeee
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDA 257 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~ 257 (476)
..|+|+|+..+ +.|..+|+|.+.+ ....+.||++|.| +.+++||+|+|.+++.|+. .++.
T Consensus 316 ~gI~G~V~D~~---g~pi~~A~V~v~g-------------~~~~~~T~~~G~y-~r~l~pG~Y~l~vs~~Gy~---~~t~ 375 (392)
T cd03864 316 QGIKGMVTDEN---NNGIANAVISVSG-------------ISHDVTSGTLGDY-FRLLLPGTYTVTASAPGYQ---PSTV 375 (392)
T ss_pred CeEEEEEECCC---CCccCCeEEEEEC-------------CccceEECCCCcE-EecCCCeeEEEEEEEcCce---eEEE
Confidence 48999998765 6899999999954 3456889999999 9999999999999998866 5666
Q ss_pred EEEEeCCceeeecceEE
Q 011825 258 LVTITSGSNIKMGDLVY 274 (476)
Q Consensus 258 ~VtV~aG~t~~lg~l~~ 274 (476)
+|+|.+++++.+ ++++
T Consensus 376 ~v~V~~~~~~~~-df~L 391 (392)
T cd03864 376 TVTVGPAEATLV-NFQL 391 (392)
T ss_pred EEEEcCCCcEEE-eeEe
Confidence 799999988776 4655
No 9
>cd06245 M14_CPD_III The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active a
Probab=97.92 E-value=4.2e-05 Score=80.16 Aligned_cols=75 Identities=17% Similarity=0.251 Sum_probs=62.2
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeee
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDA 257 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~ 257 (476)
-.|+|+|+..+ +.|..+|+|.+.+. . .+.||.+|.|.+. ++||+|+|.+...|+. ..+.
T Consensus 287 ~gI~G~V~d~~---g~pi~~A~V~v~g~-------------~-~~~T~~~G~y~~~-L~pG~y~v~vs~~Gy~---~~~~ 345 (363)
T cd06245 287 KGVHGVVTDKA---GKPISGATIVLNGG-------------H-RVYTKEGGYFHVL-LAPGQHNINVIAEGYQ---QEHL 345 (363)
T ss_pred cEEEEEEEcCC---CCCccceEEEEeCC-------------C-ceEeCCCcEEEEe-cCCceEEEEEEEeCce---eEEE
Confidence 57999998754 78999999999631 2 5779999999997 9999999999998765 5566
Q ss_pred EEEEeCCceeeecceEE
Q 011825 258 LVTITSGSNIKMGDLVY 274 (476)
Q Consensus 258 ~VtV~aG~t~~lg~l~~ 274 (476)
+|+|.+++++.+ ++++
T Consensus 346 ~V~v~~~~~~~~-~f~L 361 (363)
T cd06245 346 PVVVSHDEASSV-KIVL 361 (363)
T ss_pred EEEEcCCCeEEE-EEEe
Confidence 799999998877 5766
No 10
>cd03858 M14_CP_N-E_like Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappr
Probab=97.76 E-value=9.9e-05 Score=77.36 Aligned_cols=72 Identities=19% Similarity=0.251 Sum_probs=59.1
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeee
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDA 257 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~ 257 (476)
.+|+|+|+..+ +.|..+|.|.+. +....+.||.+|.|.+. ++||+|+|.+...|+. .++.
T Consensus 298 ~~i~G~V~d~~---g~pl~~A~V~i~-------------~~~~~~~Td~~G~f~~~-l~~G~y~l~vs~~Gy~---~~~~ 357 (374)
T cd03858 298 RGIKGFVRDAN---GNPIANATISVE-------------GINHDVTTAEDGDYWRL-LLPGTYNVTASAPGYE---PQTK 357 (374)
T ss_pred CceEEEEECCC---CCccCCeEEEEe-------------cceeeeEECCCceEEEe-cCCEeEEEEEEEcCcc---eEEE
Confidence 48999998764 678899999994 34678999999999986 7999999999998755 4555
Q ss_pred EEEEeC-Cceeee
Q 011825 258 LVTITS-GSNIKM 269 (476)
Q Consensus 258 ~VtV~a-G~t~~l 269 (476)
+|.|.. |+++.+
T Consensus 358 ~v~v~~~g~~~~~ 370 (374)
T cd03858 358 SVVVPNDNSAVVV 370 (374)
T ss_pred EEEEecCCceEEE
Confidence 688877 888776
No 11
>cd03868 M14_CPD_I The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at p
Probab=97.76 E-value=7.6e-05 Score=78.34 Aligned_cols=77 Identities=17% Similarity=0.155 Sum_probs=60.3
Q ss_pred CeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeee
Q 011825 177 RGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSD 256 (476)
Q Consensus 177 RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~ 256 (476)
.+.|+|+|+..+ +.|..+|.|.|.+. ...+.||++|.|.+ +++||+|+|.+...|+.- ...
T Consensus 295 ~~~i~G~V~d~~---g~pv~~A~V~v~~~-------------~~~~~td~~G~y~~-~l~~G~Y~l~vs~~Gf~~--~~~ 355 (372)
T cd03868 295 HIGVKGFVRDAS---GNPIEDATIMVAGI-------------DHNVTTAKFGDYWR-LLLPGTYTITAVAPGYEP--STV 355 (372)
T ss_pred CCceEEEEEcCC---CCcCCCcEEEEEec-------------ccceEeCCCceEEe-cCCCEEEEEEEEecCCCc--eEE
Confidence 478999998764 78999999999642 35689999999984 799999999999988662 122
Q ss_pred eEEEEeCCceeeecceE
Q 011825 257 ALVTITSGSNIKMGDLV 273 (476)
Q Consensus 257 ~~VtV~aG~t~~lg~l~ 273 (476)
..|+|.+|+++.+ +++
T Consensus 356 ~~v~v~~g~~~~~-~~~ 371 (372)
T cd03868 356 TDVVVKEGEATSV-NFT 371 (372)
T ss_pred eeEEEcCCCeEEE-eeE
Confidence 3477999998877 353
No 12
>cd03867 M14_CPZ Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling.
Probab=97.49 E-value=0.00036 Score=74.00 Aligned_cols=72 Identities=19% Similarity=0.215 Sum_probs=56.9
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeee
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDA 257 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~ 257 (476)
-.|+|+|+..+ +.|..+|.|.|.+ ....+.||++|.|. .+++||+|+|.+...|+. ....
T Consensus 318 ~~i~G~V~D~~---g~pi~~A~V~v~g-------------~~~~~~Td~~G~y~-~~l~~G~y~l~vs~~Gy~---~~~~ 377 (395)
T cd03867 318 RGIKGFVKDKD---GNPIKGARISVRG-------------IRHDITTAEDGDYW-RLLPPGIHIVSAQAPGYT---KVMK 377 (395)
T ss_pred ceeEEEEEcCC---CCccCCeEEEEec-------------cccceEECCCceEE-EecCCCcEEEEEEecCee---eEEE
Confidence 36999999765 6899999999953 36678999999997 689999999999998765 5556
Q ss_pred EEEEeC--Cceeee
Q 011825 258 LVTITS--GSNIKM 269 (476)
Q Consensus 258 ~VtV~a--G~t~~l 269 (476)
+|+|.. ++...+
T Consensus 378 ~v~v~~~~~~~~~~ 391 (395)
T cd03867 378 RVTLPARMKRAGRV 391 (395)
T ss_pred EEEeCCcCCCceEe
Confidence 688865 444444
No 13
>cd03866 M14_CPM Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the cont
Probab=96.92 E-value=0.0038 Score=65.89 Aligned_cols=70 Identities=17% Similarity=0.158 Sum_probs=53.0
Q ss_pred CeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeee
Q 011825 177 RGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSD 256 (476)
Q Consensus 177 RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~ 256 (476)
.+.|+|+|+..+ +.|..+|.|.+.+. +...-+.||++|.|.+. ++||+|+|.+.+.|+. ...
T Consensus 294 ~~gI~G~V~D~~---g~pi~~A~V~v~g~-----------~~~~~~~T~~~G~y~~~-l~pG~Y~v~vsa~Gy~---~~~ 355 (376)
T cd03866 294 HLGVKGQVFDSN---GNPIPNAIVEVKGR-----------KHICPYRTNVNGEYFLL-LLPGKYMINVTAPGFK---TVI 355 (376)
T ss_pred cCceEEEEECCC---CCccCCeEEEEEcC-----------CceeEEEECCCceEEEe-cCCeeEEEEEEeCCcc---eEE
Confidence 467999998543 68999999999642 11233469999999775 9999999999998865 445
Q ss_pred eEEEEeCC
Q 011825 257 ALVTITSG 264 (476)
Q Consensus 257 ~~VtV~aG 264 (476)
.+|.|.+.
T Consensus 356 ~~v~v~~~ 363 (376)
T cd03866 356 TNVIIPYN 363 (376)
T ss_pred EEEEeCCC
Confidence 56777753
No 14
>PRK15036 hydroxyisourate hydrolase; Provisional
Probab=96.21 E-value=0.011 Score=54.08 Aligned_cols=65 Identities=22% Similarity=0.275 Sum_probs=48.7
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEe---Cc-ccCcceEEEEEECc
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSI---KN-IRTGNYNLYAWVPG 248 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI---~n-V~pGtY~L~a~~~G 248 (476)
+.+.|+++|+... .++||.++.|-|....+ +.|.. -.-+.||++|+|.. .+ +.||.|+|.....+
T Consensus 25 ~~~~Is~HVLDt~--~G~PA~gV~V~L~~~~~-~~w~~-----l~~~~Td~dGR~~~l~~~~~~~~G~Y~L~F~t~~ 93 (137)
T PRK15036 25 QQNILSVHILNQQ--TGKPAADVTVTLEKKAD-NGWLQ-----LNTAKTDKDGRIKALWPEQTATTGDYRVVFKTGD 93 (137)
T ss_pred cCCCeEEEEEeCC--CCcCCCCCEEEEEEccC-CceEE-----EEEEEECCCCCCccccCcccCCCeeEEEEEEcch
Confidence 4467999998655 48999999999975422 12322 35578999999986 34 88999999998644
No 15
>PF08400 phage_tail_N: Prophage tail fibre N-terminal; InterPro: IPR013609 This entry represents the N terminus of phage 933W tail fibre protein. The characteristics of the protein distribution suggest prophage matches.
Probab=95.83 E-value=0.049 Score=49.62 Aligned_cols=77 Identities=18% Similarity=0.110 Sum_probs=54.1
Q ss_pred EEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeeeE
Q 011825 179 CVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDAL 258 (476)
Q Consensus 179 tVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~~ 258 (476)
.|||.++.-. ++|..+..+.|..-... ..==.+.-=+..|++.|.|+|+ +.||.|.+++...|..- .+-..
T Consensus 4 ~ISGvL~dg~---G~pv~g~~I~L~A~~tS---~~Vv~~t~as~~t~~~G~Ys~~-~epG~Y~V~l~~~g~~~--~~vG~ 74 (134)
T PF08400_consen 4 KISGVLKDGA---GKPVPGCTITLKARRTS---STVVVGTVASVVTGEAGEYSFD-VEPGVYRVTLKVEGRPP--VYVGD 74 (134)
T ss_pred EEEEEEeCCC---CCcCCCCEEEEEEccCc---hheEEEEEEEEEcCCCceEEEE-ecCCeEEEEEEECCCCc--eeEEE
Confidence 6899888644 89999999999642110 0001244556788999999996 99999999999988542 23245
Q ss_pred EEEeCC
Q 011825 259 VTITSG 264 (476)
Q Consensus 259 VtV~aG 264 (476)
|+|.+.
T Consensus 75 I~V~~d 80 (134)
T PF08400_consen 75 ITVYED 80 (134)
T ss_pred EEEecC
Confidence 777754
No 16
>PF08308 PEGA: PEGA domain; InterPro: IPR013229 This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands.
Probab=95.77 E-value=0.061 Score=42.74 Aligned_cols=45 Identities=20% Similarity=0.411 Sum_probs=36.7
Q ss_pred cceEeCcccCcceEEEEEECceeeeeeeeeEEEEeCCceeeecceEEcC
Q 011825 228 GCFSIKNIRTGNYNLYAWVPGFVGDYRSDALVTITSGSNIKMGDLVYEP 276 (476)
Q Consensus 228 G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~aG~t~~lg~l~~~~ 276 (476)
...++..+++|.|+|.+..+|+. ..+..|.|.+|++..+ ++.+++
T Consensus 25 tp~~~~~l~~G~~~v~v~~~Gy~---~~~~~v~v~~~~~~~v-~~~L~~ 69 (71)
T PF08308_consen 25 TPLTLKDLPPGEHTVTVEKPGYE---PYTKTVTVKPGETTTV-NVTLEP 69 (71)
T ss_pred CcceeeecCCccEEEEEEECCCe---eEEEEEEECCCCEEEE-EEEEEE
Confidence 34578889999999999998865 5556799999999888 477744
No 17
>PF03422 CBM_6: Carbohydrate binding module (family 6); InterPro: IPR005084 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM6 from CAZY which was previously known as cellulose-binding domain family VI (CBD VI). CBM6 bind to amorphous cellulose, xylan, mixed beta-(1,3)(1,4)glucan and beta-1,3-glucan[, , ]. CBM6 adopts a classic lectin-like beta-jelly roll fold, predominantly consisting of five antiparallel beta-strands on one face and four antiparallel beta-strands on the other face. It contains two potential ligand binding sites, named respectively cleft A and B. These clefts include aromatic residues which are probably involved in the substrate binding. The cleft B is located on the concave surface of one beta-sheet, and the cleft A on one edge of the protein between the loop that connects the inner and outer beta-sheets of the jellyroll fold []. The multiple binding clefts confer the extensive range of specificities displayed by the domain [, , ].; GO: 0030246 carbohydrate binding; PDB: 1UY1_A 1UY3_A 1UY4_A 1UY2_A 1UYY_A 1UXZ_B 1UYZ_A 1UY0_B 1UYX_A 1UZ0_A ....
Probab=95.21 E-value=0.16 Score=44.29 Aligned_cols=81 Identities=17% Similarity=0.331 Sum_probs=50.3
Q ss_pred CCCcEEEEEEEeccCC-CeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeeeeecEEEEEeecC
Q 011825 378 RNSSYKLRVAIASATL-AELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIEGENTIFLKQPRC 456 (476)
Q Consensus 378 ~~~~~tLriala~a~~-~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~NtI~l~~~~g 456 (476)
..+.|+|++..|.... ++++|+||+... ++..+..++.... --.|...+..| .|.+|.|+|+|....+
T Consensus 43 ~~g~y~~~~~~a~~~~~~~~~l~id~~~g--~~~~~~~~~~tg~------w~~~~~~~~~v---~l~~G~h~i~l~~~~~ 111 (125)
T PF03422_consen 43 EAGTYTLTIRYANGGGGGTIELRIDGPDG--TLIGTVSLPPTGG------WDTWQTVSVSV---KLPAGKHTIYLVFNGG 111 (125)
T ss_dssp SSEEEEEEEEEEESSSSEEEEEEETTTTS--EEEEEEEEE-ESS------TTEEEEEEEEE---EEESEEEEEEEEESSS
T ss_pred CCceEEEEEEEECCCCCcEEEEEECCCCC--cEEEEEEEcCCCC------ccccEEEEEEE---eeCCCeeEEEEEEECC
Confidence 3678889888888664 799999999322 1222222221110 01234444444 4667999999998765
Q ss_pred CCCCceEEEEEEEEe
Q 011825 457 TSPFQGIMYDYIRLE 471 (476)
Q Consensus 457 ss~~~~vmyD~IrLe 471 (476)
.+ ..+-.|+|+|+
T Consensus 112 ~~--~~~niD~~~f~ 124 (125)
T PF03422_consen 112 DG--WAFNIDYFQFT 124 (125)
T ss_dssp SS--B-EEEEEEEEE
T ss_pred CC--ceEEeEEEEEE
Confidence 43 35889999886
No 18
>cd03869 M14_CPX_like Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Pro
Probab=94.69 E-value=0.077 Score=56.64 Aligned_cols=67 Identities=16% Similarity=0.224 Sum_probs=48.9
Q ss_pred CeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECceeeeeeee
Q 011825 177 RGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSD 256 (476)
Q Consensus 177 RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~ 256 (476)
|| |+|.|+... +.|..+|+|.+.. ......|.++|.|-- =+.||+|+++|.++|+. ...
T Consensus 329 ~G-ikG~V~d~~---g~~i~~a~i~v~g-------------~~~~v~t~~~GdywR-ll~pG~y~v~~~a~gy~---~~~ 387 (405)
T cd03869 329 RG-IKGVVRDKT---GKGIPNAIISVEG-------------INHDIRTASDGDYWR-LLNPGEYRVTAHAEGYT---SST 387 (405)
T ss_pred cC-ceEEEECCC---CCcCCCcEEEEec-------------CccceeeCCCCceEE-ecCCceEEEEEEecCCC---ccc
Confidence 44 899998654 7788899999853 234456778886543 28999999999998865 455
Q ss_pred eEEEEeCC
Q 011825 257 ALVTITSG 264 (476)
Q Consensus 257 ~~VtV~aG 264 (476)
.+|+|..+
T Consensus 388 ~~~~v~~~ 395 (405)
T cd03869 388 KNCEVGYE 395 (405)
T ss_pred EEEEEcCC
Confidence 56777754
No 19
>PF05738 Cna_B: Cna protein B-type domain; InterPro: IPR008454 This entry represents a repeated B region domain found in the collagen-binding surface protein Cna in Staphylococcus aureus, as well as other related domains. The B region domain of Cna has a prealbumin-like beta-sandwich fold of seven strands in two sheets with a Greek key topology []. However, this domain does not mediate collagen binding, the IPR008456 from INTERPRO region carries out that function; instead it appears to form a stalk that presents the ligand binding domain away from the bacterial cell surface. Cna is a collagen-binding MSCRAMM (Microbial Surface Component Recognizing Adhesive Matrix Molecules), and is necessary and sufficient for S. aureus cells to adhere to cartilage.; PDB: 2X5P_A 3RKP_A 3KPT_A 1VLF_T 1TI2_F 1TI6_D 1TI4_J 1VLE_V 1VLD_X 3PF2_A ....
Probab=94.44 E-value=0.18 Score=39.69 Aligned_cols=45 Identities=31% Similarity=0.494 Sum_probs=31.4
Q ss_pred EEeCCCcceEeCcccCcceEEEEEE--CceeeeeeeeeEEEEeCCcee
Q 011825 222 TTADEDGCFSIKNIRTGNYNLYAWV--PGFVGDYRSDALVTITSGSNI 267 (476)
Q Consensus 222 t~td~~G~FtI~nV~pGtY~L~a~~--~G~~G~~~~~~~VtV~aG~t~ 267 (476)
..+|++|.|.|++++||+|+|.--. .|+.-. .....++|..++..
T Consensus 21 ~~Td~~G~~~f~~L~~G~Y~l~E~~aP~GY~~~-~~~~~~~i~~~~~~ 67 (70)
T PF05738_consen 21 VTTDENGKYTFKNLPPGTYTLKETKAPDGYQLD-DTPYEFTITEDGDV 67 (70)
T ss_dssp EEGGTTSEEEEEEEESEEEEEEEEETTTTEEEE-ECEEEEEECTTSCE
T ss_pred EEECCCCEEEEeecCCeEEEEEEEECCCCCEEC-CCceEEEEecCCEE
Confidence 5689999999999999999999876 443311 11223666665543
No 20
>PF09430 DUF2012: Protein of unknown function (DUF2012); InterPro: IPR019008 This domain is found in different proteins, including uncharacterised protein family UPF0480 and nodal modulators. A nodal modulator has been identified as part of a protein complex that participates in the nodal signaling pathway during vertebrate development [].
Probab=94.01 E-value=0.21 Score=44.53 Aligned_cols=40 Identities=25% Similarity=0.480 Sum_probs=31.5
Q ss_pred eEEEeCCCcceEeCcccCcceEEEEEECceeeeeeeeeEEEEe
Q 011825 220 FWTTADEDGCFSIKNIRTGNYNLYAWVPGFVGDYRSDALVTIT 262 (476)
Q Consensus 220 Ywt~td~~G~FtI~nV~pGtY~L~a~~~G~~G~~~~~~~VtV~ 262 (476)
+-+...++|.|.|.||++|+|.|.+-...+. |.. -.|.|.
T Consensus 22 ~~~~v~~dG~F~f~~Vp~GsY~L~V~s~~~~--F~~-~RVdV~ 61 (123)
T PF09430_consen 22 ISAFVRSDGSFVFHNVPPGSYLLEVHSPDYV--FPP-YRVDVS 61 (123)
T ss_pred eEEEecCCCEEEeCCCCCceEEEEEECCCcc--ccC-EEEEEe
Confidence 3788999999999999999999999876543 222 347776
No 21
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=94.00 E-value=0.14 Score=58.52 Aligned_cols=55 Identities=24% Similarity=0.321 Sum_probs=45.3
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCc-ccCcceEEEEEECc
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKN-IRTGNYNLYAWVPG 248 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~n-V~pGtY~L~a~~~G 248 (476)
-+|+|||.+.. .+.|..++.|-+.+ +-..+||++|+|+++| +..|+|++.|-...
T Consensus 316 fSvtGRVl~g~--~g~~l~gvvvlvng--------------k~~~kTdaqGyykLen~~t~gtytI~a~keh 371 (1165)
T KOG1948|consen 316 FSVTGRVLVGS--KGLPLSGVVVLVNG--------------KSGGKTDAQGYYKLENLKTDGTYTITAKKEH 371 (1165)
T ss_pred EEeeeeEEeCC--CCCCccceEEEEcC--------------cccceEcccceEEeeeeeccCcEEEEEeccc
Confidence 47889988753 36888999998865 5667899999999999 99999999997543
No 22
>cd00421 intradiol_dioxygenase Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers.
Probab=93.91 E-value=0.14 Score=46.98 Aligned_cols=64 Identities=20% Similarity=0.285 Sum_probs=46.0
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccccc-------ccceEEEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECK-------DYQFWTTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~-------~yqYwt~td~~G~FtI~nV~pGtY~L 242 (476)
..-.|.|+|+..+ +.|..+|.|-+-.....|....+.. ...-...||++|.|+|.-|+||.|.+
T Consensus 10 ~~l~l~G~V~D~~---g~pv~~A~VeiW~~d~~G~Y~~~~~~~~~~~~~~rg~~~Td~~G~y~f~ti~Pg~Y~~ 80 (146)
T cd00421 10 EPLTLTGTVLDGD---GCPVPDALVEIWQADADGRYSGQDDSGLDPEFFLRGRQITDADGRYRFRTIKPGPYPI 80 (146)
T ss_pred CEEEEEEEEECCC---CCCCCCcEEEEEecCCCCccCCcCccccCCCCCCEEEEEECCCcCEEEEEEcCCCCCC
Confidence 3458999999776 6788889988855444443332211 22334789999999999999999994
No 23
>COG3485 PcaH Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=92.94 E-value=0.24 Score=48.95 Aligned_cols=65 Identities=25% Similarity=0.345 Sum_probs=46.6
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccc-------cceE--EEeCCCcceEeCcccCcceEEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKD-------YQFW--TTADEDGCFSIKNIRTGNYNLY 243 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~-------yqYw--t~td~~G~FtI~nV~pGtY~L~ 243 (476)
+|=.|+|+|+..+ ++|..++.|=+-+-...|-.....+. ..=| +.||++|.|.|.-|+||.|--.
T Consensus 71 e~i~l~G~VlD~~---G~Pv~~A~VEiWQAda~GrY~~~~d~~~~~~~~f~g~Gr~~Td~~G~y~F~Ti~Pg~yp~~ 144 (226)
T COG3485 71 ERILLEGRVLDGN---GRPVPDALVEIWQADADGRYSHPKDSRLAPLPNFNGRGRTITDEDGEYRFRTIKPGPYPWR 144 (226)
T ss_pred ceEEEEEEEECCC---CCCCCCCEEEEEEcCCCCcccCccccccCcCccccceEEEEeCCCceEEEEEeecccccCC
Confidence 7899999999877 89999999988443333433311111 1123 6689999999999999998543
No 24
>cd03463 3,4-PCD_alpha Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=91.74 E-value=0.39 Score=46.09 Aligned_cols=63 Identities=21% Similarity=0.360 Sum_probs=46.9
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccc-------cccceE--EEeCCCcceEeCcccCcceE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTEC-------KDYQFW--TTADEDGCFSIKNIRTGNYN 241 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~-------~~yqYw--t~td~~G~FtI~nV~pGtY~ 241 (476)
..=.|+|+|+..+ ++|..+|.|-+-.....|....+. ..++.| ..||++|.|++.-|+||-|.
T Consensus 35 ~~l~l~G~V~D~~---g~Pi~gA~VeiWqad~~G~Y~~~~~~~~~~~~~f~~rGr~~TD~~G~y~F~Ti~Pg~Y~ 106 (185)
T cd03463 35 ERITLEGRVYDGD---GAPVPDAMLEIWQADAAGRYAHPADSRRRLDPGFRGFGRVATDADGRFSFTTVKPGAVP 106 (185)
T ss_pred CEEEEEEEEECCC---CCCCCCCEEEEEcCCCCCccCCcCCcccccCCCCCcEEEEEECCCCCEEEEEEcCCCcC
Confidence 4578999999655 889999999886554455443221 344445 56999999999999999986
No 25
>cd03459 3,4-PCD Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=91.14 E-value=0.5 Score=44.21 Aligned_cols=64 Identities=23% Similarity=0.399 Sum_probs=46.9
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccc--------cccceE--EEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTEC--------KDYQFW--TTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~--------~~yqYw--t~td~~G~FtI~nV~pGtY~L 242 (476)
.+=.|+|+|...+ +.|..+|.|-+-.....|....+. .++..| ..||++|.|+|.-|+||-|.+
T Consensus 14 ~~l~l~g~V~D~~---g~Pv~~A~veiWqad~~G~Y~~~~~~~~~~~~~~f~~rG~~~Td~~G~~~f~Ti~Pg~Y~~ 87 (158)
T cd03459 14 ERIILEGRVLDGD---GRPVPDALVEIWQADAAGRYRHPRDSHRAPLDPNFTGFGRVLTDADGRYRFRTIKPGAYPW 87 (158)
T ss_pred cEEEEEEEEECCC---CCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCceeEEEECCCCcEEEEEECCCCcCC
Confidence 3467899998655 889999999886555455333322 344444 468999999999999999983
No 26
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=91.13 E-value=0.64 Score=51.89 Aligned_cols=78 Identities=15% Similarity=0.140 Sum_probs=56.7
Q ss_pred ccEEEEEEeCCCCCCCcEEEEEEEeccC-----CCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeC
Q 011825 365 TTWQIKFKLDHVDRNSSYKLRVAIASAT-----LAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPG 439 (476)
Q Consensus 365 ~~w~I~F~L~~~~~~~~~tLriala~a~-----~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa 439 (476)
..-.|.|.+...+....++|+|.+.-+. .+.++|.|||..+. +..+...+. .....+|+||.
T Consensus 29 ~~~~~~f~v~~~~~v~~a~L~L~~~~S~~l~~~~S~L~V~lNg~~v~-----s~~l~~~~~--------~~~~~~i~Ip~ 95 (605)
T PF03170_consen 29 ASRTIYFPVPADWVVTKATLNLSYTYSPSLLPERSQLTVSLNGQPVG-----SIPLDAESA--------QPQTVTIPIPP 95 (605)
T ss_pred CceEEEEEcCCCccccceEEEEEEEECcccCCCcceEEEEECCEEeE-----EEecCcCCC--------CceEEEEecCh
Confidence 5567788887776666788888887652 36899999998654 222322221 24678999999
Q ss_pred CCeeeeecEEEEEeecC
Q 011825 440 TRFIEGENTIFLKQPRC 456 (476)
Q Consensus 440 ~~L~~G~NtI~l~~~~g 456 (476)
. |..|.|.|.|.....
T Consensus 96 ~-l~~g~N~l~~~~~~~ 111 (605)
T PF03170_consen 96 A-LIKGFNRLTFEFIGH 111 (605)
T ss_pred h-hcCCceEEEEEEEec
Confidence 9 999999999987543
No 27
>cd03462 1,2-CCD chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway.
Probab=90.87 E-value=0.72 Score=46.22 Aligned_cols=63 Identities=14% Similarity=0.161 Sum_probs=44.6
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccc---cccce--EEEeCCCcceEeCcccCcceE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTEC---KDYQF--WTTADEDGCFSIKNIRTGNYN 241 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~---~~yqY--wt~td~~G~FtI~nV~pGtY~ 241 (476)
++=.|+|+|...+ ++|..+|.|=+-.....|....+. ....+ ...||++|.|.+.-|+||-|-
T Consensus 98 ~~l~l~G~V~D~~---G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~RG~~~Td~~G~y~F~Ti~P~~Yp 165 (247)
T cd03462 98 KPLLFRGTVKDLA---GAPVAGAVIDVWHSTPDGKYSGFHPNIPEDYYRGKIRTDEDGRYEVRTTVPVPYQ 165 (247)
T ss_pred CEEEEEEEEEcCC---CCCcCCcEEEEECCCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCcC
Confidence 4578999998665 789999999885444444332211 11111 457899999999999999994
No 28
>TIGR02465 chlorocat_1_2 chlorocatechol 1,2-dioxygenase. Members of this protein family are chlorocatechol 1,2-dioxygenase. This protein is closely related to catechol 1,2-dioxygenase, TIGR02439, EC 1.13.11.1. Note that annotated database entries have appeared for the present protein family with the EC number that refers to that of family TIGR02439. This protein acts in pathways of the biodegradation of chlorinated aromatic compounds.
Probab=90.67 E-value=0.41 Score=47.92 Aligned_cols=64 Identities=13% Similarity=0.171 Sum_probs=46.6
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc---ccccce--EEEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE---CKDYQF--WTTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~---~~~yqY--wt~td~~G~FtI~nV~pGtY~L 242 (476)
+.=.|+|+|...+ ++|..+|.|=+-....+|....+ ....++ +..||++|.|.+.-|+||-|-+
T Consensus 97 ~~l~v~G~V~D~~---G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~F~Ti~P~~Ypi 165 (246)
T TIGR02465 97 KPLLIRGTVRDLS---GTPVAGAVIDVWHSTPDGKYSGFHDNIPDDYYRGKLVTAADGSYEVRTTMPVPYQI 165 (246)
T ss_pred cEEEEEEEEEcCC---CCCcCCcEEEEECCCCCCCCCCCCCCCCCCCCeEEEEECCCCCEEEEEECCCCCCC
Confidence 4578999998655 89999999988555445533321 122333 5778999999999999999853
No 29
>PF07210 DUF1416: Protein of unknown function (DUF1416); InterPro: IPR010814 This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown.
Probab=90.52 E-value=2.9 Score=35.24 Aligned_cols=60 Identities=25% Similarity=0.395 Sum_probs=46.3
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceE--EEeCCCcceEeCcccCcceEEEEEECceee
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFW--TTADEDGCFSIKNIRTGNYNLYAWVPGFVG 251 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYw--t~td~~G~FtI~nV~pGtY~L~a~~~G~~G 251 (476)
....|+|+|+ .+ +.|..+++|-|-.... .|- ..|+.+|.|.+- ..||+.+|.+...+-.|
T Consensus 6 ke~VItG~V~-~~---G~Pv~gAyVRLLD~sg-----------EFtaEvvts~~G~FRFf-aapG~WtvRal~~~g~~ 67 (85)
T PF07210_consen 6 KETVITGRVT-RD---GEPVGGAYVRLLDSSG-----------EFTAEVVTSATGDFRFF-AAPGSWTVRALSRGGNG 67 (85)
T ss_pred ceEEEEEEEe-cC---CcCCCCeEEEEEcCCC-----------CeEEEEEecCCccEEEE-eCCCceEEEEEccCCCC
Confidence 4578999999 55 7899999999865422 233 457899999995 89999999998765443
No 30
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=90.42 E-value=0.36 Score=37.38 Aligned_cols=27 Identities=26% Similarity=0.409 Sum_probs=22.0
Q ss_pred eEEEeCCCc-ceEeCcccCcceEEEEEE
Q 011825 220 FWTTADEDG-CFSIKNIRTGNYNLYAWV 246 (476)
Q Consensus 220 Ywt~td~~G-~FtI~nV~pGtY~L~a~~ 246 (476)
=|....... .+++.+++||+|+|.|.+
T Consensus 20 ~W~~~~~~~~~~~~~~L~~G~Y~l~V~a 47 (66)
T PF07495_consen 20 EWITLGSYSNSISYTNLPPGKYTLEVRA 47 (66)
T ss_dssp SEEEESSTS-EEEEES--SEEEEEEEEE
T ss_pred eEEECCCCcEEEEEEeCCCEEEEEEEEE
Confidence 377777777 999999999999999997
No 31
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=90.35 E-value=0.71 Score=53.08 Aligned_cols=57 Identities=26% Similarity=0.373 Sum_probs=42.6
Q ss_pred EEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECc
Q 011825 179 CVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPG 248 (476)
Q Consensus 179 tVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G 248 (476)
+|+|+|.+.. +-.+.++.|-|-.. .+.---|.|+++|.|.+.||.||+|.+.|....
T Consensus 120 sv~GkVlgaa---ggGpagV~velrs~----------e~~iast~T~~~Gky~f~~iiPG~Yev~ashp~ 176 (1165)
T KOG1948|consen 120 SVRGKVLGAA---GGGPAGVLVELRSQ----------EDPIASTKTEDGGKYEFRNIIPGKYEVSASHPA 176 (1165)
T ss_pred eEeeEEeecc---CCCcccceeecccc----------cCcceeeEecCCCeEEEEecCCCceEEeccCcc
Confidence 6777777654 23456677777532 334556889999999999999999999998654
No 32
>PF00775 Dioxygenase_C: Dioxygenase; InterPro: IPR000627 This entry represents the C-terminal domain common to several intradiol ring-cleavage dioxygenases. Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes (IPR000486 from INTERPRO) use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) []. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues. Enzymes that belong to the intradiol family include catechol 1,2-dioxygenase (1,2-CTD) (1.13.11.1 from EC); protocatechuate 3,4-dioxygenase (3,4-PCD) (1.13.11.3 from EC); and chlorocatechol 1,2-dioxygenase (1.13.11.1 from EC) [].; GO: 0003824 catalytic activity, 0008199 ferric iron binding, 0006725 cellular aromatic compound metabolic process, 0055114 oxidation-reduction process; PDB: 2BUV_A 2BUX_A 2BUU_A 2BUR_A 1EO9_A 2BUZ_A 2BV0_A 1EO2_A 1EOC_A 1EOA_A ....
Probab=89.51 E-value=0.86 Score=43.60 Aligned_cols=64 Identities=23% Similarity=0.350 Sum_probs=39.3
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccc-------cccceEEEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTEC-------KDYQFWTTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~-------~~yqYwt~td~~G~FtI~nV~pGtY~L 242 (476)
+.=.|.|+|...+ ++|..+|.|=+-.....|....+. ....=+..||++|.|++.-|+||.|.+
T Consensus 28 ~~l~l~G~V~D~~---g~Pv~~A~veiWqada~G~Ys~~~~~~~~~~~~~rG~~~Td~~G~y~f~Ti~Pg~Y~~ 98 (183)
T PF00775_consen 28 EPLVLHGRVIDTD---GKPVPGALVEIWQADADGRYSGQDPGSDQPDFNLRGRFRTDADGRYSFRTIKPGPYPI 98 (183)
T ss_dssp -EEEEEEEEEETT---SSB-TTEEEEEEE--TTS--TTTBTTSSSSTTTTEEEEEECTTSEEEEEEE----EEE
T ss_pred CEEEEEEEEECCC---CCCCCCcEEEEEecCCCCccccccccccccCCCcceEEecCCCCEEEEEeeCCCCCCC
Confidence 3568999999866 899999999885444444332221 122334678999999999999999975
No 33
>cd03464 3,4-PCD_beta Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=88.97 E-value=0.79 Score=45.19 Aligned_cols=65 Identities=25% Similarity=0.400 Sum_probs=46.5
Q ss_pred CCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc--------ccccceE--EEeCCCcceEeCcccCcceEE
Q 011825 175 EERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE--------CKDYQFW--TTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 175 s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~--------~~~yqYw--t~td~~G~FtI~nV~pGtY~L 242 (476)
.++=.|+|+|...+ ++|..+|.|=+-.....|....+ ..+++.+ ..||++|.|.|.-|+||.|.+
T Consensus 63 G~~i~l~G~V~D~~---G~PV~~A~VEIWQad~~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~Pg~Yp~ 137 (220)
T cd03464 63 GERIIVHGRVLDED---GRPVPNTLVEIWQANAAGRYRHKRDQHDAPLDPNFGGAGRTLTDDDGYYRFRTIKPGAYPW 137 (220)
T ss_pred CCEEEEEEEEECCC---CCCCCCCEEEEEecCCCCcccCccCCcccccCCCCCCEEEEEECCCccEEEEEECCCCccC
Confidence 45678999999655 89999999988554444433321 1233433 468999999999999999943
No 34
>TIGR02423 protocat_alph protocatechuate 3,4-dioxygenase, alpha subunit. This model represents the alpha chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the beta chain (TIGR02422), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=88.74 E-value=0.94 Score=43.77 Aligned_cols=64 Identities=23% Similarity=0.414 Sum_probs=46.1
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccc--------cccceE--EEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTEC--------KDYQFW--TTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~--------~~yqYw--t~td~~G~FtI~nV~pGtY~L 242 (476)
++=.|+|+|+..+ ++|..+|.|=+-.....|....+. .+++-| ..||++|+|++.-|+||.|..
T Consensus 38 ~~l~l~G~V~D~~---g~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~~~f~grGr~~Td~~G~y~f~TI~Pg~Yp~ 111 (193)
T TIGR02423 38 ERIRLEGRVLDGD---GHPVPDALIEIWQADAAGRYNSPADLRAPATDPGFRGWGRTGTDESGEFTFETVKPGAVPD 111 (193)
T ss_pred CEEEEEEEEECCC---CCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCCeEEEEECCCCCEEEEEEcCCCcCC
Confidence 4578999998654 899999999885544445333221 133434 468999999999999998864
No 35
>cd03458 Catechol_intradiol_dioxygenases Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=88.32 E-value=1.5 Score=44.19 Aligned_cols=65 Identities=23% Similarity=0.316 Sum_probs=46.2
Q ss_pred CCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc---ccccceE--EEeCCCcceEeCcccCcceEE
Q 011825 175 EERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE---CKDYQFW--TTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 175 s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~---~~~yqYw--t~td~~G~FtI~nV~pGtY~L 242 (476)
-++=.|+|+|...+ ++|..+|.|=+-.....|....+ ......+ ..||++|.|.+.-|+||-|-+
T Consensus 102 G~~l~l~G~V~D~~---G~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~f~Ti~P~~Ypi 171 (256)
T cd03458 102 GEPLFVHGTVTDTD---GKPLAGATVDVWHADPDGFYSQQDPDQPEFNLRGKFRTDEDGRYRFRTIRPVPYPI 171 (256)
T ss_pred CcEEEEEEEEEcCC---CCCCCCcEEEEEccCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCccC
Confidence 34578999999766 78999999988544444433321 1233333 568999999999999999954
No 36
>TIGR02438 catachol_actin catechol 1,2-dioxygenase, Actinobacterial. Members of this family are catechol 1,2-dioxygenases of the Actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so are built in this separate model. The member from Rhodococcus rhodochrous NCIMB 13259 (GB|AAC33003.1) is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol.
Probab=86.89 E-value=1 Score=45.93 Aligned_cols=64 Identities=23% Similarity=0.317 Sum_probs=44.8
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc---ccccce--EEEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE---CKDYQF--WTTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~---~~~yqY--wt~td~~G~FtI~nV~pGtY~L 242 (476)
++=.|+|+|...+ ++|..+|.|-+-.....|....+ ....++ ...||++|.|.+.-|+||-|-+
T Consensus 131 ~pl~v~G~V~D~~---G~Pv~gA~VdiWqada~G~Ys~~~~~~~~~~lRGr~~TDadG~y~F~TI~Pg~Ypi 199 (281)
T TIGR02438 131 TPLVFSGQVTDLD---GNGLAGAKVELWHADDDGFYSQFAPGIPEWNLRGTIIADDEGRFEITTMQPAPYQI 199 (281)
T ss_pred CEEEEEEEEEcCC---CCCcCCCEEEEEecCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcCC
Confidence 4578999998655 79999999988443334433221 112222 2568999999999999999964
No 37
>TIGR02422 protocat_beta protocatechuate 3,4-dioxygenase, beta subunit. This model represents the beta chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the alpha chain (TIGR02423), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=86.78 E-value=1.7 Score=42.96 Aligned_cols=67 Identities=24% Similarity=0.361 Sum_probs=46.6
Q ss_pred CCCCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc--------ccccceE--EEeCCCcceEeCcccCcceEE
Q 011825 173 KSEERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE--------CKDYQFW--TTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 173 ~~s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~--------~~~yqYw--t~td~~G~FtI~nV~pGtY~L 242 (476)
+..++=.|+|+|...+ ++|..+|.|=+-.....|....+ ..++.-+ ..||++|.|+|.-|+||-|..
T Consensus 56 ~~G~~i~l~G~V~D~~---g~PV~~A~VEIWQada~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~PG~Y~~ 132 (220)
T TIGR02422 56 PIGERIIVHGRVLDED---GRPVPNTLVEVWQANAAGRYRHKNDQYLAPLDPNFGGVGRTLTDSDGYYRFRTIKPGPYPW 132 (220)
T ss_pred CCCCEEEEEEEEECCC---CCCCCCCEEEEEecCCCCcccCccCccccccCCCCCCEEEEEECCCccEEEEEECCCCccC
Confidence 3346788999999755 78999999988544444433321 1223323 458999999999999999943
No 38
>smart00606 CBD_IV Cellulose Binding Domain Type IV.
Probab=85.86 E-value=10 Score=33.21 Aligned_cols=89 Identities=18% Similarity=0.350 Sum_probs=50.2
Q ss_pred ccEEEEEE-eCCCCCCCcEEEEEEEeccC-CCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCe
Q 011825 365 TTWQIKFK-LDHVDRNSSYKLRVAIASAT-LAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRF 442 (476)
Q Consensus 365 ~~w~I~F~-L~~~~~~~~~tLriala~a~-~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L 442 (476)
+.| |.|+ ++-. ..+.+++.|-.+... .+.++|+|++... ++..+..++.... . -.++..+.+|+ |
T Consensus 39 g~w-~~y~~vd~~-~~g~~~i~~~~as~~~~~~i~v~~d~~~G--~~~~~~~~p~tg~-----~-~~~~~~~~~v~---~ 105 (129)
T smart00606 39 GDW-IAYKDVDFG-SSGAYTFTARVASGNAGGSIELRLDSPTG--TLVGTVDVPSTGG-----W-QTYQTVSATVT---L 105 (129)
T ss_pred CCE-EEEEeEecC-CCCceEEEEEEeCCCCCceEEEEECCCCC--cEEEEEEeCCCCC-----C-ccCEEEEEEEc---c
Confidence 344 5555 4432 247788887777654 4689999997543 1222223332211 0 12344444553 4
Q ss_pred eeeecEEEEEeecCCCCCceEEEEEEEE
Q 011825 443 IEGENTIFLKQPRCTSPFQGIMYDYIRL 470 (476)
Q Consensus 443 ~~G~NtI~l~~~~gss~~~~vmyD~IrL 470 (476)
.+|.++|+|....++ .+..|.+++
T Consensus 106 ~~G~~~l~~~~~~~~----~~~ld~~~F 129 (129)
T smart00606 106 PAGVHDVYLVFKGGN----YFNIDWFRF 129 (129)
T ss_pred CCceEEEEEEEECCC----cEEEEEEEC
Confidence 489999999876543 277777654
No 39
>PF02837 Glyco_hydro_2_N: Glycosyl hydrolases family 2, sugar binding domain; InterPro: IPR006104 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 2 GH2 from CAZY comprises enzymes with several known activities; beta-galactosidase (3.2.1.23 from EC); beta-mannosidase (3.2.1.25 from EC); beta-glucuronidase (3.2.1.31 from EC). These enzymes contain a conserved glutamic acid residue which has been shown [], in Escherichia coli lacZ (P00722 from SWISSPROT), to be the general acid/base catalyst in the active site of the enzyme. This domain has a jelly-roll fold [].; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3DEC_A 3OB8_A 3OBA_A 3CMG_A 3FN9_C 2VZU_A 2X09_A 2VZO_A 2X05_A 2VZV_B ....
Probab=85.39 E-value=1.3 Score=40.32 Aligned_cols=69 Identities=20% Similarity=0.184 Sum_probs=46.1
Q ss_pred cEEEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeeee
Q 011825 366 TWQIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIEG 445 (476)
Q Consensus 366 ~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G 445 (476)
-.+=+|+|++......+.|++... ...-.|.|||+.++ ...|.+..++++|+. .|+.|
T Consensus 71 wYr~~f~lp~~~~~~~~~L~f~gv---~~~a~v~vNG~~vg------------------~~~~~~~~~~~dIt~-~l~~g 128 (167)
T PF02837_consen 71 WYRRTFTLPADWKGKRVFLRFEGV---DYAAEVYVNGKLVG------------------SHEGGYTPFEFDITD-YLKPG 128 (167)
T ss_dssp EEEEEEEESGGGTTSEEEEEESEE---ESEEEEEETTEEEE------------------EEESTTS-EEEECGG-GSSSE
T ss_pred EEEEEEEeCchhcCceEEEEeccc---eEeeEEEeCCeEEe------------------eeCCCcCCeEEeChh-hccCC
Confidence 345568887755333455555433 36778999997442 123457789999975 78899
Q ss_pred e-cEEEEEeecC
Q 011825 446 E-NTIFLKQPRC 456 (476)
Q Consensus 446 ~-NtI~l~~~~g 456 (476)
. |+|.+.+.+.
T Consensus 129 ~~N~l~V~v~~~ 140 (167)
T PF02837_consen 129 EENTLAVRVDNW 140 (167)
T ss_dssp EEEEEEEEEESS
T ss_pred CCEEEEEEEeec
Confidence 8 9999999753
No 40
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=85.05 E-value=1.5 Score=50.58 Aligned_cols=74 Identities=22% Similarity=0.256 Sum_probs=49.9
Q ss_pred EEEEEeCCCCCCCcEEEEEEEecc-----CCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCe
Q 011825 368 QIKFKLDHVDRNSSYKLRVAIASA-----TLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRF 442 (476)
Q Consensus 368 ~I~F~L~~~~~~~~~tLriala~a-----~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L 442 (476)
.|.|.+...+....++|+|...-+ ..++++|.|||+.+. +..+...+ .|.....+|+||+ .|
T Consensus 84 ~i~f~vp~d~~v~~A~L~L~y~~Sp~l~~~~S~L~V~lNg~~v~-----s~pL~~~~-------~~~~~~~~i~IP~-~l 150 (756)
T PRK11114 84 GIEFGVRSDEVVTKARLNLEYTYSPALLPDLSHLKVYLNGELMG-----TLPLDKEQ-------LGKKVLAQLPIDP-RF 150 (756)
T ss_pred eeEeecCccccccCcEEEEEEEECCCCCCCCCeEEEEECCEEeE-----EEecCccc-------CCCcceeEEecCH-HH
Confidence 677777766545556666666554 247899999998653 12222111 2445788999999 56
Q ss_pred eeeecEEEEEee
Q 011825 443 IEGENTIFLKQP 454 (476)
Q Consensus 443 ~~G~NtI~l~~~ 454 (476)
..|.|.|.|...
T Consensus 151 ~~g~N~L~~~~~ 162 (756)
T PRK11114 151 ITDFNRLRLEFI 162 (756)
T ss_pred cCCCceEEEEEe
Confidence 689999998864
No 41
>TIGR02962 hdxy_isourate hydroxyisourate hydrolase. Members of this family, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes and have been demonstrated to have hydroxyisourate hydrolase activity. In eukaryotes, a clade separate from the transthyretins (a family of thyroid-hormone binding proteins) has also been shown to have HIU hydrolase activity in urate catabolizing organisms. Transthyretin, then, would appear to be the recently diverged paralog of the more ancient HIUH family.
Probab=84.56 E-value=2 Score=38.00 Aligned_cols=59 Identities=20% Similarity=0.189 Sum_probs=40.1
Q ss_pred EEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceE-----eCcccCcceEEEEEE
Q 011825 180 VSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFS-----IKNIRTGNYNLYAWV 246 (476)
Q Consensus 180 VsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~Ft-----I~nV~pGtY~L~a~~ 246 (476)
|+=+|+... .++||.++.|.|..... +.|+. -.-+.||++|+.. ...+.||.|+|..-.
T Consensus 3 lstHVLDt~--~G~PAagv~V~L~~~~~-~~~~~-----i~~~~Tn~DGR~~~~l~~~~~~~~G~Y~l~F~~ 66 (112)
T TIGR02962 3 LSTHVLDTT--SGKPAAGVPVTLYRLDG-SGWTP-----LAEGVTNADGRCPDLLPEGETLAAGIYKLRFDT 66 (112)
T ss_pred ceEEEEeCC--CCccCCCCEEEEEEecC-CCeEE-----EEEEEECCCCCCcCcccCcccCCCeeEEEEEEh
Confidence 444444333 48999999999974221 11322 2346799999987 456789999999875
No 42
>cd03461 1,2-HQD Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes.
Probab=84.36 E-value=1.6 Score=44.46 Aligned_cols=65 Identities=23% Similarity=0.368 Sum_probs=46.1
Q ss_pred CCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc---ccccce--EEEeCCCcceEeCcccCcceEE
Q 011825 175 EERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE---CKDYQF--WTTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 175 s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~---~~~yqY--wt~td~~G~FtI~nV~pGtY~L 242 (476)
-++=.|+|+|...+ ++|..+|.|=+-.....|....+ ...... ...||++|.|.+.-|+||-|-+
T Consensus 118 G~~l~v~G~V~D~~---G~Pv~gA~VeiWqad~~G~Y~~~~~~~~~~~lRGr~~Td~~G~y~F~Ti~Pg~Ypi 187 (277)
T cd03461 118 GEPCFVHGRVTDTD---GKPLPGATVDVWQADPNGLYDVQDPDQPEFNLRGKFRTDEDGRYAFRTLRPTPYPI 187 (277)
T ss_pred CCEEEEEEEEEcCC---CCCcCCcEEEEECcCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcCC
Confidence 34678999999766 78999999988544444433321 122222 2568999999999999999975
No 43
>cd03460 1,2-CTD Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=84.02 E-value=2.1 Score=43.75 Aligned_cols=65 Identities=18% Similarity=0.258 Sum_probs=47.2
Q ss_pred CCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc---ccccce--EEEeCCCcceEeCcccCcceEE
Q 011825 175 EERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE---CKDYQF--WTTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 175 s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~---~~~yqY--wt~td~~G~FtI~nV~pGtY~L 242 (476)
-++=.|+|+|+..+ ++|..+|.|=+-.....|....+ ..+++. ...||++|.|.+.-|+||-|-+
T Consensus 122 Gepl~l~G~V~D~~---G~PI~~A~VeiWqad~~G~Ys~~~~~~~~f~~RGr~~TD~~G~y~F~TI~P~~Ypi 191 (282)
T cd03460 122 GETLVMHGTVTDTD---GKPVPGAKVEVWHANSKGFYSHFDPTQSPFNLRRSIITDADGRYRFRSIMPSGYGV 191 (282)
T ss_pred CCEEEEEEEEECCC---CCCcCCcEEEEECCCCCCCcCCCCCCCCCCCCceEEEeCCCCCEEEEEECCCCCcC
Confidence 45678999999665 78999999988555555544321 123333 3678999999999999999954
No 44
>PF00576 Transthyretin: HIUase/Transthyretin family; InterPro: IPR023416 This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [, ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence. Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates []. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) []. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain [].; PDB: 1TFP_B 1KGJ_D 1IE4_C 1GKE_C 1KGI_D 2H0J_B 2H0E_B 2H0F_B 1ZD6_A 3DGD_D ....
Probab=83.27 E-value=1.9 Score=38.22 Aligned_cols=51 Identities=25% Similarity=0.237 Sum_probs=35.6
Q ss_pred CccccCceEEEecCCCCCCCccccccccceEEEeCCCcce-----EeCcccCcceEEEEEE
Q 011825 191 DVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCF-----SIKNIRTGNYNLYAWV 246 (476)
Q Consensus 191 ~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~F-----tI~nV~pGtY~L~a~~ 246 (476)
.++||.++.|.|....+.++|+. -.-+.||++|+. .-..+.+|.|+|..-.
T Consensus 12 ~G~PA~gv~V~L~~~~~~~~~~~-----l~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~~ 67 (112)
T PF00576_consen 12 TGKPAAGVPVTLYRLDSDGSWTL-----LAEGVTDADGRIKQPLLEGESLEPGIYKLVFDT 67 (112)
T ss_dssp TTEE-TT-EEEEEEEETTSCEEE-----EEEEEBETTSEESSTSSETTTS-SEEEEEEEEH
T ss_pred CCCCccCCEEEEEEecCCCCcEE-----EEEEEECCCCcccccccccccccceEEEEEEEH
Confidence 58999999999975443444554 344679999988 4467889999999864
No 45
>PF13364 BetaGal_dom4_5: Beta-galactosidase jelly roll domain; PDB: 1TG7_A 1XC6_A 3OGS_A 3OGV_A 3OGR_A 3OG2_A.
Probab=82.78 E-value=3.4 Score=36.13 Aligned_cols=54 Identities=20% Similarity=0.191 Sum_probs=34.6
Q ss_pred EEEE-EEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeeeeecEEEEE
Q 011825 382 YKLR-VAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIEGENTIFLK 452 (476)
Q Consensus 382 ~tLr-iala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~NtI~l~ 452 (476)
..|+ |.+......+.+|.|||+.++. +.+++ | ...+|.||++.|+.++|.|++-
T Consensus 50 ~~~~~l~~~~g~~~~~~vwVNG~~~G~--------------~~~~~-g--~q~tf~~p~~il~~~n~v~~vl 104 (111)
T PF13364_consen 50 TSLTPLNIQGGNAFRASVWVNGWFLGS--------------YWPGI-G--PQTTFSVPAGILKYGNNVLVVL 104 (111)
T ss_dssp EEEE-EEECSSTTEEEEEEETTEEEEE--------------EETTT-E--CCEEEEE-BTTBTTCEEEEEEE
T ss_pred eeEEEEeccCCCceEEEEEECCEEeee--------------ecCCC-C--ccEEEEeCceeecCCCEEEEEE
Confidence 4555 5555567789999999986641 01111 1 1289999999999985555443
No 46
>TIGR02439 catechol_proteo catechol 1,2-dioxygenase, proteobacterial. Members of this family known so far are catechol 1,2-dioxygenases of the Proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the Actinobacteria, which are quite similar to each other and resolved by separate models. This enzyme catalyzes intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogs 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol.
Probab=82.57 E-value=2.8 Score=42.93 Aligned_cols=64 Identities=19% Similarity=0.285 Sum_probs=46.4
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc---ccccceE--EEeCCCcceEeCcccCcceEE
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE---CKDYQFW--TTADEDGCFSIKNIRTGNYNL 242 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~---~~~yqYw--t~td~~G~FtI~nV~pGtY~L 242 (476)
++=.|+|+|+..+ ++|..+|.|=+-.....|....+ ...++.+ ..||++|.|.+.-|+||-|-+
T Consensus 127 ~pl~v~G~V~D~~---G~PI~gA~VeIWqad~~G~Ys~~~~~~~~~~lRG~~~TD~~G~y~F~TI~P~~Ypi 195 (285)
T TIGR02439 127 ETLFLHGQVTDAD---GKPIAGAKVELWHANTKGNYSHFDKSQSEFNLRRTIITDAEGRYRARSIVPSGYGC 195 (285)
T ss_pred cEEEEEEEEECCC---CCCcCCcEEEEEccCCCCCcCCCCCCCCCCCceEEEEECCCCCEEEEEECCCCCcC
Confidence 4578999998655 78999999988555444543321 2233333 568999999999999999964
No 47
>cd05821 TLP_Transthyretin Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates. TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein. Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity. A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by bindi
Probab=81.06 E-value=3.7 Score=36.90 Aligned_cols=63 Identities=16% Similarity=0.090 Sum_probs=43.4
Q ss_pred CeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceE----eCcccCcceEEEEEE
Q 011825 177 RGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFS----IKNIRTGNYNLYAWV 246 (476)
Q Consensus 177 RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~Ft----I~nV~pGtY~L~a~~ 246 (476)
|-.|+=+|+... .++||.++.|.|......+.|+. .--+.||++|+.. -..+.+|.|+|..-.
T Consensus 6 ~~~ittHVLDt~--~G~PAaGV~V~L~~~~~~~~w~~-----l~~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~t 72 (121)
T cd05821 6 KCPLMVKVLDAV--RGSPAANVAVKVFKKTADGSWEP-----FASGKTTETGEIHGLTTDEQFTEGVYKVEFDT 72 (121)
T ss_pred CCCcEEEEEECC--CCccCCCCEEEEEEecCCCceEE-----EEEEEECCCCCCCCccCccccCCeeEEEEEeh
Confidence 566777776544 48999999999964321123433 3457799999875 235678999999864
No 48
>cd05469 Transthyretin_like Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms.
Probab=79.99 E-value=4.5 Score=35.96 Aligned_cols=51 Identities=18% Similarity=0.167 Sum_probs=36.3
Q ss_pred CccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEe----CcccCcceEEEEEE
Q 011825 191 DVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSI----KNIRTGNYNLYAWV 246 (476)
Q Consensus 191 ~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI----~nV~pGtY~L~a~~ 246 (476)
.++||.++.|.|......++|+. ---+.||+||+..- ..+.+|.|+|..-.
T Consensus 12 ~G~PAagv~V~L~~~~~~~~w~~-----l~~~~Tn~DGR~~~~l~~~~~~~G~Y~l~F~t 66 (113)
T cd05469 12 RGSPAANVAIKVFRKTADGSWEI-----FATGKTNEDGELHGLITEEEFXAGVYRVEFDT 66 (113)
T ss_pred CCccCCCCEEEEEEecCCCceEE-----EEEEEECCCCCccCccccccccceEEEEEEeh
Confidence 48999999999975321123433 33467999999852 45789999999864
No 49
>KOG2649 consensus Zinc carboxypeptidase [General function prediction only]
Probab=79.82 E-value=5.4 Score=43.48 Aligned_cols=77 Identities=13% Similarity=0.216 Sum_probs=52.7
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcc-eEeCcccCcceEEEEEECceeeeeeee
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGC-FSIKNIRTGNYNLYAWVPGFVGDYRSD 256 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~-FtI~nV~pGtY~L~a~~~G~~G~~~~~ 256 (476)
--|+|-|.... ++|..+|+|.+.+-. -=.+|...|. +.+ ..||.|.|+|.+.|+. ...
T Consensus 378 ~GIkG~V~D~~---G~~I~NA~IsV~gin-------------Hdv~T~~~GDYWRL--L~PG~y~vta~A~Gy~---~~t 436 (500)
T KOG2649|consen 378 RGIKGLVFDDT---GNPIANATISVDGIN-------------HDVTTAKEGDYWRL--LPPGKYIITASAEGYD---PVT 436 (500)
T ss_pred hccceeEEcCC---CCccCceEEEEecCc-------------CceeecCCCceEEe--eCCcceEEEEecCCCc---cee
Confidence 34899888644 899999999996532 1124455564 444 7899999999998855 455
Q ss_pred eEEEEeCCceeeecceEEcC
Q 011825 257 ALVTITSGSNIKMGDLVYEP 276 (476)
Q Consensus 257 ~~VtV~aG~t~~lg~l~~~~ 276 (476)
.+|+|..-..+.. ++++..
T Consensus 437 k~v~V~~~~a~~~-df~L~~ 455 (500)
T KOG2649|consen 437 KTVTVPPDRAARV-NFTLQR 455 (500)
T ss_pred eEEEeCCCCccce-eEEEec
Confidence 6788886333344 577754
No 50
>COG2351 Transthyretin-like protein [General function prediction only]
Probab=79.00 E-value=6.3 Score=35.37 Aligned_cols=66 Identities=24% Similarity=0.306 Sum_probs=45.2
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceE-----eCcccCcceEEEEEECceeee
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFS-----IKNIRTGNYNLYAWVPGFVGD 252 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~Ft-----I~nV~pGtY~L~a~~~G~~G~ 252 (476)
|.++=.|+... .++||.++.|.|+.-... .|+. ---+.||.||+-. -..+++|.|+|..-+ ||
T Consensus 9 G~LTTHVLDta--~GkPAagv~V~L~rl~~~-~~~~-----l~t~~Tn~DGR~d~pll~g~~~~~G~Y~l~F~~----gd 76 (124)
T COG2351 9 GRLTTHVLDTA--SGKPAAGVKVELYRLEGN-QWEL-----LKTVVTNADGRIDAPLLAGETLATGIYELVFHT----GD 76 (124)
T ss_pred ceeeeeeeecc--cCCcCCCCEEEEEEecCC-ccee-----eeEEEecCCCcccccccCccccccceEEEEEEc----ch
Confidence 45666666544 489999999999743322 2322 3346789999876 356789999999864 66
Q ss_pred eee
Q 011825 253 YRS 255 (476)
Q Consensus 253 ~~~ 255 (476)
|..
T Consensus 77 Yf~ 79 (124)
T COG2351 77 YFK 79 (124)
T ss_pred hhh
Confidence 654
No 51
>PF14900 DUF4493: Domain of unknown function (DUF4493)
Probab=78.21 E-value=68 Score=31.38 Aligned_cols=53 Identities=17% Similarity=0.379 Sum_probs=33.5
Q ss_pred cceEEEeCCC-cceEeCcccCcceEEEEEECc--eee----eeeeeeEEEEeCCceeeecceEE
Q 011825 218 YQFWTTADED-GCFSIKNIRTGNYNLYAWVPG--FVG----DYRSDALVTITSGSNIKMGDLVY 274 (476)
Q Consensus 218 yqYwt~td~~-G~FtI~nV~pGtY~L~a~~~G--~~G----~~~~~~~VtV~aG~t~~lg~l~~ 274 (476)
..++..++-. +.+. +++|+|+|.|+... ..| .|.-+++++|.+|+++++. ++-
T Consensus 47 ~~~~~~~~~~~~~i~---L~~G~Ytv~A~~g~~~~~~~d~pyy~G~~~f~I~~g~~t~v~-v~C 106 (235)
T PF14900_consen 47 VKYWKYSEMPGESIE---LPVGSYTVKASYGDNVAAGFDKPYYEGSTTFTIEKGETTTVS-VTC 106 (235)
T ss_pred EEecchhccccceEe---ecCCcEEEEEEcCCCccccccCceeecceeEEEecCCcEEEE-EEE
Confidence 3444444444 3333 57999999999422 112 1444668999999998874 654
No 52
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=76.65 E-value=4.8 Score=44.99 Aligned_cols=78 Identities=18% Similarity=0.159 Sum_probs=52.9
Q ss_pred cccEEEEEEeCCCC---CCCcEEEEEEEecc-----CCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEE
Q 011825 364 GTTWQIKFKLDHVD---RNSSYKLRVAIASA-----TLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHV 435 (476)
Q Consensus 364 ~~~w~I~F~L~~~~---~~~~~tLriala~a-----~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~ 435 (476)
..+..+.|.|+..- ......|.|..+.+ ..+++.|.|||.-+. +..+.+ +-.+....+++
T Consensus 323 ~~~~~~~f~lP~dl~~~~~~~i~l~L~y~y~~~~~~~~S~l~V~vNg~~i~-----s~~L~~-------~~~~~~~~~~v 390 (605)
T PF03170_consen 323 PQPISFNFRLPPDLFAWDGSGIPLHLRYRYTPGLDFDGSRLTVYVNGQFIG-----SLPLTP-------ADGAGFDRYTV 390 (605)
T ss_pred CCcceeEeeCCccccccCCCceEEEEEEecCCCCCCCCcEEEEEECCEEEE-----eEECCC-------CCCCccceeEE
Confidence 35778888887642 23456666666654 257899999998654 112211 22335678999
Q ss_pred EeeCCCeeeeecEEEEEee
Q 011825 436 NIPGTRFIEGENTIFLKQP 454 (476)
Q Consensus 436 ~ipa~~L~~G~NtI~l~~~ 454 (476)
.|| ..++.|.|.|.|...
T Consensus 391 ~iP-~~~~~~~N~l~~~f~ 408 (605)
T PF03170_consen 391 SIP-RLLLPGRNQLQFEFD 408 (605)
T ss_pred ecC-chhcCCCcEEEEEEE
Confidence 999 999999999888754
No 53
>PF10670 DUF4198: Domain of unknown function (DUF4198)
Probab=75.22 E-value=9.5 Score=36.01 Aligned_cols=62 Identities=21% Similarity=0.210 Sum_probs=46.3
Q ss_pred CeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEE
Q 011825 177 RGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWV 246 (476)
Q Consensus 177 RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~ 246 (476)
-..++.+|+- + ++|..++.|.+...+. |. +........+||++|.++|+=-+||.|-|.+..
T Consensus 150 g~~~~~~vl~-~---GkPl~~a~V~~~~~~~---~~-~~~~~~~~~~TD~~G~~~~~~~~~G~wli~a~~ 211 (215)
T PF10670_consen 150 GDPLPFQVLF-D---GKPLAGAEVEAFSPGG---WY-DVEHEAKTLKTDANGRATFTLPRPGLWLIRASH 211 (215)
T ss_pred CCEEEEEEEE-C---CeEcccEEEEEEECCC---cc-ccccceEEEEECCCCEEEEecCCCEEEEEEEEE
Confidence 4578888884 4 7999999998864422 11 112227888999999999998899999998854
No 54
>cd05822 TLP_HIUase HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family. HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site. In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location. Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences. HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located betw
Probab=74.97 E-value=6.5 Score=34.84 Aligned_cols=50 Identities=20% Similarity=0.179 Sum_probs=36.0
Q ss_pred CccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEe-----CcccCcceEEEEEE
Q 011825 191 DVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSI-----KNIRTGNYNLYAWV 246 (476)
Q Consensus 191 ~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI-----~nV~pGtY~L~a~~ 246 (476)
.++||.++-|-|....+. .|+. -.-+.||++|+..- ..+.+|+|+|..-.
T Consensus 12 ~G~PAagv~V~L~~~~~~-~~~~-----i~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~~ 66 (112)
T cd05822 12 TGKPAAGVAVTLYRLDGN-GWTL-----LATGVTNADGRCDDLLPPGAQLAAGTYKLTFDT 66 (112)
T ss_pred CCcccCCCEEEEEEecCC-CeEE-----EEEEEECCCCCccCcccccccCCCeeEEEEEEh
Confidence 489999999999743221 1222 33477999998753 46889999999865
No 55
>PLN03059 beta-galactosidase; Provisional
Probab=74.43 E-value=3.3 Score=48.15 Aligned_cols=86 Identities=10% Similarity=0.097 Sum_probs=53.2
Q ss_pred ccEEEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCCCCCcccc--cccCCCCeeeceee--------eeecEEEE
Q 011825 365 TTWQIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNANRPLFTT--GLIGRDNAIARHGI--------HGLYLLYH 434 (476)
Q Consensus 365 ~~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~~~~~~--~~~~~~~~i~R~~~--------~G~~~~~~ 434 (476)
+=.+-.|++++.. ... +|-...-+.=.|.|||.+++. -+.. ..-+=+.|-+|+++ .|.-...-
T Consensus 621 twYK~~Fd~p~g~--Dpv----~LDm~gmGKG~aWVNG~nIGR-YW~~~a~~~gC~~c~y~g~~~~~kc~~~cggP~q~l 693 (840)
T PLN03059 621 TWYKTTFDAPGGN--DPL----ALDMSSMGKGQIWINGQSIGR-HWPAYTAHGSCNGCNYAGTFDDKKCRTNCGEPSQRW 693 (840)
T ss_pred eEEEEEEeCCCCC--CCE----EEecccCCCeeEEECCccccc-ccccccccCCCccccccccccchhhhccCCCceeEE
Confidence 3446778764421 112 222334566689999999974 3322 11122456777776 24556667
Q ss_pred EEeeCCCeeeeecEEEEEeecCC
Q 011825 435 VNIPGTRFIEGENTIFLKQPRCT 457 (476)
Q Consensus 435 ~~ipa~~L~~G~NtI~l~~~~gs 457 (476)
+.||+++|++|.|+|.|==..|.
T Consensus 694 YHVPr~~Lk~g~N~lViFEe~gg 716 (840)
T PLN03059 694 YHVPRSWLKPSGNLLIVFEEWGG 716 (840)
T ss_pred EeCcHHHhccCCceEEEEEecCC
Confidence 78999999999999877544443
No 56
>PF02369 Big_1: Bacterial Ig-like domain (group 1); InterPro: IPR003344 Proteins that contain this domain are found in a variety of bacterial and phage surface proteins such as intimins. Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion [].; PDB: 1CWV_A 4E9L_A 1F02_I 1F00_I.
Probab=70.19 E-value=22 Score=30.38 Aligned_cols=68 Identities=19% Similarity=0.267 Sum_probs=35.1
Q ss_pred CCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcc--eEeCcccCcceEEEEEECc
Q 011825 176 ERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGC--FSIKNIRTGNYNLYAWVPG 248 (476)
Q Consensus 176 ~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~--FtI~nV~pGtY~L~a~~~G 248 (476)
....+.=++++.| ..+.|..+..|-+......+.... ... -+.||++|. +++..-++|+|++.|...|
T Consensus 21 g~~~~tltatV~D-~~gnpv~g~~V~f~~~~~~~~l~~--~~~--~~~Td~~G~a~~tltst~aG~~~VtA~~~~ 90 (100)
T PF02369_consen 21 GSDTNTLTATVTD-ANGNPVPGQPVTFSSSSSGGTLSP--TNT--SATTDSNGIATVTLTSTKAGTYTVTATVDG 90 (100)
T ss_dssp SSS-EEEEEEEEE-TTSEB-TS-EEEE--EESSSEES---CEE---EEE-TTSEEEEEEE-SS-EEEEEEEEETT
T ss_pred CcCcEEEEEEEEc-CCCCCCCCCEEEEEEcCCCcEEec--Ccc--ccEECCCEEEEEEEEecCceEEEEEEEECC
Confidence 3344444455555 237788888888821111111111 000 357999995 5667789999999999875
No 57
>cd03457 intradiol_dioxygenase_like Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown.
Probab=68.99 E-value=11 Score=36.33 Aligned_cols=62 Identities=15% Similarity=0.128 Sum_probs=41.4
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCcccc-----------ccccce-EEEeCCCcceEeCcccCcceE
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTE-----------CKDYQF-WTTADEDGCFSIKNIRTGNYN 241 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~-----------~~~yqY-wt~td~~G~FtI~nV~pGtY~ 241 (476)
=.+.|+|...+ .++|..+|.|=+-.....|..... ...+-. +..||++|.|+|.-|.||-|.
T Consensus 27 l~l~g~V~D~~--~c~Pv~~a~VdiWh~da~G~Ys~~~~~~~~~~~~~~~~flRG~~~TD~~G~~~F~TI~PG~Y~ 100 (188)
T cd03457 27 LTLDLQVVDVA--TCCPPPNAAVDIWHCDATGVYSGYSAGGGGGEDTDDETFLRGVQPTDADGVVTFTTIFPGWYP 100 (188)
T ss_pred EEEEEEEEeCC--CCccCCCeEEEEecCCCCCCCCCccCCccccccccCCCcCEEEEEECCCccEEEEEECCCCCC
Confidence 47889887543 368999999987443333422221 111111 366899999999999999985
No 58
>smart00095 TR_THY Transthyretin.
Probab=68.95 E-value=12 Score=33.59 Aligned_cols=62 Identities=15% Similarity=0.085 Sum_probs=40.6
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceE----eCcccCcceEEEEEE
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFS----IKNIRTGNYNLYAWV 246 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~Ft----I~nV~pGtY~L~a~~ 246 (476)
-.|+=+|+... .++||.++.|.|......+.|+. ---..||.+|+.. -..+.+|.|+|..-.
T Consensus 4 ~plTtHVLDt~--~G~PAagv~V~L~~~~~~~~w~~-----la~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~t 69 (121)
T smart00095 4 CPLMVKVLDAV--RGSPAVNVAVKVFKKTEEGTWEP-----FASGKTNESGEIHELTTDEKFVEGLYKVEFDT 69 (121)
T ss_pred CCeEEEEEECC--CCccCCCCEEEEEEeCCCCceEE-----EEEEecCCCccccCccCcccccceEEEEEEeh
Confidence 34566665443 48999999999964211122332 2236689999874 245779999999864
No 59
>PF09912 DUF2141: Uncharacterized protein conserved in bacteria (DUF2141); InterPro: IPR018673 This family of conserved hypothetical proteins has no known function.
Probab=66.13 E-value=14 Score=32.49 Aligned_cols=49 Identities=10% Similarity=0.161 Sum_probs=30.9
Q ss_pred ceEEEecCCCCCCCccccccccceEEEeC---CCcceEeCcccCcceEEEEEECc
Q 011825 197 GAYVGLAPPGDVGSWQTECKDYQFWTTAD---EDGCFSIKNIRTGNYNLYAWVPG 248 (476)
Q Consensus 197 ~a~V~L~~~~~~g~~q~~~~~yqYwt~td---~~G~FtI~nV~pGtY~L~a~~~G 248 (476)
.+.|.|+...+ +|.. ...+-....+. .+-.++|++++||+|.+.++.|.
T Consensus 12 ~v~v~ly~~~~--~f~~-~~~~~~~~~~~~~~~~~~~~f~~lp~G~YAi~v~hD~ 63 (112)
T PF09912_consen 12 QVRVALYNSAE--GFEN-KKKALKRVKVPAKGGTVTITFEDLPPGTYAIAVFHDE 63 (112)
T ss_pred EEEEEEEcChh--chhh-cccceeEEEEEcCCCcEEEEECCCCCccEEEEEEEeC
Confidence 36677776533 3522 23333333332 34489999999999999999753
No 60
>PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed
Probab=62.92 E-value=9.9 Score=45.47 Aligned_cols=68 Identities=16% Similarity=0.174 Sum_probs=46.2
Q ss_pred cEEEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeeee
Q 011825 366 TWQIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIEG 445 (476)
Q Consensus 366 ~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G 445 (476)
-.+=.|.+++........|++..+ .....|.|||+.++ .. .|-+.-++|+|.. .|+.|
T Consensus 112 ~Yrr~F~lp~~~~gkrv~L~FeGV---~s~a~VwvNG~~VG----------~~--------~g~~~pfefDIT~-~l~~G 169 (1021)
T PRK10340 112 AYQRTFTLSDGWQGKQTIIKFDGV---ETYFEVYVNGQYVG----------FS--------KGSRLTAEFDISA-MVKTG 169 (1021)
T ss_pred EEEEEEEeCcccccCcEEEEECcc---ceEEEEEECCEEec----------cc--------cCCCccEEEEcch-hhCCC
Confidence 344568887654333345555432 56789999998663 11 1446778899987 67899
Q ss_pred ecEEEEEeec
Q 011825 446 ENTIFLKQPR 455 (476)
Q Consensus 446 ~NtI~l~~~~ 455 (476)
+|+|.+.|.+
T Consensus 170 ~N~LaV~V~~ 179 (1021)
T PRK10340 170 DNLLCVRVMQ 179 (1021)
T ss_pred ccEEEEEEEe
Confidence 9999999853
No 61
>PF08531 Bac_rhamnosid_N: Alpha-L-rhamnosidase N-terminal domain; InterPro: IPR013737 This domain is found in bacterial rhamnosidase A and B enzymes and is probably involved in substrate recognition. ; PDB: 2OKX_B.
Probab=57.71 E-value=8.3 Score=36.15 Aligned_cols=62 Identities=18% Similarity=0.146 Sum_probs=31.3
Q ss_pred cEEEEEEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeeeeecEEEEEeecC
Q 011825 381 SYKLRVAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIEGENTIFLKQPRC 456 (476)
Q Consensus 381 ~~tLriala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~NtI~l~~~~g 456 (476)
.++|.|+ +.+..++.|||+.+.. .+..+ ...-++. +-+|. +++| +.+|++|+|+|-+.+..|
T Consensus 5 ~A~l~is----a~g~Y~l~vNG~~V~~----~~l~P-~~t~y~~--~~~Y~--tyDV-t~~L~~G~N~iav~lg~g 66 (172)
T PF08531_consen 5 SARLYIS----ALGRYELYVNGERVGD----GPLAP-GWTDYDK--RVYYQ--TYDV-TPYLRPGENVIAVWLGNG 66 (172)
T ss_dssp --EEEEE----EESEEEEEETTEEEEE----E---------BTT--EEEEE--EEE--TTT--TTEEEEEEEEEE-
T ss_pred EEEEEEE----eCeeEEEEECCEEeeC----Ccccc-ccccCCC--ceEEE--EEeC-hHHhCCCCCEEEEEEeCC
Confidence 3455553 5579999999986642 11000 0000111 12344 4444 478999999999998654
No 62
>PF01060 DUF290: Transthyretin-like family; InterPro: IPR001534 This new apparently nematode-specific protein family has been called family 2 []. The proteins show weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. The specific function of this protein is unknown.; GO: 0005615 extracellular space
Probab=56.97 E-value=23 Score=28.96 Aligned_cols=49 Identities=22% Similarity=0.188 Sum_probs=30.0
Q ss_pred EEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCc
Q 011825 181 SGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTG 238 (476)
Q Consensus 181 sG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pG 238 (476)
+|+|.=. ++|+.++.|-|.+... ......-=-+.||++|+|+|..-...
T Consensus 1 ~G~L~C~----~~P~~~~~V~L~e~d~-----~~~Ddll~~~~Td~~G~F~l~G~~~e 49 (80)
T PF01060_consen 1 KGQLMCG----GKPAKNVKVKLWEDDY-----FDPDDLLDETKTDSDGNFELSGSTNE 49 (80)
T ss_pred CeEEEeC----CccCCCCEEEEEECCC-----CCCCceeEEEEECCCceEEEEEEccC
Confidence 3666643 5899999999954211 01122222377899999999754333
No 63
>PF01190 Pollen_Ole_e_I: Pollen proteins Ole e I like; InterPro: IPR006041 Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation. The allergens in this family include allergens with the following designations: Ole e 1. A number of plant pollen proteins, whose biological function is not yet known, are structurally related []. These proteins are most probably secreted and consist of about 145 residues. There are six cysteines which are conserved in the sequence of these proteins. They seem to be involved in disulphide bonds.
Probab=55.91 E-value=22 Score=29.95 Aligned_cols=37 Identities=24% Similarity=0.242 Sum_probs=24.9
Q ss_pred ccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeC
Q 011825 192 VISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIK 233 (476)
Q Consensus 192 ~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~ 233 (476)
..|..+|.|.|.=.+.. ........+.||++|.|.|.
T Consensus 18 ~~~l~GA~V~v~C~~~~-----~~~~~~~~~~Td~~G~F~i~ 54 (97)
T PF01190_consen 18 AKPLPGAKVSVECKDGN-----GGVVFSAEAKTDENGYFSIE 54 (97)
T ss_pred CccCCCCEEEEECCCCC-----CCcEEEEEEEeCCCCEEEEE
Confidence 46788999998522111 00235666889999999996
No 64
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=55.61 E-value=23 Score=26.86 Aligned_cols=28 Identities=29% Similarity=0.503 Sum_probs=20.8
Q ss_pred ceEEEeCCCcceEe--CcccCcceEEEEEE
Q 011825 219 QFWTTADEDGCFSI--KNIRTGNYNLYAWV 246 (476)
Q Consensus 219 qYwt~td~~G~FtI--~nV~pGtY~L~a~~ 246 (476)
.|.+.+|++|.+++ +....|+|++.+.+
T Consensus 3 ~~~~t~~~~G~Ws~t~~~~~dG~y~itv~a 32 (54)
T PF13754_consen 3 TYTTTVDSDGNWSFTVPALADGTYTITVTA 32 (54)
T ss_pred EEEEEECCCCcEEEeCCCCCCccEEEEEEE
Confidence 56778899998766 55556888888875
No 65
>PF11008 DUF2846: Protein of unknown function (DUF2846); InterPro: IPR022548 Some members in this group of proteins with unknown function are annotated as lipoproteins. However this cannot be confirmed.
Probab=51.22 E-value=19 Score=31.47 Aligned_cols=43 Identities=16% Similarity=0.220 Sum_probs=29.7
Q ss_pred CCcceEeCcccCcceEEEEEECceee-eeeeeeEEEEeCCceeee
Q 011825 226 EDGCFSIKNIRTGNYNLYAWVPGFVG-DYRSDALVTITSGSNIKM 269 (476)
Q Consensus 226 ~~G~FtI~nV~pGtY~L~a~~~G~~G-~~~~~~~VtV~aG~t~~l 269 (476)
..|.|..-.|+||+|++.+-. ++.+ .-..+.+|+|.+|++-=+
T Consensus 56 ~~g~y~~~~v~pG~h~i~~~~-~~~~~~~~~~l~~~~~~G~~yy~ 99 (117)
T PF11008_consen 56 KNGGYFYVEVPPGKHTISAKS-EFSSSPGANSLDVTVEAGKTYYV 99 (117)
T ss_pred CCCeEEEEEECCCcEEEEEec-CccCCCCccEEEEEEcCCCEEEE
Confidence 567777778999999999953 2221 011445799999998654
No 66
>PF03944 Endotoxin_C: delta endotoxin; InterPro: IPR005638 This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain (IPR005639 from INTERPRO) involved in membrane insertion and pore formation; a beta-sheet central domain (IPR001178 from INTERPRO) involved in receptor binding; and a C-terminal beta-sandwich domain that interacts with the N-terminal domain to form a channel [, ]. This entry represents the conserved C-terminal domain.; PDB: 1DLC_A 1JI6_A 1W99_A 1CIY_A 1I5P_A 2C9K_A 3EB7_A.
Probab=47.47 E-value=56 Score=29.64 Aligned_cols=95 Identities=18% Similarity=0.296 Sum_probs=48.0
Q ss_pred EEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCCC--CCcccccccCCCCeeeceeeeeecEEEEE-EeeCC-Cee
Q 011825 368 QIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNAN--RPLFTTGLIGRDNAIARHGIHGLYLLYHV-NIPGT-RFI 443 (476)
Q Consensus 368 ~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~~--~~~~~~~~~~~~~~i~R~~~~G~~~~~~~-~ipa~-~L~ 443 (476)
+|++..+. .....|.+||-.|+...+.+.|.+++.... -.++.+..- ... +.++|..|.+ +++.. .+.
T Consensus 41 ~~~v~~~~-~~~~~YrIRiRYAs~~~~~~~i~~~~~~~~~~~~~~~T~~~--~~~-----~~~~y~~F~y~~~~~~~~~~ 112 (143)
T PF03944_consen 41 KIRVTINN-SSSQKYRIRIRYASNSNGTLSISINNSSGNLSFNFPSTMSN--GDN-----LTLNYESFQYVEFPTPFTFS 112 (143)
T ss_dssp EEEEEESS-SSTEEEEEEEEEEESS-EEEEEEETTEEEECEEEE--SSST--TGG-----CCETGGG-EEEEESSEEEES
T ss_pred EEEEEecC-CCCceEEEEEEEEECCCcEEEEEECCccceeeeeccccccC--CCc-----cccccceeEeeecCceEEec
Confidence 44444332 234679999999998889999999887441 011111111 111 3332332222 22221 122
Q ss_pred eee-cEEEEEeecCCCCCceEEEEEEEEe
Q 011825 444 EGE-NTIFLKQPRCTSPFQGIMYDYIRLE 471 (476)
Q Consensus 444 ~G~-NtI~l~~~~gss~~~~vmyD~IrLe 471 (476)
.+. .+|.|.+...++. ..|.-|=|++.
T Consensus 113 ~~~~~~~~i~i~~~~~~-~~v~IDkIEFI 140 (143)
T PF03944_consen 113 SNQSITITISIQNISSN-GNVYIDKIEFI 140 (143)
T ss_dssp TSEEEEEEEEEESSTTT-S-EEEEEEEEE
T ss_pred CCCceEEEEEEEecCCC-CeEEEEeEEEE
Confidence 222 5677765443322 47899999875
No 67
>PF07550 DUF1533: Protein of unknown function (DUF1533); InterPro: IPR011432 This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.
Probab=46.54 E-value=16 Score=28.88 Aligned_cols=19 Identities=42% Similarity=0.726 Sum_probs=17.0
Q ss_pred EEeeCCCe-eeeecEEEEEe
Q 011825 435 VNIPGTRF-IEGENTIFLKQ 453 (476)
Q Consensus 435 ~~ipa~~L-~~G~NtI~l~~ 453 (476)
+.|.+++| ++|+|+|+|.-
T Consensus 36 l~i~~~~f~~~G~~~I~I~A 55 (65)
T PF07550_consen 36 LKIKASAFNKDGENTIVIKA 55 (65)
T ss_pred EEEcHHHcCcCCceEEEEEe
Confidence 88899999 78999999983
No 68
>PRK09525 lacZ beta-D-galactosidase; Reviewed
Probab=46.06 E-value=36 Score=40.88 Aligned_cols=68 Identities=13% Similarity=0.136 Sum_probs=45.1
Q ss_pred cEEEEEEeCCCCCCC-cEEEEEEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeee
Q 011825 366 TWQIKFKLDHVDRNS-SYKLRVAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIE 444 (476)
Q Consensus 366 ~w~I~F~L~~~~~~~-~~tLriala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~ 444 (476)
-.+=.|++++..... ...|++. +......|.|||+.++ --.|-+.-++|+|. ..|+.
T Consensus 123 wYrr~F~vp~~w~~~~rv~L~Fe---GV~~~a~VwvNG~~VG------------------~~~g~~~pfefDIT-~~l~~ 180 (1027)
T PRK09525 123 CYSLTFTVDESWLQSGQTRIIFD---GVNSAFHLWCNGRWVG------------------YSQDSRLPAEFDLS-PFLRA 180 (1027)
T ss_pred EEEEEEEeChhhcCCCeEEEEEC---eeccEEEEEECCEEEE------------------eecCCCceEEEECh-hhhcC
Confidence 344458887653222 3445544 2467889999998552 01244677899996 67789
Q ss_pred eecEEEEEeec
Q 011825 445 GENTIFLKQPR 455 (476)
Q Consensus 445 G~NtI~l~~~~ 455 (476)
|+|+|.+.|.+
T Consensus 181 G~N~L~V~V~~ 191 (1027)
T PRK09525 181 GENRLAVMVLR 191 (1027)
T ss_pred CccEEEEEEEe
Confidence 99999999854
No 69
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=44.90 E-value=1.2e+02 Score=25.25 Aligned_cols=39 Identities=18% Similarity=0.332 Sum_probs=29.2
Q ss_pred ceEeCcccCcc---eEEEEEE--CceeeeeeeeeEEEEeCCceeee
Q 011825 229 CFSIKNIRTGN---YNLYAWV--PGFVGDYRSDALVTITSGSNIKM 269 (476)
Q Consensus 229 ~FtI~nV~pGt---Y~L~a~~--~G~~G~~~~~~~VtV~aG~t~~l 269 (476)
.|.-+++.+|. |++.+-. +|- ....+.+|.|.||.+.++
T Consensus 30 ~F~T~~L~~G~~y~Y~v~a~~~~dG~--~~t~~~~V~vrAGd~~~v 73 (75)
T TIGR03000 30 TFTTPPLEAGKEYEYTVTAEYDRDGR--ILTRTRTVVVRAGDTVTV 73 (75)
T ss_pred EEECCCCCCCCEEEEEEEEEEecCCc--EEEEEEEEEEcCCceEEe
Confidence 69999999997 6666642 552 245677899999998776
No 70
>PRK10150 beta-D-glucuronidase; Provisional
Probab=42.25 E-value=51 Score=36.90 Aligned_cols=66 Identities=17% Similarity=0.121 Sum_probs=42.2
Q ss_pred EEEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCeeeee
Q 011825 367 WQIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRFIEGE 446 (476)
Q Consensus 367 w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L~~G~ 446 (476)
.+=.|+|++........|++.. ....-+|.|||+.++ .-.|-+.-++|+|.. .|+.|+
T Consensus 69 Yrr~f~lp~~~~gk~v~L~Feg---v~~~a~V~lNG~~vg------------------~~~~~~~~f~~DIT~-~l~~G~ 126 (604)
T PRK10150 69 YQREVFIPKGWAGQRIVLRFGS---VTHYAKVWVNGQEVM------------------EHKGGYTPFEADITP-YVYAGK 126 (604)
T ss_pred EEEEEECCcccCCCEEEEEECc---ccceEEEEECCEEee------------------eEcCCccceEEeCch-hccCCC
Confidence 3445888764322334455532 345668999998653 112456778999975 577886
Q ss_pred c-EEEEEee
Q 011825 447 N-TIFLKQP 454 (476)
Q Consensus 447 N-tI~l~~~ 454 (476)
| +|.+.+.
T Consensus 127 ~n~L~V~v~ 135 (604)
T PRK10150 127 SVRITVCVN 135 (604)
T ss_pred ceEEEEEEe
Confidence 5 9999984
No 71
>PRK13211 N-acetylglucosamine-binding protein A; Reviewed
Probab=36.25 E-value=1.6e+02 Score=32.49 Aligned_cols=66 Identities=9% Similarity=0.085 Sum_probs=40.1
Q ss_pred CCCCCCCCCeEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeC--cccCcceEEEEEE
Q 011825 169 EDFQKSEERGCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIK--NIRTGNYNLYAWV 246 (476)
Q Consensus 169 ~~y~~~s~RGtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~--nV~pGtY~L~a~~ 246 (476)
++|.-..+..+|.=+|... ....+.+-|.+.. .+.++++--+..|.+-.|+|. ++++|.|+|.+.+
T Consensus 320 ~eY~I~dG~~~i~ftv~a~------g~~~vta~V~d~~------g~~~~~~~~~v~d~s~~vtL~Ls~~~AG~y~Lvv~~ 387 (478)
T PRK13211 320 KEYKIGDGAATLDFTVTAT------GDMNVEATVYNHD------GEALGSKSQTVNDGSQSVSLDLSKLKAGHHMLVVKA 387 (478)
T ss_pred ceeEEcCCcEEEEEEEEec------cceEEEEEEEcCC------CCeeeeeeEEecCCceeEEEecccCCCceEEEEEEE
Confidence 6677766666665555542 2456666665421 123444444444555567665 9999999999864
No 72
>PF11797 DUF3324: Protein of unknown function C-terminal (DUF3324); InterPro: IPR021759 This family consists of several hypothetical bacterial proteins of unknown function.
Probab=35.97 E-value=41 Score=30.52 Aligned_cols=30 Identities=20% Similarity=0.207 Sum_probs=19.8
Q ss_pred cccCcceEEEEEECceeeeeeeeeEEEEeC
Q 011825 234 NIRTGNYNLYAWVPGFVGDYRSDALVTITS 263 (476)
Q Consensus 234 nV~pGtY~L~a~~~G~~G~~~~~~~VtV~a 263 (476)
.++||+|+|.+-+..-.+....+.+++|++
T Consensus 102 ~lk~G~Y~l~~~~~~~~~~W~f~k~F~It~ 131 (140)
T PF11797_consen 102 KLKPGKYTLKITAKSGKKTWTFTKDFTITA 131 (140)
T ss_pred CccCCEEEEEEEEEcCCcEEEEEEEEEECH
Confidence 799999999987643233333445677764
No 73
>smart00634 BID_1 Bacterial Ig-like domain (group 1).
Probab=32.87 E-value=2.1e+02 Score=23.57 Aligned_cols=64 Identities=14% Similarity=0.122 Sum_probs=35.9
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeCCCcc--eEeCcccCcceEEEEEECce
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGC--FSIKNIRTGNYNLYAWVPGF 249 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~--FtI~nV~pGtY~L~a~~~G~ 249 (476)
..|+=+|...+ +.|..+..|.+.-.+.. .....+ -...||++|. +.+..-++|++++++...|.
T Consensus 20 ~~i~v~v~D~~---Gnpv~~~~V~f~~~~~~-~~~~~~----~~~~Td~~G~a~~~l~~~~~G~~~vta~~~~~ 85 (92)
T smart00634 20 ITLTATVTDAN---GNPVAGQEVTFTTPSGG-ALTLSK----GTATTDANGIATVTLTSTTAGVYTVTASLENG 85 (92)
T ss_pred EEEEEEEECCC---CCCcCCCEEEEEECCCc-eeeccC----CeeeeCCCCEEEEEEECCCCcEEEEEEEECCC
Confidence 45555554433 55666666666533221 111111 1236888996 44556678999999987663
No 74
>PF03785 Peptidase_C25_C: Peptidase family C25, C terminal ig-like domain; InterPro: IPR005536 This domain is found in almost all members of MEROPS peptidase family C25, (clan CD). Peptidase family C25 is a protein family found in the bacteria Porphyromonas gingivalis (Bacteroides gingivalis) a Gram-negative anaerobic bacterial species strongly associated with adult periodontitis. One of its distinguishing characteristics and putative virulence properties is the ability to agglutinate erythrocytes []. It is a highly proteolytic organism which metabolises small peptides and amino acids. Indirect evidence suggests that the proteases produced by this microorganism constitute an important virulence factor []. Protease-encoding genes have been shown to contain multiple copies of repeated nucleotide sequences. These conserved sequences have also been found in haemagglutinin genes [].; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 1CVR_A.
Probab=32.79 E-value=1.2e+02 Score=25.53 Aligned_cols=39 Identities=21% Similarity=0.298 Sum_probs=25.0
Q ss_pred CceEEEecCCCCCCCccccccccceEEEeCCCcceEeCccc-----CcceEEEEEE
Q 011825 196 NGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIR-----TGNYNLYAWV 246 (476)
Q Consensus 196 ~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~-----pGtY~L~a~~ 246 (476)
.++.|-|+.. +--|....-++|+++|+ +. +|+|+|++-.
T Consensus 26 ~gs~ValS~d-----------g~l~G~ai~~sG~ati~-l~~~it~~~~~tlTit~ 69 (81)
T PF03785_consen 26 PGSYVALSQD-----------GDLYGKAIVNSGNATIN-LTNPITDEGTLTLTITA 69 (81)
T ss_dssp TT-EEEEEET-----------TEEEEEEE-BTTEEEEE--SS--TT-SEEEEEEE-
T ss_pred CCcEEEEecC-----------CEEEEEEEecCceEEEE-CCcccCCCceEEEEEEE
Confidence 5678888754 33677555449999985 55 6889998864
No 75
>PF14200 RicinB_lectin_2: Ricin-type beta-trefoil lectin domain-like; PDB: 2X2S_C 2X2T_A 2VSE_B 2VSA_A 3EF2_A 2IHO_A 3HZB_H 1YBI_B 3PHZ_A 3NBE_A ....
Probab=31.14 E-value=53 Score=27.59 Aligned_cols=48 Identities=31% Similarity=0.447 Sum_probs=29.2
Q ss_pred EEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEeC-CCcceEeCcccCc
Q 011825 181 SGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTAD-EDGCFSIKNIRTG 238 (476)
Q Consensus 181 sG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td-~~G~FtI~nV~pG 238 (476)
+|+++..+. .....++.|.+.. ......|.|.... .+|.|.|.++..|
T Consensus 24 sg~~L~v~~--~~~~~g~~v~~~~--------~~~~~~Q~W~i~~~~~g~y~I~n~~s~ 72 (105)
T PF14200_consen 24 SGKYLDVAG--GSTANGTNVQQWT--------CNGNDNQQWKIEPVGDGYYRIRNKNSG 72 (105)
T ss_dssp TTEEEEEGC--TTCSTTEBEEEEE--------SSSSGGGEEEEEESTTSEEEEEETSTT
T ss_pred CCCEEEeCC--CCcCCCcEEEEec--------CCCCcCcEEEEEEecCCeEEEEECCCC
Confidence 455554441 2334566666642 2236778886664 6688999888665
No 76
>PF12866 DUF3823: Protein of unknown function (DUF3823); InterPro: IPR024278 This is a family of uncharacterised proteins from Bacteroidetes. These proteins have characteristic DN and DR sequence-motifs but their function is not known.; PDB: 3HN5_B 4EIU_A.
Probab=28.97 E-value=1.5e+02 Score=29.23 Aligned_cols=61 Identities=18% Similarity=0.326 Sum_probs=37.9
Q ss_pred CeEEEEEEEEecCCC--ccccCceEEEecCCCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEE
Q 011825 177 RGCVSGRLLVQDSND--VISANGAYVGLAPPGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYA 244 (476)
Q Consensus 177 RGtVsG~v~~~d~~~--~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a 244 (476)
-++++|+|....... .....++.|-|.+.+ |... +.|.+ ....+|.|.-..+=+|+|+|..
T Consensus 21 ~s~l~G~iiD~~tgE~i~~~~~gv~i~l~e~g----y~~~--~~~~~-~v~qDGtf~n~~lF~G~Yki~~ 83 (222)
T PF12866_consen 21 DSTLTGRIIDVYTGEPIQTDIGGVRIQLYELG----YGDN--TPQDV-YVKQDGTFRNTKLFDGDYKIVP 83 (222)
T ss_dssp -EEEEEEEEECCTTEE----STSSEEEEECS-----CCG----SEEE-EB-TTSEEEEEEE-SEEEEEEE
T ss_pred CceEEEEEEEeecCCeeeecCCceEEEEEecc----cccC--CCcce-EEccCCceeeeeEeccceEEEE
Confidence 589999996533111 111247888887653 5532 44444 3778999988899999999998
No 77
>PF10794 DUF2606: Protein of unknown function (DUF2606); InterPro: IPR019730 This entry represents bacterial proteins with unknown function.
Probab=28.09 E-value=1.8e+02 Score=26.38 Aligned_cols=64 Identities=19% Similarity=0.034 Sum_probs=39.7
Q ss_pred EEEEEEEecCCCccccCceEEEecC-CCCCCCccccccccceEEEeCCCcceEeCcccCcceEEEEEECc
Q 011825 180 VSGRLLVQDSNDVISANGAYVGLAP-PGDVGSWQTECKDYQFWTTADEDGCFSIKNIRTGNYNLYAWVPG 248 (476)
Q Consensus 180 VsG~v~~~d~~~~~pa~~a~V~L~~-~~~~g~~q~~~~~yqYwt~td~~G~FtI~nV~pGtY~L~a~~~G 248 (476)
|+=.|...+ ++|+.++.|.|-. +.. +-|-...--.--.+||..|.+.-++++-|+|.+.+-.+|
T Consensus 44 VT~hVen~e---~~pi~~~ev~lmKa~ds--~~qPs~eig~~IGKTD~~Gki~Wk~~~kG~Y~v~l~n~e 108 (131)
T PF10794_consen 44 VTFHVENAE---GQPIKDFEVTLMKAADS--DPQPSKEIGISIGKTDEEGKIIWKNGRKGKYIVFLPNGE 108 (131)
T ss_pred EEEEEecCC---CCcccceEEEEEecccc--CCCCchhhceeecccCCCCcEEEecCCcceEEEEEcCCC
Confidence 444444433 7889988887732 111 111111111223569999999999999999998876543
No 78
>TIGR03769 P_ac_wall_RPT actinobacterial surface-anchored protein domain. This model describes a repeat domain that one to three times in Actinobacterial proteins, some of which have LPXTG-type sortase recognition motifs for covalent attachment to the Gram-positive cell wall. Where it occurs with duplication in an LPXTG-anchored protein, it tends to be adjacent to the substrate-binding protein of the gene trio of an ABC transporter system, where that substrate-binding protein has a single copy of this same domain. This arrangement suggests a substrate-binding relay system, with the LPXTG protein acting as a substrate receptor.
Probab=26.23 E-value=1.2e+02 Score=22.06 Aligned_cols=11 Identities=27% Similarity=0.540 Sum_probs=9.8
Q ss_pred cCcceEEEEEE
Q 011825 236 RTGNYNLYAWV 246 (476)
Q Consensus 236 ~pGtY~L~a~~ 246 (476)
+||.|+|.+-+
T Consensus 11 ~PG~Y~l~~~a 21 (41)
T TIGR03769 11 KPGTYTLTVQA 21 (41)
T ss_pred CCeEEEEEEEE
Confidence 89999999876
No 79
>PF04571 Lipin_N: lipin, N-terminal conserved region; InterPro: IPR007651 Mutations in the lipin gene lead to fatty liver dystrophy in mice. The protein has been shown to be phosphorylated by the TOR Ser/Thr protein kinases in response to insulin stimulation. This entry represents a conserved domain found at the N terminus of the member proteins [, ].
Probab=24.87 E-value=1e+02 Score=27.34 Aligned_cols=38 Identities=11% Similarity=0.295 Sum_probs=28.2
Q ss_pred ecCCcccCcccEEEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCC
Q 011825 356 EMDNKTYQGTTWQIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNA 405 (476)
Q Consensus 356 ~~~~~~~~~~~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~ 405 (476)
+.++|+++.++|.++|-= +.+..+....+.|.|||+.+
T Consensus 34 ~q~DGs~~sSPFhVRFGk------------~~vl~~~ek~V~I~VNG~~~ 71 (110)
T PF04571_consen 34 EQPDGSLKSSPFHVRFGK------------LGVLRPREKVVDIEVNGKPV 71 (110)
T ss_pred ecCCCCEecCccEEEEcc------------eeeecccCcEEEEEECCEEc
Confidence 357889999999999971 13334456678999999855
No 80
>PF14344 DUF4397: Domain of unknown function (DUF4397)
Probab=24.23 E-value=1.2e+02 Score=26.16 Aligned_cols=36 Identities=17% Similarity=0.290 Sum_probs=25.9
Q ss_pred cccCcceEEEEEECceeee---eeeeeEEEEeCCceeee
Q 011825 234 NIRTGNYNLYAWVPGFVGD---YRSDALVTITSGSNIKM 269 (476)
Q Consensus 234 nV~pGtY~L~a~~~G~~G~---~~~~~~VtV~aG~t~~l 269 (476)
.|.||+|++.+...|-... .....+|++.+|+.-++
T Consensus 39 ~v~~G~~~i~v~~~g~~~~~~~~l~~~~i~l~~g~~yTl 77 (122)
T PF14344_consen 39 PVPPGTYTIEVTPAGTTPDVSTPLLSTTITLEAGKSYTL 77 (122)
T ss_pred EECCceEEEEEEECCCCCccceEEEeccEEEcCCCEEEE
Confidence 4679999999998763321 23445699999987665
No 81
>KOG3006 consensus Transthyretin and related proteins [Lipid transport and metabolism]
Probab=22.18 E-value=2.3e+02 Score=25.75 Aligned_cols=60 Identities=18% Similarity=0.180 Sum_probs=37.1
Q ss_pred eEEEEEEEEecCCCccccCceEEEecC-CCCCCCccccccccceEEEeCCCcceE----eCcccCcceEEEEE
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAP-PGDVGSWQTECKDYQFWTTADEDGCFS----IKNIRTGNYNLYAW 245 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~-~~~~g~~q~~~~~yqYwt~td~~G~Ft----I~nV~pGtY~L~a~ 245 (476)
-.|+-+|+..- .+.||.++-|.|.- .+++ +|+.-.. ..|+++|+-. -.-+.||+|+|..-
T Consensus 21 ~~itahVLd~s--~GsPA~gVqV~~f~~~~~~-~w~~igs-----~~T~~nGrv~~~~~~~tl~~GtYr~~~d 85 (132)
T KOG3006|consen 21 PPITAHVLDIS--RGSPAAGVQVHLFILANDD-TWTPIGS-----GFTQDNGRVDWVSPDFTLIPGTYRLVFD 85 (132)
T ss_pred CCcEeEEeecc--cCCcccceEEEEEEecCCC-cccCccc-----cccccCceeecccchhhhccceEEEEEe
Confidence 56777766443 47899999988752 1221 4554222 2367777644 23478999999863
No 82
>PF11589 DUF3244: Domain of unknown function (DUF3244); InterPro: IPR021638 This family of proteins with unknown function appear to be restricted to Bacteroidetes. The protein may have an immunoglobulin-like beta-sandwich fold however this cannot be confirmed. ; PDB: 3D33_B 3SD2_A.
Probab=21.86 E-value=2.8e+02 Score=23.72 Aligned_cols=65 Identities=12% Similarity=0.305 Sum_probs=34.4
Q ss_pred eEEEEEEEEecCCCccccCceEEEecCCCCCCCccccccccceEEEe--CCCc---ceEeCcccCcceEEEEEEC-c--e
Q 011825 178 GCVSGRLLVQDSNDVISANGAYVGLAPPGDVGSWQTECKDYQFWTTA--DEDG---CFSIKNIRTGNYNLYAWVP-G--F 249 (476)
Q Consensus 178 GtVsG~v~~~d~~~~~pa~~a~V~L~~~~~~g~~q~~~~~yqYwt~t--d~~G---~FtI~nV~pGtY~L~a~~~-G--~ 249 (476)
..+.|......- ..+...+.|.+.+. .|-..+..+ ...+ .+.+++.++|.|.|.+... | +
T Consensus 32 a~i~~~~l~I~F--~~~~~~vtI~I~d~----------~G~vVy~~~~~~~~~~~~~I~L~~~~~G~Y~l~i~~~~g~~l 99 (106)
T PF11589_consen 32 ASIDGNNLSIEF--ESPIGDVTITIKDS----------TGNVVYSETVSNSAGQSITIDLNGLPSGEYTLEITNGNGTYL 99 (106)
T ss_dssp EEEETTEEEEEE--SS--SEEEEEEEET----------T--EEEEEEESCGGTTEEEEE-TTS-SEEEEEEEEECTC-EE
T ss_pred EEEeCCEEEEEE--cCCCCCEEEEEEeC----------CCCEEEEEEccCCCCcEEEEEeCCCCCccEEEEEEeCCCCEE
Confidence 455554332221 34666777777641 344445443 2333 7889999999999998753 3 3
Q ss_pred eeeee
Q 011825 250 VGDYR 254 (476)
Q Consensus 250 ~G~~~ 254 (476)
.|+|.
T Consensus 100 ~G~F~ 104 (106)
T PF11589_consen 100 YGEFT 104 (106)
T ss_dssp EEEEE
T ss_pred EEEEE
Confidence 45544
No 83
>cd04970 Ig6_Contactin_like Sixth Ig domain of contactin. Ig6_Contactin_like: Sixth Ig domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III(FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 week
Probab=21.55 E-value=3.8e+02 Score=21.19 Aligned_cols=36 Identities=25% Similarity=0.363 Sum_probs=24.2
Q ss_pred CCCcceEeCcccC---cceEEEEEECceeeeeeeeeEEEEe
Q 011825 225 DEDGCFSIKNIRT---GNYNLYAWVPGFVGDYRSDALVTIT 262 (476)
Q Consensus 225 d~~G~FtI~nV~p---GtY~L~a~~~G~~G~~~~~~~VtV~ 262 (476)
..+|...|.++++ |.|+-.|.-. .|.......|+|.
T Consensus 44 ~~~~~L~I~~v~~~D~G~Y~C~a~n~--~g~~~~~~~l~V~ 82 (85)
T cd04970 44 DSNGDLMIRNAQLKHAGKYTCTAQTV--VDSLSASADLIVR 82 (85)
T ss_pred cccceEEEccCCHHhCeeeEEEEecC--CCcEEEEEEEEEE
Confidence 4678999999998 9999999742 2433334444443
No 84
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=21.51 E-value=1.4e+02 Score=34.04 Aligned_cols=71 Identities=20% Similarity=0.234 Sum_probs=45.9
Q ss_pred CcccEEEEEEeCCCCCCCcEEEEEEEeccCCCeEEEEEcCCCCCCCcccccccCCCCeeeceeeeeecEEEEEEeeCCCe
Q 011825 363 QGTTWQIKFKLDHVDRNSSYKLRVAIASATLAELQVRVNDPNANRPLFTTGLIGRDNAIARHGIHGLYLLYHVNIPGTRF 442 (476)
Q Consensus 363 ~~~~w~I~F~L~~~~~~~~~tLriala~a~~~~~~V~vN~~~~~~~~~~~~~~~~~~~i~R~~~~G~~~~~~~~ipa~~L 442 (476)
++-+|-=.|+.++... .. +|-...-+.=+|.|||++++. -+.+. |. ..++-||++.|
T Consensus 556 ~P~~w~k~f~~p~g~~--~t----~Ldm~g~GKG~vwVNG~niGR-YW~~~--------------G~--Q~~yhvPr~~L 612 (649)
T KOG0496|consen 556 QPLTWYKTFDIPSGSE--PT----ALDMNGWGKGQVWVNGQNIGR-YWPSF--------------GP--QRTYHVPRSWL 612 (649)
T ss_pred CCeEEEEEecCCCCCC--Ce----EEecCCCcceEEEECCccccc-ccCCC--------------CC--ceEEECcHHHh
Confidence 5667766677655431 11 222234567789999998863 22111 43 56788999999
Q ss_pred eeeecEEEEEeecC
Q 011825 443 IEGENTIFLKQPRC 456 (476)
Q Consensus 443 ~~G~NtI~l~~~~g 456 (476)
|.+.|.|.+---.+
T Consensus 613 k~~~N~lvvfEee~ 626 (649)
T KOG0496|consen 613 KPSGNLLVVFEEEG 626 (649)
T ss_pred CcCCceEEEEEecc
Confidence 99999988765554
No 85
>PF00041 fn3: Fibronectin type III domain; InterPro: IPR003961 Fibronectins are multi-domain glycoproteins found in a soluble form in plasma, and in an insoluble form in loose connective tissue and basement membranes []. They contain multiple copies of 3 repeat regions (types I, II and III), which bind to a variety of substances including heparin, collagen, DNA, actin, fibrin and fibronectin receptors on cell surfaces. The wide variety of these substances means that fibronectins are involved in a number of important functions: e.g., wound healing; cell adhesion; blood coagulation; cell differentiation and migration; maintenance of the cellular cytoskeleton; and tumour metastasis []. The role of fibronectin in cell differentiation is demonstrated by the marked reduction in the expression of its gene when neoplastic transformation occurs. Cell attachment has been found to be mediated by the binding of the tetrapeptide RGDS to integrins on the cell surface [], although related sequences can also display cell adhesion activity. Plasma fibronectin occurs as a dimer of 2 different subunits, linked together by 2 disulphide bonds near the C terminus. The difference in the 2 chains occurs in the type III repeat region and is caused by alternative splicing of the mRNA from one gene []. The observation that, in a given protein, an individual repeat of one of the 3 types (e.g., the first FnIII repeat) shows much less similarity to its subsequent tandem repeats within that protein than to its equivalent repeat between fibronectins from other species, has suggested that the repeating structure of fibronectin arose at an early stage of evolution. It also seems to suggest that the structure is subject to high selective pressure []. The fibronectin type III repeat region is an approximately 100 amino acid domain, different tandem repeats of which contain binding sites for DNA, heparin and the cell surface []. The superfamily of sequences believed to contain FnIII repeats represents 45 different families, the majority of which are involved in cell surface binding in some manner, or are receptor protein tyrosine kinases, or cytokine receptors.; GO: 0005515 protein binding; PDB: 1UEM_A 1TDQ_A 1X5I_A 2IC2_B 2IBG_C 2IBB_A 3R8Q_A 2FNB_A 1FNH_A 2EDB_A ....
Probab=20.29 E-value=3.1e+02 Score=20.85 Aligned_cols=21 Identities=14% Similarity=0.520 Sum_probs=17.3
Q ss_pred CCcceEeCcccCcc-eEEEEEE
Q 011825 226 EDGCFSIKNIRTGN-YNLYAWV 246 (476)
Q Consensus 226 ~~G~FtI~nV~pGt-Y~L~a~~ 246 (476)
..-.|.|.+++||+ |.+.+.+
T Consensus 54 ~~~~~~i~~L~p~t~Y~~~v~a 75 (85)
T PF00041_consen 54 NETSYTITGLQPGTTYEFRVRA 75 (85)
T ss_dssp TSSEEEEESCCTTSEEEEEEEE
T ss_pred eeeeeeeccCCCCCEEEEEEEE
Confidence 33489999999999 8888874
Done!