Query 023727
Match_columns 278
No_of_seqs 152 out of 251
Neff 4.8
Searched_HMMs 46136
Date Fri Mar 29 06:12:21 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/023727.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/023727hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09430 DUF2012: Protein of u 100.0 2.1E-28 4.6E-33 202.1 9.7 106 125-230 4-123 (123)
2 KOG3306 Predicted membrane pro 99.9 2.2E-23 4.8E-28 181.1 8.9 146 126-274 8-160 (185)
3 KOG3306 Predicted membrane pro 99.3 5.5E-14 1.2E-18 122.7 -3.8 182 68-270 2-183 (185)
4 PF13715 DUF4480: Domain of un 99.1 1.2E-09 2.6E-14 83.4 9.2 86 106-201 1-87 (88)
5 PF13620 CarboxypepD_reg: Carb 98.9 4.7E-09 1E-13 78.7 7.3 67 106-181 1-72 (82)
6 PF14686 fn3_3: Polysaccharide 98.4 1.9E-06 4.2E-11 68.9 8.9 73 103-180 1-86 (95)
7 cd03863 M14_CPD_II The second 98.3 2.1E-06 4.6E-11 83.5 9.7 68 104-181 296-364 (375)
8 cd03864 M14_CPN Peptidase M14 98.3 2.3E-06 4.9E-11 83.8 9.7 67 105-181 316-382 (392)
9 cd06245 M14_CPD_III The third 98.3 4.8E-06 1E-10 80.7 9.8 67 104-181 286-352 (363)
10 cd03867 M14_CPZ Peptidase M14- 98.1 1.1E-05 2.3E-10 79.0 9.3 67 105-181 318-384 (395)
11 cd03858 M14_CP_N-E_like Carbox 98.1 1.3E-05 2.8E-10 77.3 9.5 65 106-180 299-363 (374)
12 cd03865 M14_CPE_H Peptidase M1 98.1 1.6E-05 3.4E-10 78.3 9.4 65 107-181 328-392 (402)
13 cd03868 M14_CPD_I The first ca 97.9 4.1E-05 8.9E-10 74.0 8.0 66 105-180 296-362 (372)
14 cd03866 M14_CPM Peptidase M14 97.8 0.00011 2.5E-09 71.5 9.7 68 104-181 294-363 (376)
15 PRK15036 hydroxyisourate hydro 96.9 0.0091 2E-07 50.9 9.9 59 105-171 27-96 (137)
16 PF05738 Cna_B: Cna protein B- 96.7 0.0096 2.1E-07 43.4 7.4 51 129-179 2-60 (70)
17 KOG1948 Metalloproteinase-rela 96.7 0.0048 1E-07 65.9 7.8 70 103-181 314-384 (1165)
18 PF08308 PEGA: PEGA domain; I 96.6 0.0082 1.8E-07 44.3 6.3 49 129-182 11-59 (71)
19 PF08400 phage_tail_N: Prophag 96.3 0.034 7.3E-07 47.6 9.3 58 104-171 2-69 (134)
20 cd03869 M14_CPX_like Peptidase 95.4 0.047 1E-06 54.2 7.4 66 106-181 330-395 (405)
21 KOG2649 Zinc carboxypeptidase 94.3 0.18 3.8E-06 51.2 8.1 66 105-180 378-443 (500)
22 KOG1948 Metalloproteinase-rela 94.2 0.087 1.9E-06 56.7 6.0 62 103-173 117-181 (1165)
23 cd00421 intradiol_dioxygenase 93.9 0.19 4.1E-06 42.8 6.4 54 100-162 7-80 (146)
24 PF07495 Y_Y_Y: Y_Y_Y domain; 93.9 0.13 2.8E-06 36.9 4.7 39 131-169 10-50 (66)
25 PF07210 DUF1416: Protein of u 93.2 0.45 9.8E-06 37.8 7.0 56 103-169 6-65 (85)
26 PF11008 DUF2846: Protein of u 89.4 1.8 3.9E-05 35.2 7.2 55 129-188 41-98 (117)
27 COG3485 PcaH Protocatechuate 3 85.6 1.8 3.8E-05 40.1 5.6 57 100-165 68-146 (226)
28 PF12866 DUF3823: Protein of u 83.1 5 0.00011 36.8 7.4 75 101-180 18-103 (222)
29 PF10794 DUF2606: Protein of u 82.9 6.7 0.00015 33.4 7.4 28 138-165 78-105 (131)
30 PF10670 DUF4198: Domain of un 82.7 5.6 0.00012 34.3 7.2 55 104-168 150-213 (215)
31 cd03459 3,4-PCD Protocatechuat 81.7 4 8.7E-05 35.5 5.9 53 100-161 11-86 (158)
32 TIGR02465 chlorocat_1_2 chloro 79.9 4.2 9.2E-05 38.0 5.8 52 101-161 95-164 (246)
33 TIGR02962 hdxy_isourate hydrox 78.7 13 0.00027 30.8 7.6 44 128-171 16-71 (112)
34 cd03464 3,4-PCD_beta Protocate 78.4 4.5 9.7E-05 37.2 5.4 62 101-162 62-137 (220)
35 TIGR02422 protocat_beta protoc 77.8 3.9 8.5E-05 37.6 4.8 62 101-162 57-132 (220)
36 TIGR02438 catachol_actin catec 77.4 5.5 0.00012 38.0 5.8 51 102-161 130-198 (281)
37 cd03458 Catechol_intradiol_dio 77.4 13 0.00028 35.0 8.2 53 100-161 100-170 (256)
38 PF07550 DUF1533: Protein of u 76.7 4.3 9.3E-05 30.0 3.9 47 126-172 3-61 (65)
39 cd03462 1,2-CCD chlorocatechol 76.3 6.4 0.00014 36.9 5.8 53 100-161 95-165 (247)
40 TIGR02423 protocat_alph protoc 75.7 6.8 0.00015 35.3 5.7 63 100-162 35-111 (193)
41 cd03460 1,2-CTD Catechol 1,2 d 75.6 6.6 0.00014 37.5 5.8 52 101-161 121-190 (282)
42 PF14054 DUF4249: Domain of un 73.7 72 0.0016 29.0 14.1 96 84-179 4-121 (298)
43 PF00775 Dioxygenase_C: Dioxyg 71.0 14 0.00031 32.8 6.6 53 100-161 25-97 (183)
44 cd03461 1,2-HQD Hydroxyquinol 69.8 13 0.00027 35.5 6.2 54 100-162 116-187 (277)
45 TIGR02439 catechol_proteo cate 69.3 13 0.00028 35.6 6.2 51 102-161 126-194 (285)
46 cd03463 3,4-PCD_alpha Protocat 68.4 6.5 0.00014 35.2 3.8 51 102-161 34-106 (185)
47 PF03785 Peptidase_C25_C: Pept 65.2 37 0.00079 26.9 6.9 47 130-178 29-81 (81)
48 PF13754 Big_3_4: Bacterial Ig 61.7 18 0.00039 25.6 4.4 32 137-169 1-35 (54)
49 PF13953 PapC_C: PapC C-termin 61.7 19 0.00041 26.5 4.6 39 111-157 3-41 (68)
50 COG1470 Predicted membrane pro 61.5 1E+02 0.0023 31.8 11.1 110 101-243 395-509 (513)
51 COG4704 Uncharacterized protei 60.7 26 0.00056 30.6 5.8 20 146-165 75-94 (151)
52 PF14289 DUF4369: Domain of un 57.2 83 0.0018 23.8 9.4 46 103-162 11-64 (106)
53 PF11589 DUF3244: Domain of un 53.3 51 0.0011 26.2 6.2 42 128-169 47-96 (106)
54 PF00017 SH2: SH2 domain; Int 50.9 38 0.00083 24.7 4.7 35 145-180 19-55 (77)
55 PRK09619 flgD flagellar basal 50.2 67 0.0014 29.4 7.1 41 128-168 122-173 (218)
56 PF09912 DUF2141: Uncharacteri 49.8 22 0.00049 28.9 3.6 21 146-166 41-61 (112)
57 PF03404 Mo-co_dimer: Mo-co ox 49.4 65 0.0014 27.0 6.4 30 103-141 27-56 (131)
58 PF07172 GRP: Glycine rich pro 48.2 12 0.00025 30.2 1.7 12 73-84 1-12 (95)
59 PF01190 Pollen_Ole_e_I: Polle 47.7 51 0.0011 25.6 5.2 26 128-153 21-54 (97)
60 TIGR01710 typeII_sec_gspG gene 47.3 22 0.00047 29.6 3.3 32 216-251 2-33 (134)
61 PF13954 PapC_N: PapC N-termin 44.4 67 0.0014 26.9 5.8 38 154-191 26-63 (146)
62 PF15178 TOM_sub5: Mitochondri 43.6 18 0.00039 26.0 1.9 11 246-256 9-19 (51)
63 COG4537 ComGC Competence prote 43.6 32 0.00069 28.5 3.5 62 209-274 6-80 (107)
64 cd02110 SO_family_Moco_dimer S 43.1 97 0.0021 29.7 7.4 58 102-169 222-289 (317)
65 COG1422 Predicted membrane pro 43.1 34 0.00075 31.3 4.0 35 225-261 45-82 (201)
66 PF02369 Big_1: Bacterial Ig-l 42.6 1.7E+02 0.0036 23.0 9.4 67 101-176 21-98 (100)
67 cd05774 Ig_CEACAM_D1 First imm 42.4 85 0.0018 25.1 5.9 37 142-178 64-103 (105)
68 cd05822 TLP_HIUase HIUase (5-h 42.3 2E+02 0.0043 23.8 8.6 43 128-170 16-70 (112)
69 PRK12813 flgD flagellar basal 42.2 94 0.002 28.7 6.8 40 128-167 123-174 (223)
70 PF00576 Transthyretin: HIUase 41.1 32 0.0007 28.4 3.3 43 128-170 16-71 (112)
71 KOG4659 Uncharacterized conser 41.1 78 0.0017 36.7 7.0 71 100-179 47-118 (1899)
72 smart00060 FN3 Fibronectin typ 39.9 59 0.0013 21.3 4.0 22 147-168 56-78 (83)
73 cd00173 SH2 Src homology 2 dom 36.6 82 0.0018 23.4 4.7 34 145-179 19-54 (94)
74 PF03716 WCCH: WCCH motif ; I 36.6 17 0.00037 22.6 0.8 11 47-57 4-14 (25)
75 PRK06655 flgD flagellar basal 35.2 1.7E+02 0.0038 26.8 7.4 40 128-167 125-179 (225)
76 PRK12633 flgD flagellar basal 34.6 2.8E+02 0.006 25.5 8.7 42 128-169 128-184 (230)
77 PRK12812 flgD flagellar basal 34.3 1.1E+02 0.0023 29.0 6.0 41 128-168 140-195 (259)
78 COG4939 Major membrane immunog 34.2 1.5E+02 0.0032 25.8 6.2 64 83-154 8-77 (147)
79 COG2165 PulG Type II secretory 34.0 64 0.0014 25.4 3.9 33 215-251 8-40 (149)
80 cd02114 bact_SorA_Moco sulfite 33.4 1.6E+02 0.0034 29.0 7.3 57 102-168 274-340 (367)
81 PF08621 RPAP1_N: RPAP1-like, 33.1 20 0.00043 25.6 0.7 31 244-274 11-41 (49)
82 COG4676 Uncharacterized protei 32.3 78 0.0017 29.8 4.6 100 105-215 68-174 (268)
83 PF01186 Lysyl_oxidase: Lysyl 27.6 46 0.001 30.6 2.3 18 149-166 151-168 (205)
84 PF14347 DUF4399: Domain of un 27.2 42 0.00091 26.5 1.7 22 154-175 58-79 (87)
85 PRK14646 hypothetical protein; 27.1 35 0.00077 29.5 1.4 17 154-170 70-86 (155)
86 PF11346 DUF3149: Protein of u 27.0 73 0.0016 22.2 2.7 23 220-242 4-26 (42)
87 cd05469 Transthyretin_like Tra 26.8 3.7E+02 0.0081 22.3 7.4 43 128-170 16-70 (113)
88 cd05821 TLP_Transthyretin Tran 25.9 1.2E+02 0.0025 25.6 4.3 44 128-171 22-77 (121)
89 PF13753 SWM_repeat: Putative 24.6 2.7E+02 0.0058 26.0 6.9 52 103-165 61-117 (317)
90 COG4594 FecB ABC-type Fe3+-cit 24.6 1.3E+02 0.0029 29.0 4.8 31 141-171 34-64 (310)
91 COG4970 FimT Tfp pilus assembl 24.5 86 0.0019 28.2 3.4 36 215-254 8-43 (181)
92 PF00577 Usher: Outer membrane 24.1 2.2E+02 0.0047 28.8 6.6 40 129-168 99-142 (552)
93 PF10836 DUF2574: Protein of u 24.1 57 0.0012 26.4 1.9 14 101-114 24-37 (93)
94 COG4850 Uncharacterized conser 24.0 1.2E+02 0.0026 30.2 4.4 39 131-169 101-145 (373)
95 PF01060 DUF290: Transthyretin 23.9 1.5E+02 0.0032 22.5 4.2 31 128-158 11-49 (80)
96 PRK14643 hypothetical protein; 23.8 49 0.0011 29.0 1.7 17 154-170 74-90 (164)
97 PF11974 MG1: Alpha-2-macroglo 23.7 2.8E+02 0.006 21.9 5.9 28 128-155 28-60 (97)
98 COG2351 Transthyretin-like pro 23.5 4.7E+02 0.01 22.4 7.6 45 128-172 24-80 (124)
99 PRK10301 hypothetical protein; 23.2 1.4E+02 0.0031 24.8 4.3 16 150-165 88-106 (124)
100 PF05753 TRAP_beta: Translocon 22.9 5.4E+02 0.012 22.8 13.1 41 142-182 70-116 (181)
101 smart00831 Cation_ATPase_N Cat 22.8 83 0.0018 22.2 2.5 22 220-241 41-62 (64)
102 COG5266 CbiK ABC-type Co2+ tra 22.8 2.7E+02 0.0058 26.7 6.4 30 140-169 214-243 (264)
103 PRK14639 hypothetical protein; 22.7 56 0.0012 27.9 1.8 17 154-170 58-74 (140)
104 PF13860 FlgD_ig: FlgD Ig-like 22.3 3.4E+02 0.0074 20.3 7.2 14 154-167 65-78 (81)
105 PRK00022 lolB outer membrane l 22.3 4.2E+02 0.009 23.3 7.4 12 103-114 46-57 (202)
106 PRK11657 dsbG disulfide isomer 22.2 2.9E+02 0.0062 25.5 6.5 41 103-154 33-73 (251)
107 smart00634 BID_1 Bacterial Ig- 22.1 3.6E+02 0.0078 20.5 8.8 61 102-169 17-85 (92)
108 PRK14631 hypothetical protein; 21.5 53 0.0011 29.1 1.4 14 157-170 90-103 (174)
109 PRK14636 hypothetical protein; 21.5 54 0.0012 29.1 1.5 15 156-170 70-84 (176)
110 PRK14645 hypothetical protein; 21.4 61 0.0013 28.2 1.8 17 154-170 72-88 (154)
111 PRK14638 hypothetical protein; 21.3 63 0.0014 27.8 1.9 17 154-170 70-86 (150)
112 PRK02001 hypothetical protein; 21.3 61 0.0013 28.2 1.8 14 157-170 63-76 (152)
113 PF03443 Glyco_hydro_61: Glyco 21.2 70 0.0015 29.2 2.2 25 146-170 134-161 (218)
114 PRK14630 hypothetical protein; 21.0 59 0.0013 27.9 1.6 15 157-171 70-84 (143)
115 PF00630 Filamin: Filamin/ABP2 20.6 93 0.002 23.5 2.5 25 144-168 66-92 (101)
116 PF13677 MotB_plug: Membrane M 20.5 1.9E+02 0.0042 20.9 4.0 18 244-261 37-54 (58)
117 PRK14633 hypothetical protein; 20.3 69 0.0015 27.6 1.9 17 154-170 64-80 (150)
No 1
>PF09430 DUF2012: Protein of unknown function (DUF2012); InterPro: IPR019008 This domain is found in different proteins, including uncharacterised protein family UPF0480 and nodal modulators. A nodal modulator has been identified as part of a protein complex that participates in the nodal signaling pathway during vertebrate development [].
Probab=99.95 E-value=2.1e-28 Score=202.11 Aligned_cols=106 Identities=42% Similarity=0.613 Sum_probs=90.5
Q ss_pred CCCceeEEEEEcCcee---EEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcCCCC---ceEEEEecccC---
Q 023727 125 GGKASNVKVVLNGGEH---VTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSARHPG---KVQAALTETRR--- 195 (278)
Q Consensus 125 ~~~~s~t~V~L~g~~~---~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~~~G---~VrA~~~e~~~--- 195 (278)
..|.++++|+|+++++ .+++++||+|+|+|||+|+|+|+|+|++|.|.++||||..+.+. ++.+++...+.
T Consensus 4 ~~~~~~t~V~L~~g~~~~~~~~v~~dG~F~f~~Vp~GsY~L~V~s~~~~F~~~RVdV~~~~~~~~~~~~~~~~~~~~~~~ 83 (123)
T PF09430_consen 4 NNLPSSTRVTLNGGQYRPISAFVRSDGSFVFHNVPPGSYLLEVHSPDYVFPPYRVDVSSSGKIRARKVNYWQPYRGNPWS 83 (123)
T ss_pred ccCCCCEEEEEeCCCccceEEEecCCCEEEeCCCCCceEEEEEECCCccccCEEEEEecCCCCCceEEEeecCccccccc
Confidence 3678999999999988 99999999999999999999999999999999999999953211 23333333222
Q ss_pred -----ccccEEEEeccccceeeeccccChhhhccCHHHHH
Q 023727 196 -----GLNELVLEQLREEQYYEIREPFSIMSLVKSPMGLM 230 (278)
Q Consensus 196 -----~l~PLvv~p~~~~~Yfe~Re~fsi~~lLkNPM~LM 230 (278)
..+||+++|+++++|||+|++|++++||||||+||
T Consensus 84 ~~~~~~~~Pl~l~~~~~~~Yfe~r~~fsi~~lLknPM~Lm 123 (123)
T PF09430_consen 84 NSGESLPYPLVLKPVGRKQYFEEREGFSILSLLKNPMVLM 123 (123)
T ss_pred ccCcccCccEEEEEcccccCeEECCCCCHHHHhcCCeecC
Confidence 24599999999999999999999999999999987
No 2
>KOG3306 consensus Predicted membrane protein [Function unknown]
Probab=99.89 E-value=2.2e-23 Score=181.08 Aligned_cols=146 Identities=27% Similarity=0.432 Sum_probs=128.0
Q ss_pred CCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcCCCCceEEEEecccCc------ccc
Q 023727 126 GKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSARHPGKVQAALTETRRG------LNE 199 (278)
Q Consensus 126 ~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~~~G~VrA~~~e~~~~------l~P 199 (278)
.|++.++++++++++..|.+.||+|..+++|.|+|.++|.+.+|.|.+.||++... |+.+|++...-+. .+|
T Consensus 8 r~~~~~~liv~~~~f~gF~~td~sf~~~d~ptgty~V~V~~s~vi~~p~rv~i~~k--gk~rarkvtflrp~~~~tl~yp 85 (185)
T KOG3306|consen 8 RSHSAARLIVNGGEFVGFASTDGSFGSEDSPTGTYRVEVPSSDVIGHPARVSITKK--GKNRARKVTFLRPDGYFTLPYP 85 (185)
T ss_pred ccccceeeEeccceEEEEEeeccccccccCCceeEEEEecCcceeecceEEEeehh--hcccceEEEEEccCceEEeccc
Confidence 68899999999999999999999999999999999999999999999999999974 7888877544332 347
Q ss_pred EEEEeccccceeeeccccChhhhccCHHHHHHHHHHHHHHhhhccccCCCHHHHHHHHH-HHHhCCCCchhhhCCC
Q 023727 200 LVLEQLREEQYYEIREPFSIMSLVKSPMGLMMGFMLVVVFLMPKLMENMDPEEMRRAQE-EMRSQGVPSLANLIPG 274 (278)
Q Consensus 200 Lvv~p~~~~~Yfe~Re~fsi~~lLkNPM~LM~lv~l~l~~~mPKLme~mDPEe~ke~qe-em~~~~~p~~s~ll~g 274 (278)
+.+.+.+...||+.|+.|++.+++|+||+||++.+++.++++||+. .+|||+.+|+.- ++.+..+|++++||..
T Consensus 86 l~l~~~~p~~YF~~Re~w~~td~l~~pmvLm~v~plL~~Lvlpk~~-~~dp~~~~e~~~m~f~~v~vp~v~e~m~~ 160 (185)
T KOG3306|consen 86 LRLKSSGPPSYFIVREDWSATDRLKVPMVLMEVRPLLTELVLPKMV-INDPEMKKEMENMQFPKVDVPDVSEMMTN 160 (185)
T ss_pred eeecccCCccceEEEeEeeeeecccCcchHHHHHHHHHHHhcCccc-cCChhhhhhhhhcCceeeecchHHHHhhh
Confidence 9999999999999999999999999999999999999999999996 679997766641 2346689999876643
No 3
>KOG3306 consensus Predicted membrane protein [Function unknown]
Probab=99.32 E-value=5.5e-14 Score=122.72 Aligned_cols=182 Identities=31% Similarity=0.295 Sum_probs=142.6
Q ss_pred cchhhcccchhHHHHHHHHHHHhhhcceeeccCCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCC
Q 023727 68 HMASIIRSKSVLSVFFINLFLSLVSSAVAVSSGSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPD 147 (278)
Q Consensus 68 ~~~~~~~~~~~~~~~~~~~~~s~~~~~~a~s~~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~d 147 (278)
-|+.|.++.+.+.+++...|+-.++.+. .|.++.+.++|+|+|.++..+-.+. | .+.+.+++.++++++.+|+++|
T Consensus 2 ~m~~~~r~~~~~~liv~~~~f~gF~~td-~sf~~~d~ptgty~V~V~~s~vi~~--p-~rv~i~~kgk~rarkvtflrp~ 77 (185)
T KOG3306|consen 2 PMVKIQRSHSAARLIVNGGEFVGFASTD-GSFGSEDSPTGTYRVEVPSSDVIGH--P-ARVSITKKGKNRARKVTFLRPD 77 (185)
T ss_pred CcccccccccceeeEeccceEEEEEeec-cccccccCCceeEEEEecCcceeec--c-eEEEeehhhcccceEEEEEccC
Confidence 3677888887777777778877887777 7788889999999999998754221 1 4567788888899999999999
Q ss_pred ccEEEccCCCeeEEEEEEecCcceeeEEEEEEcCCCCceEEEEecccCccccEEEEeccccceeeeccccChhhhccCHH
Q 023727 148 GYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSARHPGKVQAALTETRRGLNELVLEQLREEQYYEIREPFSIMSLVKSPM 227 (278)
Q Consensus 148 G~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~~~G~VrA~~~e~~~~l~PLvv~p~~~~~Yfe~Re~fsi~~lLkNPM 227 (278)
|.|+++-+ .++..+.-.+|.+ .|+|+.+..-+++.+.+.+..+.+..|++.+...-+|...++.
T Consensus 78 ~~~tl~yp---l~l~~~~p~~YF~--~Re~w~~td~l~~pmvLm~v~plL~~Lvlpk~~~~dp~~~~e~----------- 141 (185)
T KOG3306|consen 78 GYFTLPYP---LRLKSSGPPSYFI--VREDWSATDRLKVPMVLMEVRPLLTELVLPKMVINDPEMKKEM----------- 141 (185)
T ss_pred ceEEeccc---eeecccCCccceE--EEeEeeeeecccCcchHHHHHHHHHHHhcCccccCChhhhhhh-----------
Confidence 99999764 6777778888888 8888887666788888888777777777766666666655554
Q ss_pred HHHHHHHHHHHHhhhccccCCCHHHHHHHHHHHHhCCCCchhh
Q 023727 228 GLMMGFMLVVVFLMPKLMENMDPEEMRRAQEEMRSQGVPSLAN 270 (278)
Q Consensus 228 ~LM~lv~l~l~~~mPKLme~mDPEe~ke~qeem~~~~~p~~s~ 270 (278)
.++.++..-+..++++|+++++|++++.||+++.++.|+...
T Consensus 142 -~~m~f~~v~vp~v~e~m~~lf~~k~~~ake~~~~~gv~s~kr 183 (185)
T KOG3306|consen 142 -ENMQFPKVDVPDVSEMMTNLFSGKSPGAKEKSKTGGVGSGKR 183 (185)
T ss_pred -hhcCceeeecchHHHHhhhcccccCCcccccccCCCCCcccc
Confidence 344444556778899999999999999999999888877644
No 4
>PF13715 DUF4480: Domain of unknown function (DUF4480)
Probab=99.06 E-value=1.2e-09 Score=83.39 Aligned_cols=86 Identities=22% Similarity=0.339 Sum_probs=67.4
Q ss_pred EEEEEEECCC-CCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcCCCC
Q 023727 106 SISGRVKLPG-MSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSARHPG 184 (278)
Q Consensus 106 tIsGrV~~p~-~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~~~G 184 (278)
+|+|+|.... ..| +++++|.+.++...+.+|.+|.|.|. +++|.|.|.+++.||......|++.....-
T Consensus 1 ti~G~V~d~~t~~p---------l~~a~V~~~~~~~~~~Td~~G~F~i~-~~~g~~~l~is~~Gy~~~~~~i~~~~~~~~ 70 (88)
T PF13715_consen 1 TISGKVVDSDTGEP---------LPGATVYLKNTKKGTVTDENGRFSIK-LPEGDYTLKISYIGYETKTITISVNSNKNT 70 (88)
T ss_pred CEEEEEEECCCCCC---------ccCeEEEEeCCcceEEECCCeEEEEE-EcCCCeEEEEEEeCEEEEEEEEEecCCCEE
Confidence 5899999877 333 68899999999899999999999999 999999999999999988888888643211
Q ss_pred ceEEEEecccCccccEE
Q 023727 185 KVQAALTETRRGLNELV 201 (278)
Q Consensus 185 ~VrA~~~e~~~~l~PLv 201 (278)
.+..++.+....+.+++
T Consensus 71 ~~~i~L~~~~~~L~eVv 87 (88)
T PF13715_consen 71 NLNIYLEPKSNQLDEVV 87 (88)
T ss_pred EEEEEEeeCcccCCeEE
Confidence 34555554444455544
No 5
>PF13620 CarboxypepD_reg: Carboxypeptidase regulatory-like domain; PDB: 3MN8_D 3P0D_I 3KCP_A 2B59_B 1UWY_A 1H8L_A 1QMU_A 2NSM_A.
Probab=98.91 E-value=4.7e-09 Score=78.65 Aligned_cols=67 Identities=33% Similarity=0.389 Sum_probs=52.5
Q ss_pred EEEEEEECCCCCCCCCCCCCCCceeEEEEEc----CceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeE-EEEEEc
Q 023727 106 SISGRVKLPGMSLKAFGSPGGKASNVKVVLN----GGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPV-RVDVSA 180 (278)
Q Consensus 106 tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~----g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~-RVdV~~ 180 (278)
+|+|+|..+.++| ++.++|.|. +....+.+|.+|.|.|.++|+|.|.|.+...+|..... .|.|.+
T Consensus 1 tI~G~V~d~~g~p---------v~~a~V~l~~~~~~~~~~~~Td~~G~f~~~~l~~g~Y~l~v~~~g~~~~~~~~v~v~~ 71 (82)
T PF13620_consen 1 TISGTVTDATGQP---------VPGATVTLTDQDGGTVYTTTTDSDGRFSFEGLPPGTYTLRVSAPGYQPQTQENVTVTA 71 (82)
T ss_dssp -EEEEEEETTSCB---------HTT-EEEET--TTTECCEEE--TTSEEEEEEE-SEEEEEEEEBTTEE-EEEEEEEESS
T ss_pred CEEEEEEcCCCCC---------cCCEEEEEEEeeCCCEEEEEECCCceEEEEccCCEeEEEEEEECCcceEEEEEEEEeC
Confidence 6999999886655 678898887 34668999999999999999999999999999999887 488875
Q ss_pred C
Q 023727 181 R 181 (278)
Q Consensus 181 ~ 181 (278)
.
T Consensus 72 ~ 72 (82)
T PF13620_consen 72 G 72 (82)
T ss_dssp S
T ss_pred C
Confidence 3
No 6
>PF14686 fn3_3: Polysaccharide lyase family 4, domain II; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=98.40 E-value=1.9e-06 Score=68.92 Aligned_cols=73 Identities=29% Similarity=0.299 Sum_probs=43.1
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc---------CceeEEEecCCccEEEccCCCeeEEEEEEe----cCc
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN---------GGEHVTFLRPDGYFSFQNMSAGTHLIEVAA----IGY 169 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~---------g~~~~a~~~~dG~F~f~nVP~GsY~LeVss----~gy 169 (278)
++.+|+|+|.+++.-.+. + .-..+-|-|. +-+||+.+|.+|+|+|.||+||+|.|.+-. -+|
T Consensus 1 ~RG~VsG~l~l~dg~~~~---~--~~~~~~Vgl~~~~d~~q~~~yqYwt~td~~G~Fti~~V~pGtY~L~ay~~g~~g~~ 75 (95)
T PF14686_consen 1 QRGSVSGRLTLSDGVTNP---P--AGANAVVGLAPPGDFQQNKGYQYWTRTDSDGNFTIPNVRPGTYRLYAYADGIFGDY 75 (95)
T ss_dssp G-BEEEEEEE---SS--T---T----S-EEEEEE--------SS-EEEEE--TTSEEE---B-SEEEEEEEEE----TTE
T ss_pred CCCEEEEEEEEccCcccC---c--cceeEEEEeeeccccccCCCCcEEEEeCCCCcEEeCCeeCcEeEEEEEEecccCce
Confidence 367999999988762100 1 0134556665 348999999999999999999999999999 566
Q ss_pred ceeeEEEEEEc
Q 023727 170 FFSPVRVDVSA 180 (278)
Q Consensus 170 ~F~p~RVdV~~ 180 (278)
......|.|.+
T Consensus 76 ~~~~~~ItV~~ 86 (95)
T PF14686_consen 76 KVASDSITVSG 86 (95)
T ss_dssp EEEEEEEEE-T
T ss_pred EEecceEEEcC
Confidence 66567777764
No 7
>cd03863 M14_CPD_II The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally ac
Probab=98.35 E-value=2.1e-06 Score=83.53 Aligned_cols=68 Identities=19% Similarity=0.111 Sum_probs=59.1
Q ss_pred ceEEEEEEECCC-CCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 104 GFSISGRVKLPG-MSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 104 ~~tIsGrV~~p~-~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
+..|+|+|.... +.| +..++|.+.|..+.+.+|.||.|.+ .||+|+|+|+|++.||....+.|.|.+.
T Consensus 296 ~~gI~G~V~D~~~g~p---------l~~AtV~V~g~~~~~~Td~~G~f~~-~l~pG~ytl~vs~~GY~~~~~~v~V~~~ 364 (375)
T cd03863 296 HRGVRGFVLDATDGRG---------ILNATISVADINHPVTTYKDGDYWR-LLVPGTYKVTASARGYDPVTKTVEVDSK 364 (375)
T ss_pred cCeEEEEEEeCCCCCC---------CCCeEEEEecCcCceEECCCccEEE-ccCCeeEEEEEEEcCcccEEEEEEEcCC
Confidence 368999998764 333 6789999999989999999999998 6999999999999999999988888753
No 8
>cd03864 M14_CPN Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates inclu
Probab=98.34 E-value=2.3e-06 Score=83.79 Aligned_cols=67 Identities=25% Similarity=0.248 Sum_probs=60.1
Q ss_pred eEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 105 FSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 105 ~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
..|+|+|....+.| ++.++|.+.|..+.+.++.+|.| +++++||+|+|+|++.||......|.|...
T Consensus 316 ~gI~G~V~D~~g~p---------i~~A~V~v~g~~~~~~T~~~G~y-~r~l~pG~Y~l~vs~~Gy~~~t~~v~V~~~ 382 (392)
T cd03864 316 QGIKGMVTDENNNG---------IANAVISVSGISHDVTSGTLGDY-FRLLLPGTYTVTASAPGYQPSTVTVTVGPA 382 (392)
T ss_pred CeEEEEEECCCCCc---------cCCeEEEEECCccceEECCCCcE-EecCCCeeEEEEEEEcCceeEEEEEEEcCC
Confidence 48999999876554 67899999999899999999999 999999999999999999999999998754
No 9
>cd06245 M14_CPD_III The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active a
Probab=98.26 E-value=4.8e-06 Score=80.65 Aligned_cols=67 Identities=31% Similarity=0.428 Sum_probs=58.5
Q ss_pred ceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 104 GFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 104 ~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
+..|+|+|....+.| ++.++|.++|.. .+.+|.+|.|.+. +|+|+|+|+|++.||.-....|.|..+
T Consensus 286 ~~gI~G~V~d~~g~p---------i~~A~V~v~g~~-~~~T~~~G~y~~~-L~pG~y~v~vs~~Gy~~~~~~V~v~~~ 352 (363)
T cd06245 286 HKGVHGVVTDKAGKP---------ISGATIVLNGGH-RVYTKEGGYFHVL-LAPGQHNINVIAEGYQQEHLPVVVSHD 352 (363)
T ss_pred CcEEEEEEEcCCCCC---------ccceEEEEeCCC-ceEeCCCcEEEEe-cCCceEEEEEEEeCceeEEEEEEEcCC
Confidence 468999998765544 678999999976 7889999999997 999999999999999999999998764
No 10
>cd03867 M14_CPZ Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling.
Probab=98.13 E-value=1.1e-05 Score=78.96 Aligned_cols=67 Identities=25% Similarity=0.301 Sum_probs=59.0
Q ss_pred eEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 105 FSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 105 ~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
..|+|+|....+.| ++.++|.+.|..+.+.+|.+|.|. .++|+|+|+|+|++.||.....+|+|..+
T Consensus 318 ~~i~G~V~D~~g~p---------i~~A~V~v~g~~~~~~Td~~G~y~-~~l~~G~y~l~vs~~Gy~~~~~~v~v~~~ 384 (395)
T cd03867 318 RGIKGFVKDKDGNP---------IKGARISVRGIRHDITTAEDGDYW-RLLPPGIHIVSAQAPGYTKVMKRVTLPAR 384 (395)
T ss_pred ceeEEEEEcCCCCc---------cCCeEEEEeccccceEECCCceEE-EecCCCcEEEEEEecCeeeEEEEEEeCCc
Confidence 46999999876554 678999999999999999999996 78999999999999999999989998653
No 11
>cd03858 M14_CP_N-E_like Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappr
Probab=98.12 E-value=1.3e-05 Score=77.25 Aligned_cols=65 Identities=23% Similarity=0.206 Sum_probs=57.2
Q ss_pred EEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEc
Q 023727 106 SISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSA 180 (278)
Q Consensus 106 tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~ 180 (278)
+|+|+|....+.| ++.++|++.|..+.+.+|.+|.|.+. +|+|+|+|+|++.||......|.|..
T Consensus 299 ~i~G~V~d~~g~p---------l~~A~V~i~~~~~~~~Td~~G~f~~~-l~~G~y~l~vs~~Gy~~~~~~v~v~~ 363 (374)
T cd03858 299 GIKGFVRDANGNP---------IANATISVEGINHDVTTAEDGDYWRL-LLPGTYNVTASAPGYEPQTKSVVVPN 363 (374)
T ss_pred ceEEEEECCCCCc---------cCCeEEEEecceeeeEECCCceEEEe-cCCEeEEEEEEEcCcceEEEEEEEec
Confidence 8999999875543 57899999999999999999999985 89999999999999988888888865
No 12
>cd03865 M14_CPE_H Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine pe
Probab=98.08 E-value=1.6e-05 Score=78.30 Aligned_cols=65 Identities=22% Similarity=0.225 Sum_probs=57.5
Q ss_pred EEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 107 ISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 107 IsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
|+|+|....+.| ++.++|.+.|..+.+.++.+|.|.+ +++||+|+|+|++.||......|.|...
T Consensus 328 I~G~V~D~~g~p---------I~~AtV~V~g~~~~~~T~~~G~Y~~-~L~pG~Ytv~vsa~Gy~~~~~~V~V~~~ 392 (402)
T cd03865 328 VKGFVKDLQGNP---------IANATISVEGIDHDITSAKDGDYWR-LLAPGNYKLTASAPGYLAVVKKVAVPYS 392 (402)
T ss_pred eEEEEECCCCCc---------CCCeEEEEEcCccccEECCCeeEEE-CCCCEEEEEEEEecCcccEEEEEEEcCC
Confidence 999998765543 5789999999888899999999998 9999999999999999999888888753
No 13
>cd03868 M14_CPD_I The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at p
Probab=97.87 E-value=4.1e-05 Score=74.03 Aligned_cols=66 Identities=21% Similarity=0.193 Sum_probs=55.8
Q ss_pred eEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEE-EEEc
Q 023727 105 FSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRV-DVSA 180 (278)
Q Consensus 105 ~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RV-dV~~ 180 (278)
..|+|+|....+.| ++.++|+|.+..+.+.+|.+|.|. .+||+|+|+|+|++.||......+ .|..
T Consensus 296 ~~i~G~V~d~~g~p---------v~~A~V~v~~~~~~~~td~~G~y~-~~l~~G~Y~l~vs~~Gf~~~~~~~v~v~~ 362 (372)
T cd03868 296 IGVKGFVRDASGNP---------IEDATIMVAGIDHNVTTAKFGDYW-RLLLPGTYTITAVAPGYEPSTVTDVVVKE 362 (372)
T ss_pred CceEEEEEcCCCCc---------CCCcEEEEEecccceEeCCCceEE-ecCCCEEEEEEEEecCCCceEEeeEEEcC
Confidence 56999998875544 678999999988899999999998 589999999999999999887764 4543
No 14
>cd03866 M14_CPM Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the cont
Probab=97.80 E-value=0.00011 Score=71.46 Aligned_cols=68 Identities=19% Similarity=0.205 Sum_probs=55.7
Q ss_pred ceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCcee--EEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 104 GFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEH--VTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 104 ~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~--~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
+..|+|+|....+.| ++.++|.+.|... .+.+|.+|.|.+ .++||+|+|.|++.||......|.|...
T Consensus 294 ~~gI~G~V~D~~g~p---------i~~A~V~v~g~~~~~~~~T~~~G~y~~-~l~pG~Y~v~vsa~Gy~~~~~~v~v~~~ 363 (376)
T cd03866 294 HLGVKGQVFDSNGNP---------IPNAIVEVKGRKHICPYRTNVNGEYFL-LLLPGKYMINVTAPGFKTVITNVIIPYN 363 (376)
T ss_pred cCceEEEEECCCCCc---------cCCeEEEEEcCCceeEEEECCCceEEE-ecCCeeEEEEEEeCCcceEEEEEEeCCC
Confidence 356999998665543 5778999988754 458999999966 5999999999999999988888888754
No 15
>PRK15036 hydroxyisourate hydrolase; Provisional
Probab=96.92 E-value=0.0091 Score=50.94 Aligned_cols=59 Identities=20% Similarity=0.224 Sum_probs=45.2
Q ss_pred eEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCce-------eEEEecCCccEEE---cc-CCCeeEEEEEEecCcce
Q 023727 105 FSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGE-------HVTFLRPDGYFSF---QN-MSAGTHLIEVAAIGYFF 171 (278)
Q Consensus 105 ~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~-------~~a~~~~dG~F~f---~n-VP~GsY~LeVss~gy~F 171 (278)
..|+|+|...... + | +++++|.|.... ..+.+|.||.|.+ .+ +++|.|.|....-+|.-
T Consensus 27 ~~Is~HVLDt~~G-~----P---A~gV~V~L~~~~~~~w~~l~~~~Td~dGR~~~l~~~~~~~~G~Y~L~F~t~~Yf~ 96 (137)
T PRK15036 27 NILSVHILNQQTG-K----P---AADVTVTLEKKADNGWLQLNTAKTDKDGRIKALWPEQTATTGDYRVVFKTGDYFK 96 (137)
T ss_pred CCeEEEEEeCCCC-c----C---CCCCEEEEEEccCCceEEEEEEEECCCCCCccccCcccCCCeeEEEEEEcchhhh
Confidence 3699999866442 1 2 678888886422 5699999999987 34 78999999999999864
No 16
>PF05738 Cna_B: Cna protein B-type domain; InterPro: IPR008454 This entry represents a repeated B region domain found in the collagen-binding surface protein Cna in Staphylococcus aureus, as well as other related domains. The B region domain of Cna has a prealbumin-like beta-sandwich fold of seven strands in two sheets with a Greek key topology []. However, this domain does not mediate collagen binding, the IPR008456 from INTERPRO region carries out that function; instead it appears to form a stalk that presents the ligand binding domain away from the bacterial cell surface. Cna is a collagen-binding MSCRAMM (Microbial Surface Component Recognizing Adhesive Matrix Molecules), and is necessary and sufficient for S. aureus cells to adhere to cartilage.; PDB: 2X5P_A 3RKP_A 3KPT_A 1VLF_T 1TI2_F 1TI6_D 1TI4_J 1VLE_V 1VLD_X 3PF2_A ....
Probab=96.71 E-value=0.0096 Score=43.42 Aligned_cols=51 Identities=18% Similarity=0.324 Sum_probs=37.1
Q ss_pred eeEEEEEcC---cee---EEEecCCccEEEccCCCeeEEEEEEe--cCcceeeEEEEEE
Q 023727 129 SNVKVVLNG---GEH---VTFLRPDGYFSFQNMSAGTHLIEVAA--IGYFFSPVRVDVS 179 (278)
Q Consensus 129 s~t~V~L~g---~~~---~a~~~~dG~F~f~nVP~GsY~LeVss--~gy~F~p~RVdV~ 179 (278)
+.+++.|.. ... ...+|.+|.|.|.++++|.|.|+-.. .||...+-...+.
T Consensus 2 ~Ga~f~L~~~~~~~~~~~~~~Td~~G~~~f~~L~~G~Y~l~E~~aP~GY~~~~~~~~~~ 60 (70)
T PF05738_consen 2 AGATFELYDEDGNEVIEVTVTTDENGKYTFKNLPPGTYTLKETKAPDGYQLDDTPYEFT 60 (70)
T ss_dssp STEEEEEEETTSEEEEEEEEEGGTTSEEEEEEEESEEEEEEEEETTTTEEEEECEEEEE
T ss_pred CCeEEEEEECCCCEEEEEEEEECCCCEEEEeecCCeEEEEEEEECCCCCEECCCceEEE
Confidence 346666642 222 26799999999999999999998876 6887776554443
No 17
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=96.70 E-value=0.0048 Score=65.85 Aligned_cols=70 Identities=31% Similarity=0.459 Sum_probs=58.1
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEcc-CCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQN-MSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~n-VP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
.+++|.|||...... +| ++.+.|++||. ..+-+|.+|+|++.| +..|+|++++....|.|.++.+.|...
T Consensus 314 tgfSvtGRVl~g~~g-----~~---l~gvvvlvngk-~~~kTdaqGyykLen~~t~gtytI~a~kehlqFstv~~kv~pn 384 (1165)
T KOG1948|consen 314 TGFSVTGRVLVGSKG-----LP---LSGVVVLVNGK-SGGKTDAQGYYKLENLKTDGTYTITAKKEHLQFSTVHAKVKPN 384 (1165)
T ss_pred EEEEeeeeEEeCCCC-----CC---ccceEEEEcCc-ccceEcccceEEeeeeeccCcEEEEEeccceeeeeEEEEecCC
Confidence 458999999865332 12 67788888875 457799999999999 899999999999999999999998764
No 18
>PF08308 PEGA: PEGA domain; InterPro: IPR013229 This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands.
Probab=96.58 E-value=0.0082 Score=44.28 Aligned_cols=49 Identities=20% Similarity=0.275 Sum_probs=40.5
Q ss_pred eeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcCC
Q 023727 129 SNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSARH 182 (278)
Q Consensus 129 s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~~ 182 (278)
+.++|.|||...+ ..-..+.++++|.|.|++...||.-....|+|.+..
T Consensus 11 ~gA~V~vdg~~~G-----~tp~~~~~l~~G~~~v~v~~~Gy~~~~~~v~v~~~~ 59 (71)
T PF08308_consen 11 SGAEVYVDGKYIG-----TTPLTLKDLPPGEHTVTVEKPGYEPYTKTVTVKPGE 59 (71)
T ss_pred CCCEEEECCEEec-----cCcceeeecCCccEEEEEEECCCeeEEEEEEECCCC
Confidence 5689999996555 234578889999999999999999999999998643
No 19
>PF08400 phage_tail_N: Prophage tail fibre N-terminal; InterPro: IPR013609 This entry represents the N terminus of phage 933W tail fibre protein. The characteristics of the protein distribution suggest prophage matches.
Probab=96.32 E-value=0.034 Score=47.56 Aligned_cols=58 Identities=17% Similarity=0.180 Sum_probs=46.2
Q ss_pred ceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcC----------ceeEEEecCCccEEEccCCCeeEEEEEEecCcce
Q 023727 104 GFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNG----------GEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFF 171 (278)
Q Consensus 104 ~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g----------~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F 171 (278)
..+|+|.++.+.++| .+..++.|.- ..-+..++.+|.|.| ++.||.|.+.+..-++.+
T Consensus 2 sV~ISGvL~dg~G~p---------v~g~~I~L~A~~tS~~Vv~~t~as~~t~~~G~Ys~-~~epG~Y~V~l~~~g~~~ 69 (134)
T PF08400_consen 2 SVKISGVLKDGAGKP---------VPGCTITLKARRTSSTVVVGTVASVVTGEAGEYSF-DVEPGVYRVTLKVEGRPP 69 (134)
T ss_pred eEEEEEEEeCCCCCc---------CCCCEEEEEEccCchheEEEEEEEEEcCCCceEEE-EecCCeEEEEEEECCCCc
Confidence 478999999888876 4667777652 233578899999999 799999999999888843
No 20
>cd03869 M14_CPX_like Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Pro
Probab=95.40 E-value=0.047 Score=54.17 Aligned_cols=66 Identities=21% Similarity=0.207 Sum_probs=53.6
Q ss_pred EEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEcC
Q 023727 106 SISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSAR 181 (278)
Q Consensus 106 tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~~ 181 (278)
-|+|.|....++| +++++|.++|-.+...+.++|.|- +=+.||+|.+.+++.||.-....|.|...
T Consensus 330 GikG~V~d~~g~~---------i~~a~i~v~g~~~~v~t~~~Gdyw-Rll~pG~y~v~~~a~gy~~~~~~~~v~~~ 395 (405)
T cd03869 330 GIKGVVRDKTGKG---------IPNAIISVEGINHDIRTASDGDYW-RLLNPGEYRVTAHAEGYTSSTKNCEVGYE 395 (405)
T ss_pred CceEEEECCCCCc---------CCCcEEEEecCccceeeCCCCceE-EecCCceEEEEEEecCCCcccEEEEEcCC
Confidence 4899998775544 567889999977777778888874 45899999999999999988888888743
No 21
>KOG2649 consensus Zinc carboxypeptidase [General function prediction only]
Probab=94.30 E-value=0.18 Score=51.21 Aligned_cols=66 Identities=21% Similarity=0.204 Sum_probs=56.4
Q ss_pred eEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEEc
Q 023727 105 FSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVSA 180 (278)
Q Consensus 105 ~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~~ 180 (278)
--|+|-|....+++ +++++|.++|..+...+..+|.|= +=++||.|.|.++..||.-....|.|..
T Consensus 378 ~GIkG~V~D~~G~~---------I~NA~IsV~ginHdv~T~~~GDYW-RLL~PG~y~vta~A~Gy~~~tk~v~V~~ 443 (500)
T KOG2649|consen 378 RGIKGLVFDDTGNP---------IANATISVDGINHDVTTAKEGDYW-RLLPPGKYIITASAEGYDPVTKTVTVPP 443 (500)
T ss_pred hccceeEEcCCCCc---------cCceEEEEecCcCceeecCCCceE-EeeCCcceEEEEecCCCcceeeEEEeCC
Confidence 35889998755543 689999999999888889999874 3589999999999999999999999976
No 22
>KOG1948 consensus Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]
Probab=94.21 E-value=0.087 Score=56.71 Aligned_cols=62 Identities=23% Similarity=0.298 Sum_probs=50.2
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCc---eeEEEecCCccEEEccCCCeeEEEEEEecCcceee
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGG---EHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSP 173 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~---~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p 173 (278)
.+++|+|+|.-..+. + .+.+.|.|... -..+.++.+|.|.|.|+.||.|.+..+++...+..
T Consensus 117 tGFsv~GkVlgaagg----G-----pagV~velrs~e~~iast~T~~~Gky~f~~iiPG~Yev~ashp~w~~~~ 181 (1165)
T KOG1948|consen 117 TGFSVRGKVLGAAGG----G-----PAGVLVELRSQEDPIASTKTEDGGKYEFRNIIPGKYEVSASHPAWECIS 181 (1165)
T ss_pred eeeeEeeEEeeccCC----C-----cccceeecccccCcceeeEecCCCeEEEEecCCCceEEeccCcceeEee
Confidence 357999999876541 1 25677777654 35799999999999999999999999999999876
No 23
>cd00421 intradiol_dioxygenase Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers.
Probab=93.87 E-value=0.19 Score=42.77 Aligned_cols=54 Identities=17% Similarity=0.237 Sum_probs=40.1
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcC--------------------ceeEEEecCCccEEEccCCCee
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNG--------------------GEHVTFLRPDGYFSFQNMSAGT 159 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g--------------------~~~~a~~~~dG~F~f~nVP~Gs 159 (278)
..+...+|.|+|...++.| ++.+.|.+=. +.-...+|++|.|.|..|+||.
T Consensus 7 ~~G~~l~l~G~V~D~~g~p---------v~~A~VeiW~~d~~G~Y~~~~~~~~~~~~~~rg~~~Td~~G~y~f~ti~Pg~ 77 (146)
T cd00421 7 APGEPLTLTGTVLDGDGCP---------VPDALVEIWQADADGRYSGQDDSGLDPEFFLRGRQITDADGRYRFRTIKPGP 77 (146)
T ss_pred CCCCEEEEEEEEECCCCCC---------CCCcEEEEEecCCCCccCCcCccccCCCCCCEEEEEECCCcCEEEEEEcCCC
Confidence 3467789999999887765 3445555521 1225789999999999999999
Q ss_pred EEE
Q 023727 160 HLI 162 (278)
Q Consensus 160 Y~L 162 (278)
|.+
T Consensus 78 Y~~ 80 (146)
T cd00421 78 YPI 80 (146)
T ss_pred CCC
Confidence 984
No 24
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=93.86 E-value=0.13 Score=36.92 Aligned_cols=39 Identities=23% Similarity=0.291 Sum_probs=28.4
Q ss_pred EEEEEcC-ceeEEEecCCc-cEEEccCCCeeEEEEEEecCc
Q 023727 131 VKVVLNG-GEHVTFLRPDG-YFSFQNMSAGTHLIEVAAIGY 169 (278)
Q Consensus 131 t~V~L~g-~~~~a~~~~dG-~F~f~nVP~GsY~LeVss~gy 169 (278)
-+-.|.| ...|..+.... .+.|.++|||+|+|+|.+.+-
T Consensus 10 Y~Y~l~g~d~~W~~~~~~~~~~~~~~L~~G~Y~l~V~a~~~ 50 (66)
T PF07495_consen 10 YRYRLEGFDDEWITLGSYSNSISYTNLPPGKYTLEVRAKDN 50 (66)
T ss_dssp EEEEEETTESSEEEESSTS-EEEEES--SEEEEEEEEEEET
T ss_pred EEEEEECCCCeEEECCCCcEEEEEEeCCCEEEEEEEEEECC
Confidence 3455665 34577777777 999999999999999998763
No 25
>PF07210 DUF1416: Protein of unknown function (DUF1416); InterPro: IPR010814 This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown.
Probab=93.16 E-value=0.45 Score=37.83 Aligned_cols=56 Identities=23% Similarity=0.265 Sum_probs=42.4
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcC--cee--EEEecCCccEEEccCCCeeEEEEEEecCc
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNG--GEH--VTFLRPDGYFSFQNMSAGTHLIEVAAIGY 169 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g--~~~--~a~~~~dG~F~f~nVP~GsY~LeVss~gy 169 (278)
....|+|+|. .++.| +..+.|.|.+ +++ ...+.+.|.|.|. ..||+++|.+.++.=
T Consensus 6 ke~VItG~V~-~~G~P---------v~gAyVRLLD~sgEFtaEvvts~~G~FRFf-aapG~WtvRal~~~g 65 (85)
T PF07210_consen 6 KETVITGRVT-RDGEP---------VGGAYVRLLDSSGEFTAEVVTSATGDFRFF-AAPGSWTVRALSRGG 65 (85)
T ss_pred ceEEEEEEEe-cCCcC---------CCCeEEEEEcCCCCeEEEEEecCCccEEEE-eCCCceEEEEEccCC
Confidence 3468999999 66655 4566766643 455 4788999999996 689999999888764
No 26
>PF11008 DUF2846: Protein of unknown function (DUF2846); InterPro: IPR022548 Some members in this group of proteins with unknown function are annotated as lipoproteins. However this cannot be confirmed.
Probab=89.38 E-value=1.8 Score=35.17 Aligned_cols=55 Identities=20% Similarity=0.201 Sum_probs=36.4
Q ss_pred eeEEEEEcCceeEEEecCCccEEEccCCCeeEEEEEEecCcc---eeeEEEEEEcCCCCceEE
Q 023727 129 SNVKVVLNGGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYF---FSPVRVDVSARHPGKVQA 188 (278)
Q Consensus 129 s~t~V~L~g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~---F~p~RVdV~~~~~G~VrA 188 (278)
..-.|.+||..... ..+|.|...+||||.|.+.....-.. -..+.|++.+ |++.-
T Consensus 41 ~~~~v~vdg~~ig~--l~~g~y~~~~v~pG~h~i~~~~~~~~~~~~~~l~~~~~~---G~~yy 98 (117)
T PF11008_consen 41 VKPDVYVDGELIGE--LKNGGYFYVEVPPGKHTISAKSEFSSSPGANSLDVTVEA---GKTYY 98 (117)
T ss_pred ccceEEECCEEEEE--eCCCeEEEEEECCCcEEEEEecCccCCCCccEEEEEEcC---CCEEE
Confidence 44567778765544 67889999999999999999554332 1345555543 56543
No 27
>COG3485 PcaH Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=85.64 E-value=1.8 Score=40.09 Aligned_cols=57 Identities=19% Similarity=0.253 Sum_probs=40.7
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc----C------------------ceeEEEecCCccEEEccCCC
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN----G------------------GEHVTFLRPDGYFSFQNMSA 157 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~----g------------------~~~~a~~~~dG~F~f~nVP~ 157 (278)
+.++...|+|||...++.| +..+.|-+= + +.-.+.+|++|.|.|.-+.|
T Consensus 68 ~~Ge~i~l~G~VlD~~G~P---------v~~A~VEiWQAda~GrY~~~~d~~~~~~~~f~g~Gr~~Td~~G~y~F~Ti~P 138 (226)
T COG3485 68 ARGERILLEGRVLDGNGRP---------VPDALVEIWQADADGRYSHPKDSRLAPLPNFNGRGRTITDEDGEYRFRTIKP 138 (226)
T ss_pred CCCceEEEEEEEECCCCCC---------CCCCEEEEEEcCCCCcccCccccccCcCccccceEEEEeCCCceEEEEEeec
Confidence 4457899999999888776 344555440 1 12247899999999999999
Q ss_pred eeEEEEEE
Q 023727 158 GTHLIEVA 165 (278)
Q Consensus 158 GsY~LeVs 165 (278)
|.|-..-.
T Consensus 139 g~yp~~~~ 146 (226)
T COG3485 139 GPYPWRNG 146 (226)
T ss_pred ccccCCCC
Confidence 98854433
No 28
>PF12866 DUF3823: Protein of unknown function (DUF3823); InterPro: IPR024278 This is a family of uncharacterised proteins from Bacteroidetes. These proteins have characteristic DN and DR sequence-motifs but their function is not known.; PDB: 3HN5_B 4EIU_A.
Probab=83.05 E-value=5 Score=36.82 Aligned_cols=75 Identities=15% Similarity=0.170 Sum_probs=46.6
Q ss_pred CCCceEEEEEEECC-CCCCCCCCCCCCCceeEEEEEcC------ceeEEEecCCccEEEccCCCeeEEEEE-EecC---c
Q 023727 101 SGDGFSISGRVKLP-GMSLKAFGSPGGKASNVKVVLNG------GEHVTFLRPDGYFSFQNMSAGTHLIEV-AAIG---Y 169 (278)
Q Consensus 101 ~~~~~tIsGrV~~p-~~~p~~~~lp~~~~s~t~V~L~g------~~~~a~~~~dG~F~f~nVP~GsY~LeV-ss~g---y 169 (278)
.+...+++|+|... .+++ +... .-.+++.|-. +.....+..||+|.=..+=+|.|.|.+ .-.+ +
T Consensus 18 D~P~s~l~G~iiD~~tgE~--i~~~---~~gv~i~l~e~gy~~~~~~~~~v~qDGtf~n~~lF~G~Yki~~~~G~fp~~~ 92 (222)
T PF12866_consen 18 DEPDSTLTGRIIDVYTGEP--IQTD---IGGVRIQLYELGYGDNTPQDVYVKQDGTFRNTKLFDGDYKIVPKNGNFPWVV 92 (222)
T ss_dssp ----EEEEEEEEECCTTEE---------STSSEEEEECS-CCG--SEEEEB-TTSEEEEEEE-SEEEEEEE-CTSCSBSC
T ss_pred cCCCceEEEEEEEeecCCe--eeec---CCceEEEEEecccccCCCcceEEccCCceeeeeEeccceEEEEcCCCCcccC
Confidence 34668999999543 2333 2222 1367787753 345678999999955566789999999 7788 7
Q ss_pred ceeeEEEEEEc
Q 023727 170 FFSPVRVDVSA 180 (278)
Q Consensus 170 ~F~p~RVdV~~ 180 (278)
...+.+|+|.+
T Consensus 93 ~~dti~v~i~G 103 (222)
T PF12866_consen 93 PVDTIEVDIKG 103 (222)
T ss_dssp CE--EEEEESS
T ss_pred CCccEEEEecC
Confidence 88889999974
No 29
>PF10794 DUF2606: Protein of unknown function (DUF2606); InterPro: IPR019730 This entry represents bacterial proteins with unknown function.
Probab=82.92 E-value=6.7 Score=33.35 Aligned_cols=28 Identities=14% Similarity=0.241 Sum_probs=23.7
Q ss_pred ceeEEEecCCccEEEccCCCeeEEEEEE
Q 023727 138 GEHVTFLRPDGYFSFQNMSAGTHLIEVA 165 (278)
Q Consensus 138 ~~~~a~~~~dG~F~f~nVP~GsY~LeVs 165 (278)
|...+-+|++|.+..+++..|.|.+...
T Consensus 78 g~~IGKTD~~Gki~Wk~~~kG~Y~v~l~ 105 (131)
T PF10794_consen 78 GISIGKTDEEGKIIWKNGRKGKYIVFLP 105 (131)
T ss_pred ceeecccCCCCcEEEecCCcceEEEEEc
Confidence 4556889999999999999999987544
No 30
>PF10670 DUF4198: Domain of unknown function (DUF4198)
Probab=82.70 E-value=5.6 Score=34.34 Aligned_cols=55 Identities=20% Similarity=0.213 Sum_probs=38.7
Q ss_pred ceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc--Cc-------eeEEEecCCccEEEccCCCeeEEEEEEecC
Q 023727 104 GFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN--GG-------EHVTFLRPDGYFSFQNMSAGTHLIEVAAIG 168 (278)
Q Consensus 104 ~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~--g~-------~~~a~~~~dG~F~f~nVP~GsY~LeVss~g 168 (278)
+-.++.+|...+ +| ++++.|.+. +. .....+|.+|.|+|.=-.+|.|+|.+.+.+
T Consensus 150 g~~~~~~vl~~G-kP---------l~~a~V~~~~~~~~~~~~~~~~~~~TD~~G~~~~~~~~~G~wli~a~~~~ 213 (215)
T PF10670_consen 150 GDPLPFQVLFDG-KP---------LAGAEVEAFSPGGWYDVEHEAKTLKTDANGRATFTLPRPGLWLIRASHKD 213 (215)
T ss_pred CCEEEEEEEECC-eE---------cccEEEEEEECCCccccccceEEEEECCCCEEEEecCCCEEEEEEEEEec
Confidence 345677776443 33 344555553 21 567999999999998778999999998754
No 31
>cd03459 3,4-PCD Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=81.69 E-value=4 Score=35.51 Aligned_cols=53 Identities=19% Similarity=0.260 Sum_probs=37.4
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc-----C---------------c---eeEEEecCCccEEEccCC
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN-----G---------------G---EHVTFLRPDGYFSFQNMS 156 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~-----g---------------~---~~~a~~~~dG~F~f~nVP 156 (278)
.......|+|+|...+..| ++.+.|-+= | + .-...+|.+|.|.|.-|.
T Consensus 11 ~~G~~l~l~g~V~D~~g~P---------v~~A~veiWqad~~G~Y~~~~~~~~~~~~~~f~~rG~~~Td~~G~~~f~Ti~ 81 (158)
T cd03459 11 AIGERIILEGRVLDGDGRP---------VPDALVEIWQADAAGRYRHPRDSHRAPLDPNFTGFGRVLTDADGRYRFRTIK 81 (158)
T ss_pred CCCcEEEEEEEEECCCCCC---------CCCCEEEEEccCCCCccCCccCCcccccCCCCCceeEEEECCCCcEEEEEEC
Confidence 3456799999999766655 344444441 0 0 113679999999999999
Q ss_pred CeeEE
Q 023727 157 AGTHL 161 (278)
Q Consensus 157 ~GsY~ 161 (278)
||.|-
T Consensus 82 Pg~Y~ 86 (158)
T cd03459 82 PGAYP 86 (158)
T ss_pred CCCcC
Confidence 99986
No 32
>TIGR02465 chlorocat_1_2 chlorocatechol 1,2-dioxygenase. Members of this protein family are chlorocatechol 1,2-dioxygenase. This protein is closely related to catechol 1,2-dioxygenase, TIGR02439, EC 1.13.11.1. Note that annotated database entries have appeared for the present protein family with the EC number that refers to that of family TIGR02439. This protein acts in pathways of the biodegradation of chlorinated aromatic compounds.
Probab=79.91 E-value=4.2 Score=38.01 Aligned_cols=52 Identities=12% Similarity=0.077 Sum_probs=36.7
Q ss_pred CCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc----Cc--------------eeEEEecCCccEEEccCCCeeEE
Q 023727 101 SGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN----GG--------------EHVTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 101 ~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~----g~--------------~~~a~~~~dG~F~f~nVP~GsY~ 161 (278)
.++...|+|+|...+++| ++++.|-+= .| .-...+|++|.|.|.-|.||.|-
T Consensus 95 ~G~~l~v~G~V~D~~G~P---------v~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~F~Ti~P~~Yp 164 (246)
T TIGR02465 95 DHKPLLIRGTVRDLSGTP---------VAGAVIDVWHSTPDGKYSGFHDNIPDDYYRGKLVTAADGSYEVRTTMPVPYQ 164 (246)
T ss_pred CCcEEEEEEEEEcCCCCC---------cCCcEEEEECCCCCCCCCCCCCCCCCCCCeEEEEECCCCCEEEEEECCCCCC
Confidence 356699999999766665 344444441 01 11467899999999999999883
No 33
>TIGR02962 hdxy_isourate hydroxyisourate hydrolase. Members of this family, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes and have been demonstrated to have hydroxyisourate hydrolase activity. In eukaryotes, a clade separate from the transthyretins (a family of thyroid-hormone binding proteins) has also been shown to have HIU hydrolase activity in urate catabolizing organisms. Transthyretin, then, would appear to be the recently diverged paralog of the more ancient HIUH family.
Probab=78.68 E-value=13 Score=30.80 Aligned_cols=44 Identities=25% Similarity=0.307 Sum_probs=33.0
Q ss_pred ceeEEEEEc---Cce----eEEEecCCccEE-----EccCCCeeEEEEEEecCcce
Q 023727 128 ASNVKVVLN---GGE----HVTFLRPDGYFS-----FQNMSAGTHLIEVAAIGYFF 171 (278)
Q Consensus 128 ~s~t~V~L~---g~~----~~a~~~~dG~F~-----f~nVP~GsY~LeVss~gy~F 171 (278)
++.+.|.|. +++ ..+.+|.||... -..+++|.|.|+...-+|.-
T Consensus 16 Aagv~V~L~~~~~~~~~~i~~~~Tn~DGR~~~~l~~~~~~~~G~Y~l~F~~g~Yf~ 71 (112)
T TIGR02962 16 AAGVPVTLYRLDGSGWTPLAEGVTNADGRCPDLLPEGETLAAGIYKLRFDTGDYFA 71 (112)
T ss_pred CCCCEEEEEEecCCCeEEEEEEEECCCCCCcCcccCcccCCCeeEEEEEEhhhhhh
Confidence 567777764 322 258899999986 45668899999999888863
No 34
>cd03464 3,4-PCD_beta Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=78.41 E-value=4.5 Score=37.24 Aligned_cols=62 Identities=21% Similarity=0.208 Sum_probs=36.6
Q ss_pred CCCceEEEEEEECCCCCCCCCCCCCCCceeEEEE-----------EcC---ceeEEEecCCccEEEccCCCeeEEE
Q 023727 101 SGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVV-----------LNG---GEHVTFLRPDGYFSFQNMSAGTHLI 162 (278)
Q Consensus 101 ~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~-----------L~g---~~~~a~~~~dG~F~f~nVP~GsY~L 162 (278)
..+...|+|+|...+.+|.+..+-.-|.++..=. .+. +.-...+|++|.|.|.-|.||.|-+
T Consensus 62 ~G~~i~l~G~V~D~~G~PV~~A~VEIWQad~~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~Pg~Yp~ 137 (220)
T cd03464 62 IGERIIVHGRVLDEDGRPVPNTLVEIWQANAAGRYRHKRDQHDAPLDPNFGGAGRTLTDDDGYYRFRTIKPGAYPW 137 (220)
T ss_pred CCCEEEEEEEEECCCCCCCCCCEEEEEecCCCCcccCccCCcccccCCCCCCEEEEEECCCccEEEEEECCCCccC
Confidence 4567999999997666652111111233222100 000 1114578999999999999999854
No 35
>TIGR02422 protocat_beta protocatechuate 3,4-dioxygenase, beta subunit. This model represents the beta chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the alpha chain (TIGR02423), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=77.82 E-value=3.9 Score=37.59 Aligned_cols=62 Identities=19% Similarity=0.177 Sum_probs=36.5
Q ss_pred CCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcC--------------ceeEEEecCCccEEEccCCCeeEEE
Q 023727 101 SGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNG--------------GEHVTFLRPDGYFSFQNMSAGTHLI 162 (278)
Q Consensus 101 ~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g--------------~~~~a~~~~dG~F~f~nVP~GsY~L 162 (278)
..+...|+|+|...+.+|.+..+-.-|.++..=.-++ +.-...+|+||.|.|.-|.||.|-.
T Consensus 57 ~G~~i~l~G~V~D~~g~PV~~A~VEIWQada~G~Y~~~~d~~~~~~~~~f~grGr~~TD~~G~y~F~TI~PG~Y~~ 132 (220)
T TIGR02422 57 IGERIIVHGRVLDEDGRPVPNTLVEVWQANAAGRYRHKNDQYLAPLDPNFGGVGRTLTDSDGYYRFRTIKPGPYPW 132 (220)
T ss_pred CCCEEEEEEEEECCCCCCCCCCEEEEEecCCCCcccCccCccccccCCCCCCEEEEEECCCccEEEEEECCCCccC
Confidence 3567999999997766652111111233222100000 1114668999999999999998843
No 36
>TIGR02438 catachol_actin catechol 1,2-dioxygenase, Actinobacterial. Members of this family are catechol 1,2-dioxygenases of the Actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so are built in this separate model. The member from Rhodococcus rhodochrous NCIMB 13259 (GB|AAC33003.1) is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol.
Probab=77.42 E-value=5.5 Score=38.02 Aligned_cols=51 Identities=22% Similarity=0.234 Sum_probs=36.4
Q ss_pred CCceEEEEEEECCCCCCCCCCCCCCCceeEEEEE---c-Ccee--------------EEEecCCccEEEccCCCeeEE
Q 023727 102 GDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVL---N-GGEH--------------VTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 102 ~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L---~-g~~~--------------~a~~~~dG~F~f~nVP~GsY~ 161 (278)
+....|+|+|...+++| ++.+.|-+ | .|.| ...+|+||.|.|.-|.||.|-
T Consensus 130 G~pl~v~G~V~D~~G~P---------v~gA~VdiWqada~G~Ys~~~~~~~~~~lRGr~~TDadG~y~F~TI~Pg~Yp 198 (281)
T TIGR02438 130 GTPLVFSGQVTDLDGNG---------LAGAKVELWHADDDGFYSQFAPGIPEWNLRGTIIADDEGRFEITTMQPAPYQ 198 (281)
T ss_pred CCEEEEEEEEEcCCCCC---------cCCCEEEEEecCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcC
Confidence 45689999999666655 34455555 1 1111 467899999999999999884
No 37
>cd03458 Catechol_intradiol_dioxygenases Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=77.36 E-value=13 Score=35.03 Aligned_cols=53 Identities=15% Similarity=0.132 Sum_probs=37.8
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc-----C-------------ceeEEEecCCccEEEccCCCeeEE
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN-----G-------------GEHVTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~-----g-------------~~~~a~~~~dG~F~f~nVP~GsY~ 161 (278)
..++...|+|+|...+++| ++.+.|-+= | +.-...+|++|.|.|.-|.||.|-
T Consensus 100 ~~G~~l~l~G~V~D~~G~P---------v~~A~VeiWqad~~G~Y~~~~~~~~~~~lRG~~~Td~~G~y~f~Ti~P~~Yp 170 (256)
T cd03458 100 ADGEPLFVHGTVTDTDGKP---------LAGATVDVWHADPDGFYSQQDPDQPEFNLRGKFRTDEDGRYRFRTIRPVPYP 170 (256)
T ss_pred CCCcEEEEEEEEEcCCCCC---------CCCcEEEEEccCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCcc
Confidence 3467799999999777765 344444441 1 111577899999999999999883
No 38
>PF07550 DUF1533: Protein of unknown function (DUF1533); InterPro: IPR011432 This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.
Probab=76.72 E-value=4.3 Score=29.98 Aligned_cols=47 Identities=23% Similarity=0.306 Sum_probs=29.1
Q ss_pred CCcee-EEEEEcCceeEEEecCCccEEE--cc--------C-CCeeEEEEEEecCccee
Q 023727 126 GKASN-VKVVLNGGEHVTFLRPDGYFSF--QN--------M-SAGTHLIEVAAIGYFFS 172 (278)
Q Consensus 126 ~~~s~-t~V~L~g~~~~a~~~~dG~F~f--~n--------V-P~GsY~LeVss~gy~F~ 172 (278)
+|... .+|++||..|..-.+.++.|.+ .+ . ..|.|.+.|.+-||.-.
T Consensus 3 ~~~~~I~~V~VNg~~y~~~~~~~~~y~~~~~~~l~i~~~~f~~~G~~~I~I~A~GY~d~ 61 (65)
T PF07550_consen 3 DWLKAITSVTVNGKEYNKSLKGNDKYSISSKGSLKIKASAFNKDGENTIVIKATGYKDK 61 (65)
T ss_pred hHHhhCCEEEECCEEeeccccccccEEeccCCcEEEcHHHcCcCCceEEEEEeCCccce
Confidence 34444 3488888888333344445555 11 1 45899999999988543
No 39
>cd03462 1,2-CCD chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway.
Probab=76.32 E-value=6.4 Score=36.89 Aligned_cols=53 Identities=11% Similarity=0.043 Sum_probs=37.1
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc----Cc--------------eeEEEecCCccEEEccCCCeeEE
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN----GG--------------EHVTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~----g~--------------~~~a~~~~dG~F~f~nVP~GsY~ 161 (278)
..++...++|+|...+++| ++.+.|-+= .| .-...+|.+|.|.|.-|.||.|-
T Consensus 95 ~~G~~l~l~G~V~D~~G~P---------v~~A~VeiWqad~~G~Y~~~~~~~~~~~~RG~~~Td~~G~y~F~Ti~P~~Yp 165 (247)
T cd03462 95 DDHKPLLFRGTVKDLAGAP---------VAGAVIDVWHSTPDGKYSGFHPNIPEDYYRGKIRTDEDGRYEVRTTVPVPYQ 165 (247)
T ss_pred CCCCEEEEEEEEEcCCCCC---------cCCcEEEEECCCCCCCcCCCCCCCCCCCCEEEEEeCCCCCEEEEEECCCCcC
Confidence 3456789999999776665 344444441 01 11457899999999999999883
No 40
>TIGR02423 protocat_alph protocatechuate 3,4-dioxygenase, alpha subunit. This model represents the alpha chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the beta chain (TIGR02422), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA.
Probab=75.71 E-value=6.8 Score=35.27 Aligned_cols=63 Identities=17% Similarity=0.095 Sum_probs=36.9
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEE-----------Ec-C--ceeEEEecCCccEEEccCCCeeEEE
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVV-----------LN-G--GEHVTFLRPDGYFSFQNMSAGTHLI 162 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~-----------L~-g--~~~~a~~~~dG~F~f~nVP~GsY~L 162 (278)
...+...++|+|...+.+|.+-.+-.-|+++..=. .+ + +.-...+|++|.|.|.-|.||.|-.
T Consensus 35 ~~G~~l~l~G~V~D~~g~Pv~~A~VeiWqad~~G~Y~~~~~~~~~~~~~~f~grGr~~Td~~G~y~f~TI~Pg~Yp~ 111 (193)
T TIGR02423 35 ADGERIRLEGRVLDGDGHPVPDALIEIWQADAAGRYNSPADLRAPATDPGFRGWGRTGTDESGEFTFETVKPGAVPD 111 (193)
T ss_pred CCCCEEEEEEEEECCCCCCCCCCEEEEEccCCCCccCCccCCcccccCCCCCCeEEEEECCCCCEEEEEEcCCCcCC
Confidence 44667999999997666552111111233221100 00 0 0113568999999999999998864
No 41
>cd03460 1,2-CTD Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.
Probab=75.65 E-value=6.6 Score=37.50 Aligned_cols=52 Identities=17% Similarity=0.229 Sum_probs=37.5
Q ss_pred CCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc-----C-------------ceeEEEecCCccEEEccCCCeeEE
Q 023727 101 SGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN-----G-------------GEHVTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 101 ~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~-----g-------------~~~~a~~~~dG~F~f~nVP~GsY~ 161 (278)
.++...++|+|...+++| ++++.|-+= | +.-...+|.+|.|.|.-|.||.|-
T Consensus 121 ~Gepl~l~G~V~D~~G~P---------I~~A~VeiWqad~~G~Ys~~~~~~~~f~~RGr~~TD~~G~y~F~TI~P~~Yp 190 (282)
T cd03460 121 DGETLVMHGTVTDTDGKP---------VPGAKVEVWHANSKGFYSHFDPTQSPFNLRRSIITDADGRYRFRSIMPSGYG 190 (282)
T ss_pred CCCEEEEEEEEECCCCCC---------cCCcEEEEECCCCCCCcCCCCCCCCCCCCceEEEeCCCCCEEEEEECCCCCc
Confidence 456789999999777765 344444441 0 111577899999999999999884
No 42
>PF14054 DUF4249: Domain of unknown function (DUF4249)
Probab=73.66 E-value=72 Score=29.04 Aligned_cols=96 Identities=21% Similarity=0.253 Sum_probs=50.6
Q ss_pred HHHHHHhhhcceeecc---CCCCceEEEEEEECCCCCC-------CCCC--CCCCCceeEEEEE--cCcee-EEEecCC-
Q 023727 84 INLFLSLVSSAVAVSS---GSGDGFSISGRVKLPGMSL-------KAFG--SPGGKASNVKVVL--NGGEH-VTFLRPD- 147 (278)
Q Consensus 84 ~~~~~s~~~~~~a~s~---~~~~~~tIsGrV~~p~~~p-------~~~~--lp~~~~s~t~V~L--~g~~~-~a~~~~d- 147 (278)
+++++.+.+|.-.... .......|+|.+..++... .++. .+.....+++|.| ++... ..+...+
T Consensus 4 ~ll~l~l~sC~~~i~~~~~~~~~~lVV~~~i~~~~~~~~V~Ls~s~~~~~~~~~~~v~~A~V~i~~~~~~~~~~~~~~~~ 83 (298)
T PF14054_consen 4 LLLLLLLSSCEKEIDIDDLDEEPKLVVEGYITNPGDPQTVRLSRSVPYFDNSPPEPVSGATVTIYEDGQGNEYLFEESSN 83 (298)
T ss_pred HHHHHHHhccCcccccCcCCCCCeEEEEEEEecCCCcEEEEEEEeecccCCCCCcccCCcEEEEEeCCCcceEeecccCC
Confidence 3444445555333332 2237799999998444322 1211 0112267788888 33322 2333333
Q ss_pred --ccEE-Ecc--CCCe-eEEEEEEecCcceeeEEEEEE
Q 023727 148 --GYFS-FQN--MSAG-THLIEVAAIGYFFSPVRVDVS 179 (278)
Q Consensus 148 --G~F~-f~n--VP~G-sY~LeVss~gy~F~p~RVdV~ 179 (278)
|.|. ... +.+| +|.|+|...|...-.-...|-
T Consensus 84 ~~g~Y~~~~~~~~~~G~~Y~L~V~~~~~~~~sa~~~vp 121 (298)
T PF14054_consen 84 NDGVYYSSNSFRGRPGRTYRLEVETPGGKTYSAETTVP 121 (298)
T ss_pred CcceEEecccccccCCCEEEEEEEECCCCEEEEEEEEC
Confidence 6666 333 2344 999999997665555444443
No 43
>PF00775 Dioxygenase_C: Dioxygenase; InterPro: IPR000627 This entry represents the C-terminal domain common to several intradiol ring-cleavage dioxygenases. Dioxygenases catalyse the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms. Cleavage of aromatic rings is one of the most important functions of dioxygenases, which play key roles in the degradation of aromatic compounds. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes use a non-haem Fe(III) to cleave the aromatic ring between two hydroxyl groups (ortho-cleavage), whereas extradiol enzymes (IPR000486 from INTERPRO) use a non-haem Fe(II) to cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon (meta-cleavage) []. These two subfamilies differ in sequence, structural fold, iron ligands, and the orientation of second sphere active site amino acid residues. Enzymes that belong to the intradiol family include catechol 1,2-dioxygenase (1,2-CTD) (1.13.11.1 from EC); protocatechuate 3,4-dioxygenase (3,4-PCD) (1.13.11.3 from EC); and chlorocatechol 1,2-dioxygenase (1.13.11.1 from EC) [].; GO: 0003824 catalytic activity, 0008199 ferric iron binding, 0006725 cellular aromatic compound metabolic process, 0055114 oxidation-reduction process; PDB: 2BUV_A 2BUX_A 2BUU_A 2BUR_A 1EO9_A 2BUZ_A 2BV0_A 1EO2_A 1EOC_A 1EOA_A ....
Probab=71.04 E-value=14 Score=32.77 Aligned_cols=53 Identities=21% Similarity=0.279 Sum_probs=34.7
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEE---c--C---------------ceeEEEecCCccEEEccCCCee
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVL---N--G---------------GEHVTFLRPDGYFSFQNMSAGT 159 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L---~--g---------------~~~~a~~~~dG~F~f~nVP~Gs 159 (278)
...+...|.|+|...+.+| ++++.|-+ | | ..-...+|++|.|.|.-|.||.
T Consensus 25 ~~G~~l~l~G~V~D~~g~P---------v~~A~veiWqada~G~Ys~~~~~~~~~~~~~rG~~~Td~~G~y~f~Ti~Pg~ 95 (183)
T PF00775_consen 25 APGEPLVLHGRVIDTDGKP---------VPGALVEIWQADADGRYSGQDPGSDQPDFNLRGRFRTDADGRYSFRTIKPGP 95 (183)
T ss_dssp SSS-EEEEEEEEEETTSSB----------TTEEEEEEE--TTS--TTTBTTSSSSTTTTEEEEEECTTSEEEEEEE----
T ss_pred CCCCEEEEEEEEECCCCCC---------CCCcEEEEEecCCCCccccccccccccCCCcceEEecCCCCEEEEEeeCCCC
Confidence 4556799999999877655 45666666 1 1 1225779999999999999999
Q ss_pred EE
Q 023727 160 HL 161 (278)
Q Consensus 160 Y~ 161 (278)
|.
T Consensus 96 Y~ 97 (183)
T PF00775_consen 96 YP 97 (183)
T ss_dssp EE
T ss_pred CC
Confidence 95
No 44
>cd03461 1,2-HQD Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes.
Probab=69.80 E-value=13 Score=35.51 Aligned_cols=54 Identities=19% Similarity=0.186 Sum_probs=37.9
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc-----C-------------ceeEEEecCCccEEEccCCCeeEE
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN-----G-------------GEHVTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~-----g-------------~~~~a~~~~dG~F~f~nVP~GsY~ 161 (278)
..++...|+|+|...+++| ++.+.|-+= | +.-.-.+|++|.|.|.-|.||.|-
T Consensus 116 ~~G~~l~v~G~V~D~~G~P---------v~gA~VeiWqad~~G~Y~~~~~~~~~~~lRGr~~Td~~G~y~F~Ti~Pg~Yp 186 (277)
T cd03461 116 ADGEPCFVHGRVTDTDGKP---------LPGATVDVWQADPNGLYDVQDPDQPEFNLRGKFRTDEDGRYAFRTLRPTPYP 186 (277)
T ss_pred CCCCEEEEEEEEEcCCCCC---------cCCcEEEEECcCCCCCcCCCCCCCCCCCCeEEEEeCCCCCEEEEEECCCCcC
Confidence 3456799999999777765 334444441 1 111467899999999999999885
Q ss_pred E
Q 023727 162 I 162 (278)
Q Consensus 162 L 162 (278)
+
T Consensus 187 i 187 (277)
T cd03461 187 I 187 (277)
T ss_pred C
Confidence 3
No 45
>TIGR02439 catechol_proteo catechol 1,2-dioxygenase, proteobacterial. Members of this family known so far are catechol 1,2-dioxygenases of the Proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the Actinobacteria, which are quite similar to each other and resolved by separate models. This enzyme catalyzes intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogs 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol.
Probab=69.25 E-value=13 Score=35.60 Aligned_cols=51 Identities=16% Similarity=0.182 Sum_probs=36.3
Q ss_pred CCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc-----C-------------ceeEEEecCCccEEEccCCCeeEE
Q 023727 102 GDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN-----G-------------GEHVTFLRPDGYFSFQNMSAGTHL 161 (278)
Q Consensus 102 ~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~-----g-------------~~~~a~~~~dG~F~f~nVP~GsY~ 161 (278)
++...|+|+|...+++| ++.+.|-+= | ..-...+|++|.|.|.-|.||.|-
T Consensus 126 G~pl~v~G~V~D~~G~P---------I~gA~VeIWqad~~G~Ys~~~~~~~~~~lRG~~~TD~~G~y~F~TI~P~~Yp 194 (285)
T TIGR02439 126 GETLFLHGQVTDADGKP---------IAGAKVELWHANTKGNYSHFDKSQSEFNLRRTIITDAEGRYRARSIVPSGYG 194 (285)
T ss_pred CcEEEEEEEEECCCCCC---------cCCcEEEEEccCCCCCcCCCCCCCCCCCceEEEEECCCCCEEEEEECCCCCc
Confidence 56689999999877765 334444441 0 112467899999999999999884
No 46
>cd03463 3,4-PCD_alpha Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.
Probab=68.43 E-value=6.5 Score=35.19 Aligned_cols=51 Identities=20% Similarity=0.273 Sum_probs=36.4
Q ss_pred CCceEEEEEEECCCCCCCCCCCCCCCceeEEEEE---------cC---------cee----EEEecCCccEEEccCCCee
Q 023727 102 GDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVL---------NG---------GEH----VTFLRPDGYFSFQNMSAGT 159 (278)
Q Consensus 102 ~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L---------~g---------~~~----~a~~~~dG~F~f~nVP~Gs 159 (278)
.....|+|+|...+.+| ++++.|-+ ++ ..+ ...+|+||.|.|.-|.||.
T Consensus 34 G~~l~l~G~V~D~~g~P---------i~gA~VeiWqad~~G~Y~~~~~~~~~~~~~f~~rGr~~TD~~G~y~F~Ti~Pg~ 104 (185)
T cd03463 34 GERITLEGRVYDGDGAP---------VPDAMLEIWQADAAGRYAHPADSRRRLDPGFRGFGRVATDADGRFSFTTVKPGA 104 (185)
T ss_pred CCEEEEEEEEECCCCCC---------CCCCEEEEEcCCCCCccCCcCCcccccCCCCCcEEEEEECCCCCEEEEEEcCCC
Confidence 56799999999776665 33444443 10 111 3679999999999999999
Q ss_pred EE
Q 023727 160 HL 161 (278)
Q Consensus 160 Y~ 161 (278)
|-
T Consensus 105 Y~ 106 (185)
T cd03463 105 VP 106 (185)
T ss_pred cC
Confidence 85
No 47
>PF03785 Peptidase_C25_C: Peptidase family C25, C terminal ig-like domain; InterPro: IPR005536 This domain is found in almost all members of MEROPS peptidase family C25, (clan CD). Peptidase family C25 is a protein family found in the bacteria Porphyromonas gingivalis (Bacteroides gingivalis) a Gram-negative anaerobic bacterial species strongly associated with adult periodontitis. One of its distinguishing characteristics and putative virulence properties is the ability to agglutinate erythrocytes []. It is a highly proteolytic organism which metabolises small peptides and amino acids. Indirect evidence suggests that the proteases produced by this microorganism constitute an important virulence factor []. Protease-encoding genes have been shown to contain multiple copies of repeated nucleotide sequences. These conserved sequences have also been found in haemagglutinin genes [].; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 1CVR_A.
Probab=65.21 E-value=37 Score=26.94 Aligned_cols=47 Identities=19% Similarity=0.299 Sum_probs=28.9
Q ss_pred eEEEEEcCceeE-EEecCCccEEEccCC-----CeeEEEEEEecCcceeeEEEEE
Q 023727 130 NVKVVLNGGEHV-TFLRPDGYFSFQNMS-----AGTHLIEVAAIGYFFSPVRVDV 178 (278)
Q Consensus 130 ~t~V~L~g~~~~-a~~~~dG~F~f~nVP-----~GsY~LeVss~gy~F~p~RVdV 178 (278)
.+.+.=||.-++ ++++ +|++.+ |++ +|.|+|.|..-+|..--..|.|
T Consensus 29 ~ValS~dg~l~G~ai~~-sG~ati-~l~~~it~~~~~tlTit~~n~~t~i~~i~V 81 (81)
T PF03785_consen 29 YVALSQDGDLYGKAIVN-SGNATI-NLTNPITDEGTLTLTITAFNYVTYIKTIQV 81 (81)
T ss_dssp EEEEEETTEEEEEEE-B-TTEEEE-E-SS--TT-SEEEEEEE-TTB--EEEEEEE
T ss_pred EEEEecCCEEEEEEEec-CceEEE-ECCcccCCCceEEEEEEEEccEEEEEEeeC
Confidence 344444555554 7777 999999 566 4899999988887766555544
No 48
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=61.75 E-value=18 Score=25.64 Aligned_cols=32 Identities=34% Similarity=0.431 Sum_probs=23.0
Q ss_pred CceeEEEecCCccEEEccCC---CeeEEEEEEecCc
Q 023727 137 GGEHVTFLRPDGYFSFQNMS---AGTHLIEVAAIGY 169 (278)
Q Consensus 137 g~~~~a~~~~dG~F~f~nVP---~GsY~LeVss~gy 169 (278)
|..+.+..+.+|+++| ++| .|.|.+.|...|-
T Consensus 1 G~~~~~t~~~~G~Ws~-t~~~~~dG~y~itv~a~D~ 35 (54)
T PF13754_consen 1 GVTYTTTVDSDGNWSF-TVPALADGTYTITVTATDA 35 (54)
T ss_pred CeEEEEEECCCCcEEE-eCCCCCCccEEEEEEEEeC
Confidence 3456777888998888 354 4999888876653
No 49
>PF13953 PapC_C: PapC C-terminal domain; PDB: 3L48_E 2XET_A 3RFZ_E 2KT6_A.
Probab=61.67 E-value=19 Score=26.55 Aligned_cols=39 Identities=15% Similarity=0.282 Sum_probs=26.9
Q ss_pred EECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEccCCC
Q 023727 111 VKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQNMSA 157 (278)
Q Consensus 111 V~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~nVP~ 157 (278)
+..+++++.||+ +.|...++...+++..+|.--+.++++
T Consensus 3 l~~~~G~~lPfG--------A~v~~~~g~~~g~Vg~~G~vyl~~~~~ 41 (68)
T PF13953_consen 3 LRDADGKPLPFG--------ASVSDEDGNNIGIVGQDGQVYLSGLPP 41 (68)
T ss_dssp EEETTSEE--TT---------EEEETTSSEEEEB-GCGEEEEEEE-T
T ss_pred EEcCCCCcCCCC--------cEEEcCCCCEEEEEcCCCEEEEECCCC
Confidence 445666664444 777787888999999999999999874
No 50
>COG1470 Predicted membrane protein [Function unknown]
Probab=61.52 E-value=1e+02 Score=31.81 Aligned_cols=110 Identities=13% Similarity=0.250 Sum_probs=62.3
Q ss_pred CCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeE-EEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEEE
Q 023727 101 SGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHV-TFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDVS 179 (278)
Q Consensus 101 ~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~-a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV~ 179 (278)
.++.-+|++.|.-.+..| +.++++.+++-+-| .-+|+. +|..++||-+ ..-+++|.|.
T Consensus 395 aGee~~i~i~I~NsGna~---------LtdIkl~v~~PqgWei~Vd~~---~I~sL~pge~---------~tV~ltI~vP 453 (513)
T COG1470 395 AGEEKTIRISIENSGNAP---------LTDIKLTVNGPQGWEIEVDES---TIPSLEPGES---------KTVSLTITVP 453 (513)
T ss_pred CCccceEEEEEEecCCCc---------cceeeEEecCCccceEEECcc---cccccCCCCc---------ceEEEEEEcC
Confidence 345578888886544322 78899999985444 555554 9999999855 2223344443
Q ss_pred cCC-CCceEEEEecc-cCc-c-ccEEEEeccccceeeeccccChhhhccCHHHHHHHHHHHHHHhhhc
Q 023727 180 ARH-PGKVQAALTET-RRG-L-NELVLEQLREEQYYEIREPFSIMSLVKSPMGLMMGFMLVVVFLMPK 243 (278)
Q Consensus 180 ~~~-~G~VrA~~~e~-~~~-l-~PLvv~p~~~~~Yfe~Re~fsi~~lLkNPM~LM~lv~l~l~~~mPK 243 (278)
.+. .|..+++..-. .+. + .-+++. -.+|..+-+.+++ |++++-++++|+|-|
T Consensus 454 ~~a~aGdY~i~i~~ksDq~s~e~tlrV~-------V~~sS~st~iGI~-----Ii~~~v~~L~fviRK 509 (513)
T COG1470 454 EDAGAGDYRITITAKSDQASSEDTLRVV-------VGQSSTSTYIGIA-----IIVLVVLGLIFVIRK 509 (513)
T ss_pred CCCCCCcEEEEEEEeeccccccceEEEE-------EeccccchhhhHH-----HHHHHHHHHHhhhHH
Confidence 321 34444432210 011 1 123331 2456777888877 666777777776654
No 51
>COG4704 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=60.72 E-value=26 Score=30.56 Aligned_cols=20 Identities=15% Similarity=0.092 Sum_probs=15.8
Q ss_pred CCccEEEccCCCeeEEEEEE
Q 023727 146 PDGYFSFQNMSAGTHLIEVA 165 (278)
Q Consensus 146 ~dG~F~f~nVP~GsY~LeVs 165 (278)
.-=+++|.+++||+|-+-+-
T Consensus 75 dpv~~~f~~Lk~G~YAvaa~ 94 (151)
T COG4704 75 DPVSKSFYGLKPGKYAVAAF 94 (151)
T ss_pred CchhheeecCCCccEEEEEE
Confidence 34479999999999976653
No 52
>PF14289 DUF4369: Domain of unknown function (DUF4369)
Probab=57.20 E-value=83 Score=23.82 Aligned_cols=46 Identities=28% Similarity=0.442 Sum_probs=28.3
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEE---cCce--e-EEEecCCccEEEccCC--CeeEEE
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVL---NGGE--H-VTFLRPDGYFSFQNMS--AGTHLI 162 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L---~g~~--~-~a~~~~dG~F~f~nVP--~GsY~L 162 (278)
..++|+|+|.-... ..+|.| +++. . ++.++ ||.|+|..-- |+.|.|
T Consensus 11 ~~~~I~G~i~~~~~-------------~~~vyL~~~~~~~~~~ds~~v~-nG~F~f~~~~~~p~~~~l 64 (106)
T PF14289_consen 11 KQFTIEGKIKGLPD-------------GDKVYLYYYDNGKVVIDSVVVK-NGKFSFKGPLDEPGFYYL 64 (106)
T ss_pred CcEEEEEEEcCCCC-------------CCEEEEEEeCCCCEEEEEEEEe-CCEEEEEEeCCCCEEEEE
Confidence 78999999965411 122322 2322 1 35555 9999998753 477777
No 53
>PF11589 DUF3244: Domain of unknown function (DUF3244); InterPro: IPR021638 This family of proteins with unknown function appear to be restricted to Bacteroidetes. The protein may have an immunoglobulin-like beta-sandwich fold however this cannot be confirmed. ; PDB: 3D33_B 3SD2_A.
Probab=53.29 E-value=51 Score=26.24 Aligned_cols=42 Identities=14% Similarity=0.277 Sum_probs=24.7
Q ss_pred ceeEEEEEcC--ce--e-EEEecCCc---cEEEccCCCeeEEEEEEecCc
Q 023727 128 ASNVKVVLNG--GE--H-VTFLRPDG---YFSFQNMSAGTHLIEVAAIGY 169 (278)
Q Consensus 128 ~s~t~V~L~g--~~--~-~a~~~~dG---~F~f~nVP~GsY~LeVss~gy 169 (278)
...++|++.+ |. | ..+....+ +|.+.+.+.|.|+|++....-
T Consensus 47 ~~~vtI~I~d~~G~vVy~~~~~~~~~~~~~I~L~~~~~G~Y~l~i~~~~g 96 (106)
T PF11589_consen 47 IGDVTITIKDSTGNVVYSETVSNSAGQSITIDLNGLPSGEYTLEITNGNG 96 (106)
T ss_dssp -SEEEEEEEETT--EEEEEEESCGGTTEEEEE-TTS-SEEEEEEEEECTC
T ss_pred CCCEEEEEEeCCCCEEEEEEccCCCCcEEEEEeCCCCCccEEEEEEeCCC
Confidence 3556666654 21 2 24444444 788888999999999887743
No 54
>PF00017 SH2: SH2 domain; InterPro: IPR000980 The Src homology 2 (SH2) domain is a protein domain of about 100 amino-acid residues first identified as a conserved sequence region between the oncoproteins Src and Fps []. Similar sequences were later found in many other intracellular signal-transducing proteins []. SH2 domains function as regulatory modules of intracellular signalling cascades by interacting with high affinity to phosphotyrosine-containing target peptides in a sequence-specific, SH2 domains recognise between 3-6 residues C-terminal to the phosphorylated tyrosine in a fashion that differs from one SH2 domain to another, and strictly phosphorylation-dependent manner [, , , ]. They are found in a wide variety of protein contexts e.g., in association with catalytic domains of phospholipase Cy (PLCy) and the non-receptor protein tyrosine kinases; within structural proteins such as fodrin and tensin; and in a group of small adaptor molecules, i.e Crk and Nck. The domains are frequently found as repeats in a single protein sequence and will then often bind both mono- and di-phosphorylated substrates. The structure of the SH2 domain belongs to the alpha+beta class, its overall shape forming a compact flattened hemisphere. The core structural elements comprise a central hydrophobic anti-parallel beta-sheet, flanked by 2 short alpha-helices. The loop between strands 2 and 3 provides many of the binding interactions with the phosphate group of its phosphopeptide ligand, and is hence designated the phosphate binding loop, the phosphorylated ligand binds perpendicular to the beta-sheet and typically interacts with the phosphate binding loop and a hydrophobic binding pocket that interacts with a pY+3 side chain. The N- and C-termini of the domain are close together in space and on the opposite face from the phosphopeptide binding surface and it has been speculated that this has facilitated their integration into surface-exposed regions of host proteins [].; GO: 0005515 protein binding; PDB: 1M27_A 1KA6_A 1D4W_B 1D4T_A 1D1Z_B 1KA7_A 1UUR_A 1UUS_A 1BLJ_A 1BLK_A ....
Probab=50.91 E-value=38 Score=24.67 Aligned_cols=35 Identities=26% Similarity=0.424 Sum_probs=26.4
Q ss_pred cCCccEEEccCC--CeeEEEEEEecCcceeeEEEEEEc
Q 023727 145 RPDGYFSFQNMS--AGTHLIEVAAIGYFFSPVRVDVSA 180 (278)
Q Consensus 145 ~~dG~F~f~nVP--~GsY~LeVss~gy~F~p~RVdV~~ 180 (278)
.++|+|.++.=. +|.|+|.|...+ ....++|.-..
T Consensus 19 ~~~G~FLvR~s~~~~~~~~Lsv~~~~-~v~h~~I~~~~ 55 (77)
T PF00017_consen 19 KPDGTFLVRPSSSKPGKYVLSVRFDG-KVKHFRINRTE 55 (77)
T ss_dssp SSTTEEEEEEESSSTTSEEEEEEETT-EEEEEEEEEET
T ss_pred CCCCeEEEEecccccccccccccccc-ccEEEEEEecC
Confidence 569999998776 689999999888 55555555554
No 55
>PRK09619 flgD flagellar basal body rod modification protein; Reviewed
Probab=50.22 E-value=67 Score=29.42 Aligned_cols=41 Identities=17% Similarity=0.380 Sum_probs=24.6
Q ss_pred ceeEEEEEcC--ceeEEEe---cCCc--cEEEcc----CCCeeEEEEEEecC
Q 023727 128 ASNVKVVLNG--GEHVTFL---RPDG--YFSFQN----MSAGTHLIEVAAIG 168 (278)
Q Consensus 128 ~s~t~V~L~g--~~~~a~~---~~dG--~F~f~n----VP~GsY~LeVss~g 168 (278)
...++|.+.+ |+..++. ...| .|.+.. +|+|.|.++|...+
T Consensus 122 a~~v~v~I~D~~G~v~t~~l~~~~aG~~~f~WDG~~~~lp~G~Y~~~V~a~~ 173 (218)
T PRK09619 122 APTLTLHITDILGQEKKIDLGKQPAGPVNFTLDPAALGLQPGQYQLSVVSGS 173 (218)
T ss_pred CcEEEEEEEeCCCCEEEEecCCcCCCceeEEECCCCCCCCCceeEEEEEEeC
Confidence 4567777753 3333331 2334 455555 89999999997544
No 56
>PF09912 DUF2141: Uncharacterized protein conserved in bacteria (DUF2141); InterPro: IPR018673 This family of conserved hypothetical proteins has no known function.
Probab=49.84 E-value=22 Score=28.93 Aligned_cols=21 Identities=24% Similarity=0.320 Sum_probs=17.5
Q ss_pred CCccEEEccCCCeeEEEEEEe
Q 023727 146 PDGYFSFQNMSAGTHLIEVAA 166 (278)
Q Consensus 146 ~dG~F~f~nVP~GsY~LeVss 166 (278)
.+-+++|.++|+|+|-+.|.+
T Consensus 41 ~~~~~~f~~lp~G~YAi~v~h 61 (112)
T PF09912_consen 41 GTVTITFEDLPPGTYAIAVFH 61 (112)
T ss_pred CcEEEEECCCCCccEEEEEEE
Confidence 455899999999999887765
No 57
>PF03404 Mo-co_dimer: Mo-co oxidoreductase dimerisation domain; InterPro: IPR005066 The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism [, ]. In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner []. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF []. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 (Q39054 from SWISSPROT) of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. Q9NQX3 from SWISSPROT) and Drosophila melanogaster (Fruit fly) Cinnamon (P39205 from SWISSPROT) []. This domain is found in molybdopterin cofactor oxidoreductases, such as in the C-terminal of Mo-containing sulphite oxidase, which catalyses the conversion of sulphite to sulphate, the terminal step in the oxidative degradation of cysteine and methionine []. This domain is involved in dimer formation, and has an Ig-fold structure [].; GO: 0016491 oxidoreductase activity, 0030151 molybdenum ion binding, 0055114 oxidation-reduction process; PDB: 2C9X_A 2CA3_A 2BLF_A 2CA4_A 2BPB_A 2XTS_C 2BII_A 2BIH_A 1OGP_A 2A9A_B ....
Probab=49.39 E-value=65 Score=27.03 Aligned_cols=30 Identities=23% Similarity=0.267 Sum_probs=21.2
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeE
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHV 141 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~ 141 (278)
+.++|+|....-++. .+..+.|++|||..+
T Consensus 27 ~~v~i~G~A~~g~g~---------~I~rVEVS~DgG~tW 56 (131)
T PF03404_consen 27 GTVTIRGYAWSGGGR---------GIARVEVSTDGGKTW 56 (131)
T ss_dssp EEEEEEEEEE-STT-----------EEEEEEESSTTSSE
T ss_pred cEEEEEEEEEeCCCc---------ceEEEEEEeCCCCCc
Confidence 478999998654442 268899999998644
No 58
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=48.19 E-value=12 Score=30.18 Aligned_cols=12 Identities=25% Similarity=0.329 Sum_probs=7.4
Q ss_pred cccchhHHHHHH
Q 023727 73 IRSKSVLSVFFI 84 (278)
Q Consensus 73 ~~~~~~~~~~~~ 84 (278)
|+||.+++|.++
T Consensus 1 MaSK~~llL~l~ 12 (95)
T PF07172_consen 1 MASKAFLLLGLL 12 (95)
T ss_pred CchhHHHHHHHH
Confidence 678876655443
No 59
>PF01190 Pollen_Ole_e_I: Pollen proteins Ole e I like; InterPro: IPR006041 Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation. The allergens in this family include allergens with the following designations: Ole e 1. A number of plant pollen proteins, whose biological function is not yet known, are structurally related []. These proteins are most probably secreted and consist of about 145 residues. There are six cysteines which are conserved in the sequence of these proteins. They seem to be involved in disulphide bonds.
Probab=47.69 E-value=51 Score=25.65 Aligned_cols=26 Identities=27% Similarity=0.461 Sum_probs=19.4
Q ss_pred ceeEEEEEc--C--c----eeEEEecCCccEEEc
Q 023727 128 ASNVKVVLN--G--G----EHVTFLRPDGYFSFQ 153 (278)
Q Consensus 128 ~s~t~V~L~--g--~----~~~a~~~~dG~F~f~ 153 (278)
++.++|.|. + + ...+.+|++|.|.|.
T Consensus 21 l~GA~V~v~C~~~~~~~~~~~~~~Td~~G~F~i~ 54 (97)
T PF01190_consen 21 LPGAKVSVECKDGNGGVVFSAEAKTDENGYFSIE 54 (97)
T ss_pred CCCCEEEEECCCCCCCcEEEEEEEeCCCCEEEEE
Confidence 677888885 2 1 346899999999984
No 60
>TIGR01710 typeII_sec_gspG general secretion pathway protein G. This model represents GspG, protein G of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=47.33 E-value=22 Score=29.63 Aligned_cols=32 Identities=19% Similarity=0.457 Sum_probs=22.2
Q ss_pred ccChhhhccCHHHHHHHHHHHHHHhhhccccCCCHH
Q 023727 216 PFSIMSLVKSPMGLMMGFMLVVVFLMPKLMENMDPE 251 (278)
Q Consensus 216 ~fsi~~lLkNPM~LM~lv~l~l~~~mPKLme~mDPE 251 (278)
+|++..|| +.++++.+++.+++|.+...++-.
T Consensus 2 GFTLiEll----ivlaIigil~~i~~p~~~~~~~~a 33 (134)
T TIGR01710 2 GFTLLEIM----VVLVILGLLAALVAPKLFSQADKA 33 (134)
T ss_pred ceeHHHHH----HHHHHHHHHHHHHHHHHHHHHHHH
Confidence 56666654 566667777777899988776643
No 61
>PF13954 PapC_N: PapC N-terminal domain; PDB: 2VQI_B 3FIP_A 3RFZ_E 3OHN_A 1ZDV_A 1ZE3_D 3BWU_D 1ZDX_A.
Probab=44.43 E-value=67 Score=26.94 Aligned_cols=38 Identities=24% Similarity=0.399 Sum_probs=29.4
Q ss_pred cCCCeeEEEEEEecCcceeeEEEEEEcCCCCceEEEEe
Q 023727 154 NMSAGTHLIEVAAIGYFFSPVRVDVSARHPGKVQAALT 191 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~F~p~RVdV~~~~~G~VrA~~~ 191 (278)
.++||.|.++|.-.+-......|++....++++.+..+
T Consensus 26 ~~~pG~Y~vdv~vN~~~~~~~~i~f~~~~~~~~~pClt 63 (146)
T PF13954_consen 26 AIPPGEYSVDVYVNGKFIGRYDIEFINNDDGKLQPCLT 63 (146)
T ss_dssp SS-SEEEEEEEEETTEEEEEEEEEEEESSSTSEEEE-B
T ss_pred CCCCeEEEEEEEECCeeeeeEEEEEEeCCCcceeEEeC
Confidence 68999999999999999998888888765455666553
No 62
>PF15178 TOM_sub5: Mitochondrial import receptor subunit TOM5 homolog
Probab=43.65 E-value=18 Score=26.04 Aligned_cols=11 Identities=64% Similarity=1.072 Sum_probs=7.6
Q ss_pred cCCCHHHHHHH
Q 023727 246 ENMDPEEMRRA 256 (278)
Q Consensus 246 e~mDPEe~ke~ 256 (278)
.+|||||||+-
T Consensus 9 pk~DPeE~k~k 19 (51)
T PF15178_consen 9 PKMDPEEMKRK 19 (51)
T ss_pred CCCCHHHHHHH
Confidence 34699987653
No 63
>COG4537 ComGC Competence protein ComGC [Intracellular trafficking and secretion]
Probab=43.57 E-value=32 Score=28.50 Aligned_cols=62 Identities=13% Similarity=0.404 Sum_probs=40.5
Q ss_pred ceeeeccccChhhhccCHHHHHHHHHHHHHHhhhccccCCCH------H-HHHHHHHHHH------hCCCCchhhhCCC
Q 023727 209 QYYEIREPFSIMSLVKSPMGLMMGFMLVVVFLMPKLMENMDP------E-EMRRAQEEMR------SQGVPSLANLIPG 274 (278)
Q Consensus 209 ~Yfe~Re~fsi~~lLkNPM~LM~lv~l~l~~~mPKLme~mDP------E-e~ke~qeem~------~~~~p~~s~ll~g 274 (278)
.++....+|.+..|| +.++++++++++..|-|.+.-+- | ..|-.|-|++ +...||+++|.++
T Consensus 6 k~~~~~kgFTLvEML----iVLlIISiLlLl~iPNltKq~~~i~~kGc~A~vkmV~sQ~~~YeLdh~~~~pSl~~L~s~ 80 (107)
T COG4537 6 KFLKHKKGFTLVEML----IVLLIISILLLLFIPNLTKQKEVIQDKGCEAVVKMVESQAEAYELDHNRLPPSLSDLKSD 80 (107)
T ss_pred HHHHhcccccHHHHH----HHHHHHHHHHHHHccchhhhHHHHhcchHHHHHHHHHHHHHHHHhccCCCCCCHHHHHhC
Confidence 455566888888886 56778888888889999764321 1 2222233332 2348999999875
No 64
>cd02110 SO_family_Moco_dimer Subgroup of sulfite oxidase (SO) family molybdopterin binding domains that contains conserved dimerization domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO).
Probab=43.14 E-value=97 Score=29.68 Aligned_cols=58 Identities=21% Similarity=0.241 Sum_probs=36.6
Q ss_pred CCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeE--EEecC-C-c-----cEEEc-cCCCeeEEEEEEecCc
Q 023727 102 GDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHV--TFLRP-D-G-----YFSFQ-NMSAGTHLIEVAAIGY 169 (278)
Q Consensus 102 ~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~--a~~~~-d-G-----~F~f~-nVP~GsY~LeVss~gy 169 (278)
.+.++|+|....- +. .+..+.|++|||+-| |-+.. . + .|++. ..++|.|+|.+...|-
T Consensus 222 ~~~~~i~G~A~~g-~~---------~I~rVEvS~DgG~tW~~A~l~~~~~~~~~W~~W~~~~~~~~G~~~l~vRA~D~ 289 (317)
T cd02110 222 GGRVEIGGVAWSG-GR---------GIRRVEVSLDGGRTWQEARLEGPLAGPRAWRQWELDWDLPPGEYELVARATDS 289 (317)
T ss_pred CCeEEEEEEEEcC-CC---------CEEEEEEEeCCCCcceEeEccCCcCCCCEEEEEEEEEEcCCCcEEEEEEEECC
Confidence 4568899988643 22 268899999999654 32322 2 2 33333 2446888888877664
No 65
>COG1422 Predicted membrane protein [Function unknown]
Probab=43.11 E-value=34 Score=31.31 Aligned_cols=35 Identities=23% Similarity=0.375 Sum_probs=20.1
Q ss_pred CHHHH---HHHHHHHHHHhhhccccCCCHHHHHHHHHHHH
Q 023727 225 SPMGL---MMGFMLVVVFLMPKLMENMDPEEMRRAQEEMR 261 (278)
Q Consensus 225 NPM~L---M~lv~l~l~~~mPKLme~mDPEe~ke~qeem~ 261 (278)
+|+.. ++++.++.+-+.-|+. +|-|.|+++|++|+
T Consensus 45 ~p~lvilV~avi~gl~~~i~~~~l--iD~ekm~~~qk~m~ 82 (201)
T COG1422 45 PPHLVILVAAVITGLYITILQKLL--IDQEKMKELQKMMK 82 (201)
T ss_pred ccHHHHHHHHHHHHHHHHHHHHHh--ccHHHHHHHHHHHH
Confidence 66543 3444444444444552 69998877776553
No 66
>PF02369 Big_1: Bacterial Ig-like domain (group 1); InterPro: IPR003344 Proteins that contain this domain are found in a variety of bacterial and phage surface proteins such as intimins. Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion [].; PDB: 1CWV_A 4E9L_A 1F02_I 1F00_I.
Probab=42.56 E-value=1.7e+02 Score=23.04 Aligned_cols=67 Identities=18% Similarity=0.162 Sum_probs=40.7
Q ss_pred CCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEE----cCcee-----EEEecCCccEEE--ccCCCeeEEEEEEecCc
Q 023727 101 SGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVL----NGGEH-----VTFLRPDGYFSF--QNMSAGTHLIEVAAIGY 169 (278)
Q Consensus 101 ~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L----~g~~~-----~a~~~~dG~F~f--~nVP~GsY~LeVss~gy 169 (278)
..+..+|.-+|....++| ++...|.+ +++.+ .+.+|.+|...+ ..-.+|.|.+.+..-+.
T Consensus 21 g~~~~tltatV~D~~gnp---------v~g~~V~f~~~~~~~~l~~~~~~~~Td~~G~a~~tltst~aG~~~VtA~~~~~ 91 (100)
T PF02369_consen 21 GSDTNTLTATVTDANGNP---------VPGQPVTFSSSSSGGTLSPTNTSATTDSNGIATVTLTSTKAGTYTVTATVDGG 91 (100)
T ss_dssp SSS-EEEEEEEEETTSEB----------TS-EEEE--EESSSEES-CEE-EEE-TTSEEEEEEE-SS-EEEEEEEEETTE
T ss_pred CcCcEEEEEEEEcCCCCC---------CCCCEEEEEEcCCCcEEecCccccEECCCEEEEEEEEecCceEEEEEEEECCc
Confidence 357788999998777765 34455554 33333 478999997654 45577999998888776
Q ss_pred ceeeEEE
Q 023727 170 FFSPVRV 176 (278)
Q Consensus 170 ~F~p~RV 176 (278)
......+
T Consensus 92 ~~~~~~~ 98 (100)
T PF02369_consen 92 STSVTSV 98 (100)
T ss_dssp EEEEEEE
T ss_pred ceeEEee
Confidence 5554433
No 67
>cd05774 Ig_CEACAM_D1 First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM). IG_CEACAM_D1: immunoglobulin (Ig)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions, it is a cell adhesion molecule, and a signaling molecule that regulates the growth of tumor cells, it is an angiogenic factor, and is a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surface. This family corresponds to the D
Probab=42.38 E-value=85 Score=25.15 Aligned_cols=37 Identities=16% Similarity=0.331 Sum_probs=30.2
Q ss_pred EEecCCccEEEccCCC---eeEEEEEEecCcceeeEEEEE
Q 023727 142 TFLRPDGYFSFQNMSA---GTHLIEVAAIGYFFSPVRVDV 178 (278)
Q Consensus 142 a~~~~dG~F~f~nVP~---GsY~LeVss~gy~F~p~RVdV 178 (278)
+.+..||+=.|+||.. |.|+++|...++.+....|..
T Consensus 64 ~~~~~ngSL~I~~v~~~D~G~Y~~~v~~~~~~~~~~~v~l 103 (105)
T cd05774 64 ETIYPNGSLLIQNVTQKDTGFYTLQTITTNFQVEQATVHL 103 (105)
T ss_pred EEEeCCCcEEEecCCcccCEEEEEEEEeCCccEEEEEEEE
Confidence 4466799999999975 999999999998777666654
No 68
>cd05822 TLP_HIUase HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family. HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site. In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location. Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences. HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located betw
Probab=42.28 E-value=2e+02 Score=23.79 Aligned_cols=43 Identities=28% Similarity=0.349 Sum_probs=31.6
Q ss_pred ceeEEEEEc---Cc----eeEEEecCCccEEE-----ccCCCeeEEEEEEecCcc
Q 023727 128 ASNVKVVLN---GG----EHVTFLRPDGYFSF-----QNMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 128 ~s~t~V~L~---g~----~~~a~~~~dG~F~f-----~nVP~GsY~LeVss~gy~ 170 (278)
++.+.|.|. +. -..+.+|.||...- ..+.+|.|.|....-+|.
T Consensus 16 Aagv~V~L~~~~~~~~~~i~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~~~~Yf 70 (112)
T cd05822 16 AAGVAVTLYRLDGNGWTLLATGVTNADGRCDDLLPPGAQLAAGTYKLTFDTGAYF 70 (112)
T ss_pred cCCCEEEEEEecCCCeEEEEEEEECCCCCccCcccccccCCCeeEEEEEEhhhhh
Confidence 566777764 22 12588999999853 356789999999988885
No 69
>PRK12813 flgD flagellar basal body rod modification protein; Reviewed
Probab=42.20 E-value=94 Score=28.67 Aligned_cols=40 Identities=15% Similarity=0.214 Sum_probs=25.9
Q ss_pred ceeEEEEEcC--cee-EEE--ecCCccEEEcc-------CCCeeEEEEEEec
Q 023727 128 ASNVKVVLNG--GEH-VTF--LRPDGYFSFQN-------MSAGTHLIEVAAI 167 (278)
Q Consensus 128 ~s~t~V~L~g--~~~-~a~--~~~dG~F~f~n-------VP~GsY~LeVss~ 167 (278)
...++|.+.+ |+. .++ .--.+.|.+.. +|+|.|.++|...
T Consensus 123 a~~v~v~I~D~~G~vV~t~~~~~G~~~f~WDG~d~~G~~l~~G~Yt~~V~A~ 174 (223)
T PRK12813 123 ADKAELVVRDAAGAEVARETVPVGAGPVEWAGEDADGNPLPNGAYSFVVESY 174 (223)
T ss_pred CceEEEEEEcCCCCEEEEEeeCCCceeEEeCCcCCCCCcCCCccEEEEEEEE
Confidence 4567887764 332 222 22356788864 8899999999775
No 70
>PF00576 Transthyretin: HIUase/Transthyretin family; InterPro: IPR023416 This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [, ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence. Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates []. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) []. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain [].; PDB: 1TFP_B 1KGJ_D 1IE4_C 1GKE_C 1KGI_D 2H0J_B 2H0E_B 2H0F_B 1ZD6_A 3DGD_D ....
Probab=41.13 E-value=32 Score=28.41 Aligned_cols=43 Identities=23% Similarity=0.298 Sum_probs=28.3
Q ss_pred ceeEEEEEcC----ce----eEEEecCCccEE-----EccCCCeeEEEEEEecCcc
Q 023727 128 ASNVKVVLNG----GE----HVTFLRPDGYFS-----FQNMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 128 ~s~t~V~L~g----~~----~~a~~~~dG~F~-----f~nVP~GsY~LeVss~gy~ 170 (278)
++.+.|.|.. +. ..+.+|.||... -.++.+|.|.|....-+|.
T Consensus 16 A~gv~V~L~~~~~~~~~~~l~~~~Td~DGR~~~~~~~~~~~~~G~Y~l~F~~~~Yf 71 (112)
T PF00576_consen 16 AAGVPVTLYRLDSDGSWTLLAEGVTDADGRIKQPLLEGESLEPGIYKLVFDTGDYF 71 (112)
T ss_dssp -TT-EEEEEEEETTSCEEEEEEEEBETTSEESSTSSETTTS-SEEEEEEEEHHHHH
T ss_pred ccCCEEEEEEecCCCCcEEEEEEEECCCCcccccccccccccceEEEEEEEHHHhH
Confidence 5667777642 11 358899999983 2455689999999877764
No 71
>KOG4659 consensus Uncharacterized conserved protein (Rhs family) [Function unknown]
Probab=41.09 E-value=78 Score=36.75 Aligned_cols=71 Identities=21% Similarity=0.149 Sum_probs=51.3
Q ss_pred CCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEc-CceeEEEecCCccEEEccCCCeeEEEEEEecCcceeeEEEEE
Q 023727 100 GSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLN-GGEHVTFLRPDGYFSFQNMSAGTHLIEVAAIGYFFSPVRVDV 178 (278)
Q Consensus 100 ~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~-g~~~~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F~p~RVdV 178 (278)
.....-.|.|+|.+-++-| +-.++|.-- .-.|-|++++||+|-+----.++-+|+..-..|.-++-.|-|
T Consensus 47 ne~~~~vIrgrvv~~~~~p---------LVGVrVS~~~~~~yfTlTR~DG~FDL~vnGg~svtLqF~R~pF~~qkr~v~v 117 (1899)
T KOG4659|consen 47 NENRISVIRGRVVWGGGVP---------LVGVRVSDAAHPLYFTLTREDGYFDLTVNGGRSVTLQFLRTPFQSQKRSVFV 117 (1899)
T ss_pred ccccceEEeccEeecCCcc---------eEEEEeecccccceEEEEecCceEEEEEcccceEEEEEccCCCcccceeEEe
Confidence 4456679999999887744 445666533 345689999999999865556788888888777666655555
Q ss_pred E
Q 023727 179 S 179 (278)
Q Consensus 179 ~ 179 (278)
.
T Consensus 118 p 118 (1899)
T KOG4659|consen 118 P 118 (1899)
T ss_pred C
Confidence 4
No 72
>smart00060 FN3 Fibronectin type 3 domain. One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins.
Probab=39.92 E-value=59 Score=21.30 Aligned_cols=22 Identities=14% Similarity=0.324 Sum_probs=18.5
Q ss_pred CccEEEccCCCe-eEEEEEEecC
Q 023727 147 DGYFSFQNMSAG-THLIEVAAIG 168 (278)
Q Consensus 147 dG~F~f~nVP~G-sY~LeVss~g 168 (278)
+..|.+.++.+| .|.+.|...+
T Consensus 56 ~~~~~i~~L~~~~~Y~v~v~a~~ 78 (83)
T smart00060 56 STSYTLTGLKPGTEYEFRVRAVN 78 (83)
T ss_pred ccEEEEeCcCCCCEEEEEEEEEc
Confidence 568999999998 8988887765
No 73
>cd00173 SH2 Src homology 2 domains; Signal transduction, involved in recognition of phosphorylated tyrosine (pTyr). SH2 domains typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.
Probab=36.61 E-value=82 Score=23.42 Aligned_cols=34 Identities=24% Similarity=0.479 Sum_probs=25.3
Q ss_pred cCCccEEEccCC--CeeEEEEEEecCcceeeEEEEEE
Q 023727 145 RPDGYFSFQNMS--AGTHLIEVAAIGYFFSPVRVDVS 179 (278)
Q Consensus 145 ~~dG~F~f~nVP--~GsY~LeVss~gy~F~p~RVdV~ 179 (278)
.++|+|-++.=+ +|.|+|.|...+ ....++|.-.
T Consensus 19 ~~~G~FLiR~s~~~~~~~~Lsv~~~~-~v~H~~I~~~ 54 (94)
T cd00173 19 KPDGTFLVRDSESSPGDYVLSVRVKG-KVKHYRIERT 54 (94)
T ss_pred CCCceEEEEecCCCCCCEEEEEEECC-EEEEEEEEEC
Confidence 589999998886 699999999887 4444554443
No 74
>PF03716 WCCH: WCCH motif ; InterPro: IPR005159 The WCCH motif is found in a retrotransposons and Gemini viruses. A specific function has not been associated to this motif [].
Probab=36.59 E-value=17 Score=22.57 Aligned_cols=11 Identities=36% Similarity=0.860 Sum_probs=8.9
Q ss_pred cccCCCCCCcc
Q 023727 47 FPCKFHHCTSA 57 (278)
Q Consensus 47 ~~~~~~~~~~~ 57 (278)
-||.++|||--
T Consensus 4 ~pC~cphCprH 14 (25)
T PF03716_consen 4 QPCCCPHCPRH 14 (25)
T ss_pred cccCCCCCccc
Confidence 38999999854
No 75
>PRK06655 flgD flagellar basal body rod modification protein; Reviewed
Probab=35.16 E-value=1.7e+02 Score=26.81 Aligned_cols=40 Identities=30% Similarity=0.458 Sum_probs=23.8
Q ss_pred ceeEEEEEcC--cee-EEE---ecCCccEEE--cc-------CCCeeEEEEEEec
Q 023727 128 ASNVKVVLNG--GEH-VTF---LRPDGYFSF--QN-------MSAGTHLIEVAAI 167 (278)
Q Consensus 128 ~s~t~V~L~g--~~~-~a~---~~~dG~F~f--~n-------VP~GsY~LeVss~ 167 (278)
...++|.|.+ |+. .++ ....|.+.| .+ +|+|.|.++|...
T Consensus 125 a~~vti~I~D~~G~~Vrt~~lg~~~aG~~~f~WDG~d~~G~~lp~G~Yt~~V~A~ 179 (225)
T PRK06655 125 ADNVTVTITDSAGQVVRTIDLGAQSAGVVSFTWDGTDTDGNALPDGNYTIKASAS 179 (225)
T ss_pred CcEEEEEEEcCCCCEEEEEecCCcCCCceeEEECCCCCCCCcCCCeeEEEEEEEE
Confidence 4567787754 322 222 234554444 33 7889999988754
No 76
>PRK12633 flgD flagellar basal body rod modification protein; Provisional
Probab=34.62 E-value=2.8e+02 Score=25.53 Aligned_cols=42 Identities=24% Similarity=0.267 Sum_probs=24.9
Q ss_pred ceeEEEEEcC--cee-EEE---ecCCc--cEEEcc-------CCCeeEEEEEEecCc
Q 023727 128 ASNVKVVLNG--GEH-VTF---LRPDG--YFSFQN-------MSAGTHLIEVAAIGY 169 (278)
Q Consensus 128 ~s~t~V~L~g--~~~-~a~---~~~dG--~F~f~n-------VP~GsY~LeVss~gy 169 (278)
...++|.|.+ |+. .++ ....| .|.+.. +|+|.|.+.|...+.
T Consensus 128 a~~v~v~I~D~~G~vV~t~~lg~~~aG~~~f~WDG~d~~G~~~~~G~Y~~~V~a~~~ 184 (230)
T PRK12633 128 ATKVTVKVLDPSGAVVRTMELGDLKTGVHTLQWDGNNDGGQPLADGKYSITVSASDA 184 (230)
T ss_pred CcEEEEEEEeCCCCEEEEEecCCCCCCceeEEECCCCCCCCcCCCcceEEEEEEEeC
Confidence 4567777753 332 233 22344 455544 688999998877543
No 77
>PRK12812 flgD flagellar basal body rod modification protein; Reviewed
Probab=34.30 E-value=1.1e+02 Score=28.95 Aligned_cols=41 Identities=7% Similarity=0.086 Sum_probs=24.8
Q ss_pred ceeEEEEEcC--cee-EEE---ecCCc--cEEEcc-------CCCeeEEEEEEecC
Q 023727 128 ASNVKVVLNG--GEH-VTF---LRPDG--YFSFQN-------MSAGTHLIEVAAIG 168 (278)
Q Consensus 128 ~s~t~V~L~g--~~~-~a~---~~~dG--~F~f~n-------VP~GsY~LeVss~g 168 (278)
...++|.|.+ |+. .++ ....| .|.+.+ +|+|.|.++|...+
T Consensus 140 a~~v~v~I~D~~G~~V~t~~lg~~~aG~~~f~WDG~d~~G~~~~~G~Yt~~v~A~~ 195 (259)
T PRK12812 140 SDEGTLEIYDSNNKLVEKIDFKEISQGLFTMEWDGRDNDGVYAGDGEYTIKAVYNN 195 (259)
T ss_pred CceEEEEEEeCCCCEEEEEecCCCCCcceeEEECCCCCCCCcCCCeeeEEEEEEEc
Confidence 4568887764 332 222 11234 577776 78899998887543
No 78
>COG4939 Major membrane immunogen, membrane-anchored lipoprotein [Function unknown]
Probab=34.25 E-value=1.5e+02 Score=25.80 Aligned_cols=64 Identities=14% Similarity=0.123 Sum_probs=36.3
Q ss_pred HHHHHHHhhhcceeec--cCCCCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeE----EEecCCccEEEcc
Q 023727 83 FINLFLSLVSSAVAVS--SGSGDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHV----TFLRPDGYFSFQN 154 (278)
Q Consensus 83 ~~~~~~s~~~~~~a~s--~~~~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~----a~~~~dG~F~f~n 154 (278)
+.+..++++.++.-.. .+.-.-++.+|.-..-+.. .|-.+++|++++|... -+.+.+|+|.=.+
T Consensus 8 ~~~~~~~LL~aCg~sd~s~~t~~dGtY~~~y~~fDd~--------gwk~f~~iti~dGKiv~~~ydy~~k~G~~Ks~D 77 (147)
T COG4939 8 GMIVALSLLTACGKSDFSKMTFNDGTYQGHYESFDDH--------GWKAFVTITIQDGKIVACTYDYRDKKGNIKSDD 77 (147)
T ss_pred HHHHHHHHHHHhcccccccccccCCceeeeecccccc--------CccceEEEEEeCCEEEEEEeeeecCCCCccccc
Confidence 4444445553333221 1222334556665544432 5789999999999775 3556677776554
No 79
>COG2165 PulG Type II secretory pathway, pseudopilin PulG [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=34.03 E-value=64 Score=25.42 Aligned_cols=33 Identities=15% Similarity=0.385 Sum_probs=26.9
Q ss_pred cccChhhhccCHHHHHHHHHHHHHHhhhccccCCCHH
Q 023727 215 EPFSIMSLVKSPMGLMMGFMLVVVFLMPKLMENMDPE 251 (278)
Q Consensus 215 e~fsi~~lLkNPM~LM~lv~l~l~~~mPKLme~mDPE 251 (278)
.+|.+.-+| +-|+++.+++.+++|.++...|-.
T Consensus 8 rGFTLiElL----Vvl~Iigil~~~~~p~~~~~~~~~ 40 (149)
T COG2165 8 RGFTLIELL----VVLAIIGILAALALPSLQGSIDKA 40 (149)
T ss_pred CCcchHHHH----HHHHHHHHHHHHHHhhhhhHHHHH
Confidence 488888876 788888888889999998877654
No 80
>cd02114 bact_SorA_Moco sulfite:cytochrome c oxidoreductase subunit A (SorA), molybdopterin binding domain. SorA is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SorB, a small c-type heme containing subunit, it forms a hetrodimer. It is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.
Probab=33.38 E-value=1.6e+02 Score=29.00 Aligned_cols=57 Identities=26% Similarity=0.309 Sum_probs=33.3
Q ss_pred CCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEec---CCccEEEc------cCC-CeeEEEEEEecC
Q 023727 102 GDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLR---PDGYFSFQ------NMS-AGTHLIEVAAIG 168 (278)
Q Consensus 102 ~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~---~dG~F~f~------nVP-~GsY~LeVss~g 168 (278)
.+.++|+|..- .++. .+..+.|++|||.-|...+ ..|.|... ..+ +|.|+|.+...|
T Consensus 274 ~~~~~i~G~A~-~G~~---------~I~rVEVS~DgG~tW~~A~l~~~~~~~aW~~W~~~~~~~~~G~~~l~~RA~D 340 (367)
T cd02114 274 AGELALRGIAF-DGGS---------GIRRVDVSADGGDSWTQATLGPDLGRFSFRGWKLTLDGVKKGPLTLMVRATN 340 (367)
T ss_pred CCeEEEEEEEE-cCCC---------CEEEEEEEeCCCCcceEeEeCCCCCCcEEEEEEEEEECCCCCcEEEEEEEEc
Confidence 45689999774 4432 2688999999986553322 23433222 222 466666655554
No 81
>PF08621 RPAP1_N: RPAP1-like, N-terminal; InterPro: IPR013930 Inhibition of RNA polymerase II-associated protein 1 (RPAP1) synthesis in Saccharomyces cerevisiae (Baker's yeast) results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11 []. This entry represents the N-terminal region of RPAP-1 that is conserved from yeast to humans.
Probab=33.09 E-value=20 Score=25.64 Aligned_cols=31 Identities=26% Similarity=0.449 Sum_probs=22.9
Q ss_pred cccCCCHHHHHHHHHHHHhCCCCchhhhCCC
Q 023727 244 LMENMDPEEMRRAQEEMRSQGVPSLANLIPG 274 (278)
Q Consensus 244 Lme~mDPEe~ke~qeem~~~~~p~~s~ll~g 274 (278)
.+..|.||+..+.|++...+=-|++=++|-.
T Consensus 11 rL~~MS~eEI~~er~eL~~~LdP~li~~L~~ 41 (49)
T PF08621_consen 11 RLASMSPEEIEEEREELLESLDPKLIEFLKK 41 (49)
T ss_pred HHHhCCHHHHHHHHHHHHHhCCHHHHHHHHH
Confidence 4568999999999998776655676665544
No 82
>COG4676 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=32.27 E-value=78 Score=29.76 Aligned_cols=100 Identities=22% Similarity=0.284 Sum_probs=57.0
Q ss_pred eEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEcc-CCCeeEEEEEEecCcceeeEEEEEEcCCC
Q 023727 105 FSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQN-MSAGTHLIEVAAIGYFFSPVRVDVSARHP 183 (278)
Q Consensus 105 ~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~n-VP~GsY~LeVss~gy~F~p~RVdV~~~~~ 183 (278)
.-|.|+|.....+ ...-++++||...--.+|.||+|.=.= -.+||--++|.+.+=.- .+||..-....
T Consensus 68 a~I~G~iraa~~~----------~r~~~LVVNG~~mPl~~d~dG~F~RPyaFg~GSNsVeV~sadG~~-rqRvQFY~~~~ 136 (268)
T COG4676 68 ALIRGRIRAAAGK----------DRPGRLVVNGNSMPLRIDSDGSFARPYAFGEGSNSVEVRSADGQS-RQRVQFYETNA 136 (268)
T ss_pred HHHhhhHhhccCc----------CCCceEEEcCcccceeecCCCceecceeccCCCCceEEECCCcch-hheEEEEecCC
Confidence 3577777632221 244778899988888899999996432 25688888888877432 23444433223
Q ss_pred CceEEEEe-----cc-cCccccEEEEeccccceeeecc
Q 023727 184 GKVQAALT-----ET-RRGLNELVLEQLREEQYYEIRE 215 (278)
Q Consensus 184 G~VrA~~~-----e~-~~~l~PLvv~p~~~~~Yfe~Re 215 (278)
|+++|++- ++ +-.+.==++.|-+...||-.|.
T Consensus 137 g~~~arlRvvLsWD~d~tdlDlHvvtPdG~Hawygn~~ 174 (268)
T COG4676 137 GKTRARLRVVLSWDTDNTDLDLHVVTPDGDHAWYGNPV 174 (268)
T ss_pred CccCceEEEEEEECCCCCceeEEEecCCCceeeecCce
Confidence 44444331 11 1111112456777777776553
No 83
>PF01186 Lysyl_oxidase: Lysyl oxidase ; InterPro: IPR001695 Lysyl oxidase (1.4.3.13 from EC) (LOX) [] is an extracellular copper-dependent enzyme that catalyses the oxidative deamination of peptidyl lysine residues in precursors of various collagens and elastins, yielding alpha-aminoadipic-delta-semialdehyde. The deaminated lysines are then able to form semialdehyde cross-links, resulting in the formation of insoluble collagen and elastin fibres in the extracellular matrix []. The active site of LOX resides towards the C terminus: this region also binds a single copper atom in an octahedral coordination complex involving at least 3 His residues []. Four histidine residues are clustered in a central region of the enzyme. This region is thought to be involved in cooper-binding and is called the 'copper-talon' [].; GO: 0005507 copper ion binding, 0016641 oxidoreductase activity, acting on the CH-NH2 group of donors, oxygen as acceptor, 0055114 oxidation-reduction process
Probab=27.64 E-value=46 Score=30.56 Aligned_cols=18 Identities=17% Similarity=0.545 Sum_probs=14.9
Q ss_pred cEEEccCCCeeEEEEEEe
Q 023727 149 YFSFQNMSAGTHLIEVAA 166 (278)
Q Consensus 149 ~F~f~nVP~GsY~LeVss 166 (278)
-+-|.+||+|.|+|+|.-
T Consensus 151 WiDITdvp~G~Y~l~V~v 168 (205)
T PF01186_consen 151 WIDITDVPPGTYILQVTV 168 (205)
T ss_pred ceeecCCCCccEEEEEec
Confidence 567889999999988763
No 84
>PF14347 DUF4399: Domain of unknown function (DUF4399)
Probab=27.16 E-value=42 Score=26.54 Aligned_cols=22 Identities=14% Similarity=0.157 Sum_probs=18.2
Q ss_pred cCCCeeEEEEEEecCcceeeEE
Q 023727 154 NMSAGTHLIEVAAIGYFFSPVR 175 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~F~p~R 175 (278)
+++||+|+|.+..-|+.-.++.
T Consensus 58 ~L~PG~htLtl~~~d~~h~~~~ 79 (87)
T PF14347_consen 58 ELPPGKHTLTLQLGDGDHVPHD 79 (87)
T ss_pred EeCCCCEEEEEEeCCCCcccCC
Confidence 5899999999999988766553
No 85
>PRK14646 hypothetical protein; Provisional
Probab=27.08 E-value=35 Score=29.54 Aligned_cols=17 Identities=18% Similarity=0.311 Sum_probs=13.9
Q ss_pred cCCCeeEEEEEEecCcc
Q 023727 154 NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~ 170 (278)
+.=++.|+|||+|+|..
T Consensus 70 D~i~~~Y~LEVSSPGld 86 (155)
T PRK14646 70 NLLNCSYVLEISSQGVS 86 (155)
T ss_pred CCCCCCeEEEEcCCCCC
Confidence 34459999999999975
No 86
>PF11346 DUF3149: Protein of unknown function (DUF3149); InterPro: IPR021494 This bacterial family of proteins has no known function.
Probab=27.01 E-value=73 Score=22.20 Aligned_cols=23 Identities=30% Similarity=0.327 Sum_probs=18.5
Q ss_pred hhhccCHHHHHHHHHHHHHHhhh
Q 023727 220 MSLVKSPMGLMMGFMLVVVFLMP 242 (278)
Q Consensus 220 ~~lLkNPM~LM~lv~l~l~~~mP 242 (278)
.+++.|+..||.++..+.+++|.
T Consensus 4 ~~LF~s~vGL~Sl~vI~~~igm~ 26 (42)
T PF11346_consen 4 KDLFGSDVGLMSLIVIVFTIGMG 26 (42)
T ss_pred HHHhcChHHHHHHHHHHHHHHHH
Confidence 46688999999988888777664
No 87
>cd05469 Transthyretin_like Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms.
Probab=26.84 E-value=3.7e+02 Score=22.33 Aligned_cols=43 Identities=23% Similarity=0.330 Sum_probs=30.4
Q ss_pred ceeEEEEEc---C-ce----eEEEecCCccEEE----ccCCCeeEEEEEEecCcc
Q 023727 128 ASNVKVVLN---G-GE----HVTFLRPDGYFSF----QNMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 128 ~s~t~V~L~---g-~~----~~a~~~~dG~F~f----~nVP~GsY~LeVss~gy~ 170 (278)
++.+.|.|. + +. ..+.+|.||...= .++.+|.|.|...--+|.
T Consensus 16 Aagv~V~L~~~~~~~~w~~l~~~~Tn~DGR~~~~l~~~~~~~G~Y~l~F~t~~Yf 70 (113)
T cd05469 16 AANVAIKVFRKTADGSWEIFATGKTNEDGELHGLITEEEFXAGVYRVEFDTKSYW 70 (113)
T ss_pred CCCCEEEEEEecCCCceEEEEEEEECCCCCccCccccccccceEEEEEEehHHhH
Confidence 566777663 2 22 2488999999841 345689999999988884
No 88
>cd05821 TLP_Transthyretin Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates. TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein. Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity. A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by bindi
Probab=25.93 E-value=1.2e+02 Score=25.63 Aligned_cols=44 Identities=20% Similarity=0.254 Sum_probs=31.2
Q ss_pred ceeEEEEEc----Cce----eEEEecCCccEE--E--ccCCCeeEEEEEEecCcce
Q 023727 128 ASNVKVVLN----GGE----HVTFLRPDGYFS--F--QNMSAGTHLIEVAAIGYFF 171 (278)
Q Consensus 128 ~s~t~V~L~----g~~----~~a~~~~dG~F~--f--~nVP~GsY~LeVss~gy~F 171 (278)
++.+.|.|. +++ -.+.+|.||... + ..+++|.|.|+...-+|.-
T Consensus 22 AaGV~V~L~~~~~~~~w~~l~~~~Tn~DGR~~~ll~~~~~~~G~Y~l~F~tg~Yf~ 77 (121)
T cd05821 22 AANVAVKVFKKTADGSWEPFASGKTTETGEIHGLTTDEQFTEGVYKVEFDTKAYWK 77 (121)
T ss_pred CCCCEEEEEEecCCCceEEEEEEEECCCCCCCCccCccccCCeeEEEEEehhHhhh
Confidence 567777773 222 258899999985 2 2346799999999888853
No 89
>PF13753 SWM_repeat: Putative flagellar system-associated repeat
Probab=24.64 E-value=2.7e+02 Score=26.01 Aligned_cols=52 Identities=27% Similarity=0.335 Sum_probs=34.9
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEcc-----CCCeeEEEEEE
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQN-----MSAGTHLIEVA 165 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~n-----VP~GsY~LeVs 165 (278)
...+++|.+.-... -..+.|.++|..+....+.+|.|.+.- ++.|.|.+.+.
T Consensus 61 ~~~t~s~tvs~~~~-----------g~~v~v~~~g~~~t~~~~~~G~ws~t~~~~~~l~~g~~ti~v~ 117 (317)
T PF13753_consen 61 NTVTFSGTVSGAEP-----------GSTVTVTINGTTGTLTADADGNWSVTVTPSDDLPDGDYTITVT 117 (317)
T ss_pred eeeEEEEEecCCCC-----------CCEEEEEECCEEEEEEEecCCcEEEeeccccccccCcceeEEE
Confidence 44577777743211 244677777777766688999977632 45789998887
No 90
>COG4594 FecB ABC-type Fe3+-citrate transport system, periplasmic component [Inorganic ion transport and metabolism]
Probab=24.61 E-value=1.3e+02 Score=29.01 Aligned_cols=31 Identities=13% Similarity=0.105 Sum_probs=26.9
Q ss_pred EEEecCCccEEEccCCCeeEEEEEEecCcce
Q 023727 141 VTFLRPDGYFSFQNMSAGTHLIEVAAIGYFF 171 (278)
Q Consensus 141 ~a~~~~dG~F~f~nVP~GsY~LeVss~gy~F 171 (278)
.++.|+.|+|++...|----+||.++.|+.-
T Consensus 34 ~tVkde~Gt~tv~k~PKRVVVLE~SFaDaLa 64 (310)
T COG4594 34 HTVKDELGTFTVPKTPKRVVVLELSFADALA 64 (310)
T ss_pred eeeeccCCceecCCCCceEEEEEecHHHHHH
Confidence 3488999999999999999999999888753
No 91
>COG4970 FimT Tfp pilus assembly protein FimT [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=24.45 E-value=86 Score=28.22 Aligned_cols=36 Identities=14% Similarity=0.386 Sum_probs=29.8
Q ss_pred cccChhhhccCHHHHHHHHHHHHHHhhhccccCCCHHHHH
Q 023727 215 EPFSIMSLVKSPMGLMMGFMLVVVFLMPKLMENMDPEEMR 254 (278)
Q Consensus 215 e~fsi~~lLkNPM~LM~lv~l~l~~~mPKLme~mDPEe~k 254 (278)
.+|.+..|| +.|+++.++.++.+|-+.+.++-+.+.
T Consensus 8 rGfTL~ELl----iviAIlAIla~~A~P~fs~~i~~~rl~ 43 (181)
T COG4970 8 RGFTLLELL----IVLAILAILAVIAAPNFSQWIRSQRLR 43 (181)
T ss_pred CceeHHHHH----HHHHHHHHHHHHhcchHHHHHHHHHHH
Confidence 577877765 889999999999999999999987433
No 92
>PF00577 Usher: Outer membrane usher protein; InterPro: IPR000015 In Gram-negative bacteria the biogenesis of fimbriae (or pili) requires a two- component assembly and transport system which is composed of a periplasmic chaperone (see PDOC00552 from PROSITEDOC) and an outer membrane protein which has been termed a molecular 'usher' [, , ]. The usher protein is rather large (from 86 to 100 kDa) and seems to be mainly composed of membrane-spanning beta-sheets, a structure reminiscent of porins. Although the degree of sequence similarity of these proteins is not very high, they share a number of characteristics. One of these is the presence of two pairs of cysteines, the first one located in the N-terminal part and the second at the C-terminal extremity that are probably involved in disulphide bonds. The best conserved region is located in the central part of these proteins.; GO: 0005215 transporter activity, 0006810 transport, 0016020 membrane; PDB: 2VQI_B 3FIP_A 3RFZ_E 3OHN_A 3FCG_B.
Probab=24.10 E-value=2.2e+02 Score=28.82 Aligned_cols=40 Identities=23% Similarity=0.304 Sum_probs=24.2
Q ss_pred eeEEEEEcCceeEEEecCCccEEEccCCC----eeEEEEEEecC
Q 023727 129 SNVKVVLNGGEHVTFLRPDGYFSFQNMSA----GTHLIEVAAIG 168 (278)
Q Consensus 129 s~t~V~L~g~~~~a~~~~dG~F~f~nVP~----GsY~LeVss~g 168 (278)
+.+.|..+|.......=+-|-|.|.|+|. |.-.|+|.-.+
T Consensus 99 s~V~v~qnG~~iy~~~VppGpF~i~dlp~~~~~gdl~V~i~d~~ 142 (552)
T PF00577_consen 99 STVEVYQNGRLIYSTNVPPGPFEIDDLPLISGSGDLQVVITDAD 142 (552)
T ss_dssp EEEEEEETTEEEEEEEE-SEEEEE-SS-TTTTTSEEEEEEEETT
T ss_pred cEEEEEECCEEEEEEEeCCCCEEecCccccCCCceEEEEEEECC
Confidence 67888888875443333449999999996 44555444433
No 93
>PF10836 DUF2574: Protein of unknown function (DUF2574) ; InterPro: IPR020386 This entry contains proteins with no known function.
Probab=24.07 E-value=57 Score=26.36 Aligned_cols=14 Identities=43% Similarity=0.662 Sum_probs=11.0
Q ss_pred CCCceEEEEEEECC
Q 023727 101 SGDGFSISGRVKLP 114 (278)
Q Consensus 101 ~~~~~tIsGrV~~p 114 (278)
+..+.+|+|||..|
T Consensus 24 dTATLtIsGrv~~P 37 (93)
T PF10836_consen 24 DTATLTISGRVSPP 37 (93)
T ss_pred cceEEEEcceEcCC
Confidence 44569999999865
No 94
>COG4850 Uncharacterized conserved protein [Function unknown]
Probab=23.99 E-value=1.2e+02 Score=30.20 Aligned_cols=39 Identities=26% Similarity=0.434 Sum_probs=31.6
Q ss_pred EEEEEc-CceeEEEecCCccEEEccCC-----CeeEEEEEEecCc
Q 023727 131 VKVVLN-GGEHVTFLRPDGYFSFQNMS-----AGTHLIEVAAIGY 169 (278)
Q Consensus 131 t~V~L~-g~~~~a~~~~dG~F~f~nVP-----~GsY~LeVss~gy 169 (278)
+.+++. +....+.+|.+|+|.++-+- +|-+.+.+...|.
T Consensus 101 V~~T~~~~~tv~~~Td~~Gyf~i~~~~~~~~~~g~~av~lq~eg~ 145 (373)
T COG4850 101 VYVTLKNGATVNVATDDEGYFIIHAVIPFPPTKGNHAVRLQSEGE 145 (373)
T ss_pred EEEecCCCceEEeEecCCCceEEEEecccCCCCCceeEEeecCCC
Confidence 566676 66778999999999998763 4899999998884
No 95
>PF01060 DUF290: Transthyretin-like family; InterPro: IPR001534 This new apparently nematode-specific protein family has been called family 2 []. The proteins show weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. The specific function of this protein is unknown.; GO: 0005615 extracellular space
Probab=23.91 E-value=1.5e+02 Score=22.46 Aligned_cols=31 Identities=32% Similarity=0.289 Sum_probs=20.5
Q ss_pred ceeEEEEEcCc--------eeEEEecCCccEEEccCCCe
Q 023727 128 ASNVKVVLNGG--------EHVTFLRPDGYFSFQNMSAG 158 (278)
Q Consensus 128 ~s~t~V~L~g~--------~~~a~~~~dG~F~f~nVP~G 158 (278)
+++++|.|-.. --.+.+|.+|+|.+..=...
T Consensus 11 ~~~~~V~L~e~d~~~~Ddll~~~~Td~~G~F~l~G~~~e 49 (80)
T PF01060_consen 11 AKNVKVKLWEDDYFDPDDLLDETKTDSDGNFELSGSTNE 49 (80)
T ss_pred CCCCEEEEEECCCCCCCceeEEEEECCCceEEEEEEccC
Confidence 45677776321 12588999999999864433
No 96
>PRK14643 hypothetical protein; Provisional
Probab=23.77 E-value=49 Score=29.03 Aligned_cols=17 Identities=18% Similarity=0.225 Sum_probs=14.2
Q ss_pred cCCCeeEEEEEEecCcc
Q 023727 154 NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~ 170 (278)
+.-+|.|+|||+|+|..
T Consensus 74 d~i~~~Y~LEVSSPGle 90 (164)
T PRK14643 74 IKTSEKYLLEISSSGIE 90 (164)
T ss_pred CCCCCCeEEEecCCCCC
Confidence 45579999999999974
No 97
>PF11974 MG1: Alpha-2-macroglobulin MG1 domain; InterPro: IPR021868 This is the N-terminal MG1 domain from alpha-2-macroglobulin [].
Probab=23.69 E-value=2.8e+02 Score=21.89 Aligned_cols=28 Identities=25% Similarity=0.387 Sum_probs=21.1
Q ss_pred ceeEEEEEcC---ce--eEEEecCCccEEEccC
Q 023727 128 ASNVKVVLNG---GE--HVTFLRPDGYFSFQNM 155 (278)
Q Consensus 128 ~s~t~V~L~g---~~--~~a~~~~dG~F~f~nV 155 (278)
++.++|.|.+ ++ -.+.+|.+|.+.|...
T Consensus 28 v~ga~V~l~~~~~~~~l~~g~TD~~G~a~~~~~ 60 (97)
T PF11974_consen 28 VAGAEVELYDSRNGQVLASGKTDADGFASFDST 60 (97)
T ss_pred cCCCEEEEEECCCCcEeeeeeeCCCceEEecCC
Confidence 5677777754 22 2588999999999887
No 98
>COG2351 Transthyretin-like protein [General function prediction only]
Probab=23.52 E-value=4.7e+02 Score=22.35 Aligned_cols=45 Identities=27% Similarity=0.330 Sum_probs=32.1
Q ss_pred ceeEEEEEc---Cce----eEEEecCCccEEEcc-----CCCeeEEEEEEecCccee
Q 023727 128 ASNVKVVLN---GGE----HVTFLRPDGYFSFQN-----MSAGTHLIEVAAIGYFFS 172 (278)
Q Consensus 128 ~s~t~V~L~---g~~----~~a~~~~dG~F~f~n-----VP~GsY~LeVss~gy~F~ 172 (278)
.+.++|.|. +.+ ..+.+|.||.-.-.. +..|.|.|+.+.-||.-.
T Consensus 24 Aagv~V~L~rl~~~~~~~l~t~~Tn~DGR~d~pll~g~~~~~G~Y~l~F~~gdYf~~ 80 (124)
T COG2351 24 AAGVKVELYRLEGNQWELLKTVVTNADGRIDAPLLAGETLATGIYELVFHTGDYFKS 80 (124)
T ss_pred CCCCEEEEEEecCCcceeeeEEEecCCCcccccccCccccccceEEEEEEcchhhhc
Confidence 567777764 331 258899999877433 346999999998888543
No 99
>PRK10301 hypothetical protein; Provisional
Probab=23.22 E-value=1.4e+02 Score=24.77 Aligned_cols=16 Identities=13% Similarity=0.256 Sum_probs=10.4
Q ss_pred EEEc---cCCCeeEEEEEE
Q 023727 150 FSFQ---NMSAGTHLIEVA 165 (278)
Q Consensus 150 F~f~---nVP~GsY~LeVs 165 (278)
+++. ++++|+|+++-+
T Consensus 88 ~~v~l~~~L~~G~YtV~Wr 106 (124)
T PRK10301 88 LIVPLADSLKPGTYTVDWH 106 (124)
T ss_pred EEEECCCCCCCccEEEEEE
Confidence 5553 467899987644
No 100
>PF05753 TRAP_beta: Translocon-associated protein beta (TRAPB); InterPro: IPR008856 This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion [].; GO: 0005783 endoplasmic reticulum, 0016021 integral to membrane
Probab=22.91 E-value=5.4e+02 Score=22.83 Aligned_cols=41 Identities=12% Similarity=0.212 Sum_probs=26.8
Q ss_pred EEecCCccEEEccCCCe-----eEEEEEEec-CcceeeEEEEEEcCC
Q 023727 142 TFLRPDGYFSFQNMSAG-----THLIEVAAI-GYFFSPVRVDVSARH 182 (278)
Q Consensus 142 a~~~~dG~F~f~nVP~G-----sY~LeVss~-gy~F~p~RVdV~~~~ 182 (278)
.++....+.++..+||| +|+|+.... .|.+.+..|......
T Consensus 70 ~lvsG~~s~~~~~i~pg~~vsh~~vv~p~~~G~f~~~~a~VtY~~~~ 116 (181)
T PF05753_consen 70 ELVSGSLSASWERIPPGENVSHSYVVRPKKSGYFNFTPAVVTYRDSE 116 (181)
T ss_pred EeccCceEEEEEEECCCCeEEEEEEEeeeeeEEEEccCEEEEEECCC
Confidence 44555667788899998 566665533 356667777776543
No 101
>smart00831 Cation_ATPase_N Cation transporter/ATPase, N-terminus. This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases PUBMED:12480547, PUBMED:12529322.
Probab=22.84 E-value=83 Score=22.20 Aligned_cols=22 Identities=14% Similarity=0.212 Sum_probs=16.9
Q ss_pred hhhccCHHHHHHHHHHHHHHhh
Q 023727 220 MSLVKSPMGLMMGFMLVVVFLM 241 (278)
Q Consensus 220 ~~lLkNPM~LM~lv~l~l~~~m 241 (278)
+.-|+|||.++.++..++.+++
T Consensus 41 l~~~~~p~~~iL~~~a~is~~~ 62 (64)
T smart00831 41 LRQFHNPLIYILLAAAVLSALL 62 (64)
T ss_pred HHHHHhHHHHHHHHHHHHHHHH
Confidence 3347899999998888877654
No 102
>COG5266 CbiK ABC-type Co2+ transport system, periplasmic component [Inorganic ion transport and metabolism]
Probab=22.80 E-value=2.7e+02 Score=26.66 Aligned_cols=30 Identities=27% Similarity=0.236 Sum_probs=25.1
Q ss_pred eEEEecCCccEEEccCCCeeEEEEEEecCc
Q 023727 140 HVTFLRPDGYFSFQNMSAGTHLIEVAAIGY 169 (278)
Q Consensus 140 ~~a~~~~dG~F~f~nVP~GsY~LeVss~gy 169 (278)
+..++|.+|.|.|-=+..|..-.-+.+.+=
T Consensus 214 ~~~~TD~kG~~~fip~r~G~W~~~~~~~~~ 243 (264)
T COG5266 214 LVQFTDDKGEVSFIPLRAGVWGFAVEHKTD 243 (264)
T ss_pred eEEEcCCCceEEEEEccCceEEEEeeccCC
Confidence 578999999999988888999877766653
No 103
>PRK14639 hypothetical protein; Provisional
Probab=22.72 E-value=56 Score=27.86 Aligned_cols=17 Identities=24% Similarity=0.299 Sum_probs=13.9
Q ss_pred cCCCeeEEEEEEecCcc
Q 023727 154 NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~ 170 (278)
+.-+|.|+|+|+|+|..
T Consensus 58 d~i~~~Y~LEVSSPGl~ 74 (140)
T PRK14639 58 PPVSGEYFLEVSSPGLE 74 (140)
T ss_pred cccCCCeEEEEeCCCCC
Confidence 34469999999999974
No 104
>PF13860 FlgD_ig: FlgD Ig-like domain; PDB: 3C12_A 3OSV_A.
Probab=22.33 E-value=3.4e+02 Score=20.33 Aligned_cols=14 Identities=29% Similarity=0.325 Sum_probs=9.8
Q ss_pred cCCCeeEEEEEEec
Q 023727 154 NMSAGTHLIEVAAI 167 (278)
Q Consensus 154 nVP~GsY~LeVss~ 167 (278)
.||+|.|.+.|...
T Consensus 65 ~~~~G~Y~~~v~a~ 78 (81)
T PF13860_consen 65 PVPDGTYTFRVTAT 78 (81)
T ss_dssp B--SEEEEEEEEEE
T ss_pred CCCCCCEEEEEEEE
Confidence 78999999988764
No 105
>PRK00022 lolB outer membrane lipoprotein LolB; Provisional
Probab=22.32 E-value=4.2e+02 Score=23.27 Aligned_cols=12 Identities=17% Similarity=0.332 Sum_probs=8.2
Q ss_pred CceEEEEEEECC
Q 023727 103 DGFSISGRVKLP 114 (278)
Q Consensus 103 ~~~tIsGrV~~p 114 (278)
+.+.++||+...
T Consensus 46 ~~w~~~Gria~~ 57 (202)
T PRK00022 46 TQYQTRGRFAYI 57 (202)
T ss_pred hceEEEEEEEEE
Confidence 557777777654
No 106
>PRK11657 dsbG disulfide isomerase/thiol-disulfide oxidase; Provisional
Probab=22.20 E-value=2.9e+02 Score=25.45 Aligned_cols=41 Identities=20% Similarity=0.322 Sum_probs=26.1
Q ss_pred CceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCceeEEEecCCccEEEcc
Q 023727 103 DGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEHVTFLRPDGYFSFQN 154 (278)
Q Consensus 103 ~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~~a~~~~dG~F~f~n 154 (278)
.+.+|..+...++. +........+....-++++||.|.|.+
T Consensus 33 ~g~~v~~~~~~p~~-----------l~g~~~~~~~~~~i~Y~t~dg~y~i~G 73 (251)
T PRK11657 33 QGITIIKTFDAPGG-----------LKGYAAKYQDMGVTIYLTPDGKHAISG 73 (251)
T ss_pred CCCEEEEeecCCCC-----------ceEEEEEeCCCceEEEEcCCCCEEEEE
Confidence 46677666443322 344555556666678999999988864
No 107
>smart00634 BID_1 Bacterial Ig-like domain (group 1).
Probab=22.12 E-value=3.6e+02 Score=20.52 Aligned_cols=61 Identities=11% Similarity=0.131 Sum_probs=34.3
Q ss_pred CCceEEEEEEECCCCCCCCCCCCCCCceeEEEEEcCcee------EEEecCCccEEE--ccCCCeeEEEEEEecCc
Q 023727 102 GDGFSISGRVKLPGMSLKAFGSPGGKASNVKVVLNGGEH------VTFLRPDGYFSF--QNMSAGTHLIEVAAIGY 169 (278)
Q Consensus 102 ~~~~tIsGrV~~p~~~p~~~~lp~~~~s~t~V~L~g~~~------~a~~~~dG~F~f--~nVP~GsY~LeVss~gy 169 (278)
.+..+|+=+|....++|. | -..+++.++++.. ...+|.+|...+ ..-.+|.|++.+...++
T Consensus 17 ~d~~~i~v~v~D~~Gnpv----~---~~~V~f~~~~~~~~~~~~~~~~Td~~G~a~~~l~~~~~G~~~vta~~~~~ 85 (92)
T smart00634 17 SDAITLTATVTDANGNPV----A---GQEVTFTTPSGGALTLSKGTATTDANGIATVTLTSTTAGVYTVTASLENG 85 (92)
T ss_pred cccEEEEEEEECCCCCCc----C---CCEEEEEECCCceeeccCCeeeeCCCCEEEEEEECCCCcEEEEEEEECCC
Confidence 467889988987777652 2 1347777776532 234555554333 22234555555554443
No 108
>PRK14631 hypothetical protein; Provisional
Probab=21.47 E-value=53 Score=29.15 Aligned_cols=14 Identities=29% Similarity=0.472 Sum_probs=12.6
Q ss_pred CeeEEEEEEecCcc
Q 023727 157 AGTHLIEVAAIGYF 170 (278)
Q Consensus 157 ~GsY~LeVss~gy~ 170 (278)
+|.|+|+|+|+|..
T Consensus 90 ~~~Y~LEVSSPGld 103 (174)
T PRK14631 90 SGEYALEVSSPGWD 103 (174)
T ss_pred CCCeEEEEeCCCCC
Confidence 68999999999974
No 109
>PRK14636 hypothetical protein; Provisional
Probab=21.45 E-value=54 Score=29.12 Aligned_cols=15 Identities=20% Similarity=0.120 Sum_probs=13.0
Q ss_pred CCeeEEEEEEecCcc
Q 023727 156 SAGTHLIEVAAIGYF 170 (278)
Q Consensus 156 P~GsY~LeVss~gy~ 170 (278)
-++.|+|+|+|+|..
T Consensus 70 i~~~Y~LEVSSPGld 84 (176)
T PRK14636 70 IEDAYRLEVSSPGID 84 (176)
T ss_pred CCCCeEEEEeCCCCC
Confidence 369999999999975
No 110
>PRK14645 hypothetical protein; Provisional
Probab=21.39 E-value=61 Score=28.16 Aligned_cols=17 Identities=24% Similarity=0.170 Sum_probs=13.9
Q ss_pred cCCCeeEEEEEEecCcc
Q 023727 154 NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~ 170 (278)
+.-+|.|.|+|+|+|..
T Consensus 72 d~i~~~Y~LEVSSPGld 88 (154)
T PRK14645 72 DPIEGEYRLEVESPGPK 88 (154)
T ss_pred ccCCCceEEEEeCCCCC
Confidence 33469999999999975
No 111
>PRK14638 hypothetical protein; Provisional
Probab=21.34 E-value=63 Score=27.84 Aligned_cols=17 Identities=18% Similarity=0.223 Sum_probs=13.8
Q ss_pred cCCCeeEEEEEEecCcc
Q 023727 154 NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~ 170 (278)
++-++.|.|+|+|+|..
T Consensus 70 d~i~~~Y~LEVSSPGld 86 (150)
T PRK14638 70 DLIEHSYTLEVSSPGLD 86 (150)
T ss_pred cccCCceEEEEeCCCCC
Confidence 33369999999999974
No 112
>PRK02001 hypothetical protein; Validated
Probab=21.27 E-value=61 Score=28.16 Aligned_cols=14 Identities=21% Similarity=0.271 Sum_probs=12.3
Q ss_pred CeeEEEEEEecCcc
Q 023727 157 AGTHLIEVAAIGYF 170 (278)
Q Consensus 157 ~GsY~LeVss~gy~ 170 (278)
.+.|.|+|+|+|..
T Consensus 63 d~~Y~LEVSSPGld 76 (152)
T PRK02001 63 EEDFELEVGSAGLT 76 (152)
T ss_pred CCCeEEEEeCCCCC
Confidence 38999999999974
No 113
>PF03443 Glyco_hydro_61: Glycosyl hydrolase family 61; InterPro: IPR005103 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The only known activity within this family is that of endoglucanase (3.2.1.4 from EC) GH61 from CAZY ; PDB: 4EIS_B 2VTC_A 4EIR_B 3EJA_D 3EII_A.
Probab=21.21 E-value=70 Score=29.16 Aligned_cols=25 Identities=24% Similarity=0.513 Sum_probs=16.6
Q ss_pred CCccEEE--c-cCCCeeEEEEEEecCcc
Q 023727 146 PDGYFSF--Q-NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 146 ~dG~F~f--~-nVP~GsY~LeVss~gy~ 170 (278)
..|.++| . ++|+|.|+|....+...
T Consensus 134 ~~~~~~~~IP~~l~~G~YLlR~E~IaLH 161 (218)
T PF03443_consen 134 NNGSWTFTIPKNLPPGQYLLRHEIIALH 161 (218)
T ss_dssp TTCEEEEE--TTBBSEEEEEEEEEEE-T
T ss_pred cCCceEEEeCCCCCCCCceEEecceeec
Confidence 4455544 4 78899999998876554
No 114
>PRK14630 hypothetical protein; Provisional
Probab=21.01 E-value=59 Score=27.86 Aligned_cols=15 Identities=13% Similarity=0.049 Sum_probs=13.2
Q ss_pred CeeEEEEEEecCcce
Q 023727 157 AGTHLIEVAAIGYFF 171 (278)
Q Consensus 157 ~GsY~LeVss~gy~F 171 (278)
++.|.|||+|+|..-
T Consensus 70 ~~~Y~LEVSSPGldR 84 (143)
T PRK14630 70 KYNFSLEISTPGINR 84 (143)
T ss_pred CCCeEEEEeCCCCCC
Confidence 699999999999753
No 115
>PF00630 Filamin: Filamin/ABP280 repeat; InterPro: IPR017868 The many different actin cross-linking proteins share a common architecture, consisting of a globular actin-binding domain and an extended rod. Whereas their actin-binding domains consist of two calponin homology domains (see IPR001715 from INTERPRO), their rods fall into three families. The rod domain of the family including the Dictyostelium discoideum (Slime mould) gelation factor (ABP120) and human filamin (ABP280) is constructed from tandem repeats of a 100-residue motif that is glycine and proline rich []. The gelation factor's rod contains 6 copies of the repeat, whereas filamin has a rod constructed from 24 repeats. The resolution of the 3D structure of rod repeats from the gelation factor has shown that they consist of a beta-sandwich, formed by two beta-sheets arranged in an immunoglobulin-like fold [, ]. Because conserved residues that form the core of the repeats are preserved in filamin, the repeat structure should be common to the members of the gelation factor/filamin family. The head to tail homodimerisation is crucial to the function of the ABP120 and ABP280 proteins. This interaction involves a small portion at the distal end of the rod domains. For the gelation factor it has been shown that the carboxy-terminal repeat 6 dimerises through a double edge-to-edge extension of the beta-sheet and that repeat 5 contributes to dimerisation to some extent [, , ].; PDB: 2DI9_A 2EEC_A 2DIC_A 2EEA_A 2DMC_A 2EE9_A 2D7O_A 2D7N_A 2K7P_A 2NQC_A ....
Probab=20.59 E-value=93 Score=23.51 Aligned_cols=25 Identities=24% Similarity=0.299 Sum_probs=18.2
Q ss_pred ecCCccEEEccCC--CeeEEEEEEecC
Q 023727 144 LRPDGYFSFQNMS--AGTHLIEVAAIG 168 (278)
Q Consensus 144 ~~~dG~F~f~nVP--~GsY~LeVss~g 168 (278)
-+.||+|.+.=.| +|.|.|.|..-+
T Consensus 66 ~~~~G~y~v~y~p~~~G~y~i~V~~~g 92 (101)
T PF00630_consen 66 DNGDGTYTVSYTPTEPGKYKISVKING 92 (101)
T ss_dssp EESSSEEEEEEEESSSEEEEEEEEESS
T ss_pred ECCCCEEEEEEEeCccEeEEEEEEECC
Confidence 4457888776665 488888888766
No 116
>PF13677 MotB_plug: Membrane MotB of proton-channel complex MotA/MotB
Probab=20.54 E-value=1.9e+02 Score=20.90 Aligned_cols=18 Identities=17% Similarity=0.392 Sum_probs=12.6
Q ss_pred cccCCCHHHHHHHHHHHH
Q 023727 244 LMENMDPEEMRRAQEEMR 261 (278)
Q Consensus 244 Lme~mDPEe~ke~qeem~ 261 (278)
-|..+|.+..+++.+.++
T Consensus 37 s~s~~d~~k~~~~~~s~~ 54 (58)
T PF13677_consen 37 SMSSVDKEKFEEVAQSFQ 54 (58)
T ss_pred HHHhCCHHHHHHHHHHHH
Confidence 456679998777766554
No 117
>PRK14633 hypothetical protein; Provisional
Probab=20.35 E-value=69 Score=27.58 Aligned_cols=17 Identities=24% Similarity=0.391 Sum_probs=14.1
Q ss_pred cCCCeeEEEEEEecCcc
Q 023727 154 NMSAGTHLIEVAAIGYF 170 (278)
Q Consensus 154 nVP~GsY~LeVss~gy~ 170 (278)
++-++.|.|+|+|+|..
T Consensus 64 d~i~~~Y~LEVSSPGld 80 (150)
T PRK14633 64 DPVSGKYILEVSSPGMN 80 (150)
T ss_pred cCCCCCeEEEEeCCCCC
Confidence 44479999999999975
Done!