Query gi|254780160|ref|YP_003064573.1| hypothetical protein CLIBASIA_00215 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 298 No_of_seqs 193 out of 319 Neff 8.3 Searched_HMMs 39220 Date Sun May 22 22:56:23 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780160.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 COG4932 Predicted outer membra 100.0 2.5E-30 6.5E-35 214.2 26.0 266 21-288 953-1245(1531) 2 COG4932 Predicted outer membra 100.0 2.3E-30 5.9E-35 214.5 25.7 284 4-290 1116-1433(1531) 3 KOG1948 consensus 99.5 2.6E-10 6.6E-15 84.7 23.1 217 54-276 134-384 (1165) 4 KOG1948 consensus 99.1 3.8E-08 9.6E-13 70.8 19.5 204 51-263 754-966 (1165) 5 pfam05738 Cna_B Cna protein B- 97.9 3.1E-05 7.9E-10 51.9 6.7 48 54-102 2-49 (69) 6 pfam05738 Cna_B Cna protein B- 97.9 4.3E-05 1.1E-09 51.0 7.4 59 141-199 2-65 (69) 7 cd03863 M14_CPD_II The second 97.9 0.00016 4E-09 47.3 9.4 75 211-286 299-374 (375) 8 cd03864 M14_CPN Peptidase M14 97.8 0.00022 5.5E-09 46.4 8.8 75 211-286 317-392 (392) 9 cd03865 M14_CPE_H Peptidase M1 97.7 0.00035 9E-09 45.0 8.5 75 211-286 327-402 (402) 10 cd06245 M14_CPD_III The third 97.7 0.00047 1.2E-08 44.3 9.0 75 211-287 288-363 (363) 11 cd03868 M14_CPD_I The first ca 97.5 0.00093 2.4E-08 42.3 8.4 74 211-285 297-372 (372) 12 cd03867 M14_CPZ Peptidase M14- 97.4 0.0012 3.1E-08 41.6 7.8 69 211-280 319-388 (395) 13 KOG2649 consensus 97.3 0.002 5.1E-08 40.2 8.6 81 211-292 379-460 (500) 14 cd03869 M14_CPX_like Peptidase 97.3 0.0022 5.5E-08 40.0 8.5 73 211-284 330-404 (405) 15 cd03858 M14_CP_N-E_like Carbox 97.2 0.0039 1E-07 38.3 8.8 73 211-284 299-373 (374) 16 cd03864 M14_CPN Peptidase M14 96.9 0.015 3.9E-07 34.5 9.4 78 123-207 315-392 (392) 17 pfam07210 DUF1416 Protein of u 96.8 0.01 2.6E-07 35.6 8.3 69 214-284 13-85 (86) 18 cd06245 M14_CPD_III The third 96.8 0.017 4.5E-07 34.1 9.1 77 123-207 286-362 (363) 19 cd03863 M14_CPD_II The second 96.5 0.039 1E-06 31.8 9.6 78 124-207 297-374 (375) 20 cd03865 M14_CPE_H Peptidase M1 96.5 0.031 8E-07 32.4 9.0 77 124-207 326-402 (402) 21 pfam08308 PEGA PEGA domain. Th 96.5 0.025 6.5E-07 33.0 8.6 58 225-287 11-69 (71) 22 cd03866 M14_CPM Peptidase M14 96.5 0.015 3.8E-07 34.5 7.1 65 211-276 296-363 (376) 23 cd03869 M14_CPX_like Peptidase 96.0 0.087 2.2E-06 29.6 9.1 76 124-206 329-405 (405) 24 cd03858 M14_CP_N-E_like Carbox 96.0 0.095 2.4E-06 29.3 9.3 76 124-206 298-374 (374) 25 cd03868 M14_CPD_I The first ca 95.7 0.12 3.2E-06 28.6 8.7 76 124-206 296-372 (372) 26 cd03867 M14_CPZ Peptidase M14- 95.5 0.13 3.3E-06 28.5 8.4 77 123-206 317-395 (395) 27 cd03866 M14_CPM Peptidase M14 94.3 0.18 4.6E-06 27.5 6.4 69 124-197 295-363 (376) 28 pfam08308 PEGA PEGA domain. Th 93.9 0.39 9.9E-06 25.4 7.4 46 161-207 23-68 (71) 29 pfam02369 Big_1 Bacterial Ig-l 92.2 0.74 1.9E-05 23.6 7.0 61 38-99 18-83 (93) 30 pfam11589 DUF3244 Protein of u 92.2 0.61 1.5E-05 24.1 6.3 54 221-274 44-106 (106) 31 pfam08400 phage_tail_N Prophag 91.6 0.87 2.2E-05 23.1 9.3 73 39-111 3-80 (134) 32 smart00634 BID_1 Bacterial Ig- 90.8 1 2.6E-05 22.6 7.4 61 38-101 19-86 (92) 33 cd03459 3,4-PCD Protocatechuat 89.6 1.3 3.3E-05 22.0 6.4 61 33-93 10-87 (158) 34 pfam10670 NikM Nickel uptake s 89.5 1.3 3.3E-05 21.9 7.6 52 210-263 156-216 (219) 35 pfam11008 DUF2846 Protein of u 89.2 1.3 3.3E-05 22.0 5.7 18 162-179 57-74 (112) 36 TIGR02422 protocat_beta protoc 88.3 0.59 1.5E-05 24.2 3.5 21 73-93 116-136 (224) 37 KOG0518 consensus 87.9 1.7 4.3E-05 21.3 16.1 55 56-111 392-450 (1113) 38 pfam00775 Dioxygenase_C Dioxyg 86.2 2.1 5.3E-05 20.6 6.0 59 35-93 26-97 (181) 39 COG1470 Predicted membrane pro 85.9 2.1 5.5E-05 20.6 21.0 38 218-257 407-444 (513) 40 pfam09430 DUF2012 Protein of u 85.4 1.8 4.7E-05 21.0 4.7 37 75-111 23-60 (113) 41 cd03463 3,4-PCD_alpha Protocat 85.3 2.3 5.9E-05 20.4 6.7 58 35-92 33-106 (185) 42 pfam00576 Transthyretin HIUase 84.7 2.4 6.2E-05 20.2 5.3 41 54-94 17-63 (111) 43 cd03464 3,4-PCD_beta Protocate 81.6 3.2 8.2E-05 19.4 6.2 59 35-93 62-137 (220) 44 pfam07495 Y_Y_Y Y_Y_Y domain. 80.6 3.5 8.9E-05 19.2 6.1 12 169-180 34-45 (65) 45 cd03462 1,2-CCD chlorocatechol 80.6 3.5 8.9E-05 19.2 5.7 20 129-148 103-122 (247) 46 cd00421 intradiol_dioxygenase 80.5 3.5 9E-05 19.2 6.6 60 34-93 7-80 (146) 47 cd03461 1,2-HQD Hydroxyquinol 80.4 3.5 9E-05 19.2 6.1 20 129-148 124-143 (277) 48 pfam01835 A2M_N MG2 domain. Th 79.2 3.9 9.8E-05 18.9 8.7 47 227-273 36-95 (95) 49 COG3485 PcaH Protocatechuate 3 78.5 4.1 0.0001 18.8 6.6 25 37-61 71-95 (226) 50 pfam04234 CopC Copper resistan 76.3 4.7 0.00012 18.4 6.5 25 250-274 91-118 (120) 51 cd03458 Catechol_intradiol_dio 75.9 4.8 0.00012 18.3 5.8 21 128-148 107-127 (256) 52 cd05469 Transthyretin_like Tra 74.9 5.1 0.00013 18.1 5.7 51 43-94 7-63 (113) 53 cd05894 Ig_C5_MyBP-C C5 immuno 74.4 5.2 0.00013 18.1 6.8 71 191-274 4-85 (86) 54 cd03460 1,2-CTD Catechol 1,2 d 73.7 5.4 0.00014 18.0 6.0 62 128-190 127-193 (282) 55 pfam01186 Lysyl_oxidase Lysyl 72.3 3.1 7.9E-05 19.5 2.5 29 165-193 152-180 (205) 56 cd05822 TLP_HIUase HIUase (5-h 71.0 6.3 0.00016 17.6 5.4 54 41-95 5-64 (112) 57 COG2351 Transthyretin-like pro 70.7 6.3 0.00016 17.5 6.6 51 133-183 17-76 (124) 58 pfam01105 EMP24_GP25L emp24/gp 70.5 6.4 0.00016 17.5 9.9 86 191-278 3-98 (177) 59 PRK10378 hypothetical protein; 70.2 6.5 0.00017 17.5 5.8 11 168-178 91-101 (374) 60 PRK12813 flgD flagellar basal 66.3 7.8 0.0002 16.9 9.0 14 168-181 160-173 (223) 61 pfam03272 Enhancin Viral enhan 64.5 8.4 0.00021 16.7 5.9 76 21-99 426-502 (775) 62 cd05821 TLP_Transthyretin Tran 63.9 8.6 0.00022 16.7 5.6 22 73-94 44-69 (121) 63 KOG3287 consensus 62.9 9 0.00023 16.5 7.9 72 189-263 35-113 (236) 64 pfam08842 DUF1812 Protein of u 61.8 9.4 0.00024 16.4 5.3 20 82-101 66-85 (293) 65 PRK12812 flgD flagellar basal 61.1 9.7 0.00025 16.3 7.9 15 169-183 181-195 (259) 66 TIGR02438 catachol_actin catec 60.5 9.9 0.00025 16.3 3.3 33 155-188 182-214 (287) 67 cd00222 CollagenBindB Collagen 60.4 9.9 0.00025 16.3 17.2 133 53-190 22-172 (187) 68 smart00720 calpain_III calpain 59.9 10 0.00026 16.2 3.2 27 249-275 108-134 (143) 69 pfam06488 L_lac_phage_MSP Lact 58.4 11 0.00027 16.0 5.0 27 240-266 262-289 (301) 70 COG4640 Predicted membrane pro 57.9 11 0.00028 16.0 3.9 30 74-103 201-232 (465) 71 pfam11797 DUF3324 Protein of u 57.8 11 0.00028 16.0 7.5 26 250-275 102-131 (140) 72 cd03457 intradiol_dioxygenase_ 57.3 11 0.00029 15.9 4.5 55 38-92 26-100 (188) 73 PRK06655 flgD flagellar basal 55.6 12 0.0003 15.7 9.4 36 169-205 167-202 (225) 74 KOG1692 consensus 55.5 12 0.00031 15.7 7.1 69 223-291 53-128 (201) 75 cd04975 Ig4_SCFR_like Fourth i 55.4 12 0.00031 15.7 8.2 86 190-275 11-99 (101) 76 smart00095 TR_THY Transthyreti 55.3 12 0.00031 15.7 5.8 21 74-94 42-66 (121) 77 PRK09619 flgD flagellar basal 54.9 12 0.00031 15.7 8.4 44 169-215 161-204 (220) 78 pfam03443 Glyco_hydro_61 Glyco 47.5 16 0.00041 14.9 3.5 12 169-180 161-172 (234) 79 PRK10301 hypothetical protein; 47.1 16 0.00041 14.9 6.1 24 250-273 95-123 (124) 80 pfam11138 DUF2911 Protein of u 47.1 16 0.00041 14.9 3.6 60 31-96 14-73 (145) 81 TIGR02439 catechol_proteo cate 46.6 16 0.00042 14.8 5.5 30 35-64 126-155 (288) 82 TIGR03503 conserved hypothetic 46.0 17 0.00043 14.8 16.0 94 168-264 179-288 (374) 83 cd05864 Ig2_VEGFR-2 Second imm 45.8 17 0.00043 14.8 2.6 22 252-273 45-66 (70) 84 cd05860 Ig4_SCFR Fourth immuno 45.4 17 0.00044 14.7 9.3 86 189-275 10-99 (101) 85 TIGR03000 plancto_dom_1 Planct 44.8 18 0.00045 14.7 6.2 37 168-204 38-81 (81) 86 pfam01060 DUF290 Transthyretin 43.7 18 0.00047 14.5 2.9 10 54-63 12-21 (80) 87 COG3656 Predicted periplasmic 41.6 20 0.0005 14.3 3.1 29 166-194 116-147 (172) 88 cd05737 Ig_Myomesin_like_C C-t 41.5 20 0.0005 14.3 9.0 74 188-274 7-91 (92) 89 cd05748 Ig_Titin_like Immunogl 41.2 20 0.00051 14.3 7.0 23 252-274 51-73 (74) 90 COG5266 CbiK ABC-type Co2+ tra 38.9 22 0.00055 14.1 7.2 54 208-263 171-241 (264) 91 pfam01067 Calpain_III Calpain 38.6 22 0.00056 14.0 3.5 10 169-178 107-116 (139) 92 pfam07523 Big_3 Bacterial Ig-l 38.5 22 0.00056 14.0 2.5 13 87-99 47-59 (68) 93 TIGR02656 cyanin_plasto plasto 38.3 18 0.00045 14.6 1.5 80 189-274 16-102 (102) 94 cd05891 Ig_M-protein_C C-termi 37.6 23 0.00058 14.0 8.1 23 251-273 68-90 (92) 95 PRK10689 transcription-repair 36.8 23 0.00059 13.9 3.0 22 248-269 474-495 (1148) 96 PRK12633 flgD flagellar basal 36.3 24 0.0006 13.8 9.4 36 169-205 169-205 (230) 97 cd04971 Ig_TrKABC_d5 Fifth dom 35.5 24 0.00062 13.7 5.7 22 251-272 56-77 (81) 98 cd04976 Ig2_VEGFR Second immun 34.7 25 0.00064 13.7 3.6 22 252-273 46-67 (71) 99 TIGR02962 hdxy_isourate hydrox 34.0 26 0.00066 13.6 2.9 53 132-184 8-70 (117) 100 cd05859 Ig4_PDGFR-alpha Fourth 32.9 27 0.00068 13.5 8.2 85 190-275 11-99 (101) 101 PRK05842 flgD flagellar basal 31.6 28 0.00072 13.3 9.8 43 168-211 206-250 (269) 102 cd05747 Ig5_Titin_like M5, fif 31.3 28 0.00073 13.3 7.8 71 190-273 11-91 (92) 103 COG4315 Uncharacterized protei 30.9 28 0.00071 13.4 1.5 16 250-265 87-102 (138) 104 cd00214 Calpain_III Calpain, s 30.4 29 0.00075 13.2 2.9 12 168-179 113-124 (150) 105 cd05863 Ig2_VEGFR-3 Second imm 30.1 30 0.00076 13.2 2.7 21 252-272 42-62 (67) 106 pfam09912 DUF2141 Uncharacteri 29.7 30 0.00077 13.1 5.1 14 85-98 48-61 (111) 107 cd02858 Esterase_N_term Estera 29.6 30 0.00077 13.1 4.3 23 71-93 32-56 (85) 108 PHA02358 hypothetical protein 29.3 31 0.00078 13.1 2.8 28 77-104 28-56 (194) 109 pfam09829 DUF2057 Uncharacteri 27.3 33 0.00085 12.9 6.7 79 167-246 29-111 (189) 110 cd04972 Ig_TrkABC_d4 Fourth do 26.0 35 0.0009 12.7 6.7 69 190-274 8-89 (90) 111 cd05855 Ig_TrkB_d5 Fifth domai 25.7 36 0.00091 12.7 5.9 22 251-272 54-75 (79) 112 pfam12580 TPPII Tripeptidyl pe 24.6 37 0.00095 12.5 4.4 11 168-178 107-117 (194) 113 TIGR02837 spore_II_R stage II 24.3 36 0.00092 12.6 1.1 13 246-258 127-139 (172) 114 COG4704 Uncharacterized protei 23.2 40 0.001 12.4 5.1 10 168-177 82-91 (151) 115 TIGR02423 protocat_alph protoc 22.8 40 0.001 12.3 5.1 25 37-61 40-64 (203) 116 PRK12634 flgD flagellar basal 22.7 40 0.001 12.3 11.1 35 169-204 162-197 (221) 117 pfam09116 gp45-slide_C gp45 sl 22.6 41 0.001 12.3 7.0 10 170-179 76-85 (112) 118 pfam10528 PA14_2 GLEYA domain. 22.5 41 0.001 12.3 8.3 50 197-247 57-107 (112) 119 pfam10289 consensus 22.5 41 0.001 12.3 8.9 78 192-276 9-95 (95) 120 COG2373 Large extracellular al 21.9 42 0.0011 12.2 20.5 23 252-274 467-491 (1621) 121 pfam10794 DUF2606 Protein of u 21.0 44 0.0011 12.1 6.4 57 218-274 50-118 (131) 122 cd05733 Ig6_L1-CAM_like Sixth 20.9 44 0.0011 12.1 5.0 52 219-273 7-76 (77) 123 cd05722 Ig1_Neogenin First imm 20.6 45 0.0011 12.0 6.6 55 220-274 24-94 (95) 124 COG1843 FlgD Flagellar hook ca 20.1 46 0.0012 12.0 6.7 17 169-185 164-180 (222) No 1 >COG4932 Predicted outer membrane protein [Cell envelope biogenesis, outer membrane] Probab=100.00 E-value=2.5e-30 Score=214.18 Aligned_cols=266 Identities=16% Similarity=0.200 Sum_probs=208.9 Q ss_pred ECCCCCCCCCHHEEECCEEEEEEEECCCCCCCCCCEEEEEEECCCCCCCEEEEEEECCCCEEEECCCCCEEEEEEEECCC Q ss_conf 30331467700023023279999986887753467599999538887620588982468566302267218999961577 Q gi|254780160|r 21 LNNNISKGKGKRVVDAQRITCEARLTENSTSIDSGVSWHIFDSISNKKNTLSTTKKIIGGKVSFDLFPGDYLISASFGHV 100 (298) Q Consensus 21 ~~~~~~~g~~~~~v~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~pG~Y~v~~s~g~~ 100 (298) ..-.++.|++.+....+++++......+.+..++|+.|.+|.+..+..++ .++|+++|.....+|..|+|.+.++.+|. T Consensus 953 v~vp~sgGsGsGsgt~GSleiTKvDka~~~k~LeGA~F~Lyd~~Ge~~~r-eitT~~dGkl~~~nL~~~dY~LiETkAPt 1031 (1531) T COG4932 953 VAVPFSGGSGSGSGTIGSLEITKVDKADTGKKLEGAKFQLYDSEGEKLGR-EITTDEDGKLTFDNLQYGDYKLIETKAPT 1031 (1531) T ss_pred EECCCCCCCCCCCCCCCCCEEECCCCCCCCCCCCCCEEEEEECCCCEEEE-EEEECCCCEEEECCCCCCEEEEEECCCCC T ss_conf 50477788898866425515752575433453234379999358846500-56535563364235564406888705776 Q ss_pred CCEE------EEEEECCCCEEEEEEEC--CCCEEEEEEEECCCCCCCCCEEEEEEECCCCCEE--EEEEECCCCCEEEEE Q ss_conf 7527------89980686136787513--6643799986157887775518999983897435--667505675135210 Q gi|254780160|r 101 GVVK------KITVSSKEKNQKQVFIL--NAGGIRLYSIYKPGSPIVDDELTFSIYSNPNHKA--LLITDKVRSGTLVRL 170 (298) Q Consensus 101 ~~~~------~vtV~~~~~~~~~~~~~--~ag~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~--~~~t~~~~~~~~~~L 170 (298) +|.. .++|............+ ..|++.+.++++. .+..++++.|++...++... .+.++..|...+.+| T Consensus 1032 GY~l~~~dgkeiTI~~sg~Ei~vtkeN~~~~g~V~L~K~D~a-t~~~LaGA~FeLQdk~G~~l~enL~TD~~G~v~itdL 1110 (1531) T COG4932 1032 GYTLDYKDGKEITISASGKEIFVTKENEAKKGSVQLTKKDSA-TGATLAGAEFELQDKDGNTLQENLTTDEDGKVEITDL 1110 (1531) T ss_pred CEEECCCCCCEEEEECCCCEEEEEECCCCCCCCEEEEEECCC-CCCCCCCCEEEEEECCCCCHHHHCCCCCCCCEEECCC T ss_conf 514125766379984477256775211002563268873154-3240037457875045864132120166674786233 Q ss_pred CCCCEEEEEEECCCCCEEEE---EEEEC--CCCEEEEEEEE--CCCEEEEEEEECCCCCCCCCEEEEEECCCCCEE---- Q ss_conf 67617999961577854454---48842--88637899993--452389999834788632760899983899797---- Q gi|254780160|r 171 GTNNYQITSHYGKYNAIVST---VVKVE--PGKIIDVTIQN--RAAKITFKLVSEMGGEAVADTAWSILTASGDTV---- 239 (298) Q Consensus 171 ~~G~Y~v~et~a~~~~~~~~---~i~V~--~g~~~~~tv~~--~~~~~~~~~v~~~~G~~l~ga~~~i~~~~g~~v---- 239 (298) .||+|+|+|++||.||.+.+ .|+|+ .++...++-.+ ..|.+.+.++|+..+.+|+||.|+|++.+|..| T Consensus 1111 aPGDYqfVEtkAPtGY~LdatPV~FtI~eeq~e~~~vtKeN~~~~GsvqLtK~Ds~t~a~LaGA~Fel~d~dG~~VqegL 1190 (1531) T COG4932 1111 APGDYQFVETKAPTGYILDATPVNFTISEEQDEAAKVTKENTLKPGSVQLTKVDSATKATLAGAEFELQDEDGTLVQEGL 1190 (1531) T ss_pred CCCCEEEEEECCCCEEEECCCCCEEEEECCCCCEEEEEECCCCCCCCEEEEEECCCCCCCCCCCEEEEECCCCCEEECCC T ss_conf 67740237822774047147652048512577405885403435540699973154456025747998727786752153 Q ss_pred -EECCCCEEECCCCCEEEEEEEECCCCEEEE-----EEEEEECCCCEEEEEEECH Q ss_conf -421252121244870289999538824677-----7999728850489997410 Q gi|254780160|r 240 -GESANASPSMVLSEGDYTVIARNKERNYSR-----EFSVLTGKSTIVEVLMRQK 288 (298) Q Consensus 240 -t~~~G~~~~~~L~~G~Y~v~a~~~~~~y~~-----~ftV~~g~~~~veV~~~~~ 288 (298) +|.+|+....+|+||+|+|++.++|.+|.. +|+|+.+++..+.|...+. T Consensus 1191 tTD~nG~i~VtdL~PGdYqFVETkAP~GY~LdatP~~FtI~~~q~ev~~V~~en~ 1245 (1531) T COG4932 1191 TTDENGKINVTDLAPGDYQFVETKAPTGYILDATPTPFTIEFNQEEVVKVVKENT 1245 (1531) T ss_pred EECCCCCEEECCCCCCCEEEEEECCCCCEEECCCCCEEEEECCCCCEEEEEECCC T ss_conf 0057884886234786213466048764062156630388605642068851056 No 2 >COG4932 Predicted outer membrane protein [Cell envelope biogenesis, outer membrane] Probab=100.00 E-value=2.3e-30 Score=214.45 Aligned_cols=284 Identities=16% Similarity=0.203 Sum_probs=210.8 Q ss_pred EEEEEECCCCCCCCEEEECCCCC-------CCCCHHEEECCEEEEEEEECCCCCCCCCCEEEEEEECCCCCCCEEEEEEE Q ss_conf 55433136665320465303314-------67700023023279999986887753467599999538887620588982 Q gi|254780160|r 4 SFSSYAKAENMHTTSTLLNNNIS-------KGKGKRVVDAQRITCEARLTENSTSIDSGVSWHIFDSISNKKNTLSTTKK 76 (298) Q Consensus 4 ~~~~~~~~~~~~~~~~~~~~~~~-------~g~~~~~v~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g~~~~tt~ 76 (298) .|.+-+.+-+|..-+...+..|+ +-..+.+...+++.|..+... ....++||.|+|-+.. |...+-.|+|+ T Consensus 1116 qfVEtkAPtGY~LdatPV~FtI~eeq~e~~~vtKeN~~~~GsvqLtK~Ds~-t~a~LaGA~Fel~d~d-G~~VqegLtTD 1193 (1531) T COG4932 1116 QFVETKAPTGYILDATPVNFTISEEQDEAAKVTKENTLKPGSVQLTKVDSA-TKATLAGAEFELQDED-GTLVQEGLTTD 1193 (1531) T ss_pred EEEEECCCCEEEECCCCCEEEEECCCCCEEEEEECCCCCCCCEEEEEECCC-CCCCCCCCEEEEECCC-CCEEECCCEEC T ss_conf 237822774047147652048512577405885403435540699973154-4560257479987277-86752153005 Q ss_pred CCCCEEEECCCCCEEEEEEEECCCCCE-----EEEEEECCCCEEEEEEE---CCCCEEEEEEEECCCCCCCCCEEEEEEE Q ss_conf 468566302267218999961577752-----78998068613678751---3664379998615788777551899998 Q gi|254780160|r 77 IIGGKVSFDLFPGDYLISASFGHVGVV-----KKITVSSKEKNQKQVFI---LNAGGIRLYSIYKPGSPIVDDELTFSIY 148 (298) Q Consensus 77 ~~G~~~~~~L~pG~Y~v~~s~g~~~~~-----~~vtV~~~~~~~~~~~~---~~ag~~~~~~~~~~~~~~~~~~~~f~i~ 148 (298) .+|...+.+|.||+|.+.++.+|.+|. ..++|...+.....+.. ...|...+.+++. ...+.+.++.|.+. T Consensus 1194 ~nG~i~VtdL~PGdYqFVETkAP~GY~LdatP~~FtI~~~q~ev~~V~~en~~~pgsv~L~k~d~-~~~~~l~~a~fkl~ 1272 (1531) T COG4932 1194 ENGKINVTDLAPGDYQFVETKAPTGYILDATPTPFTIEFNQEEVVKVVKENTAIPGSVVLTKKDS-DTGAALSGAEFKLL 1272 (1531) T ss_pred CCCCEEECCCCCCCEEEEEECCCCCEEECCCCCEEEEECCCCCEEEEEECCCCCCCCCEEECCCC-CCCCCCCCCCEEEE T ss_conf 78848862347862134660487640621566303886056420688510566899725452578-76553588740556 Q ss_pred CCCCCE--EEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEE---EEEECCCC--EEEEEEEEC--CCEEEEEEEECC Q ss_conf 389743--566750567513521067617999961577854454---48842886--378999934--523899998347 Q gi|254780160|r 149 SNPNHK--ALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVST---VVKVEPGK--IIDVTIQNR--AAKITFKLVSEM 219 (298) Q Consensus 149 ~~~~~~--~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~---~i~V~~g~--~~~~tv~~~--~~~~~~~~v~~~ 219 (298) ++++.. -.+.++..|...+.+|.||.|+|+|++||.||.+.+ .|+++..| ...+++.++ .+.+.+.+++.+ T Consensus 1273 ~~eg~~vqe~L~td~~Gei~v~dlkpGdyqfVETkAp~Gy~L~a~pv~ftI~~~q~e~~kV~~~n~~~~gsv~l~k~d~~ 1352 (1531) T COG4932 1273 DAEGTTVQEGLTTDETGEIVVADLKPGDYQFVETKAPEGYILDATPVNFTIEFNQEEAVKVTKENDAKTGSVVLTKLDSS 1352 (1531) T ss_pred CCCCCEECCCCEECCCCCEEECCCCCCCCCCEECCCCCCEEEEECCEEEEEEECCCCCEEEEEEECCCCCCEEEEEEECC T ss_conf 47786704672405777378604688751013714776359860531579985123317999850343340789996156 Q ss_pred CCCCCCCEEEEEECCCCCE-----EEECCCCEEECCCCCEEEEEEEECCCCEEEE-----EEEEEECCCCEEEEEEECHH Q ss_conf 8863276089998389979-----7421252121244870289999538824677-----79997288504899974101 Q gi|254780160|r 220 GGEAVADTAWSILTASGDT-----VGESANASPSMVLSEGDYTVIARNKERNYSR-----EFSVLTGKSTIVEVLMRQKR 289 (298) Q Consensus 220 ~G~~l~ga~~~i~~~~g~~-----vt~~~G~~~~~~L~~G~Y~v~a~~~~~~y~~-----~ftV~~g~~~~veV~~~~~~ 289 (298) .+..|+||+|++++..|.. +++.+|+....+|+||+|.|++.++|.+|+. +|+|+.++...+.|++.+.. T Consensus 1353 ~~~~LegA~F~l~de~g~ilke~l~t~~nG~l~v~dLaPGdYqfvETkAPtgY~Ld~tpv~FTIe~~q~e~~~vt~~nk~ 1432 (1531) T COG4932 1353 SGVTLEGAEFELLDEEGNILKEGLVTDENGQLLVDDLAPGDYQFVETKAPTGYELDATPVDFTIEFNQEEALKVTKTNKL 1432 (1531) T ss_pred CCCCCCCCEEEEECCCCCEEHHCCEECCCCCEEEEECCCCCEEEEECCCCCCEECCCCCEEEEEECCCCCCEEEEEECCC T ss_conf 67510673799873668660114300788768973048874145771488642615773578987286664379861356 Q ss_pred C Q ss_conf 0 Q gi|254780160|r 290 M 290 (298) Q Consensus 290 ~ 290 (298) . T Consensus 1433 ~ 1433 (1531) T COG4932 1433 F 1433 (1531) T ss_pred C T ss_conf 5 No 3 >KOG1948 consensus Probab=99.45 E-value=2.6e-10 Score=84.74 Aligned_cols=217 Identities=14% Similarity=0.131 Sum_probs=135.4 Q ss_pred CCEEEEEEECCCCCCCEEEEEEECCCCEEEECCCCCEEEEEEEECCCCCE---EEEEEECCCCEEEEEEECCCCEEEEEE Q ss_conf 67599999538887620588982468566302267218999961577752---789980686136787513664379998 Q gi|254780160|r 54 SGVSWHIFDSISNKKNTLSTTKKIIGGKVSFDLFPGDYLISASFGHVGVV---KKITVSSKEKNQKQVFILNAGGIRLYS 130 (298) Q Consensus 54 ~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~pG~Y~v~~s~g~~~~~---~~vtV~~~~~~~~~~~~~~ag~~~~~~ 130 (298) +|+-..+-++ .+--....|++.|.+.++++.||+|.+.++++.+... +.+.+....+ ....-.+...|..+.+ T Consensus 134 agV~velrs~---e~~iast~T~~~Gky~f~~iiPG~Yev~ashp~w~~~~ag~tvvev~~a~-~~va~~f~VsGydl~g 209 (1165) T KOG1948 134 AGVLVELRSQ---EDPIASTKTEDGGKYEFRNIIPGKYEVSASHPAWECISAGKTVVEVKNAP-VVVAPNFKVSGYDLEG 209 (1165) T ss_pred CCCEEECCCC---CCCCEEEEECCCCEEEEEECCCCCEEEECCCCCEEEEECCCEEEEECCCC-CCCCCCEEEEEEEEEE T ss_conf 6535310356---57200367259976898864998568861575236762583799967886-4447763897444488 Q ss_pred EECCCCCCCCCEEEEEEECCCCC-----------------------EEEEEEECCCCCEEEEECCCCEEEEEEECCCCCE Q ss_conf 61578877755189999838974-----------------------3566750567513521067617999961577854 Q gi|254780160|r 131 IYKPGSPIVDDELTFSIYSNPNH-----------------------KALLITDKVRSGTLVRLGTNNYQITSHYGKYNAI 187 (298) Q Consensus 131 ~~~~~~~~~~~~~~f~i~~~~~~-----------------------~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~ 187 (298) .....+. |..++.+.+|..... .....++..|.+.+..+|.|.|++...+-..... T Consensus 210 sv~s~s~-P~~gv~~~l~s~~v~~~dvpkc~gs~ap~n~~a~e~vslc~~vsd~~G~fsfksvPsGkY~l~a~y~ge~~~ 288 (1165) T KOG1948 210 SVRSESM-PFVGVVMTLYSTSVIDLDVPKCVGSEAPLNVPATENVSLCIGVSDPRGRFSFKSVPSGKYYLAASYVGEPKS 288 (1165) T ss_pred EEECCCC-CCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEEECCCCEEEEEECCCCCEEEEEEECCCCEE T ss_conf 9852688-535518999972323454776226778988874330356887776886079997277877997784699507 Q ss_pred EE-----EEEEECCCCEE-EEEEEECCCEEEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECC-CCCEEEEEEE Q ss_conf 45-----44884288637-89999345238999983478863276089998389979742125212124-4870289999 Q gi|254780160|r 188 VS-----TVVKVEPGKII-DVTIQNRAAKITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMV-LSEGDYTVIA 260 (298) Q Consensus 188 ~~-----~~i~V~~g~~~-~~tv~~~~~~~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~-L~~G~Y~v~a 260 (298) +. .++.|+...+. .-.+...+..++.++++...|.+++++.+.+ |..-...|++.|-|.+++ +..|.|++.| T Consensus 289 fdvSP~~l~v~Vehd~lqi~~ef~vtgfSvtGRVl~g~~g~~l~gvvvlv-ngk~~~kTdaqGyykLen~~t~gtytI~a 367 (1165) T KOG1948 289 FDVSPNPLKVVVEHDHLQIASEFRVTGFSVTGRVLVGSKGLPLSGVVVLV-NGKSGGKTDAQGYYKLENLKTDGTYTITA 367 (1165) T ss_pred EEECCCCEEEEEECCCEECCCEEEEEEEEEEEEEEECCCCCCCCCEEEEE-CCCCCCEECCCCEEEEEEEECCCCEEEEE T ss_conf 87189852478841411225505898887510698478988766249997-37304157366528831130157289998 Q ss_pred ECCCCEEE-EEEEEEEC Q ss_conf 53882467-77999728 Q gi|254780160|r 261 RNKERNYS-REFSVLTG 276 (298) Q Consensus 261 ~~~~~~y~-~~ftV~~g 276 (298) .+.-..+. ..|.|.+. T Consensus 368 ~kehlqFstv~~kv~pn 384 (1165) T KOG1948 368 KKEHLQFSTVHAKVKPN 384 (1165) T ss_pred ECCCEEEEEEEEEECCC T ss_conf 42430541379995278 No 4 >KOG1948 consensus Probab=99.14 E-value=3.8e-08 Score=70.75 Aligned_cols=204 Identities=14% Similarity=0.146 Sum_probs=92.3 Q ss_pred CCCCCEEEEEEECCCCCCCEEEEEEECCCCEEEECCCC-CEEEEEEEECCCCCEEEEEEECCCCEEEEEEECCCCEEEEE Q ss_conf 53467599999538887620588982468566302267-21899996157775278998068613678751366437999 Q gi|254780160|r 51 SIDSGVSWHIFDSISNKKNTLSTTKKIIGGKVSFDLFP-GDYLISASFGHVGVVKKITVSSKEKNQKQVFILNAGGIRLY 129 (298) Q Consensus 51 ~~~~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~p-G~Y~v~~s~g~~~~~~~vtV~~~~~~~~~~~~~~ag~~~~~ 129 (298) |.++|++-+++.. .+++-.....|+.+|.+...-|.. =.|.+.++. .+| .++.-............-.+.+. T Consensus 754 Palega~Ikis~k-kds~~~Iev~T~~~Gafk~GPl~~dl~yd~tA~k--egy----vft~~~~t~~sfqa~kl~~vsv~ 826 (1165) T KOG1948 754 PALEGAVIKISLK-KDSDVVIEVITNKDGAFKIGPLKRDLDYDITATK--EGY----VFTPTSPTPGSFQAVKLSQVSVK 826 (1165) T ss_pred CCCCCCEEEEEEC-CCCCEEEEEEECCCCCEEECCCCCCCCCCEEECC--CCE----EEECCCCCCCCEEEEEEEEEEEE T ss_conf 6778857999806-9985248999747785774453335544114413--746----97427998653013445678999 Q ss_pred EEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCC-EEEEEEEECCCCEEEEEEEE-C Q ss_conf 861578877755189999838974356675056751352106761799996157785-44544884288637899993-4 Q gi|254780160|r 130 SIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNA-IVSTVVKVEPGKIIDVTIQN-R 207 (298) Q Consensus 130 ~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~-~~~~~i~V~~g~~~~~tv~~-~ 207 (298) .++..+ -|+.++-.++--+......+++.+.|-..+..|.||.|+++.-...+.. +....|+|.+|+..++++.- | T Consensus 827 vkdea~--q~LpgvLLSLsGg~~yRsNlvtgdng~~nf~sLsPgqyylRpmlKEykFePst~mIevkeGq~~~vvl~gkR 904 (1165) T KOG1948 827 VKDEAT--QPLPGVLLSLSGGKDYRSNLVTGDNGHKNFVSLSPGQYYLRPMLKEYKFEPSTSMIEVKEGQHENVVLKGKR 904 (1165) T ss_pred EECCCC--CCCCCEEEEEECCCCHHHCCCCCCCCEEEEEECCCCHHHHHHHHHHCCCCCCCEEEEECCCCEEEEEEEEEE T ss_conf 833678--847867999756853311441278761688634862212466787607698732699616852789999889 Q ss_pred CCEEEEEEEECCCCCCCCCEEEEEECCC-----CCEEEECCCCEEECCCCCEE-EEEEEECC Q ss_conf 5238999983478863276089998389-----97974212521212448702-89999538 Q gi|254780160|r 208 AAKITFKLVSEMGGEAVADTAWSILTAS-----GDTVGESANASPSMVLSEGD-YTVIARNK 263 (298) Q Consensus 208 ~~~~~~~~v~~~~G~~l~ga~~~i~~~~-----g~~vt~~~G~~~~~~L~~G~-Y~v~a~~~ 263 (298) .+......|..-.|+|.+|..++.+..+ -+.+++.+|.|.+.+|-||- |.+.++.. T Consensus 905 vAySayGtvssLsGdp~~gVaieA~sdn~~~y~eeattdenG~yRiRGL~Pdc~Y~V~vk~~ 966 (1165) T KOG1948 905 VAYSAYGTVSSLSGDPMKGVAIEALSDNCDLYQEEATTDENGTYRIRGLLPDCEYQVHVKSY 966 (1165) T ss_pred EEEEEEEEHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCEEEECCCCCCEEEEEEEEC T ss_conf 99876225011169965572778713788766421211468738873148992699998504 No 5 >pfam05738 Cna_B Cna protein B-type domain. This domain is found in Staphylococcus aureus collagen-binding surface protein. However, this region does not mediate collagen binding, the pfam05737 region carries out that function. The structure of the repetitive B-region has been solved and forms a beta sandwich structure. It is thought that this region forms a stalk in Staphylococcus aureus collagen-binding protein that presents the ligand binding domain away from the bacterial cell surface. Probab=97.95 E-value=3.1e-05 Score=51.86 Aligned_cols=48 Identities=19% Similarity=0.286 Sum_probs=21.9 Q ss_pred CCEEEEEEECCCCCCCEEEEEEECCCCEEEECCCCCEEEEEEEECCCCC Q ss_conf 6759999953888762058898246856630226721899996157775 Q gi|254780160|r 54 SGVSWHIFDSISNKKNTLSTTKKIIGGKVSFDLFPGDYLISASFGHVGV 102 (298) Q Consensus 54 ~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~pG~Y~v~~s~g~~~~ 102 (298) +||+|+||+........ ..+|+++|.+.+.+|+||+|+|++..++.+| T Consensus 2 ~Ga~f~L~~~~~~~~~~-~~tTd~~G~~~f~~L~~G~Y~v~E~~ap~GY 49 (69) T pfam05738 2 EGAEFTLLDNGGKVVGE-TLTTDSNGKYTFTNLPPGTYTVKETKAPAGY 49 (69) T ss_pred CCCEEEEEECCCCEEEE-EEEECCCCEEEECCCCCCEEEEEEEECCCCC T ss_conf 98199999899999986-7999999869989879965999998199984 No 6 >pfam05738 Cna_B Cna protein B-type domain. This domain is found in Staphylococcus aureus collagen-binding surface protein. However, this region does not mediate collagen binding, the pfam05737 region carries out that function. The structure of the repetitive B-region has been solved and forms a beta sandwich structure. It is thought that this region forms a stalk in Staphylococcus aureus collagen-binding protein that presents the ligand binding domain away from the bacterial cell surface. Probab=97.94 E-value=4.3e-05 Score=50.97 Aligned_cols=59 Identities=19% Similarity=0.239 Sum_probs=32.3 Q ss_pred CEEEEEEECCCCCEEE--EEEECCCCCEEEEECCCCEEEEEEECCCCCEEEE---EEEECCCCE Q ss_conf 5189999838974356--6750567513521067617999961577854454---488428863 Q gi|254780160|r 141 DELTFSIYSNPNHKAL--LITDKVRSGTLVRLGTNNYQITSHYGKYNAIVST---VVKVEPGKI 199 (298) Q Consensus 141 ~~~~f~i~~~~~~~~~--~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~---~i~V~~g~~ 199 (298) .+++|.|++.++.... ..++..|...+.+|++|+|+|+|..+|.+|..+. .+++..++. T Consensus 2 ~Ga~f~L~~~~~~~~~~~~tTd~~G~~~f~~L~~G~Y~v~E~~ap~GY~~~~~~~~~~~~~~~~ 65 (69) T pfam05738 2 EGAEFTLLDNGGKVVGETLTTDSNGKYTFTNLPPGTYTVKETKAPAGYTLTTTPTEFTIEAGQE 65 (69) T ss_pred CCCEEEEEECCCCEEEEEEEECCCCEEEECCCCCCEEEEEEEECCCCCEECCCEEEEEECCCCE T ss_conf 9819999989999998679999998699898799659999981999849899488999948999 No 7 >cd03863 M14_CPD_II The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally ac Probab=97.89 E-value=0.00016 Score=47.35 Aligned_cols=75 Identities=12% Similarity=0.148 Sum_probs=61.2 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCE-EEEEEEEEECCCCEEEEEEE Q ss_conf 89999834788632760899983899797421252121244870289999538824-67779997288504899974 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERN-YSREFSVLTGKSTIVEVLMR 286 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~-y~~~ftV~~g~~~~veV~~~ 286 (298) +.+.+.+...|+||++|.+.+.+-+-.+.+..+|.|-. -|+||.|.|.|+.+++. ..+.++|..+..+.|.++|. T Consensus 299 IkG~V~d~~~g~pI~~A~I~V~g~~h~v~t~~~GdywR-lL~pG~Y~v~~sa~GY~~~t~~v~V~~~~~~~v~f~L~ 374 (375) T cd03863 299 VRGFVLDATDGRGILNATISVADINHPVTTYKDGDYWR-LLVPGTYKVTASARGYDPVTKTVEVDSKGAVQVNFTLS 374 (375) T ss_pred CEEEEECCCCCCCCCCEEEEEECCCCCEEECCCCCEEE-EECCCEEEEEEEECCCCCEEEEEEECCCCCEEEEEEEE T ss_conf 72899658999888882999936666246779854799-70781489999967987656999988998189999996 No 8 >cd03864 M14_CPN Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates inclu Probab=97.79 E-value=0.00022 Score=46.42 Aligned_cols=75 Identities=19% Similarity=0.209 Sum_probs=60.2 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCE-EEEEEEEEECCCCEEEEEEE Q ss_conf 89999834788632760899983899797421252121244870289999538824-67779997288504899974 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERN-YSREFSVLTGKSTIVEVLMR 286 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~-y~~~ftV~~g~~~~veV~~~ 286 (298) ++-.+|.+..|.+|++|.+.+..-+..+.+...|.|-.. |+||+|.|.|+.+++. -.+.++|..++.+.|..+++ T Consensus 317 GIkG~V~D~~g~pi~~A~I~V~g~~h~v~t~~~GdywRl-L~pG~Y~vt~sa~GY~~~t~~v~V~~~~~~~vnf~L~ 392 (392) T cd03864 317 GIKGMVTDENNNGIANAVISVSGISHDVTSGTLGDYFRL-LLPGTYTVTASAPGYQPSTVTVTVGPAEATLVNFQLK 392 (392) T ss_pred CCEEEEECCCCCCCCCCEEEEECCCCCEEECCCCEEEEE-CCCCEEEEEEEECCCCCEEEEEEECCCCCEEEEEEEC T ss_conf 875999889999788828999666454221898307987-0781689999978967623799988998589988839 No 9 >cd03865 M14_CPE_H Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine pe Probab=97.68 E-value=0.00035 Score=45.05 Aligned_cols=75 Identities=15% Similarity=0.193 Sum_probs=60.0 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCC-EEEEEEEEEECCCCEEEEEEE Q ss_conf 8999983478863276089998389979742125212124487028999953882-467779997288504899974 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKER-NYSREFSVLTGKSTIVEVLMR 286 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~-~y~~~ftV~~g~~~~veV~~~ 286 (298) ++-+.|.+..|.||++|.+++.+-+-.+.+...|.|-. -|+||.|.|.|+.+++ ...+.++|..+..+.|..+++ T Consensus 327 GIkG~V~D~~g~PI~~A~I~V~g~~h~i~t~~~GdywR-LL~PG~Y~v~asa~GY~~~t~~V~V~~~~a~~v~f~L~ 402 (402) T cd03865 327 GVKGFVKDLQGNPIANATISVEGIDHDITSAKDGDYWR-LLAPGNYKLTASAPGYLAVVKKVAVPYSPAVRVDFELE 402 (402) T ss_pred CCEEEEECCCCCCCCCCEEEEECCCCCEEECCCCCEEE-EECCCEEEEEEEECCCCCCCEEEEECCCCCEEEEEEEC T ss_conf 97799988999957881899946534225889866688-60782689999967975534699988998389988849 No 10 >cd06245 M14_CPD_III The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active a Probab=97.67 E-value=0.00047 Score=44.26 Aligned_cols=75 Identities=8% Similarity=0.095 Sum_probs=58.5 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCE-EEEEEEEEECCCCEEEEEEEC Q ss_conf 89999834788632760899983899797421252121244870289999538824-677799972885048999741 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERN-YSREFSVLTGKSTIVEVLMRQ 287 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~-y~~~ftV~~g~~~~veV~~~~ 287 (298) ++-.+|.+..|.||++|.+.+.. .-++.+...|.|-. -|+||.|.|.|+.+++. -.+.++|..++.+.|.+++.. T Consensus 288 GIkG~V~d~~g~PI~~A~I~V~g-~~~v~t~~~Gdy~R-lL~pG~Y~v~~sa~GY~~~t~~v~V~~~~~~~v~f~L~~ 363 (363) T cd06245 288 GVHGVVTDKAGKPISGATIVLNG-GHRVYTKEGGYFHV-LLAPGQHNINVIAEGYQQEHLPVVVSHDEASSVKIVLDM 363 (363) T ss_pred CCEEEEECCCCCCCCCCEEEECC-CCCEEECCCCCEEE-ECCCCEEEEEEEECCCCCEEEEEEECCCCCEEEEEEEEC T ss_conf 73799999999967881899838-73537778972799-727943799999679876248999889994799999719 No 11 >cd03868 M14_CPD_I The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at p Probab=97.48 E-value=0.00093 Score=42.32 Aligned_cols=74 Identities=18% Similarity=0.179 Sum_probs=57.3 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCEEE-E-EEEEEECCCCEEEEEE Q ss_conf 8999983478863276089998389979742125212124487028999953882467-7-7999728850489997 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERNYS-R-EFSVLTGKSTIVEVLM 285 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~y~-~-~ftV~~g~~~~veV~~ 285 (298) ++-.+|.+..|.||++|.+++.+-+-.+.+...|.|-. -|+||.|.|.|+.+++.-+ + .+.|..++...|.+++ T Consensus 297 GIkG~V~d~~g~pI~~A~I~V~g~~h~v~t~~~GdywR-lL~pG~Y~v~asa~GY~~~t~~~~~v~~~~a~~~~f~L 372 (372) T cd03868 297 GVKGFVRDASGNPIEDATIMVAGIDHNVTTAKFGDYWR-LLLPGTYTITAVAPGYEPSTVTDVVVKEGEATSVNFTL 372 (372) T ss_pred CCEEEEECCCCCCCCCEEEEEECCCCCEECCCCCCEEE-ECCCCEEEEEEEECCCCCCEEEEEEECCCCCEEEEEEC T ss_conf 86599989999957881999937654200288865799-73782589999967877741667995799857987689 No 12 >cd03867 M14_CPZ Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling. Probab=97.36 E-value=0.0012 Score=41.60 Aligned_cols=69 Identities=12% Similarity=0.045 Sum_probs=53.5 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCE-EEEEEEEEECCCCE Q ss_conf 89999834788632760899983899797421252121244870289999538824-67779997288504 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERN-YSREFSVLTGKSTI 280 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~-y~~~ftV~~g~~~~ 280 (298) ++-.+|.+..|+||++|.+.+.+-+-.+.+...|-|-. -|+||.|.|.|..+++. ..+.++|..+.... T Consensus 319 GIkG~V~d~~g~pi~~A~I~V~g~~h~v~t~~~GdywR-LL~pG~Y~vtasa~GY~~~tk~V~v~~~~~~a 388 (395) T cd03867 319 GIKGFVKDKDGNPIKGARISVRGIRHDITTAEDGDYWR-LLPPGIHIVSAQAPGYTKVMKRVTLPARMKRA 388 (395) T ss_pred CCEEEEECCCCCCCCCEEEEEECCCCCEEECCCCCEEE-ECCCCEEEEEEEECCCCCCCEEEEECCCCCCC T ss_conf 87599989999968870899947645402779865699-72783279999957877666799967888875 No 13 >KOG2649 consensus Probab=97.33 E-value=0.002 Score=40.18 Aligned_cols=81 Identities=15% Similarity=0.161 Sum_probs=62.9 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCC-EEEEEEEEEECCCCEEEEEEECHH Q ss_conf 8999983478863276089998389979742125212124487028999953882-467779997288504899974101 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKER-NYSREFSVLTGKSTIVEVLMRQKR 289 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~-~y~~~ftV~~g~~~~veV~~~~~~ 289 (298) ++-.+|.+..|++|++|.+++-..+-++.|-..|-|-. -|+||+|.++|+..++ ...+..+|..-.-..++.+++--. T Consensus 379 GIkG~V~D~~G~~I~NA~IsV~ginHdv~T~~~GDYWR-LL~PG~y~vta~A~Gy~~~tk~v~V~~~~a~~~df~L~~~~ 457 (500) T KOG2649 379 GIKGLVFDDTGNPIANATISVDGINHDVTTAKEGDYWR-LLPPGKYIITASAEGYDPVTKTVTVPPDRAARVNFTLQRSI 457 (500) T ss_pred CCCEEEECCCCCCCCCEEEEEECCCCCEEECCCCCEEE-EECCCCEEEEEECCCCCCEEEEEEECCCCCCCEEEEEECCC T ss_conf 22335886899804764899724767604668885587-50786348998657876510489867887540468984278 Q ss_pred CCC Q ss_conf 035 Q gi|254780160|r 290 MDK 292 (298) Q Consensus 290 ~~~ 292 (298) ++- T Consensus 458 ~~~ 460 (500) T KOG2649 458 PQP 460 (500) T ss_pred CCC T ss_conf 751 No 14 >cd03869 M14_CPX_like Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Pro Probab=97.30 E-value=0.0022 Score=39.96 Aligned_cols=73 Identities=16% Similarity=0.134 Sum_probs=54.7 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCE-EEEEEEEE-ECCCCEEEEE Q ss_conf 89999834788632760899983899797421252121244870289999538824-67779997-2885048999 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERN-YSREFSVL-TGKSTIVEVL 284 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~-y~~~ftV~-~g~~~~veV~ 284 (298) ++-++|.+..|.+|++|.+++..-+-++.+...|-|-. -|+||.|.|.|+.+++. -.+.++|. .+.-+.|..+ T Consensus 330 GIkG~V~d~~g~pi~~A~I~V~g~~h~v~t~~~GdywR-LL~pG~Y~vtasa~GY~~~tk~v~v~~~~~~~~~nF~ 404 (405) T cd03869 330 GIKGVVRDKTGKGIPNAIISVEGINHDIRTASDGDYWR-LLNPGEYRVTAHAEGYTSSTKNCEVGYEMGPTQCNFT 404 (405) T ss_pred CCEEEEECCCCCCCCCCEEEEECCCCCEEECCCCCEEE-ECCCCEEEEEEEECCCCCEEEEEEECCCCCCEEEEEE T ss_conf 87289988999988871899956535515678862799-7168258999997897763379998889972798657 No 15 >cd03858 M14_CP_N-E_like Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappr Probab=97.18 E-value=0.0039 Score=38.29 Aligned_cols=73 Identities=16% Similarity=0.192 Sum_probs=52.1 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCCEE-EEEEEEEE-CCCCEEEEE Q ss_conf 899998347886327608999838997974212521212448702899995388246-77799972-885048999 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKERNY-SREFSVLT-GKSTIVEVL 284 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~y-~~~ftV~~-g~~~~veV~ 284 (298) ++-.+|.+..|.||++|.+++..-+..+.+...|.|-. -|+||.|.+.|+.+++.- .+.+.|.. ++...+..+ T Consensus 299 GIkG~V~d~~g~pi~~A~I~V~g~~~~v~t~~~G~ywR-lL~pG~Y~v~~sa~GY~~~t~~v~v~~~~~~~~~~f~ 373 (374) T cd03858 299 GIKGFVRDANGNPIANATISVEGINHDVTTAEDGDYWR-LLLPGTYNVTASAPGYEPQTKSVVVPNDNSAVVVDFT 373 (374) T ss_pred CEEEEEECCCCCCCCCEEEEEECCCCCEEECCCCCEEE-ECCCCEEEEEEEECCCCCEEEEEEECCCCCEEEEEEE T ss_conf 73899988999987773999937535356579986689-7389247999997797761279998899966999768 No 16 >cd03864 M14_CPN Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates inclu Probab=96.85 E-value=0.015 Score=34.45 Aligned_cols=78 Identities=17% Similarity=0.102 Sum_probs=52.6 Q ss_pred CCEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCCCEEEE Q ss_conf 64379998615788777551899998389743566750567513521067617999961577854454488428863789 Q gi|254780160|r 123 AGGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDV 202 (298) Q Consensus 123 ag~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~ 202 (298) ..|++..+.+..+.++ .+|++.+-.-+ ..+++...+...--|.||+|.|.+++ +.+.+....++|..++.+.+ T Consensus 315 h~GIkG~V~D~~g~pi--~~A~I~V~g~~----h~v~t~~~GdywRlL~pG~Y~vt~sa-~GY~~~t~~v~V~~~~~~~v 387 (392) T cd03864 315 HQGIKGMVTDENNNGI--ANAVISVSGIS----HDVTSGTLGDYFRLLLPGTYTVTASA-PGYQPSTVTVTVGPAEATLV 387 (392) T ss_pred HCCCEEEEECCCCCCC--CCCEEEEECCC----CCEEECCCCEEEEECCCCEEEEEEEE-CCCCCEEEEEEECCCCCEEE T ss_conf 4787599988999978--88289996664----54221898307987078168999997-89676237999889985899 Q ss_pred EEEEC Q ss_conf 99934 Q gi|254780160|r 203 TIQNR 207 (298) Q Consensus 203 tv~~~ 207 (298) .+.++ T Consensus 388 nf~L~ 392 (392) T cd03864 388 NFQLK 392 (392) T ss_pred EEEEC T ss_conf 88839 No 17 >pfam07210 DUF1416 Protein of unknown function (DUF1416). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown. Probab=96.83 E-value=0.01 Score=35.63 Aligned_cols=69 Identities=22% Similarity=0.235 Sum_probs=51.7 Q ss_pred EEEECCCCCCCCCEEEEEECCCCC----EEEECCCCEEECCCCCEEEEEEEECCCCEEEEEEEEEECCCCEEEEE Q ss_conf 998347886327608999838997----97421252121244870289999538824677799972885048999 Q gi|254780160|r 214 KLVSEMGGEAVADTAWSILTASGD----TVGESANASPSMVLSEGDYTVIARNKERNYSREFSVLTGKSTIVEVL 284 (298) Q Consensus 214 ~~v~~~~G~~l~ga~~~i~~~~g~----~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~y~~~ftV~~g~~~~veV~ 284 (298) ..|. .+|+|+.++.+.|+|.+|+ +++...|.|.++. +||+.++.+-...-.-++.+.-+-|.-.+++|. T Consensus 13 G~V~-~~g~pV~gayVRLLD~sgEFtAEV~ts~~G~FRFFA-ApG~WTvRaL~~~g~g~~~v~a~~g~v~~v~v~ 85 (86) T pfam07210 13 GVVR-RDGQPVGGAYVRLLDSSGEFTAEVVTSATGQFRFFA-APGTWTVRALVPGGTGDRTVTAEGGGIHEVDVA 85 (86) T ss_pred EEEE-CCCEECCCEEEEEECCCCCEEEEEECCCCCCEEEEE-CCCCEEEEEECCCCCCCEEEEECCCCEEEEEEE T ss_conf 8993-399085520899973899789997638986379994-698569999745887457998157953889986 No 18 >cd06245 M14_CPD_III The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active a Probab=96.76 E-value=0.017 Score=34.08 Aligned_cols=77 Identities=19% Similarity=0.102 Sum_probs=52.0 Q ss_pred CCEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCCCEEEE Q ss_conf 64379998615788777551899998389743566750567513521067617999961577854454488428863789 Q gi|254780160|r 123 AGGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDV 202 (298) Q Consensus 123 ag~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~ 202 (298) ..|+...+.+..+.++ .++.+.+- +... +.+...+...--|+||+|.|.+++ +.+.+....|+|..++.+.+ T Consensus 286 h~GIkG~V~d~~g~PI--~~A~I~V~--g~~~---v~t~~~Gdy~RlL~pG~Y~v~~sa-~GY~~~t~~v~V~~~~~~~v 357 (363) T cd06245 286 HKGVHGVVTDKAGKPI--SGATIVLN--GGHR---VYTKEGGYFHVLLAPGQHNINVIA-EGYQQEHLPVVVSHDEASSV 357 (363) T ss_pred HCCCEEEEECCCCCCC--CCCEEEEC--CCCC---EEECCCCCEEEECCCCEEEEEEEE-CCCCCEEEEEEECCCCCEEE T ss_conf 1773799999999967--88189983--8735---377789727997279437999996-79876248999889994799 Q ss_pred EEEEC Q ss_conf 99934 Q gi|254780160|r 203 TIQNR 207 (298) Q Consensus 203 tv~~~ 207 (298) ++.+. T Consensus 358 ~f~L~ 362 (363) T cd06245 358 KIVLD 362 (363) T ss_pred EEEEE T ss_conf 99971 No 19 >cd03863 M14_CPD_II The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally ac Probab=96.53 E-value=0.039 Score=31.81 Aligned_cols=78 Identities=15% Similarity=0.091 Sum_probs=50.3 Q ss_pred CEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCCCEEEEE Q ss_conf 43799986157887775518999983897435667505675135210676179999615778544544884288637899 Q gi|254780160|r 124 GGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDVT 203 (298) Q Consensus 124 g~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~t 203 (298) .|++-.+.+.. .+.|..++++.+-.-+ ..+++...+...--|+||+|.|.+++ +.+.+....|+|..++.+.++ T Consensus 297 ~GIkG~V~d~~-~g~pI~~A~I~V~g~~----h~v~t~~~GdywRlL~pG~Y~v~~sa-~GY~~~t~~v~V~~~~~~~v~ 370 (375) T cd03863 297 RGVRGFVLDAT-DGRGILNATISVADIN----HPVTTYKDGDYWRLLVPGTYKVTASA-RGYDPVTKTVEVDSKGAVQVN 370 (375) T ss_pred CCCEEEEECCC-CCCCCCCEEEEEECCC----CCEEECCCCCEEEEECCCEEEEEEEE-CCCCCEEEEEEECCCCCEEEE T ss_conf 68728996589-9988888299993666----62467798547997078148999996-798765699998899818999 Q ss_pred EEEC Q ss_conf 9934 Q gi|254780160|r 204 IQNR 207 (298) Q Consensus 204 v~~~ 207 (298) +.+. T Consensus 371 f~L~ 374 (375) T cd03863 371 FTLS 374 (375) T ss_pred EEEE T ss_conf 9996 No 20 >cd03865 M14_CPE_H Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine pe Probab=96.52 E-value=0.031 Score=32.45 Aligned_cols=77 Identities=18% Similarity=0.106 Sum_probs=51.3 Q ss_pred CEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCCCEEEEE Q ss_conf 43799986157887775518999983897435667505675135210676179999615778544544884288637899 Q gi|254780160|r 124 GGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDVT 203 (298) Q Consensus 124 g~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~t 203 (298) .|+.-.+.+..+. |+.++++.+-.- ...+++...|...--|.||+|.|..++ +.+.+....|+|..++.+.+. T Consensus 326 ~GIkG~V~D~~g~--PI~~A~I~V~g~----~h~i~t~~~GdywRLL~PG~Y~v~asa-~GY~~~t~~V~V~~~~a~~v~ 398 (402) T cd03865 326 RGVKGFVKDLQGN--PIANATISVEGI----DHDITSAKDGDYWRLLAPGNYKLTASA-PGYLAVVKKVAVPYSPAVRVD 398 (402) T ss_pred CCCEEEEECCCCC--CCCCCEEEEECC----CCCEEECCCCCEEEEECCCEEEEEEEE-CCCCCCCEEEEECCCCCEEEE T ss_conf 6977999889999--578818999465----342258898666886078268999996-797553469998899838998 Q ss_pred EEEC Q ss_conf 9934 Q gi|254780160|r 204 IQNR 207 (298) Q Consensus 204 v~~~ 207 (298) +.++ T Consensus 399 f~L~ 402 (402) T cd03865 399 FELE 402 (402) T ss_pred EEEC T ss_conf 8849 No 21 >pfam08308 PEGA PEGA domain. This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. Probab=96.52 E-value=0.025 Score=33.02 Aligned_cols=58 Identities=21% Similarity=0.274 Sum_probs=38.0 Q ss_pred CCEEEEEECCCCCEEEECCCCEEECCCCCEEEEEEEECCCC-EEEEEEEEEECCCCEEEEEEEC Q ss_conf 76089998389979742125212124487028999953882-4677799972885048999741 Q gi|254780160|r 225 ADTAWSILTASGDTVGESANASPSMVLSEGDYTVIARNKER-NYSREFSVLTGKSTIVEVLMRQ 287 (298) Q Consensus 225 ~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~-~y~~~ftV~~g~~~~veV~~~~ 287 (298) +||.+.| +|..+.. ..+....|++|.|.+..+.+++ .+++.++|.+|++..+.+.+.. T Consensus 11 ~gA~V~i---dg~~~G~--TP~~~~~l~~G~h~v~v~~~GY~~~~~~v~V~~g~~~~v~~~L~p 69 (71) T pfam08308 11 SGATVYI---DGVYVGS--TPVTLSDLPAGTHTLRLEKEGYEDYSTTVTVTAGETVSVSLTLTP 69 (71) T ss_pred CCCEEEE---CCEEECC--CCCCCCCCCCCCEEEEEEECCCEEEEEEEEECCCCEEEEEEEEEE T ss_conf 9999999---9999124--987413258966899999599850799999969998999899987 No 22 >cd03866 M14_CPM Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the cont Probab=96.47 E-value=0.015 Score=34.53 Aligned_cols=65 Identities=8% Similarity=-0.017 Sum_probs=46.8 Q ss_pred EEEEEEECCCCCCCCCEEEEEECCCC--CEEEECCCCEEECCCCCEEEEEEEECCCCE-EEEEEEEEEC Q ss_conf 89999834788632760899983899--797421252121244870289999538824-6777999728 Q gi|254780160|r 211 ITFKLVSEMGGEAVADTAWSILTASG--DTVGESANASPSMVLSEGDYTVIARNKERN-YSREFSVLTG 276 (298) Q Consensus 211 ~~~~~v~~~~G~~l~ga~~~i~~~~g--~~vt~~~G~~~~~~L~~G~Y~v~a~~~~~~-y~~~ftV~~g 276 (298) ++-+.|.+..|+||++|.+.+-+.+- .+.+...|-|-. -|+||.|.|.|..+++. ..+.++|.-. T Consensus 296 GIkG~V~d~~g~pi~~A~I~V~g~~hv~~~~t~~~GdywR-LL~pG~Y~vtasa~GY~~~t~~V~V~~~ 363 (376) T cd03866 296 GVKGQVFDSNGNPIPNAIVEVKGRKHICPYRTNVNGEYFL-LLLPGKYMINVTAPGFKTVITNVIIPYN 363 (376) T ss_pred CCEEEEECCCCCCCCCCEEEEECCCCCCCCCCCCCCCEEE-EECCCEEEEEEEECCCCCCCEEEEECCC T ss_conf 8738998999994788089995874555644588865698-8178327999996797554569998999 No 23 >cd03869 M14_CPX_like Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Pro Probab=96.03 E-value=0.087 Score=29.57 Aligned_cols=76 Identities=14% Similarity=0.059 Sum_probs=47.4 Q ss_pred CEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEEC-CCCEEEE Q ss_conf 437999861578877755189999838974356675056751352106761799996157785445448842-8863789 Q gi|254780160|r 124 GGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVE-PGKIIDV 202 (298) Q Consensus 124 g~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~-~g~~~~~ 202 (298) .|++-.+.+..+.+++.+.+.+.-. ...+++...+...--|+||+|.|.+++ +.+.+....++|. .++.+.+ T Consensus 329 ~GIkG~V~d~~g~pi~~A~I~V~g~------~h~v~t~~~GdywRLL~pG~Y~vtasa-~GY~~~tk~v~v~~~~~~~~~ 401 (405) T cd03869 329 RGIKGVVRDKTGKGIPNAIISVEGI------NHDIRTASDGDYWRLLNPGEYRVTAHA-EGYTSSTKNCEVGYEMGPTQC 401 (405) T ss_pred CCCEEEEECCCCCCCCCCEEEEECC------CCCEEECCCCCEEEECCCCEEEEEEEE-CCCCCEEEEEEECCCCCCEEE T ss_conf 6872899889999888718999565------355156788627997168258999997-897763379998889972798 Q ss_pred EEEE Q ss_conf 9993 Q gi|254780160|r 203 TIQN 206 (298) Q Consensus 203 tv~~ 206 (298) .+++ T Consensus 402 nF~L 405 (405) T cd03869 402 NFTL 405 (405) T ss_pred EEEC T ss_conf 6579 No 24 >cd03858 M14_CP_N-E_like Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappr Probab=96.02 E-value=0.095 Score=29.32 Aligned_cols=76 Identities=18% Similarity=0.125 Sum_probs=46.1 Q ss_pred CEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEE-CCCCEEEE Q ss_conf 43799986157887775518999983897435667505675135210676179999615778544544884-28863789 Q gi|254780160|r 124 GGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKV-EPGKIIDV 202 (298) Q Consensus 124 g~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V-~~g~~~~~ 202 (298) .|+...+.+..+.++ .++.+.+-.-. ..+++...+...--|+||+|.|..++ +.+.+....+.| ..++.+.+ T Consensus 298 ~GIkG~V~d~~g~pi--~~A~I~V~g~~----~~v~t~~~G~ywRlL~pG~Y~v~~sa-~GY~~~t~~v~v~~~~~~~~~ 370 (374) T cd03858 298 RGIKGFVRDANGNPI--ANATISVEGIN----HDVTTAEDGDYWRLLLPGTYNVTASA-PGYEPQTKSVVVPNDNSAVVV 370 (374) T ss_pred CCEEEEEECCCCCCC--CCEEEEEECCC----CCEEECCCCCEEEECCCCEEEEEEEE-CCCCCEEEEEEECCCCCEEEE T ss_conf 373899988999987--77399993753----53565799866897389247999997-797761279998899966999 Q ss_pred EEEE Q ss_conf 9993 Q gi|254780160|r 203 TIQN 206 (298) Q Consensus 203 tv~~ 206 (298) .+.+ T Consensus 371 ~f~L 374 (374) T cd03858 371 DFTL 374 (374) T ss_pred EEEC T ss_conf 7689 No 25 >cd03868 M14_CPD_I The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at p Probab=95.66 E-value=0.12 Score=28.56 Aligned_cols=76 Identities=17% Similarity=0.070 Sum_probs=47.9 Q ss_pred CEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEE-EEEEECCCCEEEE Q ss_conf 437999861578877755189999838974356675056751352106761799996157785445-4488428863789 Q gi|254780160|r 124 GGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVS-TVVKVEPGKIIDV 202 (298) Q Consensus 124 g~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~-~~i~V~~g~~~~~ 202 (298) .|+...+.+..+. |+..+++.+-.-. ..+++...+...--|+||+|.|.+++ +.+.+.. ..+.|..++.+.+ T Consensus 296 ~GIkG~V~d~~g~--pI~~A~I~V~g~~----h~v~t~~~GdywRlL~pG~Y~v~asa-~GY~~~t~~~~~v~~~~a~~~ 368 (372) T cd03868 296 IGVKGFVRDASGN--PIEDATIMVAGID----HNVTTAKFGDYWRLLLPGTYTITAVA-PGYEPSTVTDVVVKEGEATSV 368 (372) T ss_pred CCCEEEEECCCCC--CCCCEEEEEECCC----CCEECCCCCCEEEECCCCEEEEEEEE-CCCCCCEEEEEEECCCCCEEE T ss_conf 6865999899999--5788199993765----42002888657997378258999996-787774166799579985798 Q ss_pred EEEE Q ss_conf 9993 Q gi|254780160|r 203 TIQN 206 (298) Q Consensus 203 tv~~ 206 (298) .+.+ T Consensus 369 ~f~L 372 (372) T cd03868 369 NFTL 372 (372) T ss_pred EEEC T ss_conf 7689 No 26 >cd03867 M14_CPZ Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling. Probab=95.54 E-value=0.13 Score=28.49 Aligned_cols=77 Identities=16% Similarity=0.028 Sum_probs=46.8 Q ss_pred CCEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCC--CEE Q ss_conf 643799986157887775518999983897435667505675135210676179999615778544544884288--637 Q gi|254780160|r 123 AGGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPG--KII 200 (298) Q Consensus 123 ag~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g--~~~ 200 (298) ..|+...+.+..+.+ +.++++.+-.- ...+++...+-..--|+||+|.|.+++ +.+.+....|+|..+ +.+ T Consensus 317 h~GIkG~V~d~~g~p--i~~A~I~V~g~----~h~v~t~~~GdywRLL~pG~Y~vtasa-~GY~~~tk~V~v~~~~~~a~ 389 (395) T cd03867 317 HRGIKGFVKDKDGNP--IKGARISVRGI----RHDITTAEDGDYWRLLPPGIHIVSAQA-PGYTKVMKRVTLPARMKRAG 389 (395) T ss_pred HCCCEEEEECCCCCC--CCCEEEEEECC----CCCEEECCCCCEEEECCCCEEEEEEEE-CCCCCCCEEEEECCCCCCCE T ss_conf 378759998999996--88708999476----454027798656997278327999995-78776667999678888756 Q ss_pred EEEEEE Q ss_conf 899993 Q gi|254780160|r 201 DVTIQN 206 (298) Q Consensus 201 ~~tv~~ 206 (298) .+.+.+ T Consensus 390 ~vdF~L 395 (395) T cd03867 390 RVDFVL 395 (395) T ss_pred EEEEEC T ss_conf 975589 No 27 >cd03866 M14_CPM Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the cont Probab=94.30 E-value=0.18 Score=27.53 Aligned_cols=69 Identities=13% Similarity=0.018 Sum_probs=42.4 Q ss_pred CEEEEEEEECCCCCCCCCEEEEEEECCCCCEEEEEEECCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCC Q ss_conf 43799986157887775518999983897435667505675135210676179999615778544544884288 Q gi|254780160|r 124 GGIRLYSIYKPGSPIVDDELTFSIYSNPNHKALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPG 197 (298) Q Consensus 124 g~~~~~~~~~~~~~~~~~~~~f~i~~~~~~~~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g 197 (298) .|+...+.+..+. |+.++.+.+.. ......+++...+-..--|+||+|.|.+++ +.+.+....|+|..+ T Consensus 295 ~GIkG~V~d~~g~--pi~~A~I~V~g--~~hv~~~~t~~~GdywRLL~pG~Y~vtasa-~GY~~~t~~V~V~~~ 363 (376) T cd03866 295 LGVKGQVFDSNGN--PIPNAIVEVKG--RKHICPYRTNVNGEYFLLLLPGKYMINVTA-PGFKTVITNVIIPYN 363 (376) T ss_pred CCCEEEEECCCCC--CCCCCEEEEEC--CCCCCCCCCCCCCCEEEEECCCEEEEEEEE-CCCCCCCEEEEECCC T ss_conf 6873899899999--47880899958--745556445888656988178327999996-797554569998999 No 28 >pfam08308 PEGA PEGA domain. This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. Probab=93.89 E-value=0.39 Score=25.36 Aligned_cols=46 Identities=20% Similarity=0.136 Sum_probs=35.9 Q ss_pred CCCCCEEEEECCCCEEEEEEECCCCCEEEEEEEECCCCEEEEEEEEC Q ss_conf 56751352106761799996157785445448842886378999934 Q gi|254780160|r 161 KVRSGTLVRLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDVTIQNR 207 (298) Q Consensus 161 ~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~tv~~~ 207 (298) +..+..+..|++|.|.|...+ +.+......|+|.+|+..++++.+. T Consensus 23 G~TP~~~~~l~~G~h~v~v~~-~GY~~~~~~v~V~~g~~~~v~~~L~ 68 (71) T pfam08308 23 GSTPVTLSDLPAGTHTLRLEK-EGYEDYSTTVTVTAGETVSVSLTLT 68 (71) T ss_pred CCCCCCCCCCCCCCEEEEEEE-CCCEEEEEEEEECCCCEEEEEEEEE T ss_conf 249874132589668999995-9985079999996999899989998 No 29 >pfam02369 Big_1 Bacterial Ig-like domain (group 1). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial surface proteins such as intimins and invasins involved in pathogenicity. Probab=92.21 E-value=0.74 Score=23.57 Aligned_cols=61 Identities=18% Similarity=0.194 Sum_probs=32.8 Q ss_pred EEEEEEEECCCCCCCCCCEEEEEEECCCCCCC---EEEEEEECCCCEEEE--CCCCCEEEEEEEECC Q ss_conf 27999998688775346759999953888762---058898246856630--226721899996157 Q gi|254780160|r 38 RITCEARLTENSTSIDSGVSWHIFDSISNKKN---TLSTTKKIIGGKVSF--DLFPGDYLISASFGH 99 (298) Q Consensus 38 ~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g---~~~~tt~~~G~~~~~--~L~pG~Y~v~~s~g~ 99 (298) ..+|++++++..++++.|..... +..++... .-..+|+++|.+... ...+|.|.|.++... T Consensus 18 ~~tltatV~D~~GnPv~g~~Vtf-s~~~~~~~~~~~~~~~Td~~G~A~~tLtSt~aG~~~VtAsv~~ 83 (93) T pfam02369 18 AITLTATVKDANGNPVAGQEVTF-SDVSSSGTLSNGNKATTDANGVATVTLTSTKAGTYTVTASLAN 83 (93) T ss_pred CEEEEEEEECCCCCCCCCCEEEE-EECCCCCEECCCCEEEECCCCEEEEEEECCCCEEEEEEEEECC T ss_conf 38999999999999979989999-9658985773685488899987999998565526999999889 No 30 >pfam11589 DUF3244 Protein of unknown function (DUF3244). This family of proteins with unknown function appear to be restricted to Bacteroidetes. The protein may have an immunoglobulin-like beta-sandwich fold however this cannot be confirmed. Probab=92.20 E-value=0.61 Score=24.12 Aligned_cols=54 Identities=17% Similarity=0.214 Sum_probs=36.6 Q ss_pred CCCCCCEEEEEECCCCCEEEE-----CCC---CEEECCCCCEEEEEEEECCCCEE-EEEEEEE Q ss_conf 863276089998389979742-----125---21212448702899995388246-7779997 Q gi|254780160|r 221 GEAVADTAWSILTASGDTVGE-----SAN---ASPSMVLSEGDYTVIARNKERNY-SREFSVL 274 (298) Q Consensus 221 G~~l~ga~~~i~~~~g~~vt~-----~~G---~~~~~~L~~G~Y~v~a~~~~~~y-~~~ftV~ 274 (298) -.++.+..++|.+.+|.+|.+ ..| .+.+.++++|+|++.-++..-.| .-.|+++ T Consensus 44 ~~~l~~vtI~I~d~~G~vVYe~~is~~~~~~~~isL~~~~~G~Y~l~it~~~G~~l~G~F~ie 106 (106) T pfam11589 44 TSPLDNLTITITDEKGVVVYEDTISVASGDTITISIAGEAPGEYKLELTHGLGGYLYGEFTIE 106 (106) T ss_pred ECCCCCEEEEEECCCCCEEEEEEECCCCCCEEEEEECCCCCCEEEEEEECCCCCEEEEEEEEC T ss_conf 655898699999799989999873267886899983675685089999758998899999979 No 31 >pfam08400 phage_tail_N Prophage tail fibre N-terminal. This domain is found at the N-terminus of prophage tail fibre proteins. Probab=91.56 E-value=0.87 Score=23.11 Aligned_cols=73 Identities=16% Similarity=0.217 Sum_probs=31.3 Q ss_pred EEEEEEECCCCCCCCCCEEEEEEECCCCCCCEE---EEEEECCCCEEEECCCCCEEEEEEEE-CCCCCE-EEEEEECC Q ss_conf 799999868877534675999995388876205---88982468566302267218999961-577752-78998068 Q gi|254780160|r 39 ITCEARLTENSTSIDSGVSWHIFDSISNKKNTL---STTKKIIGGKVSFDLFPGDYLISASF-GHVGVV-KKITVSSK 111 (298) Q Consensus 39 i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g~~---~~tt~~~G~~~~~~L~pG~Y~v~~s~-g~~~~~-~~vtV~~~ 111 (298) +.++..|.+..+.+..|+.-.|-+.++-.+.-. ....+.+.|.+.|++.||.|.|.... |+..+. ..|+|-.+ T Consensus 3 v~ISGvL~d~~G~pv~~~~I~LkA~~ns~~Vv~~t~as~~T~~~G~Ys~~vepG~Y~V~l~~~g~~~~~vG~i~V~~d 80 (134) T pfam08400 3 VKISGVLKDGTGEPVSNCTITLKARRTSPTVVVNTVASVVTDEDGRYSMDVEPGKYSVTLTVDGRNPSYVGDITVYED 80 (134) T ss_pred EEEEEEEECCCCCCCCCCEEEEEEEECCHHHEECCEEEEEECCCCCEEEEECCEEEEEEEEECCCCCEEEEEEEEECC T ss_conf 899999978999876798999997108768733115678808985178882360799999986715258513899569 No 32 >smart00634 BID_1 Bacterial Ig-like domain (group 1). Probab=90.80 E-value=1 Score=22.63 Aligned_cols=61 Identities=23% Similarity=0.202 Sum_probs=35.3 Q ss_pred EEEEEEEECCCCCCCCCC--EEEEEEECCCCCCCE---EEEEEECCCCEEEE--CCCCCEEEEEEEECCCC Q ss_conf 279999986887753467--599999538887620---58898246856630--22672189999615777 Q gi|254780160|r 38 RITCEARLTENSTSIDSG--VSWHIFDSISNKKNT---LSTTKKIIGGKVSF--DLFPGDYLISASFGHVG 101 (298) Q Consensus 38 ~i~l~a~~~~~~~~~~~G--~~~~vy~~~~~~~g~---~~~tt~~~G~~~~~--~L~pG~Y~v~~s~g~~~ 101 (298) ..+|++++.+..++++.| +.|.. . .+.... -..+|+.+|.+... ...+|.|.|.++.+... T Consensus 19 ~~tltatV~D~~gnpv~g~~V~f~~--~-~~~~~~~~~~~~~Td~~G~a~~~ltst~aG~~tVtA~~~~~~ 86 (92) T smart00634 19 AITLTATVTDANGNPVAGQEVTFTT--P-SGGALTLSKGTATTDANGIATVTLTSTTAGVYTVTASLENGS 86 (92) T ss_pred CEEEEEEEECCCCCCCCCCEEEEEE--C-CCCCEEECCCEEEECCCEEEEEEEECCCEEEEEEEEEECCCC T ss_conf 1999999997999997898999997--7-996145068758959987999999766408999999997985 No 33 >cd03459 3,4-PCD Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+. Probab=89.64 E-value=1.3 Score=22.01 Aligned_cols=61 Identities=15% Similarity=0.073 Sum_probs=43.2 Q ss_pred EEECCEEEEEEEECCCCCCCCCCEEEEEEECCCCCCC-----------------EEEEEEECCCCEEEECCCCCEEEE Q ss_conf 2302327999998688775346759999953888762-----------------058898246856630226721899 Q gi|254780160|r 33 VVDAQRITCEARLTENSTSIDSGVSWHIFDSISNKKN-----------------TLSTTKKIIGGKVSFDLFPGDYLI 93 (298) Q Consensus 33 ~v~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g-----------------~~~~tt~~~G~~~~~~L~pG~Y~v 93 (298) -..++.|.|+.+..+..+.++.||...|+........ .-...|+++|.+.+.-+.||.|-+ T Consensus 10 ~a~G~~i~i~G~V~D~~g~Pi~~A~ieiWQad~~G~Y~~~~~~~~~~~d~~f~G~Gr~~Td~~G~y~f~Ti~Pg~y~~ 87 (158) T cd03459 10 EAIGERIILEGRVLDGDGRPVPDALVEIWQADAAGRYRHPRDSHRAPLDPNFTGFGRVLTDADGRYRFRTIKPGAYPW 87 (158) T ss_pred CCCCCEEEEEEEEECCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCEEEEEEECCCCCCC T ss_conf 999888999999999999998997899980599986046666655666866440689986789779999978833058 No 34 >pfam10670 NikM Nickel uptake substrate-specific transmembrane region. This family of proteins forms part of the nickel-transport complex in prokaryotes, NikMNQO. CbiMNQO (cobalt-transport) and NikMNQO are the most widespread groups of microbial transporters for nickel and cobalt ions and are unusual uptake systems as they consist of eg two transmembrane components (NikM and NikQ), a small membrane-bound component (NikN) and an ATP-binding protein (NikO) but no extra-cytoplasmic solute-binding protein. Similar components constitute the cobalt transporters with some variability in the small membrane-bound component, CbiN, which is not similar to NikN or NikL at the sequence level. NikM is the substrate-specific component of the complex and is a seven-transmembrane protein. The CbiMNQO and NikMNQO systems form part of the coenzyme B12 biosynthesis pathway. The CbiM protein is pfam01891. Probab=89.50 E-value=1.3 Score=21.95 Aligned_cols=52 Identities=15% Similarity=0.190 Sum_probs=40.2 Q ss_pred EEEEEEEECCCCCCCCCEEEEEECCCCC---------EEEECCCCEEECCCCCEEEEEEEECC Q ss_conf 3899998347886327608999838997---------97421252121244870289999538 Q gi|254780160|r 210 KITFKLVSEMGGEAVADTAWSILTASGD---------TVGESANASPSMVLSEGDYTVIARNK 263 (298) Q Consensus 210 ~~~~~~v~~~~G~~l~ga~~~i~~~~g~---------~vt~~~G~~~~~~L~~G~Y~v~a~~~ 263 (298) .+.+++. -.|+|++++.+++...+.+ ..||++|.+.+.-..+|.|-+.|.+. T Consensus 156 ~~~~~vl--~~GkP~a~~~V~v~~~~~~~~~~~~~~~~~TD~~G~~~~~~~~~G~yll~a~~~ 216 (219) T pfam10670 156 PFTFQVL--YDGKPAAGAEVEVEYGGTDYRDEADTQEVKTDADGVFTFTPPKAGWYLLAALHE 216 (219) T ss_pred EEEEEEE--ECCCCCCCCEEEEEECCCCCCCCCCEEEEEECCCCEEEEECCCCCEEEEEEEEE T ss_conf 4899999--899678998899998887777762049999799967999618996899999963 No 35 >pfam11008 DUF2846 Protein of unknown function (DUF2846). Some members in this family of proteins with unknown function are annotated as lipoproteins however this cannot be confirmed. Probab=89.17 E-value=1.3 Score=21.99 Aligned_cols=18 Identities=11% Similarity=0.152 Sum_probs=7.5 Q ss_pred CCCCEEEEECCCCEEEEE Q ss_conf 675135210676179999 Q gi|254780160|r 162 VRSGTLVRLGTNNYQITS 179 (298) Q Consensus 162 ~~~~~~~~L~~G~Y~v~e 179 (298) .+.+...+++||.|.+.. T Consensus 57 ~~~y~~~~v~pG~~~i~~ 74 (112) T pfam11008 57 NGGYFYLEVPPGEYVIST 74 (112) T ss_pred CCEEEEEEECCCCEEEEE T ss_conf 983999996898789987 No 36 >TIGR02422 protocat_beta protocatechuate 3,4-dioxygenase, beta subunit; InterPro: IPR012785 Protocatechuate (3,4-dihydroxybenzene, PCA) is an aromatic compound which is a key intermediate in the degradation of the plant biopolymer lignin and other aromatic compounds. The key step of PCA degradation is the ring-cleavage performed by dioxygenases adding both atoms from molecular oxygen to specific carbon atoms within the ring. This step can be performed by two distinct mechanisms; intradiol cleavage and extradiol cleavage. In intradiol cleavage the oxygen atoms are added to the carbons carrying the hydroxyl groups, producing two carboxylate groups. In extradiol cleavage the oxygens are added to one carbon carrying a hydroxyl group and another carrying a hydrogen, resulting in the formation of a carboxylate group and an aldehydic group. For further information see . PCA dioxygenases fall into the broader category of catechol dioxygenases. These are metalloenzymes which bind non-haeme iron. The extradiol dioxygenases use Fe(II) to activate oxygen for nucleophilic attack on the aromatic substrate, while the intradiol dioxygenases use Fe(III) to activate the aromatic substrate for an electrophilic attack by oxygen . This entry represents the beta subunit of protocatechuate 3,4-dioxygenase, an enzyme which cleaves the PCA ring by an intradiol mechanism. It is composed of two subunits, alpha (IPR012786 from INTERPRO) and beta which are highly similar in structure and are thought to share a common ancestor , , . The core of each subunit is two four-stranded beta-sheets that fold upon each other to form a beta sandwhich. The active site cavity contains the Fe(III)-binding site and is located between the two subunits. All Fe(III) ligands are contributed by the beta subunit.; GO: 0005506 iron ion binding, 0018578 protocatechuate 34-dioxygenase activity, 0019619 protocatechuate catabolic process. Probab=88.25 E-value=0.59 Score=24.18 Aligned_cols=21 Identities=24% Similarity=0.074 Sum_probs=11.1 Q ss_pred EEEECCCCEEEECCCCCEEEE Q ss_conf 898246856630226721899 Q gi|254780160|r 73 TTKKIIGGKVSFDLFPGDYLI 93 (298) Q Consensus 73 ~tt~~~G~~~~~~L~pG~Y~v 93 (298) ..|+++|.+.+.-++||-|=. T Consensus 116 tlTD~~G~Y~F~TiKPGpYPW 136 (224) T TIGR02422 116 TLTDEDGYYRFRTIKPGPYPW 136 (224) T ss_pred EEECCCCCEEEEEECCCCCCC T ss_conf 343799826888761689887 No 37 >KOG0518 consensus Probab=87.91 E-value=1.7 Score=21.26 Aligned_cols=55 Identities=18% Similarity=0.272 Sum_probs=28.3 Q ss_pred EEEEEEECCCCCCCEEEEEEECCCCEEEECCC---CCEEEEEEEECCCCCE-EEEEEECC Q ss_conf 59999953888762058898246856630226---7218999961577752-78998068 Q gi|254780160|r 56 VSWHIFDSISNKKNTLSTTKKIIGGKVSFDLF---PGDYLISASFGHVGVV-KKITVSSK 111 (298) Q Consensus 56 ~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~---pG~Y~v~~s~g~~~~~-~~vtV~~~ 111 (298) +..+|+.+. |+.-......+..|+.+.+.+. ||.|.+...|+..... .++++.+. T Consensus 392 levqV~gp~-Gk~~~~~V~d~~~~~~h~vsY~pd~~G~y~i~v~~~gd~i~gSPf~~ra~ 450 (1113) T KOG0518 392 LEVQVVGPE-GKEKEVVVRDNGRGGIHIVTYVPDCPGRYLIVVFYGGDPIPGSPFTARAY 450 (1113) T ss_pred EEEEEECCC-CCCEEEEEEECCCCCEEEEEECCCCCCCEEEEEEECCCCCCCCCEEEEEC T ss_conf 789997899-97245589866888457899757888725999997782358996377733 No 38 >pfam00775 Dioxygenase_C Dioxygenase. Probab=86.16 E-value=2.1 Score=20.65 Aligned_cols=59 Identities=12% Similarity=0.090 Sum_probs=42.0 Q ss_pred ECCEEEEEEEECCCCCCCCCCEEEEEEECCCC-------------CCCEEEEEEECCCCEEEECCCCCEEEE Q ss_conf 02327999998688775346759999953888-------------762058898246856630226721899 Q gi|254780160|r 35 DAQRITCEARLTENSTSIDSGVSWHIFDSISN-------------KKNTLSTTKKIIGGKVSFDLFPGDYLI 93 (298) Q Consensus 35 ~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~-------------~~g~~~~tt~~~G~~~~~~L~pG~Y~v 93 (298) .++.+.++.++.+..+.++.||...|+..+.. -.++=...|+++|.+.+.-+.||.|-+ T Consensus 26 ~G~~l~v~G~V~D~~g~Pi~gA~veiWqad~~G~Y~~~~~~~~~~~~~rGr~~Td~dG~y~f~TI~P~~Ypi 97 (181) T pfam00775 26 PGEPLVVSGRVRDRDGKPLPGALVEIWHADADGFYSHFDPEDQPEFNLRGRIVTDAEGRYRFRTVQPAPYPI 97 (181) T ss_pred CCCEEEEEEEEECCCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCCEEEEEECCCCEEEEEEEECCCCCC T ss_conf 988899999999899999799799998448998535538777887662569974899839999981646888 No 39 >COG1470 Predicted membrane protein [Function unknown] Probab=85.89 E-value=2.1 Score=20.56 Aligned_cols=38 Identities=18% Similarity=0.257 Sum_probs=21.3 Q ss_pred CCCCCCCCCEEEEEECCCCCEEEECCCCEEECCCCCEEEE Q ss_conf 4788632760899983899797421252121244870289 Q gi|254780160|r 218 EMGGEAVADTAWSILTASGDTVGESANASPSMVLSEGDYT 257 (298) Q Consensus 218 ~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~~L~~G~Y~ 257 (298) ..|+.+|.|..+++..++|=.+ +-++ +.+-.|.||+|. T Consensus 407 NsGna~LtdIkl~v~~PqgWei-~Vd~-~~I~sL~pge~~ 444 (513) T COG1470 407 NSGNAPLTDIKLTVNGPQGWEI-EVDE-STIPSLEPGESK 444 (513) T ss_pred ECCCCCCCEEEEEECCCCCCEE-EECC-CCCCCCCCCCCC T ss_conf 4699832022688528766469-8781-005654889862 No 40 >pfam09430 DUF2012 Protein of unknown function (DUF2012). This is a eukaryotic family of uncharacterized proteins. Probab=85.36 E-value=1.8 Score=21.00 Aligned_cols=37 Identities=24% Similarity=0.236 Sum_probs=21.3 Q ss_pred EECCCCEEEECCCCCEEEEEEEE-CCCCCEEEEEEECC Q ss_conf 82468566302267218999961-57775278998068 Q gi|254780160|r 75 KKIIGGKVSFDLFPGDYLISASF-GHVGVVKKITVSSK 111 (298) Q Consensus 75 t~~~G~~~~~~L~pG~Y~v~~s~-g~~~~~~~vtV~~~ 111 (298) ...+|.+.+-++|+|+|.+.+.+ ++.=...+|+|... T Consensus 23 l~~dGsF~f~nVp~Gsy~ldv~~~~~~F~pvRVdV~~~ 60 (113) T pfam09430 23 LRRDGSFVFHNVPAGSYLLEVESPGYRFEPVRVDVSAK 60 (113) T ss_pred ECCCCEEEECCCCCEEEEEEEECCCCEECCEEEEEECC T ss_conf 62687799738998338999846982805789999469 No 41 >cd03463 3,4-PCD_alpha Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+. Probab=85.25 E-value=2.3 Score=20.36 Aligned_cols=58 Identities=16% Similarity=0.123 Sum_probs=40.0 Q ss_pred ECCEEEEEEEECCCCCCCCCCEEEEEEECCCCCC----------------CEEEEEEECCCCEEEECCCCCEEE Q ss_conf 0232799999868877534675999995388876----------------205889824685663022672189 Q gi|254780160|r 35 DAQRITCEARLTENSTSIDSGVSWHIFDSISNKK----------------NTLSTTKKIIGGKVSFDLFPGDYL 92 (298) Q Consensus 35 ~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~----------------g~~~~tt~~~G~~~~~~L~pG~Y~ 92 (298) .++.|.|+.++.+..+.++.||...|+..+.... |.-...|+++|.+.+.-+.||-|- T Consensus 33 ~G~~i~v~G~V~D~~g~Pi~~A~ieiWqad~~G~Y~~~~~~~~~~~~~f~g~Gr~~Td~~G~y~F~TI~PG~~~ 106 (185) T cd03463 33 AGERITLEGRVYDGDGAPVPDAMLEIWQADAAGRYAHPADSRRRLDPGFRGFGRVATDADGRFSFTTVKPGAVP 106 (185) T ss_pred CCCEEEEEEEEECCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCEEEEEECCCCCC T ss_conf 99889999999999989969978999960899843466654456787643077885489982999995773637 No 42 >pfam00576 Transthyretin HIUase/Transthyretin family. This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence. Probab=84.69 E-value=2.4 Score=20.20 Aligned_cols=41 Identities=17% Similarity=0.242 Sum_probs=14.8 Q ss_pred CCEEEEEEECCCCCC--CEEEEEEECCCCEEE----ECCCCCEEEEE Q ss_conf 675999995388876--205889824685663----02267218999 Q gi|254780160|r 54 SGVSWHIFDSISNKK--NTLSTTKKIIGGKVS----FDLFPGDYLIS 94 (298) Q Consensus 54 ~G~~~~vy~~~~~~~--g~~~~tt~~~G~~~~----~~L~pG~Y~v~ 94 (298) +|+..+||....+.. --....|+++|.... ..+.+|.|.|. T Consensus 17 ~gv~V~L~~~~~~~~~~~i~~~~Td~dGRi~~~~~~~~~~~G~Y~l~ 63 (111) T pfam00576 17 AGVAVKLFRLTGDGGWELLATGKTNADGRVHPLLTGETFAPGTYRLE 63 (111) T ss_pred CCCEEEEEEECCCCCEEEEEEEEECCCCCCCCCCCCCCCCCEEEEEE T ss_conf 89989999988999828999999899988978778332665779999 No 43 >cd03464 3,4-PCD_beta Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+. Probab=81.57 E-value=3.2 Score=19.42 Aligned_cols=59 Identities=15% Similarity=0.070 Sum_probs=37.8 Q ss_pred ECCEEEEEEEECCCCCCCCCCEEEEEEECCCCC-----------------CCEEEEEEECCCCEEEECCCCCEEEE Q ss_conf 023279999986887753467599999538887-----------------62058898246856630226721899 Q gi|254780160|r 35 DAQRITCEARLTENSTSIDSGVSWHIFDSISNK-----------------KNTLSTTKKIIGGKVSFDLFPGDYLI 93 (298) Q Consensus 35 ~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~-----------------~g~~~~tt~~~G~~~~~~L~pG~Y~v 93 (298) .++.|.++.++.+..+.+..++...|+..+... .|.-...|+++|.+.+.-+.||.|-. T Consensus 62 ~GerI~v~GrVlD~~G~PV~dAlvEIWQAna~GrY~H~~D~~~a~~DpnF~G~GR~~Td~~G~y~F~TIkPG~yP~ 137 (220) T cd03464 62 IGERIIVHGRVLDEDGRPVPNTLVEIWQANAAGRYRHKRDQHDAPLDPNFGGAGRTLTDDDGYYRFRTIKPGAYPW 137 (220) T ss_pred CCCEEEEEEEEECCCCCCCCCCEEEEEECCCCCEECCCCCCCCCCCCCCCCEEEEEECCCCCEEEEEEECCCCCCC T ss_conf 7887999999988999997774657762488826888767776776888632789860899739999879987678 No 44 >pfam07495 Y_Y_Y Y_Y_Y domain. This domain is mostly found at the end of the beta propellers (pfam07494) in a family of two component regulators. However they are also found tandemly repeated in a hypothetical protein from Clostridium tetani without other signal conduction domains being present. It's named after the conserved tyrosines found in the alignment. The exact function is not known. Probab=80.57 E-value=3.5 Score=19.20 Aligned_cols=12 Identities=17% Similarity=0.398 Sum_probs=4.3 Q ss_pred EECCCCEEEEEE Q ss_conf 106761799996 Q gi|254780160|r 169 RLGTNNYQITSH 180 (298) Q Consensus 169 ~L~~G~Y~v~et 180 (298) .|+||+|+|.+. T Consensus 34 ~l~pG~Y~f~v~ 45 (65) T pfam07495 34 NLPPGKYTLKVK 45 (65) T ss_pred CCCCCCEEEEEE T ss_conf 889976899999 No 45 >cd03462 1,2-CCD chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway. Probab=80.56 E-value=3.5 Score=19.19 Aligned_cols=20 Identities=5% Similarity=-0.074 Sum_probs=8.5 Q ss_pred EEEECCCCCCCCCEEEEEEE Q ss_conf 98615788777551899998 Q gi|254780160|r 129 YSIYKPGSPIVDDELTFSIY 148 (298) Q Consensus 129 ~~~~~~~~~~~~~~~~f~i~ 148 (298) ...+.+..+.|++++.+.+- T Consensus 103 ~G~V~D~~G~Pi~gA~vdvW 122 (247) T cd03462 103 RGTVKDLAGAPVAGAVIDVW 122 (247) T ss_pred EEEEECCCCCCCCCCEEEEE T ss_conf 99998999798688568888 No 46 >cd00421 intradiol_dioxygenase Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers. Probab=80.47 E-value=3.5 Score=19.17 Aligned_cols=60 Identities=12% Similarity=0.052 Sum_probs=41.8 Q ss_pred EECCEEEEEEEECCCCCCCCCCEEEEEEECCCCC--------------CCEEEEEEECCCCEEEECCCCCEEEE Q ss_conf 3023279999986887753467599999538887--------------62058898246856630226721899 Q gi|254780160|r 34 VDAQRITCEARLTENSTSIDSGVSWHIFDSISNK--------------KNTLSTTKKIIGGKVSFDLFPGDYLI 93 (298) Q Consensus 34 v~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~--------------~g~~~~tt~~~G~~~~~~L~pG~Y~v 93 (298) ..++.+.|+.++.+..+.++.+|.+.|+..+... .++=...|+++|.+.+.-+.||.|.+ T Consensus 7 ~~G~~l~l~g~V~D~~g~pv~~a~veiWqad~~G~Y~~~~~~~~~~~~~~rG~~~Td~~G~~~F~Ti~Pg~Y~~ 80 (146) T cd00421 7 APGEPLTLTGTVLDGDGCPVPDALVEIWQADADGRYSGQDDSGLDPEFFLRGRQITDADGRYRFRTIKPGPYPI 80 (146) T ss_pred CCCCEEEEEEEEECCCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCEEEEEEEECCCCEEEEEEECCCCCCC T ss_conf 99778999999995999897996999985389975246345666888133889987999959999988878888 No 47 >cd03461 1,2-HQD Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes. Probab=80.40 E-value=3.5 Score=19.16 Aligned_cols=20 Identities=5% Similarity=-0.137 Sum_probs=7.9 Q ss_pred EEEECCCCCCCCCEEEEEEE Q ss_conf 98615788777551899998 Q gi|254780160|r 129 YSIYKPGSPIVDDELTFSIY 148 (298) Q Consensus 129 ~~~~~~~~~~~~~~~~f~i~ 148 (298) ...+.+..+.|..++.+.+- T Consensus 124 ~G~V~D~~G~Pi~gA~vdvW 143 (277) T cd03461 124 HGRVTDTDGKPLPGATVDVW 143 (277) T ss_pred EEEEECCCCCCCCCCEEEEE T ss_conf 99998899998788679988 No 48 >pfam01835 A2M_N MG2 domain. This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Probab=79.23 E-value=3.9 Score=18.92 Aligned_cols=47 Identities=23% Similarity=0.228 Sum_probs=21.3 Q ss_pred EEEEEECCCCCEE------EECCCC----EEE-CCCCCEEEEEEEECC--CCEEEEEEEE Q ss_conf 0899983899797------421252----121-244870289999538--8246777999 Q gi|254780160|r 227 TAWSILTASGDTV------GESANA----SPS-MVLSEGDYTVIARNK--ERNYSREFSV 273 (298) Q Consensus 227 a~~~i~~~~g~~v------t~~~G~----~~~-~~L~~G~Y~v~a~~~--~~~y~~~ftV 273 (298) ..+.|.+++|..+ .+..|. |.+ ...+.|.|++.+..+ ...++..|.| T Consensus 36 v~v~l~dp~g~~v~~~~~~~~~~G~~~~~f~lp~~~~~G~y~i~~~~~~~~~~~~~~F~V 95 (95) T pfam01835 36 LTVTITDPDGNEVRQWLLVTNEAGIFSGSFPLPEEAPLGTWTIEAEYDDGGSLASGSFRV 95 (95) T ss_pred EEEEEECCCCCEEEEEEEEECCCCCEEEEEECCCCCCCEEEEEEEEECCCCEEEEEEEEC T ss_conf 899999899989999998608898399999989988858689999988997598899889 No 49 >COG3485 PcaH Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport, and catabolism] Probab=78.46 E-value=4.1 Score=18.77 Aligned_cols=25 Identities=20% Similarity=0.242 Sum_probs=10.4 Q ss_pred CEEEEEEEECCCCCCCCCCEEEEEE Q ss_conf 3279999986887753467599999 Q gi|254780160|r 37 QRITCEARLTENSTSIDSGVSWHIF 61 (298) Q Consensus 37 ~~i~l~a~~~~~~~~~~~G~~~~vy 61 (298) +.|.|+.+..+.++.++.++...|+ T Consensus 71 e~i~l~G~VlD~~G~Pv~~A~VEiW 95 (226) T COG3485 71 ERILLEGRVLDGNGRPVPDALVEIW 95 (226) T ss_pred CEEEEEEEEECCCCCCCCCCEEEEE T ss_conf 4699999998899986788689999 No 50 >pfam04234 CopC Copper resistance protein CopC. CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm. Probab=76.28 E-value=4.7 Score=18.38 Aligned_cols=25 Identities=32% Similarity=0.368 Sum_probs=15.8 Q ss_pred CCCCEEEEEE---EECCCCEEEEEEEEE Q ss_conf 4487028999---953882467779997 Q gi|254780160|r 250 VLSEGDYTVI---ARNKERNYSREFSVL 274 (298) Q Consensus 250 ~L~~G~Y~v~---a~~~~~~y~~~ftV~ 274 (298) .|++|.|++. ...|++..+-.|+.. T Consensus 91 ~l~~G~YtV~wr~~s~DGH~v~G~~~F~ 118 (120) T pfam04234 91 PLPAGTYTVEWRVVSADGHPVSGSFSFT 118 (120) T ss_pred CCCCEEEEEEEEEEECCCCEECCEEEEE T ss_conf 8898469999999965898226879999 No 51 >cd03458 Catechol_intradiol_dioxygenases Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. Probab=75.95 E-value=4.8 Score=18.32 Aligned_cols=21 Identities=5% Similarity=-0.067 Sum_probs=8.1 Q ss_pred EEEEECCCCCCCCCEEEEEEE Q ss_conf 998615788777551899998 Q gi|254780160|r 128 LYSIYKPGSPIVDDELTFSIY 148 (298) Q Consensus 128 ~~~~~~~~~~~~~~~~~f~i~ 148 (298) +...+.+..+.|.+++.+.+- T Consensus 107 v~G~V~D~~G~Pi~gA~vdvW 127 (256) T cd03458 107 VHGTVTDTDGKPLAGATVDVW 127 (256) T ss_pred EEEEEECCCCCCCCCCEEEEE T ss_conf 999998899999688789987 No 52 >cd05469 Transthyretin_like Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms. Probab=74.86 E-value=5.1 Score=18.14 Aligned_cols=51 Identities=10% Similarity=0.152 Sum_probs=20.1 Q ss_pred EEECCCCCCCCCCEEEEEEECCCCCCCE--EEEEEECCCCEEEE----CCCCCEEEEE Q ss_conf 9986887753467599999538887620--58898246856630----2267218999 Q gi|254780160|r 43 ARLTENSTSIDSGVSWHIFDSISNKKNT--LSTTKKIIGGKVSF----DLFPGDYLIS 94 (298) Q Consensus 43 a~~~~~~~~~~~G~~~~vy~~~~~~~g~--~~~tt~~~G~~~~~----~L~pG~Y~v~ 94 (298) ...+..+.| -+|+..+||....+..-. -...|+++|..... ++.+|.|.|. T Consensus 7 VLDt~~G~P-A~gv~V~L~r~~~~~~~~~l~~~~Tn~dGR~~~~~~~~~~~~G~Y~L~ 63 (113) T cd05469 7 VLDAVRGSP-AANVAIKVFRKTADGSWEIFATGKTNEDGELHGLITEEEFXAGVYRVE 63 (113) T ss_pred EEECCCCCC-CCCCEEEEEEECCCCCEEEEEEEEECCCCCCCCCCCCCCCCCEEEEEE T ss_conf 813799867-579999999999999889999999899997577645233467229999 No 53 >cd05894 Ig_C5_MyBP-C C5 immunoglobulin (Ig) domain of cardiac myosin binding protein C (MyBP-C). Ig_C5_MyBP_C : the C5 immunoglobulin (Ig) domain of cardiac myosin binding protein C (MyBP-C). MyBP_C consists of repeated domains, Ig and fibronectin type 3, and various linkers. Three isoforms of MYBP_C exist and are included in this group: cardiac(c), and fast and slow skeletal muscle (s) MyBP_C. cMYBP_C has insertions between and inside domains and an additional cardiac-specific Ig domain at the N-terminus. For cMYBP_C an interaction has been demonstrated between this C5 domain and the Ig C8 domain. Probab=74.40 E-value=5.2 Score=18.07 Aligned_cols=71 Identities=21% Similarity=0.243 Sum_probs=35.5 Q ss_pred EEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCEEEEEE-----CCCCCEEEECC---CCEEECC---CCCEEEEEE Q ss_conf 488428863789999345238999983478863276089998-----38997974212---5212124---487028999 Q gi|254780160|r 191 VVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADTAWSIL-----TASGDTVGESA---NASPSMV---LSEGDYTVI 259 (298) Q Consensus 191 ~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga~~~i~-----~~~g~~vt~~~---G~~~~~~---L~~G~Y~v~ 259 (298) .++|.+|+...+.+. -.|.|.+-+.|..- ..++....+.. ..+.+.. -..|.|++. T Consensus 4 ~i~V~~G~~~~l~~~-------------v~G~P~P~v~W~k~~~~l~~~~~r~~i~~~~~~~~L~I~~~~~~DsG~Yt~~ 70 (86) T cd05894 4 TIVVVAGNKLRLDVP-------------ISGEPAPTVTWSRGDKAFTETEGRVRVESYKDLSSFVIEGAEREDEGVYTIT 70 (86) T ss_pred EEEEECCCEEEEEEE-------------EEECCCCEEEEEECCEECCCCCCEEEEEECCCEEEEEECCCCCCCCEEEEEE T ss_conf 899986998999999-------------9994898999999999523899989999959969999999871139999999 Q ss_pred EECCCCEEEEEEEEE Q ss_conf 953882467779997 Q gi|254780160|r 260 ARNKERNYSREFSVL 274 (298) Q Consensus 260 a~~~~~~y~~~ftV~ 274 (298) |+|..-.-+..|.|. T Consensus 71 a~N~~G~~~~~~~v~ 85 (86) T cd05894 71 VTNPVGEDHASLFVK 85 (86) T ss_pred EEECCCEEEEEEEEE T ss_conf 997997999999999 No 54 >cd03460 1,2-CTD Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. Probab=73.66 E-value=5.4 Score=17.96 Aligned_cols=62 Identities=5% Similarity=-0.039 Sum_probs=24.1 Q ss_pred EEEEECCCCCCCCCEEEEEEECC--CCCEEEE---EEECCCCCEEEEECCCCEEEEEEECCCCCEEEE Q ss_conf 99861578877755189999838--9743566---750567513521067617999961577854454 Q gi|254780160|r 128 LYSIYKPGSPIVDDELTFSIYSN--PNHKALL---ITDKVRSGTLVRLGTNNYQITSHYGKYNAIVST 190 (298) Q Consensus 128 ~~~~~~~~~~~~~~~~~f~i~~~--~~~~~~~---~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~~~ 190 (298) +...+.+..+.|.+++.+.+-.. .+..... ....+..+.+.-=.-|.|.|.... |..|++.. T Consensus 127 v~G~V~D~~G~Pi~gA~vdvWqa~~~G~Y~~~dp~~~~~nlRg~~~td~~G~y~f~ti~-P~~YpIP~ 193 (282) T cd03460 127 MHGTVTDTDGKPVPGAKVEVWHANSKGFYSHFDPTQSPFNLRRSIITDADGRYRFRSIM-PSGYGVPP 193 (282) T ss_pred EEEEEECCCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCEEEEECCCCCEEEEEEEE-CCCCCCCC T ss_conf 99999889989878977999854899875476999998764369970898369999970-56552699 No 55 >pfam01186 Lysyl_oxidase Lysyl oxidase. Probab=72.30 E-value=3.1 Score=19.53 Aligned_cols=29 Identities=7% Similarity=0.099 Sum_probs=22.6 Q ss_pred CEEEEECCCCEEEEEEECCCCCEEEEEEE Q ss_conf 13521067617999961577854454488 Q gi|254780160|r 165 GTLVRLGTNNYQITSHYGKYNAIVSTVVK 193 (298) Q Consensus 165 ~~~~~L~~G~Y~v~et~a~~~~~~~~~i~ 193 (298) +.+++|+||+|.+++...|...+...++. T Consensus 152 IDITDv~pG~Y~l~V~vNP~~~v~Esd~~ 180 (205) T pfam01186 152 IDITDVKPGNYILQVEVNPTYDVAESDFT 180 (205) T ss_pred EECCCCCCCCEEEEEEECCCCCCCEECCC T ss_conf 98157798768999996866402320246 No 56 >cd05822 TLP_HIUase HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family. HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site. In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location. Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences. HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located betw Probab=70.96 E-value=6.3 Score=17.56 Aligned_cols=54 Identities=15% Similarity=0.140 Sum_probs=22.4 Q ss_pred EEEEECCCCCCCCCCEEEEEEECCCCCCCE-EEEEEECCCCEEEE-----CCCCCEEEEEE Q ss_conf 999986887753467599999538887620-58898246856630-----22672189999 Q gi|254780160|r 41 CEARLTENSTSIDSGVSWHIFDSISNKKNT-LSTTKKIIGGKVSF-----DLFPGDYLISA 95 (298) Q Consensus 41 l~a~~~~~~~~~~~G~~~~vy~~~~~~~g~-~~~tt~~~G~~~~~-----~L~pG~Y~v~~ 95 (298) .++..+..+.+ .+|+..+||....+.--. ....|+++|....+ .+.+|.|.|.- T Consensus 5 tHvLD~~~G~P-A~gv~v~L~~~~~~~~~~i~~~~Td~dGR~~~~~~~~~~~~~g~Y~L~F 64 (112) T cd05822 5 THVLDTATGKP-AAGVAVTLYRLDGNGWTLLATGVTNADGRCDDLLPPGAQLAAGTYKLTF 64 (112) T ss_pred EEEECCCCCCC-CCCCEEEEEEECCCCCEEEEEEEECCCCCCCCCCCCCCCCCCEEEEEEE T ss_conf 89951899884-8899899999879996899999978999898676776666666799999 No 57 >COG2351 Transthyretin-like protein [General function prediction only] Probab=70.71 E-value=6.3 Score=17.52 Aligned_cols=51 Identities=16% Similarity=0.171 Sum_probs=24.0 Q ss_pred CCCCCCCCCEEEEEEECCCCC---E-EEEEEECCCCCEE-----EEECCCCEEEEEEECC Q ss_conf 578877755189999838974---3-5667505675135-----2106761799996157 Q gi|254780160|r 133 KPGSPIVDDELTFSIYSNPNH---K-ALLITDKVRSGTL-----VRLGTNNYQITSHYGK 183 (298) Q Consensus 133 ~~~~~~~~~~~~f~i~~~~~~---~-~~~~t~~~~~~~~-----~~L~~G~Y~v~et~a~ 183 (298) +...+.|.+++++++|.-.+. . ....++.+|.... ..+.+|.|.++-..++ T Consensus 17 Dta~GkPAagv~V~L~rl~~~~~~~l~t~~Tn~DGR~d~pll~g~~~~~G~Y~l~F~~gd 76 (124) T COG2351 17 DTASGKPAAGVKVELYRLEGNQWELLKTVVTNADGRIDAPLLAGETLATGIYELVFHTGD 76 (124) T ss_pred ECCCCCCCCCCEEEEEEECCCCCEEEEEEEECCCCCCCCCCCCCCCCCCCEEEEEEECCH T ss_conf 113698189988999996497325525788668874556453766323623899997001 No 58 >pfam01105 EMP24_GP25L emp24/gp25L/p24 family/GOLD. Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Probab=70.46 E-value=6.4 Score=17.49 Aligned_cols=86 Identities=13% Similarity=0.196 Sum_probs=47.6 Q ss_pred EEEECCCCEEEEEEEECCCEEEEEEEE-CCCCCCCCCEEEEEECCCC--CEE----EECCCCEEECCCCCEEEEEEEECC Q ss_conf 488428863789999345238999983-4788632760899983899--797----421252121244870289999538 Q gi|254780160|r 191 VVKVEPGKIIDVTIQNRAAKITFKLVS-EMGGEAVADTAWSILTASG--DTV----GESANASPSMVLSEGDYTVIARNK 263 (298) Q Consensus 191 ~i~V~~g~~~~~tv~~~~~~~~~~~v~-~~~G~~l~ga~~~i~~~~g--~~v----t~~~G~~~~~~L~~G~Y~v~a~~~ 263 (298) .|.+.+|+..++--....+.....-.. ..++ -.+..+.|.+++| ..+ ....|.|......+|+|.+--.|. T Consensus 3 ~f~l~~~~~~Cf~e~v~~~~~i~~~y~v~~~~--~~~v~~~i~~p~~~~~~~~~~~~~~~~~~~f~~~~~G~y~iCf~n~ 80 (177) T pfam01105 3 TFELPAGEKECFYEEVPKGTLVTGSYQVISGG--GLDIDFTVTDPDGNGNVIYSKKRKSEGKFSFTATESGEYKFCFSNS 80 (177) T ss_pred EEEECCCCCEEEEEECCCCCEEEEEEEEEECC--CCEEEEEEEECCCCCEEEEECCCCCCCEEEEEECCCEEEEEEEECC T ss_conf 99988999799999978998999999995389--9608999991489944999635545646999932770089999837 Q ss_pred CCEE---EEEEEEEECCC Q ss_conf 8246---77799972885 Q gi|254780160|r 264 ERNY---SREFSVLTGKS 278 (298) Q Consensus 264 ~~~y---~~~ftV~~g~~ 278 (298) .... .-.|.+..|.. T Consensus 81 ~~~~~~~~v~f~i~~~~~ 98 (177) T pfam01105 81 FSTFSSKTVSFDIKVGVD 98 (177) T ss_pred CCCCCCEEEEEEEEECCC T ss_conf 988775899999997664 No 59 >PRK10378 hypothetical protein; Provisional Probab=70.23 E-value=6.5 Score=17.46 Aligned_cols=11 Identities=27% Similarity=0.561 Sum_probs=4.5 Q ss_pred EEECCCCEEEE Q ss_conf 21067617999 Q gi|254780160|r 168 VRLGTNNYQIT 178 (298) Q Consensus 168 ~~L~~G~Y~v~ 178 (298) ..|.||+|++. T Consensus 91 v~L~~G~Y~~~ 101 (374) T PRK10378 91 ANLQPGEYDMT 101 (374) T ss_pred EEECCCEEEEE T ss_conf 99889537986 No 60 >PRK12813 flgD flagellar basal body rod modification protein; Reviewed Probab=66.33 E-value=7.8 Score=16.95 Aligned_cols=14 Identities=14% Similarity=0.154 Sum_probs=10.1 Q ss_pred EEECCCCEEEEEEE Q ss_conf 21067617999961 Q gi|254780160|r 168 VRLGTNNYQITSHY 181 (298) Q Consensus 168 ~~L~~G~Y~v~et~ 181 (298) ..||+|.|.|.... T Consensus 160 ~~~p~G~Y~~~v~a 173 (223) T PRK12813 160 NPLPNGAYSFVVES 173 (223) T ss_pred CCCCCCCEEEEEEE T ss_conf 89999834999999 No 61 >pfam03272 Enhancin Viral enhancin protein. Probab=64.45 E-value=8.4 Score=16.72 Aligned_cols=76 Identities=21% Similarity=0.239 Sum_probs=44.6 Q ss_pred ECCCCCCCCCHHEEECCE-EEEEEEECCCCCCCCCCEEEEEEECCCCCCCEEEEEEECCCCEEEECCCCCEEEEEEEECC Q ss_conf 303314677000230232-7999998688775346759999953888762058898246856630226721899996157 Q gi|254780160|r 21 LNNNISKGKGKRVVDAQR-ITCEARLTENSTSIDSGVSWHIFDSISNKKNTLSTTKKIIGGKVSFDLFPGDYLISASFGH 99 (298) Q Consensus 21 ~~~~~~~g~~~~~v~~~~-i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~pG~Y~v~~s~g~ 99 (298) |.-+++..+...++...- -.++.++.-+...=+.|..+.||+ |..-....+-+.++.....+|+||-|+++.+-|. T Consensus 426 l~snf~LV~~~~~~q~~i~~~~tl~f~Iddp~qI~G~~~~l~d---G~~~v~~~tv~~~~~~~~~~v~~GvYtl~~prG~ 502 (775) T pfam03272 426 LESNFSLVTPDDLVQTNIKANVTLTFVIDDPSQIAGETFSIYD---GDKLVLESTVTNSTSLLFTHLPPGVYTLRHPRGR 502 (775) T ss_pred CCCCEEEEEHHHHHHHCCCCEEEEEEEECCHHHHCCCEEEEEE---CCCEEEEEEECCCCCEECCCCCCCEEEEEECCCC T ss_conf 3464487204353441574148999980897893687699991---8711899998178737615878715999905898 No 62 >cd05821 TLP_Transthyretin Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates. TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein. Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity. A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by bindi Probab=63.87 E-value=8.6 Score=16.65 Aligned_cols=22 Identities=14% Similarity=0.197 Sum_probs=9.6 Q ss_pred EEEECCCCEEEE----CCCCCEEEEE Q ss_conf 898246856630----2267218999 Q gi|254780160|r 73 TTKKIIGGKVSF----DLFPGDYLIS 94 (298) Q Consensus 73 ~tt~~~G~~~~~----~L~pG~Y~v~ 94 (298) ..|+++|....+ ++.+|.|.|+ T Consensus 44 ~~Tn~DGR~~~l~~~~~~~~G~Y~L~ 69 (121) T cd05821 44 GKTTETGEIHGLTTDEQFTEGVYKVE 69 (121) T ss_pred EECCCCCCCCCCCCCCCCCCEEEEEE T ss_conf 85599767266667345677779999 No 63 >KOG3287 consensus Probab=62.91 E-value=9 Score=16.54 Aligned_cols=72 Identities=18% Similarity=0.327 Sum_probs=43.1 Q ss_pred EEEEEECCCCEEEEEEEECCC-E--EEEEEEECCCCCCCCCEEEEEECCCCCEEE----ECCCCEEECCCCCEEEEEEEE Q ss_conf 544884288637899993452-3--899998347886327608999838997974----212521212448702899995 Q gi|254780160|r 189 STVVKVEPGKIIDVTIQNRAA-K--ITFKLVSEMGGEAVADTAWSILTASGDTVG----ESANASPSMVLSEGDYTVIAR 261 (298) Q Consensus 189 ~~~i~V~~g~~~~~tv~~~~~-~--~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt----~~~G~~~~~~L~~G~Y~v~a~ 261 (298) ..+|.|.+|+..++--....+ + +...+++. .|. -++.|++.++.|.++. ...|....+.-++|+|.+--. T Consensus 35 dftv~ipAGk~eCf~Q~v~~~~tle~eyQVi~G-~GD--l~i~Ftl~~P~G~~lv~~q~k~dg~ht~e~~e~GdY~~CfD 111 (236) T KOG3287 35 DFTVMIPAGKTECFYQPVPQGATLEVEYQVIDG-AGD--LDIDFTLLNPAGEVLVSDQRKVDGVHTVEVTETGDYQVCFD 111 (236) T ss_pred CEEEEECCCCCEEEEEECCCCEEEEEEEEEEEC-CCC--CCEEEEEECCCCCEEEECCCCCCCEEEEECCCCCCEEEEEC T ss_conf 369996699861654552677089999999705-786--42324775788628820010567536762068851699972 Q ss_pred CC Q ss_conf 38 Q gi|254780160|r 262 NK 263 (298) Q Consensus 262 ~~ 263 (298) |. T Consensus 112 Ns 113 (236) T KOG3287 112 NS 113 (236) T ss_pred CC T ss_conf 75 No 64 >pfam08842 DUF1812 Protein of unknown function (DUF1812). This family of proteins may be lipoproteins principally from bacilli. They are between 300 and 400 residues and are functionally uncharacterized. Probab=61.81 E-value=9.4 Score=16.41 Aligned_cols=20 Identities=25% Similarity=0.195 Sum_probs=14.9 Q ss_pred EEECCCCCEEEEEEEECCCC Q ss_conf 63022672189999615777 Q gi|254780160|r 82 VSFDLFPGDYLISASFGHVG 101 (298) Q Consensus 82 ~~~~L~pG~Y~v~~s~g~~~ 101 (298) ...+||+|+|.+.+=.|... T Consensus 66 m~l~l~~G~Y~~vaWgg~~~ 85 (293) T pfam08842 66 MPLPLPQGTYHFVAWGGLDD 85 (293) T ss_pred EEECCCCCCEEEEEEECCCC T ss_conf 87606987679999977888 No 65 >PRK12812 flgD flagellar basal body rod modification protein; Reviewed Probab=61.07 E-value=9.7 Score=16.33 Aligned_cols=15 Identities=27% Similarity=0.554 Sum_probs=11.0 Q ss_pred EECCCCEEEEEEECC Q ss_conf 106761799996157 Q gi|254780160|r 169 RLGTNNYQITSHYGK 183 (298) Q Consensus 169 ~L~~G~Y~v~et~a~ 183 (298) .+|+|.|+|+.++.+ T Consensus 181 ~~pdG~Yti~ata~d 195 (259) T PRK12812 181 YAGDGEYTIKAVYNN 195 (259) T ss_pred CCCCCCEEEEEEEEC T ss_conf 999984699999988 No 66 >TIGR02438 catachol_actin catechol 1,2-dioxygenase; InterPro: IPR012800 Members of this family are catechol 1,2-dioxygenases of the actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so form this separate entry. The member from Rhodococcus rhodochrous is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol.. Probab=60.52 E-value=9.9 Score=16.27 Aligned_cols=33 Identities=21% Similarity=0.301 Sum_probs=16.2 Q ss_pred EEEEEECCCCCEEEEECCCCEEEEEEECCCCCEE Q ss_conf 5667505675135210676179999615778544 Q gi|254780160|r 155 ALLITDKVRSGTLVRLGTNNYQITSHYGKYNAIV 188 (298) Q Consensus 155 ~~~~t~~~~~~~~~~L~~G~Y~v~et~a~~~~~~ 188 (298) ...++++.|.+.+.-|.|--|++ -+.++.|..+ T Consensus 182 gti~~d~~G~f~I~T~~PaPYqI-P~DG~~G~lI 214 (287) T TIGR02438 182 GTIIADDEGRFEITTLQPAPYQI-PTDGPTGKLI 214 (287) T ss_pred EEEEECCCCCEEEEEECCCCCCC-CCCCCCCCHH T ss_conf 33787488857887527888738-7888633043 No 67 >cd00222 CollagenBindB Collagen-binding protein B domain, mediates bacterial adherence to collagen; the primary sequence has a non-repetitive, collagen-binding A region, followed by the repetitive B region; the B region has one to four 23 kDa repeat units (B1-B4). The B repeat units have been suggested to serve as a `stalk' that projects the A region from the bacterial surface and thus facilitate bacterial adherence to collagen; each B repeat unit has two domains (D1 and D2) placed side-by-side; D1 and D2 have similar secondary structure and exhibit a unique inverse IgG-like domain fold. Probab=60.45 E-value=9.9 Score=16.26 Aligned_cols=133 Identities=14% Similarity=0.101 Sum_probs=60.6 Q ss_pred CCCEEEEEEECCCCCCCEEEEEEECCCCEEEE-CCCC----C---EEEEEEEECCCCCEEEEEEECCCCEEEEEEECCCC Q ss_conf 46759999953888762058898246856630-2267----2---18999961577752789980686136787513664 Q gi|254780160|r 53 DSGVSWHIFDSISNKKNTLSTTKKIIGGKVSF-DLFP----G---DYLISASFGHVGVVKKITVSSKEKNQKQVFILNAG 124 (298) Q Consensus 53 ~~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~-~L~p----G---~Y~v~~s~g~~~~~~~vtV~~~~~~~~~~~~~~ag 124 (298) ++-+.++||.. +...++...++.+++=...| +||- | .|+|.|..-+ +|...+ .+. ...........- T Consensus 22 P~sI~V~L~~n-G~~~~~~~~l~~~~~W~~tF~~Lpkyd~~G~~i~YtV~E~~V~-gY~~~i--~g~-~itNt~~~~~~t 96 (187) T cd00222 22 PAKISVQLLAN-GEKYVKIVTVTKDNNWKYEFKDLPKYDNEGKKINYTVVEVQVP-DYETPI--IGE-TITNKYINKETT 96 (187) T ss_pred CCEEEEEEEEC-CCEEEEEEEECCCCCCEEEECCCCCCCCCCCEEEEEEEEECCC-CCEEEE--ECE-EEEEEEECCCCE T ss_conf 96599999949-9372569981689982899877757968998899999761479-938998--320-155238178706 Q ss_pred EEEEEEEECCC-CCCCCCEEEEEEECCCCCEEE-EEEECCC--CCEEEEECC---C---CEEEEEEECCCCCEEEE Q ss_conf 37999861578-877755189999838974356-6750567--513521067---6---17999961577854454 Q gi|254780160|r 125 GIRLYSIYKPG-SPIVDDELTFSIYSNPNHKAL-LITDKVR--SGTLVRLGT---N---NYQITSHYGKYNAIVST 190 (298) Q Consensus 125 ~~~~~~~~~~~-~~~~~~~~~f~i~~~~~~~~~-~~t~~~~--~~~~~~L~~---G---~Y~v~et~a~~~~~~~~ 190 (298) .+...++-.++ .+..+..+++.||.++..... ...+..+ ...+.+||. | .|+++|...|.||.... T Consensus 97 sv~vtK~W~d~~~~~RP~sI~V~L~anG~~~g~~~~l~~~n~W~~tF~dLp~~d~G~~i~YtV~E~~v~~GY~~~v 172 (187) T cd00222 97 SFSGKKIWKNDTAGQRPEEIQVQLLQDGQATGKTKIVTKSNDWTYTFKDLPKYDTGNEYKYSVEEVTVVDGYKTTY 172 (187) T ss_pred EEEEEEEECCCCCCCCCCEEEEEEEECCCCCCCEEEECCCCCEEEEECCCCCCCCCCEEEEEEEEECCCCCCEEEE T ss_conf 9879999519999878664999999699271756997378967999679761169986999999944899959898 No 68 >smart00720 calpain_III calpain_III. Probab=59.95 E-value=10 Score=16.21 Aligned_cols=27 Identities=22% Similarity=0.437 Sum_probs=10.7 Q ss_pred CCCCCEEEEEEEECCCCEEEEEEEEEE Q ss_conf 244870289999538824677799972 Q gi|254780160|r 249 MVLSEGDYTVIARNKERNYSREFSVLT 275 (298) Q Consensus 249 ~~L~~G~Y~v~a~~~~~~y~~~ftV~~ 275 (298) ..|+||+|.||-.--...-+.+|.++. T Consensus 108 ~~L~pG~YvIIPsTf~p~~eg~F~LrV 134 (143) T smart00720 108 FRLPPGEYVIVPSTFEPNQEGDFLLRV 134 (143) T ss_pred EECCCCCEEEEECCCCCCCEECEEEEE T ss_conf 971998899981346799753789999 No 69 >pfam06488 L_lac_phage_MSP Lactococcus lactis bacteriophage major structural protein. This family consists of several Lactococcus lactis bacteriophage major structural proteins. Probab=58.43 E-value=11 Score=16.04 Aligned_cols=27 Identities=15% Similarity=0.121 Sum_probs=17.9 Q ss_pred EECCCCE-EECCCCCEEEEEEEECCCCE Q ss_conf 4212521-21244870289999538824 Q gi|254780160|r 240 GESANAS-PSMVLSEGDYTVIARNKERN 266 (298) Q Consensus 240 t~~~G~~-~~~~L~~G~Y~v~a~~~~~~ 266 (298) +++.|.. ....|+||.|.+.-+.+++. T Consensus 262 kda~G~v~tN~eLapgvy~vtfSAeGYa 289 (301) T pfam06488 262 KDAHGKVATNGELAPGVYIVTFSADGYA 289 (301) T ss_pred ECCCCCCCCCCCCCCCEEEEEECCCCCC T ss_conf 6376727227711785289986046766 No 70 >COG4640 Predicted membrane protein [Function unknown] Probab=57.87 E-value=11 Score=15.98 Aligned_cols=30 Identities=23% Similarity=0.293 Sum_probs=17.2 Q ss_pred EEECCCCEEEECCCCCEEEEEEEEC--CCCCE Q ss_conf 9824685663022672189999615--77752 Q gi|254780160|r 74 TKKIIGGKVSFDLFPGDYLISASFG--HVGVV 103 (298) Q Consensus 74 tt~~~G~~~~~~L~pG~Y~v~~s~g--~~~~~ 103 (298) ++.++--.+....-||.|++++++- +++++ T Consensus 201 vtkad~~~t~GpyipG~YTvka~~kgdya~~v 232 (465) T COG4640 201 VTKADKVTTYGPYIPGTYTVKATYKGDYAGYV 232 (465) T ss_pred EECCCCCCCCCCCCCCEEEEEEEECCCCHHHH T ss_conf 40045654047866743787654057621455 No 71 >pfam11797 DUF3324 Protein of unknown function C-terminal (DUF3324). This family consists of several hypothetical bacterial proteins of unknown function. Probab=57.75 E-value=11 Score=15.97 Aligned_cols=26 Identities=23% Similarity=0.599 Sum_probs=12.4 Q ss_pred CCCCEEEEEE--EECCCC--EEEEEEEEEE Q ss_conf 4487028999--953882--4677799972 Q gi|254780160|r 250 VLSEGDYTVI--ARNKER--NYSREFSVLT 275 (298) Q Consensus 250 ~L~~G~Y~v~--a~~~~~--~y~~~ftV~~ 275 (298) -|.||+|++. ++.... .+.++|+|.. T Consensus 102 ~lk~G~Y~l~~~~~~~~~~W~f~k~FtIt~ 131 (140) T pfam11797 102 RLKAGKYTLKLTAKSGKDKWTFTKDFTITG 131 (140) T ss_pred CCCCCEEEEEEEEECCCCEEEEEEEEEECH T ss_conf 035967899999982995598887689979 No 72 >cd03457 intradiol_dioxygenase_like Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown. Probab=57.26 E-value=11 Score=15.92 Aligned_cols=55 Identities=13% Similarity=0.128 Sum_probs=30.9 Q ss_pred EEEEEEEECC-CCCCCCCCEEEEEEECCCCCCC-------------------EEEEEEECCCCEEEECCCCCEEE Q ss_conf 2799999868-8775346759999953888762-------------------05889824685663022672189 Q gi|254780160|r 38 RITCEARLTE-NSTSIDSGVSWHIFDSISNKKN-------------------TLSTTKKIIGGKVSFDLFPGDYL 92 (298) Q Consensus 38 ~i~l~a~~~~-~~~~~~~G~~~~vy~~~~~~~g-------------------~~~~tt~~~G~~~~~~L~pG~Y~ 92 (298) -+.|..++.+ +...++.++.+.|+.-+..... +=...|+++|...+.-+.||-|. T Consensus 26 pl~l~~~v~D~~tc~P~~~a~VdiWhcda~G~YSg~~~~~~~~~~~~~~~flRG~q~TD~~G~~~F~TI~PG~Y~ 100 (188) T cd03457 26 PLTLDLQVVDVATCCPPPNAAVDIWHCDATGVYSGYSAGGGGGEDTDDETFLRGVQPTDADGVVTFTTIFPGWYP 100 (188) T ss_pred EEEEEEEEEECCCCEECCCCEEEEEECCCCCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCEEEEEEEEECCCCC T ss_conf 799999999899980879988999867998540661268888777556743512797399869999988151358 No 73 >PRK06655 flgD flagellar basal body rod modification protein; Reviewed Probab=55.56 E-value=12 Score=15.74 Aligned_cols=36 Identities=17% Similarity=0.127 Sum_probs=18.5 Q ss_pred EECCCCEEEEEEECCCCCEEEEEEEECCCCEEEEEEE Q ss_conf 1067617999961577854454488428863789999 Q gi|254780160|r 169 RLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDVTIQ 205 (298) Q Consensus 169 ~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~tv~ 205 (298) .||+|.|.|.......+.....+..+ .+.+..+.+. T Consensus 167 ~~p~G~Y~~~v~a~~~G~~~~~~~~~-~~~V~sV~~~ 202 (225) T PRK06655 167 ALPDGNYTIKASASVGGKQVVLQTLT-YANVQSVSLG 202 (225) T ss_pred CCCCCCEEEEEEEECCCCEEEEEEEE-EEEEEEEEEC T ss_conf 18997379999998099436640368-8999999966 No 74 >KOG1692 consensus Probab=55.49 E-value=12 Score=15.73 Aligned_cols=69 Identities=17% Similarity=0.207 Sum_probs=48.4 Q ss_pred CCCCEEEEEECCCCCEE----EECCCCEEECCCCCEEEEEEEECCCC-EE--EEEEEEEECCCCEEEEEEECHHCC Q ss_conf 32760899983899797----42125212124487028999953882-46--777999728850489997410103 Q gi|254780160|r 223 AVADTAWSILTASGDTV----GESANASPSMVLSEGDYTVIARNKER-NY--SREFSVLTGKSTIVEVLMRQKRMD 291 (298) Q Consensus 223 ~l~ga~~~i~~~~g~~v----t~~~G~~~~~~L~~G~Y~v~a~~~~~-~y--~~~ftV~~g~~~~veV~~~~~~~~ 291 (298) .-.|.-+.|..++|+.+ .++.|+|....-..|.|++-=.|+.- .+ ...|+|..|.-..-+=.++|.+.+ T Consensus 53 g~~~vd~~I~gP~~~~i~~~~~~ssgk~tF~a~~~G~Y~fCF~N~~s~mtpk~V~F~ihvg~~~~~~d~~~d~~~~ 128 (201) T KOG1692 53 GFLGVDVEITGPDGKIIHKGKRESSGKYTFTAPKKGTYTFCFSNKMSTMTPKTVMFTIHVGHAPQRDDLAKDAHQN 128 (201) T ss_pred CCCCEEEEEECCCCCHHHHCCCCCCCEEEEEECCCCEEEEEECCCCCCCCCEEEEEEEEEEECCCCCHHCCCCCCC T ss_conf 8663238997899861320565667518999427955899714788888743899999875146600100143335 No 75 >cd04975 Ig4_SCFR_like Fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR) and similar proteins. Ig4_SCFR_like; fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR). In addition to SCFR this group also includes the fourth Ig domain of platelet-derived growth factor receptors (PDGFR), alpha and beta, the fourth Ig domain of macrophage colony stimulating factor (M-CSF), and the Ig domain of the receptor tyrosine kinase KIT. SCFR and the PDGFR alpha and beta have similar organization: an extracellular component having five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. SCFR and its ligand SCF are critical for normal hematopoiesis, mast cell development, melanocytes and gametogenesis. SCF binds to the second and third Ig-like domains of SCFR, this fourth Ig-like domain participates in SCFR dimerization, which follows ligand binding. Deletion of this fourth SCFR_Ig-like domain abolishes Probab=55.40 E-value=12 Score=15.72 Aligned_cols=86 Identities=14% Similarity=0.119 Sum_probs=44.7 Q ss_pred EEEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCEE-EEEECC-CCCEEEECCC-CEEECCCCCEEEEEEEECCCCE Q ss_conf 448842886378999934523899998347886327608-999838-9979742125-2121244870289999538824 Q gi|254780160|r 190 TVVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADTA-WSILTA-SGDTVGESAN-ASPSMVLSEGDYTVIARNKERN 266 (298) Q Consensus 190 ~~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga~-~~i~~~-~g~~vt~~~G-~~~~~~L~~G~Y~v~a~~~~~~ 266 (298) ..++|..|+..++.+...+..--.+..=...|.++.+.. ..+... .+..-..+.= -.....--.|.|++.|+|+... T Consensus 11 ~~~~V~~GE~~~L~V~ieAyP~p~~~~W~kd~~~l~~~~~~~~~~~~~~~~rY~S~L~L~R~k~~d~G~YT~~a~N~~~~ 90 (101) T cd04975 11 TTIFVNLGENLNLVVEVEAYPPPPHINWTYDNRTLTNKLTEIVTSENESEYRYVSELKLVRLKESEAGTYTFLASNSDAS 90 (101) T ss_pred CEEEEECCCCEEEEEEEEECCCCCEEEEEECCCCCCCCCCEEEEEEECCCEEEEEEEEEECCCCCCCEEEEEEEECCCCC T ss_conf 50999899978999999863897644897089556786427778860786088999998437834466799999977750 Q ss_pred EEEEEEEEE Q ss_conf 677799972 Q gi|254780160|r 267 YSREFSVLT 275 (298) Q Consensus 267 y~~~ftV~~ 275 (298) .+..|.|.. T Consensus 91 ~~~tF~l~V 99 (101) T cd04975 91 KSLTFELYV 99 (101) T ss_pred EEEEEEEEE T ss_conf 899999999 No 76 >smart00095 TR_THY Transthyretin. Probab=55.30 E-value=12 Score=15.71 Aligned_cols=21 Identities=14% Similarity=0.226 Sum_probs=8.8 Q ss_pred EEECCCCEEEE----CCCCCEEEEE Q ss_conf 98246856630----2267218999 Q gi|254780160|r 74 TKKIIGGKVSF----DLFPGDYLIS 94 (298) Q Consensus 74 tt~~~G~~~~~----~L~pG~Y~v~ 94 (298) .|+++|....+ ++.+|.|.|. T Consensus 42 ~Tn~DGR~~~l~~~~~~~~G~YrL~ 66 (121) T smart00095 42 KTNESGEIHELTTDEKFVEGLYKVE 66 (121) T ss_pred EECCCCCCCCCCCCCCCCCEEEEEE T ss_conf 6389988567677144676549999 No 77 >PRK09619 flgD flagellar basal body rod modification protein; Reviewed Probab=54.88 E-value=12 Score=15.67 Aligned_cols=44 Identities=23% Similarity=0.344 Sum_probs=21.1 Q ss_pred EECCCCEEEEEEECCCCCEEEEEEEECCCCEEEEEEEECCCEEEEEE Q ss_conf 10676179999615778544544884288637899993452389999 Q gi|254780160|r 169 RLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDVTIQNRAAKITFKL 215 (298) Q Consensus 169 ~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~tv~~~~~~~~~~~ 215 (298) .||+|.|.|.+.... +.. ..++.+ .+++..+++....+.+.+.+ T Consensus 161 ~~~~G~Y~~~v~a~~-g~~-~~~~~~-~a~V~sVs~~~~g~~~~Lnl 204 (220) T PRK09619 161 GLQPGQYQLSVVSGS-GEE-LIPVEV-AGKVNNVRISPQGGAPQLNI 204 (220) T ss_pred CCCCCEEEEEEEECC-CCC-CCCEEE-EEEEEEEEECCCCCEEEEEE T ss_conf 899940799999778-952-232478-89989999758997689996 No 78 >pfam03443 Glyco_hydro_61 Glycosyl hydrolase family 61. Probab=47.51 E-value=16 Score=14.93 Aligned_cols=12 Identities=17% Similarity=0.457 Sum_probs=9.1 Q ss_pred EECCCCEEEEEE Q ss_conf 106761799996 Q gi|254780160|r 169 RLGTNNYQITSH 180 (298) Q Consensus 169 ~L~~G~Y~v~et 180 (298) +|++|.|-|+.. T Consensus 161 ~l~~G~YLlR~E 172 (234) T pfam03443 161 SIAPGNYLLRHE 172 (234) T ss_pred CCCCCCEEEEEC T ss_conf 789974488632 No 79 >PRK10301 hypothetical protein; Provisional Probab=47.14 E-value=16 Score=14.89 Aligned_cols=24 Identities=33% Similarity=0.385 Sum_probs=14.3 Q ss_pred CCCCEEEEEE---EECCCCEEEE--EEEE Q ss_conf 4487028999---9538824677--7999 Q gi|254780160|r 250 VLSEGDYTVI---ARNKERNYSR--EFSV 273 (298) Q Consensus 250 ~L~~G~Y~v~---a~~~~~~y~~--~ftV 273 (298) .|++|.|++. ...|++-.+- .|+| T Consensus 95 ~L~~G~YtV~WrvvS~DGH~v~G~~~FsV 123 (124) T PRK10301 95 SLKPGTYTVDWHVVSVDGHKTKGHYTFSV 123 (124) T ss_pred CCCCCEEEEEEEEEECCCCCCCCEEEEEE T ss_conf 88990189999999568983178698897 No 80 >pfam11138 DUF2911 Protein of unknown function (DUF2911). This bacterial family of proteins has no known function. Probab=47.11 E-value=16 Score=14.89 Aligned_cols=60 Identities=15% Similarity=0.093 Sum_probs=31.8 Q ss_pred HHEEECCEEEEEEEECCCCCCCCCCEEEEEEECCCCCCCEEEEEEECCCCEEEECCCCCEEEEEEE Q ss_conf 002302327999998688775346759999953888762058898246856630226721899996 Q gi|254780160|r 31 KRVVDAQRITCEARLTENSTSIDSGVSWHIFDSISNKKNTLSTTKKIIGGKVSFDLFPGDYLISAS 96 (298) Q Consensus 31 ~~~v~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~~~~~g~~~~tt~~~G~~~~~~L~pG~Y~v~~s 96 (298) +..+.-++-.+..|...++ -+.=|-+||. |.+....++.+.+=..--..||.|+|.|..- T Consensus 14 ~i~I~YsrP~~kgR~IfG~-LVPygkvWRt-----GAN~aT~i~fs~dv~i~g~~l~aG~Ysl~ti 73 (145) T pfam11138 14 DITVEYSRPSVKGRKIFGG-LVPYGKVWRT-----GANEATEITFSKDVTIGGKKLPAGTYSLFTI 73 (145) T ss_pred EEEEEECCCCCCCCCCCCC-CCCCCCEEEC-----CCCCCEEEEECCCEEECCEECCCCEEEEEEE T ss_conf 9999988986588702445-3458865104-----7874218996367789989857861899998 No 81 >TIGR02439 catechol_proteo catechol 1,2-dioxygenase; InterPro: IPR012801 Members of this family known so far are catechol 1,2-dioxygenases of the proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the actinobacteria, which are quite similar to each other and resolved by separate entries. This enzyme catalyses intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogues 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol.; GO: 0005506 iron ion binding, 0018576 catechol 12-dioxygenase activity, 0019614 catechol catabolic process. Probab=46.57 E-value=16 Score=14.83 Aligned_cols=30 Identities=7% Similarity=0.182 Sum_probs=18.6 Q ss_pred ECCEEEEEEEECCCCCCCCCCEEEEEEECC Q ss_conf 023279999986887753467599999538 Q gi|254780160|r 35 DAQRITCEARLTENSTSIDSGVSWHIFDSI 64 (298) Q Consensus 35 ~~~~i~l~a~~~~~~~~~~~G~~~~vy~~~ 64 (298) ++..+-|+.++.+..++|.+||...|+-.+ T Consensus 126 ~g~tl~l~G~V~d~~G~Pi~gA~VE~WHAN 155 (288) T TIGR02439 126 KGETLVLHGTVTDTDGKPIAGAKVEVWHAN 155 (288) T ss_pred CCCEEEEEEEEECCCCCCCCCCEEEEEECC T ss_conf 831689865887898875478766577517 No 82 >TIGR03503 conserved hypothetical protein TIGR03503. This set of conserved hypothetical protein has a phylogenetic range that closely matches that of TIGR03501, a putative C-terminal protein targeting signal. Probab=45.98 E-value=17 Score=14.78 Aligned_cols=94 Identities=12% Similarity=0.083 Sum_probs=44.6 Q ss_pred EEECCCCEEEEEEECCCCCEEE---EEEEECCCCEEEEEEEEC---CCEEEEEEEECCCCCCCCC---EEEEEECCCCCE Q ss_conf 2106761799996157785445---448842886378999934---5238999983478863276---089998389979 Q gi|254780160|r 168 VRLGTNNYQITSHYGKYNAIVS---TVVKVEPGKIIDVTIQNR---AAKITFKLVSEMGGEAVAD---TAWSILTASGDT 238 (298) Q Consensus 168 ~~L~~G~Y~v~et~a~~~~~~~---~~i~V~~g~~~~~tv~~~---~~~~~~~~v~~~~G~~l~g---a~~~i~~~~g~~ 238 (298) ...+||.|.+..+... +-..| .+|.|.+ ...++++.-. .+.-.+.+. .+.+.--++ +.+++..++|+. T Consensus 179 l~~~~G~Y~~~v~~~n-~vF~R~~~q~v~v~p-~Pi~~~~~q~~~~~~~h~l~v~-~d~~~i~p~s~~~~~e~~~P~g~~ 255 (374) T TIGR03503 179 LDVAPGEYRPTYQSRN-PVFLREVEQPVLVYP-LPVSYTVIQSEDESGAHQLMVD-ADAGHIDPGSLVIHGELVFPNGQI 255 (374) T ss_pred CCCCCCEEEEEEEECC-CEEEEEEEEEEEEEC-CCCEEEEECCCCCCCCEEEEEE-CCCCEECCCCEEEEEEEECCCCCE T ss_conf 3579965799999769-748999876089967-9817998736889986699997-574645553279999997899867 Q ss_pred EEE---CCC----CEEECCCCCEEEEEEEECCC Q ss_conf 742---125----21212448702899995388 Q gi|254780160|r 239 VGE---SAN----ASPSMVLSEGDYTVIARNKE 264 (298) Q Consensus 239 vt~---~~G----~~~~~~L~~G~Y~v~a~~~~ 264 (298) .-- ... .....+...|.|.+..+--+ T Consensus 256 ~~~~~~~~~~~~~~~lp~~~e~G~Y~~~g~v~a 288 (374) T TIGR03503 256 QQFSIELEEPETRVDLPANYEFGKYRVKGTVFG 288 (374) T ss_pred EEEECCCCCCCEEEECCCCCCCCEEEEEEEEEE T ss_conf 751115666753897437788735899999999 No 83 >cd05864 Ig2_VEGFR-2 Second immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 2 (VEGFR-2). Ig2_VEGF-2: Second immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 2 (VEGFR-2). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-2 (KDR/Flk-1) is a major mediator of the mitogenic, angiogenic and microvascular permeability-enhancing effects of VEGF-A; VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGF-A also interacts with VEGFR-1, which it binds more strongly than VEGFR-2. VEGFR-2 and -1 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. Probab=45.78 E-value=17 Score=14.76 Aligned_cols=22 Identities=23% Similarity=0.406 Sum_probs=14.8 Q ss_pred CCEEEEEEEECCCCEEEEEEEE Q ss_conf 8702899995388246777999 Q gi|254780160|r 252 SEGDYTVIARNKERNYSREFSV 273 (298) Q Consensus 252 ~~G~Y~v~a~~~~~~y~~~ftV 273 (298) .+|.|+++.+|+-..-++..++ T Consensus 45 DAG~YTvvltN~~~~~~~~~t~ 66 (70) T cd05864 45 DAGNYTVVLTNPITKEEQRHTF 66 (70) T ss_pred CCCCEEEEEECHHHHCEEEEEE T ss_conf 1862699997405600304699 No 84 >cd05860 Ig4_SCFR Fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR). Ig4_SCFR: The fourth Immunoglobulin (Ig)-like domain in stem cell factor receptor (SCFR). SCFR is organized as an extracellular component having five IG-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. SCFR and its ligand SCF are critical for normal hematopoiesis, mast cell development, melanocytes and gametogenesis. SCF binds to the second and third Ig-like domains of SCFR. This fourth Ig-like domain participates in SCFR dimerization, which follows ligand binding. Deletion of this fourth domain abolishes the ligand-induced dimerization of SCFR and completely inhibits signal transduction. Probab=45.42 E-value=17 Score=14.72 Aligned_cols=86 Identities=17% Similarity=0.127 Sum_probs=42.7 Q ss_pred EEEEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCE---EEEEECCCCCEEEECC-CCEEECCCCCEEEEEEEECCC Q ss_conf 544884288637899993452389999834788632760---8999838997974212-521212448702899995388 Q gi|254780160|r 189 STVVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADT---AWSILTASGDTVGESA-NASPSMVLSEGDYTVIARNKE 264 (298) Q Consensus 189 ~~~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga---~~~i~~~~g~~vt~~~-G~~~~~~L~~G~Y~v~a~~~~ 264 (298) +..+.|..|+..++.|...+..--....=...+..+.+. .....+... --..+. --..+..--.|-|+|.|+|.+ T Consensus 10 ~~~~~V~~gE~l~L~V~ieAYP~p~~~~W~~~n~t~~n~~~~~~~~~~~~~-~RY~s~L~L~Rlk~~E~G~YTf~a~N~d 88 (101) T cd05860 10 NTTIFVNAGENLDLIVEYEAYPKPEHQQWIYMNRTLTNTSDHYVKSRNESN-NRYVSELHLTRLKGTEGGTYTFLVSNSD 88 (101) T ss_pred CCEEEEECCCCEEEEEEEEECCCCCCEEEEECCCCCCCCCCCEEEEEECCC-EEEEEEEEEEEECCCCCEEEEEEEECCC T ss_conf 831999779978999999968998525788789864565542016778465-0899999988607254727999998877 Q ss_pred CEEEEEEEEEE Q ss_conf 24677799972 Q gi|254780160|r 265 RNYSREFSVLT 275 (298) Q Consensus 265 ~~y~~~ftV~~ 275 (298) ...+..|.|.. T Consensus 89 ~~~s~TF~l~v 99 (101) T cd05860 89 ASASVTFNVYV 99 (101) T ss_pred CCEEEEEEEEE T ss_conf 73589999998 No 85 >TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Probab=44.79 E-value=18 Score=14.66 Aligned_cols=37 Identities=19% Similarity=0.333 Sum_probs=25.0 Q ss_pred EEECCC---C-EEEEEEECCCCCE-EE--EEEEECCCCEEEEEE Q ss_conf 210676---1-7999961577854-45--448842886378999 Q gi|254780160|r 168 VRLGTN---N-YQITSHYGKYNAI-VS--TVVKVEPGKIIDVTI 204 (298) Q Consensus 168 ~~L~~G---~-Y~v~et~a~~~~~-~~--~~i~V~~g~~~~~tv 204 (298) ..|++| . |.|+..+.++||. .+ ..|.|.+|+..++.+ T Consensus 38 p~L~~G~~y~nY~v~a~~~~~GY~~~t~~~~v~vrAGd~v~~~f 81 (81) T TIGR03000 38 PPLEAGKEYENYEVTAELERDGYRTLTRTRTVVVRAGDTVTVDF 81 (81) T ss_pred CCCCCCCCEEEEEEEEEEECCCCCCCCCCEEEEECCCCEEEEEC T ss_conf 88878981441488888721996338732388766697689819 No 86 >pfam01060 DUF290 Transthyretin-like family. This family called family 2 in, has weak similarity to transthyretin (formerly called pre-albumin) which transports thyroid hormones. The specific function of this protein is unknown. Probab=43.68 E-value=18 Score=14.55 Aligned_cols=10 Identities=20% Similarity=0.664 Sum_probs=3.7 Q ss_pred CCEEEEEEEC Q ss_conf 6759999953 Q gi|254780160|r 54 SGVSWHIFDS 63 (298) Q Consensus 54 ~G~~~~vy~~ 63 (298) +|+..+||+. T Consensus 12 ~~v~V~L~d~ 21 (80) T pfam01060 12 AGVKVKLYEK 21 (80) T ss_pred CCCEEEEEEC T ss_conf 8999999977 No 87 >COG3656 Predicted periplasmic protein [Function unknown] Probab=41.61 E-value=20 Score=14.35 Aligned_cols=29 Identities=17% Similarity=0.209 Sum_probs=17.5 Q ss_pred EEEEECCCCEEEEEEECCC--CC-EEEEEEEE Q ss_conf 3521067617999961577--85-44544884 Q gi|254780160|r 166 TLVRLGTNNYQITSHYGKY--NA-IVSTVVKV 194 (298) Q Consensus 166 ~~~~L~~G~Y~v~et~a~~--~~-~~~~~i~V 194 (298) .+-.||||+|++++..+-+ +| .++.+|.. T Consensus 116 ~lk~lppG~Y~lvVEa~REvGg~elvr~pfsw 147 (172) T COG3656 116 KLKLLPPGDYYLVVEAGREVGGYELVRQPFSW 147 (172) T ss_pred HHCCCCCCCEEEEEEECCCCCCCEEEEEEEEC T ss_conf 11238998579999722124883028840413 No 88 >cd05737 Ig_Myomesin_like_C C-temrinal immunoglobulin (Ig)-like domain of myomesin and M-protein. Ig_Myomesin_like_C: domain similar to the C-temrinal immunoglobulin (Ig)-like domain of myomesin and M-protein. Myomesin and M-protein are both structural proteins localized to the M-band, a transverse structure in the center of the sarcomere, and are candidates for M-band bridges. Both proteins are modular, consisting mainly of repetitive Ig-like and fibronectin type III (FnIII) domains. Myomesin is expressed in all types of vertebrate striated muscle; M-protein has a muscle-type specific expression pattern. Myomesin is present in both slow and fast fibers; M-protein is present only in fast fibers. It has been suggested that myomesin acts as a molecular spring with alternative splicing as a means of modifying its elasticity. Probab=41.54 E-value=20 Score=14.34 Aligned_cols=74 Identities=19% Similarity=0.273 Sum_probs=36.7 Q ss_pred EEEEEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCEEEEEE----CCCCC-EEEECCC---CEEECC---CCCEEE Q ss_conf 454488428863789999345238999983478863276089998----38997-9742125---212124---487028 Q gi|254780160|r 188 VSTVVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADTAWSIL----TASGD-TVGESAN---ASPSMV---LSEGDY 256 (298) Q Consensus 188 ~~~~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga~~~i~----~~~g~-~vt~~~G---~~~~~~---L~~G~Y 256 (298) +...++|.+|+...++... .|.|.+...|.-- ..+.. .+....+ .+.+.. -..|.| T Consensus 7 lP~~vtV~eG~~v~l~c~v-------------~G~P~P~v~W~k~~~~i~~~~~~~i~~~~~~~~~L~I~~v~~~D~G~Y 73 (92) T cd05737 7 LPDVVTIMEGKTLNLTCTV-------------FGDPDPEVSWLKNDQALALSDHYNVKVEQGKYASLTIKGVSSEDSGKY 73 (92) T ss_pred CCCEEEEECCCEEEEEEEE-------------EECCCCEEEEEECCEECCCCCCEEEEEECCCEEEEEECCCCCCCCEEE T ss_conf 9987899199919999999-------------880699999999999966899889999689989999927982239999 Q ss_pred EEEEECCCCEEEEEEEEE Q ss_conf 999953882467779997 Q gi|254780160|r 257 TVIARNKERNYSREFSVL 274 (298) Q Consensus 257 ~v~a~~~~~~y~~~ftV~ 274 (298) +..|+|..-.-+..++|. T Consensus 74 ~c~a~N~~G~~~~~~~l~ 91 (92) T cd05737 74 GIVVKNKYGGETVDVTVS 91 (92) T ss_pred EEEEEECCEEEEEEEEEE T ss_conf 999997981499999998 No 89 >cd05748 Ig_Titin_like Immunoglobulin (Ig)-like domain of titin and similar proteins. Ig_Titin_like: immunoglobulin (Ig)-like domain found in titin-like proteins. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is gigantic, depending on isoform composition it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone. It appears to function similarly to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching. Within the sarcomere, titin is also attached to or is associated with myosin binding protein C (MyBP-C). MyBP-C appears to contribute to the generation of passive tension by titin, and similar to titin has repeated Ig-like and FN- Probab=41.18 E-value=20 Score=14.30 Aligned_cols=23 Identities=26% Similarity=0.385 Sum_probs=14.5 Q ss_pred CCEEEEEEEECCCCEEEEEEEEE Q ss_conf 87028999953882467779997 Q gi|254780160|r 252 SEGDYTVIARNKERNYSREFSVL 274 (298) Q Consensus 252 ~~G~Y~v~a~~~~~~y~~~ftV~ 274 (298) ..|.|+++|+|..-..+..+.|. T Consensus 51 D~G~Y~c~a~N~~G~~~~~~~v~ 73 (74) T cd05748 51 DSGKYTLTLKNPAGEKSATINVK 73 (74) T ss_pred CCEEEEEEEEECCCEEEEEEEEE T ss_conf 09899999999985999999999 No 90 >COG5266 CbiK ABC-type Co2+ transport system, periplasmic component [Inorganic ion transport and metabolism] Probab=38.87 E-value=22 Score=14.07 Aligned_cols=54 Identities=13% Similarity=0.231 Sum_probs=37.8 Q ss_pred CCEEEEEEEECCCCCCCCCEEEEEECCC----------CC-------EEEECCCCEEECCCCCEEEEEEEECC Q ss_conf 5238999983478863276089998389----------97-------97421252121244870289999538 Q gi|254780160|r 208 AAKITFKLVSEMGGEAVADTAWSILTAS----------GD-------TVGESANASPSMVLSEGDYTVIARNK 263 (298) Q Consensus 208 ~~~~~~~~v~~~~G~~l~ga~~~i~~~~----------g~-------~vt~~~G~~~~~~L~~G~Y~v~a~~~ 263 (298) .....+++++. |+||++|++.+...+ +. .-||++|.|.+.-|..|-.-+.|-+. T Consensus 171 ge~f~~~vl~~--GkPv~nA~V~v~~~n~~~~d~~a~~~~~ek~~~~~~TD~kG~~~fip~r~G~W~~~~~~~ 241 (264) T COG5266 171 GEVFRGKVLDN--GKPVPNATVEVEFDNIDTKDNRAKTGNTEKTALVQFTDDKGEVSFIPLRAGVWGFAVEHK 241 (264) T ss_pred CCEEEEEEEEC--CCCCCCCEEEEEEECCCCCCCCCCCCCCCCCCEEEECCCCCEEEEEECCCCEEEEEEECC T ss_conf 77578999778--964898689999841211234344588887306887178953998872476478986514 No 91 >pfam01067 Calpain_III Calpain large subunit, domain III. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions. Probab=38.60 E-value=22 Score=14.05 Aligned_cols=10 Identities=40% Similarity=0.670 Sum_probs=4.5 Q ss_pred EECCCCEEEE Q ss_conf 1067617999 Q gi|254780160|r 169 RLGTNNYQIT 178 (298) Q Consensus 169 ~L~~G~Y~v~ 178 (298) .|+||.|.|+ T Consensus 107 ~L~pG~YvII 116 (139) T pfam01067 107 RLPPGDYVIV 116 (139) T ss_pred ECCCCCEEEE T ss_conf 7199889999 No 92 >pfam07523 Big_3 Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins. Probab=38.54 E-value=22 Score=14.04 Aligned_cols=13 Identities=15% Similarity=0.419 Sum_probs=5.1 Q ss_pred CCCEEEEEEEECC Q ss_conf 6721899996157 Q gi|254780160|r 87 FPGDYLISASFGH 99 (298) Q Consensus 87 ~pG~Y~v~~s~g~ 99 (298) +||.|.|..+|.. T Consensus 47 ~~G~y~VTyty~g 59 (68) T pfam07523 47 KAGTYEVTYTYDG 59 (68) T ss_pred CCEEEEEEEEECC T ss_conf 9728899999899 No 93 >TIGR02656 cyanin_plasto plastocyanin; InterPro: IPR002387 Blue or 'type-1' copper proteins are small proteins which bind a single copper atom and which are characterised by an intense electronic absorption band near 600 nm , . The most well known members of this class of proteins are the plant chloroplastic plastocyanins, and the distantly related bacterial azurins, which exchange electrons with cytochrome c551. Plastocyanin participates in electron transfer between the cytochrome b6f complex and photosystem I. Many cyanobacteria and eukaryotic algae can synthesise both plastocyanin and its functional analog cytochrome c6, depending on bioavailabilities of copper and iron, respectively . Plastocyanin participates in electron transfer between P700 and the cytochrome b/f complex in photosystem I. ; GO: 0005507 copper ion binding, 0009055 electron carrier activity, 0006118 electron transport. Probab=38.29 E-value=18 Score=14.63 Aligned_cols=80 Identities=15% Similarity=0.255 Sum_probs=36.7 Q ss_pred EEEEEECCCCEEEEEEE-ECCCEEEEEEEECCCCCCCCCEEE-----EEECCCCCEEEECCCCEEECCCCCEEEEEEE-E Q ss_conf 54488428863789999-345238999983478863276089-----9983899797421252121244870289999-5 Q gi|254780160|r 189 STVVKVEPGKIIDVTIQ-NRAAKITFKLVSEMGGEAVADTAW-----SILTASGDTVGESANASPSMVLSEGDYTVIA-R 261 (298) Q Consensus 189 ~~~i~V~~g~~~~~tv~-~~~~~~~~~~v~~~~G~~l~ga~~-----~i~~~~g~~vt~~~G~~~~~~L~~G~Y~v~a-~ 261 (298) ..+|+|.+|++.+.... .---.++.-......+ ...... .|+++=|+- .-.-...+.+||+|++.+ . T Consensus 16 P~~~~i~aGDtV~f~NNK~~PHNvVFD~~~~P~~--~~~~a~~lS~~~Ll~~Pges----y~~Tf~~da~pGtY~fYC~P 89 (102) T TIGR02656 16 PAKISIAAGDTVKFVNNKGGPHNVVFDEDAVPAG--VKELAKSLSHSDLLNSPGES----YSVTFSTDAPPGTYTFYCEP 89 (102) T ss_pred CCCEEECCCCEEEEEECCCCCCCEEECCCCCCCC--CHHHHHHCCHHHHHCCCCCE----EEEEECCCCCCCCCCEECCC T ss_conf 5810468898178843788997678561017863--11466525814764189972----89851278889873302178 Q ss_pred CCCCEEEEEEEEE Q ss_conf 3882467779997 Q gi|254780160|r 262 NKERNYSREFSVL 274 (298) Q Consensus 262 ~~~~~y~~~ftV~ 274 (298) +.+.++--.++|+ T Consensus 90 HrGAGMVGkitV~ 102 (102) T TIGR02656 90 HRGAGMVGKITVE 102 (102) T ss_pred CCCCCCEEEEEEC T ss_conf 8677864378739 No 94 >cd05891 Ig_M-protein_C C-terminal immunoglobulin (Ig)-like domain of M-protein (also known as myomesin-2). Ig_M-protein_C: the C-terminal immunoglobulin (Ig)-like domain of M-protein (also known as myomesin-2). M-protein is a structural protein localized to the M-band, a transverse structure in the center of the sarcomere, and is a candidate for M-band bridges. M-protein is modular consisting mainly of repetitive IG-like and fibronectin type III (FnIII) domains, and has a muscle-type specific expression pattern. M-protein is present in fast fibers. Probab=37.64 E-value=23 Score=13.95 Aligned_cols=23 Identities=22% Similarity=0.372 Sum_probs=14.1 Q ss_pred CCCEEEEEEEECCCCEEEEEEEE Q ss_conf 48702899995388246777999 Q gi|254780160|r 251 LSEGDYTVIARNKERNYSREFSV 273 (298) Q Consensus 251 L~~G~Y~v~a~~~~~~y~~~ftV 273 (298) -..|.|+..|+|..-.-...++| T Consensus 68 ~D~G~Y~c~a~N~~G~~~~~~~l 90 (92) T cd05891 68 EDSGKYSINVKNKYGGETVDVTV 90 (92) T ss_pred HHCEEEEEEEEECCCCEEEEEEE T ss_conf 88999999999898789999999 No 95 >PRK10689 transcription-repair coupling factor; Provisional Probab=36.77 E-value=23 Score=13.86 Aligned_cols=22 Identities=18% Similarity=0.135 Sum_probs=17.2 Q ss_pred ECCCCCEEEEEEEECCCCEEEE Q ss_conf 1244870289999538824677 Q gi|254780160|r 248 SMVLSEGDYTVIARNKERNYSR 269 (298) Q Consensus 248 ~~~L~~G~Y~v~a~~~~~~y~~ 269 (298) +..|.+|||.|+..+.--.|.. T Consensus 474 l~eL~~GDyVVH~dHGIGrY~G 495 (1148) T PRK10689 474 LAELHPGQPVVHLEHGVGRYAG 495 (1148) T ss_pred HHHCCCCCEECCCCCCEEEEEC T ss_conf 8767889862211257567604 No 96 >PRK12633 flgD flagellar basal body rod modification protein; Provisional Probab=36.30 E-value=24 Score=13.82 Aligned_cols=36 Identities=19% Similarity=0.187 Sum_probs=19.3 Q ss_pred EECCCCEEEEEEEC-CCCCEEEEEEEECCCCEEEEEEE Q ss_conf 10676179999615-77854454488428863789999 Q gi|254780160|r 169 RLGTNNYQITSHYG-KYNAIVSTVVKVEPGKIIDVTIQ 205 (298) Q Consensus 169 ~L~~G~Y~v~et~a-~~~~~~~~~i~V~~g~~~~~tv~ 205 (298) .||+|.|.|..... ..+..+..+..+ .+++..+++. T Consensus 169 ~~~~G~Y~~~v~a~~~~G~~v~~~~~~-~~~V~sV~~~ 205 (230) T PRK12633 169 PLADGKYSITVSASDADAKPVKAEALT-YGQVKSVAYS 205 (230) T ss_pred CCCCCCEEEEEEEEECCCCEEEEEEEE-EEEEEEEEEC T ss_conf 389971289999994799527887778-8999899963 No 97 >cd04971 Ig_TrKABC_d5 Fifth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB and TrkC. TrkABC_d5: the fifth domain of Trk receptors TrkA, TrkB and TrkC, this is an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains. The fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, Band C mediate the trophic effects of the neurotrophin Nerve growth factor (NGF) family. TrkA is recognized by NGF. TrkB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor foun Probab=35.52 E-value=24 Score=13.74 Aligned_cols=22 Identities=32% Similarity=0.566 Sum_probs=15.0 Q ss_pred CCCEEEEEEEECCCCEEEEEEE Q ss_conf 4870289999538824677799 Q gi|254780160|r 251 LSEGDYTVIARNKERNYSREFS 272 (298) Q Consensus 251 L~~G~Y~v~a~~~~~~y~~~ft 272 (298) +..|.|+++|+|..-..++.+. T Consensus 56 ~D~G~YTl~A~N~~G~~~~~i~ 77 (81) T cd04971 56 VNNGNYTLVASNEYGQDSKSIS 77 (81) T ss_pred CCCCEEEEEEECCCCEEEEEEE T ss_conf 3485389999956770014898 No 98 >cd04976 Ig2_VEGFR Second immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor (VEGFR). Ig2_VEGFR: Second immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor (VEGFR). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members, VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1) and VEGFR-3 (Flt-4). VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signa Probab=34.67 E-value=25 Score=13.65 Aligned_cols=22 Identities=27% Similarity=0.614 Sum_probs=14.8 Q ss_pred CCEEEEEEEECCCCEEEEEEEE Q ss_conf 8702899995388246777999 Q gi|254780160|r 252 SEGDYTVIARNKERNYSREFSV 273 (298) Q Consensus 252 ~~G~Y~v~a~~~~~~y~~~ftV 273 (298) ..|.|+.+|+|..-..++.+++ T Consensus 46 D~G~YTc~a~N~aG~~~~~~~l 67 (71) T cd04976 46 DAGNYTVVLTNKQAKLEKRLTF 67 (71) T ss_pred HCEEEEEEEEECCCCEEEEEEE T ss_conf 8998899999954778999899 No 99 >TIGR02962 hdxy_isourate hydroxyisourate hydrolase; InterPro: IPR014306 Members of this entry, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes.; GO: 0006810 transport. Probab=34.02 E-value=26 Score=13.58 Aligned_cols=53 Identities=15% Similarity=0.248 Sum_probs=32.1 Q ss_pred ECCCCCCCCCEEEEEEEC--CCCC-E-E-EEEEECCCCCEE-----EEECCCCEEEEEEECCC Q ss_conf 157887775518999983--8974-3-5-667505675135-----21067617999961577 Q gi|254780160|r 132 YKPGSPIVDDELTFSIYS--NPNH-K-A-LLITDKVRSGTL-----VRLGTNNYQITSHYGKY 184 (298) Q Consensus 132 ~~~~~~~~~~~~~f~i~~--~~~~-~-~-~~~t~~~~~~~~-----~~L~~G~Y~v~et~a~~ 184 (298) .+..-+.|.+.+++++|. +++. + . ...|+.+|..-- ..|..|.|.++-..|+. T Consensus 8 LDta~G~PA~~~ki~Lyr~~g~g~p~l~~~~~TN~DGR~D~PLl~G~~l~~G~Y~l~FhagdY 70 (117) T TIGR02962 8 LDTAHGRPAANVKIELYRLEGDGLPELVKTVVTNSDGRVDAPLLEGDELATGVYELQFHAGDY 70 (117) T ss_pred HHHHCCCCCCCCEEEEEEECCCCCHHHHCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECHHHH T ss_conf 134068887775047898536577123224015888970730136775123303787511012 No 100 >cd05859 Ig4_PDGFR-alpha Fourth immunoglobulin (Ig)-like domain of platelet-derived growth factor receptor (PDGFR) alpha. IG4_PDGFR-alpha: The fourth immunoglobulin (Ig)-like domain of platelet-derived growth factor receptor (PDGFR) alpha. PDGF is a potent mitogen for connective tissue cells. PDGF-stimulated processes are mediated by three different PDGFs (PDGF-A,-B, and C). PDGFR alpha binds to all three PDGFs, whereas the PDGFR beta (not included in this group) binds only to PDGF-B. PDGF alpha is organized as an extracellular component having five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. In mice, PDGFR alpha and PDGFR beta are essential for normal development. Probab=32.89 E-value=27 Score=13.47 Aligned_cols=85 Identities=13% Similarity=0.075 Sum_probs=43.5 Q ss_pred EEEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCEEEEEECCCCCE---EEEC-CCCEEECCCCCEEEEEEEECCCC Q ss_conf 4488428863789999345238999983478863276089998389979---7421-25212124487028999953882 Q gi|254780160|r 190 TVVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADTAWSILTASGDT---VGES-ANASPSMVLSEGDYTVIARNKER 265 (298) Q Consensus 190 ~~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga~~~i~~~~g~~---vt~~-~G~~~~~~L~~G~Y~v~a~~~~~ 265 (298) ..+.+..++..++.+...+..-- ++.=...|.+|.+....+....... -.-+ ---.....--.|.|++.|.|.+. T Consensus 11 ~~~~v~l~e~~~l~v~veayP~P-~~~W~k~~~~l~~~~~~~~~~~~~~~~~rY~S~L~L~R~K~~d~G~YT~~a~N~d~ 89 (101) T cd05859 11 QLEFANLHEVKEFVVEVEAYPPP-QIRWLKDNRTLIENLTEITTSEHNVQETRYVSKLKLIRAKEEDSGLYTALAQNEDA 89 (101) T ss_pred CEEEEECCCCEEEEEEEEECCCC-CEEEEECCEECCCCCCEEECCCCEECCEEEEEEEEEEECCCCCCEEEEEEEECCCC T ss_conf 25999889848999999984999-60999999496488767882353263248886899998240368469999987676 Q ss_pred EEEEEEEEEE Q ss_conf 4677799972 Q gi|254780160|r 266 NYSREFSVLT 275 (298) Q Consensus 266 ~y~~~ftV~~ 275 (298) ..+..|.+.. T Consensus 90 ~~~~tF~l~V 99 (101) T cd05859 90 VKSYTFALQI 99 (101) T ss_pred EEEEEEEEEE T ss_conf 0899999999 No 101 >PRK05842 flgD flagellar basal body rod modification protein; Reviewed Probab=31.64 E-value=28 Score=13.33 Aligned_cols=43 Identities=14% Similarity=0.308 Sum_probs=22.9 Q ss_pred EEECCCCEEEEEEEC--CCCCEEEEEEEECCCCEEEEEEEECCCEE Q ss_conf 210676179999615--77854454488428863789999345238 Q gi|254780160|r 168 VRLGTNNYQITSHYG--KYNAIVSTVVKVEPGKIIDVTIQNRAAKI 211 (298) Q Consensus 168 ~~L~~G~Y~v~et~a--~~~~~~~~~i~V~~g~~~~~tv~~~~~~~ 211 (298) ..+|+|.|.|+..+. ..+... ..-++..+++..+.+......+ T Consensus 206 ~~vpdG~Y~v~a~~~~d~~~~~~-~~t~~g~~~V~gV~f~~g~p~l 250 (269) T PRK05842 206 EKVPKGNYKIKAEYNLDSHSKQY-LQTRIGRGEVESVIFDKGKPML 250 (269) T ss_pred CCCCCCCEEEEEEEECCCCCCCE-EEEEEEEEEEEEEEECCCCEEE T ss_conf 89989746999998616888714-5555567887899952993789 No 102 >cd05747 Ig5_Titin_like M5, fifth immunoglobulin (Ig)-like domain of human titin C terminus and similar proteins. Ig5_Titin_like: domain similar to the M5, fifth immunoglobulin (Ig)-like domain from the human titin C terminus. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is gigantic; depending on isoform composition it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone, and appears to function similar to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching. Probab=31.31 E-value=28 Score=13.30 Aligned_cols=71 Identities=20% Similarity=0.350 Sum_probs=34.8 Q ss_pred EEEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCEEEEE----ECCCCC-EEEECCC--CEEECC---CCCEEEEEE Q ss_conf 448842886378999934523899998347886327608999----838997-9742125--212124---487028999 Q gi|254780160|r 190 TVVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADTAWSI----LTASGD-TVGESAN--ASPSMV---LSEGDYTVI 259 (298) Q Consensus 190 ~~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga~~~i----~~~~g~-~vt~~~G--~~~~~~---L~~G~Y~v~ 259 (298) .+++|.+|+...+..... |.|.+...|.- +..+.. .+....+ .+.+.. =..|.|+.+ T Consensus 11 ~~~~v~eG~~~~l~C~v~-------------G~P~P~v~W~k~g~~l~~~~r~~i~~~~~~~~L~I~~~~~~D~G~Ytc~ 77 (92) T cd05747 11 RSLTVSEGESARFSCDVD-------------GEPAPTVTWMREGQIIVSSQRHQITSTEYKSTFEISKVQMSDEGNYTVV 77 (92) T ss_pred CEEEEECCCEEEEEEEEE-------------EECCCEEEEEECCEECCCCCCEEEEECCCEEEEEECCCCCCCCEEEEEE T ss_conf 559992999199999998-------------6079989999899999689978999869937999984585779999999 Q ss_pred EECCCCEEEEEEEE Q ss_conf 95388246777999 Q gi|254780160|r 260 ARNKERNYSREFSV 273 (298) Q Consensus 260 a~~~~~~y~~~ftV 273 (298) |+|..-.-+..|++ T Consensus 78 a~N~~G~~~a~~~L 91 (92) T cd05747 78 VENSEGKQEAQFTL 91 (92) T ss_pred EECCCCEEEEEEEE T ss_conf 99586469999998 No 103 >COG4315 Uncharacterized protein conserved in bacteria [Function unknown] Probab=30.87 E-value=28 Score=13.37 Aligned_cols=16 Identities=38% Similarity=0.655 Sum_probs=10.8 Q ss_pred CCCCEEEEEEEECCCC Q ss_conf 4487028999953882 Q gi|254780160|r 250 VLSEGDYTVIARNKER 265 (298) Q Consensus 250 ~L~~G~Y~v~a~~~~~ 265 (298) +-+.|+|.+++|++++ T Consensus 87 dka~Gdysii~RkDGt 102 (138) T COG4315 87 DKASGDYSIIARKDGT 102 (138) T ss_pred CCCCCCEEEEEECCCH T ss_conf 4357873468851760 No 104 >cd00214 Calpain_III Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains. Probab=30.43 E-value=29 Score=13.20 Aligned_cols=12 Identities=33% Similarity=0.537 Sum_probs=7.1 Q ss_pred EEECCCCEEEEE Q ss_conf 210676179999 Q gi|254780160|r 168 VRLGTNNYQITS 179 (298) Q Consensus 168 ~~L~~G~Y~v~e 179 (298) ..|+||.|.|+- T Consensus 113 ~~L~pG~YvIIP 124 (150) T cd00214 113 FRLPPGEYVIVP 124 (150) T ss_pred EEECCCCEEEEE T ss_conf 983998889993 No 105 >cd05863 Ig2_VEGFR-3 Second immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 3 (VEGFR-3). Ig2_VEGFR-3: Second immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 3 (VEGFR-3). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-3 (Flt-4) binds two members of the VEGF family (VEGF-C and -D) and is involved in tumor angiogenesis and growth. Probab=30.10 E-value=30 Score=13.17 Aligned_cols=21 Identities=29% Similarity=0.499 Sum_probs=12.0 Q ss_pred CCEEEEEEEECCCCEEEEEEE Q ss_conf 870289999538824677799 Q gi|254780160|r 252 SEGDYTVIARNKERNYSREFS 272 (298) Q Consensus 252 ~~G~Y~v~a~~~~~~y~~~ft 272 (298) .+|.|+++..|.-..-++..+ T Consensus 42 DAG~YTv~L~n~~~~l~k~~t 62 (67) T cd05863 42 SAGTYTLVLWNSAAGLEKRIS 62 (67) T ss_pred HCCEEEEEEECHHHCCEEEEE T ss_conf 386079999712542130238 No 106 >pfam09912 DUF2141 Uncharacterized protein conserved in bacteria (DUF2141). This domain, found in various hypothetical prokaryotic proteins, has no known function. Probab=29.68 E-value=30 Score=13.12 Aligned_cols=14 Identities=36% Similarity=0.648 Sum_probs=7.9 Q ss_pred CCCCCEEEEEEEEC Q ss_conf 22672189999615 Q gi|254780160|r 85 DLFPGDYLISASFG 98 (298) Q Consensus 85 ~L~pG~Y~v~~s~g 98 (298) +||||.|-|.+-+. T Consensus 48 ~l~~G~YAvav~HD 61 (111) T pfam09912 48 DLPPGTYAVAVFHD 61 (111) T ss_pred CCCCCCEEEEEEEE T ss_conf 89987789999982 No 107 >cd02858 Esterase_N_term Esterase N-terminal domain. Esterases catalyze the hydrolysis of organic esters to release an alcohol or thiol and acid. The term can be applied to enzymes that hydrolyze carboxylate, phosphate and sulphate esters, but is more often restricted to the first class of substrate. The N-terminus of esterase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. Probab=29.64 E-value=30 Score=13.12 Aligned_cols=23 Identities=30% Similarity=0.196 Sum_probs=10.0 Q ss_pred EEEEEECCCCEEEE--CCCCCEEEE Q ss_conf 58898246856630--226721899 Q gi|254780160|r 71 LSTTKKIIGGKVSF--DLFPGDYLI 93 (298) Q Consensus 71 ~~~tt~~~G~~~~~--~L~pG~Y~v 93 (298) .+++-+.+|-.... .|+||-|.- T Consensus 32 ~~MtK~~~GvWs~t~~pl~pg~y~Y 56 (85) T cd02858 32 HPMTKDEAGVWSVTTGPLAPGIYTY 56 (85) T ss_pred CCCEECCCCEEEEEECCCCCCEEEE T ss_conf 0545889975999868879824889 No 108 >PHA02358 hypothetical protein Probab=29.33 E-value=31 Score=13.08 Aligned_cols=28 Identities=14% Similarity=0.070 Sum_probs=17.2 Q ss_pred CCCCEEEECCCCCEEEE-EEEECCCCCEE Q ss_conf 46856630226721899-99615777527 Q gi|254780160|r 77 IIGGKVSFDLFPGDYLI-SASFGHVGVVK 104 (298) Q Consensus 77 ~~G~~~~~~L~pG~Y~v-~~s~g~~~~~~ 104 (298) .+...+.+|||-|.|+= .+..+..+|.. T Consensus 28 ~~~~lP~iNlPvGpYTSY~v~a~kdGY~I 56 (194) T PHA02358 28 QQEQVPAINIPVGPYSSYRVKAGKDGYEI 56 (194) T ss_pred CCCCCCCCCCCCCCCEEEEEEECCCCEEE T ss_conf 77789956579899777899955893699 No 109 >pfam09829 DUF2057 Uncharacterized protein conserved in bacteria (DUF2057). This domain, found in various prokaryotic proteins, has no known function. Probab=27.26 E-value=33 Score=12.85 Aligned_cols=79 Identities=20% Similarity=0.218 Sum_probs=39.0 Q ss_pred EEEECCCCEEEEEEECCCCCEEEEEEEECCCCEEEEEEEECCCEEEEEEEE---CCCCC-CCCCEEEEEECCCCCEEEEC Q ss_conf 521067617999961577854454488428863789999345238999983---47886-32760899983899797421 Q gi|254780160|r 167 LVRLGTNNYQITSHYGKYNAIVSTVVKVEPGKIIDVTIQNRAAKITFKLVS---EMGGE-AVADTAWSILTASGDTVGES 242 (298) Q Consensus 167 ~~~L~~G~Y~v~et~a~~~~~~~~~i~V~~g~~~~~tv~~~~~~~~~~~v~---~~~G~-~l~ga~~~i~~~~g~~vt~~ 242 (298) ...|++|..+|+.+|.... ....+-.+-.....-+++......+.+..-. ....+ -...-.|.|.+.+|+.|.-. T Consensus 29 ~~~L~~G~nQiv~ry~~~~-~~~~~~~~~~S~p~iv~f~a~~~~l~l~~p~~~~~~~Ak~f~~~P~~~L~d~~g~~v~~~ 107 (189) T pfam09829 29 SLELPDGENQIVVRYEKLF-DSGGDRELVKSDPIIVTFDASDQDLTLSLPKIRSEREAKKFAKSPQWTLTDASGKAVAFK 107 (189) T ss_pred CEEECCCCEEEEEEEEEEE-ECCCCEEEEECCCEEEEEECCCCEEEEECCCCCCHHHHHHHHHCCCEEEECCCCCEEEEE T ss_conf 2564799569999972268-069978689627999999737828999549978899999887299589997999899888 Q ss_pred CCCE Q ss_conf 2521 Q gi|254780160|r 243 ANAS 246 (298) Q Consensus 243 ~G~~ 246 (298) .... T Consensus 108 ~d~L 111 (189) T pfam09829 108 QDKL 111 (189) T ss_pred EEEC T ss_conf 0000 No 110 >cd04972 Ig_TrkABC_d4 Fourth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB and TrkC. TrkABC_d4: the fourth domain of Trk receptors TrkA, TrkB and TrkC, this is an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains. The fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, Band C mediate the trophic effects of the neurotrophin Nerve growth factor (NGF) family. TrkA is recognized by NGF. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor fo Probab=26.01 E-value=35 Score=12.71 Aligned_cols=69 Identities=16% Similarity=0.157 Sum_probs=37.3 Q ss_pred EEEEECCCCEEEEEEEECCCEEEEEEEECCCCCCCCCEEEEEECCCCCEEE--------ECC--CCEEECCC---CCEEE Q ss_conf 448842886378999934523899998347886327608999838997974--------212--52121244---87028 Q gi|254780160|r 190 TVVKVEPGKIIDVTIQNRAAKITFKLVSEMGGEAVADTAWSILTASGDTVG--------ESA--NASPSMVL---SEGDY 256 (298) Q Consensus 190 ~~i~V~~g~~~~~tv~~~~~~~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt--------~~~--G~~~~~~L---~~G~Y 256 (298) .+++|.+|+...+.- ...|.|.+...|.. +|..+. ... +...+.++ ..|.| T Consensus 8 ~~v~V~eG~~~~l~C-------------~~~G~P~P~I~W~k---~g~~i~~~~~~~~~~~~~~~~L~I~nv~~~D~G~Y 71 (90) T cd04972 8 NATVVYEGGTATIRC-------------TAEGSPLPKVEWII---AGLIVIQTRTDTLETTVDIYNLQLSNITSETQTTV 71 (90) T ss_pred CCEEEECCCCEEEEE-------------EEEECCCCEEEEEE---CCEECCCCCCEEEEECCCEEEEEECCCCHHHCEEE T ss_conf 878993799899999-------------99883898489998---98096898628999468758999957998979999 Q ss_pred EEEEECCCCEEEEEEEEE Q ss_conf 999953882467779997 Q gi|254780160|r 257 TVIARNKERNYSREFSVL 274 (298) Q Consensus 257 ~v~a~~~~~~y~~~ftV~ 274 (298) +.+|+|..-..+..+++. T Consensus 72 tC~A~N~~G~~~as~~L~ 89 (90) T cd04972 72 TCTAENPVGQANVSVQVT 89 (90) T ss_pred EEEEECCCCEEEEEEEEE T ss_conf 999996888698999998 No 111 >cd05855 Ig_TrkB_d5 Fifth domain (immunoglobulin-like) of Trk receptor TrkB. TrkB_d5: the fifth domain of Trk receptor TrkB, this is an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors, which mediate the trophic effects of the neurotrophin Nerve growth factor (NGF) family. The Trks are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkB shares significant sequence homology and domain organization with TrkA, and TrkC. The first three domains are leucine-rich domains. The fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. In some cell systems NT-3 can activate TrkA and TrkB receptors. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. Probab=25.70 E-value=36 Score=12.67 Aligned_cols=22 Identities=32% Similarity=0.522 Sum_probs=14.0 Q ss_pred CCCEEEEEEEECCCCEEEEEEE Q ss_conf 4870289999538824677799 Q gi|254780160|r 251 LSEGDYTVIARNKERNYSREFS 272 (298) Q Consensus 251 L~~G~Y~v~a~~~~~~y~~~ft 272 (298) +..|.|+++|+|..-..++++. T Consensus 54 ~dnG~Ytl~AkN~lGe~~ati~ 75 (79) T cd05855 54 LNNGIYTLVAKNEYGEDEKNVS 75 (79) T ss_pred ECCCEEEEEEECCCCEEEEEEE T ss_conf 2286589999876671422887 No 112 >pfam12580 TPPII Tripeptidyl peptidase II. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. The family is found in association with pfam00082. Tripeptidyl peptidase II (TPPII) is a crucial component of the proteolytic cascade acting downstream of the 26S proteasome in the ubiquitin-proteasome pathway. It is an amino peptidase belonging to the subtilase family removing tripeptides from the free N terminus of oligopeptides. Probab=24.62 E-value=37 Score=12.54 Aligned_cols=11 Identities=18% Similarity=0.576 Sum_probs=5.2 Q ss_pred EEECCCCEEEE Q ss_conf 21067617999 Q gi|254780160|r 168 VRLGTNNYQIT 178 (298) Q Consensus 168 ~~L~~G~Y~v~ 178 (298) ..|+=|.|+++ T Consensus 107 ~kL~KGdYtlr 117 (194) T pfam12580 107 TKLEKGDYTLR 117 (194) T ss_pred EECCCCCEEEE T ss_conf 04057668999 No 113 >TIGR02837 spore_II_R stage II sporulation protein R; InterPro: IPR014202 This entry is designated stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. SpoIIR is a signalling protein that links the activation of sigma E to the transcriptional activity of sigma F during sporulation , .. Probab=24.27 E-value=36 Score=12.65 Aligned_cols=13 Identities=31% Similarity=0.580 Sum_probs=7.5 Q ss_pred EEECCCCCEEEEE Q ss_conf 1212448702899 Q gi|254780160|r 246 SPSMVLSEGDYTV 258 (298) Q Consensus 246 ~~~~~L~~G~Y~v 258 (298) |....||+|+|.. T Consensus 127 YGn~vlPaG~Y~A 139 (172) T TIGR02837 127 YGNIVLPAGKYEA 139 (172) T ss_pred CCCCCCCCCCEEE T ss_conf 4441155873278 No 114 >COG4704 Uncharacterized protein conserved in bacteria [Function unknown] Probab=23.23 E-value=40 Score=12.37 Aligned_cols=10 Identities=20% Similarity=0.428 Sum_probs=3.9 Q ss_pred EEECCCCEEE Q ss_conf 2106761799 Q gi|254780160|r 168 VRLGTNNYQI 177 (298) Q Consensus 168 ~~L~~G~Y~v 177 (298) .+|+||+|-| T Consensus 82 ~~Lk~G~YAv 91 (151) T COG4704 82 YGLKPGKYAV 91 (151) T ss_pred ECCCCCCEEE T ss_conf 4589861777 No 115 >TIGR02423 protocat_alph protocatechuate 3,4-dioxygenase, alpha subunit; InterPro: IPR012786 Protocatechuate (3,4-dihydroxybenzene, PCA) is an aromatic compound which is a key intermediate in the degradation of the plant biopolymer lignin and other aromatic compounds. The key step of PCA degradation is the ring-cleavage performed by dioxygenases adding both atoms from molecular oxygen to specific carbon atoms within the ring. This step can be performed by two distinct mechanisms; intradiol cleavage and extradiol cleavage. In intradiol cleavage the oxygen atoms are added to the carbons carrying the hydroxyl groups, producing two carboxylate groups. In extradiol cleavage the oxygens are added to one carbon carrying a hydroxyl group and another carrying a hydrogen, resulting in the formation of a carboxylate group and an aldehydic group. For further information see . PCA dioxygenases fall into the broader category of catechol dioxygenases. These are metalloenzymes which bind non-haeme iron. The extradiol dioxygenases use Fe(II) to activate oxygen for nucleophilic attack on the aromatic substrate, while the intradiol dioxygenases use Fe(III) to activate the aromatic substrate for an electrophilic attack by oxygen . This entry represents the alpha subunit of protocatechuate 3,4-dioxygenase, an enzyme which cleaves the PCA ring by an intradiol mechanism. It is composed of two subunits, alpha and beta (IPR012785 from INTERPRO) which are highly similar in structure and are thought to share a common ancestor , , . The core of each subunit is two four-stranded beta-sheets that fold upon each other to form a beta sandwhich. The active site cavity contains the Fe(III)-binding site and is located between the two subunits. All Fe(III) ligands are contributed by the beta subunit.; GO: 0005506 iron ion binding, 0018578 protocatechuate 34-dioxygenase activity, 0019439 aromatic compound catabolic process. Probab=22.77 E-value=40 Score=12.32 Aligned_cols=25 Identities=20% Similarity=0.296 Sum_probs=9.2 Q ss_pred CEEEEEEEECCCCCCCCCCEEEEEE Q ss_conf 3279999986887753467599999 Q gi|254780160|r 37 QRITCEARLTENSTSIDSGVSWHIF 61 (298) Q Consensus 37 ~~i~l~a~~~~~~~~~~~G~~~~vy 61 (298) +.|.|+.++.++.+.++..+..+|+ T Consensus 40 ~~i~l~G~V~DG~G~pv~DAllE~W 64 (203) T TIGR02423 40 ERIQLEGRVLDGDGAPVPDALLEIW 64 (203) T ss_pred CEEEEEEEEEECCCCEECCEEEEEE T ss_conf 6789998998169974055478985 No 116 >PRK12634 flgD flagellar basal body rod modification protein; Reviewed Probab=22.73 E-value=40 Score=12.31 Aligned_cols=35 Identities=23% Similarity=0.250 Sum_probs=17.6 Q ss_pred EECCCCEEEEEEECCC-CCEEEEEEEECCCCEEEEEE Q ss_conf 1067617999961577-85445448842886378999 Q gi|254780160|r 169 RLGTNNYQITSHYGKY-NAIVSTVVKVEPGKIIDVTI 204 (298) Q Consensus 169 ~L~~G~Y~v~et~a~~-~~~~~~~i~V~~g~~~~~tv 204 (298) .||+|.|+|....... +.....+..+ .+++..+++ T Consensus 162 ~~p~G~Y~~~v~a~~~~g~~~~v~t~~-~~~V~sV~~ 197 (221) T PRK12634 162 RMAAGKYGITATQTDTAGSKSKLSTYV-DAPVDSVTI 197 (221) T ss_pred CCCCCCEEEEEEEECCCCCEEEEEEEE-EEEEEEEEE T ss_conf 689986699999981799658765678-899989995 No 117 >pfam09116 gp45-slide_C gp45 sliding clamp, C terminal. Members of this family are essential for the interaction of the gp45 sliding clamp with the corresponding polymerase. They adopt a DNA clamp fold, consisting of two alpha helices and two beta sheets - the fold is duplicated and has internal pseudo two-fold symmetry. Probab=22.57 E-value=41 Score=12.29 Aligned_cols=10 Identities=10% Similarity=0.358 Sum_probs=3.7 Q ss_pred ECCCCEEEEE Q ss_conf 0676179999 Q gi|254780160|r 170 LGTNNYQITS 179 (298) Q Consensus 170 L~~G~Y~v~e 179 (298) |-||+|.|.. T Consensus 76 ll~gdY~V~I 85 (112) T pfam09116 76 IIPGDYKVML 85 (112) T ss_pred ECCCCEEEEE T ss_conf 7689769999 No 118 >pfam10528 PA14_2 GLEYA domain. This presumed domain is found in fungal adhesins and is related to the PA14 domain. Probab=22.54 E-value=41 Score=12.29 Aligned_cols=50 Identities=16% Similarity=0.238 Sum_probs=29.3 Q ss_pred CCEEEEEEEECCCE-EEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEE Q ss_conf 86378999934523-8999983478863276089998389979742125212 Q gi|254780160|r 197 GKIIDVTIQNRAAK-ITFKLVSEMGGEAVADTAWSILTASGDTVGESANASP 247 (298) Q Consensus 197 g~~~~~tv~~~~~~-~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~ 247 (298) ++...+++.+.+|. .=++++-. .+.....-.|++.+++|.++.+..+.+. T Consensus 57 ~~~~~~tv~L~aG~YyPiRi~y~-N~~~~g~l~fsf~~P~G~~~~d~~~~y~ 107 (112) T pfam10528 57 GMGNFVTGYLTAGEYVPFRFLWA-NGAGIGGFDFSFTSPDGNAIATTSYSYV 107 (112) T ss_pred CCCCEEEEEEECCEEEEEEEEEE-CCCCCCEEEEEEECCCCCEECCCCCEEE T ss_conf 77717999982547997999998-6887536689999999989837961468 No 119 >pfam10289 consensus Probab=22.50 E-value=41 Score=12.28 Aligned_cols=78 Identities=23% Similarity=0.319 Sum_probs=41.9 Q ss_pred EEECCCCEEEEEEEEC-CCEEEEEEEECCCCCCCCCEEEEEECCCCCEEEECCCCEEEC---CCCCE-EEEEEEEC---- Q ss_conf 8842886378999934-523899998347886327608999838997974212521212---44870-28999953---- Q gi|254780160|r 192 VKVEPGKIIDVTIQNR-AAKITFKLVSEMGGEAVADTAWSILTASGDTVGESANASPSM---VLSEG-DYTVIARN---- 262 (298) Q Consensus 192 i~V~~g~~~~~tv~~~-~~~~~~~~v~~~~G~~l~ga~~~i~~~~g~~vt~~~G~~~~~---~L~~G-~Y~v~a~~---- 262 (298) ..|.+|+..+++..-. .+.+++.+.+..... +.... .|... ...+|.|.-. .|++| +|.+.... T Consensus 9 ~~v~aG~~~tItW~~~~~~~Vti~L~~G~s~~-~~~v~-tias~-----~~n~GsytWt~p~sl~~~~~Y~i~I~~~~~~ 81 (95) T pfam10289 9 EVVEAGKPYTITWTPTTDGPVTLVLLKGPSTN-LDPVS-TIASS-----IPNSGSYTWTVPTSLEAGSDYALEITSVSDT 81 (95) T ss_pred CEECCCCEEEEEECCCCCCCEEEEEEECCCCC-CCEEE-EEEEC-----CCCCEEEEEECCCCCCCCCCEEEEEEECCCC T ss_conf 65248972899975899988899998489888-53368-97606-----8996279985987788899679999989988 Q ss_pred CCCEEEEEEEEEEC Q ss_conf 88246777999728 Q gi|254780160|r 263 KERNYSREFSVLTG 276 (298) Q Consensus 263 ~~~~y~~~ftV~~g 276 (298) ....|...|+|+.+ T Consensus 82 ~~~nyS~~F~I~g~ 95 (95) T pfam10289 82 GEYNYSGQFTIEGG 95 (95) T ss_pred CEEEECCCEEECCC T ss_conf 61350031376369 No 120 >COG2373 Large extracellular alpha-helical protein [General function prediction only] Probab=21.86 E-value=42 Score=12.20 Aligned_cols=23 Identities=26% Similarity=0.206 Sum_probs=10.0 Q ss_pred CCEEEEEEEECCC--CEEEEEEEEE Q ss_conf 8702899995388--2467779997 Q gi|254780160|r 252 SEGDYTVIARNKE--RNYSREFSVL 274 (298) Q Consensus 252 ~~G~Y~v~a~~~~--~~y~~~ftV~ 274 (298) +.|.|++.....+ ..++.+|.|+ T Consensus 467 ~tG~w~l~~~~~~~~~~~s~~f~V~ 491 (1621) T COG2373 467 LTGGYTLELYTGGKSAVISMSFRVE 491 (1621) T ss_pred CCCEEEEEEEECCCCCEEEEEEEHH T ss_conf 7643899999578650556568856 No 121 >pfam10794 DUF2606 Protein of unknown function, DUF2606. Family of bacterial proteins with unknown function. Probab=21.01 E-value=44 Score=12.09 Aligned_cols=57 Identities=14% Similarity=0.264 Sum_probs=30.9 Q ss_pred CCCCCCCCCEEEEEECC---CCC-------EE--EECCCCEEECCCCCEEEEEEEECCCCEEEEEEEEE Q ss_conf 47886327608999838---997-------97--42125212124487028999953882467779997 Q gi|254780160|r 218 EMGGEAVADTAWSILTA---SGD-------TV--GESANASPSMVLSEGDYTVIARNKERNYSREFSVL 274 (298) Q Consensus 218 ~~~G~~l~ga~~~i~~~---~g~-------~v--t~~~G~~~~~~L~~G~Y~v~a~~~~~~y~~~ftV~ 274 (298) +..|.|++|.++.+..+ +-. .+ ||..|.+.-..-.-|+|.+.-.++.....|.+.+. T Consensus 50 n~e~qpik~~ei~lmKa~d~~p~Pskeig~~IGKTd~eGkiiWk~~rKG~Yiv~lp~~et~~~r~isl~ 118 (131) T pfam10794 50 DAEGQPIKGVEVTLMKAADSDPQPSKEIGEIIGKTDHEGKIIWKSGRKGKYIVVLPKNETPETRNISLI 118 (131) T ss_pred CCCCCCCCCEEEEEEECCCCCCCCCHHHHEEECCCCCCCEEEEECCCCCEEEEEECCCCCEEEEEEEEE T ss_conf 577893267599999656569997421211445547677088733777208999638985057754210 No 122 >cd05733 Ig6_L1-CAM_like Sixth immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. Ig6_L1-CAM_like: domain similar to the sixth immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains NrCAM [Ng(neuronglia)CAM-related cell adhesion molecule], which is primarily expressed in the nervous system, and human neurofascin. Probab=20.91 E-value=44 Score=12.08 Aligned_cols=52 Identities=19% Similarity=0.331 Sum_probs=28.8 Q ss_pred CCCCCCCCEEEEEECCCCCE----------EEECCCCEEECCC----C---CEEEEEEEECCC-CEEEEEEEE Q ss_conf 78863276089998389979----------7421252121244----8---702899995388-246777999 Q gi|254780160|r 219 MGGEAVADTAWSILTASGDT----------VGESANASPSMVL----S---EGDYTVIARNKE-RNYSREFSV 273 (298) Q Consensus 219 ~~G~~l~ga~~~i~~~~g~~----------vt~~~G~~~~~~L----~---~G~Y~v~a~~~~-~~y~~~ftV 273 (298) ..|.|-+-..|.- +|.. +....|.+.+..+ + .|.|+-+|+|.. ....+.+.| T Consensus 7 A~G~P~P~i~W~k---dG~~l~~~~~~~~~~~~~~g~l~i~~~~~~~~~~~~G~Y~C~A~N~~Gta~S~~~~l 76 (77) T cd05733 7 AKGNPPPTFSWTR---NGTHFDPEKDPRVTMKPDSGTLVIDNMNGGRAEDYEGEYQCYASNELGTAISNEIHL 76 (77) T ss_pred EEECCCCEEEEEE---CCEECCCCCCCCEEEECCCCEEEEEECCCCCCCCCCEEEEEEEECCCCEEECCCEEC T ss_conf 1012998899998---993968755898899469847999725888876789999999995878899230295 No 123 >cd05722 Ig1_Neogenin First immunoglobulin (Ig)-like domain in neogenin and similar proteins. Ig1_Neogenin: first immunoglobulin (Ig)-like domain in neogenin and related proteins. Neogenin is a cell surface protein which is expressed in the developing nervous system of vertebrate embryos in the growing nerve cells. It is also expressed in other embryonic tissues, and may play a general role in developmental processes such as cell migration, cell-cell recognition, and tissue growth regulation. Included in this group is the tumor suppressor protein DCC, which is deleted in colorectal carcinoma . DCC and neogenin each have four Ig-like domains followed by six fibronectin type III domains, a transmembrane domain, and an intracellular domain. Probab=20.55 E-value=45 Score=12.03 Aligned_cols=55 Identities=18% Similarity=0.199 Sum_probs=30.2 Q ss_pred CCCCCCCEEEEEE----C--CCCCEEEECCCCEEECCC--------CCEEEEEEEECC--CCEEEEEEEEE Q ss_conf 8863276089998----3--899797421252121244--------870289999538--82467779997 Q gi|254780160|r 220 GGEAVADTAWSIL----T--ASGDTVGESANASPSMVL--------SEGDYTVIARNK--ERNYSREFSVL 274 (298) Q Consensus 220 ~G~~l~ga~~~i~----~--~~g~~vt~~~G~~~~~~L--------~~G~Y~v~a~~~--~~~y~~~ftV~ 274 (298) .|.|.+...|.-- + .+.......+|...+..+ ..|.|+-+|+|. +....+..++. T Consensus 24 ~G~P~P~i~W~kdG~~l~~~~~~~~~~~~~G~L~I~~v~~~~~~~~D~G~Y~C~A~N~~~G~~~S~~a~l~ 94 (95) T cd05722 24 EGEPPPKIEWKKDGVLLNLVSDERRQQLPNGSLLITSVVHSKHNKPDEGFYQCVAQNDSLGSIVSRTARLT 94 (95) T ss_pred CEECCCEEEEEECCEECCCCCCCCEEECCCCCEEEEEEEECCCCCCCCEEEEEEEECCCCCCEEEEEEEEE T ss_conf 51599979999999897577887899916997999877706898886799999998367586310469998 No 124 >COG1843 FlgD Flagellar hook capping protein [Cell motility and secretion] Probab=20.08 E-value=46 Score=11.96 Aligned_cols=17 Identities=12% Similarity=0.147 Sum_probs=10.5 Q ss_pred EECCCCEEEEEEECCCC Q ss_conf 10676179999615778 Q gi|254780160|r 169 RLGTNNYQITSHYGKYN 185 (298) Q Consensus 169 ~L~~G~Y~v~et~a~~~ 185 (298) .+|.|.|++...+.-.+ T Consensus 164 ~~~~g~y~~~v~~~~~~ 180 (222) T COG1843 164 NVPDGQYTVKVVASKGG 180 (222) T ss_pred CCCCCCEEEEEEECCCC T ss_conf 47998589999981578 Done!