Query gi|254781108|ref|YP_003065521.1| von Willebrand factor type A [Candidatus Liberibacter asiaticus str. psy62] Match_columns 398 No_of_seqs 141 out of 1170 Neff 10.5 Searched_HMMs 39220 Date Mon May 30 07:03:08 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254781108.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK13685 hypothetical protein; 99.9 4.9E-23 1.3E-27 156.6 19.0 195 147-393 88-296 (326) 2 cd01465 vWA_subgroup VWA subgr 99.9 4.8E-22 1.2E-26 150.7 14.8 167 148-380 1-170 (170) 3 cd01467 vWA_BatA_type VWA BatA 99.9 5.9E-21 1.5E-25 144.1 16.2 165 147-376 2-180 (180) 4 cd01461 vWA_interalpha_trypsin 99.8 4.6E-19 1.2E-23 132.8 16.1 164 147-380 2-169 (171) 5 cd01475 vWA_Matrilin VWA_Matri 99.8 1.7E-18 4.3E-23 129.4 17.6 182 146-389 1-185 (224) 6 cd01480 vWA_collagen_alpha_1-V 99.8 8.2E-18 2.1E-22 125.3 16.8 176 146-380 1-179 (186) 7 cd01470 vWA_complement_factors 99.8 6.7E-18 1.7E-22 125.8 16.0 181 148-381 1-198 (198) 8 cd01466 vWA_C3HC4_type VWA C3H 99.8 4.1E-18 1E-22 127.1 14.0 152 148-371 1-155 (155) 9 cd01474 vWA_ATR ATR (Anthrax T 99.8 5.6E-17 1.4E-21 120.2 17.9 178 146-390 3-184 (185) 10 cd01463 vWA_VGCC_like VWA Volt 99.8 4.6E-17 1.2E-21 120.8 14.3 171 143-373 9-189 (190) 11 cd01456 vWA_ywmD_type VWA ywmD 99.7 1.8E-16 4.5E-21 117.3 12.7 166 144-375 17-205 (206) 12 cd01451 vWA_Magnesium_chelatas 99.7 1.2E-15 3E-20 112.3 16.3 172 149-380 2-176 (178) 13 cd01469 vWA_integrins_alpha_su 99.7 1.8E-15 4.5E-20 111.3 15.3 170 148-378 1-176 (177) 14 cd01472 vWA_collagen von Wille 99.7 1.7E-15 4.4E-20 111.3 15.1 160 148-372 1-163 (164) 15 cd01473 vWA_CTRP CTRP for CS 99.7 9.3E-15 2.4E-19 106.9 18.7 181 148-389 1-192 (192) 16 pfam00092 VWA von Willebrand f 99.7 1.4E-14 3.6E-19 105.8 17.0 172 149-383 1-177 (177) 17 cd01450 vWFA_subfamily_ECM Von 99.7 6.6E-15 1.7E-19 107.8 14.9 154 148-365 1-156 (161) 18 smart00327 VWA von Willebrand 99.7 1.9E-14 4.8E-19 105.1 16.6 160 147-368 1-162 (177) 19 cd01481 vWA_collagen_alpha3-VI 99.7 1.4E-14 3.7E-19 105.8 15.8 162 148-372 1-164 (165) 20 TIGR00868 hCaCC calcium-activa 99.6 2.1E-15 5.3E-20 110.8 11.1 176 147-394 307-493 (874) 21 cd01471 vWA_micronemal_protein 99.6 4E-14 1E-18 103.1 16.9 172 148-381 1-183 (186) 22 cd01464 vWA_subfamily VWA subf 99.6 2.7E-14 7E-19 104.1 13.6 173 147-381 3-175 (176) 23 cd01482 vWA_collagen_alphaI-XI 99.6 8E-14 2E-18 101.3 15.4 159 149-372 2-163 (164) 24 cd01476 VWA_integrin_invertebr 99.6 6.7E-14 1.7E-18 101.8 14.5 157 148-369 1-162 (163) 25 cd00198 vWFA Von Willebrand fa 99.6 3.7E-13 9.4E-18 97.3 14.8 153 148-364 1-155 (161) 26 TIGR03436 acidobact_VWFA VWFA- 99.5 8E-12 2E-16 89.3 16.9 181 145-392 51-259 (296) 27 cd01462 VWA_YIEM_type VWA YIEM 99.5 4.7E-12 1.2E-16 90.7 14.5 109 237-362 38-146 (152) 28 COG1240 ChlD Mg-chelatase subu 99.3 6E-11 1.5E-15 84.1 12.8 172 146-375 77-249 (261) 29 cd01454 vWA_norD_type norD typ 99.3 3.6E-10 9.2E-15 79.4 15.3 97 260-367 69-171 (174) 30 cd01477 vWA_F09G8-8_type VWA F 99.1 4.2E-09 1.1E-13 73.0 14.8 168 145-368 17-191 (193) 31 TIGR02442 Cob-chelat-sub cobal 98.8 1.2E-07 3E-12 64.3 10.2 165 147-369 508-686 (688) 32 COG4961 TadG Flp pilus assembl 98.8 3E-07 7.6E-12 61.9 12.2 69 1-75 26-94 (185) 33 COG4245 TerY Uncharacterized p 98.7 3E-07 7.7E-12 61.9 12.1 182 148-390 4-187 (207) 34 KOG2353 consensus 98.7 1.1E-07 2.7E-12 64.6 9.4 187 138-388 216-413 (1104) 35 PRK13406 bchD magnesium chelat 98.6 3.5E-06 9E-11 55.5 14.5 155 209-381 418-580 (584) 36 TIGR02031 BchD-ChlD magnesium 98.5 2.4E-06 6.1E-11 56.5 10.4 180 138-373 501-699 (705) 37 cd01457 vWA_ORF176_type VWA OR 98.3 6.7E-05 1.7E-09 47.8 13.5 160 147-361 2-163 (199) 38 cd01453 vWA_transcription_fact 98.2 0.0003 7.6E-09 43.9 16.3 173 148-382 4-177 (183) 39 COG2425 Uncharacterized protei 97.9 0.0004 1E-08 43.1 11.6 125 237-383 310-435 (437) 40 COG4548 NorD Nitric oxide redu 97.8 0.00028 7.1E-09 44.1 9.9 116 260-388 518-636 (637) 41 pfam05762 VWA_CoxE VWA domain 97.2 0.011 2.7E-07 34.6 10.8 75 258-346 111-187 (223) 42 pfam06707 DUF1194 Protein of u 97.1 0.018 4.6E-07 33.2 14.4 141 235-388 48-203 (206) 43 pfam00362 Integrin_beta Integr 97.0 0.023 5.9E-07 32.6 15.4 134 250-391 179-343 (424) 44 cd01452 VWA_26S_proteasome_sub 96.8 0.032 8.2E-07 31.7 12.4 153 207-379 23-182 (187) 45 smart00187 INB Integrin beta s 96.8 0.033 8.3E-07 31.7 15.3 134 250-391 178-342 (423) 46 COG4655 Predicted membrane pro 96.8 0.00055 1.4E-08 42.3 1.6 44 1-44 15-58 (565) 47 pfam11775 CobT_C Cobalamin bio 96.5 0.029 7.3E-07 32.0 8.8 86 277-380 118-210 (220) 48 pfam04056 Ssl1 Ssl1-like. Ssl1 95.8 0.12 3.1E-06 28.2 16.0 174 147-382 52-228 (250) 49 COG2304 Uncharacterized protei 95.7 0.13 3.3E-06 28.0 9.6 102 256-368 93-196 (399) 50 cd01458 vWA_ku Ku70/Ku80 N-ter 95.6 0.14 3.7E-06 27.8 12.9 85 272-364 103-190 (218) 51 cd01455 vWA_F11C1-5a_type Von 95.3 0.19 4.8E-06 27.1 15.7 115 256-386 68-188 (191) 52 pfam07811 TadE TadE-like prote 95.3 0.048 1.2E-06 30.7 5.4 36 2-37 7-42 (43) 53 KOG2807 consensus 94.3 0.35 8.9E-06 25.5 14.4 173 148-383 61-235 (378) 54 KOG1226 consensus 93.1 0.38 9.7E-06 25.3 6.3 62 250-315 210-272 (783) 55 COG4867 Uncharacterized protei 92.7 0.65 1.6E-05 23.9 8.5 104 265-379 521-641 (652) 56 pfam09967 DUF2201 Predicted me 92.1 0.7 1.8E-05 23.7 6.6 34 270-319 351-384 (412) 57 pfam11443 DUF2828 Domain of un 90.5 1.1 2.8E-05 22.5 11.0 103 237-347 368-472 (524) 58 PRK10997 yieM hypothetical pro 89.0 1.4 3.7E-05 21.8 13.7 96 237-348 358-453 (484) 59 pfam03731 Ku_N Ku70/Ku80 N-ter 85.3 2.4 6E-05 20.5 10.8 114 271-390 99-219 (222) 60 pfam07002 Copine Copine. This 81.5 3.4 8.6E-05 19.6 7.8 76 259-348 70-145 (145) 61 pfam11265 Med25_VWA Mediator c 72.3 6.2 0.00016 18.0 9.8 111 235-346 53-178 (219) 62 KOG4465 consensus 71.0 6.6 0.00017 17.8 4.9 100 240-361 471-581 (598) 63 cd01459 vWA_copine_like VWA Co 69.6 7.1 0.00018 17.6 7.8 86 259-360 119-204 (254) 64 KOG2884 consensus 68.9 7.3 0.00019 17.6 13.1 154 208-381 24-184 (259) 65 TIGR02877 spore_yhbH sporulati 67.0 8 0.0002 17.3 8.0 102 269-383 275-391 (392) 66 KOG1327 consensus 67.0 8 0.00021 17.3 9.3 89 259-362 375-463 (529) 67 PRK05325 hypothetical protein; 63.5 9.4 0.00024 16.9 9.5 108 269-389 295-412 (414) 68 pfam04285 DUF444 Protein of un 56.9 12 0.00031 16.2 10.3 104 269-386 307-418 (421) 69 LOAD_ku consensus 54.2 14 0.00035 15.9 12.9 11 375-385 425-435 (521) 70 pfam02431 Chalcone Chalcone-fl 48.9 17 0.00042 15.4 3.9 12 277-288 114-125 (199) 71 PRK12938 acetyacetyl-CoA reduc 48.9 17 0.00042 15.4 8.4 21 328-348 164-184 (246) 72 COG2984 ABC-type uncharacteriz 45.6 19 0.00047 15.1 9.9 85 303-394 158-242 (322) 73 pfam11411 DNA_ligase_IV DNA li 42.4 19 0.00048 15.1 2.2 22 364-385 14-35 (36) 74 COG4547 CobT Cobalamin biosynt 39.9 23 0.00058 14.6 5.9 65 278-352 520-591 (620) 75 KOG2487 consensus 39.9 23 0.00058 14.6 5.2 46 335-382 192-237 (314) 76 pfam06508 ExsB ExsB. This fami 39.4 23 0.00059 14.5 6.2 16 332-347 105-120 (137) 77 PRK08643 acetoin reductase; Va 39.2 23 0.0006 14.5 8.4 22 328-349 163-184 (256) 78 pfam04392 ABC_sub_bind ABC tra 35.9 27 0.00068 14.2 9.3 14 331-344 202-215 (292) 79 TIGR01651 CobT cobaltochelatas 33.2 29 0.00075 13.9 4.8 64 278-351 505-575 (606) 80 cd02007 TPP_DXS Thiamine pyrop 33.1 29 0.00075 13.9 2.4 89 275-383 81-176 (195) 81 PRK00907 hypothetical protein; 32.2 31 0.00078 13.8 3.2 29 357-385 50-84 (92) 82 COG3419 PilY1 Tfp pilus assemb 32.0 22 0.00055 14.7 1.1 56 337-394 352-409 (1036) 83 COG4726 PilX Tfp pilus assembl 31.0 32 0.00081 13.7 6.1 39 6-44 21-69 (196) 84 COG3552 CoxE Protein containin 30.9 14 0.00036 15.9 0.0 40 271-319 287-326 (395) 85 PRK07806 short chain dehydroge 30.9 32 0.00082 13.7 8.1 18 329-346 165-182 (248) 86 PRK06939 2-amino-3-ketobutyrat 30.6 18 0.00047 15.2 0.6 61 324-387 334-394 (395) 87 TIGR01822 2am3keto_CoA 2-amino 29.6 34 0.00086 13.6 3.5 212 148-388 172-393 (395) 88 PRK07576 short chain dehydroge 29.3 34 0.00087 13.6 8.3 19 330-348 169-187 (260) 89 PRK12824 acetoacetyl-CoA reduc 29.3 34 0.00087 13.5 7.2 22 328-349 163-184 (245) 90 cd06325 PBP1_ABC_uncharacteriz 29.0 35 0.00088 13.5 9.1 30 307-344 187-216 (281) 91 TIGR01957 nuoB_fam NADH-quinon 28.7 35 0.00089 13.5 2.0 19 372-390 128-146 (146) 92 PRK06123 short chain dehydroge 28.6 35 0.0009 13.5 8.4 22 328-349 169-190 (249) 93 PRK12826 3-ketoacyl-(acyl-carr 27.6 37 0.00093 13.4 8.7 21 329-349 168-188 (253) 94 PRK09730 hypothetical protein; 26.0 39 0.001 13.2 8.1 55 329-385 168-230 (247) 95 PRK12745 3-ketoacyl-(acyl-carr 25.7 40 0.001 13.2 8.8 22 328-349 174-195 (259) 96 PTZ00099 rab6; Provisional 25.4 37 0.00095 13.3 1.4 55 338-392 84-146 (176) 97 COG5242 TFB4 RNA polymerase II 24.4 42 0.0011 13.0 5.2 95 275-382 129-224 (296) 98 PRK09186 flagellin modificatio 24.2 42 0.0011 13.0 8.0 56 329-386 179-237 (255) 99 PRK05557 fabG 3-ketoacyl-(acyl 24.2 42 0.0011 13.0 8.6 21 329-349 167-187 (248) 100 PRK12825 fabG 3-ketoacyl-(acyl 24.2 42 0.0011 13.0 8.5 22 328-349 168-189 (250) 101 COG0623 FabI Enoyl-[acyl-carri 21.6 48 0.0012 12.7 5.3 19 330-348 171-189 (259) 102 TIGR03206 benzo_BadH 2-hydroxy 21.4 48 0.0012 12.6 8.1 32 328-361 163-194 (250) 103 PRK08063 enoyl-(acyl carrier p 21.3 48 0.0012 12.6 7.9 22 328-349 165-186 (250) 104 PRK06855 aminotransferase; Val 20.9 49 0.0013 12.6 5.3 20 370-389 409-429 (433) 105 pfam10526 NADH_ub_rd_NUML NADH 20.8 50 0.0013 12.6 2.7 23 6-28 13-35 (80) 106 KOG1575 consensus 20.3 51 0.0013 12.5 1.6 14 273-286 271-284 (336) No 1 >PRK13685 hypothetical protein; Provisional Probab=99.92 E-value=4.9e-23 Score=156.59 Aligned_cols=195 Identities=18% Similarity=0.302 Sum_probs=150.5 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) ..++++++|.|+||.-... ..+|.+..|.....+++.... T Consensus 88 ~~~i~l~lD~S~SM~a~D~----------------------------------------~p~Rl~~ak~~~~~fi~~~~~ 127 (326) T PRK13685 88 RAVVMLVIDVSQSMRATDV----------------------------------------EPNRLAAAQEAAKQFADQLTP 127 (326) T ss_pred CCCEEEEEECCCCCCCCCC----------------------------------------CCCHHHHHHHHHHHHHHHCCC T ss_conf 8867999989756558788----------------------------------------956899999999999973798 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 27687765213465412677654311235689999999973577888744588999999961134666665667666651 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) ..+.+.+.|.... ....|+|.|...++..|+.+.++.+|.++.|+..+.+.+....... ..++...++ T Consensus 128 --------~driGlv~Fa~~a--~~~~plT~D~~~~~~~l~~l~~~~~taiG~ai~~Al~~l~~~~~~~--~~~~~~~~~ 195 (326) T PRK13685 128 --------GINLGLIAFAGTA--TVLVSPTTNREATKNALDKLQLADRTATGEGIFTALQAIATVGAVI--GGGDTPPPA 195 (326) T ss_pred --------CCEEEEEEECCCC--EECCCCCCCHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHC--CCCCCCCCC T ss_conf --------8828999965872--0148987539999999984687888864068999999998633201--456777886 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCC-------------CHHHHHHHHHC-CCCCEEEECC Q ss_conf 699815887787777766314899999999789889999954797-------------55899999851-9981899469 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPP-------------EGQDLLRKCTD-SSGQFFAVND 372 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~-------------~~~~~l~~cAs-~~~~yy~a~~ 372 (398) +|||+|||++|.+...........+++.+|++||+|||||+|.+. -+.+.||++|. ++|.||.|.| T Consensus 196 ~IILLTDG~~n~g~~~~~p~~~~~AA~~A~~~gi~IyTIgvGt~~g~~~~~g~~~~~~lDe~~L~~IA~~TGG~yfrA~d 275 (326) T PRK13685 196 RIVLFSDGKETVPTNPDNPKGAYTAARTAKDQGVPISTISFGTPYGFVEINGQRQPVPVDDETLKKIAQLSGGEFYTAAS 275 (326) T ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEECCC T ss_conf 79997489977788988730299999999985994899997799884354784034568999999999972987997199 Q ss_pred HHHHHHHHHHHHHHHHHCEEE Q ss_conf 899999999999987512457 Q gi|254781108|r 373 SRELLESFDKITDKIQEQSVR 393 (398) Q Consensus 373 ~~~L~~aF~~I~~~i~~~r~~ 393 (398) .++|.++|++|.+.|..+.++ T Consensus 276 ~~~L~~Iy~~i~~~i~~~~~~ 296 (326) T PRK13685 276 LEELRAVYATLQQQIGYETIK 296 (326) T ss_pred HHHHHHHHHHHHHHHCCEEEC T ss_conf 999999999963331603311 No 2 >cd01465 vWA_subgroup VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if n Probab=99.89 E-value=4.8e-22 Score=150.66 Aligned_cols=167 Identities=16% Similarity=0.266 Sum_probs=128.2 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|+++|+|.||||... +++..|.++..+++.+.. T Consensus 1 ldiv~vlD~SGSM~g~---------------------------------------------~~~~~k~a~~~~l~~l~~- 34 (170) T cd01465 1 LNLVFVIDRSGSMDGP---------------------------------------------KLPLVKSALKLLVDQLRP- 34 (170) T ss_pred CCEEEEEECCCCCCCC---------------------------------------------HHHHHHHHHHHHHHHCCC- T ss_conf 9199999088688971---------------------------------------------999999999999985898- Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCC--CCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 76877652134654126776543112356--8999999997357788874458899999996113466666566766665 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLS--NNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt--~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) ..+.+.+.|++.... ..+++ .+...+..+|+.|.+.|+|+++.||..|++.+..... +... T Consensus 35 -------~dr~~iv~F~~~~~~--~~~~~~~~~~~~~~~~i~~l~~~G~T~~~~~l~~a~~~~~~~~~--------~~~~ 97 (170) T cd01465 35 -------DDRLAIVTYDGAAET--VLPATPVRDKAAILAAIDRLTAGGSTAGGAGIQLGYQEAQKHFV--------PGGV 97 (170) T ss_pred -------CCEEEEEEECCCCEE--CCCCCCHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCC--------CCCC T ss_conf -------787999983586155--15878666799999987438989985277999999999986337--------8875 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCHHHHHHHH Q ss_conf 169981588778777776631489999999978988999995479755899999851-998189946989999999 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTD-SSGQFFAVNDSRELLESF 380 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs-~~~~yy~a~~~~~L~~aF 380 (398) +.|||+|||+.|.+.. +.......+...++.||.|||||||. +.+..+|+.+|. +.|.||++++++||.++| T Consensus 98 ~~iillTDG~~~~~~~--~~~~~~~~~~~~~~~~i~i~tiGiG~-~~~~~~L~~iA~~~~G~~~~v~~~~~l~~~f 170 (170) T cd01465 98 NRILLATDGDFNVGET--DPDELARLVAQKRESGITLSTLGFGD-NYNEDLMEAIADAGNGNTAYIDNLAEARKVF 170 (170) T ss_pred EEEEEEECCCCCCCCC--CHHHHHHHHHHHHHCCCCCEEEEECC-CCCHHHHHHHHHCCCCEEEECCCHHHHHHHC T ss_conf 0699981588567988--98999999999874388624898088-7999999999975798899849999999639 No 3 >cd01467 vWA_BatA_type VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.88 E-value=5.9e-21 Score=144.12 Aligned_cols=165 Identities=24% Similarity=0.372 Sum_probs=125.5 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) .+|+++|+|.||||...... ..+|++..|.++..+++.. T Consensus 2 G~dvvlvlD~SgSM~~~d~~---------------------------------------~~~rl~~ak~~~~~~i~~~-- 40 (180) T cd01467 2 GRDIMIALDVSGSMLAQDFV---------------------------------------KPSRLEAAKEVLSDFIDRR-- 40 (180) T ss_pred CCEEEEEEECCCCCCCCCCC---------------------------------------CCCHHHHHHHHHHHHHHHC-- T ss_conf 62799999898475786667---------------------------------------8589999999999999719-- Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCC---CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 27687765213465412677654311235689999999973577---888744588999999961134666665667666 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLN---PYENTNTYPAMHHAYRELYNEKESSHNTIGSTR 303 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~---~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~ 303 (398) + ..+.+.+.|+... ....|+|.+...++..++.+. ++|+|+++.|+..+.+.|.+.. . T Consensus 41 ----~---~drvglv~Fs~~a--~~~~plT~d~~~~~~~l~~i~~~~~~ggT~i~~al~~a~~~l~~~~----------~ 101 (180) T cd01467 41 ----E---NDRIGLVVFAGAA--FTQAPLTLDRESLKELLEDIKIGLAGQGTAIGDAIGLAIKRLKNSE----------A 101 (180) T ss_pred ----C---CCEEEEEEECCCC--EEECCCCCCHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHCCC----------C T ss_conf ----9---9759999972873--6733766568999999862244532368608999999999764247----------6 Q ss_pred CCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCC----------HHHHHHHHHC-CCCCEEEECC Q ss_conf 6516998158877877777663148999999997898899999547975----------5899999851-9981899469 Q gi|254781108|r 304 LKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPE----------GQDLLRKCTD-SSGQFFAVND 372 (398) Q Consensus 304 ~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~----------~~~~l~~cAs-~~~~yy~a~~ 372 (398) ..|+|||+|||++|.+.. ....+++.+|++||+||+||||.+.. +.+.|+++|. ++|+||+|++ T Consensus 102 ~~~~ivLlTDG~~n~g~~-----~~~~~~~~a~~~gi~v~tIGvG~~~~~~~~~~~~~~d~~~L~~iA~~tgG~yy~a~~ 176 (180) T cd01467 102 KERVIVLLTDGENNAGEI-----DPATAAELAKNKGVRIYTIGVGKSGSGPKPDGSTILDEDSLVEIADKTGGRIFRALD 176 (180) T ss_pred CCCEEEEEECCCCCCCCC-----CHHHHHHHHHHCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEECCC T ss_conf 663799980588667876-----999999999976998999997789888768887655999999999961997997287 Q ss_pred HHHH Q ss_conf 8999 Q gi|254781108|r 373 SREL 376 (398) Q Consensus 373 ~~~L 376 (398) ++|| T Consensus 177 ~~eL 180 (180) T cd01467 177 GFEL 180 (180) T ss_pred HHHC T ss_conf 4649 No 4 >cd01461 vWA_interalpha_trypsin_inhibitor vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix. Probab=99.83 E-value=4.6e-19 Score=132.75 Aligned_cols=164 Identities=16% Similarity=0.224 Sum_probs=118.9 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) |.|+++|+|.||||... +++..+.++..+++.+.. T Consensus 2 P~div~viD~SgSM~g~---------------------------------------------~l~~ak~a~~~~l~~l~~ 36 (171) T cd01461 2 PKEVVFVIDTSGSMSGT---------------------------------------------KIEQTKEALLTALKDLPP 36 (171) T ss_pred CCEEEEEECCCCCCCCH---------------------------------------------HHHHHHHHHHHHHHHCCC T ss_conf 84699999179889863---------------------------------------------999999999999982998 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCC---CHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 2768776521346541267765431123568---9999999973577888744588999999961134666665667666 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSN---NLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTR 303 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~---~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~ 303 (398) ..+.+.+.|+.....-...+... +.......|+.+.++|+|+++.||.+|++.+... .. T Consensus 37 --------~d~~~iv~F~~~~~~~~~~~~~~~~~~~~~a~~~i~~l~~~G~T~i~~aL~~a~~~l~~~----------~~ 98 (171) T cd01461 37 --------GDYFNIIGFSDTVEEFSPSSVSATAENVAAAIEYVNRLQALGGTNMNDALEAALELLNSS----------PG 98 (171) T ss_pred --------CCEEEEEEECCEEEEECCCCEECCHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHC----------CC T ss_conf --------787999998780659807753079999999998875478899866999999999988635----------79 Q ss_pred CCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCHHHHHHHH Q ss_conf 65169981588778777776631489999999978988999995479755899999851-998189946989999999 Q gi|254781108|r 304 LKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTD-SSGQFFAVNDSRELLESF 380 (398) Q Consensus 304 ~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs-~~~~yy~a~~~~~L~~aF 380 (398) ..+.|||+|||+.+.. ......+..+++.+|+||+||||.+ .+..+|+.+|. +.|.||++++.++|.+.+ T Consensus 99 ~~~~iillTDG~~~~~------~~~~~~~~~~~~~~i~i~tig~G~~-~~~~~L~~iA~~~~G~~~~v~~~~~l~~~~ 169 (171) T cd01461 99 SVPQIILLTDGEVTNE------SQILKNVREALSGRIRLFTFGIGSD-VNTYLLERLAREGRGIARRIYETDDIESQL 169 (171) T ss_pred CCCEEEEECCCCCCCH------HHHHHHHHHHHCCCCEEEEEEECCC-CCHHHHHHHHHCCCCEEEECCCHHHHHHHH T ss_conf 8618999757886886------8999999997448963999997897-999999999972898899889878999976 No 5 >cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. Probab=99.82 E-value=1.7e-18 Score=129.39 Aligned_cols=182 Identities=19% Similarity=0.297 Sum_probs=140.9 Q ss_pred CCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHH Q ss_conf 65431357505566666676663101233102301268876540310134322011255533341000245665541022 Q gi|254781108|r 146 LAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQ 225 (398) Q Consensus 146 ~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~ 225 (398) +++|++|+||-|+|.... .....+..+..+++.+. T Consensus 1 gplDlvFllD~S~Svg~~---------------------------------------------nF~~~k~Fv~~lv~~f~ 35 (224) T cd01475 1 GPTDLVFLIDSSRSVRPE---------------------------------------------NFELVKQFLNQIIDSLD 35 (224) T ss_pred CCEEEEEEEECCCCCCHH---------------------------------------------HHHHHHHHHHHHHHHCC T ss_conf 974399999488998989---------------------------------------------99999999999998568 Q ss_pred HCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 0276877652134654126776543112356899999999735778-887445889999999611346666656676666 Q gi|254781108|r 226 KAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIGSTRL 304 (398) Q Consensus 226 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~ 304 (398) - ....+|.+.+.|+.............++..++.+|+++.+ +|+|+++.+|..+.+.++....+.+. ...+. T Consensus 36 I-----~~~~trVgvv~ys~~~~~~f~l~~~~~k~~l~~aI~~i~~~gggT~Tg~AL~~~~~~~f~~~~G~Rp--~~~~v 108 (224) T cd01475 36 V-----GPDATRVGLVQYSSTVKQEFPLGRFKSKADLKRAVRRMEYLETGTMTGLAIQYAMNNAFSEAEGARP--GSERV 108 (224) T ss_pred C-----CCCCEEEEEEEECCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCCCCC--CCCCC T ss_conf 7-----9985299999965827899966886788999999986361388446999999999972770239987--55689 Q ss_pred CEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CCEEEECCHHHHHHHHHH Q ss_conf 516998158877877777663148999999997898899999547975589999985199--818994698999999999 Q gi|254781108|r 305 KKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSS--GQFFAVNDSRELLESFDK 382 (398) Q Consensus 305 ~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~--~~yy~a~~~~~L~~aF~~ 382 (398) +|++|+||||..+. .....+..+|++||+||+||+| + .+.+.|+.+||.| .|.|.+++.++|...-+. T Consensus 109 pkvlIviTDG~s~D--------~v~~~A~~lr~~GV~ifaVGVg-~-~~~~eL~~IAs~P~~~hvf~v~~F~~l~~l~~~ 178 (224) T cd01475 109 PRVGIVVTDGRPQD--------DVSEVAAKARALGIEMFAVGVG-R-ADEEELREIASEPLADHVFYVEDFSTIEELTKK 178 (224) T ss_pred CEEEEEECCCCCCC--------CHHHHHHHHHHCCCEEEEEECC-C-CCHHHHHHHHCCCCHHCEEEECCHHHHHHHHHH T ss_conf 85999971798766--------3899999999879889999637-4-798999998559737568994798899999999 Q ss_pred HHHHHHH Q ss_conf 9998751 Q gi|254781108|r 383 ITDKIQE 389 (398) Q Consensus 383 I~~~i~~ 389 (398) |.++|.. T Consensus 179 l~~~iC~ 185 (224) T cd01475 179 FQGKICV 185 (224) T ss_pred HHHHHCC T ss_conf 8761189 No 6 >cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.80 E-value=8.2e-18 Score=125.26 Aligned_cols=176 Identities=22% Similarity=0.230 Sum_probs=130.2 Q ss_pred CCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHH Q ss_conf 65431357505566666676663101233102301268876540310134322011255533341000245665541022 Q gi|254781108|r 146 LAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQ 225 (398) Q Consensus 146 ~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~ 225 (398) +++|++++||-|+|+.... ....+..+..+++.+. T Consensus 1 gpvDlvFllD~S~Sv~~~~---------------------------------------------F~~~k~Fv~~lv~~f~ 35 (186) T cd01480 1 GPVDITFVLDSSESVGLQN---------------------------------------------FDITKNFVKRVAERFL 35 (186) T ss_pred CCEEEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHHH T ss_conf 9746999996889878789---------------------------------------------9999999999999985 Q ss_pred HCC-CCCCCCCEEEEEEECCCCCCCCCC-CCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 027-687765213465412677654311-2356899999999735778-8874458899999996113466666566766 Q gi|254781108|r 226 KAI-QEKKNLSVRIGTIAYNIGIVGNQC-TPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIGST 302 (398) Q Consensus 226 ~~~-~~~~~~~~~~~~~~~~~~~~~~~~-~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~ 302 (398) ... .......+|.+.+.|+........ .....++..++.+|+++.. +|+|+++.+|.++.+.+....+ . T Consensus 36 ~~~~~~i~~~~~rVgvv~ys~~~~~~~~~~~~~~~~~~l~~~I~~i~y~gG~T~tg~AL~~a~~~~~~~~r--------~ 107 (186) T cd01480 36 KDYYRKDPAGSWRVGVVQYSDQQEVEAGFLRDIRNYTSLKEAVDNLEYIGGGTFTDCALKYATEQLLEGSH--------Q 107 (186) T ss_pred HHCCCCCCCCCEEEEEEEECCCEEEEECCCCCCCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHCCC--------C T ss_conf 30134568774389899855842798604777588999999997501358986299999999999861367--------8 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHH Q ss_conf 665169981588778777776631489999999978988999995479755899999851998189946989999999 Q gi|254781108|r 303 RLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESF 380 (398) Q Consensus 303 ~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF 380 (398) ..+|++||+|||+.+.. +........+.+|+.||+||+||+|. . ..+.|+.+|++|.+.|.+.+.++|...| T Consensus 108 ~~~kvlvliTDG~S~~~----~~~~~~~aa~~lr~~GV~ifaVGVG~-~-~~~eL~~IAs~p~~~~~~~~f~~L~~~~ 179 (186) T cd01480 108 KENKFLLVITDGHSDGS----PDGGIEKAVNEADHLGIKIFFVAVGS-Q-NEEPLSRIACDGKSALYRENFAELLWSF 179 (186) T ss_pred CCCEEEEEEECCCCCCC----CCHHHHHHHHHHHHCCCEEEEEEECC-C-CHHHHHHHHCCCCCEEEECCHHHHHCCH T ss_conf 98538999845876667----40669999999998798999999474-8-8799999858997389736899870111 No 7 >cd01470 vWA_complement_factors Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains. Probab=99.80 E-value=6.7e-18 Score=125.80 Aligned_cols=181 Identities=23% Similarity=0.292 Sum_probs=127.9 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|++++||-|+|++... ....+..+..+++.+... T Consensus 1 lDivfllD~SgSIg~~n---------------------------------------------F~~~k~Fv~~lv~~~~~~ 35 (198) T cd01470 1 LNIYIALDASDSIGEED---------------------------------------------FDEAKNAIKTLIEKISSY 35 (198) T ss_pred CEEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHHCCC T ss_conf 91999997989888788---------------------------------------------999999999999984466 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCC--CCCCCHHHHHHHHHCCCC-----CCCCCHHHHHHHHHHHHCCCCCCCCCCCC Q ss_conf 76877652134654126776543112--356899999999735778-----88744588999999961134666665667 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCT--PLSNNLNEVKSRLNKLNP-----YENTNTYPAMHHAYRELYNEKESSHNTIG 300 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~--~lt~~~~~~~~~I~~l~~-----~G~T~~~~gl~~a~~~l~~~~~~~~~~~~ 300 (398) .. ..|.+.+.|+......... ..+.+...+..+|+.+.. +|||++..+|..+++.+.......+. . T Consensus 36 ~~-----~~rvgvv~ys~~~~~~f~l~~~~~~~~~~~~~~i~~i~y~~~~~~~gT~t~~AL~~~~~~~~~~~~~~~~--~ 108 (198) T cd01470 36 EV-----SPRYEIISYASDPKEIVSIRDFNSNDADDVIKRLEDFNYDDHGDKTGTNTAAALKKVYERMALEKVRNKE--A 108 (198) T ss_pred CC-----CCEEEEEEECCCCEEEEECCCCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCCC--C T ss_conf 87-----7538999815885389971576666899999999846033577886468999999999986555304664--4 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCCCH-------HHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC-CC--CEEEE Q ss_conf 66665169981588778777776631-------4899999999789889999954797558999998519-98--18994 Q gi|254781108|r 301 STRLKKFVIFITDGENSGASAYQNTL-------NTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDS-SG--QFFAV 370 (398) Q Consensus 301 ~~~~~k~iillTDG~~~~~~~~~~~~-------~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~-~~--~yy~a 370 (398) ....+|++||+|||+.|.+..+.... ........+++.||.||+||+| +..+.+.|+.+||+ |+ |+|.+ T Consensus 109 ~~~v~~v~illTDG~sn~g~~P~~~~~~~~~~~~~~~~a~~~r~~gi~ifaiGVG-~~~d~~eL~~IAS~~~~e~hvf~v 187 (198) T cd01470 109 FNETRHVIILFTDGKSNMGGSPLPTVDKIKNLVYKNNKSDNPREDYLDVYVFGVG-DDVNKEELNDLASKKDNERHFFKL 187 (198) T ss_pred CCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEEC-CCCCHHHHHHHHCCCCCCCEEEEE T ss_conf 4567559999737854578993367888777664101456788739479999966-615999999985799987169996 Q ss_pred CCHHHHHHHHH Q ss_conf 69899999999 Q gi|254781108|r 371 NDSRELLESFD 381 (398) Q Consensus 371 ~~~~~L~~aF~ 381 (398) .+.++|.++|. T Consensus 188 ~df~~L~~i~d 198 (198) T cd01470 188 KDYEDLQEVFD 198 (198) T ss_pred CCHHHHHHHHC T ss_conf 89999998639 No 8 >cd01466 vWA_C3HC4_type VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, Probab=99.79 E-value=4.1e-18 Score=127.08 Aligned_cols=152 Identities=18% Similarity=0.279 Sum_probs=112.0 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|+++|+|+||||... +++..|.+...+++.+.. T Consensus 1 ~div~vlD~SGSM~g~---------------------------------------------~l~~~k~a~~~~~~~L~~- 34 (155) T cd01466 1 VDLVAVLDVSGSMAGD---------------------------------------------KLQLVKHALRFVISSLGD- 34 (155) T ss_pred CEEEEEEECCCCCCCH---------------------------------------------HHHHHHHHHHHHHHHCCC- T ss_conf 9399999089898873---------------------------------------------899999999999984897- Q ss_pred CCCCCCCCEEEEEEECCCCCCCCC-CCCC-CCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 768776521346541267765431-1235-68999999997357788874458899999996113466666566766665 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQ-CTPL-SNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~-~~~l-t~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) ..+.+.+.|++...... ..+. ..++..++..|+.|.++|+|+++.||..|.+.|..... .+.. T Consensus 35 -------~d~v~iV~F~~~a~~~~pl~~~~~~~~~~~~~~i~~l~~~GgT~i~~gl~~a~~~l~~~~~--------~~~~ 99 (155) T cd01466 35 -------ADRLSIVTFSTSAKRLSPLRRMTAKGKRSAKRVVDGLQAGGGTNVVGGLKKALKVLGDRRQ--------KNPV 99 (155) T ss_pred -------CCEEEEEEECCCCEEEECCEECCHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCC--------CCCC T ss_conf -------6748999956874262046037999999999987537768887267999999999984366--------8983 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEEC Q ss_conf 169981588778777776631489999999978988999995479755899999851-998189946 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTD-SSGQFFAVN 371 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs-~~~~yy~a~ 371 (398) +.|||||||+.|.. ..+.++++.||+|||||||. +.+..+|+.+|. +.|.||++. T Consensus 100 ~~IiLlTDG~~n~~----------~~~~~~~~~~i~i~tiGiG~-~~d~~lL~~iA~~~gG~~~~v~ 155 (155) T cd01466 100 ASIMLLSDGQDNHG----------AVVLRADNAPIPIHTFGLGA-SHDPALLAFIAEITGGTFSYVK 155 (155) T ss_pred EEEEEECCCCCCHH----------HHHHHHHCCCCEEEEEEECC-CCCHHHHHHHHHCCCCEEEEEC T ss_conf 08999826986405----------77899871797399999788-6789999999976997799949 No 9 >cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. Probab=99.78 E-value=5.6e-17 Score=120.23 Aligned_cols=178 Identities=19% Similarity=0.255 Sum_probs=129.6 Q ss_pred CCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHH Q ss_conf 65431357505566666676663101233102301268876540310134322011255533341000245665541022 Q gi|254781108|r 146 LAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQ 225 (398) Q Consensus 146 ~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~ 225 (398) ..+|++|++|-|+|+...| .. .+..+..++..+. T Consensus 3 ~~~DivFllD~S~Sv~~~f--~~--------------------------------------------~~~Fv~~lv~~f~ 36 (185) T cd01474 3 GHFDLYFVLDKSGSVAANW--IE--------------------------------------------IYDFVEQLVDRFN 36 (185) T ss_pred CCEEEEEEEECCCCCCCCH--HH--------------------------------------------HHHHHHHHHHHCC T ss_conf 8613899997899876576--99--------------------------------------------9999999998569 Q ss_pred HCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHH---HHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 0276877652134654126776543112356899999999---7357788874458899999996113466666566766 Q gi|254781108|r 226 KAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSR---LNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGST 302 (398) Q Consensus 226 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~---I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~ 302 (398) .+ ..|.+.+.|+..... ..+|+...+..... +..+.++|+|+++.||..+.+.++....+.+ T Consensus 37 ----~~---~~rvgvv~fS~~~~~--~f~l~~~~~~~~~~~~~~~~~~~~G~T~tg~AL~~a~~~~f~~~~g~R------ 101 (185) T cd01474 37 ----SP---GLRFSFITFSTRATK--ILPLTDDSSAIIKGLEVLKKVTPSGQTYIHEGLENANEQIFNRNGGGR------ 101 (185) T ss_pred ----CC---CEEEEEEEECCCCCE--EEECCCCHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHCCCCCCC------ T ss_conf ----98---749999998698318--984578707889999998876158937899999999997503236998------ Q ss_pred CCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECC-HHHHHHHHH Q ss_conf 6651699815887787777766314899999999789889999954797558999998519981899469-899999999 Q gi|254781108|r 303 RLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVND-SRELLESFD 381 (398) Q Consensus 303 ~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~-~~~L~~aF~ 381 (398) ...|++|++|||..+... .......++.+|+.||.||+||++ ..+++.|..+|++|++.|.+++ .++|....+ T Consensus 102 ~~~kvlivlTDG~s~~~~----~~~~~~~a~~lr~~gV~i~aVGV~--~~~~~eL~~IAs~p~~vf~v~~~F~~L~~i~~ 175 (185) T cd01474 102 ETVSVIIALTDGQLLLNG----HKYPEHEAKLSRKLGAIVYCVGVT--DFLKSQLINIADSKEYVFPVTSGFQALSGIIE 175 (185) T ss_pred CCCEEEEEEECCCCCCCC----CHHHHHHHHHHHHCCCEEEEEECC--CCCHHHHHHHHCCCCEEEECCCCHHHHHHHHH T ss_conf 876289999326656762----141799999999789489999716--25999999871998648983475777899999 Q ss_pred HHHHHHHHC Q ss_conf 999987512 Q gi|254781108|r 382 KITDKIQEQ 390 (398) Q Consensus 382 ~I~~~i~~~ 390 (398) +|.++|..+ T Consensus 176 ~l~~~iC~~ 184 (185) T cd01474 176 SVVKKACIE 184 (185) T ss_pred HHHHHHCCC T ss_conf 999852879 No 10 >cd01463 vWA_VGCC_like VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex. Probab=99.75 E-value=4.6e-17 Score=120.76 Aligned_cols=171 Identities=13% Similarity=0.229 Sum_probs=116.4 Q ss_pred CCCCCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHH Q ss_conf 45565431357505566666676663101233102301268876540310134322011255533341000245665541 Q gi|254781108|r 143 SENLAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVN 222 (398) Q Consensus 143 ~~~~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~ 222 (398) ....|.++++|||+||||... +++.++.++..+++ T Consensus 9 a~~~Pkdvv~vlD~SGSM~g~---------------------------------------------kl~~ak~a~~~il~ 43 (190) T cd01463 9 AATSPKDIVILLDVSGSMTGQ---------------------------------------------RLHLAKQTVSSILD 43 (190) T ss_pred CCCCCCEEEEEEECCCCCCCC---------------------------------------------HHHHHHHHHHHHHH T ss_conf 678982699999799988973---------------------------------------------49999999999998 Q ss_pred HHHHCCCCCCCCCEEEEEEECCCCCCCC-------CCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCC Q ss_conf 0220276877652134654126776543-------112356899999999735778887445889999999611346666 Q gi|254781108|r 223 SIQKAIQEKKNLSVRIGTIAYNIGIVGN-------QCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESS 295 (398) Q Consensus 223 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~-------~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~ 295 (398) .+... .+..++.|++....- .......++..++..|+.|.+.|+|++..||..|+++|....... T Consensus 44 ~L~~~--------D~~~iv~Fs~~~~~~~p~~~~~~~~~t~~n~~~~~~~i~~l~~~G~Tn~~~al~~A~~~l~~~~~~~ 115 (190) T cd01463 44 TLSDN--------DFFNIITFSNEVNPVVPCFNDTLVQATTSNKKVLKEALDMLEAKGIANYTKALEFAFSLLLKNLQSN 115 (190) T ss_pred HCCCC--------CEEEEEEECCCCEEEECCCCCCEEECCHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHCCC T ss_conf 19987--------7999999689753630245684336899999999999982857987248999999999998742015 Q ss_pred CCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHH--HCCCEEEEEEECCCCCHHHHHHHHHCC-CCCEEEECC Q ss_conf 65667666651699815887787777766314899999999--789889999954797558999998519-981899469 Q gi|254781108|r 296 HNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMR--NAGMKIYSVAVSAPPEGQDLLRKCTDS-SGQFFAVND 372 (398) Q Consensus 296 ~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K--~~gi~IytIg~~~~~~~~~~l~~cAs~-~~~yy~a~~ 372 (398) .... .....++|||+|||..+.. ......-+..+ ..+|.|||+|||.+..+..+|+.+|.. .|+||+..+ T Consensus 116 ~~~~-~~~~~~~IillTDG~~~~~------~~i~~~~~~~~~~~~~i~ift~G~G~~~~d~~~L~~iA~~~~G~y~~I~~ 188 (190) T cd01463 116 HSGS-RSQCNQAIMLITDGVPENY------KEIFDKYNWDKNSEIPVRVFTYLIGREVTDRREIQWMACENKGYYSHIQS 188 (190) T ss_pred CCCC-CCCCCCEEEEEECCCCCCH------HHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHHHHHHHHCCCCEEEECCC T ss_conf 5665-5555515999836988757------88999999975579987999999679977879999999809956997888 Q ss_pred H Q ss_conf 8 Q gi|254781108|r 373 S 373 (398) Q Consensus 373 ~ 373 (398) . T Consensus 189 ~ 189 (190) T cd01463 189 L 189 (190) T ss_pred C T ss_conf 9 No 11 >cd01456 vWA_ywmD_type VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.72 E-value=1.8e-16 Score=117.27 Aligned_cols=166 Identities=17% Similarity=0.226 Sum_probs=113.3 Q ss_pred CCCCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHH Q ss_conf 55654313575055666666766631012331023012688765403101343220112555333410002456655410 Q gi|254781108|r 144 ENLAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNS 223 (398) Q Consensus 144 ~~~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~ 223 (398) ...+.++++|+|.||||..... ....|++..+.++..+++. T Consensus 17 ~~~P~~~~lVlD~SGSM~~~~~---------------------------------------~g~~rl~~ak~a~~~~v~~ 57 (206) T cd01456 17 PQLPPNVAIVLDNSGSMREVDG---------------------------------------GGETRLDNAKAALDETANA 57 (206) T ss_pred CCCCCEEEEEEECCCCCCCCCC---------------------------------------CCCCHHHHHHHHHHHHHHH T ss_conf 9898738999979878778787---------------------------------------7645999999999999985 Q ss_pred HHHCCCCCCCCCEEEEEEECCCCCCCC----C---CCC--------CCCCHHHHHHHHHCCC-CCCCCCHHHHHHHHHHH Q ss_conf 220276877652134654126776543----1---123--------5689999999973577-88874458899999996 Q gi|254781108|r 224 IQKAIQEKKNLSVRIGTIAYNIGIVGN----Q---CTP--------LSNNLNEVKSRLNKLN-PYENTNTYPAMHHAYRE 287 (398) Q Consensus 224 ~~~~~~~~~~~~~~~~~~~~~~~~~~~----~---~~~--------lt~~~~~~~~~I~~l~-~~G~T~~~~gl~~a~~~ 287 (398) +... .+.+.+.|....... . ..+ ...++..+...|+.+. +.|+|+++.|+..+.+. T Consensus 58 l~~~--------drvgLv~F~~~~~~~~d~~~~~~~~~~~~~~~~~~~~~r~~l~~~i~~l~~~~G~T~l~~al~~a~~~ 129 (206) T cd01456 58 LPDG--------TRLGLWTFSGDGDNPLDVRVLVPKGCLTAPVNGFPSAQRSALDAALNSLQTPTGWTPLAAALAEAAAY 129 (206) T ss_pred CCCC--------CEEEEEEECCCCCCCCCCCEECCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHH T ss_conf 7999--------87999997786777888513214565444345523778999999997457788964799999999986 Q ss_pred HCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHH-----HHCCCEEEEEEECCCCCHHHHHHHHHC Q ss_conf 113466666566766665169981588778777776631489999999-----978988999995479755899999851 Q gi|254781108|r 288 LYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYM-----RNAGMKIYSVAVSAPPEGQDLLRKCTD 362 (398) Q Consensus 288 l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~-----K~~gi~IytIg~~~~~~~~~~l~~cAs 362 (398) +.+. ..+.|||||||+++.+.. ....+..+ +..+|.||+||||.+ .+.++|+.+|. T Consensus 130 ~~~~------------~~~~IvLlTDG~~~~g~~------~~~~~~~l~~~~~~~~~v~V~tig~G~d-~d~~~L~~IA~ 190 (206) T cd01456 130 VDPG------------RVNVVVLITDGEDTCGPD------PCEVARELAKRRTPAPPIKVNVIDFGGD-ADRAELEAIAE 190 (206) T ss_pred HCCC------------CCCEEEEEECCCCCCCCC------HHHHHHHHHHHCCCCCCEEEEEEEECCC-CCHHHHHHHHH T ss_conf 2778------------764799992376446888------5999999998317799958999971886-58999999997 Q ss_pred C-CCCE-EEECCHHH Q ss_conf 9-9818-99469899 Q gi|254781108|r 363 S-SGQF-FAVNDSRE 375 (398) Q Consensus 363 ~-~~~y-y~a~~~~~ 375 (398) . +|.| |.++++.. T Consensus 191 ~tgG~y~y~~~d~~~ 205 (206) T cd01456 191 ATGGTYAYNQSDLAS 205 (206) T ss_pred CCCCEEEEECCCCCC T ss_conf 429789951676021 No 12 >cd01451 vWA_Magnesium_chelatase Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions. Probab=99.71 E-value=1.2e-15 Score=112.31 Aligned_cols=172 Identities=16% Similarity=0.229 Sum_probs=124.1 Q ss_pred CCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHCC Q ss_conf 31357505566666676663101233102301268876540310134322011255533341000245665541022027 Q gi|254781108|r 149 SICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKAI 228 (398) Q Consensus 149 di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~ 228 (398) -++||+|.||||... .|+...|.++..++... T Consensus 2 lvvfvvD~SGSM~~~--------------------------------------------~rl~~aK~a~~~ll~d~---- 33 (178) T cd01451 2 LVIFVVDASGSMAAR--------------------------------------------HRMAAAKGAVLSLLRDA---- 33 (178) T ss_pred EEEEEEECCCCCCCC--------------------------------------------CHHHHHHHHHHHHHHHH---- T ss_conf 699999898788875--------------------------------------------67999999999999974---- Q ss_pred CCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEE Q ss_conf 68776521346541267765431123568999999997357788874458899999996113466666566766665169 Q gi|254781108|r 229 QEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFV 308 (398) Q Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~i 308 (398) .....+.+.+.|... ......|.|.+...++..|+.|.++|+|++..||..|++.+..... .+...+++ T Consensus 34 ---~~~~D~v~lv~F~g~-~a~~~lppT~~~~~~~~~l~~L~~gG~T~l~~gL~~a~~~~~~~~~-------~~~~~~~i 102 (178) T cd01451 34 ---YQRRDKVALIAFRGT-EAEVLLPPTRSVELAKRRLARLPTGGGTPLAAGLLAAYELAAEQAR-------DPGQRPLI 102 (178) T ss_pred ---CCCCCEEEEEEECCC-CCEEECCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHCC-------CCCCCEEE T ss_conf ---346788999997597-5558568876579999987216788985199999999999998502-------78984399 Q ss_pred EECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC-CCCEEEECCH--HHHHHHH Q ss_conf 9815887787777766314899999999789889999954797558999998519-9818994698--9999999 Q gi|254781108|r 309 IFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDS-SGQFFAVNDS--RELLESF 380 (398) Q Consensus 309 illTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~-~~~yy~a~~~--~~L~~aF 380 (398) ||+|||..|.+..... .....++.++++.||...+|+|+.+.....+++..|.. +|+||..++. ++|.++. T Consensus 103 iLlTDG~~N~g~~~~~-~~~~~~a~~~~~~gi~~~vId~~~~~~~~~~~~~LA~~~~g~Y~~id~l~~~~i~~~v 176 (178) T cd01451 103 VVITDGRANVGPDPTA-DRALAAARKLRARGISALVIDTEGRPVRRGLAKDLARALGGQYVRLPDLSADAIASAV 176 (178) T ss_pred EEECCCCCCCCCCCHH-HHHHHHHHHHHHCCCCEEEEECCCCCCCHHHHHHHHHHCCCCEEECCCCCHHHHHHHH T ss_conf 9984698667999512-6999999999866997899979999767489999999429969989979988999987 No 13 >cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. Probab=99.69 E-value=1.8e-15 Score=111.27 Aligned_cols=170 Identities=19% Similarity=0.252 Sum_probs=123.9 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|+++++|-|+|+.... ....+..+..+++.+.-. T Consensus 1 lDl~fllD~S~Sv~~~~---------------------------------------------F~~~k~fi~~lv~~f~i~ 35 (177) T cd01469 1 MDIVFVLDGSGSIYPDD---------------------------------------------FQKVKNFLSTVMKKLDIG 35 (177) T ss_pred CCEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHCCCC T ss_conf 90999996889999899---------------------------------------------999999999999866769 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 76877652134654126776543112356899999999735778-88744588999999961134666665667666651 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) ....|.+.+.|+.............+...+..+|+.+.. +|+|+++.+|.++.+.++....+.+ ++..| T Consensus 36 -----~~~~rvglv~ys~~~~~~~~l~~~~~~~~~~~~i~~i~~~~g~t~t~~AL~~a~~~~f~~~~g~R-----~~~~k 105 (177) T cd01469 36 -----PTKTQFGLVQYSESFRTEFTLNEYRTKEEPLSLVKHISQLLGLTNTATAIQYVVTELFSESNGAR-----KDATK 105 (177) T ss_pred -----CCCCEEEEEEECCCEEEEEECCCCCCHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCC-----CCCEE T ss_conf -----98748999993682499982355677899999986230368975252799999998536455886-----78716 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCC---CCHHHHHHHHHCCC--CCEEEECCHHHHHH Q ss_conf 69981588778777776631489999999978988999995479---75589999985199--81899469899999 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAP---PEGQDLLRKCTDSS--GQFFAVNDSRELLE 378 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~---~~~~~~l~~cAs~~--~~yy~a~~~~~L~~ 378 (398) ++|++|||..+.+ .....+.+.+|+.||+||+||+|.. ......|+.+||+| .|.|.+++-++|.+ T Consensus 106 v~ivlTDG~s~d~------~~~~~~~~~lk~~gv~vf~VGvG~~~~~~~~~~eL~~iAs~P~~~hvf~~~~f~~L~~ 176 (177) T cd01469 106 VLVVITDGESHDD------PLLKDVIPQAEREGIIRYAIGVGGHFQRENSREELKTIASKPPEEHFFNVTDFAALKD 176 (177) T ss_pred EEEEEECCCCCCC------CCHHHHHHHHHHCCEEEEEEEECCCCCCCCCHHHHHHHHCCCCHHCEEEECCHHHHCC T ss_conf 9999978986775------0149999999979908999995551467451999999967985871998379777646 No 14 >cd01472 vWA_collagen von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins. This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif. Probab=99.69 E-value=1.7e-15 Score=111.31 Aligned_cols=160 Identities=19% Similarity=0.281 Sum_probs=115.9 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) .|+++|||.|+||.... .+..+..+..++..+... T Consensus 1 aDi~fvlD~S~Sv~~~~---------------------------------------------f~~~k~fi~~li~~~~i~ 35 (164) T cd01472 1 ADIVFLVDGSESIGLSN---------------------------------------------FNLVKDFVKRVVERLDIG 35 (164) T ss_pred CCEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHCCCC T ss_conf 97999997979988799---------------------------------------------999999999999964768 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 76877652134654126776543112356899999999735778-88744588999999961134666665667666651 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) ....+.+.+.|..............+...+..+|+.|.. +|+|+++.||.++.+.++....+. .++.+| T Consensus 36 -----~~~~rvgvv~fs~~~~~~~~l~~~~~~~~l~~~i~~i~~~~g~t~~~~AL~~~~~~~~~~~~~~-----r~~~~k 105 (164) T cd01472 36 -----PDGVRVGVVQYSDDPRTEFYLNTYRSKDDVLEAVKNLRYIGGGTNTGKALKYVRENLFTEASGS-----REGVPK 105 (164) T ss_pred -----CCCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHCCCCCC-----CCCCEE T ss_conf -----8860899998247415874454669889999999861166897529999999999863535787-----678515 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CCEEEECC Q ss_conf 6998158877877777663148999999997898899999547975589999985199--81899469 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSS--GQFFAVND 372 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~--~~yy~a~~ 372 (398) ++||+|||..+. .....+..+|++||+||+||+| + .+.+.|+.+||+| .|+|.+++ T Consensus 106 vvvllTDG~s~~--------~~~~~a~~lr~~Gi~v~~VGig-~-~~~~~L~~iAs~p~~~~~~~~~~ 163 (164) T cd01472 106 VLVVITDGKSQD--------DVEEPAVELKQAGIEVFAVGVK-N-ADEEELKQIASDPKELYVFNVAD 163 (164) T ss_pred EEEEEECCCCCC--------HHHHHHHHHHHCCCEEEEEECC-C-CCHHHHHHHHCCCCHHEEEECCC T ss_conf 999983799864--------0889999999889889999788-4-79999999967993783896588 No 15 >cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. Probab=99.69 E-value=9.3e-15 Score=106.91 Aligned_cols=181 Identities=17% Similarity=0.197 Sum_probs=127.4 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) .|+++++|-|+|.+..... ...+..+..+++.+. T Consensus 1 ~DivFllD~S~SIg~~nf~--------------------------------------------~~v~~F~~~lv~~f~-- 34 (192) T cd01473 1 YDLTLILDESASIGYSNWR--------------------------------------------KDVIPFTEKIINNLN-- 34 (192) T ss_pred CCEEEEEECCCCCCHHHHH--------------------------------------------HHHHHHHHHHHHHCC-- T ss_conf 9789999389986667769--------------------------------------------999999999998756-- Q ss_pred CCCCCCCCEEEEEEECCCCCCCC--CCCCCCCCHHHHHHHHHCCC----CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 76877652134654126776543--11235689999999973577----8887445889999999611346666656676 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGN--QCTPLSNNLNEVKSRLNKLN----PYENTNTYPAMHHAYRELYNEKESSHNTIGS 301 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~--~~~~lt~~~~~~~~~I~~l~----~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~ 301 (398) .+...++.+.+.|+...... .....++++..+..+|..|. .+|+|+++.+|..+.+.++.....+ T Consensus 35 ---Ig~~~~rvgvv~yS~~~~~~~~f~~~~~~~k~~~l~~i~~l~~~~~~gg~T~tg~AL~~~~~~~~~~~g~R------ 105 (192) T cd01473 35 ---ISKDKVHVGILLFAEKNRDVVPFSDEERYDKNELLKKINDLKNSYRSGGETYIVEALKYGLKNYTKHGNRR------ 105 (192) T ss_pred ---CCCCCEEEEEEEECCCCCEEEECCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCCCCC------ T ss_conf ---59896199999955887401323554434899999999998731468982479999999999863467888------ Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCC-----EEEECCHHHH Q ss_conf 66651699815887787777766314899999999789889999954797558999998519981-----8994698999 Q gi|254781108|r 302 TRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQ-----FFAVNDSREL 376 (398) Q Consensus 302 ~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~-----yy~a~~~~~L 376 (398) ++..|++|++|||..+.. +......+...+|++||+||.||+|. .+...|+.+|+.|.. +|...+-++| T Consensus 106 ~~vpkv~IvlTDG~s~~~----~~~~~~~~a~~lr~~gV~i~avGVg~--~~~~eL~~iag~~~~~~~c~~~~~~~fd~l 179 (192) T cd01473 106 KDAPKVTMLFTDGNDTSA----SKKELQDISLLYKEENVKLLVVGVGA--ASENKLKLLAGCDINNDNCPNVIKTEWNNL 179 (192) T ss_pred CCCCEEEEEEECCCCCCC----CHHHHHHHHHHHHHCCCEEEEEEECC--CCHHHHHHHHCCCCCCCCCCEEEECCHHHH T ss_conf 899749999956998873----16789999999998797899998063--799999998699988997757994797899 Q ss_pred HHHHHHHHHHHHH Q ss_conf 9999999998751 Q gi|254781108|r 377 LESFDKITDKIQE 389 (398) Q Consensus 377 ~~aF~~I~~~i~~ 389 (398) ..+.+.|.++|.+ T Consensus 180 ~~i~~~l~~~vC~ 192 (192) T cd01473 180 NGISKFLTDKICD 192 (192) T ss_pred HHHHHHHHHHHCC T ss_conf 9999999997249 No 16 >pfam00092 VWA von Willebrand factor type A domain. Probab=99.66 E-value=1.4e-14 Score=105.85 Aligned_cols=172 Identities=21% Similarity=0.281 Sum_probs=120.1 Q ss_pred CCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHCC Q ss_conf 31357505566666676663101233102301268876540310134322011255533341000245665541022027 Q gi|254781108|r 149 SICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKAI 228 (398) Q Consensus 149 di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~ 228 (398) |+++|+|.|+||... +....+.++..++..+.. T Consensus 1 Di~fvlD~S~Sm~~~---------------------------------------------~~~~~k~~~~~~i~~~~~-- 33 (177) T pfam00092 1 DIVFLLDGSGSIGEA---------------------------------------------NFEKVKEFIKKLVENLDI-- 33 (177) T ss_pred CEEEEEECCCCCCHH---------------------------------------------HHHHHHHHHHHHHHHHCC-- T ss_conf 989999687998868---------------------------------------------899999999999998365-- Q ss_pred CCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHH--CCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 687765213465412677654311235689999999973--577888744588999999961134666665667666651 Q gi|254781108|r 229 QEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLN--KLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~--~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) .....+.+.+.|..............+...+...+. ...++|+|+++.||..+.+.+...... .++.+| T Consensus 34 ---~~~~~rv~lv~f~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~g~t~~~~al~~a~~~~~~~~~~------r~~~~k 104 (177) T pfam00092 34 ---GPDGTRVGLVQYSSDVTTEFSLNDYKSKDDLLSAVLRNIYYLGGGTNTGKALKYALENLFRSAGS------RPNAPK 104 (177) T ss_pred ---CCCCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHCCCC------CCCCEE T ss_conf ---88752899999458458996178868999999998643157899565999999999998635478------878726 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC---CCCEEEECCHHHHHHHHHHH Q ss_conf 699815887787777766314899999999789889999954797558999998519---98189946989999999999 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDS---SGQFFAVNDSRELLESFDKI 383 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~---~~~yy~a~~~~~L~~aF~~I 383 (398) ++||+|||..+.+.. ........+|+.||+||+||+| + .+...|+.+|+. .+++|.+.+..+|.+++++| T Consensus 105 ~vvllTDG~~~~~~~-----~~~~~~~~~~~~gI~v~~vG~g-~-~~~~~L~~ia~~~~~~~~~~~~~~~~~l~~~~~~i 177 (177) T pfam00092 105 VVILLTDGKSNDGGL-----VPAAAAALRRKVGIIVFGVGVG-D-VDEEELRLIASEPCSEGHVFYVTDFDALSDIQEEL 177 (177) T ss_pred EEEEEECCCCCCCCC-----CHHHHHHHHHHCCCEEEEEECC-C-CCHHHHHHHHCCCCCCCEEEEECCHHHHHHHHHHC T ss_conf 899983698788864-----6999999999789589999747-4-48999999968999898599958989999999619 No 17 >cd01450 vWFA_subfamily_ECM Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=99.66 E-value=6.6e-15 Score=107.81 Aligned_cols=154 Identities=19% Similarity=0.333 Sum_probs=111.2 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|+++++|.|+||.... ....+.++..+++.+... T Consensus 1 ~DivfvlD~S~Sm~~~~---------------------------------------------~~~~k~~i~~~i~~~~~~ 35 (161) T cd01450 1 LDIVFLLDGSESVGPEN---------------------------------------------FEKVKDFIEKLVEKLDIG 35 (161) T ss_pred CEEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHCCCC T ss_conf 96999997989988589---------------------------------------------999999999999970568 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCC--CCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 768776521346541267765431123568999999997357788--874458899999996113466666566766665 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPY--ENTNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~--G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) . ...|.+.+.|+.............+...+...|+.|... ++|+++.+|.++.+.+......+ .+.+ T Consensus 36 ~-----~~~rv~lv~fs~~~~~~~~l~~~~~~~~l~~~i~~l~~~~~~~t~~~~AL~~~~~~~~~~~~~r------~~~~ 104 (161) T cd01450 36 P-----DKTRVGLVQYSDDVRVEFSLNDYKSKDDLLKAVKNLKYLGGGGTNTGKALQYALEQLFSESNAR------ENVP 104 (161) T ss_pred C-----CCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHCCCCC------CCCC T ss_conf 8-----7858999995573168714656466999999998421368998548999999999986144666------6675 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCC Q ss_conf 169981588778777776631489999999978988999995479755899999851998 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSG 365 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~ 365 (398) |++||+|||..+... .....++.+|+.||+||+||+| + .+.+.|+.+|+.|+ T Consensus 105 kvivllTDG~~~~~~------~~~~~a~~lk~~gi~v~~vgiG-~-~~~~~L~~iA~~p~ 156 (161) T cd01450 105 KVIIVLTDGRSDDGG------DPKEAAAKLKDEGIKVFVVGVG-P-ADEEELREIASCPS 156 (161) T ss_pred EEEEEEECCCCCCCC------CHHHHHHHHHHCCCEEEEEEEC-C-CCHHHHHHHHCCCC T ss_conf 499998258878874------7999999999889989999826-4-89999999977994 No 18 >smart00327 VWA von Willebrand factor (vWF) type A domain. VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. Probab=99.65 E-value=1.9e-14 Score=105.09 Aligned_cols=160 Identities=21% Similarity=0.278 Sum_probs=116.4 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) ++|+++++|.|+||... +.+..+.++..++..+.. T Consensus 1 ~~di~~vvD~S~SM~~~---------------------------------------------~~~~~k~~~~~~i~~l~~ 35 (177) T smart00327 1 PLDVVFLLDGSGSMGPN---------------------------------------------RFEKAKEFVLKLVEQLDI 35 (177) T ss_pred CCEEEEEEECCCCCCCH---------------------------------------------HHHHHHHHHHHHHHHHHC T ss_conf 94899999288998828---------------------------------------------999999999999998641 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC--CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 276877652134654126776543112356899999999735778--887445889999999611346666656676666 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP--YENTNTYPAMHHAYRELYNEKESSHNTIGSTRL 304 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~--~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~ 304 (398) .. ...+.+.+.|+.............+...+...|..+.+ +|+|++..+|.++++.+........ .+. T Consensus 36 ~~-----~~~~v~vv~f~~~~~~~~~~~~~~~~~~~~~~i~~l~~~~~g~t~~~~al~~a~~~~~~~~~~~~-----~~~ 105 (177) T smart00327 36 GP-----DGDRVGLVTFSDDATVLFPLNDSRSKDALLEALASLSYKLGGGTNLGAALQYALENLFSKSAGSR-----RGA 105 (177) T ss_pred CC-----CCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHHHHHCCCC-----CCC T ss_conf 79-----98789999963726899768886899999999971415578877642899999999976650377-----887 Q ss_pred CEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEE Q ss_conf 5169981588778777776631489999999978988999995479755899999851998189 Q gi|254781108|r 305 KKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFF 368 (398) Q Consensus 305 ~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy 368 (398) +|++||+|||..+.. .........+|+.||.||+||+|.+ .+..+|+..|+.++..| T Consensus 106 ~~~iil~TDG~~~~~------~~~~~~~~~~~~~~v~i~~ig~g~~-~~~~~l~~ia~~~~~~~ 162 (177) T smart00327 106 PKVLILITDGESNDG------GDLLKAAKELKRSGVKVFVVGVGND-VDEEELKKLASAPGGVY 162 (177) T ss_pred CEEEEEEECCCCCCC------HHHHHHHHHHHHCCCEEEEEEECCC-CCHHHHHHHHHCCCCEE T ss_conf 428999805887872------5299999999867948999995884-79999999984899659 No 19 >cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.65 E-value=1.4e-14 Score=105.78 Aligned_cols=162 Identities=14% Similarity=0.213 Sum_probs=121.3 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) .|+++++|-|+|+.... ....+..+..+++.+.- T Consensus 1 ~DlvFllD~S~si~~~~---------------------------------------------F~~~k~Fv~~lv~~f~i- 34 (165) T cd01481 1 KDIVFLIDGSDNVGSGN---------------------------------------------FPAIRDFIERIVQSLDV- 34 (165) T ss_pred CCEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHHCC- T ss_conf 97899996889989899---------------------------------------------99999999999996046- Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCC--CCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 7687765213465412677654311235689999999973577888--74458899999996113466666566766665 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYE--NTNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G--~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) ....+|.+.+.|+.............+...+..+|+.+.+.| +|+++.+|.++.+.++....+... .+..+ T Consensus 35 ----~~~~trVgvi~ys~~~~~~f~l~~~~~~~~l~~~I~~i~~~~g~~t~tg~AL~~a~~~~f~~~~g~R~---r~~v~ 107 (165) T cd01481 35 ----GPDKIRVAVVQFSDTPRPEFYLNTHSTKADVLGAVRRLRLRGGSQLNTGSALDYVVKNLFTKSAGSRI---EEGVP 107 (165) T ss_pred ----CCCCEEEEEEEECCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCEEHHHHHHHHHHHHCCCCCCCCC---CCCCC T ss_conf ----88862788999868647999767768999999999841045898436999999999971675678875---57998 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECC Q ss_conf 1699815887787777766314899999999789889999954797558999998519981899469 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVND 372 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~ 372 (398) |++|++|||..+. .....+..+|+.||+||+||++ + .+...|+.+|++|.+.|.+++ T Consensus 108 kvlvviTdG~s~d--------~~~~~a~~lr~~gV~i~aVGvg-~-~~~~eL~~IAs~p~~vf~~~~ 164 (165) T cd01481 108 QFLVLITGGKSQD--------DVERPAVALKRAGIVPFAIGAR-N-ADLAELQQIAFDPSFVFQVSD 164 (165) T ss_pred EEEEEEECCCCCC--------HHHHHHHHHHHCCCEEEEEECC-C-CCHHHHHHHHCCCCCEEECCC T ss_conf 6999984898853--------7899999999889789999689-7-999999998589877697389 No 20 >TIGR00868 hCaCC calcium-activated chloride channel protein 1; InterPro: IPR004727 This entry represents a family of Ca(2+)-regulated chloride channels (CLCA) which includes bovine, murine and human proteins , . Each CLCA exhibits a distinct, often overlapping, tissue expression pattern. With the exception of the truncated, secreted protein hCLCA3 , they are synthesized as an approximately 125 kDa precursor transmembrane glycoprotein that is rapidly cleaved into 90 and 35 kDa subunits. The human proteins have been shown to affect a large number of cell functions including chloride conductance, epithelial secretion, cell-cell adhesion, apoptosis, cell cycle control, mucus production in asthma, and blood pressure. The CLCA proteins expressed on the luminal surface of lung vascular endothelia (bCLCA2; mCLCA1; hCLCA2) serve as adhesion molecules for lung metastatic cancer cells, mediating vascular arrest and lung colonization. Expression of hCLCA2 in normal mammary epithelium is consistently lost in human breast cancer and in all tumorigenic breast cancer cell lines. Re-expression of hCLCA2 in human breast cancer cells abrogates tumorigenicity in nude mice, implying that hCLCA2 acts as a tumour suppressor in breast cancer.. Probab=99.65 E-value=2.1e-15 Score=110.83 Aligned_cols=176 Identities=19% Similarity=0.348 Sum_probs=110.8 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) -.-++||||.||||....+. ...++|...++-+.-+ T Consensus 307 qRiVCLVLDKSGSM~~~dRL--------------------------------------------~RmNQAa~lFL~Q~vE 342 (874) T TIGR00868 307 QRIVCLVLDKSGSMTKEDRL--------------------------------------------KRMNQAAKLFLLQIVE 342 (874) T ss_pred CEEEEEEECCCCCCCCCCHH--------------------------------------------HHHHHHHHHHEEEEEE T ss_conf 75899986344337988533--------------------------------------------4555566430123554 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCC--CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 2768776521346541267765431123568999999997357--78887445889999999611346666656676666 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKL--NPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRL 304 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l--~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~ 304 (398) . -.-.|.+.|+........+---.+-.....-...| .|.|||+|=.||+.|++.+....+..... T Consensus 343 ~-------gs~VGmV~FDS~A~i~n~L~~I~s~~~~~~l~a~LP~~a~GGTSIC~Gl~~aFq~I~~~~~~t~GS------ 409 (874) T TIGR00868 343 K-------GSWVGMVTFDSAAEIKNELIKITSSDERDALTANLPTEASGGTSICSGLKAAFQVIKKSDQSTDGS------ 409 (874) T ss_pred C-------CCEEEEEECCCEEEEEEEEEEECCHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCC------ T ss_conf 1-------526776630644576542077526689989987077878768036566766654333126666753------ Q ss_pred CEEEEECCCCCCCCCCCCCCCHHHHHHH-HHHHHCCCEEEEEEECCCCCHHHH--HHHHHCCCCCEEEECCH---HHHHH Q ss_conf 5169981588778777776631489999-999978988999995479755899--99985199818994698---99999 Q gi|254781108|r 305 KKFVIFITDGENSGASAYQNTLNTLQIC-EYMRNAGMKIYSVAVSAPPEGQDL--LRKCTDSSGQFFAVNDS---RELLE 378 (398) Q Consensus 305 ~k~iillTDG~~~~~~~~~~~~~~~~~c-~~~K~~gi~IytIg~~~~~~~~~~--l~~cAs~~~~yy~a~~~---~~L~~ 378 (398) -||||||||+|.- ..| ++-|..|+.||||.+| ++.++++ |..+ ++|+-|+|++. ..|.+ T Consensus 410 --Ei~LLTDGEDN~i----------~sC~~eVkqsGaIiHtiALG-psAa~ele~lS~m--TGG~~fYa~D~~~~NgLid 474 (874) T TIGR00868 410 --EIVLLTDGEDNTI----------SSCIEEVKQSGAIIHTIALG-PSAAKELEELSDM--TGGLRFYASDEADNNGLID 474 (874) T ss_pred --EEEEEECCCCCCE----------EECHHHHHCCCEEEEEEECC-HHHHHHHHHHHHH--CCCCEEEEECHHHCCCHHH T ss_conf --6998306875762----------31305541098089985078-4589999998733--3871133413333141454 Q ss_pred HHHHHHH---HHHHCEEEE Q ss_conf 9999999---875124573 Q gi|254781108|r 379 SFDKITD---KIQEQSVRI 394 (398) Q Consensus 379 aF~~I~~---~i~~~r~~~ 394 (398) ||..|.. .|+++.||| T Consensus 475 AFg~lsS~~~~~sQ~~lQL 493 (874) T TIGR00868 475 AFGALSSGNGSVSQQSLQL 493 (874) T ss_pred HHHHHCCCCHHHHHHHHHH T ss_conf 6642214761255555555 No 21 >cd01471 vWA_micronemal_protein Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners. Probab=99.64 E-value=4e-14 Score=103.10 Aligned_cols=172 Identities=16% Similarity=0.200 Sum_probs=117.5 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|+++++|-|+|++.... ....+..+..+++.+. T Consensus 1 lDlvFllD~S~SVg~~n~--------------------------------------------f~~~k~F~~~lv~~f~-- 34 (186) T cd01471 1 LDLYLLVDGSGSIGYSNW--------------------------------------------VTHVVPFLHTFVQNLN-- 34 (186) T ss_pred CEEEEEEECCCCCCCCCH--------------------------------------------HHHHHHHHHHHHHHCC-- T ss_conf 909999948898886131--------------------------------------------8999999999999749-- Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCC--CCCCHHH---HHHHHHCCC-CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 768776521346541267765431123--5689999---999973577-8887445889999999611346666656676 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTP--LSNNLNE---VKSRLNKLN-PYENTNTYPAMHHAYRELYNEKESSHNTIGS 301 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~--lt~~~~~---~~~~I~~l~-~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~ 301 (398) .+...++.+.+.|+.......... .+.++.. +...|..+. .+|+|+++.+|..+.+.++.....+ T Consensus 35 ---I~~~~~rVgvv~ys~~~~~~~~l~~~~~~~~~~~~~~~~~i~~~~y~gg~T~Tg~AL~~a~~~~f~~~g~R------ 105 (186) T cd01471 35 ---ISPDEINLYLVTFSTNAKELIRLSSPNSTNKDLALNAIRALLSLYYPNGSTNTTSALLVVEKHLFDTRGNR------ 105 (186) T ss_pred ---CCCCCEEEEEEEECCCCEEEEECCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCCCCC------ T ss_conf ---69884499999954870599875775544656799999999837778996779999999999721146889------ Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCC-----EEEECCHHHH Q ss_conf 66651699815887787777766314899999999789889999954797558999998519981-----8994698999 Q gi|254781108|r 302 TRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQ-----FFAVNDSREL 376 (398) Q Consensus 302 ~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~-----yy~a~~~~~L 376 (398) ++.+|++||+|||..+.. ......+..+|++||+||+||+|. .-+.+.|+.+|+.+.. .|..++-++| T Consensus 106 ~~vpkv~illTDG~s~d~------~~~~~~a~~Lr~~GV~ifavGVG~-~v~~~eL~~Iag~~~~~~~c~~~~~~~~~~l 178 (186) T cd01471 106 ENAPQLVIIMTDGIPDSK------FRTLKEARKLRERGVIIAVLGVGQ-GVNHEENRSLVGCDPDDSPCPLYLQSSWSEV 178 (186) T ss_pred CCCCEEEEEEECCCCCCC------CHHHHHHHHHHHCCCEEEEEECCC-CCCHHHHHHHCCCCCCCCCCCEEEECCHHHH T ss_conf 999859999906987785------258999999998899999998343-2499999997099988899865751788888 Q ss_pred HHHHH Q ss_conf 99999 Q gi|254781108|r 377 LESFD 381 (398) Q Consensus 377 ~~aF~ 381 (398) .++-+ T Consensus 179 ~~~~~ 183 (186) T cd01471 179 QNVIK 183 (186) T ss_pred HHHHH T ss_conf 74775 No 22 >cd01464 vWA_subfamily VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.61 E-value=2.7e-14 Score=104.11 Aligned_cols=173 Identities=20% Similarity=0.242 Sum_probs=115.9 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) .+++++|+|.||||... +++.++.++..+...+.. T Consensus 3 rlpvvlvlD~SGSM~G~---------------------------------------------~i~~~k~al~~~~~~L~~ 37 (176) T cd01464 3 RLPIYLLLDTSGSMAGE---------------------------------------------PIEALNQGLQMLQSELRQ 37 (176) T ss_pred CCCEEEEEECCCCCCCH---------------------------------------------HHHHHHHHHHHHHHHHHC T ss_conf 35789999789999984---------------------------------------------799999999999999711 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 27687765213465412677654311235689999999973577888744588999999961134666665667666651 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) .+. ....++..++.|++... .+.|++. .. ...+..|.++|+|+++.||..+.+.|........ .....+++. T Consensus 38 d~~--a~~~~~vsVItF~s~a~--~~~pl~~-~~--~~~~~~L~a~G~T~~g~al~~a~~~l~~~~~~~~-~~~~~~~~P 109 (176) T cd01464 38 DPY--ALESVEISVITFDSAAR--VIVPLTP-LE--SFQPPRLTASGGTSMGAALELALDCIDRRVQRYR-ADQKGDWRP 109 (176) T ss_pred CCC--CHHEEEEEEEEECCCEE--EECCCCC-HH--HCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHCC-CCCCCCCCE T ss_conf 831--01132699999789517--8058634-76--6475547778998199999999999998652236-556677531 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHH Q ss_conf 699815887787777766314899999999789889999954797558999998519981899469899999999 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFD 381 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~ 381 (398) +|||||||+.+.. + ......+...+..++.|++||+|.+ .+.++|+..+.. -+...+..+..+.|+ T Consensus 110 ~I~LlTDG~PtD~-~----~~~~~~~~~~~~~~~~i~a~giG~d-ad~~~L~~is~~---~~~~~~~~~f~~ff~ 175 (176) T cd01464 110 WVFLLTDGEPTDD-L----TAAIERIKEARDSKGRIVACAVGPK-ADLDTLKQITEG---VPLLDDALSGLNFFK 175 (176) T ss_pred EEEEECCCCCCCC-H----HHHHHHHHHHHHCCCEEEEEEEECC-CCHHHHHHHHCC---CCCCCCHHHHHHHHC T ss_conf 7999668998875-8----9999999988863976999997387-189999988577---745345345888508 No 23 >cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.60 E-value=8e-14 Score=101.32 Aligned_cols=159 Identities=19% Similarity=0.272 Sum_probs=114.4 Q ss_pred CCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHCC Q ss_conf 31357505566666676663101233102301268876540310134322011255533341000245665541022027 Q gi|254781108|r 149 SICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKAI 228 (398) Q Consensus 149 di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~ 228 (398) |+++++|-|+|+.... ....+..+..+++.+.- T Consensus 2 DlvfllD~S~Si~~~~---------------------------------------------f~~~k~fi~~lv~~f~i-- 34 (164) T cd01482 2 DIVFLVDGSWSIGRSN---------------------------------------------FNLVRSFLSSVVEAFEI-- 34 (164) T ss_pred CEEEEEECCCCCCHHH---------------------------------------------HHHHHHHHHHHHHHCCC-- T ss_conf 7999996989988899---------------------------------------------99999999999996476-- Q ss_pred CCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEE Q ss_conf 6877652134654126776543112356899999999735778-887445889999999611346666656676666516 Q gi|254781108|r 229 QEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKF 307 (398) Q Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~ 307 (398) .....|.+.+.|..............+...+..+|+.+.. +|+|+++.+|.++.+.++....+.. ++.+|+ T Consensus 35 ---~~~~~rvgvv~ys~~~~~~~~l~~~~~~~~l~~~i~~i~~~~g~t~~~~AL~~~~~~~f~~~~g~R-----~~~~kv 106 (164) T cd01482 35 ---GPDGVQVGLVQYSDDPRTEFDLNAYTSKEDVLAAIKNLPYKGGNTRTGKALTHVREKNFTPDAGAR-----PGVPKV 106 (164) T ss_pred ---CCCCEEEEEEEECCCCEEEECCCCCCCHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHCHHCCCC-----CCCCEE T ss_conf ---888628999994475127873434699899999986402668997289999999998615002898-----888607 Q ss_pred EEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CCEEEECC Q ss_conf 998158877877777663148999999997898899999547975589999985199--81899469 Q gi|254781108|r 308 VIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSS--GQFFAVND 372 (398) Q Consensus 308 iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~--~~yy~a~~ 372 (398) +|++|||..+. .....++.+|++||+||+||++ + .+...|+.+||.| .|.|.+++ T Consensus 107 lvliTDG~s~d--------~~~~~a~~lr~~gv~i~~VGVg-~-~~~~eL~~IAs~P~~~hvf~~~~ 163 (164) T cd01482 107 VILITDGKSQD--------DVELPARVLRNLGVNVFAVGVK-D-ADESELKMIASKPSETHVFNVAD 163 (164) T ss_pred EEEECCCCCCC--------HHHHHHHHHHHCCCEEEEEECC-C-CCHHHHHHHHCCCCHHCEEECCC T ss_conf 99960798843--------3899999999889389999788-3-78999999968985661797479 No 24 >cd01476 VWA_integrin_invertebrates VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands. Probab=99.60 E-value=6.7e-14 Score=101.79 Aligned_cols=157 Identities=18% Similarity=0.258 Sum_probs=112.9 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +|++++||-|+|+... + ...+..+..+++.+.- T Consensus 1 lDl~fllD~S~Sv~~~--f--------------------------------------------~~~k~F~~~lv~~f~i- 33 (163) T cd01476 1 LDLLFVLDSSGSVRGK--F--------------------------------------------EKYKKYIERIVEGLEI- 33 (163) T ss_pred CCEEEEEECCCCHHHH--H--------------------------------------------HHHHHHHHHHHHHHCC- T ss_conf 9299999188886673--9--------------------------------------------9999999999996146- Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCC--CCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 7687765213465412677654311--2356899999999735778-887445889999999611346666656676666 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQC--TPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIGSTRL 304 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~--~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~ 304 (398) ....+|.+.+.|+........ ..-..++..+..+|+.+.. +|+|+++.+|..+.+.+.+....+ ++. T Consensus 34 ----~~~~~rVgvv~ys~~~~~~i~f~l~~~~~~~~l~~~I~~i~~~~g~T~tg~AL~~a~~~~~~~~g~R------~~~ 103 (163) T cd01476 34 ----GPTATRVALITYSGRGRQRVRFNLPKHNDGEELLEKVDNLRFIGGTTATGAAIEVALQQLDPSEGRR------EGI 103 (163) T ss_pred ----CCCCEEEEEEEECCCCCEEEEECCCCCCCHHHHHHHHHHEECCCCCCCHHHHHHHHHHHHHHHCCCC------CCC T ss_conf ----8885389999966987078887577779999999999752036898548999999999721420678------996 Q ss_pred CEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHH-CCCEEEEEEECCC-CCHHHHHHHHHCCCCCEEE Q ss_conf 516998158877877777663148999999997-8988999995479-7558999998519981899 Q gi|254781108|r 305 KKFVIFITDGENSGASAYQNTLNTLQICEYMRN-AGMKIYSVAVSAP-PEGQDLLRKCTDSSGQFFA 369 (398) Q Consensus 305 ~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~-~gi~IytIg~~~~-~~~~~~l~~cAs~~~~yy~ 369 (398) +|++||+|||..+.+ ....+..+|+ .||+||.||+|.. ..+...|..+|++|+|.|. T Consensus 104 ~kv~vviTDG~s~d~--------~~~~a~~lr~~~gv~v~avgVG~~~~~d~~eL~~Ia~~~~~Vft 162 (163) T cd01476 104 PKVVVVLTDGRSHDD--------PEKQARILRAVPNIETFAVGTGDPGTVDTEELHSITGNEDHIFT 162 (163) T ss_pred EEEEEEEECCCCCCC--------HHHHHHHHHHHCCCEEEEEEECCCCCCCHHHHHHHCCCCCCCCC T ss_conf 169999818987664--------88999999970998999998388650159999986499725457 No 25 >cd00198 vWFA Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Probab=99.55 E-value=3.7e-13 Score=97.33 Aligned_cols=153 Identities=22% Similarity=0.308 Sum_probs=109.2 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) .++++|+|.|+||... +.+..+.++..++..+... T Consensus 1 ~div~vlD~S~Sm~~~---------------------------------------------~~~~~k~~~~~~~~~l~~~ 35 (161) T cd00198 1 ADIVFLLDVSGSMGGE---------------------------------------------KLDKAKEALKALVSSLSAS 35 (161) T ss_pred CEEEEEEECCCCCCCH---------------------------------------------HHHHHHHHHHHHHHHHHHC T ss_conf 9099999188998807---------------------------------------------9999999999999987655 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCC--CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 7687765213465412677654311235689999999973577--88874458899999996113466666566766665 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLN--PYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~--~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) ....+.+.+.|..............+...+...++.+. +.|+|+++.||..+.+.+.... ....+ T Consensus 36 -----~~~~~v~vv~f~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~g~t~~~~al~~a~~~~~~~~--------~~~~~ 102 (161) T cd00198 36 -----PPGDRVGLVTFGSNARVVLPLTTDTDKADLLEAIDALKKGLGGGTNIGAALRLALELLKSAK--------RPNAR 102 (161) T ss_pred -----CCCCEEEEEEECCCEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHC--------CCCCC T ss_conf -----99988999993795148814741257999999775135689998389999999999987532--------55565 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC Q ss_conf 16998158877877777663148999999997898899999547975589999985199 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSS 364 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~ 364 (398) |++||+|||..+... .......+.+|+.||.||+||+|. ......|+..++.+ T Consensus 103 ~~iiliTDG~~~~~~-----~~~~~~~~~~~~~~v~i~~igig~-~~~~~~l~~ia~~~ 155 (161) T cd00198 103 RVIILLTDGEPNDGP-----ELLAEAARELRKLGITVYTIGIGD-DANEDELKEIADKT 155 (161) T ss_pred EEEEEECCCCCCCCH-----HHHHHHHHHHHHCCCEEEEEEECH-HHCHHHHHHHHHCC T ss_conf 179996789989873-----679999999997799899999661-11999999998383 No 26 >TIGR03436 acidobact_VWFA VWFA-related Acidobacterial domain. Members of this family are bacterial domains that include a region related to the von Willebrand factor type A (VWFA) domain (pfam00092). These domains are restricted to, and have undergone a large paralogous family expansion in, the Acidobacteria, including Solibacter usitatus and Acidobacterium capsulatum ATCC 51196. Probab=99.48 E-value=8e-12 Score=89.33 Aligned_cols=181 Identities=21% Similarity=0.319 Sum_probs=121.0 Q ss_pred CCCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHH Q ss_conf 56543135750556666667666310123310230126887654031013432201125553334100024566554102 Q gi|254781108|r 145 NLAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSI 224 (398) Q Consensus 145 ~~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~ 224 (398) ..|+.+.+++|.|+||.... ...+.+...++... T Consensus 51 ~~P~sv~l~~D~S~s~~~~~----------------------------------------------~~~~~a~~~fl~~~ 84 (296) T TIGR03436 51 DLPLTVGLVIDTSGSMFNDL----------------------------------------------ARARAAAIRFLKTV 84 (296) T ss_pred CCCCEEEEEEECCCCCHHHH----------------------------------------------HHHHHHHHHHHHHH T ss_conf 89846999997899914539----------------------------------------------99999999999863 Q ss_pred HHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC---------------CCCCCHHHHHHHHHHHHC Q ss_conf 20276877652134654126776543112356899999999735778---------------887445889999999611 Q gi|254781108|r 225 QKAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP---------------YENTNTYPAMHHAYRELY 289 (398) Q Consensus 225 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~---------------~G~T~~~~gl~~a~~~l~ 289 (398) . .+ ..+...+.|+... ....++|.+...+..+|+.+.+ .|+|+++.++..+...+. T Consensus 85 l----~p---~d~~avv~F~~~~--~l~~~fT~d~~~l~~al~~l~~~~~~~~~~~~~~~~~~g~tal~dAi~laa~~~~ 155 (296) T TIGR03436 85 L----RP---NDEVFVVTFSTQL--RLLQDFTSDPRLLEAALNKLKPPLRTDYNSSGAFVADAGGTALYDAITLAALQQL 155 (296) T ss_pred C----CC---CCEEEEEEECCCE--EECCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHH T ss_conf 6----88---8679999948954--5727898899999999986156765433334532357874102788999999998 Q ss_pred CCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCC------------CCHHHHH Q ss_conf 3466666566766665169981588778777776631489999999978988999995479------------7558999 Q gi|254781108|r 290 NEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAP------------PEGQDLL 357 (398) Q Consensus 290 ~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~------------~~~~~~l 357 (398) ..... ...-+|+||++|||.++... .....+-+.+++.+|.||+|++... -.+.+.| T Consensus 156 ~~~~~------~~~gRK~li~iSdG~d~~s~-----~~~~~~~~~a~~a~v~IY~I~~~~~~~~~~~~~~~~~~~~~~~L 224 (296) T TIGR03436 156 ANALA------GIPGRKALIVISDGEDNSSR-----DTLERAIEAAQRADVLIYSIDARGLRAPDLGAGAKAGLSGPETL 224 (296) T ss_pred HHHCC------CCCCCEEEEEEECCCCCCCC-----CCHHHHHHHHHHCCCEEEEECCCCCCCCCCCCCCCCCCCCHHHH T ss_conf 75404------79886799999269886330-----48999999999849779995467656656444445567627999 Q ss_pred HHHH-CCCCCEEEECCHHHHHHHHHHHHHHHHHCEE Q ss_conf 9985-1998189946989999999999998751245 Q gi|254781108|r 358 RKCT-DSSGQFFAVNDSRELLESFDKITDKIQEQSV 392 (398) Q Consensus 358 ~~cA-s~~~~yy~a~~~~~L~~aF~~I~~~i~~~r~ 392 (398) +..| .++|.+|..+. .+|.++|++|++++.+.-+ T Consensus 225 ~~lA~~TGG~~f~~~~-~dl~~~~~~i~~~lr~qY~ 259 (296) T TIGR03436 225 ERLAAETGGRAFYVNS-NDIDEAFAQIAEELRSQYV 259 (296) T ss_pred HHHHHHHCCEEECCCC-CCHHHHHHHHHHHHHHEEE T ss_conf 9999973996755474-1089999999998752389 No 27 >cd01462 VWA_YIEM_type VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.46 E-value=4.7e-12 Score=90.70 Aligned_cols=109 Identities=17% Similarity=0.163 Sum_probs=79.2 Q ss_pred EEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCC Q ss_conf 34654126776543112356899999999735778887445889999999611346666656676666516998158877 Q gi|254781108|r 237 RIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGEN 316 (398) Q Consensus 237 ~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~ 316 (398) +.+.+.|+.... ....+.+.+..++...+..+.++|||++..+|..|.+.+... ...++.|||+|||+. T Consensus 38 ~~~lv~F~~~~~-~~~~~~~~~~~~~~~~i~~~~~~GGT~i~~aL~~A~~~l~~~----------~~~~~~IvlITDG~~ 106 (152) T cd01462 38 DTYLILFDSEFQ-TKIVDKTDDLEEPVEFLSGVQLGGGTDINKALRYALELIERR----------DPRKADIVLITDGYE 106 (152) T ss_pred EEEEEEECCCCE-EEECCCCCCHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCC----------CCCCCEEEEEECCCC T ss_conf 099999168735-771587645999999997253689865799999999987425----------765646999826756 Q ss_pred CCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC Q ss_conf 8777776631489999999978988999995479755899999851 Q gi|254781108|r 317 SGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTD 362 (398) Q Consensus 317 ~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs 362 (398) ... .......++..+++|+++|++++|. +.+..+++..+. T Consensus 107 ~~~-----~~~~~~~~~~~~~~~~r~~~~~iG~-~~~p~~~~~~~~ 146 (152) T cd01462 107 GGV-----SDELLREVELKRSRVARFVALALGD-HGNPGYDRISAE 146 (152) T ss_pred CCC-----HHHHHHHHHHHHHCCEEEEEEEECC-CCCCHHHHHHHH T ss_conf 798-----3999999999983891999999899-988278787666 No 28 >COG1240 ChlD Mg-chelatase subunit ChlD [Coenzyme metabolism] Probab=99.32 E-value=6e-11 Score=84.08 Aligned_cols=172 Identities=19% Similarity=0.253 Sum_probs=130.0 Q ss_pred CCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHH Q ss_conf 65431357505566666676663101233102301268876540310134322011255533341000245665541022 Q gi|254781108|r 146 LAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQ 225 (398) Q Consensus 146 ~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~ 225 (398) ....++||+|.||||.... ++...|.++..|+.... T Consensus 77 ~g~lvvfvVDASgSM~~~~--------------------------------------------Rm~aaKG~~~~lL~dAY 112 (261) T COG1240 77 AGNLIVFVVDASGSMAARR--------------------------------------------RMAAAKGAALSLLRDAY 112 (261) T ss_pred CCCCEEEEEECCCCCHHHH--------------------------------------------HHHHHHHHHHHHHHHHH T ss_conf 6774899994765420578--------------------------------------------99999999999999999 Q ss_pred HCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 02768776521346541267765431123568999999997357788874458899999996113466666566766665 Q gi|254781108|r 226 KAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 226 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) ..+.+...+.|.. ...+.+++.|.+...+..++..|.++|.|.+..||..+++.+....... ++.. T Consensus 113 -------q~RdkvavI~F~G-~~A~lll~pT~sv~~~~~~L~~l~~GG~TPL~~aL~~a~ev~~r~~r~~------p~~~ 178 (261) T COG1240 113 -------QRRDKVAVIAFRG-EKAELLLPPTSSVELAERALERLPTGGKTPLADALRQAYEVLAREKRRG------PDRR 178 (261) T ss_pred -------HCCCEEEEEEECC-CCCEEEECCCCCHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHCCC------CCCC T ss_conf -------7035489999637-7653884786539999999983899998843999999999999751048------8765 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCHHH Q ss_conf 169981588778777776631489999999978988999995479755899999851-9981899469899 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTD-SSGQFFAVNDSRE 375 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs-~~~~yy~a~~~~~ 375 (398) .++|++|||..|..........+..+|.++...|+.+-+|.+..+.-...+.+++|. .++.||+-++..+ T Consensus 179 ~~~vviTDGr~n~~~~~~~~~e~~~~a~~~~~~g~~~lvid~e~~~~~~g~~~~iA~~~Gg~~~~L~~l~~ 249 (261) T COG1240 179 PVMVVITDGRANVPIPLGPKAETLEAASKLRLRGIQLLVIDTEGSEVRLGLAEEIARASGGEYYHLDDLSD 249 (261) T ss_pred EEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCHHHHHHHHHCCEEEECCCCCC T ss_conf 38999737965888898657799999999852688479995578523344799999973990786555640 No 29 >cd01454 vWA_norD_type norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3- ------ NO2- ------ NO ------- N2O --------- N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif. Probab=99.29 E-value=3.6e-10 Score=79.40 Aligned_cols=97 Identities=21% Similarity=0.399 Sum_probs=73.9 Q ss_pred HHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCC----HHHHHHHHHH Q ss_conf 999999735778887445889999999611346666656676666516998158877877777663----1489999999 Q gi|254781108|r 260 NEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNT----LNTLQICEYM 335 (398) Q Consensus 260 ~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~----~~~~~~c~~~ 335 (398) ...+..|..|.|+|+|..+.++.|+.+.|... +..+|++|++|||+.+...++... .++..++.++ T Consensus 69 ~~~~~~i~~l~~~g~Tr~G~Air~a~~~L~~~----------~~~rkiliviSDG~P~D~~~~~~~~~~~~D~~~av~e~ 138 (174) T cd01454 69 ERARKRLAALSPGGNTRDGAAIRHAAERLLAR----------PEKRKILLVISDGEPNDLDYYEGNVFATEDALRAVIEA 138 (174) T ss_pred HHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHC----------CCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHH T ss_conf 45688885118789896179999999998639----------76667999983899766777887553899999999999 Q ss_pred HHCCCEEEEEEECCCC--CHHHHHHHHHCCCCCE Q ss_conf 9789889999954797--5589999985199818 Q gi|254781108|r 336 RNAGMKIYSVAVSAPP--EGQDLLRKCTDSSGQF 367 (398) Q Consensus 336 K~~gi~IytIg~~~~~--~~~~~l~~cAs~~~~y 367 (398) +.+||.+|.|+++.+. ...+.++.+-+ .++| T Consensus 139 ~~~GI~~~~i~i~~~~~~~~~~~l~~i~g-~~~~ 171 (174) T cd01454 139 RKLGIEVFGITIDRDATTVDKEYLKNIFG-EEGY 171 (174) T ss_pred HHCCCEEEEEEECCCCCHHHHHHHHHHCC-CCCE T ss_conf 98798899999898555669999998428-7877 No 30 >cd01477 vWA_F09G8-8_type VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of mo Probab=99.14 E-value=4.2e-09 Score=73.03 Aligned_cols=168 Identities=21% Similarity=0.236 Sum_probs=111.1 Q ss_pred CCCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHH Q ss_conf 56543135750556666667666310123310230126887654031013432201125553334100024566554102 Q gi|254781108|r 145 NLAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSI 224 (398) Q Consensus 145 ~~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~ 224 (398) +.=+|+++|+|+|..|.....+ .++..+..++... T Consensus 17 nLWLDVv~VVD~S~~mt~~gl~---------------------------------------------~V~~~I~s~f~~~ 51 (193) T cd01477 17 NLWLDIVFVVDNSKGMTQGGLW---------------------------------------------QVRATISSLFGSS 51 (193) T ss_pred HEEEEEEEEEECCCCCCCCCHH---------------------------------------------HHHHHHHHHHHCC T ss_conf 2237899999678765621099---------------------------------------------9999999997135 Q ss_pred HHCCCCCC-CCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHC-CC---CCCCCCHHHHHHHHHHHHCCCCCCCCCCC Q ss_conf 20276877-652134654126776543112356899999999735-77---88874458899999996113466666566 Q gi|254781108|r 225 QKAIQEKK-NLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNK-LN---PYENTNTYPAMHHAYRELYNEKESSHNTI 299 (398) Q Consensus 225 ~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~-l~---~~G~T~~~~gl~~a~~~l~~~~~~~~~~~ 299 (398) .....++. ...+|.+.+.|++.......+.--....++.+.|.. |. ....++++.||.-|-.+|..... . T Consensus 52 t~iGt~~~~pr~TRVGlVTYn~~AtvvAdLn~~~S~ddl~~~i~~~l~~vsss~~SyL~~GL~aA~~~l~~~~~-----~ 126 (193) T cd01477 52 SQIGTDYDDPRSTRVGLVTYNSNATVVADLNDLQSFDDLYSQIQGSLTDVSSTNASYLDTGLQAAEQMLAAGKR-----T 126 (193) T ss_pred CCCCCCCCCCCCEEEEEEEECCCCEEEECCCCCCCHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHCCC-----C T ss_conf 40357889987338999996787459863454565788999998875146666312799999999999983326-----6 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCH--HHHHHHHHCCCCCEE Q ss_conf 766665169981588778777776631489999999978988999995479755--899999851998189 Q gi|254781108|r 300 GSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEG--QDLLRKCTDSSGQFF 368 (398) Q Consensus 300 ~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~--~~~l~~cAs~~~~yy 368 (398) ...+++|+||+..-.....+ ...+..+++.+|..|+.|-||+|+.+... ...|+.||| |++-| T Consensus 127 ~R~nykKVVIVyAs~y~~~g-----~~dp~pvA~rLK~~Gv~IiTVa~~q~~~~~~~~~L~~IAS-pg~nF 191 (193) T cd01477 127 SRENYKKVVIVFASDYNDEG-----SNDPRPIAARLKSTGIAIITVAFTQDESSNLLDKLGKIAS-PGMNF 191 (193) T ss_pred CCCCCCEEEEEEECCCCCCC-----CCCHHHHHHHHHHCCCEEEEEECCCCCCHHHHHHHHHHCC-CCCCC T ss_conf 42486279999950246789-----8886999999987697899998268875889998887579-98887 No 31 >TIGR02442 Cob-chelat-sub cobaltochelatase subunit; InterPro: IPR012804 Cobalamin (vitamin B12) can be complexed with metal via ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. Cobaltochelatase is responsible for the insertion of cobalt into the corrin ring of coenzyme B12 during its biosynthesis. Two versions have been well described. CbiK/CbiX is a monomeric, anaerobic version which acts early in the biosynthesis (IPR010388 from INTERPRO). CobNST is a trimeric, ATP-dependent, aerobic version which acts late in the biosynthesis, (IPR011953 from INTERPRO, IPR006537 from INTERPRO, IPR006538 from INTERPRO) . The two pathways differ in the point of cobalt insertion during corrin ring formation . There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction . Cobaltochelatase shows similarities with magnesium chelatase, which is also a complex ATP-dependent enzyme made up of two separable components. However, unlike the situation in cobaltochelatase, one of these two components is membrane bound in magnesium chelatase . . Probab=98.76 E-value=1.2e-07 Score=64.28 Aligned_cols=165 Identities=17% Similarity=0.176 Sum_probs=116.9 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) .--+.||+|-||||.... |+...|.++..|+-.. T Consensus 508 G~LviFvVDASGSM~ar~--------------------------------------------RM~~~KGavLsLL~DA-- 541 (688) T TIGR02442 508 GNLVIFVVDASGSMAARG--------------------------------------------RMAAAKGAVLSLLRDA-- 541 (688) T ss_pred CCCEEEEEECCHHHHHHH--------------------------------------------HHHHHHHHHHHHHHHH-- T ss_conf 152223533532044235--------------------------------------------7899899999988888-- Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 27687765213465412677654311235689999999973577888744588999999961134666665667666651 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) +..+.+...+.|- +......+|.|.+.......+..|..+|.|....||..|++++...--.. ..... T Consensus 542 -----Yq~RDkValI~Fr-G~~AevlLPPT~sv~~A~r~L~~lPtGGrTPLa~gL~~A~~v~~~~~~~~------~~~~p 609 (688) T TIGR02442 542 -----YQKRDKVALITFR-GEEAEVLLPPTSSVELAARRLEELPTGGRTPLAAGLLKAAEVLSNELLRD------DDRRP 609 (688) T ss_pred -----HHHCCEEEEEECC-CCEEEEECCCCCHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHCC------CCCCE T ss_conf -----8627768886236-73435765878848999999972889898745899999999999986116------89942 Q ss_pred EEEECCCCCCCCCC---CCCC--CHHHHHHHHHHHHC-------CCEEEEEEECC-CCCHHHHHHHHHCC-CCCEEE Q ss_conf 69981588778777---7766--31489999999978-------98899999547-97558999998519-981899 Q gi|254781108|r 307 FVIFITDGENSGAS---AYQN--TLNTLQICEYMRNA-------GMKIYSVAVSA-PPEGQDLLRKCTDS-SGQFFA 369 (398) Q Consensus 307 ~iillTDG~~~~~~---~~~~--~~~~~~~c~~~K~~-------gi~IytIg~~~-~~~~~~~l~~cAs~-~~~yy~ 369 (398) ++|++|||.-|..- .... ...+..+..++++. ||..-+|=-.. +--.-.+-+++|+. ++.||. T Consensus 610 l~V~iTDGRaNv~L~~~~g~~qp~~~~~~~a~~L~~~~~R~R~Lg~~~vV~DTE~~~~v~lGlA~~~A~~lgg~~~~ 686 (688) T TIGR02442 610 LVVVITDGRANVALDVSLGEPQPLDDARTIASKLAARASRIRSLGIKFVVIDTENPGFVRLGLAEDLASALGGEYLR 686 (688) T ss_pred EEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCEEECCCEEEEEECCCCCCCCCCHHHHHHHHHCCCEEC T ss_conf 89987078635426667888415778999999988750430111622789972688754222389999982983224 No 32 >COG4961 TadG Flp pilus assembly protein TadG [Intracellular trafficking and secretion] Probab=98.75 E-value=3e-07 Score=61.91 Aligned_cols=69 Identities=13% Similarity=0.159 Sum_probs=53.0 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCC Q ss_conf 954798999999999999999999999999999999999986511444431023368999999998867541012 Q gi|254781108|r 1 MTAIIISVCFLFITYAIDLAHIMYIRNQMQSALDAAVLSGCASIVSDRTIKDPTTKKDQTSTIFKKQIKKHLKQG 75 (398) Q Consensus 1 l~Al~l~~ll~~~g~avD~~~~~~~k~~Lq~A~DaA~LAaa~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~ 75 (398) +|||++|||+++.++.||++.+++.|.+||+|+|+|++++++......... +..+.++..+........ T Consensus 26 eFAlvap~ll~l~~g~ve~~~~~~~~~~l~~a~d~aara~~~~~~~~~~~~------~~~~~~~~~~~~~~~~~~ 94 (185) T COG4961 26 EFALVAPPLLLLVFGIVEFGIAFLAKQSLQNAADAAARAAARGLTTDAADL------DTIQAAATAFLNAIAPAN 94 (185) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHH------HHHHHHHHHHHHHHCCCC T ss_conf 999999999999999999999999999999999999999985076442025------677899998887516422 No 33 >COG4245 TerY Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain [General function prediction only] Probab=98.75 E-value=3e-07 Score=61.87 Aligned_cols=182 Identities=16% Similarity=0.210 Sum_probs=116.5 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) +.++|+||.||||..- .+..++.++..+.+++..- T Consensus 4 lP~~lllDtSgSM~Ge---------------------------------------------~IealN~Glq~m~~~Lkqd 38 (207) T COG4245 4 LPCYLLLDTSGSMIGE---------------------------------------------PIEALNAGLQMMIDTLKQD 38 (207) T ss_pred CCEEEEEECCCCCCCC---------------------------------------------CHHHHHHHHHHHHHHHHHC T ss_conf 7889999367542456---------------------------------------------1799989999999998748 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEE Q ss_conf 76877652134654126776543112356899999999735778887445889999999611346666656676666516 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKF 307 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~ 307 (398) +.. ..++...++.|... ...+.||+. ..++ .--.|.+.|+|..+.+|..+.+.+....... .+....+++.+ T Consensus 39 p~A--le~v~lsIVTF~~~--a~~~~pf~~-~~nF--~~p~L~a~GgT~lGaAl~~a~d~Ie~~~~~~-~a~~kgdyrP~ 110 (207) T COG4245 39 PYA--LERVELSIVTFGGP--ARVIQPFTD-AANF--NPPILTAQGGTPLGAALTLALDMIEERKRKY-DANGKGDYRPW 110 (207) T ss_pred HHH--HHEEEEEEEEECCC--CEEEECHHH-HHHC--CCCCEECCCCCCHHHHHHHHHHHHHHHHHHC-CCCCCCCCCEE T ss_conf 465--44057899982685--068733155-7544--8870136999806799999999998777650-56775554417 Q ss_pred EEECCCCCCCCCCCCCCCHHHHHHHHH--HHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHH Q ss_conf 998158877877777663148999999--997898899999547975589999985199818994698999999999999 Q gi|254781108|r 308 VIFITDGENSGASAYQNTLNTLQICEY--MRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDKITD 385 (398) Q Consensus 308 iillTDG~~~~~~~~~~~~~~~~~c~~--~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~I~~ 385 (398) ++|||||+.+.. . .+.+... -+.+.-.+-.+++|....+...|+...+.-..++. .+..++.+.|+=+.. T Consensus 111 vfLiTDG~PtD~------w-~~~~~~~~~~~~~~k~v~a~~~G~~~ad~~~L~qit~~V~~~~t-~d~~~f~~fFkW~Sa 182 (207) T COG4245 111 VFLITDGEPTDD------W-QAGAALVFQGERRAKSVAAFSVGVQGADNKTLNQITEKVRQFLT-LDGLQFREFFKWLSA 182 (207) T ss_pred EEEECCCCCCHH------H-HHHHHHHHHCCCCCCEEEEEEECCCCCCCHHHHHHHHHHCCCCC-CCHHHHHHHHHHHHH T ss_conf 999538996657------7-76777764033100528999953543441899998876525234-534889999999987 Q ss_pred HHHHC Q ss_conf 87512 Q gi|254781108|r 386 KIQEQ 390 (398) Q Consensus 386 ~i~~~ 390 (398) .|+.- T Consensus 183 Sisag 187 (207) T COG4245 183 SISAG 187 (207) T ss_pred HHHCC T ss_conf 75132 No 34 >KOG2353 consensus Probab=98.73 E-value=1.1e-07 Score=64.56 Aligned_cols=187 Identities=18% Similarity=0.270 Sum_probs=118.5 Q ss_pred CCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHH Q ss_conf 22223455654313575055666666766631012331023012688765403101343220112555333410002456 Q gi|254781108|r 138 IIERSSENLAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESA 217 (398) Q Consensus 138 ~~~~~~~~~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~ 217 (398) .+-......+.++++.+|.||||... +.+..+..+ T Consensus 216 ~Wyi~aAt~pKdiviLlD~SgSm~g~---------------------------------------------~~~lak~tv 250 (1104) T KOG2353 216 SWYIQAATSPKDIVILLDVSGSMSGL---------------------------------------------RLDLAKQTV 250 (1104) T ss_pred CCCCCCCCCCCCEEEEEECCCCCCCH---------------------------------------------HHHHHHHHH T ss_conf 53000467866459999656555443---------------------------------------------169999999 Q ss_pred HHHHHHHHHCCCCCCCCCEEEEEEECCC------CCC-CCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCC Q ss_conf 6554102202768776521346541267------765-431123568999999997357788874458899999996113 Q gi|254781108|r 218 GNLVNSIQKAIQEKKNLSVRIGTIAYNI------GIV-GNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYN 290 (398) Q Consensus 218 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~------~~~-~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~ 290 (398) ...++++...+. +. ...|+. .|. ...+.....|+..++..|+.+.+.|.++...|+..+..+|.. T Consensus 251 ~~iLdtLs~~Df------vn--i~tf~~~~~~v~pc~~~~lvqAt~~nk~~~~~~i~~l~~k~~a~~~~~~e~aF~lL~~ 322 (1104) T KOG2353 251 NEILDTLSDNDF------VN--ILTFNSEVNPVSPCFNGTLVQATMRNKKVFKEAIETLDAKGIANYTAALEYAFSLLRD 322 (1104) T ss_pred HHHHHHCCCCCE------EE--EEEECCCCCCCCCCCCCCEEECCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHH T ss_conf 999976154776------87--8762135675652025852204567799999998641412541243557789999987 Q ss_pred CCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHH-H-HCCCEEEEEEECCCCC--HHHHHHHHHCCCCC Q ss_conf 466666566766665169981588778777776631489999999-9-7898899999547975--58999998519981 Q gi|254781108|r 291 EKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYM-R-NAGMKIYSVAVSAPPE--GQDLLRKCTDSSGQ 366 (398) Q Consensus 291 ~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~-K-~~gi~IytIg~~~~~~--~~~~l~~cAs~~~~ 366 (398) ...+....... .-.++|+++|||..+... .+-+.- . .+.|.|||.-+|.... ..-.+..|++ .|+ T Consensus 323 ~n~s~~~~~~~-~C~~~iml~tdG~~~~~~---------~If~~yn~~~~~Vrvftflig~~~~~~~~~~wmac~n-~gy 391 (1104) T KOG2353 323 YNDSRANTQRS-PCNQAIMLITDGVDENAK---------EIFEKYNWPDKKVRVFTFLIGDEVYDLDEIQWMACAN-KGY 391 (1104) T ss_pred HCCCCCCCCCC-CCCEEEEEEECCCCCCHH---------HHHHHHCCCCCCEEEEEEEECCCCCCCCCCHHHHHHC-CCC T ss_conf 44455443225-001045776247751089---------9998603677735999999244213454121225407-885 Q ss_pred EEEECCHHHHHHHHHHHHHHHH Q ss_conf 8994698999999999999875 Q gi|254781108|r 367 FFAVNDSRELLESFDKITDKIQ 388 (398) Q Consensus 367 yy~a~~~~~L~~aF~~I~~~i~ 388 (398) |++..+-++..+-=....+-+. T Consensus 392 y~~I~~~~~v~~~~~~y~~vls 413 (1104) T KOG2353 392 YVHIISIADVRENVLEYLDVLS 413 (1104) T ss_pred EEECCCHHHCCHHHHHHHHHHC T ss_conf 5864665645867655664532 No 35 >PRK13406 bchD magnesium chelatase subunit D; Provisional Probab=98.62 E-value=3.5e-06 Score=55.45 Aligned_cols=155 Identities=15% Similarity=0.172 Sum_probs=105.1 Q ss_pred CCHHHHHHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHH Q ss_conf 41000245665541022027687765213465412677654311235689999999973577888744588999999961 Q gi|254781108|r 209 KIDVLIESAGNLVNSIQKAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYREL 288 (398) Q Consensus 209 ~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l 288 (398) |+...|.++..++.... ..+.+...+.|-.. .....+|.|......+..+..|..+|+|....||..++++. T Consensus 418 Rm~~aKGAV~~LL~dAY-------~~RD~ValIaFRG~-~AevlLPPTrSv~~A~r~L~~LP~GG~TPLA~GL~~A~~l~ 489 (584) T PRK13406 418 RLAEAKGAVELLLAECY-------VRRDHVALVAFRGR-GAELLLPPTRSLVRAKRSLAGLPGGGGTPLAAGLDAALALA 489 (584) T ss_pred HHHHHHHHHHHHHHHHH-------HHHCEEEEEEECCC-CCEEEECCCCCHHHHHHHHHCCCCCCCCHHHHHHHHHHHHH T ss_conf 99999999999999999-------60044789987687-63074188655999999996299999885999999999999 Q ss_pred CCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCC-----CCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC Q ss_conf 13466666566766665169981588778777776-----6314899999999789889999954797558999998519 Q gi|254781108|r 289 YNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQ-----NTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDS 363 (398) Q Consensus 289 ~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~-----~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~ 363 (398) ...... ....++||+|||.-|..--.. -......++..++..||..-+|=-+ .......+..|.. T Consensus 490 ~~~r~~--------~~~p~~VllTDGRaNv~ldg~~~r~~a~~da~~~A~~l~~~g~~~vVIDT~--~~~~~~a~~LA~~ 559 (584) T PRK13406 490 LSVRRK--------GQTPTVVLLTDGRANIARDGAGGRAQAEEDALAAARALRAAGLPALVIDTS--PRPQPQARALAEA 559 (584) T ss_pred HHHHCC--------CCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCCCEEEEECC--CCCCHHHHHHHHH T ss_conf 997557--------995489998279877787778871148999999999999769978999489--8886269999998 Q ss_pred -CCCEEEEC--CHHHHHHHHH Q ss_conf -98189946--9899999999 Q gi|254781108|r 364 -SGQFFAVN--DSRELLESFD 381 (398) Q Consensus 364 -~~~yy~a~--~~~~L~~aF~ 381 (398) ++.||.-+ +++.|..+-+ T Consensus 560 l~a~Y~~Lp~~~A~~l~~~V~ 580 (584) T PRK13406 560 MGARYLPLPRADATRLSQAVR 580 (584) T ss_pred CCCCEEECCCCCHHHHHHHHH T ss_conf 399189789789899999999 No 36 >TIGR02031 BchD-ChlD magnesium chelatase ATPase subunit D; InterPro: IPR011776 This entry represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. Unlike subunit I (IPR011775 from INTERPRO), this subunit is not found in archaea.; GO: 0005524 ATP binding, 0016851 magnesium chelatase activity, 0015995 chlorophyll biosynthetic process. Probab=98.48 E-value=2.4e-06 Score=56.48 Aligned_cols=180 Identities=15% Similarity=0.166 Sum_probs=123.4 Q ss_pred CCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHH Q ss_conf 22223455654313575055666666766631012331023012688765403101343220112555333410002456 Q gi|254781108|r 138 IIERSSENLAISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESA 217 (398) Q Consensus 138 ~~~~~~~~~~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~ 217 (398) ....+......-+.|++|-||||.- ..|+...|-|+ T Consensus 501 r~Kr~~~ksg~L~IF~VDASGSsaa--------------------------------------------~~Rm~~AKGAV 536 (705) T TIGR02031 501 RIKRYRRKSGALLIFVVDASGSSAA--------------------------------------------VARMSEAKGAV 536 (705) T ss_pred HHHHHHCCCCCEEEEEEECCHHHHH--------------------------------------------HHHHHHHHHHH T ss_conf 1244330288279997606357899--------------------------------------------99998778999 Q ss_pred HHHHHHHHHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCC Q ss_conf 65541022027687765213465412677654311235689999999973577888744588999999961134666665 Q gi|254781108|r 218 GNLVNSIQKAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHN 297 (398) Q Consensus 218 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~ 297 (398) ..|+...... .+.+++..+.|-. ...+..+|-|-....-+..++.|.-||||....||..||++-....... T Consensus 537 ~~LL~~AYv~-----RD~vkVaLi~FRG-~~Ae~LLPPsrSv~~aKr~L~~LP~GGGtPLA~gL~~A~~~a~qar~~G-- 608 (705) T TIGR02031 537 ELLLGEAYVH-----RDQVKVALIAFRG-TAAEVLLPPSRSVELAKRRLDVLPGGGGTPLAAGLAAAVEVAKQARSRG-- 608 (705) T ss_pred HHHHHHHHHH-----CCCEEEEEEECCC-CHHHHCCCCHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHCCC-- T ss_conf 9998765441-----3603577630444-3000037852358999999715899985678999999999998510268-- Q ss_pred CCCCCCCCEEEEECCCCCCCCCCCC-------CC-----------CHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHH Q ss_conf 6676666516998158877877777-------66-----------31489999999978988999995479755899999 Q gi|254781108|r 298 TIGSTRLKKFVIFITDGENSGASAY-------QN-----------TLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRK 359 (398) Q Consensus 298 ~~~~~~~~k~iillTDG~~~~~~~~-------~~-----------~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~ 359 (398) .-.+-+|||+|||--|-.=-. .. ......++..+.+.||-.-+|=-.-.-...-+++. T Consensus 609 ----D~~~~~ivliTDGRgNvpL~~~~DP~~~~~~r~PrPts~~l~~e~~~lA~~i~~~G~~~lVIDT~~~f~s~G~a~~ 684 (705) T TIGR02031 609 ----DVGRITIVLITDGRGNVPLDASVDPKAAKADRLPRPTSEELKEEVLALARKIREAGISALVIDTANKFVSTGFAKK 684 (705) T ss_pred ----CCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCHHHH T ss_conf ----8524556776077877467656786100235678726899999999999998871886589826778667644899 Q ss_pred HHCC-CCCEEEECCH Q ss_conf 8519-9818994698 Q gi|254781108|r 360 CTDS-SGQFFAVNDS 373 (398) Q Consensus 360 cAs~-~~~yy~a~~~ 373 (398) +|.. .+|||+-+++ T Consensus 685 lA~~~~a~Y~yLP~a 699 (705) T TIGR02031 685 LARKLGARYIYLPNA 699 (705) T ss_pred HHHHHCCCEEECCCC T ss_conf 999858906713688 No 37 >cd01457 vWA_ORF176_type VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most Probab=98.26 E-value=6.7e-05 Score=47.81 Aligned_cols=160 Identities=16% Similarity=0.194 Sum_probs=92.3 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) ..|++|++|.||||...... +...|...++.++..+...... T Consensus 2 ~rD~v~lIDdSgSM~~~d~~--------------------------------------~~~sRW~~a~~al~~iA~~c~~ 43 (199) T cd01457 2 NRDYTLLIDKSGSMAEADEA--------------------------------------KERSRWEEAQESTRALARKCEE 43 (199) T ss_pred CCCEEEEEECCCCCCCCCCC--------------------------------------CCCCHHHHHHHHHHHHHHHHHH T ss_conf 97779999688876677678--------------------------------------8876299999999999999987 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC- Q ss_conf 2768776521346541267765431123568999999997357788874458899999996113466666566766665- Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLK- 305 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~- 305 (398) .+.++.. .+.++........ -+...+........|.|.|+....|.-+.......... +.+.++ T Consensus 44 ~D~DGId------vyfln~~~~~~~~----~~~~~V~~iF~~~~P~G~T~~g~~L~~il~~y~~r~~~-----~~~kp~g 108 (199) T cd01457 44 YDSDGIT------VYLFSGDFRRYDN----VNSSKVDQLFAENSPDGGTNLAAVLQDALNNYFQRKEN-----GATCPEG 108 (199) T ss_pred CCCCCCE------EEEEECCCCCCCC----CCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHC-----CCCCCCC T ss_conf 4889987------9996277645688----89999999985589899796379999998999873200-----6899986 Q ss_pred EEEEECCCCCCCCCCCCCCCHHHHHHHHHHH-HCCCEEEEEEECCCCCHHHHHHHHH Q ss_conf 1699815887787777766314899999999-7898899999547975589999985 Q gi|254781108|r 306 KFVIFITDGENSGASAYQNTLNTLQICEYMR-NAGMKIYSVAVSAPPEGQDLLRKCT 361 (398) Q Consensus 306 k~iillTDG~~~~~~~~~~~~~~~~~c~~~K-~~gi~IytIg~~~~~~~~~~l~~cA 361 (398) -.||++|||+.+.... -......++.++. .+.+-|-+|.+|.+.....+|+..= T Consensus 109 ~~iIVITDG~p~D~~a--v~~~Ii~aa~kLd~~~qlgIqF~QVG~D~~A~~fL~~LD 163 (199) T cd01457 109 ETFLVITDGAPDDKDA--VERVIIKASDELDADNELAISFLQIGRDPAATAFLKALD 163 (199) T ss_pred EEEEEEECCCCCCCHH--HHHHHHHHHHHHCCCCCCCEEEEEECCCHHHHHHHHHHC T ss_conf 0799982799798288--999999999863440100367778559688999999858 No 38 >cd01453 vWA_transcription_factor_IIH_type Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group. Probab=98.23 E-value=0.0003 Score=43.89 Aligned_cols=173 Identities=15% Similarity=0.207 Sum_probs=114.3 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) ..+++|+|.|.||.... ...+|......++..++..+.+. T Consensus 4 R~l~iiiD~S~am~~~D----------------------------------------~~PtRl~~~~~~l~~Fi~effdq 43 (183) T cd01453 4 RHLIIVIDCSRSMEEQD----------------------------------------LKPSRLAVVLKLLELFIEEFFDQ 43 (183) T ss_pred EEEEEEEECCHHHHHCC----------------------------------------CCCCHHHHHHHHHHHHHHHHHCC T ss_conf 69999998837677565----------------------------------------89549999999999999998707 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCC-CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 768776521346541267765431123568999999997357-7888744588999999961134666665667666651 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKL-NPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l-~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) ++ -.+.+.+...++. ...+..++.|......++..+ .+.|..+...||..|...|..-.. ...+. T Consensus 44 NP-----isqlGii~~rn~~-a~~ls~lsgn~~~hi~~l~~~~~~~G~~SLqN~Le~A~~~L~~~P~--------~~sRE 109 (183) T cd01453 44 NP-----ISQLGIISIKNGR-AEKLTDLTGNPRKHIQALKTARECSGEPSLQNGLEMALESLKHMPS--------HGSRE 109 (183) T ss_pred CC-----CCEEEEEEEECCE-EEEEEECCCCHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHCCC--------CCCEE T ss_conf 97-----4048999994681-6997646899899999998545899981399999999999820898--------78448 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 6998158877877777663148999999997898899999547975589999985199818994698999999999 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDK 382 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~ 382 (398) ++|++ ..-.+. +..+-...-+.+|+.+|++.+|++... -.-.-+-|..++|.|+-+-+...+.+.+.+ T Consensus 110 ILiI~-~Sl~t~-----DpgdI~~ti~~lk~~~IrvsvI~l~aE--v~I~k~l~~~TgG~y~V~lde~H~~~ll~~ 177 (183) T cd01453 110 VLIIF-SSLSTC-----DPGNIYETIDKLKKENIRVSVIGLSAE--MHICKEICKATNGTYKVILDETHLKELLLE 177 (183) T ss_pred EEEEE-CCCCCC-----CCCCHHHHHHHHHHCCCEEEEEEECHH--HHHHHHHHHHHCCEEEEECCHHHHHHHHHH T ss_conf 99997-565347-----976499999999983978999974278--999999999839976875399999999995 No 39 >COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=97.89 E-value=0.0004 Score=43.11 Aligned_cols=125 Identities=14% Similarity=0.170 Sum_probs=73.7 Q ss_pred EEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCC Q ss_conf 34654126776543112356899999999735778887445889999999611346666656676666516998158877 Q gi|254781108|r 237 RIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGEN 316 (398) Q Consensus 237 ~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~ 316 (398) +.....|.+....-...+-..+...+...+.... +|||++..++..+.+-+....-.. -=+|++|||+. T Consensus 310 ~~~~~lF~s~~~~~el~~k~~~~~e~i~fL~~~f-~GGTD~~~~l~~al~~~k~~~~~~----------adiv~ITDg~~ 378 (437) T COG2425 310 DCYVILFDSEVIEYELYEKKIDIEELIEFLSYVF-GGGTDITKALRSALEDLKSRELFK----------ADIVVITDGED 378 (437) T ss_pred CEEEEEECCCCEEEEECCCCCCHHHHHHHHHHHC-CCCCCHHHHHHHHHHHHHCCCCCC----------CCEEEEECCHH T ss_conf 5389995252025550577457999999996506-898885899999999864366567----------77899804376 Q ss_pred CCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCC-EEEECCHHHHHHHHHHH Q ss_conf 87777766314899999999789889999954797558999998519981-89946989999999999 Q gi|254781108|r 317 SGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQ-FFAVNDSRELLESFDKI 383 (398) Q Consensus 317 ~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~-yy~a~~~~~L~~aF~~I 383 (398) ... ..-...+-+..|..+.++|+|-++. .+..-+..+. ++ -|.+++. +...+++.+ T Consensus 379 ~~~-----~~~~~~v~e~~k~~~~rl~aV~I~~--~~~~~l~~Is---d~~i~~~~~~-~~~kv~~~~ 435 (437) T COG2425 379 ERL-----DDFLRKVKELKKRRNARLHAVLIGG--YGKPGLMRIS---DHIIYRVEPR-DRVKVVKRW 435 (437) T ss_pred HHH-----HHHHHHHHHHHHHHHCEEEEEEECC--CCCCCCCEEC---EEEEEEECCH-HHHHHHHCC T ss_conf 654-----6789999999887543489999647--8986600011---1467872747-776777344 No 40 >COG4548 NorD Nitric oxide reductase activation protein [Inorganic ion transport and metabolism] Probab=97.83 E-value=0.00028 Score=44.07 Aligned_cols=116 Identities=18% Similarity=0.250 Sum_probs=88.7 Q ss_pred HHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCC-CCCC--CHHHHHHHHHHH Q ss_conf 9999997357788874458899999996113466666566766665169981588778777-7766--314899999999 Q gi|254781108|r 260 NEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGAS-AYQN--TLNTLQICEYMR 336 (398) Q Consensus 260 ~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~-~~~~--~~~~~~~c~~~K 336 (398) .++...|-+|.|+-.|-.+.++..+-..|... +.-+|.+|++|||+.|.-+ |... -.++..+...++ T Consensus 518 ~~~~~RImALePg~ytR~G~AIR~As~kL~~r----------pq~qklLivlSDGkPnd~d~YEgr~gIeDTr~AV~eaR 587 (637) T COG4548 518 ETVGPRIMALEPGYYTRDGAAIRHASAKLMER----------PQRQKLLIVLSDGKPNDFDHYEGRFGIEDTREAVIEAR 587 (637) T ss_pred CCCCHHHEECCCCCCCCCCHHHHHHHHHHHCC----------CCCCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHH T ss_conf 55331221337664443109999999998347----------41124899944898543443233211153799999998 Q ss_pred HCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHHH Q ss_conf 7898899999547975589999985199818994698999999999999875 Q gi|254781108|r 337 NAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDKITDKIQ 388 (398) Q Consensus 337 ~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~I~~~i~ 388 (398) ..||.+|-|.+.-. ..+.+...-. .+.|-.|.+.++|.++.-.|-+++. T Consensus 588 k~Gi~VF~Vtld~e--a~~y~p~~fg-qngYa~V~~v~~LP~~L~~lyrkL~ 636 (637) T COG4548 588 KSGIEVFNVTLDRE--AISYLPALFG-QNGYAFVERVAQLPGALPPLYRKLL 636 (637) T ss_pred HCCCEEEEEEECCH--HHHHHHHHHC-CCCEEECCCHHHCCHHHHHHHHHHC T ss_conf 65834799983330--5555288852-6746970240016055799999962 No 41 >pfam05762 VWA_CoxE VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA type domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria. Probab=97.16 E-value=0.011 Score=34.57 Aligned_cols=75 Identities=12% Similarity=0.041 Sum_probs=44.8 Q ss_pred CHHHHHHHHHCCC--CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHH Q ss_conf 9999999973577--88874458899999996113466666566766665169981588778777776631489999999 Q gi|254781108|r 258 NLNEVKSRLNKLN--PYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYM 335 (398) Q Consensus 258 ~~~~~~~~I~~l~--~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~ 335 (398) +.......+.... -+|||+++.+|..-.+.... ..-..+-+||+++||..+.+ .......-+.+ T Consensus 111 d~~~al~~~~~~~~~~~GgT~ig~al~~f~~~~~~---------~~l~~~t~ViilsDg~~~~~-----~~~l~~~l~~L 176 (223) T pfam05762 111 DPAEALLRVSARVEDWGGGTRIGAALAYFNELWTR---------PALSRGAVVVLVSDGLERGD-----SEELLAEVARL 176 (223) T ss_pred CHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHCCC---------CCCCCCCEEEEEECCCCCCC-----HHHHHHHHHHH T ss_conf 99999999998603667997499999999985030---------34678867999723010388-----31899999999 Q ss_pred HHCCCEEEEEE Q ss_conf 97898899999 Q gi|254781108|r 336 RNAGMKIYSVA 346 (398) Q Consensus 336 K~~gi~IytIg 346 (398) +..+.+|.-+. T Consensus 177 ~~~~~rviWLN 187 (223) T pfam05762 177 VRSARRLVWLN 187 (223) T ss_pred HHHCCEEEEEC T ss_conf 98378799989 No 42 >pfam06707 DUF1194 Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown. Probab=97.11 E-value=0.018 Score=33.20 Aligned_cols=141 Identities=16% Similarity=0.180 Sum_probs=83.9 Q ss_pred CEEEEEEECCCCCCCCCCCCCC--CCHHH---HHHHHHCCC--CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEE Q ss_conf 2134654126776543112356--89999---999973577--8887445889999999611346666656676666516 Q gi|254781108|r 235 SVRIGTIAYNIGIVGNQCTPLS--NNLNE---VKSRLNKLN--PYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKF 307 (398) Q Consensus 235 ~~~~~~~~~~~~~~~~~~~~lt--~~~~~---~~~~I~~l~--~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~ 307 (398) .+....+.+.....+..++|.+ .+... +...|.... ..+.|.++.+|..+..+|..... ...+|+ T Consensus 48 ~Iava~~eWsg~~~q~~vv~Wt~I~~~~da~a~A~~i~~~~r~~~~~Taig~Al~~a~~l~~~~~~--------~~~Rrv 119 (206) T pfam06707 48 RIAVTYVEWSGPDDQRVVVPWTLIDSAEDAEAFAARLAAAPRRAGRRTAIGGALGFAAALLAQNPY--------ECLRRV 119 (206) T ss_pred EEEEEEEEECCCCCCEEEECCEEECCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHCCC--------CCCEEE T ss_conf 189999980278874488699895899999999999975887889997699999999999982998--------761799 Q ss_pred EEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCC-----HHHHHHHHHC-CCCCE-EEECCHHHHHHHH Q ss_conf 998158877877777663148999999997898899999547975-----5899999851-99818-9946989999999 Q gi|254781108|r 308 VIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPE-----GQDLLRKCTD-SSGQF-FAVNDSRELLESF 380 (398) Q Consensus 308 iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~-----~~~~l~~cAs-~~~~y-y~a~~~~~L~~aF 380 (398) |=+-.||.||.+-.+. ..+-+.+-..||+|--+.++.+.. -....+.|.- .||-| -.|.+.++-.+++ T Consensus 120 IDiSGDG~nN~G~~p~-----~~ard~~~~~GitINgL~I~~~~~~~~~~L~~yy~~~VIgGpgAFV~~a~~~~df~~Ai 194 (206) T pfam06707 120 IDVSGDGPNNQGFPPV-----TAARDAAVAAGVTINGLAIMGAEAPTSDDLDAYYRDCVIGGPGAFVEPANGFEDFAEAI 194 (206) T ss_pred EEEECCCCCCCCCCCH-----HHHHHHHHHCCEEEEEEEECCCCCCCCHHHHHHHHHCCCCCCCCEEEECCCHHHHHHHH T ss_conf 9960799888999813-----78987677759289667774789876236999997320238984499738879999999 Q ss_pred H-HHHHHHH Q ss_conf 9-9999875 Q gi|254781108|r 381 D-KITDKIQ 388 (398) Q Consensus 381 ~-~I~~~i~ 388 (398) . ++-.+|. T Consensus 195 rrKL~rEIa 203 (206) T pfam06707 195 RRKLVREIA 203 (206) T ss_pred HHHHHHHHH T ss_conf 999999873 No 43 >pfam00362 Integrin_beta Integrin, beta chain. Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF. Probab=97.00 E-value=0.023 Score=32.56 Aligned_cols=134 Identities=13% Similarity=0.221 Sum_probs=79.6 Q ss_pred CCCCCCCCCHHHHHHHHHCCCCCCCCCHHH-HHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCC------------ Q ss_conf 311235689999999973577888744588-9999999611346666656676666516998158877------------ Q gi|254781108|r 250 NQCTPLSNNLNEVKSRLNKLNPYENTNTYP-AMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGEN------------ 316 (398) Q Consensus 250 ~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~-gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~------------ 316 (398) .-.++||.|...+...|+...-+|+-...+ ||.--++..--...=.|.. ..++++||.||+.. T Consensus 179 ~~~l~LT~~~~~F~~~v~~q~iSgNlD~PEGGfDAlmQ~aVC~~~IGWR~----~arrllv~~TDa~fH~AgDGkL~GIv 254 (424) T pfam00362 179 RHVLSLTDDTDRFNEEVKKQKISGNLDAPEGGFDAIMQAAVCGEEIGWRN----EARRLLVFTTDAGFHFAGDGKLGGIV 254 (424) T ss_pred EEECCCCCCHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCC----CCEEEEEEECCCCCCCCCCCCEEEEE T ss_conf 20024677789999998746364677787501777778876142337777----85289999858875135776334353 Q ss_pred --CCCCCCC-----------CCH-HHHHHHHHHHHCCC-EEEEEEECCCCCHHHHHHHHHC-CCCCEE--EECCHHHHHH Q ss_conf --8777776-----------631-48999999997898-8999995479755899999851-998189--9469899999 Q gi|254781108|r 317 --SGASAYQ-----------NTL-NTLQICEYMRNAGM-KIYSVAVSAPPEGQDLLRKCTD-SSGQFF--AVNDSRELLE 378 (398) Q Consensus 317 --~~~~~~~-----------~~~-~~~~~c~~~K~~gi-~IytIg~~~~~~~~~~l~~cAs-~~~~yy--~a~~~~~L~~ 378 (398) |.+..+. ..+ ....+.+++++++| .||.|.= ....+.+..+. =|+-+. -..+++.+.+ T Consensus 255 ~PNDg~CHL~~~g~Yt~s~~~DYPSv~ql~~kl~ennI~~IFAVt~----~~~~~Y~~Ls~~i~gs~vg~L~~DSsNIv~ 330 (424) T pfam00362 255 EPNDGQCHLDDNGEYTASTTLDYPSVGQLAEKLSENNINPIFAVTE----NVVDLYKELSELIPGSTVGVLSSDSSNVVQ 330 (424) T ss_pred CCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECH----HHHHHHHHHHHHCCCCEEEEECCCCHHHHH T ss_conf 4888730448987614456678888899999998649259999750----245899999975776525662467502899 Q ss_pred HHHHHHHHHHHCE Q ss_conf 9999999875124 Q gi|254781108|r 379 SFDKITDKIQEQS 391 (398) Q Consensus 379 aF~~I~~~i~~~r 391 (398) ..++--++|++.. T Consensus 331 LI~~aY~ki~S~V 343 (424) T pfam00362 331 LIKDAYNKISSKV 343 (424) T ss_pred HHHHHHHHHHEEE T ss_conf 9999998752289 No 44 >cd01452 VWA_26S_proteasome_subunit 26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association. Probab=96.82 E-value=0.032 Score=31.70 Aligned_cols=153 Identities=10% Similarity=0.196 Sum_probs=98.7 Q ss_pred CCCCHHHHHHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHH Q ss_conf 33410002456655410220276877652134654126776543112356899999999735778887445889999999 Q gi|254781108|r 207 NRKIDVLIESAGNLVNSIQKAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYR 286 (398) Q Consensus 207 ~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~ 286 (398) .+|.+.-.+|++.++........ . ...+..+... ........+|.|..++...+..+.++|.-+...|++-|.= T Consensus 23 PtR~~AQ~dAvn~i~~~k~~~Np--E---n~VGl~tmag-~~~~Vl~TlT~D~gkiL~~lh~i~~~G~~~~~~~IqiA~L 96 (187) T cd01452 23 PTRFQAQADAVNLICQAKTRSNP--E---NNVGLMTMAG-NSPEVLVTLTNDQGKILSKLHDVQPKGKANFITGIQIAQL 96 (187) T ss_pred CCHHHHHHHHHHHHHHHHHHCCC--C---CCEEEEEECC-CCCEEEEECCCCHHHHHHHCCCCCCCCEECHHHHHHHHHH T ss_conf 71899999999999977751495--3---3113576158-9866898448657889875326771876518879999999 Q ss_pred HHCCCCCCCCCCCCCCCCC-EEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHH---- Q ss_conf 6113466666566766665-16998158877877777663148999999997898899999547975589999985---- Q gi|254781108|r 287 ELYNEKESSHNTIGSTRLK-KFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCT---- 361 (398) Q Consensus 287 ~l~~~~~~~~~~~~~~~~~-k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cA---- 361 (398) .|..- -++..+ |+|+|. -.-. ..+......+++++|+++|.|=.|.||......+.|+.-. T Consensus 97 ALKHR--------qnk~~~qRIv~FV-gSPi-----~~~ek~l~~laKklKKnnV~vDII~FGe~~~n~~kL~~f~~~vn 162 (187) T cd01452 97 ALKHR--------QNKNQKQRIVAFV-GSPI-----EEDEKDLVKLAKRLKKNNVSVDIINFGEIDDNTEKLTAFIDAVN 162 (187) T ss_pred HHHCC--------CCCCCCEEEEEEE-CCCC-----CCCHHHHHHHHHHHHHCCCCEEEEEECCCCCCHHHHHHHHHHHC T ss_conf 97234--------6777544799997-8987-----55789999999987555853589994688899899999999845 Q ss_pred CC-CCCEEEECCHHHH-HHH Q ss_conf 19-9818994698999-999 Q gi|254781108|r 362 DS-SGQFFAVNDSREL-LES 379 (398) Q Consensus 362 s~-~~~yy~a~~~~~L-~~a 379 (398) +. ..|+-.++....| .++ T Consensus 163 ~~~~Shlv~ippg~~lLSd~ 182 (187) T cd01452 163 GKDGSHLVSVPPGENLLSDA 182 (187) T ss_pred CCCCCEEEEECCCCCHHHHH T ss_conf 89982599947998645676 No 45 >smart00187 INB Integrin beta subunits (N-terminal portion of extracellular region). Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). Probab=96.82 E-value=0.033 Score=31.67 Aligned_cols=134 Identities=13% Similarity=0.212 Sum_probs=78.2 Q ss_pred CCCCCCCCCHHHHHHHHHCCCCCCCCCHHH-HHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCC------------ Q ss_conf 311235689999999973577888744588-9999999611346666656676666516998158877------------ Q gi|254781108|r 250 NQCTPLSNNLNEVKSRLNKLNPYENTNTYP-AMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGEN------------ 316 (398) Q Consensus 250 ~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~-gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~------------ 316 (398) .-..+||.|...+...|+.-.-+|+-...+ ||.--++..--...=.|.. ..++++||.||+-. T Consensus 178 ~n~l~LT~d~~~F~~~V~~q~iSgNlD~PEGGfDAlmQ~avC~~~IGWR~----~arrllVf~TDa~fH~AgDGkL~GIv 253 (423) T smart00187 178 KHVLSLTDDTDEFNEEVKKQRISGNLDAPEGGFDAIMQAAVCTEQIGWRE----DARRLLVFSTDAGFHFAGDGKLAGIV 253 (423) T ss_pred CCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCC----CCEEEEEEECCCCCCCCCCCCEEEEE T ss_conf 11123678889999998625363466887612778888875200037655----74389999837863023676244354 Q ss_pred --CCCCCCCC-----------CH-HHHHHHHHHHHCCC-EEEEEEECCCCCHHHHHHHHHC-CCCCEE--EECCHHHHHH Q ss_conf --87777766-----------31-48999999997898-8999995479755899999851-998189--9469899999 Q gi|254781108|r 317 --SGASAYQN-----------TL-NTLQICEYMRNAGM-KIYSVAVSAPPEGQDLLRKCTD-SSGQFF--AVNDSRELLE 378 (398) Q Consensus 317 --~~~~~~~~-----------~~-~~~~~c~~~K~~gi-~IytIg~~~~~~~~~~l~~cAs-~~~~yy--~a~~~~~L~~ 378 (398) |.+..+.+ .+ ....+..++++++| .||.|.= ....+.+..+. =|+-+. -..+++.+.+ T Consensus 254 ~PNDg~CHLd~~g~Yt~s~~~DYPSi~ql~~kl~ennI~~IFAVT~----~~~~~Y~~Ls~~ipgs~vg~L~~DSsNVv~ 329 (423) T smart00187 254 QPNDGQCHLDNNGEYTMSTTQDYPSIGQLNQKLAENNINPIFAVTK----KQVSLYKELSALIPGSSVGVLSEDSSNVVE 329 (423) T ss_pred CCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECC----CHHHHHHHHHHHCCCCEEEEECCCCHHHHH T ss_conf 3788730327888524456567887899999998539327998522----045699999875775403552457513899 Q ss_pred HHHHHHHHHHHCE Q ss_conf 9999999875124 Q gi|254781108|r 379 SFDKITDKIQEQS 391 (398) Q Consensus 379 aF~~I~~~i~~~r 391 (398) ..++=-++|++.. T Consensus 330 LI~~aY~ki~S~V 342 (423) T smart00187 330 LIKDAYNKISSRV 342 (423) T ss_pred HHHHHHHHHCEEE T ss_conf 9999998750189 No 46 >COG4655 Predicted membrane protein [Function unknown] Probab=96.78 E-value=0.00055 Score=42.28 Aligned_cols=44 Identities=14% Similarity=0.360 Sum_probs=40.6 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 95479899999999999999999999999999999999998651 Q gi|254781108|r 1 MTAIIISVCFLFITYAIDLAHIMYIRNQMQSALDAAVLSGCASI 44 (398) Q Consensus 1 l~Al~l~~ll~~~g~avD~~~~~~~k~~Lq~A~DaA~LAaa~~~ 44 (398) |+|+.+|..++.++++||+++.|..|.+||.+.|-|+++++... T Consensus 15 ltal~~~lal~~l~l~VD~G~l~leqR~LQ~~ADlAAiaAAs~~ 58 (565) T COG4655 15 LTALFVPLALATLLLGVDYGYLYLEQRELQRVADLAAIAAASNL 58 (565) T ss_pred HHHHHHHHHHHHHHHEECCCEEEEEHHHHHHHHHHHHHHHHHHC T ss_conf 99999999999886502201244117878887769988877627 No 47 >pfam11775 CobT_C Cobalamin biosynthesis protein CobT VWA domain. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. Probab=96.51 E-value=0.029 Score=32.01 Aligned_cols=86 Identities=17% Similarity=0.228 Sum_probs=51.6 Q ss_pred HHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCC-----CCCCCHH-HHHHHHHH-HHCCCEEEEEEECC Q ss_conf 58899999996113466666566766665169981588778777-----7766314-89999999-97898899999547 Q gi|254781108|r 277 TYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGAS-----AYQNTLN-TLQICEYM-RNAGMKIYSVAVSA 349 (398) Q Consensus 277 ~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~-----~~~~~~~-~~~~c~~~-K~~gi~IytIg~~~ 349 (398) -+++|.||.+-|..- +.-+|++++++||..-... ...+-.+ -...-+.+ +..+|++..||+|. T Consensus 118 DGEAL~wA~~RL~~R----------~e~RkILmViSDGaP~ddst~s~n~~~yL~~hLr~vi~~ie~~~~iel~aIGIgh 187 (220) T pfam11775 118 DGEALAQAAKLFAGR----------MEDKKILLMISDGAPCDDSTLSVAAGDGFEQHLRHIIEEIETLSEIDLIAIGIGH 187 (220) T ss_pred CCHHHHHHHHHHHCC----------CCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEEEECC T ss_conf 719999999998639----------3124699997589967764112587776799999999998506882699987477 Q ss_pred CCCHHHHHHHHHCCCCCEEEECCHHHHHHHH Q ss_conf 9755899999851998189946989999999 Q gi|254781108|r 350 PPEGQDLLRKCTDSSGQFFAVNDSRELLESF 380 (398) Q Consensus 350 ~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF 380 (398) +. ...+.+++. .+.+.+||.++. T Consensus 188 Dv-~r~yY~~av-------~i~d~eeL~~~~ 210 (220) T pfam11775 188 DA-PRRYYKNAA-------LINDAEELGGAI 210 (220) T ss_pred CC-CHHHHHCCE-------EECCHHHHHHHH T ss_conf 76-866650656-------860388865999 No 48 >pfam04056 Ssl1 Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex. Probab=95.82 E-value=0.12 Score=28.25 Aligned_cols=174 Identities=17% Similarity=0.173 Sum_probs=109.7 Q ss_pred CCCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 54313575055666666766631012331023012688765403101343220112555333410002456655410220 Q gi|254781108|r 147 AISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQK 226 (398) Q Consensus 147 ~~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~ 226 (398) -..+++|+|.|-+|....-- .+|....-..+..|+..+.+ T Consensus 52 iRhl~iilD~S~aM~e~Dlk----------------------------------------P~R~~~~l~~l~~Fi~efFd 91 (250) T pfam04056 52 IRHLYIVLDCSRAMEEKDLR----------------------------------------PSRFACTIKYLETFVEEFFD 91 (250) T ss_pred EEEEEEEEECCHHHHHCCCC----------------------------------------CCHHHHHHHHHHHHHHHHHH T ss_conf 16899999882767635158----------------------------------------64899999999999999874 Q ss_pred CCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHC---CCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 276877652134654126776543112356899999999735---77888744588999999961134666665667666 Q gi|254781108|r 227 AIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNK---LNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTR 303 (398) Q Consensus 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~---l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~ 303 (398) .++ -.+.+.+...++. ...+..++.|......++.+ +.+.|.-+...||..+...|...... . T Consensus 92 qNP-----iSQlgii~~rn~~-a~~ls~lsgnp~~hi~aL~~~~~~~~~G~pSLqN~Le~a~~~L~~~P~~--------~ 157 (250) T pfam04056 92 QNP-----ISQIGLITCKDGR-AHRLTDLTGNPRVHIKALKSLREAECGGDPSLQNALELARASLKHVPSH--------G 157 (250) T ss_pred CCC-----CCCEEEEEEECCE-EEEEEECCCCHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHCCCCC--------C T ss_conf 398-----3022799996571-3783325799899999999874069999920899999999887508987--------8 Q ss_pred CCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 6516998158877877777663148999999997898899999547975589999985199818994698999999999 Q gi|254781108|r 304 LKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDK 382 (398) Q Consensus 304 ~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~ 382 (398) .+.++|++.==-. .+.-+-...-+.+|+.+|++-.|++.. .-.-.-+-|..++|.|.-+-+..-+.+.+-+ T Consensus 158 sREILii~gSL~T------~DPgdI~~tI~~l~~~~IrvsvI~Laa--Ev~Ick~l~~~T~G~y~V~lde~Hfk~ll~~ 228 (250) T pfam04056 158 SREVLIIFGSLST------CDPGDIYSTIDTLKKEKIRCSVIGLSA--EVFICKELCKATNGTYSVALDETHLKELLLE 228 (250) T ss_pred CEEEEEEEEECCC------CCCCCHHHHHHHHHHCCCEEEEEEECH--HHHHHHHHHHHHCCEEEEECCHHHHHHHHHH T ss_conf 5489999820444------588659999999997590799987338--9999999999749988875699999999995 No 49 >COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=95.74 E-value=0.13 Score=28.05 Aligned_cols=102 Identities=13% Similarity=0.189 Sum_probs=65.7 Q ss_pred CCCHHHHHHHHHC-CCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHH Q ss_conf 6899999999735-778887445889999999611346666656676666516998158877877777663148999999 Q gi|254781108|r 256 SNNLNEVKSRLNK-LNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEY 334 (398) Q Consensus 256 t~~~~~~~~~I~~-l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~ 334 (398) -.++..+...|+. +.+.|.|....++.|+...+....... ....+.+.|||+++.... +........+. T Consensus 93 ~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--------~~~~~~~~tdg~~~~~~~--d~~~~~~~~~~ 162 (399) T COG2304 93 ATNKESITAAIDQSLQAGGATAVEASLSLAVELAAKALPRG--------TLNRILLLTDGENNLGLV--DPSRLSALAKL 162 (399) T ss_pred CCCHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHCCCC--------CCCEEEEECCCCHHCCCC--CHHHHHHHHCC T ss_conf 23227788887640264554305778999999876423545--------532333303641202766--78899998634 Q ss_pred HHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEE Q ss_conf 9978988999995479755899999851-998189 Q gi|254781108|r 335 MRNAGMKIYSVAVSAPPEGQDLLRKCTD-SSGQFF 368 (398) Q Consensus 335 ~K~~gi~IytIg~~~~~~~~~~l~~cAs-~~~~yy 368 (398) .-..+|.+.++|++.+.+ .+.+...+. .++.+. T Consensus 163 ~~~~~i~~~~~g~~~~~n-~~~~~~~~~~~~g~l~ 196 (399) T COG2304 163 AAGKGIVLDTLGLGDDVN-EDELTGIAAAANGNLA 196 (399) T ss_pred CCCCCEEEEEECCCCHHH-HHHHHHHHHHCCCCCC T ss_conf 556762786313552267-7777765530366411 No 50 >cd01458 vWA_ku Ku70/Ku80 N-terminal domain. The Ku78 heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks (DSB) in a preferred orientation. DSB's are repaired by either homologues recombination or non-homologues end joining and facilitate repair by the non-homologous end-joining pathway (NHEJ). The Ku heterodimer is required for accurate process that tends to preserve the sequence at the junction. Ku78 is found in all three kingdoms of life. However, only the eukaryotic proteins have a vWA domain fused to them at their N-termini. The vWA domain is not involved in DNA binding but may very likey mediate Ku78's interactions with other proteins. Members of this subgroup lack the conserved MIDAS motif. Probab=95.63 E-value=0.14 Score=27.81 Aligned_cols=85 Identities=9% Similarity=0.142 Sum_probs=56.2 Q ss_pred CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 88744588999999961134666665667666651699815887787777766314899999999789889999954797 Q gi|254781108|r 272 YENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPP 351 (398) Q Consensus 272 ~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~ 351 (398) .+....+..|..+.+++..... ....|-|+|+||.++...............+..+++.||.|-.+.+..++ T Consensus 103 ~~~~~l~~aL~~~~~~f~~~~~--------~~~~krI~lfTdnD~P~~~~~~~~~~a~~~a~DL~d~gI~iel~~l~~~~ 174 (218) T cd01458 103 SGQVSLSDALWVCLDLFSKGKK--------KKSHKRIFLFTNNDDPHGGDSIKDSQAAVKAEDLKDKGIELELFPLSSPG 174 (218) T ss_pred CCCCCHHHHHHHHHHHHHHCCC--------CCCCCEEEEECCCCCCCCCCHHHHHHHHHHHHHHHHCCCEEEEEECCCCC T ss_conf 8886799999999999985553--------45777799986899899988799999999998898779689998448998 Q ss_pred C---HHHHHHHHHCCC Q ss_conf 5---589999985199 Q gi|254781108|r 352 E---GQDLLRKCTDSS 364 (398) Q Consensus 352 ~---~~~~l~~cAs~~ 364 (398) . ...+.+.+-..+ T Consensus 175 ~~Fd~s~FY~dii~~~ 190 (218) T cd01458 175 KKFDVSKFYKDIIALV 190 (218) T ss_pred CCCCCHHHHHHHHCCC T ss_conf 8688067788752683 No 51 >cd01455 vWA_F11C1-5a_type Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=95.29 E-value=0.19 Score=27.09 Aligned_cols=115 Identities=17% Similarity=0.300 Sum_probs=71.2 Q ss_pred CCCHHHHHHHHHCCCC-----CCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHH Q ss_conf 6899999999735778-----88744588999999961134666665667666651699815887787777766314899 Q gi|254781108|r 256 SNNLNEVKSRLNKLNP-----YENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQ 330 (398) Q Consensus 256 t~~~~~~~~~I~~l~~-----~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~ 330 (398) .+++..+ ..+..|.+ ..|-++=+++.+|...+-...+. -..++|+++|-.-.. +...+.. T Consensus 68 k~~keRl-~vl~~M~AHsQyC~sGD~Tlea~~~Ai~~l~a~~d~---------De~fVivlSDANL~R-----YgI~p~~ 132 (191) T cd01455 68 KNNKERL-ETLKMMHAHSQFCWSGDHTVEATEFAIKELAAKEDF---------DEAIVIVLSDANLER-----YGIQPKK 132 (191) T ss_pred CCHHHHH-HHHHHHHHHHHHEECCCCHHHHHHHHHHHHHHCCCC---------CCCEEEEECCCCHHH-----CCCCHHH T ss_conf 8668999-999986312010025884489999999987530267---------760899981476443-----1889899 Q ss_pred HHHHHH-HCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHH Q ss_conf 999999-78988999995479755899999851998189946989999999999998 Q gi|254781108|r 331 ICEYMR-NAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDKITDK 386 (398) Q Consensus 331 ~c~~~K-~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~I~~~ 386 (398) +...++ +.+|.-|.|-++.-.+..+.|+.-= -.|+-|-.-+..+|..+|++|... T Consensus 133 l~~~l~~~p~V~a~~IfIgslg~eA~~l~~~l-P~G~~fVc~dt~~lP~il~qIfts 188 (191) T cd01455 133 LADALAREPNVNAFVIFIGSLSDEADQLQREL-PAGKAFVCMDTSELPHIMQQIFTS 188 (191) T ss_pred HHHHHHCCCCCCEEEEEEECHHHHHHHHHHHC-CCCCEEEECCHHHHHHHHHHHHHH T ss_conf 99997338776689999735167999999748-997417853653678999999887 No 52 >pfam07811 TadE TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues. Probab=95.26 E-value=0.048 Score=30.68 Aligned_cols=36 Identities=25% Similarity=0.473 Sum_probs=32.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 547989999999999999999999999999999999 Q gi|254781108|r 2 TAIIISVCFLFITYAIDLAHIMYIRNQMQSALDAAV 37 (398) Q Consensus 2 ~Al~l~~ll~~~g~avD~~~~~~~k~~Lq~A~DaA~ 37 (398) +|+++|+++.+....+|++++...+..++.|...++ T Consensus 7 falv~p~~l~l~~~~~~~~~~~~~~~~~~~Aa~~aa 42 (43) T pfam07811 7 FALVLPVLLLLLFGIVELGRLFYARQVLQNAAREAA 42 (43) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 999999999999999999999999999999998674 No 53 >KOG2807 consensus Probab=94.26 E-value=0.35 Score=25.50 Aligned_cols=173 Identities=16% Similarity=0.176 Sum_probs=108.3 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHC Q ss_conf 43135750556666667666310123310230126887654031013432201125553334100024566554102202 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAPAPAPANRKIDVLIESAGNLVNSIQKA 227 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~ 227 (398) .-+++|+|.|..|...... .+|.......+..++..+.+. T Consensus 61 Rhl~iviD~S~am~e~Df~----------------------------------------P~r~a~~~K~le~Fv~eFFdQ 100 (378) T KOG2807 61 RHLYIVIDCSRAMEEKDFR----------------------------------------PSRFANVIKYLEGFVPEFFDQ 100 (378) T ss_pred EEEEEEEEHHHHHHHCCCC----------------------------------------CHHHHHHHHHHHHHHHHHHCC T ss_conf 3689998734556644478----------------------------------------048999999999999998614 Q ss_pred CCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCC-CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCE Q ss_conf 7687765213465412677654311235689999999973577-888744588999999961134666665667666651 Q gi|254781108|r 228 IQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLN-PYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKK 306 (398) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~-~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k 306 (398) ++ -.+.+++.--++ ..+....++.|...-..+.+.+. ..|+-...-+|..+...|.+.... -.+. T Consensus 101 NP-----iSQigii~~k~g-~A~~lt~ltgnp~~hI~aL~~~~~~~g~fSLqNaLe~a~~~Lk~~p~H--------~sRE 166 (378) T KOG2807 101 NP-----ISQIGIISIKDG-KADRLTDLTGNPRIHIHALKGLTECSGDFSLQNALELAREVLKHMPGH--------VSRE 166 (378) T ss_pred CC-----HHHEEEEEEECC-HHHHHHHHCCCHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCCC--------CCEE T ss_conf 96-----203358997055-326888714887889999731224488867887999999985178765--------6327 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHH-HCCCCCEEEECCHHHHHHHHHHH Q ss_conf 699815887787777766314899999999789889999954797558999998-51998189946989999999999 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKC-TDSSGQFFAVNDSRELLESFDKI 383 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~c-As~~~~yy~a~~~~~L~~aF~~I 383 (398) ++|++.---.. +.-+-...-+++|..+|++.+||+... -..-|.. -.+.|.|+-+-+..-|.+.|.+- T Consensus 167 VLii~sslsT~------DPgdi~~tI~~lk~~kIRvsvIgLsaE---v~icK~l~kaT~G~Y~V~lDe~HlkeLl~e~ 235 (378) T KOG2807 167 VLIIFSSLSTC------DPGDIYETIDKLKAYKIRVSVIGLSAE---VFICKELCKATGGRYSVALDEGHLKELLLEH 235 (378) T ss_pred EEEEEEEECCC------CCCCHHHHHHHHHHHCEEEEEEEECHH---HHHHHHHHHHHCCEEEEEECHHHHHHHHHHC T ss_conf 99998540355------852099999999861727999850055---8999999886188579875789999999845 No 54 >KOG1226 consensus Probab=93.14 E-value=0.38 Score=25.27 Aligned_cols=62 Identities=13% Similarity=0.202 Sum_probs=44.4 Q ss_pred CCCCCCCCCHHHHHHHHHCCCCCCCCCHHHH-HHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCC Q ss_conf 3112356899999999735778887445889-99999961134666665667666651699815887 Q gi|254781108|r 250 NQCTPLSNNLNEVKSRLNKLNPYENTNTYPA-MHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGE 315 (398) Q Consensus 250 ~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~g-l~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~ 315 (398) ...++||.|...++..+.+-.-.|+-...+| |.--++..--...=.|. .+.++.+||+||.. T Consensus 210 khvLsLT~~~~~F~~~V~~q~ISgNlDaPEGGfDAimQaavC~~~IGWR----~~a~~LLVF~td~~ 272 (783) T KOG1226 210 KHVLSLTNDAEEFNEEVGKQRISGNLDAPEGGFDAIMQAAVCTEKIGWR----NDATRLLVFSTDAG 272 (783) T ss_pred CEEEECCCCHHHHHHHHHHCEECCCCCCCCCHHHHHHHHHHCCCCCCCC----CCCEEEEEEECCCC T ss_conf 0021068876999998754353268898982298887664146552201----26516899970751 No 55 >COG4867 Uncharacterized protein with a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=92.65 E-value=0.65 Score=23.89 Aligned_cols=104 Identities=19% Similarity=0.233 Sum_probs=65.4 Q ss_pred HHHCCCCC--CCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCC------C--CCCCCCHHHHH---- Q ss_conf 97357788--8744588999999961134666665667666651699815887787------7--77766314899---- Q gi|254781108|r 265 RLNKLNPY--ENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSG------A--SAYQNTLNTLQ---- 330 (398) Q Consensus 265 ~I~~l~~~--G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~------~--~~~~~~~~~~~---- 330 (398) .+.++.+- -+||.+-||..+-+.|... ...+|.|+++|||+.+. + -++.+...+.. T Consensus 521 eLt~l~~v~eqgTNlhhaL~LA~r~l~Rh----------~~~~~~il~vTDGePtAhle~~DG~~~~f~yp~DP~t~~~T 590 (652) T COG4867 521 ELTGLAGVYEQGTNLHHALALAGRHLRRH----------AGAQPVVLVVTDGEPTAHLEDGDGTSVFFDYPPDPRTIAHT 590 (652) T ss_pred HHHCCCCCCCCCCCHHHHHHHHHHHHHHC----------CCCCCEEEEEECCCCCCCCCCCCCCEEECCCCCCHHHHHHH T ss_conf 98248876745554588999999998737----------56576289983798630134789856616899877799898 Q ss_pred --HHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC-CCCEEEECCHHHHHHH Q ss_conf --999999789889999954797558999998519-9818994698999999 Q gi|254781108|r 331 --ICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDS-SGQFFAVNDSRELLES 379 (398) Q Consensus 331 --~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~-~~~yy~a~~~~~L~~a 379 (398) -.++....||.|-+..++.+..-.-+++..|.- .|..| +++.+.|-.+ T Consensus 591 vr~~d~~~r~G~q~t~FrLg~DpgL~~Fv~qva~rv~G~vv-~pdldglGaa 641 (652) T COG4867 591 VRGFDDMARLGAQVTIFRLGSDPGLARFIDQVARRVQGRVV-VPDLDGLGAA 641 (652) T ss_pred HHHHHHHHHCCCEEEEEEECCCHHHHHHHHHHHHHHCCEEE-ECCCCHHHHH T ss_conf 99888887516413677522777689999999998588488-1382213589 No 56 >pfam09967 DUF2201 Predicted metallopeptidase (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=92.13 E-value=0.7 Score=23.67 Aligned_cols=34 Identities=24% Similarity=0.154 Sum_probs=24.6 Q ss_pred CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCC Q ss_conf 78887445889999999611346666656676666516998158877877 Q gi|254781108|r 270 NPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGA 319 (398) Q Consensus 270 ~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~ 319 (398) .-+|||.....+.|..+. .| ..+|++|||+-... T Consensus 351 ~GgGGTdf~pvf~~~~~~-~p---------------~~~i~fTDG~g~~p 384 (412) T pfam09967 351 TGGGGTDFRPVLEAALRL-RP---------------DAAVVLTDLEGWPA 384 (412) T ss_pred CCCCCCCCHHHHHHHHHC-CC---------------CEEEEEECCCCCCC T ss_conf 578998784899999826-99---------------76999838998988 No 57 >pfam11443 DUF2828 Domain of unknown function (DUF2828). This is a uncharacterized domain found in eukaryotes and viruses. Probab=90.50 E-value=1.1 Score=22.49 Aligned_cols=103 Identities=15% Similarity=0.082 Sum_probs=64.1 Q ss_pred EEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCC Q ss_conf 34654126776543112356899999999735778887445889999999611346666656676666516998158877 Q gi|254781108|r 237 RIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGEN 316 (398) Q Consensus 237 ~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~ 316 (398) +....+|+..+....+.. .+...-...+....=+++||....+..-.+..-.+. -...++.|.|+++||=+. T Consensus 368 k~~~iTFs~~P~~~~l~g--~~l~ekv~~~~~~~wg~nTnf~~vf~lIL~~av~~~------l~~eempk~l~VfSDMqF 439 (524) T pfam11443 368 KGKVITFSSNPQLHHIKG--DSLREKVSFVRRMPWGMSTNFQKVFDLILETAVENK------LPQEDMPKRLFVFSDMEF 439 (524) T ss_pred CCCEEEECCCCEEEECCC--CCHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHCC------CCHHHCCCEEEEEECCCH T ss_conf 581898449975897079--889999999995867635339999999999999869------997886773899845423 Q ss_pred CCCCCC--CCCHHHHHHHHHHHHCCCEEEEEEE Q ss_conf 877777--6631489999999978988999995 Q gi|254781108|r 317 SGASAY--QNTLNTLQICEYMRNAGMKIYSVAV 347 (398) Q Consensus 317 ~~~~~~--~~~~~~~~~c~~~K~~gi~IytIg~ 347 (398) +..... ........+++..++.|.++=-|-| T Consensus 440 D~a~~~~~~~~t~~e~i~~~f~~aGY~~P~IVF 472 (524) T pfam11443 440 DQASTGTSGWETDYEAIQRKFKEAGYEVPELVF 472 (524) T ss_pred HHHCCCCCCCCCHHHHHHHHHHHCCCCCCEEEE T ss_conf 120379987623899999999983999983889 No 58 >PRK10997 yieM hypothetical protein; Provisional Probab=89.04 E-value=1.4 Score=21.80 Aligned_cols=96 Identities=13% Similarity=0.099 Sum_probs=56.4 Q ss_pred EEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCC Q ss_conf 34654126776543112356899999999735778887445889999999611346666656676666516998158877 Q gi|254781108|r 237 RIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGEN 316 (398) Q Consensus 237 ~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~ 316 (398) +...+.|+.....-.+ ....+...+...+.. +-.|||.+...|..+...+.... +.. -=++++||-.. T Consensus 358 ~CyvI~FSte~~t~eL-t~~~gl~~l~~FL~~-sF~GGTD~~~~L~~~l~~m~~~~---y~~-------ADllvISDFIa 425 (484) T PRK10997 358 RCYIMLFSTEVITYEL-SGPDGLEQAIRFLSQ-SFRGGTDLAPCLRAIIEKMQGRE---WFD-------ADAVVISDFIA 425 (484) T ss_pred CEEEEEECCCEEEEEE-CCCCCHHHHHHHHCC-CCCCCCCHHHHHHHHHHHHHHCC---CCC-------CCEEEECHHCC T ss_conf 8799981265178980-487887999998528-88898457999999999862324---465-------88799712206 Q ss_pred CCCCCCCCCHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 87777766314899999999789889999954 Q gi|254781108|r 317 SGASAYQNTLNTLQICEYMRNAGMKIYSVAVS 348 (398) Q Consensus 317 ~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~ 348 (398) ..- ...-...++..-|.++-+.|.|.++ T Consensus 426 ~~l----p~~l~~kv~~lqk~~~nrFhav~is 453 (484) T PRK10997 426 QRL----PDELVAKVKELQRVHQHRFHAVAMS 453 (484) T ss_pred CCC----CHHHHHHHHHHHHHHCCCEEEEECC T ss_conf 569----9999999999998506835888401 No 59 >pfam03731 Ku_N Ku70/Ku80 N-terminal alpha/beta domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the amino terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold. Probab=85.30 E-value=2.4 Score=20.51 Aligned_cols=114 Identities=10% Similarity=0.120 Sum_probs=59.9 Q ss_pred CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCH-HHHHHHHHHHHCCCEEEEEEECC Q ss_conf 88874458899999996113466666566766665169981588778777776631-48999999997898899999547 Q gi|254781108|r 271 PYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTL-NTLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 271 ~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~-~~~~~c~~~K~~gi~IytIg~~~ 349 (398) +........+|..+.+++..... .....+|.|+|+||.++.......... .....+..+.+.||.|-.+.++. T Consensus 99 ~~~~~~~~~aL~~~~~~~~~~~~------~~k~~~krI~LfTdnD~P~~~~~~~~~~~~~~~a~Dl~d~gi~i~lf~i~~ 172 (222) T pfam03731 99 DSSDGDLLSALWVCMDLLQKQTG------KKKLSKKRILLFTNLDDPFEDDDQLDTIRQKLLAEDLRDEGIEFNLIHLPN 172 (222) T ss_pred CCCCCCHHHHHHHHHHHHHHHHC------CCCCCCCEEEEECCCCCCCCCCCHHHHHHHHHHHCCHHHCCCEEEEEECCC T ss_conf 97766488899999999885103------434578679998999989887517789999998523787497799961498 Q ss_pred C--CCHHHHHHHHHCCC----CCEEEECCHHHHHHHHHHHHHHHHHC Q ss_conf 9--75589999985199----81899469899999999999987512 Q gi|254781108|r 350 P--PEGQDLLRKCTDSS----GQFFAVNDSRELLESFDKITDKIQEQ 390 (398) Q Consensus 350 ~--~~~~~~l~~cAs~~----~~yy~a~~~~~L~~aF~~I~~~i~~~ 390 (398) + -....+++..-.-+ ..+......+.|.+...+|-.+.... T Consensus 173 ~~~f~~~~FY~dii~~~~~~~~~~~~~~~~~~l~~l~~~i~~k~~~k 219 (222) T pfam03731 173 SGGFDPNIFYKEIIKLGEDEENEVMLDLEGEKLEDLLSRLRAKQTAK 219 (222) T ss_pred CCCCCHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCC T ss_conf 88887778888642677643456677730316999999997430356 No 60 >pfam07002 Copine Copine. This family represents a conserved region approximately 180 residues long within eukaryotic copines. Copines are Ca(2+)-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth. Probab=81.46 E-value=3.4 Score=19.56 Aligned_cols=76 Identities=22% Similarity=0.240 Sum_probs=48.0 Q ss_pred HHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHC Q ss_conf 99999997357788874458899999996113466666566766665169981588778777776631489999999978 Q gi|254781108|r 259 LNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNA 338 (398) Q Consensus 259 ~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~ 338 (398) ....+..+..+...|-|+...=+..+.+....... ...-.+++++|||+-+ +...+..+--++-+. T Consensus 70 l~aY~~~~~~v~l~gPT~fapiI~~a~~~a~~~~~--------~~~Y~VLlIiTDG~i~------D~~~Ti~aIv~AS~~ 135 (145) T pfam07002 70 LNAYREALPNLQLSGPTNFAPIIDAAARIAEATQK--------SGQYHVLLIITDGQVT------DMKATIDAIVRASHL 135 (145) T ss_pred HHHHHHHHCEEEECCCCCHHHHHHHHHHHHHHHCC--------CCEEEEEEEECCCCCC------CHHHHHHHHHHHHCC T ss_conf 99999985810654875279999999999997223--------7718999996389735------699999999998279 Q ss_pred CCEEEEEEEC Q ss_conf 9889999954 Q gi|254781108|r 339 GMKIYSVAVS 348 (398) Q Consensus 339 gi~IytIg~~ 348 (398) .+.|-.||+| T Consensus 136 PlSIIiVGVG 145 (145) T pfam07002 136 PLSIIIVGVG 145 (145) T ss_pred CEEEEEEEEC T ss_conf 9279999519 No 61 >pfam11265 Med25_VWA Mediator complex subunit 25 von Willebrand factor type A. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain which is this one, an SD2 domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This VWA or von Willebrand factor type A domain when bound to RAR and the histone acetyltransferase CBP is responsible for recruiting Med1 to the rest of the Mediator complex. Probab=72.26 E-value=6.2 Score=18.00 Aligned_cols=111 Identities=10% Similarity=0.092 Sum_probs=68.3 Q ss_pred CEEEEEEECCCCCCC----CCCCCCCCCHHHHHHHHHCCCCCCC-----CCHHHHHHHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 213465412677654----3112356899999999735778887-----4458899999996113466666566766665 Q gi|254781108|r 235 SVRIGTIAYNIGIVG----NQCTPLSNNLNEVKSRLNKLNPYEN-----TNTYPAMHHAYRELYNEKESSHNTIGSTRLK 305 (398) Q Consensus 235 ~~~~~~~~~~~~~~~----~~~~~lt~~~~~~~~~I~~l~~~G~-----T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~ 305 (398) ....+.+.|+...+. -+....+.+...+...++++.-.|| -++.+||..|.+.+.+-...+. ...+.+.+ T Consensus 53 ~~~y~LVvf~t~~~~p~~~~q~~gpt~~~~~fl~~Ld~i~f~GGG~es~A~iaEGLa~ALq~Fdd~~~~r~-~~~~~~~q 131 (219) T pfam11265 53 GTQYSLVVFNTHASYPECLVQRSGPTRDVDEFLQWLSSIPFMGGGFESCALIAEGLAEALQMFDDFSKMRQ-QQGQTDVH 131 (219) T ss_pred CCEEEEEEEECCCCCCHHHHHHCCCCCCHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCC-CCCCCCCC T ss_conf 84589999604687762366606885899999999973501678714668899899999987313665164-68988753 Q ss_pred EEEEECCCCCCC----CCCCCCCCHHHHHHHHHHH--HCCCEEEEEE Q ss_conf 169981588778----7777766314899999999--7898899999 Q gi|254781108|r 306 KFVIFITDGENS----GASAYQNTLNTLQICEYMR--NAGMKIYSVA 346 (398) Q Consensus 306 k~iillTDG~~~----~~~~~~~~~~~~~~c~~~K--~~gi~IytIg 346 (398) |+.||+.---.- ...+.+.......++..+. +++|.+-.|. T Consensus 132 khCILIcnSpPy~lP~~e~~~y~g~t~dqla~~~~f~e~~i~lSIis 178 (219) T pfam11265 132 RHCILICNSPPYPLPTVESWQYEGKTSDQLAAAINFAERSISLSIIC 178 (219) T ss_pred EEEEEEECCCCCCCCCCHHHHHCCCCHHHHHHHHCCCCCCEEEEEEC T ss_conf 04899968999768651204343876899998741200462589976 No 62 >KOG4465 consensus Probab=70.96 E-value=6.6 Score=17.82 Aligned_cols=100 Identities=19% Similarity=0.284 Sum_probs=55.0 Q ss_pred EEECCCCCCCCCCCCCC--CCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCC Q ss_conf 54126776543112356--8999999997357788874458899999996113466666566766665169981588778 Q gi|254781108|r 240 TIAYNIGIVGNQCTPLS--NNLNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENS 317 (398) Q Consensus 240 ~~~~~~~~~~~~~~~lt--~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~ 317 (398) .+.|++.... .|+| +...++..+++++ +.|+|.-+-.|.|+-..-.+ --+.|+.||.+.= T Consensus 471 ~vaf~d~lte---~pftkd~kigqv~~~~nni-~~g~tdcglpm~wa~ennlk--------------~dvfii~tdndt~ 532 (598) T KOG4465 471 CVAFCDELTE---CPFTKDMKIGQVLDAMNNI-DAGGTDCGLPMIWAQENNLK--------------ADVFIIFTDNDTF 532 (598) T ss_pred EEEECCCCCC---CCCCCCCCHHHHHHHHHCC-CCCCCCCCCCEEEHHHCCCC--------------CCEEEEEECCCCC T ss_conf 7886054546---8875534499999998558-88887668720432105887--------------5479998358632 Q ss_pred CCCCCCCCHHHHHHHHHHHHCCC---EEEEEEECC------CCCHHHHHHHHH Q ss_conf 77777663148999999997898---899999547------975589999985 Q gi|254781108|r 318 GASAYQNTLNTLQICEYMRNAGM---KIYSVAVSA------PPEGQDLLRKCT 361 (398) Q Consensus 318 ~~~~~~~~~~~~~~c~~~K~~gi---~IytIg~~~------~~~~~~~l~~cA 361 (398) .+..+ ...++-+.-+..+| ++.+.+... +.++..+|.-|- T Consensus 533 ageih----p~~aik~yrea~~i~dakliv~amqa~d~siadp~dagmldi~g 581 (598) T KOG4465 533 AGEIH----PAEAIKEYREAMDIHDAKLIVCAMQANDFSIADPDDAGMLDICG 581 (598) T ss_pred CCCCC----HHHHHHHHHHHCCCCCCEEEEEEEECCCCEECCCCCCCCEEECC T ss_conf 46667----78999999985489865179998633883224855466332036 No 63 >cd01459 vWA_copine_like VWA Copine: Copines are phospholipid-binding proteins originally identified in paramecium. They are found in human and orthologues have been found in C. elegans and Arabidopsis Thaliana. None have been found in D. Melanogaster or S. Cereviciae. Phylogenetic distribution suggests that copines have been lost in some eukaryotes. No functional properties have been assigned to the VWA domains present in copines. The members of this subgroup contain a functional MIDAS motif based on their preferential binding to magnesium and manganese. However, the MIDAS motif is not totally conserved, in most cases the MIDAS consists of the sequence DxTxS instead of the motif DxSxS that is found in most cases. The C2 domains present in copines mediate phospholipid binding. Probab=69.59 E-value=7.1 Score=17.64 Aligned_cols=86 Identities=19% Similarity=0.174 Sum_probs=52.5 Q ss_pred HHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHC Q ss_conf 99999997357788874458899999996113466666566766665169981588778777776631489999999978 Q gi|254781108|r 259 LNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNA 338 (398) Q Consensus 259 ~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~ 338 (398) ....+..+..+...|-|+...=+..+.+........ ..--+++++|||+-+. ...+..+--++-+. T Consensus 119 l~aY~~~l~~v~lsGPT~FapiI~~a~~~a~~~~~~--------~~Y~ILlIiTDG~i~D------~~~Ti~aIv~AS~~ 184 (254) T cd01459 119 LRAYREALPNVSLSGPTNFAPVIRAAANIAKASNSQ--------SKYHILLIITDGEITD------MNETIKAIVEASKY 184 (254) T ss_pred HHHHHHHCCCCEECCCCCHHHHHHHHHHHHHHHCCC--------CEEEEEEEECCCCCCC------HHHHHHHHHHHCCC T ss_conf 999998607437648860599999999999973248--------7089999980796367------89999999997179 Q ss_pred CCEEEEEEECCCCCHHHHHHHH Q ss_conf 9889999954797558999998 Q gi|254781108|r 339 GMKIYSVAVSAPPEGQDLLRKC 360 (398) Q Consensus 339 gi~IytIg~~~~~~~~~~l~~c 360 (398) -+-|-.||+| +.+=+.|+.. T Consensus 185 PlSIIiVGVG--d~dF~~M~~l 204 (254) T cd01459 185 PLSIVIVGVG--DGPFDAMERL 204 (254) T ss_pred CEEEEEEEEC--CCCHHHHHHH T ss_conf 8179999736--8882778873 No 64 >KOG2884 consensus Probab=68.94 E-value=7.3 Score=17.56 Aligned_cols=154 Identities=10% Similarity=0.173 Sum_probs=95.5 Q ss_pred CCCHHHHHHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHH Q ss_conf 34100024566554102202768776521346541267765431123568999999997357788874458899999996 Q gi|254781108|r 208 RKIDVLIESAGNLVNSIQKAIQEKKNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNPYENTNTYPAMHHAYRE 287 (398) Q Consensus 208 ~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~~G~T~~~~gl~~a~~~ 287 (398) .|...-+++++.++..-... +|. ...++.+... ...+.+..+|.++..+......+.+.|.-+...|++-+.-. T Consensus 24 tRf~aQ~daVn~v~~~K~~s--npE---ntvGiitla~-a~~~vLsT~T~d~gkils~lh~i~~~g~~~~~~~i~iA~la 97 (259) T KOG2884 24 TRFQAQKDAVNLVCQAKLRS--NPE---NTVGIITLAN-ASVQVLSTLTSDRGKILSKLHGIQPHGKANFMTGIQIAQLA 97 (259) T ss_pred HHHHHHHHHHHHHHHHHHCC--CCC---CCEEEEECCC-CCCEEEEECCCCCHHHHHHHCCCCCCCCCCHHHHHHHHHHH T ss_conf 88898899999998755027--954---3154686368-98504430343004898773277857761288889999999 Q ss_pred HCCCCCCCCCCCCCCCC-CEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHH-HHH---- Q ss_conf 11346666656676666-516998158877877777663148999999997898899999547975589999-985---- Q gi|254781108|r 288 LYNEKESSHNTIGSTRL-KKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLR-KCT---- 361 (398) Q Consensus 288 l~~~~~~~~~~~~~~~~-~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~-~cA---- 361 (398) |..- + +++- .++|+|+.--- ..........+..+|.+++.|=.|-||........|. -+. T Consensus 98 lkhR-q-------nk~~~~riVvFvGSpi------~e~ekeLv~~akrlkk~~Vaidii~FGE~~~~~e~l~~fida~N~ 163 (259) T KOG2884 98 LKHR-Q-------NKNQKQRIVVFVGSPI------EESEKELVKLAKRLKKNKVAIDIINFGEAENNTEKLFEFIDALNG 163 (259) T ss_pred HHHH-C-------CCCCCEEEEEEECCCC------HHHHHHHHHHHHHHHHCCEEEEEEEECCCCCCHHHHHHHHHHHCC T ss_conf 8710-3-------8886369999936832------233899999999987548027898724343337889999998538 Q ss_pred -CCCCCEEEECCHHHHHHHHH Q ss_conf -19981899469899999999 Q gi|254781108|r 362 -DSSGQFFAVNDSRELLESFD 381 (398) Q Consensus 362 -s~~~~yy~a~~~~~L~~aF~ 381 (398) +++.|--.++...-|.++.. T Consensus 164 ~~~gshlv~Vppg~~L~d~l~ 184 (259) T KOG2884 164 KGDGSHLVSVPPGPLLSDALL 184 (259) T ss_pred CCCCCEEEEECCCCCHHHHHH T ss_conf 988744898589840777764 No 65 >TIGR02877 spore_yhbH sporulation protein YhbH; InterPro: IPR014230 Proteins in this entry, typified by YhbH from Bacillus subtilis, are found in the genomes of nearly every endospore-forming bacterium, and in no other genomes. The gene in Bacillus subtilis was shown to be a member of the sigma-E regulon, with mutation leading to a sporulation defect .. Probab=67.03 E-value=8 Score=17.32 Aligned_cols=102 Identities=13% Similarity=0.187 Sum_probs=60.8 Q ss_pred CCCCCCCCHHHHHHHHHHHHCCC-CCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHH-HHHHHHH----HHHCC-CE Q ss_conf 77888744588999999961134-666665667666651699815887787777766314-8999999----99789-88 Q gi|254781108|r 269 LNPYENTNTYPAMHHAYRELYNE-KESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLN-TLQICEY----MRNAG-MK 341 (398) Q Consensus 269 l~~~G~T~~~~gl~~a~~~l~~~-~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~-~~~~c~~----~K~~g-i~ 341 (398) ..-+|||-+.+|...|+++...- .|..|+ =+-+-++||+|= ..++.+ ...+..+ ..==| .+ T Consensus 275 kgESGGT~~SS~Y~~ALeiI~~RYnP~~yN--------iY~FHfSDGDNl----~~Dn~Rlav~l~~~L~~~cNL~GYgE 342 (392) T TIGR02877 275 KGESGGTRCSSAYKLALEIIDERYNPARYN--------IYAFHFSDGDNL----SSDNERLAVKLVRKLLEVCNLFGYGE 342 (392) T ss_pred CCCCCCCCHHHHHHHHHHHHHCCCCCCCCC--------CCCCEEECCCCC----CCCCHHHHHHHHHHHHHHHHCCCEEE T ss_conf 256677430167889999974278831006--------565355337788----98864689999999988761111056 Q ss_pred E------EEEEECCCCCHHHHHHH-HHCCCCC-EEEECCHHHHHHHHHHH Q ss_conf 9------99995479755899999-8519981-89946989999999999 Q gi|254781108|r 342 I------YSVAVSAPPEGQDLLRK-CTDSSGQ-FFAVNDSRELLESFDKI 383 (398) Q Consensus 342 I------ytIg~~~~~~~~~~l~~-cAs~~~~-yy~a~~~~~L~~aF~~I 383 (398) | -+..++..++-....++ +- +|.+ ++...+.++|-.|.+.+ T Consensus 343 IEtqPqyls~~Y~y~~tL~~~f~~ei~-~~~Fv~~~I~~K~d~y~ALk~~ 391 (392) T TIGR02877 343 IETQPQYLSMPYGYSSTLKSKFKKEIK-DPNFVLLIIKDKEDVYPALKKF 391 (392) T ss_pred EECCCCEECCCCCCCHHHHHHHHHHHC-CCCCEEEEECCHHHHHHHHHHH T ss_conf 605651103788665577888887405-8883587650414689999983 No 66 >KOG1327 consensus Probab=66.98 E-value=8 Score=17.31 Aligned_cols=89 Identities=20% Similarity=0.266 Sum_probs=54.0 Q ss_pred HHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHC Q ss_conf 99999997357788874458899999996113466666566766665169981588778777776631489999999978 Q gi|254781108|r 259 LNEVKSRLNKLNPYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNA 338 (398) Q Consensus 259 ~~~~~~~I~~l~~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~ 338 (398) ....+..+-.+.+.|.|+...=+..+.+....... ....| -|++++|||.-+. ...+..+--.|-.. T Consensus 375 l~aY~~~lp~v~l~GPTnFaPII~~va~~a~~~~~-~~~qY------~VLlIitDG~vTd------m~~T~~AIV~AS~l 441 (529) T KOG1327 375 LEAYRKALPNVQLYGPTNFSPIINHVARIAQQSGN-TAGQY------HVLLIITDGVVTD------MKETRDAIVSASDL 441 (529) T ss_pred HHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCC-CCCCE------EEEEEEECCCCCC------HHHHHHHHHHHCCC T ss_conf 99998645664205887618999999999997256-78624------9999993782344------88999999863159 Q ss_pred CCEEEEEEECCCCCHHHHHHHHHC Q ss_conf 988999995479755899999851 Q gi|254781108|r 339 GMKIYSVAVSAPPEGQDLLRKCTD 362 (398) Q Consensus 339 gi~IytIg~~~~~~~~~~l~~cAs 362 (398) -.-|-.||+| +.+-+.|+..=+ T Consensus 442 PlSIIiVGVG--d~df~~M~~lD~ 463 (529) T KOG1327 442 PLSIIIVGVG--DADFDMMRELDG 463 (529) T ss_pred CEEEEEEEEC--CCCHHHHHHHHC T ss_conf 8079999737--978789997506 No 67 >PRK05325 hypothetical protein; Provisional Probab=63.53 E-value=9.4 Score=16.91 Aligned_cols=108 Identities=14% Similarity=0.183 Sum_probs=63.3 Q ss_pred CCCCCCCCHHHHHHHHHHHHCC-CCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHH-HHHHHHHCCCEEEEEE Q ss_conf 7788874458899999996113-4666665667666651699815887787777766314899-9999997898899999 Q gi|254781108|r 269 LNPYENTNTYPAMHHAYRELYN-EKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQ-ICEYMRNAGMKIYSVA 346 (398) Q Consensus 269 l~~~G~T~~~~gl~~a~~~l~~-~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~-~c~~~K~~gi~IytIg 346 (398) ...+|||-+..|+..+.+.+.. ..+..| .-+..=.|||+| |..++..... +.+++-.. +..|.-+ T Consensus 295 ~~esGGT~vSSa~~l~~eII~~rYpp~~W--------NIY~f~aSDGDN----w~~D~~~~~~~L~~~llp~-~~~f~Y~ 361 (414) T PRK05325 295 SRESGGTIVSSALKLMLEIIEERYPPAEW--------NIYAFQASDGDN----WSDDSPRCVELLVEELLPV-VNYFAYI 361 (414) T ss_pred CCCCCCEEEEHHHHHHHHHHHHHCCHHHC--------EEEEEEECCCCC----CCCCHHHHHHHHHHHHHHH-HHEEEEE T ss_conf 58989848508999999999854887565--------278899137767----5446699999999988887-5368999 Q ss_pred -EC--CCCCHHHHHHH---HHCCCCCE--EEECCHHHHHHHHHHHHHHHHH Q ss_conf -54--79755899999---85199818--9946989999999999998751 Q gi|254781108|r 347 -VS--APPEGQDLLRK---CTDSSGQF--FAVNDSRELLESFDKITDKIQE 389 (398) Q Consensus 347 -~~--~~~~~~~~l~~---cAs~~~~y--y~a~~~~~L~~aF~~I~~~i~~ 389 (398) +. .......+++. ......+| ..+.+.++|-.+|+++..+-.. T Consensus 362 Ei~~~~~~~~~~l~~~y~~~~~~~~~f~~~~I~~~~dI~p~fr~lf~k~~~ 412 (414) T PRK05325 362 EITPRAYYRHQTLWREYEKLQDEFDNFAMQHIRDKADIYPVFRELFKKELA 412 (414) T ss_pred EEECCCCCCCHHHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHHHHC T ss_conf 971798887568999999975548886799948888889999999855552 No 68 >pfam04285 DUF444 Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA). Probab=56.92 E-value=12 Score=16.22 Aligned_cols=104 Identities=11% Similarity=0.206 Sum_probs=59.4 Q ss_pred CCCCCCCCHHHHHHHHHHHHCC-CCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHH-HHHHHCCCEEEEEE Q ss_conf 7788874458899999996113-466666566766665169981588778777776631489999-99997898899999 Q gi|254781108|r 269 LNPYENTNTYPAMHHAYRELYN-EKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQIC-EYMRNAGMKIYSVA 346 (398) Q Consensus 269 l~~~G~T~~~~gl~~a~~~l~~-~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c-~~~K~~gi~IytIg 346 (398) ...+|||-+..|+..+.+.+.. ..+..|+ -+..=.+||+| |..++.....++ +.+-.. ...|.-+ T Consensus 307 ~~EsGGT~vSSal~l~~~II~~RYpp~~WN--------iY~f~aSDGDN----w~~D~~~c~~lL~~~llp~-~~~f~Y~ 373 (421) T pfam04285 307 KQESGGTIVSSALELALEIIDERYPPAEWN--------IYAFQASDGDN----WTDDSERCVKLLMNKLMPN-AQYYGYV 373 (421) T ss_pred CCCCCCEEEEHHHHHHHHHHHHHCCHHHCE--------EEEEEECCCCC----CCCCHHHHHHHHHHHHHHH-HHEEEEE T ss_conf 489897587279999999998558864450--------46798037766----4346499999999989887-4158999 Q ss_pred -ECCCCCHHHH---HHHHHCCCCCE--EEECCHHHHHHHHHHHHHH Q ss_conf -5479755899---99985199818--9946989999999999998 Q gi|254781108|r 347 -VSAPPEGQDL---LRKCTDSSGQF--FAVNDSRELLESFDKITDK 386 (398) Q Consensus 347 -~~~~~~~~~~---l~~cAs~~~~y--y~a~~~~~L~~aF~~I~~~ 386 (398) +. +...+.+ ++.......+| ..+.+.+++-.+|+++..+ T Consensus 374 EI~-~~~~~~~~~~y~~~~~~~~nf~~~~I~~k~dIypvfr~lf~k 418 (421) T pfam04285 374 EIT-QRRSHSTWRKYEAVKGVKDNFAMYTIREKDDVYPVFRTLFQK 418 (421) T ss_pred EEC-CCCCCCHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHH T ss_conf 945-887652799999863248975799958888889999999864 No 69 >LOAD_ku consensus Probab=54.23 E-value=14 Score=15.95 Aligned_cols=11 Identities=18% Similarity=0.528 Sum_probs=4.2 Q ss_pred HHHHHHHHHHH Q ss_conf 99999999999 Q gi|254781108|r 375 ELLESFDKITD 385 (398) Q Consensus 375 ~L~~aF~~I~~ 385 (398) ++.+++++|-+ T Consensus 425 eq~~a~~~li~ 435 (521) T LOAD_ku 425 EQVDTMKEIIE 435 (521) T ss_pred HHHHHHHHHHH T ss_conf 99999999998 No 70 >pfam02431 Chalcone Chalcone-flavanone isomerase. Chalcone-flavanone isomerase is a plant enzyme responsible for the isomerisation of chalcone to naringenin, a key step in the biosynthesis of flavonoids. Probab=48.91 E-value=17 Score=15.44 Aligned_cols=12 Identities=8% Similarity=0.011 Sum_probs=4.7 Q ss_pred HHHHHHHHHHHH Q ss_conf 588999999961 Q gi|254781108|r 277 TYPAMHHAYREL 288 (398) Q Consensus 277 ~~~gl~~a~~~l 288 (398) ...+|..-...+ T Consensus 114 ~~~al~~f~~~F 125 (199) T pfam02431 114 EEEALEKFKSAF 125 (199) T ss_pred HHHHHHHHHHHH T ss_conf 999999999985 No 71 >PRK12938 acetyacetyl-CoA reductase; Provisional Probab=48.89 E-value=17 Score=15.44 Aligned_cols=21 Identities=14% Similarity=0.374 Sum_probs=15.0 Q ss_pred HHHHHHHHHHCCCEEEEEEEC Q ss_conf 899999999789889999954 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVS 348 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~ 348 (398) +..++.+.-..||++.+|.=| T Consensus 164 tk~lA~Ela~~gIrVN~VaPG 184 (246) T PRK12938 164 TMSLAQEVATKGVTVNTVSPG 184 (246) T ss_pred HHHHHHHHHHHCEEEEEEEEC T ss_conf 999999960439899999668 No 72 >COG2984 ABC-type uncharacterized transport system, periplasmic component [General function prediction only] Probab=45.62 E-value=19 Score=15.13 Aligned_cols=85 Identities=11% Similarity=0.203 Sum_probs=45.0 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 66516998158877877777663148999999997898899999547975589999985199818994698999999999 Q gi|254781108|r 303 RLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDK 382 (398) Q Consensus 303 ~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~ 382 (398) +.+++.++..=||.|. ....+.+-..+++.|+++++.+.......+..++...+.++-.|-..+. .+..+|+. T Consensus 158 nak~Igv~Y~p~E~ns------~~l~eelk~~A~~~Gl~vve~~v~~~ndi~~a~~~l~g~~d~i~~p~dn-~i~s~~~~ 230 (322) T COG2984 158 NAKSIGVLYNPGEANS------VSLVEELKKEARKAGLEVVEAAVTSVNDIPRAVQALLGKVDVIYIPTDN-LIVSAIES 230 (322) T ss_pred CCEEEEEEECCCCCCC------HHHHHHHHHHHHHCCCEEEEEECCCCCCCHHHHHHHCCCCCEEEEECCH-HHHHHHHH T ss_conf 8706999957988660------8999999999987798899983476320089999734787679986606-77888999 Q ss_pred HHHHHHHCEEEE Q ss_conf 999875124573 Q gi|254781108|r 383 ITDKIQEQSVRI 394 (398) Q Consensus 383 I~~~i~~~r~~~ 394 (398) +-+...+.++-| T Consensus 231 l~~~a~~~kiPl 242 (322) T COG2984 231 LLQVANKAKIPL 242 (322) T ss_pred HHHHHHHHCCCE T ss_conf 999988708973 No 73 >pfam11411 DNA_ligase_IV DNA ligase IV. DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface. Probab=42.38 E-value=19 Score=15.07 Aligned_cols=22 Identities=23% Similarity=0.560 Sum_probs=17.6 Q ss_pred CCCEEEECCHHHHHHHHHHHHH Q ss_conf 9818994698999999999999 Q gi|254781108|r 364 SGQFFAVNDSRELLESFDKITD 385 (398) Q Consensus 364 ~~~yy~a~~~~~L~~aF~~I~~ 385 (398) ++.||--++..+|.++|..|.+ T Consensus 14 GDSy~~dt~~~qLk~vF~~i~~ 35 (36) T pfam11411 14 GDSYFVDTDEQQLKDVFHRIKK 35 (36) T ss_pred CCCEEECCCHHHHHHHHHHHCC T ss_conf 5400104858999999987504 No 74 >COG4547 CobT Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) [Coenzyme metabolism] Probab=39.93 E-value=23 Score=14.59 Aligned_cols=65 Identities=14% Similarity=0.218 Sum_probs=39.3 Q ss_pred HHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCC-----CCCHHH-HHHHHHHH-HCCCEEEEEEECCC Q ss_conf 889999999611346666656676666516998158877877777-----663148-99999999-78988999995479 Q gi|254781108|r 278 YPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAY-----QNTLNT-LQICEYMR-NAGMKIYSVAVSAP 350 (398) Q Consensus 278 ~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~-----~~~~~~-~~~c~~~K-~~gi~IytIg~~~~ 350 (398) +++++|+-+.|.- -+.-+|++++|+||..-..+.- .+.... .+.-+.|. ...|.+..||++.+ T Consensus 520 GEal~wah~rl~g----------RpEqrkIlmmiSDGAPvddstlsvnpGnylerHLRaVieeIEtrSpveLlAIGighD 589 (620) T COG4547 520 GEALMWAHQRLIG----------RPEQRKILMMISDGAPVDDSTLSVNPGNYLERHLRAVIEEIETRSPVELLAIGIGHD 589 (620) T ss_pred HHHHHHHHHHHHC----------CHHHCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHEEEECCCC T ss_conf 1999999998735----------842413788834898555543455886079999999999970378403303312555 Q ss_pred CC Q ss_conf 75 Q gi|254781108|r 351 PE 352 (398) Q Consensus 351 ~~ 352 (398) .. T Consensus 590 vt 591 (620) T COG4547 590 VT 591 (620) T ss_pred CC T ss_conf 30 No 75 >KOG2487 consensus Probab=39.92 E-value=23 Score=14.59 Aligned_cols=46 Identities=17% Similarity=0.307 Sum_probs=34.5 Q ss_pred HHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 997898899999547975589999985199818994698999999999 Q gi|254781108|r 335 MRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDK 382 (398) Q Consensus 335 ~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~ 382 (398) +.+++|.|-++.++.+ ..-+.|.|..++|-|-++++++.|....-. T Consensus 192 AqKq~I~Idv~~l~~~--s~~LqQa~D~TGG~YL~v~~~~gLLqyLlt 237 (314) T KOG2487 192 AQKQNIPIDVVSLGGD--SGFLQQACDITGGDYLHVEKPDGLLQYLLT 237 (314) T ss_pred HHHCCCEEEEEEECCC--CHHHHHHHHHCCCEEEECCCCCHHHHHHHH T ss_conf 8753961589995698--439999875028704714885259999999 No 76 >pfam06508 ExsB ExsB. This family includes putative transcriptional regulators from Bacteria and Archaea. Probab=39.35 E-value=23 Score=14.54 Aligned_cols=16 Identities=19% Similarity=0.283 Sum_probs=6.8 Q ss_pred HHHHHHCCCEEEEEEE Q ss_conf 9999978988999995 Q gi|254781108|r 332 CEYMRNAGMKIYSVAV 347 (398) Q Consensus 332 c~~~K~~gi~IytIg~ 347 (398) +.-+...|+.--.+|+ T Consensus 105 ~a~A~~~g~~~I~~G~ 120 (137) T pfam06508 105 ASYAEAIGANDIFIGV 120 (137) T ss_pred HHHHHHCCCCEEEEEE T ss_conf 9999986999799956 No 77 >PRK08643 acetoin reductase; Validated Probab=39.23 E-value=23 Score=14.53 Aligned_cols=22 Identities=14% Similarity=0.190 Sum_probs=16.6 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898899999547 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~ 349 (398) +..++.++-..||++-.|.-|. T Consensus 163 tkslA~ela~~gIrVN~V~PG~ 184 (256) T PRK08643 163 TQTAARDLASEGITVNAYAPGI 184 (256) T ss_pred HHHHHHHHHHHCCEEEEEEECC T ss_conf 9999999877591899996066 No 78 >pfam04392 ABC_sub_bind ABC transporter substrate binding protein. This family contains many hypothetical proteins and some ABC transporter substrate binding proteins. Probab=35.86 E-value=27 Score=14.21 Aligned_cols=14 Identities=7% Similarity=0.195 Sum_probs=6.5 Q ss_pred HHHHHHHCCCEEEE Q ss_conf 99999978988999 Q gi|254781108|r 331 ICEYMRNAGMKIYS 344 (398) Q Consensus 331 ~c~~~K~~gi~Iyt 344 (398) +...+.+.++.+|+ T Consensus 202 i~~~a~~~kiPv~~ 215 (292) T pfam04392 202 VLQEANKAKIPVIT 215 (292) T ss_pred HHHHHHHCCCCEEE T ss_conf 99999974999895 No 79 >TIGR01651 CobT cobaltochelatase, CobT subunit; InterPro: IPR006538 These proteins are CobT subunits of the aerobic cobalt chelatase (aerobic cobalamin biosynthesis pathway). Pseudomonas denitrificans CobT has been experimentally characterised , . Aerobic cobalt chelatase consists of three subunits, CobT, CobN (IPR003672 from INTERPRO) and CobS (IPR006537 from INTERPRO). Cobalamin (vitamin B12) can be complexed with metal via the ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. However, aerobic cobalt chelatase subunits CobN and CobS are homologous to Mg-chelatase subunits BchH and BchI, respectively . CobT, too, has been found to be remotely related to the third subunit of Mg-chelatase, BchD (involved in bacteriochlorophyll synthesis, e.g., in Rhodobacter capsulatus) . Nomenclature note: CobT of the aerobic pathway Pseudomonas denitrificans is not a homolog of CobT of the anaerobic pathway (Salmonella typhimurium, Escherichia coli). Therefore, annotation of any members of this family as nicotinate-mononucleotide--5,6-dimethylbenzimidazole phosphoribosyltransferases is erroneous. . Probab=33.23 E-value=29 Score=13.95 Aligned_cols=64 Identities=13% Similarity=0.208 Sum_probs=41.1 Q ss_pred HHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCC-----CCHHH-HHHHHHHH-HCCCEEEEEEECCC Q ss_conf 8899999996113466666566766665169981588778777776-----63148-99999999-78988999995479 Q gi|254781108|r 278 YPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQ-----NTLNT-LQICEYMR-NAGMKIYSVAVSAP 350 (398) Q Consensus 278 ~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~-----~~~~~-~~~c~~~K-~~gi~IytIg~~~~ 350 (398) +++|.||-+-|.-- +.-+|++++++||-.-.++.-. |-..+ .++-+.|- ...|++-.||+|.+ T Consensus 505 GEAL~WAH~RliAR----------~EQRrILM~ISDGAPVDDSTLSVN~G~YLERHLR~VI~~IEtrSPVELlAIGIGHD 574 (606) T TIGR01651 505 GEALLWAHERLIAR----------PEQRRILMMISDGAPVDDSTLSVNPGNYLERHLRAVIEEIETRSPVELLAIGIGHD 574 (606) T ss_pred HHHHHHHHHHHHCC----------HHHCEEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCEEEEECCCCC T ss_conf 67999886664147----------20475877762788866452354785067899999998623778700023234434 Q ss_pred C Q ss_conf 7 Q gi|254781108|r 351 P 351 (398) Q Consensus 351 ~ 351 (398) - T Consensus 575 V 575 (606) T TIGR01651 575 V 575 (606) T ss_pred C T ss_conf 2 No 80 >cd02007 TPP_DXS Thiamine pyrophosphate (TPP) family, DXS subfamily, TPP-binding module; 1-Deoxy-D-xylulose-5-phosphate synthase (DXS) is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis. Terpeniods are plant natural products with important pharmaceutical activity. DXS catalyzes a transketolase-type condensation of pyruvate with D-glyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP) and carbon dioxide. The formation of DXP leads to the formation of the terpene precursor IPP (isopentyl diphosphate) and to the formation of thiamine (vitamin B1) and pyridoxal (vitamin B6). Probab=33.15 E-value=29 Score=13.94 Aligned_cols=89 Identities=18% Similarity=0.304 Sum_probs=41.8 Q ss_pred CCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEE-EECCCC-- Q ss_conf 44588999999961134666665667666651699815887787777766314899999999789889999-954797-- Q gi|254781108|r 275 TNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSV-AVSAPP-- 351 (398) Q Consensus 275 T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytI-g~~~~~-- 351 (398) -....|+.+|.+... ...++++++.|||-+.+..|. ....+...|.+-+.|+-- .++... T Consensus 81 ls~a~G~Ala~k~~~-------------~~~~v~~l~GDGEl~EG~~wE----A~~~A~~~~~nli~iid~N~~~i~~~~ 143 (195) T cd02007 81 ISAALGMAVARDLKG-------------KKRKVIAVIGDGALTGGMAFE----ALNNAGYLKSNMIVILNDNEMSISPNV 143 (195) T ss_pred HHHHHHHHHHHHHCC-------------CCCEEEEEECCCHHHHHHHHH----HHHHHHHCCCCEEEEEECCCEEECCCC T ss_conf 999999999995679-------------998499997781140189999----999976518986999967987614886 Q ss_pred -CHHHHHHHHHCCCCCEE-EE--CCHHHHHHHHHHH Q ss_conf -55899999851998189-94--6989999999999 Q gi|254781108|r 352 -EGQDLLRKCTDSSGQFF-AV--NDSRELLESFDKI 383 (398) Q Consensus 352 -~~~~~l~~cAs~~~~yy-~a--~~~~~L~~aF~~I 383 (398) .....++.. +=++. .+ .|.++|.++|++. T Consensus 144 ~~~~~~f~af---Gw~~v~~vDGhd~~~i~~al~~a 176 (195) T cd02007 144 GTPGNLFEEL---GFRYIGPVDGHNIEALIKVLKEV 176 (195) T ss_pred CCCCCHHHHC---CCCEECCCCCCCHHHHHHHHHHH T ss_conf 6642368774---88677860789999999999998 No 81 >PRK00907 hypothetical protein; Provisional Probab=32.16 E-value=31 Score=13.84 Aligned_cols=29 Identities=17% Similarity=0.331 Sum_probs=21.5 Q ss_pred HHHHHCCCCCE------EEECCHHHHHHHHHHHHH Q ss_conf 99985199818------994698999999999999 Q gi|254781108|r 357 LRKCTDSSGQF------FAVNDSRELLESFDKITD 385 (398) Q Consensus 357 l~~cAs~~~~y------y~a~~~~~L~~aF~~I~~ 385 (398) ++-=.|+.|+| +.|+|.++|+++++.+.. T Consensus 50 i~~r~SS~GkYisvTi~i~AtSReQlD~iYraL~~ 84 (92) T PRK00907 50 ISWKHSSSGKYVSVRIGFRAESREQYDAAHQALRD 84 (92) T ss_pred EEECCCCCCEEEEEEEEEEECCHHHHHHHHHHHHC T ss_conf 67644789827999999997889999999998725 No 82 >COG3419 PilY1 Tfp pilus assembly protein, tip-associated adhesin PilY1 [Cell motility and secretion / Intracellular trafficking and secretion] Probab=32.03 E-value=22 Score=14.74 Aligned_cols=56 Identities=18% Similarity=0.253 Sum_probs=38.9 Q ss_pred HCCCEEEEEEECCCCCHHHHHH--HHHCCCCCEEEECCHHHHHHHHHHHHHHHHHCEEEE Q ss_conf 7898899999547975589999--985199818994698999999999999875124573 Q gi|254781108|r 337 NAGMKIYSVAVSAPPEGQDLLR--KCTDSSGQFFAVNDSRELLESFDKITDKIQEQSVRI 394 (398) Q Consensus 337 ~~gi~IytIg~~~~~~~~~~l~--~cAs~~~~yy~a~~~~~L~~aF~~I~~~i~~~r~~~ 394 (398) +.+.+.+.++.... ...+++ ..++..+.||.++++..+...++.|...|.++-... T Consensus 352 ~~n~k~~~~~~~~~--n~~~~~~~~~~~~~~~~f~~~~a~~~vas~~~~f~~~~~~~~ss 409 (1036) T COG3419 352 NKNTKQPSVASQST--NAEYGADPTASSGAANFFSAPSAESMVASIKRIFSAISGYASSS 409 (1036) T ss_pred CCCCCCCCCCCCCC--CCEECCCCCCCCCCCCEEECCCCCHHHHHHHHHHHHHHHHCCCC T ss_conf 76645652102056--51012586334666523555874216676678877655302677 No 83 >COG4726 PilX Tfp pilus assembly protein PilX [Cell motility and secretion / Intracellular trafficking and secretion] Probab=31.02 E-value=32 Score=13.72 Aligned_cols=39 Identities=18% Similarity=0.246 Sum_probs=18.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH----------HHHHHHHHHHHHHHHH Q ss_conf 89999999999999999999999----------9999999999998651 Q gi|254781108|r 6 ISVCFLFITYAIDLAHIMYIRNQ----------MQSALDAAVLSGCASI 44 (398) Q Consensus 6 l~~ll~~~g~avD~~~~~~~k~~----------Lq~A~DaA~LAaa~~~ 44 (398) |++|++++-+++-.+|-...+.| .++|.++|.-.+...+ T Consensus 21 L~~LvvltLl~l~~~r~~llqeRiSaN~~D~~lAfqaAEaaLr~~E~~i 69 (196) T COG4726 21 LMVLVVLTLLGLAAARSVLLQERISANERDRSLAFQAAEAALREGELQI 69 (196) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9999999999999999999898875206778999999999998778998 No 84 >COG3552 CoxE Protein containing von Willebrand factor type A (vWA) domain [General function prediction only] Probab=30.91 E-value=14 Score=15.85 Aligned_cols=40 Identities=10% Similarity=0.038 Sum_probs=22.8 Q ss_pred CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCC Q ss_conf 8887445889999999611346666656676666516998158877877 Q gi|254781108|r 271 PYENTNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGA 319 (398) Q Consensus 271 ~~G~T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~ 319 (398) -+|||-++..+.. +...|.+.- -...-+|+++|||-..+. T Consensus 287 w~ggTrig~tl~a----F~~~~~~~~-----L~~gA~VlilsDg~drd~ 326 (395) T COG3552 287 WDGGTRIGNTLAA----FLRRWHGNV-----LSGGAVVLILSDGLDRDD 326 (395) T ss_pred CCCCCCHHHHHHH----HHCCCCCCC-----CCCCEEEEEEECCCCCCC T ss_conf 3577623489999----971544445-----578617999705422478 No 85 >PRK07806 short chain dehydrogenase; Provisional Probab=30.88 E-value=32 Score=13.71 Aligned_cols=18 Identities=17% Similarity=0.228 Sum_probs=10.6 Q ss_pred HHHHHHHHHCCCEEEEEE Q ss_conf 999999997898899999 Q gi|254781108|r 329 LQICEYMRNAGMKIYSVA 346 (398) Q Consensus 329 ~~~c~~~K~~gi~IytIg 346 (398) ..++.++..+||++-.|. T Consensus 165 ~~la~ela~~gIrvn~v~ 182 (248) T PRK07806 165 RALRPELAHAGIGFVVVS 182 (248) T ss_pred HHHHHHHHHHCCEEEEEE T ss_conf 999999776598899972 No 86 >PRK06939 2-amino-3-ketobutyrate coenzyme A ligase; Provisional Probab=30.59 E-value=18 Score=15.18 Aligned_cols=61 Identities=11% Similarity=0.304 Sum_probs=43.5 Q ss_pred CCHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHH Q ss_conf 6314899999999789889999954797558999998519981899469899999999999987 Q gi|254781108|r 324 NTLNTLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCTDSSGQFFAVNDSRELLESFDKITDKI 387 (398) Q Consensus 324 ~~~~~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cAs~~~~yy~a~~~~~L~~aF~~I~~~i 387 (398) +......+|+.+.++||-+..|.+-.-..++..++-|.+. .|- -.+-+.|.++|+++++++ T Consensus 334 ~~~~a~~~~~~L~~~Gi~v~~ir~PtVp~g~~rlRi~lta-~ht--~~did~lv~~l~~v~~~l 394 (395) T PRK06939 334 DAKLAQEFADRLLEEGVYVIGFSFPVVPKGQARIRTQMSA-AHT--KEQLDRAIDAFEKVGKEL 394 (395) T ss_pred CHHHHHHHHHHHHHCCCEEEEECCCCCCCCCCEEEEEECC-CCC--HHHHHHHHHHHHHHHHHC T ss_conf 9999999999999779748207899889898569988787-799--999999999999999963 No 87 >TIGR01822 2am3keto_CoA 2-amino-3-ketobutyrate coenzyme A ligase; InterPro: IPR011282 This entry represents a narrowly defined clade of animal and bacterial (almost exclusively proteobacterial) 2-amino-3-ketobutyrate--CoA ligase. This enzyme can act in threonine catabolism. The closest homologue from Bacillus subtilis, and sequences like it, may be functionally equivalent but were not included in the model because of difficulty in finding reports of function.; GO: 0008890 glycine C-acetyltransferase activity. Probab=29.62 E-value=34 Score=13.58 Aligned_cols=212 Identities=13% Similarity=0.129 Sum_probs=90.1 Q ss_pred CCCEEEECCCCCCCCCCCCCCCCHHHHHCCCEECCCCCCCCEEEECCCCCCEEE-CCCCCCCCCHHHH----HHHHHHHH Q ss_conf 431357505566666676663101233102301268876540310134322011-2555333410002----45665541 Q gi|254781108|r 148 ISICMVLDVSRSMEDLYLQKHNDNNNMTSNKYLLPPPPKKSFWSKNTTKSKYAP-APAPANRKIDVLI----ESAGNLVN 222 (398) Q Consensus 148 ~di~lvlD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~----~a~~~~~~ 222 (398) .-...+-|=.-||+...-. .+........|.+......+=-........... -......++|... .|++.... T Consensus 172 r~~Li~tDGvFSMDG~iA~--L~~i~~LA~~Y~ALv~~DecHA~GflG~~GRG~~E~~gv~g~vdI~tgTLGKAlGGA~G 249 (395) T TIGR01822 172 RHRLIATDGVFSMDGTIAP--LDEICDLADKYDALVMVDECHATGFLGATGRGTAELLGVMGRVDIITGTLGKALGGASG 249 (395) T ss_pred CEEEEEECCEEECCCCEEC--HHHHHHHHHHCCCEEEEECCCCCCCCCCCCCCHHHHCCCCCCEEEEECCHHHHHCCCCC T ss_conf 5899864660206873107--67899999855987887443342332789986142137675257740543345114567 Q ss_pred HHHHCCCCC-CCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHCCCC-CCCCCHHHHHHHHHHHHCCCCCCCCCCCC Q ss_conf 022027687-7652134654126776543112356899999999735778-88744588999999961134666665667 Q gi|254781108|r 223 SIQKAIQEK-KNLSVRIGTIAYNIGIVGNQCTPLSNNLNEVKSRLNKLNP-YENTNTYPAMHHAYRELYNEKESSHNTIG 300 (398) Q Consensus 223 ~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~I~~l~~-~G~T~~~~gl~~a~~~l~~~~~~~~~~~~ 300 (398) -+++.+... .--+.|...|.|++...|..+ -..|..|.- .+++.+..-|..=.+.+-......+..-. T Consensus 250 GfTta~~evv~lLRQRsRPYLFSNslpP~vv----------ga~~~vl~~l~~s~~l~~~L~~n~~~FR~~m~AaGFd~~ 319 (395) T TIGR01822 250 GFTTARKEVVELLRQRSRPYLFSNSLPPAVV----------GASIKVLDMLEGSNELRDRLAENTRYFRERMEAAGFDVK 319 (395) T ss_pred CCCCCCHHHHHHHHHCCCCCCHHCCCCHHHH----------HHHHHHHHHHHCHHHHHHHHHHHHHHHHHHHHHCCCCCC T ss_conf 6437746799985312651100046306899----------999999999842057999999999987875342474017 Q ss_pred CCCCCEEE-EECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCC--CCHHHHHHHHHCCCCCEEEECCHHHHH Q ss_conf 66665169-981588778777776631489999999978988999995479--755899999851998189946989999 Q gi|254781108|r 301 STRLKKFV-IFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYSVAVSAP--PEGQDLLRKCTDSSGQFFAVNDSRELL 377 (398) Q Consensus 301 ~~~~~k~i-illTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~IytIg~~~~--~~~~~~l~~cAs~~~~yy~a~~~~~L~ 377 (398) +....++ |++-| -.....+++.+-++| ||+|||-+| ..++.=.+-== +..|- -..-+.=. T Consensus 320 -p~~hpI~pvMlyD-----------A~~Aq~~A~~Ll~~G--iYv~GFfYPVVPKGqARIRvQ~-SA~H~--~e~l~~a~ 382 (395) T TIGR01822 320 -PAEHPIIPVMLYD-----------AKLAQRFAERLLEEG--IYVIGFFYPVVPKGQARIRVQI-SAVHT--EEQLDRAV 382 (395) T ss_pred -CCCCCCEEEECCC-----------HHHHHHHHHHHHHCC--EEEEEEEECCCCCCCCEEEEEE-CCCCC--HHHHHHHH T ss_conf -8988510232114-----------789999999998779--2899745264378861244410-25689--88999999 Q ss_pred HHHHHHHHHHH Q ss_conf 99999999875 Q gi|254781108|r 378 ESFDKITDKIQ 388 (398) Q Consensus 378 ~aF~~I~~~i~ 388 (398) +||-+|+.++. T Consensus 383 ~AF~~~G~~lg 393 (395) T TIGR01822 383 EAFTRVGRELG 393 (395) T ss_pred HHHHHHHHHHC T ss_conf 99999878626 No 88 >PRK07576 short chain dehydrogenase; Provisional Probab=29.34 E-value=34 Score=13.55 Aligned_cols=19 Identities=11% Similarity=0.221 Sum_probs=13.6 Q ss_pred HHHHHHHHCCCEEEEEEEC Q ss_conf 9999999789889999954 Q gi|254781108|r 330 QICEYMRNAGMKIYSVAVS 348 (398) Q Consensus 330 ~~c~~~K~~gi~IytIg~~ 348 (398) .++.++-.+||++-+|.-| T Consensus 169 ~lA~e~a~~gIrVN~IaPG 187 (260) T PRK07576 169 TLALEWGPEGVRVNSISPG 187 (260) T ss_pred HHHHHHHHCCEEEEEEEEC T ss_conf 9999971339299998347 No 89 >PRK12824 acetoacetyl-CoA reductase; Provisional Probab=29.33 E-value=34 Score=13.55 Aligned_cols=22 Identities=14% Similarity=0.122 Sum_probs=15.4 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898899999547 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~ 349 (398) +..++.+.-..||++-+|.-|. T Consensus 163 tk~lA~E~a~~gIrvN~I~PG~ 184 (245) T PRK12824 163 TKALASEGARYGITVNCIAPGY 184 (245) T ss_pred HHHHHHHHHHHCEEEEEEEECC T ss_conf 9999999725491999997446 No 90 >cd06325 PBP1_ABC_uncharacterized_transporter Type I periplasmic ligand-binding domain of uncharacterized ABC-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This group includes the type I periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine/isoleucine/valine binding protein (LIVBP); its ligand specificity has not been determined experimentally. Probab=28.96 E-value=35 Score=13.51 Aligned_cols=30 Identities=10% Similarity=0.096 Sum_probs=13.9 Q ss_pred EEEECCCCCCCCCCCCCCCHHHHHHHHHHHHCCCEEEE Q ss_conf 69981588778777776631489999999978988999 Q gi|254781108|r 307 FVIFITDGENSGASAYQNTLNTLQICEYMRNAGMKIYS 344 (398) Q Consensus 307 ~iillTDG~~~~~~~~~~~~~~~~~c~~~K~~gi~Iyt 344 (398) .+++.+|.... .....+.+.+.+.+|.+|. T Consensus 187 al~~~~d~~v~--------s~~~~i~~~a~~~~iPv~~ 216 (281) T cd06325 187 AIYVPTDNTVA--------SAMEAVVKVANEAKIPVIA 216 (281) T ss_pred EEEEECCCHHH--------HHHHHHHHHHHHCCCCEEE T ss_conf 99991881277--------7999999999874998893 No 91 >TIGR01957 nuoB_fam NADH-quinone oxidoreductase, B subunit; InterPro: IPR006138 Respiratory-chain NADH dehydrogenase (1.6.5.3 from EC) (also known as complex I or NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex located in the inner mitochondrial membrane which also seems to exist in the chloroplast and in cyanobacteria (as a NADH-plastoquinone oxidoreductase). Among the 25 to 30 polypeptide subunits of this bioenergetic enzyme complex there is one with a molecular weight of 20 kDa (in mammals) , which is a component of the iron-sulphur (IP) fragment of the enzyme. It seems to bind a 4Fe-4S iron-sulphur cluster. The 20 kDa subunit has been found to be nuclear encoded, as a precursor form with a transit peptide in mammals, and in Neurospora crassa. It is mitochondrial encoded in Paramecium (gene psbG) and chloroplast encoded in various higher plants (gene ndhK or psbG).; GO: 0008137 NADH dehydrogenase (ubiquinone) activity, 0006120 mitochondrial electron transport NADH to ubiquinone. Probab=28.72 E-value=35 Score=13.49 Aligned_cols=19 Identities=21% Similarity=0.357 Sum_probs=12.4 Q ss_pred CHHHHHHHHHHHHHHHHHC Q ss_conf 9899999999999987512 Q gi|254781108|r 372 DSRELLESFDKITDKIQEQ 390 (398) Q Consensus 372 ~~~~L~~aF~~I~~~i~~~ 390 (398) .++.|..+|=+|-+||.++ T Consensus 128 RPEAL~~g~~~LQ~KI~~~ 146 (146) T TIGR01957 128 RPEALIYGLLKLQKKIKRE 146 (146) T ss_pred CHHHHHHHHHHHHHHHHCC T ss_conf 3789999999999987049 No 92 >PRK06123 short chain dehydrogenase; Provisional Probab=28.56 E-value=35 Score=13.47 Aligned_cols=22 Identities=14% Similarity=0.348 Sum_probs=15.6 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898899999547 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~ 349 (398) +..++.++-..||++-.|.=|. T Consensus 169 tr~lA~ela~~gIrvN~IaPG~ 190 (249) T PRK06123 169 TIGLAKEVAAEGIRVNAVRPGV 190 (249) T ss_pred HHHHHHHHHHCCEEEEEEEECC T ss_conf 9999999865596999998678 No 93 >PRK12826 3-ketoacyl-(acyl-carrier-protein) reductase; Reviewed Probab=27.62 E-value=37 Score=13.37 Aligned_cols=21 Identities=10% Similarity=0.077 Sum_probs=15.0 Q ss_pred HHHHHHHHHCCCEEEEEEECC Q ss_conf 999999997898899999547 Q gi|254781108|r 329 LQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 329 ~~~c~~~K~~gi~IytIg~~~ 349 (398) ..++.+....||++..|.-|. T Consensus 168 k~lA~e~~~~gIrvN~I~PG~ 188 (253) T PRK12826 168 RALALELARRNITVNSVHPGM 188 (253) T ss_pred HHHHHHHHHHCEEEEEEEECC T ss_conf 999998532095999996287 No 94 >PRK09730 hypothetical protein; Provisional Probab=25.98 E-value=39 Score=13.19 Aligned_cols=55 Identities=7% Similarity=0.030 Sum_probs=26.7 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCC-------HHHHHHHHHC-CCCCEEEECCHHHHHHHHHHHHH Q ss_conf 999999997898899999547975-------5899999851-99818994698999999999999 Q gi|254781108|r 329 LQICEYMRNAGMKIYSVAVSAPPE-------GQDLLRKCTD-SSGQFFAVNDSRELLESFDKITD 385 (398) Q Consensus 329 ~~~c~~~K~~gi~IytIg~~~~~~-------~~~~l~~cAs-~~~~yy~a~~~~~L~~aF~~I~~ 385 (398) ..++.++-..||++-.|.-|.=.. ..+.++.... .|=. ...+++|+.++-.-++. T Consensus 168 k~lA~ela~~gIrVN~IaPG~i~T~~~~~~~~~~~~~~~~~~~Pl~--R~g~pedia~~v~fL~S 230 (247) T PRK09730 168 TGLSLEVAAQGIRVNCVRPGFIYTEMHASGGEPGRVDRVKSNIPMQ--RGGQAEEVAQAIVWLLS 230 (247) T ss_pred HHHHHHHHHCCEEEEEEEECCCCCCCCCCCCCHHHHHHHHHCCCCC--CCCCHHHHHHHHHHHHC T ss_conf 9999997054928999977889785432349969999998579989--98499999999999968 No 95 >PRK12745 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional Probab=25.68 E-value=40 Score=13.15 Aligned_cols=22 Identities=14% Similarity=0.255 Sum_probs=15.9 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898899999547 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~ 349 (398) +..++.++-.+||++-+|.-|. T Consensus 174 tr~lA~ela~~gIrVN~IaPG~ 195 (259) T PRK12745 174 AQLFALRLAEEGIGVYEVRPGL 195 (259) T ss_pred HHHHHHHHHHCCEEEEEEEECC T ss_conf 9999999855493999998615 No 96 >PTZ00099 rab6; Provisional Probab=25.40 E-value=37 Score=13.33 Aligned_cols=55 Identities=13% Similarity=0.059 Sum_probs=30.7 Q ss_pred CCCEEEEEEECCCCC-----HHHHHHHHHCCC-CCEEE--ECCHHHHHHHHHHHHHHHHHCEE Q ss_conf 898899999547975-----589999985199-81899--46989999999999998751245 Q gi|254781108|r 338 AGMKIYSVAVSAPPE-----GQDLLRKCTDSS-GQFFA--VNDSRELLESFDKITDKIQEQSV 392 (398) Q Consensus 338 ~gi~IytIg~~~~~~-----~~~~l~~cAs~~-~~yy~--a~~~~~L~~aF~~I~~~i~~~r~ 392 (398) .++.|+-||=-.|-. ..+.-+.-|..- -.||+ |.+...+.++|+.|+++|.++.. T Consensus 84 ~~~~iiLVGNK~DL~~~r~V~~ee~~~~A~~~~~~f~EtSAktg~nV~e~F~~la~~i~~~~~ 146 (176) T PTZ00099 84 KDVIIALVGNKTDLGDLRKVTYEEGMQKAQEYNTMFHETSAKAGHNIKVLFKKIAAKLPNLDN 146 (176) T ss_pred CCCCEEEEEECCCHHHHCCCCHHHHHHHHHHCCCEEEEEECCCCCCHHHHHHHHHHHHCCHHH T ss_conf 877439998556558616859999999999859999998489994989999999998608020 No 97 >COG5242 TFB4 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair] Probab=24.37 E-value=42 Score=13.00 Aligned_cols=95 Identities=18% Similarity=0.286 Sum_probs=49.9 Q ss_pred CCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHH-HHHHCCCEEEEEEECCCCCH Q ss_conf 44588999999961134666665667666651699815887787777766314899999-99978988999995479755 Q gi|254781108|r 275 TNTYPAMHHAYRELYNEKESSHNTIGSTRLKKFVIFITDGENSGASAYQNTLNTLQICE-YMRNAGMKIYSVAVSAPPEG 353 (398) Q Consensus 275 T~~~~gl~~a~~~l~~~~~~~~~~~~~~~~~k~iillTDG~~~~~~~~~~~~~~~~~c~-~~K~~gi~IytIg~~~~~~~ 353 (398) +.++.+|..|+.....-.. .+.--.+++||---|..-- ..+...--|- .+.+.||.|-+..++.+ . T Consensus 129 ~~v~gams~glay~n~~~~------e~slkSriliftlsG~d~~-----~qYip~mnCiF~Aqk~~ipI~v~~i~g~--s 195 (296) T COG5242 129 YDVGGAMSLGLAYCNHRDE------ETSLKSRILIFTLSGRDRK-----DQYIPYMNCIFAAQKFGIPISVFSIFGN--S 195 (296) T ss_pred EEHHHHHHHHHHHHHHHCC------CCCCCCEEEEEEECCCHHH-----HHHCHHHHHEEEHHHCCCCEEEEEECCC--C T ss_conf 2212256656888753121------1340003899980482056-----5421054401126434981489982486--1 Q ss_pred HHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 89999985199818994698999999999 Q gi|254781108|r 354 QDLLRKCTDSSGQFFAVNDSRELLESFDK 382 (398) Q Consensus 354 ~~~l~~cAs~~~~yy~a~~~~~L~~aF~~ 382 (398) .-++|.|..++|-|..+.+++.|....-. T Consensus 196 ~fl~Q~~daTgG~Yl~ve~~eGllqyL~~ 224 (296) T COG5242 196 KFLLQCCDATGGDYLTVEDTEGLLQYLLS 224 (296) T ss_pred HHHHHHHHCCCCEEEEECCCHHHHHHHHH T ss_conf 78998763448726862482069999999 No 98 >PRK09186 flagellin modification protein A; Provisional Probab=24.25 E-value=42 Score=12.99 Aligned_cols=56 Identities=13% Similarity=0.314 Sum_probs=28.1 Q ss_pred HHHHHHHHHCCCEEEEEEECC--CCCHHHHHHHHHCC-CCCEEEECCHHHHHHHHHHHHHH Q ss_conf 999999997898899999547--97558999998519-98189946989999999999998 Q gi|254781108|r 329 LQICEYMRNAGMKIYSVAVSA--PPEGQDLLRKCTDS-SGQFFAVNDSRELLESFDKITDK 386 (398) Q Consensus 329 ~~~c~~~K~~gi~IytIg~~~--~~~~~~~l~~cAs~-~~~yy~a~~~~~L~~aF~~I~~~ 386 (398) ..++.+....||++-+|.-|. +...+.+++..... +.. .--+++|+.++.--++.+ T Consensus 179 r~lA~e~a~~gIrVN~VaPG~i~~~~~~~~~~~~~~~~~~~--~~~~p~dia~~v~fL~Sd 237 (255) T PRK09186 179 KYLAKYFKDSNIRVNCVSPGGILDNQPEAFLNAYKKSCNGK--GMLDPEDICGSLVFLLSD 237 (255) T ss_pred HHHHHHHCCCCEEEEEEEECCCCCCCCHHHHHHHHHHCCCC--CCCCHHHHHHHHHHHHCC T ss_conf 99999967589899998557688999899999998635577--998999999999999570 No 99 >PRK05557 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Validated Probab=24.22 E-value=42 Score=12.99 Aligned_cols=21 Identities=14% Similarity=0.251 Sum_probs=15.3 Q ss_pred HHHHHHHHHCCCEEEEEEECC Q ss_conf 999999997898899999547 Q gi|254781108|r 329 LQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 329 ~~~c~~~K~~gi~IytIg~~~ 349 (398) ..++.+....||++-+|.-|. T Consensus 167 ~~lA~e~~~~gIrvN~V~PG~ 187 (248) T PRK05557 167 KSLARELASRGITVNAVAPGF 187 (248) T ss_pred HHHHHHHHHHCEEEEEEEECC T ss_conf 999998533194999997488 No 100 >PRK12825 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional Probab=24.20 E-value=42 Score=12.99 Aligned_cols=22 Identities=23% Similarity=0.382 Sum_probs=16.3 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898899999547 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~ 349 (398) +..++.++...||++..|.=|. T Consensus 168 ~~~la~e~~~~gIrvN~I~PG~ 189 (250) T PRK12825 168 TKALARELAERGIRVNAVAPGA 189 (250) T ss_pred HHHHHHHHHHHCEEEEEEEECC T ss_conf 9999998604292999997288 No 101 >COG0623 FabI Enoyl-[acyl-carrier-protein] Probab=21.57 E-value=48 Score=12.67 Aligned_cols=19 Identities=5% Similarity=0.364 Sum_probs=12.1 Q ss_pred HHHHHHHHCCCEEEEEEEC Q ss_conf 9999999789889999954 Q gi|254781108|r 330 QICEYMRNAGMKIYSVAVS 348 (398) Q Consensus 330 ~~c~~~K~~gi~IytIg~~ 348 (398) .++..+-.+||++-+|.=| T Consensus 171 yLA~dlG~~gIRVNaISAG 189 (259) T COG0623 171 YLAADLGKEGIRVNAISAG 189 (259) T ss_pred HHHHHHCCCCEEEEEECCC T ss_conf 9999847048377014145 No 102 >TIGR03206 benzo_BadH 2-hydroxycyclohexanecarboxyl-CoA dehydrogenase. Members of this protein family are the enzyme 2-hydroxycyclohexanecarboxyl-CoA dehydrogenase. The enzymatic properties were confirmed experimentally in Rhodopseudomonas palustris; the enzyme is homotetrameric, and not sensitive to oxygen. This enzyme is part of proposed pathway for degradation of benzoyl-CoA to 3-hydroxypimeloyl-CoA that differs from the analogous in Thauera aromatica. It also may occur in degradation of the non-aromatic compound cyclohexane-1-carboxylate. Probab=21.35 E-value=48 Score=12.64 Aligned_cols=32 Identities=13% Similarity=0.178 Sum_probs=20.8 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHH Q ss_conf 8999999997898899999547975589999985 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSAPPEGQDLLRKCT 361 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~~~~~~~~l~~cA 361 (398) +..+|.++...||++-.|.=|.- ..++++... T Consensus 163 tk~lA~ela~~gIrVNaV~PG~i--~T~~~~~~~ 194 (250) T TIGR03206 163 SKTMAREHARHGITVNVVCPGPT--DTALLDDIC 194 (250) T ss_pred HHHHHHHHCCCCEEEEEEEECCC--CCHHHHHHH T ss_conf 99999996532918999976888--867789876 No 103 >PRK08063 enoyl-(acyl carrier protein) reductase; Provisional Probab=21.32 E-value=48 Score=12.64 Aligned_cols=22 Identities=18% Similarity=0.209 Sum_probs=16.0 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898899999547 Q gi|254781108|r 328 TLQICEYMRNAGMKIYSVAVSA 349 (398) Q Consensus 328 ~~~~c~~~K~~gi~IytIg~~~ 349 (398) +..++.+....||++-.|.-|. T Consensus 165 tk~lA~ela~~gIrVNaI~PG~ 186 (250) T PRK08063 165 TRYLAVELAPKGIAVNAVSGGA 186 (250) T ss_pred HHHHHHHHHHHCCEEEEEECCC T ss_conf 9999999725392899986087 No 104 >PRK06855 aminotransferase; Validated Probab=20.92 E-value=49 Score=12.59 Aligned_cols=20 Identities=25% Similarity=0.398 Sum_probs=11.6 Q ss_pred ECCHHHHHHHH-HHHHHHHHH Q ss_conf 46989999999-999998751 Q gi|254781108|r 370 VNDSRELLESF-DKITDKIQE 389 (398) Q Consensus 370 a~~~~~L~~aF-~~I~~~i~~ 389 (398) +++.++|.++- +.|+..|.+ T Consensus 409 l~~~E~~~E~i~r~~~e~~~e 429 (433) T PRK06855 409 LERDEEKFEWIYRTLAEKIGE 429 (433) T ss_pred CCCCHHHHHHHHHHHHHHHHH T ss_conf 797079999999999999999 No 105 >pfam10526 NADH_ub_rd_NUML NADH-ubiquinone reductase complex 1 MLRQ subunit. This subunit appears to be a recent vertebrate addition to the MADH-ubiquinone reductase complex 1, acting within the membrane. its exact function is not known, but it is highly expressed in muscle and neural tissue, indicative of a role in ATP generation. Probab=20.76 E-value=50 Score=12.57 Aligned_cols=23 Identities=22% Similarity=0.270 Sum_probs=11.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 89999999999999999999999 Q gi|254781108|r 6 ISVCFLFITYAIDLAHIMYIRNQ 28 (398) Q Consensus 6 l~~ll~~~g~avD~~~~~~~k~~ 28 (398) |+||++++|++.=.+-.|..|.. T Consensus 13 LIPLfv~ig~g~~gA~~y~~rla 35 (80) T pfam10526 13 LIPLFVFIGAGATGATLYLLRLA 35 (80) T ss_pred HHHHHHHHHCCHHHHHHHHHHHH T ss_conf 32489999413889999999998 No 106 >KOG1575 consensus Probab=20.29 E-value=51 Score=12.51 Aligned_cols=14 Identities=14% Similarity=-0.062 Sum_probs=6.7 Q ss_pred CCCCHHHHHHHHHH Q ss_conf 87445889999999 Q gi|254781108|r 273 ENTNTYPAMHHAYR 286 (398) Q Consensus 273 G~T~~~~gl~~a~~ 286 (398) |-|-...+|.|.+. T Consensus 271 g~T~~qlALawv~~ 284 (336) T KOG1575 271 GCTVPQLALAWVLS 284 (336) T ss_pred CCCHHHHHHHHHHH T ss_conf 99889999999998 Done!