Query gi|254780388|ref|YP_003064801.1| hypothetical protein CLIBASIA_01365 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 458 No_of_seqs 119 out of 522 Neff 10.4 Searched_HMMs 39220 Date Sun May 29 17:53:25 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780388.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK13685 hypothetical protein; 99.8 1.9E-17 4.7E-22 129.9 17.1 170 268-452 108-293 (326) 2 cd01467 vWA_BatA_type VWA BatA 99.8 3.4E-17 8.6E-22 128.2 15.2 149 269-437 24-180 (180) 3 cd01465 vWA_subgroup VWA subgr 99.7 2.9E-16 7.4E-21 122.2 14.4 148 271-441 18-170 (170) 4 cd01461 vWA_interalpha_trypsin 99.7 3.4E-15 8.6E-20 115.5 14.7 147 272-441 21-169 (171) 5 cd01466 vWA_C3HC4_type VWA C3H 99.6 5.7E-14 1.4E-18 107.6 13.7 136 271-432 18-155 (155) 6 COG4961 TadG Flp pilus assembl 99.5 6.4E-13 1.6E-17 100.9 13.8 72 4-75 5-76 (185) 7 cd01463 vWA_VGCC_like VWA Volt 99.5 1.2E-12 3E-17 99.2 13.1 152 271-434 31-189 (190) 8 cd01456 vWA_ywmD_type VWA ywmD 99.5 3.2E-12 8.1E-17 96.4 14.3 143 268-435 41-204 (206) 9 cd01480 vWA_collagen_alpha_1-V 99.5 5.4E-12 1.4E-16 95.0 14.8 151 274-441 23-179 (186) 10 cd01475 vWA_Matrilin VWA_Matri 99.4 5.8E-11 1.5E-15 88.4 16.2 153 273-443 22-178 (224) 11 cd01470 vWA_complement_factors 99.4 3E-11 7.5E-16 90.3 13.1 156 274-441 21-197 (198) 12 cd01451 vWA_Magnesium_chelatas 99.3 1.3E-10 3.4E-15 86.1 15.3 150 270-441 18-176 (178) 13 cd01474 vWA_ATR ATR (Anthrax T 99.3 2.8E-10 7.1E-15 84.0 16.8 136 294-444 39-177 (185) 14 pfam00092 VWA von Willebrand f 99.3 1.5E-10 3.8E-15 85.7 15.3 153 274-443 20-176 (177) 15 TIGR00868 hCaCC calcium-activa 99.3 2.2E-11 5.6E-16 91.1 10.5 130 296-447 346-481 (874) 16 cd01464 vWA_subfamily VWA subf 99.3 5E-11 1.3E-15 88.8 12.3 154 271-441 21-174 (176) 17 cd01450 vWFA_subfamily_ECM Von 99.3 3E-10 7.7E-15 83.8 14.0 133 274-426 21-155 (161) 18 cd01469 vWA_integrins_alpha_su 99.2 1.4E-09 3.5E-14 79.6 15.1 152 274-439 21-176 (177) 19 smart00327 VWA von Willebrand 99.2 2.7E-09 7E-14 77.7 15.4 140 273-430 21-164 (177) 20 cd01472 vWA_collagen von Wille 99.2 2.2E-09 5.6E-14 78.3 13.9 139 274-433 21-163 (164) 21 cd01473 vWA_CTRP CTRP for CS 99.1 7.5E-09 1.9E-13 74.9 16.2 143 291-447 36-189 (192) 22 cd01471 vWA_micronemal_protein 99.1 1.6E-08 4.1E-13 72.8 16.1 151 274-441 22-182 (186) 23 cd00198 vWFA Von Willebrand fa 99.1 4E-09 1E-13 76.6 12.2 132 274-425 21-154 (161) 24 cd01481 vWA_collagen_alpha3-VI 99.1 1.2E-08 3.1E-13 73.6 14.1 141 274-433 21-164 (165) 25 TIGR03436 acidobact_VWFA VWFA- 99.0 3.6E-08 9.2E-13 70.6 15.9 109 332-450 136-255 (296) 26 cd01477 vWA_F09G8-8_type VWA F 99.0 3.3E-08 8.5E-13 70.8 15.1 132 290-429 59-191 (193) 27 cd01482 vWA_collagen_alphaI-XI 98.9 6.1E-08 1.6E-12 69.1 14.0 139 274-433 21-163 (164) 28 cd01462 VWA_YIEM_type VWA YIEM 98.9 3.8E-08 9.7E-13 70.4 12.6 78 322-410 60-137 (152) 29 cd01476 VWA_integrin_invertebr 98.9 5.9E-08 1.5E-12 69.2 13.3 139 274-429 20-161 (163) 30 cd01454 vWA_norD_type norD typ 98.5 8.1E-06 2.1E-10 55.5 13.5 91 325-428 72-171 (174) 31 COG1240 ChlD Mg-chelatase subu 98.2 6.7E-05 1.7E-09 49.7 11.7 151 271-442 97-257 (261) 32 COG4245 TerY Uncharacterized p 98.1 7E-05 1.8E-09 49.6 10.9 164 271-451 21-186 (207) 33 KOG2353 consensus 97.8 0.0004 1E-08 44.7 10.1 165 269-445 241-409 (1104) 34 COG4655 Predicted membrane pro 97.7 3.6E-05 9.1E-10 51.4 4.2 55 14-68 4-58 (565) 35 PRK13406 bchD magnesium chelat 97.2 0.0093 2.4E-07 36.0 11.0 163 257-445 406-583 (584) 36 COG4548 NorD Nitric oxide redu 97.1 0.011 2.7E-07 35.6 10.0 107 323-446 519-633 (637) 37 TIGR02031 BchD-ChlD magnesium 96.9 0.011 2.9E-07 35.5 9.0 158 259-435 517-700 (705) 38 pfam07811 TadE TadE-like prote 96.4 0.011 2.8E-07 35.5 6.3 42 20-61 1-42 (43) 39 cd01453 vWA_transcription_fact 96.2 0.091 2.3E-06 29.7 14.9 153 267-443 22-177 (183) 40 TIGR02442 Cob-chelat-sub cobal 96.2 0.017 4.2E-07 34.4 6.0 153 257-430 513-686 (688) 41 COG2425 Uncharacterized protei 95.5 0.17 4.3E-06 27.9 9.6 97 325-442 336-433 (437) 42 pfam11775 CobT_C Cobalamin bio 94.9 0.26 6.6E-06 26.8 10.3 85 337-443 116-212 (220) 43 cd01457 vWA_ORF176_type VWA OR 94.7 0.3 7.7E-06 26.3 13.2 81 321-407 66-151 (199) 44 COG3847 Flp Flp pilus assembly 93.3 0.38 9.7E-06 25.7 6.6 27 8-34 2-28 (58) 45 cd01455 vWA_F11C1-5a_type Von 90.9 1 2.7E-05 22.9 13.1 102 333-448 87-189 (191) 46 COG4726 PilX Tfp pilus assembl 90.9 0.81 2.1E-05 23.6 5.9 54 13-67 7-68 (196) 47 pfam04285 DUF444 Protein of un 88.9 1.5 3.8E-05 21.9 10.6 110 331-450 307-421 (421) 48 PRK05325 hypothetical protein; 87.9 1.7 4.4E-05 21.5 7.4 110 331-450 295-412 (414) 49 pfam06707 DUF1194 Protein of u 87.5 1.8 4.7E-05 21.3 12.6 110 332-450 90-203 (206) 50 COG2304 Uncharacterized protei 84.2 2.7 6.9E-05 20.3 6.0 81 321-410 96-180 (399) 51 cd01452 VWA_26S_proteasome_sub 82.0 3.3 8.3E-05 19.7 12.5 160 263-440 18-182 (187) 52 TIGR02877 spore_yhbH sporulati 80.3 3.7 9.5E-05 19.4 9.6 106 331-444 275-391 (392) 53 pfam05762 VWA_CoxE VWA domain 77.9 4.4 0.00011 18.9 6.3 61 333-403 126-186 (223) 54 pfam00362 Integrin_beta Integr 74.6 5.4 0.00014 18.3 10.8 85 310-404 182-299 (424) 55 KOG1226 consensus 60.4 10 0.00027 16.5 5.4 47 139-192 102-148 (783) 56 PRK10913 dipeptide transporter 57.0 12 0.0003 16.1 3.3 35 3-37 15-49 (300) 57 cd04477 RPA1N RPA1N: A subfami 55.8 11 0.00028 16.4 2.7 38 365-402 33-70 (97) 58 smart00187 INB Integrin beta s 53.6 14 0.00035 15.8 10.7 87 309-405 180-299 (423) 59 pfam04057 Rep-A_N Replication 52.8 13 0.00034 15.8 2.7 38 365-402 34-71 (100) 60 COG1991 Uncharacterized conser 52.1 4 0.0001 19.2 -0.0 32 10-41 4-35 (131) 61 pfam04964 Flp_Fap Flp/Fap pili 50.9 15 0.00038 15.5 5.5 21 14-34 1-21 (47) 62 KOG4667 consensus 44.0 19 0.00049 14.8 5.2 10 415-424 218-227 (269) 63 pfam09967 DUF2201 Predicted me 41.5 21 0.00053 14.6 3.2 37 330-383 349-385 (412) 64 pfam07002 Copine Copine. This 37.2 24 0.00062 14.2 8.8 73 324-406 73-145 (145) 65 cd03132 GATase1_catalase Type 37.1 24 0.00062 14.2 4.7 20 427-446 119-139 (142) 66 pfam04917 Shufflon_N Bacterial 33.8 27 0.0007 13.8 6.5 45 15-59 2-46 (356) 67 pfam02060 ISK_Channel Slow vol 33.1 28 0.00072 13.7 4.3 37 11-47 31-67 (129) 68 TIGR02600 TIGR02600 Verrucomic 32.8 29 0.00073 13.7 5.9 44 22-65 2-53 (1697) 69 COG1681 FlaB Archaeal flagelli 31.4 30 0.00077 13.6 2.9 22 17-38 1-22 (209) 70 PRK10506 hypothetical protein; 31.3 30 0.00077 13.6 6.0 45 15-59 1-45 (155) 71 PRK06007 fliF flagellar MS-rin 28.8 33 0.00085 13.3 3.0 37 4-40 9-45 (540) 72 COG4867 Uncharacterized protei 28.7 33 0.00085 13.3 6.3 92 334-441 530-642 (652) 73 pfam10526 NADH_ub_rd_NUML NADH 28.5 34 0.00086 13.3 2.6 22 29-50 12-33 (80) 74 smart00310 PTBI Phosphotyrosin 27.9 34 0.00088 13.2 3.0 30 420-449 66-97 (98) 75 pfam00733 Asn_synthase Asparag 27.8 35 0.00088 13.2 6.5 35 416-450 154-191 (195) 76 KOG2487 consensus 27.5 35 0.00089 13.1 4.8 48 392-446 191-240 (314) 77 pfam02174 IRS PTB domain (IRS- 26.8 36 0.00092 13.1 3.2 28 420-447 67-96 (99) 78 cd01203 DOK_PTB Downstream of 23.8 41 0.001 12.7 3.4 31 420-450 67-99 (104) 79 TIGR00873 gnd 6-phosphoglucona 23.6 41 0.0011 12.7 5.1 22 386-409 107-128 (480) 80 pfam04056 Ssl1 Ssl1-like. Ssl1 23.4 42 0.0011 12.7 15.3 154 269-443 73-228 (250) 81 cd01569 PBEF_like pre-B-cell c 22.4 44 0.0011 12.5 3.1 53 368-423 325-382 (407) 82 COG3552 CoxE Protein containin 22.4 29 0.00075 13.6 0.4 54 332-395 286-339 (395) 83 COG4547 CobT Cobalamin biosynt 22.3 44 0.0011 12.5 5.5 82 339-443 519-612 (620) 84 TIGR00385 dsbE periplasmic pro 22.3 42 0.0011 12.7 1.1 18 28-45 1-18 (175) 85 PRK04214 rbn ribonuclease BN/u 22.3 44 0.0011 12.5 4.2 37 5-41 17-57 (411) 86 TIGR00263 trpB tryptophan synt 22.2 44 0.0011 12.5 2.3 44 398-447 317-363 (412) 87 PRK05434 phosphoglyceromutase; 21.9 45 0.0011 12.5 5.8 19 387-405 378-396 (511) 88 pfam06508 ExsB ExsB. This fami 21.7 45 0.0011 12.4 4.4 18 390-407 105-122 (137) 89 KOG0394 consensus 21.7 45 0.0011 12.4 3.4 32 416-447 141-176 (210) 90 PRK09198 putative nicotinate p 20.8 47 0.0012 12.3 2.1 39 368-409 327-365 (462) 91 KOG2884 consensus 20.7 47 0.0012 12.3 12.4 161 264-442 19-184 (259) 92 pfam03850 Tfb4 Transcription f 20.2 48 0.0012 12.2 5.9 53 387-445 162-216 (271) No 1 >PRK13685 hypothetical protein; Provisional Probab=99.79 E-value=1.9e-17 Score=129.87 Aligned_cols=170 Identities=19% Similarity=0.178 Sum_probs=125.8 Q ss_pred CCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHH Q ss_conf 43306677766522112346775666520121100677665565576812100566665641256777776458899999 Q gi|254780388|r 268 IKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAY 347 (458) Q Consensus 268 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~ 347 (458) .+|++..+..+..++.... ...+.+...|....+..+|++.+. ..+...++.+.+..+|.++.|+..+. T Consensus 108 p~Rl~~ak~~~~~fi~~~~------~~driGlv~Fa~~a~~~~plT~D~-----~~~~~~l~~l~~~~~taiG~ai~~Al 176 (326) T PRK13685 108 PNRLAAAQEAAKQFADQLT------PGINLGLIAFAGTATVLVSPTTNR-----EATKNALDKLQLADRTATGEGIFTAL 176 (326) T ss_pred CCHHHHHHHHHHHHHHHCC------CCCEEEEEEECCCCEECCCCCCCH-----HHHHHHHHHCCCCCCCCCCHHHHHHH T ss_conf 5689999999999997379------888289999658720148987539-----99999998468788886406899999 Q ss_pred HHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCH-----HHHHHHHHHHCCCEEEEEEECCCCCCC---------C Q ss_conf 86236666544445667775069999960668888540-----899999999879689999943788743---------1 Q gi|254780388|r 348 DTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEE-----GIAICNKAKSQGIRIMTIAFSVNKTQQ---------E 413 (458) Q Consensus 348 ~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~-----~~~~C~~~K~~gI~IytI~f~~~~~~~---------~ 413 (458) +.+....... ......+.|.|||+|||+||.+... ....++.+|++||+|||||+|.+.... . T Consensus 177 ~~l~~~~~~~---~~~~~~~~~~IILLTDG~~n~g~~~~~p~~~~~AA~~A~~~gi~IyTIgvGt~~g~~~~~g~~~~~~ 253 (326) T PRK13685 177 QAIATVGAVI---GGGDTPPPARIVLFSDGKETVPTNPDNPKGAYTAARTAKDQGVPISTISFGTPYGFVEINGQRQPVP 253 (326) T ss_pred HHHHHHHHHC---CCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCC T ss_conf 9998633201---4567778867999748997778898873029999999998599489999779988435478403456 Q ss_pred HHHHHHHHHC--CCCCEEEECCHHHHHHHHHHHHHHHHHHH Q ss_conf 1789998606--89837882998999999999987587535 Q gi|254780388|r 414 KARYFLSNCA--SPNSFFEANSTHELNKIFRDRIGNEIFER 452 (458) Q Consensus 414 ~~~~~lk~CA--s~~~yy~a~~~~eL~~aF~~~i~~~~~~~ 452 (458) -..+.||+.| +.|.||.|.+.++|.+.|++ |++.+..+ T Consensus 254 lDe~~L~~IA~~TGG~yfrA~d~~~L~~Iy~~-i~~~i~~~ 293 (326) T PRK13685 254 VDDETLKKIAQLSGGEFYTAASLEELRAVYAT-LQQQIGYE 293 (326) T ss_pred CCHHHHHHHHHHCCCEEEECCCHHHHHHHHHH-HHHHHCCE T ss_conf 89999999999729879971999999999999-63331603 No 2 >cd01467 vWA_BatA_type VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.77 E-value=3.4e-17 Score=128.20 Aligned_cols=149 Identities=23% Similarity=0.263 Sum_probs=113.0 Q ss_pred CCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHH Q ss_conf 33066777665221123467756665201211006776655655768121005666656412567777764588999998 Q gi|254780388|r 269 KKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYD 348 (458) Q Consensus 269 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~ 348 (458) .|++..+..+..++... + ..+.+...|+.......|++.+...... ....+..+.+.|+|++..||..+.+ T Consensus 24 ~rl~~ak~~~~~~i~~~----~---~drvglv~Fs~~a~~~~plT~d~~~~~~--~l~~i~~~~~~ggT~i~~al~~a~~ 94 (180) T cd01467 24 SRLEAAKEVLSDFIDRR----E---NDRIGLVVFAGAAFTQAPLTLDRESLKE--LLEDIKIGLAGQGTAIGDAIGLAIK 94 (180) T ss_pred CHHHHHHHHHHHHHHHC----C---CCEEEEEEECCCCEEECCCCCCHHHHHH--HHHCCCCCCCCCCCCHHHHHHHHHH T ss_conf 89999999999999719----9---9759999972873673376656899999--9862244532368608999999999 Q ss_pred HHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCC------CCHHHHHHHHH Q ss_conf 623666654444566777506999996066888854089999999987968999994378874------31178999860 Q gi|254780388|r 349 TIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQ------QEKARYFLSNC 422 (458) Q Consensus 349 ~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~------~~~~~~~lk~C 422 (458) .|.+. ....|+|||+|||+++.+......+++.+|++||+||||||+.+... .......|++. T Consensus 95 ~l~~~-----------~~~~~~ivLlTDG~~n~g~~~~~~~~~~a~~~gi~v~tIGvG~~~~~~~~~~~~~~d~~~L~~i 163 (180) T cd01467 95 RLKNS-----------EAKERVIVLLTDGENNAGEIDPATAAELAKNKGVRIYTIGVGKSGSGPKPDGSTILDEDSLVEI 163 (180) T ss_pred HHHCC-----------CCCCCEEEEEECCCCCCCCCCHHHHHHHHHHCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHH T ss_conf 76424-----------7666379998058866787699999999997699899999778988876888765599999999 Q ss_pred C--CCCCEEEECCHHHH Q ss_conf 6--89837882998999 Q gi|254780388|r 423 A--SPNSFFEANSTHEL 437 (458) Q Consensus 423 A--s~~~yy~a~~~~eL 437 (458) | +.|.||+|.+++|| T Consensus 164 A~~tgG~yy~a~~~~eL 180 (180) T cd01467 164 ADKTGGRIFRALDGFEL 180 (180) T ss_pred HHHCCCEEEECCCHHHC T ss_conf 99619979972874649 No 3 >cd01465 vWA_subgroup VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if n Probab=99.72 E-value=2.9e-16 Score=122.23 Aligned_cols=148 Identities=17% Similarity=0.163 Sum_probs=107.3 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHH Q ss_conf 06677766522112346775666520121100677665565576812100566665641256777776458899999862 Q gi|254780388|r 271 KHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTI 350 (458) Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~L 350 (458) ++..+.++..++..+. ...+.+.+.|++......|++.. .++..+...+..+.+.|+|+...||..++..+ T Consensus 18 ~~~~k~a~~~~l~~l~------~~dr~~iv~F~~~~~~~~~~~~~---~~~~~~~~~i~~l~~~G~T~~~~~l~~a~~~~ 88 (170) T cd01465 18 LPLVKSALKLLVDQLR------PDDRLAIVTYDGAAETVLPATPV---RDKAAILAAIDRLTAGGSTAGGAGIQLGYQEA 88 (170) T ss_pred HHHHHHHHHHHHHHCC------CCCEEEEEEECCCCEECCCCCCH---HHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHH T ss_conf 9999999999998589------87879999835861551587866---67999999874389899852779999999999 Q ss_pred CCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCH---HHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC--CC Q ss_conf 36666544445667775069999960668888540---89999999987968999994378874311789998606--89 Q gi|254780388|r 351 ISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEE---GIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA--SP 425 (458) Q Consensus 351 s~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~---~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~ 425 (458) .... .+...+.|||+|||+.+.+... ....+.+.++.||+||||||+... ...+|+..| +. T Consensus 89 ~~~~---------~~~~~~~iillTDG~~~~~~~~~~~~~~~~~~~~~~~i~i~tiGiG~~~-----~~~~L~~iA~~~~ 154 (170) T cd01465 89 QKHF---------VPGGVNRILLATDGDFNVGETDPDELARLVAQKRESGITLSTLGFGDNY-----NEDLMEAIADAGN 154 (170) T ss_pred HHCC---------CCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEECCCC-----CHHHHHHHHHCCC T ss_conf 8633---------7887506999815885679889899999999987438862489808879-----9999999997579 Q ss_pred CCEEEECCHHHHHHHH Q ss_conf 8378829989999999 Q gi|254780388|r 426 NSFFEANSTHELNKIF 441 (458) Q Consensus 426 ~~yy~a~~~~eL~~aF 441 (458) |.||++++++||.++| T Consensus 155 G~~~~v~~~~~l~~~f 170 (170) T cd01465 155 GNTAYIDNLAEARKVF 170 (170) T ss_pred CEEEECCCHHHHHHHC T ss_conf 8899849999999639 No 4 >cd01461 vWA_interalpha_trypsin_inhibitor vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix. Probab=99.67 E-value=3.4e-15 Score=115.45 Aligned_cols=147 Identities=21% Similarity=0.243 Sum_probs=103.1 Q ss_pred HHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHC Q ss_conf 66777665221123467756665201211006776655655768121005666656412567777764588999998623 Q gi|254780388|r 272 HLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTII 351 (458) Q Consensus 272 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls 351 (458) +..+.++..++..+. ...+.+.+.|+.......|................+..+.+.|+|++..||.++++.|. T Consensus 21 ~~ak~a~~~~l~~l~------~~d~~~iv~F~~~~~~~~~~~~~~~~~~~~~a~~~i~~l~~~G~T~i~~aL~~a~~~l~ 94 (171) T cd01461 21 EQTKEALLTALKDLP------PGDYFNIIGFSDTVEEFSPSSVSATAENVAAAIEYVNRLQALGGTNMNDALEAALELLN 94 (171) T ss_pred HHHHHHHHHHHHHCC------CCCEEEEEEECCEEEEECCCCEECCHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHH T ss_conf 999999999998299------87879999987806598077530799999999988754788998669999999999886 Q ss_pred CCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC--CCCCEE Q ss_conf 666654444566777506999996066888854089999999987968999994378874311789998606--898378 Q gi|254780388|r 352 SSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA--SPNSFF 429 (458) Q Consensus 352 ~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~~~yy 429 (458) .. ....+.+||+|||+.+.. ......+..+++.+|+||||||+.+.+ ..+|+..| +.|.|| T Consensus 95 ~~-----------~~~~~~iillTDG~~~~~-~~~~~~~~~~~~~~i~i~tig~G~~~~-----~~~L~~iA~~~~G~~~ 157 (171) T cd01461 95 SS-----------PGSVPQIILLTDGEVTNE-SQILKNVREALSGRIRLFTFGIGSDVN-----TYLLERLAREGRGIAR 157 (171) T ss_pred HC-----------CCCCCEEEEECCCCCCCH-HHHHHHHHHHHCCCCEEEEEEECCCCC-----HHHHHHHHHCCCCEEE T ss_conf 35-----------798618999757886886-899999999744896399999789799-----9999999972898899 Q ss_pred EECCHHHHHHHH Q ss_conf 829989999999 Q gi|254780388|r 430 EANSTHELNKIF 441 (458) Q Consensus 430 ~a~~~~eL~~aF 441 (458) ++.+++||.+.+ T Consensus 158 ~v~~~~~l~~~~ 169 (171) T cd01461 158 RIYETDDIESQL 169 (171) T ss_pred ECCCHHHHHHHH T ss_conf 889878999976 No 5 >cd01466 vWA_C3HC4_type VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, Probab=99.59 E-value=5.7e-14 Score=107.62 Aligned_cols=136 Identities=20% Similarity=0.221 Sum_probs=97.8 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHH Q ss_conf 06677766522112346775666520121100677665565576812100566665641256777776458899999862 Q gi|254780388|r 271 KHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTI 350 (458) Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~L 350 (458) ++..+.+...++..+. +..+.+...|+.......|+.... ...+......++.+.+.|+|++..||..|.++| T Consensus 18 l~~~k~a~~~~~~~L~------~~d~v~iV~F~~~a~~~~pl~~~~-~~~~~~~~~~i~~l~~~GgT~i~~gl~~a~~~l 90 (155) T cd01466 18 LQLVKHALRFVISSLG------DADRLSIVTFSTSAKRLSPLRRMT-AKGKRSAKRVVDGLQAGGGTNVVGGLKKALKVL 90 (155) T ss_pred HHHHHHHHHHHHHHCC------CCCEEEEEEECCCCEEEECCEECC-HHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHH T ss_conf 9999999999998489------767489999568742620460379-999999999875377688872679999999999 Q ss_pred CCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC--CCCCE Q ss_conf 3666654444566777506999996066888854089999999987968999994378874311789998606--89837 Q gi|254780388|r 351 ISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA--SPNSF 428 (458) Q Consensus 351 s~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~~~y 428 (458) .... ..++.+.|||+|||+.+.. ..+.++++.||+||||||+.+. ...+|+..| +.|.| T Consensus 91 ~~~~---------~~~~~~~IiLlTDG~~n~~-----~~~~~~~~~~i~i~tiGiG~~~-----d~~lL~~iA~~~gG~~ 151 (155) T cd01466 91 GDRR---------QKNPVASIMLLSDGQDNHG-----AVVLRADNAPIPIHTFGLGASH-----DPALLAFIAEITGGTF 151 (155) T ss_pred HHCC---------CCCCCEEEEEECCCCCCHH-----HHHHHHHCCCCEEEEEEECCCC-----CHHHHHHHHHCCCCEE T ss_conf 8436---------6898308999826986405-----7789987179739999978867-----8999999997699779 Q ss_pred EEEC Q ss_conf 8829 Q gi|254780388|r 429 FEAN 432 (458) Q Consensus 429 y~a~ 432 (458) |++. T Consensus 152 ~~v~ 155 (155) T cd01466 152 SYVK 155 (155) T ss_pred EEEC T ss_conf 9949 No 6 >COG4961 TadG Flp pilus assembly protein TadG [Intracellular trafficking and secretion] Probab=99.52 E-value=6.4e-13 Score=100.89 Aligned_cols=72 Identities=17% Similarity=0.090 Sum_probs=64.1 Q ss_pred HHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHH Q ss_conf 678999999874013872799999999999999999999999999999999999999999887641367314 Q gi|254780388|r 4 DTKFIFYSKKLIKSCTGHFFIITALLMPVMLGVGGMLVDVVRWSYYEHALKQAAQTAIITASVPLIQSLEEV 75 (458) Q Consensus 4 ~~~~~~~~~rf~~d~~G~vaiifal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~LA~a~~~~~~~~~~ 75 (458) ....+.+.+||+|||+|++||.|||.+||||+++++.||++++++.|.+||+|+|+|++++++......... T Consensus 5 ~~~~~~~~~rF~rdr~Ga~AVeFAlvap~ll~l~~g~ve~~~~~~~~~~l~~a~d~aara~~~~~~~~~~~~ 76 (185) T COG4961 5 RRGLRGLLRRFRRDRRGAAAVEFALVAPPLLLLVFGIVEFGIAFLAKQSLQNAADAAARAAARGLTTDAADL 76 (185) T ss_pred HHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHH T ss_conf 265799999887648768999999999999999999999999999999999999999999985076442025 No 7 >cd01463 vWA_VGCC_like VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex. Probab=99.49 E-value=1.2e-12 Score=99.24 Aligned_cols=152 Identities=14% Similarity=0.122 Sum_probs=96.4 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCC----CHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHH Q ss_conf 06677766522112346775666520121100677665565576----81210056666564125677777645889999 Q gi|254780388|r 271 KHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSW----GVHKLIRTIVKTFAIDENEMGSTAINDAMQTA 346 (458) Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g 346 (458) ++.++.+...+++.+...+ +...+.|+.......|... ......+...+..|+.+.+.|+|++..||..| T Consensus 31 l~~ak~a~~~il~~L~~~D------~~~iv~Fs~~~~~~~p~~~~~~~~~t~~n~~~~~~~i~~l~~~G~Tn~~~al~~A 104 (190) T cd01463 31 LHLAKQTVSSILDTLSDND------FFNIITFSNEVNPVVPCFNDTLVQATTSNKKVLKEALDMLEAKGIANYTKALEFA 104 (190) T ss_pred HHHHHHHHHHHHHHCCCCC------EEEEEEECCCCEEEECCCCCCEEECCHHHHHHHHHHHHHCCCCCCCHHHHHHHHH T ss_conf 9999999999998199877------9999996897536302456843368999999999999828579872489999999 Q ss_pred HHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHH-HHCCCEEEEEEECCCCCCCCHHHHHHHHH--C Q ss_conf 98623666654444566777506999996066888854089999999-98796899999437887431178999860--6 Q gi|254780388|r 347 YDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKA-KSQGIRIMTIAFSVNKTQQEKARYFLSNC--A 423 (458) Q Consensus 347 ~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~-K~~gI~IytI~f~~~~~~~~~~~~~lk~C--A 423 (458) +.+|........ ........|.|+|+|||..+........+...- ...+|.|||+|||.+.. ...+||.. + T Consensus 105 ~~~l~~~~~~~~--~~~~~~~~~~IillTDG~~~~~~~i~~~~~~~~~~~~~i~ift~G~G~~~~----d~~~L~~iA~~ 178 (190) T cd01463 105 FSLLLKNLQSNH--SGSRSQCNQAIMLITDGVPENYKEIFDKYNWDKNSEIPVRVFTYLIGREVT----DRREIQWMACE 178 (190) T ss_pred HHHHHHHHCCCC--CCCCCCCCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCEEEEEEEECCCCC----CHHHHHHHHHC T ss_conf 999987420155--665555551599983698875788999999975579987999999679977----87999999980 Q ss_pred CCCCEEEECCH Q ss_conf 89837882998 Q gi|254780388|r 424 SPNSFFEANST 434 (458) Q Consensus 424 s~~~yy~a~~~ 434 (458) +.|+||+.++. T Consensus 179 ~~G~y~~I~~~ 189 (190) T cd01463 179 NKGYYSHIQSL 189 (190) T ss_pred CCCEEEECCCC T ss_conf 99569978889 No 8 >cd01456 vWA_ywmD_type VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.47 E-value=3.2e-12 Score=96.44 Aligned_cols=143 Identities=18% Similarity=0.232 Sum_probs=91.1 Q ss_pred CCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC---------CCC---CHHHHHHHHHHHHHHCCC-CC Q ss_conf 43306677766522112346775666520121100677665565---------576---812100566665641256-77 Q gi|254780388|r 268 IKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPS---------FSW---GVHKLIRTIVKTFAIDEN-EM 334 (458) Q Consensus 268 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~---~~~~~~~~~~~~~i~~~~-~~ 334 (458) ..|++..+.++..++..+. ...+.+.+.|+.......+ +.. +....++..+...+..+. +. T Consensus 41 ~~rl~~ak~a~~~~v~~l~------~~drvgLv~F~~~~~~~~d~~~~~~~~~~~~~~~~~~~~~r~~l~~~i~~l~~~~ 114 (206) T cd01456 41 ETRLDNAKAALDETANALP------DGTRLGLWTFSGDGDNPLDVRVLVPKGCLTAPVNGFPSAQRSALDAALNSLQTPT 114 (206) T ss_pred CCHHHHHHHHHHHHHHHCC------CCCEEEEEEECCCCCCCCCCCEECCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCC T ss_conf 4599999999999998579------9987999997786777888513214565444345523778999999997457788 Q ss_pred CCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHH-H----HCCCEEEEEEECCCC Q ss_conf 77764588999998623666654444566777506999996066888854089999999-9----879689999943788 Q gi|254780388|r 335 GSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKA-K----SQGIRIMTIAFSVNK 409 (458) Q Consensus 335 g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~-K----~~gI~IytI~f~~~~ 409 (458) |+|++..++..+.+.+.+. ..+.|||||||+++...... ..+..+ + ..+|.||||||+.+. T Consensus 115 G~T~l~~al~~a~~~~~~~-------------~~~~IvLlTDG~~~~g~~~~-~~~~~l~~~~~~~~~v~V~tig~G~d~ 180 (206) T cd01456 115 GWTPLAAALAEAAAYVDPG-------------RVNVVVLITDGEDTCGPDPC-EVARELAKRRTPAPPIKVNVIDFGGDA 180 (206) T ss_pred CCCHHHHHHHHHHHHHCCC-------------CCCEEEEEECCCCCCCCCHH-HHHHHHHHHCCCCCCEEEEEEEECCCC T ss_conf 9647999999999862778-------------76479999237644688859-999999983177999589999718865 Q ss_pred CCCCHHHHHHHHHC--CCCCE-EEECCHH Q ss_conf 74311789998606--89837-8829989 Q gi|254780388|r 410 TQQEKARYFLSNCA--SPNSF-FEANSTH 435 (458) Q Consensus 410 ~~~~~~~~~lk~CA--s~~~y-y~a~~~~ 435 (458) ...+|++.| +.|.| |.+.++. T Consensus 181 -----d~~~L~~IA~~tgG~y~y~~~d~~ 204 (206) T cd01456 181 -----DRAELEAIAEATGGTYAYNQSDLA 204 (206) T ss_pred -----CHHHHHHHHHCCCCEEEEECCCCC T ss_conf -----899999999742978995167602 No 9 >cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.46 E-value=5.4e-12 Score=94.97 Aligned_cols=151 Identities=19% Similarity=0.210 Sum_probs=101.9 Q ss_pred HHHHHHHHHCCCCCC---CCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCC-CCCCCCCHHHHHHHHHH Q ss_conf 777665221123467---75666520121100677665565576812100566665641256-77777645889999986 Q gi|254780388|r 274 VRDALASVIRSIKKI---DNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDEN-EMGSTAINDAMQTAYDT 349 (458) Q Consensus 274 ~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~-~~g~T~~~~gl~~g~~~ 349 (458) .+..+..++...... +-.....+.+...|+.......++... ..++..+...|..+. .+|+|+++.+|.++.+. T Consensus 23 ~k~Fv~~lv~~f~~~~~~~i~~~~~rVgvv~ys~~~~~~~~~~~~--~~~~~~l~~~I~~i~y~gG~T~tg~AL~~a~~~ 100 (186) T cd01480 23 TKNFVKRVAERFLKDYYRKDPAGSWRVGVVQYSDQQEVEAGFLRD--IRNYTSLKEAVDNLEYIGGGTFTDCALKYATEQ 100 (186) T ss_pred HHHHHHHHHHHHHHHCCCCCCCCCEEEEEEEECCCEEEEECCCCC--CCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHH T ss_conf 999999999998530134568774389899855842798604777--588999999997501358986299999999999 Q ss_pred HCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCC-CCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-CCC Q ss_conf 236666544445667775069999960668888-540899999999879689999943788743117899986068-983 Q gi|254780388|r 350 IISSNEDEVHRMKNNLEAKKYIVLLTDGENTQD-NEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-PNS 427 (458) Q Consensus 350 Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~-~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~~~ 427 (458) +.... .....|++|++|||..+.. ........+.+|..||+||+||++... ..-|+.+|| |++ T Consensus 101 ~~~~~---------r~~~~kvlvliTDG~S~~~~~~~~~~aa~~lr~~GV~ifaVGVG~~~------~~eL~~IAs~p~~ 165 (186) T cd01480 101 LLEGS---------HQKENKFLLVITDGHSDGSPDGGIEKAVNEADHLGIKIFFVAVGSQN------EEPLSRIACDGKS 165 (186) T ss_pred HHHCC---------CCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHCCCEEEEEEECCCC------HHHHHHHHCCCCC T ss_conf 86136---------78985389998458766674066999999999879899999947488------7999998589973 Q ss_pred EEEECCHHHHHHHH Q ss_conf 78829989999999 Q gi|254780388|r 428 FFEANSTHELNKIF 441 (458) Q Consensus 428 yy~a~~~~eL~~aF 441 (458) .|.+.+=++|.+.| T Consensus 166 ~~~~~~f~~L~~~~ 179 (186) T cd01480 166 ALYRENFAELLWSF 179 (186) T ss_pred EEEECCHHHHHCCH T ss_conf 89736899870111 No 10 >cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. Probab=99.39 E-value=5.8e-11 Score=88.38 Aligned_cols=153 Identities=18% Similarity=0.208 Sum_probs=104.6 Q ss_pred HHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCC-CCCCCCHHHHHHHHHHHC Q ss_conf 6777665221123467756665201211006776655655768121005666656412567-777764588999998623 Q gi|254780388|r 273 LVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENE-MGSTAINDAMQTAYDTII 351 (458) Q Consensus 273 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~-~g~T~~~~gl~~g~~~Ls 351 (458) ..+..+..++...+. ..+..+.+...|++.....-++..- .++..+...|..+.. .|+|+.+.+|..+.+.+. T Consensus 22 ~~k~Fv~~lv~~f~I---~~~~trVgvv~ys~~~~~~f~l~~~---~~k~~l~~aI~~i~~~gggT~Tg~AL~~~~~~~f 95 (224) T cd01475 22 LVKQFLNQIIDSLDV---GPDATRVGLVQYSSTVKQEFPLGRF---KSKADLKRAVRRMEYLETGTMTGLAIQYAMNNAF 95 (224) T ss_pred HHHHHHHHHHHHCCC---CCCCEEEEEEEECCCEEEEEECCCC---CCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHC T ss_conf 999999999985687---9985299999965827899966886---7889999999863613884469999999999727 Q ss_pred CCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCCC---CCE Q ss_conf 66665444456677750699999606688885408999999998796899999437887431178999860689---837 Q gi|254780388|r 352 SSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCASP---NSF 428 (458) Q Consensus 352 ~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~---~~y 428 (458) ....+.+|. ..+..|++|+||||.-. .........+|++||+||+||++. . ...-|+.+||+ .|. T Consensus 96 ~~~~G~Rp~---~~~vpkvlIviTDG~s~---D~v~~~A~~lr~~GV~ifaVGVg~-~-----~~~eL~~IAs~P~~~hv 163 (224) T cd01475 96 SEAEGARPG---SERVPRVGIVVTDGRPQ---DDVSEVAAKARALGIEMFAVGVGR-A-----DEEELREIASEPLADHV 163 (224) T ss_pred CCCCCCCCC---CCCCCEEEEEECCCCCC---CCHHHHHHHHHHCCCEEEEEECCC-C-----CHHHHHHHHCCCCHHCE T ss_conf 702399875---56898599997179876---638999999998798899996374-7-----98999998559737568 Q ss_pred EEECCHHHHHHHHHH Q ss_conf 882998999999999 Q gi|254780388|r 429 FEANSTHELNKIFRD 443 (458) Q Consensus 429 y~a~~~~eL~~aF~~ 443 (458) |.+.+=++|...-+. T Consensus 164 f~v~~F~~l~~l~~~ 178 (224) T cd01475 164 FYVEDFSTIEELTKK 178 (224) T ss_pred EEECCHHHHHHHHHH T ss_conf 994798899999999 No 11 >cd01470 vWA_complement_factors Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains. Probab=99.36 E-value=3e-11 Score=90.26 Aligned_cols=156 Identities=16% Similarity=0.183 Sum_probs=94.0 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCC-----CCCCCCCHHHHHHHHH Q ss_conf 77766522112346775666520121100677665565576812100566665641256-----7777764588999998 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDEN-----EMGSTAINDAMQTAYD 348 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~-----~~g~T~~~~gl~~g~~ 348 (458) .+..+..+......... ..+.+...|++.......+....+ .++......++.+. ..++|+...+|...+. T Consensus 21 ~k~Fv~~lv~~~~~~~~---~~rvgvv~ys~~~~~~f~l~~~~~-~~~~~~~~~i~~i~y~~~~~~~gT~t~~AL~~~~~ 96 (198) T cd01470 21 AKNAIKTLIEKISSYEV---SPRYEIISYASDPKEIVSIRDFNS-NDADDVIKRLEDFNYDDHGDKTGTNTAAALKKVYE 96 (198) T ss_pred HHHHHHHHHHHHCCCCC---CCEEEEEEECCCCEEEEECCCCCC-CCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHH T ss_conf 99999999998446687---753899981588538997157666-68999999998460335778864689999999999 Q ss_pred HHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCH------------HHHHHHHHHHCCCEEEEEEECCCCCCCCHHH Q ss_conf 6236666544445667775069999960668888540------------8999999998796899999437887431178 Q gi|254780388|r 349 TIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEE------------GIAICNKAKSQGIRIMTIAFSVNKTQQEKAR 416 (458) Q Consensus 349 ~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~------------~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~ 416 (458) .+.-...... ......+|++|++|||.-+.+... .....+.+|+.||+||+||++.+.+ . T Consensus 97 ~~~~~~~~~~---~~~~~v~~v~illTDG~sn~g~~P~~~~~~~~~~~~~~~~a~~~r~~gi~ifaiGVG~~~d-----~ 168 (198) T cd01470 97 RMALEKVRNK---EAFNETRHVIILFTDGKSNMGGSPLPTVDKIKNLVYKNNKSDNPREDYLDVYVFGVGDDVN-----K 168 (198) T ss_pred HHHHHHHCCC---CCCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEECCCCC-----H T ss_conf 8655530466---4445675599997378545789933678887776641014567887394799999666159-----9 Q ss_pred HHHHHHCCC--C--CEEEECCHHHHHHHH Q ss_conf 999860689--8--378829989999999 Q gi|254780388|r 417 YFLSNCASP--N--SFFEANSTHELNKIF 441 (458) Q Consensus 417 ~~lk~CAs~--~--~yy~a~~~~eL~~aF 441 (458) .-|+.+||+ + |+|.+.+=++|.+.| T Consensus 169 ~eL~~IAS~~~~e~hvf~v~df~~L~~i~ 197 (198) T cd01470 169 EELNDLASKKDNERHFFKLKDYEDLQEVF 197 (198) T ss_pred HHHHHHHCCCCCCCEEEEECCHHHHHHHH T ss_conf 99999857999871699968999999863 No 12 >cd01451 vWA_Magnesium_chelatase Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions. Probab=99.34 E-value=1.3e-10 Score=86.12 Aligned_cols=150 Identities=15% Similarity=0.203 Sum_probs=99.5 Q ss_pred CCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEEC-CCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHH Q ss_conf 3066777665221123467756665201211006-776655655768121005666656412567777764588999998 Q gi|254780388|r 270 KKHLVRDALASVIRSIKKIDNVNDTVRMGATFFN-DRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYD 348 (458) Q Consensus 270 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~ 348 (458) ++...+.++..++.. ......+.+...|. +......|++... ...+..+..+.++|+|++..||..++. T Consensus 18 rl~~aK~a~~~ll~d-----~~~~~D~v~lv~F~g~~a~~~lppT~~~-----~~~~~~l~~L~~gG~T~l~~gL~~a~~ 87 (178) T cd01451 18 RMAAAKGAVLSLLRD-----AYQRRDKVALIAFRGTEAEVLLPPTRSV-----ELAKRRLARLPTGGGTPLAAGLLAAYE 87 (178) T ss_pred HHHHHHHHHHHHHHH-----HCCCCCEEEEEEECCCCCEEECCCCCCH-----HHHHHHHHCCCCCCCCCHHHHHHHHHH T ss_conf 799999999999997-----4346788999997597555856887657-----999998721678898519999999999 Q ss_pred HHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCH----HHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC- Q ss_conf 6236666544445667775069999960668888540----89999999987968999994378874311789998606- Q gi|254780388|r 349 TIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEE----GIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA- 423 (458) Q Consensus 349 ~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~----~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA- 423 (458) ++.... ..+...+++||+|||..|.+... ...++++++++||...+|+|+.+... ..+|++-| T Consensus 88 ~~~~~~--------~~~~~~~~iiLlTDG~~N~g~~~~~~~~~~~a~~~~~~gi~~~vId~~~~~~~----~~~~~~LA~ 155 (178) T cd01451 88 LAAEQA--------RDPGQRPLIVVITDGRANVGPDPTADRALAAARKLRARGISALVIDTEGRPVR----RGLAKDLAR 155 (178) T ss_pred HHHHHC--------CCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHCCCCEEEEECCCCCCC----HHHHHHHHH T ss_conf 999850--------27898439999846986679995126999999999866997899979999767----489999999 Q ss_pred -CCCCEEEECC--HHHHHHHH Q ss_conf -8983788299--89999999 Q gi|254780388|r 424 -SPNSFFEANS--THELNKIF 441 (458) Q Consensus 424 -s~~~yy~a~~--~~eL~~aF 441 (458) ..++||..++ +++|.++. T Consensus 156 ~~~g~Y~~id~l~~~~i~~~v 176 (178) T cd01451 156 ALGGQYVRLPDLSADAIASAV 176 (178) T ss_pred HCCCCEEECCCCCHHHHHHHH T ss_conf 429969989979988999987 No 13 >cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. Probab=99.33 E-value=2.8e-10 Score=84.04 Aligned_cols=136 Identities=18% Similarity=0.197 Sum_probs=92.6 Q ss_pred CCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEE Q ss_conf 52012110067766556557681210056666564125677777645889999986236666544445667775069999 Q gi|254780388|r 294 TVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVL 373 (458) Q Consensus 294 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil 373 (458) ..+.+...|++.....-++......... ....+....+.|+|+.+.+|..+...+....++ .+...|++|+ T Consensus 39 ~~rvgvv~fS~~~~~~f~l~~~~~~~~~--~~~~~~~~~~~G~T~tg~AL~~a~~~~f~~~~g-------~R~~~kvliv 109 (185) T cd01474 39 GLRFSFITFSTRATKILPLTDDSSAIIK--GLEVLKKVTPSGQTYIHEGLENANEQIFNRNGG-------GRETVSVIIA 109 (185) T ss_pred CEEEEEEEECCCCCEEEECCCCHHHHHH--HHHHHHHHCCCCCCHHHHHHHHHHHHHHCCCCC-------CCCCCEEEEE T ss_conf 7499999986983189845787078899--999988761589378999999999975032369-------9887628999 Q ss_pred EECCCCCCCC-CHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-CCCEEEECC-HHHHHHHHHHH Q ss_conf 9606688885-40899999999879689999943788743117899986068-983788299-89999999999 Q gi|254780388|r 374 LTDGENTQDN-EEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-PNSFFEANS-THELNKIFRDR 444 (458) Q Consensus 374 ~TDG~n~~~~-~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~~~yy~a~~-~~eL~~aF~~~ 444 (458) +|||..+... .......+.+|+.||.||+||++ + ..+.-|...|| |++.|.+++ =++|....+++ T Consensus 110 lTDG~s~~~~~~~~~~~a~~lr~~gV~i~aVGV~-~-----~~~~eL~~IAs~p~~vf~v~~~F~~L~~i~~~l 177 (185) T cd01474 110 LTDGQLLLNGHKYPEHEAKLSRKLGAIVYCVGVT-D-----FLKSQLINIADSKEYVFPVTSGFQALSGIIESV 177 (185) T ss_pred EECCCCCCCCCHHHHHHHHHHHHCCCEEEEEECC-C-----CCHHHHHHHHCCCCEEEECCCCHHHHHHHHHHH T ss_conf 9326656762141799999999789489999716-2-----599999987199864898347577789999999 No 14 >pfam00092 VWA von Willebrand factor type A domain. Probab=99.33 E-value=1.5e-10 Score=85.74 Aligned_cols=153 Identities=18% Similarity=0.172 Sum_probs=102.7 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCC Q ss_conf 77766522112346775666520121100677665565576812100566665641256777776458899999862366 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTIISS 353 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~ 353 (458) .+..+..++.... ......+.+...|+.......++.......... ...........|+|++..+|..+.+.+... T Consensus 20 ~k~~~~~~i~~~~---~~~~~~rv~lv~f~~~~~~~~~l~~~~~~~~~~-~~~~~~~~~~~g~t~~~~al~~a~~~~~~~ 95 (177) T pfam00092 20 VKEFIKKLVENLD---IGPDGTRVGLVQYSSDVTTEFSLNDYKSKDDLL-SAVLRNIYYLGGGTNTGKALKYALENLFRS 95 (177) T ss_pred HHHHHHHHHHHHC---CCCCCCEEEEEEECCCEEEEECCCCCCCHHHHH-HHHHHHCCCCCCCCHHHHHHHHHHHHHHHC T ss_conf 9999999999836---588752899999458458996178868999999-998643157899565999999999998635 Q ss_pred CCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC----CCCEE Q ss_conf 66544445667775069999960668888540899999999879689999943788743117899986068----98378 Q gi|254780388|r 354 NEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS----PNSFF 429 (458) Q Consensus 354 ~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs----~~~yy 429 (458) . ...++..|++||+|||..+.........-...|..||+||+||.+- . ....|+..|+ .+++| T Consensus 96 ~-------~~r~~~~k~vvllTDG~~~~~~~~~~~~~~~~~~~gI~v~~vG~g~-~-----~~~~L~~ia~~~~~~~~~~ 162 (177) T pfam00092 96 A-------GSRPNAPKVVILLTDGKSNDGGLVPAAAAALRRKVGIIVFGVGVGD-V-----DEEELRLIASEPCSEGHVF 162 (177) T ss_pred C-------CCCCCCEEEEEEEECCCCCCCCCCHHHHHHHHHHCCCEEEEEECCC-C-----CHHHHHHHHCCCCCCCEEE T ss_conf 4-------7887872689998369878886469999999997895899997474-4-----8999999968999898599 Q ss_pred EECCHHHHHHHHHH Q ss_conf 82998999999999 Q gi|254780388|r 430 EANSTHELNKIFRD 443 (458) Q Consensus 430 ~a~~~~eL~~aF~~ 443 (458) .+.+..+|.+.+++ T Consensus 163 ~~~~~~~l~~~~~~ 176 (177) T pfam00092 163 YVTDFDALSDIQEE 176 (177) T ss_pred EECCHHHHHHHHHH T ss_conf 95898999999961 No 15 >TIGR00868 hCaCC calcium-activated chloride channel protein 1; InterPro: IPR004727 This entry represents a family of Ca(2+)-regulated chloride channels (CLCA) which includes bovine, murine and human proteins , . Each CLCA exhibits a distinct, often overlapping, tissue expression pattern. With the exception of the truncated, secreted protein hCLCA3 , they are synthesized as an approximately 125 kDa precursor transmembrane glycoprotein that is rapidly cleaved into 90 and 35 kDa subunits. The human proteins have been shown to affect a large number of cell functions including chloride conductance, epithelial secretion, cell-cell adhesion, apoptosis, cell cycle control, mucus production in asthma, and blood pressure. The CLCA proteins expressed on the luminal surface of lung vascular endothelia (bCLCA2; mCLCA1; hCLCA2) serve as adhesion molecules for lung metastatic cancer cells, mediating vascular arrest and lung colonization. Expression of hCLCA2 in normal mammary epithelium is consistently lost in human breast cancer and in all tumorigenic breast cancer cell lines. Re-expression of hCLCA2 in human breast cancer cells abrogates tumorigenicity in nude mice, implying that hCLCA2 acts as a tumour suppressor in breast cancer.. Probab=99.32 E-value=2.2e-11 Score=91.11 Aligned_cols=130 Identities=24% Similarity=0.306 Sum_probs=88.8 Q ss_pred EEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEE Q ss_conf 01211006776655655768121005666656412567777764588999998623666654444566777506999996 Q gi|254780388|r 296 RMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLT 375 (458) Q Consensus 296 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~T 375 (458) ..+.+.|++.......|.--.+...+..+-. --.-.|.|||.+..||..|.+.+....+...|- -||||| T Consensus 346 ~VGmV~FDS~A~i~n~L~~I~s~~~~~~l~a-~LP~~a~GGTSIC~Gl~~aFq~I~~~~~~t~GS---------Ei~LLT 415 (874) T TIGR00868 346 WVGMVTFDSAAEIKNELIKITSSDERDALTA-NLPTEASGGTSICSGLKAAFQVIKKSDQSTDGS---------EIVLLT 415 (874) T ss_pred EEEEEECCCEEEEEEEEEEECCHHHHHHHHH-HCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCC---------EEEEEE T ss_conf 6776630644576542077526689989987-077878768036566766654333126666753---------699830 Q ss_pred CCCCCCCCCHHHHHH-HHHHHCCCEEEEEEECCCCCCCCHHHHH--HHHHCCCCCEEEEC---CHHHHHHHHHHHHHH Q ss_conf 066888854089999-9999879689999943788743117899--98606898378829---989999999999875 Q gi|254780388|r 376 DGENTQDNEEGIAIC-NKAKSQGIRIMTIAFSVNKTQQEKARYF--LSNCASPNSFFEAN---STHELNKIFRDRIGN 447 (458) Q Consensus 376 DG~n~~~~~~~~~~C-~~~K~~gI~IytI~f~~~~~~~~~~~~~--lk~CAs~~~yy~a~---~~~eL~~aF~~~i~~ 447 (458) |||+|... .| +..|..|..||||++|= +.++.| |.+.. +|+.|+|. +-..|.+||.. |.. T Consensus 416 DGEDN~i~-----sC~~eVkqsGaIiHtiALGp-----sAa~ele~lS~mT-GG~~fYa~D~~~~NgLidAFg~-lsS 481 (874) T TIGR00868 416 DGEDNTIS-----SCIEEVKQSGAIIHTIALGP-----SAAKELEELSDMT-GGLRFYASDEADNNGLIDAFGA-LSS 481 (874) T ss_pred CCCCCCEE-----ECHHHHHCCCEEEEEEECCH-----HHHHHHHHHHHHC-CCCEEEEECHHHCCCHHHHHHH-HCC T ss_conf 68757623-----13055410980899850784-----5899999987333-8711334133331414546642-214 No 16 >cd01464 vWA_subfamily VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.32 E-value=5e-11 Score=88.81 Aligned_cols=154 Identities=16% Similarity=0.112 Sum_probs=100.5 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHH Q ss_conf 06677766522112346775666520121100677665565576812100566665641256777776458899999862 Q gi|254780388|r 271 KHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTI 350 (458) Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~L 350 (458) ++.++.+...++..+...+......+.+...|+.......|+..- ....+..+.+.|+|+.+.||..+...| T Consensus 21 i~~~k~al~~~~~~L~~d~~a~~~~~vsVItF~s~a~~~~pl~~~--------~~~~~~~L~a~G~T~~g~al~~a~~~l 92 (176) T cd01464 21 IEALNQGLQMLQSELRQDPYALESVEISVITFDSAARVIVPLTPL--------ESFQPPRLTASGGTSMGAALELALDCI 92 (176) T ss_pred HHHHHHHHHHHHHHHHCCCCCHHEEEEEEEEECCCEEEECCCCCH--------HHCCCCCCCCCCCCHHHHHHHHHHHHH T ss_conf 999999999999997118310113269999978951780586347--------664755477789981999999999999 Q ss_pred CCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCCCCCEEE Q ss_conf 36666544445667775069999960668888540899999999879689999943788743117899986068983788 Q gi|254780388|r 351 ISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCASPNSFFE 430 (458) Q Consensus 351 s~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~~~yy~ 430 (458) ........+ ......+.+|||||||+.|..-......++..+..++.|++||+|.+. ..++|++.+.. -.. T Consensus 93 ~~~~~~~~~--~~~~~~~P~I~LlTDG~PtD~~~~~~~~~~~~~~~~~~i~a~giG~da-----d~~~L~~is~~--~~~ 163 (176) T cd01464 93 DRRVQRYRA--DQKGDWRPWVFLLTDGEPTDDLTAAIERIKEARDSKGRIVACAVGPKA-----DLDTLKQITEG--VPL 163 (176) T ss_pred HHHHHHCCC--CCCCCCCEEEEEECCCCCCCCHHHHHHHHHHHHHCCCEEEEEEEECCC-----CHHHHHHHHCC--CCC T ss_conf 986522365--566775317999668998875899999999888639769999973871-----89999988577--745 Q ss_pred ECCHHHHHHHH Q ss_conf 29989999999 Q gi|254780388|r 431 ANSTHELNKIF 441 (458) Q Consensus 431 a~~~~eL~~aF 441 (458) ..++.+..+.| T Consensus 164 ~~~~~~f~~ff 174 (176) T cd01464 164 LDDALSGLNFF 174 (176) T ss_pred CCCHHHHHHHH T ss_conf 34534588850 No 17 >cd01450 vWFA_subfamily_ECM Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=99.27 E-value=3e-10 Score=83.80 Aligned_cols=133 Identities=20% Similarity=0.259 Sum_probs=89.9 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCC--CCCCCHHHHHHHHHHHC Q ss_conf 7776652211234677566652012110067766556557681210056666564125677--77764588999998623 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEM--GSTAINDAMQTAYDTII 351 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~--g~T~~~~gl~~g~~~Ls 351 (458) .+..+..++...... ....+.+...|++......++.... ++..+...+..+... ++|++..+|.++.+.+. T Consensus 21 ~k~~i~~~i~~~~~~---~~~~rv~lv~fs~~~~~~~~l~~~~---~~~~l~~~i~~l~~~~~~~t~~~~AL~~~~~~~~ 94 (161) T cd01450 21 VKDFIEKLVEKLDIG---PDKTRVGLVQYSDDVRVEFSLNDYK---SKDDLLKAVKNLKYLGGGGTNTGKALQYALEQLF 94 (161) T ss_pred HHHHHHHHHHHCCCC---CCCCEEEEEEECCCEEEEECCCCCC---CHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHH T ss_conf 999999999970568---8785899999557316871465646---6999999998421368998548999999999986 Q ss_pred CCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCCCC Q ss_conf 666654444566777506999996066888854089999999987968999994378874311789998606898 Q gi|254780388|r 352 SSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCASPN 426 (458) Q Consensus 352 ~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~~ 426 (458) .... ...+..|++|++|||..+.. .......+.+|++||+||+||++. . ....|+..|+.+ T Consensus 95 ~~~~-------~r~~~~kvivllTDG~~~~~-~~~~~~a~~lk~~gi~v~~vgiG~-~-----~~~~L~~iA~~p 155 (161) T cd01450 95 SESN-------ARENVPKVIIVLTDGRSDDG-GDPKEAAAKLKDEGIKVFVVGVGP-A-----DEEELREIASCP 155 (161) T ss_pred HCCC-------CCCCCCEEEEEEECCCCCCC-CCHHHHHHHHHHCCCEEEEEEECC-C-----CHHHHHHHHCCC T ss_conf 1446-------66667549999825887887-479999999998899899998264-8-----999999997799 No 18 >cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. Probab=99.21 E-value=1.4e-09 Score=79.61 Aligned_cols=152 Identities=19% Similarity=0.269 Sum_probs=96.7 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHC-CCCCCCCCCHHHHHHHHHHHCC Q ss_conf 777665221123467756665201211006776655655768121005666656412-5677777645889999986236 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAID-ENEMGSTAINDAMQTAYDTIIS 352 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~-~~~~g~T~~~~gl~~g~~~Ls~ 352 (458) .+..+..++..... .....+.+...|+........+.... ++..+...+.. .+..|+|+...+|.++.+.+.. T Consensus 21 ~k~fi~~lv~~f~i---~~~~~rvglv~ys~~~~~~~~l~~~~---~~~~~~~~i~~i~~~~g~t~t~~AL~~a~~~~f~ 94 (177) T cd01469 21 VKNFLSTVMKKLDI---GPTKTQFGLVQYSESFRTEFTLNEYR---TKEEPLSLVKHISQLLGLTNTATAIQYVVTELFS 94 (177) T ss_pred HHHHHHHHHHHCCC---CCCCCEEEEEEECCCEEEEEECCCCC---CHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCC T ss_conf 99999999986676---99874899999368249998235567---7899999986230368975252799999998536 Q ss_pred CCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-C--CCEE Q ss_conf 666544445667775069999960668888540899999999879689999943788743117899986068-9--8378 Q gi|254780388|r 353 SNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-P--NSFF 429 (458) Q Consensus 353 ~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~--~~yy 429 (458) ...+.+ ++..|++|++|||..+ .+.......+.+|++||+||+||++-..+.. ....-|+.+|| | .|.| T Consensus 95 ~~~g~R------~~~~kv~ivlTDG~s~-d~~~~~~~~~~lk~~gv~vf~VGvG~~~~~~-~~~~eL~~iAs~P~~~hvf 166 (177) T cd01469 95 ESNGAR------KDATKVLVVITDGESH-DDPLLKDVIPQAEREGIIRYAIGVGGHFQRE-NSREELKTIASKPPEEHFF 166 (177) T ss_pred CCCCCC------CCCEEEEEEEECCCCC-CCCCHHHHHHHHHHCCEEEEEEEECCCCCCC-CCHHHHHHHHCCCCHHCEE T ss_conf 455886------7871699999789867-7501499999999799089999955514674-5199999996798587199 Q ss_pred EECCHHHHHH Q ss_conf 8299899999 Q gi|254780388|r 430 EANSTHELNK 439 (458) Q Consensus 430 ~a~~~~eL~~ 439 (458) .+.+=++|.+ T Consensus 167 ~~~~f~~L~~ 176 (177) T cd01469 167 NVTDFAALKD 176 (177) T ss_pred EECCHHHHCC T ss_conf 8379777646 No 19 >smart00327 VWA von Willebrand factor (vWF) type A domain. VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. Probab=99.18 E-value=2.7e-09 Score=77.70 Aligned_cols=140 Identities=16% Similarity=0.226 Sum_probs=89.7 Q ss_pred HHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCC--CCCCCCCHHHHHHHHHHH Q ss_conf 677766522112346775666520121100677665565576812100566665641256--777776458899999862 Q gi|254780388|r 273 LVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDEN--EMGSTAINDAMQTAYDTI 350 (458) Q Consensus 273 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~--~~g~T~~~~gl~~g~~~L 350 (458) ..+..+..++....... ...+.+...|+.......++. ...+...+...+..+. +.|+|+...+|.++++.+ T Consensus 21 ~~k~~~~~~i~~l~~~~---~~~~v~vv~f~~~~~~~~~~~---~~~~~~~~~~~i~~l~~~~~g~t~~~~al~~a~~~~ 94 (177) T smart00327 21 KAKEFVLKLVEQLDIGP---DGDRVGLVTFSDDATVLFPLN---DSRSKDALLEALASLSYKLGGGTNLGAALQYALENL 94 (177) T ss_pred HHHHHHHHHHHHHHCCC---CCCEEEEEEECCCEEEEECCC---CCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHH T ss_conf 99999999999864179---987899999637268997688---868999999999714155788776428999999999 Q ss_pred CCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC--CCCE Q ss_conf 36666544445667775069999960668888540899999999879689999943788743117899986068--9837 Q gi|254780388|r 351 ISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS--PNSF 428 (458) Q Consensus 351 s~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs--~~~y 428 (458) ....... .....|++|++|||..+.. .......+.+|+.||.||+||++... ....|+..|+ .+.| T Consensus 95 ~~~~~~~------~~~~~~~iil~TDG~~~~~-~~~~~~~~~~~~~~v~i~~ig~g~~~-----~~~~l~~ia~~~~~~~ 162 (177) T smart00327 95 FSKSAGS------RRGAPKVLILITDGESNDG-GDLLKAAKELKRSGVKVFVVGVGNDV-----DEEELKKLASAPGGVY 162 (177) T ss_pred HHHHCCC------CCCCCEEEEEEECCCCCCC-HHHHHHHHHHHHCCCEEEEEEECCCC-----CHHHHHHHHHCCCCEE T ss_conf 7665037------7887428999805887872-52999999998679489999958847-----9999999984899659 Q ss_pred EE Q ss_conf 88 Q gi|254780388|r 429 FE 430 (458) Q Consensus 429 y~ 430 (458) ++ T Consensus 163 ~~ 164 (177) T smart00327 163 VF 164 (177) T ss_pred EE T ss_conf 99 No 20 >cd01472 vWA_collagen von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins. This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif. Probab=99.16 E-value=2.2e-09 Score=78.33 Aligned_cols=139 Identities=17% Similarity=0.200 Sum_probs=90.6 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCC-CCCCCCHHHHHHHHHHHCC Q ss_conf 777665221123467756665201211006776655655768121005666656412567-7777645889999986236 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENE-MGSTAINDAMQTAYDTIIS 352 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~-~g~T~~~~gl~~g~~~Ls~ 352 (458) .+..+..++...... ....+.+...|+.......++... .++..+...+..+.. +|+|+++.+|.++...+.. T Consensus 21 ~k~fi~~li~~~~i~---~~~~rvgvv~fs~~~~~~~~l~~~---~~~~~l~~~i~~i~~~~g~t~~~~AL~~~~~~~~~ 94 (164) T cd01472 21 VKDFVKRVVERLDIG---PDGVRVGVVQYSDDPRTEFYLNTY---RSKDDVLEAVKNLRYIGGGTNTGKALKYVRENLFT 94 (164) T ss_pred HHHHHHHHHHHCCCC---CCCCEEEEEEECCCEEEEECCCCC---CCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHC T ss_conf 999999999964768---886089999824741587445466---98899999998611668975299999999998635 Q ss_pred CCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCCC---CCEE Q ss_conf 6665444456677750699999606688885408999999998796899999437887431178999860689---8378 Q gi|254780388|r 353 SNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCASP---NSFF 429 (458) Q Consensus 353 ~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~---~~yy 429 (458) .... ..++..|++|++|||..+ .........+|++||+||+||++- ..+.-|+..||. .|+| T Consensus 95 ~~~~------~r~~~~kvvvllTDG~s~---~~~~~~a~~lr~~Gi~v~~VGig~------~~~~~L~~iAs~p~~~~~~ 159 (164) T cd01472 95 EASG------SREGVPKVLVVITDGKSQ---DDVEEPAVELKQAGIEVFAVGVKN------ADEEELKQIASDPKELYVF 159 (164) T ss_pred CCCC------CCCCCEEEEEEEECCCCC---CHHHHHHHHHHHCCCEEEEEECCC------CCHHHHHHHHCCCCHHEEE T ss_conf 3578------767851599998379986---408899999998898899997884------7999999996799378389 Q ss_pred EECC Q ss_conf 8299 Q gi|254780388|r 430 EANS 433 (458) Q Consensus 430 ~a~~ 433 (458) .+.+ T Consensus 160 ~~~~ 163 (164) T cd01472 160 NVAD 163 (164) T ss_pred ECCC T ss_conf 6588 No 21 >cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. Probab=99.15 E-value=7.5e-09 Score=74.89 Aligned_cols=143 Identities=19% Similarity=0.180 Sum_probs=94.3 Q ss_pred CCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCC----CCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 66652012110067766556557681210056666564125----67777764588999998623666654444566777 Q gi|254780388|r 291 VNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDE----NEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLE 366 (458) Q Consensus 291 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~----~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~ 366 (458) ..+..+.+...|+.......++....+ .++..+...|..+ ..+|+|+...+|..+.+.+.... ...++ T Consensus 36 g~~~~rvgvv~yS~~~~~~~~f~~~~~-~~k~~~l~~i~~l~~~~~~gg~T~tg~AL~~~~~~~~~~~-------g~R~~ 107 (192) T cd01473 36 SKDKVHVGILLFAEKNRDVVPFSDEER-YDKNELLKKINDLKNSYRSGGETYIVEALKYGLKNYTKHG-------NRRKD 107 (192) T ss_pred CCCCEEEEEEEECCCCCEEEECCCCCC-CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCC-------CCCCC T ss_conf 989619999995588740132355443-4899999999998731468982479999999999863467-------88889 Q ss_pred CCEEEEEEECCCCCC-CCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-C--C---CEEEECCHHHHHH Q ss_conf 506999996066888-8540899999999879689999943788743117899986068-9--8---3788299899999 Q gi|254780388|r 367 AKKYIVLLTDGENTQ-DNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-P--N---SFFEANSTHELNK 439 (458) Q Consensus 367 ~~K~iil~TDG~n~~-~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~--~---~yy~a~~~~eL~~ 439 (458) ..|++|++|||..+. .......+...+|++||+||.||.+... ..-|+..|| | + .+|+..+=++|.. T Consensus 108 vpkv~IvlTDG~s~~~~~~~~~~~a~~lr~~gV~i~avGVg~~~------~~eL~~iag~~~~~~~c~~~~~~~fd~l~~ 181 (192) T cd01473 108 APKVTMLFTDGNDTSASKKELQDISLLYKEENVKLLVVGVGAAS------ENKLKLLAGCDINNDNCPNVIKTEWNNLNG 181 (192) T ss_pred CCEEEEEEECCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCC------HHHHHHHHCCCCCCCCCCEEEECCHHHHHH T ss_conf 97499999569988731678999999999879789999806379------999999869998899775799479789999 Q ss_pred HHHHHHHH Q ss_conf 99999875 Q gi|254780388|r 440 IFRDRIGN 447 (458) Q Consensus 440 aF~~~i~~ 447 (458) ...++..+ T Consensus 182 i~~~l~~~ 189 (192) T cd01473 182 ISKFLTDK 189 (192) T ss_pred HHHHHHHH T ss_conf 99999997 No 22 >cd01471 vWA_micronemal_protein Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners. Probab=99.10 E-value=1.6e-08 Score=72.81 Aligned_cols=151 Identities=13% Similarity=0.136 Sum_probs=95.5 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHH---HHHHHHHCC-CCCCCCCCHHHHHHHHHH Q ss_conf 7776652211234677566652012110067766556557681210056---666564125-677777645889999986 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRT---IVKTFAIDE-NEMGSTAINDAMQTAYDT 349 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~i~~~-~~~g~T~~~~gl~~g~~~ 349 (458) .+..+..+....+ -..+..+.+...|+.......++....+. ++. .....+..+ +.+|+|+.+.+|..+.+. T Consensus 22 ~k~F~~~lv~~f~---I~~~~~rVgvv~ys~~~~~~~~l~~~~~~-~~~~~~~~~~~i~~~~y~gg~T~Tg~AL~~a~~~ 97 (186) T cd01471 22 VVPFLHTFVQNLN---ISPDEINLYLVTFSTNAKELIRLSSPNST-NKDLALNAIRALLSLYYPNGSTNTTSALLVVEKH 97 (186) T ss_pred HHHHHHHHHHHCC---CCCCCEEEEEEEECCCCEEEEECCCCCCC-CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHH T ss_conf 9999999999749---69884499999954870599875775544-6567999999998377789967799999999997 Q ss_pred HCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-CCC- Q ss_conf 236666544445667775069999960668888540899999999879689999943788743117899986068-983- Q gi|254780388|r 350 IISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-PNS- 427 (458) Q Consensus 350 Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~~~- 427 (458) +.... ...++..|++|++|||..+ .+......++.+|++||+||+||.+.+.+. .-|+..|+ ++. T Consensus 98 ~f~~~-------g~R~~vpkv~illTDG~s~-d~~~~~~~a~~Lr~~GV~ifavGVG~~v~~-----~eL~~Iag~~~~~ 164 (186) T cd01471 98 LFDTR-------GNRENAPQLVIIMTDGIPD-SKFRTLKEARKLRERGVIIAVLGVGQGVNH-----EENRSLVGCDPDD 164 (186) T ss_pred HCCCC-------CCCCCCCEEEEEEECCCCC-CCCHHHHHHHHHHHCCCEEEEEECCCCCCH-----HHHHHHCCCCCCC T ss_conf 21146-------8899998599999069877-852589999999988999999983432499-----9999970999888 Q ss_pred ----EEEECCHHHHHHHH Q ss_conf ----78829989999999 Q gi|254780388|r 428 ----FFEANSTHELNKIF 441 (458) Q Consensus 428 ----yy~a~~~~eL~~aF 441 (458) .|..++=++|...- T Consensus 165 ~~c~~~~~~~~~~l~~~~ 182 (186) T cd01471 165 SPCPLYLQSSWSEVQNVI 182 (186) T ss_pred CCCCEEEECCHHHHHHHH T ss_conf 998657517888887477 No 23 >cd00198 vWFA Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Probab=99.07 E-value=4e-09 Score=76.62 Aligned_cols=132 Identities=20% Similarity=0.220 Sum_probs=87.0 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCC--CCCCCCCHHHHHHHHHHHC Q ss_conf 77766522112346775666520121100677665565576812100566665641256--7777764588999998623 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDEN--EMGSTAINDAMQTAYDTII 351 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~--~~g~T~~~~gl~~g~~~Ls 351 (458) .+..+..++...... ....+.+...|+.......++.... +.......+..+. +.|+|+...+|..+.+.+. T Consensus 21 ~k~~~~~~~~~l~~~---~~~~~v~vv~f~~~~~~~~~~~~~~---~~~~~~~~i~~~~~~~~g~t~~~~al~~a~~~~~ 94 (161) T cd00198 21 AKEALKALVSSLSAS---PPGDRVGLVTFGSNARVVLPLTTDT---DKADLLEAIDALKKGLGGGTNIGAALRLALELLK 94 (161) T ss_pred HHHHHHHHHHHHHHC---CCCCEEEEEEECCCEEEEECCCCHH---HHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHH T ss_conf 999999999987655---9998899999379514881474125---7999999775135689998389999999999987 Q ss_pred CCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCCC Q ss_conf 66665444456677750699999606688885408999999998796899999437887431178999860689 Q gi|254780388|r 352 SSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCASP 425 (458) Q Consensus 352 ~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~ 425 (458) ... .....|++|++|||..+..........+.+|+.||.||+|+++... ....|+..|+. T Consensus 95 ~~~---------~~~~~~~iiliTDG~~~~~~~~~~~~~~~~~~~~v~i~~igig~~~-----~~~~l~~ia~~ 154 (161) T cd00198 95 SAK---------RPNARRVIILLTDGEPNDGPELLAEAARELRKLGITVYTIGIGDDA-----NEDELKEIADK 154 (161) T ss_pred HHC---------CCCCCEEEEEECCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECHHH-----CHHHHHHHHHC T ss_conf 532---------5556517999678998987367999999999779989999966111-----99999999838 No 24 >cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.06 E-value=1.2e-08 Score=73.56 Aligned_cols=141 Identities=16% Similarity=0.197 Sum_probs=91.8 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCC--CCCCHHHHHHHHHHHC Q ss_conf 77766522112346775666520121100677665565576812100566665641256777--7764588999998623 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMG--STAINDAMQTAYDTII 351 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g--~T~~~~gl~~g~~~Ls 351 (458) .+..+..+...... ..+..+.+...|+......-.+.. . .++..+...|..+.+.| +|+.+.+|.++.+.+. T Consensus 21 ~k~Fv~~lv~~f~i---~~~~trVgvi~ys~~~~~~f~l~~--~-~~~~~l~~~I~~i~~~~g~~t~tg~AL~~a~~~~f 94 (165) T cd01481 21 IRDFIERIVQSLDV---GPDKIRVAVVQFSDTPRPEFYLNT--H-STKADVLGAVRRLRLRGGSQLNTGSALDYVVKNLF 94 (165) T ss_pred HHHHHHHHHHHHCC---CCCCEEEEEEEECCCEEEEEECCC--C-CCHHHHHHHHHHHHCCCCCCEEHHHHHHHHHHHHC T ss_conf 99999999996046---888627889998686479997677--6-89999999998410458984369999999999716 Q ss_pred CCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-CCCEEE Q ss_conf 6666544445667775069999960668888540899999999879689999943788743117899986068-983788 Q gi|254780388|r 352 SSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-PNSFFE 430 (458) Q Consensus 352 ~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~~~yy~ 430 (458) ....+... .+...|++|++|||.-+ .......+.+|+.||+||+||.+. .+ ..-|+..|| |++.|. T Consensus 95 ~~~~g~R~----r~~v~kvlvviTdG~s~---d~~~~~a~~lr~~gV~i~aVGvg~-~~-----~~eL~~IAs~p~~vf~ 161 (165) T cd01481 95 TKSAGSRI----EEGVPQFLVLITGGKSQ---DDVERPAVALKRAGIVPFAIGARN-AD-----LAELQQIAFDPSFVFQ 161 (165) T ss_pred CCCCCCCC----CCCCCEEEEEEECCCCC---CHHHHHHHHHHHCCCEEEEEECCC-CC-----HHHHHHHHCCCCCEEE T ss_conf 75678875----57998699998489885---378999999998897899996897-99-----9999998589877697 Q ss_pred ECC Q ss_conf 299 Q gi|254780388|r 431 ANS 433 (458) Q Consensus 431 a~~ 433 (458) +.+ T Consensus 162 ~~~ 164 (165) T cd01481 162 VSD 164 (165) T ss_pred CCC T ss_conf 389 No 25 >TIGR03436 acidobact_VWFA VWFA-related Acidobacterial domain. Members of this family are bacterial domains that include a region related to the von Willebrand factor type A (VWFA) domain (pfam00092). These domains are restricted to, and have undergone a large paralogous family expansion in, the Acidobacteria, including Solibacter usitatus and Acidobacterium capsulatum ATCC 51196. Probab=99.04 E-value=3.6e-08 Score=70.56 Aligned_cols=109 Identities=19% Similarity=0.276 Sum_probs=75.4 Q ss_pred CCCCCCCCHHHHHHHHH-HHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCC Q ss_conf 67777764588999998-62366665444456677750699999606688885408999999998796899999437887 Q gi|254780388|r 332 NEMGSTAINDAMQTAYD-TIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKT 410 (458) Q Consensus 332 ~~~g~T~~~~gl~~g~~-~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~ 410 (458) .+.|+|+....+..+.. .+... ......+|+||++|||.++........+-+.++..+|.||+|++..... T Consensus 136 ~~~g~tal~dAi~laa~~~~~~~--------~~~~~gRK~li~iSdG~d~~s~~~~~~~~~~a~~a~v~IY~I~~~~~~~ 207 (296) T TIGR03436 136 ADAGGTALYDAITLAALQQLANA--------LAGIPGRKALIVISDGEDNSSRDTLERAIEAAQRADVLIYSIDARGLRA 207 (296) T ss_pred CCCCCCCHHHHHHHHHHHHHHHH--------CCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHCCCEEEEECCCCCCC T ss_conf 57874102788999999998754--------0479886799999269886330489999999998497799954676566 Q ss_pred --------CCCHHHHHHHHHC--CCCCEEEECCHHHHHHHHHHHHHHHHH Q ss_conf --------4311789998606--898378829989999999999875875 Q gi|254780388|r 411 --------QQEKARYFLSNCA--SPNSFFEANSTHELNKIFRDRIGNEIF 450 (458) Q Consensus 411 --------~~~~~~~~lk~CA--s~~~yy~a~~~~eL~~aF~~~i~~~~~ 450 (458) ..-.+...|+..| ++|.+|+... .||.++|++ |.+++. T Consensus 208 ~~~~~~~~~~~~~~~~L~~lA~~TGG~~f~~~~-~dl~~~~~~-i~~~lr 255 (296) T TIGR03436 208 PDLGAGAKAGLSGPETLERLAAETGGRAFYVNS-NDIDEAFAQ-IAEELR 255 (296) T ss_pred CCCCCCCCCCCCCHHHHHHHHHHHCCEEECCCC-CCHHHHHHH-HHHHHH T ss_conf 564444455676279999999973996755474-108999999-999875 No 26 >cd01477 vWA_F09G8-8_type VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of mo Probab=99.02 E-value=3.3e-08 Score=70.77 Aligned_cols=132 Identities=17% Similarity=0.225 Sum_probs=90.4 Q ss_pred CCCCCCEEEEEEECCCCCCCCCCCCCHHHHH-HHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCC Q ss_conf 5666520121100677665565576812100-566665641256777776458899999862366665444456677750 Q gi|254780388|r 290 NVNDTVRMGATFFNDRVISDPSFSWGVHKLI-RTIVKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAK 368 (458) Q Consensus 290 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~ 368 (458) ......|.+.++|+........+....+..+ ...+.............+...||..+-++|..... ....+++ T Consensus 59 ~~pr~TRVGlVTYn~~AtvvAdLn~~~S~ddl~~~i~~~l~~vsss~~SyL~~GL~aA~~~l~~~~~------~~R~nyk 132 (193) T cd01477 59 DDPRSTRVGLVTYNSNATVVADLNDLQSFDDLYSQIQGSLTDVSSTNASYLDTGLQAAEQMLAAGKR------TSRENYK 132 (193) T ss_pred CCCCCEEEEEEEECCCCEEEECCCCCCCHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHCCC------CCCCCCC T ss_conf 9987338999996787459863454565788999998875146666312799999999999983326------6424862 Q ss_pred EEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCCCCCEE Q ss_conf 6999996066888854089999999987968999994378874311789998606898378 Q gi|254780388|r 369 KYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCASPNSFF 429 (458) Q Consensus 369 K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~~~yy 429 (458) |++|+++-.-+..+......+.+.+|..|+.|-||+|+.+.+. .....|++||||++-| T Consensus 133 KVVIVyAs~y~~~g~~dp~pvA~rLK~~Gv~IiTVa~~q~~~~--~~~~~L~~IASpg~nF 191 (193) T cd01477 133 KVVIVFASDYNDEGSNDPRPIAARLKSTGIAIITVAFTQDESS--NLLDKLGKIASPGMNF 191 (193) T ss_pred EEEEEEECCCCCCCCCCHHHHHHHHHHCCCEEEEEECCCCCCH--HHHHHHHHHCCCCCCC T ss_conf 7999995024678988869999999876978999982688758--8999888757998887 No 27 >cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=98.94 E-value=6.1e-08 Score=69.08 Aligned_cols=139 Identities=17% Similarity=0.237 Sum_probs=86.2 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCC-CCCCCCCCHHHHHHHHHHHCC Q ss_conf 7776652211234677566652012110067766556557681210056666564125-677777645889999986236 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDE-NEMGSTAINDAMQTAYDTIIS 352 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~-~~~g~T~~~~gl~~g~~~Ls~ 352 (458) .+..+..+...... .....+.+...|+.......++.... ++..+...|..+ +.+|+|+.+.+|.++.+.+.. T Consensus 21 ~k~fi~~lv~~f~i---~~~~~rvgvv~ys~~~~~~~~l~~~~---~~~~l~~~i~~i~~~~g~t~~~~AL~~~~~~~f~ 94 (164) T cd01482 21 VRSFLSSVVEAFEI---GPDGVQVGLVQYSDDPRTEFDLNAYT---SKEDVLAAIKNLPYKGGNTRTGKALTHVREKNFT 94 (164) T ss_pred HHHHHHHHHHHCCC---CCCCEEEEEEEECCCCEEEECCCCCC---CHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHC T ss_conf 99999999996476---88862899999447512787343469---9899999986402668997289999999998615 Q ss_pred CCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC-C--CCEE Q ss_conf 666544445667775069999960668888540899999999879689999943788743117899986068-9--8378 Q gi|254780388|r 353 SNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS-P--NSFF 429 (458) Q Consensus 353 ~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~--~~yy 429 (458) ...+ ..++..|++|++|||.-+ .......+.+|++||+||+||++- . + ..-|+..|| | .|.| T Consensus 95 ~~~g------~R~~~~kvlvliTDG~s~---d~~~~~a~~lr~~gv~i~~VGVg~-~---~--~~eL~~IAs~P~~~hvf 159 (164) T cd01482 95 PDAG------ARPGVPKVVILITDGKSQ---DDVELPARVLRNLGVNVFAVGVKD-A---D--ESELKMIASKPSETHVF 159 (164) T ss_pred HHCC------CCCCCCEEEEEECCCCCC---CHHHHHHHHHHHCCCEEEEEECCC-C---C--HHHHHHHHCCCCHHCEE T ss_conf 0028------988886079996079884---338999999998893899997883-7---8--99999996898566179 Q ss_pred EECC Q ss_conf 8299 Q gi|254780388|r 430 EANS 433 (458) Q Consensus 430 ~a~~ 433 (458) .+.+ T Consensus 160 ~~~~ 163 (164) T cd01482 160 NVAD 163 (164) T ss_pred ECCC T ss_conf 7479 No 28 >cd01462 VWA_YIEM_type VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=98.93 E-value=3.8e-08 Score=70.41 Aligned_cols=78 Identities=24% Similarity=0.213 Sum_probs=62.6 Q ss_pred HHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEE Q ss_conf 66665641256777776458899999862366665444456677750699999606688885408999999998796899 Q gi|254780388|r 322 TIVKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIM 401 (458) Q Consensus 322 ~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~Iy 401 (458) ......+..+.+.|||++..+|..+...|... ....+.|||+|||+....+......++..+++|+++| T Consensus 60 ~~~~~~i~~~~~~GGT~i~~aL~~A~~~l~~~-----------~~~~~~IvlITDG~~~~~~~~~~~~~~~~~~~~~r~~ 128 (152) T cd01462 60 EEPVEFLSGVQLGGGTDINKALRYALELIERR-----------DPRKADIVLITDGYEGGVSDELLREVELKRSRVARFV 128 (152) T ss_pred HHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCC-----------CCCCCEEEEEECCCCCCCHHHHHHHHHHHHHCCEEEE T ss_conf 99999997253689865799999999987425-----------7656469998267567983999999999983891999 Q ss_pred EEEECCCCC Q ss_conf 999437887 Q gi|254780388|r 402 TIAFSVNKT 410 (458) Q Consensus 402 tI~f~~~~~ 410 (458) +++++...+ T Consensus 129 ~~~iG~~~~ 137 (152) T cd01462 129 ALALGDHGN 137 (152) T ss_pred EEEECCCCC T ss_conf 999899988 No 29 >cd01476 VWA_integrin_invertebrates VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands. Probab=98.92 E-value=5.9e-08 Score=69.17 Aligned_cols=139 Identities=14% Similarity=0.136 Sum_probs=85.4 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCC-CCCCCCHHHHHHHHHHHCC Q ss_conf 777665221123467756665201211006776655655768121005666656412567-7777645889999986236 Q gi|254780388|r 274 VRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENE-MGSTAINDAMQTAYDTIIS 352 (458) Q Consensus 274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~-~g~T~~~~gl~~g~~~Ls~ 352 (458) .+..+..+...... ..+..+.+...|++..........+.. .++..+...|..+.. .|+|+.+.+|..+.+.+.+ T Consensus 20 ~k~F~~~lv~~f~i---~~~~~rVgvv~ys~~~~~~i~f~l~~~-~~~~~l~~~I~~i~~~~g~T~tg~AL~~a~~~~~~ 95 (163) T cd01476 20 YKKYIERIVEGLEI---GPTATRVALITYSGRGRQRVRFNLPKH-NDGEELLEKVDNLRFIGGTTATGAAIEVALQQLDP 95 (163) T ss_pred HHHHHHHHHHHHCC---CCCCEEEEEEEECCCCCEEEEECCCCC-CCHHHHHHHHHHEECCCCCCCHHHHHHHHHHHHHH T ss_conf 99999999996146---888538999996698707888757777-99999999997520368985489999999997214 Q ss_pred CCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHH-CCCEEEEEEECCCCCCCCHHHHHHHHHCC-CCCEE Q ss_conf 6665444456677750699999606688885408999999998-79689999943788743117899986068-98378 Q gi|254780388|r 353 SNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKS-QGIRIMTIAFSVNKTQQEKARYFLSNCAS-PNSFF 429 (458) Q Consensus 353 ~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~-~gI~IytI~f~~~~~~~~~~~~~lk~CAs-~~~yy 429 (458) ... ..++..|++|++|||..+. ......+.+|+ .||+||.||++.... ..+.-|+..|| |++.| T Consensus 96 ~~g-------~R~~~~kv~vviTDG~s~d---~~~~~a~~lr~~~gv~v~avgVG~~~~---~d~~eL~~Ia~~~~~Vf 161 (163) T cd01476 96 SEG-------RREGIPKVVVVLTDGRSHD---DPEKQARILRAVPNIETFAVGTGDPGT---VDTEELHSITGNEDHIF 161 (163) T ss_pred HCC-------CCCCCEEEEEEEECCCCCC---CHHHHHHHHHHHCCCEEEEEEECCCCC---CCHHHHHHHCCCCCCCC T ss_conf 206-------7899616999981898766---488999999970998999998388650---15999998649972545 No 30 >cd01454 vWA_norD_type norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3- ------ NO2- ------ NO ------- N2O --------- N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif. Probab=98.50 E-value=8.1e-06 Score=55.53 Aligned_cols=91 Identities=19% Similarity=0.333 Sum_probs=66.6 Q ss_pred HHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCC---------CHHHHHHHHHHH Q ss_conf 65641256777776458899999862366665444456677750699999606688885---------408999999998 Q gi|254780388|r 325 KTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDN---------EEGIAICNKAKS 395 (458) Q Consensus 325 ~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~---------~~~~~~C~~~K~ 395 (458) +..+..+.|.|+|..+.+|.|+...|.. .+..+|++|++|||+.+..+ ..+...+..++. T Consensus 72 ~~~i~~l~~~g~Tr~G~Air~a~~~L~~-----------~~~~rkiliviSDG~P~D~~~~~~~~~~~~D~~~av~e~~~ 140 (174) T cd01454 72 RKRLAALSPGGNTRDGAAIRHAAERLLA-----------RPEKRKILLVISDGEPNDLDYYEGNVFATEDALRAVIEARK 140 (174) T ss_pred HHHHHCCCCCCCCCCHHHHHHHHHHHHH-----------CCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHHH T ss_conf 8888511878989617999999999863-----------97666799998389976677788755389999999999998 Q ss_pred CCCEEEEEEECCCCCCCCHHHHHHHHHCCCCCE Q ss_conf 796899999437887431178999860689837 Q gi|254780388|r 396 QGIRIMTIAFSVNKTQQEKARYFLSNCASPNSF 428 (458) Q Consensus 396 ~gI~IytI~f~~~~~~~~~~~~~lk~CAs~~~y 428 (458) +||.+|.|+++..... ...+.|+..-+..+| T Consensus 141 ~GI~~~~i~i~~~~~~--~~~~~l~~i~g~~~~ 171 (174) T cd01454 141 LGIEVFGITIDRDATT--VDKEYLKNIFGEEGY 171 (174) T ss_pred CCCEEEEEEECCCCCH--HHHHHHHHHCCCCCE T ss_conf 7988999998985556--699999984287877 No 31 >COG1240 ChlD Mg-chelatase subunit ChlD [Coenzyme metabolism] Probab=98.16 E-value=6.7e-05 Score=49.68 Aligned_cols=151 Identities=15% Similarity=0.237 Sum_probs=95.2 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECC-CCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHH Q ss_conf 0667776652211234677566652012110067-766556557681210056666564125677777645889999986 Q gi|254780388|r 271 KHLVRDALASVIRSIKKIDNVNDTVRMGATFFND-RVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDT 349 (458) Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~ 349 (458) +...+..+..++.. -+....+.....|.- .....-|++.... .....+..+.++|.|....||..++.+ T Consensus 97 m~aaKG~~~~lL~d-----AYq~RdkvavI~F~G~~A~lll~pT~sv~-----~~~~~L~~l~~GG~TPL~~aL~~a~ev 166 (261) T COG1240 97 MAAAKGAALSLLRD-----AYQRRDKVAVIAFRGEKAELLLPPTSSVE-----LAERALERLPTGGKTPLADALRQAYEV 166 (261) T ss_pred HHHHHHHHHHHHHH-----HHHCCCEEEEEEECCCCCEEEECCCCCHH-----HHHHHHHHCCCCCCCCHHHHHHHHHHH T ss_conf 99999999999999-----99703548999963776538847865399-----999999838999988439999999999 Q ss_pred HCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCC-----CCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC- Q ss_conf 236666544445667775069999960668888-----54089999999987968999994378874311789998606- Q gi|254780388|r 350 IISSNEDEVHRMKNNLEAKKYIVLLTDGENTQD-----NEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA- 423 (458) Q Consensus 350 Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~-----~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA- 423 (458) +....+ .+++...++|++|||..|.. ...+..+|.++...|+.+-+|.++.+.-.. .+.+..| T Consensus 167 ~~r~~r-------~~p~~~~~~vviTDGr~n~~~~~~~~~e~~~~a~~~~~~g~~~lvid~e~~~~~~----g~~~~iA~ 235 (261) T COG1240 167 LAREKR-------RGPDRRPVMVVITDGRANVPIPLGPKAETLEAASKLRLRGIQLLVIDTEGSEVRL----GLAEEIAR 235 (261) T ss_pred HHHHHC-------CCCCCCEEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHCCCCEEEEECCCCCCCC----CHHHHHHH T ss_conf 997510-------4887653899973796588889865779999999985268847999557852334----47999999 Q ss_pred -CCCCEEEECCH--HHHHHHHH Q ss_conf -89837882998--99999999 Q gi|254780388|r 424 -SPNSFFEANST--HELNKIFR 442 (458) Q Consensus 424 -s~~~yy~a~~~--~eL~~aF~ 442 (458) +.+.||+-++- +.|..+.+ T Consensus 236 ~~Gg~~~~L~~l~~~~i~~~~r 257 (261) T COG1240 236 ASGGEYYHLDDLSDDSIVSAVR 257 (261) T ss_pred HHCCEEEECCCCCCHHHHHHHH T ss_conf 7399078655564048999887 No 32 >COG4245 TerY Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain [General function prediction only] Probab=98.10 E-value=7e-05 Score=49.55 Aligned_cols=164 Identities=14% Similarity=0.152 Sum_probs=104.2 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHH Q ss_conf 06677766522112346775666520121100677665565576812100566665641256777776458899999862 Q gi|254780388|r 271 KHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYDTI 350 (458) Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~~L 350 (458) +..++..+..+++.+..-.+--.....+..+|+.......|++.- .. -.-..+.+.|+|..+.++.-+.+++ T Consensus 21 IealN~Glq~m~~~Lkqdp~Ale~v~lsIVTF~~~a~~~~pf~~~-~n-------F~~p~L~a~GgT~lGaAl~~a~d~I 92 (207) T COG4245 21 IEALNAGLQMMIDTLKQDPYALERVELSIVTFGGPARVIQPFTDA-AN-------FNPPILTAQGGTPLGAALTLALDMI 92 (207) T ss_pred HHHHHHHHHHHHHHHHHCHHHHHEEEEEEEEECCCCEEEECHHHH-HH-------CCCCCEECCCCCCHHHHHHHHHHHH T ss_conf 799989999999998748465440578999826850687331557-54-------4887013699980679999999999 Q ss_pred CCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHH--HHCCCEEEEEEECCCCCCCCHHHHHHHHHCCCCCE Q ss_conf 3666654444566777506999996066888854089999999--98796899999437887431178999860689837 Q gi|254780388|r 351 ISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKA--KSQGIRIMTIAFSVNKTQQEKARYFLSNCASPNSF 428 (458) Q Consensus 351 s~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~--K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~~~y 428 (458) -...+-... ..-.+++.+++|||||+.+ +..+.++...- +...-.|-..+|+.... + -..|++..+-..- T Consensus 93 e~~~~~~~a--~~kgdyrP~vfLiTDG~Pt--D~w~~~~~~~~~~~~~~k~v~a~~~G~~~a--d--~~~L~qit~~V~~ 164 (207) T COG4245 93 EERKRKYDA--NGKGDYRPWVFLITDGEPT--DDWQAGAALVFQGERRAKSVAAFSVGVQGA--D--NKTLNQITEKVRQ 164 (207) T ss_pred HHHHHHCCC--CCCCCCCEEEEEECCCCCC--HHHHHHHHHHHHCCCCCCEEEEEEECCCCC--C--CHHHHHHHHHHCC T ss_conf 877765056--7755544179995389966--577767777640331005289999535434--4--1899998876525 Q ss_pred EEECCHHHHHHHHHHHHHHHHHH Q ss_conf 88299899999999998758753 Q gi|254780388|r 429 FEANSTHELNKIFRDRIGNEIFE 451 (458) Q Consensus 429 y~a~~~~eL~~aF~~~i~~~~~~ 451 (458) |...++..+.+-|+ .+...++. T Consensus 165 ~~t~d~~~f~~fFk-W~SaSisa 186 (207) T COG4245 165 FLTLDGLQFREFFK-WLSASISA 186 (207) T ss_pred CCCCCHHHHHHHHH-HHHHHHHC T ss_conf 23453488999999-99877513 No 33 >KOG2353 consensus Probab=97.78 E-value=0.0004 Score=44.71 Aligned_cols=165 Identities=13% Similarity=0.177 Sum_probs=101.1 Q ss_pred CCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCC----HHHHHHHHHHHHHHCCCCCCCCCCHHHHH Q ss_conf 33066777665221123467756665201211006776655655768----12100566665641256777776458899 Q gi|254780388|r 269 KKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWG----VHKLIRTIVKTFAIDENEMGSTAINDAMQ 344 (458) Q Consensus 269 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~i~~~~~~g~T~~~~gl~ 344 (458) .+++..+..+...++.+...+ ......|+.......|...+ ..-..+..++..++.+.+.|.++...|+. T Consensus 241 ~~~~lak~tv~~iLdtLs~~D------fvni~tf~~~~~~v~pc~~~~lvqAt~~nk~~~~~~i~~l~~k~~a~~~~~~e 314 (1104) T KOG2353 241 LRLDLAKQTVNEILDTLSDND------FVNILTFNSEVNPVSPCFNGTLVQATMRNKKVFKEAIETLDAKGIANYTAALE 314 (1104) T ss_pred HHHHHHHHHHHHHHHHCCCCC------EEEEEEECCCCCCCCCCCCCCEEECCHHHHHHHHHHHHHHCCCCCCCHHHHHH T ss_conf 316999999999997615477------68787621356756520258522045677999999986414125412435577 Q ss_pred HHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHCC Q ss_conf 99986236666544445667775069999960668888540899999999879689999943788743117899986068 Q gi|254780388|r 345 TAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCAS 424 (458) Q Consensus 345 ~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CAs 424 (458) .+..+|.....+... ..+..-.+.+++.|||..+.........-. -+..|.|||.=.+.....-.. .-+..|++ T Consensus 315 ~aF~lL~~~n~s~~~--~~~~~C~~~iml~tdG~~~~~~~If~~yn~--~~~~Vrvftflig~~~~~~~~--~~wmac~n 388 (1104) T KOG2353 315 YAFSLLRDYNDSRAN--TQRSPCNQAIMLITDGVDENAKEIFEKYNW--PDKKVRVFTFLIGDEVYDLDE--IQWMACAN 388 (1104) T ss_pred HHHHHHHHHCCCCCC--CCCCCCCEEEEEEECCCCCCHHHHHHHHCC--CCCCEEEEEEEECCCCCCCCC--CHHHHHHC T ss_conf 899999874445544--322500104577624775108999986036--777359999992442134541--21225407 Q ss_pred CCCEEEECCHHHHHHHHHHHH Q ss_conf 983788299899999999998 Q gi|254780388|r 425 PNSFFEANSTHELNKIFRDRI 445 (458) Q Consensus 425 ~~~yy~a~~~~eL~~aF~~~i 445 (458) .|.|++..+-.++.+--++.+ T Consensus 389 ~gyy~~I~~~~~v~~~~~~y~ 409 (1104) T KOG2353 389 KGYYVHIISIADVRENVLEYL 409 (1104) T ss_pred CCCEEECCCHHHCCHHHHHHH T ss_conf 885586466564586765566 No 34 >COG4655 Predicted membrane protein [Function unknown] Probab=97.74 E-value=3.6e-05 Score=51.41 Aligned_cols=55 Identities=24% Similarity=0.193 Sum_probs=50.1 Q ss_pred HHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 7401387279999999999999999999999999999999999999999988764 Q gi|254780388|r 14 LIKSCTGHFFIITALLMPVMLGVGGMLVDVVRWSYYEHALKQAAQTAIITASVPL 68 (458) Q Consensus 14 f~~d~~G~vaiifal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~LA~a~~~ 68 (458) |-|.+|+-+.++.++.+|..++.++++||+++.+.+|.+||..+|-|++++|... T Consensus 4 ~~r~~rs~~gvltal~~~lal~~l~l~VD~G~l~leqR~LQ~~ADlAAiaAAs~~ 58 (565) T COG4655 4 WPRRQRSMVGVLTALFVPLALATLLLGVDYGYLYLEQRELQRVADLAAIAAASNL 58 (565) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHEECCCEEEEEHHHHHHHHHHHHHHHHHHC T ss_conf 2376766778999999999999886502201244117878887769988877627 No 35 >PRK13406 bchD magnesium chelatase subunit D; Provisional Probab=97.22 E-value=0.0093 Score=35.98 Aligned_cols=163 Identities=15% Similarity=0.143 Sum_probs=98.8 Q ss_pred CCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECC-CCCCCCCCCCCHHHHHHHHHHHHHHCCCCCC Q ss_conf 101345677754330667776652211234677566652012110067-7665565576812100566665641256777 Q gi|254780388|r 257 HFVDSSSLRHVIKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFND-RVISDPSFSWGVHKLIRTIVKTFAIDENEMG 335 (458) Q Consensus 257 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g 335 (458) +..+...+ ....++...|.++..++.+ -+...-....+.|-. .....-|++...... +..+..|..+| T Consensus 406 FvVDASGS-~A~~Rm~~aKGAV~~LL~d-----AY~~RD~ValIaFRG~~AevlLPPTrSv~~A-----~r~L~~LP~GG 474 (584) T PRK13406 406 FVVDASGS-AALHRLAEAKGAVELLLAE-----CYVRRDHVALVAFRGRGAELLLPPTRSLVRA-----KRSLAGLPGGG 474 (584) T ss_pred EEEECCCC-HHHHHHHHHHHHHHHHHHH-----HHHHHCEEEEEEECCCCCEEEECCCCCHHHH-----HHHHHCCCCCC T ss_conf 99828862-7999999999999999999-----9960044789987687630741886559999-----99996299999 Q ss_pred CCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCC----------CHHHHHHHHHHHCCCEEEEEEE Q ss_conf 776458899999862366665444456677750699999606688885----------4089999999987968999994 Q gi|254780388|r 336 STAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDN----------EEGIAICNKAKSQGIRIMTIAF 405 (458) Q Consensus 336 ~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~----------~~~~~~C~~~K~~gI~IytI~f 405 (458) +|....||..+++++.... .....-.+||+|||-.|..- .....++..++..||...+|-. T Consensus 475 ~TPLA~GL~~A~~l~~~~r---------~~~~~p~~VllTDGRaNv~ldg~~~r~~a~~da~~~A~~l~~~g~~~vVIDT 545 (584) T PRK13406 475 GTPLAAGLDAALALALSVR---------RKGQTPTVVLLTDGRANIARDGAGGRAQAEEDALAAARALRAAGLPALVIDT 545 (584) T ss_pred CCHHHHHHHHHHHHHHHHH---------CCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCCCEEEEEC T ss_conf 8859999999999999975---------5799548999827987778777887114899999999999976997899948 Q ss_pred CCCCCCCCHHHHHHHHHCC--CCCEEEE--CCHHHHHHHHHHHH Q ss_conf 3788743117899986068--9837882--99899999999998 Q gi|254780388|r 406 SVNKTQQEKARYFLSNCAS--PNSFFEA--NSTHELNKIFRDRI 445 (458) Q Consensus 406 ~~~~~~~~~~~~~lk~CAs--~~~yy~a--~~~~eL~~aF~~~i 445 (458) +.... ...+.-|. .+.||.- .+++.|..+.+.+. T Consensus 546 ~~~~~------~~a~~LA~~l~a~Y~~Lp~~~A~~l~~~V~~a~ 583 (584) T PRK13406 546 SPRPQ------PQARALAEAMGARYLPLPRADATRLSQAVRAAT 583 (584) T ss_pred CCCCC------HHHHHHHHHCCCCEEECCCCCHHHHHHHHHHHC T ss_conf 98886------269999998399189789789899999999851 No 36 >COG4548 NorD Nitric oxide reductase activation protein [Inorganic ion transport and metabolism] Probab=97.05 E-value=0.011 Score=35.60 Aligned_cols=107 Identities=15% Similarity=0.188 Sum_probs=78.4 Q ss_pred HHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCC--------CHHHHHHHHHH Q ss_conf 6665641256777776458899999862366665444456677750699999606688885--------40899999999 Q gi|254780388|r 323 IVKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDN--------EEGIAICNKAK 394 (458) Q Consensus 323 ~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~--------~~~~~~C~~~K 394 (458) .....|..+.|+-.|-.+.+|-.+-..|.- .+..+|++|++|||+.|.-+ ..+......++ T Consensus 519 ~~~~RImALePg~ytR~G~AIR~As~kL~~-----------rpq~qklLivlSDGkPnd~d~YEgr~gIeDTr~AV~eaR 587 (637) T COG4548 519 TVGPRIMALEPGYYTRDGAAIRHASAKLME-----------RPQRQKLLIVLSDGKPNDFDHYEGRFGIEDTREAVIEAR 587 (637) T ss_pred CCCHHHEECCCCCCCCCCHHHHHHHHHHHC-----------CCCCCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHH T ss_conf 533122133766444310999999999834-----------741124899944898543443233211153799999998 Q ss_pred HCCCEEEEEEECCCCCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHH Q ss_conf 8796899999437887431178999860689837882998999999999987 Q gi|254780388|r 395 SQGIRIMTIAFSVNKTQQEKARYFLSNCASPNSFFEANSTHELNKIFRDRIG 446 (458) Q Consensus 395 ~~gI~IytI~f~~~~~~~~~~~~~lk~CAs~~~yy~a~~~~eL~~aF~~~i~ 446 (458) ..||+||-|.+.- ..++.+..-.+-+.|-.+.+...|..+.-.+.. T Consensus 588 k~Gi~VF~Vtld~------ea~~y~p~~fgqngYa~V~~v~~LP~~L~~lyr 633 (637) T COG4548 588 KSGIEVFNVTLDR------EAISYLPALFGQNGYAFVERVAQLPGALPPLYR 633 (637) T ss_pred HCCCEEEEEEECC------HHHHHHHHHHCCCCEEECCCHHHCCHHHHHHHH T ss_conf 6583479998333------055552888526746970240016055799999 No 37 >TIGR02031 BchD-ChlD magnesium chelatase ATPase subunit D; InterPro: IPR011776 This entry represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. Unlike subunit I (IPR011775 from INTERPRO), this subunit is not found in archaea.; GO: 0005524 ATP binding, 0016851 magnesium chelatase activity, 0015995 chlorophyll biosynthetic process. Probab=96.90 E-value=0.011 Score=35.46 Aligned_cols=158 Identities=18% Similarity=0.162 Sum_probs=100.1 Q ss_pred CCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEEC-CCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 134567775433066777665221123467756665201211006-7766556557681210056666564125677777 Q gi|254780388|r 259 VDSSSLRHVIKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFN-DRVISDPSFSWGVHKLIRTIVKTFAIDENEMGST 337 (458) Q Consensus 259 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T 337 (458) ++.+.+.....|+...|.++..++.. .+-..+.+...++.|- .......|++-..... +..++.|..+||| T Consensus 517 VDASGSsaa~~Rm~~AKGAV~~LL~~---AYv~RD~vkVaLi~FRG~~Ae~LLPPsrSv~~a-----Kr~L~~LP~GGGt 588 (705) T TIGR02031 517 VDASGSSAAVARMSEAKGAVELLLGE---AYVHRDQVKVALIAFRGTAAEVLLPPSRSVELA-----KRRLDVLPGGGGT 588 (705) T ss_pred EECCHHHHHHHHHHHHHHHHHHHHHH---HHHHCCCEEEEEEECCCCHHHHCCCCHHHHHHH-----HHHHCCCCCCCCC T ss_conf 60635789999998778999999876---544136035776304443000037852358999-----9997158999856 Q ss_pred CCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCC------------C-C----------CHHHHHHHHHH Q ss_conf 64588999998623666654444566777506999996066888------------8-5----------40899999999 Q gi|254780388|r 338 AINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQ------------D-N----------EEGIAICNKAK 394 (458) Q Consensus 338 ~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~------------~-~----------~~~~~~C~~~K 394 (458) ....||..||.+=-..- .+++-.+-.|||+|||=.|- . + .....+..+++ T Consensus 589 PLA~gL~~A~~~a~qar-------~~GD~~~~~ivliTDGRgNvpL~~~~DP~~~~~~r~PrPts~~l~~e~~~lA~~i~ 661 (705) T TIGR02031 589 PLAAGLAAAVEVAKQAR-------SRGDVGRITIVLITDGRGNVPLDASVDPKAAKADRLPRPTSEELKEEVLALARKIR 661 (705) T ss_pred HHHHHHHHHHHHHHHHH-------CCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH T ss_conf 78999999999998510-------26885245567760778774676567861002356787268999999999999988 Q ss_pred HCCCEEEEEEECCCCCCCCHHHHHHHHHCC--CCCEEEECCHH Q ss_conf 879689999943788743117899986068--98378829989 Q gi|254780388|r 395 SQGIRIMTIAFSVNKTQQEKARYFLSNCAS--PNSFFEANSTH 435 (458) Q Consensus 395 ~~gI~IytI~f~~~~~~~~~~~~~lk~CAs--~~~yy~a~~~~ 435 (458) +.||-.-.|--.-.-. +++ .+++.|. .++||+-+++. T Consensus 662 ~~G~~~lVIDT~~~f~--s~G--~a~~lA~~~~a~Y~yLP~a~ 700 (705) T TIGR02031 662 EAGISALVIDTANKFV--STG--FAKKLARKLGARYIYLPNAT 700 (705) T ss_pred HCCCCEEEEECCCCCC--CCC--HHHHHHHHHCCCEEECCCCC T ss_conf 7188658982677866--764--48999998589067136888 No 38 >pfam07811 TadE TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues. Probab=96.44 E-value=0.011 Score=35.53 Aligned_cols=42 Identities=31% Similarity=0.338 Sum_probs=38.7 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 727999999999999999999999999999999999999999 Q gi|254780388|r 20 GHFFIITALLMPVMLGVGGMLVDVVRWSYYEHALKQAAQTAI 61 (458) Q Consensus 20 G~vaiifal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~ 61 (458) |..+|=|++.+|+++.+....+|++++...+..++.|+..|+ T Consensus 1 G~a~VEfalv~p~~l~l~~~~~~~~~~~~~~~~~~~Aa~~aa 42 (43) T pfam07811 1 GAAAVEFALVLPVLLLLLFGIVELGRLFYARQVLQNAAREAA 42 (43) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 916999999999999999999999999999999999998674 No 39 >cd01453 vWA_transcription_factor_IIH_type Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group. Probab=96.18 E-value=0.091 Score=29.65 Aligned_cols=153 Identities=12% Similarity=0.168 Sum_probs=90.9 Q ss_pred CCCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEE-CCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHH Q ss_conf 543306677766522112346775666520121100-6776655655768121005666656412567777764588999 Q gi|254780388|r 267 VIKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFF-NDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQT 345 (458) Q Consensus 267 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~ 345 (458) ..+|+.+.......++.......|...- +.... +......+.++-+....-.. .-....+.|......||.- T Consensus 22 ~PtRl~~~~~~l~~Fi~effdqNPisql---Gii~~rn~~a~~ls~lsgn~~~hi~~----l~~~~~~~G~~SLqN~Le~ 94 (183) T cd01453 22 KPSRLAVVLKLLELFIEEFFDQNPISQL---GIISIKNGRAEKLTDLTGNPRKHIQA----LKTARECSGEPSLQNGLEM 94 (183) T ss_pred CCCHHHHHHHHHHHHHHHHHCCCCCCEE---EEEEEECCEEEEEEECCCCHHHHHHH----HHHCCCCCCCHHHHHHHHH T ss_conf 9549999999999999998707974048---99999468169976468998999999----9854589998139999999 Q ss_pred HHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC-- Q ss_conf 998623666654444566777506999996066888854089999999987968999994378874311789998606-- Q gi|254780388|r 346 AYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA-- 423 (458) Q Consensus 346 g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA-- 423 (458) +...|...- ... .|-++++.-.--+.+...--..-+.+|+.+|+|..|+|.. .-.++|.++ T Consensus 95 A~~~L~~~P---------~~~-sREILiI~~Sl~t~DpgdI~~ti~~lk~~~IrvsvI~l~a-------Ev~I~k~l~~~ 157 (183) T cd01453 95 ALESLKHMP---------SHG-SREVLIIFSSLSTCDPGNIYETIDKLKKENIRVSVIGLSA-------EMHICKEICKA 157 (183) T ss_pred HHHHHHHCC---------CCC-CEEEEEEECCCCCCCCCCHHHHHHHHHHCCCEEEEEEECH-------HHHHHHHHHHH T ss_conf 999982089---------878-4489999756534797649999999998397899997427-------89999999998 Q ss_pred CCCCEEEECCHHHHHHHHHH Q ss_conf 89837882998999999999 Q gi|254780388|r 424 SPNSFFEANSTHELNKIFRD 443 (458) Q Consensus 424 s~~~yy~a~~~~eL~~aF~~ 443 (458) +.|.|+-+-+..-+.+-+.+ T Consensus 158 TgG~y~V~lde~H~~~ll~~ 177 (183) T cd01453 158 TNGTYKVILDETHLKELLLE 177 (183) T ss_pred HCCEEEEECCHHHHHHHHHH T ss_conf 39976875399999999995 No 40 >TIGR02442 Cob-chelat-sub cobaltochelatase subunit; InterPro: IPR012804 Cobalamin (vitamin B12) can be complexed with metal via ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. Cobaltochelatase is responsible for the insertion of cobalt into the corrin ring of coenzyme B12 during its biosynthesis. Two versions have been well described. CbiK/CbiX is a monomeric, anaerobic version which acts early in the biosynthesis (IPR010388 from INTERPRO). CobNST is a trimeric, ATP-dependent, aerobic version which acts late in the biosynthesis, (IPR011953 from INTERPRO, IPR006537 from INTERPRO, IPR006538 from INTERPRO) . The two pathways differ in the point of cobalt insertion during corrin ring formation . There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction . Cobaltochelatase shows similarities with magnesium chelatase, which is also a complex ATP-dependent enzyme made up of two separable components. However, unlike the situation in cobaltochelatase, one of these two components is membrane bound in magnesium chelatase . . Probab=96.17 E-value=0.017 Score=34.37 Aligned_cols=153 Identities=15% Similarity=0.166 Sum_probs=98.3 Q ss_pred CCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEEC-CCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCC Q ss_conf 10134567775433066777665221123467756665201211006-77665565576812100566665641256777 Q gi|254780388|r 257 HFVDSSSLRHVIKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFN-DRVISDPSFSWGVHKLIRTIVKTFAIDENEMG 335 (458) Q Consensus 257 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g 335 (458) +-.+.+.+...-.++...|.++..++.. -+...-+.+.++|- ......-|+|.......+ .+..|..+| T Consensus 513 FvVDASGSM~ar~RM~~~KGavLsLL~D-----AYq~RDkValI~FrG~~AevlLPPT~sv~~A~r-----~L~~lPtGG 582 (688) T TIGR02442 513 FVVDASGSMAARGRMAAAKGAVLSLLRD-----AYQKRDKVALITFRGEEAEVLLPPTSSVELAAR-----RLEELPTGG 582 (688) T ss_pred EEEECCHHHHHHHHHHHHHHHHHHHHHH-----HHHHCCEEEEEECCCCEEEEECCCCCHHHHHHH-----HHHHCCCCC T ss_conf 3533532044235789989999998888-----886277688862367343576587884899999-----997288989 Q ss_pred CCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCC------CCC-H---HHHHHHHHHHC-------CC Q ss_conf 7764588999998623666654444566777506999996066888------854-0---89999999987-------96 Q gi|254780388|r 336 STAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQ------DNE-E---GIAICNKAKSQ-------GI 398 (458) Q Consensus 336 ~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~------~~~-~---~~~~C~~~K~~-------gI 398 (458) .|....||..|+.++...-- ......-++|++|||-.|. +.. . +..+..++++. || T Consensus 583 rTPLa~gL~~A~~v~~~~~~-------~~~~~~pl~V~iTDGRaNv~L~~~~g~~qp~~~~~~~a~~L~~~~~R~R~Lg~ 655 (688) T TIGR02442 583 RTPLAAGLLKAAEVLSNELL-------RDDDRRPLVVVITDGRANVALDVSLGEPQPLDDARTIASKLAARASRIRSLGI 655 (688) T ss_pred CCHHHHHHHHHHHHHHHHHH-------CCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCEEECCC T ss_conf 87458999999999999861-------16899428998707863542666788841577899999998875043011162 Q ss_pred EEEEEEECC-CCCCCCHHHHHHHHHCC--CCCEEE Q ss_conf 899999437-88743117899986068--983788 Q gi|254780388|r 399 RIMTIAFSV-NKTQQEKARYFLSNCAS--PNSFFE 430 (458) Q Consensus 399 ~IytI~f~~-~~~~~~~~~~~lk~CAs--~~~yy~ 430 (458) ..-+|--+. ..-.- -+.++.|+ .+.||. T Consensus 656 ~~vV~DTE~~~~v~l----GlA~~~A~~lgg~~~~ 686 (688) T TIGR02442 656 KFVVIDTENPGFVRL----GLAEDLASALGGEYLR 686 (688) T ss_pred EEEEEECCCCCCCCC----CHHHHHHHHHCCCEEC T ss_conf 278997268875422----2389999982983224 No 41 >COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=95.53 E-value=0.17 Score=27.92 Aligned_cols=97 Identities=18% Similarity=0.210 Sum_probs=59.3 Q ss_pred HHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEE Q ss_conf 65641256777776458899999862366665444456677750699999606688885408999999998796899999 Q gi|254780388|r 325 KTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIA 404 (458) Q Consensus 325 ~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~ 404 (458) ...+....+ |||++...+..+..-+.....+. --+|++|||+.-..+.....+-+-.|..+.++|+|- T Consensus 336 i~fL~~~f~-GGTD~~~~l~~al~~~k~~~~~~-----------adiv~ITDg~~~~~~~~~~~v~e~~k~~~~rl~aV~ 403 (437) T COG2425 336 IEFLSYVFG-GGTDITKALRSALEDLKSRELFK-----------ADIVVITDGEDERLDDFLRKVKELKKRRNARLHAVL 403 (437) T ss_pred HHHHHHHCC-CCCCHHHHHHHHHHHHHCCCCCC-----------CCEEEEECCHHHHHHHHHHHHHHHHHHHHCEEEEEE T ss_conf 999965068-98885899999999864366567-----------778998043766546789999999887543489999 Q ss_pred ECCCCCCCCHHHHHHHHHCCCCC-EEEECCHHHHHHHHH Q ss_conf 43788743117899986068983-788299899999999 Q gi|254780388|r 405 FSVNKTQQEKARYFLSNCASPNS-FFEANSTHELNKIFR 442 (458) Q Consensus 405 f~~~~~~~~~~~~~lk~CAs~~~-yy~a~~~~eL~~aF~ 442 (458) .+..+.. . |+.. .++ .|... +.+...+++ T Consensus 404 I~~~~~~-----~-l~~I--sd~~i~~~~-~~~~~kv~~ 433 (437) T COG2425 404 IGGYGKP-----G-LMRI--SDHIIYRVE-PRDRVKVVK 433 (437) T ss_pred ECCCCCC-----C-CCEE--CEEEEEEEC-CHHHHHHHH T ss_conf 6478986-----6-0001--114678727-477767773 No 42 >pfam11775 CobT_C Cobalamin biosynthesis protein CobT VWA domain. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. Probab=94.92 E-value=0.26 Score=26.76 Aligned_cols=85 Identities=16% Similarity=0.261 Sum_probs=53.7 Q ss_pred CCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCC--------CCCCCHH---HHHHHHHH-HCCCEEEEEE Q ss_conf 7645889999986236666544445667775069999960668--------8885408---99999999-8796899999 Q gi|254780388|r 337 TAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGEN--------TQDNEEG---IAICNKAK-SQGIRIMTIA 404 (458) Q Consensus 337 T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n--------~~~~~~~---~~~C~~~K-~~gI~IytI~ 404 (458) .--+++|.|+.+-|.. .+..+|+++++|||.. |..+-.. ...-+.+. ..+|++.-|| T Consensus 116 NiDGEAL~wA~~RL~~-----------R~e~RkILmViSDGaP~ddst~s~n~~~yL~~hLr~vi~~ie~~~~iel~aIG 184 (220) T pfam11775 116 NIDGEALAQAAKLFAG-----------RMEDKKILLMISDGAPCDDSTLSVAAGDGFEQHLRHIIEEIETLSEIDLIAIG 184 (220) T ss_pred CCCCHHHHHHHHHHHC-----------CCCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEEE T ss_conf 8971999999999863-----------93124699997589967764112587776799999999998506882699987 Q ss_pred ECCCCCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 437887431178999860689837882998999999999 Q gi|254780388|r 405 FSVNKTQQEKARYFLSNCASPNSFFEANSTHELNKIFRD 443 (458) Q Consensus 405 f~~~~~~~~~~~~~lk~CAs~~~yy~a~~~~eL~~aF~~ 443 (458) .+.+. .+...+ ++--..+.+||-.+.-+ T Consensus 185 IghDv-----~r~yY~------~av~i~d~eeL~~~~~~ 212 (220) T pfam11775 185 IGHDA-----PRRYYK------NAALINDAEELGGAITE 212 (220) T ss_pred ECCCC-----CHHHHH------CCEEECCHHHHHHHHHH T ss_conf 47776-----866650------65686038886599999 No 43 >cd01457 vWA_ORF176_type VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most Probab=94.67 E-value=0.3 Score=26.35 Aligned_cols=81 Identities=12% Similarity=0.226 Sum_probs=44.4 Q ss_pred HHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCC-EEEEEEECCCCCCCCCHHH---HHHHHHHH- Q ss_conf 566665641256777776458899999862366665444456677750-6999996066888854089---99999998- Q gi|254780388|r 321 RTIVKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAK-KYIVLLTDGENTQDNEEGI---AICNKAKS- 395 (458) Q Consensus 321 ~~~~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~-K~iil~TDG~n~~~~~~~~---~~C~~~K~- 395 (458) +..+........|.|.|+....|.-..+-....... ....|. -.+|++|||+.+....... ..++++.. T Consensus 66 ~~~V~~iF~~~~P~G~T~~g~~L~~il~~y~~r~~~------~~~kp~g~~iIVITDG~p~D~~av~~~Ii~aa~kLd~~ 139 (199) T cd01457 66 SSKVDQLFAENSPDGGTNLAAVLQDALNNYFQRKEN------GATCPEGETFLVITDGAPDDKDAVERVIIKASDELDAD 139 (199) T ss_pred HHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHC------CCCCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHCCC T ss_conf 999999985589899796379999998999873200------68999860799982799798288999999999863440 Q ss_pred CCCEEEEEEECC Q ss_conf 796899999437 Q gi|254780388|r 396 QGIRIMTIAFSV 407 (458) Q Consensus 396 ~gI~IytI~f~~ 407 (458) +-+-|-++.+|. T Consensus 140 ~qlgIqF~QVG~ 151 (199) T cd01457 140 NELAISFLQIGR 151 (199) T ss_pred CCCCEEEEEECC T ss_conf 100367778559 No 44 >COG3847 Flp Flp pilus assembly protein, pilin Flp [Intracellular trafficking and secretion] Probab=93.35 E-value=0.38 Score=25.69 Aligned_cols=27 Identities=11% Similarity=0.170 Sum_probs=20.9 Q ss_pred HHHHHHHHHCCCCCHHHHHHHHHHHHH Q ss_conf 999998740138727999999999999 Q gi|254780388|r 8 IFYSKKLIKSCTGHFFIITALLMPVML 34 (458) Q Consensus 8 ~~~~~rf~~d~~G~vaiifal~l~~ll 34 (458) +..++||+|||+|.-+|=.++.+..+- T Consensus 2 ~~~~~rF~rDE~GAtaiEYglia~lIa 28 (58) T COG3847 2 KKLLRRFLRDEDGATAIEYGLIAALIA 28 (58) T ss_pred CHHHHHHHHCCCCHHHHHHHHHHHHHH T ss_conf 178999977456518999999999999 No 45 >cd01455 vWA_F11C1-5a_type Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=90.92 E-value=1 Score=22.88 Aligned_cols=102 Identities=12% Similarity=0.110 Sum_probs=68.5 Q ss_pred CCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHH-CCCEEEEEEECCCCCC Q ss_conf 777776458899999862366665444456677750699999606688885408999999998-7968999994378874 Q gi|254780388|r 333 EMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKS-QGIRIMTIAFSVNKTQ 411 (458) Q Consensus 333 ~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~-~gI~IytI~f~~~~~~ 411 (458) -..|-+.=+++.++...|-+.. +.-..++|++||-.-.-+.-....+...++. ..+.-|.|-.+.-+++ T Consensus 87 C~sGD~Tlea~~~Ai~~l~a~~----------d~De~fVivlSDANL~RYgI~p~~l~~~l~~~p~V~a~~IfIgslg~e 156 (191) T cd01455 87 CWSGDHTVEATEFAIKELAAKE----------DFDEAIVIVLSDANLERYGIQPKKLADALAREPNVNAFVIFIGSLSDE 156 (191) T ss_pred EECCCCHHHHHHHHHHHHHHCC----------CCCCCEEEEECCCCHHHCCCCHHHHHHHHHCCCCCCEEEEEEECHHHH T ss_conf 0258844899999999875302----------677608999814764431889899999973387766899997351679 Q ss_pred CCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHH Q ss_conf 3117899986068983788299899999999998758 Q gi|254780388|r 412 QEKARYFLSNCASPNSFFEANSTHELNKIFRDRIGNE 448 (458) Q Consensus 412 ~~~~~~~lk~CAs~~~yy~a~~~~eL~~aF~~~i~~~ 448 (458) +. -|+.-=-.|+-|-..+..+|...|++|+... T Consensus 157 ---A~-~l~~~lP~G~~fVc~dt~~lP~il~qIfts~ 189 (191) T cd01455 157 ---AD-QLQRELPAGKAFVCMDTSELPHIMQQIFTST 189 (191) T ss_pred ---HH-HHHHHCCCCCEEEECCHHHHHHHHHHHHHHH T ss_conf ---99-9997489974178536536789999998874 No 46 >COG4726 PilX Tfp pilus assembly protein PilX [Cell motility and secretion / Intracellular trafficking and secretion] Probab=90.90 E-value=0.81 Score=23.59 Aligned_cols=54 Identities=20% Similarity=0.229 Sum_probs=30.5 Q ss_pred HHHHCCCCCHHHHHHHHHHHHHHHHHHHHH--------HHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 874013872799999999999999999999--------9999999999999999999998876 Q gi|254780388|r 13 KLIKSCTGHFFIITALLMPVMLGVGGMLVD--------VVRWSYYEHALKQAAQTAIITASVP 67 (458) Q Consensus 13 rf~~d~~G~vaiifal~l~~ll~~~g~aVD--------~~r~~~~ks~Lq~A~DaA~LA~a~~ 67 (458) |-.|-+|| ++.+++|+++++++++|++.= ++.-++.|++.++|+++|.-.+-.. T Consensus 7 r~~r~qRG-~~LivvL~~LvvltLl~l~~~r~~llqeRiSaN~~D~~lAfqaAEaaLr~~E~~ 68 (196) T COG4726 7 RGSRRQRG-FALIVVLMVLVVLTLLGLAAARSVLLQERISANERDRSLAFQAAEAALREGELQ 68 (196) T ss_pred CCCCCCCC-EEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 87645676-473899999999999999999999989887520677899999999999877899 No 47 >pfam04285 DUF444 Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA). Probab=88.94 E-value=1.5 Score=21.88 Aligned_cols=110 Identities=17% Similarity=0.269 Sum_probs=64.5 Q ss_pred CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHH-HHHH-HHHHHCCCEEEEEEECCC Q ss_conf 56777776458899999862366665444456677750699999606688885408-9999-999987968999994378 Q gi|254780388|r 331 ENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEG-IAIC-NKAKSQGIRIMTIAFSVN 408 (458) Q Consensus 331 ~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~-~~~C-~~~K~~gI~IytI~f~~~ 408 (458) -..+|||-+..++..+...+....| .....-|..-.|||+|...++.. ..++ +.+-.. ...| -+.+.. T Consensus 307 ~~EsGGT~vSSal~l~~~II~~RYp--------p~~WNiY~f~aSDGDNw~~D~~~c~~lL~~~llp~-~~~f-~Y~EI~ 376 (421) T pfam04285 307 KQESGGTIVSSALELALEIIDERYP--------PAEWNIYAFQASDGDNWTDDSERCVKLLMNKLMPN-AQYY-GYVEIT 376 (421) T ss_pred CCCCCCEEEEHHHHHHHHHHHHHCC--------HHHCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHH-HHEE-EEEEEC T ss_conf 4898975872799999999985588--------64450467980377664346499999999989887-4158-999945 Q ss_pred -CCCCCHHHHHHHHHCCCCC--EEEECCHHHHHHHHHHHHHHHHH Q ss_conf -8743117899986068983--78829989999999999875875 Q gi|254780388|r 409 -KTQQEKARYFLSNCASPNS--FFEANSTHELNKIFRDRIGNEIF 450 (458) Q Consensus 409 -~~~~~~~~~~lk~CAs~~~--yy~a~~~~eL~~aF~~~i~~~~~ 450 (458) ....+.+...-..-...++ .....+.+||-.+|+++++++.+ T Consensus 377 ~~~~~~~~~~y~~~~~~~~nf~~~~I~~k~dIypvfr~lf~ke~~ 421 (421) T pfam04285 377 QRRSHSTWRKYEAVKGVKDNFAMYTIREKDDVYPVFRTLFQKELN 421 (421) T ss_pred CCCCCCHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHHHCC T ss_conf 887652799999863248975799958888889999999864309 No 48 >PRK05325 hypothetical protein; Provisional Probab=87.93 E-value=1.7 Score=21.47 Aligned_cols=110 Identities=14% Similarity=0.231 Sum_probs=63.8 Q ss_pred CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHH-H-HHHHHHHCCCEEEEEEECCC Q ss_conf 567777764588999998623666654444566777506999996066888854089-9-99999987968999994378 Q gi|254780388|r 331 ENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGI-A-ICNKAKSQGIRIMTIAFSVN 408 (458) Q Consensus 331 ~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~-~-~C~~~K~~gI~IytI~f~~~ 408 (458) ....|||-+..++..+...+....| .....-|..-.|||+|...+..-. . +.+.+-.. ...|. +.+.. T Consensus 295 ~~esGGT~vSSa~~l~~eII~~rYp--------p~~WNIY~f~aSDGDNw~~D~~~~~~~L~~~llp~-~~~f~-Y~Ei~ 364 (414) T PRK05325 295 SRESGGTIVSSALKLMLEIIEERYP--------PAEWNIYAFQASDGDNWSDDSPRCVELLVEELLPV-VNYFA-YIEIT 364 (414) T ss_pred CCCCCCEEEEHHHHHHHHHHHHHCC--------HHHCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHH-HHEEE-EEEEE T ss_conf 5898984850899999999985488--------75652788991377675446699999999988887-53689-99971 Q ss_pred CCCCCHHHHHH---HHHCC-CCCE--EEECCHHHHHHHHHHHHHHHHH Q ss_conf 87431178999---86068-9837--8829989999999999875875 Q gi|254780388|r 409 KTQQEKARYFL---SNCAS-PNSF--FEANSTHELNKIFRDRIGNEIF 450 (458) Q Consensus 409 ~~~~~~~~~~l---k~CAs-~~~y--y~a~~~~eL~~aF~~~i~~~~~ 450 (458) .......+.++ +.... .+++ ......+||-.+|++++.++.. T Consensus 365 ~~~~~~~~~l~~~y~~~~~~~~~f~~~~I~~~~dI~p~fr~lf~k~~~ 412 (414) T PRK05325 365 PRAYYRHQTLWREYEKLQDEFDNFAMQHIRDKADIYPVFRELFKKELA 412 (414) T ss_pred CCCCCCCHHHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHHHHC T ss_conf 798887568999999975548886799948888889999999855552 No 49 >pfam06707 DUF1194 Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown. Probab=87.53 E-value=1.8 Score=21.32 Aligned_cols=110 Identities=21% Similarity=0.248 Sum_probs=73.0 Q ss_pred CCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCC Q ss_conf 67777764588999998623666654444566777506999996066888854089999999987968999994378874 Q gi|254780388|r 332 NEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQ 411 (458) Q Consensus 332 ~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~ 411 (458) ...+.|.+...|..+..+|... ++ .-.+|+|=+-.||.||.+.......=+.+-..||+|--+........ T Consensus 90 ~~~~~Taig~Al~~a~~l~~~~-~~--------~~~RrvIDiSGDG~nN~G~~p~~~ard~~~~~GitINgL~I~~~~~~ 160 (206) T pfam06707 90 RAGRRTAIGGALGFAAALLAQN-PY--------ECLRRVIDVSGDGPNNQGFPPVTAARDAAVAAGVTINGLAIMGAEAP 160 (206) T ss_pred CCCCCCHHHHHHHHHHHHHHHC-CC--------CCCEEEEEEECCCCCCCCCCCHHHHHHHHHHCCEEEEEEEECCCCCC T ss_conf 8899976999999999999829-98--------76179999607998889998137898767775928966777478987 Q ss_pred C-CHHHHHHHHHC--CCCCE-EEECCHHHHHHHHHHHHHHHHH Q ss_conf 3-11789998606--89837-8829989999999999875875 Q gi|254780388|r 412 Q-EKARYFLSNCA--SPNSF-FEANSTHELNKIFRDRIGNEIF 450 (458) Q Consensus 412 ~-~~~~~~lk~CA--s~~~y-y~a~~~~eL~~aF~~~i~~~~~ 450 (458) . ..-...++.|. +||.| -.+..-++-.+|++..+-.+++ T Consensus 161 ~~~~L~~yy~~~VIgGpgAFV~~a~~~~df~~AirrKL~rEIa 203 (206) T pfam06707 161 TSDDLDAYYRDCVIGGPGAFVEPANGFEDFAEAIRRKLVREIA 203 (206) T ss_pred CCHHHHHHHHHCCCCCCCCEEEECCCHHHHHHHHHHHHHHHHH T ss_conf 6236999997320238984499738879999999999999873 No 50 >COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=84.18 E-value=2.7 Score=20.26 Aligned_cols=81 Identities=21% Similarity=0.220 Sum_probs=48.7 Q ss_pred HHHHHHHHHC-CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHH---C Q ss_conf 5666656412-56777776458899999862366665444456677750699999606688885408999999998---7 Q gi|254780388|r 321 RTIVKTFAID-ENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKS---Q 396 (458) Q Consensus 321 ~~~~~~~i~~-~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~---~ 396 (458) +..+...|.. ..+.|.|....++.|+...+.+.. .....-.+.+.|||+++........+-...|. . T Consensus 96 ~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~~~~~~~tdg~~~~~~~d~~~~~~~~~~~~~~ 166 (399) T COG2304 96 KESITAAIDQSLQAGGATAVEASLSLAVELAAKAL---------PRGTLNRILLLTDGENNLGLVDPSRLSALAKLAAGK 166 (399) T ss_pred HHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHC---------CCCCCCEEEEECCCCHHCCCCCHHHHHHHHCCCCCC T ss_conf 27788887640264554305778999999876423---------545532333303641202766788999986345567 Q ss_pred CCEEEEEEECCCCC Q ss_conf 96899999437887 Q gi|254780388|r 397 GIRIMTIAFSVNKT 410 (458) Q Consensus 397 gI~IytI~f~~~~~ 410 (458) +|.+.++|++...+ T Consensus 167 ~i~~~~~g~~~~~n 180 (399) T COG2304 167 GIVLDTLGLGDDVN 180 (399) T ss_pred CEEEEEECCCCHHH T ss_conf 62786313552267 No 51 >cd01452 VWA_26S_proteasome_subunit 26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association. Probab=82.04 E-value=3.3 Score=19.73 Aligned_cols=160 Identities=11% Similarity=0.122 Sum_probs=95.4 Q ss_pred CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEEC-CCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHH Q ss_conf 67775433066777665221123467756665201211006-77665565576812100566665641256777776458 Q gi|254780388|r 263 SLRHVIKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFN-DRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAIND 341 (458) Q Consensus 263 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~ 341 (458) ..++..+|+++-.+++.-+........|.+.- +..+.. .......+++.+. ..+...+....+.|.-+... T Consensus 18 NGDy~PtR~~AQ~dAvn~i~~~k~~~NpEn~V---Gl~tmag~~~~Vl~TlT~D~-----gkiL~~lh~i~~~G~~~~~~ 89 (187) T cd01452 18 NGDYPPTRFQAQADAVNLICQAKTRSNPENNV---GLMTMAGNSPEVLVTLTNDQ-----GKILSKLHDVQPKGKANFIT 89 (187) T ss_pred CCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCE---EEEEECCCCCEEEEECCCCH-----HHHHHHCCCCCCCCEECHHH T ss_conf 58989718999999999999777514953311---35761589866898448657-----88987532677187651887 Q ss_pred HHHHHHHHHCCCCCCCCCCCCCCCC-CCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHH Q ss_conf 8999998623666654444566777-506999996066888854089999999987968999994378874311789998 Q gi|254780388|r 342 AMQTAYDTIISSNEDEVHRMKNNLE-AKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLS 420 (458) Q Consensus 342 gl~~g~~~Ls~~~~~~~~~~~~~~~-~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk 420 (458) ||+-|.=.|-... ++. -+|+++|.-- -..........+.+++|.++|-|-.|.||........-+.+.+ T Consensus 90 ~IqiA~LALKHRq---------nk~~~qRIv~FVgS-Pi~~~ek~l~~laKklKKnnV~vDII~FGe~~~n~~kL~~f~~ 159 (187) T cd01452 90 GIQIAQLALKHRQ---------NKNQKQRIVAFVGS-PIEEDEKDLVKLAKRLKKNNVSVDIINFGEIDDNTEKLTAFID 159 (187) T ss_pred HHHHHHHHHHCCC---------CCCCCEEEEEEECC-CCCCCHHHHHHHHHHHHHCCCCEEEEEECCCCCCHHHHHHHHH T ss_conf 9999999972346---------77754479999789-8755789999999987555853589994688899899999999 Q ss_pred HHCC-CC-CEEEECCHHHH-HHH Q ss_conf 6068-98-37882998999-999 Q gi|254780388|r 421 NCAS-PN-SFFEANSTHEL-NKI 440 (458) Q Consensus 421 ~CAs-~~-~yy~a~~~~eL-~~a 440 (458) ..-+ ++ ++-..+.+..| .++ T Consensus 160 ~vn~~~~Shlv~ippg~~lLSd~ 182 (187) T cd01452 160 AVNGKDGSHLVSVPPGENLLSDA 182 (187) T ss_pred HHCCCCCCEEEEECCCCCHHHHH T ss_conf 84589982599947998645676 No 52 >TIGR02877 spore_yhbH sporulation protein YhbH; InterPro: IPR014230 Proteins in this entry, typified by YhbH from Bacillus subtilis, are found in the genomes of nearly every endospore-forming bacterium, and in no other genomes. The gene in Bacillus subtilis was shown to be a member of the sigma-E regulon, with mutation leading to a sporulation defect .. Probab=80.33 E-value=3.7 Score=19.36 Aligned_cols=106 Identities=18% Similarity=0.219 Sum_probs=67.7 Q ss_pred CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCC-CCCCH-HHHHHHHH-HHCC----CEE--E Q ss_conf 56777776458899999862366665444456677750699999606688-88540-89999999-9879----689--9 Q gi|254780388|r 331 ENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENT-QDNEE-GIAICNKA-KSQG----IRI--M 401 (458) Q Consensus 331 ~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~-~~~~~-~~~~C~~~-K~~g----I~I--y 401 (458) .--+|||-+..|...|+.++... |+-....=|-+.+|||+|- .+|.| ...+..++ +--+ ++| - T Consensus 275 kgESGGT~~SS~Y~~ALeiI~~R--------YnP~~yNiY~FHfSDGDNl~~Dn~Rlav~l~~~L~~~cNL~GYgEIEtq 346 (392) T TIGR02877 275 KGESGGTRCSSAYKLALEIIDER--------YNPARYNIYAFHFSDGDNLSSDNERLAVKLVRKLLEVCNLFGYGEIETQ 346 (392) T ss_pred CCCCCCCCHHHHHHHHHHHHHCC--------CCCCCCCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHCCCEEEEECC T ss_conf 25667743016788999997427--------8831006565355337788988646899999999887611110566056 Q ss_pred EEEECCCCCCCCHHHHHHHH-HCCCCC-EEEECCHHHHHHHHHHH Q ss_conf 99943788743117899986-068983-78829989999999999 Q gi|254780388|r 402 TIAFSVNKTQQEKARYFLSN-CASPNS-FFEANSTHELNKIFRDR 444 (458) Q Consensus 402 tI~f~~~~~~~~~~~~~lk~-CAs~~~-yy~a~~~~eL~~aF~~~ 444 (458) --++..+.+++++.....++ .-.|-. ++.-.+.+||-.|.+.. T Consensus 347 Pqyls~~Y~y~~tL~~~f~~ei~~~~Fv~~~I~~K~d~y~ALk~~ 391 (392) T TIGR02877 347 PQYLSMPYGYSSTLKSKFKKEIKDPNFVLLIIKDKEDVYPALKKF 391 (392) T ss_pred CCEECCCCCCCHHHHHHHHHHHCCCCCEEEEECCHHHHHHHHHHH T ss_conf 511037886655778888874058883587650414689999983 No 53 >pfam05762 VWA_CoxE VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA type domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria. Probab=77.86 E-value=4.4 Score=18.88 Aligned_cols=61 Identities=23% Similarity=0.232 Sum_probs=41.5 Q ss_pred CCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEE Q ss_conf 77777645889999986236666544445667775069999960668888540899999999879689999 Q gi|254780388|r 333 EMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTI 403 (458) Q Consensus 333 ~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI 403 (458) -+|||+++..+..-.+...- ..-..+-++|++|||.++.+.......-+.++..+-+|.-+ T Consensus 126 ~~GgT~ig~al~~f~~~~~~----------~~l~~~t~ViilsDg~~~~~~~~l~~~l~~L~~~~~rviWL 186 (223) T pfam05762 126 WGGGTRIGAALAYFNELWTR----------PALSRGAVVVLVSDGLERGDSEELLAEVARLVRSARRLVWL 186 (223) T ss_pred CCCCCCHHHHHHHHHHHCCC----------CCCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCCEEEEE T ss_conf 67997499999999985030----------34678867999723010388318999999999837879998 No 54 >pfam00362 Integrin_beta Integrin, beta chain. Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF. Probab=74.58 E-value=5.4 Score=18.32 Aligned_cols=85 Identities=13% Similarity=0.128 Sum_probs=49.5 Q ss_pred CCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHH-HHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCC------- Q ss_conf 655768121005666656412567777764588-999998623666654444566777506999996066888------- Q gi|254780388|r 310 PSFSWGVHKLIRTIVKTFAIDENEMGSTAINDA-MQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQ------- 381 (458) Q Consensus 310 ~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~g-l~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~------- 381 (458) .+++.+...+.. .+....-+|+-+.++| +-.-++.......-.|. ...+|++||.||+.... T Consensus 182 l~LT~~~~~F~~-----~v~~q~iSgNlD~PEGGfDAlmQ~aVC~~~IGWR-----~~arrllv~~TDa~fH~AgDGkL~ 251 (424) T pfam00362 182 LSLTDDTDRFNE-----EVKKQKISGNLDAPEGGFDAIMQAAVCGEEIGWR-----NEARRLLVFTTDAGFHFAGDGKLG 251 (424) T ss_pred CCCCCCHHHHHH-----HHHHCCCCCCCCCCCCCHHHHHHHHHHCCCCCCC-----CCCEEEEEEECCCCCCCCCCCCEE T ss_conf 246777899999-----9874636467778750177777887614233777-----785289999858875135776334 Q ss_pred ------------------------CCCHHHHHHHHHHHCCCEE-EEEE Q ss_conf ------------------------8540899999999879689-9999 Q gi|254780388|r 382 ------------------------DNEEGIAICNKAKSQGIRI-MTIA 404 (458) Q Consensus 382 ------------------------~~~~~~~~C~~~K~~gI~I-ytI~ 404 (458) +-.-...+-+++++++|.+ |.|. T Consensus 252 GIv~PNDg~CHL~~~g~Yt~s~~~DYPSv~ql~~kl~ennI~~IFAVt 299 (424) T pfam00362 252 GIVEPNDGQCHLDDNGEYTASTTLDYPSVGQLAEKLSENNINPIFAVT 299 (424) T ss_pred EEECCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEEC T ss_conf 353488873044898761445667888889999999864925999975 No 55 >KOG1226 consensus Probab=60.42 E-value=10 Score=16.49 Aligned_cols=47 Identities=26% Similarity=0.244 Sum_probs=28.0 Q ss_pred EEECCCCCCHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCC Q ss_conf 640454310000376643222432100001001232013445521345655466 Q gi|254780388|r 139 LLLNPLSLFLRSMGIKSWLIQTKAEAETVSRSYHKEHGVSIQWVIDFSRSMLDY 192 (458) Q Consensus 139 ~~~~~~~~~~~~~~~~s~~~~~~~~~~~~~~~~~~~~~~d~~~v~d~sgSm~~~ 192 (458) .++.+....+++.+.....+..... ...+.|+|+|...|.|.||.+. T Consensus 102 ~Qi~PQ~~~l~LRpg~~~~f~l~~r-------~a~~yPVDLYyLMDlS~SM~DD 148 (783) T KOG1226 102 TQITPQELRLRLRPGEEQTFQLKVR-------QAEDYPVDLYYLMDLSYSMKDD 148 (783) T ss_pred EEECCCEEEEEECCCCCEEEEEEEE-------ECCCCCEEEEEEEECCHHHHHH T ss_conf 6864615899965798504899996-------0357970379986130245656 No 56 >PRK10913 dipeptide transporter; Provisional Probab=56.98 E-value=12 Score=16.12 Aligned_cols=35 Identities=3% Similarity=-0.092 Sum_probs=23.9 Q ss_pred HHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHH Q ss_conf 36789999998740138727999999999999999 Q gi|254780388|r 3 FDTKFIFYSKKLIKSCTGHFFIITALLMPVMLGVG 37 (458) Q Consensus 3 ~~~~~~~~~~rf~~d~~G~vaiifal~l~~ll~~~ 37 (458) -+|.++.+.|||.||+.+-++....+.++.+-.++ T Consensus 15 p~sp~~~~w~r~~r~~~a~~g~~il~~~~l~alfa 49 (300) T PRK10913 15 PMTPLQEFWHYFKRNKGAVVGLVYVVIVLFIAIFA 49 (300) T ss_pred CCCHHHHHHHHHHHCHHHHHHHHHHHHHHHHHHHH T ss_conf 98989999999853879999999999999999999 No 57 >cd04477 RPA1N RPA1N: A subfamily of OB folds corresponding to the N-terminal OB-fold domain of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA1N is known to specifically interact with the p53 tumor suppressor, DNA polymerase alpha, and transcription factors. In addition to RPA1N, RPA1 contains three other OB folds: ssDNA-binding domain (DBD)-A, DBD-B, and DBD-C. Probab=55.83 E-value=11 Score=16.36 Aligned_cols=38 Identities=16% Similarity=0.163 Sum_probs=23.1 Q ss_pred CCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEE Q ss_conf 77506999996066888854089999999987968999 Q gi|254780388|r 365 LEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMT 402 (458) Q Consensus 365 ~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~Iyt 402 (458) ...+||-++++||.+...--....+...+++..++-++ T Consensus 33 ~~~~RyRi~lSDG~~~~~amLatqln~~v~~g~l~~~s 70 (97) T cd04477 33 GSSERYRILLSDGVYYVQAMLATQLNPLVESGQLQRGS 70 (97) T ss_pred CCCCEEEEEEECCCEEEEEEEHHHHHHHHHCCCCCCCC T ss_conf 97632899997555045568624557898719966688 No 58 >smart00187 INB Integrin beta subunits (N-terminal portion of extracellular region). Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). Probab=53.64 E-value=14 Score=15.78 Aligned_cols=87 Identities=10% Similarity=0.084 Sum_probs=50.3 Q ss_pred CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHH-HHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCC------ Q ss_conf 5655768121005666656412567777764588-999998623666654444566777506999996066888------ Q gi|254780388|r 309 DPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDA-MQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQ------ 381 (458) Q Consensus 309 ~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~g-l~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~------ 381 (458) ..+++.+...+... +....-+|+-+.++| +-.-++.......-.|. ...+|++||.||+.... T Consensus 180 ~l~LT~d~~~F~~~-----V~~q~iSgNlD~PEGGfDAlmQ~avC~~~IGWR-----~~arrllVf~TDa~fH~AgDGkL 249 (423) T smart00187 180 VLSLTDDTDEFNEE-----VKKQRISGNLDAPEGGFDAIMQAAVCTEQIGWR-----EDARRLLVFSTDAGFHFAGDGKL 249 (423) T ss_pred CCCCCCCHHHHHHH-----HHHCCCCCCCCCCCCCHHHHHHHHHHCCCCCCC-----CCCEEEEEEECCCCCCCCCCCCE T ss_conf 12367888999999-----862536346688761277888887520003765-----57438999983786302367624 Q ss_pred -------------------------CCCHHHHHHHHHHHCCCEE-EEEEE Q ss_conf -------------------------8540899999999879689-99994 Q gi|254780388|r 382 -------------------------DNEEGIAICNKAKSQGIRI-MTIAF 405 (458) Q Consensus 382 -------------------------~~~~~~~~C~~~K~~gI~I-ytI~f 405 (458) +-.-...+-+++++++|.+ |.|.= T Consensus 250 ~GIv~PNDg~CHLd~~g~Yt~s~~~DYPSi~ql~~kl~ennI~~IFAVT~ 299 (423) T smart00187 250 AGIVQPNDGQCHLDNNGEYTMSTTQDYPSIGQLNQKLAENNINPIFAVTK 299 (423) T ss_pred EEEECCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECC T ss_conf 43543788730327888524456567887899999998539327998522 No 59 >pfam04057 Rep-A_N Replication factor-A protein 1, N-terminal domain. Probab=52.76 E-value=13 Score=15.84 Aligned_cols=38 Identities=18% Similarity=0.225 Sum_probs=23.0 Q ss_pred CCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEE Q ss_conf 77506999996066888854089999999987968999 Q gi|254780388|r 365 LEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMT 402 (458) Q Consensus 365 ~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~Iyt 402 (458) ..++||-++++||.+...--....+...+++..++-++ T Consensus 34 ~~~~RyR~~lSDG~~~~~aMLatqlN~~V~~g~l~~~s 71 (100) T pfam04057 34 NSPERYRFLLSDGKNKSKAMLATQLNSLVISGKLQNGS 71 (100) T ss_pred CCCCCEEEEEECCCCEEEEEEHHHHHHHHHCCCCCCCC T ss_conf 99763999998742015478502345687649866560 No 60 >COG1991 Uncharacterized conserved protein [Function unknown] Probab=52.13 E-value=4 Score=19.19 Aligned_cols=32 Identities=22% Similarity=0.283 Sum_probs=27.0 Q ss_pred HHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHH Q ss_conf 99987401387279999999999999999999 Q gi|254780388|r 10 YSKKLIKSCTGHFFIITALLMPVMLGVGGMLV 41 (458) Q Consensus 10 ~~~rf~~d~~G~vaiifal~l~~ll~~~g~aV 41 (458) +..+++.+.||.+..=|.|+++.++.+++.++ T Consensus 4 ~i~~~~~~nkgQiSLEf~Ll~l~ivla~~i~~ 35 (131) T COG1991 4 YITKIILSNKGQISLEFSLLLLAIVLAASIAG 35 (131) T ss_pred EEEEEEECCCCCEEEEHHHHHHHHHHHHHHEE T ss_conf 66254224666256413899999999731211 No 61 >pfam04964 Flp_Fap Flp/Fap pilin component. Probab=50.88 E-value=15 Score=15.50 Aligned_cols=21 Identities=19% Similarity=0.282 Sum_probs=16.2 Q ss_pred HHHCCCCCHHHHHHHHHHHHH Q ss_conf 740138727999999999999 Q gi|254780388|r 14 LIKSCTGHFFIITALLMPVML 34 (458) Q Consensus 14 f~~d~~G~vaiifal~l~~ll 34 (458) |+|||+|.-||=.+|..-.+- T Consensus 1 F~kde~GaTAIEYgLIaalIa 21 (47) T pfam04964 1 FLKDESGATAIEYGLIAALIA 21 (47) T ss_pred CCCCCCCCHHHHHHHHHHHHH T ss_conf 965656415999999999999 No 62 >KOG4667 consensus Probab=44.02 E-value=19 Score=14.83 Aligned_cols=10 Identities=20% Similarity=0.189 Sum_probs=5.1 Q ss_pred HHHHHHHHCC Q ss_conf 7899986068 Q gi|254780388|r 415 ARYFLSNCAS 424 (458) Q Consensus 415 ~~~~lk~CAs 424 (458) ++.++|..++ T Consensus 218 AkefAk~i~n 227 (269) T KOG4667 218 AKEFAKIIPN 227 (269) T ss_pred HHHHHHHCCC T ss_conf 7999985668 No 63 >pfam09967 DUF2201 Predicted metallopeptidase (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=41.50 E-value=21 Score=14.58 Aligned_cols=37 Identities=22% Similarity=0.171 Sum_probs=26.3 Q ss_pred CCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCC Q ss_conf 256777776458899999862366665444456677750699999606688885 Q gi|254780388|r 330 DENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDN 383 (458) Q Consensus 330 ~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~ 383 (458) .+.-+|||+....+.|.-.. - ...+|++|||+-+... T Consensus 349 ~~~GgGGTdf~pvf~~~~~~-~----------------p~~~i~fTDG~g~~p~ 385 (412) T pfam09967 349 ELTGGGGTDFRPVLEAALRL-R----------------PDAAVVLTDLEGWPAG 385 (412) T ss_pred CCCCCCCCCCHHHHHHHHHC-C----------------CCEEEEEECCCCCCCC T ss_conf 13578998784899999826-9----------------9769998389989887 No 64 >pfam07002 Copine Copine. This family represents a conserved region approximately 180 residues long within eukaryotic copines. Copines are Ca(2+)-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth. Probab=37.21 E-value=24 Score=14.16 Aligned_cols=73 Identities=14% Similarity=0.138 Sum_probs=41.1 Q ss_pred HHHHHHCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEE Q ss_conf 66564125677777645889999986236666544445667775069999960668888540899999999879689999 Q gi|254780388|r 324 VKTFAIDENEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTI 403 (458) Q Consensus 324 ~~~~i~~~~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI 403 (458) .+..+......|.|+..+=|..+.+..... .+...--+++++|||+-+.-+.....+.++ -+.-+.|--| T Consensus 73 Y~~~~~~v~l~gPT~fapiI~~a~~~a~~~---------~~~~~Y~VLlIiTDG~i~D~~~Ti~aIv~A-S~~PlSIIiV 142 (145) T pfam07002 73 YREALPNLQLSGPTNFAPIIDAAARIAEAT---------QKSGQYHVLLIITDGQVTDMKATIDAIVRA-SHLPLSIIIV 142 (145) T ss_pred HHHHHCEEEECCCCCHHHHHHHHHHHHHHH---------CCCCEEEEEEEECCCCCCCHHHHHHHHHHH-HCCCEEEEEE T ss_conf 999858106548752799999999999972---------237718999996389735699999999998-2799279999 Q ss_pred EEC Q ss_conf 943 Q gi|254780388|r 404 AFS 406 (458) Q Consensus 404 ~f~ 406 (458) |+| T Consensus 143 GVG 145 (145) T pfam07002 143 GVG 145 (145) T ss_pred EEC T ss_conf 519 No 65 >cd03132 GATase1_catalase Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Catalase catalyzes the dismutation of hydrogen peroxide (H2O2) to water and oxygen. This group includes the large catalases: Neurospora crassa Catalase-1 and Catalase-3 and, Escherichia coli HP-II. This GATase1-like domain has an essential role in HP-II catalase activity. However, it lacks enzymatic activity and the catalytic triad typical of GATase1 domains. Catalase-1 and -3 are homotetrameric, HP-II is homohexameric. It has been proposed that this domain may facilitate the folding and oligomerization process. The interface between this GATase1-like domain of HP-II and the core of the subunit forms part of a channel which provides access to the deeply buried catalase active sites of HPII. Catalase-1 is associated with non-growing cells; C Probab=37.14 E-value=24 Score=14.15 Aligned_cols=20 Identities=15% Similarity=0.356 Sum_probs=9.3 Q ss_pred CEEEECCHHHH-HHHHHHHHH Q ss_conf 37882998999-999999987 Q gi|254780388|r 427 SFFEANSTHEL-NKIFRDRIG 446 (458) Q Consensus 427 ~yy~a~~~~eL-~~aF~~~i~ 446 (458) ..+...+..++ .+.|-+.+. T Consensus 119 Gvv~~~~~~~~~~~~F~~~~~ 139 (142) T cd03132 119 GVVTADDVKDVFTDRFIDALA 139 (142) T ss_pred CEEEECCCCHHHHHHHHHHHH T ss_conf 579815866778999999998 No 66 >pfam04917 Shufflon_N Bacterial shufflon protein, N-terminal constant region. This family represents the high-similarity N-terminal 'constant region' shared by shufflon proteins. Probab=33.81 E-value=27 Score=13.82 Aligned_cols=45 Identities=11% Similarity=0.056 Sum_probs=32.9 Q ss_pred HHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 401387279999999999999999999999999999999999999 Q gi|254780388|r 15 IKSCTGHFFIITALLMPVMLGVGGMLVDVVRWSYYEHALKQAAQT 59 (458) Q Consensus 15 ~~d~~G~vaiifal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~Da 59 (458) .|..||-..|=..++|.+++.++.+.+.+..-++...+-|.+... T Consensus 2 r~~~kGf~LlE~~~~L~I~~~~~~~~~~~~~~~~~~~~~q~aA~q 46 (356) T pfam04917 2 KKTDKGVSLLEVGAVLLIVVMVIPKVAENIEDYLNNVRWQNAAEH 46 (356) T ss_pred CEECCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 120344308999999999999999999999989999999999999 No 67 >pfam02060 ISK_Channel Slow voltage-gated potassium channel. Probab=33.12 E-value=28 Score=13.75 Aligned_cols=37 Identities=8% Similarity=0.037 Sum_probs=27.2 Q ss_pred HHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9987401387279999999999999999999999999 Q gi|254780388|r 11 SKKLIKSCTGHFFIITALLMPVMLGVGGMLVDVVRWS 47 (458) Q Consensus 11 ~~rf~~d~~G~vaiifal~l~~ll~~~g~aVD~~r~~ 47 (458) .||.-+...|...++..++++-++++..++|=++..- T Consensus 31 arr~p~~~dg~le~lYiLmvlGfFgFft~GImlsyiR 67 (129) T pfam02060 31 ARRSPLGDDGKLEALYILMVLGFFGFFTLGIMLSYIR 67 (129) T ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 5678899986026889999999999999999999999 No 68 >TIGR02600 TIGR02600 Verrucomicrobium spinosum paralogous protein TIGR02600. Probab=32.76 E-value=29 Score=13.71 Aligned_cols=44 Identities=7% Similarity=0.002 Sum_probs=21.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHH--------HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 799999999999999999999--------99999999999999999999988 Q gi|254780388|r 22 FFIITALLMPVMLGVGGMLVD--------VVRWSYYEHALKQAAQTAIITAS 65 (458) Q Consensus 22 vaiifal~l~~ll~~~g~aVD--------~~r~~~~ks~Lq~A~DaA~LA~a 65 (458) +|+|++|+++.||.++-+++= -+..+...++++.=.|.||=++. T Consensus 2 ~ALi~vL~~lALiT~LVl~fls~v~~E~rsss~y~~~~~ar~L~d~avn~V~ 53 (1697) T TIGR02600 2 MALIMVLIVLALITILVLAFLSMVRTETRSSSSYASRSDARTLSDMAVNIVI 53 (1697) T ss_pred CHHHHHHHHHHHHHHHHHHHHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 0488889999999999999871216899999989888988877655899999 No 69 >COG1681 FlaB Archaeal flagellins [Cell motility and secretion] Probab=31.38 E-value=30 Score=13.56 Aligned_cols=22 Identities=23% Similarity=0.110 Sum_probs=16.0 Q ss_pred CCCCCHHHHHHHHHHHHHHHHH Q ss_conf 1387279999999999999999 Q gi|254780388|r 17 SCTGHFFIITALLMPVMLGVGG 38 (458) Q Consensus 17 d~~G~vaiifal~l~~ll~~~g 38 (458) +|||-+.|=.+|.++.|++++. T Consensus 1 ~rrG~~GIgtlIVfIAmVlVAA 22 (209) T COG1681 1 DRRGATGIGTLIVFIAMVLVAA 22 (209) T ss_pred CCCCCCCHHHHHHHHHHHHHHH T ss_conf 9841104328999999999999 No 70 >PRK10506 hypothetical protein; Provisional Probab=31.30 E-value=30 Score=13.56 Aligned_cols=45 Identities=20% Similarity=0.122 Sum_probs=35.3 Q ss_pred HHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 401387279999999999999999999999999999999999999 Q gi|254780388|r 15 IKSCTGHFFIITALLMPVMLGVGGMLVDVVRWSYYEHALKQAAQT 59 (458) Q Consensus 15 ~~d~~G~vaiifal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~Da 59 (458) ++.|+|=-.|=..+.+.++-.+..+++---+.++.+.||++++.. T Consensus 1 ~~~q~GFTLiEllvvi~ii~il~~~a~p~~~~~~q~~~L~~~a~~ 45 (155) T PRK10506 1 MKKQRGYTLIETLVAMTLVVILSAWGLYGWQYWQQQQRLWQTAQQ 45 (155) T ss_pred CCCCCCEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 985566279999999999999998877779999999999999999 No 71 >PRK06007 fliF flagellar MS-ring protein; Reviewed Probab=28.84 E-value=33 Score=13.29 Aligned_cols=37 Identities=16% Similarity=0.067 Sum_probs=24.3 Q ss_pred HHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 6789999998740138727999999999999999999 Q gi|254780388|r 4 DTKFIFYSKKLIKSCTGHFFIITALLMPVMLGVGGML 40 (458) Q Consensus 4 ~~~~~~~~~rf~~d~~G~vaiifal~l~~ll~~~g~a 40 (458) -.+|+.++++|-+.||-.+.+.+++++..+++++-++ T Consensus 9 ~~~~~~~~~~l~~~qki~l~~~~~~~i~~~~~l~~~~ 45 (540) T PRK06007 9 MEKLLEFLKKLSKLRKIALIGAAAAVIAAIVALVLWA 45 (540) T ss_pred HHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9999999970698889999999999999999999841 No 72 >COG4867 Uncharacterized protein with a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=28.72 E-value=33 Score=13.28 Aligned_cols=92 Identities=14% Similarity=0.200 Sum_probs=55.7 Q ss_pred CCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCC---------------CCHH----HHHHHHHH Q ss_conf 7777645889999986236666544445667775069999960668888---------------5408----99999999 Q gi|254780388|r 334 MGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQD---------------NEEG----IAICNKAK 394 (458) Q Consensus 334 ~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~---------------~~~~----~~~C~~~K 394 (458) .-+||...|+..+-+.|-- ....+|.++++|||+.+.- +..+ ..-.+++. T Consensus 530 eqgTNlhhaL~LA~r~l~R-----------h~~~~~~il~vTDGePtAhle~~DG~~~~f~yp~DP~t~~~Tvr~~d~~~ 598 (652) T COG4867 530 EQGTNLHHALALAGRHLRR-----------HAGAQPVVLVVTDGEPTAHLEDGDGTSVFFDYPPDPRTIAHTVRGFDDMA 598 (652) T ss_pred CCCCCHHHHHHHHHHHHHH-----------CCCCCCEEEEEECCCCCCCCCCCCCCEEECCCCCCHHHHHHHHHHHHHHH T ss_conf 4555458899999999873-----------75657628998379863013478985661689987779989899888887 Q ss_pred HCCCEEEEEEECCCCCCCCHHHHHHHHHC--CCCCEEEECCHHHHHHHH Q ss_conf 87968999994378874311789998606--898378829989999999 Q gi|254780388|r 395 SQGIRIMTIAFSVNKTQQEKARYFLSNCA--SPNSFFEANSTHELNKIF 441 (458) Q Consensus 395 ~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~~~yy~a~~~~eL~~aF 441 (458) ..||.|-+..++.+. .-.-++++.| ..|..| +++.+.|-.+. T Consensus 599 r~G~q~t~FrLg~Dp----gL~~Fv~qva~rv~G~vv-~pdldglGaaV 642 (652) T COG4867 599 RLGAQVTIFRLGSDP----GLARFIDQVARRVQGRVV-VPDLDGLGAAV 642 (652) T ss_pred HCCCEEEEEEECCCH----HHHHHHHHHHHHHCCEEE-ECCCCHHHHHH T ss_conf 516413677522777----689999999998588488-13822135899 No 73 >pfam10526 NADH_ub_rd_NUML NADH-ubiquinone reductase complex 1 MLRQ subunit. This subunit appears to be a recent vertebrate addition to the MADH-ubiquinone reductase complex 1, acting within the membrane. its exact function is not known, but it is highly expressed in muscle and neural tissue, indicative of a role in ATP generation. Probab=28.50 E-value=34 Score=13.25 Aligned_cols=22 Identities=5% Similarity=-0.144 Sum_probs=14.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9999999999999999999999 Q gi|254780388|r 29 LMPVMLGVGGMLVDVVRWSYYE 50 (458) Q Consensus 29 ~l~~ll~~~g~aVD~~r~~~~k 50 (458) .|+||++++|++.=.+-.+..| T Consensus 12 ~LIPLfv~ig~g~~gA~~y~~r 33 (80) T pfam10526 12 ALIPLFVFIGAGATGATLYLLR 33 (80) T ss_pred CHHHHHHHHHCCHHHHHHHHHH T ss_conf 1324899994138899999999 No 74 >smart00310 PTBI Phosphotyrosine-binding domain (IRS1-like). Probab=27.91 E-value=34 Score=13.19 Aligned_cols=30 Identities=17% Similarity=0.489 Sum_probs=23.7 Q ss_pred HHHC-CCCCE-EEECCHHHHHHHHHHHHHHHH Q ss_conf 8606-89837-882998999999999987587 Q gi|254780388|r 420 SNCA-SPNSF-FEANSTHELNKIFRDRIGNEI 449 (458) Q Consensus 420 k~CA-s~~~y-y~a~~~~eL~~aF~~~i~~~~ 449 (458) +.|. ++|.| |....+++|.++++++|..+. T Consensus 66 Rrc~~G~G~f~f~t~~~~~i~~~v~~am~a~k 97 (98) T smart00310 66 RRCVSGPGEFTFQTVVAQEIFQLVLEAMQAQS 97 (98) T ss_pred CCCCCCCCEEEEECCCHHHHHHHHHHHHHHHC T ss_conf 88788986799982849999999999998602 No 75 >pfam00733 Asn_synthase Asparagine synthase. This family is always found associated with pfam00310. Members of this family catalyse the conversion of aspartate to asparagine. Probab=27.80 E-value=35 Score=13.17 Aligned_cols=35 Identities=17% Similarity=0.179 Sum_probs=17.2 Q ss_pred HHHHHHHCC-CCCEEE--ECCHHHHHHHHHHHHHHHHH Q ss_conf 899986068-983788--29989999999999875875 Q gi|254780388|r 416 RYFLSNCAS-PNSFFE--ANSTHELNKIFRDRIGNEIF 450 (458) Q Consensus 416 ~~~lk~CAs-~~~yy~--a~~~~eL~~aF~~~i~~~~~ 450 (458) ..+++-|-+ |..+.. -.+.-=|.++++.++-+++. T Consensus 154 ~~lv~~~~~ip~~~k~~~~~~K~iLR~a~~~~lP~~i~ 191 (195) T pfam00733 154 HRLVEFALSLPPELKLRDGEEKYILREAARGILPDEIL 191 (195) T ss_pred HHHHHHHHHCCHHHHCCCCCCHHHHHHHHHCCCCHHHH T ss_conf 79999999499999479999889999998671999996 No 76 >KOG2487 consensus Probab=27.54 E-value=35 Score=13.15 Aligned_cols=48 Identities=15% Similarity=0.225 Sum_probs=34.6 Q ss_pred HHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC--CCCCEEEECCHHHHHHHHHHHHH Q ss_conf 99987968999994378874311789998606--89837882998999999999987 Q gi|254780388|r 392 KAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA--SPNSFFEANSTHELNKIFRDRIG 446 (458) Q Consensus 392 ~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~~~yy~a~~~~eL~~aF~~~i~ 446 (458) .+..++|.|-.+-.+.++ .+|++|+ ++|.|-++.+.+.|....-.... T Consensus 191 aAqKq~I~Idv~~l~~~s-------~~LqQa~D~TGG~YL~v~~~~gLLqyLlt~~~ 240 (314) T KOG2487 191 AAQKQNIPIDVVSLGGDS-------GFLQQACDITGGDYLHVEKPDGLLQYLLTLLL 240 (314) T ss_pred HHHHCCCEEEEEEECCCC-------HHHHHHHHHCCCEEEECCCCCHHHHHHHHHHC T ss_conf 787539615899956984-------39999875028704714885259999999846 No 77 >pfam02174 IRS PTB domain (IRS-1 type). Probab=26.80 E-value=36 Score=13.06 Aligned_cols=28 Identities=14% Similarity=0.452 Sum_probs=21.9 Q ss_pred HHHC-CCCCE-EEECCHHHHHHHHHHHHHH Q ss_conf 8606-89837-8829989999999999875 Q gi|254780388|r 420 SNCA-SPNSF-FEANSTHELNKIFRDRIGN 447 (458) Q Consensus 420 k~CA-s~~~y-y~a~~~~eL~~aF~~~i~~ 447 (458) +.|. ++|.| |...+++||.+.+++++.. T Consensus 67 r~c~~G~G~f~f~t~~~~~i~~~v~~~m~a 96 (99) T pfam02174 67 RRCVTGEGEFTFQTDDAEEIFETVQAAMKA 96 (99) T ss_pred CCCCCCCCEEEEECCCHHHHHHHHHHHHHH T ss_conf 888889827999838899999999999997 No 78 >cd01203 DOK_PTB Downstream of tyrosine kinase (DOK) Phosphotyrosine-binding domain. This domain has a PH-like fold and is similiar to the PTB domain that is found in insulin receptor substrate molecules The DOK family of eukaryotic signaling molecules have an N-terminal PH domain, followed by an IRS-like PTB domain. This PTBi domain is shorter than the PTB domain which is found in SHC, Numb and other proteins. The PTBi domain binds to phosphotyrosines which are in NPXpY motifs. Probab=23.78 E-value=41 Score=12.71 Aligned_cols=31 Identities=19% Similarity=0.391 Sum_probs=24.9 Q ss_pred HHHCC-CCCE-EEECCHHHHHHHHHHHHHHHHH Q ss_conf 86068-9837-8829989999999999875875 Q gi|254780388|r 420 SNCAS-PNSF-FEANSTHELNKIFRDRIGNEIF 450 (458) Q Consensus 420 k~CAs-~~~y-y~a~~~~eL~~aF~~~i~~~~~ 450 (458) +.|.+ +|.| |....++||-.+.+..|..+-. T Consensus 67 RrC~~GeG~f~F~t~~~~~If~~v~~~i~~qk~ 99 (104) T cd01203 67 RRCTSGEGVFTFDTTQGNEIFRAVEAAIKSQKK 99 (104) T ss_pred CCCCCCCCEEEEECCCHHHHHHHHHHHHHHHHH T ss_conf 878989977999549999999999999999984 No 79 >TIGR00873 gnd 6-phosphogluconate dehydrogenase, decarboxylating; InterPro: IPR006113 6-Phosphogluconate dehydrogenase (1.1.1.44 from EC) (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP) , . Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequence are highly conserved . The protein is a homodimer in which the monomers act independently : each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet . NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket . This model does not specify whether the cofactor is NADP only, NAD only, or both.; GO: 0004616 phosphogluconate dehydrogenase (decarboxylating) activity, 0050661 NADP binding, 0006098 pentose-phosphate shunt. Probab=23.60 E-value=41 Score=12.68 Aligned_cols=22 Identities=18% Similarity=0.196 Sum_probs=10.7 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 899999999879689999943788 Q gi|254780388|r 386 GIAICNKAKSQGIRIMTIAFSVNK 409 (458) Q Consensus 386 ~~~~C~~~K~~gI~IytI~f~~~~ 409 (458) +.--|+.+|++||. +||.|+++ T Consensus 107 T~RR~~eL~~~Gi~--FvG~GvSG 128 (480) T TIGR00873 107 TERRYKELKAKGIL--FVGVGVSG 128 (480) T ss_pred HHHHHHHHHHCCCC--EEEEEEEC T ss_conf 57899999864981--67301324 No 80 >pfam04056 Ssl1 Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex. Probab=23.42 E-value=42 Score=12.66 Aligned_cols=154 Identities=11% Similarity=0.131 Sum_probs=81.4 Q ss_pred CCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHH Q ss_conf 33066777665221123467756665201211006776655655768121005666656412567777764588999998 Q gi|254780388|r 269 KKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAMQTAYD 348 (458) Q Consensus 269 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl~~g~~ 348 (458) +|+.+.-.....++.......|...- .-....|......+.++-+....-.. +. ..-...+.|.=....||.-+.. T Consensus 73 ~R~~~~l~~l~~Fi~efFdqNPiSQl--gii~~rn~~a~~ls~lsgnp~~hi~a-L~-~~~~~~~~G~pSLqN~Le~a~~ 148 (250) T pfam04056 73 SRFACTIKYLETFVEEFFDQNPISQI--GLITCKDGRAHRLTDLTGNPRVHIKA-LK-SLREAECGGDPSLQNALELARA 148 (250) T ss_pred CHHHHHHHHHHHHHHHHHHCCCCCCE--EEEEEECCEEEEEEECCCCHHHHHHH-HH-HHHCCCCCCCHHHHHHHHHHHH T ss_conf 48999999999999998743983022--79999657137833257998999999-99-8740699999208999999998 Q ss_pred HHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC--CCC Q ss_conf 623666654444566777506999996066888854089999999987968999994378874311789998606--898 Q gi|254780388|r 349 TIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA--SPN 426 (458) Q Consensus 349 ~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~~ 426 (458) .|...-. ...+-++|++.- --+.+...--..-+.+|+.+|++-.|++.. .-.++|..+ +.| T Consensus 149 ~L~~~P~---------~~sREILii~gS-L~T~DPgdI~~tI~~l~~~~IrvsvI~Laa-------Ev~Ick~l~~~T~G 211 (250) T pfam04056 149 SLKHVPS---------HGSREVLIIFGS-LSTCDPGDIYSTIDTLKKEKIRCSVIGLSA-------EVFICKELCKATNG 211 (250) T ss_pred HHHCCCC---------CCCEEEEEEEEE-CCCCCCCCHHHHHHHHHHCCCEEEEEEECH-------HHHHHHHHHHHHCC T ss_conf 8750898---------785489999820-444588659999999997590799987338-------99999999997499 Q ss_pred CEEEECCHHHHHHHHHH Q ss_conf 37882998999999999 Q gi|254780388|r 427 SFFEANSTHELNKIFRD 443 (458) Q Consensus 427 ~yy~a~~~~eL~~aF~~ 443 (458) .|.-+-+..-+.+-+.+ T Consensus 212 ~y~V~lde~Hfk~ll~~ 228 (250) T pfam04056 212 TYSVALDETHLKELLLE 228 (250) T ss_pred EEEEECCHHHHHHHHHH T ss_conf 88875699999999995 No 81 >cd01569 PBEF_like pre-B-cell colony-enhancing factor (PBEF)-like. The mammalian members of this group of nicotinate phosphoribosyltransferases (NAPRTases) were originally identified as genes whose expression is upregulated upon activation in lymphoid cells. In general, nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. Probab=22.40 E-value=44 Score=12.53 Aligned_cols=53 Identities=17% Similarity=0.256 Sum_probs=38.3 Q ss_pred CEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCC-----CCCCHHHHHHHHHC Q ss_conf 069999960668888540899999999879689999943788-----74311789998606 Q gi|254780388|r 368 KKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNK-----TQQEKARYFLSNCA 423 (458) Q Consensus 368 ~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~-----~~~~~~~~~lk~CA 423 (458) ..+-++.-||.+ -.+...+|+.+|++|.-.--|.||.++ ..+|+-+-.||.|| T Consensus 325 ~~vriIqGDgI~---~~~i~~Il~~l~~~G~sa~Ni~FG~Gg~llQ~~~RDT~~fA~K~s~ 382 (407) T cd01569 325 PHVRIIQGDGIT---LERIEEILERLKAKGFASENIVFGMGGGLLQKVTRDTQGFAMKASA 382 (407) T ss_pred CCEEEEECCCCC---HHHHHHHHHHHHHCCCCHHHEEEECCHHHHCCCCCCHHHHHHEEEE T ss_conf 742378548869---9999999999997798314226741677750377521234462556 No 82 >COG3552 CoxE Protein containing von Willebrand factor type A (vWA) domain [General function prediction only] Probab=22.37 E-value=29 Score=13.63 Aligned_cols=54 Identities=17% Similarity=0.287 Sum_probs=31.3 Q ss_pred CCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHH Q ss_conf 6777776458899999862366665444456677750699999606688885408999999998 Q gi|254780388|r 332 NEMGSTAINDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQDNEEGIAICNKAKS 395 (458) Q Consensus 332 ~~~g~T~~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~~~~~~~~~C~~~K~ 395 (458) ...|+|.++.-+.- +...|... .-...-+++++|||-............+.+.. T Consensus 286 dw~ggTrig~tl~a----F~~~~~~~------~L~~gA~VlilsDg~drd~~~~l~~~~~rl~r 339 (395) T COG3552 286 DWDGGTRIGNTLAA----FLRRWHGN------VLSGGAVVLILSDGLDRDDIPELVTAMARLRR 339 (395) T ss_pred CCCCCCCHHHHHHH----HHCCCCCC------CCCCCEEEEEEECCCCCCCCHHHHHHHHHHHH T ss_conf 33577623489999----97154444------55786179997054224781579999999998 No 83 >COG4547 CobT Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) [Coenzyme metabolism] Probab=22.29 E-value=44 Score=12.52 Aligned_cols=82 Identities=17% Similarity=0.306 Sum_probs=46.2 Q ss_pred CHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEECCCCCC--------CCC---HHHHHHHHHHH-CCCEEEEEEEC Q ss_conf 4588999998623666654444566777506999996066888--------854---08999999998-79689999943 Q gi|254780388|r 339 INDAMQTAYDTIISSNEDEVHRMKNNLEAKKYIVLLTDGENTQ--------DNE---EGIAICNKAKS-QGIRIMTIAFS 406 (458) Q Consensus 339 ~~~gl~~g~~~Ls~~~~~~~~~~~~~~~~~K~iil~TDG~n~~--------~~~---~~~~~C~~~K~-~gI~IytI~f~ 406 (458) -++.++|+-+.|.. ...-+|++.+|+||..-. ++- ...++-+.|.. ..|++..||++ T Consensus 519 DGEal~wah~rl~g-----------RpEqrkIlmmiSDGAPvddstlsvnpGnylerHLRaVieeIEtrSpveLlAIGig 587 (620) T COG4547 519 DGEALMWAHQRLIG-----------RPEQRKILMMISDGAPVDDSTLSVNPGNYLERHLRAVIEEIETRSPVELLAIGIG 587 (620) T ss_pred CHHHHHHHHHHHHC-----------CHHHCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHEEEECC T ss_conf 71999999998735-----------8424137888348985555434558860799999999999703784033033125 Q ss_pred CCCCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 7887431178999860689837882998999999999 Q gi|254780388|r 407 VNKTQQEKARYFLSNCASPNSFFEANSTHELNKIFRD 443 (458) Q Consensus 407 ~~~~~~~~~~~~lk~CAs~~~yy~a~~~~eL~~aF~~ 443 (458) -+- .+..-+ -|.-.+.+||--+.-+ T Consensus 588 hDv-----tRyYrr-------avtiVdaeeL~gamte 612 (620) T COG4547 588 HDV-----TRYYRR-------AVTIVDAEELAGAMTE 612 (620) T ss_pred CCC-----CHHHHH-------HEEEECHHHHCHHHHH T ss_conf 553-----066662-------0137428885658999 No 84 >TIGR00385 dsbE periplasmic protein thiol:disulfide oxidoreductases, DsbE subfamily; InterPro: IPR004799 Periplasmic protein thiol:disulphide oxidoreductase is involved in the biogenesis of c-type cytochromes as well as in disulphide bond formation in some periplasmic proteins. This group defines the DsbE subfamily.; GO: 0015036 disulfide oxidoreductase activity, 0017004 cytochrome complex assembly, 0030288 outer membrane-bounded periplasmic space. Probab=22.28 E-value=42 Score=12.67 Aligned_cols=18 Identities=17% Similarity=0.235 Sum_probs=13.0 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 999999999999999999 Q gi|254780388|r 28 LLMPVMLGVGGMLVDVVR 45 (458) Q Consensus 28 l~l~~ll~~~g~aVD~~r 45 (458) |+|+|+++|+|+|+=... T Consensus 1 L~L~Pli~F~~ia~~~~~ 18 (175) T TIGR00385 1 LALLPLIIFLGIAVAILW 18 (175) T ss_pred CCHHHHHHHHHHHHHHHH T ss_conf 921457899999999999 No 85 >PRK04214 rbn ribonuclease BN/unknown domain fusion protein; Reviewed Probab=22.26 E-value=44 Score=12.52 Aligned_cols=37 Identities=11% Similarity=0.031 Sum_probs=26.0 Q ss_pred HHHHHHHHHHHHCC----CCCHHHHHHHHHHHHHHHHHHHH Q ss_conf 78999999874013----87279999999999999999999 Q gi|254780388|r 5 TKFIFYSKKLIKSC----TGHFFIITALLMPVMLGVGGMLV 41 (458) Q Consensus 5 ~~~~~~~~rf~~d~----~G~vaiifal~l~~ll~~~g~aV 41 (458) +=.++..|||.+|| -++.+-.+.|+++|++.++-..+ T Consensus 17 ~F~r~~~rrF~~dr~~~~AAsLtf~tlLALvPlL~v~~sl~ 57 (411) T PRK04214 17 SFGRFLWRRFLDDRLFQAAASLTFTTLLALVPLATVVFGVL 57 (411) T ss_pred HHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 99999999975467988999999999999999999999999 No 86 >TIGR00263 trpB tryptophan synthase, beta subunit; InterPro: IPR006654 Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan , :L-serine + 1-(indol-3-yl)glycerol 3-phosphate = L-tryptophan + glyceraldehyde 3-phosphate + H_2O It has two functional domains, each found in bacteria and plants on a separate subunit: alpha chain (IPR002028 from INTERPRO) is for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and beta chain is for the synthesis of tryptophan from indole and serine. In fungi the two domains are fused together on a single multifunctional protein . The beta chain of the enzyme, represented here, requires pyridoxal-phosphate as a cofactor. The pyridoxal-phosphate group is attached to a lysine residue. The region around this lysine residue also contains two histidine residues which are part of the pyridoxal-phosphate binding site.; GO: 0004834 tryptophan synthase activity, 0006568 tryptophan metabolic process. Probab=22.15 E-value=44 Score=12.50 Aligned_cols=44 Identities=14% Similarity=0.237 Sum_probs=25.9 Q ss_pred CEEEEE--EECCCCCCCCHHHHHHHHHCCCCC-EEEECCHHHHHHHHHHHHHH Q ss_conf 689999--943788743117899986068983-78829989999999999875 Q gi|254780388|r 398 IRIMTI--AFSVNKTQQEKARYFLSNCASPNS-FFEANSTHELNKIFRDRIGN 447 (458) Q Consensus 398 I~IytI--~f~~~~~~~~~~~~~lk~CAs~~~-yy~a~~~~eL~~aF~~~i~~ 447 (458) -+.|.| |+.+++-++.= ..|. =.|+ -|++.+-+|=.+||+- +.+ T Consensus 317 ~~~hSvSAGLDYPGVGP~H--A~L~---~~GRa~Y~~iTD~EAl~AF~~-l~~ 363 (412) T TIGR00263 317 LEAHSVSAGLDYPGVGPEH--AYLH---ETGRAEYEAITDDEALEAFKL-LSR 363 (412) T ss_pred CCCEEEEEECCCCCCCHHH--HHHH---CCCCEEEEECCHHHHHHHHHH-HHH T ss_conf 3211278515788868677--8875---038756630688999999999-877 No 87 >PRK05434 phosphoglyceromutase; Provisional Probab=21.86 E-value=45 Score=12.47 Aligned_cols=19 Identities=5% Similarity=0.198 Sum_probs=7.4 Q ss_pred HHHHHHHHHCCCEEEEEEE Q ss_conf 9999999987968999994 Q gi|254780388|r 387 IAICNKAKSQGIRIMTIAF 405 (458) Q Consensus 387 ~~~C~~~K~~gI~IytI~f 405 (458) ..+++.+++.......+-| T Consensus 378 d~~i~ai~~~~yd~i~~Nf 396 (511) T PRK05434 378 DKLVEAIESGKYDLIIVNY 396 (511) T ss_pred HHHHHHHHCCCCCEEEEEC T ss_conf 9999999748998899935 No 88 >pfam06508 ExsB ExsB. This family includes putative transcriptional regulators from Bacteria and Archaea. Probab=21.71 E-value=45 Score=12.45 Aligned_cols=18 Identities=17% Similarity=0.156 Sum_probs=7.6 Q ss_pred HHHHHHCCCEEEEEEECC Q ss_conf 999998796899999437 Q gi|254780388|r 390 CNKAKSQGIRIMTIAFSV 407 (458) Q Consensus 390 C~~~K~~gI~IytI~f~~ 407 (458) +..+...|+.--.+|+.. T Consensus 105 ~a~A~~~g~~~I~~G~~~ 122 (137) T pfam06508 105 ASYAEAIGANDIFIGVNE 122 (137) T ss_pred HHHHHHCCCCEEEEEECC T ss_conf 999998699979995655 No 89 >KOG0394 consensus Probab=21.67 E-value=45 Score=12.44 Aligned_cols=32 Identities=19% Similarity=0.413 Sum_probs=17.6 Q ss_pred HHHHHHHCCCCC--EEE--ECCHHHHHHHHHHHHHH Q ss_conf 899986068983--788--29989999999999875 Q gi|254780388|r 416 RYFLSNCASPNS--FFE--ANSTHELNKIFRDRIGN 447 (458) Q Consensus 416 ~~~lk~CAs~~~--yy~--a~~~~eL~~aF~~~i~~ 447 (458) +.....|++.++ ||+ |.++--..+||++|..+ T Consensus 141 ~~Aq~WC~s~gnipyfEtSAK~~~NV~~AFe~ia~~ 176 (210) T KOG0394 141 KKAQTWCKSKGNIPYFETSAKEATNVDEAFEEIARR 176 (210) T ss_pred HHHHHHHHHCCCCEEEEECCCCCCCHHHHHHHHHHH T ss_conf 899999986599506871024344689999999999 No 90 >PRK09198 putative nicotinate phosphoribosyltransferase; Provisional Probab=20.77 E-value=47 Score=12.32 Aligned_cols=39 Identities=26% Similarity=0.250 Sum_probs=31.4 Q ss_pred CEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 069999960668888540899999999879689999943788 Q gi|254780388|r 368 KKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNK 409 (458) Q Consensus 368 ~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~ 409 (458) .++-++.-||.+ -.+...+|+.+|++|..+--|.||.++ T Consensus 327 ~~vrvIqGDgI~---~~~i~~Il~~l~~~G~sa~Ni~FG~Gg 365 (462) T PRK09198 327 KHVGLIQGDGIT---LERIEAILEALKAKGFAADNIVFGMGG 365 (462) T ss_pred CCEEEEECCCCC---HHHHHHHHHHHHHCCCCCCCEEEECCH T ss_conf 642168558739---999999999999759864243551153 No 91 >KOG2884 consensus Probab=20.70 E-value=47 Score=12.31 Aligned_cols=161 Identities=14% Similarity=0.090 Sum_probs=96.1 Q ss_pred CCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHH Q ss_conf 77754330667776652211234677566652012110067766556557681210056666564125677777645889 Q gi|254780388|r 264 LRHVIKKKHLVRDALASVIRSIKKIDNVNDTVRMGATFFNDRVISDPSFSWGVHKLIRTIVKTFAIDENEMGSTAINDAM 343 (458) Q Consensus 264 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~g~T~~~~gl 343 (458) .++..+|+.+-++++..+........|.+........ +..+...+.++. +...+......+.+.|.-+...|| T Consensus 19 gDy~PtRf~aQ~daVn~v~~~K~~snpEntvGiitla--~a~~~vLsT~T~-----d~gkils~lh~i~~~g~~~~~~~i 91 (259) T KOG2884 19 GDYLPTRFQAQKDAVNLVCQAKLRSNPENTVGIITLA--NASVQVLSTLTS-----DRGKILSKLHGIQPHGKANFMTGI 91 (259) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHCCCCCCCEEEEECC--CCCCEEEEECCC-----CCHHHHHHHCCCCCCCCCCHHHHH T ss_conf 8977188898899999998755027954315468636--898504430343-----004898773277857761288889 Q ss_pred HHHHHHHCCCCCCCCCCCCCCCC-CCEEEEEEECCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCCCCHHH---HHH Q ss_conf 99998623666654444566777-50699999606688885408999999998796899999437887431178---999 Q gi|254780388|r 344 QTAYDTIISSNEDEVHRMKNNLE-AKKYIVLLTDGENTQDNEEGIAICNKAKSQGIRIMTIAFSVNKTQQEKAR---YFL 419 (458) Q Consensus 344 ~~g~~~Ls~~~~~~~~~~~~~~~-~~K~iil~TDG~n~~~~~~~~~~C~~~K~~gI~IytI~f~~~~~~~~~~~---~~l 419 (458) ..+--.|-.. .++. .+++++|+---.... .-....+...+|.+++-|-.|-|+-..+...... +.+ T Consensus 92 ~iA~lalkhR---------qnk~~~~riVvFvGSpi~e~-ekeLv~~akrlkk~~Vaidii~FGE~~~~~e~l~~fida~ 161 (259) T KOG2884 92 QIAQLALKHR---------QNKNQKQRIVVFVGSPIEES-EKELVKLAKRLKKNKVAIDIINFGEAENNTEKLFEFIDAL 161 (259) T ss_pred HHHHHHHHHH---------CCCCCCEEEEEEECCCCHHH-HHHHHHHHHHHHHCCEEEEEEEECCCCCCHHHHHHHHHHH T ss_conf 9999998710---------38886369999936832233-8999999999875480278987243433378899999985 Q ss_pred HHHCCCCC-EEEECCHHHHHHHHH Q ss_conf 86068983-788299899999999 Q gi|254780388|r 420 SNCASPNS-FFEANSTHELNKIFR 442 (458) Q Consensus 420 k~CAs~~~-yy~a~~~~eL~~aF~ 442 (458) ..- +++. --.++.+.-|.++.. T Consensus 162 N~~-~~gshlv~Vppg~~L~d~l~ 184 (259) T KOG2884 162 NGK-GDGSHLVSVPPGPLLSDALL 184 (259) T ss_pred CCC-CCCCEEEEECCCCCHHHHHH T ss_conf 389-88744898589840777764 No 92 >pfam03850 Tfb4 Transcription factor Tfb4. Probab=20.17 E-value=48 Score=12.24 Aligned_cols=53 Identities=13% Similarity=0.145 Sum_probs=38.1 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCCCCHHHHHHHHHC--CCCCEEEECCHHHHHHHHHHHH Q ss_conf 9999999987968999994378874311789998606--8983788299899999999998 Q gi|254780388|r 387 IAICNKAKSQGIRIMTIAFSVNKTQQEKARYFLSNCA--SPNSFFEANSTHELNKIFRDRI 445 (458) Q Consensus 387 ~~~C~~~K~~gI~IytI~f~~~~~~~~~~~~~lk~CA--s~~~yy~a~~~~eL~~aF~~~i 445 (458) ...--.++..+|.|-+..++... ..+||+.+ +.|.|+.++..+.|....-..+ T Consensus 162 MN~iFaAqk~~I~IDvc~L~~~~------s~fLQQA~diT~G~Yl~~~~~~gLlQyL~~~f 216 (271) T pfam03850 162 MNSIFAAQKLKIPIDVCKLGGED------SSFLQQAADITGGVYLHVTEPDGLLQYLMTAF 216 (271) T ss_pred HHHHHHHHHCCCEEEEEEECCCC------CHHHHHHHHHHCCEEECCCCCCHHHHHHHHHH T ss_conf 99999998559747999936998------58999999974977751478333899999996 Done!