Query gi|254781110|ref|YP_003065523.1| von Willebrand factor type A [Candidatus Liberibacter asiaticus str. psy62] Match_columns 420 No_of_seqs 153 out of 724 Neff 9.9 Searched_HMMs 39220 Date Mon May 30 05:51:04 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254781110.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK13685 hypothetical protein; 99.9 2.8E-20 7.1E-25 150.6 19.5 170 225-414 108-295 (326) 2 cd01467 vWA_BatA_type VWA BatA 99.8 1.8E-19 4.6E-24 145.3 17.3 167 191-398 3-180 (180) 3 cd01465 vWA_subgroup VWA subgr 99.8 1.2E-18 3.1E-23 139.9 16.5 166 191-402 1-170 (170) 4 cd01461 vWA_interalpha_trypsin 99.7 3.5E-16 9E-21 123.9 18.2 166 190-402 2-169 (171) 5 cd01466 vWA_C3HC4_type VWA C3H 99.7 2.1E-15 5.4E-20 118.8 16.6 152 192-393 2-155 (155) 6 cd01475 vWA_Matrilin VWA_Matri 99.7 3.8E-14 9.7E-19 110.6 19.3 179 190-410 2-184 (224) 7 cd01480 vWA_collagen_alpha_1-V 99.7 2.3E-14 5.8E-19 112.1 17.4 172 190-402 2-179 (186) 8 cd01470 vWA_complement_factors 99.6 4.8E-14 1.2E-18 110.0 16.4 179 191-403 1-198 (198) 9 cd01474 vWA_ATR ATR (Anthrax T 99.6 1.4E-13 3.5E-18 107.0 18.7 177 190-409 4-181 (185) 10 cd01456 vWA_ywmD_type VWA ywmD 99.6 1E-13 2.6E-18 107.9 15.3 164 187-396 17-204 (206) 11 cd01451 vWA_Magnesium_chelatas 99.6 2.9E-13 7.4E-18 104.9 15.9 168 192-401 2-175 (178) 12 cd01463 vWA_VGCC_like VWA Volt 99.6 7.5E-13 1.9E-17 102.2 16.7 170 188-395 11-189 (190) 13 pfam00092 VWA von Willebrand f 99.6 8.4E-13 2.1E-17 101.9 16.7 172 192-405 1-177 (177) 14 cd01469 vWA_integrins_alpha_su 99.5 9E-13 2.3E-17 101.7 16.4 169 191-400 1-176 (177) 15 cd01473 vWA_CTRP CTRP for CS 99.5 1.3E-11 3.4E-16 94.1 20.1 179 191-409 1-190 (192) 16 cd01472 vWA_collagen von Wille 99.5 2.9E-12 7.4E-17 98.4 16.4 159 191-394 1-163 (164) 17 cd01464 vWA_subfamily VWA subf 99.5 1E-12 2.6E-17 101.3 14.0 172 190-402 3-174 (176) 18 cd01471 vWA_micronemal_protein 99.5 1.3E-11 3.2E-16 94.2 18.9 173 191-403 1-183 (186) 19 cd01450 vWFA_subfamily_ECM Von 99.5 4.7E-12 1.2E-16 97.0 16.0 154 191-387 1-156 (161) 20 COG4961 TadG Flp pilus assembl 99.5 4.5E-12 1.1E-16 97.1 15.5 71 5-75 6-76 (185) 21 TIGR00868 hCaCC calcium-activa 99.5 4E-13 1E-17 104.0 9.7 180 192-419 309-496 (874) 22 smart00327 VWA von Willebrand 99.4 1.7E-11 4.4E-16 93.3 16.4 160 191-391 2-164 (177) 23 cd01481 vWA_collagen_alpha3-VI 99.4 3.3E-11 8.4E-16 91.5 17.4 161 192-394 2-164 (165) 24 cd01482 vWA_collagen_alphaI-XI 99.4 5.8E-11 1.5E-15 89.9 16.4 159 191-394 1-163 (164) 25 cd00198 vWFA Von Willebrand fa 99.4 4.3E-11 1.1E-15 90.8 15.7 151 192-385 2-154 (161) 26 TIGR03436 acidobact_VWFA VWFA- 99.3 4.4E-10 1.1E-14 84.2 18.3 180 189-412 52-257 (296) 27 cd01476 VWA_integrin_invertebr 99.3 6.6E-11 1.7E-15 89.5 14.0 156 191-391 1-162 (163) 28 cd01462 VWA_YIEM_type VWA YIEM 99.2 1.8E-09 4.6E-14 80.2 14.5 144 192-384 2-146 (152) 29 COG1240 ChlD Mg-chelatase subu 99.0 2.8E-08 7E-13 72.5 14.5 169 188-397 76-249 (261) 30 cd01454 vWA_norD_type norD typ 99.0 1.2E-07 3.2E-12 68.2 15.8 149 193-383 3-166 (174) 31 cd01477 vWA_F09G8-8_type VWA F 98.9 1.2E-07 3.1E-12 68.3 15.7 166 189-390 18-191 (193) 32 COG4245 TerY Uncharacterized p 98.8 1.3E-07 3.2E-12 68.2 10.6 187 190-416 3-191 (207) 33 cd01453 vWA_transcription_fact 98.4 0.0001 2.6E-09 49.2 17.1 172 192-404 5-177 (183) 34 PRK13406 bchD magnesium chelat 98.4 9.9E-05 2.5E-09 49.3 16.2 169 189-403 400-580 (584) 35 TIGR02442 Cob-chelat-sub cobal 98.3 1.6E-05 4.2E-10 54.4 11.5 162 189-391 507-686 (688) 36 TIGR02031 BchD-ChlD magnesium 98.1 9.9E-05 2.5E-09 49.3 10.9 167 189-396 509-700 (705) 37 KOG2353 consensus 98.0 0.00073 1.9E-08 43.7 14.0 177 188-407 223-410 (1104) 38 cd01457 vWA_ORF176_type VWA OR 97.8 0.0011 2.8E-08 42.5 12.7 155 192-381 4-161 (199) 39 COG4548 NorD Nitric oxide redu 97.6 0.0022 5.6E-08 40.6 11.2 106 283-409 524-635 (637) 40 COG4655 Predicted membrane pro 97.5 7.6E-05 1.9E-09 50.1 3.5 56 13-68 3-58 (565) 41 cd01452 VWA_26S_proteasome_sub 97.4 0.0071 1.8E-07 37.3 12.3 157 221-401 19-182 (187) 42 pfam04056 Ssl1 Ssl1-like. Ssl1 97.4 0.011 2.7E-07 36.1 17.1 174 192-404 54-228 (250) 43 pfam11775 CobT_C Cobalamin bio 97.3 0.013 3.2E-07 35.7 14.7 92 291-408 115-217 (220) 44 COG2425 Uncharacterized protei 97.3 0.0078 2E-07 37.0 11.7 162 191-405 273-435 (437) 45 KOG2807 consensus 96.8 0.044 1.1E-06 32.1 14.6 173 192-405 62-235 (378) 46 COG2304 Uncharacterized protei 96.7 0.019 4.7E-07 34.5 9.1 160 187-389 34-195 (399) 47 pfam07811 TadE TadE-like prote 96.5 0.0087 2.2E-07 36.7 6.3 42 20-61 1-42 (43) 48 cd01455 vWA_F11C1-5a_type Von 96.2 0.092 2.3E-06 30.0 16.3 180 193-408 3-188 (191) 49 KOG2884 consensus 95.4 0.19 4.8E-06 28.0 13.9 158 221-403 19-184 (259) 50 pfam05762 VWA_CoxE VWA domain 92.8 0.67 1.7E-05 24.4 11.5 96 253-368 90-187 (223) 51 COG4547 CobT Cobalamin biosynt 90.8 1.1 2.8E-05 23.0 7.4 65 292-374 517-591 (620) 52 COG4867 Uncharacterized protei 90.2 1.2 3.2E-05 22.7 10.3 93 290-401 531-641 (652) 53 COG3847 Flp Flp pilus assembly 89.3 1.5 3.7E-05 22.2 6.8 28 8-35 2-29 (58) 54 cd01460 vWA_midasin VWA_Midasi 88.9 1.6 4E-05 22.0 14.7 184 187-409 57-259 (266) 55 COG4726 PilX Tfp pilus assembl 87.4 1.9 4.9E-05 21.4 5.7 57 13-70 7-71 (196) 56 pfam00362 Integrin_beta Integr 82.8 3.1 8E-05 20.0 14.3 129 265-410 180-344 (424) 57 smart00187 INB Integrin beta s 79.1 4.2 0.00011 19.2 14.3 130 263-409 177-342 (423) 58 KOG2487 consensus 75.8 5.2 0.00013 18.6 5.1 47 356-404 191-237 (314) 59 cd01458 vWA_ku Ku70/Ku80 N-ter 73.9 5.8 0.00015 18.3 14.8 160 193-384 4-188 (218) 60 pfam09967 DUF2201 Predicted me 71.1 6.7 0.00017 17.9 7.0 46 253-303 322-367 (412) 61 TIGR02877 spore_yhbH sporulati 67.1 8.1 0.00021 17.4 8.7 106 285-405 274-391 (392) 62 COG2984 ABC-type uncharacteriz 65.5 8.6 0.00022 17.2 9.3 87 326-416 156-242 (322) 63 COG5242 TFB4 RNA polymerase II 62.6 9.8 0.00025 16.8 5.4 48 356-405 178-225 (296) 64 pfam02060 ISK_Channel Slow vol 60.6 11 0.00027 16.6 3.7 40 11-50 31-70 (129) 65 TIGR01651 CobT cobaltochelatas 60.3 5.1 0.00013 18.7 1.6 65 292-374 502-576 (606) 66 PRK05325 hypothetical protein; 59.6 11 0.00028 16.5 9.2 109 285-409 294-410 (414) 67 PRK10997 yieM hypothetical pro 58.1 12 0.0003 16.3 13.0 145 187-380 317-462 (484) 68 pfam03850 Tfb4 Transcription f 56.4 12 0.00032 16.1 7.0 73 332-406 144-216 (271) 69 PRK06939 2-amino-3-ketobutyrat 55.8 5.3 0.00014 18.6 1.1 60 348-410 336-395 (395) 70 COG1991 Uncharacterized conser 50.4 4.5 0.00011 19.0 0.0 33 10-42 4-36 (131) 71 PRK06007 fliF flagellar MS-rin 49.7 16 0.0004 15.5 3.1 39 2-40 7-45 (540) 72 PRK12938 acetyacetyl-CoA reduc 49.2 16 0.00041 15.4 7.1 23 350-372 164-186 (246) 73 COG5151 SSL1 RNA polymerase II 47.3 17 0.00044 15.2 14.3 173 192-405 89-266 (421) 74 TIGR00627 tfb4 transcription f 44.8 19 0.00048 15.0 4.7 47 356-403 185-232 (295) 75 pfam01482 DUF13 DUF13. This do 44.1 17 0.00044 15.2 2.2 47 359-406 21-68 (87) 76 pfam04392 ABC_sub_bind ABC tra 43.2 20 0.00051 14.8 8.7 14 357-370 155-168 (292) 77 pfam04964 Flp_Fap Flp/Fap pili 42.3 21 0.00052 14.7 5.4 28 14-41 1-28 (47) 78 PRK12824 acetoacetyl-CoA reduc 41.4 21 0.00054 14.6 8.8 24 350-373 163-186 (245) 79 TIGR00802 nico transition meta 38.9 23 0.00059 14.4 3.2 29 17-45 181-209 (290) 80 pfam03731 Ku_N Ku70/Ku80 N-ter 38.8 23 0.00059 14.4 14.9 186 193-409 2-216 (222) 81 COG5148 RPN10 26S proteasome r 38.8 23 0.00059 14.4 11.6 132 221-378 19-153 (243) 82 pfam04285 DUF444 Protein of un 38.0 24 0.00061 14.3 9.6 105 286-407 307-417 (421) 83 pfam06707 DUF1194 Protein of u 37.3 24 0.00062 14.2 16.2 143 252-412 49-205 (206) 84 pfam11443 DUF2828 Domain of un 34.9 27 0.00068 14.0 11.0 141 191-369 327-472 (524) 85 pfam05814 DUF843 Baculovirus p 34.3 27 0.0007 13.9 5.9 45 14-59 17-75 (83) 86 PRK07806 short chain dehydroge 34.0 28 0.0007 13.9 6.9 18 351-368 165-182 (248) 87 TIGR00937 2A51 chromate transp 33.2 28 0.00072 13.8 6.5 61 6-66 52-115 (390) 88 PRK13392 5-aminolevulinate syn 33.1 22 0.00057 14.5 1.4 61 348-411 339-400 (410) 89 PRK05557 fabG 3-ketoacyl-(acyl 31.6 30 0.00077 13.6 7.4 23 350-372 166-188 (248) 90 cd06325 PBP1_ABC_uncharacteriz 31.5 30 0.00077 13.6 8.4 14 353-366 203-216 (281) 91 TIGR02134 transald_staph trans 31.0 31 0.00079 13.6 2.9 88 324-411 83-197 (237) 92 pfam11812 DUF3333 Domain of un 29.8 19 0.00049 14.9 0.6 43 20-62 19-61 (155) 93 PRK06123 short chain dehydroge 29.3 33 0.00084 13.4 8.7 22 350-371 169-190 (249) 94 pfam11411 DNA_ligase_IV DNA li 29.0 33 0.00085 13.4 2.3 22 386-407 14-35 (36) 95 PRK07505 hypothetical protein; 28.9 34 0.00085 13.3 1.9 59 348-409 345-403 (405) 96 pfam00429 TLV_coat ENV polypro 28.2 34 0.00088 13.3 3.3 28 386-413 480-507 (560) 97 PRK09730 hypothetical protein; 28.1 35 0.00088 13.3 6.7 22 350-371 167-188 (247) 98 PRK05958 8-amino-7-oxononanoat 27.7 35 0.0009 13.2 2.6 57 348-410 329-385 (387) 99 pfam07423 DUF1510 Protein of u 26.7 37 0.00093 13.1 3.2 31 8-38 2-32 (214) 100 COG1681 FlaB Archaeal flagelli 25.2 39 0.00099 12.9 2.6 22 17-38 1-22 (209) 101 PRK10506 hypothetical protein; 24.9 39 0.001 12.9 5.9 45 15-59 1-45 (155) 102 PRK08063 enoyl-(acyl carrier p 24.4 40 0.001 12.8 6.6 24 350-373 165-188 (250) 103 KOG1226 consensus 24.3 40 0.001 12.8 6.8 60 263-341 209-273 (783) 104 pfam06508 ExsB ExsB. This fami 23.4 42 0.0011 12.7 5.4 15 355-369 106-120 (137) 105 pfam09001 DUF1890 Domain of un 23.2 42 0.0011 12.7 3.7 25 359-386 92-116 (138) 106 cd01985 ETF The electron trans 22.9 43 0.0011 12.6 8.7 24 349-372 23-46 (181) 107 cd03132 GATase1_catalase Type 22.9 43 0.0011 12.6 4.5 18 350-367 84-101 (142) 108 PRK07576 short chain dehydroge 22.5 44 0.0011 12.6 7.0 22 350-371 167-188 (260) 109 COG1842 PspA Phage shock prote 21.7 45 0.0012 12.5 2.2 17 1-17 1-17 (225) 110 pfam04917 Shufflon_N Bacterial 20.9 47 0.0012 12.4 6.3 45 15-59 2-46 (356) 111 pfam10526 NADH_ub_rd_NUML NADH 20.1 49 0.0012 12.3 2.6 23 29-51 12-34 (80) 112 PRK12825 fabG 3-ketoacyl-(acyl 20.0 49 0.0013 12.3 7.2 24 350-373 168-191 (250) No 1 >PRK13685 hypothetical protein; Provisional Probab=99.88 E-value=2.8e-20 Score=150.59 Aligned_cols=170 Identities=18% Similarity=0.232 Sum_probs=132.6 Q ss_pred CCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHH Q ss_conf 30678899989999851046787655430223204876431244577848899999986404678876538899999986 Q gi|254781110|r 225 RTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQI 304 (420) Q Consensus 225 ~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~ 304 (420) ++|++.+|.....|++... ...|+|++.|....+...|+|.|...++..++.+. ++.+|.++.|+..+.+. T Consensus 108 p~Rl~~ak~~~~~fi~~~~------~~driGlv~Fa~~a~~~~plT~D~~~~~~~l~~l~---~~~~taiG~ai~~Al~~ 178 (326) T PRK13685 108 PNRLAAAQEAAKQFADQLT------PGINLGLIAFAGTATVLVSPTTNREATKNALDKLQ---LADRTATGEGIFTALQA 178 (326) T ss_pred CCHHHHHHHHHHHHHHHCC------CCCEEEEEEECCCCEECCCCCCCHHHHHHHHHHCC---CCCCCCCCHHHHHHHHH T ss_conf 5689999999999997379------88828999965872014898753999999998468---78888640689999999 Q ss_pred HCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH---HHHHHHHHHHCCCEEEEEEECCCC-------- Q ss_conf 131012334554334776778776662489960686777776534---899999999789879999954897-------- Q gi|254781110|r 305 LTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN---TIKICDKAKENFIKIVTISINASP-------- 373 (420) Q Consensus 305 L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~---~~~~c~~~k~~gi~i~tIgf~~~~-------- 373 (420) +...... ....+...++.||++|||+||.+..... ...+++.+|+.||+|||||+|.+. T Consensus 179 l~~~~~~----------~~~~~~~~~~~IILLTDG~~n~g~~~~~p~~~~~AA~~A~~~gi~IyTIgvGt~~g~~~~~g~ 248 (326) T PRK13685 179 IATVGAV----------IGGGDTPPPARIVLFSDGKETVPTNPDNPKGAYTAARTAKDQGVPISTISFGTPYGFVEINGQ 248 (326) T ss_pred HHHHHHH----------CCCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCC T ss_conf 9863320----------145677788679997489977788988730299999999985994899997799884354784 Q ss_pred ------CHHHHHHHHH-CCCCCEEEECCHHHHHHHHHHHHHHHHCCEE Q ss_conf ------7899999862-1898279817989999999999987420255 Q gi|254781110|r 374 ------NGQRLLKTCV-SSPEYHYNVVNADSLIHVFQNISQLMVHRKY 414 (420) Q Consensus 374 ------~~~~~l~~ca-s~~~~yf~a~~~~~L~~aF~~Ia~~I~~lr~ 414 (420) |. +.|++.| .++|.||.|.+.++|+++|++|.+.|..-.. T Consensus 249 ~~~~~lDe-~~L~~IA~~TGG~yfrA~d~~~L~~Iy~~i~~~i~~~~~ 295 (326) T PRK13685 249 RQPVPVDD-ETLKKIAQLSGGEFYTAASLEELRAVYATLQQQIGYETI 295 (326) T ss_pred CCCCCCCH-HHHHHHHHHCCCEEEECCCHHHHHHHHHHHHHHHCCEEE T ss_conf 03456899-999999997298799719999999999996333160331 No 2 >cd01467 vWA_BatA_type VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.85 E-value=1.8e-19 Score=145.26 Aligned_cols=167 Identities=22% Similarity=0.253 Sum_probs=132.8 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .++++++|+|+||...... ..+|++.+++++..+++... ..+++++.|++......|+| T Consensus 3 ~dvvlvlD~SgSM~~~d~~--------------~~~rl~~ak~~~~~~i~~~~-------~drvglv~Fs~~a~~~~plT 61 (180) T cd01467 3 RDIMIALDVSGSMLAQDFV--------------KPSRLEAAKEVLSDFIDRRE-------NDRIGLVVFAGAAFTQAPLT 61 (180) T ss_pred CEEEEEEECCCCCCCCCCC--------------CCCHHHHHHHHHHHHHHHCC-------CCEEEEEEECCCCEEECCCC T ss_conf 2799999898475786667--------------85899999999999997199-------97599999728736733766 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) .+....+..+..+....+.|+|++..|+..+.+.|.+.. ...|+|||+|||++|.+.. .. T Consensus 62 ~d~~~~~~~l~~i~~~~~~ggT~i~~al~~a~~~l~~~~------------------~~~~~ivLlTDG~~n~g~~--~~ 121 (180) T cd01467 62 LDRESLKELLEDIKIGLAGQGTAIGDAIGLAIKRLKNSE------------------AKERVIVLLTDGENNAGEI--DP 121 (180) T ss_pred CCHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHCCC------------------CCCCEEEEEECCCCCCCCC--CH T ss_conf 568999999862244532368608999999999764247------------------6663799980588667876--99 Q ss_pred HHHHHHHHHCCCEEEEEEECCCC-----------CHHHHHHHHHCCCCCEEEECCHHHH Q ss_conf 99999999789879999954897-----------7899999862189827981798999 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASP-----------NGQRLLKTCVSSPEYHYNVVNADSL 398 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~-----------~~~~~l~~cas~~~~yf~a~~~~~L 398 (420) ...++.+|+.||+||+||||.+. +.+.|-+.+..++|+||+|.+++|| T Consensus 122 ~~~~~~a~~~gi~v~tIGvG~~~~~~~~~~~~~~d~~~L~~iA~~tgG~yy~a~~~~eL 180 (180) T cd01467 122 ATAAELAKNKGVRIYTIGVGKSGSGPKPDGSTILDEDSLVEIADKTGGRIFRALDGFEL 180 (180) T ss_pred HHHHHHHHHCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEECCCHHHC T ss_conf 99999999769989999977898887688876559999999999619979972874649 No 3 >cd01465 vWA_subgroup VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if n Probab=99.82 E-value=1.2e-18 Score=139.93 Aligned_cols=166 Identities=19% Similarity=0.210 Sum_probs=128.1 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .++.+++|+||||. ..+++.++.++..+++.+.. ..|++++.|++......+++ T Consensus 1 ldiv~vlD~SGSM~--------------------g~~~~~~k~a~~~~l~~l~~------~dr~~iv~F~~~~~~~~~~~ 54 (170) T cd01465 1 LNLVFVIDRSGSMD--------------------GPKLPLVKSALKLLVDQLRP------DDRLAIVTYDGAAETVLPAT 54 (170) T ss_pred CCEEEEEECCCCCC--------------------CCHHHHHHHHHHHHHHHCCC------CCEEEEEEECCCCEECCCCC T ss_conf 91999990886889--------------------71999999999999985898------78799998358615515878 Q ss_pred CC--HHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC- Q ss_conf 78--488999999864046788765388999999861310123345543347767787766624899606867777765- Q gi|254781110|r 271 WG--TEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSN- 347 (420) Q Consensus 271 ~~--~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~- 347 (420) .. ...+...++ ...+.|+|++..|+..|+..+.... .+...+.+|++|||+.|.+..+ T Consensus 55 ~~~~~~~~~~~i~---~l~~~G~T~~~~~l~~a~~~~~~~~----------------~~~~~~~iillTDG~~~~~~~~~ 115 (170) T cd01465 55 PVRDKAAILAAID---RLTAGGSTAGGAGIQLGYQEAQKHF----------------VPGGVNRILLATDGDFNVGETDP 115 (170) T ss_pred CHHHHHHHHHHHH---CCCCCCCCCHHHHHHHHHHHHHHCC----------------CCCCCEEEEEEECCCCCCCCCCH T ss_conf 6667999999874---3898998527799999999998633----------------78875069998158856798898 Q ss_pred CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHH-HHCCCCCEEEECCHHHHHHHH Q ss_conf 3489999999978987999995489778999998-621898279817989999999 Q gi|254781110|r 348 VNTIKICDKAKENFIKIVTISINASPNGQRLLKT-CVSSPEYHYNVVNADSLIHVF 402 (420) Q Consensus 348 ~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~-cas~~~~yf~a~~~~~L~~aF 402 (420) ......+...++.||+|||||||.+.+.. +|+. +..++|+||++++++||.++| T Consensus 116 ~~~~~~~~~~~~~~i~i~tiGiG~~~~~~-~L~~iA~~~~G~~~~v~~~~~l~~~f 170 (170) T cd01465 116 DELARLVAQKRESGITLSTLGFGDNYNED-LMEAIADAGNGNTAYIDNLAEARKVF 170 (170) T ss_pred HHHHHHHHHHHHCCCCCEEEEECCCCCHH-HHHHHHHCCCCEEEECCCHHHHHHHC T ss_conf 99999999987438862489808879999-99999975798899849999999639 No 4 >cd01461 vWA_interalpha_trypsin_inhibitor vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix. Probab=99.75 E-value=3.5e-16 Score=123.87 Aligned_cols=166 Identities=26% Similarity=0.266 Sum_probs=126.2 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC Q ss_conf 43201221022324223357876666753467665306788999899998510467876554302232048764312445 Q gi|254781110|r 190 IFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP 269 (420) Q Consensus 190 ~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l 269 (420) +.++.+++|+|+||. ..+++.++.++..+++.+.. ..+++++.|++......+. T Consensus 2 P~div~viD~SgSM~--------------------g~~l~~ak~a~~~~l~~l~~------~d~~~iv~F~~~~~~~~~~ 55 (171) T cd01461 2 PKEVVFVIDTSGSMS--------------------GTKIEQTKEALLTALKDLPP------GDYFNIIGFSDTVEEFSPS 55 (171) T ss_pred CCEEEEEECCCCCCC--------------------CHHHHHHHHHHHHHHHHCCC------CCEEEEEEECCEEEEECCC T ss_conf 846999991798898--------------------63999999999999982998------7879999987806598077 Q ss_pred --CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC Q ss_conf --778488999999864046788765388999999861310123345543347767787766624899606867777765 Q gi|254781110|r 270 --SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSN 347 (420) Q Consensus 270 --t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~ 347 (420) ..+.......+..+....+.|+|++..||.++++.|... +...+.++|+|||+.+.. T Consensus 56 ~~~~~~~~~~~a~~~i~~l~~~G~T~i~~aL~~a~~~l~~~------------------~~~~~~iillTDG~~~~~--- 114 (171) T cd01461 56 SVSATAENVAAAIEYVNRLQALGGTNMNDALEAALELLNSS------------------PGSVPQIILLTDGEVTNE--- 114 (171) T ss_pred CEECCHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHC------------------CCCCCEEEEECCCCCCCH--- T ss_conf 53079999999998875478899866999999999988635------------------798618999757886886--- Q ss_pred CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHH Q ss_conf 3489999999978987999995489778999998621898279817989999999 Q gi|254781110|r 348 VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVF 402 (420) Q Consensus 348 ~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF 402 (420) ......+..+++.+|+|||||||.+.+...+-+.+..++|.||++++.++|.+.+ T Consensus 115 ~~~~~~~~~~~~~~i~i~tig~G~~~~~~~L~~iA~~~~G~~~~v~~~~~l~~~~ 169 (171) T cd01461 115 SQILKNVREALSGRIRLFTFGIGSDVNTYLLERLAREGRGIARRIYETDDIESQL 169 (171) T ss_pred HHHHHHHHHHHCCCCEEEEEEECCCCCHHHHHHHHHCCCCEEEECCCHHHHHHHH T ss_conf 8999999997448963999997897999999999972898899889878999976 No 5 >cd01466 vWA_C3HC4_type VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, Probab=99.70 E-value=2.1e-15 Score=118.78 Aligned_cols=152 Identities=21% Similarity=0.193 Sum_probs=114.8 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876431244577 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSW 271 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~ 271 (420) ++.+++|+||||. ..+++.++.+...+++.+.. ..+++++.|++......|++. T Consensus 2 div~vlD~SGSM~--------------------g~~l~~~k~a~~~~~~~L~~------~d~v~iV~F~~~a~~~~pl~~ 55 (155) T cd01466 2 DLVAVLDVSGSMA--------------------GDKLQLVKHALRFVISSLGD------ADRLSIVTFSTSAKRLSPLRR 55 (155) T ss_pred EEEEEEECCCCCC--------------------CHHHHHHHHHHHHHHHHCCC------CCEEEEEEECCCCEEEECCEE T ss_conf 3999990898988--------------------73899999999999984897------674899995687426204603 Q ss_pred CHHHHHHH-HHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 84889999-99864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 272 GTEKVRQY-VTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 272 ~~~~~~~~-i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) .....+.. ...+....++|+|++..|+..|.+.|.... .++..+.|||+|||++|.. T Consensus 56 ~~~~~~~~~~~~i~~l~~~GgT~i~~gl~~a~~~l~~~~----------------~~~~~~~IiLlTDG~~n~~------ 113 (155) T cd01466 56 MTAKGKRSAKRVVDGLQAGGGTNVVGGLKKALKVLGDRR----------------QKNPVASIMLLSDGQDNHG------ 113 (155) T ss_pred CCHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHCC----------------CCCCCEEEEEECCCCCCHH------ T ss_conf 799999999998753776888726799999999998436----------------6898308999826986405------ Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEEC Q ss_conf 9999999978987999995489778999998621-898279817 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVS-SPEYHYNVV 393 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas-~~~~yf~a~ 393 (420) ..+..+++.+|+|||||||.+.+.. +|+..|. ++|.||.++ T Consensus 114 -~~~~~~~~~~i~i~tiGiG~~~d~~-lL~~iA~~~gG~~~~v~ 155 (155) T cd01466 114 -AVVLRADNAPIPIHTFGLGASHDPA-LLAFIAEITGGTFSYVK 155 (155) T ss_pred -HHHHHHHCCCCEEEEEEECCCCCHH-HHHHHHHCCCCEEEEEC T ss_conf -7789987179739999978867899-99999976997799949 No 6 >cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. Probab=99.67 E-value=3.8e-14 Score=110.64 Aligned_cols=179 Identities=17% Similarity=0.221 Sum_probs=134.5 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC Q ss_conf 43201221022324223357876666753467665306788999899998510467876554302232048764312445 Q gi|254781110|r 190 IFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP 269 (420) Q Consensus 190 ~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l 269 (420) ..|+.|.+|-|+|.. ...++..++-+..+++.++ ..++.+|++++.|++.....++| T Consensus 2 plDlvFllD~S~Svg--------------------~~nF~~~k~Fv~~lv~~f~---I~~~~trVgvv~ys~~~~~~f~l 58 (224) T cd01475 2 PTDLVFLIDSSRSVR--------------------PENFELVKQFLNQIIDSLD---VGPDATRVGLVQYSSTVKQEFPL 58 (224) T ss_pred CEEEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHCC---CCCCCEEEEEEEECCCEEEEEEC T ss_conf 743999994889989--------------------8999999999999998568---79985299999965827899966 Q ss_pred C--CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC Q ss_conf 7--78488999999864046788765388999999861310123345543347767787766624899606867777765 Q gi|254781110|r 270 S--WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSN 347 (420) Q Consensus 270 t--~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~ 347 (420) . .+...+.+.+.++.. .+|+|+.+.+|..+.+.+.....+ ......+.+|++|+||||..+. T Consensus 59 ~~~~~k~~l~~aI~~i~~--~gggT~Tg~AL~~~~~~~f~~~~G----------~Rp~~~~vpkvlIviTDG~s~D---- 122 (224) T cd01475 59 GRFKSKADLKRAVRRMEY--LETGTMTGLAIQYAMNNAFSEAEG----------ARPGSERVPRVGIVVTDGRPQD---- 122 (224) T ss_pred CCCCCHHHHHHHHHHHHC--CCCCCHHHHHHHHHHHHHCCCCCC----------CCCCCCCCCEEEEEECCCCCCC---- T ss_conf 886788999999986361--388446999999999972770239----------9875568985999971798766---- Q ss_pred CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CCEEEECCHHHHHHHHHHHHHHHH Q ss_conf 348999999997898799999548977899999862189--827981798999999999998742 Q gi|254781110|r 348 VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSP--EYHYNVVNADSLIHVFQNISQLMV 410 (420) Q Consensus 348 ~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~--~~yf~a~~~~~L~~aF~~Ia~~I~ 410 (420) .....+..+|++||+||+||++. .+. ..|+..||.| .|+|.+++-++|...-+.|.+.|- T Consensus 123 -~v~~~A~~lr~~GV~ifaVGVg~-~~~-~eL~~IAs~P~~~hvf~v~~F~~l~~l~~~l~~~iC 184 (224) T cd01475 123 -DVSEVAAKARALGIEMFAVGVGR-ADE-EELREIASEPLADHVFYVEDFSTIEELTKKFQGKIC 184 (224) T ss_pred -CHHHHHHHHHHCCCEEEEEECCC-CCH-HHHHHHHCCCCHHCEEEECCHHHHHHHHHHHHHHHC T ss_conf -38999999998798899996374-798-999998559737568994798899999999876118 No 7 >cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.66 E-value=2.3e-14 Score=112.08 Aligned_cols=172 Identities=18% Similarity=0.125 Sum_probs=129.5 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCC---CCCCCCCCEEEECCCCCCCCC Q ss_conf 43201221022324223357876666753467665306788999899998510467---876554302232048764312 Q gi|254781110|r 190 IFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLL---SHVKEDVYMGLIGYTTRVEKN 266 (420) Q Consensus 190 ~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~ 266 (420) ..|+.|.+|.|+|+. ...++..++-+..+++.+... +..++.+|++++.|++..... T Consensus 2 pvDlvFllD~S~Sv~--------------------~~~F~~~k~Fv~~lv~~f~~~~~~~i~~~~~rVgvv~ys~~~~~~ 61 (186) T cd01480 2 PVDITFVLDSSESVG--------------------LQNFDITKNFVKRVAERFLKDYYRKDPAGSWRVGVVQYSDQQEVE 61 (186) T ss_pred CEEEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHHHHHCCCCCCCCCEEEEEEEECCCEEEE T ss_conf 746999996889878--------------------789999999999999998530134568774389899855842798 Q ss_pred CCCCC---CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCC Q ss_conf 44577---848899999986404678876538899999986131012334554334776778776662489960686777 Q gi|254781110|r 267 IEPSW---GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNN 343 (420) Q Consensus 267 ~~lt~---~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~ 343 (420) .++.. +...+++.++.+. +.+|+|+++.+|.++.+.+... ..+..+|++|++|||..+. T Consensus 62 ~~~~~~~~~~~~l~~~I~~i~--y~gG~T~tg~AL~~a~~~~~~~----------------~r~~~~kvlvliTDG~S~~ 123 (186) T cd01480 62 AGFLRDIRNYTSLKEAVDNLE--YIGGGTFTDCALKYATEQLLEG----------------SHQKENKFLLVITDGHSDG 123 (186) T ss_pred ECCCCCCCCHHHHHHHHHHHH--CCCCCCHHHHHHHHHHHHHHHC----------------CCCCCCEEEEEEECCCCCC T ss_conf 604777588999999997501--3589862999999999998613----------------6789853899984587666 Q ss_pred CCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHH Q ss_conf 77653489999999978987999995489778999998621898279817989999999 Q gi|254781110|r 344 FKSNVNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVF 402 (420) Q Consensus 344 ~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF 402 (420) .. +.......+.+|+.||+||+||+|... . ..|+.+|++|++.|.+++-++|.+.| T Consensus 124 ~~-~~~~~~aa~~lr~~GV~ifaVGVG~~~-~-~eL~~IAs~p~~~~~~~~f~~L~~~~ 179 (186) T cd01480 124 SP-DGGIEKAVNEADHLGIKIFFVAVGSQN-E-EPLSRIACDGKSALYRENFAELLWSF 179 (186) T ss_pred CC-CHHHHHHHHHHHHCCCEEEEEEECCCC-H-HHHHHHHCCCCCEEEECCHHHHHCCH T ss_conf 74-066999999999879899999947488-7-99999858997389736899870111 No 8 >cd01470 vWA_complement_factors Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains. Probab=99.63 E-value=4.8e-14 Score=110.00 Aligned_cols=179 Identities=18% Similarity=0.221 Sum_probs=125.7 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) +|+.|++|-|+|+. ...++..++-+..+++.+...+. .+|++++.|++.......+. T Consensus 1 lDivfllD~SgSIg--------------------~~nF~~~k~Fv~~lv~~~~~~~~---~~rvgvv~ys~~~~~~f~l~ 57 (198) T cd01470 1 LNIYIALDASDSIG--------------------EEDFDEAKNAIKTLIEKISSYEV---SPRYEIISYASDPKEIVSIR 57 (198) T ss_pred CEEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHHCCCCC---CCEEEEEEECCCCEEEEECC T ss_conf 91999997989888--------------------78899999999999998446687---75389998158853899715 Q ss_pred CC----HHHHHHHHHHHHHCC--CCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCC Q ss_conf 78----488999999864046--788765388999999861310123345543347767787766624899606867777 Q gi|254781110|r 271 WG----TEKVRQYVTRDMDSL--ILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNF 344 (420) Q Consensus 271 ~~----~~~~~~~i~~~~~~~--~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~ 344 (420) .. .......++++.... ..+||++..++......+.-... +........+|++|++|||..|.+ T Consensus 58 ~~~~~~~~~~~~~i~~i~y~~~~~~~gT~t~~AL~~~~~~~~~~~~----------~~~~~~~~v~~v~illTDG~sn~g 127 (198) T cd01470 58 DFNSNDADDVIKRLEDFNYDDHGDKTGTNTAAALKKVYERMALEKV----------RNKEAFNETRHVIILFTDGKSNMG 127 (198) T ss_pred CCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHHHHHHHHH----------CCCCCCCCCCEEEEEECCCCCCCC T ss_conf 7666689999999984603357788646899999999998655530----------466444567559999737854578 Q ss_pred CCCCHH----------HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC-CC--CEEEECCHHHHHHHHH Q ss_conf 765348----------99999999789879999954897789999986218-98--2798179899999999 Q gi|254781110|r 345 KSNVNT----------IKICDKAKENFIKIVTISINASPNGQRLLKTCVSS-PE--YHYNVVNADSLIHVFQ 403 (420) Q Consensus 345 ~~~~~~----------~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~-~~--~yf~a~~~~~L~~aF~ 403 (420) ..+... ......+|+.||.||+||+|.+.+. ..|+.+||. |+ |+|.+++-++|+++|. T Consensus 128 ~~P~~~~~~~~~~~~~~~~a~~~r~~gi~ifaiGVG~~~d~-~eL~~IAS~~~~e~hvf~v~df~~L~~i~d 198 (198) T cd01470 128 GSPLPTVDKIKNLVYKNNKSDNPREDYLDVYVFGVGDDVNK-EELNDLASKKDNERHFFKLKDYEDLQEVFD 198 (198) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEECCCCCH-HHHHHHHCCCCCCCEEEEECCHHHHHHHHC T ss_conf 99336788877766410145678873947999996661599-999998579998716999689999998639 No 9 >cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. Probab=99.63 E-value=1.4e-13 Score=106.97 Aligned_cols=177 Identities=14% Similarity=0.115 Sum_probs=129.4 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC Q ss_conf 43201221022324223357876666753467665306788999899998510467876554302232048764312445 Q gi|254781110|r 190 IFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP 269 (420) Q Consensus 190 ~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l 269 (420) ..|+.|++|-|+|+... +...++-+..+++.+. ....|+|++.|++......+| T Consensus 4 ~~DivFllD~S~Sv~~~---------------------f~~~~~Fv~~lv~~f~-----~~~~rvgvv~fS~~~~~~f~l 57 (185) T cd01474 4 HFDLYFVLDKSGSVAAN---------------------WIEIYDFVEQLVDRFN-----SPGLRFSFITFSTRATKILPL 57 (185) T ss_pred CEEEEEEEECCCCCCCC---------------------HHHHHHHHHHHHHHCC-----CCCEEEEEEEECCCCCEEEEC T ss_conf 61389999789987657---------------------6999999999998569-----987499999986983189845 Q ss_pred CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH Q ss_conf 77848899999986404678876538899999986131012334554334776778776662489960686777776534 Q gi|254781110|r 270 SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN 349 (420) Q Consensus 270 t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~ 349 (420) +...++....+..+....++|+|+.+.||..+...+..... +.....|++|++|||..+... ... T Consensus 58 ~~~~~~~~~~~~~~~~~~~~G~T~tg~AL~~a~~~~f~~~~--------------g~R~~~kvlivlTDG~s~~~~-~~~ 122 (185) T cd01474 58 TDDSSAIIKGLEVLKKVTPSGQTYIHEGLENANEQIFNRNG--------------GGRETVSVIIALTDGQLLLNG-HKY 122 (185) T ss_pred CCCHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHCCCC--------------CCCCCCEEEEEEECCCCCCCC-CHH T ss_conf 78707889999998876158937899999999997503236--------------998876289999326656762-141 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECC-HHHHHHHHHHHHHHH Q ss_conf 899999999789879999954897789999986218982798179-899999999999874 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVN-ADSLIHVFQNISQLM 409 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~-~~~L~~aF~~Ia~~I 409 (420) +...++.+|+.||.||+||++ +.+ ...|+..|++|+|.|.+++ -++|....+.|.+.| T Consensus 123 ~~~~a~~lr~~gV~i~aVGV~-~~~-~~eL~~IAs~p~~vf~v~~~F~~L~~i~~~l~~~i 181 (185) T cd01474 123 PEHEAKLSRKLGAIVYCVGVT-DFL-KSQLINIADSKEYVFPVTSGFQALSGIIESVVKKA 181 (185) T ss_pred HHHHHHHHHHCCCEEEEEECC-CCC-HHHHHHHHCCCCEEEECCCCHHHHHHHHHHHHHHH T ss_conf 799999999789489999716-259-99999871998648983475777899999999852 No 10 >cd01456 vWA_ywmD_type VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.60 E-value=1e-13 Score=107.89 Aligned_cols=164 Identities=15% Similarity=0.138 Sum_probs=106.5 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCC Q ss_conf 43443201221022324223357876666753467665306788999899998510467876554302232048764312 Q gi|254781110|r 187 ERPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKN 266 (420) Q Consensus 187 ~~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 266 (420) .+...++.+++|+||||..... ...+|++.++.++..+++.+.. ..+++++.|++..... T Consensus 17 ~~~P~~~~lVlD~SGSM~~~~~--------------~g~~rl~~ak~a~~~~v~~l~~------~drvgLv~F~~~~~~~ 76 (206) T cd01456 17 PQLPPNVAIVLDNSGSMREVDG--------------GGETRLDNAKAALDETANALPD------GTRLGLWTFSGDGDNP 76 (206) T ss_pred CCCCCEEEEEEECCCCCCCCCC--------------CCCCHHHHHHHHHHHHHHHCCC------CCEEEEEEECCCCCCC T ss_conf 9898738999979878778787--------------7645999999999999985799------9879999977867778 Q ss_pred CC---------CCC--------CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCC Q ss_conf 44---------577--------8488999999864046788765388999999861310123345543347767787766 Q gi|254781110|r 267 IE---------PSW--------GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPF 329 (420) Q Consensus 267 ~~---------lt~--------~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~ 329 (420) .+ ++. +...+...++.+. .+.|+|++..++..+.+.+.+. . T Consensus 77 ~d~~~~~~~~~~~~~~~~~~~~~r~~l~~~i~~l~--~~~G~T~l~~al~~a~~~~~~~--------------------~ 134 (206) T cd01456 77 LDVRVLVPKGCLTAPVNGFPSAQRSALDAALNSLQ--TPTGWTPLAAALAEAAAYVDPG--------------------R 134 (206) T ss_pred CCCCEECCCCCCCCCCCCCCHHHHHHHHHHHHHCC--CCCCCCHHHHHHHHHHHHHCCC--------------------C T ss_conf 88513214565444345523778999999997457--7889647999999999862778--------------------7 Q ss_pred CCEEEEEECCCCCCCCCCCHHHHHHHHH-----HHCCCEEEEEEECCCCCHHHHHHHHH-CCCCCE-EEECCHH Q ss_conf 6248996068677777653489999999-----97898799999548977899999862-189827-9817989 Q gi|254781110|r 330 QKFIIFLTDGENNNFKSNVNTIKICDKA-----KENFIKIVTISINASPNGQRLLKTCV-SSPEYH-YNVVNAD 396 (420) Q Consensus 330 ~k~iil~TDG~~~~~~~~~~~~~~c~~~-----k~~gi~i~tIgf~~~~~~~~~l~~ca-s~~~~y-f~a~~~~ 396 (420) .+.|||+|||++|++.... ..+..+ +..+|+|||||||.+.+.. +|+..| .++|.| |.++.+. T Consensus 135 ~~~IvLlTDG~~~~g~~~~---~~~~~l~~~~~~~~~v~V~tig~G~d~d~~-~L~~IA~~tgG~y~y~~~d~~ 204 (206) T cd01456 135 VNVVVLITDGEDTCGPDPC---EVARELAKRRTPAPPIKVNVIDFGGDADRA-ELEAIAEATGGTYAYNQSDLA 204 (206) T ss_pred CCEEEEEECCCCCCCCCHH---HHHHHHHHHCCCCCCEEEEEEEECCCCCHH-HHHHHHHCCCCEEEEECCCCC T ss_conf 6479999237644688859---999999983177999589999718865899-999999742978995167602 No 11 >cd01451 vWA_Magnesium_chelatase Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions. Probab=99.57 E-value=2.9e-13 Score=104.89 Aligned_cols=168 Identities=23% Similarity=0.276 Sum_probs=124.2 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCC-CCCCCCCCCC Q ss_conf 201221022324223357876666753467665306788999899998510467876554302232048-7643124457 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYT-TRVEKNIEPS 270 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~lt 270 (420) .+.|++|.||||.. ..|+..+|.++..++... .....++++++|. +......|+| T Consensus 2 lvvfvvD~SGSM~~-------------------~~rl~~aK~a~~~ll~d~-----~~~~D~v~lv~F~g~~a~~~lppT 57 (178) T cd01451 2 LVIFVVDASGSMAA-------------------RHRMAAAKGAVLSLLRDA-----YQRRDKVALIAFRGTEAEVLLPPT 57 (178) T ss_pred EEEEEEECCCCCCC-------------------CCHHHHHHHHHHHHHHHH-----CCCCCEEEEEEECCCCCEEECCCC T ss_conf 69999989878887-------------------567999999999999974-----346788999997597555856887 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH- Q ss_conf 7848899999986404678876538899999986131012334554334776778776662489960686777776534- Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN- 349 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~- 349 (420) .+....+..++.+ .++|+|++..||..|+..+..... .+...+++||+|||..|.+..+.. T Consensus 58 ~~~~~~~~~l~~L---~~gG~T~l~~gL~~a~~~~~~~~~---------------~~~~~~~iiLlTDG~~N~g~~~~~~ 119 (178) T cd01451 58 RSVELAKRRLARL---PTGGGTPLAAGLLAAYELAAEQAR---------------DPGQRPLIVVITDGRANVGPDPTAD 119 (178) T ss_pred CCHHHHHHHHHCC---CCCCCCCHHHHHHHHHHHHHHHCC---------------CCCCCEEEEEECCCCCCCCCCCHHH T ss_conf 6579999987216---788985199999999999998502---------------7898439999846986679995126 Q ss_pred -HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCH--HHHHHH Q ss_conf -89999999978987999995489778999998621-89827981798--999999 Q gi|254781110|r 350 -TIKICDKAKENFIKIVTISINASPNGQRLLKTCVS-SPEYHYNVVNA--DSLIHV 401 (420) Q Consensus 350 -~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas-~~~~yf~a~~~--~~L~~a 401 (420) ...++..+++.||...+|+|+.+.-...++++.|. .+++||..++. ++|.++ T Consensus 120 ~~~~~a~~~~~~gi~~~vId~~~~~~~~~~~~~LA~~~~g~Y~~id~l~~~~i~~~ 175 (178) T cd01451 120 RALAAARKLRARGISALVIDTEGRPVRRGLAKDLARALGGQYVRLPDLSADAIASA 175 (178) T ss_pred HHHHHHHHHHHCCCCEEEEECCCCCCCHHHHHHHHHHCCCCEEECCCCCHHHHHHH T ss_conf 99999999986699789997999976748999999942996998997998899998 No 12 >cd01463 vWA_VGCC_like VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex. Probab=99.55 E-value=7.5e-13 Score=102.20 Aligned_cols=170 Identities=16% Similarity=0.190 Sum_probs=112.6 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCC Q ss_conf 34432012210223242233578766667534676653067889998999985104678765543022320487643124 Q gi|254781110|r 188 RPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNI 267 (420) Q Consensus 188 ~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 267 (420) ..+.++.+++|+||||. ..+++.+++++..+++.+...+ +++++.|++...... T Consensus 11 ~~Pkdvv~vlD~SGSM~--------------------g~kl~~ak~a~~~il~~L~~~D------~~~iv~Fs~~~~~~~ 64 (190) T cd01463 11 TSPKDIVILLDVSGSMT--------------------GQRLHLAKQTVSSILDTLSDND------FFNIITFSNEVNPVV 64 (190) T ss_pred CCCCEEEEEEECCCCCC--------------------CCHHHHHHHHHHHHHHHCCCCC------EEEEEEECCCCEEEE T ss_conf 89826999997999889--------------------7349999999999998199877------999999689753630 Q ss_pred C-----CCCCHH-HHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCC Q ss_conf 4-----577848-8999999864046788765388999999861310123345543347767787766624899606867 Q gi|254781110|r 268 E-----PSWGTE-KVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGEN 341 (420) Q Consensus 268 ~-----lt~~~~-~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~ 341 (420) | +..... ..+.....+....+.|+|++..||..|+..|....... .........+.|+|+|||.. T Consensus 65 p~~~~~~~~~t~~n~~~~~~~i~~l~~~G~Tn~~~al~~A~~~l~~~~~~~---------~~~~~~~~~~~IillTDG~~ 135 (190) T cd01463 65 PCFNDTLVQATTSNKKVLKEALDMLEAKGIANYTKALEFAFSLLLKNLQSN---------HSGSRSQCNQAIMLITDGVP 135 (190) T ss_pred CCCCCCEEECCHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHCCC---------CCCCCCCCCCEEEEEECCCC T ss_conf 245684336899999999999982857987248999999999998742015---------56655555515999836988 Q ss_pred CCCCCCCHHHHHHHHHH--HCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCH Q ss_conf 77776534899999999--78987999995489778999998621-89827981798 Q gi|254781110|r 342 NNFKSNVNTIKICDKAK--ENFIKIVTISINASPNGQRLLKTCVS-SPEYHYNVVNA 395 (420) Q Consensus 342 ~~~~~~~~~~~~c~~~k--~~gi~i~tIgf~~~~~~~~~l~~cas-~~~~yf~a~~~ 395 (420) ++.. .....-...+ ...|.|||+|||.+.....+|+..|. +.|+|+...+. T Consensus 136 ~~~~---~i~~~~~~~~~~~~~i~ift~G~G~~~~d~~~L~~iA~~~~G~y~~I~~~ 189 (190) T cd01463 136 ENYK---EIFDKYNWDKNSEIPVRVFTYLIGREVTDRREIQWMACENKGYYSHIQSL 189 (190) T ss_pred CCHH---HHHHHHHHHHCCCCCEEEEEEEECCCCCCHHHHHHHHHCCCCEEEECCCC T ss_conf 7578---89999999755799879999996799778799999998099569978889 No 13 >pfam00092 VWA von Willebrand factor type A domain. Probab=99.55 E-value=8.4e-13 Score=101.89 Aligned_cols=172 Identities=19% Similarity=0.197 Sum_probs=124.3 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876431244577 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSW 271 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~ 271 (420) |+.|++|.|+||. ...++..++.+..+++.+. ..+...|++++.|++......+|+. T Consensus 1 Di~fvlD~S~Sm~--------------------~~~~~~~k~~~~~~i~~~~---~~~~~~rv~lv~f~~~~~~~~~l~~ 57 (177) T pfam00092 1 DIVFLLDGSGSIG--------------------EANFEKVKEFIKKLVENLD---IGPDGTRVGLVQYSSDVTTEFSLND 57 (177) T ss_pred CEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHHC---CCCCCCEEEEEEECCCEEEEECCCC T ss_conf 9899996879988--------------------6889999999999999836---5887528999994584589961788 Q ss_pred CHHH--HHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH Q ss_conf 8488--99999986404678876538899999986131012334554334776778776662489960686777776534 Q gi|254781110|r 272 GTEK--VRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN 349 (420) Q Consensus 272 ~~~~--~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~ 349 (420) ..+. ....+. ......+|+|++..||..+.+.+.... ...++.+|++|++|||.++..... T Consensus 58 ~~~~~~~~~~~~-~~~~~~~g~t~~~~al~~a~~~~~~~~--------------~~r~~~~k~vvllTDG~~~~~~~~-- 120 (177) T pfam00092 58 YKSKDDLLSAVL-RNIYYLGGGTNTGKALKYALENLFRSA--------------GSRPNAPKVVILLTDGKSNDGGLV-- 120 (177) T ss_pred CCCHHHHHHHHH-HHCCCCCCCCHHHHHHHHHHHHHHHCC--------------CCCCCCEEEEEEEECCCCCCCCCC-- T ss_conf 689999999986-431578995659999999999986354--------------788787268999836987888646-- Q ss_pred HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC---CCCCEEEECCHHHHHHHHHHH Q ss_conf 89999999978987999995489778999998621---898279817989999999999 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASPNGQRLLKTCVS---SPEYHYNVVNADSLIHVFQNI 405 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas---~~~~yf~a~~~~~L~~aF~~I 405 (420) .......++..||++|+||+|. .+ ...|+..|+ +.+++|.+.+.++|.++++.| T Consensus 121 ~~~~~~~~~~~gI~v~~vG~g~-~~-~~~L~~ia~~~~~~~~~~~~~~~~~l~~~~~~i 177 (177) T pfam00092 121 PAAAAALRRKVGIIVFGVGVGD-VD-EEELRLIASEPCSEGHVFYVTDFDALSDIQEEL 177 (177) T ss_pred HHHHHHHHHHCCCEEEEEECCC-CC-HHHHHHHHCCCCCCCEEEEECCHHHHHHHHHHC T ss_conf 9999999997895899997474-48-999999968999898599958989999999619 No 14 >cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. Probab=99.54 E-value=9e-13 Score=101.68 Aligned_cols=169 Identities=22% Similarity=0.244 Sum_probs=126.0 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .|+.|++|-|+|+. ...+...++.+..+++.+. ..+...|++++.|++......+|. T Consensus 1 lDl~fllD~S~Sv~--------------------~~~F~~~k~fi~~lv~~f~---i~~~~~rvglv~ys~~~~~~~~l~ 57 (177) T cd01469 1 MDIVFVLDGSGSIY--------------------PDDFQKVKNFLSTVMKKLD---IGPTKTQFGLVQYSESFRTEFTLN 57 (177) T ss_pred CCEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHCC---CCCCCCEEEEEEECCCEEEEEECC T ss_conf 90999996889999--------------------8999999999999998667---699874899999368249998235 Q ss_pred C--CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC Q ss_conf 7--84889999998640467887653889999998613101233455433477677877666248996068677777653 Q gi|254781110|r 271 W--GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV 348 (420) Q Consensus 271 ~--~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~ 348 (420) . +.......+..+. ..+|+|+.+.+|.++.+.+.....+ ..++.+|++|++|||..+.. . T Consensus 58 ~~~~~~~~~~~i~~i~--~~~g~t~t~~AL~~a~~~~f~~~~g-------------~R~~~~kv~ivlTDG~s~d~---~ 119 (177) T cd01469 58 EYRTKEEPLSLVKHIS--QLLGLTNTATAIQYVVTELFSESNG-------------ARKDATKVLVVITDGESHDD---P 119 (177) T ss_pred CCCCHHHHHHHHHHCC--CCCCCCCHHHHHHHHHHHHCCCCCC-------------CCCCCEEEEEEEECCCCCCC---C T ss_conf 5677899999986230--3689752527999999985364558-------------86787169999978986775---0 Q ss_pred HHHHHHHHHHHCCCEEEEEEECCCCC---HHHHHHHHHCCC--CCEEEECCHHHHHH Q ss_conf 48999999997898799999548977---899999862189--82798179899999 Q gi|254781110|r 349 NTIKICDKAKENFIKIVTISINASPN---GQRLLKTCVSSP--EYHYNVVNADSLIH 400 (420) Q Consensus 349 ~~~~~c~~~k~~gi~i~tIgf~~~~~---~~~~l~~cas~~--~~yf~a~~~~~L~~ 400 (420) ......+.+|+.||+||+||+|..-+ ....|+.+||.| .|.|.+++-++|++ T Consensus 120 ~~~~~~~~lk~~gv~vf~VGvG~~~~~~~~~~eL~~iAs~P~~~hvf~~~~f~~L~~ 176 (177) T cd01469 120 LLKDVIPQAEREGIIRYAIGVGGHFQRENSREELKTIASKPPEEHFFNVTDFAALKD 176 (177) T ss_pred CHHHHHHHHHHCCEEEEEEEECCCCCCCCCHHHHHHHHCCCCHHCEEEECCHHHHCC T ss_conf 149999999979908999995551467451999999967985871998379777646 No 15 >cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. Probab=99.51 E-value=1.3e-11 Score=94.06 Aligned_cols=179 Identities=14% Similarity=0.142 Sum_probs=122.0 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHH-HHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC Q ss_conf 3201221022324223357876666753467665306788-999899998510467876554302232048764312445 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAA-LKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP 269 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l 269 (420) .|+.|++|-|+|.... .|.. .++=+..+++.+ +..++.+|++++.|++......++ T Consensus 1 ~DivFllD~S~SIg~~--------------------nf~~~v~~F~~~lv~~f---~Ig~~~~rvgvv~yS~~~~~~~~f 57 (192) T cd01473 1 YDLTLILDESASIGYS--------------------NWRKDVIPFTEKIINNL---NISKDKVHVGILLFAEKNRDVVPF 57 (192) T ss_pred CCEEEEEECCCCCCHH--------------------HHHHHHHHHHHHHHHHC---CCCCCCEEEEEEEECCCCCEEEEC T ss_conf 9789999389986667--------------------76999999999999875---659896199999955887401323 Q ss_pred CC----CHHHHHHHHHHHHHC-CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCC Q ss_conf 77----848899999986404-6788765388999999861310123345543347767787766624899606867777 Q gi|254781110|r 270 SW----GTEKVRQYVTRDMDS-LILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNF 344 (420) Q Consensus 270 t~----~~~~~~~~i~~~~~~-~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~ 344 (420) .. +.......++.+... +.+|+|+++.+|..+.+.+.... -.+++.+|++|++|||..+.. T Consensus 58 ~~~~~~~k~~~l~~i~~l~~~~~~gg~T~tg~AL~~~~~~~~~~~--------------g~R~~vpkv~IvlTDG~s~~~ 123 (192) T cd01473 58 SDEERYDKNELLKKINDLKNSYRSGGETYIVEALKYGLKNYTKHG--------------NRRKDAPKVTMLFTDGNDTSA 123 (192) T ss_pred CCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCC--------------CCCCCCCEEEEEEECCCCCCC T ss_conf 554434899999999998731468982479999999999863467--------------888899749999956998873 Q ss_pred CCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CC---EEEECCHHHHHHHHHHHHHHH Q ss_conf 765348999999997898799999548977899999862189--82---798179899999999999874 Q gi|254781110|r 345 KSNVNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSP--EY---HYNVVNADSLIHVFQNISQLM 409 (420) Q Consensus 345 ~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~--~~---yf~a~~~~~L~~aF~~Ia~~I 409 (420) . ......+...+|+.||+||.||+|.. +..+ |+..|+.| +. +|...+-++|....+.|.++| T Consensus 124 ~-~~~~~~~a~~lr~~gV~i~avGVg~~-~~~e-L~~iag~~~~~~~c~~~~~~~fd~l~~i~~~l~~~v 190 (192) T cd01473 124 S-KKELQDISLLYKEENVKLLVVGVGAA-SENK-LKLLAGCDINNDNCPNVIKTEWNNLNGISKFLTDKI 190 (192) T ss_pred C-HHHHHHHHHHHHHCCCEEEEEEECCC-CHHH-HHHHHCCCCCCCCCCEEEECCHHHHHHHHHHHHHHH T ss_conf 1-67899999999987978999980637-9999-999869998899775799479789999999999972 No 16 >cd01472 vWA_collagen von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins. This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif. Probab=99.51 E-value=2.9e-12 Score=98.37 Aligned_cols=159 Identities=19% Similarity=0.259 Sum_probs=119.9 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .|+.|++|.|+|+. ...++..++.+..+++.+.. .++..|++++.|++......+++ T Consensus 1 aDi~fvlD~S~Sv~--------------------~~~f~~~k~fi~~li~~~~i---~~~~~rvgvv~fs~~~~~~~~l~ 57 (164) T cd01472 1 ADIVFLVDGSESIG--------------------LSNFNLVKDFVKRVVERLDI---GPDGVRVGVVQYSDDPRTEFYLN 57 (164) T ss_pred CCEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHCCC---CCCCCEEEEEEECCCEEEEECCC T ss_conf 97999997979988--------------------79999999999999996476---88860899998247415874454 Q ss_pred C--CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC Q ss_conf 7--84889999998640467887653889999998613101233455433477677877666248996068677777653 Q gi|254781110|r 271 W--GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV 348 (420) Q Consensus 271 ~--~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~ 348 (420) . +...+.+.++.+. ..+|+|+++.+|.++.+.+..... ...++.+|++|++|||..+. T Consensus 58 ~~~~~~~l~~~i~~i~--~~~g~t~~~~AL~~~~~~~~~~~~-------------~~r~~~~kvvvllTDG~s~~----- 117 (164) T cd01472 58 TYRSKDDVLEAVKNLR--YIGGGTNTGKALKYVRENLFTEAS-------------GSREGVPKVLVVITDGKSQD----- 117 (164) T ss_pred CCCCHHHHHHHHHHHH--CCCCCCHHHHHHHHHHHHHHCCCC-------------CCCCCCEEEEEEEECCCCCC----- T ss_conf 6698899999998611--668975299999999998635357-------------87678515999983799864----- Q ss_pred HHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CCEEEECC Q ss_conf 48999999997898799999548977899999862189--82798179 Q gi|254781110|r 349 NTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSP--EYHYNVVN 394 (420) Q Consensus 349 ~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~--~~yf~a~~ 394 (420) .....+..+|+.||+||+||+|. .+ .+.|+..||.| .|+|.+.+ T Consensus 118 ~~~~~a~~lr~~Gi~v~~VGig~-~~-~~~L~~iAs~p~~~~~~~~~~ 163 (164) T cd01472 118 DVEEPAVELKQAGIEVFAVGVKN-AD-EEELKQIASDPKELYVFNVAD 163 (164) T ss_pred HHHHHHHHHHHCCCEEEEEECCC-CC-HHHHHHHHCCCCHHEEEECCC T ss_conf 08899999998898899997884-79-999999967993783896588 No 17 >cd01464 vWA_subfamily VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.50 E-value=1e-12 Score=101.32 Aligned_cols=172 Identities=18% Similarity=0.157 Sum_probs=122.7 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC Q ss_conf 43201221022324223357876666753467665306788999899998510467876554302232048764312445 Q gi|254781110|r 190 IFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP 269 (420) Q Consensus 190 ~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l 269 (420) ...+.+++|+||||. ..+++.++.++..+++.+...+.....+++++++|++......|+ T Consensus 3 rlpvvlvlD~SGSM~--------------------G~~i~~~k~al~~~~~~L~~d~~a~~~~~vsVItF~s~a~~~~pl 62 (176) T cd01464 3 RLPIYLLLDTSGSMA--------------------GEPIEALNQGLQMLQSELRQDPYALESVEISVITFDSAARVIVPL 62 (176) T ss_pred CCCEEEEEECCCCCC--------------------CHHHHHHHHHHHHHHHHHHCCCCCHHEEEEEEEEECCCEEEECCC T ss_conf 357899997899999--------------------847999999999999997118310113269999978951780586 Q ss_pred CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH Q ss_conf 77848899999986404678876538899999986131012334554334776778776662489960686777776534 Q gi|254781110|r 270 SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN 349 (420) Q Consensus 270 t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~ 349 (420) +.-... . .....++|+|+++.|+..+.+.|...... .......+++++|||||||+++... .. T Consensus 63 ~~~~~~---~---~~~L~a~G~T~~g~al~~a~~~l~~~~~~---------~~~~~~~~~~P~I~LlTDG~PtD~~--~~ 125 (176) T cd01464 63 TPLESF---Q---PPRLTASGGTSMGAALELALDCIDRRVQR---------YRADQKGDWRPWVFLLTDGEPTDDL--TA 125 (176) T ss_pred CCHHHC---C---CCCCCCCCCCHHHHHHHHHHHHHHHHHHH---------CCCCCCCCCCEEEEEECCCCCCCCH--HH T ss_conf 347664---7---55477789981999999999999986522---------3655667753179996689988758--99 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHH Q ss_conf 89999999978987999995489778999998621898279817989999999 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVF 402 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF 402 (420) ....++..+..++.|++||+|.+.+. ++|+..+... +. ..+..++.+-| T Consensus 126 ~~~~~~~~~~~~~~i~a~giG~dad~-~~L~~is~~~--~~-~~~~~~f~~ff 174 (176) T cd01464 126 AIERIKEARDSKGRIVACAVGPKADL-DTLKQITEGV--PL-LDDALSGLNFF 174 (176) T ss_pred HHHHHHHHHHCCCEEEEEEEECCCCH-HHHHHHHCCC--CC-CCCHHHHHHHH T ss_conf 99999988863976999997387189-9999885777--45-34534588850 No 18 >cd01471 vWA_micronemal_protein Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners. Probab=99.49 E-value=1.3e-11 Score=94.23 Aligned_cols=173 Identities=17% Similarity=0.186 Sum_probs=119.6 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .|+.|.+|-|+|+.. ...+...+.-+..+++.+ +..++..|++++.|++......+|. T Consensus 1 lDlvFllD~S~SVg~-------------------~n~f~~~k~F~~~lv~~f---~I~~~~~rVgvv~ys~~~~~~~~l~ 58 (186) T cd01471 1 LDLYLLVDGSGSIGY-------------------SNWVTHVVPFLHTFVQNL---NISPDEINLYLVTFSTNAKELIRLS 58 (186) T ss_pred CEEEEEEECCCCCCC-------------------CCHHHHHHHHHHHHHHHC---CCCCCCEEEEEEEECCCCEEEEECC T ss_conf 909999948898886-------------------131899999999999974---9698844999999548705998757 Q ss_pred CCH----HHHHHHHHHHHH-CCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCC Q ss_conf 784----889999998640-467887653889999998613101233455433477677877666248996068677777 Q gi|254781110|r 271 WGT----EKVRQYVTRDMD-SLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFK 345 (420) Q Consensus 271 ~~~----~~~~~~i~~~~~-~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~ 345 (420) ... ......+..+.. .+.+|+|+++.+|..+.+.+.... -.+++.+|++|++|||..+. T Consensus 59 ~~~~~~~~~~~~~~~~i~~~~y~gg~T~Tg~AL~~a~~~~f~~~--------------g~R~~vpkv~illTDG~s~d-- 122 (186) T cd01471 59 SPNSTNKDLALNAIRALLSLYYPNGSTNTTSALLVVEKHLFDTR--------------GNRENAPQLVIIMTDGIPDS-- 122 (186) T ss_pred CCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCC--------------CCCCCCCEEEEEEECCCCCC-- T ss_conf 75544656799999999837778996779999999999721146--------------88999985999990698778-- Q ss_pred CCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCC-----EEEECCHHHHHHHHH Q ss_conf 6534899999999789879999954897789999986218982-----798179899999999 Q gi|254781110|r 346 SNVNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEY-----HYNVVNADSLIHVFQ 403 (420) Q Consensus 346 ~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~-----yf~a~~~~~L~~aF~ 403 (420) +..+...++.+|++||+||+||+|...+.++ |+..|+.+.. .|...+-++|+++-+ T Consensus 123 -~~~~~~~a~~Lr~~GV~ifavGVG~~v~~~e-L~~Iag~~~~~~~c~~~~~~~~~~l~~~~~ 183 (186) T cd01471 123 -KFRTLKEARKLRERGVIIAVLGVGQGVNHEE-NRSLVGCDPDDSPCPLYLQSSWSEVQNVIK 183 (186) T ss_pred -CCHHHHHHHHHHHCCCEEEEEECCCCCCHHH-HHHHCCCCCCCCCCCEEEECCHHHHHHHHH T ss_conf -5258999999998899999998343249999-999709998889986575178888874775 No 19 >cd01450 vWFA_subfamily_ECM Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=99.48 E-value=4.7e-12 Score=97.02 Aligned_cols=154 Identities=19% Similarity=0.236 Sum_probs=115.9 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .|+.|++|.|+||. ..+++..++.+..+++.+... ....|++++.|++......+|+ T Consensus 1 ~DivfvlD~S~Sm~--------------------~~~~~~~k~~i~~~i~~~~~~---~~~~rv~lv~fs~~~~~~~~l~ 57 (161) T cd01450 1 LDIVFLLDGSESVG--------------------PENFEKVKDFIEKLVEKLDIG---PDKTRVGLVQYSDDVRVEFSLN 57 (161) T ss_pred CEEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHCCCC---CCCCEEEEEEECCCEEEEECCC T ss_conf 96999997989988--------------------589999999999999970568---8785899999557316871465 Q ss_pred CC--HHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC Q ss_conf 78--4889999998640467887653889999998613101233455433477677877666248996068677777653 Q gi|254781110|r 271 WG--TEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV 348 (420) Q Consensus 271 ~~--~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~ 348 (420) .. ...+.+.+..+... .+++|++..+|.++.+.+.... ...++.+|++|++|||..+... T Consensus 58 ~~~~~~~l~~~i~~l~~~-~~~~t~~~~AL~~~~~~~~~~~--------------~~r~~~~kvivllTDG~~~~~~--- 119 (161) T cd01450 58 DYKSKDDLLKAVKNLKYL-GGGGTNTGKALQYALEQLFSES--------------NARENVPKVIIVLTDGRSDDGG--- 119 (161) T ss_pred CCCCHHHHHHHHHHCCCC-CCCCCCHHHHHHHHHHHHHHCC--------------CCCCCCCEEEEEEECCCCCCCC--- T ss_conf 646699999999842136-8998548999999999986144--------------6666675499998258878874--- Q ss_pred HHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCC Q ss_conf 489999999978987999995489778999998621898 Q gi|254781110|r 349 NTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPE 387 (420) Q Consensus 349 ~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~ 387 (420) .....++.+|+.||+||+||+|. .+ .+.|+..|+.|+ T Consensus 120 ~~~~~a~~lk~~gi~v~~vgiG~-~~-~~~L~~iA~~p~ 156 (161) T cd01450 120 DPKEAAAKLKDEGIKVFVVGVGP-AD-EEELREIASCPS 156 (161) T ss_pred CHHHHHHHHHHCCCEEEEEEECC-CC-HHHHHHHHCCCC T ss_conf 79999999998899899998264-89-999999977994 No 20 >COG4961 TadG Flp pilus assembly protein TadG [Intracellular trafficking and secretion] Probab=99.48 E-value=4.5e-12 Score=97.15 Aligned_cols=71 Identities=21% Similarity=0.201 Sum_probs=64.2 Q ss_pred HHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCHH Q ss_conf 99999999886303885699999999999999999999999999999999999999999874204653025 Q gi|254781110|r 5 SRFRFYFKKGIASEKANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMESANNAAILAGASKMVSNLSRL 75 (420) Q Consensus 5 ~~~~~~~~~~~~~~~G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~LA~a~~~~~~~~~~ 75 (420) -..+-+++||+||++|+++|.|||++||||+++++.||++.+++.|.+||+|+|+|++++++......... T Consensus 6 ~~~~~~~~rF~rdr~Ga~AVeFAlvap~ll~l~~g~ve~~~~~~~~~~l~~a~d~aara~~~~~~~~~~~~ 76 (185) T COG4961 6 RGLRGLLRRFRRDRRGAAAVEFALVAPPLLLLVFGIVEFGIAFLAKQSLQNAADAAARAAARGLTTDAADL 76 (185) T ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHH T ss_conf 65799999887648768999999999999999999999999999999999999999999985076442025 No 21 >TIGR00868 hCaCC calcium-activated chloride channel protein 1; InterPro: IPR004727 This entry represents a family of Ca(2+)-regulated chloride channels (CLCA) which includes bovine, murine and human proteins , . Each CLCA exhibits a distinct, often overlapping, tissue expression pattern. With the exception of the truncated, secreted protein hCLCA3 , they are synthesized as an approximately 125 kDa precursor transmembrane glycoprotein that is rapidly cleaved into 90 and 35 kDa subunits. The human proteins have been shown to affect a large number of cell functions including chloride conductance, epithelial secretion, cell-cell adhesion, apoptosis, cell cycle control, mucus production in asthma, and blood pressure. The CLCA proteins expressed on the luminal surface of lung vascular endothelia (bCLCA2; mCLCA1; hCLCA2) serve as adhesion molecules for lung metastatic cancer cells, mediating vascular arrest and lung colonization. Expression of hCLCA2 in normal mammary epithelium is consistently lost in human breast cancer and in all tumorigenic breast cancer cell lines. Re-expression of hCLCA2 in human breast cancer cells abrogates tumorigenicity in nude mice, implying that hCLCA2 acts as a tumour suppressor in breast cancer.. Probab=99.47 E-value=4e-13 Score=103.96 Aligned_cols=180 Identities=24% Similarity=0.361 Sum_probs=122.4 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876431244577 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSW 271 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~ 271 (420) .|-.|+|+||||. ...|+..+.+|.+-|+-+. ..+...+|+++|++.+.....|.. T Consensus 309 iVCLVLDKSGSM~-------------------~~dRL~RmNQAa~lFL~Q~-----vE~gs~VGmV~FDS~A~i~n~L~~ 364 (874) T TIGR00868 309 IVCLVLDKSGSMT-------------------KEDRLKRMNQAAKLFLLQI-----VEKGSWVGMVTFDSAAEIKNELIK 364 (874) T ss_pred EEEEEECCCCCCC-------------------CCCHHHHHHHHHHHHEEEE-----EECCCEEEEEECCCEEEEEEEEEE T ss_conf 8999863443379-------------------8853345555664301235-----541526776630644576542077 Q ss_pred CH-HHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 84-88999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 272 GT-EKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 272 ~~-~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) =. +..+..+.+---..+.|||.+..||+.|.+.+....+...+.. |||+||||+|... T Consensus 365 I~s~~~~~~l~a~LP~~a~GGTSIC~Gl~~aFq~I~~~~~~t~GSE----------------i~LLTDGEDN~i~----- 423 (874) T TIGR00868 365 ITSSDERDALTANLPTEASGGTSICSGLKAAFQVIKKSDQSTDGSE----------------IVLLTDGEDNTIS----- 423 (874) T ss_pred ECCHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCE----------------EEEEECCCCCCEE----- T ss_conf 5266899899870778787680365667666543331266667536----------------9983068757623----- Q ss_pred HHHH-HHHHHCCCEEEEEEECCCCCHH-HHHHHHHCCCCCEEEE--CCHHHHHHHHHHHHH---HHHCCEEEEEEC Q ss_conf 9999-9999789879999954897789-9999862189827981--798999999999998---742025587736 Q gi|254781110|r 351 IKIC-DKAKENFIKIVTISINASPNGQ-RLLKTCVSSPEYHYNV--VNADSLIHVFQNISQ---LMVHRKYSVILK 419 (420) Q Consensus 351 ~~~c-~~~k~~gi~i~tIgf~~~~~~~-~~l~~cas~~~~yf~a--~~~~~L~~aF~~Ia~---~I~~lr~s~~~~ 419 (420) .| +..|.+|+.||||++|-.++.+ .-|.+.- ++.+||.- ..-..|.+||..|+. .|++.-|=||=| T Consensus 424 --sC~~eVkqsGaIiHtiALGpsAa~ele~lS~mT-GG~~fYa~D~~~~NgLidAFg~lsS~~~~~sQ~~lQLESk 496 (874) T TIGR00868 424 --SCIEEVKQSGAIIHTIALGPSAAKELEELSDMT-GGLRFYASDEADNNGLIDAFGALSSGNGSVSQQSLQLESK 496 (874) T ss_pred --ECHHHHHCCCEEEEEEECCHHHHHHHHHHHHHC-CCCEEEEECHHHCCCHHHHHHHHCCCCHHHHHHHHHHHHH T ss_conf --130554109808998507845899999987333-8711334133331414546642214761255555555432 No 22 >smart00327 VWA von Willebrand factor (vWF) type A domain. VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. Probab=99.44 E-value=1.7e-11 Score=93.32 Aligned_cols=160 Identities=19% Similarity=0.227 Sum_probs=116.9 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC- Q ss_conf 3201221022324223357876666753467665306788999899998510467876554302232048764312445- Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP- 269 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l- 269 (420) .++.+++|.|+||. ..+++..+.++..+++.+...+ ...+++++.|++......++ T Consensus 2 ~di~~vvD~S~SM~--------------------~~~~~~~k~~~~~~i~~l~~~~---~~~~v~vv~f~~~~~~~~~~~ 58 (177) T smart00327 2 LDVVFLLDGSGSMG--------------------PNRFEKAKEFVLKLVEQLDIGP---DGDRVGLVTFSDDATVLFPLN 58 (177) T ss_pred CEEEEEEECCCCCC--------------------CHHHHHHHHHHHHHHHHHHCCC---CCCEEEEEEECCCEEEEECCC T ss_conf 48999992889988--------------------2899999999999999864179---987899999637268997688 Q ss_pred -CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC Q ss_conf -7784889999998640467887653889999998613101233455433477677877666248996068677777653 Q gi|254781110|r 270 -SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV 348 (420) Q Consensus 270 -t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~ 348 (420) ..+...+.+.+..+.. ..+|+|+...+|.++.+.+...... ..+..+|++|++|||..+.. . T Consensus 59 ~~~~~~~~~~~i~~l~~-~~~g~t~~~~al~~a~~~~~~~~~~-------------~~~~~~~~iil~TDG~~~~~---~ 121 (177) T smart00327 59 DSRSKDALLEALASLSY-KLGGGTNLGAALQYALENLFSKSAG-------------SRRGAPKVLILITDGESNDG---G 121 (177) T ss_pred CCCCHHHHHHHHHHCCC-CCCCCCCCHHHHHHHHHHHHHHHCC-------------CCCCCCEEEEEEECCCCCCC---H T ss_conf 86899999999971415-5788776428999999999766503-------------77887428999805887872---5 Q ss_pred HHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC-CCEEE Q ss_conf 48999999997898799999548977899999862189-82798 Q gi|254781110|r 349 NTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSP-EYHYN 391 (420) Q Consensus 349 ~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~-~~yf~ 391 (420) ........+|+.||.||+||+|.+.+.. .|+..|+.+ +.|+. T Consensus 122 ~~~~~~~~~~~~~v~i~~ig~g~~~~~~-~l~~ia~~~~~~~~~ 164 (177) T smart00327 122 DLLKAAKELKRSGVKVFVVGVGNDVDEE-ELKKLASAPGGVYVF 164 (177) T ss_pred HHHHHHHHHHHCCCEEEEEEECCCCCHH-HHHHHHHCCCCEEEE T ss_conf 2999999998679489999958847999-999998489965999 No 23 >cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.43 E-value=3.3e-11 Score=91.49 Aligned_cols=161 Identities=12% Similarity=0.132 Sum_probs=116.3 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876431244577 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSW 271 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~ 271 (420) |+.|.+|-|+|+. ...++..++-+..+++.+. ..++..|++++.|++.......|.. T Consensus 2 DlvFllD~S~si~--------------------~~~F~~~k~Fv~~lv~~f~---i~~~~trVgvi~ys~~~~~~f~l~~ 58 (165) T cd01481 2 DIVFLIDGSDNVG--------------------SGNFPAIRDFIERIVQSLD---VGPDKIRVAVVQFSDTPRPEFYLNT 58 (165) T ss_pred CEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHHC---CCCCCEEEEEEEECCCEEEEEECCC T ss_conf 7899996889989--------------------8999999999999999604---6888627889998686479997677 Q ss_pred --CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH Q ss_conf --848899999986404678876538899999986131012334554334776778776662489960686777776534 Q gi|254781110|r 272 --GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN 349 (420) Q Consensus 272 --~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~ 349 (420) +...+...+.++... .+++|+.+.+|.++.+.+..... .....+..+|++|++|||..+. . T Consensus 59 ~~~~~~l~~~I~~i~~~-~g~~t~tg~AL~~a~~~~f~~~~-----------g~R~r~~v~kvlvviTdG~s~d-----~ 121 (165) T cd01481 59 HSTKADVLGAVRRLRLR-GGSQLNTGSALDYVVKNLFTKSA-----------GSRIEEGVPQFLVLITGGKSQD-----D 121 (165) T ss_pred CCCHHHHHHHHHHHHCC-CCCCEEHHHHHHHHHHHHCCCCC-----------CCCCCCCCCEEEEEEECCCCCC-----H T ss_conf 68999999999841045-89843699999999997167567-----------8875579986999984898853-----7 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECC Q ss_conf 899999999789879999954897789999986218982798179 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVN 394 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~ 394 (420) ....+..+|+.||+||+||++. .+. ..|+..||.|++.|.+++ T Consensus 122 ~~~~a~~lr~~gV~i~aVGvg~-~~~-~eL~~IAs~p~~vf~~~~ 164 (165) T cd01481 122 VERPAVALKRAGIVPFAIGARN-ADL-AELQQIAFDPSFVFQVSD 164 (165) T ss_pred HHHHHHHHHHCCCEEEEEECCC-CCH-HHHHHHHCCCCCEEECCC T ss_conf 8999999998897899996897-999-999998589877697389 No 24 >cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.39 E-value=5.8e-11 Score=89.89 Aligned_cols=159 Identities=16% Similarity=0.197 Sum_probs=116.2 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) .|+.|++|.|+|+. ...++..++-+..+++.+. ..++..|++++.|++......+|. T Consensus 1 aDlvfllD~S~Si~--------------------~~~f~~~k~fi~~lv~~f~---i~~~~~rvgvv~ys~~~~~~~~l~ 57 (164) T cd01482 1 ADIVFLVDGSWSIG--------------------RSNFNLVRSFLSSVVEAFE---IGPDGVQVGLVQYSDDPRTEFDLN 57 (164) T ss_pred CCEEEEEECCCCCC--------------------HHHHHHHHHHHHHHHHHCC---CCCCCEEEEEEEECCCCEEEECCC T ss_conf 97999996989988--------------------8999999999999999647---688862899999447512787343 Q ss_pred C--CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC Q ss_conf 7--84889999998640467887653889999998613101233455433477677877666248996068677777653 Q gi|254781110|r 271 W--GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV 348 (420) Q Consensus 271 ~--~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~ 348 (420) . +...+...+..+. +.+|+|+++.+|.++.+.+-.... ...++.+|++|++|||..+. T Consensus 58 ~~~~~~~l~~~i~~i~--~~~g~t~~~~AL~~~~~~~f~~~~-------------g~R~~~~kvlvliTDG~s~d----- 117 (164) T cd01482 58 AYTSKEDVLAAIKNLP--YKGGNTRTGKALTHVREKNFTPDA-------------GARPGVPKVVILITDGKSQD----- 117 (164) T ss_pred CCCCHHHHHHHHHHCC--CCCCCCCHHHHHHHHHHHHHCHHC-------------CCCCCCCEEEEEECCCCCCC----- T ss_conf 4699899999986402--668997289999999998615002-------------89888860799960798843----- Q ss_pred HHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCC--CCEEEECC Q ss_conf 48999999997898799999548977899999862189--82798179 Q gi|254781110|r 349 NTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSP--EYHYNVVN 394 (420) Q Consensus 349 ~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~--~~yf~a~~ 394 (420) .....++.+|+.||+||+||++. .+ ...|+..||.| .|+|.+++ T Consensus 118 ~~~~~a~~lr~~gv~i~~VGVg~-~~-~~eL~~IAs~P~~~hvf~~~~ 163 (164) T cd01482 118 DVELPARVLRNLGVNVFAVGVKD-AD-ESELKMIASKPSETHVFNVAD 163 (164) T ss_pred HHHHHHHHHHHCCCEEEEEECCC-CC-HHHHHHHHCCCCHHCEEECCC T ss_conf 38999999998893899997883-78-999999968985661797479 No 25 >cd00198 vWFA Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Probab=99.39 E-value=4.3e-11 Score=90.78 Aligned_cols=151 Identities=23% Similarity=0.238 Sum_probs=111.6 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876431244577 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSW 271 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~ 271 (420) ++.+++|.|+||. ..+++..+.++..+++.+... ....+++++.|++......+++. T Consensus 2 div~vlD~S~Sm~--------------------~~~~~~~k~~~~~~~~~l~~~---~~~~~v~vv~f~~~~~~~~~~~~ 58 (161) T cd00198 2 DIVFLLDVSGSMG--------------------GEKLDKAKEALKALVSSLSAS---PPGDRVGLVTFGSNARVVLPLTT 58 (161) T ss_pred EEEEEEECCCCCC--------------------CHHHHHHHHHHHHHHHHHHHC---CCCCEEEEEEECCCEEEEECCCC T ss_conf 0999991889988--------------------079999999999999987655---99988999993795148814741 Q ss_pred C--HHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH Q ss_conf 8--48899999986404678876538899999986131012334554334776778776662489960686777776534 Q gi|254781110|r 272 G--TEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN 349 (420) Q Consensus 272 ~--~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~ 349 (420) . .......++.+.. ...|+|+...++..+.+.+.... ....+|++|++|||.++... .. T Consensus 59 ~~~~~~~~~~i~~~~~-~~~g~t~~~~al~~a~~~~~~~~----------------~~~~~~~iiliTDG~~~~~~--~~ 119 (161) T cd00198 59 DTDKADLLEAIDALKK-GLGGGTNIGAALRLALELLKSAK----------------RPNARRVIILLTDGEPNDGP--EL 119 (161) T ss_pred HHHHHHHHHHHHHCCC-CCCCCCHHHHHHHHHHHHHHHHC----------------CCCCCEEEEEECCCCCCCCH--HH T ss_conf 2579999997751356-89998389999999999987532----------------55565179996789989873--67 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC Q ss_conf 899999999789879999954897789999986218 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASPNGQRLLKTCVSS 385 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~ 385 (420) .....+.+|+.||.||+||+|.+.+.. .|+..++. T Consensus 120 ~~~~~~~~~~~~v~i~~igig~~~~~~-~l~~ia~~ 154 (161) T cd00198 120 LAEAARELRKLGITVYTIGIGDDANED-ELKEIADK 154 (161) T ss_pred HHHHHHHHHHCCCEEEEEEECHHHCHH-HHHHHHHC T ss_conf 999999999779989999966111999-99999838 No 26 >TIGR03436 acidobact_VWFA VWFA-related Acidobacterial domain. Members of this family are bacterial domains that include a region related to the von Willebrand factor type A (VWFA) domain (pfam00092). These domains are restricted to, and have undergone a large paralogous family expansion in, the Acidobacteria, including Solibacter usitatus and Acidobacterium capsulatum ATCC 51196. Probab=99.34 E-value=4.4e-10 Score=84.17 Aligned_cols=180 Identities=17% Similarity=0.210 Sum_probs=127.2 Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCC Q ss_conf 44320122102232422335787666675346766530678899989999851046787655430223204876431244 Q gi|254781110|r 189 PIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIE 268 (420) Q Consensus 189 ~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 268 (420) .+..+.+++|.|++|. .++...+.+...|++.... ...++.++.|++......+ T Consensus 52 ~P~sv~l~~D~S~s~~---------------------~~~~~~~~a~~~fl~~~l~-----p~d~~avv~F~~~~~l~~~ 105 (296) T TIGR03436 52 LPLTVGLVIDTSGSMF---------------------NDLARARAAAIRFLKTVLR-----PNDEVFVVTFSTQLRLLQD 105 (296) T ss_pred CCCEEEEEEECCCCCH---------------------HHHHHHHHHHHHHHHHHCC-----CCCEEEEEEECCCEEECCC T ss_conf 9846999997899914---------------------5399999999999986368-----8867999994895457278 Q ss_pred CCCCHHHHHHHHHHHHHC------------CCCCCCCHHHHHHHHHH-HHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEE Q ss_conf 577848899999986404------------67887653889999998-61310123345543347767787766624899 Q gi|254781110|r 269 PSWGTEKVRQYVTRDMDS------------LILKPTDSTPAMKQAYQ-ILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIF 335 (420) Q Consensus 269 lt~~~~~~~~~i~~~~~~------------~~~g~T~~~~gl~~~~~-~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil 335 (420) +|.+...+...++.+... ...|+|+..+++..+.. .+.... ....-+|++|+ T Consensus 106 fT~d~~~l~~al~~l~~~~~~~~~~~~~~~~~~g~tal~dAi~laa~~~~~~~~---------------~~~~gRK~li~ 170 (296) T TIGR03436 106 FTSDPRLLEAALNKLKPPLRTDYNSSGAFVADAGGTALYDAITLAALQQLANAL---------------AGIPGRKALIV 170 (296) T ss_pred CCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHC---------------CCCCCCEEEEE T ss_conf 988999999999861567654333345323578741027889999999987540---------------47988679999 Q ss_pred EECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCC------------CH-HHHHHHHHCCCCCEEEECCHHHHHHHH Q ss_conf 60686777776534899999999789879999954897------------78-999998621898279817989999999 Q gi|254781110|r 336 LTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASP------------NG-QRLLKTCVSSPEYHYNVVNADSLIHVF 402 (420) Q Consensus 336 ~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~------------~~-~~~l~~cas~~~~yf~a~~~~~L~~aF 402 (420) +|||.++.... ....+-+.+..++|.||+|++.... .+ +.|-+.|..++|++|..+. .+|.++| T Consensus 171 iSdG~d~~s~~--~~~~~~~~a~~a~v~IY~I~~~~~~~~~~~~~~~~~~~~~~~L~~lA~~TGG~~f~~~~-~dl~~~~ 247 (296) T TIGR03436 171 ISDGEDNSSRD--TLERAIEAAQRADVLIYSIDARGLRAPDLGAGAKAGLSGPETLERLAAETGGRAFYVNS-NDIDEAF 247 (296) T ss_pred EECCCCCCCCC--CHHHHHHHHHHCCCEEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCEEECCCC-CCHHHHH T ss_conf 92698863304--89999999998497799954676566564444455676279999999973996755474-1089999 Q ss_pred HHHHHHHHCC Q ss_conf 9999874202 Q gi|254781110|r 403 QNISQLMVHR 412 (420) Q Consensus 403 ~~Ia~~I~~l 412 (420) +.|++++-+- T Consensus 248 ~~i~~~lr~q 257 (296) T TIGR03436 248 AQIAEELRSQ 257 (296) T ss_pred HHHHHHHHHE T ss_conf 9999987523 No 27 >cd01476 VWA_integrin_invertebrates VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands. Probab=99.34 E-value=6.6e-11 Score=89.54 Aligned_cols=156 Identities=17% Similarity=0.221 Sum_probs=109.8 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCC--C Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124--4 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNI--E 268 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~ 268 (420) .|+.|++|-|+|.. ..++..++-+..+++.+. ..++..|++++.|++...... . T Consensus 1 lDl~fllD~S~Sv~---------------------~~f~~~k~F~~~lv~~f~---i~~~~~rVgvv~ys~~~~~~i~f~ 56 (163) T cd01476 1 LDLLFVLDSSGSVR---------------------GKFEKYKKYIERIVEGLE---IGPTATRVALITYSGRGRQRVRFN 56 (163) T ss_pred CCEEEEEECCCCHH---------------------HHHHHHHHHHHHHHHHHC---CCCCCEEEEEEEECCCCCEEEEEC T ss_conf 92999991888866---------------------739999999999999614---688853899999669870788875 Q ss_pred CC--CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCC Q ss_conf 57--7848899999986404678876538899999986131012334554334776778776662489960686777776 Q gi|254781110|r 269 PS--WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKS 346 (420) Q Consensus 269 lt--~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~ 346 (420) +. .+..++.+.++.+. +.+|+|+++.+|..+.+.+.+... .+++.+|++|++|||..+. T Consensus 57 l~~~~~~~~l~~~I~~i~--~~~g~T~tg~AL~~a~~~~~~~~g--------------~R~~~~kv~vviTDG~s~d--- 117 (163) T cd01476 57 LPKHNDGEELLEKVDNLR--FIGGTTATGAAIEVALQQLDPSEG--------------RREGIPKVVVVLTDGRSHD--- 117 (163) T ss_pred CCCCCCHHHHHHHHHHEE--CCCCCCCHHHHHHHHHHHHHHHCC--------------CCCCCEEEEEEEECCCCCC--- T ss_conf 777799999999997520--368985489999999997214206--------------7899616999981898766--- Q ss_pred CCHHHHHHHHHHH-CCCEEEEEEECCCCC-HHHHHHHHHCCCCCEEE Q ss_conf 5348999999997-898799999548977-89999986218982798 Q gi|254781110|r 347 NVNTIKICDKAKE-NFIKIVTISINASPN-GQRLLKTCVSSPEYHYN 391 (420) Q Consensus 347 ~~~~~~~c~~~k~-~gi~i~tIgf~~~~~-~~~~l~~cas~~~~yf~ 391 (420) .....+..+|+ .||+||.||+|.... ....|+..|++|+|.|. T Consensus 118 --~~~~~a~~lr~~~gv~v~avgVG~~~~~d~~eL~~Ia~~~~~Vft 162 (163) T cd01476 118 --DPEKQARILRAVPNIETFAVGTGDPGTVDTEELHSITGNEDHIFT 162 (163) T ss_pred --CHHHHHHHHHHHCCCEEEEEEECCCCCCCHHHHHHHCCCCCCCCC T ss_conf --488999999970998999998388650159999986499725457 No 28 >cd01462 VWA_YIEM_type VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.18 E-value=1.8e-09 Score=80.22 Aligned_cols=144 Identities=17% Similarity=0.200 Sum_probs=98.7 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCC-CCCC Q ss_conf 201221022324223357876666753467665306788999899998510467876554302232048764312-4457 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKN-IEPS 270 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~lt 270 (420) .+.+++|+||||. ..++..++.+...++..+... ..+++++.|++..... .+++ T Consensus 2 pvV~vlD~SGSM~--------------------G~~~~~ak~~~~~l~~~l~~~-----~~~~~lv~F~~~~~~~~~~~~ 56 (152) T cd01462 2 PVILLVDQSGSMY--------------------GAPEEVAKAVALALLRIALAE-----NRDTYLILFDSEFQTKIVDKT 56 (152) T ss_pred CEEEEEECCCCCC--------------------CCHHHHHHHHHHHHHHHHHHC-----CCEEEEEEECCCCEEEECCCC T ss_conf 9999997999989--------------------806999999999999973233-----980999991687357715876 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) .+..+....+ ....++|||++..+|..|...|... ...+..|||+|||+..... ... T Consensus 57 ~~~~~~~~~i---~~~~~~GGT~i~~aL~~A~~~l~~~------------------~~~~~~IvlITDG~~~~~~--~~~ 113 (152) T cd01462 57 DDLEEPVEFL---SGVQLGGGTDINKALRYALELIERR------------------DPRKADIVLITDGYEGGVS--DEL 113 (152) T ss_pred CCHHHHHHHH---HHCCCCCCCCHHHHHHHHHHHHHCC------------------CCCCCEEEEEECCCCCCCH--HHH T ss_conf 4599999999---7253689865799999999987425------------------7656469998267567983--999 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC Q ss_conf 9999999978987999995489778999998621 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVS 384 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas 384 (420) ...++..++.|+++|++++|++.+ ..+++..+. T Consensus 114 ~~~~~~~~~~~~r~~~~~iG~~~~-p~~~~~~~~ 146 (152) T cd01462 114 LREVELKRSRVARFVALALGDHGN-PGYDRISAE 146 (152) T ss_pred HHHHHHHHHCCEEEEEEEECCCCC-CHHHHHHHH T ss_conf 999999983891999999899988-278787666 No 29 >COG1240 ChlD Mg-chelatase subunit ChlD [Coenzyme metabolism] Probab=99.02 E-value=2.8e-08 Score=72.47 Aligned_cols=169 Identities=25% Similarity=0.272 Sum_probs=128.6 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCC-CCCCCC Q ss_conf 3443201221022324223357876666753467665306788999899998510467876554302232048-764312 Q gi|254781110|r 188 RPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYT-TRVEKN 266 (420) Q Consensus 188 ~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~ 266 (420) +.-..+.|++|.|+||. ...|+..+|-++-.+.... +....++++++|. ...... T Consensus 76 r~g~lvvfvVDASgSM~-------------------~~~Rm~aaKG~~~~lL~dA-----Yq~RdkvavI~F~G~~A~ll 131 (261) T COG1240 76 RAGNLIVFVVDASGSMA-------------------ARRRMAAAKGAALSLLRDA-----YQRRDKVAVIAFRGEKAELL 131 (261) T ss_pred CCCCCEEEEEECCCCCH-------------------HHHHHHHHHHHHHHHHHHH-----HHCCCEEEEEEECCCCCEEE T ss_conf 76774899994765420-------------------5789999999999999999-----97035489999637765388 Q ss_pred CCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCC Q ss_conf 44577848899999986404678876538899999986131012334554334776778776662489960686777776 Q gi|254781110|r 267 IEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKS 346 (420) Q Consensus 267 ~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~ 346 (420) .|+|.+.......+..+ .++|.|....||..++..+....+ .++....++|++|||..|.... T Consensus 132 l~pT~sv~~~~~~L~~l---~~GG~TPL~~aL~~a~ev~~r~~r--------------~~p~~~~~~vviTDGr~n~~~~ 194 (261) T COG1240 132 LPPTSSVELAERALERL---PTGGKTPLADALRQAYEVLAREKR--------------RGPDRRPVMVVITDGRANVPIP 194 (261) T ss_pred ECCCCCHHHHHHHHHHC---CCCCCCCHHHHHHHHHHHHHHHHC--------------CCCCCCEEEEEEECCCCCCCCC T ss_conf 47865399999999838---999988439999999999997510--------------4887653899973796588889 Q ss_pred CC---HHHHHHHHHHHCCCEEEEEEECCCCCHHHH-HHHHHCCCCCEEEECCHHH Q ss_conf 53---489999999978987999995489778999-9986218982798179899 Q gi|254781110|r 347 NV---NTIKICDKAKENFIKIVTISINASPNGQRL-LKTCVSSPEYHYNVVNADS 397 (420) Q Consensus 347 ~~---~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~-l~~cas~~~~yf~a~~~~~ 397 (420) .. .+..+|.++...|+.+-+|.++.+.-...+ .+.|--.++.||+-+...+ T Consensus 195 ~~~~~e~~~~a~~~~~~g~~~lvid~e~~~~~~g~~~~iA~~~Gg~~~~L~~l~~ 249 (261) T COG1240 195 LGPKAETLEAASKLRLRGIQLLVIDTEGSEVRLGLAEEIARASGGEYYHLDDLSD 249 (261) T ss_pred CCHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCHHHHHHHHHCCEEEECCCCCC T ss_conf 8657799999999852688479995578523344799999973990786555640 No 30 >cd01454 vWA_norD_type norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3- ------ NO2- ------ NO ------- N2O --------- N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif. Probab=98.95 E-value=1.2e-07 Score=68.23 Aligned_cols=149 Identities=21% Similarity=0.302 Sum_probs=94.3 Q ss_pred EEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCC--C---CC Q ss_conf 012210223242233578766667534676653067889998999985104678765543022320487643--1---24 Q gi|254781110|r 193 IELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVE--K---NI 267 (420) Q Consensus 193 ~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~---~~ 267 (420) +.+.+|.|+||.. ..+++.++.+...+...+... ..+..++.|++... . .. T Consensus 3 V~lLlD~SgSM~~-------------------~~~i~~a~~a~~~l~~aL~~~-----g~~~~v~gF~s~~~~r~~~~~~ 58 (174) T cd01454 3 VTLLLDLSGSMRS-------------------DRRIDVAKKAAVLLAEALEAC-----GVPHAILGFTTDAGGRERVRWI 58 (174) T ss_pred EEEEEECCCCCCC-------------------CCHHHHHHHHHHHHHHHHHHC-----CCCEEEEECCCCCCCCCCEEEE T ss_conf 9999989868899-------------------848999999999999999976-----9956999515788984434789 Q ss_pred CC-CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCC Q ss_conf 45-77848899999986404678876538899999986131012334554334776778776662489960686777776 Q gi|254781110|r 268 EP-SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKS 346 (420) Q Consensus 268 ~l-t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~ 346 (420) ++ ..+..-......++..+.+.|+|..+.+|.|+...|.. .+..+|+++++|||+++.... T Consensus 59 ~~k~f~e~~~~~~~~~i~~l~~~g~Tr~G~Air~a~~~L~~------------------~~~~rkiliviSDG~P~D~~~ 120 (174) T cd01454 59 KIKDFDESLHERARKRLAALSPGGNTRDGAAIRHAAERLLA------------------RPEKRKILLVISDGEPNDLDY 120 (174) T ss_pred ECCCCCCCHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHH------------------CCCCCEEEEEEECCCCCCCCC T ss_conf 32366742114568888511878989617999999999863------------------976667999983899766777 Q ss_pred C--C-----HHHHHHHHHHHCCCEEEEEEECCCCC--HHHHHHHHH Q ss_conf 5--3-----48999999997898799999548977--899999862 Q gi|254781110|r 347 N--V-----NTIKICDKAKENFIKIVTISINASPN--GQRLLKTCV 383 (420) Q Consensus 347 ~--~-----~~~~~c~~~k~~gi~i~tIgf~~~~~--~~~~l~~ca 383 (420) . . .....+..++..||.+|.|+++.+.. ..+.++..= T Consensus 121 ~~~~~~~~~D~~~av~e~~~~GI~~~~i~i~~~~~~~~~~~l~~i~ 166 (174) T cd01454 121 YEGNVFATEDALRAVIEARKLGIEVFGITIDRDATTVDKEYLKNIF 166 (174) T ss_pred CCCCHHHHHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHHC T ss_conf 8875538999999999999879889999989855566999999842 No 31 >cd01477 vWA_F09G8-8_type VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of mo Probab=98.95 E-value=1.2e-07 Score=68.25 Aligned_cols=166 Identities=16% Similarity=0.175 Sum_probs=113.0 Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHC---CCCCCCCCCCCEEEECCCCCCCC Q ss_conf 443201221022324223357876666753467665306788999899998510---46787655430223204876431 Q gi|254781110|r 189 PIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSI---DLLSHVKEDVYMGLIGYTTRVEK 265 (420) Q Consensus 189 ~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~ 265 (420) .-+|+++++|.|.+|.. ..+...+.-+..+.... ......+...|+|+++|++.+.. T Consensus 18 LWLDVv~VVD~S~~mt~--------------------~gl~~V~~~I~s~f~~~t~iGt~~~~pr~TRVGlVTYn~~Atv 77 (193) T cd01477 18 LWLDIVFVVDNSKGMTQ--------------------GGLWQVRATISSLFGSSSQIGTDYDDPRSTRVGLVTYNSNATV 77 (193) T ss_pred EEEEEEEEEECCCCCCC--------------------CCHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEEEEECCCCEE T ss_conf 23789999967876562--------------------1099999999999713540357889987338999996787459 Q ss_pred CCCCCC--CHHHHHHHHHH-HHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCC Q ss_conf 244577--84889999998-640467887653889999998613101233455433477677877666248996068677 Q gi|254781110|r 266 NIEPSW--GTEKVRQYVTR-DMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENN 342 (420) Q Consensus 266 ~~~lt~--~~~~~~~~i~~-~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~ 342 (420) ...|.. ....+...+.. +........++.+.||..+-++|..... ..+.+++|++|+.+-.-.. T Consensus 78 vAdLn~~~S~ddl~~~i~~~l~~vsss~~SyL~~GL~aA~~~l~~~~~-------------~~R~nykKVVIVyAs~y~~ 144 (193) T cd01477 78 VADLNDLQSFDDLYSQIQGSLTDVSSTNASYLDTGLQAAEQMLAAGKR-------------TSRENYKKVVIVFASDYND 144 (193) T ss_pred EECCCCCCCHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHCCC-------------CCCCCCCEEEEEEECCCCC T ss_conf 863454565788999998875146666312799999999999983326-------------6424862799999502467 Q ss_pred CCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHH--HHHHHHHCCCCCEE Q ss_conf 7776534899999999789879999954897789--99998621898279 Q gi|254781110|r 343 NFKSNVNTIKICDKAKENFIKIVTISINASPNGQ--RLLKTCVSSPEYHY 390 (420) Q Consensus 343 ~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~--~~l~~cas~~~~yf 390 (420) .+. ..+..+++.+|..|+.|-||+|+-+.+.+ ..|+.+|| |++-| T Consensus 145 ~g~--~dp~pvA~rLK~~Gv~IiTVa~~q~~~~~~~~~L~~IAS-pg~nF 191 (193) T cd01477 145 EGS--NDPRPIAARLKSTGIAIITVAFTQDESSNLLDKLGKIAS-PGMNF 191 (193) T ss_pred CCC--CCHHHHHHHHHHCCCEEEEEECCCCCCHHHHHHHHHHCC-CCCCC T ss_conf 898--886999999987697899998268875889998887579-98887 No 32 >COG4245 TerY Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain [General function prediction only] Probab=98.77 E-value=1.3e-07 Score=68.15 Aligned_cols=187 Identities=13% Similarity=0.104 Sum_probs=133.2 Q ss_pred CCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCC Q ss_conf 43201221022324223357876666753467665306788999899998510467876554302232048764312445 Q gi|254781110|r 190 IFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEP 269 (420) Q Consensus 190 ~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l 269 (420) ...+.+.+|+|+||. ..++..+...+..+++.+..-+..-..+++++++|.+.+....|+ T Consensus 3 RlP~~lllDtSgSM~--------------------Ge~IealN~Glq~m~~~Lkqdp~Ale~v~lsIVTF~~~a~~~~pf 62 (207) T COG4245 3 RLPCYLLLDTSGSMI--------------------GEPIEALNAGLQMMIDTLKQDPYALERVELSIVTFGGPARVIQPF 62 (207) T ss_pred CCCEEEEEECCCCCC--------------------CCCHHHHHHHHHHHHHHHHHCHHHHHEEEEEEEEECCCCEEEECH T ss_conf 778899993675424--------------------561799989999999998748465440578999826850687331 Q ss_pred CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCH Q ss_conf 77848899999986404678876538899999986131012334554334776778776662489960686777776534 Q gi|254781110|r 270 SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVN 349 (420) Q Consensus 270 t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~ 349 (420) +.-.+= ......+.|+|..++++..+.+.+....+.. ......+++.++++||||+++.. . T Consensus 63 ~~~~nF------~~p~L~a~GgT~lGaAl~~a~d~Ie~~~~~~---------~a~~kgdyrP~vfLiTDG~PtD~----w 123 (207) T COG4245 63 TDAANF------NPPILTAQGGTPLGAALTLALDMIEERKRKY---------DANGKGDYRPWVFLITDGEPTDD----W 123 (207) T ss_pred HHHHHC------CCCCEECCCCCCHHHHHHHHHHHHHHHHHHC---------CCCCCCCCCEEEEEECCCCCCHH----H T ss_conf 557544------8870136999806799999999998777650---------56775554417999538996657----7 Q ss_pred HHHHHHHH--HHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHHHCCEEEE Q ss_conf 89999999--97898799999548977899999862189827981798999999999998742025587 Q gi|254781110|r 350 TIKICDKA--KENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQLMVHRKYSV 416 (420) Q Consensus 350 ~~~~c~~~--k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~I~~lr~s~ 416 (420) ...+.... +.+...+-.+++|........|++..+.-..++...+ .++.+-|+=+...|..-.-|. T Consensus 124 ~~~~~~~~~~~~~~k~v~a~~~G~~~ad~~~L~qit~~V~~~~t~d~-~~f~~fFkW~SaSisagS~S~ 191 (207) T COG4245 124 QAGAALVFQGERRAKSVAAFSVGVQGADNKTLNQITEKVRQFLTLDG-LQFREFFKWLSASISAGSRST 191 (207) T ss_pred HHHHHHHHHCCCCCCEEEEEEECCCCCCCHHHHHHHHHHCCCCCCCH-HHHHHHHHHHHHHHHCCCCCC T ss_conf 76777764033100528999953543441899998876525234534-889999999987751323234 No 33 >cd01453 vWA_transcription_factor_IIH_type Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group. Probab=98.40 E-value=0.0001 Score=49.24 Aligned_cols=172 Identities=17% Similarity=0.169 Sum_probs=123.5 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCC-CCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876-43124457 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTR-VEKNIEPS 270 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~lt 270 (420) .+..++|.|.+|.... ..++|+....+++..|+..+-++.|- -+.|++..-++ +....+++ T Consensus 5 ~l~iiiD~S~am~~~D---------------~~PtRl~~~~~~l~~Fi~effdqNPi---sqlGii~~rn~~a~~ls~ls 66 (183) T cd01453 5 HLIIVIDCSRSMEEQD---------------LKPSRLAVVLKLLELFIEEFFDQNPI---SQLGIISIKNGRAEKLTDLT 66 (183) T ss_pred EEEEEEECCHHHHHCC---------------CCCCHHHHHHHHHHHHHHHHHCCCCC---CEEEEEEEECCEEEEEEECC T ss_conf 9999998837677565---------------89549999999999999998707974---04899999468169976468 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) .+..+....+.+.. .+.|......|+..|...|..- +....+.++|+ ...-.++... .. T Consensus 67 gn~~~hi~~l~~~~--~~~G~~SLqN~Le~A~~~L~~~----------------P~~~sREILiI-~~Sl~t~Dpg--dI 125 (183) T cd01453 67 GNPRKHIQALKTAR--ECSGEPSLQNGLEMALESLKHM----------------PSHGSREVLII-FSSLSTCDPG--NI 125 (183) T ss_pred CCHHHHHHHHHHCC--CCCCCHHHHHHHHHHHHHHHHC----------------CCCCCEEEEEE-ECCCCCCCCC--CH T ss_conf 99899999998545--8999813999999999998208----------------98784489999-7565347976--49 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 999999997898799999548977899999862189827981798999999999 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQN 404 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~ 404 (420) ...-+.+|+.+|++.+||+... -.-.-+.|..++|.|+-+-+..-+.+.+.+ T Consensus 126 ~~ti~~lk~~~IrvsvI~l~aE--v~I~k~l~~~TgG~y~V~lde~H~~~ll~~ 177 (183) T cd01453 126 YETIDKLKKENIRVSVIGLSAE--MHICKEICKATNGTYKVILDETHLKELLLE 177 (183) T ss_pred HHHHHHHHHCCCEEEEEEECHH--HHHHHHHHHHHCCEEEEECCHHHHHHHHHH T ss_conf 9999999983978999974278--999999999839976875399999999995 No 34 >PRK13406 bchD magnesium chelatase subunit D; Provisional Probab=98.36 E-value=9.9e-05 Score=49.35 Aligned_cols=169 Identities=15% Similarity=0.152 Sum_probs=115.0 Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCC-CCCCCCC Q ss_conf 443201221022324223357876666753467665306788999899998510467876554302232048-7643124 Q gi|254781110|r 189 PIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYT-TRVEKNI 267 (420) Q Consensus 189 ~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~ 267 (420) ....+.|++|.||||. ..|+..+|-++..+.... +....+++++.|- +...... T Consensus 400 ~~~lviFvVDASGS~A--------------------~~Rm~~aKGAV~~LL~dA-----Y~~RD~ValIaFRG~~AevlL 454 (584) T PRK13406 400 SETTTIFVVDASGSAA--------------------LHRLAEAKGAVELLLAEC-----YVRRDHVALVAFRGRGAELLL 454 (584) T ss_pred CCEEEEEEEECCCCHH--------------------HHHHHHHHHHHHHHHHHH-----HHHHCEEEEEEECCCCCEEEE T ss_conf 6606999982886279--------------------999999999999999999-----960044789987687630741 Q ss_pred CCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC Q ss_conf 45778488999999864046788765388999999861310123345543347767787766624899606867777765 Q gi|254781110|r 268 EPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSN 347 (420) Q Consensus 268 ~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~ 347 (420) |+|......++.+..+ ..+|+|....||..++++...... ......+|++|||.-|..... T Consensus 455 PPTrSv~~A~r~L~~L---P~GG~TPLA~GL~~A~~l~~~~r~----------------~~~~p~~VllTDGRaNv~ldg 515 (584) T PRK13406 455 PPTRSLVRAKRSLAGL---PGGGGTPLAAGLDAALALALSVRR----------------KGQTPTVVLLTDGRANIARDG 515 (584) T ss_pred CCCCCHHHHHHHHHCC---CCCCCCHHHHHHHHHHHHHHHHHC----------------CCCCEEEEEECCCCCCCCCCC T ss_conf 8865599999999629---999988599999999999999755----------------799548999827987778777 Q ss_pred --------CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEEC--CHHHHHHHHH Q ss_conf --------3489999999978987999995489778999998621-898279817--9899999999 Q gi|254781110|r 348 --------VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVS-SPEYHYNVV--NADSLIHVFQ 403 (420) Q Consensus 348 --------~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas-~~~~yf~a~--~~~~L~~aF~ 403 (420) .....++..++..||..-+|--+.... ...+..|. -++.||.-. +++.|..+-+ T Consensus 516 ~~~r~~a~~da~~~A~~l~~~g~~~vVIDT~~~~~--~~a~~LA~~l~a~Y~~Lp~~~A~~l~~~V~ 580 (584) T PRK13406 516 AGGRAQAEEDALAAARALRAAGLPALVIDTSPRPQ--PQARALAEAMGARYLPLPRADATRLSQAVR 580 (584) T ss_pred CCCCHHHHHHHHHHHHHHHHCCCCEEEEECCCCCC--HHHHHHHHHCCCCEEECCCCCHHHHHHHHH T ss_conf 88711489999999999997699789994898886--269999998399189789789899999999 No 35 >TIGR02442 Cob-chelat-sub cobaltochelatase subunit; InterPro: IPR012804 Cobalamin (vitamin B12) can be complexed with metal via ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. Cobaltochelatase is responsible for the insertion of cobalt into the corrin ring of coenzyme B12 during its biosynthesis. Two versions have been well described. CbiK/CbiX is a monomeric, anaerobic version which acts early in the biosynthesis (IPR010388 from INTERPRO). CobNST is a trimeric, ATP-dependent, aerobic version which acts late in the biosynthesis, (IPR011953 from INTERPRO, IPR006537 from INTERPRO, IPR006538 from INTERPRO) . The two pathways differ in the point of cobalt insertion during corrin ring formation . There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction . Cobaltochelatase shows similarities with magnesium chelatase, which is also a complex ATP-dependent enzyme made up of two separable components. However, unlike the situation in cobaltochelatase, one of these two components is membrane bound in magnesium chelatase . . Probab=98.32 E-value=1.6e-05 Score=54.43 Aligned_cols=162 Identities=23% Similarity=0.195 Sum_probs=113.5 Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECC-CCCCCCCC Q ss_conf 44320122102232422335787666675346766530678899989999851046787655430223204-87643124 Q gi|254781110|r 189 PIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGY-TTRVEKNI 267 (420) Q Consensus 189 ~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~ 267 (420) .-..+-|++|-||||. ...||..+|.++-.+.... +....++++++| ...+.... T Consensus 507 ~G~LviFvVDASGSM~-------------------ar~RM~~~KGavLsLL~DA-----Yq~RDkValI~FrG~~AevlL 562 (688) T TIGR02442 507 AGNLVIFVVDASGSMA-------------------ARGRMAAAKGAVLSLLRDA-----YQKRDKVALITFRGEEAEVLL 562 (688) T ss_pred HCCCEEEEEECCHHHH-------------------HHHHHHHHHHHHHHHHHHH-----HHHCCEEEEEECCCCEEEEEC T ss_conf 1152223533532044-------------------2357899899999988888-----862776888623673435765 Q ss_pred CCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCC-C- Q ss_conf 45778488999999864046788765388999999861310123345543347767787766624899606867777-7- Q gi|254781110|r 268 EPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNF-K- 345 (420) Q Consensus 268 ~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~-~- 345 (420) |+|.........+..+ ..+|.|....||..|+.++...... ......++|++|||.-|.. . T Consensus 563 PPT~sv~~A~r~L~~l---PtGGrTPLa~gL~~A~~v~~~~~~~--------------~~~~~pl~V~iTDGRaNv~L~~ 625 (688) T TIGR02442 563 PPTSSVELAARRLEEL---PTGGRTPLAAGLLKAAEVLSNELLR--------------DDDRRPLVVVITDGRANVALDV 625 (688) T ss_pred CCCCHHHHHHHHHHHC---CCCCCCHHHHHHHHHHHHHHHHHHC--------------CCCCCEEEEEEECCCCCCCCCC T ss_conf 8788489999999728---8989874589999999999998611--------------6899428998707863542666 Q ss_pred -CCCH-----HHHHHHHHHHC-------CCEEEEEEECC-CCCHHHHHHHHHC-CCCCEEE Q ss_conf -6534-----89999999978-------98799999548-9778999998621-8982798 Q gi|254781110|r 346 -SNVN-----TIKICDKAKEN-------FIKIVTISINA-SPNGQRLLKTCVS-SPEYHYN 391 (420) Q Consensus 346 -~~~~-----~~~~c~~~k~~-------gi~i~tIgf~~-~~~~~~~l~~cas-~~~~yf~ 391 (420) .+.. ...+..++.+. ||..-+|==+. ..-.--+.++.|+ -++.||. T Consensus 626 ~~g~~qp~~~~~~~a~~L~~~~~R~R~Lg~~~vV~DTE~~~~v~lGlA~~~A~~lgg~~~~ 686 (688) T TIGR02442 626 SLGEPQPLDDARTIASKLAARASRIRSLGIKFVVIDTENPGFVRLGLAEDLASALGGEYLR 686 (688) T ss_pred CCCCCCHHHHHHHHHHHHHHHHCCEEECCCEEEEEECCCCCCCCCCHHHHHHHHHCCCEEC T ss_conf 7888415778999999988750430111622789972688754222389999982983224 No 36 >TIGR02031 BchD-ChlD magnesium chelatase ATPase subunit D; InterPro: IPR011776 This entry represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. Unlike subunit I (IPR011775 from INTERPRO), this subunit is not found in archaea.; GO: 0005524 ATP binding, 0016851 magnesium chelatase activity, 0015995 chlorophyll biosynthetic process. Probab=98.05 E-value=9.9e-05 Score=49.33 Aligned_cols=167 Identities=20% Similarity=0.157 Sum_probs=113.4 Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCC--EEEECC-CCCCCC Q ss_conf 44320122102232422335787666675346766530678899989999851046787655430--223204-876431 Q gi|254781110|r 189 PIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVY--MGLIGY-TTRVEK 265 (420) Q Consensus 189 ~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~--~~~~~~-~~~~~~ 265 (420) ....+-|++|-||||. ...|+..+|-|+..+.... | -+..+ +.++.| +..... T Consensus 509 sg~L~IF~VDASGSsa-------------------a~~Rm~~AKGAV~~LL~~A----Y-v~RD~vkVaLi~FRG~~Ae~ 564 (705) T TIGR02031 509 SGALLIFVVDASGSSA-------------------AVARMSEAKGAVELLLGEA----Y-VHRDQVKVALIAFRGTAAEV 564 (705) T ss_pred CCCEEEEEEECCHHHH-------------------HHHHHHHHHHHHHHHHHHH----H-HHCCCEEEEEEECCCCHHHH T ss_conf 8827999760635789-------------------9999987789999998765----4-41360357763044430000 Q ss_pred CCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCC Q ss_conf 24457784889999998640467887653889999998613101233455433477677877666248996068677777 Q gi|254781110|r 266 NIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFK 345 (420) Q Consensus 266 ~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~ 345 (420) ..|+|......++.+..+ ..+|||....||..||++-...... +.-.+-.||++|||--|-.- T Consensus 565 LLPPsrSv~~aKr~L~~L---P~GGGtPLA~gL~~A~~~a~qar~~--------------GD~~~~~ivliTDGRgNvpL 627 (705) T TIGR02031 565 LLPPSRSVELAKRRLDVL---PGGGGTPLAAGLAAAVEVAKQARSR--------------GDVGRITIVLITDGRGNVPL 627 (705) T ss_pred CCCCHHHHHHHHHHHCCC---CCCCCCHHHHHHHHHHHHHHHHHCC--------------CCCCEEEEEEECCCCCCCCC T ss_conf 378523589999997158---9998567899999999999851026--------------88524556776077877467 Q ss_pred C----------C-----------CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCHH Q ss_conf 6----------5-----------3489999999978987999995489778999998621-898279817989 Q gi|254781110|r 346 S----------N-----------VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVS-SPEYHYNVVNAD 396 (420) Q Consensus 346 ~----------~-----------~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas-~~~~yf~a~~~~ 396 (420) . + .....++..++++||-.-+|==.-.....-.++..|. =.+|||.=.++. T Consensus 628 ~~~~DP~~~~~~r~PrPts~~l~~e~~~lA~~i~~~G~~~lVIDT~~~f~s~G~a~~lA~~~~a~Y~yLP~a~ 700 (705) T TIGR02031 628 DASVDPKAAKADRLPRPTSEELKEEVLALARKIREAGISALVIDTANKFVSTGFAKKLARKLGARYIYLPNAT 700 (705) T ss_pred CCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCHHHHHHHHHCCCEEECCCCC T ss_conf 6567861002356787268999999999999988718865898267786676448999998589067136888 No 37 >KOG2353 consensus Probab=97.97 E-value=0.00073 Score=43.67 Aligned_cols=177 Identities=15% Similarity=0.197 Sum_probs=114.8 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCC Q ss_conf 34432012210223242233578766667534676653067889998999985104678765543022320487643124 Q gi|254781110|r 188 RPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNI 267 (420) Q Consensus 188 ~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 267 (420) ....++++..|+||||. ..+++..+..+....+++...+ .+.+.+|++...+.. T Consensus 223 t~pKdiviLlD~SgSm~--------------------g~~~~lak~tv~~iLdtLs~~D------fvni~tf~~~~~~v~ 276 (1104) T KOG2353 223 TSPKDIVILLDVSGSMS--------------------GLRLDLAKQTVNEILDTLSDND------FVNILTFNSEVNPVS 276 (1104) T ss_pred CCCCCEEEEEECCCCCC--------------------CHHHHHHHHHHHHHHHHCCCCC------EEEEEEECCCCCCCC T ss_conf 78664599996565554--------------------4316999999999997615477------687876213567565 Q ss_pred CC---------CCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEC Q ss_conf 45---------778488999999864046788765388999999861310123345543347767787766624899606 Q gi|254781110|r 268 EP---------SWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTD 338 (420) Q Consensus 268 ~l---------t~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TD 338 (420) |= ..+...+++.++.+ .+-|.++...|+..+...|.....+.... .+..-.+.++++|| T Consensus 277 pc~~~~lvqAt~~nk~~~~~~i~~l---~~k~~a~~~~~~e~aF~lL~~~n~s~~~~---------~~~~C~~~iml~td 344 (1104) T KOG2353 277 PCFNGTLVQATMRNKKVFKEAIETL---DAKGIANYTAALEYAFSLLRDYNDSRANT---------QRSPCNQAIMLITD 344 (1104) T ss_pred CCCCCCEEECCHHHHHHHHHHHHHH---CCCCCCCHHHHHHHHHHHHHHHCCCCCCC---------CCCCCCEEEEEEEC T ss_conf 2025852204567799999998641---41254124355778999998744455443---------22500104577624 Q ss_pred CCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCC--HHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHH Q ss_conf 867777765348999999997898799999548977--899999862189827981798999999999998 Q gi|254781110|r 339 GENNNFKSNVNTIKICDKAKENFIKIVTISINASPN--GQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQ 407 (420) Q Consensus 339 G~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~--~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~ 407 (420) |..++... -.+.. +.-...|+|||.-+|.... .+..+..|+ +.|+|++..+-+++.+-=..... T Consensus 345 G~~~~~~~--If~~y--n~~~~~Vrvftflig~~~~~~~~~~wmac~-n~gyy~~I~~~~~v~~~~~~y~~ 410 (1104) T KOG2353 345 GVDENAKE--IFEKY--NWPDKKVRVFTFLIGDEVYDLDEIQWMACA-NKGYYVHIISIADVRENVLEYLD 410 (1104) T ss_pred CCCCCHHH--HHHHH--CCCCCCEEEEEEEECCCCCCCCCCHHHHHH-CCCCEEECCCHHHCCHHHHHHHH T ss_conf 77510899--99860--367773599999924421345412122540-78855864665645867655664 No 38 >cd01457 vWA_ORF176_type VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most Probab=97.81 E-value=0.0011 Score=42.53 Aligned_cols=155 Identities=17% Similarity=0.169 Sum_probs=88.4 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204876431244577 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPSW 271 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt~ 271 (420) +..|.+|.||||.+...++ ..+||..++.++..+.......++..- -++.+++....... - T Consensus 4 D~v~lIDdSgSM~~~d~~~-------------~~sRW~~a~~al~~iA~~c~~~D~DGI----dvyfln~~~~~~~~--~ 64 (199) T cd01457 4 DYTLLIDKSGSMAEADEAK-------------ERSRWEEAQESTRALARKCEEYDSDGI----TVYLFSGDFRRYDN--V 64 (199) T ss_pred CEEEEEECCCCCCCCCCCC-------------CCCHHHHHHHHHHHHHHHHHHCCCCCC----EEEEEECCCCCCCC--C T ss_conf 7799996888766776788-------------876299999999999999987488998----79996277645688--8 Q ss_pred CHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCC-CCEEEEEECCCCCCCCCC-CH Q ss_conf 8488999999864046788765388999999861310123345543347767787766-624899606867777765-34 Q gi|254781110|r 272 GTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPF-QKFIIFLTDGENNNFKSN-VN 349 (420) Q Consensus 272 ~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~-~k~iil~TDG~~~~~~~~-~~ 349 (420) +..++.... ....|.|.|+....|.......-..... ..+.+ .-.+|++|||+++....- .. T Consensus 65 ~~~~V~~iF---~~~~P~G~T~~g~~L~~il~~y~~r~~~-------------~~~kp~g~~iIVITDG~p~D~~av~~~ 128 (199) T cd01457 65 NSSKVDQLF---AENSPDGGTNLAAVLQDALNNYFQRKEN-------------GATCPEGETFLVITDGAPDDKDAVERV 128 (199) T ss_pred CHHHHHHHH---HCCCCCCCCCHHHHHHHHHHHHHHHHHC-------------CCCCCCCEEEEEEECCCCCCCHHHHHH T ss_conf 999999998---5589899796379999998999873200-------------689998607999827997982889999 Q ss_pred HHHHHHHHH-HCCCEEEEEEECCCCCHHHHHHH Q ss_conf 899999999-78987999995489778999998 Q gi|254781110|r 350 TIKICDKAK-ENFIKIVTISINASPNGQRLLKT 381 (420) Q Consensus 350 ~~~~c~~~k-~~gi~i~tIgf~~~~~~~~~l~~ 381 (420) ....++++. ..-+-|-+|.+|.+..+...|+. T Consensus 129 Ii~aa~kLd~~~qlgIqF~QVG~D~~A~~fL~~ 161 (199) T cd01457 129 IIKASDELDADNELAISFLQIGRDPAATAFLKA 161 (199) T ss_pred HHHHHHHHCCCCCCCEEEEEECCCHHHHHHHHH T ss_conf 999998634401003677785596889999998 No 39 >COG4548 NorD Nitric oxide reductase activation protein [Inorganic ion transport and metabolism] Probab=97.56 E-value=0.0022 Score=40.58 Aligned_cols=106 Identities=18% Similarity=0.150 Sum_probs=73.7 Q ss_pred HHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC------HHHHHHHH Q ss_conf 640467887653889999998613101233455433477677877666248996068677777653------48999999 Q gi|254781110|r 283 DMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV------NTIKICDK 356 (420) Q Consensus 283 ~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~------~~~~~c~~ 356 (420) |..+-|+..|..+.+|..+-..|-. .+..+|.+|++|||++|.-.... .+..+... T Consensus 524 ImALePg~ytR~G~AIR~As~kL~~------------------rpq~qklLivlSDGkPnd~d~YEgr~gIeDTr~AV~e 585 (637) T COG4548 524 IMALEPGYYTRDGAAIRHASAKLME------------------RPQRQKLLIVLSDGKPNDFDHYEGRFGIEDTREAVIE 585 (637) T ss_pred HEECCCCCCCCCCHHHHHHHHHHHC------------------CCCCCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHH T ss_conf 2133766444310999999999834------------------7411248999448985434432332111537999999 Q ss_pred HHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHH Q ss_conf 99789879999954897789999986218982798179899999999999874 Q gi|254781110|r 357 AKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQLM 409 (420) Q Consensus 357 ~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~I 409 (420) +++.||.+|-|-++-.+- ..+..-- +-+.|--+++...|..++-.|=+.+ T Consensus 586 aRk~Gi~VF~Vtld~ea~--~y~p~~f-gqngYa~V~~v~~LP~~L~~lyrkL 635 (637) T COG4548 586 ARKSGIEVFNVTLDREAI--SYLPALF-GQNGYAFVERVAQLPGALPPLYRKL 635 (637) T ss_pred HHHCCCEEEEEEECCHHH--HHHHHHH-CCCCEEECCCHHHCCHHHHHHHHHH T ss_conf 986583479998333055--5528885-2674697024001605579999996 No 40 >COG4655 Predicted membrane protein [Function unknown] Probab=97.54 E-value=7.6e-05 Score=50.08 Aligned_cols=56 Identities=14% Similarity=0.059 Sum_probs=51.1 Q ss_pred HHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 88630388569999999999999999999999999999999999999999987420 Q gi|254781110|r 13 KGIASEKANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMESANNAAILAGASKM 68 (420) Q Consensus 13 ~~~~~~~G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~LA~a~~~ 68 (420) -|-|.+|+-+.++.|+.+|..+...+++|||++.+..|.+||.+.|-|+++++... T Consensus 3 g~~r~~rs~~gvltal~~~lal~~l~l~VD~G~l~leqR~LQ~~ADlAAiaAAs~~ 58 (565) T COG4655 3 GWPRRQRSMVGVLTALFVPLALATLLLGVDYGYLYLEQRELQRVADLAAIAAASNL 58 (565) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHEECCCEEEEEHHHHHHHHHHHHHHHHHHC T ss_conf 42376766778999999999999886502201244117878887769988877627 No 41 >cd01452 VWA_26S_proteasome_subunit 26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association. Probab=97.42 E-value=0.0071 Score=37.26 Aligned_cols=157 Identities=13% Similarity=0.114 Sum_probs=103.4 Q ss_pred CCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCC-CCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHH Q ss_conf 76653067889998999985104678765543022320487-64312445778488999999864046788765388999 Q gi|254781110|r 221 QDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTT-RVEKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMK 299 (420) Q Consensus 221 ~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~ 299 (420) +++.++|+++-.++++-....-....| ...+|+.+... .......||.+..++.+.+..+. ++|.-+...|++ T Consensus 19 GDy~PtR~~AQ~dAvn~i~~~k~~~Np---En~VGl~tmag~~~~Vl~TlT~D~gkiL~~lh~i~---~~G~~~~~~~Iq 92 (187) T cd01452 19 GDYPPTRFQAQADAVNLICQAKTRSNP---ENNVGLMTMAGNSPEVLVTLTNDQGKILSKLHDVQ---PKGKANFITGIQ 92 (187) T ss_pred CCCCCCHHHHHHHHHHHHHHHHHHCCC---CCCEEEEEECCCCCEEEEECCCCHHHHHHHCCCCC---CCCEECHHHHHH T ss_conf 898971899999999999977751495---33113576158986689844865788987532677---187651887999 Q ss_pred HHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHH Q ss_conf 99986131012334554334776778776662489960686777776534899999999789879999954897789999 Q gi|254781110|r 300 QAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASPNGQRLL 379 (420) Q Consensus 300 ~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l 379 (420) -|.-.|..+. +...++.||++- |-+... ....+..+++++|+++|-|-.|.||...+..+.| T Consensus 93 iA~LALKHRq----------------nk~~~qRIv~FV-gSPi~~-~ek~l~~laKklKKnnV~vDII~FGe~~~n~~kL 154 (187) T cd01452 93 IAQLALKHRQ----------------NKNQKQRIVAFV-GSPIEE-DEKDLVKLAKRLKKNNVSVDIINFGEIDDNTEKL 154 (187) T ss_pred HHHHHHHCCC----------------CCCCCEEEEEEE-CCCCCC-CHHHHHHHHHHHHHCCCCEEEEEECCCCCCHHHH T ss_conf 9999972346----------------777544799997-898755-7899999999875558535899946888998999 Q ss_pred HH---HH-CC-CCCEEEECCHHH-HHHH Q ss_conf 98---62-18-982798179899-9999 Q gi|254781110|r 380 KT---CV-SS-PEYHYNVVNADS-LIHV 401 (420) Q Consensus 380 ~~---ca-s~-~~~yf~a~~~~~-L~~a 401 (420) +. .+ +. ++|+-....... |.++ T Consensus 155 ~~f~~~vn~~~~Shlv~ippg~~lLSd~ 182 (187) T cd01452 155 TAFIDAVNGKDGSHLVSVPPGENLLSDA 182 (187) T ss_pred HHHHHHHCCCCCCEEEEECCCCCHHHHH T ss_conf 9999984589982599947998645676 No 42 >pfam04056 Ssl1 Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex. Probab=97.41 E-value=0.011 Score=36.14 Aligned_cols=174 Identities=14% Similarity=0.093 Sum_probs=124.9 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECC-CCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204-87643124457 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGY-TTRVEKNIEPS 270 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~lt 270 (420) .+..++|.|.+|.... ..++|+...-..+..|+..+-++.|-. +.|++.. +..+....+++ T Consensus 54 hl~iilD~S~aM~e~D---------------lkP~R~~~~l~~l~~Fi~efFdqNPiS---Qlgii~~rn~~a~~ls~ls 115 (250) T pfam04056 54 HLYIVLDCSRAMEEKD---------------LRPSRFACTIKYLETFVEEFFDQNPIS---QIGLITCKDGRAHRLTDLT 115 (250) T ss_pred EEEEEEECCHHHHHCC---------------CCCCHHHHHHHHHHHHHHHHHHCCCCC---CEEEEEEECCEEEEEEECC T ss_conf 8999998827676351---------------586489999999999999987439830---2279999657137833257 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) -+...-...+.++....+.|.-....||..+...|..- +.+..+.++|++. --.++... .. T Consensus 116 gnp~~hi~aL~~~~~~~~~G~pSLqN~Le~a~~~L~~~----------------P~~~sREILii~g-SL~T~DPg--dI 176 (250) T pfam04056 116 GNPRVHIKALKSLREAECGGDPSLQNALELARASLKHV----------------PSHGSREVLIIFG-SLSTCDPG--DI 176 (250) T ss_pred CCHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHCC----------------CCCCCEEEEEEEE-ECCCCCCC--CH T ss_conf 99899999999874069999920899999999887508----------------9878548999982-04445886--59 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 999999997898799999548977899999862189827981798999999999 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQN 404 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~ 404 (420) -..-+.+|+.+|++..||+... -.-.-+.|..++|.|.-+-+..-+.+.+.. T Consensus 177 ~~tI~~l~~~~IrvsvI~LaaE--v~Ick~l~~~T~G~y~V~lde~Hfk~ll~~ 228 (250) T pfam04056 177 YSTIDTLKKEKIRCSVIGLSAE--VFICKELCKATNGTYSVALDETHLKELLLE 228 (250) T ss_pred HHHHHHHHHCCCEEEEEEECHH--HHHHHHHHHHHCCEEEEECCHHHHHHHHHH T ss_conf 9999999975907999873389--999999999749988875699999999995 No 43 >pfam11775 CobT_C Cobalamin biosynthesis protein CobT VWA domain. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. Probab=97.34 E-value=0.013 Score=35.65 Aligned_cols=92 Identities=15% Similarity=0.243 Sum_probs=57.2 Q ss_pred CCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCC--------CCCCCC-CHHHHHHHHH-HHC Q ss_conf 765388999999861310123345543347767787766624899606867--------777765-3489999999-978 Q gi|254781110|r 291 PTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGEN--------NNFKSN-VNTIKICDKA-KEN 360 (420) Q Consensus 291 ~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~--------~~~~~~-~~~~~~c~~~-k~~ 360 (420) .-..++++.|+..-|.. .+..+|++++++||.+ |.+..- ..+...-+.+ +.. T Consensus 115 ENiDGEAL~wA~~RL~~------------------R~e~RkILmViSDGaP~ddst~s~n~~~yL~~hLr~vi~~ie~~~ 176 (220) T pfam11775 115 ENIDGEALAQAAKLFAG------------------RMEDKKILLMISDGAPCDDSTLSVAAGDGFEQHLRHIIEEIETLS 176 (220) T ss_pred CCCCCHHHHHHHHHHHC------------------CCCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCC T ss_conf 18971999999999863------------------931246999975899677641125877767999999999985068 Q ss_pred CCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHH-HHHHHHHH Q ss_conf 9879999954897789999986218982798179899999-99999987 Q gi|254781110|r 361 FIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIH-VFQNISQL 408 (420) Q Consensus 361 gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~-aF~~Ia~~ 408 (420) +|++..||+|.+... ..-+.+. ...+.+||.. .|+++++- T Consensus 177 ~iel~aIGIghDv~r-~yY~~av-------~i~d~eeL~~~~~~~L~~l 217 (220) T pfam11775 177 EIDLIAIGIGHDAPR-RYYKNAA-------LINDAEELGGAITEELAEI 217 (220) T ss_pred CCEEEEEEECCCCCH-HHHHCCE-------EECCHHHHHHHHHHHHHHH T ss_conf 826999874777686-6650656-------8603888659999999998 No 44 >COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=97.34 E-value=0.0078 Score=36.99 Aligned_cols=162 Identities=20% Similarity=0.154 Sum_probs=97.0 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) ..+...+|.||||. ..+.+.++...-+++.... ..+-++.+..|++ ......++ T Consensus 273 GpvilllD~SGSM~--------------------G~~e~~AKAvalAl~~~al-----aenR~~~~~lF~s-~~~~~el~ 326 (437) T COG2425 273 GPVILLLDKSGSMS--------------------GFKEQWAKAVALALMRIAL-----AENRDCYVILFDS-EVIEYELY 326 (437) T ss_pred CCEEEEEECCCCCC--------------------CCHHHHHHHHHHHHHHHHH-----HHCCCEEEEEECC-CCEEEEEC T ss_conf 98799995888857--------------------8288999999999999998-----8430538999525-20255505 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) .-...+...++-+..-.+ |||++...+..+.+.+..... .+-=+|++|||+.--. +.-. T Consensus 327 ~k~~~~~e~i~fL~~~f~-GGTD~~~~l~~al~~~k~~~~------------------~~adiv~ITDg~~~~~--~~~~ 385 (437) T COG2425 327 EKKIDIEELIEFLSYVFG-GGTDITKALRSALEDLKSREL------------------FKADIVVITDGEDERL--DDFL 385 (437) T ss_pred CCCCCHHHHHHHHHHHCC-CCCCHHHHHHHHHHHHHCCCC------------------CCCCEEEEECCHHHHH--HHHH T ss_conf 774579999999965068-988858999999998643665------------------6777899804376654--6789 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCC-EEEECCHHHHHHHHHHH Q ss_conf 99999999789879999954897789999986218982-79817989999999999 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEY-HYNVVNADSLIHVFQNI 405 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~-yf~a~~~~~L~~aF~~I 405 (420) ...-+..|....++|+|-++.... ..+++. . .+ -|..+. .+...+++.+ T Consensus 386 ~~v~e~~k~~~~rl~aV~I~~~~~-~~l~~I-s---d~~i~~~~~-~~~~kv~~~~ 435 (437) T COG2425 386 RKVKELKKRRNARLHAVLIGGYGK-PGLMRI-S---DHIIYRVEP-RDRVKVVKRW 435 (437) T ss_pred HHHHHHHHHHHCEEEEEEECCCCC-CCCCEE-C---EEEEEEECC-HHHHHHHHCC T ss_conf 999999887543489999647898-660001-1---146787274-7776777344 No 45 >KOG2807 consensus Probab=96.75 E-value=0.044 Score=32.13 Aligned_cols=173 Identities=13% Similarity=0.098 Sum_probs=115.2 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCC-CCCCCCCCCC Q ss_conf 201221022324223357876666753467665306788999899998510467876554302232048-7643124457 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYT-TRVEKNIEPS 270 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~lt 270 (420) -+..++|+|..|..+.. .++|+.....++..|+..+-++.|.. ++|++.-- ..+.....++ T Consensus 62 hl~iviD~S~am~e~Df---------------~P~r~a~~~K~le~Fv~eFFdQNPiS---Qigii~~k~g~A~~lt~lt 123 (378) T KOG2807 62 HLYIVIDCSRAMEEKDF---------------RPSRFANVIKYLEGFVPEFFDQNPIS---QIGIISIKDGKADRLTDLT 123 (378) T ss_pred EEEEEEEHHHHHHHCCC---------------CCHHHHHHHHHHHHHHHHHHCCCCHH---HEEEEEEECCHHHHHHHHC T ss_conf 68999873455664447---------------80489999999999999986149620---3358997055326888714 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) -+.......++.+. ...|.-....+|..+...|.+- +....+.++|++.-= -+. ++..- T Consensus 124 gnp~~hI~aL~~~~--~~~g~fSLqNaLe~a~~~Lk~~----------------p~H~sREVLii~ssl-sT~--DPgdi 182 (378) T KOG2807 124 GNPRIHIHALKGLT--ECSGDFSLQNALELAREVLKHM----------------PGHVSREVLIIFSSL-STC--DPGDI 182 (378) T ss_pred CCHHHHHHHHHCCC--CCCCCHHHHHHHHHHHHHHCCC----------------CCCCCEEEEEEEEEE-CCC--CCCCH T ss_conf 88788999973122--4488867887999999985178----------------765632799998540-355--85209 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHH Q ss_conf 9999999978987999995489778999998621898279817989999999999 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNI 405 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~I 405 (420) -..-+++|..+|++.+||+...-. ---..|-.++|.|+-+-+..-|.+.|..- T Consensus 183 ~~tI~~lk~~kIRvsvIgLsaEv~--icK~l~kaT~G~Y~V~lDe~HlkeLl~e~ 235 (378) T KOG2807 183 YETIDKLKAYKIRVSVIGLSAEVF--ICKELCKATGGRYSVALDEGHLKELLLEH 235 (378) T ss_pred HHHHHHHHHHCEEEEEEEECHHHH--HHHHHHHHHCCEEEEEECHHHHHHHHHHC T ss_conf 999999986172799985005589--99999886188579875789999999845 No 46 >COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=96.75 E-value=0.019 Score=34.53 Aligned_cols=160 Identities=17% Similarity=0.151 Sum_probs=91.1 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCC Q ss_conf 43443201221022324223357876666753467665306788999899998510467876554302232048764312 Q gi|254781110|r 187 ERPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKN 266 (420) Q Consensus 187 ~~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 266 (420) .....+..+..+.+++|.... +...+.....++..+... .......|....... T Consensus 34 ~~~~~~~~~~~~~~~s~~~~~--------------------~~~~~~~~~~~v~~~~~~------~~~~~~~~~~~~~~~ 87 (399) T COG2304 34 LLVPANLTLAIDTSGSMTGAL--------------------LELAKSAAIELVNGLNPG------DLLSIVTFAGSADVL 87 (399) T ss_pred CCCCCHHHHHHCCCCCCHHHH--------------------HHHHHHHHHHHHHCCCCH------HHHEEEECCCCCCEE T ss_conf 002210022213664100556--------------------776789988877315621------112046115655501 Q ss_pred CCCCCCHHHHHHHHHHHHH-CCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCC Q ss_conf 4457784889999998640-467887653889999998613101233455433477677877666248996068677777 Q gi|254781110|r 267 IEPSWGTEKVRQYVTRDMD-SLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFK 345 (420) Q Consensus 267 ~~lt~~~~~~~~~i~~~~~-~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~ 345 (420) .+.+ ...........+.. ..+.|.|....++.|+...+.+.. .......+.+.|||+++.+. T Consensus 88 ~~~~-~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~----------------~~~~~~~~~~~tdg~~~~~~ 150 (399) T COG2304 88 IPPT-GATNKESITAAIDQSLQAGGATAVEASLSLAVELAAKAL----------------PRGTLNRILLLTDGENNLGL 150 (399) T ss_pred CCCC-CCCCHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHC----------------CCCCCCEEEEECCCCHHCCC T ss_conf 2564-223227788887640264554305778999999876423----------------54553233330364120276 Q ss_pred CCCHH-HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCE Q ss_conf 65348-999999997898799999548977899999862189827 Q gi|254781110|r 346 SNVNT-IKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYH 389 (420) Q Consensus 346 ~~~~~-~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~y 389 (420) .+... ....+.....+|.+.++|++.+-+.+.++..-+-..+.+ T Consensus 151 ~d~~~~~~~~~~~~~~~i~~~~~g~~~~~n~~~~~~~~~~~~g~l 195 (399) T COG2304 151 VDPSRLSALAKLAAGKGIVLDTLGLGDDVNEDELTGIAAAANGNL 195 (399) T ss_pred CCHHHHHHHHCCCCCCCEEEEEECCCCHHHHHHHHHHHHHCCCCC T ss_conf 678899998634556762786313552267777776553036641 No 47 >pfam07811 TadE TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues. Probab=96.54 E-value=0.0087 Score=36.69 Aligned_cols=42 Identities=21% Similarity=0.282 Sum_probs=38.9 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 856999999999999999999999999999999999999999 Q gi|254781110|r 20 ANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMESANNAAI 61 (420) Q Consensus 20 G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~ 61 (420) |..++=|++++|+++.+..+.+|++++...+..+++|+..++ T Consensus 1 G~a~VEfalv~p~~l~l~~~~~~~~~~~~~~~~~~~Aa~~aa 42 (43) T pfam07811 1 GAAAVEFALVLPVLLLLLFGIVELGRLFYARQVLQNAAREAA 42 (43) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 916999999999999999999999999999999999998674 No 48 >cd01455 vWA_F11C1-5a_type Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=96.22 E-value=0.092 Score=30.02 Aligned_cols=180 Identities=18% Similarity=0.220 Sum_probs=102.7 Q ss_pred EEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCC---CCCCC Q ss_conf 012210223242233578766667534676653067889998999985104678765543022320487643---12445 Q gi|254781110|r 193 IELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVE---KNIEP 269 (420) Q Consensus 193 ~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~l 269 (420) +.++.|+|+||.-... .-.|++..-.+.--.+..+...+....+.-+|-..-+.+.. ...|+ T Consensus 3 lr~v~DvSgSMYRFNg---------------~DgRL~R~lEa~~MvMEaf~g~e~k~~ydIvGHSGd~~~I~lV~~~~~P 67 (191) T cd01455 3 LKLVVDVSGSMYRFNG---------------YDGRLDRSLEAVVMVMEAFDGFEDKIQYDIIGHSGDGPCVPFVKTNHPP 67 (191) T ss_pred EEEEEECCCCEEEECC---------------CCHHHHHHHHHHHHHHHHHHCCCCEEEEEEEECCCCCCCCCCCCCCCCC T ss_conf 6999973544233047---------------5328999999999999986175400578875026887751023489999 Q ss_pred CCCHHH--HHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC Q ss_conf 778488--999999864046788765388999999861310123345543347767787766624899606867777765 Q gi|254781110|r 270 SWGTEK--VRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSN 347 (420) Q Consensus 270 t~~~~~--~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~ 347 (420) ..+... +.+.+.+.. .+.-.|-+.-+++.++...+-+... .-..++|+++|-+-..+.- T Consensus 68 k~~keRl~vl~~M~AHs-QyC~sGD~Tlea~~~Ai~~l~a~~d-----------------~De~fVivlSDANL~RYgI- 128 (191) T cd01455 68 KNNKERLETLKMMHAHS-QFCWSGDHTVEATEFAIKELAAKED-----------------FDEAIVIVLSDANLERYGI- 128 (191) T ss_pred CCHHHHHHHHHHHHHHH-HHEECCCCHHHHHHHHHHHHHHCCC-----------------CCCCEEEEECCCCHHHCCC- T ss_conf 86689999999863120-1002588448999999998753026-----------------7760899981476443188- Q ss_pred CHHHHHHHHHH-HCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHH Q ss_conf 34899999999-78987999995489778999998621898279817989999999999987 Q gi|254781110|r 348 VNTIKICDKAK-ENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQL 408 (420) Q Consensus 348 ~~~~~~c~~~k-~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~ 408 (420) .+..+...++ +..|.-|.|-++.-.+....++.-- ..|+-|-..++.+|..+|++|-.. T Consensus 129 -~p~~l~~~l~~~p~V~a~~IfIgslg~eA~~l~~~l-P~G~~fVc~dt~~lP~il~qIfts 188 (191) T cd01455 129 -QPKKLADALAREPNVNAFVIFIGSLSDEADQLQREL-PAGKAFVCMDTSELPHIMQQIFTS 188 (191) T ss_pred -CHHHHHHHHHCCCCCCEEEEEEECHHHHHHHHHHHC-CCCCEEEECCHHHHHHHHHHHHHH T ss_conf -989999997338776689999735167999999748-997417853653678999999887 No 49 >KOG2884 consensus Probab=95.45 E-value=0.19 Score=27.97 Aligned_cols=158 Identities=15% Similarity=0.119 Sum_probs=104.9 Q ss_pred CCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCC-CCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHH Q ss_conf 76653067889998999985104678765543022320487-64312445778488999999864046788765388999 Q gi|254781110|r 221 QDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTT-RVEKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMK 299 (420) Q Consensus 221 ~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~ 299 (420) +++.++|+.+-+++++.+...-... +....+|+.+-.. .......||.+..++.+.++.+ .+.|.-+...||. T Consensus 19 gDy~PtRf~aQ~daVn~v~~~K~~s---npEntvGiitla~a~~~vLsT~T~d~gkils~lh~i---~~~g~~~~~~~i~ 92 (259) T KOG2884 19 GDYLPTRFQAQKDAVNLVCQAKLRS---NPENTVGIITLANASVQVLSTLTSDRGKILSKLHGI---QPHGKANFMTGIQ 92 (259) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHCC---CCCCCEEEEECCCCCCEEEEECCCCCHHHHHHHCCC---CCCCCCCHHHHHH T ss_conf 8977188898899999998755027---954315468636898504430343004898773277---8577612888899 Q ss_pred HHHHHHCCCCCCCCCCCCCCCCCCCCCCCC-CCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHH Q ss_conf 999861310123345543347767787766-6248996068677777653489999999978987999995489778999 Q gi|254781110|r 300 QAYQILTSDKKRSFFTNFFRQGVKIPSLPF-QKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASPNGQRL 378 (420) Q Consensus 300 ~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~-~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~ 378 (420) -+.-.|..+- +++. .++++++ |-+-..... .+...+..+|+.+|-|-.|-||...+.... T Consensus 93 iA~lalkhRq----------------nk~~~~riVvFv--GSpi~e~ek-eLv~~akrlkk~~Vaidii~FGE~~~~~e~ 153 (259) T KOG2884 93 IAQLALKHRQ----------------NKNQKQRIVVFV--GSPIEESEK-ELVKLAKRLKKNKVAIDIINFGEAENNTEK 153 (259) T ss_pred HHHHHHHHHC----------------CCCCCEEEEEEE--CCCCHHHHH-HHHHHHHHHHHCCEEEEEEEECCCCCCHHH T ss_conf 9999987103----------------888636999993--683223389-999999998754802789872434333788 Q ss_pred HHH-H--H---CCCCCEEEECCHHHHHHHHH Q ss_conf 998-6--2---18982798179899999999 Q gi|254781110|r 379 LKT-C--V---SSPEYHYNVVNADSLIHVFQ 403 (420) Q Consensus 379 l~~-c--a---s~~~~yf~a~~~~~L~~aF~ 403 (420) +.. . . ++++|.-.++...-|.++.. T Consensus 154 l~~fida~N~~~~gshlv~Vppg~~L~d~l~ 184 (259) T KOG2884 154 LFEFIDALNGKGDGSHLVSVPPGPLLSDALL 184 (259) T ss_pred HHHHHHHHCCCCCCCEEEEECCCCCHHHHHH T ss_conf 9999998538988744898589840777764 No 50 >pfam05762 VWA_CoxE VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA type domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria. Probab=92.83 E-value=0.67 Score=24.42 Aligned_cols=96 Identities=11% Similarity=0.090 Sum_probs=56.5 Q ss_pred CEEEECCCCCCCCCCCCC--CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCC Q ss_conf 022320487643124457--784889999998640467887653889999998613101233455433477677877666 Q gi|254781110|r 253 YMGLIGYTTRVEKNIEPS--WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQ 330 (420) Q Consensus 253 ~~~~~~~~~~~~~~~~lt--~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~ 330 (420) ++-...|++.....++.- .+.......+..... .-+|||+++.++..-.+...-. .-..+ T Consensus 90 rv~~F~F~t~l~~vT~~l~~~d~~~al~~~~~~~~-~~~GgT~ig~al~~f~~~~~~~-----------------~l~~~ 151 (223) T pfam05762 90 RTRLFAFGTRLTDLTRALRERDPAEALLRVSARVE-DWGGGTRIGAALAYFNELWTRP-----------------ALSRG 151 (223) T ss_pred CCEEEEEECCHHHHHHHHHCCCHHHHHHHHHHHHC-CCCCCCCHHHHHHHHHHHCCCC-----------------CCCCC T ss_conf 61599983648988888712899999999998603-6679974999999999850303-----------------46788 Q ss_pred CEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEE Q ss_conf 24899606867777765348999999997898799999 Q gi|254781110|r 331 KFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTIS 368 (420) Q Consensus 331 k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIg 368 (420) -++|+++||.++... ......-+.++..+.+|.-+. T Consensus 152 t~ViilsDg~~~~~~--~~l~~~l~~L~~~~~rviWLN 187 (223) T pfam05762 152 AVVVLVSDGLERGDS--EELLAEVARLVRSARRLVWLN 187 (223) T ss_pred CEEEEEECCCCCCCH--HHHHHHHHHHHHHCCEEEEEC T ss_conf 679997230103883--189999999998378799989 No 51 >COG4547 CobT Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) [Coenzyme metabolism] Probab=90.78 E-value=1.1 Score=22.96 Aligned_cols=65 Identities=14% Similarity=0.132 Sum_probs=39.2 Q ss_pred CCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCC-----CCCHHH-HHHHHHH----HCC Q ss_conf 653889999998613101233455433477677877666248996068677777-----653489-9999999----789 Q gi|254781110|r 292 TDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFK-----SNVNTI-KICDKAK----ENF 361 (420) Q Consensus 292 T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~-----~~~~~~-~~c~~~k----~~g 361 (420) .-.++++.|+-+.|-. .+..+|++.+|+||.+-..+ ..+++. .+-.-++ ..- T Consensus 517 NiDGEal~wah~rl~g------------------RpEqrkIlmmiSDGAPvddstlsvnpGnylerHLRaVieeIEtrSp 578 (620) T COG4547 517 NIDGEALMWAHQRLIG------------------RPEQRKILMMISDGAPVDDSTLSVNPGNYLERHLRAVIEEIETRSP 578 (620) T ss_pred CCCHHHHHHHHHHHHC------------------CHHHCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCC T ss_conf 5771999999998735------------------8424137888348985555434558860799999999999703784 Q ss_pred CEEEEEEECCCCC Q ss_conf 8799999548977 Q gi|254781110|r 362 IKIVTISINASPN 374 (420) Q Consensus 362 i~i~tIgf~~~~~ 374 (420) |.+..||++-+.. T Consensus 579 veLlAIGighDvt 591 (620) T COG4547 579 VELLAIGIGHDVT 591 (620) T ss_pred HHHEEEECCCCCC T ss_conf 0330331255530 No 52 >COG4867 Uncharacterized protein with a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=90.23 E-value=1.2 Score=22.65 Aligned_cols=93 Identities=17% Similarity=0.169 Sum_probs=62.6 Q ss_pred CCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC-------------CH----HHH Q ss_conf 8765388999999861310123345543347767787766624899606867777765-------------34----899 Q gi|254781110|r 290 KPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSN-------------VN----TIK 352 (420) Q Consensus 290 g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~-------------~~----~~~ 352 (420) .+||.+-|+..+.+.|.- .+..+|.++++|||+++....+ .+ +.. T Consensus 531 qgTNlhhaL~LA~r~l~R------------------h~~~~~~il~vTDGePtAhle~~DG~~~~f~yp~DP~t~~~Tvr 592 (652) T COG4867 531 QGTNLHHALALAGRHLRR------------------HAGAQPVVLVVTDGEPTAHLEDGDGTSVFFDYPPDPRTIAHTVR 592 (652) T ss_pred CCCCHHHHHHHHHHHHHH------------------CCCCCCEEEEEECCCCCCCCCCCCCCEEECCCCCCHHHHHHHHH T ss_conf 555458899999999873------------------75657628998379863013478985661689987779989899 Q ss_pred HHHHHHHCCCEEEEEEECCCCCHHHHHHHHHC-CCCCEEEECCHHHHHHH Q ss_conf 99999978987999995489778999998621-89827981798999999 Q gi|254781110|r 353 ICDKAKENFIKIVTISINASPNGQRLLKTCVS-SPEYHYNVVNADSLIHV 401 (420) Q Consensus 353 ~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas-~~~~yf~a~~~~~L~~a 401 (420) -.+++...||.|-+.-++.++.-.-++++.+- ..|..| +++.+.|-.+ T Consensus 593 ~~d~~~r~G~q~t~FrLg~DpgL~~Fv~qva~rv~G~vv-~pdldglGaa 641 (652) T COG4867 593 GFDDMARLGAQVTIFRLGSDPGLARFIDQVARRVQGRVV-VPDLDGLGAA 641 (652) T ss_pred HHHHHHHCCCEEEEEEECCCHHHHHHHHHHHHHHCCEEE-ECCCCHHHHH T ss_conf 888887516413677522777689999999998588488-1382213589 No 53 >COG3847 Flp Flp pilus assembly protein, pilin Flp [Intracellular trafficking and secretion] Probab=89.31 E-value=1.5 Score=22.19 Aligned_cols=28 Identities=11% Similarity=0.186 Sum_probs=21.2 Q ss_pred HHHHHHHHHCCCCCHHHHHHHHHHHHHH Q ss_conf 9999988630388569999999999999 Q gi|254781110|r 8 RFYFKKGIASEKANFSIIFALSVMSFLL 35 (420) Q Consensus 8 ~~~~~~~~~~~~G~vai~fal~l~~ll~ 35 (420) +..++||+|||+|.-+|=.+++...+-. T Consensus 2 ~~~~~rF~rDE~GAtaiEYglia~lIav 29 (58) T COG3847 2 KKLLRRFLRDEDGATAIEYGLIAALIAV 29 (58) T ss_pred CHHHHHHHHCCCCHHHHHHHHHHHHHHH T ss_conf 1789999774565189999999999999 No 54 >cd01460 vWA_midasin VWA_Midasin: Midasin is a member of the AAA ATPase family. The proteins of this family are unified by their common archetectural organization that is based upon a conserved ATPase domain. The AAA domain of midasin contains six tandem AAA protomers. The AAA domains in midasin is followed by a D/E rich domain that is following by a VWA domain. The members of this subgroup have a conserved MIDAS motif. The function of this domain is not exactly known although it has been speculated to play a crucial role in midasin function. Probab=88.90 E-value=1.6 Score=22.01 Aligned_cols=184 Identities=18% Similarity=0.149 Sum_probs=108.5 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCC Q ss_conf 43443201221022324223357876666753467665306788999899998510467876554302232048764312 Q gi|254781110|r 187 ERPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKN 266 (420) Q Consensus 187 ~~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 266 (420) +...--+...+|.|.||.... ....+-.++.-+...+...+.. ..+++.|+...... T Consensus 57 sKR~YqI~lAiDdSkSM~~~~-------------------~~~lAlesl~lvs~Als~LEvG----~l~V~~FGe~v~~l 113 (266) T cd01460 57 AKRDYQILIAIDDSKSMSENN-------------------SKKLALESLCLVSKALTLLEVG----QLGVCSFGEDVQIL 113 (266) T ss_pred CCCCEEEEEEECCCHHHHHHH-------------------HHHHHHHHHHHHHHHHHHHCCC----CEEEEECCCCEEEE T ss_conf 654258999972603310104-------------------7889999999999999872776----56899848872786 Q ss_pred CCCCCCHHH--HHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCC Q ss_conf 445778488--999999864046788765388999999861310123345543347767787766624899606867777 Q gi|254781110|r 267 IEPSWGTEK--VRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNF 344 (420) Q Consensus 267 ~~lt~~~~~--~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~ 344 (420) .++...++. ....+.... .....|++..-+......+..... ...+....+.+++++||..... T Consensus 114 h~f~~~f~~~~g~~il~~f~--F~q~~T~v~~ll~~~~~~~~~a~~------------~~~~~~~~qL~lIiSDG~~~~~ 179 (266) T cd01460 114 HPFDEQFSSQSGPRILNQFT--FQQDKTDIANLLKFTAQIFEDART------------QSSSGSLWQLLLIISDGRGEFS 179 (266) T ss_pred EECCCCCCCCHHHHHHHHCC--CCCCCCCHHHHHHHHHHHHHHHHC------------CCCCCCHHHEEEEEECCCCCCC T ss_conf 22688765422799998398--876776199999999999999752------------6687562127999968986335 Q ss_pred CCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHH--H--------------HHHCCCCCEEE-ECCHHHHHHHHHHHHH Q ss_conf 76534899999999789879999954897789999--9--------------86218982798-1798999999999998 Q gi|254781110|r 345 KSNVNTIKICDKAKENFIKIVTISINASPNGQRLL--K--------------TCVSSPEYHYN-VVNADSLIHVFQNISQ 407 (420) Q Consensus 345 ~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l--~--------------~cas~~~~yf~-a~~~~~L~~aF~~Ia~ 407 (420) .. .....-..+..++|-+..|=++.+.+.+..| + .--+=|=.||- +.+.++|.+++..+-+ T Consensus 180 ~~--~~r~~vr~a~~~~i~~vfiIiD~~~~~~SIldmk~~~f~~~~~~~~~~YLd~FPFpyYvivrdi~~LP~~Lsd~LR 257 (266) T cd01460 180 EG--AQKVRLREAREQNVFVVFIIIDNPDNKQSILDIKVVSFKNDKSGVITPYLDEFPFPYYVIVRDLNQLPSVLSDALR 257 (266) T ss_pred CC--HHHHHHHHHHHCCCEEEEEEECCCCCCCCCCCCEEEEEECCCCCEEEEHHHCCCCCEEEEECCHHHHHHHHHHHHH T ss_conf 30--6899999999769769999970898877633113777717984066472443998648997688780899999999 Q ss_pred HH Q ss_conf 74 Q gi|254781110|r 408 LM 409 (420) Q Consensus 408 ~I 409 (420) += T Consensus 258 Qw 259 (266) T cd01460 258 QW 259 (266) T ss_pred HH T ss_conf 99 No 55 >COG4726 PilX Tfp pilus assembly protein PilX [Cell motility and secretion / Intracellular trafficking and secretion] Probab=87.43 E-value=1.9 Score=21.42 Aligned_cols=57 Identities=19% Similarity=0.258 Sum_probs=32.1 Q ss_pred HHHHCCCCCHHHHHHHHHHHHHHHHHHHHH--------HHHHHHHHHHHHHHHHHHHHHHHHHCCC Q ss_conf 886303885699999999999999999999--------9999999999999999999998742046 Q gi|254781110|r 13 KGIASEKANFSIIFALSVMSFLLLIGFLIY--------VLDWHYKKNSMESANNAAILAGASKMVS 70 (420) Q Consensus 13 ~~~~~~~G~vai~fal~l~~ll~~~g~aVD--------~~r~~~~ks~Lq~A~DaA~LA~a~~~~~ 70 (420) |-.|.|||-. ++++|++++++.+.|++.- .+.-++.|+..++|+++|.-.+...+.+ T Consensus 7 r~~r~qRG~~-LivvL~~LvvltLl~l~~~r~~llqeRiSaN~~D~~lAfqaAEaaLr~~E~~i~n 71 (196) T COG4726 7 RGSRRQRGFA-LIVVLMVLVVLTLLGLAAARSVLLQERISANERDRSLAFQAAEAALREGELQINN 71 (196) T ss_pred CCCCCCCCEE-EHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 8764567647-3899999999999999999999989887520677899999999999877899860 No 56 >pfam00362 Integrin_beta Integrin, beta chain. Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF. Probab=82.78 E-value=3.1 Score=20.03 Aligned_cols=129 Identities=14% Similarity=0.182 Sum_probs=67.2 Q ss_pred CCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCC Q ss_conf 12445778488999999864046788765388999999861310123345543347767787766624899606867777 Q gi|254781110|r 265 KNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNF 344 (420) Q Consensus 265 ~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~ 344 (420) ...+||.+.+.+...+++.. .+|+-..++| |...|-.. ..-...-++ ++..++++|+.||+-.... T Consensus 180 ~~l~LT~~~~~F~~~v~~q~---iSgNlD~PEG---GfDAlmQ~-------aVC~~~IGW-R~~arrllv~~TDa~fH~A 245 (424) T pfam00362 180 HVLSLTDDTDRFNEEVKKQK---ISGNLDAPEG---GFDAIMQA-------AVCGEEIGW-RNEARRLLVFTTDAGFHFA 245 (424) T ss_pred EECCCCCCHHHHHHHHHHCC---CCCCCCCCCC---CHHHHHHH-------HHHCCCCCC-CCCCEEEEEEECCCCCCCC T ss_conf 00246777899999987463---6467778750---17777788-------761423377-7785289999858875135 Q ss_pred ------------------------C-----CCCHHHHHHHHHHHCCCE-EEEEEECCCCCHHHHHHHHHCCCCCEE---- Q ss_conf ------------------------7-----653489999999978987-999995489778999998621898279---- Q gi|254781110|r 345 ------------------------K-----SNVNTIKICDKAKENFIK-IVTISINASPNGQRLLKTCVSSPEYHY---- 390 (420) Q Consensus 345 ------------------------~-----~~~~~~~~c~~~k~~gi~-i~tIgf~~~~~~~~~l~~cas~~~~yf---- 390 (420) . .......+.++++.++|. ||.|.=..-.--+. |..-- |+.+. T Consensus 246 gDGkL~GIv~PNDg~CHL~~~g~Yt~s~~~DYPSv~ql~~kl~ennI~~IFAVt~~~~~~Y~~-Ls~~i--~gs~vg~L~ 322 (424) T pfam00362 246 GDGKLGGIVEPNDGQCHLDDNGEYTASTTLDYPSVGQLAEKLSENNINPIFAVTENVVDLYKE-LSELI--PGSTVGVLS 322 (424) T ss_pred CCCCEEEEECCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECHHHHHHHHH-HHHHC--CCCEEEEEC T ss_conf 776334353488873044898761445667888889999999864925999975024589999-99757--765256624 Q ss_pred -EECCHHH-HHHHHHHHHHHHH Q ss_conf -8179899-9999999998742 Q gi|254781110|r 391 -NVVNADS-LIHVFQNISQLMV 410 (420) Q Consensus 391 -~a~~~~~-L~~aF~~Ia~~I~ 410 (420) +..|--+ +.++|++|...+. T Consensus 323 ~DSsNIv~LI~~aY~ki~S~V~ 344 (424) T pfam00362 323 SDSSNVVQLIKDAYNKISSKVE 344 (424) T ss_pred CCCHHHHHHHHHHHHHHHEEEE T ss_conf 6750289999999987522899 No 57 >smart00187 INB Integrin beta subunits (N-terminal portion of extracellular region). Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). Probab=79.11 E-value=4.2 Score=19.22 Aligned_cols=130 Identities=13% Similarity=0.160 Sum_probs=67.0 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCC Q ss_conf 43124457784889999998640467887653889999998613101233455433477677877666248996068677 Q gi|254781110|r 263 VEKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENN 342 (420) Q Consensus 263 ~~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~ 342 (420) ...+.+||.+...+.+.+++.. .+|+-..++| |...|-.. ..-...-+ .++..+|++|+.||+-.. T Consensus 177 f~n~l~LT~d~~~F~~~V~~q~---iSgNlD~PEG---GfDAlmQ~-------avC~~~IG-WR~~arrllVf~TDa~fH 242 (423) T smart00187 177 FKHVLSLTDDTDEFNEEVKKQR---ISGNLDAPEG---GFDAIMQA-------AVCTEQIG-WREDARRLLVFSTDAGFH 242 (423) T ss_pred CCCCCCCCCCHHHHHHHHHHCC---CCCCCCCCCC---CHHHHHHH-------HHHCCCCC-CCCCCEEEEEEECCCCCC T ss_conf 0111236788899999986253---6346688761---27788888-------75200037-655743899998378630 Q ss_pred CC------------------------C-----CCCHHHHHHHHHHHCCCE-EEEEEECCCCCHHHHHHHHHCCCCCEE-- Q ss_conf 77------------------------7-----653489999999978987-999995489778999998621898279-- Q gi|254781110|r 343 NF------------------------K-----SNVNTIKICDKAKENFIK-IVTISINASPNGQRLLKTCVSSPEYHY-- 390 (420) Q Consensus 343 ~~------------------------~-----~~~~~~~~c~~~k~~gi~-i~tIgf~~~~~~~~~l~~cas~~~~yf-- 390 (420) .. . .......+.+++++++|. ||.|.=..-.--+. |..-- |+... T Consensus 243 ~AgDGkL~GIv~PNDg~CHLd~~g~Yt~s~~~DYPSi~ql~~kl~ennI~~IFAVT~~~~~~Y~~-Ls~~i--pgs~vg~ 319 (423) T smart00187 243 FAGDGKLAGIVQPNDGQCHLDNNGEYTMSTTQDYPSIGQLNQKLAENNINPIFAVTKKQVSLYKE-LSALI--PGSSVGV 319 (423) T ss_pred CCCCCCEEEEECCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECCCHHHHHHH-HHHHC--CCCEEEE T ss_conf 23676244354378873032788852445656788789999999853932799852204569999-98757--7540355 Q ss_pred ---EECCHHH-HHHHHHHHHHHH Q ss_conf ---8179899-999999999874 Q gi|254781110|r 391 ---NVVNADS-LIHVFQNISQLM 409 (420) Q Consensus 391 ---~a~~~~~-L~~aF~~Ia~~I 409 (420) +..|--+ +.++|++|...+ T Consensus 320 L~~DSsNVv~LI~~aY~ki~S~V 342 (423) T smart00187 320 LSEDSSNVVELIKDAYNKISSRV 342 (423) T ss_pred ECCCCHHHHHHHHHHHHHHCEEE T ss_conf 24575138999999998750189 No 58 >KOG2487 consensus Probab=75.78 E-value=5.2 Score=18.62 Aligned_cols=47 Identities=19% Similarity=0.247 Sum_probs=37.6 Q ss_pred HHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHH Q ss_conf 9997898799999548977899999862189827981798999999999 Q gi|254781110|r 356 KAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQN 404 (420) Q Consensus 356 ~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~ 404 (420) .+.+.+|.|-++-+|.+ +--+.|.|..++|-|-++++++.|.+.+-. T Consensus 191 aAqKq~I~Idv~~l~~~--s~~LqQa~D~TGG~YL~v~~~~gLLqyLlt 237 (314) T KOG2487 191 AAQKQNIPIDVVSLGGD--SGFLQQACDITGGDYLHVEKPDGLLQYLLT 237 (314) T ss_pred HHHHCCCEEEEEEECCC--CHHHHHHHHHCCCEEEECCCCCHHHHHHHH T ss_conf 78753961589995698--439999875028704714885259999999 No 59 >cd01458 vWA_ku Ku70/Ku80 N-terminal domain. The Ku78 heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks (DSB) in a preferred orientation. DSB's are repaired by either homologues recombination or non-homologues end joining and facilitate repair by the non-homologous end-joining pathway (NHEJ). The Ku heterodimer is required for accurate process that tends to preserve the sequence at the junction. Ku78 is found in all three kingdoms of life. However, only the eukaryotic proteins have a vWA domain fused to them at their N-termini. The vWA domain is not involved in DNA binding but may very likey mediate Ku78's interactions with other proteins. Members of this subgroup lack the conserved MIDAS motif. Probab=73.92 E-value=5.8 Score=18.32 Aligned_cols=160 Identities=11% Similarity=0.126 Sum_probs=83.3 Q ss_pred EEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCC------- Q ss_conf 0122102232422335787666675346766530678899989999851046787655430223204876431------- Q gi|254781110|r 193 IELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEK------- 265 (420) Q Consensus 193 ~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------- 265 (420) +.|.+|++-+|-....+ ...+.+..+-..+..++...- .......+|++.|++.-.. T Consensus 4 ivflID~s~sM~~~~~~-------------~~~s~~~~al~~i~~~~~~ki---is~~~d~vGvv~~~T~~~~n~~~~~~ 67 (218) T cd01458 4 VVFLVDVSPSMFESKDG-------------EYESPFEEALKCIRQLMKSKI---ISSPKDLVGVVFYGTEESKNPVGYEN 67 (218) T ss_pred EEEEEECCHHHCCCCCC-------------CCCCHHHHHHHHHHHHHHHHE---ECCCCCEEEEEEECCCCCCCCCCCCE T ss_conf 99999799778477678-------------888839999999999998650---67899869999976788888789872 Q ss_pred ---CCCCCCCHHHHHHHHHHHHHC---------CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEE Q ss_conf ---244577848899999986404---------67887653889999998613101233455433477677877666248 Q gi|254781110|r 266 ---NIEPSWGTEKVRQYVTRDMDS---------LILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFI 333 (420) Q Consensus 266 ---~~~lt~~~~~~~~~i~~~~~~---------~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~i 333 (420) ..+|..-..+....+..+... ...+.......+--+.+++.... .....|.| T Consensus 68 i~vl~~l~~~~a~~i~~l~~~~~~~~~~~~~~~~~~~~~~l~~aL~~~~~~f~~~~----------------~~~~~krI 131 (218) T cd01458 68 IYVLLDLDTPGAERVEDLKELIEPGGLSFAGQVGDSGQVSLSDALWVCLDLFSKGK----------------KKKSHKRI 131 (218) T ss_pred EEEEECCCCCCHHHHHHHHHHHHCCHHHHHHHCCCCCCCCHHHHHHHHHHHHHHCC----------------CCCCCCEE T ss_conf 69963388767799999999860102355664488888679999999999998555----------------34577779 Q ss_pred EEEECCCCCCCCC---CCHHHHHHHHHHHCCCEEEEEEECCCC---CHHHHHHHHHC Q ss_conf 9960686777776---534899999999789879999954897---78999998621 Q gi|254781110|r 334 IFLTDGENNNFKS---NVNTIKICDKAKENFIKIVTISINASP---NGQRLLKTCVS 384 (420) Q Consensus 334 il~TDG~~~~~~~---~~~~~~~c~~~k~~gi~i~tIgf~~~~---~~~~~l~~cas 384 (420) +++||..+=.... .......+..+++.||.|-.+.+..+. +...+.+.... T Consensus 132 ~lfTdnD~P~~~~~~~~~~a~~~a~DL~d~gI~iel~~l~~~~~~Fd~s~FY~dii~ 188 (218) T cd01458 132 FLFTNNDDPHGGDSIKDSQAAVKAEDLKDKGIELELFPLSSPGKKFDVSKFYKDIIA 188 (218) T ss_pred EEECCCCCCCCCCHHHHHHHHHHHHHHHHCCCEEEEEECCCCCCCCCCHHHHHHHHC T ss_conf 998689989998879999999999889877968999844899886880677887526 No 60 >pfam09967 DUF2201 Predicted metallopeptidase (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=71.13 E-value=6.7 Score=17.90 Aligned_cols=46 Identities=13% Similarity=0.106 Sum_probs=27.8 Q ss_pred CEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHH Q ss_conf 022320487643124457784889999998640467887653889999998 Q gi|254781110|r 253 YMGLIGYTTRVEKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQ 303 (420) Q Consensus 253 ~~~~~~~~~~~~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~ 303 (420) .+-++.++........+.....- +..+ ....+|||.....+.|.-. T Consensus 322 ~i~vi~~D~~V~~~~~~~~~~~~----~~~~-~~~GgGGTdf~pvf~~~~~ 367 (412) T pfam09967 322 EVHVLACDEKVSSVQKFEPGDSE----ISEV-ELTGGGGTDFRPVLEAALR 367 (412) T ss_pred CEEEEEECCEECCEEEECCCCCC----CCCC-CCCCCCCCCCHHHHHHHHH T ss_conf 77999968885640786346676----4414-1357899878489999982 No 61 >TIGR02877 spore_yhbH sporulation protein YhbH; InterPro: IPR014230 Proteins in this entry, typified by YhbH from Bacillus subtilis, are found in the genomes of nearly every endospore-forming bacterium, and in no other genomes. The gene in Bacillus subtilis was shown to be a member of the sigma-E regulon, with mutation leading to a sporulation defect .. Probab=67.06 E-value=8.1 Score=17.36 Aligned_cols=106 Identities=11% Similarity=0.143 Sum_probs=65.4 Q ss_pred HCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCC-HHHHHHHHHHHCC-- Q ss_conf 0467887653889999998613101233455433477677877666248996068677777653-4899999999789-- Q gi|254781110|r 285 DSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNV-NTIKICDKAKENF-- 361 (420) Q Consensus 285 ~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~-~~~~~c~~~k~~g-- 361 (420) ...-+|||.+..|...|..++...+ +...++=|-+-++||+|.....+. +..-+-+-++-.+ T Consensus 274 ~kgESGGT~~SS~Y~~ALeiI~~RY---------------nP~~yNiY~FHfSDGDNl~~Dn~Rlav~l~~~L~~~cNL~ 338 (392) T TIGR02877 274 TKGESGGTRCSSAYKLALEIIDERY---------------NPARYNIYAFHFSDGDNLSSDNERLAVKLVRKLLEVCNLF 338 (392) T ss_pred CCCCCCCCCHHHHHHHHHHHHHCCC---------------CCCCCCCCCCEEECCCCCCCCCHHHHHHHHHHHHHHHHCC T ss_conf 3256677430167889999974278---------------8310065653553377889886468999999998876111 Q ss_pred --CEE------EEEEECCCCCHHHHHHHHHCCCCC-EEEECCHHHHHHHHHHH Q ss_conf --879------999954897789999986218982-79817989999999999 Q gi|254781110|r 362 --IKI------VTISINASPNGQRLLKTCVSSPEY-HYNVVNADSLIHVFQNI 405 (420) Q Consensus 362 --i~i------~tIgf~~~~~~~~~l~~cas~~~~-yf~a~~~~~L~~aF~~I 405 (420) ++| .+..++++.+-....+.-..+|.+ ++...+-+||=.|++++ T Consensus 339 GYgEIEtqPqyls~~Y~y~~tL~~~f~~ei~~~~Fv~~~I~~K~d~y~ALk~~ 391 (392) T TIGR02877 339 GYGEIETQPQYLSMPYGYSSTLKSKFKKEIKDPNFVLLIIKDKEDVYPALKKF 391 (392) T ss_pred CEEEEECCCCEECCCCCCCHHHHHHHHHHHCCCCCEEEEECCHHHHHHHHHHH T ss_conf 10566056511037886655778888874058883587650414689999983 No 62 >COG2984 ABC-type uncharacterized transport system, periplasmic component [General function prediction only] Probab=65.52 E-value=8.6 Score=17.17 Aligned_cols=87 Identities=10% Similarity=0.182 Sum_probs=47.0 Q ss_pred CCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHH Q ss_conf 77666248996068677777653489999999978987999995489778999998621898279817989999999999 Q gi|254781110|r 326 SLPFQKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNI 405 (420) Q Consensus 326 ~~~~~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~I 405 (420) -++.+++.++..-||.|... ..+.+-..+++.|+++++.+.....+.+..++.....++-.|--.+ .-+..+++.+ T Consensus 156 ~Pnak~Igv~Y~p~E~ns~~---l~eelk~~A~~~Gl~vve~~v~~~ndi~~a~~~l~g~~d~i~~p~d-n~i~s~~~~l 231 (322) T COG2984 156 LPNAKSIGVLYNPGEANSVS---LVEELKKEARKAGLEVVEAAVTSVNDIPRAVQALLGKVDVIYIPTD-NLIVSAIESL 231 (322) T ss_pred CCCCEEEEEEECCCCCCCHH---HHHHHHHHHHHCCCEEEEEECCCCCCCHHHHHHHCCCCCEEEEECC-HHHHHHHHHH T ss_conf 78870699995798866089---9999999998779889998347632008999973478767998660-6778889999 Q ss_pred HHHHHCCEEEE Q ss_conf 98742025587 Q gi|254781110|r 406 SQLMVHRKYSV 416 (420) Q Consensus 406 a~~I~~lr~s~ 416 (420) -...-.-|+-| T Consensus 232 ~~~a~~~kiPl 242 (322) T COG2984 232 LQVANKAKIPL 242 (322) T ss_pred HHHHHHHCCCE T ss_conf 99988708973 No 63 >COG5242 TFB4 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair] Probab=62.60 E-value=9.8 Score=16.82 Aligned_cols=48 Identities=21% Similarity=0.270 Sum_probs=36.2 Q ss_pred HHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHH Q ss_conf 99978987999995489778999998621898279817989999999999 Q gi|254781110|r 356 KAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNI 405 (420) Q Consensus 356 ~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~I 405 (420) .+.+.||.|-+..++.+ ..-++|.|.+++|-|-.+++++.|...+-.. T Consensus 178 ~Aqk~~ipI~v~~i~g~--s~fl~Q~~daTgG~Yl~ve~~eGllqyL~~~ 225 (296) T COG5242 178 AAQKFGIPISVFSIFGN--SKFLLQCCDATGGDYLTVEDTEGLLQYLLSL 225 (296) T ss_pred EHHHCCCCEEEEEECCC--CHHHHHHHHCCCCEEEEECCCHHHHHHHHHH T ss_conf 26434981489982486--1789987634487268624820699999998 No 64 >pfam02060 ISK_Channel Slow voltage-gated potassium channel. Probab=60.60 E-value=11 Score=16.59 Aligned_cols=40 Identities=15% Similarity=0.255 Sum_probs=31.3 Q ss_pred HHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9988630388569999999999999999999999999999 Q gi|254781110|r 11 FKKGIASEKANFSIIFALSVMSFLLLIGFLIYVLDWHYKK 50 (420) Q Consensus 11 ~~~~~~~~~G~vai~fal~l~~ll~~~g~aVD~~r~~~~k 50 (420) .+|.-+...|...++..|+++.+++++.++|-++.....| T Consensus 31 arr~p~~~dg~le~lYiLmvlGfFgFft~GImlsyiRSkk 70 (129) T pfam02060 31 ARRSPLGDDGKLEALYILMVLGFFGFFTLGIMLSYIRSKK 70 (129) T ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 5678899986026889999999999999999999999886 No 65 >TIGR01651 CobT cobaltochelatase, CobT subunit; InterPro: IPR006538 These proteins are CobT subunits of the aerobic cobalt chelatase (aerobic cobalamin biosynthesis pathway). Pseudomonas denitrificans CobT has been experimentally characterised , . Aerobic cobalt chelatase consists of three subunits, CobT, CobN (IPR003672 from INTERPRO) and CobS (IPR006537 from INTERPRO). Cobalamin (vitamin B12) can be complexed with metal via the ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. However, aerobic cobalt chelatase subunits CobN and CobS are homologous to Mg-chelatase subunits BchH and BchI, respectively . CobT, too, has been found to be remotely related to the third subunit of Mg-chelatase, BchD (involved in bacteriochlorophyll synthesis, e.g., in Rhodobacter capsulatus) . Nomenclature note: CobT of the aerobic pathway Pseudomonas denitrificans is not a homolog of CobT of the anaerobic pathway (Salmonella typhimurium, Escherichia coli). Therefore, annotation of any members of this family as nicotinate-mononucleotide--5,6-dimethylbenzimidazole phosphoribosyltransferases is erroneous. . Probab=60.31 E-value=5.1 Score=18.66 Aligned_cols=65 Identities=11% Similarity=0.156 Sum_probs=42.4 Q ss_pred CCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCC-----CCCHHH----HHHHHHH-HCC Q ss_conf 653889999998613101233455433477677877666248996068677777-----653489----9999999-789 Q gi|254781110|r 292 TDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFK-----SNVNTI----KICDKAK-ENF 361 (420) Q Consensus 292 T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~-----~~~~~~----~~c~~~k-~~g 361 (420) ---+++|.|+-+=|.. .+.-+|++.+++||.+-..+ ...+++ ..-+.+- .+= T Consensus 502 NIDGEAL~WAH~RliA------------------R~EQRrILM~ISDGAPVDDSTLSVN~G~YLERHLR~VI~~IEtrSP 563 (606) T TIGR01651 502 NIDGEALLWAHERLIA------------------RPEQRRILMMISDGAPVDDSTLSVNPGNYLERHLRAVIEEIETRSP 563 (606) T ss_pred CCCHHHHHHHHHHHHC------------------CHHHCEEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCC T ss_conf 5646799988666414------------------7204758777627888664523547850678999999986237787 Q ss_pred CEEEEEEECCCCC Q ss_conf 8799999548977 Q gi|254781110|r 362 IKIVTISINASPN 374 (420) Q Consensus 362 i~i~tIgf~~~~~ 374 (420) |++..||+|-+.+ T Consensus 564 VELlAIGIGHDVT 576 (606) T TIGR01651 564 VELLAIGIGHDVT 576 (606) T ss_pred CEEEEECCCCCCC T ss_conf 0002323443422 No 66 >PRK05325 hypothetical protein; Provisional Probab=59.60 E-value=11 Score=16.48 Aligned_cols=109 Identities=14% Similarity=0.127 Sum_probs=63.1 Q ss_pred HCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEE Q ss_conf 04678876538899999986131012334554334776778776662489960686777776534899999999789879 Q gi|254781110|r 285 DSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKI 364 (420) Q Consensus 285 ~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i 364 (420) ....+|||-+..|+..+...+.... +....+-|..-.|||+|.......+...+.+.+-.. ... T Consensus 294 ~~~esGGT~vSSa~~l~~eII~~rY---------------pp~~WNIY~f~aSDGDNw~~D~~~~~~~L~~~llp~-~~~ 357 (414) T PRK05325 294 YSRESGGTIVSSALKLMLEIIEERY---------------PPAEWNIYAFQASDGDNWSDDSPRCVELLVEELLPV-VNY 357 (414) T ss_pred CCCCCCCEEEEHHHHHHHHHHHHHC---------------CHHHCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHH-HHE T ss_conf 5589898485089999999998548---------------875652788991377675446699999999988887-536 Q ss_pred EEE-EECCCC--CHHHHHHH---HHCCCCC--EEEECCHHHHHHHHHHHHHHH Q ss_conf 999-954897--78999998---6218982--798179899999999999874 Q gi|254781110|r 365 VTI-SINASP--NGQRLLKT---CVSSPEY--HYNVVNADSLIHVFQNISQLM 409 (420) Q Consensus 365 ~tI-gf~~~~--~~~~~l~~---cas~~~~--yf~a~~~~~L~~aF~~Ia~~I 409 (420) |.- -+.... ....+++. ......+ .......++|-.+|+.+-..- T Consensus 358 f~Y~Ei~~~~~~~~~~l~~~y~~~~~~~~~f~~~~I~~~~dI~p~fr~lf~k~ 410 (414) T PRK05325 358 FAYIEITPRAYYRHQTLWREYEKLQDEFDNFAMQHIRDKADIYPVFRELFKKE 410 (414) T ss_pred EEEEEEECCCCCCCHHHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHHH T ss_conf 89999717988875689999999755488867999488888899999998555 No 67 >PRK10997 yieM hypothetical protein; Provisional Probab=58.10 E-value=12 Score=16.32 Aligned_cols=145 Identities=15% Similarity=0.123 Sum_probs=84.8 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCC Q ss_conf 43443201221022324223357876666753467665306788999899998510467876554302232048764312 Q gi|254781110|r 187 ERPIFLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKN 266 (420) Q Consensus 187 ~~~~~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 266 (420) ..+...+-..+|.||||.-. +-..+|...-.++...... .-.+.++.|++... . T Consensus 317 ~~~kGP~IvCVDTSGSM~G~--------------------pE~~AKA~~Lal~r~Al~e-----~R~CyvI~FSte~~-t 370 (484) T PRK10997 317 EQPRGPFIVCVDTSGSMGGF--------------------NEQCAKAFCLALMRIALAE-----NRRCYIMLFSTEVI-T 370 (484) T ss_pred CCCCCCEEEEEECCCCCCCC--------------------HHHHHHHHHHHHHHHHHHC-----CCCEEEEEECCCEE-E T ss_conf 36789979999588888997--------------------6889999999999999962-----89879998126517-8 Q ss_pred CCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCC Q ss_conf 44577848899999986404678876538899999986131012334554334776778776662489960686777776 Q gi|254781110|r 267 IEPSWGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKS 346 (420) Q Consensus 267 ~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~ 346 (420) ..++.. ..+...++=+... -.|||.....+..+...+.... +..-=++++||-....-. T Consensus 371 ~eLt~~-~gl~~l~~FL~~s-F~GGTD~~~~L~~~l~~m~~~~------------------y~~ADllvISDFIa~~lp- 429 (484) T PRK10997 371 YELSGP-DGLEQAIRFLSQS-FRGGTDLAPCLRAIIEKMQGRE------------------WFDADAVVISDFIAQRLP- 429 (484) T ss_pred EEECCC-CCHHHHHHHHCCC-CCCCCCHHHHHHHHHHHHHHCC------------------CCCCCEEEECHHCCCCCC- T ss_conf 980487-8879999985288-8898457999999999862324------------------465887997122065699- Q ss_pred CCHHHHHHHH-HHHCCCEEEEEEECCCCCHHHHHH Q ss_conf 5348999999-997898799999548977899999 Q gi|254781110|r 347 NVNTIKICDK-AKENFIKIVTISINASPNGQRLLK 380 (420) Q Consensus 347 ~~~~~~~c~~-~k~~gi~i~tIgf~~~~~~~~~l~ 380 (420) .....-++. -|.++=+.|.|.++.-.+ ..+|+ T Consensus 430 -~~l~~kv~~lqk~~~nrFhav~is~~g~-p~~m~ 462 (484) T PRK10997 430 -DELVAKVKELQRVHQHRFHAVAMSAHGK-PGIMR 462 (484) T ss_pred -HHHHHHHHHHHHHHCCCEEEEECCCCCC-HHHHH T ss_conf -9999999999985068358884012358-57999 No 68 >pfam03850 Tfb4 Transcription factor Tfb4. Probab=56.41 E-value=12 Score=16.14 Aligned_cols=73 Identities=12% Similarity=0.119 Sum_probs=47.0 Q ss_pred EEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHH Q ss_conf 489960686777776534899999999789879999954897789999986218982798179899999999999 Q gi|254781110|r 332 FIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNIS 406 (420) Q Consensus 332 ~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia 406 (420) .|.+++-..+.. ...-.....-=.+++.+|.|-+..++.+ +..-+.|.|.-++|.|+.++..++|.+.+-.+. T Consensus 144 RILiis~S~d~~-~QYi~~MN~iFaAqk~~I~IDvc~L~~~-~s~fLQQA~diT~G~Yl~~~~~~gLlQyL~~~f 216 (271) T pfam03850 144 RILVLSGSPDSA-SQYIPIMNSIFAAQKLKIPIDVCKLGGE-DSSFLQQAADITGGVYLHVTEPDGLLQYLMTAF 216 (271) T ss_pred EEEEEECCCCCH-HHHHHHHHHHHHHHHCCCEEEEEEECCC-CCHHHHHHHHHHCCEEECCCCCCHHHHHHHHHH T ss_conf 599998788844-7789999999999855974799993699-858999999974977751478333899999996 No 69 >PRK06939 2-amino-3-ketobutyrate coenzyme A ligase; Provisional Probab=55.84 E-value=5.3 Score=18.55 Aligned_cols=60 Identities=18% Similarity=0.259 Sum_probs=46.0 Q ss_pred CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHHH Q ss_conf 348999999997898799999548977899999862189827981798999999999998742 Q gi|254781110|r 348 VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQLMV 410 (420) Q Consensus 348 ~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~I~ 410 (420) .....+|+.+.++||-+..|.+..-+.+...++.|.+ ..| .-+.-+.+.++|.+|+++++ T Consensus 336 ~~a~~~~~~L~~~Gi~v~~ir~PtVp~g~~rlRi~lt-a~h--t~~did~lv~~l~~v~~~lG 395 (395) T PRK06939 336 KLAQEFADRLLEEGVYVIGFSFPVVPKGQARIRTQMS-AAH--TKEQLDRAIDAFEKVGKELG 395 (395) T ss_pred HHHHHHHHHHHHCCCEEEEECCCCCCCCCCEEEEEEC-CCC--CHHHHHHHHHHHHHHHHHCC T ss_conf 9999999999977974820789988989856998878-779--99999999999999999639 No 70 >COG1991 Uncharacterized conserved protein [Function unknown] Probab=50.40 E-value=4.5 Score=19.03 Aligned_cols=33 Identities=27% Similarity=0.357 Sum_probs=28.2 Q ss_pred HHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999886303885699999999999999999999 Q gi|254781110|r 10 YFKKGIASEKANFSIIFALSVMSFLLLIGFLIY 42 (420) Q Consensus 10 ~~~~~~~~~~G~vai~fal~l~~ll~~~g~aVD 42 (420) +..+++.+.||.+.+=|.|+++.++++.+.++- T Consensus 4 ~i~~~~~~nkgQiSLEf~Ll~l~ivla~~i~~~ 36 (131) T COG1991 4 YITKIILSNKGQISLEFSLLLLAIVLAASIAGA 36 (131) T ss_pred EEEEEEECCCCCEEEEHHHHHHHHHHHHHHEEE T ss_conf 662542246662564138999999997312114 No 71 >PRK06007 fliF flagellar MS-ring protein; Reviewed Probab=49.68 E-value=16 Score=15.45 Aligned_cols=39 Identities=15% Similarity=0.141 Sum_probs=29.4 Q ss_pred CHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 878999999998863038856999999999999999999 Q gi|254781110|r 2 HLLSRFRFYFKKGIASEKANFSIIFALSVMSFLLLIGFL 40 (420) Q Consensus 2 ~~~~~~~~~~~~~~~~~~G~vai~fal~l~~ll~~~g~a 40 (420) .++.+|+.++++|-+.||-.+.+.+++++..+++++-++ T Consensus 7 ~~~~~~~~~~~~l~~~qki~l~~~~~~~i~~~~~l~~~~ 45 (540) T PRK06007 7 ELMEKLLEFLKKLSKLRKIALIGAAAAVIAAIVALVLWA 45 (540) T ss_pred HHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 999999999970698889999999999999999999841 No 72 >PRK12938 acetyacetyl-CoA reductase; Provisional Probab=49.24 E-value=16 Score=15.41 Aligned_cols=23 Identities=13% Similarity=0.167 Sum_probs=16.2 Q ss_pred HHHHHHHHHHCCCEEEEEEECCC Q ss_conf 89999999978987999995489 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINAS 372 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~ 372 (420) ++.++..+-..||+|.+|.=|.= T Consensus 164 tk~lA~Ela~~gIrVN~VaPG~i 186 (246) T PRK12938 164 TMSLAQEVATKGVTVNTVSPGYI 186 (246) T ss_pred HHHHHHHHHHHCEEEEEEEECCC T ss_conf 99999996043989999966879 No 73 >COG5151 SSL1 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit SSL1 [Transcription / DNA replication, recombination, and repair] Probab=47.30 E-value=17 Score=15.22 Aligned_cols=173 Identities=12% Similarity=0.077 Sum_probs=99.3 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECC-CCCCCCCCCCC Q ss_conf 20122102232422335787666675346766530678899989999851046787655430223204-87643124457 Q gi|254781110|r 192 LIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGY-TTRVEKNIEPS 270 (420) Q Consensus 192 ~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~lt 270 (420) -+..++|+|.+|..... ..+|.+....+...|+-.+-.+.|..+ ++.+.- +..+.....+. T Consensus 89 hl~l~lD~Seam~e~Df---------------~p~r~a~vikya~~Fv~eFf~qNPiSq---lsii~irdg~a~~~s~~~ 150 (421) T COG5151 89 HLHLILDVSEAMDESDF---------------LPTRRANVIKYAEGFVPEFFSQNPISQ---LSIISIRDGCAKYTSSMD 150 (421) T ss_pred EEEEEEEHHHHHHHHHC---------------CCHHHHHHHHHHHHHHHHHHCCCCCHH---EEEEEHHHHHHHHHHHCC T ss_conf 05788873544433203---------------605888899999877688752597003---244333546888765347 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHH Q ss_conf 78488999999864046788765388999999861310123345543347767787766624899606867777765348 Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNT 350 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~ 350 (420) -+.......+..+. .+.|+-....|+..+.-.|.+.. ...++.++|++-- ........- T Consensus 151 gnpq~hi~~lkS~r--d~~gnfSLqNaLEmar~~l~~~~----------------~H~trEvLiifgS---~st~DPgdi 209 (421) T COG5151 151 GNPQAHIGQLKSKR--DCSGNFSLQNALEMARIELMKNT----------------MHGTREVLIIFGS---TSTRDPGDI 209 (421) T ss_pred CCHHHHHHHHHCCC--CCCCCHHHHHHHHHHHHHHCCCC----------------CCCCEEEEEEEEE---CCCCCCCCH T ss_conf 99899998750100--46888537768877665403454----------------5562279999842---155897418 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCC----CCCEEEECCHHHHHHHHHHH Q ss_conf 99999999789879999954897789999986218----98279817989999999999 Q gi|254781110|r 351 IKICDKAKENFIKIVTISINASPNGQRLLKTCVSS----PEYHYNVVNADSLIHVFQNI 405 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~----~~~yf~a~~~~~L~~aF~~I 405 (420) ...-+++...+|+|..||+-.... ---+.|-.+ ++.||-.-+..-|.+.|... T Consensus 210 ~~tid~Lv~~~IrV~~igL~aeva--icKeickaTn~~~e~~y~v~vde~Hl~el~~E~ 266 (421) T COG5151 210 AETIDKLVAYNIRVHFIGLCAEVA--ICKEICKATNSSTEGRYYVPVDEGHLSELMREL 266 (421) T ss_pred HHHHHHHHHHCEEEEEEEEHHHHH--HHHHHHHHCCCCCCCEEEEEECHHHHHHHHHHC T ss_conf 999999875052799975015899--999998614767675067660478899999862 No 74 >TIGR00627 tfb4 transcription factor tfb4; InterPro: IPR004600 Members of this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. The core-TFIIH basal transcription factor complex has six subunits, this is the p34 subunit.; GO: 0016251 general RNA polymerase II transcription factor activity, 0006281 DNA repair, 0006355 regulation of transcription DNA-dependent, 0000439 core TFIIH complex. Probab=44.84 E-value=19 Score=14.97 Aligned_cols=47 Identities=15% Similarity=0.170 Sum_probs=37.0 Q ss_pred HHHHCCCEEEEEEECCCCCHHHHHHH-HHCCCCCEEEECCHHHHHHHHH Q ss_conf 99978987999995489778999998-6218982798179899999999 Q gi|254781110|r 356 KAKENFIKIVTISINASPNGQRLLKT-CVSSPEYHYNVVNADSLIHVFQ 403 (420) Q Consensus 356 ~~k~~gi~i~tIgf~~~~~~~~~l~~-cas~~~~yf~a~~~~~L~~aF~ 403 (420) .+.+++|.|-++-+|.+-.+- +||+ |=.++|-|-+|+++..|.+.+- T Consensus 185 sA~K~~i~idvv~~~~~~~~~-~LqQAaD~TGG~YL~v~~~~~LL~yL~ 232 (295) T TIGR00627 185 SAQKQNIPIDVVKIGGDFESG-FLQQAADITGGVYLKVEKPKGLLQYLM 232 (295) T ss_pred HHHCCCCCEEEEEECCCCCCH-HHHHHHHHHCCEEEEECCCHHHHHHHH T ss_conf 851698415899808983020-677777663874574278746899999 No 75 >pfam01482 DUF13 DUF13. This domain is found in nematode proteins and is thought to be involved in nematode larval development and have a positive regulation on growth rate. It is currently of unknown function. Probab=44.05 E-value=17 Score=15.21 Aligned_cols=47 Identities=13% Similarity=0.218 Sum_probs=32.1 Q ss_pred HCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHH-HHHHHH Q ss_conf 789879999954897789999986218982798179899999-999999 Q gi|254781110|r 359 ENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIH-VFQNIS 406 (420) Q Consensus 359 ~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~-aF~~Ia 406 (420) +...++.|+|+|-+..++..|+.-.- .-.||-|+-..++++ .|++|+ T Consensus 21 ~~~~t~VTLGIG~Dv~aE~~lk~~~~-~~~FfGADP~~e~N~~LYs~~G 68 (87) T pfam01482 21 QEPLTMVTLGIGHDVKAELKLKKLLP-NIEFFGADPISEINKDLYSKIG 68 (87) T ss_pred CCCCEEEEEECCCCCHHHHHHHHHCC-CCEEECCCCCCCCCHHHHHHCC T ss_conf 78736899843644068899986587-7437657987651166786257 No 76 >pfam04392 ABC_sub_bind ABC transporter substrate binding protein. This family contains many hypothetical proteins and some ABC transporter substrate binding proteins. Probab=43.25 E-value=20 Score=14.82 Aligned_cols=14 Identities=43% Similarity=0.470 Sum_probs=5.5 Q ss_pred HHHCCCEEEEEEEC Q ss_conf 99789879999954 Q gi|254781110|r 357 AKENFIKIVTISIN 370 (420) Q Consensus 357 ~k~~gi~i~tIgf~ 370 (420) +++.|+++..+.+. T Consensus 155 a~~~gi~l~~~~v~ 168 (292) T pfam04392 155 AKKSGIKVVEASVP 168 (292) T ss_pred HHHCCCEEEEEECC T ss_conf 99769989999668 No 77 >pfam04964 Flp_Fap Flp/Fap pilin component. Probab=42.26 E-value=21 Score=14.72 Aligned_cols=28 Identities=14% Similarity=0.128 Sum_probs=18.4 Q ss_pred HHHCCCCCHHHHHHHHHHHHHHHHHHHH Q ss_conf 8630388569999999999999999999 Q gi|254781110|r 14 GIASEKANFSIIFALSVMSFLLLIGFLI 41 (420) Q Consensus 14 ~~~~~~G~vai~fal~l~~ll~~~g~aV 41 (420) |+|||+|.-||=.+|+...+-.++=.++ T Consensus 1 F~kde~GaTAIEYgLIaalIav~iI~~~ 28 (47) T pfam04964 1 FLKDESGATAIEYGLIAALIAVVIIAYV 28 (47) T ss_pred CCCCCCCCHHHHHHHHHHHHHHHHHHHH T ss_conf 9656564159999999999999999999 No 78 >PRK12824 acetoacetyl-CoA reductase; Provisional Probab=41.36 E-value=21 Score=14.63 Aligned_cols=24 Identities=13% Similarity=0.039 Sum_probs=16.8 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 899999999789879999954897 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASP 373 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~ 373 (420) ++.++......||++-+|.-|.=. T Consensus 163 tk~lA~E~a~~gIrvN~I~PG~i~ 186 (245) T PRK12824 163 TKALASEGARYGITVNCIAPGYIA 186 (245) T ss_pred HHHHHHHHHHHCEEEEEEEECCCC T ss_conf 999999972549199999744687 No 79 >TIGR00802 nico transition metal uptake transporter, Ni2+-Co2+ transporter (NiCoT) family; InterPro: IPR004688 This family is found in both Gram-negative and Gram-positive bacteria. The functionally characterised members of the family catalyze uptake of either Ni2+ or Co2+ in a proton motive force-dependent process. Topological analyses with the HoxN Ni2+ transporter of Ralstonia eutropha suggest that it possesses 8 TMSs with its N- and C-termini in the cytoplasm.; GO: 0015099 nickel ion transmembrane transporter activity, 0015675 nickel ion transport, 0016021 integral to membrane. Probab=38.86 E-value=23 Score=14.38 Aligned_cols=29 Identities=21% Similarity=0.182 Sum_probs=21.0 Q ss_pred CCCCCHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 03885699999999999999999999999 Q gi|254781110|r 17 SEKANFSIIFALSVMSFLLLIGFLIYVLD 45 (420) Q Consensus 17 ~~~G~vai~fal~l~~ll~~~g~aVD~~r 45 (420) ..+|.+.+.-.|++|+|+.+...-||..- T Consensus 181 A~~Gtls~~~~L~lP~LFaAGMaL~DT~D 209 (290) T TIGR00802 181 AARGTLSIAAVLVLPVLFAAGMALVDTLD 209 (290) T ss_pred HHCCCCHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 75588428999987899986778998756 No 80 >pfam03731 Ku_N Ku70/Ku80 N-terminal alpha/beta domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the amino terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold. Probab=38.80 E-value=23 Score=14.38 Aligned_cols=186 Identities=11% Similarity=0.100 Sum_probs=86.2 Q ss_pred EEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCC------- Q ss_conf 0122102232422335787666675346766530678899989999851046787655430223204876431------- Q gi|254781110|r 193 IELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEK------- 265 (420) Q Consensus 193 ~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------- 265 (420) +.|.+|++-+|.....+. ..+.++.+-..+..++..---. .....+|++.|++.... T Consensus 2 vvf~ID~s~sM~~~~~~~-------------~~s~~~~al~~i~~~~~~kIis---~~kD~vGvv~~~T~~~~n~~~~~n 65 (222) T pfam03731 2 TVFLIDASPAMFESVKGL-------------EASPFEQALKCIDEILSRKIIS---NDKDLIGVVLYGTDESENSEGFEN 65 (222) T ss_pred EEEEEECCHHHCCCCCCC-------------CCCHHHHHHHHHHHHHHHHEEC---CCCCEEEEEEECCCCCCCCCCCCE T ss_conf 799997998886878899-------------8783999999999999877137---899858899970567777678860 Q ss_pred ---CCCCCC-CHHHHHHHHHHHHHC--------CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEE Q ss_conf ---244577-848899999986404--------67887653889999998613101233455433477677877666248 Q gi|254781110|r 266 ---NIEPSW-GTEKVRQYVTRDMDS--------LILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFI 333 (420) Q Consensus 266 ---~~~lt~-~~~~~~~~i~~~~~~--------~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~i 333 (420) ..+|.. +...++. +..+... ..........++--+..++..... .....+|.| T Consensus 66 i~vl~~l~~p~a~~ik~-L~~~~~~~~~~~~~~~~~~~~~~~~aL~~~~~~~~~~~~--------------~~k~~~krI 130 (222) T pfam03731 66 VTVLRDLDLPGAELLKE-LDQFLEPLADVFGFNGDSSDGDLLSALWVCMDLLQKQTG--------------KKKLSKKRI 130 (222) T ss_pred EEEECCCCCCCHHHHHH-HHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHC--------------CCCCCCCEE T ss_conf 69850578878899999-999851034444116897766488899999999885103--------------434578679 Q ss_pred EEEECCCCCCCCCCCH----HHHHHHHHHHCCCEEEEEEECCCC--CHHHHHH---HHHCCCC-CEEEECCHHHHHHHHH Q ss_conf 9960686777776534----899999999789879999954897--7899999---8621898-2798179899999999 Q gi|254781110|r 334 IFLTDGENNNFKSNVN----TIKICDKAKENFIKIVTISINASP--NGQRLLK---TCVSSPE-YHYNVVNADSLIHVFQ 403 (420) Q Consensus 334 il~TDG~~~~~~~~~~----~~~~c~~~k~~gi~i~tIgf~~~~--~~~~~l~---~cas~~~-~yf~a~~~~~L~~aF~ 403 (420) +++||..+=....+.. ....++.+.+.||.|-.+.++.+. +-..+.+ .+.+... .+......+.|.+..+ T Consensus 131 ~LfTdnD~P~~~~~~~~~~~~~~~a~Dl~d~gi~i~lf~i~~~~~f~~~~FY~dii~~~~~~~~~~~~~~~~~~l~~l~~ 210 (222) T pfam03731 131 LLFTNLDDPFEDDDQLDTIRQKLLAEDLRDEGIEFNLIHLPNSGGFDPNIFYKEIIKLGEDEENEVMLDLEGEKLEDLLS 210 (222) T ss_pred EEECCCCCCCCCCCHHHHHHHHHHHCCHHHCCCEEEEEECCCCCCCCHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHH T ss_conf 99899998988751778999999852378749779996149888887778888642677643456677730316999999 Q ss_pred HHHHHH Q ss_conf 999874 Q gi|254781110|r 404 NISQLM 409 (420) Q Consensus 404 ~Ia~~I 409 (420) .|-... T Consensus 211 ~i~~k~ 216 (222) T pfam03731 211 RLRAKQ 216 (222) T ss_pred HHHHCC T ss_conf 997430 No 81 >COG5148 RPN10 26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones] Probab=38.79 E-value=23 Score=14.37 Aligned_cols=132 Identities=17% Similarity=0.167 Sum_probs=85.4 Q ss_pred CCCCCCHHHHHHHHHHHHHHH-CCCCCCCCCCCCEEEECCCCCC-CCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHH Q ss_conf 766530678899989999851-0467876554302232048764-31244577848899999986404678876538899 Q gi|254781110|r 221 QDKKRTKMAALKNALLLFLDS-IDLLSHVKEDVYMGLIGYTTRV-EKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPAM 298 (420) Q Consensus 221 ~~~~~~~~~~~~~a~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~-~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~gl 298 (420) +++.++|+.+-++++...... +++ +....+|+++..... .....+|..+.++.+.++-+ .-.|+-.+..++ T Consensus 19 gDy~ptRFeAQkd~ve~if~~K~nd----npEntiGli~~~~a~p~vlsT~T~~~gkilt~lhd~---~~~g~a~~~~~l 91 (243) T COG5148 19 GDYLPTRFEAQKDAVESIFSKKFND----NPENTIGLIPLVQAQPNVLSTPTKQRGKILTFLHDI---RLHGGADIMRCL 91 (243) T ss_pred CCCCCHHHHHHHHHHHHHHHHHHCC----CCCCEEEEEECCCCCCCHHCCCHHHHHHHHHHHCCC---CCCCCCHHHHHH T ss_conf 8977078888788999999877238----963315445635688512113065412888772365---124764088899 Q ss_pred HHHHHHHCCCCCCCCCCCCCCCCCCCCCCC-CCCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCCHHH Q ss_conf 999986131012334554334776778776-6624899606867777765348999999997898799999548977899 Q gi|254781110|r 299 KQAYQILTSDKKRSFFTNFFRQGVKIPSLP-FQKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASPNGQR 377 (420) Q Consensus 299 ~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~-~~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~ 377 (420) ..+...|..+- ++. ..+++.++. .+-..+. ..+...+..+|++|+-|-.|-||.-.+..- T Consensus 92 qiaql~lkhR~----------------nk~q~qriVaFvg--Spi~ese-deLirlak~lkknnVAidii~fGE~~n~~~ 152 (243) T COG5148 92 QIAQLILKHRD----------------NKGQRQRIVAFVG--SPIQESE-DELIRLAKQLKKNNVAIDIIFFGEAANMAG 152 (243) T ss_pred HHHHHHHHCCC----------------CCCCCEEEEEEEC--CCCCCCH-HHHHHHHHHHHHCCEEEEEEEHHHHHHHHH T ss_conf 99999986014----------------8765058999946--8453267-999999999986683589986034555667 Q ss_pred H Q ss_conf 9 Q gi|254781110|r 378 L 378 (420) Q Consensus 378 ~ 378 (420) + T Consensus 153 l 153 (243) T COG5148 153 L 153 (243) T ss_pred H T ss_conf 8 No 82 >pfam04285 DUF444 Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA). Probab=38.02 E-value=24 Score=14.30 Aligned_cols=105 Identities=16% Similarity=0.209 Sum_probs=59.7 Q ss_pred CCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEE Q ss_conf 46788765388999999861310123345543347767787766624899606867777765348999999997898799 Q gi|254781110|r 286 SLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIV 365 (420) Q Consensus 286 ~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~ 365 (420) ...+|||-+..|+..+...+...+ +....+-|..-.+||+|.......+...+.+.+-.. ...| T Consensus 307 ~~EsGGT~vSSal~l~~~II~~RY---------------pp~~WNiY~f~aSDGDNw~~D~~~c~~lL~~~llp~-~~~f 370 (421) T pfam04285 307 KQESGGTIVSSALELALEIIDERY---------------PPAEWNIYAFQASDGDNWTDDSERCVKLLMNKLMPN-AQYY 370 (421) T ss_pred CCCCCCEEEEHHHHHHHHHHHHHC---------------CHHHCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHH-HHEE T ss_conf 489897587279999999998558---------------864450467980377664346499999999989887-4158 Q ss_pred E-EEECCCCCHHHHH---HHH-HCCCCC-EEEECCHHHHHHHHHHHHH Q ss_conf 9-9954897789999---986-218982-7981798999999999998 Q gi|254781110|r 366 T-ISINASPNGQRLL---KTC-VSSPEY-HYNVVNADSLIHVFQNISQ 407 (420) Q Consensus 366 t-Igf~~~~~~~~~l---~~c-as~~~~-yf~a~~~~~L~~aF~~Ia~ 407 (420) . |-+..... +.++ +.. ...+.. ...+.+.+||-.+|+.+-. T Consensus 371 ~Y~EI~~~~~-~~~~~~y~~~~~~~~nf~~~~I~~k~dIypvfr~lf~ 417 (421) T pfam04285 371 GYVEITQRRS-HSTWRKYEAVKGVKDNFAMYTIREKDDVYPVFRTLFQ 417 (421) T ss_pred EEEEECCCCC-CCHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHH T ss_conf 9999458876-5279999986324897579995888888999999986 No 83 >pfam06707 DUF1194 Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown. Probab=37.30 E-value=24 Score=14.23 Aligned_cols=143 Identities=13% Similarity=0.116 Sum_probs=87.6 Q ss_pred CCEEEECCCCCCC--CCCCCC--CCHHHHHHHHHHHHHC--CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCC Q ss_conf 3022320487643--124457--7848899999986404--678876538899999986131012334554334776778 Q gi|254781110|r 252 VYMGLIGYTTRVE--KNIEPS--WGTEKVRQYVTRDMDS--LILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIP 325 (420) Q Consensus 252 ~~~~~~~~~~~~~--~~~~lt--~~~~~~~~~i~~~~~~--~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~ 325 (420) +.+..+.++.... ...|.| .+.........++... ...+.|.+..+|..+..+|...- T Consensus 49 Iava~~eWsg~~~q~~vv~Wt~I~~~~da~a~A~~i~~~~r~~~~~Taig~Al~~a~~l~~~~~---------------- 112 (206) T pfam06707 49 IAVTYVEWSGPDDQRVVVPWTLIDSAEDAEAFAARLAAAPRRAGRRTAIGGALGFAAALLAQNP---------------- 112 (206) T ss_pred EEEEEEEECCCCCCEEEECCEEECCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHCC---------------- T ss_conf 8999998027887448869989589999999999997588788999769999999999998299---------------- Q ss_pred CCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHCCCEEEEEEECCCCC-----HHHHHHHHHCC-CC-CEEEECCHHHH Q ss_conf 7766624899606867777765348999999997898799999548977-----89999986218-98-27981798999 Q gi|254781110|r 326 SLPFQKFIIFLTDGENNNFKSNVNTIKICDKAKENFIKIVTISINASPN-----GQRLLKTCVSS-PE-YHYNVVNADSL 398 (420) Q Consensus 326 ~~~~~k~iil~TDG~~~~~~~~~~~~~~c~~~k~~gi~i~tIgf~~~~~-----~~~~l~~cas~-~~-~yf~a~~~~~L 398 (420) ..-.+|+|=+=.||.||.+.... ...-+.+-..||+|--+.++.... -....+.|+-+ || +.-.|.+-++- T Consensus 113 ~~~~RrvIDiSGDG~nN~G~~p~--~~ard~~~~~GitINgL~I~~~~~~~~~~L~~yy~~~VIgGpgAFV~~a~~~~df 190 (206) T pfam06707 113 YECLRRVIDVSGDGPNNQGFPPV--TAARDAAVAAGVTINGLAIMGAEAPTSDDLDAYYRDCVIGGPGAFVEPANGFEDF 190 (206) T ss_pred CCCCEEEEEEECCCCCCCCCCCH--HHHHHHHHHCCEEEEEEEECCCCCCCCHHHHHHHHHCCCCCCCCEEEECCCHHHH T ss_conf 87617999960799888999813--7898767775928966777478987623699999732023898449973887999 Q ss_pred HHHH-HHHHHHHHCC Q ss_conf 9999-9999874202 Q gi|254781110|r 399 IHVF-QNISQLMVHR 412 (420) Q Consensus 399 ~~aF-~~Ia~~I~~l 412 (420) .+++ +++-.||.-+ T Consensus 191 ~~AirrKL~rEIag~ 205 (206) T pfam06707 191 AEAIRRKLVREIAGL 205 (206) T ss_pred HHHHHHHHHHHHHCC T ss_conf 999999999987326 No 84 >pfam11443 DUF2828 Domain of unknown function (DUF2828). This is a uncharacterized domain found in eukaryotes and viruses. Probab=34.86 E-value=27 Score=13.98 Aligned_cols=141 Identities=16% Similarity=0.101 Sum_probs=82.3 Q ss_pred CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 32012210223242233578766667534676653067889998999985104678765543022320487643124457 Q gi|254781110|r 191 FLIELVVDLSGSMHCAMNSDPEDVNSAPICQDKKRTKMAALKNALLLFLDSIDLLSHVKEDVYMGLIGYTTRVEKNIEPS 270 (420) Q Consensus 191 ~~~~~~~d~s~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lt 270 (420) ..+.-++|+||||..... ..+.++.+-. +.-++.... ...++....+|+.....+. +. T Consensus 327 ~n~iav~DvSGSM~g~~~---------------~~~p~~vai~-Lgl~ise~~-----~~~fk~~~iTFs~~P~~~~-l~ 384 (524) T pfam11443 327 TNCIAVCDVSGSMSGPVF---------------SITPMDVCIA-LGLLVSELS-----EGPFKGKVITFSSNPQLHH-IK 384 (524) T ss_pred CCEEEEEECCCCCCCCCC---------------CCCHHHHHHH-HHHHHHHHC-----CCCCCCCEEEECCCCEEEE-CC T ss_conf 544899956877778888---------------8874999999-999999853-----5000581898449975897-07 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCC----- Q ss_conf 784889999998640467887653889999998613101233455433477677877666248996068677777----- Q gi|254781110|r 271 WGTEKVRQYVTRDMDSLILKPTDSTPAMKQAYQILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLTDGENNNFK----- 345 (420) Q Consensus 271 ~~~~~~~~~i~~~~~~~~~g~T~~~~gl~~~~~~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~TDG~~~~~~----- 345 (420) . ..+...++.+....=+++|+....+..=......+. -+..+..|.++++||=+.+... T Consensus 385 g--~~l~ekv~~~~~~~wg~nTnf~~vf~lIL~~av~~~--------------l~~eempk~l~VfSDMqFD~a~~~~~~ 448 (524) T pfam11443 385 G--DSLREKVSFVRRMPWGMSTNFQKVFDLILETAVENK--------------LPQEDMPKRLFVFSDMEFDQASTGTSG 448 (524) T ss_pred C--CCHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHCC--------------CCHHHCCCEEEEEECCCHHHHCCCCCC T ss_conf 9--889999999995867635339999999999999869--------------997886773899845423120379987 Q ss_pred CCCHHHHHHHHHHHCCCEEEEEEE Q ss_conf 653489999999978987999995 Q gi|254781110|r 346 SNVNTIKICDKAKENFIKIVTISI 369 (420) Q Consensus 346 ~~~~~~~~c~~~k~~gi~i~tIgf 369 (420) +....+.+++..++.|-++=.|-| T Consensus 449 ~~t~~e~i~~~f~~aGY~~P~IVF 472 (524) T pfam11443 449 WETDYEAIQRKFKEAGYEVPELVF 472 (524) T ss_pred CCCHHHHHHHHHHHCCCCCCEEEE T ss_conf 623899999999983999983889 No 85 >pfam05814 DUF843 Baculovirus protein of unknown function (DUF843). This family consists of several Baculovirus proteins of around 85 residues long with no known function. Probab=34.26 E-value=27 Score=13.91 Aligned_cols=45 Identities=18% Similarity=0.104 Sum_probs=26.2 Q ss_pred HHHCCCCCHHHHHHHHHHHHHHHHHHHH--------------HHHHHHHHHHHHHHHHHH Q ss_conf 8630388569999999999999999999--------------999999999999999999 Q gi|254781110|r 14 GIASEKANFSIIFALSVMSFLLLIGFLI--------------YVLDWHYKKNSMESANNA 59 (420) Q Consensus 14 ~~~~~~G~vai~fal~l~~ll~~~g~aV--------------D~~r~~~~ks~Lq~A~Da 59 (420) |-|.++++-.++|.+++.+++.+. +.| +=+.-...|-+|++|.|| T Consensus 17 ~dk~e~~s~li~~~lllfvlF~~~-l~vyyinteS~~~dL~t~kaKsiKKK~~le~AfDA 75 (83) T pfam05814 17 FDKNEGSSELILTLLVLFVLFFCL-LNVYYINTESTPADLYTEKAKKIKKKQDLEDAFDA 75 (83) T ss_pred HCCCCCHHHHHHHHHHHHHHHHHH-HHHHHCCCCCCHHHCCCHHHHHHHHHHHHHHHHHH T ss_conf 704566278999999999999998-77640389776544014127888989889999999 No 86 >PRK07806 short chain dehydrogenase; Provisional Probab=34.00 E-value=28 Score=13.89 Aligned_cols=18 Identities=17% Similarity=0.080 Sum_probs=10.6 Q ss_pred HHHHHHHHHCCCEEEEEE Q ss_conf 999999997898799999 Q gi|254781110|r 351 IKICDKAKENFIKIVTIS 368 (420) Q Consensus 351 ~~~c~~~k~~gi~i~tIg 368 (420) ..++..+...||++-.|. T Consensus 165 ~~la~ela~~gIrvn~v~ 182 (248) T PRK07806 165 RALRPELAHAGIGFVVVS 182 (248) T ss_pred HHHHHHHHHHCCEEEEEE T ss_conf 999999776598899972 No 87 >TIGR00937 2A51 chromate transporter, chromate ion transporter (CHR) family; InterPro: IPR014047 Members of this family probably act as chromate transporters , , and are found in both bacteria and archaebacteria. The protein reduces chromate accumulation and is essential for chromate resistance. They are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.. Probab=33.24 E-value=28 Score=13.81 Aligned_cols=61 Identities=15% Similarity=0.149 Sum_probs=48.6 Q ss_pred HHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH---HHHHHHHHHHH Q ss_conf 99999998863038856999999999999999999999999999999999---99999999874 Q gi|254781110|r 6 RFRFYFKKGIASEKANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMES---ANNAAILAGAS 66 (420) Q Consensus 6 ~~~~~~~~~~~~~~G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~---A~DaA~LA~a~ 66 (420) +.-.|.--..+...|.++...|++||.++.++.++.=|.|+...-.-+|+ .+-.++.+... T Consensus 52 q~~~~lG~~~~g~~Ga~~ag~AF~LPs~l~~~~L~~~y~~~~~l~~~~g~~f~G~~~~vi~lia 115 (390) T TIGR00937 52 QVAIYLGYLRGGILGAILAGVAFVLPSFLLVVALAWLYVQYGSLPKAVGAVFYGLKAAVIALIA 115 (390) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHH T ss_conf 9999999998779999999999986999999999999997436436899999889999999999 No 88 >PRK13392 5-aminolevulinate synthase; Provisional Probab=33.09 E-value=22 Score=14.47 Aligned_cols=61 Identities=21% Similarity=0.177 Sum_probs=44.2 Q ss_pred CHHHHHHHHH-HHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHHHC Q ss_conf 3489999999-978987999995489778999998621898279817989999999999987420 Q gi|254781110|r 348 VNTIKICDKA-KENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQLMVH 411 (420) Q Consensus 348 ~~~~~~c~~~-k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~I~~ 411 (420) .....+++.+ +++||-+..|.+..-+.+...++.|.+. .| .-+.-+.|.++|.+|.++++= T Consensus 339 ~~a~~~~~~ll~e~Gi~v~~i~~PtVP~g~~rLRi~lsA-~H--t~~dId~l~~~L~~v~~e~gl 400 (410) T PRK13392 339 EKCKAISDLLMAEHGIYIQPINYPTVPRGTERLRITPSP-LH--TDEDIDALVAALVAIWRELAL 400 (410) T ss_pred HHHHHHHHHHHHCCCEEEEEECCCCCCCCCCEEEEEECC-CC--CHHHHHHHHHHHHHHHHHCCC T ss_conf 999999999987599899867888379997449877586-68--999999999999999998099 No 89 >PRK05557 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Validated Probab=31.57 E-value=30 Score=13.63 Aligned_cols=23 Identities=9% Similarity=0.030 Sum_probs=15.6 Q ss_pred HHHHHHHHHHCCCEEEEEEECCC Q ss_conf 89999999978987999995489 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINAS 372 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~ 372 (420) .+.++......||++-+|.-|.- T Consensus 166 t~~lA~e~~~~gIrvN~V~PG~i 188 (248) T PRK05557 166 TKSLARELASRGITVNAVAPGFI 188 (248) T ss_pred HHHHHHHHHHHCEEEEEEEECCC T ss_conf 99999985331949999974888 No 90 >cd06325 PBP1_ABC_uncharacterized_transporter Type I periplasmic ligand-binding domain of uncharacterized ABC-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This group includes the type I periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine/isoleucine/valine binding protein (LIVBP); its ligand specificity has not been determined experimentally. Probab=31.47 E-value=30 Score=13.62 Aligned_cols=14 Identities=21% Similarity=0.366 Sum_probs=5.8 Q ss_pred HHHHHHHCCCEEEE Q ss_conf 99999978987999 Q gi|254781110|r 353 ICDKAKENFIKIVT 366 (420) Q Consensus 353 ~c~~~k~~gi~i~t 366 (420) +.+.+..++|.+|+ T Consensus 203 i~~~a~~~~iPv~~ 216 (281) T cd06325 203 VVKVANEAKIPVIA 216 (281) T ss_pred HHHHHHHCCCCEEE T ss_conf 99999874998893 No 91 >TIGR02134 transald_staph transaldolase; InterPro: IPR011861 This small family of proteins belong to the transaldolases. Coxiella and Staphylococcus lack members of the known transaldolase families and appear to require a transaldolase activity for completion of the pentose phosphate pathway.. Probab=31.02 E-value=31 Score=13.57 Aligned_cols=88 Identities=17% Similarity=0.144 Sum_probs=48.8 Q ss_pred CCCCCCCCEEEEEECCCCCCCCC-----------------CCHHHHHHHHHHHCCCEEEEE---EE---CCC--CCHHHH Q ss_conf 78776662489960686777776-----------------534899999999789879999---95---489--778999 Q gi|254781110|r 324 IPSLPFQKFIIFLTDGENNNFKS-----------------NVNTIKICDKAKENFIKIVTI---SI---NAS--PNGQRL 378 (420) Q Consensus 324 ~~~~~~~k~iil~TDG~~~~~~~-----------------~~~~~~~c~~~k~~gi~i~tI---gf---~~~--~~~~~~ 378 (420) +++.-+-|+.|.-|-|+.+..-- -...+..|+.+-+.=-.|-+| .+ |.| +--+.- T Consensus 83 ~GnNV~vKIPvtntkGesT~PlIqkLSadgi~LNvTA~~TieQv~~v~~~~t~gvP~iVSVFAGRiADtGvDP~p~M~eA 162 (237) T TIGR02134 83 YGNNVYVKIPVTNTKGESTIPLIQKLSADGIKLNVTAVYTIEQVKKVVEALTEGVPAIVSVFAGRIADTGVDPLPLMKEA 162 (237) T ss_pred HCCCEEEEEEEECCCCCCCCHHHHHHHHCCCEEEEEEECCHHHHHHHHHHHHCCCCCEEEEECCEEECCCCCCHHHHHHH T ss_conf 07932788412418895153078664024866754354025889999998745898179872120206899836789999 Q ss_pred HHHHHCCCCCEEEECCHHHHHHHHHH--HHHHHHC Q ss_conf 99862189827981798999999999--9987420 Q gi|254781110|r 379 LKTCVSSPEYHYNVVNADSLIHVFQN--ISQLMVH 411 (420) Q Consensus 379 l~~cas~~~~yf~a~~~~~L~~aF~~--Ia~~I~~ 411 (420) |+.|++-+|----=.|..||=.++|. |+-+|++ T Consensus 163 l~i~~qK~gveLLWASpRElfNiiQAd~iG~dIIT 197 (237) T TIGR02134 163 LKIVRQKEGVELLWASPRELFNIIQADRIGVDIIT 197 (237) T ss_pred HHHHCCCCCCHHHCCCCHHHHHHHHHHHCCCEEEE T ss_conf 88651688720002451145667747342840676 No 92 >pfam11812 DUF3333 Domain of unknown function (DUF3333). This family of proteins are functionally uncharacterized. This family is only found in bacteria. This presumed domain is typically between 116 to 159 amino acids in length. Probab=29.75 E-value=19 Score=14.91 Aligned_cols=43 Identities=14% Similarity=0.085 Sum_probs=32.7 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 8569999999999999999999999999999999999999999 Q gi|254781110|r 20 ANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMESANNAAIL 62 (420) Q Consensus 20 G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~DaA~L 62 (420) |-.||.+|++++++|+..-+.=-+.-..++..+|.--.|+..+ T Consensus 19 G~~AI~~al~fL~iLl~sI~s~G~~AF~qT~I~l~V~~d~~~i 61 (155) T pfam11812 19 GLAAILIGLAFLVILLGSIVSNGYGAFQQTEITLEVTLDEEVL 61 (155) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHEEEEEEEEECHHHC T ss_conf 8999999999999999999985499887258888888289991 No 93 >PRK06123 short chain dehydrogenase; Provisional Probab=29.33 E-value=33 Score=13.39 Aligned_cols=22 Identities=14% Similarity=0.226 Sum_probs=15.5 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898799999548 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINA 371 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~ 371 (420) +..++..+...||++..|.=|. T Consensus 169 tr~lA~ela~~gIrvN~IaPG~ 190 (249) T PRK06123 169 TIGLAKEVAAEGIRVNAVRPGV 190 (249) T ss_pred HHHHHHHHHHCCEEEEEEEECC T ss_conf 9999999865596999998678 No 94 >pfam11411 DNA_ligase_IV DNA ligase IV. DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface. Probab=29.01 E-value=33 Score=13.35 Aligned_cols=22 Identities=18% Similarity=0.306 Sum_probs=18.3 Q ss_pred CCCEEEECCHHHHHHHHHHHHH Q ss_conf 9827981798999999999998 Q gi|254781110|r 386 PEYHYNVVNADSLIHVFQNISQ 407 (420) Q Consensus 386 ~~~yf~a~~~~~L~~aF~~Ia~ 407 (420) +..||.-.+..+|.++|..|.+ T Consensus 14 GDSy~~dt~~~qLk~vF~~i~~ 35 (36) T pfam11411 14 GDSYFVDTDEQQLKDVFHRIKK 35 (36) T ss_pred CCCEEECCCHHHHHHHHHHHCC T ss_conf 5400104858999999987504 No 95 >PRK07505 hypothetical protein; Provisional Probab=28.86 E-value=34 Score=13.34 Aligned_cols=59 Identities=12% Similarity=0.091 Sum_probs=41.4 Q ss_pred CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHH Q ss_conf 34899999999789879999954897789999986218982798179899999999999874 Q gi|254781110|r 348 VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQLM 409 (420) Q Consensus 348 ~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~I 409 (420) .....+++.+.++|+.+..|.+..-+.++..++.|.+ ..| .-+.-+.|.++++++..+. T Consensus 345 ~~a~~~~~~L~e~Gi~v~~i~~PtVp~g~~rlRi~l~-A~h--T~edId~l~~~L~~v~~e~ 403 (405) T PRK07505 345 DTAIKYAKQLKDAGFYTSPVFFPVVAKGNAGLRIMFR-ADH--TNEEIKRLCSLLKEILADY 403 (405) T ss_pred HHHHHHHHHHHHCCCCEEEEECCCCCCCCCEEEEEEC-CCC--CHHHHHHHHHHHHHHHHHH T ss_conf 9999999999978901875638838979824998748-889--9999999999999999985 No 96 >pfam00429 TLV_coat ENV polyprotein (coat polyprotein). Probab=28.23 E-value=34 Score=13.27 Aligned_cols=28 Identities=7% Similarity=0.170 Sum_probs=13.4 Q ss_pred CCCEEEECCHHHHHHHHHHHHHHHHCCE Q ss_conf 9827981798999999999998742025 Q gi|254781110|r 386 PEYHYNVVNADSLIHVFQNISQLMVHRK 413 (420) Q Consensus 386 ~~~yf~a~~~~~L~~aF~~Ia~~I~~lr 413 (420) ++--|.|+...-+.+...++-+.+-.+| T Consensus 480 eeCCfy~~~sgivrd~~~klre~l~~r~ 507 (560) T pfam00429 480 EECCFYADHSGIVRDSIAKLQERLPQRQ 507 (560) T ss_pred CCCEEEECCCCHHHHHHHHHHHHHHHHH T ss_conf 7515874564318899999999999988 No 97 >PRK09730 hypothetical protein; Provisional Probab=28.09 E-value=35 Score=13.25 Aligned_cols=22 Identities=9% Similarity=0.027 Sum_probs=14.5 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898799999548 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINA 371 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~ 371 (420) +..++..+...||+|-.|.-|. T Consensus 167 tk~lA~ela~~gIrVN~IaPG~ 188 (247) T PRK09730 167 TTGLSLEVAAQGIRVNCVRPGF 188 (247) T ss_pred HHHHHHHHHHCCEEEEEEEECC T ss_conf 9999999705492899997788 No 98 >PRK05958 8-amino-7-oxononanoate synthase; Reviewed Probab=27.68 E-value=35 Score=13.20 Aligned_cols=57 Identities=7% Similarity=0.171 Sum_probs=42.6 Q ss_pred CHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHHHH Q ss_conf 348999999997898799999548977899999862189827981798999999999998742 Q gi|254781110|r 348 VNTIKICDKAKENFIKIVTISINASPNGQRLLKTCVSSPEYHYNVVNADSLIHVFQNISQLMV 410 (420) Q Consensus 348 ~~~~~~c~~~k~~gi~i~tIgf~~~~~~~~~l~~cas~~~~yf~a~~~~~L~~aF~~Ia~~I~ 410 (420) .....+++.++++||.+..|.+..-+.+...++.|.+ ..| +.+||..+.+.+.+-+. T Consensus 329 ~~~~~~~~~L~~~Gi~v~~i~~PtVp~~~~rlRi~i~-a~h-----t~~dId~l~~~l~e~~~ 385 (387) T PRK05958 329 ERALALAAALQAQGFWVGAIRPPTVPAGTSRLRITLT-AAH-----TEADIDRLLEALAEALA 385 (387) T ss_pred HHHHHHHHHHHHCCCEEEEECCCCCCCCCCEEEEEEC-CCC-----CHHHHHHHHHHHHHHHH T ss_conf 9999999999977952831789988999844999978-779-----99999999999999995 No 99 >pfam07423 DUF1510 Protein of unknown function (DUF1510). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. Probab=26.71 E-value=37 Score=13.09 Aligned_cols=31 Identities=13% Similarity=0.193 Sum_probs=13.9 Q ss_pred HHHHHHHHHCCCCCHHHHHHHHHHHHHHHHH Q ss_conf 9999988630388569999999999999999 Q gi|254781110|r 8 RFYFKKGIASEKANFSIIFALSVMSFLLLIG 38 (420) Q Consensus 8 ~~~~~~~~~~~~G~vai~fal~l~~ll~~~g 38 (420) ||..|.-.|.+.+-+-|.++|+++.++++++ T Consensus 2 Rf~~r~k~Rk~n~vLNiaI~iV~llIiiva~ 32 (214) T pfam07423 2 RFEQRQKRRKINRVLNIAIGIVVVLIIIVAY 32 (214) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 1677777764534557999999999999766 No 100 >COG1681 FlaB Archaeal flagellins [Cell motility and secretion] Probab=25.24 E-value=39 Score=12.92 Aligned_cols=22 Identities=9% Similarity=0.191 Sum_probs=16.6 Q ss_pred CCCCCHHHHHHHHHHHHHHHHH Q ss_conf 0388569999999999999999 Q gi|254781110|r 17 SEKANFSIIFALSVMSFLLLIG 38 (420) Q Consensus 17 ~~~G~vai~fal~l~~ll~~~g 38 (420) +|||.+.|=.++.|+.|++++. T Consensus 1 ~rrG~~GIgtlIVfIAmVlVAA 22 (209) T COG1681 1 DRRGATGIGTLIVFIAMVLVAA 22 (209) T ss_pred CCCCCCCHHHHHHHHHHHHHHH T ss_conf 9841104328999999999999 No 101 >PRK10506 hypothetical protein; Provisional Probab=24.86 E-value=39 Score=12.88 Aligned_cols=45 Identities=7% Similarity=0.149 Sum_probs=34.5 Q ss_pred HHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 630388569999999999999999999999999999999999999 Q gi|254781110|r 15 IASEKANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMESANNA 59 (420) Q Consensus 15 ~~~~~G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~Da 59 (420) ++.|||=-.|=..+.+.++-.+..+|+---+.++.+.+|++++.. T Consensus 1 ~~~q~GFTLiEllvvi~ii~il~~~a~p~~~~~~q~~~L~~~a~~ 45 (155) T PRK10506 1 MKKQRGYTLIETLVAMTLVVILSAWGLYGWQYWQQQQRLWQTAQQ 45 (155) T ss_pred CCCCCCEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 985566279999999999999998877779999999999999999 No 102 >PRK08063 enoyl-(acyl carrier protein) reductase; Provisional Probab=24.41 E-value=40 Score=12.82 Aligned_cols=24 Identities=17% Similarity=0.004 Sum_probs=17.4 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 899999999789879999954897 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASP 373 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~ 373 (420) ++.++......||+|-.|.-|.-. T Consensus 165 tk~lA~ela~~gIrVNaI~PG~i~ 188 (250) T PRK08063 165 TRYLAVELAPKGIAVNAVSGGAVD 188 (250) T ss_pred HHHHHHHHHHHCCEEEEEECCCCC T ss_conf 999999972539289998608798 No 103 >KOG1226 consensus Probab=24.27 E-value=40 Score=12.80 Aligned_cols=60 Identities=15% Similarity=0.108 Sum_probs=35.3 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHH----HHHHHH-HHCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEE Q ss_conf 43124457784889999998640467887653889----999998-6131012334554334776778776662489960 Q gi|254781110|r 263 VEKNIEPSWGTEKVRQYVTRDMDSLILKPTDSTPA----MKQAYQ-ILTSDKKRSFFTNFFRQGVKIPSLPFQKFIIFLT 337 (420) Q Consensus 263 ~~~~~~lt~~~~~~~~~i~~~~~~~~~g~T~~~~g----l~~~~~-~L~~~~~~~~~~~~~~~~~~~~~~~~~k~iil~T 337 (420) ...+.+||.+...+.+.+.+.. ..|+-..++| |+.+.. .=.-+| .+..++.+||+| T Consensus 209 fkhvLsLT~~~~~F~~~V~~q~---ISgNlDaPEGGfDAimQaavC~~~IGW----------------R~~a~~LLVF~t 269 (783) T KOG1226 209 FKHVLSLTNDAEEFNEEVGKQR---ISGNLDAPEGGFDAIMQAAVCTEKIGW----------------RNDATRLLVFST 269 (783) T ss_pred CCEEEECCCCHHHHHHHHHHCE---ECCCCCCCCCHHHHHHHHHHCCCCCCC----------------CCCCEEEEEEEC T ss_conf 4002106887699999875435---326889898229888766414655220----------------126516899970 Q ss_pred CCCC Q ss_conf 6867 Q gi|254781110|r 338 DGEN 341 (420) Q Consensus 338 DG~~ 341 (420) |... T Consensus 270 d~~~ 273 (783) T KOG1226 270 DAGF 273 (783) T ss_pred CCCE T ss_conf 7513 No 104 >pfam06508 ExsB ExsB. This family includes putative transcriptional regulators from Bacteria and Archaea. Probab=23.39 E-value=42 Score=12.70 Aligned_cols=15 Identities=13% Similarity=-0.028 Sum_probs=7.1 Q ss_pred HHHHHCCCEEEEEEE Q ss_conf 999978987999995 Q gi|254781110|r 355 DKAKENFIKIVTISI 369 (420) Q Consensus 355 ~~~k~~gi~i~tIgf 369 (420) ..+...|+.--.+|+ T Consensus 106 a~A~~~g~~~I~~G~ 120 (137) T pfam06508 106 SYAEAIGANDIFIGV 120 (137) T ss_pred HHHHHCCCCEEEEEE T ss_conf 999986999799956 No 105 >pfam09001 DUF1890 Domain of unknown function (DUF1890). This domain is found in a set of hypothetical archaeal proteins. Probab=23.19 E-value=42 Score=12.67 Aligned_cols=25 Identities=12% Similarity=0.024 Sum_probs=14.5 Q ss_pred HCCCEEEEEEECCCCCHHHHHHHHHCCC Q ss_conf 7898799999548977899999862189 Q gi|254781110|r 359 ENFIKIVTISINASPNGQRLLKTCVSSP 386 (420) Q Consensus 359 ~~gi~i~tIgf~~~~~~~~~l~~cas~~ 386 (420) -.+...+.|-||-+ .+.|..|...+ T Consensus 92 i~~~~~~aiVFg~~---~e~la~~i~~~ 116 (138) T pfam09001 92 VSNAKTYAIVFGEH---AEELAETIEFD 116 (138) T ss_pred HCCCCEEEEEECCC---HHHHHHHHCCC T ss_conf 54786589993588---79999986489 No 106 >cd01985 ETF The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit. Probab=22.94 E-value=43 Score=12.64 Aligned_cols=24 Identities=13% Similarity=0.148 Sum_probs=9.7 Q ss_pred HHHHHHHHHHHCCCEEEEEEECCC Q ss_conf 489999999978987999995489 Q gi|254781110|r 349 NTIKICDKAKENFIKIVTISINAS 372 (420) Q Consensus 349 ~~~~~c~~~k~~gi~i~tIgf~~~ 372 (420) .....+..+|+.|-++..+.+|.+ T Consensus 23 ~Ale~A~~lke~~~~v~~v~~G~~ 46 (181) T cd01985 23 EAVEAALRLKEYGGEVTALVIGPP 46 (181) T ss_pred HHHHHHHHHHHCCCCEEEEEECCC T ss_conf 999999986444996899997881 No 107 >cd03132 GATase1_catalase Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Catalase catalyzes the dismutation of hydrogen peroxide (H2O2) to water and oxygen. This group includes the large catalases: Neurospora crassa Catalase-1 and Catalase-3 and, Escherichia coli HP-II. This GATase1-like domain has an essential role in HP-II catalase activity. However, it lacks enzymatic activity and the catalytic triad typical of GATase1 domains. Catalase-1 and -3 are homotetrameric, HP-II is homohexameric. It has been proposed that this domain may facilitate the folding and oligomerization process. The interface between this GATase1-like domain of HP-II and the core of the subunit forms part of a channel which provides access to the deeply buried catalase active sites of HPII. Catalase-1 is associated with non-growing cells; C Probab=22.89 E-value=43 Score=12.63 Aligned_cols=18 Identities=11% Similarity=0.080 Sum_probs=6.3 Q ss_pred HHHHHHHHHHCCCEEEEE Q ss_conf 899999999789879999 Q gi|254781110|r 350 TIKICDKAKENFIKIVTI 367 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tI 367 (420) ....-..+=.++-.|..| T Consensus 84 ~~~fv~eay~h~KpI~a~ 101 (142) T cd03132 84 ALHFVTEAFKHGKPIGAV 101 (142) T ss_pred HHHHHHHHHHCCCEEEEE T ss_conf 999999999769979993 No 108 >PRK07576 short chain dehydrogenase; Provisional Probab=22.53 E-value=44 Score=12.59 Aligned_cols=22 Identities=14% Similarity=0.058 Sum_probs=15.4 Q ss_pred HHHHHHHHHHCCCEEEEEEECC Q ss_conf 8999999997898799999548 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINA 371 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~ 371 (420) ...++..+-.+||+|-+|.-|. T Consensus 167 tk~lA~e~a~~gIrVN~IaPG~ 188 (260) T PRK07576 167 TRTLALEWGPEGVRVNSISPGP 188 (260) T ss_pred HHHHHHHHHHCCEEEEEEEECC T ss_conf 9999999713392999983477 No 109 >COG1842 PspA Phage shock protein A (IM30), suppresses sigma54-dependent transcription [Transcription / Signal transduction mechanisms] Probab=21.75 E-value=45 Score=12.49 Aligned_cols=17 Identities=29% Similarity=0.432 Sum_probs=14.3 Q ss_pred CCHHHHHHHHHHHHHHC Q ss_conf 98789999999988630 Q gi|254781110|r 1 MHLLSRFRFYFKKGIAS 17 (420) Q Consensus 1 ~~~~~~~~~~~~~~~~~ 17 (420) |++|+||.++++.++.+ T Consensus 1 M~i~~r~~~~~~a~~~~ 17 (225) T COG1842 1 MGIFSRLKDLVKANINE 17 (225) T ss_pred CCHHHHHHHHHHHHHHH T ss_conf 95689999999988878 No 110 >pfam04917 Shufflon_N Bacterial shufflon protein, N-terminal constant region. This family represents the high-similarity N-terminal 'constant region' shared by shufflon proteins. Probab=20.90 E-value=47 Score=12.37 Aligned_cols=45 Identities=7% Similarity=0.020 Sum_probs=28.6 Q ss_pred HHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 630388569999999999999999999999999999999999999 Q gi|254781110|r 15 IASEKANFSIIFALSVMSFLLLIGFLIYVLDWHYKKNSMESANNA 59 (420) Q Consensus 15 ~~~~~G~vai~fal~l~~ll~~~g~aVD~~r~~~~ks~Lq~A~Da 59 (420) .|..||-..|=..+.|.++++++.+.+.+..-++...+-|.+... T Consensus 2 r~~~kGf~LlE~~~~L~I~~~~~~~~~~~~~~~~~~~~~q~aA~q 46 (356) T pfam04917 2 KKTDKGVSLLEVGAVLLIVVMVIPKVAENIEDYLNNVRWQNAAEH 46 (356) T ss_pred CEECCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 120344308999999999999999999999989999999999999 No 111 >pfam10526 NADH_ub_rd_NUML NADH-ubiquinone reductase complex 1 MLRQ subunit. This subunit appears to be a recent vertebrate addition to the MADH-ubiquinone reductase complex 1, acting within the membrane. its exact function is not known, but it is highly expressed in muscle and neural tissue, indicative of a role in ATP generation. Probab=20.14 E-value=49 Score=12.27 Aligned_cols=23 Identities=9% Similarity=0.075 Sum_probs=15.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99999999999999999999999 Q gi|254781110|r 29 SVMSFLLLIGFLIYVLDWHYKKN 51 (420) Q Consensus 29 ~l~~ll~~~g~aVD~~r~~~~ks 51 (420) .|+||++++|++.-.+-.+..|. T Consensus 12 ~LIPLfv~ig~g~~gA~~y~~rl 34 (80) T pfam10526 12 ALIPLFVFIGAGATGATLYLLRL 34 (80) T ss_pred CHHHHHHHHHCCHHHHHHHHHHH T ss_conf 13248999941388999999999 No 112 >PRK12825 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional Probab=20.00 E-value=49 Score=12.25 Aligned_cols=24 Identities=17% Similarity=0.160 Sum_probs=17.4 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 899999999789879999954897 Q gi|254781110|r 350 TIKICDKAKENFIKIVTISINASP 373 (420) Q Consensus 350 ~~~~c~~~k~~gi~i~tIgf~~~~ 373 (420) ...++......||++.+|.=|.-. T Consensus 168 ~~~la~e~~~~gIrvN~I~PG~v~ 191 (250) T PRK12825 168 TKALARELAERGIRVNAVAPGAID 191 (250) T ss_pred HHHHHHHHHHHCEEEEEEEECCCC T ss_conf 999999860429299999728887 Done!