Query gi|254780833|ref|YP_003065246.1| hypothetical protein CLIBASIA_03630 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 371 No_of_seqs 156 out of 2153 Neff 10.1 Searched_HMMs 39220 Date Sun May 29 23:34:38 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780833.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK13685 hypothetical protein; 100.0 6.4E-27 1.6E-31 180.9 21.1 191 168-368 88-296 (326) 2 cd01465 vWA_subgroup VWA subgr 100.0 1.3E-26 3.4E-31 178.9 18.0 166 169-355 1-170 (170) 3 cd01467 vWA_BatA_type VWA BatA 99.9 1E-25 2.7E-30 173.6 17.5 163 168-351 2-180 (180) 4 cd01461 vWA_interalpha_trypsin 99.9 7.2E-23 1.8E-27 156.4 17.7 162 167-355 1-169 (171) 5 cd01466 vWA_C3HC4_type VWA C3H 99.9 1.2E-22 3E-27 155.2 15.7 149 169-346 1-155 (155) 6 cd01456 vWA_ywmD_type VWA ywmD 99.9 3.1E-21 7.9E-26 146.5 14.9 171 160-349 12-204 (206) 7 cd01463 vWA_VGCC_like VWA Volt 99.9 9.1E-21 2.3E-25 143.7 16.0 166 165-348 10-189 (190) 8 cd01470 vWA_complement_factors 99.9 2.4E-20 6.1E-25 141.2 17.6 176 169-356 1-198 (198) 9 cd01475 vWA_Matrilin VWA_Matri 99.9 5E-20 1.3E-24 139.3 18.4 178 167-363 1-184 (224) 10 cd01480 vWA_collagen_alpha_1-V 99.9 8.6E-20 2.2E-24 137.8 17.7 171 167-355 1-179 (186) 11 cd01464 vWA_subfamily VWA subf 99.9 3E-20 7.6E-25 140.6 15.2 173 168-356 3-175 (176) 12 cd01474 vWA_ATR ATR (Anthrax T 99.9 2E-19 5E-24 135.7 18.2 179 165-365 1-184 (185) 13 cd01451 vWA_Magnesium_chelatas 99.8 1.5E-18 3.8E-23 130.3 17.6 165 171-354 3-175 (178) 14 pfam00092 VWA von Willebrand f 99.8 2.5E-17 6.3E-22 123.0 18.4 169 170-358 1-177 (177) 15 smart00327 VWA von Willebrand 99.8 3.6E-17 9.1E-22 122.0 18.5 158 168-344 1-164 (177) 16 cd01469 vWA_integrins_alpha_su 99.8 1.7E-17 4.2E-22 124.0 16.0 166 169-353 1-176 (177) 17 cd01450 vWFA_subfamily_ECM Von 99.8 2E-17 5.2E-22 123.5 16.0 150 169-339 1-154 (161) 18 cd01473 vWA_CTRP CTRP for CS 99.8 7.9E-17 2E-21 119.9 18.4 178 169-364 1-192 (192) 19 cd01471 vWA_micronemal_protein 99.8 5.7E-17 1.4E-21 120.8 17.5 169 169-356 1-183 (186) 20 TIGR03436 acidobact_VWFA VWFA- 99.8 1.8E-16 4.5E-21 117.8 19.8 181 164-367 49-259 (296) 21 cd01472 vWA_collagen von Wille 99.8 3.6E-17 9.2E-22 122.0 16.1 156 170-347 2-163 (164) 22 cd01481 vWA_collagen_alpha3-VI 99.7 4.2E-16 1.1E-20 115.6 16.6 158 170-347 2-164 (165) 23 TIGR00868 hCaCC calcium-activa 99.7 7.1E-17 1.8E-21 120.2 11.5 171 171-367 310-491 (874) 24 cd01482 vWA_collagen_alphaI-XI 99.7 8E-16 2E-20 113.9 16.2 156 170-347 2-163 (164) 25 cd00198 vWFA Von Willebrand fa 99.7 4.4E-16 1.1E-20 115.4 14.7 150 169-339 1-154 (161) 26 cd01476 VWA_integrin_invertebr 99.7 2.5E-15 6.4E-20 110.9 15.3 153 169-344 1-162 (163) 27 cd01462 VWA_YIEM_type VWA YIEM 99.6 1.9E-14 4.7E-19 105.6 15.2 144 170-338 2-146 (152) 28 COG1240 ChlD Mg-chelatase subu 99.6 1.3E-13 3.3E-18 100.5 15.7 171 164-351 74-250 (261) 29 COG4245 TerY Uncharacterized p 99.5 6.4E-13 1.6E-17 96.3 13.4 182 170-366 5-188 (207) 30 cd01454 vWA_norD_type norD typ 99.5 2.5E-12 6.3E-17 92.8 15.4 152 170-342 2-171 (174) 31 cd01477 vWA_F09G8-8_type VWA F 99.5 5.7E-12 1.4E-16 90.6 16.6 165 165-344 16-192 (193) 32 COG4961 TadG Flp pilus assembl 99.4 2.9E-12 7.3E-17 92.4 11.2 68 3-70 9-76 (185) 33 TIGR02442 Cob-chelat-sub cobal 99.3 1.6E-10 4E-15 81.9 13.9 165 164-345 504-687 (688) 34 KOG2353 consensus 99.1 1.9E-09 4.9E-14 75.3 12.3 183 164-367 221-417 (1104) 35 PRK13406 bchD magnesium chelat 99.1 1.3E-08 3.3E-13 70.3 16.1 171 165-356 398-580 (584) 36 TIGR02031 BchD-ChlD magnesium 99.1 2.1E-09 5.4E-14 75.1 11.5 170 165-349 507-700 (705) 37 cd01453 vWA_transcription_fact 98.9 3.8E-07 9.7E-12 61.5 16.5 169 170-357 5-177 (183) 38 cd01457 vWA_ORF176_type VWA OR 98.8 1.4E-07 3.5E-12 64.1 11.5 157 169-337 3-163 (199) 39 pfam00362 Integrin_beta Integr 98.7 1.3E-06 3.3E-11 58.2 14.9 192 158-363 90-344 (424) 40 COG4548 NorD Nitric oxide redu 98.7 3.9E-07 9.9E-12 61.4 11.3 175 168-362 446-635 (637) 41 smart00187 INB Integrin beta s 98.7 2.1E-06 5.3E-11 57.0 14.6 191 158-362 89-342 (423) 42 pfam11775 CobT_C Cobalamin bio 98.6 9.5E-07 2.4E-11 59.1 11.6 163 169-359 13-215 (220) 43 pfam04056 Ssl1 Ssl1-like. Ssl1 98.3 0.00013 3.4E-09 46.2 16.1 169 170-357 54-228 (250) 44 COG2425 Uncharacterized protei 98.2 1.6E-05 4E-10 51.7 9.5 159 170-358 274-435 (437) 45 COG2304 Uncharacterized protei 98.1 4.8E-05 1.2E-09 48.8 9.6 164 161-345 30-198 (399) 46 KOG2807 consensus 98.1 0.0003 7.5E-09 44.0 13.5 167 169-358 61-235 (378) 47 cd01455 vWA_F11C1-5a_type Von 98.0 0.00084 2.1E-08 41.3 15.2 174 171-361 3-188 (191) 48 COG4655 Predicted membrane pro 98.0 3.2E-06 8.1E-11 55.9 2.2 58 7-64 2-59 (565) 49 cd01452 VWA_26S_proteasome_sub 97.9 0.0013 3.3E-08 40.1 14.4 169 170-354 5-182 (187) 50 KOG3768 consensus 97.8 0.00038 9.7E-09 43.4 9.8 192 171-363 4-229 (888) 51 pfam05762 VWA_CoxE VWA domain 97.7 0.00024 6.2E-09 44.5 8.5 130 167-325 56-189 (223) 52 pfam06707 DUF1194 Protein of u 97.7 0.0031 8E-08 37.8 15.5 179 168-364 3-204 (206) 53 KOG1226 consensus 97.7 0.00051 1.3E-08 42.6 9.4 151 158-322 122-329 (783) 54 cd01460 vWA_midasin VWA_Midasi 97.6 0.0051 1.3E-07 36.6 14.8 189 147-361 45-258 (266) 55 pfam04285 DUF444 Protein of un 97.5 0.0062 1.6E-07 36.1 13.2 163 171-362 249-419 (421) 56 PRK05325 hypothetical protein; 97.5 0.0072 1.8E-07 35.7 13.3 164 171-363 237-411 (414) 57 pfam09967 DUF2201 Predicted me 97.2 0.0013 3.2E-08 40.3 6.6 96 169-299 288-385 (412) 58 cd01459 vWA_copine_like VWA Co 97.0 0.024 6.2E-07 32.5 13.5 154 168-336 31-204 (254) 59 COG4867 Uncharacterized protei 96.9 0.027 6.9E-07 32.2 12.0 156 170-354 465-641 (652) 60 pfam11443 DUF2828 Domain of un 96.9 0.031 7.8E-07 31.9 11.1 142 169-324 327-472 (524) 61 COG4547 CobT Cobalamin biosynt 96.8 0.0023 5.9E-08 38.7 4.7 82 256-355 520-610 (620) 62 cd01458 vWA_ku Ku70/Ku80 N-ter 96.8 0.037 9.3E-07 31.4 13.4 156 170-337 3-187 (218) 63 pfam07811 TadE TadE-like prote 96.6 0.0085 2.2E-07 35.2 6.3 42 15-56 1-42 (43) 64 KOG1327 consensus 96.5 0.057 1.5E-06 30.2 12.3 152 167-337 284-462 (529) 65 TIGR01651 CobT cobaltochelatas 96.4 0.0089 2.3E-07 35.1 5.5 140 170-331 400-578 (606) 66 pfam07002 Copine Copine. This 96.1 0.087 2.2E-06 29.1 11.1 124 189-325 10-145 (145) 67 COG5151 SSL1 RNA polymerase II 96.0 0.099 2.5E-06 28.8 11.2 170 167-358 86-266 (421) 68 KOG2884 consensus 96.0 0.11 2.7E-06 28.6 15.3 170 170-356 5-184 (259) 69 TIGR02877 spore_yhbH sporulati 95.9 0.075 1.9E-06 29.5 8.3 115 170-312 216-331 (392) 70 PRK10997 yieM hypothetical pro 95.7 0.14 3.5E-06 28.0 14.2 143 165-336 319-464 (484) 71 TIGR00873 gnd 6-phosphoglucona 95.5 0.036 9.1E-07 31.5 5.5 46 280-325 61-127 (480) 72 COG2718 Uncharacterized conser 95.4 0.18 4.5E-06 27.3 13.2 165 171-363 249-419 (423) 73 pfam03731 Ku_N Ku70/Ku80 N-ter 94.4 0.32 8.1E-06 25.7 14.4 186 171-366 2-220 (222) 74 COG3552 CoxE Protein containin 93.9 0.35 9E-06 25.5 7.2 119 166-313 216-338 (395) 75 COG3847 Flp Flp pilus assembly 92.7 0.48 1.2E-05 24.7 6.3 25 4-28 3-27 (58) 76 pfam11265 Med25_VWA Mediator c 92.6 0.66 1.7E-05 23.8 10.3 162 169-336 6-188 (219) 77 COG3864 Uncharacterized protei 92.4 0.47 1.2E-05 24.7 6.0 95 170-298 263-357 (396) 78 LOAD_ku consensus 90.7 1.1 2.7E-05 22.6 14.1 53 171-226 2-55 (521) 79 COG5148 RPN10 26S proteasome r 84.4 2.6 6.6E-05 20.3 12.5 143 170-330 5-150 (243) 80 KOG2487 consensus 82.8 2.8 7.2E-05 20.0 4.8 72 283-358 165-238 (314) 81 TIGR00627 tfb4 transcription f 82.1 2.7 6.9E-05 20.1 4.5 76 281-356 153-232 (295) 82 cd01468 trunk_domain trunk dom 77.5 4.5 0.00011 18.8 16.4 160 168-348 3-224 (239) 83 KOG4465 consensus 74.7 5.4 0.00014 18.3 11.0 118 166-308 425-544 (598) 84 KOG2653 consensus 72.6 6 0.00015 18.0 5.6 10 171-180 173-182 (487) 85 COG4726 PilX Tfp pilus assembl 69.2 7.2 0.00018 17.6 5.2 57 8-65 7-71 (196) 86 pfam04811 Sec23_trunk Sec23/Se 58.3 12 0.00029 16.3 16.6 159 168-348 3-224 (241) 87 pfam02060 ISK_Channel Slow vol 57.6 12 0.0003 16.3 4.5 40 6-45 31-70 (129) 88 COG5242 TFB4 RNA polymerase II 57.2 12 0.00031 16.2 5.4 45 311-357 178-224 (296) 89 TIGR02477 PFKA_PPi diphosphate 50.3 16 0.0004 15.5 3.3 39 285-329 170-208 (566) 90 COG1681 FlaB Archaeal flagelli 50.0 15 0.00039 15.6 2.7 22 12-33 1-22 (209) 91 PRK06939 2-amino-3-ketobutyrat 48.6 11 0.00029 16.3 1.9 60 301-362 334-394 (395) 92 pfam04964 Flp_Fap Flp/Fap pili 44.9 19 0.00048 15.0 5.7 21 9-29 1-21 (47) 93 cd01479 Sec24-like Sec24-like: 44.8 19 0.00048 15.0 16.5 174 168-363 3-241 (244) 94 cd06353 PBP1_BmpA_Med_like Per 43.9 20 0.0005 14.9 7.7 19 307-327 191-209 (258) 95 COG1991 Uncharacterized conser 43.7 9.5 0.00024 16.8 0.8 34 6-39 5-38 (131) 96 pfam03850 Tfb4 Transcription f 43.3 20 0.00051 14.9 12.0 75 282-359 140-216 (271) 97 pfam05814 DUF843 Baculovirus p 42.5 21 0.00053 14.8 5.3 43 10-53 18-74 (83) 98 PRK08265 short chain dehydroge 41.7 21 0.00054 14.7 7.7 21 306-326 162-182 (261) 99 COG2984 ABC-type uncharacteriz 37.2 25 0.00064 14.3 9.3 80 282-365 158-238 (322) 100 PRK08643 acetoin reductase; Va 35.9 26 0.00067 14.2 7.2 21 306-326 164-184 (256) 101 pfam04917 Shufflon_N Bacterial 34.1 28 0.00071 14.0 7.3 46 10-55 2-47 (356) 102 PRK12826 3-ketoacyl-(acyl-carr 33.3 29 0.00074 13.9 7.6 59 308-369 170-245 (253) 103 pfam02608 Bmp Basic membrane p 32.7 30 0.00075 13.9 8.7 21 307-327 198-218 (302) 104 TIGR00565 trpE_proteo anthrani 32.5 21 0.00053 14.8 1.1 40 281-322 277-316 (505) 105 cd06304 PBP1_BmpA_like Peripla 32.4 30 0.00076 13.8 7.4 19 307-327 193-211 (260) 106 pfam09001 DUF1890 Domain of un 32.2 30 0.00077 13.8 3.9 21 315-337 93-113 (138) 107 pfam11411 DNA_ligase_IV DNA li 30.8 32 0.00081 13.7 2.2 22 339-360 14-35 (36) 108 KOG4169 consensus 30.7 32 0.00081 13.6 6.2 10 313-322 171-180 (261) 109 pfam00331 Glyco_hydro_10 Glyco 30.3 32 0.00083 13.6 6.7 55 286-340 163-226 (308) 110 PRK05715 NADH dehydrogenase su 29.9 33 0.00084 13.6 4.0 39 3-41 45-83 (99) 111 TIGR02180 GRX_euk Glutaredoxin 29.3 34 0.00086 13.5 1.8 26 346-371 38-63 (85) 112 pfam00071 Ras Ras family. Incl 28.2 35 0.0009 13.4 4.4 59 306-364 91-162 (162) 113 cd06354 PBP1_BmpA_PnrA_like Pe 27.9 36 0.00091 13.3 7.2 13 312-326 202-214 (265) 114 smart00633 Glyco_10 Glycosyl h 27.9 36 0.00091 13.3 6.8 54 287-340 120-182 (254) 115 PRK05872 short chain dehydroge 27.6 36 0.00092 13.3 8.8 19 307-325 167-185 (296) 116 cd02012 TPP_TK Thiamine pyroph 27.2 37 0.00094 13.3 5.6 83 282-367 126-221 (255) 117 PRK10538 3-hydroxy acid dehydr 27.2 37 0.00094 13.3 9.0 51 307-360 160-221 (248) 118 TIGR00937 2A51 chromate transp 27.0 37 0.00095 13.3 4.2 49 11-59 62-113 (390) 119 PRK01215 competence damage-ind 26.7 38 0.00096 13.2 5.0 20 345-364 232-251 (264) 120 PRK08541 flagellin; Validated 25.2 40 0.001 13.1 2.5 22 12-33 1-22 (212) 121 PRK06138 short chain dehydroge 25.1 40 0.001 13.0 7.4 20 307-326 166-185 (252) 122 TIGR02848 spore_III_AC stage I 24.7 41 0.001 13.0 2.8 29 6-34 23-51 (64) 123 cd06325 PBP1_ABC_uncharacteriz 24.6 41 0.001 13.0 8.5 30 286-321 187-216 (281) 124 PRK08324 short chain dehydroge 24.6 41 0.001 13.0 8.5 77 286-369 422-502 (676) 125 PRK07523 gluconate 5-dehydroge 24.2 42 0.0011 12.9 7.5 20 307-326 168-187 (251) 126 cd00765 Pyrophosphate_PFK Phos 23.2 44 0.0011 12.8 3.9 16 166-181 164-179 (550) 127 TIGR00421 ubiX_pad polyprenyl 22.2 46 0.0012 12.7 2.7 33 283-320 111-143 (181) 128 PRK07576 short chain dehydroge 22.1 46 0.0012 12.7 7.2 50 308-360 170-232 (260) 129 PRK06113 7-alpha-hydroxysteroi 21.3 48 0.0012 12.6 7.5 20 307-326 172-191 (255) 130 cd03132 GATase1_catalase Type 21.0 48 0.0012 12.6 5.4 44 303-350 82-128 (142) 131 PRK08277 D-mannonate oxidoredu 20.4 50 0.0013 12.5 7.0 19 307-325 187-205 (278) 132 PRK07505 hypothetical protein; 20.4 50 0.0013 12.5 2.4 60 301-362 343-403 (405) 133 pfam04392 ABC_sub_bind ABC tra 20.2 51 0.0013 12.4 8.2 30 286-321 186-215 (292) 134 TIGR02764 spore_ybaN_pdaB poly 20.2 51 0.0013 12.4 2.2 36 287-323 160-195 (198) No 1 >PRK13685 hypothetical protein; Provisional Probab=99.96 E-value=6.4e-27 Score=180.87 Aligned_cols=191 Identities=19% Similarity=0.291 Sum_probs=155.6 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCC Q ss_conf 74069985276311566787214898999998750023355553110479878863674178166655878999997401 Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRL 247 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l 247 (371) ..++++++|.|+||....-. .+|++.++..+..+++..... .|+|++.|...+...+|+|.|+..++..++.+ T Consensus 88 ~~~i~l~lD~S~SM~a~D~~-p~Rl~~ak~~~~~fi~~~~~~------driGlv~Fa~~a~~~~plT~D~~~~~~~l~~l 160 (326) T PRK13685 88 RAVVMLVIDVSQSMRATDVE-PNRLAAAQEAAKQFADQLTPG------INLGLIAFAGTATVLVSPTTNREATKNALDKL 160 (326) T ss_pred CCCEEEEEECCCCCCCCCCC-CCHHHHHHHHHHHHHHHCCCC------CEEEEEEECCCCEECCCCCCCHHHHHHHHHHC T ss_conf 88679999897565587889-568999999999999737988------82899996587201489875399999999846 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC--CHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 56887456423788999987421101234677766616999984045888888--9789999999999789879999941 Q gi|254780833|r 248 IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNI--DNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 248 ~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~--~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) .++.+|.+..|+..+.+.+....... ...+....+.|||+|||+||.+.. ++......++.+|+.||+|||||+| T Consensus 161 ~~~~~taiG~ai~~Al~~l~~~~~~~---~~~~~~~~~~IILLTDG~~n~g~~~~~p~~~~~AA~~A~~~gi~IyTIgvG 237 (326) T PRK13685 161 QLADRTATGEGIFTALQAIATVGAVI---GGGDTPPPARIVLFSDGKETVPTNPDNPKGAYTAARTAKDQGVPISTISFG 237 (326) T ss_pred CCCCCCCCCHHHHHHHHHHHHHHHHC---CCCCCCCCCEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHCCCCEEEEEEC T ss_conf 87888864068999999998633201---456777886799974899777889887302999999999859948999977 Q ss_pred CC--------------CCHHHHHHHC--CCCEEEEECCHHHHHHHHHHHHHHHHHEEEE Q ss_conf 86--------------4279899833--8980898289899999999999964000787 Q gi|254780833|r 326 AE--------------AADQFLKNCA--SPDRFYSVQNSRKLHDAFLRIGKEMVKQRIL 368 (371) Q Consensus 326 ~~--------------~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~I~~~i~~~~i~ 368 (371) .+ -|++.||++| ++|.||.++|.++|+++|++|.+.+....++ T Consensus 238 t~~g~~~~~g~~~~~~lDe~~L~~IA~~TGG~yfrA~d~~~L~~Iy~~i~~~i~~~~~~ 296 (326) T PRK13685 238 TPYGFVEINGQRQPVPVDDETLKKIAQLSGGEFYTAASLEELRAVYATLQQQIGYETIK 296 (326) T ss_pred CCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEECCCHHHHHHHHHHHHHHHCCEEEC T ss_conf 99884354784034568999999999972987997199999999999963331603311 No 2 >cd01465 vWA_subgroup VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if n Probab=99.95 E-value=1.3e-26 Score=178.94 Aligned_cols=166 Identities=19% Similarity=0.299 Sum_probs=143.8 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCC--CCHHHHHHHHHC Q ss_conf 406998527631156678721489899999875002335555311047987886367417816665--587899999740 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLA--WGVQHIQEKINR 246 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt--~~~~~~~~~i~~ 246 (371) +|++||+|.||||.. .+++.+++++..+++.+... .|++++.|++.+....|++ .+...+..+|+. T Consensus 1 ldiv~vlD~SGSM~g------~~~~~~k~a~~~~l~~l~~~------dr~~iv~F~~~~~~~~~~~~~~~~~~~~~~i~~ 68 (170) T cd01465 1 LNLVFVIDRSGSMDG------PKLPLVKSALKLLVDQLRPD------DRLAIVTYDGAAETVLPATPVRDKAAILAAIDR 68 (170) T ss_pred CCEEEEEECCCCCCC------CHHHHHHHHHHHHHHHCCCC------CEEEEEEECCCCEECCCCCCHHHHHHHHHHHHC T ss_conf 919999908868897------19999999999999858987------879999835861551587866679999998743 Q ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECC Q ss_conf 15688745642378899998742110123467776661699998404588888897899999999997898799999418 Q gi|254780833|r 247 LIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 247 l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~ 326 (371) +.++|+|++..|+..+++.+.... .++..+.|||+|||++|.+..+.......+...++.+|+||+||||. T Consensus 69 l~~~G~T~~~~~l~~a~~~~~~~~---------~~~~~~~iillTDG~~~~~~~~~~~~~~~~~~~~~~~i~i~tiGiG~ 139 (170) T cd01465 69 LTAGGSTAGGAGIQLGYQEAQKHF---------VPGGVNRILLATDGDFNVGETDPDELARLVAQKRESGITLSTLGFGD 139 (170) T ss_pred CCCCCCCCHHHHHHHHHHHHHHCC---------CCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEECC T ss_conf 898998527799999999998633---------78875069998158856798898999999999874388624898088 Q ss_pred CCCHHHHHHHC--CCCEEEEECCHHHHHHHH Q ss_conf 64279899833--898089828989999999 Q gi|254780833|r 327 EAADQFLKNCA--SPDRFYSVQNSRKLHDAF 355 (371) Q Consensus 327 ~~~~~~l~~cA--s~~~~y~~~~~~~L~~af 355 (371) +.+.++|+.+| ++|+||++++++||.++| T Consensus 140 ~~~~~~L~~iA~~~~G~~~~v~~~~~l~~~f 170 (170) T cd01465 140 NYNEDLMEAIADAGNGNTAYIDNLAEARKVF 170 (170) T ss_pred CCCHHHHHHHHHCCCCEEEECCCHHHHHHHC T ss_conf 7999999999975798899849999999639 No 3 >cd01467 vWA_BatA_type VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.94 E-value=1e-25 Score=173.56 Aligned_cols=163 Identities=26% Similarity=0.376 Sum_probs=135.2 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCC Q ss_conf 74069985276311566787214898999998750023355553110479878863674178166655878999997401 Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRL 247 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l 247 (371) ++|++||+|.||||.........|++.+++++..+++... ..|++++.|++.+...+|++.+...++..++.+ T Consensus 2 G~dvvlvlD~SgSM~~~d~~~~~rl~~ak~~~~~~i~~~~-------~drvglv~Fs~~a~~~~plT~d~~~~~~~l~~i 74 (180) T cd01467 2 GRDIMIALDVSGSMLAQDFVKPSRLEAAKEVLSDFIDRRE-------NDRIGLVVFAGAAFTQAPLTLDRESLKELLEDI 74 (180) T ss_pred CCEEEEEEECCCCCCCCCCCCCCHHHHHHHHHHHHHHHCC-------CCEEEEEEECCCCEEECCCCCCHHHHHHHHHCC T ss_conf 6279999989847578666785899999999999997199-------975999997287367337665689999998622 Q ss_pred C---CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEE Q ss_conf 5---6887456423788999987421101234677766616999984045888888978999999999978987999994 Q gi|254780833|r 248 I---FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGV 324 (371) Q Consensus 248 ~---~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~ 324 (371) . ++++|++..++..+.+.|.+. ....|+|||+|||++|.+..+. ...++.+++.||+||+||+ T Consensus 75 ~~~~~~ggT~i~~al~~a~~~l~~~-----------~~~~~~ivLlTDG~~n~g~~~~---~~~~~~a~~~gi~v~tIGv 140 (180) T cd01467 75 KIGLAGQGTAIGDAIGLAIKRLKNS-----------EAKERVIVLLTDGENNAGEIDP---ATAAELAKNKGVRIYTIGV 140 (180) T ss_pred CCCCCCCCCCHHHHHHHHHHHHHCC-----------CCCCCEEEEEECCCCCCCCCCH---HHHHHHHHHCCCEEEEEEE T ss_conf 4453236860899999999976424-----------7666379998058866787699---9999999976998999997 Q ss_pred CCCC-----------CHHHHHHHC--CCCEEEEECCHHHH Q ss_conf 1864-----------279899833--89808982898999 Q gi|254780833|r 325 QAEA-----------ADQFLKNCA--SPDRFYSVQNSRKL 351 (371) Q Consensus 325 ~~~~-----------~~~~l~~cA--s~~~~y~~~~~~~L 351 (371) |.+. +++.|+++| ++|+||++.|++|| T Consensus 141 G~~~~~~~~~~~~~~d~~~L~~iA~~tgG~yy~a~~~~eL 180 (180) T cd01467 141 GKSGSGPKPDGSTILDEDSLVEIADKTGGRIFRALDGFEL 180 (180) T ss_pred CCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEECCCHHHC T ss_conf 7898887688876559999999999619979972874649 No 4 >cd01461 vWA_interalpha_trypsin_inhibitor vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix. Probab=99.91 E-value=7.2e-23 Score=156.42 Aligned_cols=162 Identities=20% Similarity=0.217 Sum_probs=130.7 Q ss_pred CCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCC--CC---HHHHH Q ss_conf 67406998527631156678721489899999875002335555311047987886367417816665--58---78999 Q gi|254780833|r 167 IGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLA--WG---VQHIQ 241 (371) Q Consensus 167 ~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt--~~---~~~~~ 241 (371) .|.|+++|+|.||||.. .+++.+++++..+++.+... .+++++.|++.+....|.. .+ ..... T Consensus 1 ~P~div~viD~SgSM~g------~~l~~ak~a~~~~l~~l~~~------d~~~iv~F~~~~~~~~~~~~~~~~~~~~~a~ 68 (171) T cd01461 1 LPKEVVFVIDTSGSMSG------TKIEQTKEALLTALKDLPPG------DYFNIIGFSDTVEEFSPSSVSATAENVAAAI 68 (171) T ss_pred CCCEEEEEECCCCCCCC------HHHHHHHHHHHHHHHHCCCC------CEEEEEEECCEEEEECCCCEECCHHHHHHHH T ss_conf 98469999917988986------39999999999999829987------8799999878065980775307999999999 Q ss_pred HHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEE Q ss_conf 99740156887456423788999987421101234677766616999984045888888978999999999978987999 Q gi|254780833|r 242 EKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYA 321 (371) Q Consensus 242 ~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~t 321 (371) .+|+.+.++|+|++..|+.++++.+... +...+.|||+|||+.+.. ......+..+++.+|+||+ T Consensus 69 ~~i~~l~~~G~T~i~~aL~~a~~~l~~~-----------~~~~~~iillTDG~~~~~----~~~~~~~~~~~~~~i~i~t 133 (171) T cd01461 69 EYVNRLQALGGTNMNDALEAALELLNSS-----------PGSVPQIILLTDGEVTNE----SQILKNVREALSGRIRLFT 133 (171) T ss_pred HHHHCCCCCCCCHHHHHHHHHHHHHHHC-----------CCCCCEEEEECCCCCCCH----HHHHHHHHHHHCCCCEEEE T ss_conf 8875478899866999999999988635-----------798618999757886886----8999999997448963999 Q ss_pred EEECCCCCHHHHHHHC--CCCEEEEECCHHHHHHHH Q ss_conf 9941864279899833--898089828989999999 Q gi|254780833|r 322 IGVQAEAADQFLKNCA--SPDRFYSVQNSRKLHDAF 355 (371) Q Consensus 322 Ig~~~~~~~~~l~~cA--s~~~~y~~~~~~~L~~af 355 (371) ||||.+.+..+|+.+| ++|.||++++.++|.+.+ T Consensus 134 ig~G~~~~~~~L~~iA~~~~G~~~~v~~~~~l~~~~ 169 (171) T cd01461 134 FGIGSDVNTYLLERLAREGRGIARRIYETDDIESQL 169 (171) T ss_pred EEECCCCCHHHHHHHHHCCCCEEEECCCHHHHHHHH T ss_conf 997897999999999972898899889878999976 No 5 >cd01466 vWA_C3HC4_type VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, Probab=99.90 E-value=1.2e-22 Score=155.15 Aligned_cols=149 Identities=26% Similarity=0.331 Sum_probs=122.7 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC----CHHHHHHHH Q ss_conf 4069985276311566787214898999998750023355553110479878863674178166655----878999997 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW----GVQHIQEKI 244 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~----~~~~~~~~i 244 (371) +|+++|+|+||||.. ++++.+++++..+++.+... .+++++.|++.+....|++. ++..++..| T Consensus 1 ~div~vlD~SGSM~g------~~l~~~k~a~~~~~~~L~~~------d~v~iV~F~~~a~~~~pl~~~~~~~~~~~~~~i 68 (155) T cd01466 1 VDLVAVLDVSGSMAG------DKLQLVKHALRFVISSLGDA------DRLSIVTFSTSAKRLSPLRRMTAKGKRSAKRVV 68 (155) T ss_pred CEEEEEEECCCCCCC------HHHHHHHHHHHHHHHHCCCC------CEEEEEEECCCCEEEECCEECCHHHHHHHHHHH T ss_conf 939999908989887------38999999999999848976------748999956874262046037999999999987 Q ss_pred HCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEE Q ss_conf 40156887456423788999987421101234677766616999984045888888978999999999978987999994 Q gi|254780833|r 245 NRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGV 324 (371) Q Consensus 245 ~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~ 324 (371) +.+.++|+|++..|+..+.+.+.... .+++.+.|||+|||++|.+ ..+..+++.+|+|||||| T Consensus 69 ~~l~~~GgT~i~~gl~~a~~~l~~~~---------~~~~~~~IiLlTDG~~n~~--------~~~~~~~~~~i~i~tiGi 131 (155) T cd01466 69 DGLQAGGGTNVVGGLKKALKVLGDRR---------QKNPVASIMLLSDGQDNHG--------AVVLRADNAPIPIHTFGL 131 (155) T ss_pred HCCCCCCCCCHHHHHHHHHHHHHHCC---------CCCCCEEEEEECCCCCCHH--------HHHHHHHCCCCEEEEEEE T ss_conf 53776888726799999999998436---------6898308999826986405--------778998717973999997 Q ss_pred CCCCCHHHHHHHC--CCCEEEEEC Q ss_conf 1864279899833--898089828 Q gi|254780833|r 325 QAEAADQFLKNCA--SPDRFYSVQ 346 (371) Q Consensus 325 ~~~~~~~~l~~cA--s~~~~y~~~ 346 (371) |.+.++.+|+.+| ++|+||+++ T Consensus 132 G~~~d~~lL~~iA~~~gG~~~~v~ 155 (155) T cd01466 132 GASHDPALLAFIAEITGGTFSYVK 155 (155) T ss_pred CCCCCHHHHHHHHHCCCCEEEEEC T ss_conf 886789999999976997799949 No 6 >cd01456 vWA_ywmD_type VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.88 E-value=3.1e-21 Score=146.54 Aligned_cols=171 Identities=19% Similarity=0.182 Sum_probs=125.1 Q ss_pred ECCCCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEEC-------- Q ss_conf 100013467406998527631156678721489899999875002335555311047987886367417816-------- Q gi|254780833|r 160 KISSKSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTF-------- 231 (371) Q Consensus 160 ~~~~~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~-------- 231 (371) +...+...|.++++|+|.||||....+.+.+|++.++.++..+++.+... .+++++.|++...... T Consensus 12 p~~~~~~~P~~~~lVlD~SGSM~~~~~~g~~rl~~ak~a~~~~v~~l~~~------drvgLv~F~~~~~~~~d~~~~~~~ 85 (206) T cd01456 12 PVETEPQLPPNVAIVLDNSGSMREVDGGGETRLDNAKAALDETANALPDG------TRLGLWTFSGDGDNPLDVRVLVPK 85 (206) T ss_pred CCCCCCCCCCEEEEEEECCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCC------CEEEEEEECCCCCCCCCCCEECCC T ss_conf 97568989873899997987877878776459999999999999857999------879999977867778885132145 Q ss_pred -CC--------CCCHHHHHHHHHCCC-CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCC Q ss_conf -66--------558789999974015-68874564237889999874211012346777666169999840458888889 Q gi|254780833|r 232 -PL--------AWGVQHIQEKINRLI-FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNID 301 (371) Q Consensus 232 -~l--------t~~~~~~~~~i~~l~-~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~ 301 (371) ++ ..++..+...++.+. +.|+|.+..++..+...+. +...+.|||||||++|.+... T Consensus 86 ~~~~~~~~~~~~~~r~~l~~~i~~l~~~~G~T~l~~al~~a~~~~~-------------~~~~~~IvLlTDG~~~~g~~~ 152 (206) T cd01456 86 GCLTAPVNGFPSAQRSALDAALNSLQTPTGWTPLAAALAEAAAYVD-------------PGRVNVVVLITDGEDTCGPDP 152 (206) T ss_pred CCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHC-------------CCCCCEEEEEECCCCCCCCCH T ss_conf 6544434552377899999999745778896479999999998627-------------787647999923764468885 Q ss_pred HHHHHHHHHH-HHHCCCEEEEEEECCCCCHHHHHHHC--CCCEE-EEECCHH Q ss_conf 7899999999-99789879999941864279899833--89808-9828989 Q gi|254780833|r 302 NKESLFYCNE-AKRRGAIVYAIGVQAEAADQFLKNCA--SPDRF-YSVQNSR 349 (371) Q Consensus 302 ~~~~~~~c~~-~k~~gi~i~tIg~~~~~~~~~l~~cA--s~~~~-y~~~~~~ 349 (371) ......+... .+..+|+||+||||.+.+..+|+++| ++|.| |.++++. T Consensus 153 ~~~~~~l~~~~~~~~~v~V~tig~G~d~d~~~L~~IA~~tgG~y~y~~~d~~ 204 (206) T cd01456 153 CEVARELAKRRTPAPPIKVNVIDFGGDADRAELEAIAEATGGTYAYNQSDLA 204 (206) T ss_pred HHHHHHHHHHCCCCCCEEEEEEEECCCCCHHHHHHHHHCCCCEEEEECCCCC T ss_conf 9999999983177999589999718865899999999742978995167602 No 7 >cd01463 vWA_VGCC_like VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex. Probab=99.87 E-value=9.1e-21 Score=143.73 Aligned_cols=166 Identities=20% Similarity=0.272 Sum_probs=127.0 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCC---------CC Q ss_conf 346740699852763115667872148989999987500233555531104798788636741781666---------55 Q gi|254780833|r 165 SDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPL---------AW 235 (371) Q Consensus 165 ~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~l---------t~ 235 (371) ...|.|+++|+|.||||.. .+++.+++++..+++.+...+ +++++.|++.+....|. .. T Consensus 10 ~~~Pkdvv~vlD~SGSM~g------~kl~~ak~a~~~il~~L~~~D------~~~iv~Fs~~~~~~~p~~~~~~~~~t~~ 77 (190) T cd01463 10 ATSPKDIVILLDVSGSMTG------QRLHLAKQTVSSILDTLSDND------FFNIITFSNEVNPVVPCFNDTLVQATTS 77 (190) T ss_pred CCCCCEEEEEEECCCCCCC------CHHHHHHHHHHHHHHHCCCCC------EEEEEEECCCCEEEECCCCCCEEECCHH T ss_conf 7898269999979998897------349999999999998199877------9999996897536302456843368999 Q ss_pred CHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHH-- Q ss_conf 878999997401568874564237889999874211012346777666169999840458888889789999999999-- Q gi|254780833|r 236 GVQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAK-- 313 (371) Q Consensus 236 ~~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k-- 313 (371) ++..++.+|+.+.+.|+|++..|+..|++.|........ ........|.|||+|||.++.. ..........+ T Consensus 78 n~~~~~~~i~~l~~~G~Tn~~~al~~A~~~l~~~~~~~~--~~~~~~~~~~IillTDG~~~~~----~~i~~~~~~~~~~ 151 (190) T cd01463 78 NKKVLKEALDMLEAKGIANYTKALEFAFSLLLKNLQSNH--SGSRSQCNQAIMLITDGVPENY----KEIFDKYNWDKNS 151 (190) T ss_pred HHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHCCCC--CCCCCCCCCEEEEEECCCCCCH----HHHHHHHHHHHCC T ss_conf 999999999828579872489999999999987420155--6655555515999836988757----8899999997557 Q ss_pred HCCCEEEEEEECCC-CCHHHHHHHC--CCCEEEEECCH Q ss_conf 78987999994186-4279899833--89808982898 Q gi|254780833|r 314 RRGAIVYAIGVQAE-AADQFLKNCA--SPDRFYSVQNS 348 (371) Q Consensus 314 ~~gi~i~tIg~~~~-~~~~~l~~cA--s~~~~y~~~~~ 348 (371) ..+|+|||+|||.+ .+.++|+.+| +.|+||++.+. T Consensus 152 ~~~i~ift~G~G~~~~d~~~L~~iA~~~~G~y~~I~~~ 189 (190) T cd01463 152 EIPVRVFTYLIGREVTDRREIQWMACENKGYYSHIQSL 189 (190) T ss_pred CCCEEEEEEEECCCCCCHHHHHHHHHCCCCEEEECCCC T ss_conf 99879999996799778799999998099569978889 No 8 >cd01470 vWA_complement_factors Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains. Probab=99.87 E-value=2.4e-20 Score=141.20 Aligned_cols=176 Identities=17% Similarity=0.317 Sum_probs=129.9 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCC----CCHHHHHHHH Q ss_conf 406998527631156678721489899999875002335555311047987886367417816665----5878999997 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLA----WGVQHIQEKI 244 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt----~~~~~~~~~i 244 (371) +|++|++|.|+|++. +..+..++++..+++.+..... .+|++++.|++.+...+++. .+...+...| T Consensus 1 lDivfllD~SgSIg~------~nF~~~k~Fv~~lv~~~~~~~~---~~rvgvv~ys~~~~~~f~l~~~~~~~~~~~~~~i 71 (198) T cd01470 1 LNIYIALDASDSIGE------EDFDEAKNAIKTLIEKISSYEV---SPRYEIISYASDPKEIVSIRDFNSNDADDVIKRL 71 (198) T ss_pred CEEEEEEECCCCCCH------HHHHHHHHHHHHHHHHHCCCCC---CCEEEEEEECCCCEEEEECCCCCCCCHHHHHHHH T ss_conf 919999979898887------8899999999999998446687---7538999815885389971576666899999999 Q ss_pred HCCCC-----CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHH---------HHHH Q ss_conf 40156-----887456423788999987421101234677766616999984045888888978999---------9999 Q gi|254780833|r 245 NRLIF-----GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESL---------FYCN 310 (371) Q Consensus 245 ~~l~~-----~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~---------~~c~ 310 (371) +.+.. +++|++..+++..+..+...... .....++.+|++||+|||..|.+..+..... .... T Consensus 72 ~~i~y~~~~~~~gT~t~~AL~~~~~~~~~~~~~---~~~~~~~v~~v~illTDG~sn~g~~P~~~~~~~~~~~~~~~~a~ 148 (198) T cd01470 72 EDFNYDDHGDKTGTNTAAALKKVYERMALEKVR---NKEAFNETRHVIILFTDGKSNMGGSPLPTVDKIKNLVYKNNKSD 148 (198) T ss_pred HHCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHC---CCCCCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHHHHH T ss_conf 846033577886468999999999986555304---66444567559999737854578993367888777664101456 Q ss_pred HHHHCCCEEEEEEECCCCCHHHHHHHCC--CC--EEEEECCHHHHHHHHH Q ss_conf 9997898799999418642798998338--98--0898289899999999 Q gi|254780833|r 311 EAKRRGAIVYAIGVQAEAADQFLKNCAS--PD--RFYSVQNSRKLHDAFL 356 (371) Q Consensus 311 ~~k~~gi~i~tIg~~~~~~~~~l~~cAs--~~--~~y~~~~~~~L~~af~ 356 (371) .+++.||.||+||+|.+.+...|+.+|| |+ |+|.+++.++|+++|+ T Consensus 149 ~~r~~gi~ifaiGVG~~~d~~eL~~IAS~~~~e~hvf~v~df~~L~~i~d 198 (198) T cd01470 149 NPREDYLDVYVFGVGDDVNKEELNDLASKKDNERHFFKLKDYEDLQEVFD 198 (198) T ss_pred HHHHCCCEEEEEEECCCCCHHHHHHHHCCCCCCCEEEEECCHHHHHHHHC T ss_conf 78873947999996661599999998579998716999689999998639 No 9 >cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. Probab=99.87 E-value=5e-20 Score=139.25 Aligned_cols=178 Identities=22% Similarity=0.388 Sum_probs=140.2 Q ss_pred CCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCC--CCHHHHHHHH Q ss_conf 67406998527631156678721489899999875002335555311047987886367417816665--5878999997 Q gi|254780833|r 167 IGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLA--WGVQHIQEKI 244 (371) Q Consensus 167 ~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt--~~~~~~~~~i 244 (371) .|+|++|++|.|+|++. +.....++.+..+++.++. ..+.+|++++.|++.+....+|. .++..+..+| T Consensus 1 gplDlvFllD~S~Svg~------~nF~~~k~Fv~~lv~~f~I---~~~~trVgvv~ys~~~~~~f~l~~~~~k~~l~~aI 71 (224) T cd01475 1 GPTDLVFLIDSSRSVRP------ENFELVKQFLNQIIDSLDV---GPDATRVGLVQYSSTVKQEFPLGRFKSKADLKRAV 71 (224) T ss_pred CCEEEEEEEECCCCCCH------HHHHHHHHHHHHHHHHCCC---CCCCEEEEEEEECCCEEEEEECCCCCCHHHHHHHH T ss_conf 97439999948899898------9999999999999985687---99852999999658278999668867889999999 Q ss_pred HCCCC-CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEE Q ss_conf 40156-88745642378899998742110123467776661699998404588888897899999999997898799999 Q gi|254780833|r 245 NRLIF-GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIG 323 (371) Q Consensus 245 ~~l~~-~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg 323 (371) +.+.. +|+|++..|+..+.+.++....... ....+..|++|+||||..++. .......+|++||+||+|| T Consensus 72 ~~i~~~gggT~Tg~AL~~~~~~~f~~~~G~R---p~~~~vpkvlIviTDG~s~D~------v~~~A~~lr~~GV~ifaVG 142 (224) T cd01475 72 RRMEYLETGTMTGLAIQYAMNNAFSEAEGAR---PGSERVPRVGIVVTDGRPQDD------VSEVAAKARALGIEMFAVG 142 (224) T ss_pred HHHHCCCCCCHHHHHHHHHHHHHCCCCCCCC---CCCCCCCEEEEEECCCCCCCC------HHHHHHHHHHCCCEEEEEE T ss_conf 8636138844699999999997277023998---755689859999717987663------8999999998798899996 Q ss_pred ECCCCCHHHHHHHCC-C--CEEEEECCHHHHHHHHHHHHHHHH Q ss_conf 418642798998338-9--808982898999999999999640 Q gi|254780833|r 324 VQAEAADQFLKNCAS-P--DRFYSVQNSRKLHDAFLRIGKEMV 363 (371) Q Consensus 324 ~~~~~~~~~l~~cAs-~--~~~y~~~~~~~L~~af~~I~~~i~ 363 (371) ++ +.+...|+.+|| | .|+|.+++.++|...-+.|.++|. T Consensus 143 Vg-~~~~~eL~~IAs~P~~~hvf~v~~F~~l~~l~~~l~~~iC 184 (224) T cd01475 143 VG-RADEEELREIASEPLADHVFYVEDFSTIEELTKKFQGKIC 184 (224) T ss_pred CC-CCCHHHHHHHHCCCCHHCEEEECCHHHHHHHHHHHHHHHC T ss_conf 37-4798999998559737568994798899999999876118 No 10 >cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.86 E-value=8.6e-20 Score=137.85 Aligned_cols=171 Identities=19% Similarity=0.263 Sum_probs=130.9 Q ss_pred CCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCC---CCCCEEEEEEEEEECCCCEEECCCC---CCHHHH Q ss_conf 6740699852763115667872148989999987500233555---5311047987886367417816665---587899 Q gi|254780833|r 167 IGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSI---PDVNNVVRSGLVTFSSKIVQTFPLA---WGVQHI 240 (371) Q Consensus 167 ~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~f~~~~~~~~~lt---~~~~~~ 240 (371) .|+|++|++|.|+|++. ...+..++.+..+++.+... +..++.+|+|++.|++.+....++. .+...+ T Consensus 1 gpvDlvFllD~S~Sv~~------~~F~~~k~Fv~~lv~~f~~~~~~~i~~~~~rVgvv~ys~~~~~~~~~~~~~~~~~~l 74 (186) T cd01480 1 GPVDITFVLDSSESVGL------QNFDITKNFVKRVAERFLKDYYRKDPAGSWRVGVVQYSDQQEVEAGFLRDIRNYTSL 74 (186) T ss_pred CCEEEEEEEECCCCCCH------HHHHHHHHHHHHHHHHHHHHCCCCCCCCCEEEEEEEECCCEEEEECCCCCCCCHHHH T ss_conf 97469999968898787------899999999999999985301345687743898998558427986047775889999 Q ss_pred HHHHHCCC-CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEE Q ss_conf 99974015-68874564237889999874211012346777666169999840458888889789999999999789879 Q gi|254780833|r 241 QEKINRLI-FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIV 319 (371) Q Consensus 241 ~~~i~~l~-~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i 319 (371) +..|+.+. .+|+|++..|+.++.+.+... ..+...|++||+|||..+.... .......+.+|+.||+| T Consensus 75 ~~~I~~i~y~gG~T~tg~AL~~a~~~~~~~---------~r~~~~kvlvliTDG~S~~~~~--~~~~~aa~~lr~~GV~i 143 (186) T cd01480 75 KEAVDNLEYIGGGTFTDCALKYATEQLLEG---------SHQKENKFLLVITDGHSDGSPD--GGIEKAVNEADHLGIKI 143 (186) T ss_pred HHHHHHHHCCCCCCHHHHHHHHHHHHHHHC---------CCCCCCEEEEEEECCCCCCCCC--HHHHHHHHHHHHCCCEE T ss_conf 999975013589862999999999998613---------6789853899984587666740--66999999999879899 Q ss_pred EEEEECCCCCHHHHHHHCC-CCEEEEECCHHHHHHHH Q ss_conf 9999418642798998338-98089828989999999 Q gi|254780833|r 320 YAIGVQAEAADQFLKNCAS-PDRFYSVQNSRKLHDAF 355 (371) Q Consensus 320 ~tIg~~~~~~~~~l~~cAs-~~~~y~~~~~~~L~~af 355 (371) |+||+|.. +.+.|+.||+ |++.|.++|.++|.+-| T Consensus 144 faVGVG~~-~~~eL~~IAs~p~~~~~~~~f~~L~~~~ 179 (186) T cd01480 144 FFVAVGSQ-NEEPLSRIACDGKSALYRENFAELLWSF 179 (186) T ss_pred EEEEECCC-CHHHHHHHHCCCCCEEEECCHHHHHCCH T ss_conf 99994748-8799999858997389736899870111 No 11 >cd01464 vWA_subfamily VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.86 E-value=3e-20 Score=140.60 Aligned_cols=173 Identities=21% Similarity=0.228 Sum_probs=134.8 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCC Q ss_conf 74069985276311566787214898999998750023355553110479878863674178166655878999997401 Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRL 247 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l 247 (371) .+.++||+|.||||.. ++++.+++++..+++.+...+.....+++++++|++.+....|++.- ..-.+..+ T Consensus 3 rlpvvlvlD~SGSM~G------~~i~~~k~al~~~~~~L~~d~~a~~~~~vsVItF~s~a~~~~pl~~~---~~~~~~~L 73 (176) T cd01464 3 RLPIYLLLDTSGSMAG------EPIEALNQGLQMLQSELRQDPYALESVEISVITFDSAARVIVPLTPL---ESFQPPRL 73 (176) T ss_pred CCCEEEEEECCCCCCC------HHHHHHHHHHHHHHHHHHCCCCCHHEEEEEEEEECCCEEEECCCCCH---HHCCCCCC T ss_conf 3578999978999998------47999999999999997118310113269999978951780586347---66475547 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC Q ss_conf 56887456423788999987421101234677766616999984045888888978999999999978987999994186 Q gi|254780833|r 248 IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 248 ~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~ 327 (371) .++|+|++..|+..+...|........ .......+++|||||||.+|+ ++......+..++..+++|+++|+|.+ T Consensus 74 ~a~G~T~~g~al~~a~~~l~~~~~~~~--~~~~~~~~P~I~LlTDG~PtD---~~~~~~~~~~~~~~~~~~i~a~giG~d 148 (176) T cd01464 74 TASGGTSMGAALELALDCIDRRVQRYR--ADQKGDWRPWVFLLTDGEPTD---DLTAAIERIKEARDSKGRIVACAVGPK 148 (176) T ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHCC--CCCCCCCCEEEEEECCCCCCC---CHHHHHHHHHHHHHCCCEEEEEEEECC T ss_conf 778998199999999999998652236--556677531799966899887---589999999988863976999997387 Q ss_pred CCHHHHHHHCCCCEEEEECCHHHHHHHHH Q ss_conf 42798998338980898289899999999 Q gi|254780833|r 328 AADQFLKNCASPDRFYSVQNSRKLHDAFL 356 (371) Q Consensus 328 ~~~~~l~~cAs~~~~y~~~~~~~L~~af~ 356 (371) ++.+.|++++.. .....+..++.+.|+ T Consensus 149 ad~~~L~~is~~--~~~~~~~~~f~~ff~ 175 (176) T cd01464 149 ADLDTLKQITEG--VPLLDDALSGLNFFK 175 (176) T ss_pred CCHHHHHHHHCC--CCCCCCHHHHHHHHC T ss_conf 189999988577--745345345888508 No 12 >cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. Probab=99.85 E-value=2e-19 Score=135.66 Aligned_cols=179 Identities=23% Similarity=0.295 Sum_probs=135.0 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHH- Q ss_conf 3467406998527631156678721489899999875002335555311047987886367417816665587899999- Q gi|254780833|r 165 SDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEK- 243 (371) Q Consensus 165 ~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~- 243 (371) +..++|++|++|.|+|++.+|. + .++.+..+++.+. ...+|+|++.|++.+....+|+.+.+..... T Consensus 1 C~~~~DivFllD~S~Sv~~~f~----~---~~~Fv~~lv~~f~-----~~~~rvgvv~fS~~~~~~f~l~~~~~~~~~~~ 68 (185) T cd01474 1 CAGHFDLYFVLDKSGSVAANWI----E---IYDFVEQLVDRFN-----SPGLRFSFITFSTRATKILPLTDDSSAIIKGL 68 (185) T ss_pred CCCCEEEEEEEECCCCCCCCHH----H---HHHHHHHHHHHCC-----CCCEEEEEEEECCCCCEEEECCCCHHHHHHHH T ss_conf 9986138999978998765769----9---9999999998569-----98749999998698318984578707889999 Q ss_pred --HHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEE Q ss_conf --740156887456423788999987421101234677766616999984045888888978999999999978987999 Q gi|254780833|r 244 --INRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYA 321 (371) Q Consensus 244 --i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~t 321 (371) ++...++|+|++..|+..+.+.++.... ++....|++|++|||..++.... ........+|+.||.||+ T Consensus 69 ~~~~~~~~~G~T~tg~AL~~a~~~~f~~~~-------g~R~~~kvlivlTDG~s~~~~~~--~~~~~a~~lr~~gV~i~a 139 (185) T cd01474 69 EVLKKVTPSGQTYIHEGLENANEQIFNRNG-------GGRETVSVIIALTDGQLLLNGHK--YPEHEAKLSRKLGAIVYC 139 (185) T ss_pred HHHHHHCCCCCCHHHHHHHHHHHHHHCCCC-------CCCCCCEEEEEEECCCCCCCCCH--HHHHHHHHHHHCCCEEEE T ss_conf 998876158937899999999997503236-------99887628999932665676214--179999999978948999 Q ss_pred EEECCCCCHHHHHHHCC-CCEEEEECC-HHHHHHHHHHHHHHHHHE Q ss_conf 99418642798998338-980898289-899999999999964000 Q gi|254780833|r 322 IGVQAEAADQFLKNCAS-PDRFYSVQN-SRKLHDAFLRIGKEMVKQ 365 (371) Q Consensus 322 Ig~~~~~~~~~l~~cAs-~~~~y~~~~-~~~L~~af~~I~~~i~~~ 365 (371) ||++ +.++..|+.+|| |.|.|.+++ .++|......|.+++... T Consensus 140 VGV~-~~~~~eL~~IAs~p~~vf~v~~~F~~L~~i~~~l~~~iC~~ 184 (185) T cd01474 140 VGVT-DFLKSQLINIADSKEYVFPVTSGFQALSGIIESVVKKACIE 184 (185) T ss_pred EECC-CCCHHHHHHHHCCCCEEEECCCCHHHHHHHHHHHHHHHCCC T ss_conf 9716-25999999871998648983475777899999999852879 No 13 >cd01451 vWA_Magnesium_chelatase Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions. Probab=99.83 E-value=1.5e-18 Score=130.33 Aligned_cols=165 Identities=18% Similarity=0.278 Sum_probs=130.1 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCC-CCCCCCCCCEEEEEEEEEEC-CCCEEECCCCCCHHHHHHHHHCCC Q ss_conf 6998527631156678721489899999875002-33555531104798788636-741781666558789999974015 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLD-IIKSIPDVNNVVRSGLVTFS-SKIVQTFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~f~-~~~~~~~~lt~~~~~~~~~i~~l~ 248 (371) ++||+|.||||... +|+..++.++..++. .+.. ..++++++|. +.+....|+|.+...++..|+.+. T Consensus 3 vvfvvD~SGSM~~~-----~rl~~aK~a~~~ll~d~~~~------~D~v~lv~F~g~~a~~~lppT~~~~~~~~~l~~L~ 71 (178) T cd01451 3 VIFVVDASGSMAAR-----HRMAAAKGAVLSLLRDAYQR------RDKVALIAFRGTEAEVLLPPTRSVELAKRRLARLP 71 (178) T ss_pred EEEEEECCCCCCCC-----CHHHHHHHHHHHHHHHHCCC------CCEEEEEEECCCCCEEECCCCCCHHHHHHHHHCCC T ss_conf 99999898788875-----67999999999999974346------78899999759755585688765799999872167 Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHH-HHHHHHHHHHCCCEEEEEEECCC Q ss_conf 68874564237889999874211012346777666169999840458888889789-99999999978987999994186 Q gi|254780833|r 249 FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKE-SLFYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 249 ~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~-~~~~c~~~k~~gi~i~tIg~~~~ 327 (371) ++|+|++..|+..++..+.... ..+...+++||+|||.+|.+..+... ...++...++.||...+|+++.+ T Consensus 72 ~gG~T~l~~gL~~a~~~~~~~~--------~~~~~~~~iiLlTDG~~N~g~~~~~~~~~~~a~~~~~~gi~~~vId~~~~ 143 (178) T cd01451 72 TGGGTPLAAGLLAAYELAAEQA--------RDPGQRPLIVVITDGRANVGPDPTADRALAAARKLRARGISALVIDTEGR 143 (178) T ss_pred CCCCCCHHHHHHHHHHHHHHHC--------CCCCCCEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHCCCCEEEEECCCC T ss_conf 8898519999999999999850--------27898439999846986679995126999999999866997899979999 Q ss_pred C-CHHHHHHHC--CCCEEEEECCH--HHHHHH Q ss_conf 4-279899833--89808982898--999999 Q gi|254780833|r 328 A-ADQFLKNCA--SPDRFYSVQNS--RKLHDA 354 (371) Q Consensus 328 ~-~~~~l~~cA--s~~~~y~~~~~--~~L~~a 354 (371) . ...+|+++| .+|+||..++. ++|.++ T Consensus 144 ~~~~~~~~~LA~~~~g~Y~~id~l~~~~i~~~ 175 (178) T cd01451 144 PVRRGLAKDLARALGGQYVRLPDLSADAIASA 175 (178) T ss_pred CCCHHHHHHHHHHCCCCEEECCCCCHHHHHHH T ss_conf 76748999999942996998997998899998 No 14 >pfam00092 VWA von Willebrand factor type A domain. Probab=99.79 E-value=2.5e-17 Score=123.01 Aligned_cols=169 Identities=23% Similarity=0.369 Sum_probs=128.2 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCH--HHHHHHH--H Q ss_conf 06998527631156678721489899999875002335555311047987886367417816665587--8999997--4 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGV--QHIQEKI--N 245 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~--~~~~~~i--~ 245 (371) |++||+|.|+||.. .++..+++++..+++.+. ......|++++.|.+.+....||+... ..+...+ . T Consensus 1 Di~fvlD~S~Sm~~------~~~~~~k~~~~~~i~~~~---~~~~~~rv~lv~f~~~~~~~~~l~~~~~~~~~~~~~~~~ 71 (177) T pfam00092 1 DIVFLLDGSGSIGE------ANFEKVKEFIKKLVENLD---IGPDGTRVGLVQYSSDVTTEFSLNDYKSKDDLLSAVLRN 71 (177) T ss_pred CEEEEEECCCCCCH------HHHHHHHHHHHHHHHHHC---CCCCCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHHH T ss_conf 98999968799886------889999999999999836---588752899999458458996178868999999998643 Q ss_pred CCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 01568874564237889999874211012346777666169999840458888889789999999999789879999941 Q gi|254780833|r 246 RLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 246 ~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) ....+|+|++..|+..+.+.+... ....++.+|++||+|||.++.+........ ..+++.||++|+||+| T Consensus 72 ~~~~~g~t~~~~al~~a~~~~~~~-------~~~r~~~~k~vvllTDG~~~~~~~~~~~~~---~~~~~~gI~v~~vG~g 141 (177) T pfam00092 72 IYYLGGGTNTGKALKYALENLFRS-------AGSRPNAPKVVILLTDGKSNDGGLVPAAAA---ALRRKVGIIVFGVGVG 141 (177) T ss_pred CCCCCCCCHHHHHHHHHHHHHHHC-------CCCCCCCEEEEEEEECCCCCCCCCCHHHHH---HHHHHCCCEEEEEECC T ss_conf 157899565999999999998635-------478878726899983698788864699999---9999789589999747 Q ss_pred CCCCHHHHHHHCC----CCEEEEECCHHHHHHHHHHH Q ss_conf 8642798998338----98089828989999999999 Q gi|254780833|r 326 AEAADQFLKNCAS----PDRFYSVQNSRKLHDAFLRI 358 (371) Q Consensus 326 ~~~~~~~l~~cAs----~~~~y~~~~~~~L~~af~~I 358 (371) +.+...|+.+|+ .+++|.+.+.++|.+.++.| T Consensus 142 -~~~~~~L~~ia~~~~~~~~~~~~~~~~~l~~~~~~i 177 (177) T pfam00092 142 -DVDEEELRLIASEPCSEGHVFYVTDFDALSDIQEEL 177 (177) T ss_pred -CCCHHHHHHHHCCCCCCCEEEEECCHHHHHHHHHHC T ss_conf -448999999968999898599958989999999619 No 15 >smart00327 VWA von Willebrand factor (vWF) type A domain. VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. Probab=99.79 E-value=3.6e-17 Score=122.04 Aligned_cols=158 Identities=28% Similarity=0.429 Sum_probs=124.3 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCC--CCCHHHHHHHHH Q ss_conf 740699852763115667872148989999987500233555531104798788636741781666--558789999974 Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPL--AWGVQHIQEKIN 245 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~l--t~~~~~~~~~i~ 245 (371) ++|+++++|.|+||.. .+++.+++++..+++.+...+ ...+++++.|++.+....|+ ..+...+...|+ T Consensus 1 ~~di~~vvD~S~SM~~------~~~~~~k~~~~~~i~~l~~~~---~~~~v~vv~f~~~~~~~~~~~~~~~~~~~~~~i~ 71 (177) T smart00327 1 PLDVVFLLDGSGSMGP------NRFEKAKEFVLKLVEQLDIGP---DGDRVGLVTFSDDATVLFPLNDSRSKDALLEALA 71 (177) T ss_pred CCEEEEEEECCCCCCC------HHHHHHHHHHHHHHHHHHCCC---CCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHH T ss_conf 9489999928899882------899999999999999864179---9878999996372689976888689999999997 Q ss_pred CCCC--CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEE Q ss_conf 0156--88745642378899998742110123467776661699998404588888897899999999997898799999 Q gi|254780833|r 246 RLIF--GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIG 323 (371) Q Consensus 246 ~l~~--~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg 323 (371) .+.+ +|+|++..++.++.+.+...... ..+..+|++|++|||.++.+ .........+++.||+||+|| T Consensus 72 ~l~~~~~g~t~~~~al~~a~~~~~~~~~~------~~~~~~~~iil~TDG~~~~~----~~~~~~~~~~~~~~v~i~~ig 141 (177) T smart00327 72 SLSYKLGGGTNLGAALQYALENLFSKSAG------SRRGAPKVLILITDGESNDG----GDLLKAAKELKRSGVKVFVVG 141 (177) T ss_pred HCCCCCCCCCCCHHHHHHHHHHHHHHHCC------CCCCCCEEEEEEECCCCCCC----HHHHHHHHHHHHCCCEEEEEE T ss_conf 14155788776428999999999766503------77887428999805887872----529999999986794899999 Q ss_pred ECCCCCHHHHHHHCC--CCEEEE Q ss_conf 418642798998338--980898 Q gi|254780833|r 324 VQAEAADQFLKNCAS--PDRFYS 344 (371) Q Consensus 324 ~~~~~~~~~l~~cAs--~~~~y~ 344 (371) +|.+.+...|+.+|+ ++.|+. T Consensus 142 ~g~~~~~~~l~~ia~~~~~~~~~ 164 (177) T smart00327 142 VGNDVDEEELKKLASAPGGVYVF 164 (177) T ss_pred ECCCCCHHHHHHHHHCCCCEEEE T ss_conf 58847999999998489965999 No 16 >cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. Probab=99.78 E-value=1.7e-17 Score=124.03 Aligned_cols=166 Identities=24% Similarity=0.358 Sum_probs=127.4 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHHHC Q ss_conf 4069985276311566787214898999998750023355553110479878863674178166655--87899999740 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKINR 246 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i~~ 246 (371) +|++|++|.|+|++. +.....++.+..+++.+... ...+|+|++.|+..+....+|.. +...+...++. T Consensus 1 lDl~fllD~S~Sv~~------~~F~~~k~fi~~lv~~f~i~---~~~~rvglv~ys~~~~~~~~l~~~~~~~~~~~~i~~ 71 (177) T cd01469 1 MDIVFVLDGSGSIYP------DDFQKVKNFLSTVMKKLDIG---PTKTQFGLVQYSESFRTEFTLNEYRTKEEPLSLVKH 71 (177) T ss_pred CCEEEEEECCCCCCH------HHHHHHHHHHHHHHHHCCCC---CCCCEEEEEEECCCEEEEEECCCCCCHHHHHHHHHH T ss_conf 909999968899998------99999999999999866769---987489999936824999823556778999999862 Q ss_pred CC-CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 15-68874564237889999874211012346777666169999840458888889789999999999789879999941 Q gi|254780833|r 247 LI-FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 247 l~-~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) +. .+|+|++..|+.++.+.++.... ...++..|++|++|||..+++. ......+.+|+.||+||+||+| T Consensus 72 i~~~~g~t~t~~AL~~a~~~~f~~~~------g~R~~~~kv~ivlTDG~s~d~~----~~~~~~~~lk~~gv~vf~VGvG 141 (177) T cd01469 72 ISQLLGLTNTATAIQYVVTELFSESN------GARKDATKVLVVITDGESHDDP----LLKDVIPQAEREGIIRYAIGVG 141 (177) T ss_pred CCCCCCCCCHHHHHHHHHHHHCCCCC------CCCCCCEEEEEEEECCCCCCCC----CHHHHHHHHHHCCEEEEEEEEC T ss_conf 30368975252799999998536455------8867871699999789867750----1499999999799089999955 Q ss_pred CCC----CHHHHHHHCC-C--CEEEEECCHHHHHH Q ss_conf 864----2798998338-9--80898289899999 Q gi|254780833|r 326 AEA----ADQFLKNCAS-P--DRFYSVQNSRKLHD 353 (371) Q Consensus 326 ~~~----~~~~l~~cAs-~--~~~y~~~~~~~L~~ 353 (371) ... +...|+.+|| | .|.|.+++.++|++ T Consensus 142 ~~~~~~~~~~eL~~iAs~P~~~hvf~~~~f~~L~~ 176 (177) T cd01469 142 GHFQRENSREELKTIASKPPEEHFFNVTDFAALKD 176 (177) T ss_pred CCCCCCCCHHHHHHHHCCCCHHCEEEECCHHHHCC T ss_conf 51467451999999967985871998379777646 No 17 >cd01450 vWFA_subfamily_ECM Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=99.78 E-value=2e-17 Score=123.51 Aligned_cols=150 Identities=25% Similarity=0.402 Sum_probs=118.9 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHHHC Q ss_conf 4069985276311566787214898999998750023355553110479878863674178166655--87899999740 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKINR 246 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i~~ 246 (371) +|++|++|.|+||.. ++++.+++++..+++.+...+ ...|++++.|++.+....||+. +...+...|+. T Consensus 1 ~DivfvlD~S~Sm~~------~~~~~~k~~i~~~i~~~~~~~---~~~rv~lv~fs~~~~~~~~l~~~~~~~~l~~~i~~ 71 (161) T cd01450 1 LDIVFLLDGSESVGP------ENFEKVKDFIEKLVEKLDIGP---DKTRVGLVQYSDDVRVEFSLNDYKSKDDLLKAVKN 71 (161) T ss_pred CEEEEEEECCCCCCH------HHHHHHHHHHHHHHHHCCCCC---CCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHH T ss_conf 969999979899885------899999999999999705688---78589999955731687146564669999999984 Q ss_pred CCCC--CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEE Q ss_conf 1568--87456423788999987421101234677766616999984045888888978999999999978987999994 Q gi|254780833|r 247 LIFG--STTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGV 324 (371) Q Consensus 247 l~~~--g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~ 324 (371) +... ++|++..|+.++.+.+.... ...++.+|++||+|||.++.+. .....+..+|+.||+||+||+ T Consensus 72 l~~~~~~~t~~~~AL~~~~~~~~~~~-------~~r~~~~kvivllTDG~~~~~~----~~~~~a~~lk~~gi~v~~vgi 140 (161) T cd01450 72 LKYLGGGGTNTGKALQYALEQLFSES-------NARENVPKVIIVLTDGRSDDGG----DPKEAAAKLKDEGIKVFVVGV 140 (161) T ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCC-------CCCCCCCEEEEEEECCCCCCCC----CHHHHHHHHHHCCCEEEEEEE T ss_conf 21368998548999999999986144-------6666675499998258878874----799999999988998999982 Q ss_pred CCCCCHHHHHHHCCC Q ss_conf 186427989983389 Q gi|254780833|r 325 QAEAADQFLKNCASP 339 (371) Q Consensus 325 ~~~~~~~~l~~cAs~ 339 (371) |. .+...|+.+|+. T Consensus 141 G~-~~~~~L~~iA~~ 154 (161) T cd01450 141 GP-ADEEELREIASC 154 (161) T ss_pred CC-CCHHHHHHHHCC T ss_conf 64-899999999779 No 18 >cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. Probab=99.78 E-value=7.9e-17 Score=119.94 Aligned_cols=178 Identities=20% Similarity=0.225 Sum_probs=130.3 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCC----CCHHHHHHHH Q ss_conf 406998527631156678721489899999875002335555311047987886367417816665----5878999997 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLA----WGVQHIQEKI 244 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt----~~~~~~~~~i 244 (371) .|++|++|.|+|.+...+ -+..++.+..+++.++ ..++.+|+|++.|+..+....++. .+...+..+| T Consensus 1 ~DivFllD~S~SIg~~nf-----~~~v~~F~~~lv~~f~---Ig~~~~rvgvv~yS~~~~~~~~f~~~~~~~k~~~l~~i 72 (192) T cd01473 1 YDLTLILDESASIGYSNW-----RKDVIPFTEKIINNLN---ISKDKVHVGILLFAEKNRDVVPFSDEERYDKNELLKKI 72 (192) T ss_pred CCEEEEEECCCCCCHHHH-----HHHHHHHHHHHHHHCC---CCCCCEEEEEEEECCCCCEEEECCCCCCCCHHHHHHHH T ss_conf 978999938998666776-----9999999999998756---59896199999955887401323554434899999999 Q ss_pred HCCC----CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEE Q ss_conf 4015----688745642378899998742110123467776661699998404588888897899999999997898799 Q gi|254780833|r 245 NRLI----FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVY 320 (371) Q Consensus 245 ~~l~----~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~ 320 (371) +.+. .+|+|++..|+..+.+.+... ....++..|++|++|||..+..+. .........+|++||+|| T Consensus 73 ~~l~~~~~~gg~T~tg~AL~~~~~~~~~~-------~g~R~~vpkv~IvlTDG~s~~~~~--~~~~~~a~~lr~~gV~i~ 143 (192) T cd01473 73 NDLKNSYRSGGETYIVEALKYGLKNYTKH-------GNRRKDAPKVTMLFTDGNDTSASK--KELQDISLLYKEENVKLL 143 (192) T ss_pred HHHHHCCCCCCCCHHHHHHHHHHHHHCCC-------CCCCCCCCEEEEEEECCCCCCCCH--HHHHHHHHHHHHCCCEEE T ss_conf 99873146898247999999999986346-------788889974999995699887316--789999999998797899 Q ss_pred EEEECCCCCHHHHHHHCC-C--CE---EEEECCHHHHHHHHHHHHHHHHH Q ss_conf 999418642798998338-9--80---89828989999999999996400 Q gi|254780833|r 321 AIGVQAEAADQFLKNCAS-P--DR---FYSVQNSRKLHDAFLRIGKEMVK 364 (371) Q Consensus 321 tIg~~~~~~~~~l~~cAs-~--~~---~y~~~~~~~L~~af~~I~~~i~~ 364 (371) +||+|. .+...|+.+|+ + +. +|...+.++|....+.|.+++.. T Consensus 144 avGVg~-~~~~eL~~iag~~~~~~~c~~~~~~~fd~l~~i~~~l~~~vC~ 192 (192) T cd01473 144 VVGVGA-ASENKLKLLAGCDINNDNCPNVIKTEWNNLNGISKFLTDKICD 192 (192) T ss_pred EEEECC-CCHHHHHHHHCCCCCCCCCCEEEECCHHHHHHHHHHHHHHHCC T ss_conf 998063-7999999986999889977579947978999999999997249 No 19 >cd01471 vWA_micronemal_protein Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners. Probab=99.77 E-value=5.7e-17 Score=120.82 Aligned_cols=169 Identities=17% Similarity=0.218 Sum_probs=124.0 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCC----HHH---HH Q ss_conf 40699852763115667872148989999987500233555531104798788636741781666558----789---99 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWG----VQH---IQ 241 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~----~~~---~~ 241 (371) +|++|++|.|+|++.. ++ .+..++.+..+++.++ ..++.+|+|++.|+..+....+|... ... +. T Consensus 1 lDlvFllD~S~SVg~~--n~---f~~~k~F~~~lv~~f~---I~~~~~rVgvv~ys~~~~~~~~l~~~~~~~~~~~~~~~ 72 (186) T cd01471 1 LDLYLLVDGSGSIGYS--NW---VTHVVPFLHTFVQNLN---ISPDEINLYLVTFSTNAKELIRLSSPNSTNKDLALNAI 72 (186) T ss_pred CEEEEEEECCCCCCCC--CH---HHHHHHHHHHHHHHCC---CCCCCEEEEEEEECCCCEEEEECCCCCCCCHHHHHHHH T ss_conf 9099999488988861--31---8999999999999749---69884499999954870599875775544656799999 Q ss_pred HHHHCCC-CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEE Q ss_conf 9974015-688745642378899998742110123467776661699998404588888897899999999997898799 Q gi|254780833|r 242 EKINRLI-FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVY 320 (371) Q Consensus 242 ~~i~~l~-~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~ 320 (371) ..+..+. .+|+|++..|+..+.+.++... ...++.+|++||+|||..++.. .....++.+|+.||+|| T Consensus 73 ~~i~~~~y~gg~T~Tg~AL~~a~~~~f~~~-------g~R~~vpkv~illTDG~s~d~~----~~~~~a~~Lr~~GV~if 141 (186) T cd01471 73 RALLSLYYPNGSTNTTSALLVVEKHLFDTR-------GNRENAPQLVIIMTDGIPDSKF----RTLKEARKLRERGVIIA 141 (186) T ss_pred HHHHHCCCCCCCCHHHHHHHHHHHHHCCCC-------CCCCCCCEEEEEEECCCCCCCC----HHHHHHHHHHHCCCEEE T ss_conf 999837778996779999999999721146-------8899998599999069877852----58999999998899999 Q ss_pred EEEECCCCCHHHHHHHCC-CCE-----EEEECCHHHHHHHHH Q ss_conf 999418642798998338-980-----898289899999999 Q gi|254780833|r 321 AIGVQAEAADQFLKNCAS-PDR-----FYSVQNSRKLHDAFL 356 (371) Q Consensus 321 tIg~~~~~~~~~l~~cAs-~~~-----~y~~~~~~~L~~af~ 356 (371) +||+|...+...|+.+|+ +.. .|..++-++|+.+-+ T Consensus 142 avGVG~~v~~~eL~~Iag~~~~~~~c~~~~~~~~~~l~~~~~ 183 (186) T cd01471 142 VLGVGQGVNHEENRSLVGCDPDDSPCPLYLQSSWSEVQNVIK 183 (186) T ss_pred EEECCCCCCHHHHHHHCCCCCCCCCCCEEEECCHHHHHHHHH T ss_conf 998343249999999709998889986575178888874775 No 20 >TIGR03436 acidobact_VWFA VWFA-related Acidobacterial domain. Members of this family are bacterial domains that include a region related to the von Willebrand factor type A (VWFA) domain (pfam00092). These domains are restricted to, and have undergone a large paralogous family expansion in, the Acidobacteria, including Solibacter usitatus and Acidobacterium capsulatum ATCC 51196. Probab=99.77 E-value=1.8e-16 Score=117.84 Aligned_cols=181 Identities=22% Similarity=0.276 Sum_probs=131.9 Q ss_pred CCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHH Q ss_conf 13467406998527631156678721489899999875002335555311047987886367417816665587899999 Q gi|254780833|r 164 KSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEK 243 (371) Q Consensus 164 ~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~ 243 (371) +...|+.+.+++|.|+||.. ++..+++++..++..... ...++.++.|+.......++|.+...+..+ T Consensus 49 ~~~~P~sv~l~~D~S~s~~~-------~~~~~~~a~~~fl~~~l~-----p~d~~avv~F~~~~~l~~~fT~d~~~l~~a 116 (296) T TIGR03436 49 ETDLPLTVGLVIDTSGSMFN-------DLARARAAAIRFLKTVLR-----PNDEVFVVTFSTQLRLLQDFTSDPRLLEAA 116 (296) T ss_pred CCCCCCEEEEEEECCCCCHH-------HHHHHHHHHHHHHHHHCC-----CCCEEEEEEECCCEEECCCCCCCHHHHHHH T ss_conf 78898469999978999145-------399999999999986368-----886799999489545727898899999999 Q ss_pred HHCCCC---------------CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHH Q ss_conf 740156---------------88745642378899998742110123467776661699998404588888897899999 Q gi|254780833|r 244 INRLIF---------------GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFY 308 (371) Q Consensus 244 i~~l~~---------------~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~ 308 (371) |+.+.+ .|+|.++.++..+...+... ...+...+|+||++|||.++.......+.. T Consensus 117 l~~l~~~~~~~~~~~~~~~~~~g~tal~dAi~laa~~~~~~-------~~~~~~gRK~li~iSdG~d~~s~~~~~~~~-- 187 (296) T TIGR03436 117 LNKLKPPLRTDYNSSGAFVADAGGTALYDAITLAALQQLAN-------ALAGIPGRKALIVISDGEDNSSRDTLERAI-- 187 (296) T ss_pred HHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHH-------HCCCCCCCEEEEEEECCCCCCCCCCHHHHH-- T ss_conf 98615676543333453235787410278899999999875-------404798867999992698863304899999-- Q ss_pred HHHHHHCCCEEEEEEECCC-------------CCHHHHHHHC--CCCEEEEECCHHHHHHHHHHHHHHHHHEEE Q ss_conf 9999978987999994186-------------4279899833--898089828989999999999996400078 Q gi|254780833|r 309 CNEAKRRGAIVYAIGVQAE-------------AADQFLKNCA--SPDRFYSVQNSRKLHDAFLRIGKEMVKQRI 367 (371) Q Consensus 309 c~~~k~~gi~i~tIg~~~~-------------~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~I~~~i~~~~i 367 (371) +.++..+|.||+|++... .....|+.+| +||++|..+ ..+|.++|+.|.+++.++.+ T Consensus 188 -~~a~~a~v~IY~I~~~~~~~~~~~~~~~~~~~~~~~L~~lA~~TGG~~f~~~-~~dl~~~~~~i~~~lr~qY~ 259 (296) T TIGR03436 188 -EAAQRADVLIYSIDARGLRAPDLGAGAKAGLSGPETLERLAAETGGRAFYVN-SNDIDEAFAQIAEELRSQYV 259 (296) T ss_pred -HHHHHCCCEEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCEEECCC-CCCHHHHHHHHHHHHHHEEE T ss_conf -9999849779995467656656444445567627999999997399675547-41089999999998752389 No 21 >cd01472 vWA_collagen von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins. This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif. Probab=99.77 E-value=3.6e-17 Score=122.01 Aligned_cols=156 Identities=23% Similarity=0.387 Sum_probs=121.7 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHHHCC Q ss_conf 069985276311566787214898999998750023355553110479878863674178166655--878999997401 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKINRL 247 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i~~l 247 (371) |++|++|.|+||.. ...+..++.+..+++.+... ....|++++.|+..+....|+.. +...+...|+.+ T Consensus 2 Di~fvlD~S~Sv~~------~~f~~~k~fi~~li~~~~i~---~~~~rvgvv~fs~~~~~~~~l~~~~~~~~l~~~i~~i 72 (164) T cd01472 2 DIVFLVDGSESIGL------SNFNLVKDFVKRVVERLDIG---PDGVRVGVVQYSDDPRTEFYLNTYRSKDDVLEAVKNL 72 (164) T ss_pred CEEEEEECCCCCCH------HHHHHHHHHHHHHHHHCCCC---CCCCEEEEEEECCCEEEEECCCCCCCHHHHHHHHHHH T ss_conf 79999979799887------99999999999999964768---8860899998247415874454669889999999861 Q ss_pred CC-CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECC Q ss_conf 56-88745642378899998742110123467776661699998404588888897899999999997898799999418 Q gi|254780833|r 248 IF-GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 248 ~~-~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~ 326 (371) .. +|+|++..|++++.+.+.... ....++.+|++||+|||..++ ........+|+.||+||+||+| T Consensus 73 ~~~~g~t~~~~AL~~~~~~~~~~~------~~~r~~~~kvvvllTDG~s~~------~~~~~a~~lr~~Gi~v~~VGig- 139 (164) T cd01472 73 RYIGGGTNTGKALKYVRENLFTEA------SGSREGVPKVLVVITDGKSQD------DVEEPAVELKQAGIEVFAVGVK- 139 (164) T ss_pred HCCCCCCHHHHHHHHHHHHHHCCC------CCCCCCCEEEEEEEECCCCCC------HHHHHHHHHHHCCCEEEEEECC- T ss_conf 166897529999999999863535------787678515999983799864------0889999999889889999788- Q ss_pred CCCHHHHHHHCC-C--CEEEEECC Q ss_conf 642798998338-9--80898289 Q gi|254780833|r 327 EAADQFLKNCAS-P--DRFYSVQN 347 (371) Q Consensus 327 ~~~~~~l~~cAs-~--~~~y~~~~ 347 (371) +.+...|+.+|| | .|+|.+.+ T Consensus 140 ~~~~~~L~~iAs~p~~~~~~~~~~ 163 (164) T cd01472 140 NADEEELKQIASDPKELYVFNVAD 163 (164) T ss_pred CCCHHHHHHHHCCCCHHEEEECCC T ss_conf 479999999967993783896588 No 22 >cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.73 E-value=4.2e-16 Score=115.56 Aligned_cols=158 Identities=18% Similarity=0.296 Sum_probs=121.6 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHHHCC Q ss_conf 069985276311566787214898999998750023355553110479878863674178166655--878999997401 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKINRL 247 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i~~l 247 (371) |++|++|.|+|++. ...+..++.+..+++.++.. .+.+|+|++.|++.+....+|.. +...+..+|+.+ T Consensus 2 DlvFllD~S~si~~------~~F~~~k~Fv~~lv~~f~i~---~~~trVgvi~ys~~~~~~f~l~~~~~~~~l~~~I~~i 72 (165) T cd01481 2 DIVFLIDGSDNVGS------GNFPAIRDFIERIVQSLDVG---PDKIRVAVVQFSDTPRPEFYLNTHSTKADVLGAVRRL 72 (165) T ss_pred CEEEEEECCCCCCH------HHHHHHHHHHHHHHHHHCCC---CCCEEEEEEEECCCEEEEEECCCCCCHHHHHHHHHHH T ss_conf 78999968899898------99999999999999960468---8862788999868647999767768999999999841 Q ss_pred CCCCC--CCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 56887--4564237889999874211012346777666169999840458888889789999999999789879999941 Q gi|254780833|r 248 IFGST--TKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 248 ~~~g~--T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) .+.++ |++..|+.++.+.++... +..+..+..+|++|++|||..++ ........+|+.||+||+||++ T Consensus 73 ~~~~g~~t~tg~AL~~a~~~~f~~~----~g~R~r~~v~kvlvviTdG~s~d------~~~~~a~~lr~~gV~i~aVGvg 142 (165) T cd01481 73 RLRGGSQLNTGSALDYVVKNLFTKS----AGSRIEEGVPQFLVLITGGKSQD------DVERPAVALKRAGIVPFAIGAR 142 (165) T ss_pred HCCCCCCEEHHHHHHHHHHHHCCCC----CCCCCCCCCCEEEEEEECCCCCC------HHHHHHHHHHHCCCEEEEEECC T ss_conf 0458984369999999999716756----78875579986999984898853------7899999999889789999689 Q ss_pred CCCCHHHHHHHCC-CCEEEEECC Q ss_conf 8642798998338-980898289 Q gi|254780833|r 326 AEAADQFLKNCAS-PDRFYSVQN 347 (371) Q Consensus 326 ~~~~~~~l~~cAs-~~~~y~~~~ 347 (371) +.+...|+.+|| |.|.|.+++ T Consensus 143 -~~~~~eL~~IAs~p~~vf~~~~ 164 (165) T cd01481 143 -NADLAELQQIAFDPSFVFQVSD 164 (165) T ss_pred -CCCHHHHHHHHCCCCCEEECCC T ss_conf -7999999998589877697389 No 23 >TIGR00868 hCaCC calcium-activated chloride channel protein 1; InterPro: IPR004727 This entry represents a family of Ca(2+)-regulated chloride channels (CLCA) which includes bovine, murine and human proteins , . Each CLCA exhibits a distinct, often overlapping, tissue expression pattern. With the exception of the truncated, secreted protein hCLCA3 , they are synthesized as an approximately 125 kDa precursor transmembrane glycoprotein that is rapidly cleaved into 90 and 35 kDa subunits. The human proteins have been shown to affect a large number of cell functions including chloride conductance, epithelial secretion, cell-cell adhesion, apoptosis, cell cycle control, mucus production in asthma, and blood pressure. The CLCA proteins expressed on the luminal surface of lung vascular endothelia (bCLCA2; mCLCA1; hCLCA2) serve as adhesion molecules for lung metastatic cancer cells, mediating vascular arrest and lung colonization. Expression of hCLCA2 in normal mammary epithelium is consistently lost in human breast cancer and in all tumorigenic breast cancer cell lines. Re-expression of hCLCA2 in human breast cancer cells abrogates tumorigenicity in nude mice, implying that hCLCA2 acts as a tumour suppressor in breast cancer.. Probab=99.72 E-value=7.1e-17 Score=120.23 Aligned_cols=171 Identities=26% Similarity=0.331 Sum_probs=119.9 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHH-HHHHHH-HCC- Q ss_conf 69985276311566787214898999998750023355553110479878863674178166655878-999997-401- Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQ-HIQEKI-NRL- 247 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~-~~~~~i-~~l- 247 (371) |++|+|.||||... +|+....++...+|-+.-. ...-+|++.|.+.+...-.|..-.+ ..++.| ..| T Consensus 310 VCLVLDKSGSM~~~-----dRL~RmNQAa~lFL~Q~vE-----~gs~VGmV~FDS~A~i~n~L~~I~s~~~~~~l~a~LP 379 (874) T TIGR00868 310 VCLVLDKSGSMTKE-----DRLKRMNQAAKLFLLQIVE-----KGSWVGMVTFDSAAEIKNELIKITSSDERDALTANLP 379 (874) T ss_pred EEEEECCCCCCCCC-----CHHHHHHHHHHHHEEEEEE-----CCCEEEEEECCCEEEEEEEEEEECCHHHHHHHHHHCC T ss_conf 99986344337988-----5334555566430123554-----1526776630644576542077526689989987077 Q ss_pred -CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECC Q ss_conf -5688745642378899998742110123467776661699998404588888897899999999997898799999418 Q gi|254780833|r 248 -IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 248 -~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~ 326 (371) .+.|||.+=.|++.|++.+....+...+ + =|||||||++|.-+.. .++-|+.|+-||||++|. T Consensus 380 ~~a~GGTSIC~Gl~~aFq~I~~~~~~t~G----S-----Ei~LLTDGEDN~i~sC-------~~eVkqsGaIiHtiALGp 443 (874) T TIGR00868 380 TEASGGTSICSGLKAAFQVIKKSDQSTDG----S-----EIVLLTDGEDNTISSC-------IEEVKQSGAIIHTIALGP 443 (874) T ss_pred CCCCCCCHHHHHHHHHHHHHHHCCCCCCC----C-----EEEEEECCCCCCEEEC-------HHHHHCCCEEEEEEECCH T ss_conf 87876803656676665433312666675----3-----6998306875762313-------055410980899850784 Q ss_pred CCCHHH--HHHHCCCCEEEEE--CCHHHHHHHHHHHHH---HHHHEEE Q ss_conf 642798--9983389808982--898999999999999---6400078 Q gi|254780833|r 327 EAADQF--LKNCASPDRFYSV--QNSRKLHDAFLRIGK---EMVKQRI 367 (371) Q Consensus 327 ~~~~~~--l~~cAs~~~~y~~--~~~~~L~~af~~I~~---~i~~~~i 367 (371) .+++++ |.++.+|-+||-. .+-+.|.+||..|.. .++|+.| T Consensus 444 sAa~ele~lS~mTGG~~fYa~D~~~~NgLidAFg~lsS~~~~~sQ~~l 491 (874) T TIGR00868 444 SAAKELEELSDMTGGLRFYASDEADNNGLIDAFGALSSGNGSVSQQSL 491 (874) T ss_pred HHHHHHHHHHHHCCCCEEEEECHHHCCCHHHHHHHHCCCCHHHHHHHH T ss_conf 589999998733387113341333314145466422147612555555 No 24 >cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=99.72 E-value=8e-16 Score=113.89 Aligned_cols=156 Identities=23% Similarity=0.383 Sum_probs=119.1 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHHHCC Q ss_conf 069985276311566787214898999998750023355553110479878863674178166655--878999997401 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKINRL 247 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i~~l 247 (371) |++|++|.|+|++.. ..+..++.+..+++.++. ....+|++++.|+..+....+|.. +...+...|+.+ T Consensus 2 DlvfllD~S~Si~~~------~f~~~k~fi~~lv~~f~i---~~~~~rvgvv~ys~~~~~~~~l~~~~~~~~l~~~i~~i 72 (164) T cd01482 2 DIVFLVDGSWSIGRS------NFNLVRSFLSSVVEAFEI---GPDGVQVGLVQYSDDPRTEFDLNAYTSKEDVLAAIKNL 72 (164) T ss_pred CEEEEEECCCCCCHH------HHHHHHHHHHHHHHHCCC---CCCCEEEEEEEECCCCEEEECCCCCCCHHHHHHHHHHC T ss_conf 799999698998889------999999999999996476---88862899999447512787343469989999998640 Q ss_pred CC-CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECC Q ss_conf 56-88745642378899998742110123467776661699998404588888897899999999997898799999418 Q gi|254780833|r 248 IF-GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 248 ~~-~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~ 326 (371) .. +|+|++..|++++.+.++... ....++.+|++|++|||..++ ........+|+.||+||+||++ T Consensus 73 ~~~~g~t~~~~AL~~~~~~~f~~~------~g~R~~~~kvlvliTDG~s~d------~~~~~a~~lr~~gv~i~~VGVg- 139 (164) T cd01482 73 PYKGGNTRTGKALTHVREKNFTPD------AGARPGVPKVVILITDGKSQD------DVELPARVLRNLGVNVFAVGVK- 139 (164) T ss_pred CCCCCCCCHHHHHHHHHHHHHCHH------CCCCCCCCEEEEEECCCCCCC------HHHHHHHHHHHCCCEEEEEECC- T ss_conf 266899728999999999861500------289888860799960798843------3899999999889389999788- Q ss_pred CCCHHHHHHHCC-C--CEEEEECC Q ss_conf 642798998338-9--80898289 Q gi|254780833|r 327 EAADQFLKNCAS-P--DRFYSVQN 347 (371) Q Consensus 327 ~~~~~~l~~cAs-~--~~~y~~~~ 347 (371) +.+...|+.+|| | .|+|.+++ T Consensus 140 ~~~~~eL~~IAs~P~~~hvf~~~~ 163 (164) T cd01482 140 DADESELKMIASKPSETHVFNVAD 163 (164) T ss_pred CCCHHHHHHHHCCCCHHCEEECCC T ss_conf 378999999968985661797479 No 25 >cd00198 vWFA Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Probab=99.72 E-value=4.4e-16 Score=115.44 Aligned_cols=150 Identities=30% Similarity=0.408 Sum_probs=116.8 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHHHC Q ss_conf 4069985276311566787214898999998750023355553110479878863674178166655--87899999740 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKINR 246 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i~~ 246 (371) .|++||+|.|+||.. .+++.+++++..+++.+... ....+++++.|...+....+++. +...+...++. T Consensus 1 ~div~vlD~S~Sm~~------~~~~~~k~~~~~~~~~l~~~---~~~~~v~vv~f~~~~~~~~~~~~~~~~~~~~~~i~~ 71 (161) T cd00198 1 ADIVFLLDVSGSMGG------EKLDKAKEALKALVSSLSAS---PPGDRVGLVTFGSNARVVLPLTTDTDKADLLEAIDA 71 (161) T ss_pred CEEEEEEECCCCCCC------HHHHHHHHHHHHHHHHHHHC---CCCCEEEEEEECCCEEEEECCCCHHHHHHHHHHHHH T ss_conf 909999918899880------79999999999999987655---999889999937951488147412579999997751 Q ss_pred CC--CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEE Q ss_conf 15--6887456423788999987421101234677766616999984045888888978999999999978987999994 Q gi|254780833|r 247 LI--FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGV 324 (371) Q Consensus 247 l~--~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~ 324 (371) +. ..|+|++..++..+.+.+.... ....+|++||+|||.++.+. .........+|+.||.||+||+ T Consensus 72 ~~~~~~g~t~~~~al~~a~~~~~~~~---------~~~~~~~iiliTDG~~~~~~---~~~~~~~~~~~~~~v~i~~igi 139 (161) T cd00198 72 LKKGLGGGTNIGAALRLALELLKSAK---------RPNARRVIILLTDGEPNDGP---ELLAEAARELRKLGITVYTIGI 139 (161) T ss_pred CCCCCCCCCHHHHHHHHHHHHHHHHC---------CCCCCEEEEEECCCCCCCCH---HHHHHHHHHHHHCCCEEEEEEE T ss_conf 35689998389999999999987532---------55565179996789989873---6799999999977998999996 Q ss_pred CCCCCHHHHHHHCCC Q ss_conf 186427989983389 Q gi|254780833|r 325 QAEAADQFLKNCASP 339 (371) Q Consensus 325 ~~~~~~~~l~~cAs~ 339 (371) |.+.+.+.|+.+|+. T Consensus 140 g~~~~~~~l~~ia~~ 154 (161) T cd00198 140 GDDANEDELKEIADK 154 (161) T ss_pred CHHHCHHHHHHHHHC T ss_conf 611199999999838 No 26 >cd01476 VWA_integrin_invertebrates VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands. Probab=99.69 E-value=2.5e-15 Score=110.89 Aligned_cols=153 Identities=22% Similarity=0.325 Sum_probs=111.0 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEE--CCCC--CCHHHHHHHH Q ss_conf 40699852763115667872148989999987500233555531104798788636741781--6665--5878999997 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQT--FPLA--WGVQHIQEKI 244 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~--~~lt--~~~~~~~~~i 244 (371) +|++|++|.|+|+... ....++.+..+++.+. ...+.+|++++.|+...... .++. .+..++...| T Consensus 1 lDl~fllD~S~Sv~~~-------f~~~k~F~~~lv~~f~---i~~~~~rVgvv~ys~~~~~~i~f~l~~~~~~~~l~~~I 70 (163) T cd01476 1 LDLLFVLDSSGSVRGK-------FEKYKKYIERIVEGLE---IGPTATRVALITYSGRGRQRVRFNLPKHNDGEELLEKV 70 (163) T ss_pred CCEEEEEECCCCHHHH-------HHHHHHHHHHHHHHHC---CCCCCEEEEEEEECCCCCEEEEECCCCCCCHHHHHHHH T ss_conf 9299999188886673-------9999999999999614---68885389999966987078887577779999999999 Q ss_pred HCCCC-CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHH-CCCEEEEE Q ss_conf 40156-88745642378899998742110123467776661699998404588888897899999999997-89879999 Q gi|254780833|r 245 NRLIF-GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKR-RGAIVYAI 322 (371) Q Consensus 245 ~~l~~-~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~-~gi~i~tI 322 (371) +.+.. +|+|++..|+..+.+.+.+.. ...++.+|++|++|||..+++ .......+|+ .||+||+| T Consensus 71 ~~i~~~~g~T~tg~AL~~a~~~~~~~~-------g~R~~~~kv~vviTDG~s~d~------~~~~a~~lr~~~gv~v~av 137 (163) T cd01476 71 DNLRFIGGTTATGAAIEVALQQLDPSE-------GRREGIPKVVVVLTDGRSHDD------PEKQARILRAVPNIETFAV 137 (163) T ss_pred HHEECCCCCCCHHHHHHHHHHHHHHHC-------CCCCCCEEEEEEEECCCCCCC------HHHHHHHHHHHCCCEEEEE T ss_conf 752036898548999999999721420-------678996169999818987664------8899999997099899999 Q ss_pred EECC--CCCHHHHHHHCC-CCEEEE Q ss_conf 9418--642798998338-980898 Q gi|254780833|r 323 GVQA--EAADQFLKNCAS-PDRFYS 344 (371) Q Consensus 323 g~~~--~~~~~~l~~cAs-~~~~y~ 344 (371) |+|. +.+...|+.+|+ ++|.|. T Consensus 138 gVG~~~~~d~~eL~~Ia~~~~~Vft 162 (163) T cd01476 138 GTGDPGTVDTEELHSITGNEDHIFT 162 (163) T ss_pred EECCCCCCCHHHHHHHCCCCCCCCC T ss_conf 8388650159999986499725457 No 27 >cd01462 VWA_YIEM_type VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if Probab=99.64 E-value=1.9e-14 Score=105.63 Aligned_cols=144 Identities=21% Similarity=0.191 Sum_probs=107.1 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEE-ECCCCCCHHHHHHHHHCCC Q ss_conf 069985276311566787214898999998750023355553110479878863674178-1666558789999974015 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQ-TFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~-~~~lt~~~~~~~~~i~~l~ 248 (371) .+++++|.||||.. .++..++.++..++......+ .+++++.|++.... ..+.+.+.......+..+. T Consensus 2 pvV~vlD~SGSM~G------~~~~~ak~~~~~l~~~l~~~~-----~~~~lv~F~~~~~~~~~~~~~~~~~~~~~i~~~~ 70 (152) T cd01462 2 PVILLVDQSGSMYG------APEEVAKAVALALLRIALAEN-----RDTYLILFDSEFQTKIVDKTDDLEEPVEFLSGVQ 70 (152) T ss_pred CEEEEEECCCCCCC------CHHHHHHHHHHHHHHHHHHCC-----CEEEEEEECCCCEEEECCCCCCHHHHHHHHHHCC T ss_conf 99999979999898------069999999999999732339-----8099999168735771587645999999997253 Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCC Q ss_conf 68874564237889999874211012346777666169999840458888889789999999999789879999941864 Q gi|254780833|r 249 FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEA 328 (371) Q Consensus 249 ~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~ 328 (371) ++|||++..++..+...+... ...+..|||+|||+....+ ......+..+++.++++|++++|.+. T Consensus 71 ~~GGT~i~~aL~~A~~~l~~~-----------~~~~~~IvlITDG~~~~~~---~~~~~~~~~~~~~~~r~~~~~iG~~~ 136 (152) T cd01462 71 LGGGTDINKALRYALELIERR-----------DPRKADIVLITDGYEGGVS---DELLREVELKRSRVARFVALALGDHG 136 (152) T ss_pred CCCCCCHHHHHHHHHHHHHCC-----------CCCCCEEEEEECCCCCCCH---HHHHHHHHHHHHCCEEEEEEEECCCC T ss_conf 689865799999999987425-----------7656469998267567983---99999999998389199999989998 Q ss_pred CHHHHHHHCC Q ss_conf 2798998338 Q gi|254780833|r 329 ADQFLKNCAS 338 (371) Q Consensus 329 ~~~~l~~cAs 338 (371) +..+++..+. T Consensus 137 ~p~~~~~~~~ 146 (152) T cd01462 137 NPGYDRISAE 146 (152) T ss_pred CCHHHHHHHH T ss_conf 8278787666 No 28 >COG1240 ChlD Mg-chelatase subunit ChlD [Coenzyme metabolism] Probab=99.59 E-value=1.3e-13 Score=100.52 Aligned_cols=171 Identities=19% Similarity=0.260 Sum_probs=138.1 Q ss_pred CCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEEC-CCCEEECCCCCCHHHHHH Q ss_conf 1346740699852763115667872148989999987500233555531104798788636-741781666558789999 Q gi|254780833|r 164 KSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFS-SKIVQTFPLAWGVQHIQE 242 (371) Q Consensus 164 ~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~-~~~~~~~~lt~~~~~~~~ 242 (371) ..+....++||+|.||||... .|+..++.++..++... +....+++++.|. .++....|.|.+...+.. T Consensus 74 ~~r~g~lvvfvVDASgSM~~~-----~Rm~aaKG~~~~lL~dA-----Yq~RdkvavI~F~G~~A~lll~pT~sv~~~~~ 143 (261) T COG1240 74 EGRAGNLIVFVVDASGSMAAR-----RRMAAAKGAALSLLRDA-----YQRRDKVAVIAFRGEKAELLLPPTSSVELAER 143 (261) T ss_pred CCCCCCCEEEEEECCCCCHHH-----HHHHHHHHHHHHHHHHH-----HHCCCEEEEEEECCCCCEEEECCCCCHHHHHH T ss_conf 047677489999476542057-----89999999999999999-----97035489999637765388478653999999 Q ss_pred HHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCH--HHHHHHHHHHHHCCCEEE Q ss_conf 974015688745642378899998742110123467776661699998404588888897--899999999997898799 Q gi|254780833|r 243 KINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDN--KESLFYCNEAKRRGAIVY 320 (371) Q Consensus 243 ~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~--~~~~~~c~~~k~~gi~i~ 320 (371) .+..+.++|.|.+..|+..+++.+.... ..+++...++|++|||..|.+.... .....+|..+...|+.+- T Consensus 144 ~L~~l~~GG~TPL~~aL~~a~ev~~r~~-------r~~p~~~~~~vviTDGr~n~~~~~~~~~e~~~~a~~~~~~g~~~l 216 (261) T COG1240 144 ALERLPTGGKTPLADALRQAYEVLAREK-------RRGPDRRPVMVVITDGRANVPIPLGPKAETLEAASKLRLRGIQLL 216 (261) T ss_pred HHHHCCCCCCCCHHHHHHHHHHHHHHHH-------CCCCCCCEEEEEEECCCCCCCCCCCHHHHHHHHHHHHHHCCCCEE T ss_conf 9983899998843999999999999751-------048876538999737965888898657799999999852688479 Q ss_pred EEEECCCCC-HHHHHHHC--CCCEEEEECCHHHH Q ss_conf 999418642-79899833--89808982898999 Q gi|254780833|r 321 AIGVQAEAA-DQFLKNCA--SPDRFYSVQNSRKL 351 (371) Q Consensus 321 tIg~~~~~~-~~~l~~cA--s~~~~y~~~~~~~L 351 (371) +|.++.+.- -.+.+.+| ++|.||+.++..+. T Consensus 217 vid~e~~~~~~g~~~~iA~~~Gg~~~~L~~l~~~ 250 (261) T COG1240 217 VIDTEGSEVRLGLAEEIARASGGEYYHLDDLSDD 250 (261) T ss_pred EEECCCCCCCCCHHHHHHHHHCCEEEECCCCCCH T ss_conf 9955785233447999999739907865556404 No 29 >COG4245 TerY Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain [General function prediction only] Probab=99.51 E-value=6.4e-13 Score=96.33 Aligned_cols=182 Identities=20% Similarity=0.263 Sum_probs=137.3 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHH-HHCCC Q ss_conf 06998527631156678721489899999875002335555311047987886367417816665587899999-74015 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEK-INRLI 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~-i~~l~ 248 (371) -|++++|.||||.. ++|+.++..++.+.+.+.+.+..-..+.+++++|...+....|++.- .++ -..|. T Consensus 5 P~~lllDtSgSM~G------e~IealN~Glq~m~~~Lkqdp~Ale~v~lsIVTF~~~a~~~~pf~~~----~nF~~p~L~ 74 (207) T COG4245 5 PCYLLLDTSGSMIG------EPIEALNAGLQMMIDTLKQDPYALERVELSIVTFGGPARVIQPFTDA----ANFNPPILT 74 (207) T ss_pred CEEEEEECCCCCCC------CCHHHHHHHHHHHHHHHHHCHHHHHEEEEEEEEECCCCEEEECHHHH----HHCCCCCEE T ss_conf 88999936754245------61799989999999998748465440578999826850687331557----544887013 Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC- Q ss_conf 6887456423788999987421101234677766616999984045888888978999999999978987999994186- Q gi|254780833|r 249 FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE- 327 (371) Q Consensus 249 ~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~- 327 (371) +.|+|....|+..+..++....+.... ....+++++++|||||++++. +......-..-+.....|-.+++|.+ T Consensus 75 a~GgT~lGaAl~~a~d~Ie~~~~~~~a--~~kgdyrP~vfLiTDG~PtD~---w~~~~~~~~~~~~~~k~v~a~~~G~~~ 149 (207) T COG4245 75 AQGGTPLGAALTLALDMIEERKRKYDA--NGKGDYRPWVFLITDGEPTDD---WQAGAALVFQGERRAKSVAAFSVGVQG 149 (207) T ss_pred CCCCCCHHHHHHHHHHHHHHHHHHCCC--CCCCCCCEEEEEECCCCCCHH---HHHHHHHHHHCCCCCCEEEEEEECCCC T ss_conf 699980679999999999877765056--775554417999538996657---776777764033100528999953543 Q ss_pred CCHHHHHHHCCCCEEEEECCHHHHHHHHHHHHHHHHHEE Q ss_conf 427989983389808982898999999999999640007 Q gi|254780833|r 328 AADQFLKNCASPDRFYSVQNSRKLHDAFLRIGKEMVKQR 366 (371) Q Consensus 328 ~~~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~~i~~~~ 366 (371) ++-..|+++.+.-.-|.-.+..++.+.|+.+...|+.-+ T Consensus 150 ad~~~L~qit~~V~~~~t~d~~~f~~fFkW~SaSisagS 188 (207) T COG4245 150 ADNKTLNQITEKVRQFLTLDGLQFREFFKWLSASISAGS 188 (207) T ss_pred CCCHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHCCC T ss_conf 441899998876525234534889999999987751323 No 30 >cd01454 vWA_norD_type norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3- ------ NO2- ------ NO ------- N2O --------- N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif. Probab=99.50 E-value=2.5e-12 Score=92.79 Aligned_cols=152 Identities=18% Similarity=0.357 Sum_probs=105.4 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCC--------EEECCCCCC-HHHH Q ss_conf 0699852763115667872148989999987500233555531104798788636741--------781666558-7899 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKI--------VQTFPLAWG-VQHI 240 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~--------~~~~~lt~~-~~~~ 240 (371) .+.+++|.||||... .+++.+++++.-+.+.+... ..+++++.|+... ....++... .... T Consensus 2 aV~lLlD~SgSM~~~-----~~i~~a~~a~~~l~~aL~~~-----g~~~~v~gF~s~~~~r~~~~~~~~k~f~e~~~~~~ 71 (174) T cd01454 2 AVTLLLDLSGSMRSD-----RRIDVAKKAAVLLAEALEAC-----GVPHAILGFTTDAGGRERVRWIKIKDFDESLHERA 71 (174) T ss_pred EEEEEEECCCCCCCC-----CHHHHHHHHHHHHHHHHHHC-----CCCEEEEECCCCCCCCCCEEEEECCCCCCCHHHHH T ss_conf 899999898688998-----48999999999999999976-----99569995157889844347893236674211456 Q ss_pred HHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC------CHHHHHHHHHHHHH Q ss_conf 999740156887456423788999987421101234677766616999984045888888------97899999999997 Q gi|254780833|r 241 QEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNI------DNKESLFYCNEAKR 314 (371) Q Consensus 241 ~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~------~~~~~~~~c~~~k~ 314 (371) +..|..+.++++|....++.|+...|.. .+..+|++|++|||.|++... ....+...+..++. T Consensus 72 ~~~i~~l~~~g~Tr~G~Air~a~~~L~~-----------~~~~rkiliviSDG~P~D~~~~~~~~~~~~D~~~av~e~~~ 140 (174) T cd01454 72 RKRLAALSPGGNTRDGAAIRHAAERLLA-----------RPEKRKILLVISDGEPNDLDYYEGNVFATEDALRAVIEARK 140 (174) T ss_pred HHHHHCCCCCCCCCCHHHHHHHHHHHHH-----------CCCCCEEEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHHH T ss_conf 8888511878989617999999999863-----------97666799998389976677788755389999999999998 Q ss_pred CCCEEEEEEECCCCC---HHHHHHHCCCCEE Q ss_conf 898799999418642---7989983389808 Q gi|254780833|r 315 RGAIVYAIGVQAEAA---DQFLKNCASPDRF 342 (371) Q Consensus 315 ~gi~i~tIg~~~~~~---~~~l~~cAs~~~~ 342 (371) .||.+|.|+++.+.. .+.|+.+-+.++| T Consensus 141 ~GI~~~~i~i~~~~~~~~~~~l~~i~g~~~~ 171 (174) T cd01454 141 LGIEVFGITIDRDATTVDKEYLKNIFGEEGY 171 (174) T ss_pred CCCEEEEEEECCCCCHHHHHHHHHHCCCCCE T ss_conf 7988999998985556699999984287877 No 31 >cd01477 vWA_F09G8-8_type VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of mo Probab=99.49 E-value=5.7e-12 Score=90.64 Aligned_cols=165 Identities=19% Similarity=0.253 Sum_probs=117.0 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCC---CCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHH Q ss_conf 3467406998527631156678721489899999875002---3355553110479878863674178166655--8789 Q gi|254780833|r 165 SDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLD---IIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQH 239 (371) Q Consensus 165 ~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~ 239 (371) .+.=+|+++|+|.|..|... -+..+...+..++. .....+.-+..+|+|+++|++.++....|.. .... T Consensus 16 ~nLWLDVv~VVD~S~~mt~~------gl~~V~~~I~s~f~~~t~iGt~~~~pr~TRVGlVTYn~~AtvvAdLn~~~S~dd 89 (193) T cd01477 16 KNLWLDIVFVVDNSKGMTQG------GLWQVRATISSLFGSSSQIGTDYDDPRSTRVGLVTYNSNATVVADLNDLQSFDD 89 (193) T ss_pred HHEEEEEEEEEECCCCCCCC------CHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEEEEECCCCEEEECCCCCCCHHH T ss_conf 22237899999678765621------099999999999713540357889987338999996787459863454565788 Q ss_pred HHHHHH----CCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHC Q ss_conf 999974----0156887456423788999987421101234677766616999984045888888978999999999978 Q gi|254780833|r 240 IQEKIN----RLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRR 315 (371) Q Consensus 240 ~~~~i~----~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~ 315 (371) +.+.|. ........+...|+..+-.+|... ....+.+++|+||+++-.....+. ..+..+++.+|.. T Consensus 90 l~~~i~~~l~~vsss~~SyL~~GL~aA~~~l~~~------~~~~R~nykKVVIVyAs~y~~~g~---~dp~pvA~rLK~~ 160 (193) T cd01477 90 LYSQIQGSLTDVSSTNASYLDTGLQAAEQMLAAG------KRTSRENYKKVVIVFASDYNDEGS---NDPRPIAARLKST 160 (193) T ss_pred HHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHC------CCCCCCCCCEEEEEEECCCCCCCC---CCHHHHHHHHHHC T ss_conf 9999988751466663127999999999999833------266424862799999502467898---8869999999876 Q ss_pred CCEEEEEEECCCCCHH---HHHHHCCCCEEEE Q ss_conf 9879999941864279---8998338980898 Q gi|254780833|r 316 GAIVYAIGVQAEAADQ---FLKNCASPDRFYS 344 (371) Q Consensus 316 gi~i~tIg~~~~~~~~---~l~~cAs~~~~y~ 344 (371) |+.|.||+|+.+.+.. .|.+|||||+-|. T Consensus 161 Gv~IiTVa~~q~~~~~~~~~L~~IASpg~nFt 192 (193) T cd01477 161 GIAIITVAFTQDESSNLLDKLGKIASPGMNFT 192 (193) T ss_pred CCEEEEEECCCCCCHHHHHHHHHHCCCCCCCC T ss_conf 97899998268875889998887579988877 No 32 >COG4961 TadG Flp pilus assembly protein TadG [Intracellular trafficking and secretion] Probab=99.42 E-value=2.9e-12 Score=92.42 Aligned_cols=68 Identities=16% Similarity=0.204 Sum_probs=60.7 Q ss_pred HHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCH Q ss_conf 25788752036871899999999999999999999999999999999999999999986520365302 Q gi|254780833|r 3 FLNIRNFFYNCKGSISILTAILLPVIFIVMGLVIETSHKFFVKAKLHYILDHSLLYTATKILNQENGN 70 (371) Q Consensus 3 ~~~l~~f~~d~~G~vai~~al~l~~li~~~g~aVD~~r~~~~ks~Lq~a~DaA~LAaa~~~~~~~~~~ 70 (371) .+-++||+|||+|+++|.|||++|||++++++.||++.+++.|.+||+|+|+|+++++.......... T Consensus 9 ~~~~~rF~rdr~Ga~AVeFAlvap~ll~l~~g~ve~~~~~~~~~~l~~a~d~aara~~~~~~~~~~~~ 76 (185) T COG4961 9 RGLLRRFRRDRRGAAAVEFALVAPPLLLLVFGIVEFGIAFLAKQSLQNAADAAARAAARGLTTDAADL 76 (185) T ss_pred HHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHH T ss_conf 99999887648768999999999999999999999999999999999999999999985076442025 No 33 >TIGR02442 Cob-chelat-sub cobaltochelatase subunit; InterPro: IPR012804 Cobalamin (vitamin B12) can be complexed with metal via ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. Cobaltochelatase is responsible for the insertion of cobalt into the corrin ring of coenzyme B12 during its biosynthesis. Two versions have been well described. CbiK/CbiX is a monomeric, anaerobic version which acts early in the biosynthesis (IPR010388 from INTERPRO). CobNST is a trimeric, ATP-dependent, aerobic version which acts late in the biosynthesis, (IPR011953 from INTERPRO, IPR006537 from INTERPRO, IPR006538 from INTERPRO) . The two pathways differ in the point of cobalt insertion during corrin ring formation . There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction . Cobaltochelatase shows similarities with magnesium chelatase, which is also a complex ATP-dependent enzyme made up of two separable components. However, unlike the situation in cobaltochelatase, one of these two components is membrane bound in magnesium chelatase . . Probab=99.30 E-value=1.6e-10 Score=81.91 Aligned_cols=165 Identities=18% Similarity=0.218 Sum_probs=127.9 Q ss_pred CCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC-CEEECCCCCCHHHHHH Q ss_conf 134674069985276311566787214898999998750023355553110479878863674-1781666558789999 Q gi|254780833|r 164 KSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK-IVQTFPLAWGVQHIQE 242 (371) Q Consensus 164 ~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~-~~~~~~lt~~~~~~~~ 242 (371) +.+.+--|+||+|-||||.-. .||..++.++..+|... +-...++++++|... +...+|.|.+....+. T Consensus 504 ~~r~G~LviFvVDASGSM~ar-----~RM~~~KGavLsLL~DA-----Yq~RDkValI~FrG~~AevlLPPT~sv~~A~r 573 (688) T TIGR02442 504 EGRAGNLVIFVVDASGSMAAR-----GRMAAAKGAVLSLLRDA-----YQKRDKVALITFRGEEAEVLLPPTSSVELAAR 573 (688) T ss_pred HHHHCCCEEEEEECCHHHHHH-----HHHHHHHHHHHHHHHHH-----HHHCCEEEEEECCCCEEEEECCCCCHHHHHHH T ss_conf 422115222353353204423-----57899899999988888-----86277688862367343576587884899999 Q ss_pred HHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCC------CCCC-CHHHHHHHHHHHHHC Q ss_conf 974015688745642378899998742110123467776661699998404588------8888-978999999999978 Q gi|254780833|r 243 KINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENS------SPNI-DNKESLFYCNEAKRR 315 (371) Q Consensus 243 ~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~------~~~~-~~~~~~~~c~~~k~~ 315 (371) .+..|..+|.|....||..|+..+..... ..+....++|++|||--| .+.. .+.....+...+.+. T Consensus 574 ~L~~lPtGGrTPLa~gL~~A~~v~~~~~~-------~~~~~~pl~V~iTDGRaNv~L~~~~g~~qp~~~~~~~a~~L~~~ 646 (688) T TIGR02442 574 RLEELPTGGRTPLAAGLLKAAEVLSNELL-------RDDDRRPLVVVITDGRANVALDVSLGEPQPLDDARTIASKLAAR 646 (688) T ss_pred HHHHCCCCCCCHHHHHHHHHHHHHHHHHH-------CCCCCCEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHH T ss_conf 99728898987458999999999999861-------16899428998707863542666788841577899999998875 Q ss_pred -------CCEEEEEEECC-CCC-HHHHHHHCC--CCEEEEE Q ss_conf -------98799999418-642-798998338--9808982 Q gi|254780833|r 316 -------GAIVYAIGVQA-EAA-DQFLKNCAS--PDRFYSV 345 (371) Q Consensus 316 -------gi~i~tIg~~~-~~~-~~~l~~cAs--~~~~y~~ 345 (371) |+..-+|=-+. ..- -.+-+++|+ +|.||.. T Consensus 647 ~~R~R~Lg~~~vV~DTE~~~~v~lGlA~~~A~~lgg~~~~l 687 (688) T TIGR02442 647 ASRIRSLGIKFVVIDTENPGFVRLGLAEDLASALGGEYLRL 687 (688) T ss_pred HCCEEECCCEEEEEECCCCCCCCCCHHHHHHHHHCCCEECC T ss_conf 04301116227899726887542223899999829832247 No 34 >KOG2353 consensus Probab=99.12 E-value=1.9e-09 Score=75.35 Aligned_cols=183 Identities=19% Similarity=0.265 Sum_probs=128.8 Q ss_pred CCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCC---------C Q ss_conf 1346740699852763115667872148989999987500233555531104798788636741781666---------5 Q gi|254780833|r 164 KSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPL---------A 234 (371) Q Consensus 164 ~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~l---------t 234 (371) ....+-|+++.+|.||||.. .+++.++..+...++.+...+. +.+.+|+.......|= . T Consensus 221 aAt~pKdiviLlD~SgSm~g------~~~~lak~tv~~iLdtLs~~Df------vni~tf~~~~~~v~pc~~~~lvqAt~ 288 (1104) T KOG2353 221 AATSPKDIVILLDVSGSMSG------LRLDLAKQTVNEILDTLSDNDF------VNILTFNSEVNPVSPCFNGTLVQATM 288 (1104) T ss_pred CCCCCCCEEEEEECCCCCCC------HHHHHHHHHHHHHHHHCCCCCE------EEEEEECCCCCCCCCCCCCCEEECCH T ss_conf 46786645999965655544------3169999999999976154776------87876213567565202585220456 Q ss_pred CCHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHH-- Q ss_conf 587899999740156887456423788999987421101234677766616999984045888888978999999999-- Q gi|254780833|r 235 WGVQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEA-- 312 (371) Q Consensus 235 ~~~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~-- 312 (371) .++..+++.++.+.+.|.++...|+..++..|.+.. .......+....+.|+++|||..... ..+-+.- T Consensus 289 ~nk~~~~~~i~~l~~k~~a~~~~~~e~aF~lL~~~n--~s~~~~~~~~C~~~iml~tdG~~~~~-------~~If~~yn~ 359 (1104) T KOG2353 289 RNKKVFKEAIETLDAKGIANYTAALEYAFSLLRDYN--DSRANTQRSPCNQAIMLITDGVDENA-------KEIFEKYNW 359 (1104) T ss_pred HHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHC--CCCCCCCCCCCCEEEEEEECCCCCCH-------HHHHHHHCC T ss_conf 779999999864141254124355778999998744--45544322500104577624775108-------999986036 Q ss_pred HHCCCEEEEEEECCCC---CHHHHHHHCCCCEEEEECCHHHHHHHHHHHHHHHHHEEE Q ss_conf 9789879999941864---279899833898089828989999999999996400078 Q gi|254780833|r 313 KRRGAIVYAIGVQAEA---ADQFLKNCASPDRFYSVQNSRKLHDAFLRIGKEMVKQRI 367 (371) Q Consensus 313 k~~gi~i~tIg~~~~~---~~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~~i~~~~i 367 (371) -++.|+|||.-+|... ....+-.|++.|.|++..+..++.+--.+...-+...++ T Consensus 360 ~~~~Vrvftflig~~~~~~~~~~wmac~n~gyy~~I~~~~~v~~~~~~y~~vlsRp~v 417 (1104) T KOG2353 360 PDKKVRVFTFLIGDEVYDLDEIQWMACANKGYYVHIISIADVRENVLEYLDVLSRPLV 417 (1104) T ss_pred CCCCEEEEEEEECCCCCCCCCCHHHHHHCCCCEEECCCHHHCCHHHHHHHHHHCCCEE T ss_conf 7773599999924421345412122540788558646656458676556645320002 No 35 >PRK13406 bchD magnesium chelatase subunit D; Provisional Probab=99.11 E-value=1.3e-08 Score=70.35 Aligned_cols=171 Identities=17% Similarity=0.206 Sum_probs=125.0 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC-CEEECCCCCCHHHHHHH Q ss_conf 34674069985276311566787214898999998750023355553110479878863674-17816665587899999 Q gi|254780833|r 165 SDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK-IVQTFPLAWGVQHIQEK 243 (371) Q Consensus 165 ~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~-~~~~~~lt~~~~~~~~~ 243 (371) .+.+.-++||+|-||||.. .|+..++.++..++... +....+++++.|... +....|.|......+.. T Consensus 398 ~r~~~lviFvVDASGS~A~------~Rm~~aKGAV~~LL~dA-----Y~~RD~ValIaFRG~~AevlLPPTrSv~~A~r~ 466 (584) T PRK13406 398 QRSETTTIFVVDASGSAAL------HRLAEAKGAVELLLAEC-----YVRRDHVALVAFRGRGAELLLPPTRSLVRAKRS 466 (584) T ss_pred CCCCEEEEEEEECCCCHHH------HHHHHHHHHHHHHHHHH-----HHHHCEEEEEEECCCCCEEEECCCCCHHHHHHH T ss_conf 2566069999828862799------99999999999999999-----960044789987687630741886559999999 Q ss_pred HHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC-------CHHHHHHHHHHHHHCC Q ss_conf 740156887456423788999987421101234677766616999984045888888-------9789999999999789 Q gi|254780833|r 244 INRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNI-------DNKESLFYCNEAKRRG 316 (371) Q Consensus 244 i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~-------~~~~~~~~c~~~k~~g 316 (371) +..|..+|+|....|+..+++........ .....+||+|||--|-+-. .......++..++..| T Consensus 467 L~~LP~GG~TPLA~GL~~A~~l~~~~r~~---------~~~p~~VllTDGRaNv~ldg~~~r~~a~~da~~~A~~l~~~g 537 (584) T PRK13406 467 LAGLPGGGGTPLAAGLDAALALALSVRRK---------GQTPTVVLLTDGRANIARDGAGGRAQAEEDALAAARALRAAG 537 (584) T ss_pred HHCCCCCCCCHHHHHHHHHHHHHHHHHCC---------CCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCC T ss_conf 96299999885999999999999997557---------995489998279877787778871148999999999999769 Q ss_pred CEEEEEEECCCCCHHHHHHHCC--CCEEEEECCH--HHHHHHHH Q ss_conf 8799999418642798998338--9808982898--99999999 Q gi|254780833|r 317 AIVYAIGVQAEAADQFLKNCAS--PDRFYSVQNS--RKLHDAFL 356 (371) Q Consensus 317 i~i~tIg~~~~~~~~~l~~cAs--~~~~y~~~~~--~~L~~af~ 356 (371) |...+|-.+.. .....+.+|. ++.||...+. +.|..+-+ T Consensus 538 ~~~vVIDT~~~-~~~~a~~LA~~l~a~Y~~Lp~~~A~~l~~~V~ 580 (584) T PRK13406 538 LPALVIDTSPR-PQPQARALAEAMGARYLPLPRADATRLSQAVR 580 (584) T ss_pred CCEEEEECCCC-CCHHHHHHHHHCCCCEEECCCCCHHHHHHHHH T ss_conf 97899948988-86269999998399189789789899999999 No 36 >TIGR02031 BchD-ChlD magnesium chelatase ATPase subunit D; InterPro: IPR011776 This entry represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. Unlike subunit I (IPR011775 from INTERPRO), this subunit is not found in archaea.; GO: 0005524 ATP binding, 0016851 magnesium chelatase activity, 0015995 chlorophyll biosynthetic process. Probab=99.09 E-value=2.1e-09 Score=75.11 Aligned_cols=170 Identities=20% Similarity=0.254 Sum_probs=127.4 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC-CEEECCCCCCHHHHHHH Q ss_conf 34674069985276311566787214898999998750023355553110479878863674-17816665587899999 Q gi|254780833|r 165 SDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK-IVQTFPLAWGVQHIQEK 243 (371) Q Consensus 165 ~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~-~~~~~~lt~~~~~~~~~ 243 (371) ...+--++|++|-||||.. +.|+..||-++..++.+. +...+.++++++.|... +....|.|.....-|.. T Consensus 507 ~ksg~L~IF~VDASGSsaa-----~~Rm~~AKGAV~~LL~~A---Yv~RD~vkVaLi~FRG~~Ae~LLPPsrSv~~aKr~ 578 (705) T TIGR02031 507 RKSGALLIFVVDASGSSAA-----VARMSEAKGAVELLLGEA---YVHRDQVKVALIAFRGTAAEVLLPPSRSVELAKRR 578 (705) T ss_pred CCCCCEEEEEEECCHHHHH-----HHHHHHHHHHHHHHHHHH---HHHCCCEEEEEEECCCCHHHHCCCCHHHHHHHHHH T ss_conf 0288279997606357899-----999987789999998765---44136035776304443000037852358999999 Q ss_pred HHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCC-------C---CC----------CHH Q ss_conf 740156887456423788999987421101234677766616999984045888-------8---88----------978 Q gi|254780833|r 244 INRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSS-------P---NI----------DNK 303 (371) Q Consensus 244 i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~-------~---~~----------~~~ 303 (371) ++.|..||||-...||..||..-.++. ..|+-.+-.|||+|||=.|- + .. -.. T Consensus 579 L~~LP~GGGtPLA~gL~~A~~~a~qar-------~~GD~~~~~ivliTDGRgNvpL~~~~DP~~~~~~r~PrPts~~l~~ 651 (705) T TIGR02031 579 LDVLPGGGGTPLAAGLAAAVEVAKQAR-------SRGDVGRITIVLITDGRGNVPLDASVDPKAAKADRLPRPTSEELKE 651 (705) T ss_pred HCCCCCCCCCHHHHHHHHHHHHHHHHH-------CCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHH T ss_conf 715899985678999999999998510-------2688524556776077877467656786100235678726899999 Q ss_pred HHHHHHHHHHHCCCEEEEEEECCCC-CHHHHHHHCC--CCEEEEECCHH Q ss_conf 9999999999789879999941864-2798998338--98089828989 Q gi|254780833|r 304 ESLFYCNEAKRRGAIVYAIGVQAEA-ADQFLKNCAS--PDRFYSVQNSR 349 (371) Q Consensus 304 ~~~~~c~~~k~~gi~i~tIg~~~~~-~~~~l~~cAs--~~~~y~~~~~~ 349 (371) +...++..+.+.||-.-+|=-.... ...+++++|. .+|||.-.++. T Consensus 652 e~~~lA~~i~~~G~~~lVIDT~~~f~s~G~a~~lA~~~~a~Y~yLP~a~ 700 (705) T TIGR02031 652 EVLALARKIREAGISALVIDTANKFVSTGFAKKLARKLGARYIYLPNAT 700 (705) T ss_pred HHHHHHHHHHHCCCCEEEEECCCCCCCCCHHHHHHHHHCCCEEECCCCC T ss_conf 9999999988718865898267786676448999998589067136888 No 37 >cd01453 vWA_transcription_factor_IIH_type Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group. Probab=98.89 E-value=3.8e-07 Score=61.50 Aligned_cols=169 Identities=15% Similarity=0.220 Sum_probs=119.1 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC-CEEECCCCCCHHHHHHHHHCCC Q ss_conf 069985276311566787214898999998750023355553110479878863674-1781666558789999974015 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK-IVQTFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~-~~~~~~lt~~~~~~~~~i~~l~ 248 (371) .+++++|.|.+|....- ..+|+....+.+..++.++-. .+...+.|++...++ +....+++.|.......+.... T Consensus 5 ~l~iiiD~S~am~~~D~-~PtRl~~~~~~l~~Fi~effd---qNPisqlGii~~rn~~a~~ls~lsgn~~~hi~~l~~~~ 80 (183) T cd01453 5 HLIIVIDCSRSMEEQDL-KPSRLAVVLKLLELFIEEFFD---QNPISQLGIISIKNGRAEKLTDLTGNPRKHIQALKTAR 80 (183) T ss_pred EEEEEEECCHHHHHCCC-CCCHHHHHHHHHHHHHHHHHC---CCCCCEEEEEEEECCEEEEEEECCCCHHHHHHHHHHCC T ss_conf 99999988376775658-954999999999999999870---79740489999946816997646899899999998545 Q ss_pred -CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC Q ss_conf -6887456423788999987421101234677766616999984045888888978999999999978987999994186 Q gi|254780833|r 249 -FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 249 -~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~ 327 (371) +.|......|+..+...|... .....+.++|++. .-.+. ++.+.....+.+|+.+|++.+||+... T Consensus 81 ~~~G~~SLqN~Le~A~~~L~~~---------P~~~sREILiI~~-Sl~t~---DpgdI~~ti~~lk~~~IrvsvI~l~aE 147 (183) T cd01453 81 ECSGEPSLQNGLEMALESLKHM---------PSHGSREVLIIFS-SLSTC---DPGNIYETIDKLKKENIRVSVIGLSAE 147 (183) T ss_pred CCCCCHHHHHHHHHHHHHHHHC---------CCCCCEEEEEEEC-CCCCC---CCCCHHHHHHHHHHCCCEEEEEEECHH T ss_conf 8999813999999999998208---------9878448999975-65347---976499999999983978999974278 Q ss_pred CCHHHHHHHC--CCCEEEEECCHHHHHHHHHH Q ss_conf 4279899833--89808982898999999999 Q gi|254780833|r 328 AADQFLKNCA--SPDRFYSVQNSRKLHDAFLR 357 (371) Q Consensus 328 ~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~ 357 (371) . ..+|.++ ++|.|+-+-|..-+.+.+.+ T Consensus 148 v--~I~k~l~~~TgG~y~V~lde~H~~~ll~~ 177 (183) T cd01453 148 M--HICKEICKATNGTYKVILDETHLKELLLE 177 (183) T ss_pred H--HHHHHHHHHHCCEEEEECCHHHHHHHHHH T ss_conf 9--99999999839976875399999999995 No 38 >cd01457 vWA_ORF176_type VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most Probab=98.79 E-value=1.4e-07 Score=64.14 Aligned_cols=157 Identities=16% Similarity=0.124 Sum_probs=90.8 Q ss_pred CCEEEEECCCCCCCCCCCC-CHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCC Q ss_conf 4069985276311566787-214898999998750023355553110479878863674178166655878999997401 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGP-GMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRL 247 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l 247 (371) -|.+|++|.||||.+..++ ...|.+.+++++..+.......+.-. +.++-++........ -+.+.+...-... T Consensus 3 rD~v~lIDdSgSM~~~d~~~~~sRW~~a~~al~~iA~~c~~~D~DG----Idvyfln~~~~~~~~--~~~~~V~~iF~~~ 76 (199) T cd01457 3 RDYTLLIDKSGSMAEADEAKERSRWEEAQESTRALARKCEEYDSDG----ITVYLFSGDFRRYDN--VNSSKVDQLFAEN 76 (199) T ss_pred CCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCC----CEEEEEECCCCCCCC--CCHHHHHHHHHCC T ss_conf 7779999688876677678887629999999999999998748899----879996277645688--8999999998558 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCC-EEEEEEECCCCCCCCCCHHHHHHHHHHHHH-CCCEEEEEEEC Q ss_conf 5688745642378899998742110123467776661-699998404588888897899999999997-89879999941 Q gi|254780833|r 248 IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYK-KYIIFLTDGENSSPNIDNKESLFYCNEAKR-RGAIVYAIGVQ 325 (371) Q Consensus 248 ~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~-k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~-~gi~i~tIg~~ 325 (371) .|.|.|+....+.............. ...++ -.||++|||.+++...-.......+..+.+ ..+-|-.+.+| T Consensus 77 ~P~G~T~~g~~L~~il~~y~~r~~~~------~~kp~g~~iIVITDG~p~D~~av~~~Ii~aa~kLd~~~qlgIqF~QVG 150 (199) T cd01457 77 SPDGGTNLAAVLQDALNNYFQRKENG------ATCPEGETFLVITDGAPDDKDAVERVIIKASDELDADNELAISFLQIG 150 (199) T ss_pred CCCCCCCHHHHHHHHHHHHHHHHHCC------CCCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHCCCCCCCEEEEEEC T ss_conf 98997963799999989998732006------899986079998279979828899999999986344010036777855 Q ss_pred CCCC-HHHHHHHC Q ss_conf 8642-79899833 Q gi|254780833|r 326 AEAA-DQFLKNCA 337 (371) Q Consensus 326 ~~~~-~~~l~~cA 337 (371) .|.. ..+|+.+= T Consensus 151 ~D~~A~~fL~~LD 163 (199) T cd01457 151 RDPAATAFLKALD 163 (199) T ss_pred CCHHHHHHHHHHC T ss_conf 9688999999858 No 39 >pfam00362 Integrin_beta Integrin, beta chain. Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF. Probab=98.73 E-value=1.3e-06 Score=58.24 Aligned_cols=192 Identities=17% Similarity=0.269 Sum_probs=113.2 Q ss_pred EEECCCCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCC---------- Q ss_conf 2210001346740699852763115667872148989999987500233555531104798788636741---------- Q gi|254780833|r 158 SVKISSKSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKI---------- 227 (371) Q Consensus 158 ~~~~~~~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~---------- 227 (371) ........+.|+|+++++|.|.||... .++++.+...+...+..+.. ..|+|.-+|-+++ T Consensus 90 ~~~~~~aedYPVDLYYLMDLS~SM~dD----l~~lk~LG~~La~~m~~iT~------nfrlGFGSFVDK~v~P~~~t~p~ 159 (424) T pfam00362 90 KLKVRQAEDYPVDLYYLMDLSYSMKDD----LENLKTLGTDLAKEMANITS------NFRLGFGSFVDKTVSPYVSTVPE 159 (424) T ss_pred EEEEECCCCCCCEEEEEEECCHHHHHH----HHHHHHHHHHHHHHHHHHCC------CCEEECCCCCCCCCCCCCCCCHH T ss_conf 999871356971369986054146778----99999999999999986140------43563020004766885337977 Q ss_pred -------------------EEECCCCCCHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEE Q ss_conf -------------------7816665587899999740156887456423788999987421101234677766616999 Q gi|254780833|r 228 -------------------VQTFPLAWGVQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYII 288 (371) Q Consensus 228 -------------------~~~~~lt~~~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~iv 288 (371) .-+.|||.+.......++....+|+-...+| |+.+|-....=....++ ++..++++| T Consensus 160 ~l~~PC~~~~~~C~~~fgf~~~l~LT~~~~~F~~~v~~q~iSgNlD~PEG---GfDAlmQ~aVC~~~IGW-R~~arrllv 235 (424) T pfam00362 160 KLKNPCSSKNPGCQPPFGFRHVLSLTDDTDRFNEEVKKQKISGNLDAPEG---GFDAIMQAAVCGEEIGW-RNEARRLLV 235 (424) T ss_pred HHCCCCCCCCCCCCCCCCEEEECCCCCCHHHHHHHHHHCCCCCCCCCCCC---CHHHHHHHHHHCCCCCC-CCCCEEEEE T ss_conf 85398757788988980222002467778999999874636467778750---17777788761423377-778528999 Q ss_pred EEECCCCC--------------CC--------------CCCHHHHHHHHHHHHHCCCE-EEEEEECCCCCHHHHHHHCCC Q ss_conf 98404588--------------88--------------88978999999999978987-999994186427989983389 Q gi|254780833|r 289 FLTDGENS--------------SP--------------NIDNKESLFYCNEAKRRGAI-VYAIGVQAEAADQFLKNCASP 339 (371) Q Consensus 289 l~TDG~~~--------------~~--------------~~~~~~~~~~c~~~k~~gi~-i~tIg~~~~~~~~~l~~cAs~ 339 (371) |.||+..+ ++ ..+.+...++.+.+++++|. ||+|.=..-.-.+.|.+.-.+ T Consensus 236 ~~TDa~fH~AgDGkL~GIv~PNDg~CHL~~~g~Yt~s~~~DYPSv~ql~~kl~ennI~~IFAVt~~~~~~Y~~Ls~~i~g 315 (424) T pfam00362 236 FTTDAGFHFAGDGKLGGIVEPNDGQCHLDDNGEYTASTTLDYPSVGQLAEKLSENNINPIFAVTENVVDLYKELSELIPG 315 (424) T ss_pred EECCCCCCCCCCCCEEEEECCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECHHHHHHHHHHHHHCCC T ss_conf 98588751357763343534888730448987614456678888899999998649259999750245899999975776 Q ss_pred CEE----EEECCHHH-HHHHHHHHHHHHH Q ss_conf 808----98289899-9999999999640 Q gi|254780833|r 340 DRF----YSVQNSRK-LHDAFLRIGKEMV 363 (371) Q Consensus 340 ~~~----y~~~~~~~-L~~af~~I~~~i~ 363 (371) ... -+..|--+ +.+++++|..++. T Consensus 316 s~vg~L~~DSsNIv~LI~~aY~ki~S~V~ 344 (424) T pfam00362 316 STVGVLSSDSSNVVQLIKDAYNKISSKVE 344 (424) T ss_pred CEEEEECCCCHHHHHHHHHHHHHHHEEEE T ss_conf 52566246750289999999987522899 No 40 >COG4548 NorD Nitric oxide reductase activation protein [Inorganic ion transport and metabolism] Probab=98.69 E-value=3.9e-07 Score=61.44 Aligned_cols=175 Identities=17% Similarity=0.184 Sum_probs=113.6 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCC----------CCCCCCCCEEEEEEEEEECCCCEEECCCCCCH Q ss_conf 74069985276311566787214898999998750023----------35555311047987886367417816665587 Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDI----------IKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGV 237 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~----------~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~ 237 (371) .+-+.+.+|.|-||..+...-..-++.-.+++.-+... ++.......++++..+.-.+. .-. T Consensus 446 Dla~TLLvD~S~St~a~mdetrRvidl~~eaL~~la~~~qa~gd~~~~~~fts~rr~~vri~tvk~FDe--------s~~ 517 (637) T COG4548 446 DLAFTLLVDVSASTDAKMDETRRVIDLFHEALLVLAHGHQALGDSEDILDFTSRRRPWVRINTVKDFDE--------SMG 517 (637) T ss_pred CCEEEEEEECCCCHHHHHHHHHHHHHHHHHHHHHHHCHHHHHCCHHHHCCCHHHCCCCEEEEEEECCCC--------CCC T ss_conf 614678762343367776522125787899999861326551788874372121376312311103430--------004 Q ss_pred HHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCC-----CCHHHHHHHHHHH Q ss_conf 89999974015688745642378899998742110123467776661699998404588888-----8978999999999 Q gi|254780833|r 238 QHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPN-----IDNKESLFYCNEA 312 (371) Q Consensus 238 ~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~-----~~~~~~~~~c~~~ 312 (371) .++...|-.|.|+..|....++..+-.-|. ..+..+|++|++|||++|.-+ ..-..+......+ T Consensus 518 ~~~~~RImALePg~ytR~G~AIR~As~kL~-----------~rpq~qklLivlSDGkPnd~d~YEgr~gIeDTr~AV~ea 586 (637) T COG4548 518 ETVGPRIMALEPGYYTRDGAAIRHASAKLM-----------ERPQRQKLLIVLSDGKPNDFDHYEGRFGIEDTREAVIEA 586 (637) T ss_pred CCCCHHHEECCCCCCCCCCHHHHHHHHHHH-----------CCCCCCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHH T ss_conf 553312213376644431099999999983-----------474112489994489854344323321115379999999 Q ss_pred HHCCCEEEEEEECCCCCHHHHHHHCCCCEEEEECCHHHHHHHHHHHHHHH Q ss_conf 97898799999418642798998338980898289899999999999964 Q gi|254780833|r 313 KRRGAIVYAIGVQAEAADQFLKNCASPDRFYSVQNSRKLHDAFLRIGKEM 362 (371) Q Consensus 313 k~~gi~i~tIg~~~~~~~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~~i 362 (371) +..||.+|.|.+..+...+ +...-.-+.|-.|++..+|-.++-.|-+++ T Consensus 587 Rk~Gi~VF~Vtld~ea~~y-~p~~fgqngYa~V~~v~~LP~~L~~lyrkL 635 (637) T COG4548 587 RKSGIEVFNVTLDREAISY-LPALFGQNGYAFVERVAQLPGALPPLYRKL 635 (637) T ss_pred HHCCCEEEEEEECCHHHHH-HHHHHCCCCEEECCCHHHCCHHHHHHHHHH T ss_conf 8658347999833305555-288852674697024001605579999996 No 41 >smart00187 INB Integrin beta subunits (N-terminal portion of extracellular region). Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). Probab=98.68 E-value=2.1e-06 Score=57.02 Aligned_cols=191 Identities=16% Similarity=0.250 Sum_probs=112.8 Q ss_pred EEECCCCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCC---------- Q ss_conf 2210001346740699852763115667872148989999987500233555531104798788636741---------- Q gi|254780833|r 158 SVKISSKSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKI---------- 227 (371) Q Consensus 158 ~~~~~~~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~---------- 227 (371) ........+.|+|+.+++|.|.||... .++++.+...+...+..+. ...|+|.-+|-+++ T Consensus 89 ~~~~~~aedYPVDLYYLMDLS~SM~dD----l~~l~~LG~~La~~m~~iT------~nfrlGFGsFVDK~v~Py~~t~p~ 158 (423) T smart00187 89 TLTVRQAEDYPVDLYYLMDLSYSMKDD----LDNLKSLGDDLAREMKGLT------SNFRLGFGSFVDKTVSPFVSTRPE 158 (423) T ss_pred EEEEECCCCCCCEEEEEEECCHHHHHH----HHHHHHHHHHHHHHHHHHC------CCCEEEEEECCCCCCCCCCCCCHH T ss_conf 987303246971268885054457888----9999999999999998640------054551111203665775448978 Q ss_pred -------------------EEECCCCCCHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEE Q ss_conf -------------------7816665587899999740156887456423788999987421101234677766616999 Q gi|254780833|r 228 -------------------VQTFPLAWGVQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYII 288 (371) Q Consensus 228 -------------------~~~~~lt~~~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~iv 288 (371) .-+.|||.|.....+.++....+|+-...+| |+.+|-....=....++ +++.+|.+| T Consensus 159 ~l~~PC~~~~~~C~ppfgf~n~l~LT~d~~~F~~~V~~q~iSgNlD~PEG---GfDAlmQ~avC~~~IGW-R~~arrllV 234 (423) T smart00187 159 KLENPCPNYNLTCEPPYGFKHVLSLTDDTDEFNEEVKKQRISGNLDAPEG---GFDAIMQAAVCTEQIGW-REDARRLLV 234 (423) T ss_pred HHCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCC---CHHHHHHHHHHCCCCCC-CCCCEEEEE T ss_conf 85499878898767981301112367888999999862536346688761---27788888752000376-557438999 Q ss_pred EEECCCCC--------------CCC--------------CCHHHHHHHHHHHHHCCCE-EEEEEECCCCCHHHHHHHCCC Q ss_conf 98404588--------------888--------------8978999999999978987-999994186427989983389 Q gi|254780833|r 289 FLTDGENS--------------SPN--------------IDNKESLFYCNEAKRRGAI-VYAIGVQAEAADQFLKNCASP 339 (371) Q Consensus 289 l~TDG~~~--------------~~~--------------~~~~~~~~~c~~~k~~gi~-i~tIg~~~~~~~~~l~~cAs~ 339 (371) |.||+..+ ++. .+.+...++.+.+++++|. ||+|.=..-.-.+.|...-.+ T Consensus 235 f~TDa~fH~AgDGkL~GIv~PNDg~CHLd~~g~Yt~s~~~DYPSi~ql~~kl~ennI~~IFAVT~~~~~~Y~~Ls~~ipg 314 (423) T smart00187 235 FSTDAGFHFAGDGKLAGIVQPNDGQCHLDNNGEYTMSTTQDYPSIGQLNQKLAENNINPIFAVTKKQVSLYKELSALIPG 314 (423) T ss_pred EECCCCCCCCCCCCEEEEECCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECCCHHHHHHHHHHHCCC T ss_conf 98378630236762443543788730327888524456567887899999998539327998522045699999875775 Q ss_pred ---CE-EEEECCHHH-HHHHHHHHHHHH Q ss_conf ---80-898289899-999999999964 Q gi|254780833|r 340 ---DR-FYSVQNSRK-LHDAFLRIGKEM 362 (371) Q Consensus 340 ---~~-~y~~~~~~~-L~~af~~I~~~i 362 (371) |. --+..|--+ +.++|++|..++ T Consensus 315 s~vg~L~~DSsNVv~LI~~aY~ki~S~V 342 (423) T smart00187 315 SSVGVLSEDSSNVVELIKDAYNKISSRV 342 (423) T ss_pred CEEEEECCCCHHHHHHHHHHHHHHCEEE T ss_conf 4035524575138999999998750189 No 42 >pfam11775 CobT_C Cobalamin biosynthesis protein CobT VWA domain. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. Probab=98.62 E-value=9.5e-07 Score=59.09 Aligned_cols=163 Identities=18% Similarity=0.207 Sum_probs=85.8 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHH---HHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEE-----------ECCC- Q ss_conf 406998527631156678721489899---9998750023355553110479878863674178-----------1666- Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVA---TRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQ-----------TFPL- 233 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~-----------~~~l- 233 (371) .-+.+++|+||||... ++..+ .+.+...++.....-++ .|..+..|+... ..|- T Consensus 13 t~VtlLID~SGSMrgr------~i~~Aa~~adiL~~~Ler~gv~~EI-----LGFtT~~wkGg~~r~~w~~~G~p~~pgR 81 (220) T pfam11775 13 ACVQLLIDLSGSMGGR------KIQLAAACADIIADALDRCGVKNEI-----LGFTTFAWKGGPDREAMLAAGFPAFEAL 81 (220) T ss_pred EEEEEEEECCCCCCCC------HHHHHHHHHHHHHHHHHHCCCCEEE-----EEECCCCCCCCCHHHHHHHCCCCCCCHH T ss_conf 2899988688888988------6789999999999999876998799-----8540476577502999987599767236 Q ss_pred ----------CCCHH--HHHHHHHCCCCC---CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCC--- Q ss_conf ----------55878--999997401568---874564237889999874211012346777666169999840458--- Q gi|254780833|r 234 ----------AWGVQ--HIQEKINRLIFG---STTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGEN--- 295 (371) Q Consensus 234 ----------t~~~~--~~~~~i~~l~~~---g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~--- 295 (371) ..+.. .-+..+..+.-. -..--.+++.|+.+.|.. .+..+|++++++||.| T Consensus 82 lndl~hiiyk~ad~~wrrar~~lg~m~~~gllkENiDGEAL~wA~~RL~~-----------R~e~RkILmViSDGaP~dd 150 (220) T pfam11775 82 LLDIIHIINEKADAPEIRARKNLGCMCEEFLLKENIDGEALAQAAKLFAG-----------RMEDKKILLMISDGAPCDD 150 (220) T ss_pred HHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHC-----------CCCCCEEEEEECCCCCCCC T ss_conf 65566421244688307789998777663144318971999999999863-----------9312469999758996776 Q ss_pred -----CCCCCCHHHHHHHHHHHH-HCCCEEEEEEECCCCCHHHHHHHCCCCEEEEECCHHHHHHHH-HHHH Q ss_conf -----888889789999999999-789879999941864279899833898089828989999999-9999 Q gi|254780833|r 296 -----SSPNIDNKESLFYCNEAK-RRGAIVYAIGVQAEAADQFLKNCASPDRFYSVQNSRKLHDAF-LRIG 359 (371) Q Consensus 296 -----~~~~~~~~~~~~~c~~~k-~~gi~i~tIg~~~~~~~~~l~~cAs~~~~y~~~~~~~L~~af-~~I~ 359 (371) |.++.-............ ..+|++..||++-|..+++-+. +..+.+.+||..++ ++++ T Consensus 151 st~s~n~~~yL~~hLr~vi~~ie~~~~iel~aIGIghDv~r~yY~~------av~i~d~eeL~~~~~~~L~ 215 (220) T pfam11775 151 STLSVAAGDGFEQHLRHIIEEIETLSEIDLIAIGIGHDAPRRYYKN------AALINDAEELGGAITEELA 215 (220) T ss_pred CCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEEEECCCCCHHHHHC------CEEECCHHHHHHHHHHHHH T ss_conf 4112587776799999999998506882699987477768666506------5686038886599999999 No 43 >pfam04056 Ssl1 Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex. Probab=98.32 E-value=0.00013 Score=46.17 Aligned_cols=169 Identities=15% Similarity=0.193 Sum_probs=109.9 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECC-CCEEECCCCCCHHHHHHHHHCCC Q ss_conf 06998527631156678721489899999875002335555311047987886367-41781666558789999974015 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSS-KIVQTFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~-~~~~~~~lt~~~~~~~~~i~~l~ 248 (371) .+++++|-|.+|....-.+ +|+....+.+..++.++- +.|...+.|++...+ .+..+.+++.+......++..+. T Consensus 54 hl~iilD~S~aM~e~DlkP-~R~~~~l~~l~~Fi~efF---dqNPiSQlgii~~rn~~a~~ls~lsgnp~~hi~aL~~~~ 129 (250) T pfam04056 54 HLYIVLDCSRAMEEKDLRP-SRFACTIKYLETFVEEFF---DQNPISQIGLITCKDGRAHRLTDLTGNPRVHIKALKSLR 129 (250) T ss_pred EEEEEEECCHHHHHCCCCC-CHHHHHHHHHHHHHHHHH---HCCCCCCEEEEEEECCEEEEEEECCCCHHHHHHHHHHHH T ss_conf 8999998827676351586-489999999999999987---439830227999965713783325799899999999874 Q ss_pred ---CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf ---68874564237889999874211012346777666169999840458888889789999999999789879999941 Q gi|254780833|r 249 ---FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 249 ---~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) +.|.-....|+..+...|... + ....+.++|++.-= ...++.+.....+.+|+.+|++.+||+. T Consensus 130 ~~~~~G~pSLqN~Le~a~~~L~~~-------P--~~~sREILii~gSL----~T~DPgdI~~tI~~l~~~~IrvsvI~La 196 (250) T pfam04056 130 EAECGGDPSLQNALELARASLKHV-------P--SHGSREVLIIFGSL----STCDPGDIYSTIDTLKKEKIRCSVIGLS 196 (250) T ss_pred CCCCCCCHHHHHHHHHHHHHHHCC-------C--CCCCEEEEEEEEEC----CCCCCCCHHHHHHHHHHCCCEEEEEEEC T ss_conf 069999920899999999887508-------9--87854899998204----4458865999999999759079998733 Q ss_pred CCCCHHHHHHHC--CCCEEEEECCHHHHHHHHHH Q ss_conf 864279899833--89808982898999999999 Q gi|254780833|r 326 AEAADQFLKNCA--SPDRFYSVQNSRKLHDAFLR 357 (371) Q Consensus 326 ~~~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~ 357 (371) .. -...|.++ ++|.|.-+-|..-+.+.+.+ T Consensus 197 aE--v~Ick~l~~~T~G~y~V~lde~Hfk~ll~~ 228 (250) T pfam04056 197 AE--VFICKELCKATNGTYSVALDETHLKELLLE 228 (250) T ss_pred HH--HHHHHHHHHHHCCEEEEECCHHHHHHHHHH T ss_conf 89--999999999749988875699999999995 No 44 >COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=98.21 E-value=1.6e-05 Score=51.74 Aligned_cols=159 Identities=21% Similarity=0.212 Sum_probs=87.9 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEE--ECCCCCCHHHHHHHHHCC Q ss_conf 069985276311566787214898999998750023355553110479878863674178--166655878999997401 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQ--TFPLAWGVQHIQEKINRL 247 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~--~~~lt~~~~~~~~~i~~l 247 (371) .+++.+|.||||.+. +..-++..+..++.... ..+-++.+.-|.+.... ..+...+..++..++... T Consensus 274 pvilllD~SGSM~G~------~e~~AKAvalAl~~~al-----aenR~~~~~lF~s~~~~~el~~k~~~~~e~i~fL~~~ 342 (437) T COG2425 274 PVILLLDKSGSMSGF------KEQWAKAVALALMRIAL-----AENRDCYVILFDSEVIEYELYEKKIDIEELIEFLSYV 342 (437) T ss_pred CEEEEEECCCCCCCC------HHHHHHHHHHHHHHHHH-----HHCCCEEEEEECCCCEEEEECCCCCCHHHHHHHHHHH T ss_conf 879999588885782------88999999999999998-----8430538999525202555057745799999999650 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC Q ss_conf 56887456423788999987421101234677766616999984045888888978999999999978987999994186 Q gi|254780833|r 248 IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 248 ~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~ 327 (371) - +|||++..++..+...+...... +--||++|||++--. ..-....-...|..+.++|+|-++.. T Consensus 343 f-~GGTD~~~~l~~al~~~k~~~~~-----------~adiv~ITDg~~~~~---~~~~~~v~e~~k~~~~rl~aV~I~~~ 407 (437) T COG2425 343 F-GGGTDITKALRSALEDLKSRELF-----------KADIVVITDGEDERL---DDFLRKVKELKKRRNARLHAVLIGGY 407 (437) T ss_pred C-CCCCCHHHHHHHHHHHHHCCCCC-----------CCCEEEEECCHHHHH---HHHHHHHHHHHHHHHCEEEEEEECCC T ss_conf 6-89888589999999986436656-----------777899804376654---67899999998875434899996478 Q ss_pred CCHHHHHHHCCCCE-EEEECCHHHHHHHHHHH Q ss_conf 42798998338980-89828989999999999 Q gi|254780833|r 328 AADQFLKNCASPDR-FYSVQNSRKLHDAFLRI 358 (371) Q Consensus 328 ~~~~~l~~cAs~~~-~y~~~~~~~L~~af~~I 358 (371) .. +.+..+. ++ .|.++. .+...+++.+ T Consensus 408 ~~-~~l~~Is--d~~i~~~~~-~~~~kv~~~~ 435 (437) T COG2425 408 GK-PGLMRIS--DHIIYRVEP-RDRVKVVKRW 435 (437) T ss_pred CC-CCCCEEC--EEEEEEECC-HHHHHHHHCC T ss_conf 98-6600011--146787274-7776777344 No 45 >COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=98.08 E-value=4.8e-05 Score=48.83 Aligned_cols=164 Identities=20% Similarity=0.219 Sum_probs=109.4 Q ss_pred CCCCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCC--CCHH Q ss_conf 00013467406998527631156678721489899999875002335555311047987886367417816665--5878 Q gi|254780833|r 161 ISSKSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLA--WGVQ 238 (371) Q Consensus 161 ~~~~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt--~~~~ 238 (371) .......+....+++|.|+||... . +..++......+..+...+ ....+.|........|.+ .+.. T Consensus 30 ~~~~~~~~~~~~~~~~~~~s~~~~-----~-~~~~~~~~~~~v~~~~~~~------~~~~~~~~~~~~~~~~~~~~~~~~ 97 (399) T COG2304 30 IDLDLLVPANLTLAIDTSGSMTGA-----L-LELAKSAAIELVNGLNPGD------LLSIVTFAGSADVLIPPTGATNKE 97 (399) T ss_pred CCCCCCCCCHHHHHHCCCCCCHHH-----H-HHHHHHHHHHHHHCCCCHH------HHEEEECCCCCCEECCCCCCCCHH T ss_conf 111000221002221366410055-----6-7767899888773156211------120461156555012564223227 Q ss_pred HHHHHHHC-CCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCC Q ss_conf 99999740-15688745642378899998742110123467776661699998404588888897899999999997898 Q gi|254780833|r 239 HIQEKINR-LIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGA 317 (371) Q Consensus 239 ~~~~~i~~-l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi 317 (371) .+...|+. +.+.+.|....++.|+...+.... .......+.+.|||+++.+..+........+.....+| T Consensus 98 ~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~~~~~~~tdg~~~~~~~d~~~~~~~~~~~~~~~i 168 (399) T COG2304 98 SITAAIDQSLQAGGATAVEASLSLAVELAAKAL---------PRGTLNRILLLTDGENNLGLVDPSRLSALAKLAAGKGI 168 (399) T ss_pred HHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHC---------CCCCCCEEEEECCCCHHCCCCCHHHHHHHHCCCCCCCE T ss_conf 788887640264554305778999999876423---------54553233330364120276678899998634556762 Q ss_pred EEEEEEECCCCCHHHHHHHC--CCCEEEEE Q ss_conf 79999941864279899833--89808982 Q gi|254780833|r 318 IVYAIGVQAEAADQFLKNCA--SPDRFYSV 345 (371) Q Consensus 318 ~i~tIg~~~~~~~~~l~~cA--s~~~~y~~ 345 (371) .+.++|++.+.+.+.+...+ ..|.+... T Consensus 169 ~~~~~g~~~~~n~~~~~~~~~~~~g~l~~~ 198 (399) T COG2304 169 VLDTLGLGDDVNEDELTGIAAAANGNLAFI 198 (399) T ss_pred EEEEECCCCHHHHHHHHHHHHHCCCCCCCC T ss_conf 786313552267777776553036641102 No 46 >KOG2807 consensus Probab=98.06 E-value=0.0003 Score=44.04 Aligned_cols=167 Identities=16% Similarity=0.148 Sum_probs=110.3 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHH----HHHHHHCCCCCCCCCCCEEEEEEEEEECC-CCEEECCCCCCHHHHHHH Q ss_conf 4069985276311566787214898999----99875002335555311047987886367-417816665587899999 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVAT----RSIREMLDIIKSIPDVNNVVRSGLVTFSS-KIVQTFPLAWGVQHIQEK 243 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~f~~-~~~~~~~lt~~~~~~~~~ 243 (371) -.+++|+|.|..|......+ .|..... ..+.+++++. ...++|++.--+ .+.....++.|......+ T Consensus 61 Rhl~iviD~S~am~e~Df~P-~r~a~~~K~le~Fv~eFFdQN-------PiSQigii~~k~g~A~~lt~ltgnp~~hI~a 132 (378) T KOG2807 61 RHLYIVIDCSRAMEEKDFRP-SRFANVIKYLEGFVPEFFDQN-------PISQIGIISIKDGKADRLTDLTGNPRIHIHA 132 (378) T ss_pred EEEEEEEEHHHHHHHCCCCC-HHHHHHHHHHHHHHHHHHCCC-------CHHHEEEEEEECCHHHHHHHHCCCHHHHHHH T ss_conf 36899987345566444780-489999999999999986149-------6203358997055326888714887889999 Q ss_pred HHCCC-CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEE Q ss_conf 74015-68874564237889999874211012346777666169999840458888889789999999999789879999 Q gi|254780833|r 244 INRLI-FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAI 322 (371) Q Consensus 244 i~~l~-~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tI 322 (371) +..+. .+|.-....+++.+...|.. ..+-..+.++|+++-=- ..++.+.....+.+|..+|++.+| T Consensus 133 L~~~~~~~g~fSLqNaLe~a~~~Lk~---------~p~H~sREVLii~ssls----T~DPgdi~~tI~~lk~~kIRvsvI 199 (378) T KOG2807 133 LKGLTECSGDFSLQNALELAREVLKH---------MPGHVSREVLIIFSSLS----TCDPGDIYETIDKLKAYKIRVSVI 199 (378) T ss_pred HHCCCCCCCCHHHHHHHHHHHHHHCC---------CCCCCCEEEEEEEEEEC----CCCCCCHHHHHHHHHHHCEEEEEE T ss_conf 73122448886788799999998517---------87656327999985403----558520999999998617279998 Q ss_pred EECCCCCHHHHHHH--CCCCEEEEECCHHHHHHHHHHH Q ss_conf 94186427989983--3898089828989999999999 Q gi|254780833|r 323 GVQAEAADQFLKNC--ASPDRFYSVQNSRKLHDAFLRI 358 (371) Q Consensus 323 g~~~~~~~~~l~~c--As~~~~y~~~~~~~L~~af~~I 358 (371) |+.... ..-+.+ |++|.|+-+-+..-|.+.|.+- T Consensus 200 gLsaEv--~icK~l~kaT~G~Y~V~lDe~HlkeLl~e~ 235 (378) T KOG2807 200 GLSAEV--FICKELCKATGGRYSVALDEGHLKELLLEH 235 (378) T ss_pred EECHHH--HHHHHHHHHHCCEEEEEECHHHHHHHHHHC T ss_conf 500558--999999886188579875789999999845 No 47 >cd01455 vWA_F11C1-5a_type Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A Probab=98.02 E-value=0.00084 Score=41.30 Aligned_cols=174 Identities=13% Similarity=0.168 Sum_probs=103.5 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC------CEEECCCCCCHHHHHHHH Q ss_conf 69985276311566787214898999998750023355553110479878863674------178166655878999997 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK------IVQTFPLAWGVQHIQEKI 244 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~------~~~~~~lt~~~~~~~~~i 244 (371) +.+|+|.||||-.. ++.-.|++...+++...++.+...... .++..+--+.+ +....|+.+++..++ .+ T Consensus 3 lr~v~DvSgSMYRF-Ng~DgRL~R~lEa~~MvMEaf~g~e~k---~~ydIvGHSGd~~~I~lV~~~~~Pk~~keRl~-vl 77 (191) T cd01455 3 LKLVVDVSGSMYRF-NGYDGRLDRSLEAVVMVMEAFDGFEDK---IQYDIIGHSGDGPCVPFVKTNHPPKNNKERLE-TL 77 (191) T ss_pred EEEEEECCCCEEEE-CCCCHHHHHHHHHHHHHHHHHHCCCCE---EEEEEEECCCCCCCCCCCCCCCCCCCHHHHHH-HH T ss_conf 69999735442330-475328999999999999986175400---57887502688775102348999986689999-99 Q ss_pred HCCCC-----CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEE Q ss_conf 40156-----8874564237889999874211012346777666169999840458888889789999999999789879 Q gi|254780833|r 245 NRLIF-----GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIV 319 (371) Q Consensus 245 ~~l~~-----~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i 319 (371) ..+.+ ..|-++-+++.++...+.... +.-..++|+++|.+-......+.....+.+ ++..|+- T Consensus 78 ~~M~AHsQyC~sGD~Tlea~~~Ai~~l~a~~----------d~De~fVivlSDANL~RYgI~p~~l~~~l~--~~p~V~a 145 (191) T cd01455 78 KMMHAHSQFCWSGDHTVEATEFAIKELAAKE----------DFDEAIVIVLSDANLERYGIQPKKLADALA--REPNVNA 145 (191) T ss_pred HHHHHHHHHEECCCCHHHHHHHHHHHHHHCC----------CCCCCEEEEECCCCHHHCCCCHHHHHHHHH--CCCCCCE T ss_conf 9863120100258844899999999875302----------677608999814764431889899999973--3877668 Q ss_pred EEEEECCCCCH-HHHHHHCCCCEEEEECCHHHHHHHHHHHHHH Q ss_conf 99994186427-9899833898089828989999999999996 Q gi|254780833|r 320 YAIGVQAEAAD-QFLKNCASPDRFYSVQNSRKLHDAFLRIGKE 361 (371) Q Consensus 320 ~tIg~~~~~~~-~~l~~cAs~~~~y~~~~~~~L~~af~~I~~~ 361 (371) |.|-++.-+++ +.|+.-=-.|+-|...+..+|..+|++|+.. T Consensus 146 ~~IfIgslg~eA~~l~~~lP~G~~fVc~dt~~lP~il~qIfts 188 (191) T cd01455 146 FVIFIGSLSDEADQLQRELPAGKAFVCMDTSELPHIMQQIFTS 188 (191) T ss_pred EEEEEECHHHHHHHHHHHCCCCCEEEECCHHHHHHHHHHHHHH T ss_conf 9999735167999999748997417853653678999999887 No 48 >COG4655 Predicted membrane protein [Function unknown] Probab=97.98 E-value=3.2e-06 Score=55.94 Aligned_cols=58 Identities=14% Similarity=0.285 Sum_probs=53.1 Q ss_pred HHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 8752036871899999999999999999999999999999999999999999986520 Q gi|254780833|r 7 RNFFYNCKGSISILTAILLPVIFIVMGLVIETSHKFFVKAKLHYILDHSLLYTATKIL 64 (371) Q Consensus 7 ~~f~~d~~G~vai~~al~l~~li~~~g~aVD~~r~~~~ks~Lq~a~DaA~LAaa~~~~ 64 (371) +.|.|.+|+-+.|+.++.++..++.++++|||++.|..|.+||+++|-|+++++.... T Consensus 2 ~g~~r~~rs~~gvltal~~~lal~~l~l~VD~G~l~leqR~LQ~~ADlAAiaAAs~~~ 59 (565) T COG4655 2 NGWPRRQRSMVGVLTALFVPLALATLLLGVDYGYLYLEQRELQRVADLAAIAAASNLD 59 (565) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHEECCCEEEEEHHHHHHHHHHHHHHHHHHCC T ss_conf 8423767667789999999999998865022012441178788877699888776279 No 49 >cd01452 VWA_26S_proteasome_subunit 26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association. Probab=97.90 E-value=0.0013 Score=40.14 Aligned_cols=169 Identities=15% Similarity=0.123 Sum_probs=113.0 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECC-CCEEECCCCCCHHHHHHHHHCCC Q ss_conf 06998527631156678721489899999875002335555311047987886367-41781666558789999974015 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSS-KIVQTFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~-~~~~~~~lt~~~~~~~~~i~~l~ 248 (371) .+++.+|+|..|..... ..+|++.-.+++......... .+....+|+.+... .......||.+...+...+..+. T Consensus 5 Atmi~iDNSe~~RNGDy-~PtR~~AQ~dAvn~i~~~k~~---~NpEn~VGl~tmag~~~~Vl~TlT~D~gkiL~~lh~i~ 80 (187) T cd01452 5 ATMICIDNSEYMRNGDY-PPTRFQAQADAVNLICQAKTR---SNPENNVGLMTMAGNSPEVLVTLTNDQGKILSKLHDVQ 80 (187) T ss_pred EEEEEEECCHHHCCCCC-CCCHHHHHHHHHHHHHHHHHH---CCCCCCEEEEEECCCCCEEEEECCCCHHHHHHHCCCCC T ss_conf 89999978566505898-971899999999999977751---49533113576158986689844865788987532677 Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC- Q ss_conf 6887456423788999987421101234677766616999984045888888978999999999978987999994186- Q gi|254780833|r 249 FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE- 327 (371) Q Consensus 249 ~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~- 327 (371) ++|.-+...|++-+.-.|.-. ..+..++-||+|--. +-. .+..+...+...+|+++|.|-.|.||.. T Consensus 81 ~~G~~~~~~~IqiA~LALKHR---------qnk~~~qRIv~FVgS-Pi~--~~ek~l~~laKklKKnnV~vDII~FGe~~ 148 (187) T cd01452 81 PKGKANFITGIQIAQLALKHR---------QNKNQKQRIVAFVGS-PIE--EDEKDLVKLAKRLKKNNVSVDIINFGEID 148 (187) T ss_pred CCCEECHHHHHHHHHHHHHCC---------CCCCCCEEEEEEECC-CCC--CCHHHHHHHHHHHHHCCCCEEEEEECCCC T ss_conf 187651887999999997234---------677754479999789-875--57899999999875558535899946888 Q ss_pred CCHHHHHHHC----C--CCEEEEECCHHHH-HHH Q ss_conf 4279899833----8--9808982898999-999 Q gi|254780833|r 328 AADQFLKNCA----S--PDRFYSVQNSRKL-HDA 354 (371) Q Consensus 328 ~~~~~l~~cA----s--~~~~y~~~~~~~L-~~a 354 (371) .+.+.|+... + +.|+-.+.....| .++ T Consensus 149 ~n~~kL~~f~~~vn~~~~Shlv~ippg~~lLSd~ 182 (187) T cd01452 149 DNTEKLTAFIDAVNGKDGSHLVSVPPGENLLSDA 182 (187) T ss_pred CCHHHHHHHHHHHCCCCCCEEEEECCCCCHHHHH T ss_conf 9989999999984589982599947998645676 No 50 >KOG3768 consensus Probab=97.77 E-value=0.00038 Score=43.39 Aligned_cols=192 Identities=17% Similarity=0.139 Sum_probs=111.2 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEE-CCCCCCHHHHHHHHHCCCC Q ss_conf 699852763115667872148989999987500233555531104798788636741781-6665587899999740156 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQT-FPLAWGVQHIQEKINRLIF 249 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~-~~lt~~~~~~~~~i~~l~~ 249 (371) +.|++|.|+||.+.--...+.++.++.++-.++.......... ..|+-+++|......+ ..+..+...+.+.|..|.+ T Consensus 4 ~lFllDTS~SM~qrah~~~tylD~AKgaVEtFiK~R~r~~~~~-gdryml~TfeepP~~vk~~~~~~~a~~~~eik~l~a 82 (888) T KOG3768 4 FLFLLDTSGSMSQRAHPQFTYLDLAKGAVETFIKQRTRVGRET-GDRYMLTTFEEPPKNVKVACEKLGAVVIEEIKKLHA 82 (888) T ss_pred EEEEEECCCCHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCC-CCEEEEEECCCCCHHHHHHHHHCCCHHHHHHHHHCC T ss_conf 9999706641222036783156777779999999874154124-865899862458602556776503079888776157 Q ss_pred CCCCC-CCCHHHHHHHHHHHHHHCC----CCCCCCC-CCCCEEEEEEECCCCCCCCCCHHHH-----------HHHHHHH Q ss_conf 88745-6423788999987421101----2346777-6661699998404588888897899-----------9999999 Q gi|254780833|r 250 GSTTK-STPGLEYAYNKIFDAKEKL----EHIAKGH-DDYKKYIIFLTDGENSSPNIDNKES-----------LFYCNEA 312 (371) Q Consensus 250 ~g~T~-~~~~~~~~~~~l~~~~~~~----~~~~~~~-~~~~k~ivl~TDG~~~~~~~~~~~~-----------~~~c~~~ 312 (371) .+++- ...+...++..|.-..-.. .+.++.- .-..-+||++|||---.+....... ...-... T Consensus 83 ~~~s~~~~~~~t~AFdlLnlnR~qtGID~yGqGR~pf~lEP~~iI~iTDG~r~s~~~GV~~e~~Lpl~~p~pGse~Tkep 162 (888) T KOG3768 83 PYGSCQLHHAITEAFDLLNLNRVQTGIDGYGQGRLPFNLEPVTIILITDGGRYSGVAGVPIEFRLPLDPPFPGSEMTKEP 162 (888) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCEEEEECCCCCCCCCCCCCCC T ss_conf 63124556788877654315555305545555657456686489998248721344677515885068999862023563 Q ss_pred HHCCCEEEEEEECC--------------CCCH-HHHHHHC-CCCEEEEECCHHHHHHHHHHHHHHHH Q ss_conf 97898799999418--------------6427-9899833-89808982898999999999999640 Q gi|254780833|r 313 KRRGAIVYAIGVQA--------------EAAD-QFLKNCA-SPDRFYSVQNSRKLHDAFLRIGKEMV 363 (371) Q Consensus 313 k~~gi~i~tIg~~~--------------~~~~-~~l~~cA-s~~~~y~~~~~~~L~~af~~I~~~i~ 363 (371) =.+.-+.|++-|.- +.|. ..-+.|+ ++|+-|.+-+...|....+.+-+++. T Consensus 163 FRWDQrlftlVlRiPgt~~~~~~qlt~Vp~Dds~IermCevTGGRSysV~Spr~lnqciesLvqkvQ 229 (888) T KOG3768 163 FRWDQRLFTLVLRIPGTPYPTISQLTAVPIDDSVIERMCEVTGGRSYSVVSPRQLNQCIESLVQKVQ 229 (888) T ss_pred CHHHHHHHEEEEECCCCCCCCHHHHCCCCCCCHHHHHHHHHCCCCEEEEECHHHHHHHHHHHHHHHC T ss_conf 0234312004675589998667550577777326677665317850355379999999999998644 No 51 >pfam05762 VWA_CoxE VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA type domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria. Probab=97.74 E-value=0.00024 Score=44.54 Aligned_cols=130 Identities=15% Similarity=0.142 Sum_probs=74.4 Q ss_pred CCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHHH Q ss_conf 674069985276311566787214898999998750023355553110479878863674178166655--878999997 Q gi|254780833|r 167 IGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEKI 244 (371) Q Consensus 167 ~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~i 244 (371) .|-.+++++|.||||..+ .......+..+...+. ++-...|......+.+.-. +..+....+ T Consensus 56 ~p~~lVvl~DVSGSM~~y-------s~~~L~~~~al~~~~~---------rv~~F~F~t~l~~vT~~l~~~d~~~al~~~ 119 (223) T pfam05762 56 KPRRLVLLLDVSGSMADY-------SRIFLALLHALLAGRP---------RTRLFAFGTRLTDLTRALRERDPAEALLRV 119 (223) T ss_pred CCCCEEEEECCCCCCHHH-------HHHHHHHHHHHHHCCC---------CCEEEEEECCHHHHHHHHHCCCHHHHHHHH T ss_conf 987589997378874999-------9999999999985468---------615999836489888887128999999999 Q ss_pred HCC--CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEE Q ss_conf 401--568874564237889999874211012346777666169999840458888889789999999999789879999 Q gi|254780833|r 245 NRL--IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAI 322 (371) Q Consensus 245 ~~l--~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tI 322 (371) ... ..+|||++..++........ ...-..+-++|+++||.++++. .........++..+.+|+-+ T Consensus 120 ~~~~~~~~GgT~ig~al~~f~~~~~----------~~~l~~~t~ViilsDg~~~~~~---~~l~~~l~~L~~~~~rviWL 186 (223) T pfam05762 120 SARVEDWGGGTRIGAALAYFNELWT----------RPALSRGAVVVLVSDGLERGDS---EELLAEVARLVRSARRLVWL 186 (223) T ss_pred HHHHCCCCCCCCHHHHHHHHHHHCC----------CCCCCCCCEEEEEECCCCCCCH---HHHHHHHHHHHHHCCEEEEE T ss_conf 9860366799749999999998503----------0346788679997230103883---18999999999837879998 Q ss_pred EEC Q ss_conf 941 Q gi|254780833|r 323 GVQ 325 (371) Q Consensus 323 g~~ 325 (371) --. T Consensus 187 NPl 189 (223) T pfam05762 187 NPL 189 (223) T ss_pred CCC T ss_conf 998 No 52 >pfam06707 DUF1194 Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown. Probab=97.72 E-value=0.0031 Score=37.85 Aligned_cols=179 Identities=16% Similarity=0.225 Sum_probs=101.8 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCC----CCCCEEEEEEEEEECC--CCEEECCCCC--CHHH Q ss_conf 740699852763115667872148989999987500233555----5311047987886367--4178166655--8789 Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSI----PDVNNVVRSGLVTFSS--KIVQTFPLAW--GVQH 239 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~f~~--~~~~~~~lt~--~~~~ 239 (371) .+.+++.+|.|+||... ....-++.+-..+..-... ......+.+..+.|+. ......||+. +... T Consensus 3 dlaLvLavDvS~SVD~~------E~~lQr~G~A~Al~dp~V~~Ai~~g~~g~Iava~~eWsg~~~q~~vv~Wt~I~~~~d 76 (206) T pfam06707 3 DLALVLAVDVSGSVDEE------EYRLQREGYAAALRDPEVLDALLSGPHGRIAVTYVEWSGPDDQRVVVPWTLIDSAED 76 (206) T ss_pred HHHHHHHHHCCCCCCHH------HHHHHHHHHHHHHCCHHHHHHHHCCCCCEEEEEEEEECCCCCCEEEECCEEECCHHH T ss_conf 47786773323686999------999999999999759999999962899718999998027887448869989589999 Q ss_pred HHHHHHCC-----CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHH Q ss_conf 99997401-----5688745642378899998742110123467776661699998404588888897899999999997 Q gi|254780833|r 240 IQEKINRL-----IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKR 314 (371) Q Consensus 240 ~~~~i~~l-----~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~ 314 (371) ...+...+ ...+.|.+..++..+..+|.... .+-.+|+|=+=.||.||.+..+. ...-+.+.. T Consensus 77 a~a~A~~i~~~~r~~~~~Taig~Al~~a~~l~~~~~---------~~~~RrvIDiSGDG~nN~G~~p~---~~ard~~~~ 144 (206) T pfam06707 77 AEAFAARLAAAPRRAGRRTAIGGALGFAAALLAQNP---------YECLRRVIDVSGDGPNNQGFPPV---TAARDAAVA 144 (206) T ss_pred HHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHCC---------CCCCEEEEEEECCCCCCCCCCCH---HHHHHHHHH T ss_conf 999999997588788999769999999999998299---------87617999960799888999813---789876777 Q ss_pred CCCEEEEEEECCCCC------HHHHHHHC--CCCEE-EEECCHHHHHHHH-HHHHHHHHH Q ss_conf 898799999418642------79899833--89808-9828989999999-999996400 Q gi|254780833|r 315 RGAIVYAIGVQAEAA------DQFLKNCA--SPDRF-YSVQNSRKLHDAF-LRIGKEMVK 364 (371) Q Consensus 315 ~gi~i~tIg~~~~~~------~~~l~~cA--s~~~~-y~~~~~~~L~~af-~~I~~~i~~ 364 (371) .||+|..+.++.+.. ....+.|. ++|-| -.+.+.++..+++ +++-+||.. T Consensus 145 ~GitINgL~I~~~~~~~~~~L~~yy~~~VIgGpgAFV~~a~~~~df~~AirrKL~rEIag 204 (206) T pfam06707 145 AGVTINGLAIMGAEAPTSDDLDAYYRDCVIGGPGAFVEPANGFEDFAEAIRRKLVREIAG 204 (206) T ss_pred CCEEEEEEEECCCCCCCCHHHHHHHHHCCCCCCCCEEEECCCHHHHHHHHHHHHHHHHHC T ss_conf 592896677747898762369999973202389844997388799999999999998732 No 53 >KOG1226 consensus Probab=97.68 E-value=0.00051 Score=42.61 Aligned_cols=151 Identities=18% Similarity=0.314 Sum_probs=90.6 Q ss_pred EEECCCCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC----------- Q ss_conf 221000134674069985276311566787214898999998750023355553110479878863674----------- Q gi|254780833|r 158 SVKISSKSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK----------- 226 (371) Q Consensus 158 ~~~~~~~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~----------- 226 (371) ........+.|+|+..++|.|.||.. ..++++.+...+...+..+... .|.|.-+|-++ T Consensus 122 ~l~~r~a~~yPVDLYyLMDlS~SM~D----Dl~~l~~LG~~L~~~m~~lT~n------frlGFGSFVDK~v~P~i~~~pe 191 (783) T KOG1226 122 QLKVRQAEDYPVDLYYLMDLSYSMKD----DLENLKSLGTDLAREMRKLTSN------FRLGFGSFVDKTVSPYISTTPE 191 (783) T ss_pred EEEEEECCCCCEEEEEEEECCHHHHH----HHHHHHHHHHHHHHHHHHHHCC------CCCCCCCHHCCCCCCCCCCCCH T ss_conf 99996035797037998613024565----6999999999999999987546------7766531112443541116807 Q ss_pred -----------------CEEECCCCCCHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEE Q ss_conf -----------------178166655878999997401568874564237889999874211012346777666169999 Q gi|254780833|r 227 -----------------IVQTFPLAWGVQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIF 289 (371) Q Consensus 227 -----------------~~~~~~lt~~~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl 289 (371) ..-+.+||++.....+.+.+-.-.|+-...+| |+.++-.+..=....+ ...+..+.+|| T Consensus 192 kl~npc~~~~~C~ppfgfkhvLsLT~~~~~F~~~V~~q~ISgNlDaPEG---GfDAimQaavC~~~IG-WR~~a~~LLVF 267 (783) T KOG1226 192 KLRNPCPNYKNCAPPFGFKHVLSLTNDAEEFNEEVGKQRISGNLDAPEG---GFDAIMQAAVCTEKIG-WRNDATRLLVF 267 (783) T ss_pred HHCCCCCCCCCCCCCCCCCEEEECCCCHHHHHHHHHHCEECCCCCCCCC---HHHHHHHHHHCCCCCC-CCCCCEEEEEE T ss_conf 7538998744577985540021068876999998754353268898982---2988876641465522-01265168999 Q ss_pred EECCCC--------------CCC--------------CCCHHHHHHHHHHHHHCCCE-EEEE Q ss_conf 840458--------------888--------------88978999999999978987-9999 Q gi|254780833|r 290 LTDGEN--------------SSP--------------NIDNKESLFYCNEAKRRGAI-VYAI 322 (371) Q Consensus 290 ~TDG~~--------------~~~--------------~~~~~~~~~~c~~~k~~gi~-i~tI 322 (371) .||... |++ ..++.....+...+.+++|. ||+| T Consensus 268 ~td~~~H~a~DgkLaGiv~pnDG~CHL~~~g~Yt~S~~qdyPSia~l~~kl~~~ni~~IFAV 329 (783) T KOG1226 268 STDAGFHFAGDGKLAGIVQPNDGQCHLDKNGEYTQSTTQDYPSIAQLAQKLADNNINTIFAV 329 (783) T ss_pred ECCCCEEEECCCCEEEEECCCCCCCCCCCCCCCCEECCCCCCCHHHHHHHHHHHCCHHHHHH T ss_conf 70751345115531258537887415687775120137777738999998766045347787 No 54 >cd01460 vWA_midasin VWA_Midasin: Midasin is a member of the AAA ATPase family. The proteins of this family are unified by their common archetectural organization that is based upon a conserved ATPase domain. The AAA domain of midasin contains six tandem AAA protomers. The AAA domains in midasin is followed by a D/E rich domain that is following by a VWA domain. The members of this subgroup have a conserved MIDAS motif. The function of this domain is not exactly known although it has been speculated to play a crucial role in midasin function. Probab=97.58 E-value=0.0051 Score=36.57 Aligned_cols=189 Identities=15% Similarity=0.201 Sum_probs=109.8 Q ss_pred CCCCCCCCEEEEEECCCCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHH---HCCCCCCCCCCCEEEEEEEEEE Q ss_conf 2565443201122100013467406998527631156678721489899999875---0023355553110479878863 Q gi|254780833|r 147 NSSHAPLLITSSVKISSKSDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIRE---MLDIIKSIPDVNNVVRSGLVTF 223 (371) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~f 223 (371) ....+++.-+...++.++ +++++|.|.||..+. ....+.+++.. .+..+.. -++++..| T Consensus 45 rkDKIWLRRtkPsKR~Yq------I~lAiDdSkSM~~~~-----~~~lAlesl~lvs~Als~LEv-------G~l~V~~F 106 (266) T cd01460 45 RKDKIWLRRTKPAKRDYQ------ILIAIDDSKSMSENN-----SKKLALESLCLVSKALTLLEV-------GQLGVCSF 106 (266) T ss_pred CCCCEEEEECCCCCCCEE------EEEEECCCHHHHHHH-----HHHHHHHHHHHHHHHHHHHCC-------CCEEEEEC T ss_conf 566357760477654258------999972603310104-----788999999999999987277-------65689984 Q ss_pred CCCCEEECCCCCCHHH--HHHHHHCCCC-CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC Q ss_conf 6741781666558789--9999740156-887456423788999987421101234677766616999984045888888 Q gi|254780833|r 224 SSKIVQTFPLAWGVQH--IQEKINRLIF-GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNI 300 (371) Q Consensus 224 ~~~~~~~~~lt~~~~~--~~~~i~~l~~-~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~ 300 (371) ...+....|+....+. -...+..++. ...|++..-+......+..... ...+.+..+.+++++||....... T Consensus 107 Ge~v~~lh~f~~~f~~~~g~~il~~f~F~q~~T~v~~ll~~~~~~~~~a~~-----~~~~~~~~qL~lIiSDG~~~~~~~ 181 (266) T cd01460 107 GEDVQILHPFDEQFSSQSGPRILNQFTFQQDKTDIANLLKFTAQIFEDART-----QSSSGSLWQLLLIISDGRGEFSEG 181 (266) T ss_pred CCCEEEEEECCCCCCCCHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHC-----CCCCCCHHHEEEEEECCCCCCCCC T ss_conf 887278622688765422799998398876776199999999999999752-----668756212799996898633530 Q ss_pred CHHHHHHHHHHHHHCCCEEEEEEECCCCCH-HH--HHHH--------------CC-C-CEEEEECCHHHHHHHHHHHHHH Q ss_conf 978999999999978987999994186427-98--9983--------------38-9-8089828989999999999996 Q gi|254780833|r 301 DNKESLFYCNEAKRRGAIVYAIGVQAEAAD-QF--LKNC--------------AS-P-DRFYSVQNSRKLHDAFLRIGKE 361 (371) Q Consensus 301 ~~~~~~~~c~~~k~~gi~i~tIg~~~~~~~-~~--l~~c--------------As-~-~~~y~~~~~~~L~~af~~I~~~ 361 (371) . .......+.+++|-+-.|=+..+.+. .. |++. -+ | .+|--+.+.++|.+++..+-++ T Consensus 182 ~---~r~~vr~a~~~~i~~vfiIiD~~~~~~SIldmk~~~f~~~~~~~~~~YLd~FPFpyYvivrdi~~LP~~Lsd~LRQ 258 (266) T cd01460 182 A---QKVRLREAREQNVFVVFIIIDNPDNKQSILDIKVVSFKNDKSGVITPYLDEFPFPYYVIVRDLNQLPSVLSDALRQ 258 (266) T ss_pred H---HHHHHHHHHHCCCEEEEEEECCCCCCCCCCCCEEEEEECCCCCEEEEHHHCCCCCEEEEECCHHHHHHHHHHHHHH T ss_conf 6---8999999997697699999708988776331137777179840664724439986489976887808999999999 No 55 >pfam04285 DUF444 Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA). Probab=97.52 E-value=0.0062 Score=36.08 Aligned_cols=163 Identities=15% Similarity=0.166 Sum_probs=88.6 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCCCC Q ss_conf 69985276311566787214898999998750023355553110479878863674178166655878999997401568 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLIFG 250 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~~~ 250 (371) |++++|+||||+. .+-..++....++..-+...+..-..+.+.- ...+..+.- ..-=....+ T Consensus 249 mfc~MDVSGSM~e------~~K~lAk~FfflLy~FL~r~Y~~VEvVFI~H---~t~AkEVdE---------e~FF~~~Es 310 (421) T pfam04285 249 VFCLMDVSGSMGE------SEKDLAKRFFFLLYLFLTRKYENVEIVFIAH---HTEAKEVDE---------TDFFYKQES 310 (421) T ss_pred EEEEEECCCCCCH------HHHHHHHHHHHHHHHHHHHCCCCEEEEEEEE---CCCEEEECH---------HHHCCCCCC T ss_conf 9999855768778------8999999999999999971578548999971---383478367---------993254898 Q ss_pred CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHH-HHHHHCCCEEEEE-EECCCC Q ss_conf 87456423788999987421101234677766616999984045888888978999999-9999789879999-941864 Q gi|254780833|r 251 STTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYC-NEAKRRGAIVYAI-GVQAEA 328 (371) Q Consensus 251 g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c-~~~k~~gi~i~tI-g~~~~~ 328 (371) |||-+..|+..+...+...- +.....-|.+-.+||+|-.. ++.....++ +.+- .-++.|.- -+.... T Consensus 311 GGT~vSSal~l~~~II~~RY--------pp~~WNiY~f~aSDGDNw~~--D~~~c~~lL~~~ll-p~~~~f~Y~EI~~~~ 379 (421) T pfam04285 311 GGTIVSSALELALEIIDERY--------PPAEWNIYAFQASDGDNWTD--DSERCVKLLMNKLM-PNAQYYGYVEITQRR 379 (421) T ss_pred CCEEEEHHHHHHHHHHHHHC--------CHHHCEEEEEEECCCCCCCC--CHHHHHHHHHHHHH-HHHHEEEEEEECCCC T ss_conf 97587279999999998558--------86445046798037766434--64999999999898-874158999945887 Q ss_pred CHHHH---HHHC-CCCEE--EEECCHHHHHHHHHHHHHHH Q ss_conf 27989---9833-89808--98289899999999999964 Q gi|254780833|r 329 ADQFL---KNCA-SPDRF--YSVQNSRKLHDAFLRIGKEM 362 (371) Q Consensus 329 ~~~~l---~~cA-s~~~~--y~~~~~~~L~~af~~I~~~i 362 (371) ....+ +... ...+| ..+.+..+|-.+|++++.+- T Consensus 380 ~~~~~~~y~~~~~~~~nf~~~~I~~k~dIypvfr~lf~ke 419 (421) T pfam04285 380 SHSTWRKYEAVKGVKDNFAMYTIREKDDVYPVFRTLFQKE 419 (421) T ss_pred CCCHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHHH T ss_conf 6527999998632489757999588888899999998643 No 56 >PRK05325 hypothetical protein; Provisional Probab=97.47 E-value=0.0072 Score=35.67 Aligned_cols=164 Identities=14% Similarity=0.145 Sum_probs=89.2 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCCCC Q ss_conf 69985276311566787214898999998750023355553110479878863674178166655878999997401568 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLIFG 250 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~~~ 250 (371) |++++|+||||+. .+-..++....++..-+...+..-. +-.+.-...+..+. + ..+ =....+ T Consensus 237 ~f~lMDvSGSM~~------~~K~lak~ff~lLy~fL~~~Y~~ve---vvFI~H~t~AkEVd-------E-e~F-F~~~es 298 (414) T PRK05325 237 MFCLMDVSGSMDE------AEKDLAKRFFFLLYLFLRRKYENVE---VVFIRHHTEAKEVD-------E-EEF-FYSRES 298 (414) T ss_pred EEEEEECCCCCCH------HHHHHHHHHHHHHHHHHHHHCCCEE---EEEEEECCCEEECC-------H-HHH-CCCCCC T ss_conf 9999855667767------8999999999999999985157548---99997159426747-------8-983-155898 Q ss_pred CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHH-HHHHHHHCCCEEEEE-EECCC- Q ss_conf 874564237889999874211012346777666169999840458888889789999-999999789879999-94186- Q gi|254780833|r 251 STTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLF-YCNEAKRRGAIVYAI-GVQAE- 327 (371) Q Consensus 251 g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~-~c~~~k~~gi~i~tI-g~~~~- 327 (371) |||-+..|+..+...+...- ......-|.+-.|||+|-... +..... +...+.. -++.|.- -+... T Consensus 299 GGT~vSSa~~l~~eII~~rY--------pp~~WNIY~f~aSDGDNw~~D--~~~~~~~L~~~llp-~~~~f~Y~Ei~~~~ 367 (414) T PRK05325 299 GGTIVSSALKLMLEIIEERY--------PPAEWNIYAFQASDGDNWSDD--SPRCVELLVEELLP-VVNYFAYIEITPRA 367 (414) T ss_pred CCEEEEHHHHHHHHHHHHHC--------CHHHCEEEEEEECCCCCCCCC--HHHHHHHHHHHHHH-HHHEEEEEEEECCC T ss_conf 98485089999999998548--------875652788991377675446--69999999998888-75368999971798 Q ss_pred --CCHHHHHHH---CC-CCEE--EEECCHHHHHHHHHHHHHHHH Q ss_conf --427989983---38-9808--982898999999999999640 Q gi|254780833|r 328 --AADQFLKNC---AS-PDRF--YSVQNSRKLHDAFLRIGKEMV 363 (371) Q Consensus 328 --~~~~~l~~c---As-~~~~--y~~~~~~~L~~af~~I~~~i~ 363 (371) ....+++.. .. .++| ..+.+..+|-.+|++++.+-. T Consensus 368 ~~~~~~l~~~y~~~~~~~~~f~~~~I~~~~dI~p~fr~lf~k~~ 411 (414) T PRK05325 368 YYRHQTLWREYEKLQDEFDNFAMQHIRDKADIYPVFRELFKKEL 411 (414) T ss_pred CCCCHHHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHHHHHHHH T ss_conf 88756899999997554888679994888888999999985555 No 57 >pfam09967 DUF2201 Predicted metallopeptidase (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=97.23 E-value=0.0013 Score=40.25 Aligned_cols=96 Identities=22% Similarity=0.234 Sum_probs=58.4 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHC-- Q ss_conf 406998527631156678721489899999875002335555311047987886367417816665587899999740-- Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINR-- 246 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~-- 246 (371) .++++++|.||||.. +.+......+...+.... ..+-++.|...++....+.... ..+.. T Consensus 288 ~~i~v~iDTSGSis~------~~l~~flsEi~~I~~~~~--------~~i~vi~~D~~V~~~~~~~~~~----~~~~~~~ 349 (412) T pfam09967 288 PRLAVAIDTSGSISD------AELARFAAEIAAILRRTR--------AEVHVLACDEKVSSVQKFEPGD----SEISEVE 349 (412) T ss_pred CCEEEEEECCCCCCH------HHHHHHHHHHHHHHHHCC--------CCEEEEEECCEECCEEEECCCC----CCCCCCC T ss_conf 557999969899889------999999999999998489--------9779999688856407863466----7644141 Q ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCC Q ss_conf 15688745642378899998742110123467776661699998404588888 Q gi|254780833|r 247 LIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPN 299 (371) Q Consensus 247 l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~ 299 (371) +.-+|||...+.+.|... + . ...+|+||||+-+.+. T Consensus 350 ~~GgGGTdf~pvf~~~~~-~---------------~-p~~~i~fTDG~g~~p~ 385 (412) T pfam09967 350 LTGGGGTDFRPVLEAALR-L---------------R-PDAAVVLTDLEGWPAG 385 (412) T ss_pred CCCCCCCCCHHHHHHHHH-C---------------C-CCEEEEEECCCCCCCC T ss_conf 357899878489999982-6---------------9-9769998389989887 No 58 >cd01459 vWA_copine_like VWA Copine: Copines are phospholipid-binding proteins originally identified in paramecium. They are found in human and orthologues have been found in C. elegans and Arabidopsis Thaliana. None have been found in D. Melanogaster or S. Cereviciae. Phylogenetic distribution suggests that copines have been lost in some eukaryotes. No functional properties have been assigned to the VWA domains present in copines. The members of this subgroup contain a functional MIDAS motif based on their preferential binding to magnesium and manganese. However, the MIDAS motif is not totally conserved, in most cases the MIDAS consists of the sequence DxTxS instead of the motif DxSxS that is found in most cases. The C2 domains present in copines mediate phospholipid binding. Probab=96.97 E-value=0.024 Score=32.48 Aligned_cols=154 Identities=16% Similarity=0.175 Sum_probs=88.1 Q ss_pred CCCEEEEECCCCCCCCCC---------CCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECC-CCEEECCC---- Q ss_conf 740699852763115667---------8721489899999875002335555311047987886367-41781666---- Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHF---------GPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSS-KIVQTFPL---- 233 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~---------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~-~~~~~~~l---- 233 (371) .+++++.+|.+.|-++.. +........+..++...+..++.....+-+ -+|...-.. .....+|+ T Consensus 31 einlivaIDFT~SNg~p~~~~SLHy~~~~~~N~Y~~aI~~vg~iL~~YD~Dk~~p~y-GFGa~~~~~~~~~~~~~~~~~~ 109 (254) T cd01459 31 ESNLIVAIDFTKSNGWPGEKRSLHYISPGRLNPYQKAIRIVGEVLQPYDSDKLIPAF-GFGAIVTKDQSVFSFFPGYSES 109 (254) T ss_pred EEEEEEEEEECCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCEEEE-CCCCCCCCCCEEECCCCCCCCC T ss_conf 885899997078689989998854589999899999999999998632778811301-1231139997573577799999 Q ss_pred --CCC----HHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHH Q ss_conf --558----78999997401568874564237889999874211012346777666169999840458888889789999 Q gi|254780833|r 234 --AWG----VQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLF 307 (371) Q Consensus 234 --t~~----~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~ 307 (371) -.+ .+.-+..+..+..+|-|+..+-+..+.+....... ..---+++++|||.-++ ..++.. T Consensus 110 p~~~G~~gvl~aY~~~l~~v~lsGPT~FapiI~~a~~~a~~~~~---------~~~Y~ILlIiTDG~i~D----~~~Ti~ 176 (254) T cd01459 110 PECQGFEGVLRAYREALPNVSLSGPTNFAPVIRAAANIAKASNS---------QSKYHILLIITDGEITD----MNETIK 176 (254) T ss_pred CCCCCHHHHHHHHHHHCCCCEECCCCCHHHHHHHHHHHHHHHCC---------CCEEEEEEEECCCCCCC----HHHHHH T ss_conf 96559999999999860743764886059999999999997324---------87089999980796367----899999 Q ss_pred HHHHHHHCCCEEEEEEECCCCCHHHHHHH Q ss_conf 99999978987999994186427989983 Q gi|254780833|r 308 YCNEAKRRGAIVYAIGVQAEAADQFLKNC 336 (371) Q Consensus 308 ~c~~~k~~gi~i~tIg~~~~~~~~~l~~c 336 (371) ..-.+.+..+.|..||+|. ++=..|+.. T Consensus 177 aIv~AS~~PlSIIiVGVGd-~dF~~M~~l 204 (254) T cd01459 177 AIVEASKYPLSIVIVGVGD-GPFDAMERL 204 (254) T ss_pred HHHHHCCCCEEEEEEEECC-CCHHHHHHH T ss_conf 9999717981799997368-882778873 No 59 >COG4867 Uncharacterized protein with a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=96.92 E-value=0.027 Score=32.19 Aligned_cols=156 Identities=15% Similarity=0.168 Sum_probs=91.8 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCC- Q ss_conf 0699852763115667872148989999987500233555531104798788636741781666558789999974015- Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLI- 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~- 248 (371) -+++.+|.|.||.... ....++...=++..++..--..+ ...++.|...+..+. .+++ -.+. T Consensus 465 AvallvDtS~SM~~eG--Rw~PmKQtALALhHLv~TrfrGD------~l~~i~Fgr~A~~v~-----v~eL----t~l~~ 527 (652) T COG4867 465 AVALLVDTSFSMVMEG--RWLPMKQTALALHHLVCTRFRGD------ALQIIAFGRYARTVT-----AAEL----TGLAG 527 (652) T ss_pred CEEEEEECCHHHHHHC--CCCCHHHHHHHHHHHHHHCCCCC------EEEEEECCCCCCCCC-----HHHH----HCCCC T ss_conf 2454341516777741--66716889999999998317886------058876043020177-----9998----24887 Q ss_pred -CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCC-------------CCHH---HHHHHHHH Q ss_conf -688745642378899998742110123467776661699998404588888-------------8978---99999999 Q gi|254780833|r 249 -FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPN-------------IDNK---ESLFYCNE 311 (371) Q Consensus 249 -~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~-------------~~~~---~~~~~c~~ 311 (371) ...+||...++..+.+.|.. .+..+|.|+++|||+++.-- .++. .+..-.++ T Consensus 528 v~eqgTNlhhaL~LA~r~l~R-----------h~~~~~~il~vTDGePtAhle~~DG~~~~f~yp~DP~t~~~Tvr~~d~ 596 (652) T COG4867 528 VYEQGTNLHHALALAGRHLRR-----------HAGAQPVVLVVTDGEPTAHLEDGDGTSVFFDYPPDPRTIAHTVRGFDD 596 (652) T ss_pred CCCCCCCHHHHHHHHHHHHHH-----------CCCCCCEEEEEECCCCCCCCCCCCCCEEECCCCCCHHHHHHHHHHHHH T ss_conf 674555458899999999873-----------756576289983798630134789856616899877799898998888 Q ss_pred HHHCCCEEEEEEECCCCC-HHHHHHHC--CCCEEEEECCHHHHHHH Q ss_conf 997898799999418642-79899833--89808982898999999 Q gi|254780833|r 312 AKRRGAIVYAIGVQAEAA-DQFLKNCA--SPDRFYSVQNSRKLHDA 354 (371) Q Consensus 312 ~k~~gi~i~tIg~~~~~~-~~~l~~cA--s~~~~y~~~~~~~L~~a 354 (371) ....|++|-++-++.|.. ..++++.| .+|..|.. +.+.|..+ T Consensus 597 ~~r~G~q~t~FrLg~DpgL~~Fv~qva~rv~G~vv~p-dldglGaa 641 (652) T COG4867 597 MARLGAQVTIFRLGSDPGLARFIDQVARRVQGRVVVP-DLDGLGAA 641 (652) T ss_pred HHHCCCEEEEEEECCCHHHHHHHHHHHHHHCCEEEEC-CCCHHHHH T ss_conf 8751641367752277768999999999858848813-82213589 No 60 >pfam11443 DUF2828 Domain of unknown function (DUF2828). This is a uncharacterized domain found in eukaryotes and viruses. Probab=96.85 E-value=0.031 Score=31.86 Aligned_cols=142 Identities=16% Similarity=0.139 Sum_probs=82.5 Q ss_pred CCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCC Q ss_conf 40699852763115667872148989999987500233555531104798788636741781666558789999974015 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~ 248 (371) -+++-|+|+||||.+... ....++.+... .-.+.+.+..+ .+--+.+|+..+.-+.--..+..+-.+.+.... T Consensus 327 ~n~iav~DvSGSM~g~~~-~~~p~~vai~L-gl~ise~~~~~-----fk~~~iTFs~~P~~~~l~g~~l~ekv~~~~~~~ 399 (524) T pfam11443 327 TNCIAVCDVSGSMSGPVF-SITPMDVCIAL-GLLVSELSEGP-----FKGKVITFSSNPQLHHIKGDSLREKVSFVRRMP 399 (524) T ss_pred CCEEEEEECCCCCCCCCC-CCCHHHHHHHH-HHHHHHHCCCC-----CCCCEEEECCCCEEEECCCCCHHHHHHHHHHCC T ss_conf 544899956877778888-88749999999-99999853500-----058189844997589707988999999999586 Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC----CHHHHHHHHHHHHHCCCEEEEEEE Q ss_conf 6887456423788999987421101234677766616999984045888888----978999999999978987999994 Q gi|254780833|r 249 FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNI----DNKESLFYCNEAKRRGAIVYAIGV 324 (371) Q Consensus 249 ~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~----~~~~~~~~c~~~k~~gi~i~tIg~ 324 (371) .+++|+....+..-........ -..++..|.|+++||=+..+... ....-..++...++.|.++=-|=| T Consensus 400 wg~nTnf~~vf~lIL~~av~~~-------l~~eempk~l~VfSDMqFD~a~~~~~~~~t~~e~i~~~f~~aGY~~P~IVF 472 (524) T pfam11443 400 WGMSTNFQKVFDLILETAVENK-------LPQEDMPKRLFVFSDMEFDQASTGTSGWETDYEAIQRKFKEAGYEVPELVF 472 (524) T ss_pred CCCCCHHHHHHHHHHHHHHHCC-------CCHHHCCCEEEEEECCCHHHHCCCCCCCCCHHHHHHHHHHHCCCCCCEEEE T ss_conf 7635339999999999999869-------997886773899845423120379987623899999999983999983889 No 61 >COG4547 CobT Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) [Coenzyme metabolism] Probab=96.79 E-value=0.0023 Score=38.66 Aligned_cols=82 Identities=16% Similarity=0.324 Sum_probs=48.2 Q ss_pred CCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCH--------HHHHHHHHHHH-HCCCEEEEEEECC Q ss_conf 42378899998742110123467776661699998404588888897--------89999999999-7898799999418 Q gi|254780833|r 256 TPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDN--------KESLFYCNEAK-RRGAIVYAIGVQA 326 (371) Q Consensus 256 ~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~--------~~~~~~c~~~k-~~gi~i~tIg~~~ 326 (371) .+++.|+-+.|. +.+.-+|++.+|+||.+-..++-. .-.....+.+. ...|...+||++- T Consensus 520 GEal~wah~rl~-----------gRpEqrkIlmmiSDGAPvddstlsvnpGnylerHLRaVieeIEtrSpveLlAIGigh 588 (620) T COG4547 520 GEALMWAHQRLI-----------GRPEQRKILMMISDGAPVDDSTLSVNPGNYLERHLRAVIEEIETRSPVELLAIGIGH 588 (620) T ss_pred HHHHHHHHHHHH-----------CCHHHCEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHEEEECCC T ss_conf 199999999873-----------584241378883489855554345588607999999999997037840330331255 Q ss_pred CCCHHHHHHHCCCCEEEEECCHHHHHHHH Q ss_conf 64279899833898089828989999999 Q gi|254780833|r 327 EAADQFLKNCASPDRFYSVQNSRKLHDAF 355 (371) Q Consensus 327 ~~~~~~l~~cAs~~~~y~~~~~~~L~~af 355 (371) |.-++.-|..+ .-+.++|.-++ T Consensus 589 DvtRyYrravt-------iVdaeeL~gam 610 (620) T COG4547 589 DVTRYYRRAVT-------IVDAEELAGAM 610 (620) T ss_pred CCCHHHHHHEE-------EECHHHHCHHH T ss_conf 53066662013-------74288856589 No 62 >cd01458 vWA_ku Ku70/Ku80 N-terminal domain. The Ku78 heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks (DSB) in a preferred orientation. DSB's are repaired by either homologues recombination or non-homologues end joining and facilitate repair by the non-homologous end-joining pathway (NHEJ). The Ku heterodimer is required for accurate process that tends to preserve the sequence at the junction. Ku78 is found in all three kingdoms of life. However, only the eukaryotic proteins have a vWA domain fused to them at their N-termini. The vWA domain is not involved in DNA binding but may very likey mediate Ku78's interactions with other proteins. Members of this subgroup lack the conserved MIDAS motif. Probab=96.75 E-value=0.037 Score=31.40 Aligned_cols=156 Identities=13% Similarity=0.157 Sum_probs=85.2 Q ss_pred CEEEEECCCCCCCCCCCC-CHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCE----------EECCCCC-CH Q ss_conf 069985276311566787-21489899999875002335555311047987886367417----------8166655-87 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGP-GMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIV----------QTFPLAW-GV 237 (371) Q Consensus 170 di~~viD~SgSm~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~----------~~~~lt~-~~ 237 (371) .++|++|.|.||-..... ....+..+.+.+..++... -.......+|++-|+..-. ...+|.. +. T Consensus 3 ~ivflID~s~sM~~~~~~~~~s~~~~al~~i~~~~~~k---iis~~~d~vGvv~~~T~~~~n~~~~~~i~vl~~l~~~~a 79 (218) T cd01458 3 SVVFLVDVSPSMFESKDGEYESPFEEALKCIRQLMKSK---IISSPKDLVGVVFYGTEESKNPVGYENIYVLLDLDTPGA 79 (218) T ss_pred EEEEEEECCHHHCCCCCCCCCCHHHHHHHHHHHHHHHH---EECCCCCEEEEEEECCCCCCCCCCCCEEEEEECCCCCCH T ss_conf 79999979977847767888883999999999999865---067899869999976788888789872699633887677 Q ss_pred HHHHHHHHCCCC-----------CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCC--HHH Q ss_conf 899999740156-----------8874564237889999874211012346777666169999840458888889--789 Q gi|254780833|r 238 QHIQEKINRLIF-----------GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNID--NKE 304 (371) Q Consensus 238 ~~~~~~i~~l~~-----------~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~--~~~ 304 (371) ..++....-+.. .+......++..+.+++... ......|-|+|+||.++-.+... ..+ T Consensus 80 ~~i~~l~~~~~~~~~~~~~~~~~~~~~~l~~aL~~~~~~f~~~---------~~~~~~krI~lfTdnD~P~~~~~~~~~~ 150 (218) T cd01458 80 ERVEDLKELIEPGGLSFAGQVGDSGQVSLSDALWVCLDLFSKG---------KKKKSHKRIFLFTNNDDPHGGDSIKDSQ 150 (218) T ss_pred HHHHHHHHHHHCCHHHHHHHCCCCCCCCHHHHHHHHHHHHHHC---------CCCCCCCEEEEECCCCCCCCCCHHHHHH T ss_conf 9999999986010235566448888867999999999999855---------5345777799986899899988799999 Q ss_pred HHHHHHHHHHCCCEEEEEEECCCC---C-HHHHHHHC Q ss_conf 999999999789879999941864---2-79899833 Q gi|254780833|r 305 SLFYCNEAKRRGAIVYAIGVQAEA---A-DQFLKNCA 337 (371) Q Consensus 305 ~~~~c~~~k~~gi~i~tIg~~~~~---~-~~~l~~cA 337 (371) +...+..+++.||.|-.+.+..+. + ..+.+.+- T Consensus 151 a~~~a~DL~d~gI~iel~~l~~~~~~Fd~s~FY~dii 187 (218) T cd01458 151 AAVKAEDLKDKGIELELFPLSSPGKKFDVSKFYKDII 187 (218) T ss_pred HHHHHHHHHHCCCEEEEEECCCCCCCCCCHHHHHHHH T ss_conf 9999988987796899984489988688067788752 No 63 >pfam07811 TadE TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues. Probab=96.55 E-value=0.0085 Score=35.23 Aligned_cols=42 Identities=19% Similarity=0.421 Sum_probs=39.1 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 718999999999999999999999999999999999999999 Q gi|254780833|r 15 GSISILTAILLPVIFIVMGLVIETSHKFFVKAKLHYILDHSL 56 (371) Q Consensus 15 G~vai~~al~l~~li~~~g~aVD~~r~~~~ks~Lq~a~DaA~ 56 (371) |+.+|=|++++|+++.+....+|++++...+..++.|+..++ T Consensus 1 G~a~VEfalv~p~~l~l~~~~~~~~~~~~~~~~~~~Aa~~aa 42 (43) T pfam07811 1 GAAAVEFALVLPVLLLLLFGIVELGRLFYARQVLQNAAREAA 42 (43) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 916999999999999999999999999999999999998674 No 64 >KOG1327 consensus Probab=96.47 E-value=0.057 Score=30.25 Aligned_cols=152 Identities=17% Similarity=0.154 Sum_probs=87.5 Q ss_pred CCCCEEEEECCCCCCCCCCC---------CCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC------CEEEC Q ss_conf 67406998527631156678---------7214898999998750023355553110479878863674------17816 Q gi|254780833|r 167 IGLDMMMVLDVSLSMNDHFG---------PGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK------IVQTF 231 (371) Q Consensus 167 ~~idi~~viD~SgSm~~~~~---------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~------~~~~~ 231 (371) ..++..+.+|.+.|-++... .....+..|...+...+..++.... +..+-|... ..-.+ T Consensus 284 ~~lnf~vgIDfTaSNg~p~~~sSLHyi~p~~~N~Y~~Ai~~vG~~lq~ydsdk~------fpa~GFGakip~~~~vs~~f 357 (529) T KOG1327 284 EQLNFTVGIDFTASNGDPRNPSSLHYIDPHQPNPYEQAIRSVGETLQDYDSDKL------FPAFGFGAKIPPDGQVSHEF 357 (529) T ss_pred CEEEEEEEEEEECCCCCCCCCCCCEECCCCCCCHHHHHHHHHHHHHCCCCCCCC------CCCCCCCCCCCCCCCCCCCE T ss_conf 566169999971247899999752452788998899999997134105587776------42345466579986524202 Q ss_pred CCC--------CC----HHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCC Q ss_conf 665--------58----789999974015688745642378899998742110123467776661699998404588888 Q gi|254780833|r 232 PLA--------WG----VQHIQEKINRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPN 299 (371) Q Consensus 232 ~lt--------~~----~~~~~~~i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~ 299 (371) .|. .. ...-+..+..+.+.|.|+..+-+..+.+....... ...---+++++|||.-++ T Consensus 358 ~ln~~~~~~~c~Gi~gVl~aY~~~lp~v~l~GPTnFaPII~~va~~a~~~~~--------~~~qY~VLlIitDG~vTd-- 427 (529) T KOG1327 358 VLNFNPEDPECRGIEGVLEAYRKALPNVQLYGPTNFSPIINHVARIAQQSGN--------TAGQYHVLLIITDGVVTD-- 427 (529) T ss_pred EECCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCC--------CCCCEEEEEEEECCCCCC-- T ss_conf 2037889985434788999998645664205887618999999999997256--------786249999993782344-- Q ss_pred CCHHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHC Q ss_conf 89789999999999789879999941864279899833 Q gi|254780833|r 300 IDNKESLFYCNEAKRRGAIVYAIGVQAEAADQFLKNCA 337 (371) Q Consensus 300 ~~~~~~~~~c~~~k~~gi~i~tIg~~~~~~~~~l~~cA 337 (371) ...+....-.|-.....|..||+| +.+-+.|+..= T Consensus 428 --m~~T~~AIV~AS~lPlSIIiVGVG-d~df~~M~~lD 462 (529) T KOG1327 428 --MKETRDAIVSASDLPLSIIIVGVG-DADFDMMRELD 462 (529) T ss_pred --HHHHHHHHHHHCCCCEEEEEEEEC-CCCHHHHHHHH T ss_conf --889999998631598079999737-97878999750 No 65 >TIGR01651 CobT cobaltochelatase, CobT subunit; InterPro: IPR006538 These proteins are CobT subunits of the aerobic cobalt chelatase (aerobic cobalamin biosynthesis pathway). Pseudomonas denitrificans CobT has been experimentally characterised , . Aerobic cobalt chelatase consists of three subunits, CobT, CobN (IPR003672 from INTERPRO) and CobS (IPR006537 from INTERPRO). Cobalamin (vitamin B12) can be complexed with metal via the ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. However, aerobic cobalt chelatase subunits CobN and CobS are homologous to Mg-chelatase subunits BchH and BchI, respectively . CobT, too, has been found to be remotely related to the third subunit of Mg-chelatase, BchD (involved in bacteriochlorophyll synthesis, e.g., in Rhodobacter capsulatus) . Nomenclature note: CobT of the aerobic pathway Pseudomonas denitrificans is not a homolog of CobT of the anaerobic pathway (Salmonella typhimurium, Escherichia coli). Therefore, annotation of any members of this family as nicotinate-mononucleotide--5,6-dimethylbenzimidazole phosphoribosyltransferases is erroneous. . Probab=96.37 E-value=0.0089 Score=35.11 Aligned_cols=140 Identities=18% Similarity=0.206 Sum_probs=71.9 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHH---HHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEE--ECCCC---------- Q ss_conf 069985276311566787214898999---998750023355553110479878863674178--16665---------- Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVAT---RSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQ--TFPLA---------- 234 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~--~~~lt---------- 234 (371) -|.++||+||||.+. .|.++. +.+-.-++.+...-++ .|+++-.|+... .-+|. T Consensus 400 VVTLliDNSGSMRGR------PI~VAA~CADILARTLERCGV~~Ei-----LGFTTrAWKGG~sR~~Wl~~GKP~aPGRL 468 (606) T TIGR01651 400 VVTLLIDNSGSMRGR------PITVAATCADILARTLERCGVKVEI-----LGFTTRAWKGGQSREKWLKAGKPAAPGRL 468 (606) T ss_pred EEEEEEECCCCCCCC------HHHHHHHHHHHHHHHHHHCCCEEEE-----CCCCCCCCCCCCCHHHHHHCCCCCCCCCC T ss_conf 788877467887881------4788988898987666417730675-----25401455788648999737787777842 Q ss_pred CCHHHHHHHH-HC----------CCCCCC---CCC-CCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCC Q ss_conf 5878999997-40----------156887---456-42378899998742110123467776661699998404588888 Q gi|254780833|r 235 WGVQHIQEKI-NR----------LIFGST---TKS-TPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPN 299 (371) Q Consensus 235 ~~~~~~~~~i-~~----------l~~~g~---T~~-~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~ 299 (371) +|...+...- +. |....| -|+ .+++.|+-+-|.. .+.-+|++.+++||.|-+.+ T Consensus 469 NDLRHIiYKsAD~PWRRARrNLGLMMREGLLKENIDGEAL~WAH~RliA-----------R~EQRrILM~ISDGAPVDDS 537 (606) T TIGR01651 469 NDLRHIIYKSADAPWRRARRNLGLMMREGLLKENIDGEALLWAHERLIA-----------RPEQRRILMMISDGAPVDDS 537 (606) T ss_pred CHHHHHHHHCCCCCHHHHHHHCCHHHHHCCHHCCCCHHHHHHHHHHHHC-----------CHHHCEEEEEEECCCCCCCC T ss_conf 0234575321687146777512325541200105646799988666414-----------72047587776278886645 Q ss_pred CCH--------HHHHHHHHHHH-HCCCEEEEEEECCCCCHH Q ss_conf 897--------89999999999-789879999941864279 Q gi|254780833|r 300 IDN--------KESLFYCNEAK-RRGAIVYAIGVQAEAADQ 331 (371) Q Consensus 300 ~~~--------~~~~~~c~~~k-~~gi~i~tIg~~~~~~~~ 331 (371) +-. .-.....+.+- ...|+..+||+|-|.-++ T Consensus 538 TLSVN~G~YLERHLR~VI~~IEtrSPVELlAIGIGHDVTRY 578 (606) T TIGR01651 538 TLSVNPGNYLERHLRAVIEEIETRSPVELLAIGIGHDVTRY 578 (606) T ss_pred CCCCCCCHHHHHHHHHHHHHHCCCCCCEEEEECCCCCCCCE T ss_conf 23547850678999999986237787000232344342200 No 66 >pfam07002 Copine Copine. This family represents a conserved region approximately 180 residues long within eukaryotic copines. Copines are Ca(2+)-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth. Probab=96.14 E-value=0.087 Score=29.15 Aligned_cols=124 Identities=17% Similarity=0.184 Sum_probs=72.8 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCH------------HHHHHHHHCCCCCCCCCCC Q ss_conf 1489899999875002335555311047987886367417816665587------------8999997401568874564 Q gi|254780833|r 189 MDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGV------------QHIQEKINRLIFGSTTKST 256 (371) Q Consensus 189 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~------------~~~~~~i~~l~~~g~T~~~ 256 (371) ......+..++...+..++.....+-+-..+..+......-++||+.+. +.-+..+..+...|-|+.. T Consensus 10 ~N~Y~~AI~~vg~il~~YD~Dk~~p~yGFGa~~~~~~~vsh~F~ln~~~~~p~~~G~~gvl~aY~~~~~~v~l~gPT~fa 89 (145) T pfam07002 10 PNPYEQAIRIVGEILQPYDSDKRFPAFGFGARLPPDYEVSHDFPLNFNPENPECNGIEGVLNAYREALPNLQLSGPTNFA 89 (145) T ss_pred CCHHHHHHHHHHHHHHHCCCCCCEEEECCCCCCCCCCCEEEEEECCCCCCCCCCCCHHHHHHHHHHHHCEEEECCCCCHH T ss_conf 88999999999999872689881365435556699986330123568989997669999999999985810654875279 Q ss_pred CHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 237889999874211012346777666169999840458888889789999999999789879999941 Q gi|254780833|r 257 PGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 257 ~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) +-++.+.+...... ...--.+++++|||+-+ +...+....-.+-+..+.|-.||+| T Consensus 90 piI~~a~~~a~~~~---------~~~~Y~VLlIiTDG~i~----D~~~Ti~aIv~AS~~PlSIIiVGVG 145 (145) T pfam07002 90 PIIDAAARIAEATQ---------KSGQYHVLLIITDGQVT----DMKATIDAIVRASHLPLSIIIVGVG 145 (145) T ss_pred HHHHHHHHHHHHHC---------CCCEEEEEEEECCCCCC----CHHHHHHHHHHHHCCCEEEEEEEEC T ss_conf 99999999999722---------37718999996389735----6999999999982799279999519 No 67 >COG5151 SSL1 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit SSL1 [Transcription / DNA replication, recombination, and repair] Probab=96.02 E-value=0.099 Score=28.80 Aligned_cols=170 Identities=16% Similarity=0.151 Sum_probs=96.4 Q ss_pred CCCCEEEEECCCCCCCCCCCCCHHH---HHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC-CEEECCCCCCHHHHHH Q ss_conf 6740699852763115667872148---98999998750023355553110479878863674-1781666558789999 Q gi|254780833|r 167 IGLDMMMVLDVSLSMNDHFGPGMDK---LGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK-IVQTFPLAWGVQHIQE 242 (371) Q Consensus 167 ~~idi~~viD~SgSm~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~-~~~~~~lt~~~~~~~~ 242 (371) .--.+.+++|.|.+|.....-+..+ ++.+.+.+-+++++... .+++++.-.+. +.....+.-|...-.. T Consensus 86 IiRhl~l~lD~Seam~e~Df~p~r~a~vikya~~Fv~eFf~qNPi-------Sqlsii~irdg~a~~~s~~~gnpq~hi~ 158 (421) T COG5151 86 IIRHLHLILDVSEAMDESDFLPTRRANVIKYAEGFVPEFFSQNPI-------SQLSIISIRDGCAKYTSSMDGNPQAHIG 158 (421) T ss_pred HHHEEEEEEEHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHCCCCC-------HHEEEEEHHHHHHHHHHHCCCCHHHHHH T ss_conf 021057888735444332036058888999998776887525970-------0324433354688876534799899998 Q ss_pred HHHCCC-CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEE Q ss_conf 974015-6887456423788999987421101234677766616999984045888888978999999999978987999 Q gi|254780833|r 243 KINRLI-FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYA 321 (371) Q Consensus 243 ~i~~l~-~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~t 321 (371) .+..+. ..|+-....|++.+.-.|.+. ..-.++.++|++..=.- .++....+..+.+...+|+|.. T Consensus 159 ~lkS~rd~~gnfSLqNaLEmar~~l~~~---------~~H~trEvLiifgS~st----~DPgdi~~tid~Lv~~~IrV~~ 225 (421) T COG5151 159 QLKSKRDCSGNFSLQNALEMARIELMKN---------TMHGTREVLIIFGSTST----RDPGDIAETIDKLVAYNIRVHF 225 (421) T ss_pred HHHCCCCCCCCHHHHHHHHHHHHHHCCC---------CCCCCEEEEEEEEECCC----CCCCCHHHHHHHHHHHCEEEEE T ss_conf 7501004688853776887766540345---------45562279999842155----8974189999998750527999 Q ss_pred EEECCCCCHHHHHH-H-CC----CCEEEEECCHHHHHHHHHHH Q ss_conf 99418642798998-3-38----98089828989999999999 Q gi|254780833|r 322 IGVQAEAADQFLKN-C-AS----PDRFYSVQNSRKLHDAFLRI 358 (371) Q Consensus 322 Ig~~~~~~~~~l~~-c-As----~~~~y~~~~~~~L~~af~~I 358 (371) ||+..... .-+. | |+ .+.||..-+..-|.+.|.+. T Consensus 226 igL~aeva--icKeickaTn~~~e~~y~v~vde~Hl~el~~E~ 266 (421) T COG5151 226 IGLCAEVA--ICKEICKATNSSTEGRYYVPVDEGHLSELMREL 266 (421) T ss_pred EEEHHHHH--HHHHHHHHCCCCCCCEEEEEECHHHHHHHHHHC T ss_conf 75015899--999998614767675067660478899999862 No 68 >KOG2884 consensus Probab=95.96 E-value=0.11 Score=28.62 Aligned_cols=170 Identities=15% Similarity=0.148 Sum_probs=109.3 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECC-CCEEECCCCCCHHHHHHHHHCCC Q ss_conf 06998527631156678721489899999875002335555311047987886367-41781666558789999974015 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSS-KIVQTFPLAWGVQHIQEKINRLI 248 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~-~~~~~~~lt~~~~~~~~~i~~l~ 248 (371) ..++.+|+|.-|..... ..+|+..-++++......- ...+.-..+|+.+-.. .+.....+|.+...+......+. T Consensus 5 atmi~iDNse~mrNgDy-~PtRf~aQ~daVn~v~~~K---~~snpEntvGiitla~a~~~vLsT~T~d~gkils~lh~i~ 80 (259) T KOG2884 5 ATMICIDNSEYMRNGDY-LPTRFQAQKDAVNLVCQAK---LRSNPENTVGIITLANASVQVLSTLTSDRGKILSKLHGIQ 80 (259) T ss_pred EEEEEEECHHHHHCCCC-CHHHHHHHHHHHHHHHHHH---HCCCCCCCEEEEECCCCCCEEEEECCCCCHHHHHHHCCCC T ss_conf 27999847677624897-7188898899999998755---0279543154686368985044303430048987732778 Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCC-CEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC Q ss_conf 68874564237889999874211012346777666-16999984045888888978999999999978987999994186 Q gi|254780833|r 249 FGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDY-KKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 249 ~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~-~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~ 327 (371) +.|.-+...|++-+--.|... .+++. .++++|+ |.+-. ....+...+...+|..+|.|-.|-||.. T Consensus 81 ~~g~~~~~~~i~iA~lalkhR---------qnk~~~~riVvFv--GSpi~--e~ekeLv~~akrlkk~~Vaidii~FGE~ 147 (259) T KOG2884 81 PHGKANFMTGIQIAQLALKHR---------QNKNQKQRIVVFV--GSPIE--ESEKELVKLAKRLKKNKVAIDIINFGEA 147 (259) T ss_pred CCCCCCHHHHHHHHHHHHHHH---------CCCCCCEEEEEEE--CCCCH--HHHHHHHHHHHHHHHCCEEEEEEEECCC T ss_conf 577612888899999998710---------3888636999993--68322--3389999999998754802789872434 Q ss_pred CCH-HHHHHH----C---CCCEEEEECCHHHHHHHHH Q ss_conf 427-989983----3---8980898289899999999 Q gi|254780833|r 328 AAD-QFLKNC----A---SPDRFYSVQNSRKLHDAFL 356 (371) Q Consensus 328 ~~~-~~l~~c----A---s~~~~y~~~~~~~L~~af~ 356 (371) .+. ..|... - ++.|...+....-|.++.. T Consensus 148 ~~~~e~l~~fida~N~~~~gshlv~Vppg~~L~d~l~ 184 (259) T KOG2884 148 ENNTEKLFEFIDALNGKGDGSHLVSVPPGPLLSDALL 184 (259) T ss_pred CCCHHHHHHHHHHHCCCCCCCEEEEECCCCCHHHHHH T ss_conf 3337889999998538988744898589840777764 No 69 >TIGR02877 spore_yhbH sporulation protein YhbH; InterPro: IPR014230 Proteins in this entry, typified by YhbH from Bacillus subtilis, are found in the genomes of nearly every endospore-forming bacterium, and in no other genomes. The gene in Bacillus subtilis was shown to be a member of the sigma-E regulon, with mutation leading to a sporulation defect .. Probab=95.89 E-value=0.075 Score=29.51 Aligned_cols=115 Identities=16% Similarity=0.141 Sum_probs=67.2 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCCC Q ss_conf 06998527631156678721489899999875002335555311047987886367417816665587899999740156 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLIF 249 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~~ 249 (371) -|+.++|.||||+.. .| =.|+..+-++..-+...+.. +.+..++=.+.+..+.--. + =..-- T Consensus 216 Vvi~mMDtSGSMg~~-----kK-YiARSfFFw~~kFlr~KY~~---VeI~FisH~TeAkEV~Ee~--------F-F~kgE 277 (392) T TIGR02877 216 VVIAMMDTSGSMGEF-----KK-YIARSFFFWMVKFLRTKYEN---VEIVFISHHTEAKEVTEEE--------F-FTKGE 277 (392) T ss_pred EEEEEECCCCCCCCC-----HH-HHHHHHHHHHHHHHHHEEEE---EEEEEEEECCCCEEECHHH--------C-CCCCC T ss_conf 777764478898873-----16-78888999999886320014---7899971045212401666--------0-13256 Q ss_pred CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHH-HHHHHHHH Q ss_conf 8874564237889999874211012346777666169999840458888889789-99999999 Q gi|254780833|r 250 GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKE-SLFYCNEA 312 (371) Q Consensus 250 ~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~-~~~~c~~~ 312 (371) +|||.+..|.+.|++.+...-.+..++ =|-+-++||+|=.. ++.+ +..+...+ T Consensus 278 SGGT~~SS~Y~~ALeiI~~RYnP~~yN--------iY~FHfSDGDNl~~--Dn~Rlav~l~~~L 331 (392) T TIGR02877 278 SGGTRCSSAYKLALEIIDERYNPARYN--------IYAFHFSDGDNLSS--DNERLAVKLVRKL 331 (392) T ss_pred CCCCCHHHHHHHHHHHHHCCCCCCCCC--------CCCCEEECCCCCCC--CCHHHHHHHHHHH T ss_conf 677430167889999974278831006--------56535533778898--8646899999999 No 70 >PRK10997 yieM hypothetical protein; Provisional Probab=95.70 E-value=0.14 Score=27.96 Aligned_cols=143 Identities=16% Similarity=0.152 Sum_probs=79.4 Q ss_pred CCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHH Q ss_conf 34674069985276311566787214898999998750023355553110479878863674178166655878999997 Q gi|254780833|r 165 SDIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKI 244 (371) Q Consensus 165 ~~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i 244 (371) .+.| ++.-+|.||||.+. .+.+..+ .+..++..... . .-..-++.|+.... ...|+. ...+...+ T Consensus 319 ~kGP--~IvCVDTSGSM~G~----pE~~AKA--~~Lal~r~Al~-e----~R~CyvI~FSte~~-t~eLt~-~~gl~~l~ 383 (484) T PRK10997 319 PRGP--FIVCVDTSGSMGGF----NEQCAKA--FCLALMRIALA-E----NRRCYIMLFSTEVI-TYELSG-PDGLEQAI 383 (484) T ss_pred CCCC--EEEEEECCCCCCCC----HHHHHHH--HHHHHHHHHHH-C----CCCEEEEEECCCEE-EEEECC-CCCHHHHH T ss_conf 7899--79999588888997----6889999--99999999996-2----89879998126517-898048-78879999 Q ss_pred HCC--CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHH-HHHCCCEEEE Q ss_conf 401--5688745642378899998742110123467776661699998404588888897899999999-9978987999 Q gi|254780833|r 245 NRL--IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNE-AKRRGAIVYA 321 (371) Q Consensus 245 ~~l--~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~-~k~~gi~i~t 321 (371) +-| ..+|||...+++..+...+...... .. =++++||-.-..- + .....-+.. -|+.+=++|+ T Consensus 384 ~FL~~sF~GGTD~~~~L~~~l~~m~~~~y~-------~A----DllvISDFIa~~l--p-~~l~~kv~~lqk~~~nrFha 449 (484) T PRK10997 384 RFLSQSFRGGTDLAPCLRAIIEKMQGREWF-------DA----DAVVISDFIAQRL--P-DELVAKVKELQRVHQHRFHA 449 (484) T ss_pred HHHCCCCCCCCCHHHHHHHHHHHHHHCCCC-------CC----CEEEECHHCCCCC--C-HHHHHHHHHHHHHHCCCEEE T ss_conf 985288889845799999999986232446-------58----8799712206569--9-99999999999850683588 Q ss_pred EEECCCCCHHHHHHH Q ss_conf 994186427989983 Q gi|254780833|r 322 IGVQAEAADQFLKNC 336 (371) Q Consensus 322 Ig~~~~~~~~~l~~c 336 (371) |..+.-+...+|+.- T Consensus 450 v~is~~g~p~~m~iF 464 (484) T PRK10997 450 VAMSAHGKPGIMRIF 464 (484) T ss_pred EECCCCCCHHHHHHH T ss_conf 840123585799997 No 71 >TIGR00873 gnd 6-phosphogluconate dehydrogenase, decarboxylating; InterPro: IPR006113 6-Phosphogluconate dehydrogenase (1.1.1.44 from EC) (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP) , . Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequence are highly conserved . The protein is a homodimer in which the monomers act independently : each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet . NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket . This model does not specify whether the cofactor is NADP only, NAD only, or both.; GO: 0004616 phosphogluconate dehydrogenase (decarboxylating) activity, 0050661 NADP binding, 0006098 pentose-phosphate shunt. Probab=95.52 E-value=0.036 Score=31.48 Aligned_cols=46 Identities=20% Similarity=0.325 Sum_probs=36.6 Q ss_pred CCCCCEEEEEEECCCCC--C-------------------CCCCHHHHHHHHHHHHHCCCEEEEEEEC Q ss_conf 76661699998404588--8-------------------8889789999999999789879999941 Q gi|254780833|r 280 HDDYKKYIIFLTDGENS--S-------------------PNIDNKESLFYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 280 ~~~~~k~ivl~TDG~~~--~-------------------~~~~~~~~~~~c~~~k~~gi~i~tIg~~ 325 (371) -++|+|+|++++-|.+. + ||+..+.+..-|++++++||.+--+|++ T Consensus 61 Le~PRKImLMVkAG~pVdaD~~I~~L~P~LE~GDiIIDGGNS~Y~DT~RR~~eL~~~Gi~FvG~GvS 127 (480) T TIGR00873 61 LERPRKIMLMVKAGAPVDADAVINSLLPLLEKGDIIIDGGNSHYKDTERRYKELKAKGILFVGVGVS 127 (480) T ss_pred CCCCCEEEEEEECCCCCCHHHHHHHHHHHCCCCCEEECCCCCCCCCHHHHHHHHHHCCCCEEEEEEE T ss_conf 0688728887753885377899999644358998887588788466578999998649816730132 No 72 >COG2718 Uncharacterized conserved protein [Function unknown] Probab=95.37 E-value=0.18 Score=27.25 Aligned_cols=165 Identities=14% Similarity=0.110 Sum_probs=84.1 Q ss_pred EEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCCCC Q ss_conf 69985276311566787214898999998750023355553110479878863674178166655878999997401568 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLIFG 250 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~~~ 250 (371) ++-++|+||||+.. .-+.+++....+.-.+...+.....+.+.--+-.+.+.... -=.-..+ T Consensus 249 mfclMDvSGSM~~~------~KdlAkrFF~lL~~FL~~kYenveivfIrHht~A~EVdE~d------------FF~~~es 310 (423) T COG2718 249 MFCLMDVSGSMDQS------EKDLAKRFFFLLYLFLRRKYENVEIVFIRHHTEAKEVDETD------------FFYSQES 310 (423) T ss_pred EEEEEECCCCCCHH------HHHHHHHHHHHHHHHHHCCCCEEEEEEEEECCCCEECCHHH------------CEEECCC T ss_conf 99977457774467------89999999999999984440025899996037431423224------------0232478 Q ss_pred CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEE-ECCCCC Q ss_conf 8745642378899998742110123467776661699998404588888897899999999997898799999-418642 Q gi|254780833|r 251 STTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIG-VQAEAA 329 (371) Q Consensus 251 g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg-~~~~~~ 329 (371) |||-+..+++.+.+.+...-+ -....-|.+-.+||+|-.+... ....+..+---..++.|+-+ +-.... T Consensus 311 GGTivSSAl~~m~evi~ErYp--------~aeWNIY~fqaSDGDN~~dDse--rc~~ll~~~im~~~~~y~Y~Eitq~~~ 380 (423) T COG2718 311 GGTIVSSALKLMLEVIKERYP--------PAEWNIYAFQASDGDNWADDSE--RCVELLAKKLMPVVQYYGYIEITQRRT 380 (423) T ss_pred CCEEEHHHHHHHHHHHHHHCC--------HHHEEEEEEEECCCCCCCCCCH--HHHHHHHHHHHHHHHHEEEEEEEECCC T ss_conf 976868899999999970388--------5350255454057766568778--899999999998633438874200243 Q ss_pred HHHH--HHHCC-CCE--EEEECCHHHHHHHHHHHHHHHH Q ss_conf 7989--98338-980--8982898999999999999640 Q gi|254780833|r 330 DQFL--KNCAS-PDR--FYSVQNSRKLHDAFLRIGKEMV 363 (371) Q Consensus 330 ~~~l--~~cAs-~~~--~y~~~~~~~L~~af~~I~~~i~ 363 (371) ...| +..-. -++ ++...+.+++--+|++++.+-. T Consensus 381 H~t~~y~~~~~~~dnFa~~~I~~~~Diypvfr~lf~ke~ 419 (423) T COG2718 381 HQTLEYEALQGVFDNFAMQTIREPDDIYPVFRELFSKEL 419 (423) T ss_pred CHHHHHHHHHCCCCCHHEEEECCHHHHHHHHHHHHHCCC T ss_conf 112433444335753130340677877999999984140 No 73 >pfam03731 Ku_N Ku70/Ku80 N-terminal alpha/beta domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the amino terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold. Probab=94.43 E-value=0.32 Score=25.73 Aligned_cols=186 Identities=15% Similarity=0.146 Sum_probs=94.2 Q ss_pred EEEEECCCCCCCCCCCCCH-HHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCC----------EEECCCCC-CHH Q ss_conf 6998527631156678721-48989999987500233555531104798788636741----------78166655-878 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPGM-DKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKI----------VQTFPLAW-GVQ 238 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~----------~~~~~lt~-~~~ 238 (371) ++|++|.|.+|.....+.. ..+..+.+.+...+...-.. .....+|++-|+..- +...+|.. +.. T Consensus 2 vvf~ID~s~sM~~~~~~~~~s~~~~al~~i~~~~~~kIis---~~kD~vGvv~~~T~~~~n~~~~~ni~vl~~l~~p~a~ 78 (222) T pfam03731 2 TVFLIDASPAMFESVKGLEASPFEQALKCIDEILSRKIIS---NDKDLIGVVLYGTDESENSEGFENVTVLRDLDLPGAE 78 (222) T ss_pred EEEEEECCHHHCCCCCCCCCCHHHHHHHHHHHHHHHHEEC---CCCCEEEEEEECCCCCCCCCCCCEEEEECCCCCCCHH T ss_conf 7999979988868788998783999999999999877137---8998588999705677776788606985057887889 Q ss_pred HHH---HHHHC-------CCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHH---H Q ss_conf 999---99740-------1568874564237889999874211012346777666169999840458888889789---9 Q gi|254780833|r 239 HIQ---EKINR-------LIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKE---S 305 (371) Q Consensus 239 ~~~---~~i~~-------l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~---~ 305 (371) .++ ..+.. ...........++..+.+++... .......+|-|+|+||.++-.+..+..+ . T Consensus 79 ~ik~L~~~~~~~~~~~~~~~~~~~~~~~~aL~~~~~~~~~~-------~~~~k~~~krI~LfTdnD~P~~~~~~~~~~~~ 151 (222) T pfam03731 79 LLKELDQFLEPLADVFGFNGDSSDGDLLSALWVCMDLLQKQ-------TGKKKLSKKRILLFTNLDDPFEDDDQLDTIRQ 151 (222) T ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHH-------HCCCCCCCCEEEEECCCCCCCCCCCHHHHHHH T ss_conf 99999998510344441168977664888999999998851-------03434578679998999989887517789999 Q ss_pred HHHHHHHHHCCCEEEEEEECCCCC---HHHHHHHC---CC--CEEEEECCHHHHHHHHHHHHHHHHHEE Q ss_conf 999999997898799999418642---79899833---89--808982898999999999999640007 Q gi|254780833|r 306 LFYCNEAKRRGAIVYAIGVQAEAA---DQFLKNCA---SP--DRFYSVQNSRKLHDAFLRIGKEMVKQR 366 (371) Q Consensus 306 ~~~c~~~k~~gi~i~tIg~~~~~~---~~~l~~cA---s~--~~~y~~~~~~~L~~af~~I~~~i~~~~ 366 (371) ...+..+.+.||.|-.+.++.+.. ..+.+.+- +. ..+.....++.+.+....|-.+....| T Consensus 152 ~~~a~Dl~d~gi~i~lf~i~~~~~f~~~~FY~dii~~~~~~~~~~~~~~~~~~l~~l~~~i~~k~~~kR 220 (222) T pfam03731 152 KLLAEDLRDEGIEFNLIHLPNSGGFDPNIFYKEIIKLGEDEENEVMLDLEGEKLEDLLSRLRAKQTAKR 220 (222) T ss_pred HHHHCCHHHCCCEEEEEECCCCCCCCHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCC T ss_conf 998523787497799961498888877788886426776434566777303169999999974303565 No 74 >COG3552 CoxE Protein containing von Willebrand factor type A (vWA) domain [General function prediction only] Probab=93.91 E-value=0.35 Score=25.46 Aligned_cols=119 Identities=12% Similarity=0.082 Sum_probs=57.4 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCC--CHHHHHHH Q ss_conf 4674069985276311566787214898999998750023355553110479878863674178166655--87899999 Q gi|254780833|r 166 DIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAW--GVQHIQEK 243 (371) Q Consensus 166 ~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~--~~~~~~~~ 243 (371) ..+-.+++.+|+||||.+++.- .--...++.... .++-.-.|.+..+.+++.-. +....... T Consensus 216 ~~~~~lvvL~DVSGSm~~ys~~----~L~l~hAl~q~~------------~R~~~F~F~TRLt~vT~~l~~rD~~~Al~~ 279 (395) T COG3552 216 RRKPPLVVLCDVSGSMSGYSRI----FLHLLHALRQQR------------SRVHVFLFGTRLTRVTHMLRERDLEDALRR 279 (395) T ss_pred CCCCCEEEEEECCCCHHHHHHH----HHHHHHHHHHCC------------CCEEEEEEECHHHHHHHHHCCCCHHHHHHH T ss_conf 6899859998346634566999----999999998522------------660599730158777887624899999999 Q ss_pred HHCC--CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHH Q ss_conf 7401--568874564237889999874211012346777666169999840458888889789999999999 Q gi|254780833|r 244 INRL--IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFYCNEAK 313 (371) Q Consensus 244 i~~l--~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k 313 (371) +..- .-.|+|.+...+ ..+...|.... -...-+++++|||.+.++.. ........+. T Consensus 280 ~~a~v~dw~ggTrig~tl----~aF~~~~~~~~------L~~gA~VlilsDg~drd~~~---~l~~~~~rl~ 338 (395) T COG3552 280 LSAQVKDWDGGTRIGNTL----AAFLRRWHGNV------LSGGAVVLILSDGLDRDDIP---ELVTAMARLR 338 (395) T ss_pred HHHHCCCCCCCCCHHHHH----HHHHCCCCCCC------CCCCEEEEEEECCCCCCCCH---HHHHHHHHHH T ss_conf 986411335776234899----99971544445------57861799970542247815---7999999999 No 75 >COG3847 Flp Flp pilus assembly protein, pilin Flp [Intracellular trafficking and secretion] Probab=92.66 E-value=0.48 Score=24.65 Aligned_cols=25 Identities=20% Similarity=0.376 Sum_probs=20.1 Q ss_pred HHHHHHHCCCCCCHHHHHHHHHHHH Q ss_conf 5788752036871899999999999 Q gi|254780833|r 4 LNIRNFFYNCKGSISILTAILLPVI 28 (371) Q Consensus 4 ~~l~~f~~d~~G~vai~~al~l~~l 28 (371) ..++||+|||+|.-+|=-+|+...+ T Consensus 3 ~~~~rF~rDE~GAtaiEYglia~lI 27 (58) T COG3847 3 KLLRRFLRDEDGATAIEYGLIAALI 27 (58) T ss_pred HHHHHHHHCCCCHHHHHHHHHHHHH T ss_conf 7899997745651899999999999 No 76 >pfam11265 Med25_VWA Mediator complex subunit 25 von Willebrand factor type A. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain which is this one, an SD2 domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This VWA or von Willebrand factor type A domain when bound to RAR and the histone acetyltransferase CBP is responsible for recruiting Med1 to the rest of the Mediator complex. Probab=92.56 E-value=0.66 Score=23.81 Aligned_cols=162 Identities=15% Similarity=0.062 Sum_probs=89.2 Q ss_pred CCEEEEECCCCCCCCCCCCCH-HHHHHHHHHHH-HHCCCCCCCCCCCEEEEEEEEEECCCC------EEECCCCCCHHHH Q ss_conf 406998527631156678721-48989999987-500233555531104798788636741------7816665587899 Q gi|254780833|r 169 LDMMMVLDVSLSMNDHFGPGM-DKLGVATRSIR-EMLDIIKSIPDVNNVVRSGLVTFSSKI------VQTFPLAWGVQHI 240 (371) Q Consensus 169 idi~~viD~SgSm~~~~~~~~-~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~f~~~~------~~~~~lt~~~~~~ 240 (371) -|++||+..+.-|+.+|..-. +.+....+.+. .-++..+... -...++++++.|.... ....+++.+...+ T Consensus 6 ~dvVfviEgTA~~g~y~~~lkt~Yi~p~ieyF~~g~~~~~~~~~-~~~~~~y~LVvf~t~~~~p~~~~q~~gpt~~~~~f 84 (219) T pfam11265 6 KDVVFVIEGTANLGPYFETLKTDYILPIIEYFNGGPLAETDFGG-EYGGTQYSLVVFNTHASYPECLVQRSGPTRDVDEF 84 (219) T ss_pred EEEEEEEECCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCC-CCCCCEEEEEEEECCCCCCHHHHHHCCCCCCHHHH T ss_conf 04899995543452128999887689999986189965332356-67884589999604687762366606885899999 Q ss_pred HHHHHCCCCCCCC-----CCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCC--CH----HHHHHHH Q ss_conf 9997401568874-----56423788999987421101234677766616999984045888888--97----8999999 Q gi|254780833|r 241 QEKINRLIFGSTT-----KSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNI--DN----KESLFYC 309 (371) Q Consensus 241 ~~~i~~l~~~g~T-----~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~--~~----~~~~~~c 309 (371) ...++++...||- ++.+|+..+...+.+-.... ...+..+.+|+=||+.--.+-.-.. .+ ....++. T Consensus 85 l~~Ld~i~f~GGG~es~A~iaEGLa~ALq~Fdd~~~~r--~~~~~~~~qkhCILIcnSpPy~lP~~e~~~y~g~t~dqla 162 (219) T pfam11265 85 LQWLSSIPFMGGGFESCALIAEGLAEALQMFDDFSKMR--QQQGQTDVHRHCILICNSPPYPLPTVESWQYEGKTSDQLA 162 (219) T ss_pred HHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHC--CCCCCCCCCEEEEEEECCCCCCCCCCHHHHHCCCCHHHHH T ss_conf 99997350167871466889989999998731366516--4689887530489996899976865120434387689999 Q ss_pred HHHH--HCCCEEEEEEECCCCCHHHHHHH Q ss_conf 9999--78987999994186427989983 Q gi|254780833|r 310 NEAK--RRGAIVYAIGVQAEAADQFLKNC 336 (371) Q Consensus 310 ~~~k--~~gi~i~tIg~~~~~~~~~l~~c 336 (371) ..+. +++|.+..|. +..-+.|+.+ T Consensus 163 ~~~~f~e~~i~lSIis---PRklP~L~~l 188 (219) T pfam11265 163 AAINFAERSISLSIIC---PRKLPALRLL 188 (219) T ss_pred HHHCCCCCCEEEEEEC---CCCCHHHHHH T ss_conf 8741200462589976---4304899999 No 77 >COG3864 Uncharacterized protein conserved in bacteria [Function unknown] Probab=92.39 E-value=0.47 Score=24.70 Aligned_cols=95 Identities=22% Similarity=0.337 Sum_probs=46.5 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCCHHHHHHHHHCCCC Q ss_conf 06998527631156678721489899999875002335555311047987886367417816665587899999740156 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWGVQHIQEKINRLIF 249 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~~~~~~~~i~~l~~ 249 (371) -++.++|.||||.. ..++.+...+...+... .++.-++.-...++....+..... .--.+.- T Consensus 263 ~i~vaVDtSGS~~d------~ei~a~~~Ei~~Il~~~--------~~eltli~~D~~v~~~~~~r~g~~----~~~~~~g 324 (396) T COG3864 263 KIVVAVDTSGSMTD------AEIDAAMTEIFDILKNK--------NYELTLIECDNIVRRMYRVRKGRD----MKKKLDG 324 (396) T ss_pred HEEEEEECCCCCCH------HHHHHHHHHHHHHHHCC--------CCEEEEEEECCHHHHHHCCCCCCC----CCCCCCC T ss_conf 14789815787448------99999999999987078--------807999980301210310577445----7865578 Q ss_pred CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCC Q ss_conf 8874564237889999874211012346777666169999840458888 Q gi|254780833|r 250 GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSP 298 (371) Q Consensus 250 ~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~ 298 (371) +|+|...+.+..--..+ +.-++|.+|||.-..+ T Consensus 325 gG~Tdf~Pvfeylek~~----------------~~~~lIyfTDG~gd~p 357 (396) T COG3864 325 GGGTDFSPVFEYLEKNR----------------MECFLIYFTDGMGDQP 357 (396) T ss_pred CCCCCCCHHHHHHHHHC----------------CCCEEEEECCCCCCCC T ss_conf 98766417999997616----------------3335999816888764 No 78 >LOAD_ku consensus Probab=90.69 E-value=1.1 Score=22.58 Aligned_cols=53 Identities=17% Similarity=0.239 Sum_probs=34.0 Q ss_pred EEEEECCCCCCCCCCCCC-HHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCC Q ss_conf 699852763115667872-14898999998750023355553110479878863674 Q gi|254780833|r 171 MMMVLDVSLSMNDHFGPG-MDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSK 226 (371) Q Consensus 171 i~~viD~SgSm~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~ 226 (371) ++|++|.|.||....++. ...+..+.+.+..++...-.. .....+|++-|+.. T Consensus 2 ivflID~s~sM~~~~~~~~~s~l~~al~~v~~~~~~ki~~---~~~D~vGvvl~gT~ 55 (521) T LOAD_ku 2 ILFCIDVSPAMFESSDGEELSPFEQALKCIRTLMQRKVIS---RPKDLIGVVLYGTD 55 (521) T ss_pred EEEEEECCHHHCCCCCCCCCCHHHHHHHHHHHHHHHHEEC---CCCCEEEEEEECCC T ss_conf 8999979988878789988886999999999999864458---99986999998279 No 79 >COG5148 RPN10 26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones] Probab=84.44 E-value=2.6 Score=20.25 Aligned_cols=143 Identities=10% Similarity=0.061 Sum_probs=88.5 Q ss_pred CEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCC-CCCCCCCCEEEEEEEEEECCC-CEEECCCCCCHHHHHHHHHCC Q ss_conf 069985276311566787214898999998750023-355553110479878863674-178166655878999997401 Q gi|254780833|r 170 DMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDI-IKSIPDVNNVVRSGLVTFSSK-IVQTFPLAWGVQHIQEKINRL 247 (371) Q Consensus 170 di~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~f~~~-~~~~~~lt~~~~~~~~~i~~l 247 (371) -.++++|+|.-|..... ..+|...-++++-..++. ++. +....+|+.+.... +.....+|..+..+..++..+ T Consensus 5 atvvliDNse~s~NgDy-~ptRFeAQkd~ve~if~~K~nd----npEntiGli~~~~a~p~vlsT~T~~~gkilt~lhd~ 79 (243) T COG5148 5 ATVVLIDNSEASQNGDY-LPTRFEAQKDAVESIFSKKFND----NPENTIGLIPLVQAQPNVLSTPTKQRGKILTFLHDI 79 (243) T ss_pred EEEEEEECHHHHHCCCC-CCHHHHHHHHHHHHHHHHHHCC----CCCCEEEEEECCCCCCCHHCCCHHHHHHHHHHHCCC T ss_conf 28999847066524897-7078888788999999877238----963315445635688512113065412888772365 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCC-CCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECC Q ss_conf 56887456423788999987421101234677766-61699998404588888897899999999997898799999418 Q gi|254780833|r 248 IFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDD-YKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 248 ~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~-~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~ 326 (371) ...|+-.+..++.-+...|... ..+. .++++.|+ |.+-.. +..+...++..+|..|+.|-.|-||. T Consensus 80 ~~~g~a~~~~~lqiaql~lkhR---------~nk~q~qriVaFv--gSpi~e--sedeLirlak~lkknnVAidii~fGE 146 (243) T COG5148 80 RLHGGADIMRCLQIAQLILKHR---------DNKGQRQRIVAFV--GSPIQE--SEDELIRLAKQLKKNNVAIDIIFFGE 146 (243) T ss_pred CCCCCCHHHHHHHHHHHHHHCC---------CCCCCCEEEEEEE--CCCCCC--CHHHHHHHHHHHHHCCEEEEEEEHHH T ss_conf 1247640888999999998601---------4876505899994--684532--67999999999986683589986034 Q ss_pred CCCH Q ss_conf 6427 Q gi|254780833|r 327 EAAD 330 (371) Q Consensus 327 ~~~~ 330 (371) -.+. T Consensus 147 ~~n~ 150 (243) T COG5148 147 AANM 150 (243) T ss_pred HHHH T ss_conf 5556 No 80 >KOG2487 consensus Probab=82.83 E-value=2.8 Score=20.02 Aligned_cols=72 Identities=13% Similarity=0.203 Sum_probs=43.9 Q ss_pred CCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHC--CCCEEEEECCHHHHHHHHHHH Q ss_conf 6169999840458888889789999999999789879999941864279899833--898089828989999999999 Q gi|254780833|r 283 YKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEAADQFLKNCA--SPDRFYSVQNSRKLHDAFLRI 358 (371) Q Consensus 283 ~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~I 358 (371) .+--|+++|=+.+-....- ......=.|.+.+|.|-++-.+.+ ..+||+|+ ++|-|-++++++.|-+.+-.. T Consensus 165 lkSRilV~t~t~d~~~qyi--~~MNciFaAqKq~I~Idv~~l~~~--s~~LqQa~D~TGG~YL~v~~~~gLLqyLlt~ 238 (314) T KOG2487 165 LKSRILVFTLTRDRALQYI--PYMNCIFAAQKQNIPIDVVSLGGD--SGFLQQACDITGGDYLHVEKPDGLLQYLLTL 238 (314) T ss_pred HHCEEEEEEECHHHHHHHH--HHHHHHHHHHHCCCEEEEEEECCC--CHHHHHHHHHCCCEEEECCCCCHHHHHHHHH T ss_conf 1232899991517776553--577877778753961589995698--4399998750287047148852599999998 No 81 >TIGR00627 tfb4 transcription factor tfb4; InterPro: IPR004600 Members of this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. The core-TFIIH basal transcription factor complex has six subunits, this is the p34 subunit.; GO: 0016251 general RNA polymerase II transcription factor activity, 0006281 DNA repair, 0006355 regulation of transcription DNA-dependent, 0000439 core TFIIH complex. Probab=82.09 E-value=2.7 Score=20.13 Aligned_cols=76 Identities=13% Similarity=0.267 Sum_probs=50.4 Q ss_pred CCCCEEEEEEECCCCCCCCC--CHHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHC--CCCEEEEECCHHHHHHHHH Q ss_conf 66616999984045888888--9789999999999789879999941864279899833--8980898289899999999 Q gi|254780833|r 281 DDYKKYIIFLTDGENSSPNI--DNKESLFYCNEAKRRGAIVYAIGVQAEAADQFLKNCA--SPDRFYSVQNSRKLHDAFL 356 (371) Q Consensus 281 ~~~~k~ivl~TDG~~~~~~~--~~~~~~~~c~~~k~~gi~i~tIg~~~~~~~~~l~~cA--s~~~~y~~~~~~~L~~af~ 356 (371) ...+--+++++=|.-..+.. ..-......=.|.+++|-|=++-++.+....+|||-| |+|-|-+|+++..|-+.+- T Consensus 153 ~~l~sR~lv~~~GsGst~d~~~qY~~~MN~iFsA~K~~i~idvv~~~~~~~~~~LqQAaD~TGG~YL~v~~~~~LL~yL~ 232 (295) T TIGR00627 153 EKLKSRILVISIGSGSTEDVALQYIPLMNCIFSAQKQNIPIDVVKIGGDFESGFLQQAADITGGVYLKVEKPKGLLQYLM 232 (295) T ss_pred HHCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHHCCCCCEEEEEECCCCCCHHHHHHHHHHCCEEEEECCCHHHHHHHH T ss_conf 11001178997357876210100200556999851698415899808983020677777663874574278746899999 No 82 >cd01468 trunk_domain trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins. Probab=77.51 E-value=4.5 Score=18.78 Aligned_cols=160 Identities=21% Similarity=0.188 Sum_probs=83.6 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEE------------------ Q ss_conf 74069985276311566787214898999998750023355553110479878863674178------------------ Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQ------------------ 229 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~------------------ 229 (371) |.-.+|++|.|..--.. .-++.+++++...++..... ...++|+++|...... T Consensus 3 pp~~~FvIDvs~~ai~~-----g~l~~~~~si~~~l~~lp~~----~~~~VgiiTfd~~v~~y~l~~~~~~~~~~vv~d~ 73 (239) T cd01468 3 PPVFVFVIDVSYEAIKE-----GLLQALKESLLASLDLLPGD----PRARVGLITYDSTVHFYNLSSDLAQPKMYVVSDL 73 (239) T ss_pred CCEEEEEEECCHHHHHC-----HHHHHHHHHHHHHHHHCCCC----CCCEEEEEEECCEEEEEECCCCCCCCCEEEECCC T ss_conf 98899999886665334-----09999999999999857789----9978999997877899975888778823675388 Q ss_pred -----------ECCCCCCHHHHHHHHHCCCCC----CCC----CCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEE Q ss_conf -----------166655878999997401568----874----5642378899998742110123467776661699998 Q gi|254780833|r 230 -----------TFPLAWGVQHIQEKINRLIFG----STT----KSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFL 290 (371) Q Consensus 230 -----------~~~lt~~~~~~~~~i~~l~~~----g~T----~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~ 290 (371) ..|+......+.+.++.+... .+. ....|++.+...|... ..++ .|+++ T Consensus 74 ~d~f~P~~~~~lv~~~e~~~~i~~lL~~l~~~~~~~~~~~~~~~~GsAl~~A~~~L~~~-------~~GG-----kI~~f 141 (239) T cd01468 74 KDVFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPVPTHRPERCLGPALQAAFLLLKGT-------FAGG-----RIIVF 141 (239) T ss_pred CCCCCCCCCCEEEEHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHCC-------CCCC-----EEEEE T ss_conf 55667887545761899899999999987751145478888743799999999997245-------7896-----69999 Q ss_pred ECCCCCCCCCC----------------------HHHHHHHHHHHHHCCCEEEEEEECCC-CCHHHHHHHC--CCCEEEEE Q ss_conf 40458888889----------------------78999999999978987999994186-4279899833--89808982 Q gi|254780833|r 291 TDGENSSPNID----------------------NKESLFYCNEAKRRGAIVYAIGVQAE-AADQFLKNCA--SPDRFYSV 345 (371) Q Consensus 291 TDG~~~~~~~~----------------------~~~~~~~c~~~k~~gi~i~tIg~~~~-~~~~~l~~cA--s~~~~y~~ 345 (371) +-|.+|.|... ..--..++..+...||-|.-..+..+ .+-..|+.++ |+|..|.- T Consensus 142 ~s~~pt~GpG~l~~r~~~~~~~s~~e~~~~~~~~~fY~~la~~~~~~~isvDlF~~s~~~~dlatl~~l~~~TGG~~~~y 221 (239) T cd01468 142 QGGLPTVGPGKLKSREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLY 221 (239) T ss_pred ECCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCHHHHHHHHHHHHHHCCEEEEEEECCCCCCCHHHHHHHHHHCCCEEEEE T ss_conf 57899788986546554557897115665352078999999999986915999972586678388875887269679985 Q ss_pred CCH Q ss_conf 898 Q gi|254780833|r 346 QNS 348 (371) Q Consensus 346 ~~~ 348 (371) ++. T Consensus 222 ~~F 224 (239) T cd01468 222 DSF 224 (239) T ss_pred CCC T ss_conf 888 No 83 >KOG4465 consensus Probab=74.68 E-value=5.4 Score=18.32 Aligned_cols=118 Identities=18% Similarity=0.235 Sum_probs=67.1 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEECCCCCC--HHHHHHH Q ss_conf 46740699852763115667872148989999987500233555531104798788636741781666558--7899999 Q gi|254780833|r 166 DIGLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQTFPLAWG--VQHIQEK 243 (371) Q Consensus 166 ~~~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~~lt~~--~~~~~~~ 243 (371) +.+-..++.+|+|+||.+..-+.+-....+..+ -.+++..+.. ..-.+.|++... .+|+|.+ ..++..+ T Consensus 425 ptgkr~~laldvs~sm~~rv~~s~ln~reaaa~-m~linlhnea-------d~~~vaf~d~lt-e~pftkd~kigqv~~~ 495 (598) T KOG4465 425 PTGKRFCLALDVSASMNQRVLGSILNAREAAAA-MCLINLHNEA-------DSRCVAFCDELT-ECPFTKDMKIGQVLDA 495 (598) T ss_pred CCCCEEEEEEECCHHHHHHHHCCCCCHHHHHHH-HHEEEECCCC-------CEEEEEECCCCC-CCCCCCCCCHHHHHHH T ss_conf 877458999851044445564365556788755-5124512444-------403788605454-6887553449999999 Q ss_pred HHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEECCCCCCCCCCHHHHHHH Q ss_conf 74015688745642378899998742110123467776661699998404588888897899999 Q gi|254780833|r 244 INRLIFGSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTDGENSSPNIDNKESLFY 308 (371) Q Consensus 244 i~~l~~~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TDG~~~~~~~~~~~~~~~ 308 (371) .+.+. .|+|...-.|.|+-.. .-+-.+.|++||-+--.+...+..++.. T Consensus 496 ~nni~-~g~tdcglpm~wa~en---------------nlk~dvfii~tdndt~ageihp~~aik~ 544 (598) T KOG4465 496 MNNID-AGGTDCGLPMIWAQEN---------------NLKADVFIIFTDNDTFAGEIHPAEAIKE 544 (598) T ss_pred HHCCC-CCCCCCCCCEEEHHHC---------------CCCCCEEEEEECCCCCCCCCCHHHHHHH T ss_conf 85588-8887668720432105---------------8875479998358632466677899999 No 84 >KOG2653 consensus Probab=72.63 E-value=6 Score=18.02 Aligned_cols=10 Identities=0% Similarity=-0.340 Sum_probs=3.6 Q ss_pred EEEEECCCCC Q ss_conf 6998527631 Q gi|254780833|r 171 MMMVLDVSLS 180 (371) Q Consensus 171 i~~viD~SgS 180 (371) ++-++..-|+ T Consensus 173 Cc~wvG~~Ga 182 (487) T KOG2653 173 CCDWVGEGGA 182 (487) T ss_pred CEEEECCCCC T ss_conf 8354468887 No 85 >COG4726 PilX Tfp pilus assembly protein PilX [Cell motility and secretion / Intracellular trafficking and secretion] Probab=69.24 E-value=7.2 Score=17.56 Aligned_cols=57 Identities=16% Similarity=0.100 Sum_probs=25.9 Q ss_pred HHHCCCCCCHHHHHHHHHHHHHHHHHHHH-----H---HHHHHHHHHHHHHHHHHHHHHHHHHHCC Q ss_conf 75203687189999999999999999999-----9---9999999999999999999999865203 Q gi|254780833|r 8 NFFYNCKGSISILTAILLPVIFIVMGLVI-----E---TSHKFFVKAKLHYILDHSLLYTATKILN 65 (371) Q Consensus 8 ~f~~d~~G~vai~~al~l~~li~~~g~aV-----D---~~r~~~~ks~Lq~a~DaA~LAaa~~~~~ 65 (371) |-.|.+||-.+|+. |++.+++.+.|++. | ++.-+..|+.+++++++|.-.+-..+.+ T Consensus 7 r~~r~qRG~~Livv-L~~LvvltLl~l~~~r~~llqeRiSaN~~D~~lAfqaAEaaLr~~E~~i~n 71 (196) T COG4726 7 RGSRRQRGFALIVV-LMVLVVLTLLGLAAARSVLLQERISANERDRSLAFQAAEAALREGELQINN 71 (196) T ss_pred CCCCCCCCEEEHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 87645676473899-999999999999999999989887520677899999999999877899860 No 86 >pfam04811 Sec23_trunk Sec23/Sec24 trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Probab=58.33 E-value=12 Score=16.33 Aligned_cols=159 Identities=18% Similarity=0.160 Sum_probs=84.6 Q ss_pred CCCEEEEECCCCC-CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEEE---------------- Q ss_conf 7406998527631-15667872148989999987500233555531104798788636741781---------------- Q gi|254780833|r 168 GLDMMMVLDVSLS-MNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQT---------------- 230 (371) Q Consensus 168 ~idi~~viD~SgS-m~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~---------------- 230 (371) |.-.+|++|.|-. ... .-++.+++++...++..... ...++|+++|....... T Consensus 3 PP~f~FvIDvS~~ai~~------g~l~~~~~si~~~l~~lp~~----~~~~VgiITf~~~v~~y~l~~~~~~~~~~vv~d 72 (241) T pfam04811 3 PPVFLFVIDVSYNAIKS------GLLAALKESLLQSLDLLPGD----PRALVGFITFDSTVHFFNLSSSLRQPKMLVVSD 72 (241) T ss_pred CCEEEEEEECCHHHHHC------CHHHHHHHHHHHHHHHCCCC----CCCEEEEEEECCEEEEEECCCCCCCCEEEEHHH T ss_conf 98899999886565334------39999999999999847799----986899999688689998778888872650356 Q ss_pred -----CC--------CCCCHHHHHHHHHCCCCC------CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEE Q ss_conf -----66--------655878999997401568------87456423788999987421101234677766616999984 Q gi|254780833|r 231 -----FP--------LAWGVQHIQEKINRLIFG------STTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLT 291 (371) Q Consensus 231 -----~~--------lt~~~~~~~~~i~~l~~~------g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~T 291 (371) .| +......+.+.++.+... .......|++.+...|... ..++ .|++++ T Consensus 73 l~d~f~P~~~~~lv~~~e~~~~i~~lL~~L~~~~~~~~~~~~~~G~Al~~A~~lL~~~-------~~GG-----kI~~F~ 140 (241) T pfam04811 73 LQDMFLPLPDRFLVPLSECRFVLEDLLEELPRMFPVTKRPERCLGPALQAAVLLLKAA-------FTGG-----KIMLFQ 140 (241) T ss_pred HHHHCCCCCCCEEEEHHHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHC-------CCCC-----EEEEEE T ss_conf 5430367755546446999999999999757643467887422489999999998516-------8997-----699994 Q ss_pred CCCCCCCCC--------------C----------HHHHHHHHHHHHHCCCEEEEEEECCC-CCHHHHHHHC--CCCEEEE Q ss_conf 045888888--------------9----------78999999999978987999994186-4279899833--8980898 Q gi|254780833|r 292 DGENSSPNI--------------D----------NKESLFYCNEAKRRGAIVYAIGVQAE-AADQFLKNCA--SPDRFYS 344 (371) Q Consensus 292 DG~~~~~~~--------------~----------~~~~~~~c~~~k~~gi~i~tIg~~~~-~~~~~l~~cA--s~~~~y~ 344 (371) -|.+|.|.. + ..--..++..+-..||.+--..+..+ .+-..|+.++ |+|..|. T Consensus 141 s~~pt~Gpg~~~~~~~~~~~~~~~~e~~~~~~~~~~fY~~la~~~~~~~isvDlF~~s~~~~dlatl~~l~~~TGG~i~~ 220 (241) T pfam04811 141 GGLPTVGPGGKLKSRLDESHHDTDKEKAKLVKKADKFYKSLAKECVAQGHSVDLFAFSLDYVDVAELGCLSRLTGGQVYL 220 (241) T ss_pred CCCCCCCCCCCCCCCCCCCCCCCCCCHHHCCCCHHHHHHHHHHHHHHCCCEEEEEECCCCCCCHHHHHHHHHCCCCEEEE T ss_conf 78998797544555444344676300454166217999999999998694599995278667828877687606936998 Q ss_pred ECCH Q ss_conf 2898 Q gi|254780833|r 345 VQNS 348 (371) Q Consensus 345 ~~~~ 348 (371) -++. T Consensus 221 y~~F 224 (241) T pfam04811 221 YPSF 224 (241) T ss_pred ECCC T ss_conf 6788 No 87 >pfam02060 ISK_Channel Slow voltage-gated potassium channel. Probab=57.64 E-value=12 Score=16.25 Aligned_cols=40 Identities=20% Similarity=0.054 Sum_probs=31.7 Q ss_pred HHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 8875203687189999999999999999999999999999 Q gi|254780833|r 6 IRNFFYNCKGSISILTAILLPVIFIVMGLVIETSHKFFVK 45 (371) Q Consensus 6 l~~f~~d~~G~vai~~al~l~~li~~~g~aVD~~r~~~~k 45 (371) -||..+..+|...++..|+++.++++..++|-++.....| T Consensus 31 arr~p~~~dg~le~lYiLmvlGfFgFft~GImlsyiRSkk 70 (129) T pfam02060 31 ARRSPLGDDGKLEALYILMVLGFFGFFTLGIMLSYIRSKK 70 (129) T ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 5678899986026889999999999999999999999886 No 88 >COG5242 TFB4 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair] Probab=57.16 E-value=12 Score=16.21 Aligned_cols=45 Identities=18% Similarity=0.414 Sum_probs=32.0 Q ss_pred HHHHCCCEEEEEEECCCCCHHHHHHHC--CCCEEEEECCHHHHHHHHHH Q ss_conf 999789879999941864279899833--89808982898999999999 Q gi|254780833|r 311 EAKRRGAIVYAIGVQAEAADQFLKNCA--SPDRFYSVQNSRKLHDAFLR 357 (371) Q Consensus 311 ~~k~~gi~i~tIg~~~~~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~ 357 (371) .|...||.|-++.++.+ ..+|++|+ ++|-|-.++++..|-..+-. T Consensus 178 ~Aqk~~ipI~v~~i~g~--s~fl~Q~~daTgG~Yl~ve~~eGllqyL~~ 224 (296) T COG5242 178 AAQKFGIPISVFSIFGN--SKFLLQCCDATGGDYLTVEDTEGLLQYLLS 224 (296) T ss_pred EHHHCCCCEEEEEECCC--CHHHHHHHHCCCCEEEEECCCHHHHHHHHH T ss_conf 26434981489982486--178998763448726862482069999999 No 89 >TIGR02477 PFKA_PPi diphosphate--fructose-6-phosphate 1-phosphotransferase; InterPro: IPR011183 Diphosphate--fructose-6-phosphate 1-phosphotransferase catalyses the addition of phosphate from diphosphate (PPi) to fructose 6-phosphate to give fructose 1,6-bisphosphate (2.7.1.90 from EC). The enzyme is also known as pyrophosphate-dependent phosphofructokinase. The usage of PPi-dependent enzymes in glycolysis presumably frees up ATP for other processes . IPR012828 from INTERPRO represents the ATP-dependent 6-phosphofructokinase enzyme contained within Phosphofructokinase. This entry contains primarily bacterial, plant alpha, and plant beta sequences. These may be dimeric nonallosteric enzymes (bacteria) or allosteric heterotetramers (plants) . For additional information please see , , , .; GO: 0005524 ATP binding, 0047334 diphosphate-fructose-6-phosphate 1-phosphotransferase activity, 0006096 glycolysis, 0005945 6-phosphofructokinase complex. Probab=50.29 E-value=16 Score=15.53 Aligned_cols=39 Identities=13% Similarity=0.223 Sum_probs=29.7 Q ss_pred EEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCCC Q ss_conf 699998404588888897899999999997898799999418642 Q gi|254780833|r 285 KYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEAA 329 (371) Q Consensus 285 k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~~ 329 (371) .-+|++ +|...+.++.-+++...+++..+=+||+---.| T Consensus 170 dgLVII------GGDdSNTnAA~LAEyF~~~~~~t~viGVPKTID 208 (566) T TIGR02477 170 DGLVII------GGDDSNTNAALLAEYFAKKGLKTQVIGVPKTID 208 (566) T ss_pred CEEEEE------CCCCCHHHHHHHHHHHHHCCCCCEEEEEECCCC T ss_conf 648997------479867999999999997389922786402547 No 90 >COG1681 FlaB Archaeal flagellins [Cell motility and secretion] Probab=50.04 E-value=15 Score=15.60 Aligned_cols=22 Identities=23% Similarity=0.380 Sum_probs=16.6 Q ss_pred CCCCCHHHHHHHHHHHHHHHHH Q ss_conf 3687189999999999999999 Q gi|254780833|r 12 NCKGSISILTAILLPVIFIVMG 33 (371) Q Consensus 12 d~~G~vai~~al~l~~li~~~g 33 (371) +|||.+-|=.++.|+.|+++++ T Consensus 1 ~rrG~~GIgtlIVfIAmVlVAA 22 (209) T COG1681 1 DRRGATGIGTLIVFIAMVLVAA 22 (209) T ss_pred CCCCCCCHHHHHHHHHHHHHHH T ss_conf 9841104328999999999999 No 91 >PRK06939 2-amino-3-ketobutyrate coenzyme A ligase; Provisional Probab=48.62 E-value=11 Score=16.35 Aligned_cols=60 Identities=18% Similarity=0.269 Sum_probs=42.0 Q ss_pred CHHHHHHHHHHHHHCCCEEEEEEEC-CCCCHHHHHHHCCCCEEEEECCHHHHHHHHHHHHHHH Q ss_conf 9789999999999789879999941-8642798998338980898289899999999999964 Q gi|254780833|r 301 DNKESLFYCNEAKRRGAIVYAIGVQ-AEAADQFLKNCASPDRFYSVQNSRKLHDAFLRIGKEM 362 (371) Q Consensus 301 ~~~~~~~~c~~~k~~gi~i~tIg~~-~~~~~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~~i 362 (371) ++..+..+++.+.++||-+..|.+- .+.++..||-|-+..|= -++-+.+.++++++++++ T Consensus 334 ~~~~a~~~~~~L~~~Gi~v~~ir~PtVp~g~~rlRi~lta~ht--~~did~lv~~l~~v~~~l 394 (395) T PRK06939 334 DAKLAQEFADRLLEEGVYVIGFSFPVVPKGQARIRTQMSAAHT--KEQLDRAIDAFEKVGKEL 394 (395) T ss_pred CHHHHHHHHHHHHHCCCEEEEECCCCCCCCCCEEEEEECCCCC--HHHHHHHHHHHHHHHHHC T ss_conf 9999999999999779748207899889898569988787799--999999999999999963 No 92 >pfam04964 Flp_Fap Flp/Fap pilin component. Probab=44.89 E-value=19 Score=15.02 Aligned_cols=21 Identities=19% Similarity=0.357 Sum_probs=16.2 Q ss_pred HHCCCCCCHHHHHHHHHHHHH Q ss_conf 520368718999999999999 Q gi|254780833|r 9 FFYNCKGSISILTAILLPVIF 29 (371) Q Consensus 9 f~~d~~G~vai~~al~l~~li 29 (371) |+|||+|.-+|=-+|+...+- T Consensus 1 F~kde~GaTAIEYgLIaalIa 21 (47) T pfam04964 1 FLKDESGATAIEYGLIAALIA 21 (47) T ss_pred CCCCCCCCHHHHHHHHHHHHH T ss_conf 965656415999999999999 No 93 >cd01479 Sec24-like Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 Probab=44.82 E-value=19 Score=15.01 Aligned_cols=174 Identities=15% Similarity=0.155 Sum_probs=85.1 Q ss_pred CCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEECCCCEE------------------ Q ss_conf 74069985276311566787214898999998750023355553110479878863674178------------------ Q gi|254780833|r 168 GLDMMMVLDVSLSMNDHFGPGMDKLGVATRSIREMLDIIKSIPDVNNVVRSGLVTFSSKIVQ------------------ 229 (371) Q Consensus 168 ~idi~~viD~SgSm~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~------------------ 229 (371) |.-.+|++|.|..--.. .-+..+.+++...++..... ....++|+++|.....- T Consensus 3 Pp~y~FvIDvS~~av~s-----G~l~~~~~sI~~~L~~lp~~---~~rt~VgiiTfds~vhfy~l~~~l~~pqm~vv~Dl 74 (244) T cd01479 3 PAVYVFLIDVSYNAIKS-----GLLATACEALLSNLDNLPGD---DPRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDL 74 (244) T ss_pred CCEEEEEEECCHHHHHH-----CHHHHHHHHHHHHHHHCCCC---CCCEEEEEEEECCEEEEEECCCCCCCCEEEEEECC T ss_conf 98899999898999876-----87999999999999857788---88628999996887999966888888716873066 Q ss_pred ---ECCC--------CCCHHHHHHHHHCCCC------CCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEEC Q ss_conf ---1666--------5587899999740156------8874564237889999874211012346777666169999840 Q gi|254780833|r 230 ---TFPL--------AWGVQHIQEKINRLIF------GSTTKSTPGLEYAYNKIFDAKEKLEHIAKGHDDYKKYIIFLTD 292 (371) Q Consensus 230 ---~~~l--------t~~~~~~~~~i~~l~~------~g~T~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~k~ivl~TD 292 (371) ..|+ ...+..+...++.+.. ........|++.+...|... ++ |+ +++.- T Consensus 75 ~d~f~P~~~~llv~l~e~~~~i~~lL~~lp~~f~~~~~~~~~~G~Al~aA~~~l~~~---------GG----kI-~~f~s 140 (244) T cd01479 75 DDPFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLKET---------GG----KI-IVFQS 140 (244) T ss_pred CCCCCCCCCCEEECHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHC---------CC----EE-EEEEC T ss_conf 555678874436525998999999999878875068988643889999999999734---------98----89-99964 Q ss_pred CCCCCCCC----------------------CHHHHHHHHHHHHHCCCEEEEEEECCCC-CHHHHHHHC--CCCE--EEE- Q ss_conf 45888888----------------------9789999999999789879999941864-279899833--8980--898- Q gi|254780833|r 293 GENSSPNI----------------------DNKESLFYCNEAKRRGAIVYAIGVQAEA-ADQFLKNCA--SPDR--FYS- 344 (371) Q Consensus 293 G~~~~~~~----------------------~~~~~~~~c~~~k~~gi~i~tIg~~~~~-~~~~l~~cA--s~~~--~y~- 344 (371) +-++.|.. .+.--..++...-..+|.|...-+.... +-..|..++ |+|. ||. T Consensus 141 s~Pt~G~G~l~~r~~~~~~~~~~e~~ll~p~~~fY~~la~~~~~~~isvDlF~~~~~~~Dlatl~~l~~~TGG~~~~Yp~ 220 (244) T cd01479 141 SLPTLGAGKLKSREDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPS 220 (244) T ss_pred CCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCHHHHHHHHHHHHCCCEEEEEECCCCCCCHHHHHHHHHCCCEEEEEECC T ss_conf 79988887550054434568602222038660899999999998592189996268766544430044316724898078 Q ss_pred --ECCHHHHHHHHHHHHHHHH Q ss_conf --2898999999999999640 Q gi|254780833|r 345 --VQNSRKLHDAFLRIGKEMV 363 (371) Q Consensus 345 --~~~~~~L~~af~~I~~~i~ 363 (371) .....+-.+..+++.+-++ T Consensus 221 f~~~~~~d~~kl~~dl~~~lt 241 (244) T cd01479 221 FNFSAPNDVEKLVNELARYLT 241 (244) T ss_pred CCCCCHHHHHHHHHHHHHHHC T ss_conf 777677688999999999834 No 94 >cd06353 PBP1_BmpA_Med_like Periplasmic binding domain of the basic membrane lipoprotein Med in Bacillus and its close homologs from other bacteria and Archaea. Periplasmic binding domain of the basic membrane lipoprotein Med in Bacillus and its close homologs from other bacteria and Archaea. Med, a cell-surface localized protein, which regulates the competence transcription factor gene comK in Bacillus subtilis, lacks the DNA binding domain when compared with structures of transcription regulators from the LacI family. Nevertheless, Med has significant overall sequence homology to various periplasmic substrate-binding proteins. Moreover, the structure of Med shows a striking similarity to PnrA, a periplasmic nucleoside binding protein of an ATP-binding cassette transport system. Members of this group contain the type I periplasmic sugar-binding protein-like fold. Probab=43.86 E-value=20 Score=14.92 Aligned_cols=19 Identities=37% Similarity=0.562 Sum_probs=11.4 Q ss_pred HHHHHHHHCCCEEEEEEECCC Q ss_conf 999999978987999994186 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~~ 327 (371) ...+.++++| +|+||+..| T Consensus 191 gv~~aa~e~g--~~~IG~d~d 209 (258) T cd06353 191 GVIQAAEEKG--VYAIGYVSD 209 (258) T ss_pred HHHHHHHHCC--CEEEECCCC T ss_conf 8999999729--879954676 No 95 >COG1991 Uncharacterized conserved protein [Function unknown] Probab=43.67 E-value=9.5 Score=16.84 Aligned_cols=34 Identities=24% Similarity=0.311 Sum_probs=27.2 Q ss_pred HHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 8875203687189999999999999999999999 Q gi|254780833|r 6 IRNFFYNCKGSISILTAILLPVIFIVMGLVIETS 39 (371) Q Consensus 6 l~~f~~d~~G~vai~~al~l~~li~~~g~aVD~~ 39 (371) +..|..+.||++..=|.|+++.++++++.++-|- T Consensus 5 i~~~~~~nkgQiSLEf~Ll~l~ivla~~i~~~y~ 38 (131) T COG1991 5 ITKIILSNKGQISLEFSLLLLAIVLAASIAGAYV 38 (131) T ss_pred EEEEEECCCCCEEEEHHHHHHHHHHHHHHEEEEE T ss_conf 6254224666256413899999999731211489 No 96 >pfam03850 Tfb4 Transcription factor Tfb4. Probab=43.28 E-value=20 Score=14.87 Aligned_cols=75 Identities=15% Similarity=0.144 Sum_probs=47.8 Q ss_pred CCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHC--CCCEEEEECCHHHHHHHHHHHH Q ss_conf 66169999840458888889789999999999789879999941864279899833--8980898289899999999999 Q gi|254780833|r 282 DYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEAADQFLKNCA--SPDRFYSVQNSRKLHDAFLRIG 359 (371) Q Consensus 282 ~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~~~~~l~~cA--s~~~~y~~~~~~~L~~af~~I~ 359 (371) ..+.-|++++-..+.. ...-......-.|++.+|.|-+..++.. +..+||+.+ ++|.|+.+++.+.|-+..-..+ T Consensus 140 ~~~~RILiis~S~d~~--~QYi~~MN~iFaAqk~~I~IDvc~L~~~-~s~fLQQA~diT~G~Yl~~~~~~gLlQyL~~~f 216 (271) T pfam03850 140 SLKSRILVLSGSPDSA--SQYIPIMNSIFAAQKLKIPIDVCKLGGE-DSSFLQQAADITGGVYLHVTEPDGLLQYLMTAF 216 (271) T ss_pred CCEEEEEEEECCCCCH--HHHHHHHHHHHHHHHCCCEEEEEEECCC-CCHHHHHHHHHHCCEEECCCCCCHHHHHHHHHH T ss_conf 5302599998788844--7789999999999855974799993699-858999999974977751478333899999996 No 97 >pfam05814 DUF843 Baculovirus protein of unknown function (DUF843). This family consists of several Baculovirus proteins of around 85 residues long with no known function. Probab=42.53 E-value=21 Score=14.80 Aligned_cols=43 Identities=30% Similarity=0.244 Sum_probs=23.7 Q ss_pred HCCCCCCHHHHHHHHHHHHHHHHHHHHHHH--------------HHHHHHHHHHHHHH Q ss_conf 203687189999999999999999999999--------------99999999999999 Q gi|254780833|r 10 FYNCKGSISILTAILLPVIFIVMGLVIETS--------------HKFFVKAKLHYILD 53 (371) Q Consensus 10 ~~d~~G~vai~~al~l~~li~~~g~aVD~~--------------r~~~~ks~Lq~a~D 53 (371) -|.++++-.+++.|++.++++++ +.+-|. .-...|.+|.+|.| T Consensus 18 dk~e~~s~li~~~lllfvlF~~~-l~vyyinteS~~~dL~t~kaKsiKKK~~le~AfD 74 (83) T pfam05814 18 DKNEGSSELILTLLVLFVLFFCL-LNVYYINTESTPADLYTEKAKKIKKKQDLEDAFD 74 (83) T ss_pred CCCCCHHHHHHHHHHHHHHHHHH-HHHHHCCCCCCHHHCCCHHHHHHHHHHHHHHHHH T ss_conf 04566278999999999999998-7764038977654401412788898988999999 No 98 >PRK08265 short chain dehydrogenase; Provisional Probab=41.71 E-value=21 Score=14.72 Aligned_cols=21 Identities=10% Similarity=-0.051 Sum_probs=15.2 Q ss_pred HHHHHHHHHCCCEEEEEEECC Q ss_conf 999999997898799999418 Q gi|254780833|r 306 LFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 306 ~~~c~~~k~~gi~i~tIg~~~ 326 (371) ..++...-.+||+|.+|..|. T Consensus 162 k~lA~e~a~~gIrVN~IaPG~ 182 (261) T PRK08265 162 RSMAMDLAPDGIRVNSVSPGW 182 (261) T ss_pred HHHHHHHHHHCEEEEEEEECC T ss_conf 999999741092998885587 No 99 >COG2984 ABC-type uncharacterized transport system, periplasmic component [General function prediction only] Probab=37.18 E-value=25 Score=14.29 Aligned_cols=80 Identities=14% Similarity=0.061 Sum_probs=31.1 Q ss_pred CCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCCC-HHHHHHHCCCCEEEEECCHHHHHHHHHHHHH Q ss_conf 661699998404588888897899999999997898799999418642-7989983389808982898999999999999 Q gi|254780833|r 282 DYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEAA-DQFLKNCASPDRFYSVQNSRKLHDAFLRIGK 360 (371) Q Consensus 282 ~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~~-~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~ 360 (371) +.+++-++..-|+.|.- .....+-..++..|++++..+.....+ +..++.+.+............+..+++.+-+ T Consensus 158 nak~Igv~Y~p~E~ns~----~l~eelk~~A~~~Gl~vve~~v~~~ndi~~a~~~l~g~~d~i~~p~dn~i~s~~~~l~~ 233 (322) T COG2984 158 NAKSIGVLYNPGEANSV----SLVEELKKEARKAGLEVVEAAVTSVNDIPRAVQALLGKVDVIYIPTDNLIVSAIESLLQ 233 (322) T ss_pred CCEEEEEEECCCCCCCH----HHHHHHHHHHHHCCCEEEEEECCCCCCCHHHHHHHCCCCCEEEEECCHHHHHHHHHHHH T ss_conf 87069999579886608----99999999998779889998347632008999973478767998660677888999999 Q ss_pred HHHHE Q ss_conf 64000 Q gi|254780833|r 361 EMVKQ 365 (371) Q Consensus 361 ~i~~~ 365 (371) ...+. T Consensus 234 ~a~~~ 238 (322) T COG2984 234 VANKA 238 (322) T ss_pred HHHHH T ss_conf 98870 No 100 >PRK08643 acetoin reductase; Validated Probab=35.92 E-value=26 Score=14.17 Aligned_cols=21 Identities=14% Similarity=0.033 Sum_probs=15.3 Q ss_pred HHHHHHHHHCCCEEEEEEECC Q ss_conf 999999997898799999418 Q gi|254780833|r 306 LFYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 306 ~~~c~~~k~~gi~i~tIg~~~ 326 (371) ..++..+-..||+|.+|..|. T Consensus 164 kslA~ela~~gIrVN~V~PG~ 184 (256) T PRK08643 164 QTAARDLASEGITVNAYAPGI 184 (256) T ss_pred HHHHHHHHHHCCEEEEEEECC T ss_conf 999999877591899996066 No 101 >pfam04917 Shufflon_N Bacterial shufflon protein, N-terminal constant region. This family represents the high-similarity N-terminal 'constant region' shared by shufflon proteins. Probab=34.12 E-value=28 Score=13.99 Aligned_cols=46 Identities=15% Similarity=0.235 Sum_probs=32.7 Q ss_pred HCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 2036871899999999999999999999999999999999999999 Q gi|254780833|r 10 FYNCKGSISILTAILLPVIFIVMGLVIETSHKFFVKAKLHYILDHS 55 (371) Q Consensus 10 ~~d~~G~vai~~al~l~~li~~~g~aVD~~r~~~~ks~Lq~a~DaA 55 (371) +|..||-.++=..+.|.++++++.+...+..-++...+.|.+...+ T Consensus 2 r~~~kGf~LlE~~~~L~I~~~~~~~~~~~~~~~~~~~~~q~aA~q~ 47 (356) T pfam04917 2 KKTDKGVSLLEVGAVLLIVVMVIPKVAENIEDYLNNVRWQNAAEHA 47 (356) T ss_pred CEECCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 1203443089999999999999999999999899999999999999 No 102 >PRK12826 3-ketoacyl-(acyl-carrier-protein) reductase; Reviewed Probab=33.27 E-value=29 Score=13.91 Aligned_cols=59 Identities=17% Similarity=0.198 Sum_probs=30.2 Q ss_pred HHHHHHHCCCEEEEEEECCCC----------CHHHHHHHCC--C-CEEEEECCHHHHHHHHHHHHH----HHHHEEEEE Q ss_conf 999999789879999941864----------2798998338--9-808982898999999999999----640007874 Q gi|254780833|r 308 YCNEAKRRGAIVYAIGVQAEA----------ADQFLKNCAS--P-DRFYSVQNSRKLHDAFLRIGK----EMVKQRILY 369 (371) Q Consensus 308 ~c~~~k~~gi~i~tIg~~~~~----------~~~~l~~cAs--~-~~~y~~~~~~~L~~af~~I~~----~i~~~~i~~ 369 (371) +.......||+|.+|..|.-. +++..+.... | ++ .-+++|+.++..-+.. -|+-+.|.+ T Consensus 170 lA~e~~~~gIrvN~I~PG~i~T~~~~~~~~~~~~~~~~~~~~~pl~R---~~~p~eiA~~v~fL~S~~s~~itG~~i~v 245 (253) T PRK12826 170 LALELARRNITVNSVHPGMVDTPMAGNVFLGDASVAEAAAAAIPLGR---LGEPEDIAAAVLFLASDAARYITGQTLPV 245 (253) T ss_pred HHHHHHHHCEEEEEEEECCCCCHHHHCCCCCCHHHHHHHHHCCCCCC---CCCHHHHHHHHHHHHCCHHCCCCCCEEEE T ss_conf 99985320959999962879672121446687899999983799999---85999999999999686322956873887 No 103 >pfam02608 Bmp Basic membrane protein. This is a family of basic membrane lipoproteins form Borrelia and various putative lipoproteins form other bacteria. All of these proteins are outer membrane proteins and are thus antigenic in nature when possessed by the pathogenic members of the family. One protein, Bacillus subtilis med, is a transcriptional activator. Probab=32.73 E-value=30 Score=13.85 Aligned_cols=21 Identities=33% Similarity=0.379 Sum_probs=11.7 Q ss_pred HHHHHHHHCCCEEEEEEECCC Q ss_conf 999999978987999994186 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~~ 327 (371) ...+.+|+.|+..|+||+..| T Consensus 198 Gv~~aa~e~g~~~~~IGvd~d 218 (302) T pfam02608 198 GVIQAAKELGLYGYVIGVDQD 218 (302) T ss_pred HHHHHHHHCCCCCEEEEEECC T ss_conf 999999971998269999676 No 104 >TIGR00565 trpE_proteo anthranilate synthase component I; InterPro: IPR005257 This family represents anthranilate/para-aminobenzoate synthase component I from proteobacteria and actinobacteria. This enzyme resembles some other chorismate-binding enzymes, including para-aminobenzoate synthase (pabB) and isochorismate synthase. There is a fairly deep split between two sets, seen in the pattern of gaps as well as in amino acid sequence differences. This group includes proteobacteria such as Escherichia coli and Helicobacter pylori but also the Gram-positive organism Corynebacterium glutamicum. The second group (IPR005256 from INTERPRO) includes eukaryotes, archaea, and most other bacterial lineages; sequences from the second group may resemble pabB more closely than other trpE from this group. ; GO: 0004049 anthranilate synthase activity, 0009058 biosynthetic process. Probab=32.52 E-value=21 Score=14.79 Aligned_cols=40 Identities=20% Similarity=0.353 Sum_probs=28.8 Q ss_pred CCCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEE Q ss_conf 666169999840458888889789999999999789879999 Q gi|254780833|r 281 DDYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAI 322 (371) Q Consensus 281 ~~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tI 322 (371) .+|-+||++|-|-+..-=...++.++.. ++..+.++||-| T Consensus 277 ~NPSPYMFy~~D~DFiLFGASPESaLKY--~a~~r~~EIYPI 316 (505) T TIGR00565 277 SNPSPYMFYMKDEDFILFGASPESALKY--DALSRQLEIYPI 316 (505) T ss_pred CCCCCCEEEEECCCEEEECCCCCHHHHC--CCCCCCEEEECC T ss_conf 4859603232047535546870012310--312582777035 No 105 >cd06304 PBP1_BmpA_like Periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type I periplasmic sugar-binding protein-like fold. Probab=32.39 E-value=30 Score=13.82 Aligned_cols=19 Identities=47% Similarity=0.660 Sum_probs=10.0 Q ss_pred HHHHHHHHCCCEEEEEEECCC Q ss_conf 999999978987999994186 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQAE 327 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~~ 327 (371) ..++.++++| +|+||+..| T Consensus 193 gv~~aa~e~g--~~~IG~d~d 211 (260) T cd06304 193 GVIQAAKEAG--VYAIGVDSD 211 (260) T ss_pred HHHHHHHHHC--CEEEEECCC T ss_conf 7999998609--789984576 No 106 >pfam09001 DUF1890 Domain of unknown function (DUF1890). This domain is found in a set of hypothetical archaeal proteins. Probab=32.20 E-value=30 Score=13.80 Aligned_cols=21 Identities=24% Similarity=0.186 Sum_probs=12.1 Q ss_pred CCCEEEEEEECCCCCHHHHHHHC Q ss_conf 89879999941864279899833 Q gi|254780833|r 315 RGAIVYAIGVQAEAADQFLKNCA 337 (371) Q Consensus 315 ~gi~i~tIg~~~~~~~~~l~~cA 337 (371) .+.+.|.|=||.+ .+.|.+|. T Consensus 93 ~~~~~~aiVFg~~--~e~la~~i 113 (138) T pfam09001 93 SNAKTYAIVFGEH--AEELAETI 113 (138) T ss_pred CCCCEEEEEECCC--HHHHHHHH T ss_conf 4786589993588--79999986 No 107 >pfam11411 DNA_ligase_IV DNA ligase IV. DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface. Probab=30.76 E-value=32 Score=13.65 Aligned_cols=22 Identities=32% Similarity=0.605 Sum_probs=18.6 Q ss_pred CCEEEEECCHHHHHHHHHHHHH Q ss_conf 9808982898999999999999 Q gi|254780833|r 339 PDRFYSVQNSRKLHDAFLRIGK 360 (371) Q Consensus 339 ~~~~y~~~~~~~L~~af~~I~~ 360 (371) |+.||--++.++|.++|..|.+ T Consensus 14 GDSy~~dt~~~qLk~vF~~i~~ 35 (36) T pfam11411 14 GDSYFVDTDEQQLKDVFHRIKK 35 (36) T ss_pred CCCEEECCCHHHHHHHHHHHCC T ss_conf 5400104858999999987504 No 108 >KOG4169 consensus Probab=30.71 E-value=32 Score=13.65 Aligned_cols=10 Identities=30% Similarity=0.484 Sum_probs=4.7 Q ss_pred HHCCCEEEEE Q ss_conf 9789879999 Q gi|254780833|r 313 KRRGAIVYAI 322 (371) Q Consensus 313 k~~gi~i~tI 322 (371) .+.||++++| T Consensus 171 ~~sGV~~~av 180 (261) T KOG4169 171 QRSGVRFNAV 180 (261) T ss_pred HHCCEEEEEE T ss_conf 6558799997 No 109 >pfam00331 Glyco_hydro_10 Glycosyl hydrolase family 10. Probab=30.34 E-value=32 Score=13.61 Aligned_cols=55 Identities=22% Similarity=0.309 Sum_probs=33.8 Q ss_pred EEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCC-----CC----HHHHHHHCCCC Q ss_conf 999984045888888978999999999978987999994186-----42----79899833898 Q gi|254780833|r 286 YIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAE-----AA----DQFLKNCASPD 340 (371) Q Consensus 286 ~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~-----~~----~~~l~~cAs~~ 340 (371) ...++-|-.-............+++.++++|+.|-.||+..- .+ ...|+.+++-| T Consensus 163 akL~~NDyn~e~~~~k~~~~~~lv~~l~~~gvpIDgIG~Q~H~~~~~~~~~~i~~~l~~~a~lG 226 (308) T pfam00331 163 AKLYYNDYNIEGPNAKREAIYNLVKDLKAKGVPIDGIGMQSHLSAGGPSISEIEAALKRFASLG 226 (308) T ss_pred CEEEEECCCCCCCCHHHHHHHHHHHHHHHCCCCCCEEEEEEEECCCCCCHHHHHHHHHHHHHCC T ss_conf 7788742665466478999999999999779986448876660689999999999999999659 No 110 >PRK05715 NADH dehydrogenase subunit K; Validated Probab=29.88 E-value=33 Score=13.56 Aligned_cols=39 Identities=15% Similarity=0.254 Sum_probs=20.0 Q ss_pred HHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 257887520368718999999999999999999999999 Q gi|254780833|r 3 FLNIRNFFYNCKGSISILTAILLPVIFIVMGLVIETSHK 41 (371) Q Consensus 3 ~~~l~~f~~d~~G~vai~~al~l~~li~~~g~aVD~~r~ 41 (371) |....+++.+-.|++..+|.+.+-..=..+|+++=...+ T Consensus 45 fv~fs~~~~~~~Gqv~~lfii~vAAaE~avgLAi~v~~~ 83 (99) T PRK05715 45 FVAFSSYLGDLDGQVFAFFVITVAAAEAAIGLAILLQLY 83 (99) T ss_pred HHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 999999918987145899999999999999999999999 No 111 >TIGR02180 GRX_euk Glutaredoxin; InterPro: IPR011899 Glutaredoxins , , , also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system . Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond . It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro. This entry is found in eukaryotic glutaredoxins and includes sequences from fungi, plants and metazoans as well as viruses .; GO: 0006118 electron transport, 0045454 cell redox homeostasis. Probab=29.32 E-value=34 Score=13.50 Aligned_cols=26 Identities=19% Similarity=0.263 Sum_probs=17.7 Q ss_pred CCHHHHHHHHHHHHHHHHHEEEEEEC Q ss_conf 89899999999999964000787419 Q gi|254780833|r 346 QNSRKLHDAFLRIGKEMVKQRILYNK 371 (371) Q Consensus 346 ~~~~~L~~af~~I~~~i~~~~i~~~~ 371 (371) ++.++|+++|.+|+.+=+=.||.+.+ T Consensus 38 ~~g~~~Q~~L~~~TG~~TVP~iFi~g 63 (85) T TIGR02180 38 SNGSEIQDYLKEITGQRTVPNIFING 63 (85) T ss_pred CCHHHHHHHHHHHCCCCCCCCEEECC T ss_conf 88578999999844892388265688 No 112 >pfam00071 Ras Ras family. Includes sub-families Ras, Rab, Rac, Ral, Ran, Rap Ypt1 and more. Shares P-loop motif with GTP_EFTU, arf and myosin_head. See pfam00009 pfam00025, pfam00063. As regards Rab GTPases, these are important regulators of vesicle formation, motility and fusion. They share a fold in common with all Ras GTPases: this is a six-stranded beta-sheet surrounded by five alpha-helices. Probab=28.15 E-value=35 Score=13.38 Aligned_cols=59 Identities=15% Similarity=0.245 Sum_probs=36.8 Q ss_pred HHHHHHHHH---CCCEEEEEEECCCC------CHHHHHHHCC--CCEEEE--ECCHHHHHHHHHHHHHHHHH Q ss_conf 999999997---89879999941864------2798998338--980898--28989999999999996400 Q gi|254780833|r 306 LFYCNEAKR---RGAIVYAIGVQAEA------ADQFLKNCAS--PDRFYS--VQNSRKLHDAFLRIGKEMVK 364 (371) Q Consensus 306 ~~~c~~~k~---~gi~i~tIg~~~~~------~~~~l~~cAs--~~~~y~--~~~~~~L~~af~~I~~~i~~ 364 (371) ..+.+.+++ .++.++.||--.|- ..+..++.|. +-.||+ +.+...+.+.|..|+++|-+ T Consensus 91 ~~~~~~i~~~~~~~~piilvgnK~Dl~~~~~i~~~e~~~~a~~~~~~y~e~Sak~g~gI~~~F~~i~~~il~ 162 (162) T pfam00071 91 KKWLEEILRHADDNVPIVLVGNKCDLEDQRVVSTEEGEALAKELGLPFMETSAKTNENVEEAFEELAREILK 162 (162) T ss_pred HHHHHHHHHHCCCCCEEEEEEECCCHHHCCCCCHHHHHHHHHHHCCEEEEECCCCCCCHHHHHHHHHHHHCC T ss_conf 999999998579886288997524746518899999999999809979997378882999999999999676 No 113 >cd06354 PBP1_BmpA_PnrA_like Periplasmic binding domain of basic membrane lipoprotein, PnrA, in Treponema pallidum and its homologs from other bacteria and Archaea. Periplasmic binding domain of basic membrane lipoprotein, PnrA, in Treponema pallidum and its homologs from other bacteria and Archaea. The PnrA lipoprotein, also known as Tp0319 or TmpC, represents a novel family of bacterial purine nucleoside receptor encoded within an ATP-binding cassette (ABC) transport system (pnrABCDE). It shows a striking structural similarity to another basic membrane lipoprotein Med which regulates the competence transcription factor gene, comK, in Bacillus subtilis. The members of PnrA-like subgroup are likely to have similar nucleoside-binding functions and a similar type I periplasmic sugar-binding protein-like fold. Probab=27.86 E-value=36 Score=13.35 Aligned_cols=13 Identities=69% Similarity=1.066 Sum_probs=5.6 Q ss_pred HHHCCCEEEEEEECC Q ss_conf 997898799999418 Q gi|254780833|r 312 AKRRGAIVYAIGVQA 326 (371) Q Consensus 312 ~k~~gi~i~tIg~~~ 326 (371) ++++| +|+||+.. T Consensus 202 a~e~g--~~~IG~d~ 214 (265) T cd06354 202 AKEAG--VYAIGVDS 214 (265) T ss_pred HHHCC--CEEEEEEC T ss_conf 99709--87999977 No 114 >smart00633 Glyco_10 Glycosyl hydrolase family 10. Probab=27.86 E-value=36 Score=13.35 Aligned_cols=54 Identities=22% Similarity=0.295 Sum_probs=31.8 Q ss_pred EEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCC-----C----HHHHHHHCCCC Q ss_conf 999840458888889789999999999789879999941864-----2----79899833898 Q gi|254780833|r 287 IIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEA-----A----DQFLKNCASPD 340 (371) Q Consensus 287 ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~-----~----~~~l~~cAs~~ 340 (371) ..++-|-.--...........+++.++++|+.|..||+..-- + ...|+.+++-| T Consensus 120 kL~~NDy~~~~~~~k~~~~~~lv~~l~~~g~pIdgIG~Q~H~~~~~~~~~~~~~~l~~~~~~g 182 (254) T smart00633 120 KLFYNDYNTEEPNAKRQAIYELVKKLKAKGVPIDGIGLQSHLSLGSPNIAEIRAALDRFASLG 182 (254) T ss_pred EEEEECCCCCCCCHHHHHHHHHHHHHHHCCCCCCEEEEEEEECCCCCCHHHHHHHHHHHHHCC T ss_conf 898742555477477999999999999779983158762430479999999999999999659 No 115 >PRK05872 short chain dehydrogenase; Provisional Probab=27.57 E-value=36 Score=13.32 Aligned_cols=19 Identities=21% Similarity=0.067 Sum_probs=13.0 Q ss_pred HHHHHHHHCCCEEEEEEEC Q ss_conf 9999999789879999941 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~ 325 (371) .+..++...||+|-+|..| T Consensus 167 sLa~Ela~~GIrVn~V~PG 185 (296) T PRK05872 167 ALRLEVAHRGVSVGSAYLS 185 (296) T ss_pred HHHHHHHHHCCEEEEEECC T ss_conf 9999840019389999708 No 116 >cd02012 TPP_TK Thiamine pyrophosphate (TPP) family, Transketolase (TK) subfamily, TPP-binding module; TK catalyzes the transfer of a two-carbon unit from ketose phosphates to aldose phosphates. In heterotrophic organisms, TK provides a link between glycolysis and the pentose phosphate pathway and provides precursors for nucleotide, aromatic amino acid and vitamin biosynthesis. In addition, the enzyme plays a central role in the Calvin cycle in plants. Typically, TKs are homodimers. They require TPP and divalent cations, such as magnesium ions, for activity. Probab=27.23 E-value=37 Score=13.28 Aligned_cols=83 Identities=14% Similarity=0.145 Sum_probs=38.1 Q ss_pred CCCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEE-----EECCCC----C-HHHHHHHCC-CCEEEEE--CCH Q ss_conf 66169999840458888889789999999999789879999-----941864----2-798998338-9808982--898 Q gi|254780833|r 282 DYKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAI-----GVQAEA----A-DQFLKNCAS-PDRFYSV--QNS 348 (371) Q Consensus 282 ~~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tI-----g~~~~~----~-~~~l~~cAs-~~~~y~~--~~~ 348 (371) ...++++++.||+-+.+... ++...+-..|-.++.+ .| ...... . +++-+.+.+ +=++..+ .|. T Consensus 126 ~~~~v~~~iGDGel~EG~~w--EAl~~A~~~~L~nLi~-ivD~N~~~~~g~~~~~~~~~~l~~~~~sfG~~v~~vdGhd~ 202 (255) T cd02012 126 FDYRVYVLLGDGELQEGSVW--EAASFAGHYKLDNLIA-IVDSNRIQIDGPTDDILFTEDLAKKFEAFGWNVIEVDGHDV 202 (255) T ss_pred CCCCEEEEECCCCCCCCHHH--HHHHHHHHCCCCCEEE-EECCCCCEECCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCH T ss_conf 88717999425110331289--9999985558775699-98689826256030254768899999966981110179999 Q ss_pred HHHHHHHHHHHHHHHHEEE Q ss_conf 9999999999996400078 Q gi|254780833|r 349 RKLHDAFLRIGKEMVKQRI 367 (371) Q Consensus 349 ~~L~~af~~I~~~i~~~~i 367 (371) .+|.++|+...+.-.+.++ T Consensus 203 ~~i~~a~~~a~~~~~kP~~ 221 (255) T cd02012 203 EEILAALEEAKKSKGKPTL 221 (255) T ss_pred HHHHHHHHHHHHCCCCCEE T ss_conf 9999999999867999589 No 117 >PRK10538 3-hydroxy acid dehydrogenase; Provisional Probab=27.21 E-value=37 Score=13.28 Aligned_cols=51 Identities=8% Similarity=-0.031 Sum_probs=25.9 Q ss_pred HHHHHHHHCCCEEEEEEECCCC-----------CHHHHHHHCCCCEEEEECCHHHHHHHHHHHHH Q ss_conf 9999999789879999941864-----------27989983389808982898999999999999 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQAEA-----------ADQFLKNCASPDRFYSVQNSRKLHDAFLRIGK 360 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~~~-----------~~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~ 360 (371) .++..+...||++.+|.-|.-. +.+.+...-..... -.++|+.++.--++. T Consensus 160 ~La~El~~~gIrVn~v~PG~v~t~~~~~~~~~~~~~~~~~~~~~~~~---l~PedVA~av~fl~s 221 (248) T PRK10538 160 NLRTDLHGTAVRVTDIEPGLVGGTEFSNVRFKGDDGKAEKTYQNTVA---LTPEDVSEAVWWVAT 221 (248) T ss_pred HHHHHHCCCCEEEEEEECCCCCCCCHHCCCCCCCHHHHHHHHCCCCC---CCHHHHHHHHHHHHC T ss_conf 99998478685999998475768411114556768889740357899---999999999999982 No 118 >TIGR00937 2A51 chromate transporter, chromate ion transporter (CHR) family; InterPro: IPR014047 Members of this family probably act as chromate transporters , , and are found in both bacteria and archaebacteria. The protein reduces chromate accumulation and is essential for chromate resistance. They are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.. Probab=27.02 E-value=37 Score=13.26 Aligned_cols=49 Identities=16% Similarity=0.138 Sum_probs=41.1 Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH---HHHHHHHH Q ss_conf 03687189999999999999999999999999999999999---99999999 Q gi|254780833|r 11 YNCKGSISILTAILLPVIFIVMGLVIETSHKFFVKAKLHYI---LDHSLLYT 59 (371) Q Consensus 11 ~d~~G~vai~~al~l~~li~~~g~aVD~~r~~~~ks~Lq~a---~DaA~LAa 59 (371) +...|.++.-.|++||.++.+.++++=|.|....-..+|+. +..++.+. T Consensus 62 ~g~~Ga~~ag~AF~LPs~l~~~~L~~~y~~~~~l~~~~g~~f~G~~~~vi~l 113 (390) T TIGR00937 62 GGILGAILAGVAFVLPSFLLVVALAWLYVQYGSLPKAVGAVFYGLKAAVIAL 113 (390) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHH T ss_conf 7799999999999869999999999999974364368999998899999999 No 119 >PRK01215 competence damage-inducible protein A; Provisional Probab=26.72 E-value=38 Score=13.22 Aligned_cols=20 Identities=0% Similarity=0.076 Sum_probs=9.7 Q ss_pred ECCHHHHHHHHHHHHHHHHH Q ss_conf 28989999999999996400 Q gi|254780833|r 345 VQNSRKLHDAFLRIGKEMVK 364 (371) Q Consensus 345 ~~~~~~L~~af~~I~~~i~~ 364 (371) +.+..+..+..+.+.++|.+ T Consensus 232 a~~~~ea~~~i~~~~~~ire 251 (264) T PRK01215 232 AESEEEAEEKVEKALERLRE 251 (264) T ss_pred CCCHHHHHHHHHHHHHHHHH T ss_conf 09999999999999999999 No 120 >PRK08541 flagellin; Validated Probab=25.23 E-value=40 Score=13.06 Aligned_cols=22 Identities=27% Similarity=0.368 Sum_probs=15.6 Q ss_pred CCCCCHHHHHHHHHHHHHHHHH Q ss_conf 3687189999999999999999 Q gi|254780833|r 12 NCKGSISILTAILLPVIFIVMG 33 (371) Q Consensus 12 d~~G~vai~~al~l~~li~~~g 33 (371) +|||.+-|=.++.++.|+++++ T Consensus 1 ~kkG~~GigtlIVfIAmVlVAA 22 (212) T PRK08541 1 MKKGAVGIGTLIVFIAMVLVAA 22 (212) T ss_pred CCCCCCCHHHHHHHHHHHHHHH T ss_conf 9753024208999999999999 No 121 >PRK06138 short chain dehydrogenase; Provisional Probab=25.07 E-value=40 Score=13.04 Aligned_cols=20 Identities=15% Similarity=0.089 Sum_probs=14.2 Q ss_pred HHHHHHHHCCCEEEEEEECC Q ss_conf 99999997898799999418 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~ 326 (371) .++...-..||++.+|..|. T Consensus 166 ~lA~e~a~~gIrVNaI~PG~ 185 (252) T PRK06138 166 AMALDHATDGIRVNAVAPGT 185 (252) T ss_pred HHHHHHHHCCEEEEEEEECC T ss_conf 99998622291999997588 No 122 >TIGR02848 spore_III_AC stage III sporulation protein AC; InterPro: IPR009570 This family consists of several bacterial stage III sporulation protein AC (SpoIIIAC) sequences. The exact function of this family is unknown.. Probab=24.72 E-value=41 Score=13.00 Aligned_cols=29 Identities=7% Similarity=0.196 Sum_probs=24.3 Q ss_pred HHHHHCCCCCCHHHHHHHHHHHHHHHHHH Q ss_conf 88752036871899999999999999999 Q gi|254780833|r 6 IRNFFYNCKGSISILTAILLPVIFIVMGL 34 (371) Q Consensus 6 l~~f~~d~~G~vai~~al~l~~li~~~g~ 34 (371) |++..|||-++++.+.++..+.+.++-.+ T Consensus 23 Lk~~GKeE~A~~vtL~G~~~vL~mVi~l~ 51 (64) T TIGR02848 23 LKQAGKEEQAQMVTLAGIVVVLFMVITLI 51 (64) T ss_pred HHHHCCHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 88708667876778887789999999999 No 123 >cd06325 PBP1_ABC_uncharacterized_transporter Type I periplasmic ligand-binding domain of uncharacterized ABC-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This group includes the type I periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine/isoleucine/valine binding protein (LIVBP); its ligand specificity has not been determined experimentally. Probab=24.64 E-value=41 Score=12.99 Aligned_cols=30 Identities=20% Similarity=0.006 Sum_probs=14.8 Q ss_pred EEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEE Q ss_conf 999984045888888978999999999978987999 Q gi|254780833|r 286 YIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYA 321 (371) Q Consensus 286 ~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~t 321 (371) .+++.+|..-- .....+...+.+.+|.+|+ T Consensus 187 al~~~~d~~v~------s~~~~i~~~a~~~~iPv~~ 216 (281) T cd06325 187 AIYVPTDNTVA------SAMEAVVKVANEAKIPVIA 216 (281) T ss_pred EEEEECCCHHH------HHHHHHHHHHHHCCCCEEE T ss_conf 99991881277------7999999999874998893 No 124 >PRK08324 short chain dehydrogenase; Validated Probab=24.60 E-value=41 Score=12.98 Aligned_cols=77 Identities=18% Similarity=0.193 Sum_probs=44.5 Q ss_pred EEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHCCCCE----EEEECCHHHHHHHHHHHHHH Q ss_conf 99998404588888897899999999997898799999418642798998338980----89828989999999999996 Q gi|254780833|r 286 YIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIGVQAEAADQFLKNCASPDR----FYSVQNSRKLHDAFLRIGKE 361 (371) Q Consensus 286 ~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg~~~~~~~~~l~~cAs~~~----~y~~~~~~~L~~af~~I~~~ 361 (371) -++|+|=|-..- -..+|..+.+.|.+|.......+.-+...+.+..++. ..++++.+++.++|+++.++ T Consensus 422 KVALVTGga~GI-------G~A~A~~fa~eGA~Vvl~D~~~~~l~~~a~el~~~~~~~~~~~DVtd~~~v~~~v~~~~~~ 494 (676) T PRK08324 422 KVALVTGAAGGI-------GLATAKRLAAEGACVVLADIDEEAAEAAAAELGGRDRALGVACDVTDEAAVQAAFEEAALA 494 (676) T ss_pred CEEEEECCCCCH-------HHHHHHHHHHCCCEEEEEECCHHHHHHHHHHHHCCCCEEEEEECCCCHHHHHHHHHHHHHH T ss_conf 879994798816-------2999999998799899995888999999999707994799980689999999999999998 Q ss_pred HHHEEEEE Q ss_conf 40007874 Q gi|254780833|r 362 MVKQRILY 369 (371) Q Consensus 362 i~~~~i~~ 369 (371) .-..-+.+ T Consensus 495 fGgIDiLV 502 (676) T PRK08324 495 FGGVDIVV 502 (676) T ss_pred HCCCCEEE T ss_conf 59988899 No 125 >PRK07523 gluconate 5-dehydrogenase; Provisional Probab=24.19 E-value=42 Score=12.94 Aligned_cols=20 Identities=15% Similarity=0.029 Sum_probs=13.8 Q ss_pred HHHHHHHHCCCEEEEEEECC Q ss_conf 99999997898799999418 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~ 326 (371) .++...-..||++.+|..|. T Consensus 168 ~lA~e~a~~gIrVNaVaPG~ 187 (251) T PRK07523 168 GMATDWAKHGLQCNAIAPGY 187 (251) T ss_pred HHHHHHCCCCEEEEEEEECC T ss_conf 99999702094999997378 No 126 >cd00765 Pyrophosphate_PFK Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include pyrophosphate-dependent phosphofructokinases. These are found in bacteria as well as plants. These may be dimeric nonallosteric enzymes as in bacteria or allosteric heterotetramers as in plants. Probab=23.19 E-value=44 Score=12.82 Aligned_cols=16 Identities=25% Similarity=0.320 Sum_probs=7.1 Q ss_pred CCCCCEEEEECCCCCC Q ss_conf 4674069985276311 Q gi|254780833|r 166 DIGLDMMMVLDVSLSM 181 (371) Q Consensus 166 ~~~idi~~viD~SgSm 181 (371) ...+|-..++..-+|+ T Consensus 164 ~l~LdgLVIiGGddSn 179 (550) T cd00765 164 KLDLDALVVIGGDDSN 179 (550) T ss_pred HCCCCEEEEECCCCCH T ss_conf 8599879996898734 No 127 >TIGR00421 ubiX_pad polyprenyl P-hydroxybenzoate and phenylacrylic acid decarboxylases; InterPro: IPR004507 This family contains flavoproteins, which are aromatic acid decarboxylases. An example is the Saccharomyces cerevisiae gene, PAD1 that encodes phenylacrylic acid decarboxylase. Mutations of this gene are viable and confer resistance to cinnamic acid. ; GO: 0016831 carboxy-lyase activity. Probab=22.21 E-value=46 Score=12.70 Aligned_cols=33 Identities=15% Similarity=0.261 Sum_probs=14.4 Q ss_pred CCEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEE Q ss_conf 61699998404588888897899999999997898799 Q gi|254780833|r 283 YKKYIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVY 320 (371) Q Consensus 283 ~~k~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~ 320 (371) .+|-+||++==-|=. +.+-++.+.+|+ .|.-|+ T Consensus 111 ErRkLvL~~REtPL~-SiHLENmL~L~~----~G~IIl 143 (181) T TIGR00421 111 ERRKLVLVPRETPLN-SIHLENMLRLSR----MGAIIL 143 (181) T ss_pred CCCCEEEECCCCCCC-HHHHHHHHHHHH----CCCEEE T ss_conf 054246403678875-154899999982----792532 No 128 >PRK07576 short chain dehydrogenase; Provisional Probab=22.15 E-value=46 Score=12.69 Aligned_cols=50 Identities=20% Similarity=0.200 Sum_probs=26.0 Q ss_pred HHHHHHHCCCEEEEEEECCCCC----------HHHHHHHCC--C-CEEEEECCHHHHHHHHHHHHH Q ss_conf 9999997898799999418642----------798998338--9-808982898999999999999 Q gi|254780833|r 308 YCNEAKRRGAIVYAIGVQAEAA----------DQFLKNCAS--P-DRFYSVQNSRKLHDAFLRIGK 360 (371) Q Consensus 308 ~c~~~k~~gi~i~tIg~~~~~~----------~~~l~~cAs--~-~~~y~~~~~~~L~~af~~I~~ 360 (371) ++...-.+||+|.+|..|.=.+ +...+.+.. | ++ .-.++|+.++.-=++. T Consensus 170 lA~e~a~~gIrVN~IaPG~i~~t~~~~~~~~~~~~~~~~~~~~Pl~R---~g~pedia~~v~FL~S 232 (260) T PRK07576 170 LALEWGPEGVRVNSISPGPIAGTEGMARLAPTPELQAAVAQSVPLKR---NGTGQDIANAALFLAS 232 (260) T ss_pred HHHHHHHCCEEEEEEEECCCCCCHHHHHCCCCHHHHHHHHHCCCCCC---CCCHHHHHHHHHHHHC T ss_conf 99997133929999834775783666632799999999984799999---8699999999999958 No 129 >PRK06113 7-alpha-hydroxysteroid dehydrogenase; Validated Probab=21.34 E-value=48 Score=12.59 Aligned_cols=20 Identities=15% Similarity=0.026 Sum_probs=14.4 Q ss_pred HHHHHHHHCCCEEEEEEECC Q ss_conf 99999997898799999418 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQA 326 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~~ 326 (371) .+...+-..||++.+|..|. T Consensus 172 ~lA~ela~~gIrVN~V~PG~ 191 (255) T PRK06113 172 NMAFDLGEKNIRVNGIAPGA 191 (255) T ss_pred HHHHHHHHCCEEEEEEEECC T ss_conf 99999826494999986488 No 130 >cd03132 GATase1_catalase Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Catalase catalyzes the dismutation of hydrogen peroxide (H2O2) to water and oxygen. This group includes the large catalases: Neurospora crassa Catalase-1 and Catalase-3 and, Escherichia coli HP-II. This GATase1-like domain has an essential role in HP-II catalase activity. However, it lacks enzymatic activity and the catalytic triad typical of GATase1 domains. Catalase-1 and -3 are homotetrameric, HP-II is homohexameric. It has been proposed that this domain may facilitate the folding and oligomerization process. The interface between this GATase1-like domain of HP-II and the core of the subunit forms part of a channel which provides access to the deeply buried catalase active sites of HPII. Catalase-1 is associated with non-growing cells; C Probab=21.04 E-value=48 Score=12.55 Aligned_cols=44 Identities=14% Similarity=0.280 Sum_probs=17.2 Q ss_pred HHHHHHHHHHHHCCCEEEEEEECCCCCHHHHHHHC---CCCEEEEECCHHH Q ss_conf 89999999999789879999941864279899833---8980898289899 Q gi|254780833|r 303 KESLFYCNEAKRRGAIVYAIGVQAEAADQFLKNCA---SPDRFYSVQNSRK 350 (371) Q Consensus 303 ~~~~~~c~~~k~~gi~i~tIg~~~~~~~~~l~~cA---s~~~~y~~~~~~~ 350 (371) .........+=+.+-.|-.+ ....++|...- .....+..++..+ T Consensus 82 ~~~~~fv~eay~h~KpI~a~----~~~~~lL~~agi~~~~~Gvv~~~~~~~ 128 (142) T cd03132 82 GRALHFVTEAFKHGKPIGAV----GEGSDLLEAAGIPLEDPGVVTADDVKD 128 (142) T ss_pred HHHHHHHHHHHHCCCEEEEE----CCHHHHHHHCCCCCCCCCEEEECCCCH T ss_conf 67999999999769979993----772999997697999985798158667 No 131 >PRK08277 D-mannonate oxidoreductase; Provisional Probab=20.39 E-value=50 Score=12.47 Aligned_cols=19 Identities=26% Similarity=0.230 Sum_probs=12.9 Q ss_pred HHHHHHHHCCCEEEEEEEC Q ss_conf 9999999789879999941 Q gi|254780833|r 307 FYCNEAKRRGAIVYAIGVQ 325 (371) Q Consensus 307 ~~c~~~k~~gi~i~tIg~~ 325 (371) .++...-..||+|.+|..| T Consensus 187 ~lA~e~a~~gIrVNaIaPG 205 (278) T PRK08277 187 WLAVEFAKVGIRVNAIAPG 205 (278) T ss_pred HHHHHHCCCCEEEEEEEEC T ss_conf 9999965359499998528 No 132 >PRK07505 hypothetical protein; Provisional Probab=20.35 E-value=50 Score=12.46 Aligned_cols=60 Identities=13% Similarity=0.153 Sum_probs=40.8 Q ss_pred CHHHHHHHHHHHHHCCCEEEEEEEC-CCCCHHHHHHHCCCCEEEEECCHHHHHHHHHHHHHHH Q ss_conf 9789999999999789879999941-8642798998338980898289899999999999964 Q gi|254780833|r 301 DNKESLFYCNEAKRRGAIVYAIGVQ-AEAADQFLKNCASPDRFYSVQNSRKLHDAFLRIGKEM 362 (371) Q Consensus 301 ~~~~~~~~c~~~k~~gi~i~tIg~~-~~~~~~~l~~cAs~~~~y~~~~~~~L~~af~~I~~~i 362 (371) +......+++.+.++|+-+..|.+- .+.++..||-|-+..|= -++-+.|.++++++.++. T Consensus 343 ~~~~a~~~~~~L~e~Gi~v~~i~~PtVp~g~~rlRi~l~A~hT--~edId~l~~~L~~v~~e~ 403 (405) T PRK07505 343 DEDTAIKYAKQLKDAGFYTSPVFFPVVAKGNAGLRIMFRADHT--NEEIKRLCSLLKEILADY 403 (405) T ss_pred CHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCEEEEEECCCCC--HHHHHHHHHHHHHHHHHH T ss_conf 9999999999999789018756388389798249987488899--999999999999999985 No 133 >pfam04392 ABC_sub_bind ABC transporter substrate binding protein. This family contains many hypothetical proteins and some ABC transporter substrate binding proteins. Probab=20.20 E-value=51 Score=12.45 Aligned_cols=30 Identities=20% Similarity=0.048 Sum_probs=15.0 Q ss_pred EEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEE Q ss_conf 999984045888888978999999999978987999 Q gi|254780833|r 286 YIIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYA 321 (371) Q Consensus 286 ~ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~t 321 (371) .+++.+|..-. .....+...+.+.++.+|+ T Consensus 186 al~i~~d~~v~------s~~~~i~~~a~~~kiPv~~ 215 (292) T pfam04392 186 AIFIPTDNLIA------SAFTAVLQEANKAKIPVIT 215 (292) T ss_pred EEEEECCCCHH------HHHHHHHHHHHHCCCCEEE T ss_conf 89993781078------8999999999974999895 No 134 >TIGR02764 spore_ybaN_pdaB polysaccharide deacetylase family sporulation protein PdaB; InterPro: IPR014132 This entry describes the YbaN protein family, also called PdaB and SpoVIE, of Gram-positive bacteria. Although ybaN null mutants have only a mild sporulation defect, ybaN/ytrI double mutants show drastically reduced sporulation efficiencies. This synthetic defect suggests the role of this sigmaE-controlled gene in sporulation had been masked by functional redundancy. Members of this family are homologous to a characterised polysaccharide deacetylase; the exact function this protein family is unknown.. Probab=20.19 E-value=51 Score=12.44 Aligned_cols=36 Identities=25% Similarity=0.265 Sum_probs=24.6 Q ss_pred EEEEECCCCCCCCCCHHHHHHHHHHHHHCCCEEEEEE Q ss_conf 9998404588888897899999999997898799999 Q gi|254780833|r 287 IIFLTDGENSSPNIDNKESLFYCNEAKRRGAIVYAIG 323 (371) Q Consensus 287 ivl~TDG~~~~~~~~~~~~~~~c~~~k~~gi~i~tIg 323 (371) ||||=|+-++... .......+...+|++|-+.-+|+ T Consensus 160 IiL~HDASd~~Kq-T~~ALp~Ii~~LK~~GY~fv~is 195 (198) T TIGR02764 160 IILLHDASDSAKQ-TVKALPEIIKKLKEKGYEFVTIS 195 (198) T ss_pred EEEEEECCCCCCC-HHHHHHHHHHHHHHCCCEEEEHH T ss_conf 6987634879744-17789999899875495343422 Done!