BLASTP 2.2.22 [Sep-27-2009] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Schaffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Query= gi|254780980|ref|YP_003065393.1| hypothetical protein CLIBASIA_04405 [Candidatus Liberibacter asiaticus str. psy62] (121 letters) Database: nr 14,124,377 sequences; 4,842,793,630 total letters Searching..................................................done >gi|254780980|ref|YP_003065393.1| hypothetical protein CLIBASIA_04405 [Candidatus Liberibacter asiaticus str. psy62] gi|254040657|gb|ACT57453.1| hypothetical protein CLIBASIA_04405 [Candidatus Liberibacter asiaticus str. psy62] Length = 121 Score = 173 bits (439), Expect = 6e-42, Method: Composition-based stats. Identities = 121/121 (100%), Positives = 121/121 (100%) Query: 1 MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER 60 MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER Sbjct: 1 MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER 60 Query: 61 VEAAEKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPPS 120 VEAAEKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPPS Sbjct: 61 VEAAEKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPPS 120 Query: 121 N 121 N Sbjct: 121 N 121 >gi|254780556|ref|YP_003064969.1| hypothetical protein CLIBASIA_02215 [Candidatus Liberibacter asiaticus str. psy62] gi|254040233|gb|ACT57029.1| hypothetical protein CLIBASIA_02215 [Candidatus Liberibacter asiaticus str. psy62] Length = 120 Score = 123 bits (309), Expect = 7e-27, Method: Composition-based stats. Identities = 41/110 (37%), Positives = 67/110 (60%), Gaps = 10/110 (9%) Query: 1 MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER 60 M+AK L+ S TT +TI GC +V ED N+++ S +M+KEA ++ E H+LA+E Sbjct: 1 MRAKTLLASTLVTTAITIIGCSLV----ED--NRIE--SLRMVKEAKMEVLEAHKLAKEY 52 Query: 61 VEAAEKRVKEVEERATAS--RKLSVDELANAFWDLSDEDKNAFTGNVKQE 108 VE A +RVKE EE++ A + L +D+L F +L +++ F ++ + Sbjct: 53 VEQANQRVKEAEEQSNARLLKGLGMDDLVRYFMNLDSQNQAFFIDTIQNK 102 >gi|75911009|ref|YP_325305.1| WD-40 repeat-containing protein [Anabaena variabilis ATCC 29413] gi|75704734|gb|ABA24410.1| WD-40 repeat-containing protein [Anabaena variabilis ATCC 29413] Length = 1477 Score = 36.6 bits (83), Expect = 0.99, Method: Composition-based stats. Identities = 19/68 (27%), Positives = 32/68 (47%), Gaps = 7/68 (10%) Query: 43 IKEANFKISETHRLAQERVEAAEKRVKEV-------EERATASRKLSVDELANAFWDLSD 95 + +AN K++ T + A+E + A +RVKE +E A K + D++ A +L Sbjct: 496 VNDANVKVANTQKEAKELTDKANQRVKEANIQVANTKEEAKNRTKKADDDVRLALENLGK 555 Query: 96 EDKNAFTG 103 K A Sbjct: 556 ITKQAKID 563 >gi|159486149|ref|XP_001701106.1| hypothetical protein CHLREDRAFT_194202 [Chlamydomonas reinhardtii] gi|158272000|gb|EDO97808.1| predicted protein [Chlamydomonas reinhardtii] Length = 1430 Score = 36.6 bits (83), Expect = 1.1, Method: Composition-based stats. Identities = 13/34 (38%), Positives = 23/34 (67%) Query: 41 QMIKEANFKISETHRLAQERVEAAEKRVKEVEER 74 +++K + E + AQ+RVE AE+RV++ E+R Sbjct: 70 KLVKAVKEDVEEAQQEAQQRVEKAEQRVEKAEQR 103 Score = 35.1 bits (79), Expect = 3.5, Method: Composition-based stats. Identities = 17/44 (38%), Positives = 28/44 (63%) Query: 31 LLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKRVKEVEER 74 +L KL K + ++EA + + A++RVE AE+RV+E E+R Sbjct: 67 ILQKLVKAVKEDVEEAQQEAQQRVEKAEQRVEKAEQRVEEAEQR 110 >gi|288922780|ref|ZP_06416949.1| hypothetical protein FrEUN1fDRAFT_6647 [Frankia sp. EUN1f] gi|288345893|gb|EFC80253.1| hypothetical protein FrEUN1fDRAFT_6647 [Frankia sp. EUN1f] Length = 674 Score = 36.6 bits (83), Expect = 1.2, Method: Composition-based stats. Identities = 22/87 (25%), Positives = 40/87 (45%), Gaps = 5/87 (5%) Query: 14 TLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKRVKEVEE 73 T + G D V+ E ++Q+++ ++ E + R A VE AE+ ++ E Sbjct: 267 TAKGVDGSDEVVAAQE----RVQESAAAIV-ETQLDGQRSVRDALHAVEQAERSLQSARE 321 Query: 74 RATASRKLSVDELANAFWDLSDEDKNA 100 ++ AS VD A+A DL + + Sbjct: 322 QSAASAAGGVDAYADALKDLPAKTQAF 348 >gi|255943494|ref|XP_002562515.1| Pc19g00220 [Penicillium chrysogenum Wisconsin 54-1255] gi|211587249|emb|CAP79438.1| Pc19g00220 [Penicillium chrysogenum Wisconsin 54-1255] Length = 566 Score = 36.2 bits (82), Expect = 1.5, Method: Composition-based stats. Identities = 17/78 (21%), Positives = 32/78 (41%), Gaps = 5/78 (6%) Query: 23 IVIGRTEDLLNKLQKNST---QMIKEANFKI--SETHRLAQERVEAAEKRVKEVEERATA 77 I+ G ++ L+ + NS + KE I H ++ A + +KE EE + Sbjct: 484 IIKGASKSPLSSIAGNSRVHRRCTKETKLDIGNDSAHARIRDYWLRANQALKEKEEASGG 543 Query: 78 SRKLSVDELANAFWDLSD 95 L VD+ + ++ Sbjct: 544 RDYLDVDDEVFKYVNIDS 561 >gi|270009401|gb|EFA05849.1| hypothetical protein TcasGA2_TC008640 [Tribolium castaneum] Length = 4573 Score = 35.5 bits (80), Expect = 2.2, Method: Composition-based stats. Identities = 29/111 (26%), Positives = 56/111 (50%), Gaps = 6/111 (5%) Query: 11 FSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEK--RV 68 ++T + G V+ E LN+ + + ++I + +E ++A+E+ A+E+ +V Sbjct: 3163 LASTSEEVDGLKEVLAVQEVELNEKNEAAGKLIAVLS---AENEKVAKEQAIASEEEAKV 3219 Query: 69 KEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPP 119 K +EE + +K+ D+LA A L + T N K + ++K T PP Sbjct: 3220 KLIEEDVSVKQKICADDLAKAEPALIAAQQALNTLN-KNNLTELKSFTQPP 3269 >gi|58259447|ref|XP_567136.1| hypothetical protein [Cryptococcus neoformans var. neoformans JEC21] gi|134107537|ref|XP_777653.1| hypothetical protein CNBA7730 [Cryptococcus neoformans var. neoformans B-3501A] gi|50260347|gb|EAL23006.1| hypothetical protein CNBA7730 [Cryptococcus neoformans var. neoformans B-3501A] gi|57223273|gb|AAW41317.1| conserved hypothetical protein [Cryptococcus neoformans var. neoformans JEC21] Length = 1184 Score = 35.1 bits (79), Expect = 3.2, Method: Composition-based stats. Identities = 25/74 (33%), Positives = 38/74 (51%), Gaps = 6/74 (8%) Query: 8 TSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKR 67 TS + T + CD+ ++DL K Q ++A + + AQ+RV+AAE R Sbjct: 257 TSTYVTLRMNSALCDVAADVSKDLSVK------QRQRDAEVRKAGATNAAQKRVKAAEDR 310 Query: 68 VKEVEERATASRKL 81 VKEV+ER +L Sbjct: 311 VKEVQERKQTLEEL 324 >gi|224075020|ref|XP_002304521.1| predicted protein [Populus trichocarpa] gi|222841953|gb|EEE79500.1| predicted protein [Populus trichocarpa] Length = 822 Score = 34.7 bits (78), Expect = 4.5, Method: Composition-based stats. Identities = 23/89 (25%), Positives = 38/89 (42%), Gaps = 11/89 (12%) Query: 39 STQMIKEANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV-------DELANAFW 91 S + I E++ K+ +T QER EK + E E T++R L + D+ N Sbjct: 309 SLKAISESDNKVDQT----QERTLHCEKGMPEQVESMTSTRALPMVMDLTVDDDEINGED 364 Query: 92 DLSDEDKNAFTGNVKQEVCKVKKITVPPS 120 ++ ED+ F ++ I PS Sbjct: 365 NIDAEDRKPFLATLQNHPVDTNPIPTMPS 393 >gi|52080071|ref|YP_078862.1| inositol monophosphatase SuhB [Bacillus licheniformis ATCC 14580] gi|52785445|ref|YP_091274.1| YktC [Bacillus licheniformis ATCC 14580] gi|319646154|ref|ZP_08000384.1| YktC protein [Bacillus sp. BT1B_CT2] gi|52003282|gb|AAU23224.1| Inositol monophosphatase SuhB [Bacillus licheniformis ATCC 14580] gi|52347947|gb|AAU40581.1| YktC [Bacillus licheniformis ATCC 14580] gi|317391904|gb|EFV72701.1| YktC protein [Bacillus sp. BT1B_CT2] Length = 264 Score = 34.7 bits (78), Expect = 4.6, Method: Composition-based stats. Identities = 14/57 (24%), Positives = 29/57 (50%), Gaps = 4/57 (7%) Query: 52 ETHRLAQERVEAAEKRVKEV-EERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQ 107 E RLA+ V+ A +R+K+ +E+ T K + ++L ++ E + F ++ Sbjct: 6 EIDRLAKSWVKEAGQRIKQSMKEKMTIETKSNPNDLVT---NIDKETERFFIEKIQS 59 >gi|72389040|ref|XP_844815.1| 65 kDa invariant surface glycoprotein [Trypanosoma brucei TREU927] gi|62176336|gb|AAX70448.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei] gi|70801349|gb|AAZ11256.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei brucei strain 927/4 GUTat10.1] Length = 435 Score = 34.7 bits (78), Expect = 4.7, Method: Composition-based stats. Identities = 21/62 (33%), Positives = 32/62 (51%), Gaps = 4/62 (6%) Query: 28 TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83 + + +KL T+ +KE A K+SE A+E E A KR +EV E A +R + Sbjct: 84 SNNGYSKLSDADTKKVKEIYEKAKGKVSEQLPKAKEFGEEAGKRHQEVTEAAKRARGWGL 143 Query: 84 DE 85 D+ Sbjct: 144 DD 145 >gi|229596643|ref|XP_001007975.2| hypothetical protein TTHERM_01395380 [Tetrahymena thermophila] gi|225565189|gb|EAR87730.2| hypothetical protein TTHERM_01395380 [Tetrahymena thermophila SB210] Length = 2032 Score = 34.3 bits (77), Expect = 5.0, Method: Composition-based stats. Identities = 19/86 (22%), Positives = 40/86 (46%), Gaps = 2/86 (2%) Query: 30 DLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSVDELANA 89 + +N+ Q++ Q I E K+ E +L Q + +K +KE + + +D ++ Sbjct: 1852 EEINQFQQSLIQDIYEMENKMREKQQLKQRKTNPKKKVIKEANNKIYKQLSMGMD--SSQ 1909 Query: 90 FWDLSDEDKNAFTGNVKQEVCKVKKI 115 F++ DK FT ++ + + I Sbjct: 1910 FYNDQSYDKTNFTPSIVNKSISLAAI 1935 >gi|72382218|ref|YP_291573.1| hypothetical protein PMN2A_0378 [Prochlorococcus marinus str. NATL2A] gi|124025767|ref|YP_001014883.1| hypothetical protein NATL1_10601 [Prochlorococcus marinus str. NATL1A] gi|72002068|gb|AAZ57870.1| uncharacterized membrane protein [Prochlorococcus marinus str. NATL2A] gi|123960835|gb|ABM75618.1| conserved hypothetical protein [Prochlorococcus marinus str. NATL1A] Length = 215 Score = 34.3 bits (77), Expect = 5.1, Method: Composition-based stats. Identities = 19/64 (29%), Positives = 28/64 (43%), Gaps = 2/64 (3%) Query: 44 KEANFKISETHRLAQERVEAA--EKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAF 101 K + FK L + E RV+ +E+A +L V E F L ED++ F Sbjct: 85 KNSKFKSFTFSALVDGYISDVYVEDRVENKQEQANQDGRLEVLEKKRTFVILDIEDEDGF 144 Query: 102 TGNV 105 GN+ Sbjct: 145 IGNI 148 >gi|321250385|ref|XP_003191789.1| hypothetical protein CGB_A9310C [Cryptococcus gattii WM276] gi|317458256|gb|ADV20002.1| Conserved hypothetical protein [Cryptococcus gattii WM276] Length = 1309 Score = 34.3 bits (77), Expect = 5.5, Method: Composition-based stats. Identities = 24/74 (32%), Positives = 38/74 (51%), Gaps = 6/74 (8%) Query: 8 TSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKR 67 TS + T + CD+ ++DL K Q ++A + + AQ+R++AAE R Sbjct: 257 TSTYMTLKINSALCDVAADVSKDLSVK------QRQRDAEVRKAGATNAAQKRMQAAEDR 310 Query: 68 VKEVEERATASRKL 81 VKEV+ER +L Sbjct: 311 VKEVQERKQTLEEL 324 >gi|224075467|ref|XP_002304646.1| predicted protein [Populus trichocarpa] gi|222842078|gb|EEE79625.1| predicted protein [Populus trichocarpa] Length = 555 Score = 34.3 bits (77), Expect = 5.5, Method: Composition-based stats. Identities = 18/60 (30%), Positives = 36/60 (60%), Gaps = 4/60 (6%) Query: 22 DIVIGRTEDLLNKLQKNSTQ----MIKEANFKISETHRLAQERVEAAEKRVKEVEERATA 77 ++V ++D ++ + ++ + M++E N I E RL +ER + A+ RVKE+E++ A Sbjct: 194 EVVNFNSKDTGDQHEASALRDELDMLQEENGNILEKLRLEEERCKEADARVKELEKQVAA 253 >gi|261328538|emb|CBH11515.1| 64 kDa invariant surface glycoprotein, putative [Trypanosoma brucei gambiense DAL972] Length = 435 Score = 34.3 bits (77), Expect = 5.6, Method: Composition-based stats. Identities = 21/62 (33%), Positives = 32/62 (51%), Gaps = 4/62 (6%) Query: 28 TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83 + + +KL T+ +KE A K+SE A+E E A KR +EV E A +R + Sbjct: 84 SNNGYSKLSDADTKKVKEIYEKAKGKVSEQLPKAKEFGEEAGKRHQEVTEAAKRARGWGL 143 Query: 84 DE 85 D+ Sbjct: 144 DD 145 >gi|261328521|emb|CBH11498.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei gambiense DAL972] Length = 431 Score = 34.3 bits (77), Expect = 6.1, Method: Composition-based stats. Identities = 20/62 (32%), Positives = 31/62 (50%), Gaps = 4/62 (6%) Query: 28 TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83 + D +KL T+ +K+ A K+SE A+E E A KR + V E A +R + Sbjct: 84 SNDGYSKLSDADTKKVKDIYEKAKGKVSEQLPKAKEFGEEAGKRCQSVTEAAKKARGWGL 143 Query: 84 DE 85 D+ Sbjct: 144 DD 145 >gi|261328080|emb|CBH11057.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei gambiense DAL972] Length = 432 Score = 33.9 bits (76), Expect = 6.5, Method: Composition-based stats. Identities = 21/62 (33%), Positives = 32/62 (51%), Gaps = 4/62 (6%) Query: 28 TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83 + + +KL T+ +KE A K+SE A+E E A KR +EV E A +R + Sbjct: 84 SNNGYSKLSDADTKKVKEIYEKAKGKVSEQLPKAKEFGEEAGKRHQEVTEAAKRARGWGL 143 Query: 84 DE 85 D+ Sbjct: 144 DD 145 >gi|255948250|ref|XP_002564892.1| Pc22g08800 [Penicillium chrysogenum Wisconsin 54-1255] gi|211591909|emb|CAP98168.1| Pc22g08800 [Penicillium chrysogenum Wisconsin 54-1255] Length = 481 Score = 33.9 bits (76), Expect = 7.9, Method: Composition-based stats. Identities = 16/78 (20%), Positives = 32/78 (41%), Gaps = 5/78 (6%) Query: 23 IVIGRTEDLLNKLQKNST---QMIKEANFKI--SETHRLAQERVEAAEKRVKEVEERATA 77 I+ G ++ L+ + NS + KE I H ++ A + +KE EE + Sbjct: 399 IIKGASKSPLSSIAGNSRVHRRCTKETKLDIGNESAHARIRDYWLRANQALKEKEEASGG 458 Query: 78 SRKLSVDELANAFWDLSD 95 L V++ + ++ Sbjct: 459 RDYLDVEDEVFKYVNIDS 476 Database: nr Posted date: May 22, 2011 12:22 AM Number of letters in database: 999,999,966 Number of sequences in database: 2,987,313 Database: /data/usr2/db/fasta/nr.01 Posted date: May 22, 2011 12:30 AM Number of letters in database: 999,999,796 Number of sequences in database: 2,903,041 Database: /data/usr2/db/fasta/nr.02 Posted date: May 22, 2011 12:36 AM Number of letters in database: 999,999,281 Number of sequences in database: 2,904,016 Database: /data/usr2/db/fasta/nr.03 Posted date: May 22, 2011 12:41 AM Number of letters in database: 999,999,960 Number of sequences in database: 2,935,328 Database: /data/usr2/db/fasta/nr.04 Posted date: May 22, 2011 12:46 AM Number of letters in database: 842,794,627 Number of sequences in database: 2,394,679 Lambda K H 0.315 0.129 0.321 Lambda K H 0.267 0.0394 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 711,604,737 Number of Sequences: 14124377 Number of extensions: 19207885 Number of successful extensions: 132023 Number of sequences better than 10.0: 126 Number of HSP's better than 10.0 without gapping: 46 Number of HSP's successfully gapped in prelim test: 80 Number of HSP's that attempted gapping in prelim test: 131898 Number of HSP's gapped (non-prelim): 197 length of query: 121 length of database: 4,842,793,630 effective HSP length: 88 effective length of query: 33 effective length of database: 3,599,848,454 effective search space: 118794998982 effective search space used: 118794998982 T: 11 A: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 42 (22.1 bits) S2: 75 (33.6 bits)