BLASTP 2.2.22 [Sep-27-2009]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics:
Schaffer, Alejandro A., L. Aravind, Thomas L. Madden,
Sergei Shavirin, John L. Spouge, Yuri I. Wolf,  
Eugene V. Koonin, and Stephen F. Altschul (2001), 
"Improving the accuracy of PSI-BLAST protein database searches with 
composition-based statistics and other refinements",  Nucleic Acids Res. 29:2994-3005.

Query= gi|254780980|ref|YP_003065393.1| hypothetical protein
CLIBASIA_04405 [Candidatus Liberibacter asiaticus str. psy62]
         (121 letters)

Database: nr 
           14,124,377 sequences; 4,842,793,630 total letters

Searching..................................................done



>gi|254780980|ref|YP_003065393.1| hypothetical protein CLIBASIA_04405 [Candidatus Liberibacter
           asiaticus str. psy62]
 gi|254040657|gb|ACT57453.1| hypothetical protein CLIBASIA_04405 [Candidatus Liberibacter
           asiaticus str. psy62]
          Length = 121

 Score =  173 bits (439), Expect = 6e-42,   Method: Composition-based stats.
 Identities = 121/121 (100%), Positives = 121/121 (100%)

Query: 1   MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER 60
           MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER
Sbjct: 1   MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER 60

Query: 61  VEAAEKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPPS 120
           VEAAEKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPPS
Sbjct: 61  VEAAEKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPPS 120

Query: 121 N 121
           N
Sbjct: 121 N 121


>gi|254780556|ref|YP_003064969.1| hypothetical protein CLIBASIA_02215 [Candidatus Liberibacter
           asiaticus str. psy62]
 gi|254040233|gb|ACT57029.1| hypothetical protein CLIBASIA_02215 [Candidatus Liberibacter
           asiaticus str. psy62]
          Length = 120

 Score =  123 bits (309), Expect = 7e-27,   Method: Composition-based stats.
 Identities = 41/110 (37%), Positives = 67/110 (60%), Gaps = 10/110 (9%)

Query: 1   MKAKILMTSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQER 60
           M+AK L+ S   TT +TI GC +V    ED  N+++  S +M+KEA  ++ E H+LA+E 
Sbjct: 1   MRAKTLLASTLVTTAITIIGCSLV----ED--NRIE--SLRMVKEAKMEVLEAHKLAKEY 52

Query: 61  VEAAEKRVKEVEERATAS--RKLSVDELANAFWDLSDEDKNAFTGNVKQE 108
           VE A +RVKE EE++ A   + L +D+L   F +L  +++  F   ++ +
Sbjct: 53  VEQANQRVKEAEEQSNARLLKGLGMDDLVRYFMNLDSQNQAFFIDTIQNK 102


>gi|75911009|ref|YP_325305.1| WD-40 repeat-containing protein [Anabaena variabilis ATCC 29413]
 gi|75704734|gb|ABA24410.1| WD-40 repeat-containing protein [Anabaena variabilis ATCC 29413]
          Length = 1477

 Score = 36.6 bits (83), Expect = 0.99,   Method: Composition-based stats.
 Identities = 19/68 (27%), Positives = 32/68 (47%), Gaps = 7/68 (10%)

Query: 43  IKEANFKISETHRLAQERVEAAEKRVKEV-------EERATASRKLSVDELANAFWDLSD 95
           + +AN K++ T + A+E  + A +RVKE        +E A    K + D++  A  +L  
Sbjct: 496 VNDANVKVANTQKEAKELTDKANQRVKEANIQVANTKEEAKNRTKKADDDVRLALENLGK 555

Query: 96  EDKNAFTG 103
             K A   
Sbjct: 556 ITKQAKID 563


>gi|159486149|ref|XP_001701106.1| hypothetical protein CHLREDRAFT_194202 [Chlamydomonas reinhardtii]
 gi|158272000|gb|EDO97808.1| predicted protein [Chlamydomonas reinhardtii]
          Length = 1430

 Score = 36.6 bits (83), Expect = 1.1,   Method: Composition-based stats.
 Identities = 13/34 (38%), Positives = 23/34 (67%)

Query: 41  QMIKEANFKISETHRLAQERVEAAEKRVKEVEER 74
           +++K     + E  + AQ+RVE AE+RV++ E+R
Sbjct: 70  KLVKAVKEDVEEAQQEAQQRVEKAEQRVEKAEQR 103



 Score = 35.1 bits (79), Expect = 3.5,   Method: Composition-based stats.
 Identities = 17/44 (38%), Positives = 28/44 (63%)

Query: 31  LLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKRVKEVEER 74
           +L KL K   + ++EA  +  +    A++RVE AE+RV+E E+R
Sbjct: 67  ILQKLVKAVKEDVEEAQQEAQQRVEKAEQRVEKAEQRVEEAEQR 110


>gi|288922780|ref|ZP_06416949.1| hypothetical protein FrEUN1fDRAFT_6647 [Frankia sp. EUN1f]
 gi|288345893|gb|EFC80253.1| hypothetical protein FrEUN1fDRAFT_6647 [Frankia sp. EUN1f]
          Length = 674

 Score = 36.6 bits (83), Expect = 1.2,   Method: Composition-based stats.
 Identities = 22/87 (25%), Positives = 40/87 (45%), Gaps = 5/87 (5%)

Query: 14  TLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKRVKEVEE 73
           T   + G D V+   E    ++Q+++  ++ E       + R A   VE AE+ ++   E
Sbjct: 267 TAKGVDGSDEVVAAQE----RVQESAAAIV-ETQLDGQRSVRDALHAVEQAERSLQSARE 321

Query: 74  RATASRKLSVDELANAFWDLSDEDKNA 100
           ++ AS    VD  A+A  DL  + +  
Sbjct: 322 QSAASAAGGVDAYADALKDLPAKTQAF 348


>gi|255943494|ref|XP_002562515.1| Pc19g00220 [Penicillium chrysogenum Wisconsin 54-1255]
 gi|211587249|emb|CAP79438.1| Pc19g00220 [Penicillium chrysogenum Wisconsin 54-1255]
          Length = 566

 Score = 36.2 bits (82), Expect = 1.5,   Method: Composition-based stats.
 Identities = 17/78 (21%), Positives = 32/78 (41%), Gaps = 5/78 (6%)

Query: 23  IVIGRTEDLLNKLQKNST---QMIKEANFKI--SETHRLAQERVEAAEKRVKEVEERATA 77
           I+ G ++  L+ +  NS    +  KE    I     H   ++    A + +KE EE +  
Sbjct: 484 IIKGASKSPLSSIAGNSRVHRRCTKETKLDIGNDSAHARIRDYWLRANQALKEKEEASGG 543

Query: 78  SRKLSVDELANAFWDLSD 95
              L VD+    + ++  
Sbjct: 544 RDYLDVDDEVFKYVNIDS 561


>gi|270009401|gb|EFA05849.1| hypothetical protein TcasGA2_TC008640 [Tribolium castaneum]
          Length = 4573

 Score = 35.5 bits (80), Expect = 2.2,   Method: Composition-based stats.
 Identities = 29/111 (26%), Positives = 56/111 (50%), Gaps = 6/111 (5%)

Query: 11   FSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEK--RV 68
             ++T   + G   V+   E  LN+  + + ++I   +   +E  ++A+E+  A+E+  +V
Sbjct: 3163 LASTSEEVDGLKEVLAVQEVELNEKNEAAGKLIAVLS---AENEKVAKEQAIASEEEAKV 3219

Query: 69   KEVEERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQEVCKVKKITVPP 119
            K +EE  +  +K+  D+LA A   L    +   T N K  + ++K  T PP
Sbjct: 3220 KLIEEDVSVKQKICADDLAKAEPALIAAQQALNTLN-KNNLTELKSFTQPP 3269


>gi|58259447|ref|XP_567136.1| hypothetical protein [Cryptococcus neoformans var. neoformans
           JEC21]
 gi|134107537|ref|XP_777653.1| hypothetical protein CNBA7730 [Cryptococcus neoformans var.
           neoformans B-3501A]
 gi|50260347|gb|EAL23006.1| hypothetical protein CNBA7730 [Cryptococcus neoformans var.
           neoformans B-3501A]
 gi|57223273|gb|AAW41317.1| conserved hypothetical protein [Cryptococcus neoformans var.
           neoformans JEC21]
          Length = 1184

 Score = 35.1 bits (79), Expect = 3.2,   Method: Composition-based stats.
 Identities = 25/74 (33%), Positives = 38/74 (51%), Gaps = 6/74 (8%)

Query: 8   TSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKR 67
           TS + T  +    CD+    ++DL  K      Q  ++A  + +     AQ+RV+AAE R
Sbjct: 257 TSTYVTLRMNSALCDVAADVSKDLSVK------QRQRDAEVRKAGATNAAQKRVKAAEDR 310

Query: 68  VKEVEERATASRKL 81
           VKEV+ER     +L
Sbjct: 311 VKEVQERKQTLEEL 324


>gi|224075020|ref|XP_002304521.1| predicted protein [Populus trichocarpa]
 gi|222841953|gb|EEE79500.1| predicted protein [Populus trichocarpa]
          Length = 822

 Score = 34.7 bits (78), Expect = 4.5,   Method: Composition-based stats.
 Identities = 23/89 (25%), Positives = 38/89 (42%), Gaps = 11/89 (12%)

Query: 39  STQMIKEANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV-------DELANAFW 91
           S + I E++ K+ +T    QER    EK + E  E  T++R L +       D+  N   
Sbjct: 309 SLKAISESDNKVDQT----QERTLHCEKGMPEQVESMTSTRALPMVMDLTVDDDEINGED 364

Query: 92  DLSDEDKNAFTGNVKQEVCKVKKITVPPS 120
           ++  ED+  F   ++        I   PS
Sbjct: 365 NIDAEDRKPFLATLQNHPVDTNPIPTMPS 393


>gi|52080071|ref|YP_078862.1| inositol monophosphatase SuhB [Bacillus licheniformis ATCC 14580]
 gi|52785445|ref|YP_091274.1| YktC [Bacillus licheniformis ATCC 14580]
 gi|319646154|ref|ZP_08000384.1| YktC protein [Bacillus sp. BT1B_CT2]
 gi|52003282|gb|AAU23224.1| Inositol monophosphatase SuhB [Bacillus licheniformis ATCC 14580]
 gi|52347947|gb|AAU40581.1| YktC [Bacillus licheniformis ATCC 14580]
 gi|317391904|gb|EFV72701.1| YktC protein [Bacillus sp. BT1B_CT2]
          Length = 264

 Score = 34.7 bits (78), Expect = 4.6,   Method: Composition-based stats.
 Identities = 14/57 (24%), Positives = 29/57 (50%), Gaps = 4/57 (7%)

Query: 52  ETHRLAQERVEAAEKRVKEV-EERATASRKLSVDELANAFWDLSDEDKNAFTGNVKQ 107
           E  RLA+  V+ A +R+K+  +E+ T   K + ++L     ++  E +  F   ++ 
Sbjct: 6   EIDRLAKSWVKEAGQRIKQSMKEKMTIETKSNPNDLVT---NIDKETERFFIEKIQS 59


>gi|72389040|ref|XP_844815.1| 65 kDa invariant surface glycoprotein [Trypanosoma brucei TREU927]
 gi|62176336|gb|AAX70448.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma
           brucei]
 gi|70801349|gb|AAZ11256.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei
           brucei strain 927/4 GUTat10.1]
          Length = 435

 Score = 34.7 bits (78), Expect = 4.7,   Method: Composition-based stats.
 Identities = 21/62 (33%), Positives = 32/62 (51%), Gaps = 4/62 (6%)

Query: 28  TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83
           + +  +KL    T+ +KE    A  K+SE    A+E  E A KR +EV E A  +R   +
Sbjct: 84  SNNGYSKLSDADTKKVKEIYEKAKGKVSEQLPKAKEFGEEAGKRHQEVTEAAKRARGWGL 143

Query: 84  DE 85
           D+
Sbjct: 144 DD 145


>gi|229596643|ref|XP_001007975.2| hypothetical protein TTHERM_01395380 [Tetrahymena thermophila]
 gi|225565189|gb|EAR87730.2| hypothetical protein TTHERM_01395380 [Tetrahymena thermophila SB210]
          Length = 2032

 Score = 34.3 bits (77), Expect = 5.0,   Method: Composition-based stats.
 Identities = 19/86 (22%), Positives = 40/86 (46%), Gaps = 2/86 (2%)

Query: 30   DLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSVDELANA 89
            + +N+ Q++  Q I E   K+ E  +L Q +    +K +KE   +      + +D  ++ 
Sbjct: 1852 EEINQFQQSLIQDIYEMENKMREKQQLKQRKTNPKKKVIKEANNKIYKQLSMGMD--SSQ 1909

Query: 90   FWDLSDEDKNAFTGNVKQEVCKVKKI 115
            F++    DK  FT ++  +   +  I
Sbjct: 1910 FYNDQSYDKTNFTPSIVNKSISLAAI 1935


>gi|72382218|ref|YP_291573.1| hypothetical protein PMN2A_0378 [Prochlorococcus marinus str.
           NATL2A]
 gi|124025767|ref|YP_001014883.1| hypothetical protein NATL1_10601 [Prochlorococcus marinus str.
           NATL1A]
 gi|72002068|gb|AAZ57870.1| uncharacterized membrane protein [Prochlorococcus marinus str.
           NATL2A]
 gi|123960835|gb|ABM75618.1| conserved hypothetical protein [Prochlorococcus marinus str.
           NATL1A]
          Length = 215

 Score = 34.3 bits (77), Expect = 5.1,   Method: Composition-based stats.
 Identities = 19/64 (29%), Positives = 28/64 (43%), Gaps = 2/64 (3%)

Query: 44  KEANFKISETHRLAQERVEAA--EKRVKEVEERATASRKLSVDELANAFWDLSDEDKNAF 101
           K + FK      L    +     E RV+  +E+A    +L V E    F  L  ED++ F
Sbjct: 85  KNSKFKSFTFSALVDGYISDVYVEDRVENKQEQANQDGRLEVLEKKRTFVILDIEDEDGF 144

Query: 102 TGNV 105
            GN+
Sbjct: 145 IGNI 148


>gi|321250385|ref|XP_003191789.1| hypothetical protein CGB_A9310C [Cryptococcus gattii WM276]
 gi|317458256|gb|ADV20002.1| Conserved hypothetical protein [Cryptococcus gattii WM276]
          Length = 1309

 Score = 34.3 bits (77), Expect = 5.5,   Method: Composition-based stats.
 Identities = 24/74 (32%), Positives = 38/74 (51%), Gaps = 6/74 (8%)

Query: 8   TSAFSTTLLTIGGCDIVIGRTEDLLNKLQKNSTQMIKEANFKISETHRLAQERVEAAEKR 67
           TS + T  +    CD+    ++DL  K      Q  ++A  + +     AQ+R++AAE R
Sbjct: 257 TSTYMTLKINSALCDVAADVSKDLSVK------QRQRDAEVRKAGATNAAQKRMQAAEDR 310

Query: 68  VKEVEERATASRKL 81
           VKEV+ER     +L
Sbjct: 311 VKEVQERKQTLEEL 324


>gi|224075467|ref|XP_002304646.1| predicted protein [Populus trichocarpa]
 gi|222842078|gb|EEE79625.1| predicted protein [Populus trichocarpa]
          Length = 555

 Score = 34.3 bits (77), Expect = 5.5,   Method: Composition-based stats.
 Identities = 18/60 (30%), Positives = 36/60 (60%), Gaps = 4/60 (6%)

Query: 22  DIVIGRTEDLLNKLQKNSTQ----MIKEANFKISETHRLAQERVEAAEKRVKEVEERATA 77
           ++V   ++D  ++ + ++ +    M++E N  I E  RL +ER + A+ RVKE+E++  A
Sbjct: 194 EVVNFNSKDTGDQHEASALRDELDMLQEENGNILEKLRLEEERCKEADARVKELEKQVAA 253


>gi|261328538|emb|CBH11515.1| 64 kDa invariant surface glycoprotein, putative [Trypanosoma brucei
           gambiense DAL972]
          Length = 435

 Score = 34.3 bits (77), Expect = 5.6,   Method: Composition-based stats.
 Identities = 21/62 (33%), Positives = 32/62 (51%), Gaps = 4/62 (6%)

Query: 28  TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83
           + +  +KL    T+ +KE    A  K+SE    A+E  E A KR +EV E A  +R   +
Sbjct: 84  SNNGYSKLSDADTKKVKEIYEKAKGKVSEQLPKAKEFGEEAGKRHQEVTEAAKRARGWGL 143

Query: 84  DE 85
           D+
Sbjct: 144 DD 145


>gi|261328521|emb|CBH11498.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei
           gambiense DAL972]
          Length = 431

 Score = 34.3 bits (77), Expect = 6.1,   Method: Composition-based stats.
 Identities = 20/62 (32%), Positives = 31/62 (50%), Gaps = 4/62 (6%)

Query: 28  TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83
           + D  +KL    T+ +K+    A  K+SE    A+E  E A KR + V E A  +R   +
Sbjct: 84  SNDGYSKLSDADTKKVKDIYEKAKGKVSEQLPKAKEFGEEAGKRCQSVTEAAKKARGWGL 143

Query: 84  DE 85
           D+
Sbjct: 144 DD 145


>gi|261328080|emb|CBH11057.1| 65 kDa invariant surface glycoprotein, putative [Trypanosoma brucei
           gambiense DAL972]
          Length = 432

 Score = 33.9 bits (76), Expect = 6.5,   Method: Composition-based stats.
 Identities = 21/62 (33%), Positives = 32/62 (51%), Gaps = 4/62 (6%)

Query: 28  TEDLLNKLQKNSTQMIKE----ANFKISETHRLAQERVEAAEKRVKEVEERATASRKLSV 83
           + +  +KL    T+ +KE    A  K+SE    A+E  E A KR +EV E A  +R   +
Sbjct: 84  SNNGYSKLSDADTKKVKEIYEKAKGKVSEQLPKAKEFGEEAGKRHQEVTEAAKRARGWGL 143

Query: 84  DE 85
           D+
Sbjct: 144 DD 145


>gi|255948250|ref|XP_002564892.1| Pc22g08800 [Penicillium chrysogenum Wisconsin 54-1255]
 gi|211591909|emb|CAP98168.1| Pc22g08800 [Penicillium chrysogenum Wisconsin 54-1255]
          Length = 481

 Score = 33.9 bits (76), Expect = 7.9,   Method: Composition-based stats.
 Identities = 16/78 (20%), Positives = 32/78 (41%), Gaps = 5/78 (6%)

Query: 23  IVIGRTEDLLNKLQKNST---QMIKEANFKI--SETHRLAQERVEAAEKRVKEVEERATA 77
           I+ G ++  L+ +  NS    +  KE    I     H   ++    A + +KE EE +  
Sbjct: 399 IIKGASKSPLSSIAGNSRVHRRCTKETKLDIGNESAHARIRDYWLRANQALKEKEEASGG 458

Query: 78  SRKLSVDELANAFWDLSD 95
              L V++    + ++  
Sbjct: 459 RDYLDVEDEVFKYVNIDS 476


  Database: nr
    Posted date:  May 22, 2011 12:22 AM
  Number of letters in database: 999,999,966
  Number of sequences in database:  2,987,313
  
  Database: /data/usr2/db/fasta/nr.01
    Posted date:  May 22, 2011 12:30 AM
  Number of letters in database: 999,999,796
  Number of sequences in database:  2,903,041
  
  Database: /data/usr2/db/fasta/nr.02
    Posted date:  May 22, 2011 12:36 AM
  Number of letters in database: 999,999,281
  Number of sequences in database:  2,904,016
  
  Database: /data/usr2/db/fasta/nr.03
    Posted date:  May 22, 2011 12:41 AM
  Number of letters in database: 999,999,960
  Number of sequences in database:  2,935,328
  
  Database: /data/usr2/db/fasta/nr.04
    Posted date:  May 22, 2011 12:46 AM
  Number of letters in database: 842,794,627
  Number of sequences in database:  2,394,679
  
Lambda     K      H
   0.315    0.129    0.321 

Lambda     K      H
   0.267   0.0394    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 711,604,737
Number of Sequences: 14124377
Number of extensions: 19207885
Number of successful extensions: 132023
Number of sequences better than 10.0: 126
Number of HSP's better than 10.0 without gapping: 46
Number of HSP's successfully gapped in prelim test: 80
Number of HSP's that attempted gapping in prelim test: 131898
Number of HSP's gapped (non-prelim): 197
length of query: 121
length of database: 4,842,793,630
effective HSP length: 88
effective length of query: 33
effective length of database: 3,599,848,454
effective search space: 118794998982
effective search space used: 118794998982
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.1 bits)
S2: 75 (33.6 bits)