BLASTP 2.2.22 [Sep-27-2009] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. Reference for composition-based statistics starting in round 2: Schaffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Query= gi|254781136|ref|YP_003065549.1| hypothetical protein CLIBASIA_05190 [Candidatus Liberibacter asiaticus str. psy62] (114 letters) Database: nr 14,124,377 sequences; 4,842,793,630 total letters Searching..................................................done Results from round 1 >gi|254781136|ref|YP_003065549.1| hypothetical protein CLIBASIA_05190 [Candidatus Liberibacter asiaticus str. psy62] gi|254040813|gb|ACT57609.1| hypothetical protein CLIBASIA_05190 [Candidatus Liberibacter asiaticus str. psy62] Length = 114 Score = 240 bits (612), Expect = 6e-62, Method: Compositional matrix adjust. Identities = 114/114 (100%), Positives = 114/114 (100%) Query: 1 MSQFLRNQYFDSNNLSGEYISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIMRGG 60 MSQFLRNQYFDSNNLSGEYISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIMRGG Sbjct: 1 MSQFLRNQYFDSNNLSGEYISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIMRGG 60 Query: 61 CRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCVCRE 114 CRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCVCRE Sbjct: 61 CRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCVCRE 114 >gi|315122528|ref|YP_004063017.1| hypothetical protein CKC_03900 [Candidatus Liberibacter solanacearum CLso-ZC1] gi|313495930|gb|ADR52529.1| hypothetical protein CKC_03900 [Candidatus Liberibacter solanacearum CLso-ZC1] Length = 532 Score = 136 bits (342), Expect = 1e-30, Method: Compositional matrix adjust. Identities = 64/114 (56%), Positives = 84/114 (73%), Gaps = 3/114 (2%) Query: 1 MSQFLRNQYFDSNNLSGE---YISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIM 57 MS ++N+Y + N+ E + S + NFL+FP QQL++N +L NFWPT+ GP + Sbjct: 1 MSYLMKNRYHNQKNIGTEQSSFFSREQHNFLHFPVGTQQLEDNISCVLRNFWPTSNGPSI 60 Query: 58 RGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCV 111 RGGCR+ICTL+ S+IISAFSY SS+QQR+FLANHREIYDVT+AS MQ A C+ Sbjct: 61 RGGCRRICTLNHRSNIISAFSYCSSSQQRIFLANHREIYDVTNASLMQPAIRCL 114 >gi|227822437|ref|YP_002826409.1| hypothetical protein NGR_c18920 [Sinorhizobium fredii NGR234] gi|227341438|gb|ACP25656.1| hypothetical protein NGR_c18920 [Sinorhizobium fredii NGR234] Length = 536 Score = 44.7 bits (104), Expect = 0.005, Method: Composition-based stats. Identities = 20/56 (35%), Positives = 32/56 (57%) Query: 43 SILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDV 98 ++L NF PT G +RGG RK+ +I SAF Y+ +++F+A IY++ Sbjct: 51 TVLTNFLPTLAGCKIRGGSRKVGLAADGGAIRSAFKYKFGNNEKLFMATATAIYNM 106 >gi|150397022|ref|YP_001327489.1| hypothetical protein Smed_1819 [Sinorhizobium medicae WSM419] gi|150028537|gb|ABR60654.1| conserved hypothetical protein [Sinorhizobium medicae WSM419] Length = 538 Score = 42.0 bits (97), Expect = 0.029, Method: Composition-based stats. Identities = 19/61 (31%), Positives = 35/61 (57%) Query: 38 QENAFSILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYD 97 Q + ++L NF+PT G +RGG ++ I SAF Y+ + +++F+A + IY+ Sbjct: 48 QPGSATVLRNFFPTLMGCKIRGGSQRKGLAADGGDIRSAFKYKYGSNEKLFMATNAGIYN 107 Query: 98 V 98 + Sbjct: 108 M 108 >gi|316933870|ref|YP_004108852.1| hypothetical protein Rpdx1_2528 [Rhodopseudomonas palustris DX-1] gi|315601584|gb|ADU44119.1| hypothetical protein Rpdx1_2528 [Rhodopseudomonas palustris DX-1] Length = 558 Score = 38.1 bits (87), Expect = 0.43, Method: Composition-based stats. Identities = 21/66 (31%), Positives = 34/66 (51%), Gaps = 3/66 (4%) Query: 39 ENAFSILHNFWPTTKGPIMRGGCRKICTL--DSNSSIISAFSYRSSTQQRVFLANHREIY 96 + AF +L NF+P G MR G + D + ++S FSY + ++F A +IY Sbjct: 47 QGAF-VLDNFFPEATGLRMRRGSESYAQVGADGSQPVLSLFSYINGANAKLFAATATDIY 105 Query: 97 DVTDAS 102 DV+ + Sbjct: 106 DVSSPA 111 >gi|13470677|ref|NP_102246.1| hypothetical protein mll0452 [Mesorhizobium loti MAFF303099] gi|14021419|dbj|BAB48032.1| mll0452 [Mesorhizobium loti MAFF303099] Length = 528 Score = 36.6 bits (83), Expect = 1.3, Method: Composition-based stats. Identities = 19/60 (31%), Positives = 33/60 (55%), Gaps = 1/60 (1%) Query: 44 ILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASS 103 +L N++PT+ G +RGG + T+ + ++S SY +F A+ I+DVT +S Sbjct: 46 VLENWFPTSTGIRLRGGAARHATIGT-IPVVSMMSYDGPAGSFMFAADGTNIFDVTTPAS 104 >gi|110632595|ref|YP_672803.1| hypothetical protein Meso_0234 [Mesorhizobium sp. BNC1] gi|110283579|gb|ABG61638.1| hypothetical protein Meso_0234 [Chelativorans sp. BNC1] Length = 632 Score = 36.2 bits (82), Expect = 1.4, Method: Composition-based stats. Identities = 20/56 (35%), Positives = 29/56 (51%), Gaps = 2/56 (3%) Query: 44 ILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVT 99 +L NF+PT G I+R G +K L + + S F Y + +F + IYDVT Sbjct: 47 VLENFFPTATGAILRRGRQKHAELP--TEVRSLFKYVVGNNRHLFASTVTTIYDVT 100 Searching..................................................done Results from round 2 >gi|315122528|ref|YP_004063017.1| hypothetical protein CKC_03900 [Candidatus Liberibacter solanacearum CLso-ZC1] gi|313495930|gb|ADR52529.1| hypothetical protein CKC_03900 [Candidatus Liberibacter solanacearum CLso-ZC1] Length = 532 Score = 207 bits (526), Expect = 6e-52, Method: Composition-based stats. Identities = 64/114 (56%), Positives = 84/114 (73%), Gaps = 3/114 (2%) Query: 1 MSQFLRNQYFDSNNLSGE---YISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIM 57 MS ++N+Y + N+ E + S + NFL+FP QQL++N +L NFWPT+ GP + Sbjct: 1 MSYLMKNRYHNQKNIGTEQSSFFSREQHNFLHFPVGTQQLEDNISCVLRNFWPTSNGPSI 60 Query: 58 RGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCV 111 RGGCR+ICTL+ S+IISAFSY SS+QQR+FLANHREIYDVT+AS MQ A C+ Sbjct: 61 RGGCRRICTLNHRSNIISAFSYCSSSQQRIFLANHREIYDVTNASLMQPAIRCL 114 >gi|254781136|ref|YP_003065549.1| hypothetical protein CLIBASIA_05190 [Candidatus Liberibacter asiaticus str. psy62] gi|254040813|gb|ACT57609.1| hypothetical protein CLIBASIA_05190 [Candidatus Liberibacter asiaticus str. psy62] Length = 114 Score = 191 bits (486), Expect = 2e-47, Method: Composition-based stats. Identities = 114/114 (100%), Positives = 114/114 (100%) Query: 1 MSQFLRNQYFDSNNLSGEYISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIMRGG 60 MSQFLRNQYFDSNNLSGEYISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIMRGG Sbjct: 1 MSQFLRNQYFDSNNLSGEYISSKDQNFLYFPTEKQQLQENAFSILHNFWPTTKGPIMRGG 60 Query: 61 CRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCVCRE 114 CRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCVCRE Sbjct: 61 CRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVTDASSMQSATVCVCRE 114 >gi|150397022|ref|YP_001327489.1| hypothetical protein Smed_1819 [Sinorhizobium medicae WSM419] gi|150028537|gb|ABR60654.1| conserved hypothetical protein [Sinorhizobium medicae WSM419] Length = 538 Score = 46.2 bits (108), Expect = 0.002, Method: Composition-based stats. Identities = 19/61 (31%), Positives = 35/61 (57%) Query: 38 QENAFSILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYD 97 Q + ++L NF+PT G +RGG ++ I SAF Y+ + +++F+A + IY+ Sbjct: 48 QPGSATVLRNFFPTLMGCKIRGGSQRKGLAADGGDIRSAFKYKYGSNEKLFMATNAGIYN 107 Query: 98 V 98 + Sbjct: 108 M 108 >gi|227822437|ref|YP_002826409.1| hypothetical protein NGR_c18920 [Sinorhizobium fredii NGR234] gi|227341438|gb|ACP25656.1| hypothetical protein NGR_c18920 [Sinorhizobium fredii NGR234] Length = 536 Score = 43.9 bits (102), Expect = 0.007, Method: Composition-based stats. Identities = 20/56 (35%), Positives = 32/56 (57%) Query: 43 SILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDV 98 ++L NF PT G +RGG RK+ +I SAF Y+ +++F+A IY++ Sbjct: 51 TVLTNFLPTLAGCKIRGGSRKVGLAADGGAIRSAFKYKFGNNEKLFMATATAIYNM 106 >gi|13470677|ref|NP_102246.1| hypothetical protein mll0452 [Mesorhizobium loti MAFF303099] gi|14021419|dbj|BAB48032.1| mll0452 [Mesorhizobium loti MAFF303099] Length = 528 Score = 38.5 bits (88), Expect = 0.36, Method: Composition-based stats. Identities = 22/92 (23%), Positives = 41/92 (44%), Gaps = 10/92 (10%) Query: 21 SSKDQNFLYFPTEKQQLQEN---------AFSILHNFWPTTKGPIMRGGCRKICTLDSNS 71 + + + + FP + N +L N++PT+ G +RGG + T+ + Sbjct: 14 AQQRHSTMTFPAPTRGKISNENLAAAGPKGAKVLENWFPTSTGIRLRGGAARHATIGT-I 72 Query: 72 SIISAFSYRSSTQQRVFLANHREIYDVTDASS 103 ++S SY +F A+ I+DVT +S Sbjct: 73 PVVSMMSYDGPAGSFMFAADGTNIFDVTTPAS 104 >gi|316933870|ref|YP_004108852.1| hypothetical protein Rpdx1_2528 [Rhodopseudomonas palustris DX-1] gi|315601584|gb|ADU44119.1| hypothetical protein Rpdx1_2528 [Rhodopseudomonas palustris DX-1] Length = 558 Score = 38.1 bits (87), Expect = 0.43, Method: Composition-based stats. Identities = 19/62 (30%), Positives = 31/62 (50%), Gaps = 2/62 (3%) Query: 44 ILHNFWPTTKGPIMRGGCRKICTL--DSNSSIISAFSYRSSTQQRVFLANHREIYDVTDA 101 +L NF+P G MR G + D + ++S FSY + ++F A +IYDV+ Sbjct: 51 VLDNFFPEATGLRMRRGSESYAQVGADGSQPVLSLFSYINGANAKLFAATATDIYDVSSP 110 Query: 102 SS 103 + Sbjct: 111 AI 112 >gi|110632595|ref|YP_672803.1| hypothetical protein Meso_0234 [Mesorhizobium sp. BNC1] gi|110283579|gb|ABG61638.1| hypothetical protein Meso_0234 [Chelativorans sp. BNC1] Length = 632 Score = 36.5 bits (83), Expect = 1.3, Method: Composition-based stats. Identities = 20/56 (35%), Positives = 29/56 (51%), Gaps = 2/56 (3%) Query: 44 ILHNFWPTTKGPIMRGGCRKICTLDSNSSIISAFSYRSSTQQRVFLANHREIYDVT 99 +L NF+PT G I+R G +K L + + S F Y + +F + IYDVT Sbjct: 47 VLENFFPTATGAILRRGRQKHAELP--TEVRSLFKYVVGNNRHLFASTVTTIYDVT 100 Database: nr Posted date: May 22, 2011 12:22 AM Number of letters in database: 999,999,966 Number of sequences in database: 2,987,313 Database: /data/usr2/db/fasta/nr.01 Posted date: May 22, 2011 12:30 AM Number of letters in database: 999,999,796 Number of sequences in database: 2,903,041 Database: /data/usr2/db/fasta/nr.02 Posted date: May 22, 2011 12:36 AM Number of letters in database: 999,999,281 Number of sequences in database: 2,904,016 Database: /data/usr2/db/fasta/nr.03 Posted date: May 22, 2011 12:41 AM Number of letters in database: 999,999,960 Number of sequences in database: 2,935,328 Database: /data/usr2/db/fasta/nr.04 Posted date: May 22, 2011 12:46 AM Number of letters in database: 842,794,627 Number of sequences in database: 2,394,679 Lambda K H 0.318 0.138 0.415 Lambda K H 0.267 0.0417 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 1,939,987,013 Number of Sequences: 14124377 Number of extensions: 68276005 Number of successful extensions: 139632 Number of sequences better than 10.0: 9 Number of HSP's better than 10.0 without gapping: 8 Number of HSP's successfully gapped in prelim test: 8 Number of HSP's that attempted gapping in prelim test: 139619 Number of HSP's gapped (non-prelim): 17 length of query: 114 length of database: 4,842,793,630 effective HSP length: 81 effective length of query: 33 effective length of database: 3,698,719,093 effective search space: 122057730069 effective search space used: 122057730069 T: 11 A: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 76 (33.9 bits)