RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy12410
(615 letters)
>gnl|CDD|240273 PTZ00110, PTZ00110, helicase; Provisional.
Length = 545
Score = 341 bits (877), Expect = e-110
Identities = 144/328 (43%), Positives = 216/328 (65%), Gaps = 6/328 (1%)
Query: 265 ASKQKKELSKVDHSTIEYLPFRKDFYVEVPEIARMTPEEVEKYKEELEGIRVKGKGCPRP 324
+S K L +D +I +PF K+FY E PE++ ++ +EV++ ++E E + G+ P+P
Sbjct: 69 SSTLGKRLQPIDWKSINLVPFEKNFYKEHPEVSALSSKEVDEIRKEKEITIIAGENVPKP 128
Query: 325 IKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPL 384
+ ++ IL +LK + +PTPIQ Q P +SGRD+IGIA+TGSGKT+AF+LP
Sbjct: 129 VVSFEYTSFPDYILKSLKNAGFTEPTPIQVQGWPIALSGRDMIGIAETGSGKTLAFLLPA 188
Query: 385 LRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQIS 444
+ HI QP L DGP+ ++++PTREL QI ++ KF S +R YGG QI
Sbjct: 189 IVHINAQPLLRYGDGPIVLVLAPTRELAEQIREQCNKFGASSKIRNTVAYGGVPKRGQIY 248
Query: 445 ELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNV 504
L+RG EI++ PGR+ID L +N VTNLRRVTY+VLDEADRM DMGFEPQ+ +I+ +
Sbjct: 249 ALRRGVEILIACPGRLIDFLESN---VTNLRRVTYLVLDEADRMLDMGFEPQIRKIVSQI 305
Query: 505 RPDRQTVMFSATFPRQMEALARRIL-NKPIEIQVGGRSV-VCKEVEQHVIVLDEEQKMLK 562
RPDRQT+M+SAT+P+++++LAR + +P+ + VG + C ++Q V V++E +K K
Sbjct: 306 RPDRQTLMWSATWPKEVQSLARDLCKEEPVHVNVGSLDLTACHNIKQEVFVVEEHEKRGK 365
Query: 563 LLELLG-IYQDQGSVIVFVDKQENADSL 589
L LL I +D +++FV+ ++ AD L
Sbjct: 366 LKMLLQRIMRDGDKILIFVETKKGADFL 393
>gnl|CDD|238167 cd00268, DEADc, DEAD-box helicases. A diverse family of proteins
involved in ATP-dependent RNA unwinding, needed in a
variety of cellular processes including splicing,
ribosome biogenesis and RNA degradation. The name
derives from the sequence of the Walker B motif (motif
II). This domain contains the ATP- binding region.
Length = 203
Score = 295 bits (757), Expect = 4e-97
Identities = 103/204 (50%), Positives = 146/204 (71%), Gaps = 6/204 (2%)
Query: 332 GVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
G+S ++L + +EKPTPIQA+AIP ++SGRD+IG A+TGSGKT AF++P+L +
Sbjct: 5 GLSPELLRGIYALGFEKPTPIQARAIPPLLSGRDVIGQAQTGSGKTAAFLIPILEKLDPS 64
Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE 451
P DGP A+I++PTREL +QI + A+K K L+VV +YGGT I +QI +LKRG
Sbjct: 65 PKK---DGPQALILAPTRELALQIAEVARKLGKHTNLKVVVIYGGTSIDKQIRKLKRGPH 121
Query: 452 IIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTV 511
I+V TPGR++D+L G++ +L +V Y+VLDEADRM DMGFE Q+ I+ + DRQT+
Sbjct: 122 IVVATPGRLLDLL--ERGKL-DLSKVKYLVLDEADRMLDMGFEDQIREILKLLPKDRQTL 178
Query: 512 MFSATFPRQMEALARRILNKPIEI 535
+FSAT P+++ LAR+ L P+ I
Sbjct: 179 LFSATMPKEVRDLARKFLRNPVRI 202
>gnl|CDD|223587 COG0513, SrmB, Superfamily II DNA and RNA helicases [DNA
replication, recombination, and repair / Transcription /
Translation, ribosomal structure and biogenesis].
Length = 513
Score = 275 bits (706), Expect = 2e-85
Identities = 126/296 (42%), Positives = 180/296 (60%), Gaps = 11/296 (3%)
Query: 298 RMTPEEVEKYKEELEGIRVKGKGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAI 357
+ K + +G + +A G+S ++L ALK +E+PTPIQ AI
Sbjct: 1 LAREDYDRFVKLKSAHNVALSRGEEKTPPEFASLGLSPELLQALKDLGFEEPTPIQLAAI 60
Query: 358 PAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGK 417
P I++GRD++G A+TG+GKT AF+LPLL+ IL E A+I++PTREL +QI +
Sbjct: 61 PLILAGRDVLGQAQTGTGKTAAFLLPLLQKILKSV---ERKYVSALILAPTRELAVQIAE 117
Query: 418 EAKKFTKSL-GLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTNLRR 476
E +K K+L GLRV VYGG I +QI LKRG +I+V TPGR++D++ +L
Sbjct: 118 ELRKLGKNLGGLRVAVVYGGVSIRKQIEALKRGVDIVVATPGRLLDLIKRGKL---DLSG 174
Query: 477 VTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEALARRILNKPIEIQ 536
V +VLDEADRM DMGF + +I+ + PDRQT++FSAT P + LARR LN P+EI+
Sbjct: 175 VETLVLDEADRMLDMGFIDDIEKILKALPPDRQTLLFSATMPDDIRELARRYLNDPVEIE 234
Query: 537 VGGRSVVC--KEVEQHVI-VLDEEQKMLKLLELLGIYQDQGSVIVFVDKQENADSL 589
V + K+++Q + V EE+K+ LL+LL D+G VIVFV + + L
Sbjct: 235 VSVEKLERTLKKIKQFYLEVESEEEKLELLLKLLKDE-DEGRVIVFVRTKRLVEEL 289
>gnl|CDD|215832 pfam00270, DEAD, DEAD/DEAH box helicase. Members of this family
include the DEAD and DEAH box helicases. Helicases are
involved in unwinding nucleic acids. The DEAD box
helicases are involved in various aspects of RNA
metabolism, including nuclear transcription, pre mRNA
splicing, ribosome biogenesis, nucleocytoplasmic
transport, translation, RNA decay and organellar gene
expression.
Length = 169
Score = 212 bits (542), Expect = 7e-66
Identities = 87/176 (49%), Positives = 123/176 (69%), Gaps = 8/176 (4%)
Query: 350 TPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTR 409
TPIQAQAIPAI+SG+D++ A TGSGKT+AF+LP+L+ +L + GP A++++PTR
Sbjct: 1 TPIQAQAIPAILSGKDVLVQAPTGSGKTLAFLLPILQALLPKK-----GGPQALVLAPTR 55
Query: 410 ELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRG-AEIIVCTPGRMIDMLAANS 468
EL QI +E KK K LGLRV + GGT + EQ +LK+G A+I+V TPGR++D+L
Sbjct: 56 ELAEQIYEELKKLFKILGLRVALLTGGTSLKEQARKLKKGKADILVGTPGRLLDLL--RR 113
Query: 469 GRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEAL 524
G++ L+ + +VLDEA R+ DMGF + I+ + PDRQ ++ SAT PR +E L
Sbjct: 114 GKLKLLKNLKLLVLDEAHRLLDMGFGDDLEEILSRLPPDRQILLLSATLPRNLEDL 169
>gnl|CDD|236977 PRK11776, PRK11776, ATP-dependent RNA helicase DbpA; Provisional.
Length = 460
Score = 208 bits (533), Expect = 8e-61
Identities = 97/261 (37%), Positives = 151/261 (57%), Gaps = 23/261 (8%)
Query: 338 LDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEET 397
L L + Y + TPIQAQ++PAI++G+D+I AKTGSGKT AF L LL+ +
Sbjct: 16 LANLNELGYTEMTPIQAQSLPAILAGKDVIAQAKTGSGKTAAFGLGLLQKL--------- 66
Query: 398 D----GPMAIIMSPTRELCMQIGKEAKKFTKSL-GLRVVCVYGGTGISEQISELKRGAEI 452
D A+++ PTREL Q+ KE ++ + + ++V+ + GG + QI L+ GA I
Sbjct: 67 DVKRFRVQALVLCPTRELADQVAKEIRRLARFIPNIKVLTLCGGVPMGPQIDSLEHGAHI 126
Query: 453 IVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVM 512
IV TPGR++D L + +L + +VLDEADRM DMGF+ + II RQT++
Sbjct: 127 IVGTPGRILDHLRKGT---LDLDALNTLVLDEADRMLDMGFQDAIDAIIRQAPARRQTLL 183
Query: 513 FSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQD 572
FSAT+P + A+++R P+E++V + +EQ + ++++ L LL +Q
Sbjct: 184 FSATYPEGIAAISQRFQRDPVEVKVESTHDLPA-IEQRFYEVSPDERLPALQRLLLHHQP 242
Query: 573 QGSVIVF----VDKQENADSL 589
+ S +VF + QE AD+L
Sbjct: 243 E-SCVVFCNTKKECQEVADAL 262
>gnl|CDD|236722 PRK10590, PRK10590, ATP-dependent RNA helicase RhlE; Provisional.
Length = 456
Score = 208 bits (530), Expect = 1e-60
Identities = 107/259 (41%), Positives = 171/259 (66%), Gaps = 5/259 (1%)
Query: 332 GVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
G+S IL A+ +Q Y +PTPIQ QAIPA++ GRDL+ A+TG+GKT F LPLL+H++ +
Sbjct: 7 GLSPDILRAVAEQGYREPTPIQQQAIPAVLEGRDLMASAQTGTGKTAGFTLPLLQHLITR 66
Query: 392 PPLEETDGPM-AIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGA 450
P + P+ A+I++PTREL QIG+ + ++K L +R + V+GG I+ Q+ +L+ G
Sbjct: 67 QPHAKGRRPVRALILTPTRELAAQIGENVRDYSKYLNIRSLVVFGGVSINPQMMKLRGGV 126
Query: 451 EIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQT 510
+++V TPGR++D+ N+ L +V +VLDEADRM DMGF + R++ + RQ
Sbjct: 127 DVLVATPGRLLDLEHQNA---VKLDQVEILVLDEADRMLDMGFIHDIRRVLAKLPAKRQN 183
Query: 511 VMFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIY 570
++FSATF ++ALA ++L+ P+EI+V R+ ++V QHV +D+++K L +++G
Sbjct: 184 LLFSATFSDDIKALAEKLLHNPLEIEVARRNTASEQVTQHVHFVDKKRKRELLSQMIGKG 243
Query: 571 QDQGSVIVFVDKQENADSL 589
Q V+VF + A+ L
Sbjct: 244 NWQ-QVLVFTRTKHGANHL 261
>gnl|CDD|215103 PLN00206, PLN00206, DEAD-box ATP-dependent RNA helicase;
Provisional.
Length = 518
Score = 209 bits (533), Expect = 2e-60
Identities = 125/353 (35%), Positives = 193/353 (54%), Gaps = 18/353 (5%)
Query: 247 EYSSEEEQEDLTS---TAANLASKQKKELSKVDHSTIEYLPFRKD-FYVEVPEI-ARMTP 301
EY +E +D+ S A L K ++ V + LP + FYV P + ++
Sbjct: 39 EYICDETDDDICSLECKQALLRRVAKSRVA-VGAPKPKRLPATDECFYVRDPGSTSGLSS 97
Query: 302 EEVEKYKEELEGIRVKGKGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIM 361
+ E + +LE I VKG+ P PI +++ CG+ K+L L+ YE PTPIQ QAIPA +
Sbjct: 98 SQAELLRRKLE-IHVKGEAVPPPILSFSSCGLPPKLLLNLETAGYEFPTPIQMQAIPAAL 156
Query: 362 SGRDLIGIAKTGSGKTVAFVLPLLRHI----LDQPPLEETDGPMAIIMSPTRELCMQIGK 417
SGR L+ A TGSGKT +F++P++ P E P+A++++PTRELC+Q+
Sbjct: 157 SGRSLLVSADTGSGKTASFLVPIISRCCTIRSGHPS--EQRNPLAMVLTPTRELCVQVED 214
Query: 418 EAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRV 477
+AK K L + V GG + +Q+ +++G E+IV TPGR+ID+L+ + L V
Sbjct: 215 QAKVLGKGLPFKTALVVGGDAMPQQLYRIQQGVELIVGTPGRLIDLLSKHD---IELDNV 271
Query: 478 TYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEALARRILNKPIEIQV 537
+ +VLDE D M + GF QVM+I + Q ++FSAT ++E A + I I +
Sbjct: 272 SVLVLDEVDCMLERGFRDQVMQIFQAL-SQPQVLLFSATVSPEVEKFASSLAKDIILISI 330
Query: 538 GGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQD-QGSVIVFVDKQENADSL 589
G + K V+Q I ++ +QK KL ++L Q + +VFV + AD L
Sbjct: 331 GNPNRPNKAVKQLAIWVETKQKKQKLFDILKSKQHFKPPAVVFVSSRLGADLL 383
>gnl|CDD|214692 smart00487, DEXDc, DEAD-like helicases superfamily.
Length = 201
Score = 191 bits (488), Expect = 1e-57
Identities = 80/212 (37%), Positives = 122/212 (57%), Gaps = 13/212 (6%)
Query: 341 LKKQNYEKPTPIQAQAIPAIMSG-RDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDG 399
++K +E P Q +AI A++SG RD+I A TGSGKT+A +LP L + G
Sbjct: 1 IEKFGFEPLRPYQKEAIEALLSGLRDVILAAPTGSGKTLAALLPALEALKRGK------G 54
Query: 400 PMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRG-AEIIVCTPG 458
+++ PTREL Q +E KK SLGL+VV +YGG EQ+ +L+ G +I+V TPG
Sbjct: 55 GRVLVLVPTRELAEQWAEELKKLGPSLGLKVVGLYGGDSKREQLRKLESGKTDILVTTPG 114
Query: 459 RMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFP 518
R++D+L + +L V ++LDEA R+ D GF Q+ +++ + + Q ++ SAT P
Sbjct: 115 RLLDLLENDK---LSLSNVDLVILDEAHRLLDGGFGDQLEKLLKLLPKNVQLLLLSATPP 171
Query: 519 RQMEALARRILNKPIEIQVGGRSVVCKEVEQH 550
++E L LN P+ I VG + +EQ
Sbjct: 172 EEIENLLELFLNDPVFIDVGFT--PLEPIEQF 201
>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
Length = 434
Score = 172 bits (439), Expect = 9e-48
Identities = 93/266 (34%), Positives = 144/266 (54%), Gaps = 7/266 (2%)
Query: 326 KTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
T+++ + + +L+AL+ + Y +PT IQA+AIP + GRD++G A TG+GKT AF+LP L
Sbjct: 1 TTFSELELDESLLEALQDKGYTRPTAIQAEAIPPALDGRDVLGSAPTGTGKTAAFLLPAL 60
Query: 386 RHILDQPPLEETDGPMAI-IMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQIS 444
+H+LD P GP I I++PTREL MQ+ +A++ K L + + GG
Sbjct: 61 QHLLDFP--RRKSGPPRILILTPTRELAMQVADQARELAKHTHLDIATITGGVAYMNHAE 118
Query: 445 ELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNV 504
+I+V TPGR++ + + + R V ++LDEADRM DMGF + I
Sbjct: 119 VFSENQDIVVATPGRLLQYIKEEN---FDCRAVETLILDEADRMLDMGFAQDIETIAAET 175
Query: 505 RPDRQTVMFSATFP-RQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKL 563
R +QT++FSAT ++ A R+LN P+E++ K++ Q D+ + L
Sbjct: 176 RWRKQTLLFSATLEGDAVQDFAERLLNDPVEVEAEPSRRERKKIHQWYYRADDLEHKTAL 235
Query: 564 LELLGIYQDQGSVIVFVDKQENADSL 589
L L + IVFV +E L
Sbjct: 236 LCHLLKQPEVTRSIVFVRTRERVHEL 261
>gnl|CDD|235314 PRK04837, PRK04837, ATP-dependent RNA helicase RhlB; Provisional.
Length = 423
Score = 171 bits (435), Expect = 3e-47
Identities = 85/253 (33%), Positives = 142/253 (56%), Gaps = 21/253 (8%)
Query: 326 KTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
+ ++ + ++++AL+K+ + TPIQA A+P ++GRD+ G A+TG+GKT+AF+
Sbjct: 8 QKFSDFALHPQVVEALEKKGFHNCTPIQALALPLTLAGRDVAGQAQTGTGKTMAFLTATF 67
Query: 386 RHILDQPPLE--ETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQI 443
++L P E + + P A+IM+PTREL +QI +A+ ++ GL++ YGG G +Q+
Sbjct: 68 HYLLSHPAPEDRKVNQPRALIMAPTRELAVQIHADAEPLAQATGLKLGLAYGGDGYDKQL 127
Query: 444 SELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDN 503
L+ G +I++ T GR+ID N NL + +VLDEADRMFD+GF I +
Sbjct: 128 KVLESGVDILIGTTGRLIDYAKQN---HINLGAIQVVVLDEADRMFDLGF-------IKD 177
Query: 504 VR------PD---RQTVMFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVL 554
+R P R ++FSAT ++ LA +N P ++V +++ +
Sbjct: 178 IRWLFRRMPPANQRLNMLFSATLSYRVRELAFEHMNNPEYVEVEPEQKTGHRIKEELFYP 237
Query: 555 DEEQKMLKLLELL 567
E+KM L L+
Sbjct: 238 SNEEKMRLLQTLI 250
>gnl|CDD|236941 PRK11634, PRK11634, ATP-dependent RNA helicase DeaD; Provisional.
Length = 629
Score = 166 bits (423), Expect = 3e-44
Identities = 85/212 (40%), Positives = 131/212 (61%), Gaps = 9/212 (4%)
Query: 327 TWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLR 386
T+A G+ IL+AL YEKP+PIQA+ IP +++GRD++G+A+TGSGKT AF LPLL
Sbjct: 7 TFADLGLKAPILEALNDLGYEKPSPIQAECIPHLLNGRDVLGMAQTGSGKTAAFSLPLLH 66
Query: 387 HILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSL-GLRVVCVYGGTGISEQISE 445
++ E P ++++PTREL +Q+ + F+K + G+ VV +YGG Q+
Sbjct: 67 NL-----DPELKAPQILVLAPTRELAVQVAEAMTDFSKHMRGVNVVALYGGQRYDVQLRA 121
Query: 446 LKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVR 505
L++G +I+V TPGR++D L + ++ L +VLDEAD M MGF V I+ +
Sbjct: 122 LRQGPQIVVGTPGRLLDHLKRGTLDLSKLSG---LVLDEADEMLRMGFIEDVETIMAQIP 178
Query: 506 PDRQTVMFSATFPRQMEALARRILNKPIEIQV 537
QT +FSAT P + + RR + +P E+++
Sbjct: 179 EGHQTALFSATMPEAIRRITRRFMKEPQEVRI 210
>gnl|CDD|234938 PRK01297, PRK01297, ATP-dependent RNA helicase RhlB; Provisional.
Length = 475
Score = 162 bits (412), Expect = 8e-44
Identities = 96/240 (40%), Positives = 140/240 (58%), Gaps = 9/240 (3%)
Query: 350 TPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEE--TDGPMAIIMSP 407
TPIQAQ + ++G D IG A+TG+GKT AF++ ++ +L PP +E P A+I++P
Sbjct: 111 TPIQAQVLGYTLAGHDAIGRAQTGTGKTAAFLISIINQLLQTPPPKERYMGEPRALIIAP 170
Query: 408 TRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELK-RGAEIIVCTPGRMIDMLAA 466
TREL +QI K+A TK GL V+ GG +Q+ +L+ R +I+V TPGR++D
Sbjct: 171 TRELVVQIAKDAAALTKYTGLNVMTFVGGMDFDKQLKQLEARFCDILVATPGRLLDF--- 227
Query: 467 NSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRP--DRQTVMFSATFPRQMEAL 524
N +L V +VLDEADRM DMGF PQV +II +RQT++FSATF + L
Sbjct: 228 NQRGEVHLDMVEVMVLDEADRMLDMGFIPQVRQIIRQTPRKEERQTLLFSATFTDDVMNL 287
Query: 525 ARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQDQGSVIVFVDKQE 584
A++ P +++ +V VEQHV + K KLL L V+VF ++++
Sbjct: 288 AKQWTTDPAIVEIEPENVASDTVEQHVYAVAGSDKY-KLLYNLVTQNPWERVMVFANRKD 346
>gnl|CDD|235307 PRK04537, PRK04537, ATP-dependent RNA helicase RhlB; Provisional.
Length = 572
Score = 158 bits (401), Expect = 1e-41
Identities = 89/235 (37%), Positives = 141/235 (60%), Gaps = 6/235 (2%)
Query: 337 ILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPL-- 394
+L L+ + + TPIQA +P + G D+ G A+TG+GKT+AF++ ++ +L +P L
Sbjct: 20 LLAGLESAGFTRCTPIQALTLPVALPGGDVAGQAQTGTGKTLAFLVAVMNRLLSRPALAD 79
Query: 395 EETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIV 454
+ + P A+I++PTREL +QI K+A KF LGLR VYGG +Q L++G ++I+
Sbjct: 80 RKPEDPRALILAPTRELAIQIHKDAVKFGADLGLRFALVYGGVDYDKQRELLQQGVDVII 139
Query: 455 CTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNV--RPDRQTVM 512
TPGR+ID + + +V +L VLDEADRMFD+GF + ++ + R RQT++
Sbjct: 140 ATPGRLIDYVKQH--KVVSLHACEICVLDEADRMFDLGFIKDIRFLLRRMPERGTRQTLL 197
Query: 513 FSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELL 567
FSAT ++ LA +N+P ++ V ++ V Q + +E+K LL LL
Sbjct: 198 FSATLSHRVLELAYEHMNEPEKLVVETETITAARVRQRIYFPADEEKQTLLLGLL 252
>gnl|CDD|238005 cd00046, DEXDc, DEAD-like helicases superfamily. A diverse family
of proteins involved in ATP-dependent RNA or DNA
unwinding. This domain contains the ATP-binding region.
Length = 144
Score = 146 bits (371), Expect = 2e-41
Identities = 52/154 (33%), Positives = 87/154 (56%), Gaps = 10/154 (6%)
Query: 364 RDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFT 423
RD++ A TGSGKT+A +LP+L + G ++++PTREL Q+ + K+
Sbjct: 1 RDVLLAAPTGSGKTLAALLPILELLDSL------KGGQVLVLAPTRELANQVAERLKELF 54
Query: 424 KSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLD 483
G++V + GGT I +Q L +I+V TPGR++D L +L+++ ++LD
Sbjct: 55 G-EGIKVGYLIGGTSIKQQEKLLSGKTDIVVGTPGRLLDELERLK---LSLKKLDLLILD 110
Query: 484 EADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATF 517
EA R+ + GF ++I+ + DRQ ++ SAT
Sbjct: 111 EAHRLLNQGFGLLGLKILLKLPKDRQVLLLSATP 144
>gnl|CDD|185609 PTZ00424, PTZ00424, helicase 45; Provisional.
Length = 401
Score = 131 bits (332), Expect = 2e-33
Identities = 78/236 (33%), Positives = 123/236 (52%), Gaps = 9/236 (3%)
Query: 332 GVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
+++ +L + +EKP+ IQ + I I+ G D IG A++G+GKT FV+ L+ I
Sbjct: 34 KLNEDLLRGIYSYGFEKPSAIQQRGIKPILDGYDTIGQAQSGTGKTATFVIAALQLI--D 91
Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE 451
L A+I++PTREL QI K L +R GGT + + I++LK G
Sbjct: 92 YDLNACQ---ALILAPTRELAQQIQKVVLALGDYLKVRCHACVGGTVVRDDINKLKAGVH 148
Query: 452 IIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTV 511
++V TPGR+ DM+ RV +L+ +LDEAD M GF+ Q+ + + PD Q
Sbjct: 149 MVVGTPGRVYDMIDKRHLRVDDLK---LFILDEADEMLSRGFKGQIYDVFKKLPPDVQVA 205
Query: 512 MFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLD-EEQKMLKLLEL 566
+FSAT P ++ L + + P I V + + + Q + ++ EE K L +L
Sbjct: 206 LFSATMPNEILELTTKFMRDPKRILVKKDELTLEGIRQFYVAVEKEEWKFDTLCDL 261
>gnl|CDD|224122 COG1201, Lhr, Lhr-like helicases [General function prediction
only].
Length = 814
Score = 76.1 bits (188), Expect = 2e-14
Identities = 49/142 (34%), Positives = 76/142 (53%), Gaps = 1/142 (0%)
Query: 343 KQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMA 402
K+ + TP Q AIP I SG +++ IA TGSGKT A LP++ +L + DG A
Sbjct: 17 KRKFTSLTPPQRYAIPEIHSGENVLIIAPTGSGKTEAAFLPVINELLSLGKGKLEDGIYA 76
Query: 403 IIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMID 462
+ +SP + L I + ++ + LG+ V +G T SE+ LK I++ TP +
Sbjct: 77 LYISPLKALNNDIRRRLEEPLRELGIEVAVRHGDTPQSEKQKMLKNPPHILITTPESLAI 136
Query: 463 MLAANSGRVTNLRRVTYIVLDE 484
+L + R LR V Y+++DE
Sbjct: 137 LLNSPKFR-ELLRDVRYVIVDE 157
>gnl|CDD|224126 COG1205, COG1205, Distinct helicase family with a unique C-terminal
domain including a metal-binding cysteine cluster
[General function prediction only].
Length = 851
Score = 66.7 bits (163), Expect = 2e-11
Identities = 43/172 (25%), Positives = 68/172 (39%), Gaps = 17/172 (9%)
Query: 319 KGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTV 378
G + + AL K E+ Q A+ I GR+++ TGSGKT
Sbjct: 45 PGKTSEFPELRD----ESLKSALVKAGIERLYSHQVDALRLIREGRNVVVTTGTGSGKTE 100
Query: 379 AFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVC-VYGGT 437
+F+LP+L H+L P A+++ PT L + ++ L +V Y G
Sbjct: 101 SFLLPILDHLLRDP------SARALLLYPTNALANDQAERLRELISDLPGKVTFGRYTGD 154
Query: 438 GISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTN----LRRVTYIVLDEA 485
E+ + R I+ T M+ L LR + Y+V+DE
Sbjct: 155 TPPEERRAIIRNPPDILLTNPDMLHYLLLR--NHDAWLWLLRNLKYLVVDEL 204
>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family. This model
represents a subfamily of RNA splicing factors including
the Pad-1 protein (N. crassa), CAPER (M. musculus) and
CC1.3 (H.sapiens). These proteins are characterized by
an N-terminal arginine-rich, low complexity domain
followed by three (or in the case of 4 H. sapiens
paralogs, two) RNA recognition domains (rrm: pfam00706).
These splicing factors are closely related to the U2AF
splicing factor family (TIGR01642). A homologous gene
from Plasmodium falciparum was identified in the course
of the analysis of that genome at TIGR and was included
in the seed.
Length = 457
Score = 64.9 bits (158), Expect = 5e-11
Identities = 42/145 (28%), Positives = 61/145 (42%), Gaps = 14/145 (9%)
Query: 6 RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLER---RKEKSRGSK 62
R R R R R R DK R+R RR RS R RDRD R + +SR
Sbjct: 1 RYRDRERGRL----RNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPN 56
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
R R R + ++ R +E + + D+ V +L L+ ++ RD E +
Sbjct: 57 RYYRPRGDRSYRRDDRRS---GRNTKEPLT--EAERDDRTVFVLQLALKARE-RDLYEFF 110
Query: 123 RAERKKKDIETIKKDIKSNLSSGLG 147
K +D++ I KD S S G+
Sbjct: 111 SKVGKVRDVQCI-KDRNSRRSKGVA 134
Score = 37.2 bits (86), Expect = 0.025
Identities = 20/47 (42%), Positives = 27/47 (57%), Gaps = 1/47 (2%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
R RR RSRSRSP+ + RP+ R + DRR + E +E +RD
Sbjct: 44 RGRRGRSRSRSPN-RYYRPRGDRSYRRDDRRSGRNTKEPLTEAERDD 89
>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
splicing factor. These splicing factors consist of an
N-terminal arginine-rich low complexity domain followed
by three tandem RNA recognition motifs (pfam00076). The
well-characterized members of this family are auxilliary
components of the U2 small nuclear ribonuclearprotein
splicing factor (U2AF). These proteins are closely
related to the CC1-like subfamily of splicing factors
(TIGR01622). Members of this subfamily are found in
plants, metazoa and fungi.
Length = 509
Score = 58.8 bits (142), Expect = 5e-09
Identities = 42/152 (27%), Positives = 64/152 (42%), Gaps = 18/152 (11%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERR-SERDRDRDLERRKEKSRGS 61
R++SR R S +RP+ RD+ R R R RS ER E R RD R +S S
Sbjct: 6 DREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRS 65
Query: 62 KRRSRSREAERSKDHSKKEEKD-KREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
R S R RS+D ++ + + ++ D S + Q R+D +
Sbjct: 66 LRYSSVR---RSRDRPRRRSRSVRSIEQHRRRLRDRSPSN------------QWRKDDKK 110
Query: 121 RWRAERKKKDIETIKKD-IKSNLSSGLGGSAP 151
R + K E + D K++ + G+AP
Sbjct: 111 RSLWDIKPPGYELVTADQAKASQVFSVPGTAP 142
Score = 50.3 bits (120), Expect = 2e-06
Identities = 30/118 (25%), Positives = 49/118 (41%), Gaps = 5/118 (4%)
Query: 26 RDKDRDR-RRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR---EAERSKDHSKKEE 81
RD++ DR R +SR +R +R R R + + R RRSR R E R +D + +
Sbjct: 1 RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDS 60
Query: 82 KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
+ R + + + +E +RR R + +K D + DIK
Sbjct: 61 RSPRSLRYSSVRRSRDRPRRR-SRSVRSIEQHRRRLRDRSPSNQWRKDDKKRSLWDIK 117
>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777). This is
a family of eukaryotic proteins of unknown function.
Some of the proteins in this family are putative nucleic
acid binding proteins.
Length = 158
Score = 54.5 bits (131), Expect = 1e-08
Identities = 39/109 (35%), Positives = 53/109 (48%), Gaps = 4/109 (3%)
Query: 9 SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHER---RSERDRDRDLERRKEKSRGSKRRS 65
RSRS SP R + R +DR RRR RS R R R R R R + + RS
Sbjct: 2 GRSRSRSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRS 61
Query: 66 RSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
RSR R +D ++ +KD RE ++ E + D E ++ E+EM K
Sbjct: 62 RSRSPSRRRDRKRERDKDAREPKKRERQKLIKEEDLEGKSDE-EVEMMK 109
Score = 48.3 bits (115), Expect = 1e-06
Identities = 28/89 (31%), Positives = 41/89 (46%), Gaps = 3/89 (3%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R RR+R R RS S R RR + R R RS R R R R RR+++ R
Sbjct: 21 RDRRERRRERSRSRERDR---RRRSRSRSPHRSRRSRSPRRHRSRSRSPSRRRDRKRERD 77
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEEE 91
+ +R + + K+E+ + + EE E
Sbjct: 78 KDAREPKKRERQKLIKEEDLEGKSDEEVE 106
Score = 45.6 bits (108), Expect = 1e-05
Identities = 37/88 (42%), Positives = 48/88 (54%), Gaps = 3/88 (3%)
Query: 5 RRKRSRSRSPSPSHKRPKESR-RDKDRDRRRRSRSH-ERRSERDRDRDLERRKEKSRGSK 62
RR R R RS S + + R R ++RDRRRRSRS RS R R R + +S S+
Sbjct: 10 RRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSP-SR 68
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEE 90
RR R RE ++ KK E+ K KEE+
Sbjct: 69 RRDRKRERDKDAREPKKRERQKLIKEED 96
>gnl|CDD|237171 PRK12678, PRK12678, transcription termination factor Rho;
Provisional.
Length = 672
Score = 48.7 bits (117), Expect = 6e-06
Identities = 24/126 (19%), Positives = 47/126 (37%), Gaps = 2/126 (1%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
+ R+ + + +R + RR DR+ + ER +R RD + R + R +
Sbjct: 152 PATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQ 211
Query: 63 RRSRSRE--AERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
R + ++ +D+R+ ++ D D + R ++ RDR
Sbjct: 212 GDRREERGRRDGGDRRGRRRRRDRRDARGDDNREDRGDRDGDDGEGRGGRRGRRFRDRDR 271
Query: 121 RWRAER 126
R R
Sbjct: 272 RGRRGG 277
Score = 48.0 bits (115), Expect = 1e-05
Identities = 18/89 (20%), Positives = 33/89 (37%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R R + R + E + R E R ER R D E R+ ++ +
Sbjct: 130 RRERGEAARRGAARKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGE 189
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEEE 91
R R D +++ +++ ++ EE
Sbjct: 190 RGRREERGRDGDDRDRRDRREQGDRREER 218
Score = 44.5 bits (106), Expect = 2e-04
Identities = 15/90 (16%), Positives = 33/90 (36%), Gaps = 1/90 (1%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRD-LERRKEKSRGS 61
+ R+ + ++ + E+R D R RR DR R E+ R
Sbjct: 135 EAARRGAARKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRRE 194
Query: 62 KRRSRSREAERSKDHSKKEEKDKREKEEEE 91
+R + +R + + +++R + +
Sbjct: 195 ERGRDGDDRDRRDRREQGDRREERGRRDGG 224
Score = 40.7 bits (96), Expect = 0.002
Identities = 18/88 (20%), Positives = 35/88 (39%), Gaps = 3/88 (3%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
+ + + + E +++RD RRR E R + RR+E+ R
Sbjct: 142 ARKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGD 201
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEE 90
R R E+ ++EE+ +R+ +
Sbjct: 202 DRDRRDRREQG---DRREERGRRDGGDR 226
Score = 40.3 bits (95), Expect = 0.003
Identities = 22/87 (25%), Positives = 32/87 (36%), Gaps = 2/87 (2%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R + R + RR +DR R + E R +RD D R + R +
Sbjct: 208 RREQGDRREERGRRDGGDRRGRRRRRDRRDARGDDNREDRGDRDGDDGEGRGGRRGR--R 265
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEE 89
R R R R D + E + RE +
Sbjct: 266 FRDRDRRGRRGGDGGNEREPELREDDV 292
Score = 38.7 bits (91), Expect = 0.010
Identities = 17/82 (20%), Positives = 34/82 (41%), Gaps = 1/82 (1%)
Query: 5 RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
++ + + + + +R +E RD+ R R R ER E R RR
Sbjct: 148 GGEQPATEARADAAERTEEEERDERRRRGDREDRQAEA-ERGERGRREERGRDGDDRDRR 206
Query: 65 SRSREAERSKDHSKKEEKDKRE 86
R + +R ++ +++ D+R
Sbjct: 207 DRREQGDRREERGRRDGGDRRG 228
Score = 34.9 bits (81), Expect = 0.13
Identities = 14/92 (15%), Positives = 36/92 (39%), Gaps = 1/92 (1%)
Query: 2 VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGS 61
++ + + + + + + R+RR R + R + R E+ ++R
Sbjct: 100 AKAEAAPAARAAAAAAAEAASAPEAAQARERRERGEAARRGAARKAGEGGEQPATEARAD 159
Query: 62 KRRSRSREAERSKDHSKKEEKDKREKEEEEAA 93
R+ E ER + + + +D++ + E
Sbjct: 160 -AAERTEEEERDERRRRGDREDRQAEAERGER 190
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 48.2 bits (114), Expect = 1e-05
Identities = 57/299 (19%), Positives = 124/299 (41%), Gaps = 32/299 (10%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
+++K ++ + + K+ E+++ ++ + +++E + D ++ E+ + +
Sbjct: 1495 AKKKADEAKKAAEAKKKADEAKKAEEA----KKADEAKKAEEAKKADEAKKAEEKKKADE 1550
Query: 64 RSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEM----------- 112
++ E +++++ K EE K+ +E++ A ++ K+ E R+E M
Sbjct: 1551 LKKAEELKKAEEKKKAEEA-KKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKA 1609
Query: 113 -QKRRDRIERWRAERKKKDIETIKK--DIKSNLSSGLGGSAPMKKWNLEDD-------SD 162
+ ++ + +AE KK E KK +K + + +KK E+
Sbjct: 1610 EEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKK 1669
Query: 163 EDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVI 222
+E+ K E K AEED ++ EE +K + + K A+ K
Sbjct: 1670 AEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEEN- 1728
Query: 223 VTGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVDHSTIE 281
K E+AK E E+ + E +EE++ + K+ +E+ K + IE
Sbjct: 1729 -----KIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIE 1782
Score = 45.1 bits (106), Expect = 1e-04
Identities = 57/295 (19%), Positives = 125/295 (42%), Gaps = 8/295 (2%)
Query: 19 KRPKESRRDKDRDRRRRSRSHE--RRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDH 76
++ +E+R+ +D R +R E R++E R + ++ E +R ++ ++ E +++D
Sbjct: 1140 RKAEEARKAEDAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEVRKAEELRKAEDA 1199
Query: 77 SKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKK 136
K E K E+E + ++ K+ EA + E +K + ++ ER ++I ++
Sbjct: 1200 RKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEE 1259
Query: 137 DIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRK 196
++ + + ++ +E DE K E+ D + EE +K
Sbjct: 1260 ARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKKA--EEKKKADEAKKKA-EEAKK 1316
Query: 197 VNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEEEQED 256
++ K AD+ K A K+ +A + E ++ E + ++++E
Sbjct: 1317 ADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEA 1376
Query: 257 LTSTAANLASKQKKELSKVDHSTIEYLPFRKDF-YVEVPEIARMTPEEVEKYKEE 310
A+ A K+ +E K D + + +K ++ A+ +E +K EE
Sbjct: 1377 --KKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEE 1429
Score = 45.1 bits (106), Expect = 1e-04
Identities = 39/196 (19%), Positives = 89/196 (45%), Gaps = 7/196 (3%)
Query: 5 RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
++K + + K+ E + + + + ++ +++E D+ + E +K + +++
Sbjct: 1632 KKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEE--DEKK 1689
Query: 65 SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRA 124
+ + +++ K EE K+E EE++ A +L K E +++ E K+ ++ +A
Sbjct: 1690 AAEALKKEAEEAKKAEELKKKEAEEKKKA---EELKKAEEENKIKAEEAKKEAEEDKKKA 1746
Query: 125 ERKKKDIETIKK--DIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDP 182
E KKD E KK +K K+ +E++ DE++ + E K ++ D
Sbjct: 1747 EEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFDN 1806
Query: 183 LDAFMQGVHEEMRKVN 198
++G E +N
Sbjct: 1807 FANIIEGGKEGNLVIN 1822
Score = 44.4 bits (104), Expect = 2e-04
Identities = 55/280 (19%), Positives = 109/280 (38%), Gaps = 17/280 (6%)
Query: 6 RKRSRSRSPSPSHKRPKESRR-DKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
+K + + K +E+++ +++R+ + E R R + E++R +
Sbjct: 1224 KKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADEL 1283
Query: 65 SRSREAERSKDHSKKEEKDKRE---------KEEEEAAFDPSKLDKEVEATRLELEMQKR 115
++ E +++ + K EEK K + K+ +EA + K+ +A + + E K+
Sbjct: 1284 KKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKK 1343
Query: 116 RDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKT 175
+ AE + E ++ ++ KK +E DE K
Sbjct: 1344 AAEAAKAEAEAAADEAEAAEEKAEAAEKK----KEEAKKKADAAKKKAEEKKKADEAKKK 1399
Query: 176 AEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKA- 234
AEED D + +K A + K AD K A KK E+A
Sbjct: 1400 AEEDKKKADELKKA--AAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAK 1457
Query: 235 KGELMEENQDGLEYSSEEEQEDLTSTAANLASKQKKELSK 274
K E ++ + + + E +++ + A+ A K+ +E K
Sbjct: 1458 KAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKK 1497
Score = 43.2 bits (101), Expect = 5e-04
Identities = 65/316 (20%), Positives = 120/316 (37%), Gaps = 20/316 (6%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
+++K ++ + K+ E+ + + + + E ++E + E +K+ K+
Sbjct: 1327 AKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKK 1386
Query: 64 RSRSREAERSKDHSKKEEKDKREKEE----EEAAFDPSKLDKEVEATRLELEMQKRRDRI 119
++A+ +K KK E+DK++ +E A + K+ E + E +K+ +
Sbjct: 1387 AEEKKKADEAK---KKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEE- 1442
Query: 120 ERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEED 179
+ E KKK E K + A K E +E DE K AEE
Sbjct: 1443 AKKADEAKKKAEEAKK-------AEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEA 1495
Query: 180 IDPLDAFMQGVHEEMR--KVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGE 237
D + + + + K AD +K A KK+ E K E
Sbjct: 1496 KKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAE 1555
Query: 238 LM---EENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVDHSTIEYLPFRKDFYVEVP 294
+ EE + E EE +++ A A K ++ + E K +
Sbjct: 1556 ELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKA 1615
Query: 295 EIARMTPEEVEKYKEE 310
E A++ EE++K +EE
Sbjct: 1616 EEAKIKAEELKKAEEE 1631
Score = 42.1 bits (98), Expect = 9e-04
Identities = 51/258 (19%), Positives = 100/258 (38%), Gaps = 7/258 (2%)
Query: 19 KRPKESRRDKDRDRRRRSRSHE--RRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDH 76
K+ + +R+ ++ + R E R++E R + ER+ E++R ++ ++ +++++
Sbjct: 1176 KKAEAARKAEEVRKAEELRKAEDARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEA 1235
Query: 77 SKKEEKDKREKEEEEAAFDPSKLDKEVEA----TRLELEMQKRRDRIERWRAERKKKDIE 132
K E + ++ EEE + K ++ A + ++ ++ R E +AE KKK E
Sbjct: 1236 KKDAE-EAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADE 1294
Query: 133 TIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHE 192
K + K + KK + E+ D K AEE +A
Sbjct: 1295 AKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEA 1354
Query: 193 EMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEE 252
+ A K + K A KK ++AK + E+ + E
Sbjct: 1355 AADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAA 1414
Query: 253 EQEDLTSTAANLASKQKK 270
+ A A ++KK
Sbjct: 1415 AAKKKADEAKKKAEEKKK 1432
Score = 42.1 bits (98), Expect = 0.001
Identities = 56/276 (20%), Positives = 102/276 (36%), Gaps = 7/276 (2%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
++ K+ + K+ + + K + ++ + ++ E + D ++K +
Sbjct: 1285 KAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKA 1344
Query: 63 RRSRSREAERSKDHSKKEEKDKR--EKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
+ EAE + D ++ E+ EK++EEA K+ E + E +K+ + +
Sbjct: 1345 AEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDK 1404
Query: 121 RWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDI 180
+ E KK K D + + KK E+ DE K E K AEE
Sbjct: 1405 KKADELKKAAAAKKKADEAKKKAEEKKKADEAKK-KAEEAKKADEAKKKAEEAKKAEEAK 1463
Query: 181 DPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELME 240
+ + +E +K + A K A+ K A KK ++AK E
Sbjct: 1464 KKAEEAKKA--DEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKA--E 1519
Query: 241 ENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVD 276
E + E EE + K+ EL K +
Sbjct: 1520 EAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAE 1555
Score = 37.4 bits (86), Expect = 0.024
Identities = 44/219 (20%), Positives = 92/219 (42%), Gaps = 14/219 (6%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
++ R ++ ++ K + + +++E ++ + + +K+++ K
Sbjct: 1588 KAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKK 1647
Query: 63 RRSRSREAE-----RSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRD 117
+ ++AE ++ + +KK E+DK++ EE + A + K K EA + E E K+ +
Sbjct: 1648 KAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEK--KAAEALKKEAEEAKKAE 1705
Query: 118 RIERWRAERKKKDIETIK-KDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNK-----DE 171
+++ AE KKK E K ++ + A K E+ ++E K E
Sbjct: 1706 ELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKE 1765
Query: 172 NGKTAEEDIDPLDAFM-QGVHEEMRKVNKPAVPTTADVK 209
K AEE +A + + + EE K D+
Sbjct: 1766 EEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIF 1804
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 47.2 bits (112), Expect = 2e-05
Identities = 39/268 (14%), Positives = 90/268 (33%), Gaps = 22/268 (8%)
Query: 7 KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR 66
K ++ P + +E +++ ++ +++ + + +DR + ++E + +
Sbjct: 91 KTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDR----KPKEEAKEKRPPKEK 146
Query: 67 SREAERSKDHSKKEEKDK---REKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWR 123
+E E+ + + E++K R + + P K + E E Q++ R
Sbjct: 147 EKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQAARE---- 202
Query: 124 AERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPL 183
K K E + + ED+S + ++ + + D P
Sbjct: 203 -AVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSSSSLKKPDPSPS 261
Query: 184 DAFMQ------GVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGE 237
A + R +P A +PA K +V V + +
Sbjct: 262 MASPETRESSKRTETRPRTSLRPPSARPASARPAPPRVKRKEIVTVLQDAQGVGKIVSNV 321
Query: 238 LME----ENQDGLEYSSEEEQEDLTSTA 261
++E E++D + E + A
Sbjct: 322 ILEGKKSEDEDDENFVVEAAAQAPDIVA 349
Score = 37.2 bits (86), Expect = 0.027
Identities = 27/170 (15%), Positives = 51/170 (30%), Gaps = 5/170 (2%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
++ ++ + P K+ + +R + ER E D +D E +
Sbjct: 179 PKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDE 238
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
R S E S+ S +K S E R
Sbjct: 239 SRQSS---EISRRSSSSLKKPDPSPSMASPETRESSKRTETRPRTSLRPPSARPASARPA 295
Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDEN 172
K+K+I T+ +D + + + ++ ED+ DE+
Sbjct: 296 PPRVKRKEIVTVLQD--AQGVGKIVSNVILEGKKSEDEDDENFVVEAAAQ 343
Score = 33.3 bits (76), Expect = 0.44
Identities = 21/150 (14%), Positives = 52/150 (34%), Gaps = 3/150 (2%)
Query: 5 RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
+ R KRP + + + + R E +R+R R R K+ K++
Sbjct: 126 EEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKP---PKKK 182
Query: 65 SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRA 124
+++ E ++ +++ + K + E + +KE + + + E ++
Sbjct: 183 PPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQS 242
Query: 125 ERKKKDIETIKKDIKSNLSSGLGGSAPMKK 154
+ + K + S + K
Sbjct: 243 SEISRRSSSSLKKPDPSPSMASPETRESSK 272
>gnl|CDD|234365 TIGR03817, DECH_helic, helicase/secretion neighborhood putative
DEAH-box helicase. A conserved gene neighborhood widely
spread in the Actinobacteria contains this
uncharacterized DEAH-box family helicase encoded
convergently towards an operon of genes for protein
homologous to type II secretion and pilus formation
proteins. The context suggests that this helicase may
play a role in conjugal transfer of DNA.
Length = 742
Score = 46.6 bits (111), Expect = 3e-05
Identities = 47/175 (26%), Positives = 77/175 (44%), Gaps = 27/175 (15%)
Query: 318 GKGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKT 377
G+ P P WA ++ AL+ +P QA+A +GR ++ T SGK+
Sbjct: 12 GRTAPWP--AWAH----PDVVAALEAAGIHRPWQHQARAAELAHAGRHVVVATGTASGKS 65
Query: 378 VAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVV--CVYG 435
+A+ LP+L + D P A+ ++PT+ L + + + L LR V Y
Sbjct: 66 LAYQLPVLSALADDP------RATALYLAPTKAL----AADQLRAVRELTLRGVRPATYD 115
Query: 436 GTGISEQISELKRGAEIIVCTPGRMIDML----AANSGRVTN-LRRVTYIVLDEA 485
G +E+ + A ++ P DML + R LRR+ Y+V+DE
Sbjct: 116 GDTPTEERRWAREHARYVLTNP----DMLHRGILPSHARWARFLRRLRYVVIDEC 166
>gnl|CDD|224125 COG1204, COG1204, Superfamily II helicase [General function
prediction only].
Length = 766
Score = 46.2 bits (110), Expect = 4e-05
Identities = 34/135 (25%), Positives = 60/135 (44%), Gaps = 14/135 (10%)
Query: 351 PIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRE 410
P Q ++S +++ A TGSGKT+ +L +L +L+ G + + P +
Sbjct: 35 PQQEAVEKGLLSDENVLISAPTGSGKTLIALLAILSTLLE-------GGGKVVYIVPLKA 87
Query: 411 LCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGR 470
L + +E + + LG+RV TG + E ++IV TP ++ D L
Sbjct: 88 LAEEKYEEFSRL-EELGIRVGIS---TGDYDLDDERLARYDVIVTTPEKL-DSLTRKRPS 142
Query: 471 VTNLRRVTYIVLDEA 485
+ V +V+DE
Sbjct: 143 W--IEEVDLVVIDEI 155
>gnl|CDD|234478 TIGR04121, DEXH_lig_assoc, DEXH box helicase, DNA
ligase-associated. Members of this protein family are
DEAD/DEAH box helicases found associated with a
bacterial ATP-dependent DNA ligase, part of a four-gene
system that occurs in about 12 % of prokaryotic
reference genomes. The actual motif in this family is
DE[VILW]H.
Length = 803
Score = 45.6 bits (109), Expect = 6e-05
Identities = 45/142 (31%), Positives = 72/142 (50%), Gaps = 11/142 (7%)
Query: 348 KPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLP-LLRHILDQPPLEETDGPMAIIMS 406
P P Q + A + GR + IA TGSGKT+A LP L+ + P G + ++
Sbjct: 13 TPRPFQLEMWAAALEGRSGLLIAPTGSGKTLAGFLPSLIDLAGPEKP---KKGLHTLYIT 69
Query: 407 PTRELCMQIGKEAKKFTKSLGL--RVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDML 464
P R L + I + + + LGL RV G T SE+ + K+ +I++ TP + +L
Sbjct: 70 PLRALAVDIARNLQAPIEELGLPIRVETRTGDTSSSERARQRKKPPDILLTTPESLALLL 129
Query: 465 A-ANSGRV-TNLRRVTYIVLDE 484
+ ++ R+ +LR V V+DE
Sbjct: 130 SYPDAARLFKDLRCV---VVDE 148
>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
hydrophilic C-term. This domain is a hydrophilic
region found at the C-terminus of plant and metazoan
pre-mRNA-splicing factor 38 proteins. The function is
not known.
Length = 97
Score = 41.3 bits (97), Expect = 1e-04
Identities = 24/79 (30%), Positives = 34/79 (43%), Gaps = 3/79 (3%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEK---SRG 60
R R + S R + R R RR + +R R RDRD R +++ R
Sbjct: 19 EEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRD 78
Query: 61 SKRRSRSREAERSKDHSKK 79
RSRSR RS+D ++
Sbjct: 79 RYDRSRSRSRSRSRDRRRR 97
Score = 40.5 bits (95), Expect = 2e-04
Identities = 25/73 (34%), Positives = 35/73 (47%), Gaps = 1/73 (1%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R + +R R +R + R + R R+RR R +R R RDRD R R S+
Sbjct: 26 RRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRDRYDR-SR 84
Query: 63 RRSRSREAERSKD 75
RSRSR +R +
Sbjct: 85 SRSRSRSRDRRRR 97
Score = 37.1 bits (86), Expect = 0.003
Identities = 20/84 (23%), Positives = 34/84 (40%), Gaps = 1/84 (1%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
+ +R + R R R RR +RS + R R +R + + R
Sbjct: 15 ESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74
Query: 64 RSRSREAERSKDHSKKEEKDKREK 87
R R R +RS+ S+ +D+R +
Sbjct: 75 RDRDRY-DRSRSRSRSRSRDRRRR 97
Score = 29.8 bits (67), Expect = 1.1
Identities = 17/71 (23%), Positives = 28/71 (39%)
Query: 21 PKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
E D + RR+ R +R R R R + + R KRR R R+ +R++ + +
Sbjct: 15 ESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74
Query: 81 EKDKREKEEEE 91
R
Sbjct: 75 RDRDRYDRSRS 85
>gnl|CDD|223989 COG1061, SSL2, DNA or RNA helicases of superfamily II
[Transcription / DNA replication, recombination, and
repair].
Length = 442
Score = 41.7 bits (98), Expect = 9e-04
Identities = 37/178 (20%), Positives = 62/178 (34%), Gaps = 34/178 (19%)
Query: 348 KPTPIQAQAIPAIMSGRDLIG----IAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAI 403
+ P Q +A+ A++ R + TG+GKTV + + +
Sbjct: 36 ELRPYQEEALDALVKNRRTERRGVIVLPTGAGKTVVAAE-AIAEL----------KRSTL 84
Query: 404 IMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDM 463
++ PT+EL Q + KKF L + +YGG A++ V T +
Sbjct: 85 VLVPTKELLDQWAEALKKFL--LLNDEIGIYGGGEKEL------EPAKVTVAT----VQT 132
Query: 464 LAANSGRVTNL-RRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVM-FSATFPR 519
LA L I+ DE + + R I + + +AT R
Sbjct: 133 LARRQLLDEFLGNEFGLIIFDEVHHLPAPSY-----RRILELLSAAYPRLGLTATPER 185
>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
Length = 1068
Score = 40.4 bits (95), Expect = 0.003
Identities = 26/117 (22%), Positives = 55/117 (47%), Gaps = 10/117 (8%)
Query: 5 RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
RR R+ R + R +E R +++ +RR R ++ ++ +E R+ EK+R +
Sbjct: 614 RRDRNERRDTRDNRTR-REGRENREENRRNRRQAQQQTAE-TRESQQAEVTEKARTQDEQ 671
Query: 65 SRSREAERSKDHSKKEEKDKREKEEEEAAF---DPSKLDKEVEATRLELEMQKRRDR 118
++ ER ++ +KR+ ++E A + S + E E ++ +R+ R
Sbjct: 672 QQAPRRER----QRRRNDEKRQAQQEAKALNVEEQSVQETEQEERVQQV-QPRRKQR 723
Score = 38.9 bits (91), Expect = 0.008
Identities = 25/110 (22%), Positives = 46/110 (41%), Gaps = 13/110 (11%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
PK + + + RR+ R + RR +R+ RD + + G + R +R R
Sbjct: 592 PAPKAEAKPERQQDRRKPRQNNRR-DRNERRDTRDNRTRREGRENREENRRNRRQAQQQT 650
Query: 79 KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKK 128
E ++ ++ E E T+ E + RR+R +R R + K+
Sbjct: 651 AETRESQQAEVTEK-----------ARTQDEQQQAPRRER-QRRRNDEKR 688
>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
component YidC; Validated.
Length = 429
Score = 39.8 bits (93), Expect = 0.003
Identities = 27/160 (16%), Positives = 59/160 (36%), Gaps = 16/160 (10%)
Query: 18 HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHS 77
K +R + R++++ ++ R+R R + R + + + E ++++
Sbjct: 283 FKEHHAEQRAQYREKQKEKKAFLWTLRRNRLRMIIT---PWRAPELHAENAEIKKTRTAE 339
Query: 78 KKEEKD--KREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIK 135
K E K K ++ AA ++++E R M + R R +A +KK I+
Sbjct: 340 KNEAKARKKEIAQKRRAAER--EINREARQERAA-AMARARARRAAVKA-KKKGLIDASP 395
Query: 136 KDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKT 175
+ + + GS P + E +
Sbjct: 396 NEDTPSENEESKGSPP-------QVEATTTAEPNREPSQE 428
Score = 38.3 bits (89), Expect = 0.010
Identities = 19/94 (20%), Positives = 38/94 (40%), Gaps = 8/94 (8%)
Query: 5 RRKRSRS-----RSPS---PSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKE 56
RR R R R+P + + K +K+ + R+ ++R +R+ + E R+E
Sbjct: 309 RRNRLRMIITPWRAPELHAENAEIKKTRTAEKNEAKARKKEIAQKRRAAEREINREARQE 368
Query: 57 KSRGSKRRSRSREAERSKDHSKKEEKDKREKEEE 90
++ R R A ++K + + E
Sbjct: 369 RAAAMARARARRAAVKAKKKGLIDASPNEDTPSE 402
>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
Length = 1388
Score = 40.0 bits (94), Expect = 0.004
Identities = 31/240 (12%), Positives = 85/240 (35%), Gaps = 7/240 (2%)
Query: 23 ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
E +K+ + +R +S + + ++KEK + +S++A + + +
Sbjct: 1145 EEVEEKEIAKEQRLKSKTKGKASKLRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSD 1204
Query: 83 DKREKEEEEAAFDP--SKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKS 140
+KR+ +++ S D+E + + + R++ + K + +
Sbjct: 1205 EKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDD 1264
Query: 141 NLSSGLGGSAPMKKWNLEDDSDEDENDN----KDENGKTAEEDIDPLDAFMQGVHEEMRK 196
G +AP + + S + + K + + ++G ++K
Sbjct: 1265 LSKEGKPKNAPKRV-SAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKK 1323
Query: 197 VNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEEEQED 256
K T K + + + + +K+ +++ ++ S +E+ ED
Sbjct: 1324 KKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKKKSDSSSEDDDDSEVDDSEDEDDED 1383
Score = 38.5 bits (90), Expect = 0.013
Identities = 27/181 (14%), Positives = 62/181 (34%), Gaps = 8/181 (4%)
Query: 2 VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGS 61
+K + S S + K + R + +++ +S D D K
Sbjct: 1212 KPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKP 1271
Query: 62 KRR-SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
K R + S K + + ++ K+ K +E + L+ +K+ ++
Sbjct: 1272 KNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKK- 1330
Query: 121 RWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDI 180
A +KK + + +K + S++D++ D++ +ED
Sbjct: 1331 --TARKKKSKTRVKQASASQSSRLL----RRPRKKKSDSSSEDDDDSEVDDSEDEDDEDD 1384
Query: 181 D 181
+
Sbjct: 1385 E 1385
Score = 33.1 bits (76), Expect = 0.56
Identities = 34/235 (14%), Positives = 73/235 (31%), Gaps = 11/235 (4%)
Query: 44 ERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEV 103
E++ + E KE+ SK + ++ + + K KKE+K K+ ++ K V
Sbjct: 1143 EQEEVEEKEIAKEQRLKSKTKGKASKLRKPKL-KKKEKKKKKSSADKSKKASVVGNSKRV 1201
Query: 104 EATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDE 163
++ K ++ ++ D E K KS++ K + ++D
Sbjct: 1202 DSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFS 1261
Query: 164 DENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIV 223
++ +K+ K A + + + K + +
Sbjct: 1262 SDDLSKEGKPKNAPKRVSAVQYSPP----PPSKRPDGESNGGSKPSSPTKKKVKKRLEGS 1317
Query: 224 TGVVKKSV--EKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVD 276
+KK EK + + S + K+ S+ D
Sbjct: 1318 LAALKKKKKSEKKTARKKKSKTRVKQAS----ASQSSRLLRRPRKKKSDSSSEDD 1368
>gnl|CDD|234702 PRK00254, PRK00254, ski2-like helicase; Provisional.
Length = 720
Score = 39.0 bits (91), Expect = 0.007
Identities = 41/153 (26%), Positives = 74/153 (48%), Gaps = 15/153 (9%)
Query: 333 VSKKILDALKKQNYEKPTPIQAQAIPA-IMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
V ++I LK++ E+ P QA+A+ + ++ G++L+ T SGKT+ + ++ +L
Sbjct: 8 VDERIKRVLKERGIEELYPPQAEALKSGVLEGKNLVLAIPTASGKTLVAEIVMVNKLL-- 65
Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE 451
+G A+ + P + L + +E K + K LGLRV TG + E +
Sbjct: 66 -----REGGKAVYLVPLKALAEEKYREFKDWEK-LGLRVAMT---TGDYDSTDEWLGKYD 116
Query: 452 IIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDE 484
II+ T + +L S + + V +V DE
Sbjct: 117 IIIATAEKFDSLLRHGSSWI---KDVKLVVADE 146
>gnl|CDD|236779 PRK10864, PRK10864, putative methyltransferase; Provisional.
Length = 346
Score = 38.2 bits (89), Expect = 0.010
Identities = 28/92 (30%), Positives = 36/92 (39%), Gaps = 14/92 (15%)
Query: 12 RSPSPSHKRPKESR--RDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR- 68
RS S KR R + R + RR RD DR+ + R K S R+ SR
Sbjct: 18 RSDDDSDKRTHNPRTGKGGGRPSGKSRADGGRRPARD-DRNSQSRDRKWEDSPWRTVSRA 76
Query: 69 ---EAERSKDH---SKKEEKD----KREKEEE 90
E DH S K D +R++ EE
Sbjct: 77 PGDETPEKADHGGISGKSFIDPEVLRRQRAEE 108
Score = 31.7 bits (72), Expect = 1.1
Identities = 23/97 (23%), Positives = 31/97 (31%), Gaps = 10/97 (10%)
Query: 27 DKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKRE 86
DK R + R S + R R R S+ R R E + S+ + E
Sbjct: 24 DKRTHNPRTGKGGGRPSGKSRADGGRRPARDDRNSQSRDRKWEDSPWRTVSRAPGDETPE 83
Query: 87 KEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWR 123
K + S +D E RR R E R
Sbjct: 84 KADHGGISGKSFIDPE----------VLRRQRAEETR 110
>gnl|CDD|219061 pfam06495, Transformer, Fruit fly transformer protein. This family
consists of transformer proteins from several Drosophila
species and also from Ceratitis capitata (Mediterranean
fruit fly). The transformer locus (tra) produces an RNA
processing protein that alternatively splices the
doublesex pre-mRNA in the sex determination hierarchy of
Drosophila melanogaster.
Length = 182
Score = 37.3 bits (86), Expect = 0.011
Identities = 21/65 (32%), Positives = 27/65 (41%), Gaps = 1/65 (1%)
Query: 5 RRKRSRSRSPSPSHKRPKESR-RDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
+RK +R + R SR R + +R R H RS D R S +R
Sbjct: 41 QRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRR 100
Query: 64 RSRSR 68
RSRSR
Sbjct: 101 RSRSR 105
Score = 36.6 bits (84), Expect = 0.014
Identities = 26/65 (40%), Positives = 29/65 (44%), Gaps = 7/65 (10%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R RR RSRSRS S R R R RSRS R R R R+ +SR
Sbjct: 54 RGRRTRSRSRSQS-------AERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRRRSRSRS 106
Query: 63 RRSRS 67
R SR+
Sbjct: 107 RYSRT 111
Score = 34.3 bits (78), Expect = 0.10
Identities = 26/96 (27%), Positives = 37/96 (38%), Gaps = 10/96 (10%)
Query: 4 SRRKRSRSRSPSPSHKRP--KESRRDKDRDRRRRSRSHE-------RRSERDRDRDLERR 54
SR R R K P + R++DR R R R + R R R R +
Sbjct: 7 SRSPRDTRRDSRKKEKIPYFADEVRERDRVRNLRQRKTQSTRPTTSHRGRRTRSRSRSQS 66
Query: 55 KEKSRGSKR-RSRSREAERSKDHSKKEEKDKREKEE 89
E++ +R RSRSR RS + +R +
Sbjct: 67 AERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRRRS 102
Score = 33.1 bits (75), Expect = 0.22
Identities = 26/78 (33%), Positives = 32/78 (41%), Gaps = 7/78 (8%)
Query: 2 VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHER--RSERDRDRDLERRKEKSR 59
VR R + R RP S R + R RS+S ER R R R R + SR
Sbjct: 30 VRERDRVRNLRQRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSRSRNRSDSR 89
Query: 60 -----GSKRRSRSREAER 72
++RR RSR R
Sbjct: 90 HRSTSSTERRRRSRSRSR 107
Score = 33.1 bits (75), Expect = 0.25
Identities = 23/79 (29%), Positives = 37/79 (46%), Gaps = 4/79 (5%)
Query: 6 RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRS 65
R+R R R+ +R +S R R RR+RS R +R+ R + +SR S+ RS
Sbjct: 31 RERDRVRN---LRQRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSR-SRNRS 86
Query: 66 RSREAERSKDHSKKEEKDK 84
SR S ++ + +
Sbjct: 87 DSRHRSTSSTERRRRSRSR 105
Score = 31.9 bits (72), Expect = 0.54
Identities = 21/45 (46%), Positives = 23/45 (51%), Gaps = 3/45 (6%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDR 47
RR RSRSRS + S R R +RRRRSRS R S R
Sbjct: 72 CQRRHRSRSRSRNRSDSR---HRSTSSTERRRRSRSRSRYSRTPR 113
>gnl|CDD|237497 PRK13767, PRK13767, ATP-dependent helicase; Provisional.
Length = 876
Score = 38.7 bits (91), Expect = 0.011
Identities = 44/158 (27%), Positives = 73/158 (46%), Gaps = 20/158 (12%)
Query: 343 KQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEET----D 398
K+ + TP Q AIP I G++++ + TGSGKT+A L ++ + L D
Sbjct: 27 KEKFGTFTPPQRYAIPLIHEGKNVLISSPTGSGKTLAAFLAIIDELFR---LGREGELED 83
Query: 399 GPMAIIMSPTREL-----------CMQIGKEAKKFTKSL-GLRVVCVYGGTGISEQISEL 446
+ +SP R L +I + AK+ + L +RV G T E+ L
Sbjct: 84 KVYCLYVSPLRALNNDIHRNLEEPLTEIREIAKERGEELPEIRVAIRTGDTSSYEKQKML 143
Query: 447 KRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDE 484
K+ I++ TP + +L + R LR V ++++DE
Sbjct: 144 KKPPHILITTPESLAILLNSPKFR-EKLRTVKWVIVDE 180
>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
chromosome partitioning].
Length = 1163
Score = 38.2 bits (89), Expect = 0.014
Identities = 28/167 (16%), Positives = 67/167 (40%), Gaps = 1/167 (0%)
Query: 28 KDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKD-HSKKEEKDKRE 86
K RS R + +LER+ E+ + + +EE ++ E
Sbjct: 691 KSLKNELRSLEDLLEELRRQLEELERQLEELKRELAALEEELEQLQSRLEELEEELEELE 750
Query: 87 KEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGL 146
+E EE +L++E+E+ L K + + ++++E ++++++
Sbjct: 751 EELEELQERLEELEEELESLEEALAKLKEEIEELEEKRQALQEELEELEEELEEAERRLD 810
Query: 147 GGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEE 193
++ + E E + +E + EE +D L+ ++ + +E
Sbjct: 811 ALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKE 857
Score = 35.8 bits (83), Expect = 0.085
Identities = 41/255 (16%), Positives = 91/255 (35%), Gaps = 17/255 (6%)
Query: 28 KDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREK 87
++ E ++ +LE R + E + + +EK + K
Sbjct: 277 EELREELEELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALK 336
Query: 88 EEEEAAFDPSKLDKEVEATRLELEMQ-KRRDRIERWRAERKKKDIETIKKDIKSNLSSGL 146
EE E L +E+E ELE + + E ++ E +++++ +
Sbjct: 337 EELEER---ETLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREELAELEAELA 393
Query: 147 GGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTA 206
+++ E +S E+ + E + +E++ L+A ++ + E+ ++N+
Sbjct: 394 EIRNELEELKREIESLEERLERLSERLEDLKEELKELEAELEELQTELEELNEELEELEE 453
Query: 207 DVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLAS 266
++ K +E+ EL EE Q + S E A AS
Sbjct: 454 QLEELRD-------------RLKELERELAELQEELQRLEKELSSLEARLDRLEAEQRAS 500
Query: 267 KQKKELSKVDHSTIE 281
+ + + + S +
Sbjct: 501 QGVRAVLEALESGLP 515
Score = 33.9 bits (78), Expect = 0.28
Identities = 22/113 (19%), Positives = 42/113 (37%), Gaps = 5/113 (4%)
Query: 23 ESRRDKDRDRRRRSRSHERRSERDRDR--DLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
E + + +R + E E+ + R +LE E+ + R E ++ E
Sbjct: 712 EELERQLEELKRELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELESLE 771
Query: 81 EKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIET 133
E + KEE E + + ++ ELE + ER+ + +E
Sbjct: 772 EALAKLKEEIEEL---EEKRQALQEELEELEEELEEAERRLDALERELESLEQ 821
Score = 33.5 bits (77), Expect = 0.45
Identities = 30/180 (16%), Positives = 66/180 (36%), Gaps = 8/180 (4%)
Query: 22 KESRRDKDRDRR--RRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
KE + + + R + + LE KEK K RE +
Sbjct: 294 KEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLL 353
Query: 80 EEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
E ++ ++E EE + E L E+ + + R E +E +K++I+
Sbjct: 354 AELEEAKEELEEKL-SALLEELEELFEALREELAELEAELAEIRNE-----LEELKREIE 407
Query: 140 SNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNK 199
S S ++ E E E + + E+++ L+ ++ + + ++++ +
Sbjct: 408 SLEERLERLSERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLKELER 467
Score = 32.8 bits (75), Expect = 0.74
Identities = 31/170 (18%), Positives = 74/170 (43%), Gaps = 8/170 (4%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
++ + + + + E ER + + E + +K + E E + +
Sbjct: 733 EQLQSRLEELEEELEELEEELEELQERLEELEEELESLEEALAKLKEEIEELEEKRQ-AL 791
Query: 79 KEEKDKREKEEEEAAFDPSKLDKEVEAT-----RLELEMQKRRDRIERWRAERK--KKDI 131
+EE ++ E+E EEA L++E+E+ RLE E+++ + IE + ++++
Sbjct: 792 QEELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEEL 851
Query: 132 ETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDID 181
E ++K+++ A ++ E E+E + +E + E ++
Sbjct: 852 EELEKELEELKEELEELEAEKEELEDELKELEEEKEELEEELRELESELA 901
Score = 30.5 bits (69), Expect = 3.4
Identities = 23/130 (17%), Positives = 48/130 (36%), Gaps = 4/130 (3%)
Query: 17 SHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDH 76
K E + + R + + +L + + KR S E +
Sbjct: 358 EAKEELEEKLSALLEELEELFEALREELAELEAELAEIRNELEELKREIESLEERLERLS 417
Query: 77 SKKEEKDKR--EKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETI 134
+ E+ + E E E + E LE ++++ RDR++ ER+ +++
Sbjct: 418 ERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLK--ELERELAELQEE 475
Query: 135 KKDIKSNLSS 144
+ ++ LSS
Sbjct: 476 LQRLEKELSS 485
Score = 30.1 bits (68), Expect = 4.5
Identities = 22/134 (16%), Positives = 53/134 (39%), Gaps = 6/134 (4%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
+ R + S K ++ + +R++ E + + ERR +
Sbjct: 757 QERLEELEEELESLEEALAKLKEEIEELEEKRQALQEELEELEEELEEAERRLDALEREL 816
Query: 63 RRSRSREAERSKD-HSKKEEKDKREKEEEEAAFDPSKLDKEVEA-----TRLELEMQKRR 116
R ++ +EE ++ E++ +E + +L+KE+E LE E ++
Sbjct: 817 ESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEELEELEAEKEELE 876
Query: 117 DRIERWRAERKKKD 130
D ++ E+++ +
Sbjct: 877 DELKELEEEKEELE 890
Score = 29.3 bits (66), Expect = 7.2
Identities = 22/118 (18%), Positives = 45/118 (38%)
Query: 22 KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
E + RR R E + + E+ E + + E + + + E
Sbjct: 812 LERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEELEELEAE 871
Query: 82 KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
K++ E E +E + +L++E+ EL K R R E + +E ++ ++
Sbjct: 872 KEELEDELKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERLEVELP 929
>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
unknown].
Length = 652
Score = 37.8 bits (88), Expect = 0.020
Identities = 24/124 (19%), Positives = 55/124 (44%), Gaps = 9/124 (7%)
Query: 20 RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
RP+E ++ +RR + + + + +ER +E++ KR + E K S+
Sbjct: 402 RPREKEGTEEEERREITVY--EKRIKKLEETVERLEEENSELKRELEELKREIEKLESEL 459
Query: 80 EEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
E + +++ + D+E+ A +E ++ ++ R E ++ + ++K K
Sbjct: 460 ERFRREVRDKV-------RKDREIRARDRRIERLEKELEEKKKRVEELERKLAELRKMRK 512
Query: 140 SNLS 143
LS
Sbjct: 513 LELS 516
>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
Length = 880
Score = 37.7 bits (88), Expect = 0.021
Identities = 42/175 (24%), Positives = 70/175 (40%), Gaps = 33/175 (18%)
Query: 20 RPKESRRDKDRDRRRR-----SRSHERRSERDRDR-----------DLERRKEKSRGSKR 63
R +E ++ RD R R + +E D +LE R E+ R
Sbjct: 272 REREELAEEVRDLRERLEELEEERDDLLAEAGLDDADAEAVEARREELEDRDEELRDRLE 331
Query: 64 RSR-SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
R + +A + S +E+ D E+ EE + ++L+ E+E R +E RR+ IE
Sbjct: 332 ECRVAAQAHNEEAESLREDADDLEERAEELREEAAELESELEEAREAVE--DRREEIEEL 389
Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAE 177
E IE +++ G AP+ N ED +E + + + AE
Sbjct: 390 EEE-----IEELRERF---------GDAPVDLGNAEDFLEELREERDELREREAE 430
Score = 29.2 bits (66), Expect = 7.2
Identities = 39/138 (28%), Positives = 63/138 (45%), Gaps = 21/138 (15%)
Query: 22 KESRRDKDRDRRRRSRSHERRSERDR--DRDLERRKE----KSRGSKRRSRSREAERSKD 75
E + +R +R ++ E R E D + ERR+E ++ R E ER ++
Sbjct: 216 AELDEEIERYEEQREQARETRDEADEVLEEHEERREELETLEAEIEDLRETIAETERERE 275
Query: 76 HSKKEEKDKREKEEE----------EAAFDPSKLDKEVEATRLELEMQKR--RDRIE--R 121
+E +D RE+ EE EA D + VEA R ELE + RDR+E R
Sbjct: 276 ELAEEVRDLRERLEELEEERDDLLAEAGLD-DADAEAVEARREELEDRDEELRDRLEECR 334
Query: 122 WRAERKKKDIETIKKDIK 139
A+ ++ E++++D
Sbjct: 335 VAAQAHNEEAESLREDAD 352
Score = 29.2 bits (66), Expect = 8.2
Identities = 25/112 (22%), Positives = 47/112 (41%), Gaps = 12/112 (10%)
Query: 20 RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
+ ++ +R E R ER +R R + ++RR E +R + +
Sbjct: 488 EEEVEEVEERLERAEDLVEAEDRIERLEER---REDLEELIAERRETI-EEKRERAEELR 543
Query: 80 EEKDKREKEEEEAAFDPSKLDKEVEATRLEL--------EMQKRRDRIERWR 123
E + E E EE ++ ++E E R E+ E+++R + +ER R
Sbjct: 544 ERAAELEAEAEEKREAAAEAEEEAEEAREEVAELNSKLAELKERIESLERIR 595
>gnl|CDD|223588 COG0514, RecQ, Superfamily II DNA helicase [DNA replication,
recombination, and repair].
Length = 590
Score = 37.3 bits (87), Expect = 0.023
Identities = 41/198 (20%), Positives = 80/198 (40%), Gaps = 36/198 (18%)
Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLP-LLRHILDQPPLEETDGPMAII 404
Y P Q + I A++SG+D + + TG GK++ + +P LL +G ++
Sbjct: 15 YASFRPGQQEIIDALLSGKDTLVVMPTGGGKSLCYQIPALLL-----------EGL-TLV 62
Query: 405 MSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE-----IIVCTPGR 459
+SP L M+ + ++ G+R + T E+ ++ + ++ +P R
Sbjct: 63 VSPLISL-MKDQVDQ---LEAAGIRAAYL-NSTLSREERQQVLNQLKSGQLKLLYISPER 117
Query: 460 MIDMLAANSGRVTNLR---RVTYIVLDEADRMFDMG--FEPQVMRIIDNVR--PDRQTVM 512
+ S R L ++ + +DEA + G F P R+ P+ +
Sbjct: 118 L------MSPRFLELLKRLPISLVAIDEAHCISQWGHDFRPDYRRLGRLRAGLPNPPVLA 171
Query: 513 FSATFPRQMEALARRILN 530
+AT ++ R L
Sbjct: 172 LTATATPRVRDDIREQLG 189
>gnl|CDD|234468 TIGR04095, dnd_restrict_1, DNA phosphorothioation system
restriction enzyme. The DNA phosphorothioate
modification system dnd (DNA instability during
electrophoresis) recently has been shown to provide a
modification essential to a restriction system. This
protein family was detected by Partial Phylogenetic
Profiling as linked to dnd, and its members usually are
clustered with the dndABCDE genes.
Length = 451
Score = 36.9 bits (86), Expect = 0.027
Identities = 37/152 (24%), Positives = 61/152 (40%), Gaps = 31/152 (20%)
Query: 348 KPTPIQAQAIPAIMS--GRDLIGIAKTGSGKTV-AFVLPLLRHILDQPPLEETDGPMAII 404
+ Q +AI A GR ++ +A TG+GKT+ A + E+ + ++
Sbjct: 8 ELRDYQKEAIRAWFKNNGRGILKMA-TGTGKTLTALAAASKLY-------EKIGLLVLLV 59
Query: 405 MSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTG-----ISEQISELKRGAE---IIVCT 456
+ P + L Q +EA+KF GL + Y +S + L G + I+ T
Sbjct: 60 VCPYQHLVDQWAREAEKF----GLNPILCYESVSNWQSELSTGLYNLNSGNQKFLAIITT 115
Query: 457 PGRMIDMLAANSGRVTNLRRV---TYIVLDEA 485
I + LRR T ++ DEA
Sbjct: 116 NATFI-----GKNFQSQLRRFPGKTLLIGDEA 142
>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain. This is a family of proteins of
approximately 300 residues, found in plants and
vertebrates. They contain a highly conserved DDRGK
motif.
Length = 189
Score = 35.8 bits (83), Expect = 0.031
Identities = 15/69 (21%), Positives = 37/69 (53%)
Query: 22 KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
K+ + +++ RR+ R E +R + E+R+ + + + RE ++ ++ K+ E
Sbjct: 6 KKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKERE 65
Query: 82 KDKREKEEE 90
+ R+++EE
Sbjct: 66 EQARKEQEE 74
>gnl|CDD|219293 pfam07093, SGT1, SGT1 protein. This family consists of several
eukaryotic SGT1 proteins. Human SGT1 or hSGT1 is known
to suppress GCR2 and is highly expressed in the muscle
and heart. The function of this family is unknown
although it has been speculated that SGT1 may be
functionally analogous to the Gcr2p protein of
Saccharomyces cerevisiae which is known to be a
regulatory factor of glycolytic gene expression.
Length = 557
Score = 36.6 bits (85), Expect = 0.035
Identities = 26/165 (15%), Positives = 54/165 (32%), Gaps = 17/165 (10%)
Query: 74 KDHSKKEEKDKREKEEEEAAFDPSKLDKEVEA--------TRLELEMQKRRDRIERWRAE 125
++ ++ K KE+ D ++ +E E + D E +E
Sbjct: 392 QERQGDKKDLKSNKEDANEVDDLEEVVSSMEEFLNKVSSFEGAEFADDEDEDDDEPDDSE 451
Query: 126 RKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDA 185
K + + L + LG +L DDSD+ + D+ +++ + + D
Sbjct: 452 DKDVSFDE--DEFFEFLKNMLGLKDDEIDNDLPDDSDDADEDDDEDDDEDEDSSSDSTLE 509
Query: 186 FMQGVHEEM-------RKVNKPAVPTTADVKPADSGSKPAGVVIV 223
++ ++M N + + D GV V
Sbjct: 510 ELEEYMDQMDAELKQTDSSNNADISNSGSSGAEDDDDDIEGVEPV 554
>gnl|CDD|224123 COG1202, COG1202, Superfamily II helicase, archaea-specific
[General function prediction only].
Length = 830
Score = 36.7 bits (85), Expect = 0.036
Identities = 59/243 (24%), Positives = 109/243 (44%), Gaps = 25/243 (10%)
Query: 333 VSKKILDALKKQNYEKPTPIQAQAIPA-IMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
+ +K LK++ E+ P+Q A+ A ++ G +L+ ++ T SGKT L+ +
Sbjct: 201 IPEKFKRMLKREGIEELLPVQVLAVEAGLLEGENLLVVSATASGKT------LIGELAGI 254
Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELK---- 447
P L M + + P L Q ++ K+ LGL+V G + I + +
Sbjct: 255 PRLLSGGKKM-LFLVPLVALANQKYEDFKERYSKLGLKVAIRVGMSRIKTREEPVVVDTS 313
Query: 448 RGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVR-- 505
A+IIV T + +L +G+ +L + +V+DE + D P++ +I +R
Sbjct: 314 PDADIIVGTYEGIDYLL--RTGK--DLGDIGTVVIDEIHTLEDEERGPRLDGLIGRLRYL 369
Query: 506 -PDRQTVMFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLL 564
P Q + SAT E LA+++ K + R V +E+H++ E + ++
Sbjct: 370 FPGAQFIYLSATVGNPEE-LAKKLGAKLVLYD--ERPV---PLERHLVFARNESEKWDII 423
Query: 565 ELL 567
L
Sbjct: 424 ARL 426
>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain. This
family represents the C-terminus (approximately 300
residues) of proteins that are involved as binding
partners for Prp19 as part of the nuclear pore complex.
The family in Drosophila is necessary for pre-mRNA
splicing, and the human protein has been found in
purifications of the spliceosome. In the past this
family was thought, erroneously, to be associated with
microfibrillin.
Length = 277
Score = 35.3 bits (81), Expect = 0.063
Identities = 32/107 (29%), Positives = 53/107 (49%), Gaps = 8/107 (7%)
Query: 37 RSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK---DHSKKEEKDKREKEEEEAA 93
R +R + ++R+R+ + K +KR++ R+ E K + KKE + K+ EA
Sbjct: 43 RKKDRITIQEREREAAKEKALEEEAKRKAEERKRETLKIVEEEVKKELELKKRNTLLEAN 102
Query: 94 FDPSKLDKEVEATRLEL----EM-QKRRDRIERWRAERKKKDIETIK 135
D D E E E E+ + +RDR ER ER+K +IE ++
Sbjct: 103 IDDVDTDDENEEEEYEAWKLRELKRIKRDREEREEMEREKAEIEKMR 149
>gnl|CDD|109943 pfam00906, Hepatitis_core, Hepatitis core antigen. The core
antigen of hepatitis viruses possesses a carboxyl
terminus rich in arginine. On this basis it was
predicted that the core antigen would bind DNA. There is
some experimental evidence to support this.
Length = 182
Score = 34.8 bits (80), Expect = 0.073
Identities = 16/47 (34%), Positives = 20/47 (42%), Gaps = 15/47 (31%)
Query: 2 VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRD 48
VR R + R R+PSP RRRRS+S RR +
Sbjct: 148 VRRRGRSPRRRTPSP---------------RRRRSQSPRRRRSQSPS 179
Score = 32.8 bits (75), Expect = 0.32
Identities = 15/39 (38%), Positives = 22/39 (56%), Gaps = 6/39 (15%)
Query: 33 RRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE 71
RRR RS RR+ R RR+ +S +RRS+S ++
Sbjct: 149 RRRGRSPRRRTPSPR-----RRRSQSPR-RRRSQSPSSQ 181
>gnl|CDD|219939 pfam08619, Nha1_C, Alkali metal cation/H+ antiporter Nha1 C
terminus. The C terminus of the plasma membrane Nha1
antiporter plays an important role in the immediate cell
response to hypo-osmotic shock which prevents an
execessive loss of ions and water. This domain is found
with pfam00999.
Length = 430
Score = 35.2 bits (81), Expect = 0.087
Identities = 25/107 (23%), Positives = 33/107 (30%), Gaps = 14/107 (13%)
Query: 19 KRPKESRRDKDRDRRRRSRSHE---------RRSERD-----RDRDLERRKEKSRGSKRR 64
K + RR RRRR R E S+ R E E S
Sbjct: 63 KGSRAGRRASSLRRRRRQRRKEPQAGTGALGPISQSAISPQRRSSTGENSAESDNTSYGL 122
Query: 65 SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELE 111
S+ E + D E D+R EE + ++E T E
Sbjct: 123 SKLAEDSENIDVRPVYESDERSGISEEGSRPSKLREQEQRPTEAYQE 169
>gnl|CDD|219406 pfam07420, DUF1509, Protein of unknown function (DUF1509). This
family consists of several uncharacterized viral
proteins from the Marek's disease-like viruses. Members
of this family are typically around 400 residues in
length. The function of this family is unknown.
Length = 377
Score = 35.0 bits (80), Expect = 0.10
Identities = 22/67 (32%), Positives = 28/67 (41%), Gaps = 4/67 (5%)
Query: 9 SRSRSPSPSHK---RPKESRRDKDRDRRR-RSRSHERRSERDRDRDLERRKEKSRGSKRR 64
S P H + RR++ R R RSRS RR R R R + R+ +SR
Sbjct: 310 SHHTMRRPPHSTSGERRGRRRNRSESRSRSRSRSGSRRYRRRRGRGVPGRRSESRQDTVL 369
Query: 65 SRSREAE 71
S EA
Sbjct: 370 VSSSEAS 376
Score = 31.9 bits (72), Expect = 1.0
Identities = 18/54 (33%), Positives = 25/54 (46%), Gaps = 1/54 (1%)
Query: 29 DRDRRRRSRSHERRSERDRDRDLERRKEKSRGS-KRRSRSREAERSKDHSKKEE 81
+R RRR+RS R R R R+ + RG RRS SR+ S+ +
Sbjct: 324 ERRGRRRNRSESRSRSRSRSGSRRYRRRRGRGVPGRRSESRQDTVLVSSSEASD 377
>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34. This family represents
herpes virus protein U79 and cytomegalovirus early
phosphoprotein P34 (UL112).
Length = 238
Score = 34.5 bits (79), Expect = 0.11
Identities = 15/69 (21%), Positives = 27/69 (39%)
Query: 22 KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
KE RR +D + + R ++ +R D D + S + E K+ +K
Sbjct: 167 KEKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSSGGQSGLSTKDEPPKEKRQKHH 226
Query: 82 KDKREKEEE 90
+R E +
Sbjct: 227 DPERRLEPQ 235
Score = 29.8 bits (67), Expect = 3.2
Identities = 16/85 (18%), Positives = 35/85 (41%), Gaps = 4/85 (4%)
Query: 49 RDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRL 108
R L R+K KR + +E R +D K +E ++++EE+ + + ++
Sbjct: 148 RALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSS-- 205
Query: 109 ELEMQKRRDRIERWRAERKKKDIET 133
Q + E+++K +
Sbjct: 206 --GGQSGLSTKDEPPKEKRQKHHDP 228
>gnl|CDD|223046 PHA03328, PHA03328, nuclear egress lamina protein UL31;
Provisional.
Length = 316
Score = 34.7 bits (80), Expect = 0.11
Identities = 18/55 (32%), Positives = 23/55 (41%), Gaps = 1/55 (1%)
Query: 20 RPKESRRDK-DRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERS 73
+ S + R RRSR R R R R RR+ R S RR+ + ER
Sbjct: 6 LRRSSSSLRRSRRAARRSRRDGRVGSRGRSRYRSRRRSSRRSSTRRAELADTERD 60
Score = 33.5 bits (77), Expect = 0.28
Identities = 13/45 (28%), Positives = 18/45 (40%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDR 47
+RR R R S R + RR R RR+ + +R R
Sbjct: 19 AARRSRRDGRVGSRGRSRYRSRRRSSRRSSTRRAELADTERDRYR 63
Score = 32.8 bits (75), Expect = 0.49
Identities = 18/53 (33%), Positives = 22/53 (41%), Gaps = 6/53 (11%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSE-RDRDRDLERR 54
R +RSR S R + R RR RS RR+E D +RD R
Sbjct: 17 RRAARRSRRDGRVGSRGRSRYRSR-----RRSSRRSSTRRAELADTERDRYRA 64
Score = 32.0 bits (73), Expect = 0.74
Identities = 16/47 (34%), Positives = 21/47 (44%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
RSRR RSR R + R + R RR S ++ +RDR
Sbjct: 15 RSRRAARRSRRDGRVGSRGRSRYRSRRRSSRRSSTRRAELADTERDR 61
Score = 32.0 bits (73), Expect = 0.91
Identities = 22/61 (36%), Positives = 25/61 (40%), Gaps = 4/61 (6%)
Query: 8 RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRS 67
R S S S + + SRRD R RSR RR R R RR E + R R
Sbjct: 7 RRSSSSLRRSRRAARRSRRDGRVGSRGRSRYRSRR--RSSRRSSTRRAELAD--TERDRY 62
Query: 68 R 68
R
Sbjct: 63 R 63
Score = 30.5 bits (69), Expect = 2.4
Identities = 19/71 (26%), Positives = 28/71 (39%), Gaps = 8/71 (11%)
Query: 25 RRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDK 84
+ RRSR RRS RD R R + + R +R SR +++ E
Sbjct: 5 ALRRSSSSLRRSRRAARRSRRD-GRVGSRGRSRYRSRRRSSRRS-------STRRAELAD 56
Query: 85 REKEEEEAAFD 95
E++ A F
Sbjct: 57 TERDRYRAYFA 67
>gnl|CDD|129701 TIGR00614, recQ_fam, ATP-dependent DNA helicase, RecQ family. All
proteins in this family for which functions are known
are 3'-5' DNA-DNA helicases. These proteins are used for
recombination, recombinational repair, and possibly
maintenance of chromosome stability. This family is
based on the phylogenomic analysis of JA Eisen (1999,
Ph.D. Thesis, Stanford University) [DNA metabolism, DNA
replication, recombination, and repair].
Length = 470
Score = 35.1 bits (81), Expect = 0.12
Identities = 34/151 (22%), Positives = 67/151 (44%), Gaps = 34/151 (22%)
Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIM 405
P+Q + I A++ GRD + TG GK++ + LP L +DG + +++
Sbjct: 9 LSSFRPVQLEVINAVLLGRDCFVVMPTGGGKSLCYQLPALC----------SDG-ITLVI 57
Query: 406 SPTREL----CMQIGKEAKKFTKSLGLRVVCVYGGTGISEQ---ISELKRGA-EIIVCTP 457
SP L +Q+ K+ G+ + +Q +++LK G +++ TP
Sbjct: 58 SPLISLMEDQVLQL--------KASGIPATFLNSSQSKEQQKNVLTDLKDGKIKLLYVTP 109
Query: 458 GRMIDMLAANSG---RVTNLRRVTYIVLDEA 485
+ +A++ + + +T I +DEA
Sbjct: 110 EK----CSASNRLLQTLEERKGITLIAVDEA 136
>gnl|CDD|221931 pfam13136, DUF3984, Protein of unknown function (DUF3984). This
family of proteins is functionally uncharacterized. This
family of proteins is found in eukaryotes. Proteins in
this family are typically between 393 and 442 amino
acids in length.
Length = 301
Score = 34.7 bits (80), Expect = 0.12
Identities = 29/118 (24%), Positives = 39/118 (33%), Gaps = 26/118 (22%)
Query: 2 VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGS 61
SRR RS +PS + RR SRS RR R DL
Sbjct: 179 RASRRGRSGYSTPSAAL-------------SRRGSRSASRRGSRA---DLSM-----TPL 217
Query: 62 KRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP-----SKLDKEVEATRLELEMQK 114
+ R E R + D+ + E + D S + E E E E+Q+
Sbjct: 218 EARRADAEDSRDTVLLGPDFVDEDIRAEMASIDDESFSSLSDSESESEDEIDEAEVQR 275
Score = 30.8 bits (70), Expect = 1.9
Identities = 24/71 (33%), Positives = 31/71 (43%), Gaps = 2/71 (2%)
Query: 8 RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRS 67
SRS S S HKR K SRR D +S+S R R+ KS + R S
Sbjct: 58 HSRSPSRSRLHKRKKSSRRSPMSDTLLKSKSSAHLLHHQSTR--SHRRSKSGTTSPRKPS 115
Query: 68 REAERSKDHSK 78
A R ++ S+
Sbjct: 116 SSAHRRRNDSE 126
>gnl|CDD|218684 pfam05672, MAP7, MAP7 (E-MAP-115) family. The organisation of
microtubules varies with the cell type and is presumably
controlled by tissue-specific microtubule-associated
proteins (MAPs). The 115-kDa epithelial MAP
(E-MAP-115/MAP7) has been identified as a
microtubule-stabilising protein predominantly expressed
in cell lines of epithelial origin. The binding of this
microtubule associated protein is nucleotide
independent.
Length = 171
Score = 33.5 bits (76), Expect = 0.14
Identities = 35/125 (28%), Positives = 66/125 (52%), Gaps = 4/125 (3%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
+ +E R +++DR R E R + L R +E R + R+R +E + + +
Sbjct: 41 QEEQERREQEEQDRLER----EELKRRAAEERLRREEEARRQEEERAREKEEKAKRKAEE 96
Query: 79 KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDI 138
+E++++ E+E + + ++ EA R+ LE +K +IE+ R ERKK+ E +K+
Sbjct: 97 EEKQEQEEQERIQKQKEEAEARAREEAERMRLEREKHFQQIEQERLERKKRLEEIMKRTR 156
Query: 139 KSNLS 143
KS +S
Sbjct: 157 KSEVS 161
>gnl|CDD|114629 pfam05917, DUF874, Helicobacter pylori protein of unknown function
(DUF874). This family consists of several hypothetical
proteins specific to Helicobacter pylori. The function
of this family is unknown.
Length = 417
Score = 34.5 bits (78), Expect = 0.15
Identities = 26/110 (23%), Positives = 54/110 (49%), Gaps = 7/110 (6%)
Query: 29 DRDRRRRSRSHERRSERDRDR------DLERRKEKSRGSKRRSRSREAERSKDHSKKE-E 81
D+D++ ++ +E RDR +LE+ ++K+ K+++ E + K E E
Sbjct: 123 DQDKKIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELANSQIKAEQE 182
Query: 82 KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
K K E+E+++ + K +ELE +K++ E+ +++KD
Sbjct: 183 KQKTEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQDLIKEQKDF 232
>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
Length = 1021
Score = 34.7 bits (79), Expect = 0.16
Identities = 38/172 (22%), Positives = 74/172 (43%), Gaps = 12/172 (6%)
Query: 25 RRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDK 84
R DKD R R E+ + + +++ ++K R ER + + E+ +
Sbjct: 429 RVDKDHAERARI---EKENAHRKALEMKILEKKRIERLEREERERLERERMERIERERLE 485
Query: 85 REKEEEEAAFDPSKLDKE-VEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLS 143
RE+ E E +L+++ +E RL+ ++R DR+ER R E+ +++ +K N
Sbjct: 486 RERLERE------RLERDRLERDRLDRLERERVDRLERDRLEKARRNSYFLKG--MENGL 537
Query: 144 SGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMR 195
S GG + +D ++ +G + + GVH+ +R
Sbjct: 538 SAGGGPGDGPGVGAGVGAGVGTSDGRNHSGVRSGIHCSIQSSARGGVHDSVR 589
Score = 30.9 bits (69), Expect = 2.8
Identities = 25/90 (27%), Positives = 42/90 (46%), Gaps = 2/90 (2%)
Query: 17 SHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLER-RKEKSRGSKRRSRSREAERSK- 74
+H++ E + + + R R R ER+R +ER R E+ R + R ER +
Sbjct: 445 AHRKALEMKILEKKRIERLEREERERLERERMERIERERLERERLERERLERDRLERDRL 504
Query: 75 DHSKKEEKDKREKEEEEAAFDPSKLDKEVE 104
D ++E D+ E++ E A S K +E
Sbjct: 505 DRLERERVDRLERDRLEKARRNSYFLKGME 534
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 34.8 bits (80), Expect = 0.18
Identities = 17/65 (26%), Positives = 20/65 (30%), Gaps = 2/65 (3%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
S SRSPSPS RP +R R R RR + +
Sbjct: 341 VSPGPSPSRSPSPS--RPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRA 398
Query: 64 RSRSR 68
R R
Sbjct: 399 RRRDA 403
Score = 31.7 bits (72), Expect = 1.6
Identities = 11/47 (23%), Positives = 14/47 (29%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
S R + P R RRR+R+ R RD
Sbjct: 357 PPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRARRRDA 403
Score = 30.9 bits (70), Expect = 2.5
Identities = 16/71 (22%), Positives = 23/71 (32%), Gaps = 4/71 (5%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
S R + SP PS R R R+ R R ++
Sbjct: 332 SSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSP----RKRPRPSRAPSSPAASAGRPTR 387
Query: 63 RRSRSREAERS 73
RR+R+ A R+
Sbjct: 388 RRARAAVAGRA 398
>gnl|CDD|235032 PRK02362, PRK02362, ski2-like helicase; Provisional.
Length = 737
Score = 34.5 bits (80), Expect = 0.20
Identities = 66/255 (25%), Positives = 118/255 (46%), Gaps = 39/255 (15%)
Query: 351 PIQAQAIPA-IMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTR 409
P QA+A+ A ++ G++L+ T SGKT+ L +L+ I G A+ + P R
Sbjct: 26 PPQAEAVEAGLLDGKNLLAAIPTASGKTLIAELAMLKAIA--------RGGKALYIVPLR 77
Query: 410 ELCMQIGKEAKKFTKSLGLRVVCVYGGTGIS----EQISELKRGAEIIVCTPGRMIDMLA 465
L + +E ++F + LG+RV GIS + E +IIV T + +D L
Sbjct: 78 ALASEKFEEFERFEE-LGVRV-------GISTGDYDSRDEWLGDNDIIVATSEK-VDSLL 128
Query: 466 ANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVR---PDRQTVMFSATF--PRQ 520
N L +T +V+DE + P + + +R PD Q V SAT +
Sbjct: 129 RNGAPW--LDDITCVVVDEVHLIDSANRGPTLEVTLAKLRRLNPDLQVVALSATIGNADE 186
Query: 521 MEAL--ARRILN--KPIEIQVG---GRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQDQ 573
+ A + + +PI+++ G G ++ + ++ V V ++ + +L+ L ++
Sbjct: 187 LADWLDAELVDSEWRPIDLREGVFYGGAIHFDDSQREVEVPSKDDTLNLVLDTL---EEG 243
Query: 574 GSVIVFVDKQENADS 588
G +VFV + NA+
Sbjct: 244 GQCLVFVSSRRNAEG 258
>gnl|CDD|152107 pfam11671, Apis_Csd, Complementary sex determiner protein. This
family of proteins represents the complementary sex
determiner in the honeybee. In the honeybee, the
mechanism of sex determination depends on the csd gene
which produces an SR-type protein. Males are homozygous
while females are homozygous for the csd gene.
Heterozygosity generates an active protein which
initiates female development.
Length = 146
Score = 32.8 bits (74), Expect = 0.24
Identities = 14/40 (35%), Positives = 23/40 (57%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERR 42
RSR + +S S++ +E+ R++ RDR R RS E +
Sbjct: 8 RSREREQKSYKNENSYREYRETSRERSRDRTERERSREHK 47
Score = 30.8 bits (69), Expect = 1.0
Identities = 16/46 (34%), Positives = 23/46 (50%), Gaps = 2/46 (4%)
Query: 24 SRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSRE 69
SR+ R R R +S+ ++E E +E+SR R RSRE
Sbjct: 2 SRKRYSRSREREQKSY--KNENSYREYRETSRERSRDRTERERSRE 45
Score = 29.3 bits (65), Expect = 3.0
Identities = 18/46 (39%), Positives = 28/46 (60%), Gaps = 3/46 (6%)
Query: 32 RRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR-EAERSKDH 76
R+R SRS ER + ++ + R ++ S+ RSR R E ERS++H
Sbjct: 3 RKRYSRSREREQKSYKNENSYREYRET--SRERSRDRTERERSREH 46
>gnl|CDD|216205 pfam00937, Corona_nucleoca, Coronavirus nucleocapsid protein.
Length = 346
Score = 33.5 bits (77), Expect = 0.28
Identities = 24/83 (28%), Positives = 33/83 (39%), Gaps = 8/83 (9%)
Query: 3 RSR---RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSR 59
RSR R SRS S PS R+ R+R S + L K+KS
Sbjct: 155 RSRSSSRSSSRSNSRGPSRGSS----RNNSRNRNSSSPDDLVAAVLAALAKLGFGKQKSS 210
Query: 60 GSKRRSRSREAERSKDHSKKEEK 82
SK+ SR + ++ K+ K
Sbjct: 211 -SKKPSRVTKKSAAEAAKKQLNK 232
>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
primarily archaeal type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. It is found
in a single copy and is homodimeric in prokaryotes, but
six paralogs (excluded from this family) are found in
eukarotes, where SMC proteins are heterodimeric. This
family represents the SMC protein of archaea and a few
bacteria (Aquifex, Synechocystis, etc); the SMC of other
bacteria is described by TIGR02168. The N- and
C-terminal domains of this protein are well conserved,
but the central hinge region is skewed in composition
and highly divergent [Cellular processes, Cell division,
DNA metabolism, Chromosome-associated proteins].
Length = 1164
Score = 33.9 bits (78), Expect = 0.29
Identities = 40/179 (22%), Positives = 74/179 (41%), Gaps = 16/179 (8%)
Query: 32 RRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEE 91
R++ R R + +R + L + K + G + ER K+ +++ E+E E+
Sbjct: 197 RQQLERLRREREKAERYQALLKEKREYEGYELLKEKEALERQKEAIERQLASL-EEELEK 255
Query: 92 AAFDPSKLDKEVEATRLEL-EMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSA 150
+ S+L+K +E L E+ K+ + R K+ I ++ +I S L S
Sbjct: 256 LTEEISELEKRLEEIEQLLEELNKKIKDLGEEEQLRVKEKIGELEAEIAS-----LERSI 310
Query: 151 PMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVK 209
K+ LED +E E +ID L A ++ + E+ + K T +
Sbjct: 311 AEKERELED---------AEERLAKLEAEIDKLLAEIEELEREIEEERKRRDKLTEEYA 360
>gnl|CDD|137505 PRK09751, PRK09751, putative ATP-dependent helicase Lhr;
Provisional.
Length = 1490
Score = 33.7 bits (77), Expect = 0.30
Identities = 34/133 (25%), Positives = 55/133 (41%), Gaps = 19/133 (14%)
Query: 369 IAKTGSGKTV-AFVLPLLRHILDQPPLEETDGPMA----IIMSPTRELCMQIGKEAKKFT 423
IA TGSGKT+ AF+ L R + + +SP + L + + +
Sbjct: 2 IAPTGSGKTLAAFLYALDRLFREGGEDTREAHKRKTSRILYISPIKALGTDVQRNLQIPL 61
Query: 424 KSLG------------LRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRV 471
K + LRV G T E+ + +I++ TP + ML + + R
Sbjct: 62 KGIADERRRRGETEVNLRVGIRTGDTPAQERSKLTRNPPDILITTPESLYLMLTSRA-RE 120
Query: 472 TNLRRVTYIVLDE 484
T LR V +++DE
Sbjct: 121 T-LRGVETVIIDE 132
>gnl|CDD|153337 cd07653, F-BAR_CIP4-like, The F-BAR (FES-CIP4 Homology and
Bin/Amphiphysin/Rvs) domain of Cdc42-Interacting Protein
4 and similar proteins. F-BAR domains are dimerization
modules that bind and bend membranes and are found in
proteins involved in membrane dynamics and actin
reorganization. This subfamily is composed of
Cdc42-Interacting Protein 4 (CIP4), Formin Binding
Protein 17 (FBP17), FormiN Binding Protein 1-Like
(FNBP1L), and similar proteins. CIP4 and FNBP1L are
Cdc42 effectors that bind Wiskott-Aldrich syndrome
protein (WASP) and function in endocytosis. CIP4 and
FBP17 bind to the Fas ligand and may be implicated in
the inflammatory response. CIP4 may also play a role in
phagocytosis. Members of this subfamily typically
contain an N-terminal F-BAR domain and a C-terminal SH3
domain. In addition, some members such as FNBP1L contain
a central Cdc42-binding HR1 domain. F-BAR domains form
banana-shaped dimers with a positively-charged concave
surface that binds to negatively-charged lipid
membranes. They can induce membrane deformation in the
form of long tubules.
Length = 251
Score = 33.0 bits (76), Expect = 0.32
Identities = 19/65 (29%), Positives = 35/65 (53%), Gaps = 3/65 (4%)
Query: 52 ERRKEKSRGSKRRSRSREAERSKDHSKKE-EKDKREKEEEEAAFDPSKLDKEVEATRLEL 110
ER+K S GSK + + + + + SKK EK +E E+ + ++ K D ++ T+ ++
Sbjct: 106 ERKKHLSEGSKLQQKLESSIKQLEKSKKAYEKAFKEAEKAKQKYE--KADADMNLTKADV 163
Query: 111 EMQKR 115
E K
Sbjct: 164 EKAKA 168
>gnl|CDD|217301 pfam02956, TT_ORF1, TT viral orf 1. TT virus (TTV), isolated
initially from a Japanese patient with hepatitis of
unknown aetiology, has since been found to infect both
healthy and diseased individuals and numerous
prevalence studies have raised questions about its role
in unexplained hepatitis. ORF1 is a large 750 residue
protein. The N-terminal half of this protein
corresponds to the capsid protein.
Length = 525
Score = 33.4 bits (77), Expect = 0.34
Identities = 26/66 (39%), Positives = 32/66 (48%), Gaps = 12/66 (18%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R RR+R R R +R RR R RRRR R RR R R R RR+ + R +
Sbjct: 7 RRRRRRWRGRR----RRRR---RR---RARRRRRRRRVRR--RRRGRRRRRRRRRRRRRR 54
Query: 63 RRSRSR 68
RR R +
Sbjct: 55 RRKRKK 60
>gnl|CDD|222914 PHA02666, PHA02666, hypothetical protein; Provisional.
Length = 287
Score = 33.4 bits (75), Expect = 0.34
Identities = 21/113 (18%), Positives = 34/113 (30%), Gaps = 5/113 (4%)
Query: 7 KRSRSRSPSPSHKRPKESRRDKDRDRRR-RSRSHERRSERDRDRDLERRKEKSRGSKRRS 65
+R + S RP R +R S +HE + R K SR S R
Sbjct: 32 RRRANSMESRRKSRPSRQHRSAERTPTTASSLTHENNTAPSRHGKQHSCKASSRSSHNRG 91
Query: 66 R----SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
H + K + + P + ++ RL E ++
Sbjct: 92 STSSSHNHHAHRGPHQSAHRRSKHDAVRDTYQPCPQSPETDLYKGRLPGETER 144
>gnl|CDD|236794 PRK10917, PRK10917, ATP-dependent DNA helicase RecG; Provisional.
Length = 681
Score = 33.2 bits (77), Expect = 0.45
Identities = 28/88 (31%), Positives = 39/88 (44%), Gaps = 12/88 (13%)
Query: 373 GSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVC 432
GSGKTV L L I G A +M+PT L Q + KK + LG+RV
Sbjct: 292 GSGKTVVAALAALAAI--------EAGYQAALMAPTEILAEQHYENLKKLLEPLGIRVAL 343
Query: 433 VYGGTGIS---EQISELKRG-AEIIVCT 456
+ G E + + G A+I++ T
Sbjct: 344 LTGSLKGKERREILEAIASGEADIVIGT 371
>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein. TolA couples the inner
membrane complex of itself with TolQ and TolR to the
outer membrane complex of TolB and OprL (also called
Pal). Most of the length of the protein consists of
low-complexity sequence that may differ in both length
and composition from one species to another,
complicating efforts to discriminate TolA (the most
divergent gene in the tol-pal system) from paralogs such
as TonB. Selection of members of the seed alignment and
criteria for setting scoring cutoffs are based largely
conserved operon struction. //The Tol-Pal complex is
required for maintaining outer membrane integrity. Also
involved in transport (uptake) of colicins and
filamentous DNA, and implicated in pathogenesis.
Transport is energized by the proton motive force. TolA
is an inner membrane protein that interacts with
periplasmic TolB and with outer membrane porins ompC,
phoE and lamB [Transport and binding proteins, Other,
Cellular processes, Pathogenesis].
Length = 346
Score = 32.9 bits (75), Expect = 0.46
Identities = 31/128 (24%), Positives = 55/128 (42%), Gaps = 8/128 (6%)
Query: 20 RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR--------RSRSREAE 71
KE R K +++ +R +E+ R ++LE+R + +K+ + ++AE
Sbjct: 63 AKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAE 122
Query: 72 RSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
+K E K K E E E+ A + +K E EA K++ + +AE + K
Sbjct: 123 EAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAK 182
Query: 132 ETIKKDIK 139
K K
Sbjct: 183 AEAKAKAK 190
Score = 31.7 bits (72), Expect = 1.2
Identities = 23/99 (23%), Positives = 46/99 (46%), Gaps = 16/99 (16%)
Query: 31 DRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEE 90
+R ++ + + E++R + LE++ E+ + R+ E R K+ ++ +K K+ E
Sbjct: 53 NRIQQQKKPAAKKEQERQKKLEQQAEE----AEKQRAAEQARQKELEQRAAAEKAAKQAE 108
Query: 91 EAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKK 129
+AA K+ E + + E K + AE K K
Sbjct: 109 QAA-------KQAEEKQKQAEEAKAKQA-----AEAKAK 135
Score = 29.4 bits (66), Expect = 6.5
Identities = 27/122 (22%), Positives = 59/122 (48%), Gaps = 7/122 (5%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE---RSKD 75
++ K+ K+++R+++ +E+ R + R+KE + + +++AE + +
Sbjct: 56 QQQKKPAAKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAE 115
Query: 76 HSKKEEKDKREKEEEEAAFDPSKLDKEVE-ATRLELEMQKRRDRIERWRAERKKKDIETI 134
+K+ ++ + K+ EA +K + E E + E + Q + + AE KKK E
Sbjct: 116 EKQKQAEEAKAKQAAEAK---AKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAK 172
Query: 135 KK 136
KK
Sbjct: 173 KK 174
>gnl|CDD|112890 pfam04094, DUF390, Protein of unknown function (DUF390). This is a
family of long proteins currently only found in the rice
genome. They have no known function. However they may be
some kind of transposable element.
Length = 843
Score = 33.3 bits (75), Expect = 0.46
Identities = 25/97 (25%), Positives = 46/97 (47%), Gaps = 3/97 (3%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
S+ +S + P+ + R +E+ R + DR R + + + R R + R+E +R +
Sbjct: 220 SKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAEEAAREEAARARQA 279
Query: 64 RSRSREAE---RSKDHSKKEEKDKREKEEEEAAFDPS 97
+REAE R+ + + E + E + A DPS
Sbjct: 280 EEAAREAEAAFRADEAAATSEAARDEAAGAQLAPDPS 316
Score = 30.2 bits (67), Expect = 4.2
Identities = 25/99 (25%), Positives = 44/99 (44%), Gaps = 3/99 (3%)
Query: 12 RSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE 71
S PS RP + + + + RR E DR +R +E ++ +R+R+AE
Sbjct: 209 LSEIPS--RPSRHSKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAE 266
Query: 72 RS-KDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLE 109
+ ++ + + + + E EAAF + EA R E
Sbjct: 267 EAAREEAARARQAEEAAREAEAAFRADEAAATSEAARDE 305
>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
selection and in elongation by RNA polymerase II
[Transcription].
Length = 521
Score = 33.1 bits (75), Expect = 0.47
Identities = 23/82 (28%), Positives = 38/82 (46%), Gaps = 5/82 (6%)
Query: 24 SRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRS-RSREAERSKDHSKKEEK 82
S D R R +E + + LE +K + R + S R E +R KD+ + EE
Sbjct: 122 SSGCTDTRRSTRYEPLTSAAEEKKKKLLELKKTREREERLYSERHIELQRFKDYKELEES 181
Query: 83 DKREKEEEEAAFDPSKLDKEVE 104
++ +EE + PS ++ VE
Sbjct: 182 EQGLQEE----YTPSYAEEAVE 199
>gnl|CDD|218734 pfam05758, Ycf1, Ycf1. The chloroplast genomes of most higher
plants contain two giant open reading frames designated
ycf1 and ycf2. Although the function of Ycf1 is unknown,
it is known to be an essential gene.
Length = 832
Score = 33.1 bits (76), Expect = 0.49
Identities = 21/88 (23%), Positives = 35/88 (39%), Gaps = 14/88 (15%)
Query: 12 RSPSPSH-KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREA 70
R PSP K+ KE+ ++R E + D ++E E + + S E
Sbjct: 216 RIPSPFFTKKLKETSETEER-------------EEETDVEIETTSETKGTKQEQEGSTEE 262
Query: 71 ERSKDHSKKEEKDKREKEEEEAAFDPSK 98
+ S +KE+ DK E ++ K
Sbjct: 263 DPSLFSEEKEDPDKTEDLDKLEILKEKK 290
>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein. This family
consists of several Borrelia P83/P100 antigen proteins.
Length = 489
Score = 33.1 bits (75), Expect = 0.51
Identities = 31/184 (16%), Positives = 76/184 (41%), Gaps = 9/184 (4%)
Query: 66 RSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE 125
RE + +++ D +E+E +E A +L +E++ +++ + +++ + A+
Sbjct: 185 ALREDNEKGVNFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQKADFAQDNAD 244
Query: 126 RKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDA 185
+++ ++ +++ K+ S K E+ E E E K EE + D
Sbjct: 245 KQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQI-EIKKNDEEALKAKDH 303
Query: 186 FMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDG 245
+ +E + K A + A +P V ++K+ + + + N+D
Sbjct: 304 KAFDLKQESKASEKEAEDKELE---AQKKREP-----VAEDLQKTKPQVEAQPTSLNEDA 355
Query: 246 LEYS 249
++ S
Sbjct: 356 IDSS 359
Score = 31.1 bits (70), Expect = 2.0
Identities = 22/114 (19%), Positives = 55/114 (48%), Gaps = 10/114 (8%)
Query: 23 ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
RRD + R S+ +R+++ ++ +++ + + + +A+ ++D++ K+
Sbjct: 195 NFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQ------KADFAQDNADKQRD 248
Query: 83 DKREKEEEEA-AFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIK 135
+ R+K++E P+ E ++ ++ IE+ + E KK D E +K
Sbjct: 249 EVRQKQQEAKNLPKPADTSSPKEDKQVAENQKR---EIEKAQIEIKKNDEEALK 299
Score = 29.2 bits (65), Expect = 6.8
Identities = 20/136 (14%), Positives = 53/136 (38%), Gaps = 11/136 (8%)
Query: 7 KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR 66
KR++ K+ + + D + + +R R + ++ + + + S +
Sbjct: 213 KRAQQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQEAKNLPKPADTSSPKED 272
Query: 67 SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAER 126
+ AE K +K + + ++ +EE K + +L K+ + AE
Sbjct: 273 KQVAENQKREIEKAQIEIKKNDEE--------ALKAKDHKAFDL---KQESKASEKEAED 321
Query: 127 KKKDIETIKKDIKSNL 142
K+ + + ++ + +L
Sbjct: 322 KELEAQKKREPVAEDL 337
>gnl|CDD|113514 pfam04747, DUF612, Protein of unknown function, DUF612. This
family includes several uncharacterized proteins from
Caenorhabditis elegans.
Length = 517
Score = 32.7 bits (73), Expect = 0.55
Identities = 29/127 (22%), Positives = 64/127 (50%), Gaps = 3/127 (2%)
Query: 31 DRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEE 90
D+R+ + + +E+ + + ++ EK + K+ ++ EAE+ + K EK+ R E E
Sbjct: 49 DQRKEAFASLELTEQPQQVEKVKKSEKKKAQKQIAKDHEAEQKVNAKKAAEKEARRAEAE 108
Query: 91 ---EAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLG 147
AA + + E R++ E +K+ +++ +AE+KK+ +K K+ +
Sbjct: 109 AKKRAAQEEEHKQWKAEQERIQKEQEKKEADLKKLQAEKKKEKAVKAEKAEKAEKTKKAS 168
Query: 148 GSAPMKK 154
AP+++
Sbjct: 169 TPAPVEE 175
>gnl|CDD|130009 TIGR00934, 2a38euk, potassium uptake protein, Trk family. The
proteins of the Trk family are derived from
Gram-negative and Gram-positive bacteria, yeast and
wheat. The proteins of E. coli K12 TrkH and TrkG as well
as several yeast proteins have been functionally
characterized.The E. coli TrkH and TrkG proteins are
complexed to two peripheral membrane proteins, TrkA, an
NAD-binding protein, and TrkE, an ATP-binding protein.
This complex forms the potassium uptake system. This
family is specific for the eukaryotic Trk system
[Transport and binding proteins, Cations and iron
carrying compounds].
Length = 800
Score = 33.0 bits (75), Expect = 0.58
Identities = 42/314 (13%), Positives = 90/314 (28%), Gaps = 37/314 (11%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
S+++ R+ + + + R + R + H RD L + R
Sbjct: 141 SKQRFFLRRTKTLLQR--ELEDRPETGVAGRVTVPHGSAKRRDFQDKLFSGEFVKRDEPD 198
Query: 64 -RSRSREAERSKDHS-KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIER 121
S +++ D S E +K K DP L + + + E + +
Sbjct: 199 QNSPDVKSDTRADESISDLEFEKFAKRRGSRDVDPEDLYRSIMMLQGIHERIREKSSANS 258
Query: 122 WRAERKKKDIETIKKDIKSNLSSGLGG--------------SAPMKKWNLEDDSDEDEND 167
ER + I+ + S + +++ D ++ + +
Sbjct: 259 RSDERSSESIQEQVERRPSTSDIERNSQSLTRRYDDKSFDKAVRLRRSKTIDRAEACDLE 318
Query: 168 NKDENGKTAEEDIDPLDAFM--------QGVHEEMRKVNKPAVPTTADVKPADSGSKPAG 219
D + D A +G + + RK + + A
Sbjct: 319 ELDRAKDFEKMTYDNWKAHHRKKKNFRPRGWNLKFRK-ASRFPKDSDRNYEDNGNHLSAS 377
Query: 220 VVIVTGVVKKSVEKAKGELMEENQD----------GLEYSSEEEQEDLTSTAANLASKQK 269
+ S E+ + ++ Y S + S L +QK
Sbjct: 378 SSFGSEEPSLSSEENLYPTYNKKREDSRHTLSKTMSTNYLSWQPTIGRNSNFVGLTKEQK 437
Query: 270 KELSKVDHSTIEYL 283
EL +++ ++ L
Sbjct: 438 DELGGIEYRALKCL 451
Score = 30.0 bits (67), Expect = 4.6
Identities = 21/114 (18%), Positives = 44/114 (38%)
Query: 1 MVRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRG 60
M++ +R R +S + S + S +++ RR S S R+ + R + +
Sbjct: 242 MLQGIHERIREKSSANSRSDERSSESIQEQVERRPSTSDIERNSQSLTRRYDDKSFDKAV 301
Query: 61 SKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
RRS++ + + D + + EK + + K L+ +K
Sbjct: 302 RLRRSKTIDRAEACDLEELDRAKDFEKMTYDNWKAHHRKKKNFRPRGWNLKFRK 355
>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex. This
entry is characterized by proteins with alternating
conserved and low-complexity regions. Bud13 together
with Snu17p and a newly identified factor,
Pml1p/Ylr016c, form a novel trimeric complex. called The
RES complex, pre-mRNA retention and splicing complex.
Subunits of this complex are not essential for viability
of yeasts but they are required for efficient splicing
in vitro and in vivo. Furthermore, inactivation of this
complex causes pre-mRNA leakage from the nucleus. Bud13
contains a unique, phylogenetically conserved C-terminal
region of unknown function.
Length = 141
Score = 31.1 bits (71), Expect = 0.63
Identities = 26/83 (31%), Positives = 40/83 (48%), Gaps = 4/83 (4%)
Query: 35 RSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAF 94
R +S ++ + ER KE+ K R +E E K +KEE++KR +E E+A
Sbjct: 5 RDKSGRIIDIEEKREEKEREKEE----KERKEEKEKEWGKGLVQKEEREKRLEELEKAKN 60
Query: 95 DPSKLDKEVEATRLELEMQKRRD 117
P + E EL+ Q+R D
Sbjct: 61 KPLARYADDEDYDEELKEQERWD 83
>gnl|CDD|221188 pfam11725, AvrE, Pathogenicity factor. This family is secreted by
gram-negative Gammaproteobacteria such as Pseudomonas
syringae of tomato and the fire blight plant pathogen
Erwinia amylovora, amongst others. It is an essential
pathogenicity factor of approximately 198 kDa. Its
injection into the host-plant is dependent upon the
bacterial type III or Hrp secretion system. The family
is long and carries a number of predicted functional
regions, including an ERMS or endoplasmic reticulum
membrane retention signal at both the C- and the
N-termini, a leucine-zipper motif from residues 539-560,
and a nuclear localisation signal at 1358-1361. this
conserved AvrE-family of effectors is among the few that
are required for full virulence of many phytopathogenic
pseudomonads, erwinias and pantoeas.
Length = 1771
Score = 32.8 bits (75), Expect = 0.63
Identities = 32/176 (18%), Positives = 48/176 (27%), Gaps = 10/176 (5%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
SR R P ++ RR R E + K + R
Sbjct: 85 SRGPTLRELLALPEDDGETQAPESSPSARRLTRSEGVARHEMEDLAGRPVVKPDADRQLR 144
Query: 64 -----RSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDR 118
+S S K A F ++ +EV+A R + Q R
Sbjct: 145 QDILNKSSSSRRPPVSKEEGTSSKMPATALASAALFKDDEIRQEVDAARSDQASQSRLS- 203
Query: 119 IERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGK 174
R+ I + L+ GG + NLE + D+ GK
Sbjct: 204 ----RSRGNPPAIPPDAAPRQPMLTRSAGGRFEGEDENLERNLQPQSPITLDKKGK 255
>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355). This
family of proteins is found in bacteria and viruses.
Proteins in this family are typically between 180 and
214 amino acids in length.
Length = 125
Score = 31.1 bits (71), Expect = 0.65
Identities = 17/77 (22%), Positives = 31/77 (40%), Gaps = 5/77 (6%)
Query: 62 KRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIER 121
K + + K EK + EK+ E KL K + E E++K +E
Sbjct: 6 KTFTDKEVDKAIAKEKAKWEKKQEEKKSEAE-----KLAKMSAEEKAEYELEKLEKELEE 60
Query: 122 WRAERKKKDIETIKKDI 138
AE +++++ K +
Sbjct: 61 LEAELARRELKAEAKKM 77
>gnl|CDD|165391 PHA03118, PHA03118, multifunctional expression regulator;
Provisional.
Length = 474
Score = 32.4 bits (73), Expect = 0.69
Identities = 27/137 (19%), Positives = 45/137 (32%), Gaps = 24/137 (17%)
Query: 20 RPKESRRDKDR---DRRRRSRSHERRSERDRDRD-------------LERRKEK-----S 58
P + D DR + ++H RR D + R E
Sbjct: 69 NPADVCEDADRAYTNPNFEKKAHGRREGYHHDDEKCLVTFLDDINHHGGRDTEPGHAHIE 128
Query: 59 RGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDR 118
G ++ +S + K H + ++K + A P + + L +R R
Sbjct: 129 NGERKSPKSYNQQSRKKHRDESLRNKHGRPSGPPAMSPGEHFDQTHDAEYRLRFNERDAR 188
Query: 119 IERWRAERKKKDIETIK 135
+R RK+ DI T K
Sbjct: 189 RDRI---RKEYDIPTDK 202
Score = 30.0 bits (67), Expect = 4.2
Identities = 14/47 (29%), Positives = 19/47 (40%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
R R++ PS D+ D R R +ER + RDR R
Sbjct: 147 RDESLRNKHGRPSGPPAMSPGEHFDQTHDAEYRLRFNERDARRDRIR 193
Score = 29.7 bits (66), Expect = 5.9
Identities = 12/73 (16%), Positives = 24/73 (32%), Gaps = 4/73 (5%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
RK +S + K R + R++ R S + + + R ++
Sbjct: 129 NGERKSPKSYNQQSRKKH----RDESLRNKHGRPSGPPAMSPGEHFDQTHDAEYRLRFNE 184
Query: 63 RRSRSREAERSKD 75
R +R + D
Sbjct: 185 RDARRDRIRKEYD 197
>gnl|CDD|236394 PRK09169, PRK09169, hypothetical protein; Validated.
Length = 2316
Score = 32.8 bits (75), Expect = 0.72
Identities = 17/88 (19%), Positives = 28/88 (31%), Gaps = 5/88 (5%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSE-----RDRDRDLERRKEKS 58
R R R P+ +R D R R R+ R RD +R L+R
Sbjct: 17 PADPRPRRRPRLGDAPAPRTARADSGATPRGRPRAGADREPTSEQLRDYERWLDRAAAGQ 76
Query: 59 RGSKRRSRSREAERSKDHSKKEEKDKRE 86
++R + ++ + D
Sbjct: 77 LDAQREQQCARLWFLVQQARARKVDPDF 104
>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
biogenesis [Translation, ribosomal structure and
biogenesis].
Length = 1077
Score = 32.4 bits (73), Expect = 0.79
Identities = 29/130 (22%), Positives = 53/130 (40%), Gaps = 7/130 (5%)
Query: 70 AERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRI-ERWRAERKK 128
E + +K E D ++E E FD SK+ E ++ E M+ + + ++W + +
Sbjct: 517 EEYKGESAKSSESDLVVQDEPEDFFDVSKVANESISSNHEKLMESEFEELKKKWSSLAQL 576
Query: 129 KDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDE---DENDNKDENGK--TAEEDIDPL 183
K K ++ +K N ED DE +N+ ++ G TAE +
Sbjct: 577 KS-RFQKDATLDSIEGEEELIQDDEKGNFEDLEDEENSSDNEMEESRGSSVTAENEESAD 635
Query: 184 DAFMQGVHEE 193
+ + EE
Sbjct: 636 EVDYETEREE 645
>gnl|CDD|234173 TIGR03346, chaperone_ClpB, ATP-dependent chaperone ClpB. Members
of this protein family are the bacterial ATP-dependent
chaperone ClpB. This protein belongs to the AAA family,
ATPases associated with various cellular activities
(pfam00004). This molecular chaperone does not act as a
protease, but rather serves to disaggregate misfolded
and aggregated proteins [Protein fate, Protein folding
and stabilization].
Length = 852
Score = 32.2 bits (74), Expect = 0.90
Identities = 22/66 (33%), Positives = 34/66 (51%), Gaps = 6/66 (9%)
Query: 54 RKEKSRGSKRRSRSRE---AERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLEL 110
+KEK SK R E AE ++++ EE+ K EK + ++ +E+E RLEL
Sbjct: 425 KKEKDEASKERLEDLEKELAELEEEYADLEEQWKAEKAAIQGI---QQIKEEIEQVRLEL 481
Query: 111 EMQKRR 116
E +R
Sbjct: 482 EQAERE 487
>gnl|CDD|218790 pfam05876, Terminase_GpA, Phage terminase large subunit (GpA).
This family consists of several phage terminase large
subunit proteins as well as related sequences from
several bacterial species. The DNA packaging enzyme of
bacteriophage lambda, terminase, is a heteromultimer
composed of a small subunit, gpNu1, and a large subunit,
gpA, products of the Nu1 and A genes, respectively.
Terminase is involved in the site-specific binding and
cutting of the DNA in the initial stages of packaging.
It is now known that gpA is actively involved in late
stages of packaging, including DNA translocation, and
that this enzyme contains separate functional domains
for its early and late packaging activities.
Length = 552
Score = 32.2 bits (74), Expect = 0.90
Identities = 13/34 (38%), Positives = 18/34 (52%), Gaps = 4/34 (11%)
Query: 457 PGRMIDMLAANSGRVTNLRRVT--YIVLDEADRM 488
PG + ++ ANS NLR Y++LDE D
Sbjct: 115 PGGSLLLIGANS--PANLRSRPVRYVILDEVDAY 146
>gnl|CDD|215824 pfam00260, Protamine_P1, Protamine P1.
Length = 51
Score = 28.7 bits (64), Expect = 0.93
Identities = 20/54 (37%), Positives = 25/54 (46%), Gaps = 3/54 (5%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEK 57
+R + RSRS S +R RR R RRR R RR R R RR+ +
Sbjct: 1 ARYRCCRSRSRSRCRRRR---RRRCRRRRRRCCRRRRRRVGCCRRRYTRRRRRR 51
Score = 28.7 bits (64), Expect = 1.0
Identities = 19/47 (40%), Positives = 22/47 (46%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
R R RSRSRS +R + RR + RRRR R R R R
Sbjct: 2 RYRCCRSRSRSRCRRRRRRRCRRRRRRCCRRRRRRVGCCRRRYTRRR 48
>gnl|CDD|182933 PRK11057, PRK11057, ATP-dependent DNA helicase RecQ; Provisional.
Length = 607
Score = 32.0 bits (73), Expect = 0.99
Identities = 15/40 (37%), Positives = 26/40 (65%)
Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
Y++ P Q + I A++SGRD + + TG GK++ + +P L
Sbjct: 23 YQQFRPGQQEIIDAVLSGRDCLVVMPTGGGKSLCYQIPAL 62
>gnl|CDD|215814 pfam00242, DNA_pol_viral_N, DNA polymerase (viral) N-terminal
domain.
Length = 379
Score = 31.7 bits (72), Expect = 0.99
Identities = 17/83 (20%), Positives = 27/83 (32%), Gaps = 8/83 (9%)
Query: 2 VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSH--ERRSERDRD---RDLERRKE 56
+S+ +RSR + + K + + R R R H RR R
Sbjct: 241 RQSQIQRSRLGLQA---NQGKLAHGQQGRSGSIRGRKHSTTRRPFGVEPSSSGVTTNRAS 297
Query: 57 KSRGSKRRSRSREAERSKDHSKK 79
S +S RE S + +
Sbjct: 298 SSSSCFHQSAVRETAYSSLSTSE 320
>gnl|CDD|130456 TIGR01389, recQ, ATP-dependent DNA helicase RecQ. The
ATP-dependent DNA helicase RecQ of E. coli is about 600
residues long. This model represents bacterial proteins
with a high degree of similarity in domain architecture
and in primary sequence to E. coli RecQ. The model
excludes eukaryotic and archaeal proteins with RecQ-like
regions, as well as more distantly related bacterial
helicases related to RecQ [DNA metabolism, DNA
replication, recombination, and repair].
Length = 591
Score = 32.0 bits (73), Expect = 1.0
Identities = 13/40 (32%), Positives = 24/40 (60%)
Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
Y+ P Q + I ++ GRD++ + TG GK++ + +P L
Sbjct: 11 YDDFRPGQEEIISHVLDGRDVLVVMPTGGGKSLCYQVPAL 50
>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
TolA; Provisional.
Length = 387
Score = 31.7 bits (72), Expect = 1.0
Identities = 24/115 (20%), Positives = 48/115 (41%), Gaps = 14/115 (12%)
Query: 22 KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
E +R K ++ ++ +E++R + LE+ + ++ +++ ++ EA + +K+
Sbjct: 77 AEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQ--EQKKQAEEAAKQAALKQKQA 134
Query: 82 KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKK 136
++ K A K E EA R +K AE KKK K
Sbjct: 135 EEAAAKAAAAA-----KAKAEAEAKRAAAAAKKA-------AAEAKKKAEAEAAK 177
>gnl|CDD|113290 pfam04514, BTV_NS2, Bluetongue virus non-structural protein NS2.
This family includes NS2 proteins from other members of
the Orbivirus genus. NS2 is a non-specific
single-stranded RNA-binding protein that forms large
homomultimers and accumulates in viral inclusion bodies
of infected cells. Three RNA binding regions have been
identified in Bluetongue virus serotype 17 at residues
2-11, 153-166 and 274-286. NS2 multimers also possess
nucleotidyl phosphatase activity. The precise function
of NS2 is not known, but it may be involved in the
transport and condensation of viral mRNAs.
Length = 363
Score = 31.8 bits (72), Expect = 1.2
Identities = 16/74 (21%), Positives = 28/74 (37%), Gaps = 7/74 (9%)
Query: 21 PKESRRDKDRDRRR-RSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
P+ S++D+ +RR R E ERR++ R E + S
Sbjct: 205 PETSKQDQKEERRAAVERRLAELVEMINWNLEERRRDL------RKEQELEENVERDSDD 258
Query: 80 EEKDKREKEEEEAA 93
E++ + E+ E
Sbjct: 259 EDEHGEDSEDGETK 272
Score = 31.4 bits (71), Expect = 1.5
Identities = 18/74 (24%), Positives = 28/74 (37%), Gaps = 1/74 (1%)
Query: 11 SRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREA 70
+P S + KE RR R + +R RDL + +E +R S +
Sbjct: 202 DMTPETSKQDQKEERRAAVERRLAELVEMINWNLEERRRDLRKEQELEENVERDSDDED- 260
Query: 71 ERSKDHSKKEEKDK 84
E +D E K +
Sbjct: 261 EHGEDSEDGETKPE 274
Score = 29.5 bits (66), Expect = 5.9
Identities = 18/131 (13%), Positives = 47/131 (35%), Gaps = 7/131 (5%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRR-------RRSRSHERRSERDRDRDLERRK 55
R + K + P+ +R + + + ++ + +R +ERR
Sbjct: 165 REKEKEEQPMKPAFKPERWMGGPDSDEDENPLDEEAPDMTPETSKQDQKEERRAAVERRL 224
Query: 56 EKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKR 115
+ + + + EE +R+ ++E+ + S+ + + + E +R
Sbjct: 225 AELVEMINWNLEERRRDLRKEQELEENVERDSDDEDEHGEDSEDGETKPESYITSEYIER 284
Query: 116 RDRIERWRAER 126
I + + ER
Sbjct: 285 ISEIRKMKDER 295
>gnl|CDD|217198 pfam02718, Herpes_UL31, Herpesvirus UL31-like protein. This is a
family of Herpesvirus proteins including UL31, UL53,
and the product of ORF 69 in some strains. The proteins
in this family have no known function.
Length = 262
Score = 31.1 bits (71), Expect = 1.4
Identities = 13/44 (29%), Positives = 15/44 (34%), Gaps = 10/44 (22%)
Query: 8 RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDL 51
RSRS RP R SR RR+ R R+
Sbjct: 1 RSRSSVRRLPRSRPS----------RSSSRKKARRALRLTLREF 34
>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
subunit [Translation, ribosomal structure and
biogenesis].
Length = 591
Score = 31.6 bits (71), Expect = 1.4
Identities = 24/93 (25%), Positives = 42/93 (45%), Gaps = 3/93 (3%)
Query: 46 DRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEA 105
D D +L+ +KE ++ S +E KD +K + K ++ EEEE + +
Sbjct: 495 DDDEELQAQKELELEAQGIKYSETSEADKDVNKSKNKKRKVDEEEEEK-KLKMIMMSNKQ 553
Query: 106 TRLELEMQKRRDRIERWRA--ERKKKDIETIKK 136
+L +M+ + E ++KKK I KK
Sbjct: 554 KKLYKKMKYSNAKKEEQAENLKKKKKQIAKQKK 586
>gnl|CDD|198139 smart01071, CDC37_N, Cdc37 N terminal kinase binding. Cdc37 is a
molecular chaperone required for the activity of
numerous eukaryotic protein kinases. This domain
corresponds to the N terminal domain which binds
predominantly to protein kinases.and is found N terminal
to the Hsp (Heat shocked protein) 90-binding domain.
Expression of a construct consisting of only the
N-terminal domain of Saccharomyces pombe Cdc37 results
in cellular viability. This indicates that interactions
with the cochaperone Hsp90 may not be essential for
Cdc37 function.
Length = 154
Score = 30.5 bits (69), Expect = 1.4
Identities = 26/111 (23%), Positives = 48/111 (43%), Gaps = 13/111 (11%)
Query: 32 RRRRSRSHERRSER-----------DRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
R ++ H+ R ER + L +R +K R + + E
Sbjct: 31 RWKQRDIHQARVERMEEIKNLKYELIMNDHLNKRIDKLLKGLREEELSPETPTYNEMLAE 90
Query: 81 EKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKR--RDRIERWRAERKKK 129
+D+ +KE EEA D L +E++ R +L+ +++ R +++ E KKK
Sbjct: 91 LQDQLKKELEEANGDSEGLLEELKKHRDKLKKEQKELRKKLDELEKEEKKK 141
>gnl|CDD|224120 COG1199, DinG, Rad3-related DNA helicases [Transcription / DNA
replication, recombination, and repair].
Length = 654
Score = 31.7 bits (72), Expect = 1.4
Identities = 23/78 (29%), Positives = 38/78 (48%), Gaps = 11/78 (14%)
Query: 346 YEKPTPIQAQAI----PAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPM 401
+P P Q + A+ G L+ A TG+GKT+A++LP L + + +G
Sbjct: 13 GFEPRPEQREMAEAVAEALKGGEGLLIEAPTGTGKTLAYLLPALAYARE-------EGKK 65
Query: 402 AIIMSPTRELCMQIGKEA 419
II + T+ L Q+ +E
Sbjct: 66 VIISTRTKALQEQLLEED 83
>gnl|CDD|239286 cd02988, Phd_like_VIAF, Phosducin (Phd)-like family, Viral
inhibitor of apoptosis (IAP)-associated factor (VIAF)
subfamily; VIAF is a Phd-like protein that functions in
caspase activation during apoptosis. It was identified
as an IAP binding protein through a screen of a human
B-cell library using a prototype IAP. VIAF lacks a
consensus IAP binding motif and while it does not
function as an IAP antagonist, it still plays a
regulatory role in the complete activation of caspases.
VIAF itself is a substrate for IAP-mediated
ubiquitination, suggesting that it may be a target of
IAPs in the prevention of cell death. The similarity of
VIAF to Phd points to a potential role distinct from
apoptosis regulation. Phd functions as a cytosolic
regulator of G protein by specifically binding to G
protein betagamma (Gbg)-subunits. The C-terminal domain
of Phd adopts a thioredoxin fold, but it does not
contain a CXXC motif. Phd interacts with G protein beta
mostly through the N-terminal helical domain.
Length = 192
Score = 30.7 bits (70), Expect = 1.5
Identities = 17/75 (22%), Positives = 28/75 (37%), Gaps = 11/75 (14%)
Query: 55 KEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
K S + A + + E+K E +EE D+E + LE +
Sbjct: 16 KPPSPKEEEEEALELAIQEAHENALEKKLLDELDEEL--------DEEEDDRFLE---EY 64
Query: 115 RRDRIERWRAERKKK 129
RR R+ +A +K
Sbjct: 65 RRKRLAEMKALAEKS 79
>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
bacterial type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. This family
represents the SMC protein of most bacteria. The smc
gene is often associated with scpB (TIGR00281) and scpA
genes, where scp stands for segregation and condensation
protein. SMC was shown (in Caulobacter crescentus) to be
induced early in S phase but present and bound to DNA
throughout the cell cycle [Cellular processes, Cell
division, DNA metabolism, Chromosome-associated
proteins].
Length = 1179
Score = 31.6 bits (72), Expect = 1.5
Identities = 21/90 (23%), Positives = 34/90 (37%)
Query: 52 ERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELE 111
ER R + E SK EE + E++ EE + L+ E+E ELE
Sbjct: 309 ERLANLERQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEELEAELE 368
Query: 112 MQKRRDRIERWRAERKKKDIETIKKDIKSN 141
+ R + E + + ++ I S
Sbjct: 369 ELESRLEELEEQLETLRSKVAQLELQIASL 398
Score = 30.4 bits (69), Expect = 3.1
Identities = 18/119 (15%), Positives = 46/119 (38%), Gaps = 1/119 (0%)
Query: 23 ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAERSKDHSKKEE 81
E + + R E + E R + + + + + R ER +D ++ +
Sbjct: 361 EELEAELEELESRLEELEEQLETLRSKVAQLELQIASLNNEIERLEARLERLEDRRERLQ 420
Query: 82 KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKS 140
++ E ++ + +L E+E ELE + E ++++E ++ + +
Sbjct: 421 QEIEELLKKLEEAELKELQAELEELEEELEELQEELERLEEALEELREELEEAEQALDA 479
Score = 29.6 bits (67), Expect = 7.0
Identities = 21/109 (19%), Positives = 39/109 (35%), Gaps = 9/109 (8%)
Query: 23 ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
E ++ R+ R+ R E + +R ++ + + +E +
Sbjct: 708 EELEEELEQLRKELEELSRQISALRKDLARLEAEVEQLEERIAQLSKELTELEAEIEELE 767
Query: 83 DKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
++ E+ EEE E EA ELE + E +A R+ D
Sbjct: 768 ERLEEAEEEL--------AEAEAEIEELE-AQIEQLKEELKALREALDE 807
>gnl|CDD|216108 pfam00769, ERM, Ezrin/radixin/moesin family. This family of
proteins contain a band 4.1 domain (pfam00373), at their
amino terminus. This family represents the rest of these
proteins.
Length = 244
Score = 30.9 bits (70), Expect = 1.5
Identities = 30/180 (16%), Positives = 62/180 (34%), Gaps = 17/180 (9%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
+R ++ D R ++ E E + E + + K E R ++ +
Sbjct: 12 ERMEQMEEDMRRAQKELEEYEETALELEEKLKQEEEEAQLLEKKADELEEENRRLEEEAA 71
Query: 79 KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE--RKKKDIETIKK 136
E+++ E E E +LE E +K+ + + E ++ E ++
Sbjct: 72 ASEEERERLEAEVDEA-------TAEVAKLEEEREKKEAETRQLQQELREAQEAHERARQ 124
Query: 137 DIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRK 196
++ ++ E D+NG+ A D++ D M+ EE R
Sbjct: 125 ELLEAAAAPTAPPHVA-------APVNGEQLEPDDNGEEASADLET-DPDMKDRSEEERV 176
>gnl|CDD|180485 PRK06245, cofG, FO synthase subunit 1; Reviewed.
Length = 336
Score = 31.4 bits (72), Expect = 1.5
Identities = 16/43 (37%), Positives = 19/43 (44%), Gaps = 5/43 (11%)
Query: 500 IIDNVRPDRQTVMFSATFP-----RQMEALARRILNKPIEIQV 537
II N P M + P ++ ALAR IL I IQV
Sbjct: 205 IIQNFSPKPGIPMENHPEPSLEEMLRVVALARLILPPDISIQV 247
>gnl|CDD|215597 PLN03137, PLN03137, ATP-dependent DNA helicase; Q4-like;
Provisional.
Length = 1195
Score = 31.4 bits (71), Expect = 1.6
Identities = 15/35 (42%), Positives = 21/35 (60%)
Query: 351 PIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
P Q + I A MSG D+ + TG GK++ + LP L
Sbjct: 463 PNQREIINATMSGYDVFVLMPTGGGKSLTYQLPAL 497
>gnl|CDD|215434 PLN02813, PLN02813, pfkB-type carbohydrate kinase family protein.
Length = 426
Score = 31.3 bits (71), Expect = 1.8
Identities = 16/67 (23%), Positives = 27/67 (40%), Gaps = 2/67 (2%)
Query: 9 SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSER--DRDRDLERRKEKSRGSKRRSR 66
S + S SPS PK +RR + +R + R R + +++ +E+ G
Sbjct: 4 SSTASTSPSLYVPKPNRRLRRVTSQRGAPGLFRIHSRANNAALAIQQDEEQPEGFGPIPE 63
Query: 67 SREAERS 73
ER
Sbjct: 64 KAVPERW 70
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 31.6 bits (71), Expect = 1.8
Identities = 16/55 (29%), Positives = 28/55 (50%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERS 73
KR + + K ++ ER E++++R+ ER +E R +K S S E+ S
Sbjct: 580 KREEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSSHESRMS 634
>gnl|CDD|235370 PRK05244, PRK05244, Der GTPase activator; Provisional.
Length = 177
Score = 30.3 bits (69), Expect = 1.8
Identities = 18/68 (26%), Positives = 33/68 (48%), Gaps = 4/68 (5%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAERSKDHS 77
K+ K S + + + ++ R E D + +RK+K +G K SR + +SK
Sbjct: 1 KKKKSSPKRSKGMAKSKKKT---REELDAEARERKRKKKHKGLKSGSRHNEGNTQSKGKG 57
Query: 78 KKEEKDKR 85
+ ++KD R
Sbjct: 58 QAQKKDPR 65
>gnl|CDD|218561 pfam05340, DUF740, Protein of unknown function (DUF740). This
family consists of several uncharacterized plant
proteins of unknown function.
Length = 565
Score = 31.2 bits (70), Expect = 2.0
Identities = 30/139 (21%), Positives = 54/139 (38%), Gaps = 7/139 (5%)
Query: 33 RRRS---RSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEE 89
+RRS RS D D E + R+ ++EE+ + E++E
Sbjct: 93 QRRSCDVRSRSTLWSLFHDDDEENLPSSIAPPEIDPEPRKPIVPDLVLEEEEEVEMEEDE 152
Query: 90 EEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGS 149
E +P K+ E E E++ +D I+ ++ KK ++ K S S S
Sbjct: 153 EYYEKEPGKVVDEKSEEEEEEELKTMKDFID-LESQTKKPSVKDNGKSFWSAASV---FS 208
Query: 150 APMKKWNLEDDSDEDENDN 168
++KW + + + N
Sbjct: 209 KKLQKWRQKQKLKKPRSGN 227
>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
Length = 880
Score = 31.2 bits (71), Expect = 2.1
Identities = 17/102 (16%), Positives = 40/102 (39%), Gaps = 8/102 (7%)
Query: 40 ERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKL 99
D R++E+R + E + K+E ++ +K+ +E +L
Sbjct: 301 FYEEYLDELREIEKRLSRLE---EEINGIEERIKELEEKEERLEELKKKLKELEKRLEEL 357
Query: 100 DKEVEATRLELEMQKRR-DRIERWRAERKKKDIETIKKDIKS 140
++ E E K + + +ER + E ++K+++
Sbjct: 358 EERHE----LYEEAKAKKEELERLKKRLTGLTPEKLEKELEE 395
>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR). This
family consists of several bovine specific leukaemia
virus receptors which are thought to function as
transmembrane proteins, although their exact function is
unknown.
Length = 561
Score = 30.8 bits (69), Expect = 2.2
Identities = 40/184 (21%), Positives = 71/184 (38%), Gaps = 5/184 (2%)
Query: 37 RSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP 96
R H +R E+D+ ++++EK + +RR S E +D + + D +E E A
Sbjct: 88 RRHRQRLEKDKRE--KKKREKEKRGRRRHHSLGTESDEDIAPAQMVDIVTEEMPENALPS 145
Query: 97 SKLDKEVEATRLELEMQKRRDRIERWRAE-RKKKDIETIKKDIKSNLSSGLGGSAPMKKW 155
+ DK+ L++ + + + +K ++ ET K K ++ + S KK
Sbjct: 146 DEDDKDPNDPYRALDIDLDKPLADSEKLPVQKHRNAETSKSPEKGDVPAVEKKSKKPKKK 205
Query: 156 NLEDDSDEDENDNK--DENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADS 213
++ E + D K E K+ +D A V E V TA D
Sbjct: 206 EKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPDSEPDE 265
Query: 214 GSKP 217
Sbjct: 266 PKDA 269
Score = 28.9 bits (64), Expect = 8.9
Identities = 20/117 (17%), Positives = 37/117 (31%), Gaps = 33/117 (28%)
Query: 12 RSPSPSHKRPKESRRDKDRDRRRRSRSHE------------------------------- 40
+S P K KE +++D+D+++ +
Sbjct: 198 KSKKPKKKEKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGT 257
Query: 41 -RRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP 96
SE D +D E + K ++ + R+ + K KK R + A P
Sbjct: 258 APDSEPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKKHHH-HRCHHSDGGAEQP 313
>gnl|CDD|223003 PHA03169, PHA03169, hypothetical protein; Provisional.
Length = 413
Score = 30.7 bits (69), Expect = 2.2
Identities = 22/92 (23%), Positives = 34/92 (36%), Gaps = 7/92 (7%)
Query: 1 MVRSRRKRSRSRSPSPSHKRPKESRRDKDR-DRRRRSRSHERRSERDRDRDLERRKEKSR 59
M R RRK RSR S R R R RR + R ++
Sbjct: 1 MSRQRRKAKRSRHTLRSSCRGHCKRHGGTREQAGRRRGTAARAAKPAPPAPTT------S 54
Query: 60 GSKRRSRSREAERSKDHSKKEEKDKREKEEEE 91
G + R+ + + R + + ++ R E+EE
Sbjct: 55 GPQVRAVAEQGHRQTESDTETAEESRHGEKEE 86
>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
envelope biogenesis, outer membrane].
Length = 387
Score = 30.7 bits (69), Expect = 2.2
Identities = 26/124 (20%), Positives = 59/124 (47%), Gaps = 7/124 (5%)
Query: 10 RSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSRE 69
R +S S K+ ++ R+ K+ + ++ +E++R + LE KE+ + +++ ++ E
Sbjct: 66 RIQSQQSSAKKGEQQRKKKEE-QVAEELKPKQAAEQERLKQLE--KERLKAQEQQKQAEE 122
Query: 70 AERSKDHSKKEEKDKREKEEEE----AAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE 125
AE+ +K+++++ K E A +K E + E +K+ + + E
Sbjct: 123 AEKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEE 182
Query: 126 RKKK 129
K K
Sbjct: 183 AKAK 186
Score = 29.1 bits (65), Expect = 6.8
Identities = 21/115 (18%), Positives = 50/115 (43%), Gaps = 3/115 (2%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK---D 75
+R K + K + + E++ + ++ R ++K + + + EA + K +
Sbjct: 109 ERLKAQEQQKQAEEAEKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAE 168
Query: 76 HSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKD 130
KK E+ + EE +A + + K+ EA + + + + +AE+K +
Sbjct: 169 AKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEAKAAAEKAKAEAEAKAKAEKKAEA 223
>gnl|CDD|197664 smart00338, BRLZ, basic region leucin zipper.
Length = 65
Score = 27.9 bits (63), Expect = 2.3
Identities = 23/63 (36%), Positives = 35/63 (55%), Gaps = 9/63 (14%)
Query: 52 ERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELE 111
RR+E++R + RRSR R KK E ++ E++ E+ + +L KE+E R ELE
Sbjct: 7 RRRRERNREAARRSRER---------KKAEIEELERKVEQLEAENERLKKEIERLRRELE 57
Query: 112 MQK 114
K
Sbjct: 58 KLK 60
>gnl|CDD|131281 TIGR02226, two_anch, N-terminal double-transmembrane domain. This
model represents a prokaryotic N-terminal region of
about 80 amino acids. The predicted membrane topology by
TMHMM puts the N-terminus outside and spans the membrane
twice, with a cytosolic region of about 25 amino acids
between the two transmembrane regions. Member proteins
tend to be between 600 and 1000 amino acids in length
[Hypothetical proteins, Domain].
Length = 82
Score = 28.5 bits (64), Expect = 2.4
Identities = 13/34 (38%), Positives = 17/34 (50%), Gaps = 2/34 (5%)
Query: 379 AFVLPLLRHILDQPPLEETD-GPMAIIM-SPTRE 410
A VLPLL H+L + P D + + P RE
Sbjct: 14 AAVLPLLIHLLRRRPPRPVDFPALRFLREVPKRE 47
>gnl|CDD|224036 COG1111, MPH1, ERCC4-like helicases [DNA replication,
recombination, and repair].
Length = 542
Score = 30.8 bits (70), Expect = 2.4
Identities = 27/117 (23%), Positives = 50/117 (42%), Gaps = 13/117 (11%)
Query: 372 TGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVV 431
TG GKT + + + G + ++PT+ L +Q + +K T +
Sbjct: 38 TGLGKTFIAAMVIANRLRW-------FGGKVLFLAPTKPLVLQHAEFCRKVTGIPEDEIA 90
Query: 432 CVYGGTGISEQISELKRGAEIIVCTPGRMI-DMLAANSGRVTNLRRVTYIVLDEADR 487
+ G E+ + ++ V TP + D+ A GR+ +L V+ ++ DEA R
Sbjct: 91 ALTGEVRPEEREELWAKK-KVFVATPQVVENDLKA---GRI-DLDDVSLLIFDEAHR 142
>gnl|CDD|225620 COG3078, COG3078, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 169
Score = 29.8 bits (67), Expect = 2.5
Identities = 23/71 (32%), Positives = 37/71 (52%), Gaps = 9/71 (12%)
Query: 19 KRPKESRRDK---DRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAERSK 74
KR K++RR + D+ RR++R RDR +R++K +G SR S E S
Sbjct: 2 KRSKKTRRPRSKADKKARRKTREELDAEARDR-----KRQKKRKGLASGSRHSGGNENSG 56
Query: 75 DHSKKEEKDKR 85
+ + ++KD R
Sbjct: 57 NKQQNQKKDPR 67
>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
RPA34.5. This is a family of proteins conserved from
yeasts to human. Subunit A34.5 of RNA polymerase I is a
non-essential subunit which is thought to help Pol I
overcome topological constraints imposed on ribosomal
DNA during the process of transcription.
Length = 193
Score = 30.1 bits (68), Expect = 2.6
Identities = 12/74 (16%), Positives = 34/74 (45%), Gaps = 1/74 (1%)
Query: 15 SPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEK-SRGSKRRSRSREAERS 73
P E + + + + E+ +E + + E++K+K + K+ + ++ +
Sbjct: 119 GAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMV 178
Query: 74 KDHSKKEEKDKREK 87
+ K++K K++K
Sbjct: 179 EPKGSKKKKKKKKK 192
>gnl|CDD|165442 PHA03171, PHA03171, UL37 tegument protein; Provisional.
Length = 499
Score = 30.8 bits (69), Expect = 2.7
Identities = 29/102 (28%), Positives = 46/102 (45%), Gaps = 11/102 (10%)
Query: 7 KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSH----ERRSERDRDRDLERRKEKSRGSK 62
+R R+ P P R +++ + + R R R H SE +R RDL G +
Sbjct: 26 QRKRAEDPLPPWLRKEKACALRQQRRHRLQRQHGVIDGENSETERPRDLTAALFAEAGEE 85
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP--SKLDKE 102
+ + +R ++ EE+D +EEE A DP + LD E
Sbjct: 86 --AEEEDNDRECPDTEAEEED---EEEEIEAPDPEVNPLDAE 122
>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
recombination, and repair].
Length = 908
Score = 30.9 bits (70), Expect = 2.8
Identities = 17/108 (15%), Positives = 50/108 (46%), Gaps = 1/108 (0%)
Query: 38 SHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPS 97
HE+ + +LE +E+ K + RE + +E +++ + E
Sbjct: 470 EHEKELLELYELELEELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKE 529
Query: 98 KLDKEVEATRLELE-MQKRRDRIERWRAERKKKDIETIKKDIKSNLSS 144
+L++++E LE +++ +++++ + + + + +E +++K L
Sbjct: 530 ELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLEE 577
Score = 30.5 bits (69), Expect = 3.2
Identities = 26/147 (17%), Positives = 51/147 (34%), Gaps = 10/147 (6%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R ++R + R ++ R+ R E + ER + + E + +
Sbjct: 250 RLEELKARLLEIESLELEALKIREEELRELERLLEELEEKIERLEELEREIEELEEELEG 309
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEE-------EEAAFDPSKLDKEVEATRLELEMQKR 115
R+ E E + K E+ + EE E K + E+++R
Sbjct: 310 LRALLEELEELLEKLKSLEERLEKLEEKLEKLESELEELAEEKNELAKLLEERLKELEER 369
Query: 116 RDRIE---RWRAERKKKDIETIKKDIK 139
+ +E ER K+ E I++ +
Sbjct: 370 LEELEKELEKALERLKQLEEAIQELKE 396
Score = 29.7 bits (67), Expect = 5.7
Identities = 14/113 (12%), Positives = 41/113 (36%), Gaps = 4/113 (3%)
Query: 18 HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHS 77
+ +E ++ R+ E ++ + LE R EK + ++ +
Sbjct: 294 EELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEK----LEEKLEKLESELEELA 349
Query: 78 KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKD 130
+++ + + EE + + E E + +++ + I+ + E +
Sbjct: 350 EEKNELAKLLEERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELS 402
>gnl|CDD|241525 cd13374, PH_RASAL3, RAS protein activator like-3 Pleckstrin
homology (PH) domain. RASAL3 is thought to be a Ras
GTPase-activating protein. It is involved in positive
regulation of Ras GTPase activity and of small GTPase
mediated signal transduction as well as negative
regulation of Ras protein signal transduction. It
contains a PH domain, a C2 domain, and a Ras-GAP
domain. PH domains have diverse functions, but in
general are involved in targeting proteins to the
appropriate cellular location or in the interaction
with a binding partner. They share little sequence
conservation, but all have a common fold, which is
electrostatically polarized. Less than 10% of PH
domains bind phosphoinositide phosphates (PIPs) with
high affinity and specificity. PH domains are
distinguished from other PIP-binding domains by their
specific high-affinity binding to PIPs with two vicinal
phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or
PtdIns(3,4,5)P3 which results in targeting some PH
domain proteins to the plasma membrane. A few display
strong specificity in lipid binding. Any specificity is
usually determined by loop regions or insertions in the
N-terminus of the domain, which are not conserved
across all PH domains. PH domains are found in cellular
signaling proteins such as serine/threonine kinase,
tyrosine kinases, regulators of G-proteins, endocytotic
GTPases, adaptors, as well as cytoskeletal associated
molecules and in lipid associated enzymes.
Length = 180
Score = 29.9 bits (67), Expect = 2.9
Identities = 20/75 (26%), Positives = 29/75 (38%), Gaps = 6/75 (8%)
Query: 5 RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLE-RRKEKSR---- 59
RR R S S S K S D DR + +++ R L R KE+ +
Sbjct: 13 RRPRPGSASSGGSIISAKGSGGDPDRKPGKTEPEAAGQNQVHNVRGLLKRLKEEKKARVS 72
Query: 60 -GSKRRSRSREAERS 73
K S +R ++ S
Sbjct: 73 GEGKPSSSARGSQES 87
>gnl|CDD|224121 COG1200, RecG, RecG-like helicase [DNA replication, recombination,
and repair / Transcription].
Length = 677
Score = 30.6 bits (70), Expect = 3.1
Identities = 21/59 (35%), Positives = 30/59 (50%), Gaps = 8/59 (13%)
Query: 373 GSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVV 431
GSGKTV +L +L I G A +M+PT L Q + +K+ + LG+RV
Sbjct: 293 GSGKTVVALLAMLAAI--------EAGYQAALMAPTEILAEQHYESLRKWLEPLGIRVA 343
>gnl|CDD|236544 PRK09506, mrcB, bifunctional glycosyl transferase/transpeptidase;
Reviewed.
Length = 830
Score = 30.5 bits (69), Expect = 3.2
Identities = 14/57 (24%), Positives = 20/57 (35%)
Query: 12 RSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR 68
R P +P + K RR R + D + RK K +G K R +
Sbjct: 6 REPIGRKGKPSRPVKQKVSRRRYRDDDDYDDYDDYEDEEPMPRKGKGKGRKPRGKRG 62
>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain. This domain is
found at the N terminus of SMC proteins. The SMC
(structural maintenance of chromosomes) superfamily
proteins have ATP-binding domains at the N- and
C-termini, and two extended coiled-coil domains
separated by a hinge in the middle. The eukaryotic SMC
proteins form two kind of heterodimers: the SMC1/SMC3
and the SMC2/SMC4 types. These heterodimers constitute
an essential part of higher order complexes, which are
involved in chromatin and DNA dynamics. This family also
includes the RecF and RecN proteins that are involved in
DNA metabolism and recombination.
Length = 1162
Score = 30.7 bits (69), Expect = 3.2
Identities = 25/159 (15%), Positives = 67/159 (42%), Gaps = 2/159 (1%)
Query: 22 KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK--K 79
++K + + DL + + + S +E E+ ++
Sbjct: 213 YYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVL 272
Query: 80 EEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
+E + EKE++ + L KE E + EL +RR + + + +K+++ ++K++K
Sbjct: 273 KENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLEKELK 332
Query: 140 SNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEE 178
+K+ ++ +++E+E + ++ + E+
Sbjct: 333 KEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEKLEQ 371
Score = 29.6 bits (66), Expect = 6.1
Identities = 36/160 (22%), Positives = 65/160 (40%), Gaps = 7/160 (4%)
Query: 40 ERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKL 99
ERR D ++ E KE + K + +E + KE + KRE EEEE +
Sbjct: 307 ERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQ--LEK 364
Query: 100 DKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLED 159
+E E + K++ ER + K K+ E K+ + + L ++ E+
Sbjct: 365 LQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLL-----LELSEQEE 419
Query: 160 DSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNK 199
D ++E + + + EE ++ + EE+ K
Sbjct: 420 DLLKEEKKEELKIVEELEESLETKQGKLTEEKEELEKQAL 459
>gnl|CDD|237178 PRK12705, PRK12705, hypothetical protein; Provisional.
Length = 508
Score = 30.4 bits (69), Expect = 3.3
Identities = 28/144 (19%), Positives = 54/144 (37%), Gaps = 9/144 (6%)
Query: 5 RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR- 63
+R+R + + KE+ + R + R R+E R +R
Sbjct: 27 KRQRLAKEAERILQEAQKEAEEKLEAALLEAKELLLRERNQQRQEARREREELQREEERL 86
Query: 64 --RSRSREAERSKDHSKKEEKDKREK----EEEEAAFDPSKLDKE-VEATRLELEMQKRR 116
+ +A K + + + ++REK E E +LD E L E Q R+
Sbjct: 87 VQKEEQLDARAEKLDNLENQLEEREKALSARELELEELEKQLDNELYRVAGLTPE-QARK 145
Query: 117 DRIERWRAERKKKDIETIKKDIKS 140
++ AE +++ + +KK +
Sbjct: 146 LLLKLLDAELEEEKAQRVKKIEEE 169
>gnl|CDD|227352 COG5019, CDC3, Septin family protein [Cell division and chromosome
partitioning / Cytoskeleton].
Length = 373
Score = 30.0 bits (68), Expect = 3.4
Identities = 18/86 (20%), Positives = 36/86 (41%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R+ + S PS K E+R +++ ++ + + R + R +LE+ + R
Sbjct: 288 RTEKLSGLKNSGEPSLKEIHEARLNEEERELKKKFTEKIREKEKRLEELEQNLIEERKEL 347
Query: 63 RRSRSREAERSKDHSKKEEKDKREKE 88
++ +D K+ EK K K
Sbjct: 348 NSKLEEIQKKLEDLEKRLEKLKSNKS 373
>gnl|CDD|227594 COG5269, ZUO1, Ribosome-associated chaperone zuotin [Translation,
ribosomal structure and biogenesis / Posttranslational
modification, protein turnover, chaperones].
Length = 379
Score = 30.0 bits (67), Expect = 3.4
Identities = 22/99 (22%), Positives = 41/99 (41%), Gaps = 2/99 (2%)
Query: 43 SERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE-KDKREKEEEEAAFDPSKLDK 101
ERDR R E + + R + + +R +KK + + K KE+E+ K ++
Sbjct: 191 EERDRKRYSEAKNREKRAKLKNQDNARLKRLVQIAKKRDPRIKSFKEQEKEMKKIRKWER 250
Query: 102 EVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKS 140
E A L K + + +AE + + + + K
Sbjct: 251 EAGARLKALAALKGKAEAKN-KAEIEAEALASATAVKKK 288
>gnl|CDD|144738 pfam01254, TP2, Nuclear transition protein 2.
Length = 132
Score = 29.0 bits (64), Expect = 3.6
Identities = 17/69 (24%), Positives = 27/69 (39%), Gaps = 9/69 (13%)
Query: 7 KRSRSRSPSPS---HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
+S+S SPSP HK S R + R ++LE + K + KR
Sbjct: 62 HQSQSPSPSPPPKHHKTTMHSHYSPSRPTTHSCSCPKNR------KNLEGKVSKRKAVKR 115
Query: 64 RSRSREAER 72
+ + +R
Sbjct: 116 SKQVYKTKR 124
>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
Provisional.
Length = 695
Score = 30.3 bits (69), Expect = 3.6
Identities = 14/51 (27%), Positives = 22/51 (43%)
Query: 85 REKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIK 135
R E+E+ + +K E RLE E R R ++ R KD + +
Sbjct: 439 RAIEQEKKKAEEAKARFEARQARLEREKAAREARHKKAAEARAAKDKDAVA 489
>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572). Family of
eukaryotic proteins with undetermined function.
Length = 321
Score = 30.1 bits (68), Expect = 3.6
Identities = 50/226 (22%), Positives = 86/226 (38%), Gaps = 43/226 (19%)
Query: 67 SREAERSKDHSKKEEKDKREKEEEEAAFDP----------SKLDKEVEATRLEL-EMQKR 115
+R E K ++EE+ ++E+EEE A D SK + EV EL E+Q R
Sbjct: 105 TRNYEADKLDEEQEERVEKEREEELAG-DAMKKLENRTADSKREMEVLERLEELKELQSR 163
Query: 116 RD-------------RIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSD 162
R R ++ E +++D IK LS G + ++ ++DS+
Sbjct: 164 RADVDVNSMLEALFRREKKEEEEEEEEDEALIKS-----LSFGP-ETEEDRRRADDEDSE 217
Query: 163 EDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVI 222
+DE DN + + + + + P+++ K G +
Sbjct: 218 DDEEDNDNTPSPKSGSSSPAKPTSI----LKKSAAKRSEAPSSSKAKKNSRGIPKPRDAL 273
Query: 223 VTGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQ 268
+ VV+K KA E ++ SS E + TA N +
Sbjct: 274 SSLVVRK---KAAPESTSQSP-----SSAEPTSESPQTAGNSSLSS 311
>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
Length = 1463
Score = 30.5 bits (68), Expect = 3.6
Identities = 25/81 (30%), Positives = 33/81 (40%), Gaps = 12/81 (14%)
Query: 53 RRKEKSRGSKR--RSRSREAER----------SKDHSKKEEKDKREKEEEEAAFDPSKLD 100
RR+E R S+R RSRSR R S + K+ KR K E PS L
Sbjct: 180 RRQEVVRKSERVARSRSRRPWRDLWSNRRPVPSPQRTSKDRLPKRGKREFSKKMGPSHLT 239
Query: 101 KEVEATRLELEMQKRRDRIER 121
++ + RR R+ R
Sbjct: 240 SSSSSSSSSFSLSGRRGRLAR 260
>gnl|CDD|238432 cd00852, NifB, NifB belongs to a family of iron-molybdenum
cluster-binding proteins that includes NifX, and NifY,
all of which are involved in the synthesis of an
iron-molybdenum cofactor (FeMo-co) that binds the active
site of the dinitrogenase enzyme as part of nitrogen
fixation in bacteria. This domain is sometimes found
fused to a N-terminal domain (the Radical SAM domain) in
nifB-like proteins.
Length = 106
Score = 28.4 bits (64), Expect = 3.7
Identities = 14/36 (38%), Positives = 20/36 (55%)
Query: 411 LCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISEL 446
LC +IG E K+ + G+ V+ Y G I E + EL
Sbjct: 70 LCAKIGDEPKEKLEEAGIEVIEAYAGEYIEEALLEL 105
>gnl|CDD|217970 pfam04220, YihI, Der GTPase activator (YihI). YihI activates the
GTPase activity of Der, a 50S ribosomal subunit
stability factor. The stimulation is specific to Der as
YihI does not stimulate the GTPase activity of Era or
ObgE. The interaction of YihI with Der requires only
the C-terminal 78 amino acids of YihI. A yihI deletion
mutant is viable and shows a shorter lag period, but
the same post-lag growth rate as a wild-type strain.
yihI is expressed during the lag period. Overexpression
of yihI inhibits cell growth and biogenesis of the 50S
ribosomal subunit. YihI is an unusual, highly
hydrophilic protein with an uneven distribution of
charged residues, resulting in an N-terminal region
with high pI and a C-terminal region with low pI.
Length = 169
Score = 29.2 bits (66), Expect = 3.7
Identities = 19/73 (26%), Positives = 31/73 (42%), Gaps = 11/73 (15%)
Query: 14 PSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAER 72
PK + + K + R E D++ +RK+K +G K SR + E+E
Sbjct: 1 RKSGKNGPKLAPKGKKKTR----------YELDQEARERKRKKKRKGLKSGSRHNEESES 50
Query: 73 SKDHSKKEEKDKR 85
K ++KD R
Sbjct: 51 QKQKGAAQKKDPR 63
>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y. Members of this family are
RNase Y, an endoribonuclease. The member from Bacillus
subtilis, YmdA, has been shown to be involved in
turnover of yitJ riboswitch [Transcription, Degradation
of RNA].
Length = 514
Score = 30.3 bits (69), Expect = 3.9
Identities = 25/83 (30%), Positives = 45/83 (54%), Gaps = 3/83 (3%)
Query: 39 HERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSK 98
H+ R+E +R+ ERR E R +R + E K S ++++ EK+E+E +
Sbjct: 61 HKLRAELERELK-ERRNELQRLERRLLQREETLDRKMESLDKKEENLEKKEKELSNKEKN 119
Query: 99 LDKEVEATRLELEMQKRRDRIER 121
LD++ E LE + ++R+ +ER
Sbjct: 120 LDEKEE--ELEELIAEQREELER 140
>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584). This
protein is found in bacteria and eukaryotes. Proteins in
this family are typically between 943 to 1234 amino
acids in length. This family contains a P-loop motif
suggesting it is a nucleotide binding protein. It may be
involved in replication.
Length = 1198
Score = 30.4 bits (69), Expect = 3.9
Identities = 20/74 (27%), Positives = 38/74 (51%), Gaps = 4/74 (5%)
Query: 56 EKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKR 115
E++ S + + E+ + + E+ KR + E A ++LD + RL+ E Q
Sbjct: 613 EEALQSAVAKQKQAEEQLVQANAELEEQKRAEAEARTALKQARLDLQ----RLQNEQQSL 668
Query: 116 RDRIERWRAERKKK 129
+D++E AERK++
Sbjct: 669 KDKLELAIAERKQQ 682
>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
Provisional.
Length = 482
Score = 29.9 bits (68), Expect = 4.0
Identities = 13/53 (24%), Positives = 30/53 (56%)
Query: 52 ERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVE 104
+ EK R +++ + ++A K ++EE++K +KEEE+ + +++ E
Sbjct: 416 VEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEE 468
Score = 29.5 bits (67), Expect = 5.9
Identities = 14/56 (25%), Positives = 30/56 (53%)
Query: 51 LERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEAT 106
+ K++ K + + A + K+ ++EEK+K+E+E+EE + + +E E
Sbjct: 417 EKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEK 472
>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
(TAF4) is one of several TAFs that bind TBP and is
involved in forming Transcription Factor IID (TFIID)
complex. The TATA Binding Protein (TBP) Associated
Factor 4 (TAF4) is one of several TAFs that bind TBP and
are involved in forming the Transcription Factor IID
(TFIID) complex. TFIID is one of seven General
Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
TFIIF, and TFIID) that are involved in accurate
initiation of transcription by RNA polymerase II in
eukaryote. TFIID plays an important role in the
recognition of promoter DNA and assembly of the
pre-initiation complex. TFIID complex is composed of the
TBP and at least 13 TAFs. TAFs from various species were
originally named by their predicted molecular weight or
their electrophoretic mobility in polyacrylamide gels. A
new, unified nomenclature for the pol II TAFs has been
suggested to show the relationship between TAF orthologs
and paralogs. Several hypotheses are proposed for TAFs
functions such as serving as activator-binding sites,
core-promoter recognition or a role in essential
catalytic activity. Each TAF, with the help of a
specific activator, is required only for the expression
of subset of genes and is not universally involved for
transcription as are GTFs. In yeast and human cells,
TAFs have been found as components of other complexes
besides TFIID. Several TAFs interact via histone-fold
(HFD) motifs; HFD is the interaction motif involved in
heterodimerization of the core histones and their
assembly into nucleosome octamers. The minimal HFD
contains three alpha-helices linked by two loops and is
found in core histones, TAFS and many other
transcription factors. TFIID has a histone octamer-like
substructure. TAF4 domain interacts with TAF12 and makes
a novel histone-like heterodimer that binds DNA and has
a core promoter function of a subset of genes.
Length = 212
Score = 29.6 bits (67), Expect = 4.0
Identities = 19/84 (22%), Positives = 31/84 (36%), Gaps = 9/84 (10%)
Query: 3 RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
R ++ R S R K+ R + +R E + +R+ R KSR +
Sbjct: 96 RVDSEKEDERYEITSDVR-KQLRFLEQLER------EEEEKRDEEERERLLRAAKSRSEQ 148
Query: 63 RRSRSREAERSKDHSKKEEKDKRE 86
R + + E K+ EE R
Sbjct: 149 SRLKQKAKEMQKEED--EEMRHRA 170
>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family. Emg1 and Nop14 are novel
proteins whose interaction is required for the
maturation of the 18S rRNA and for 40S ribosome
production.
Length = 809
Score = 30.4 bits (69), Expect = 4.0
Identities = 35/203 (17%), Positives = 77/203 (37%), Gaps = 29/203 (14%)
Query: 39 HERRSERDRDRDLERRKEKSRGSKRRS-RSREAERSKDHSKKEEKDKREKEEEEAAFDP- 96
ER+ ++ D DL + R+ + + +E+ D+ ++ E FD
Sbjct: 191 AERQKAKEEDEDLREELDDDFKDLMSLLRTVKPPPKPPMTPEEKDDEYDQRVRELTFDRR 250
Query: 97 ------SKLDKEV---EATRLE-LEMQKRRDRIERWR--------AERKKKDIETIKKDI 138
+K ++E+ EA RL+ LE +R+ R R E K+ + + +
Sbjct: 251 AQPTDRTKTEEELAKEEAERLKKLE----AERLRRMRGEEEDDEEEEDSKESADDLDDEF 306
Query: 139 KSNLSSGLGGSAPMKKWN-----LEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEE 193
+ + G + ++D+ +ED++D+ +E + + + D + +E
Sbjct: 307 EPDDDDNFGLGQGEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEEDEDSDDE 366
Query: 194 MRKVNKPAVPTTADVKPADSGSK 216
+ + K A+S
Sbjct: 367 DDEEEEEEEKEKKKKKSAESTRS 389
>gnl|CDD|163426 TIGR03714, secA2, accessory Sec system translocase SecA2. Members
of this protein family are homologous to SecA and part
of the accessory Sec system. This system, including both
five core proteins for export and a variable number of
proteins for glycosylation, operates in certain
Gram-positive pathogens for the maturation and delivery
of serine-rich glycoproteins such as the cell surface
glycoprotein GspB in Streptococcus gordonii [Protein
fate, Protein and peptide secretion and trafficking].
Length = 762
Score = 30.0 bits (68), Expect = 4.2
Identities = 39/149 (26%), Positives = 61/149 (40%), Gaps = 27/149 (18%)
Query: 351 PIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRE 410
P Q + AI+ + I KTG GKT+ +PL L G A++++
Sbjct: 71 PYDVQVLGAIVLHQGNIAEMKTGEGKTLTATMPLY--------LNALTGKGAMLVTTNDY 122
Query: 411 LCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISE-----LKR---GAEIIVCTPGR--- 459
L + +E + LGL V G+ + E KR ++I+ T
Sbjct: 123 LAKRDAEEMGPVYEWLGLTV-----SLGVVDDPDEEYDANEKRKIYNSDIVYTTNSALGF 177
Query: 460 --MIDMLAANSGRVTNLRRVTYIVLDEAD 486
+ID LA+N LR Y+++DE D
Sbjct: 178 DYLIDNLASNK-EGKFLRPFNYVIVDEVD 205
>gnl|CDD|215770 pfam00176, SNF2_N, SNF2 family N-terminal domain. This domain is
found in proteins involved in a variety of processes
including transcription regulation (e.g., SNF2, STH1,
brahma, MOT1), DNA repair (e.g. ERCC6, RAD16, RAD5), DNA
recombination (e.g. RAD54), and chromatin unwinding
(e.g. ISWI) as well as a variety of other proteins with
little functional information (e.g. lodestar, ETL1).
Length = 301
Score = 29.6 bits (67), Expect = 4.2
Identities = 26/126 (20%), Positives = 51/126 (40%), Gaps = 23/126 (18%)
Query: 372 TGSGKT---VAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGL 428
G GKT +A + L+ D+ GP ++ + E +K+ L
Sbjct: 25 MGLGKTLQTIALLATYLKEGKDRR------GPTLVVCPLS--TLHNWLNEFEKWA--PAL 74
Query: 429 RVVCVYGGTGISEQISELK----RGAEIIVCTPGRMIDMLAANSGRVTNLRRVT--YIVL 482
RVV +G ++ + ++++ T ++L + ++ L +V +VL
Sbjct: 75 RVVVYHGDGRERSKLRQSMAKRLDTYDVVITT----YEVLRKDKKLLSLLNKVEWDRVVL 130
Query: 483 DEADRM 488
DEA R+
Sbjct: 131 DEAHRL 136
>gnl|CDD|180626 PRK06565, PRK06565, amidase; Validated.
Length = 566
Score = 30.1 bits (68), Expect = 4.2
Identities = 25/76 (32%), Positives = 36/76 (47%), Gaps = 14/76 (18%)
Query: 156 NLEDD--SDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKP---------AV-- 202
N E D + DE N + G + + I L ++G+ E+ RK++ AV
Sbjct: 414 NREGDLAAGMDEYVNMAKRGLKSWDQIPTLPDGLRGL-EKTRKLDLEDWMDGLGLDAVLF 472
Query: 203 PTTADVKPADSGSKPA 218
PT ADV PAD+ PA
Sbjct: 473 PTVADVGPADADVNPA 488
>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381). This
domain is functionally uncharacterized. This domain is
found in eukaryotes. This presumed domain is typically
between 156 to 174 amino acids in length. This domain is
found associated with pfam07780, pfam01728.
Length = 154
Score = 29.2 bits (66), Expect = 4.2
Identities = 16/54 (29%), Positives = 33/54 (61%), Gaps = 6/54 (11%)
Query: 78 KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
K++E+++ E+ E E + ++D+ +E +L+ +KRR+ ERK+K+I
Sbjct: 99 KEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRE------NERKQKEI 146
>gnl|CDD|233069 TIGR00643, recG, ATP-dependent DNA helicase RecG. [DNA metabolism,
DNA replication, recombination, and repair].
Length = 630
Score = 30.0 bits (68), Expect = 4.2
Identities = 25/98 (25%), Positives = 35/98 (35%), Gaps = 18/98 (18%)
Query: 348 KPTPIQAQAIPAIMS--------GRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDG 399
K T Q + + I+ R L G GSGKT+ L +L I G
Sbjct: 235 KLTRAQKRVVKEILQDLKSDVPMNRLLQG--DVGSGKTLVAALAMLAAI--------EAG 284
Query: 400 PMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGT 437
+M+PT L Q + LG+ V + G
Sbjct: 285 YQVALMAPTEILAEQHYNSLRNLLAPLGIEVALLTGSL 322
>gnl|CDD|223649 COG0576, GrpE, Molecular chaperone GrpE (heat shock protein)
[Posttranslational modification, protein turnover,
chaperones].
Length = 193
Score = 29.2 bits (66), Expect = 4.3
Identities = 18/80 (22%), Positives = 39/80 (48%), Gaps = 6/80 (7%)
Query: 69 EAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE--- 125
E +++ + E+ ++ E EEEE + +++ E LE ++++ +D+ R +AE
Sbjct: 9 EEPDAEETEEAEKSEEEEAEEEEPEEENELEEEQQEIAELEAQLEELKDKYLRAQAEFEN 68
Query: 126 ---RKKKDIETIKKDIKSNL 142
R +++ E KK
Sbjct: 69 LRKRTEREREEAKKYAIEKF 88
>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein. This protein is found to be part
of a large ribonucleoprotein complex containing the U3
snoRNA. Depletion of the Utp proteins impedes production
of the 18S rRNA, indicating that they are part of the
active pre-rRNA processing complex. This large RNP
complex has been termed the small subunit (SSU)
processome.
Length = 728
Score = 30.0 bits (68), Expect = 4.4
Identities = 32/179 (17%), Positives = 61/179 (34%), Gaps = 19/179 (10%)
Query: 18 HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHS 77
+R + +++++ R E + + E +K+ G RR E + S
Sbjct: 379 MQRAEARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVG--RRKFGPENGEKEAES 436
Query: 78 KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIER---------------W 122
KK +K+ + + +E+ D + ++ E ++E K R E+ W
Sbjct: 437 KKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPW 496
Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDED--ENDNKDENGKTAEED 179
K+D K SS L +A + E ++ EED
Sbjct: 497 LKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEKSIDLDDDLIDEED 555
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 29.5 bits (67), Expect = 4.5
Identities = 17/73 (23%), Positives = 33/73 (45%), Gaps = 8/73 (10%)
Query: 55 KEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
K K R E +++E+ + +KEE++ K ++E + +L E Q+
Sbjct: 258 LRKVD--KTREEEEEKILKAAEEERQEEAQEKKEEKK------KEEREAKLAKLSPEEQR 309
Query: 115 RRDRIERWRAERK 127
+ + ER + RK
Sbjct: 310 KLEEKERKKQARK 322
>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168). This
family consists of several hypothetical eukaryotic
proteins of unknown function.
Length = 142
Score = 28.9 bits (65), Expect = 4.6
Identities = 29/116 (25%), Positives = 48/116 (41%), Gaps = 15/116 (12%)
Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
R R RE ER + +EK K+E E D+E + R E + K ++ +
Sbjct: 42 RALRRREYERLE---LMDEKWKKETE-----------DEEFQQKREEKKR-KDEEKTAKK 86
Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEE 178
RA+R+KK + KK + D+ +E E D ++E + E+
Sbjct: 87 RAKRQKKKQKKKKKKKAKKGNKKEEKEGSKSSEESSDEEEEGEEDKQEEPVEIMEK 142
>gnl|CDD|234229 TIGR03490, Mycoplas_LppA, mycoides cluster lipoprotein, LppA/P72
family. Members of this protein family occur in
Mycoplasma mycoides, Mycoplasma hyopneumoniae, and
related Mycoplasmas in small paralogous families that
may also include truncated forms and/or pseudogenes.
Members are predicted lipoproteins with a conserved
signal peptidase II processing and lipid attachment
site. Note that the name for certain characterized
members, p72, reflects an anomalous apparent molecular
weight, given a theoretical MW of about 61 kDa.
Length = 541
Score = 29.8 bits (67), Expect = 4.6
Identities = 30/172 (17%), Positives = 63/172 (36%), Gaps = 30/172 (17%)
Query: 16 PSHKRPKESRR----DKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE 71
P+ PK ++ + + +S + + E + E++ + S+ + E E
Sbjct: 42 PNENTPKIPKKPDNKEPSENNNNKSNNENKDEENPSSTNPEKKPDPSKNKE------EIE 95
Query: 72 RSKDHSKKEEKDKREKEEEEAAFDPS-----------KLDKEVEATRLELEMQKRRDRIE 120
+ KD KK +K + + D KL KE+ L ++D
Sbjct: 96 KPKDEPKKPDKKPQADQPNNVHADQPNNNKVDFSDLDKLKKELSFENFTL--YSQKDPKT 153
Query: 121 RWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDEN 172
K D+ T K I + K+NL+ +S+++ ++ ++
Sbjct: 154 AL--SSLKGDLSTFFKTIFYK-----TNKDILDKYNLKLESNKEPKEDFEKG 198
>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein. The proteins in this family
are designated YL1. These proteins have been shown to be
DNA-binding and may be a transcription factor.
Length = 238
Score = 29.3 bits (66), Expect = 4.7
Identities = 30/131 (22%), Positives = 55/131 (41%), Gaps = 19/131 (14%)
Query: 6 RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRS 65
++ + P +R + D RR+ SRS +++ L+ R+ + + + ++
Sbjct: 112 SPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSSTVQNKEATHERLKEREIRRKKIQAKA 171
Query: 66 RSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE 125
R R K+++K+K +EE A EA E K +R E E
Sbjct: 172 RKR---------KEKKKEKELTQEERLA----------EAKETERINLKSLERYEEQEEE 212
Query: 126 RKKKDIETIKK 136
+KK I+ +KK
Sbjct: 213 KKKAKIQALKK 223
>gnl|CDD|218883 pfam06075, DUF936, Plant protein of unknown function (DUF936).
This family consists of several hypothetical proteins
from Arabidopsis thaliana and Oryza sativa. The function
of this family is unknown.
Length = 564
Score = 29.8 bits (67), Expect = 4.8
Identities = 22/138 (15%), Positives = 36/138 (26%), Gaps = 15/138 (10%)
Query: 2 VRSRRKRSRSR----SPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKE- 56
V R RS S +P+ R S S R R
Sbjct: 161 VIGPRPRSFSELNLTDRTPAKVRSSRSELGAPSPSGGTSCPSSSGGRRSSIGSRRLRGSA 220
Query: 57 ---KSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQ 113
K R S+ K R +A P K + +AT+ ++
Sbjct: 221 SLRKKVAVLSAPRKP---GSRSSDCKSSPRARSS----SAKSPFKSSIQRKATKALSKLS 273
Query: 114 KRRDRIERWRAERKKKDI 131
R + ++ + +
Sbjct: 274 LRASPKDTSKSSKSEVAP 291
>gnl|CDD|225995 COG3464, COG3464, Transposase and inactivated derivatives [DNA
replication, recombination, and repair].
Length = 402
Score = 29.7 bits (67), Expect = 4.9
Identities = 17/115 (14%), Positives = 31/115 (26%), Gaps = 4/115 (3%)
Query: 28 KDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREK 87
+ R R + D+ ++ S+ S ++ E
Sbjct: 241 SRALEQVRRRVRNQFRSEDKRIKALWKRRARLSSRYLCDKNFQNLSLLRYERLSPILGEL 300
Query: 88 EE-EEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAER---KKKDIETIKKDI 138
A L +E+ A R ++K IE + T+KK
Sbjct: 301 YSLYPALRVAYDLAQELAADRRREAVKKLIQWIEDAVKSAIKELARLAATLKKHQ 355
>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
domain. This domain is found in a number of different
types of plant proteins including NAM-like proteins.
Length = 147
Score = 28.9 bits (65), Expect = 5.0
Identities = 31/109 (28%), Positives = 51/109 (46%), Gaps = 6/109 (5%)
Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
KR + + K + +R S +E + D D E E R + R +++E R
Sbjct: 21 KRSELKKASKKKKKRSNSSPGSTSNEENEDEDDESTAESKR-PEGRKKAKEKLRRDKLKA 79
Query: 79 KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERK 127
K+E+ ++EKE+EE K E E R ELE K++ + + E+K
Sbjct: 80 KKEEAEKEKEKEERF---MKALAEAEKERAELE--KKKAEAKLMKEEKK 123
>gnl|CDD|221803 pfam12846, AAA_10, AAA-like domain. This family of domains contain
a P-loop motif that is characteristic of the AAA
superfamily. Many of the proteins in this family are
conjugative transfer proteins.
Length = 316
Score = 29.7 bits (67), Expect = 5.1
Identities = 17/72 (23%), Positives = 30/72 (41%), Gaps = 16/72 (22%)
Query: 369 IAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGL 428
+ +GSGK+ LL+ + + G I++ P E ++LG
Sbjct: 7 VGPSGSGKST-----LLKLLALRLLAR---GGRVIVIDP--------KGEYSGLARALGG 50
Query: 429 RVVCVYGGTGIS 440
V+ + G+GIS
Sbjct: 51 EVIDLGPGSGIS 62
>gnl|CDD|213994 cd12110, PHP_HisPPase_Hisj_like, Polymerase and Histidinol
Phosphatase domain of Histidinol phosphate phosphatase
of Hisj like. Bacillus subtilis YtvP HisJ has strong
histidinol phosphate phosphatase (HisPPase) activity.
The PHP (also called histidinol phosphatase-2/HIS2)
domain is associated with several types of DNA
polymerases, such as PolIIIA and family X DNA
polymerases, stand alone histidinol phosphate
phosphatases (HisPPases), and a number of
uncharacterized protein families. HisPPase catalyzes the
eighth step of histidine biosynthesis, in which
L-histidinol phosphate undergoes dephosphorylation to
produce histidinol. The PHP domain has four conserved
sequence motifs and contains an invariant histidine that
is involved in metal ion coordination. The PHP domain of
HisPPase is structurally homologous to other members of
the PHP family that have a distorted (beta/alpha)7
barrel fold with a trinuclear metal site on the
C-terminal side of the barrel.
Length = 244
Score = 29.5 bits (67), Expect = 5.2
Identities = 17/48 (35%), Positives = 25/48 (52%), Gaps = 7/48 (14%)
Query: 270 KELSKVDHSTIEYLPFRKDFYVEVPEIARMTPEEVEKYKEELEGIRVK 317
E+ +H+ LPF D Y E +RM EE+E Y EE+ ++ K
Sbjct: 30 TEIGFSEHA---PLPFEFDDYPE----SRMAEEELEDYVEEIRRLKEK 70
>gnl|CDD|227935 COG5648, NHP6B, Chromatin-associated proteins containing the HMG
domain [Chromatin structure and dynamics].
Length = 211
Score = 29.1 bits (65), Expect = 5.3
Identities = 18/94 (19%), Positives = 30/94 (31%), Gaps = 1/94 (1%)
Query: 50 DLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLE 109
D E+ + R R + E+ + + K K E K++ L
Sbjct: 114 DEEKEPYYKEANSDRERY-QREKEEYNKKLPNKAPIGPFIENEPKIRPKVEGPSPDKALV 172
Query: 110 LEMQKRRDRIERWRAERKKKDIETIKKDIKSNLS 143
E + +KKK I+ KK + S
Sbjct: 173 EETKIISKAWSELDESKKKKYIDKYKKLKEEYDS 206
>gnl|CDD|215507 PLN02939, PLN02939, transferase, transferring glycosyl groups.
Length = 977
Score = 29.9 bits (67), Expect = 5.6
Identities = 21/102 (20%), Positives = 42/102 (41%), Gaps = 5/102 (4%)
Query: 8 RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRS 67
RSR+ PS +R S R RRR S +++ +R ++ ++R S+ +
Sbjct: 18 RSRAPFYLPSRRRLAVSCR-----ARRRGFSSQQKKKRGKNIAPKQRSSNSKLQSNTDEN 72
Query: 68 REAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLE 109
+ E + + E K +++ + D+ + A E
Sbjct: 73 GQLENTSLRTVMELPQKSTSSDDDHNRASMQRDEAIAAIDNE 114
>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
Length = 509
Score = 29.6 bits (67), Expect = 5.7
Identities = 20/159 (12%), Positives = 54/159 (33%), Gaps = 10/159 (6%)
Query: 55 KEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
K+ ++ ++ + + + + K++ ++E + ++ ++ ++
Sbjct: 64 KDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKD 123
Query: 115 RRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGK 174
+ + D + DI + ++D D+D+ D++DE K
Sbjct: 124 IDVLNQADDDDDDDDDDDLDDDDID----------DDDDDEDDDEDDDDDDVDDEDEEKK 173
Query: 175 TAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADS 213
A+E D E+ + + A AD
Sbjct: 174 EAKELEKLSDDDDFVWDEDDSEALRQARKDAKLTATADP 212
Score = 28.8 bits (65), Expect = 9.3
Identities = 27/179 (15%), Positives = 60/179 (33%), Gaps = 9/179 (5%)
Query: 43 SERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKE 102
+E + + L++ KS+ ++ E + K E+ + + E
Sbjct: 12 AEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATE 71
Query: 103 VEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSN--------LSSGLGGSAPMKK 154
+ + + + + + A++K KD K + L+ +
Sbjct: 72 SDIPKKKTKTAAKAAAAKA-PAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQA 130
Query: 155 WNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADS 213
+ +DD D+D+ D+ D + +ED D D E+ K + +D
Sbjct: 131 DDDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVW 189
>gnl|CDD|221041 pfam11241, DUF3043, Protein of unknown function (DUF3043). Some
members in this family of proteins with unknown
function are annotated as membrane proteins. This
cannot be confirmed.
Length = 168
Score = 28.8 bits (65), Expect = 5.8
Identities = 17/56 (30%), Positives = 27/56 (48%), Gaps = 8/56 (14%)
Query: 20 RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKD 75
RP RR+ + RRR +R++ + R R+E RR+R+R A + D
Sbjct: 4 RPTPKRREAEAARRRPLVPEDRKAAKKAAR--AARRE------RRARARAAMMAGD 51
>gnl|CDD|153359 cd07675, F-BAR_FNBP1L, The F-BAR (FES-CIP4 Homology and
Bin/Amphiphysin/Rvs) domain of Formin Binding Protein
1-Like. F-BAR domains are dimerization modules that
bind and bend membranes and are found in proteins
involved in membrane dynamics and actin reorganization.
FormiN Binding Protein 1-Like (FNBP1L), also known as
Toca-1 (Transducer of Cdc42-dependent actin assembly),
forms a complex with neural Wiskott-Aldrich syndrome
protein (N-WASP). The FNBP1L/N-WASP complex induces the
formation of filopodia and endocytic vesicles. FNBP1L is
required for Cdc42-induced actin assembly and is
essential for autophagy of intracellular pathogens. It
contains an N-terminal F-BAR domain, a central
Cdc42-binding HR1 domain, and a C-terminal SH3 domain.
F-BAR domains form banana-shaped dimers with a
positively-charged concave surface that binds to
negatively-charged lipid membranes. They can induce
membrane deformation in the form of long tubules.
Length = 252
Score = 29.2 bits (65), Expect = 5.8
Identities = 20/80 (25%), Positives = 42/80 (52%), Gaps = 3/80 (3%)
Query: 52 ERRKEKSRGSKRRSRSREAERSKDHSKKE-EKDKREKEEEEAAFDPSKLDKEVEATRLEL 110
ER+ G K + + D+SKK+ E++ RE E+ + +++ +LD + AT+ ++
Sbjct: 107 ERKMHLQEGRKAQQYLDMCWKQMDNSKKKFERECREAEKAQQSYE--RLDNDTNATKSDV 164
Query: 111 EMQKRRDRIERWRAERKKKD 130
E K++ + A+ K +
Sbjct: 165 EKAKQQLNLRTHMADESKNE 184
>gnl|CDD|177439 PHA02620, PHA02620, VP3; Provisional.
Length = 353
Score = 29.6 bits (66), Expect = 5.8
Identities = 13/35 (37%), Positives = 19/35 (54%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRS 38
+++KR SR S K P+ S + + R R SRS
Sbjct: 319 NKKKRRMSRGSSQKAKGPRASSKTSYKRRSRSSRS 353
>gnl|CDD|173534 PTZ00341, PTZ00341, Ring-infected erythrocyte surface antigen;
Provisional.
Length = 1136
Score = 29.8 bits (66), Expect = 6.1
Identities = 22/111 (19%), Positives = 56/111 (50%)
Query: 71 ERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKD 130
E +++ ++ +++ E+ EE + +E+E E + + IE + E ++
Sbjct: 1013 ENIEENVEEYDEENVEEVEENVEEYDEENVEEIEENAEENVEENIEENIEEYDEENVEEI 1072
Query: 131 IETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDID 181
E I+++I+ N+ + + + N+E++ +E+ +N +EN + E+ D
Sbjct: 1073 EENIEENIEENVEENVEENVEEIEENVEENVEENAEENAEENAEENAEEYD 1123
>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
Mitofilin controls mitochondrial cristae morphology.
Mitofilin is enriched in the narrow space between the
inner boundary and the outer membranes, where it forms a
homotypic interaction and assembles into a large
multimeric protein complex. The first 78 amino acids
contain a typical amino-terminal-cleavable mitochondrial
presequence rich in positive-charged and hydroxylated
residues and a membrane anchor domain. In addition, it
has three centrally located coiled coil domains.
Length = 493
Score = 29.6 bits (67), Expect = 6.1
Identities = 23/97 (23%), Positives = 40/97 (41%), Gaps = 8/97 (8%)
Query: 48 DRDLERR-KEKSRGSKRRSRSREAERSKDHSKKEEKDKRE-----KEEEEAAFDPSKLDK 101
+ +LER KEK + R + EK R KEE ++ KL +
Sbjct: 187 EEELERALKEKREELLSKLEEELLARLESKEAALEKQLRLEFEREKEELRKKYE-EKLRQ 245
Query: 102 EVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDI 138
E+E E QK ++ + E +++ + IK+ +
Sbjct: 246 ELERQAEAHE-QKLKNELALQAIELQREFNKEIKEKV 281
>gnl|CDD|218292 pfam04851, ResIII, Type III restriction enzyme, res subunit.
Length = 100
Score = 27.6 bits (62), Expect = 6.2
Identities = 10/31 (32%), Positives = 18/31 (58%)
Query: 348 KPTPIQAQAIPAIMSGRDLIGIAKTGSGKTV 378
+ P Q +AI ++ + + + TGSGKT+
Sbjct: 3 ELRPYQEEAIERLLEKKRGLIVMATGSGKTL 33
>gnl|CDD|205235 pfam13054, DUF3915, Protein of unknown function (DUF3915). This
family of proteins is functionally uncharacterized.
This family of proteins is found in bacteria. Proteins
in this family are approximately 120 amino acids in
length.
Length = 116
Score = 27.9 bits (62), Expect = 6.5
Identities = 10/25 (40%), Positives = 14/25 (56%)
Query: 35 RSRSHERRSERDRDRDLERRKEKSR 59
R H R + +R+R+ ER KE R
Sbjct: 12 RDCHHHEREDFEREREREREKEPQR 36
>gnl|CDD|151146 pfam10628, CotE, Outer spore coat protein E (CotE). CotE is a
morphogenic protein that is required for the assembly of
the outer coat of the endospore and spore resistance to
lysozyme. CotE also regulates the expression of cotA,
cotB, cotC and other genes encoding spore outer coat
proteins. The timing of cotE expression has been shown
in Bacillus subtilis to affect spore coat morphology but
not lysozyme resistance.
Length = 182
Score = 28.6 bits (64), Expect = 6.8
Identities = 16/40 (40%), Positives = 19/40 (47%), Gaps = 7/40 (17%)
Query: 154 KWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEE 193
EDD +EDE DE ED+DP F+ G EE
Sbjct: 150 DGCEEDDDEEDEEITDDE-----FEDLDP--DFLVGEEEE 182
>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
Length = 1036
Score = 29.5 bits (66), Expect = 7.1
Identities = 16/56 (28%), Positives = 30/56 (53%), Gaps = 8/56 (14%)
Query: 32 RRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREK 87
RR + + +ER+R + +RR+E+ + + EA+R +K E + +REK
Sbjct: 255 RRELEKLAKEEAERERQAEEQRRREEEKAA------MEADR--AQAKAEVEKRREK 302
>gnl|CDD|237133 PRK12553, PRK12553, ATP-dependent Clp protease proteolytic subunit;
Reviewed.
Length = 207
Score = 28.8 bits (65), Expect = 7.2
Identities = 14/40 (35%), Positives = 24/40 (60%), Gaps = 3/40 (7%)
Query: 105 ATRLEL---EMQKRRDRIERWRAERKKKDIETIKKDIKSN 141
A+ LE+ E+ + R+R+ER AE + +E I+KD +
Sbjct: 143 ASDLEIQAREILRMRERLERILAEHTGQSVEKIRKDTDRD 182
>gnl|CDD|234634 PRK00103, PRK00103, rRNA large subunit methyltransferase;
Provisional.
Length = 157
Score = 28.2 bits (64), Expect = 7.4
Identities = 23/101 (22%), Positives = 34/101 (33%), Gaps = 26/101 (25%)
Query: 485 ADRM-FDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEALARRILNKPIEIQVGGRSVV 543
R + E + I D RP +A + RIL + +
Sbjct: 24 LKRFPRYLKLEL--IEIPDEKRPK------NADAEQIKAKEGERILAA-----LPKGA-- 68
Query: 544 CKEVEQHVIVLDEEQKML---KLLELLGIYQDQG-SVIVFV 580
VI LDE K L + + L ++D G S + FV
Sbjct: 69 ------RVIALDERGKQLSSEEFAQELERWRDDGRSDVAFV 103
>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62.
Length = 217
Score = 28.6 bits (64), Expect = 7.5
Identities = 12/72 (16%), Positives = 26/72 (36%), Gaps = 6/72 (8%)
Query: 49 RDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRL 108
R LE K K+ K S+D + ++ K A ++ K +
Sbjct: 14 RALESEKYKANKDKGNPEIYNKINSQDKAIEKFK------LLIKAQMAERVKKLHSQEKK 67
Query: 109 ELEMQKRRDRIE 120
E + + ++ ++
Sbjct: 68 EEKKKPKKKKVP 79
>gnl|CDD|222011 pfam13257, DUF4048, Domain of unknown function (DUF4048). This
presumed domain is functionally uncharacterized. This
domain family is found in eukaryotes, and is typically
between 228 and 257 amino acids in length.
Length = 242
Score = 28.6 bits (64), Expect = 7.6
Identities = 19/73 (26%), Positives = 24/73 (32%), Gaps = 17/73 (23%)
Query: 9 SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR 68
+ SR+ P R RR SRS R R + L S R S S+
Sbjct: 117 TESRTVPPP------------RSRRSGSRSTSRSRLRLQGGSLSS-----SRSSRSSTSK 159
Query: 69 EAERSKDHSKKEE 81
A KD +
Sbjct: 160 GATSGKDSKSADI 172
>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
Length = 1465
Score = 29.4 bits (66), Expect = 7.6
Identities = 37/226 (16%), Positives = 73/226 (32%), Gaps = 4/226 (1%)
Query: 48 DRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK----DKREKEEEEAAFDPSKLDKEV 103
D E +EK + + R S A++ + K+ K+ E E E
Sbjct: 1172 DAKAEEAREKLQRAAARGESGAAKKVSRQAPKKPAPKKTTKKASESETTEETYGSSAMET 1231
Query: 104 EATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDE 163
E ++ + R ++ A K+K+ E D+K L++ SAP + +E+
Sbjct: 1232 ENVAEVVKPKGRAGAKKKAPAAAKEKEEEDEILDLKDRLAAYNLDSAPAQSAKMEETVKA 1291
Query: 164 DENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIV 223
K D+ + + KPA + K A
Sbjct: 1292 VPARRAAARKKPLASVSVISDSDDDDDDFAVEVSLAERLKKKGGRKPAAANKKAAKPPAA 1351
Query: 224 TGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQK 269
+ ++ +L+ E E ++ + A+ +K+
Sbjct: 1352 AKKRGPATVQSGQKLLTEMLKPAEAIGISPEKKVRKMRASPFNKKS 1397
>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
Length = 520
Score = 29.0 bits (66), Expect = 7.7
Identities = 23/85 (27%), Positives = 45/85 (52%), Gaps = 7/85 (8%)
Query: 39 HERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSK 98
H+ R+E +++ ERR E + ++R +E + K E +KRE+E E+ + +
Sbjct: 67 HKLRNEFEKEL-RERRNELQK-LEKRLLQKEENLDR---KLELLEKREEELEKKEKELEQ 121
Query: 99 LDKEVEATRLELE--MQKRRDRIER 121
+E+E ELE ++++ +ER
Sbjct: 122 KQQELEKKEEELEELIEEQLQELER 146
>gnl|CDD|212661 cd07777, FGGY_SHK_like, sedoheptulokinase-like proteins; a
subfamily of the FGGY family of carbohydrate kinases.
This subfamily is predominantly composed of
uncharacterized bacterial and eukaryotic proteins with
similarity to human sedoheptulokinase (SHK, also known
as D-altro-heptulose or heptulokinase, EC 2.7.1.14)
encoded by the carbohydrate kinase-like (CARKL/SHPK)
gene. SHK catalyzes the ATP-dependent phosphorylation of
sedoheptulose to produce sedoheptulose 7-phosphate and
ADP. The presence of Mg2+ or Mn2+ might be required for
catalytic activity. Members of this subfamily belong to
the FGGY family of carbohydrate kinases, the monomers of
which contain two large domains, which are separated by
a deep cleft that forms the active site. This model
includes both the N-terminal domain, which adopts a
ribonuclease H-like fold, and the structurally related
C-terminal domain.
Length = 448
Score = 29.2 bits (66), Expect = 7.7
Identities = 10/35 (28%), Positives = 18/35 (51%)
Query: 259 STAANLASKQKKELSKVDHSTIEYLPFRKDFYVEV 293
T+A L+ + V ++ EY P+ K+ Y+ V
Sbjct: 263 GTSAQLSFLPVFKPETVPPASPEYRPYFKNHYLAV 297
>gnl|CDD|221408 pfam12072, DUF3552, Domain of unknown function (DUF3552). This
presumed domain is functionally uncharacterized. This
domain is found in bacteria, archaea and eukaryotes.
This domain is about 200 amino acids in length. This
domain is found associated with pfam00013, pfam01966.
This domain has a single completely conserved residue A
that may be functionally important.
Length = 201
Score = 28.7 bits (65), Expect = 7.9
Identities = 28/84 (33%), Positives = 44/84 (52%), Gaps = 4/84 (4%)
Query: 38 SHERRSERDRDRDLERRKEKSRGSKRRSRSREA--ERSKDHSKKEEKDKREKEEEEAAFD 95
H+ R+E +R+ ERR E R ++R +E +R + +K+E+ EKE+E AA
Sbjct: 62 IHKLRAEAEREL-KERRNELQR-QEKRLLQKEETLDRKDESLEKKEESLEEKEKELAARQ 119
Query: 96 PSKLDKEVEATRLELEMQKRRDRI 119
+KE E L E Q+ +RI
Sbjct: 120 QQLEEKEEELEELIEEQQQELERI 143
>gnl|CDD|110514 pfam01517, HDV_ag, Hepatitis delta virus delta antigen. The
hepatitis delta virus (HDV) encodes a single protein,
the hepatitis delta antigen (HDAg). The central region
of this protein has been shown to bind RNA. Several
interactions are also mediated by a coiled-coil region
at the N terminus of the protein.
Length = 194
Score = 28.7 bits (64), Expect = 7.9
Identities = 22/84 (26%), Positives = 39/84 (46%), Gaps = 5/84 (5%)
Query: 7 KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRS-----HERRSERDRDRDLERRKEKSRGS 61
++ + +P KRP+ + + D +R + ERR R R ++K+ S G
Sbjct: 59 RKDKDGEGAPPAKRPRTDQMEVDSGPGKRPHAGGFTDQERRDHRRRKALENKKKQLSSGG 118
Query: 62 KRRSRSREAERSKDHSKKEEKDKR 85
K SR E E + + EE+++R
Sbjct: 119 KHLSREEEEELRRLTEEDEERERR 142
>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6. The surfeit locus
protein SURF-6 is shown to be a component of the
nucleolar matrix and has a strong binding capacity for
nucleic acids.
Length = 206
Score = 28.4 bits (64), Expect = 8.2
Identities = 25/129 (19%), Positives = 56/129 (43%), Gaps = 14/129 (10%)
Query: 25 RRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK---------D 75
+R++ + R+++ R ++ E + + E K + SK+++ E D
Sbjct: 14 KREQRKARKKQKRKEAKKKEDAQKSEAEEVKNEENKSKKKAAPIENAEGNIVFSKVEFAD 73
Query: 76 HSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE-----RWRAERKKKD 130
+ ++ K +K++++ D +L K++EA + +LE E +W K +
Sbjct: 74 GEQAKKDLKLKKKKKKKKTDYKQLLKKLEARKKKLEELDEDKAAEIEEKEKWTKALAKAE 133
Query: 131 IETIKKDIK 139
+K D K
Sbjct: 134 GVKVKDDEK 142
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 29.1 bits (65), Expect = 8.6
Identities = 14/71 (19%), Positives = 36/71 (50%), Gaps = 12/71 (16%)
Query: 21 PKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
+E R K ++ E+ ++++L++ K + +K + ++++A + KK
Sbjct: 15 EEELERKKKKE------------EKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKS 62
Query: 81 EKDKREKEEEE 91
EK R+++ E+
Sbjct: 63 EKKSRKRDVED 73
>gnl|CDD|218545 pfam05300, DUF737, Protein of unknown function (DUF737). This
family consists of several uncharacterized mammalian
proteins of unknown function.
Length = 187
Score = 28.2 bits (63), Expect = 8.6
Identities = 24/122 (19%), Positives = 52/122 (42%), Gaps = 8/122 (6%)
Query: 15 SPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK 74
E + + +R + + DL +R E+ + + +R A+R +
Sbjct: 48 ELRRLIAGELKGALEDAKRPSEETAGGLQSSEVKEDLLKRYEQEQAIVQEELARIAKRER 107
Query: 75 DHSKKE-------EKDKREKEEEEAAFDPSKLD-KEVEATRLELEMQKRRDRIERWRAER 126
+ ++++ EK E+E ++A +L+ KE E RL+ +++ R+E +E
Sbjct: 108 EAAEEQLSRAVLREKASAEQERQKAKHLARQLEEKEAELKRLDAFYKEQLARLEEKNSEF 167
Query: 127 KK 128
K
Sbjct: 168 YK 169
>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM). This
family consists of several Plasmodium falciparum SPAM
(secreted polymorphic antigen associated with
merozoites) proteins. Variation among SPAM alleles is
the result of deletions and amino acid substitutions in
non-repetitive sequences within and flanking the alanine
heptad-repeat domain. Heptad repeats in which the a and
d position contain hydrophobic residues generate
amphipathic alpha-helices which give rise to helical
bundles or coiled-coil structures in proteins. SPAM is
an example of a P. falciparum antigen in which a
repetitive sequence has features characteristic of a
well-defined structural element.
Length = 164
Score = 28.3 bits (63), Expect = 9.0
Identities = 27/128 (21%), Positives = 52/128 (40%), Gaps = 21/128 (16%)
Query: 67 SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAER 126
+E E KD +++++++ E++EEE D+E E E + D ++ E+
Sbjct: 37 IKENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDIEK 96
Query: 127 KKK-DIETIKKDIKS-NLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLD 184
K DI +D + NL S +++ KTAE+ + L
Sbjct: 97 KNINDIFNSTQDDNAQNLIS-------------------KNYKKNEKSKKTAEDIVKTLF 137
Query: 185 AFMQGVHE 192
+ G ++
Sbjct: 138 GLLNGNNQ 145
>gnl|CDD|226263 COG3740, COG3740, Phage head maturation protease [General function
prediction only].
Length = 194
Score = 28.2 bits (63), Expect = 9.2
Identities = 15/60 (25%), Positives = 21/60 (35%), Gaps = 6/60 (10%)
Query: 72 RSKDHSKKEEKDKREKEEEEAAFD------PSKLDKEVEATRLELEMQKRRDRIERWRAE 125
R D + E + P+ D VEA RLE + RR R E+ +
Sbjct: 129 RDGDEWGDDGSPIVRIRLEATLLEVSVVTFPAYPDARVEAVRLEELFEVRRTRAEKRKLL 188
>gnl|CDD|226809 COG4372, COG4372, Uncharacterized protein conserved in bacteria
with the myosin-like domain [Function unknown].
Length = 499
Score = 28.8 bits (64), Expect = 9.3
Identities = 15/100 (15%), Positives = 35/100 (35%), Gaps = 6/100 (6%)
Query: 31 DRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKR--EKE 88
+ ++L R ++ R A + +++D + +K
Sbjct: 185 KSQVLDLKLRSAQIEQEAQNLATRANAAQARTEELARRAAAAQQTAQAIQQRDAQISQKA 244
Query: 89 EEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKK 128
++ AA ++E + RLE R+E+ A+ +
Sbjct: 245 QQIAARAEQIRERERQLQRLETAQ----ARLEQEVAQLEA 280
>gnl|CDD|183984 PRK13340, PRK13340, alanine racemase; Reviewed.
Length = 406
Score = 28.8 bits (65), Expect = 9.4
Identities = 14/64 (21%), Positives = 30/64 (46%), Gaps = 9/64 (14%)
Query: 482 LDEADRMFDMGFEPQVMRI-------IDNVRPDRQTVMF-SATFPRQMEALARRILNKPI 533
+EA R+ ++GF Q++R+ I+ + + + A+A++ KPI
Sbjct: 100 NEEARRVRELGFTGQLLRVRSASPAEIEQALRYDLEELIGDDEQAKLLAAIAKKN-GKPI 158
Query: 534 EIQV 537
+I +
Sbjct: 159 DIHL 162
>gnl|CDD|221313 pfam11917, DUF3435, Protein of unknown function (DUF3435). This
family of proteins are functionally uncharacterized.
This protein is found in eukaryotes. Proteins in this
family are typically between 435 to 791 amino acids in
length. This family is related to pfam00589 suggesting
it may be an integrase enzyme.
Length = 418
Score = 28.9 bits (65), Expect = 9.5
Identities = 14/83 (16%), Positives = 32/83 (38%)
Query: 9 SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR 68
SR+R P E + + D + +R + L + K++G+ R
Sbjct: 265 SRTRDPRRPRDLTDEQKASVEEDPELQELIRKRDHLKKEIIALYGQVAKAKGTPLYERLE 324
Query: 69 EAERSKDHSKKEEKDKREKEEEE 91
+ R + ++ + + +K+ E
Sbjct: 325 KRRREVRNERQRLRRELKKKIRE 347
>gnl|CDD|191187 pfam05087, Rota_VP2, Rotavirus VP2 protein. Rotavirus particles
consist of three concentric proteinaceous capsid layers.
The innermost capsid (core) is made of VP2. The genomic
RNA and the two minor proteins VP1 and VP3 are
encapsidated within this layer. The N-terminus of
rotavirus VP2 is necessary for the encapsidation of VP1
and VP3.
Length = 887
Score = 29.1 bits (65), Expect = 9.7
Identities = 19/101 (18%), Positives = 40/101 (39%), Gaps = 14/101 (13%)
Query: 54 RKEKSRGSKRRSRSREAERSK--DHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLEL- 110
R + R +E + K ++ E K+K ++EE D + ++++ L
Sbjct: 4 RNRREANINNNDRMQEKDDEKQDQKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQENLK 63
Query: 111 ----------EMQKRRDRIERWRAERKKK-DIETIKKDIKS 140
E K+ + + + E +K+ E ++K I S
Sbjct: 64 IADEVKKSTKEESKQLLEVLKTKEEHQKEIQYEILQKTIPS 104
>gnl|CDD|236944 PRK11642, PRK11642, exoribonuclease R; Provisional.
Length = 813
Score = 28.9 bits (65), Expect = 9.7
Identities = 16/76 (21%), Positives = 34/76 (44%)
Query: 54 RKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQ 113
R ++ G R ++++ + K K+ + K+ E ++AF K K A + + +
Sbjct: 729 RAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKAKPKAAKKDARKAK 788
Query: 114 KRRDRIERWRAERKKK 129
K + ++ A K K
Sbjct: 789 KPSAKTQKIAAATKAK 804
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 29.2 bits (65), Expect = 9.7
Identities = 14/80 (17%), Positives = 30/80 (37%)
Query: 23 ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
E +D+D + + + + D + K G + + E EE
Sbjct: 4029 EPMQDEDPLEENNTLDEDIQQDDFSDLAEDDEKMNEDGFEENVQENEESTEDGVKSDEEL 4088
Query: 83 DKREKEEEEAAFDPSKLDKE 102
++ E E++A + K+D +
Sbjct: 4089 EQGEVPEDQAIDNHPKMDAK 4108
>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207). This
family is found in eukaryotes; it has several conserved
tryptophan residues. The function is not known.
Length = 261
Score = 28.5 bits (64), Expect = 9.8
Identities = 16/57 (28%), Positives = 30/57 (52%), Gaps = 1/57 (1%)
Query: 4 SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRG 60
K R+ S + KR +E K + ++++ R ERR +R + ++ E RK+K+
Sbjct: 167 GSAKPERNVSQEEAKKRLQEWELKKLKQQQQK-REEERRKQRKKQQEEEERKQKAEE 222
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.314 0.132 0.369
Gapped
Lambda K H
0.267 0.0677 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 32,083,254
Number of extensions: 3302346
Number of successful extensions: 8123
Number of sequences better than 10.0: 1
Number of HSP's gapped: 6315
Number of HSP's successfully gapped: 776
Length of query: 615
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 512
Effective length of database: 6,369,140
Effective search space: 3260999680
Effective search space used: 3260999680
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 62 (27.8 bits)