RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy8081
         (463 letters)



>gnl|CDD|218393 pfam05033, Pre-SET, Pre-SET motif.  This protein motif is a zinc
           binding motif. It contains 9 conserved cysteines that
           coordinate three zinc ions. It is thought that this
           region plays a structural role in stabilising SET
           domains.
          Length = 103

 Score = 99.4 bits (248), Expect = 2e-25
 Identities = 47/117 (40%), Positives = 58/117 (49%), Gaps = 16/117 (13%)

Query: 80  DMSNGRENVPISCVNYIDTDVPKTV-DYMTERKPKEGVTINTNKEFLVCCDCTDDCRDRN 138
           D+SNG+E+VPI  VN +D + P     Y+ E  P  GV+ +   EFLV C C D C D +
Sbjct: 1   DISNGKESVPIPVVNEVDLEGPPPNFTYINEYIPGSGVS-DIPNEFLVGCSCKDGCPDSS 59

Query: 139 NCACWQLTIKGSRDLWNVSEPKDFVGYQ-NRRLPEHVVSGIFECNDLCKCKHTCHNR 194
           NCAC QL   G               Y  N RL       I+ECN  CKC  +C NR
Sbjct: 60  NCACLQLNGGG-------------FAYDKNGRLRVEPGPPIYECNSRCKCDPSCPNR 103


>gnl|CDD|128744 smart00468, PreSET, N-terminal to some SET domains.  A Cys-rich
           putative Zn2+-binding domain that occurs N-terminal to
           some SET domains. Function is unknown. Unpublished.
          Length = 98

 Score = 77.8 bits (192), Expect = 1e-17
 Identities = 35/110 (31%), Positives = 49/110 (44%), Gaps = 13/110 (11%)

Query: 78  IKDMSNGRENVPISCVNYIDTDV-PKTVDYMTERKPKEGVTINTNKEFLVCCDCTDDCRD 136
             D+SNG+ENVP+  VN +D D  P   +Y++E    +GV I+ +   LV C C+ DC  
Sbjct: 1   CLDISNGKENVPVPLVNEVDEDPPPPDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSS 60

Query: 137 RNNCACWQLTIKGSRDLWNVSEPKDFVGYQNRRLPEHVVSGIFECNDLCK 186
            N C C +                +F    N  L       I+ECN  C 
Sbjct: 61  SNKCECARKN------------GGEFAYELNGGLRLKRKPLIYECNSRCS 98


>gnl|CDD|214614 smart00317, SET, SET (Su(var)3-9, Enhancer-of-zeste, Trithorax)
           domain.  Putative methyl transferase, based on outlier
           plant homologues.
          Length = 124

 Score = 78.1 bits (193), Expect = 2e-17
 Identities = 28/83 (33%), Positives = 42/83 (50%), Gaps = 4/83 (4%)

Query: 361 KKTKRLRSLREYFGEDENVYIMDARTSGNIGRYLNHSCTPNVFVQNVFVDTHDPRFPWVS 420
           K      +   Y  + ++   +DAR  GN+ R++NHSC PN  +  V V+  D     + 
Sbjct: 45  KAYDTDGAKAFYLFDIDSDLCIDARRKGNLARFINHSCEPNCELLFVEVNGDD----RIV 100

Query: 421 FFALKFIEAGSELTWDYAYDIGS 443
            FAL+ I+ G ELT DY  D  +
Sbjct: 101 IFALRDIKPGEELTIDYGSDYAN 123



 Score = 55.4 bits (134), Expect = 1e-09
 Identities = 22/49 (44%), Positives = 31/49 (63%)

Query: 203 KLQLFKTEMKGWGLRCLNDIPQGTFICIYAGHLLTDSDANEEGKNYGDE 251
           KL++FK+  KGWG+R   DIP+G FI  Y G ++T  +A E  K Y  +
Sbjct: 2   KLEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERPKAYDTD 50


>gnl|CDD|216155 pfam00856, SET, SET domain.  SET domains are protein lysine
           methyltransferase enzymes. SET domains appear to be
           protein-protein interaction domains. It has been
           demonstrated that SET domains mediate interactions with
           a family of proteins that display similarity with
           dual-specificity phosphatases (dsPTPases). A subset of
           SET domains have been called PR domains. These domains
           are divergent in sequence from other SET domains, but
           also appear to mediate protein-protein interaction. The
           SET domain consists of two regions known as SET-N and
           SET-C. SET-C forms an unusual and conserved knot-like
           structure of probably functional importance.
           Additionally to SET-N and SET-C, an insert region
           (SET-I) and flanking regions of high structural
           variability form part of the overall structure.
          Length = 113

 Score = 68.7 bits (168), Expect = 3e-14
 Identities = 26/69 (37%), Positives = 38/69 (55%), Gaps = 4/69 (5%)

Query: 369 LREYFGEDENVYIMDARTSGNIGRYLNHSCTPNVFVQNVFVDTHDPRFPWVSFFALKFIE 428
           L  +    ++ Y +DA   GN+ R++NHSC PN  V+ VFV+        +   AL+ I+
Sbjct: 48  LELFLSRLDSEYDIDATGLGNVARFINHSCEPNCEVRFVFVNGG----DRIVVRALRDIK 103

Query: 429 AGSELTWDY 437
            G ELT DY
Sbjct: 104 PGEELTIDY 112



 Score = 45.6 bits (108), Expect = 3e-06
 Identities = 20/61 (32%), Positives = 26/61 (42%)

Query: 213 GWGLRCLNDIPQGTFICIYAGHLLTDSDANEEGKNYGDEYLAELDFIETVERYKEAYESD 272
           G GL    DIP+G  I  Y G L+T  +A E    Y  E L  L     +   +   E D
Sbjct: 1   GRGLFATRDIPKGELIIEYVGELITPEEAEERELLYNKEELRGLLSDLELFLSRLDSEYD 60

Query: 273 V 273
           +
Sbjct: 61  I 61


>gnl|CDD|238689 cd01395, HMT_MBD, Methyl-CpG binding domains (MBD) present in
          putative histone methyltransferases (HMT) such as CLLD8
          and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and
          a bifurcated SET domain, suggesting that CLLD8 might be
          associated with methylation-mediated transcriptional
          repression. SETDB1 and other proteins in this group
          have a similar domain architecture. SETDB1 is a novel
          KAP-1-associated histone H3, lysine 9-specific
          methyltransferase that contributes to HP1-mediated
          silencing of euchromatic genes by KRAB zinc-finger
          proteins.
          Length = 60

 Score = 39.7 bits (93), Expect = 1e-04
 Identities = 16/36 (44%), Positives = 23/36 (63%)

Query: 7  KKCIMYTAPCGRTLRTSDQLVLYLFITKAKWTIDMF 42
          KK ++Y APCGR+LR   ++  YL  T +  T+D F
Sbjct: 23 KKHVIYKAPCGRSLRNMSEVHRYLRETCSFLTVDNF 58


>gnl|CDD|225491 COG2940, COG2940, Proteins containing SET domain [General function
           prediction only].
          Length = 480

 Score = 42.9 bits (101), Expect = 3e-04
 Identities = 32/166 (19%), Positives = 69/166 (41%), Gaps = 16/166 (9%)

Query: 282 DEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKR----KQ 337
           +       D  + +  S  + S+ K +LNS+            ++      +       +
Sbjct: 295 NRISKSEEDSTTSSDFSKSNVSKLKELLNSN----GCKKRREPNVVQESEIKGYGVFALE 350

Query: 338 KADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYF---GEDENVYIMDARTSGNIGRYL 394
              K E   +    ++  +   +++ +    L   F     ++   + D++ +G++ R++
Sbjct: 351 SIKKGEFIIEYHGEIIR-RKEAREREENYDLLGNEFSFGLLEDKDKVRDSQKAGDVARFI 409

Query: 395 NHSCTPNVFVQNVFVDTHDPRFPWVSFFALKFIEAGSELTWDYAYD 440
           NHSCTPN     + V+        +S +A++ I+AG ELT+DY   
Sbjct: 410 NHSCTPNCEASPIEVNGIFK----ISIYAIRDIKAGEELTYDYGPS 451


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 39.2 bits (92), Expect = 0.004
 Identities = 20/87 (22%), Positives = 38/87 (43%), Gaps = 19/87 (21%)

Query: 262 VERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNA 321
             RY +AY     ++D  E+ + E+E S +                   S+D+ ++  + 
Sbjct: 113 EPRYDDAYRDLEEDDDDDEESDEEDEESSK-------------------SEDDEDDDDDD 153

Query: 322 DSDHIRSRLRKRKRKQKADKKEGKRKT 348
           D D I +R R  +R+++  + E KR  
Sbjct: 154 DDDDIATRERSLERRRRRREWEEKRAE 180


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 37.0 bits (86), Expect = 0.011
 Identities = 24/118 (20%), Positives = 49/118 (41%), Gaps = 12/118 (10%)

Query: 264 RYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADS 323
           R K+  E ++ E++         E  D+E       E+  +  +  +  ++ E  S+ + 
Sbjct: 13  RMKKLLEEELEEDEFFWTYLLFEEEEDDEEFEIEEEEEEEEVDSDFDDSEDDEPESDDEE 72

Query: 324 D-----HIRSRLRKRKRKQKADKKEG----KRKTSSLLMTLQAN---QKKKTKRLRSL 369
           +         RL+K+KR +    KE     K+K  +   + +A     KKK++R+   
Sbjct: 73  EGEKELQREERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAPRPKKKSERISWA 130


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 37.3 bits (87), Expect = 0.012
 Identities = 18/112 (16%), Positives = 48/112 (42%), Gaps = 2/112 (1%)

Query: 259 IETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENS 318
           +E +E++KE        E  +E D+ ++E  + E       + + + + I    D+   S
Sbjct: 72  LELLEKWKEEERKKKEAEQGLESDDDDDEEEEWEV--EEDEDSDDEGEWIDVESDKEIES 129

Query: 319 SNADSDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLR 370
           S+++ +  +    K+ ++   ++   + +  +        +K+K   L + R
Sbjct: 130 SDSEDEEEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASELATTR 181


>gnl|CDD|216497 pfam01429, MBD, Methyl-CpG binding domain.  The Methyl-CpG
          binding domain (MBD) binds to DNA that contains one or
          more symmetrically methylated CpGs. DNA methylation in
          animals is associated with alterations in chromatin
          structure and silencing of gene expression. MBD has
          negligible non-specific affinity for DNA. In vitro
          foot-printing with MeCP2 showed the MBD can protect a
          12 nucleotide region surrounding a methyl CpG pair.
          MBDs are found in several Methyl-CpG binding proteins
          and also DNA demethylase.
          Length = 75

 Score = 34.3 bits (79), Expect = 0.015
 Identities = 9/41 (21%), Positives = 19/41 (46%), Gaps = 1/41 (2%)

Query: 6  NKKCIMYTAPCGRTLRTSDQLVLYLFITKAKW-TIDMFEYD 45
           K  + Y +P G+  R+  +L+ YL         ++ F++ 
Sbjct: 28 GKVDVYYYSPTGKKFRSKSELIRYLEKNGDTSLKLEDFDFT 68


>gnl|CDD|237629 PRK14160, PRK14160, heat shock protein GrpE; Provisional.
          Length = 211

 Score = 36.3 bits (84), Expect = 0.017
 Identities = 24/105 (22%), Positives = 43/105 (40%), Gaps = 12/105 (11%)

Query: 267 EAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHI 326
           E    D   E+M ED   ENEN  EE      + +  + +     +D  E++     +  
Sbjct: 2   EKECKDAKHENMEEDCCKENEN-KEEDKGKEEDLEFEEIEKEEIIEDSEESNEVKIEE-- 58

Query: 327 RSRLRKRKRKQKADKKEGKRKTSSL---LMTLQA---NQKKKTKR 365
              L+    K K + K+ + +  +L   L+   A   N +K+T +
Sbjct: 59  ---LKDENNKLKEENKKLENELEALKDRLLRTVAEYDNYRKRTAK 100



 Score = 31.6 bits (72), Expect = 0.60
 Identities = 22/89 (24%), Positives = 36/89 (40%), Gaps = 11/89 (12%)

Query: 260 ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSS 319
           E   +  E  E D  +E+ +E +E E E   E+S  SN  +            +E +   
Sbjct: 15  EDCCKENENKEEDKGKEEDLEFEEIEKEEIIEDSEESNEVKIEELKDENNKLKEENKKLE 74

Query: 320 NADSDHIRSRL----------RKRKRKQK 338
           N + + ++ RL          RKR  K+K
Sbjct: 75  N-ELEALKDRLLRTVAEYDNYRKRTAKEK 102


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 37.3 bits (86), Expect = 0.017
 Identities = 24/98 (24%), Positives = 45/98 (45%), Gaps = 3/98 (3%)

Query: 240 DANEEGKNYGDEYLAELDFIE-TVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNS 298
           D  E+ K       AEL+  E   E  K+  +S   EED  EDDE E++  +EE P +  
Sbjct: 259 DPKEKDKKKDAGDDAELEDDEPDKEAVKKEADSKPEEED-EEDDEQEDDQDEEEPPEAAM 317

Query: 299 NEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRK 336
           ++    +  +   D E      +  +  +++L+++  +
Sbjct: 318 DKVKLDEPVLEGVDLE-SPKELSSFEKRQAKLKQQIEQ 354



 Score = 30.0 bits (67), Expect = 3.5
 Identities = 17/82 (20%), Positives = 39/82 (47%)

Query: 266 KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDH 325
            E++E      DM  +D A++E  +EE  +      + +D+A L ++ E+     +D + 
Sbjct: 97  SESHEDGSDGSDMDSEDSADDEEEEEEDESLEDEMIDDEDEADLFNESESSLEDLSDDET 156

Query: 326 IRSRLRKRKRKQKADKKEGKRK 347
                +K + ++  ++KE   +
Sbjct: 157 EDDEEKKMEEEEAGEEKESVEQ 178


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 36.5 bits (84), Expect = 0.040
 Identities = 26/89 (29%), Positives = 38/89 (42%), Gaps = 6/89 (6%)

Query: 240  DANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSN 299
             ANEE      + L E +  E         +  V  E+   D E EN++ DEE  +   +
Sbjct: 3840 LANEEDTANQSD-LDESEARELESDMNGVTKDSVVSENENSDSEEENQDLDEEVNDIPED 3898

Query: 300  EDNSQDKAIL---NSDD--ETENSSNADS 323
              NS ++ +    N +D  ETE  SN  S
Sbjct: 3899 LSNSLNEKLWDEPNEEDLLETEQKSNEQS 3927



 Score = 34.2 bits (78), Expect = 0.16
 Identities = 22/91 (24%), Positives = 34/91 (37%), Gaps = 8/91 (8%)

Query: 243  EEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENEN-------SDEESPN 295
            EE     DE + + D +E      E  + D    D+ EDDE  NE+        +EES  
Sbjct: 4021 EEADAEKDEPMQDEDPLEENNTLDEDIQQD-DFSDLAEDDEKMNEDGFEENVQENEESTE 4079

Query: 296  SNSNEDNSQDKAILNSDDETENSSNADSDHI 326
                 D   ++  +  D   +N    D+   
Sbjct: 4080 DGVKSDEELEQGEVPEDQAIDNHPKMDAKST 4110



 Score = 31.9 bits (72), Expect = 0.93
 Identities = 21/87 (24%), Positives = 34/87 (39%), Gaps = 9/87 (10%)

Query: 240  DANEEGKNYGDEYL----AELDFIETVERYKE-AYESDVPEEDMVEDDEAENENSDEESP 294
            D  E+  N  +E L     E D +ET ++  E +  ++  +    EDD    E+ D +  
Sbjct: 3894 DIPEDLSNSLNEKLWDEPNEEDLLETEQKSNEQSAANNESDLVSKEDDNKALEDKDRQEK 3953

Query: 295  NSNSNEDNS--QDKAILNSDDETENSS 319
                   +    D  I    D  EN+S
Sbjct: 3954 EDEEEMSDDVGIDDEI--QPDIQENNS 3978



 Score = 31.9 bits (72), Expect = 0.96
 Identities = 24/88 (27%), Positives = 38/88 (43%), Gaps = 9/88 (10%)

Query: 239  SDANEEGK--NYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNS 296
            S AN E    +  D+  A  D     +  +E    DV  +D ++ D  EN   + + P  
Sbjct: 3927 SAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSDDVGIDDEIQPDIQEN---NSQPPPE 3983

Query: 297  NSNEDNSQDKAILNSDDETENSSNADSD 324
            N + D  +D   L  D++  + S  DSD
Sbjct: 3984 NEDLDLPED---LKLDEKEGDVSK-DSD 4007


>gnl|CDD|240433 PTZ00482, PTZ00482, membrane-attack complex/perforin (MACPF)
           Superfamily; Provisional.
          Length = 844

 Score = 34.8 bits (80), Expect = 0.093
 Identities = 18/90 (20%), Positives = 36/90 (40%), Gaps = 2/90 (2%)

Query: 276 EDMVEDDEAENENSDEESP-NSNSNEDNSQDKAILNSDDETE-NSSNADSDHIRSRLRKR 333
           +D  +D+       DE+   N+ S E ++ D ++L   D  E   + A++D      +  
Sbjct: 87  DDDDDDEFDFLYEDDEDDAGNATSGESSTDDDSLLELPDRDEDADTQANNDQTNDFDQDD 146

Query: 334 KRKQKADKKEGKRKTSSLLMTLQANQKKKT 363
               + D+   +    S    L   +K +T
Sbjct: 147 SSNSQTDQGLKQSVNLSSAEKLIEEKKGQT 176


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 34.4 bits (79), Expect = 0.13
 Identities = 20/114 (17%), Positives = 48/114 (42%), Gaps = 11/114 (9%)

Query: 237 TDSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNS 296
             SD ++E ++  ++           E   E    D  +++  ED ++E+ +  +     
Sbjct: 165 ESSDKDDEEESESED-----------ESKSEESAEDDSDDEEEEDSDSEDYSQYDGMLVD 213

Query: 297 NSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTSS 350
           +S+E+  ++   +N +++T  S + +SD   S  R     +++     K K   
Sbjct: 214 SSDEEEGEEAPSINYNEDTSESESDESDSEISESRSVSDSEESSPPSKKPKEKK 267



 Score = 31.0 bits (70), Expect = 1.4
 Identities = 24/108 (22%), Positives = 41/108 (37%), Gaps = 17/108 (15%)

Query: 260 ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETEN-- 317
           E  E   +  E +   ED  + +E+  ++SD+E    + +ED SQ   +L    + E   
Sbjct: 162 EAKESSDKDDEEESESEDESKSEESAEDDSDDEEEEDSDSEDYSQYDGMLVDSSDEEEGE 221

Query: 318 ---------------SSNADSDHIRSRLRKRKRKQKADKKEGKRKTSS 350
                          S  +DS+   SR      +     K+ K K +S
Sbjct: 222 EAPSINYNEDTSESESDESDSEISESRSVSDSEESSPPSKKPKEKKTS 269


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 34.0 bits (78), Expect = 0.14
 Identities = 14/57 (24%), Positives = 29/57 (50%), Gaps = 5/57 (8%)

Query: 271 SDVPEE---DMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSD 324
            D+ EE   +    D  +NE  DE+ P  + ++D  +++   +SD  +E+S  +  +
Sbjct: 206 MDMAEELGDEPESADSEDNE--DEDDPKEDEDDDQGEEEESGSSDSLSEDSDASSEE 260



 Score = 34.0 bits (78), Expect = 0.15
 Identities = 20/73 (27%), Positives = 34/73 (46%), Gaps = 4/73 (5%)

Query: 256 LDFIETVERYKEAYESDVPEEDMVEDDEAENENSD----EESPNSNSNEDNSQDKAILNS 311
           L  ++  E   +  ES   E++  EDD  E+E+ D    EES +S+S  ++S   +    
Sbjct: 203 LSSMDMAEELGDEPESADSEDNEDEDDPKEDEDDDQGEEEESGSSDSLSEDSDASSEEME 262

Query: 312 DDETENSSNADSD 324
             E E +  +  D
Sbjct: 263 SGEMEAAEASADD 275


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
           biogenesis [Translation, ribosomal structure and
           biogenesis].
          Length = 1077

 Score = 34.3 bits (78), Expect = 0.16
 Identities = 31/154 (20%), Positives = 66/154 (42%), Gaps = 11/154 (7%)

Query: 270 ESDVPEEDMVEDDEAEN--ENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIR 327
           +S   EE++++DDE  N  +  DEE+ + N  E+ S+  ++   ++E+ +  + +++   
Sbjct: 587 DSIEGEEELIQDDEKGNFEDLEDEENSSDNEMEE-SRGSSVTAENEESADEVDYETEREE 645

Query: 328 SRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYFGEDENVYIMDARTS 387
           +  +K + +   + +E        +      ++K  ++L+  R  F        M   + 
Sbjct: 646 NARKKEELRGNFELEERGDPEKKDVDWYTEEKRKIEEQLKINRSEFET------MVPESR 699

Query: 388 GNIGRYLNHSCTPNVF--VQNVFVDTHDPRFPWV 419
             I  Y        V   V   FVD  + R+P V
Sbjct: 700 VVIEGYRAGRYVRIVLSHVPLEFVDEFNSRYPIV 733


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 34.0 bits (78), Expect = 0.18
 Identities = 26/143 (18%), Positives = 47/143 (32%), Gaps = 18/143 (12%)

Query: 254 AELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDD 313
                +E +E   E  E+++  E    ++E + E   EE       E   ++   L  + 
Sbjct: 651 LLQAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEEL------EQLEEELEQLREEL 704

Query: 314 ETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYF 373
           E       + + +   L  RK + +  KKE +              +K  + L  LRE  
Sbjct: 705 EELLKKLGEIEQLIEELESRKAELEELKKELE------------KLEKALELLEELREKL 752

Query: 374 GEDENVYIMDARTSGNIGRYLNH 396
           G+      +       I    N 
Sbjct: 753 GKAGLRADILRNLLAQIEAEANE 775


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 33.2 bits (76), Expect = 0.21
 Identities = 29/145 (20%), Positives = 62/145 (42%), Gaps = 24/145 (16%)

Query: 236 LTDSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPE-EDMVEDDEAENENSDEESP 294
           L +++   + K      + + D    ++   E  E D  E E   +  E +  NSD +  
Sbjct: 63  LEEAERAHKSKKENKLAIEDADKSTNLDASNEGDEDDDEEDEIKRKRIEEDARNSDADD- 121

Query: 295 NSNSNEDNSQDKAILNSDDETENSSNADSDHI-----------RSRLRKRKRKQKADKKE 343
            S+S+ D+       +SDD++++  + D               R+  ++R+ ++KA ++E
Sbjct: 122 -SDSSSDSD------SSDDDSDDDDSEDETAALLRELEKIKKERAEEKEREEEEKAAEEE 174

Query: 344 GKRK----TSSLLMTLQANQKKKTK 364
             R+    T + L+    + K K +
Sbjct: 175 KAREEEILTGNPLLNTSGDFKVKRR 199


>gnl|CDD|217783 pfam03896, TRAP_alpha, Translocon-associated protein (TRAP), alpha
           subunit.  The alpha-subunit of the TRAP complex (TRAP
           alpha) is a single-spanning membrane protein of the
           endoplasmic reticulum (ER) which is found in proximity
           of nascent polypeptide chains translocating across the
           membrane.
          Length = 281

 Score = 33.2 bits (76), Expect = 0.21
 Identities = 14/52 (26%), Positives = 28/52 (53%)

Query: 273 VPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSD 324
              +D+ ED+EAE++  DE+  +    E++  +      D+E E  ++ D+D
Sbjct: 29  ASAQDLTEDEEAEDDVVDEDEEDEAVVEEDENELTEEEEDEEGEVKASPDAD 80


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 33.4 bits (76), Expect = 0.24
 Identities = 29/125 (23%), Positives = 50/125 (40%), Gaps = 18/125 (14%)

Query: 238 DSDANEEGKNYGDEYLAELDFI--------ETVERYKEAYESDVPEEDMV---EDDEAEN 286
           D DA+E   + GD+   E D+I        +  ER  +       + ++    + +E+E 
Sbjct: 266 DDDADEYDSDDGDDEGREEDYISDSSASGNDPEEREDKLSPEIPAKPEIEQDEDSEESEE 325

Query: 287 ENSDEESPNSNS------NEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKAD 340
           E ++EE   S         +         +SD   ++S ++D D   S      +KQK  
Sbjct: 326 EKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDSG-DDSDDSDIDGEDSVSLVTAKKQKEP 384

Query: 341 KKEGK 345
           KKE  
Sbjct: 385 KKEEP 389



 Score = 29.5 bits (66), Expect = 4.7
 Identities = 15/77 (19%), Positives = 28/77 (36%), Gaps = 6/77 (7%)

Query: 275 EEDMVEDDEA-ENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKR 333
           EED + D  A  N+  + E   S       + +   +S++  E  +  +       L K+
Sbjct: 283 EEDYISDSSASGNDPEEREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEGG-----LSKK 337

Query: 334 KRKQKADKKEGKRKTSS 350
            +K K  K +       
Sbjct: 338 GKKLKKLKGKKNGLDKD 354


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
           cell cycle arrest and pre-mRNA splicing. It has been
           shown to be a component of U4/U6 x U5 tri-snRNP complex
           in human, Schizosaccharomyces pombe and Saccharomyces
           cerevisiae. SART-1 is a known tumour antigen in a range
           of cancers recognised by T cells.
          Length = 603

 Score = 33.2 bits (76), Expect = 0.34
 Identities = 26/140 (18%), Positives = 47/140 (33%), Gaps = 10/140 (7%)

Query: 266 KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDH 325
           KE  E     +D   ++ ++ E  DE+  + + + D        + + E E+     S  
Sbjct: 405 KEPLEEKPENKDESVEEISDAEEDDEDEEDEDGDGDVEMSAVDNDEEKEEEDKEAIPSTI 464

Query: 326 ----------IRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYFGE 375
                     + + L+  K +    K + +R+    L   +  +     R R  RE    
Sbjct: 465 LEEEPTVGGGLAAALKLLKSRGILKKNQLERERREFLKEKERLKLLAEIRERIERERDRN 524

Query: 376 DENVYIMDARTSGNIGRYLN 395
           D     M AR      R  N
Sbjct: 525 DGKYSRMSAREREEYARPEN 544


>gnl|CDD|218598 pfam05470, eIF-3c_N, Eukaryotic translation initiation factor 3
           subunit 8 N-terminus.  The largest of the mammalian
           translation initiation factors, eIF3, consists of at
           least eight subunits ranging in mass from 35 to 170 kDa.
           eIF3 binds to the 40 S ribosome in an early step of
           translation initiation and promotes the binding of
           methionyl-tRNAi and mRNA.
          Length = 593

 Score = 32.8 bits (75), Expect = 0.46
 Identities = 22/111 (19%), Positives = 42/111 (37%), Gaps = 2/111 (1%)

Query: 257 DFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETE 316
            F + + RY+E  ES+  EE+  EDD+ +  + ++E  +     +     +    D   E
Sbjct: 123 QFEDDITRYREDPESEDEEEEEDEDDDDDGSDDEDEDEDGVGATEEVAASSESGVDRVKE 182

Query: 317 NSSNADSDHIRSRLRKRKRKQKADKKEGKRKT--SSLLMTLQANQKKKTKR 365
           +    +   +  +    + K     +E         L   + A  KK T R
Sbjct: 183 DDEEDEDADLSKKDVLEEPKMFKKPEEITWDDVFKKLKEIMSARGKKTTDR 233


>gnl|CDD|203043 pfam04546, Sigma70_ner, Sigma-70, non-essential region.  The domain
           is found in the primary vegetative sigma factor. The
           function of this domain is unclear and can be removed
           without loss of function.
          Length = 211

 Score = 31.8 bits (73), Expect = 0.50
 Identities = 7/43 (16%), Positives = 17/43 (39%)

Query: 267 EAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAIL 309
            A  +        E DE + E+ D++  + + +++   D    
Sbjct: 34  AAAAAATAAAIESELDEEDLEDDDDDDEDEDEDDEEEADLGPD 76


>gnl|CDD|215565 PLN03083, PLN03083, E3 UFM1-protein ligase 1 homolog; Provisional.
          Length = 803

 Score = 32.1 bits (73), Expect = 0.66
 Identities = 21/114 (18%), Positives = 46/114 (40%), Gaps = 14/114 (12%)

Query: 249 GDEYLAELDFI----ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQ 304
           G+  +    FI    + +E+  +A+        ++   E ++  S+E SP +++++  S+
Sbjct: 363 GESCVLSNGFIKGIYDQIEKEMDAFSIQASSAGLIGSSE-KSLGSNESSPAASNSDKGSK 421

Query: 305 DK---------AILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTS 349
            K             S  + E  +       + + R +  K  +D K G +K S
Sbjct: 422 KKKGKSTSTKGGTAESIPDDEEDAPKKGKKNQKKGRDKSSKVPSDSKAGGKKES 475


>gnl|CDD|185626 PTZ00447, PTZ00447, apical membrane antigen 1-like protein;
           Provisional.
          Length = 508

 Score = 32.3 bits (73), Expect = 0.66
 Identities = 22/96 (22%), Positives = 40/96 (41%), Gaps = 10/96 (10%)

Query: 217 RCLNDIPQGTFICIYAGHLLTDSDANEEGKNYGDEYLAEL--------DFIETVERYKEA 268
           R   +IP+     I   ++  DS     G+   DE+  ++        D I   E   + 
Sbjct: 393 RFRRNIPKSAHKMIKEMYVQKDSLETTAGER--DEFKNDVLEEDEISGDNIGPDEIENDH 450

Query: 269 YESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQ 304
           Y+    E D ++ DE +NE  ++  PN+   + + Q
Sbjct: 451 YQEKEIENDNIQYDENKNEGKNDHIPNNTLPKGHIQ 486


>gnl|CDD|219979 pfam08704, GCD14, tRNA methyltransferase complex GCD14 subunit.
           GCD14 is a subunit of the tRNA methyltransferase complex
           and is required for 1-methyladenosine modification and
           maturation of initiator methionyl-tRNA.
          Length = 309

 Score = 31.7 bits (72), Expect = 0.70
 Identities = 20/63 (31%), Positives = 28/63 (44%), Gaps = 7/63 (11%)

Query: 251 EYLAELDF--IETVE---RYKEAYESDVPEEDMVEDDEAENENSDEESP--NSNSNEDNS 303
             LA L F  IET+E   R  +     +P  D+  D   ENE    E P     +N+  S
Sbjct: 219 LALAALGFTEIETIEVLPRQYDVRTVSLPVIDLGRDTLEENERRRIEGPKERKANNDAKS 278

Query: 304 QDK 306
           +D+
Sbjct: 279 EDQ 281


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 32.0 bits (73), Expect = 0.72
 Identities = 15/88 (17%), Positives = 35/88 (39%), Gaps = 8/88 (9%)

Query: 285 ENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEG 344
           E      E  + +  E++  + A    + E E  +       +++  +R ++++  + E 
Sbjct: 236 EMSEGLLEESDDDGEEESDDESAWEGFESEYEPINKPVRPKRKTK-AQRNKEKRRKELER 294

Query: 345 KRKTSSLLMTLQANQKKKTKRLRSLREY 372
           + K        +   KKK  +L  L+E 
Sbjct: 295 EAK-------EEKQLKKKLAQLARLKEI 315


>gnl|CDD|217834 pfam03998, Utp11, Utp11 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 239

 Score = 31.6 bits (72), Expect = 0.73
 Identities = 23/93 (24%), Positives = 40/93 (43%), Gaps = 2/93 (2%)

Query: 275 EEDMVEDDEAENENSDEESPNSNSNE-DNSQDKAILNSDDETENSSNADSDHIRSRLRKR 333
           EE+    D AE  ++  E  +   N    SQ +     D++ +  S      +   L++R
Sbjct: 131 EEEQKSFDPAEYFDTTPELLDRRENRPRISQLEKTSLVDEKQKKKSAKKKRKLYKELKER 190

Query: 334 KRKQKADKK-EGKRKTSSLLMTLQANQKKKTKR 365
           K ++K  KK E + +    LM     +KKK  +
Sbjct: 191 KEREKKLKKVEQRLELQRELMKKGKGKKKKIVK 223


>gnl|CDD|227458 COG5129, MAK16, Nuclear protein with HMG-like acidic region
           [General function prediction only].
          Length = 303

 Score = 31.6 bits (71), Expect = 0.75
 Identities = 20/87 (22%), Positives = 35/87 (40%), Gaps = 9/87 (10%)

Query: 263 ERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNED---NSQDKAILNSDDETENSS 319
           ERY E  E    E + V DD       ++E       E    + Q      S++E  + S
Sbjct: 201 ERYVEEEEESDTELEAVTDDS------EKEKTKKKDLEKWLGSDQSMETSESEEEESSES 254

Query: 320 NADSDHIRSRLRKRKRKQKADKKEGKR 346
            +D D       K ++++  D K+ ++
Sbjct: 255 ESDEDEDEDNKGKIRKRKTDDAKKSRK 281


>gnl|CDD|201707 pfam01280, Ribosomal_L19e, Ribosomal protein L19e. 
          Length = 148

 Score = 30.6 bits (70), Expect = 0.90
 Identities = 18/53 (33%), Positives = 25/53 (47%), Gaps = 10/53 (18%)

Query: 325 HIRSRLRKRKRKQKAD--KKEGKRKTSSLLMTLQANQKKKT---KRLRSLREY 372
           H R R RKRK K++    +  G RK +       A   KK    +R+R+LR  
Sbjct: 57  HSRGRARKRKEKRRKGRHRGPGSRKGTK-----GARMPKKELWIRRIRALRRL 104


>gnl|CDD|218333 pfam04931, DNA_pol_phi, DNA polymerase phi.  This family includes
           the fifth essential DNA polymerase in yeast EC:2.7.7.7.
           Pol5p is localised exclusively to the nucleolus and
           binds near or at the enhancer region of rRNA-encoding
           DNA repeating units.
          Length = 784

 Score = 31.8 bits (72), Expect = 0.92
 Identities = 22/133 (16%), Positives = 47/133 (35%), Gaps = 21/133 (15%)

Query: 239 SDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNS 298
            +  EE ++  +E   + D  E +E      +S+   E   ED E + +  D E+     
Sbjct: 645 FEGEEEDEDDLEETDDDEDECEAIE------DSESESESDGEDGEEDEQEDDAEANEGVV 698

Query: 299 NEDNSQDK----------AILNSDDETENSSNADS-----DHIRSRLRKRKRKQKADKKE 343
             D +  +          A+   D E E   + +       ++    +++K + +A  + 
Sbjct: 699 PIDKAVRRALPKVLNLPDALDGGDSEDEEGMDDEQMMRLDTYLAQIFKEKKERIQAGGET 758

Query: 344 GKRKTSSLLMTLQ 356
            K   S     + 
Sbjct: 759 KKEAQSQKQNVIS 771


>gnl|CDD|238069 cd00122, MBD, MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and
          BAZ2A-like proteins constitute a family of proteins
          that share the methyl-CpG-binding domain (MBD). The MBD
          consists of about 70 residues and is defined as the
          minimal region required for binding to methylated DNA
          by a methyl-CpG-binding protein which binds
          specifically to methylated DNA. The MBD can recognize a
          single symmetrically methylated CpG either as naked DNA
          or within chromatin.  MeCP2, MBD1 and MBD2 (and likely
          MBD3) form complexes with histone deacetylase and are
          involved in histone deacetylase-dependent repression of
          transcription. MBD4 is an endonuclease that forms a
          complex with the DNA mismatch-repair protein MLH1. The
          MBDs present in putative chromatin remodelling subunit,
          BAZ2A, and putative histone methyltransferase, CLLD8,
          represent two phylogenetically distinct groups within
          the MBD protein family.
          Length = 62

 Score = 28.8 bits (65), Expect = 0.94
 Identities = 8/20 (40%), Positives = 13/20 (65%)

Query: 12 YTAPCGRTLRTSDQLVLYLF 31
          Y +PCG+ LR+  ++  YL 
Sbjct: 29 YYSPCGKKLRSKPEVARYLE 48


>gnl|CDD|220232 pfam09421, FRQ, Frequency clock protein.  The frequency clock
           protein, is the central component of the frq-based
           circadian negative feedback loop, regulates various
           aspects of the circadian clock in Neurospora crassa.
           This protein has been shown to interact with itself via
           a coiled-coil.
          Length = 989

 Score = 31.5 bits (71), Expect = 1.0
 Identities = 19/58 (32%), Positives = 29/58 (50%), Gaps = 5/58 (8%)

Query: 275 EEDMVEDDEAENENSDEE---SPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSR 329
           E+D+  D + E+ +S+E      N + +++N  DK  L S DET      D D   SR
Sbjct: 869 EDDLGSDGDEEDSSSEEFMSRRANPHQSDNNYPDKVDLASGDETG--EEPDDDIDASR 924


>gnl|CDD|221333 pfam11942, Spt5_N, Spt5 transcription elongation factor, acidic
           N-terminal.  This is the very acidic N-terminal region
           of the early transcription elongation factor Spt5. The
           Spt5-Spt4 complex regulates early transcription
           elongation by RNA polymerase II and has an imputed role
           in pre-mRNA processing via its physical association with
           mRNA capping enzymes. The actual function of this
           N-terminal domain is not known although it is
           dispensable for binding to Spt4.
          Length = 92

 Score = 29.3 bits (66), Expect = 1.0
 Identities = 16/66 (24%), Positives = 27/66 (40%), Gaps = 3/66 (4%)

Query: 275 EEDMVEDDEAENENSDEESPNSNSNEDN---SQDKAILNSDDETENSSNADSDHIRSRLR 331
           E D  E++E E E+  E+  + +   D      D+     D   E     D++ +   LR
Sbjct: 7   EVDDEEEEEEEEEDDLEDLSDEDEFIDEAEAEDDRRHRRLDRRREKEEEEDAEELAEYLR 66

Query: 332 KRKRKQ 337
           KR   +
Sbjct: 67  KRYGDE 72


>gnl|CDD|217830 pfam03986, Autophagy_N, Autophagocytosis associated protein (Atg3),
           N-terminal domain.  Autophagocytosis is a
           starvation-induced process responsible for transport of
           cytoplasmic proteins to the lysosome/vacuole. Atg3 is a
           ubiquitin like modifier that is topologically similar to
           the canonical E2 enzyme. It catalyzes the conjugation of
           Atg8 and phosphatidylethanolamine.
          Length = 146

 Score = 30.4 bits (69), Expect = 1.0
 Identities = 8/57 (14%), Positives = 23/57 (40%)

Query: 269 YESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDH 325
            E     E++VED++ ++             +D + ++ I    D+ ++  ++    
Sbjct: 82  MEYGDGAEEIVEDEDEDDGWVTTHGNRDKQKDDIADEEDIPEIGDDDDDVVDSSDAD 138


>gnl|CDD|221173 pfam11702, DUF3295, Protein of unknown function (DUF3295).  This
           family is conserved in fungi but the function is not
           known.
          Length = 509

 Score = 31.4 bits (71), Expect = 1.1
 Identities = 20/97 (20%), Positives = 35/97 (36%), Gaps = 12/97 (12%)

Query: 272 DVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLR 331
             PE    +DDE   E  +++  + ++ ED+  D    +S +E+  SS  +         
Sbjct: 278 TFPER-TSDDDEDAIETEEDDV-DESAIEDDDDDSDWEDSVEESGRSSVDEKTMF----- 330

Query: 332 KRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRS 368
                Q+ D K       SLL  +     +      S
Sbjct: 331 -----QRVDSKPNLTSRRSLLTLMLHQNDRAQGNEAS 362


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 31.5 bits (71), Expect = 1.2
 Identities = 13/74 (17%), Positives = 27/74 (36%)

Query: 251 EYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILN 310
           E L  ++  E +    E+ + +  ++D   ++E E +   E      S    S+     +
Sbjct: 192 EMLRSMELAEEMGDDTESEDEEDGDDDQPTENEQEEQGEGEGEGQEGSAPQESEATDRES 251

Query: 311 SDDETENSSNADSD 324
              E E   +   D
Sbjct: 252 ESGEEEMVQSDQDD 265


>gnl|CDD|235046 PRK02507, PRK02507, proton extrusion protein PcxA; Provisional.
          Length = 422

 Score = 31.1 bits (71), Expect = 1.2
 Identities = 14/70 (20%), Positives = 30/70 (42%), Gaps = 2/70 (2%)

Query: 236 LTDSDANEEGKNYGDEYLAELDFIETV-ERYK-EAYESDVPEEDMVEDDEAENENSDEES 293
           L++     E +      L +L FI++V  RY+ ++ +   P     + + + +  S    
Sbjct: 87  LSELAKAGEEQITETIILEKLQFIDSVISRYRSKSDQESPPLAASSQIETSTSSPSTSNP 146

Query: 294 PNSNSNEDNS 303
            N  S++  S
Sbjct: 147 VNDPSSKSES 156


>gnl|CDD|236178 PRK08187, PRK08187, pyruvate kinase; Validated.
          Length = 493

 Score = 31.1 bits (71), Expect = 1.2
 Identities = 10/21 (47%), Positives = 14/21 (66%)

Query: 350 SLLMTLQANQKKKTKRLRSLR 370
            LL  +  +Q KKT +LR+LR
Sbjct: 471 DLLARMDGHQHKKTPQLRALR 491


>gnl|CDD|227578 COG5253, MSS4, Phosphatidylinositol-4-phosphate 5-kinase [Signal
           transduction mechanisms].
          Length = 612

 Score = 31.1 bits (70), Expect = 1.4
 Identities = 32/165 (19%), Positives = 55/165 (33%), Gaps = 29/165 (17%)

Query: 237 TDSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVE-----DDEAENENSDE 291
           T SD   +  +       E D ++  +R  +A         +VE      D  +  N   
Sbjct: 60  TFSDQLHDALSKEFTLERERDRLQLNKRKYQAIRL-QTSTPIVEIFKNNKDAVDPPNHTR 118

Query: 292 ESPNSNSNED------------NSQDKAILNSDDETENSSNAD---------SDHIRSRL 330
            S N+ SN +             S +   L+ + +TE  S+           S    S  
Sbjct: 119 SSGNNLSNANVKTLSAPVGEHSRSNNPPNLDQNLDTEPESSISQWGELQLNPSGKTLSSQ 178

Query: 331 RKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYFGE 375
             RK   +  K E     S L  ++ +    K+   R+L  ++ E
Sbjct: 179 PSRKPTSENPKSE--SDNSKLPTSVNSPLPDKSLLKRTLSNFWAE 221


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 31.1 bits (70), Expect = 1.5
 Identities = 15/80 (18%), Positives = 29/80 (36%)

Query: 238 DSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSN 297
           + +  EEG+   DE   E +    VE   +  E++   E   E  E E+E   +   +  
Sbjct: 731 EIETGEEGEEVEDEGEGEAEGKHEVETEGDRKETEHEGETEAEGKEDEDEGEIQAGEDGE 790

Query: 298 SNEDNSQDKAILNSDDETEN 317
              D   +  + +  +    
Sbjct: 791 MKGDEGAEGKVEHEGETEAG 810


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
           RPA34.5.  This is a family of proteins conserved from
           yeasts to human. Subunit A34.5 of RNA polymerase I is a
           non-essential subunit which is thought to help Pol I
           overcome topological constraints imposed on ribosomal
           DNA during the process of transcription.
          Length = 193

 Score = 30.1 bits (68), Expect = 1.5
 Identities = 13/38 (34%), Positives = 21/38 (55%), Gaps = 3/38 (7%)

Query: 331 RKRKRKQKADKK---EGKRKTSSLLMTLQANQKKKTKR 365
           +K K+K+K  KK   E K K   ++    + +KKK K+
Sbjct: 154 KKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKK 191


>gnl|CDD|148314 pfam06632, XRCC4, DNA double-strand break repair and V(D)J
           recombination protein XRCC4.  This family consists of
           several eukaryotic DNA double-strand break repair and
           V(D)J recombination protein XRCC4 sequences. In the
           non-homologous end joining pathway of DNA double-strand
           break repair, the ligation step is catalyzed by a
           complex of XRCC4 and DNA ligase IV. It is thought that
           XRCC4 and ligase IV are essential for alignment-based
           gap filling, as well as for final ligation of the
           breaks.
          Length = 331

 Score = 30.6 bits (69), Expect = 1.7
 Identities = 22/75 (29%), Positives = 32/75 (42%), Gaps = 9/75 (12%)

Query: 274 PEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKR 333
           P+ED   D   + E      P+ +     S+D ++++S D T        D   SR R R
Sbjct: 222 PDEDSKYDGSTDEEQEAPPKPSESMPAAVSKDDSLISSPDIT--------DIAPSRKR-R 272

Query: 334 KRKQKADKKEGKRKT 348
           +R QK    E K  T
Sbjct: 273 QRMQKNLGTEPKMAT 287


>gnl|CDD|226920 COG4547, CobT, Cobalamin biosynthesis protein CobT
           (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole
           phosphoribosyltransferase) [Coenzyme metabolism].
          Length = 620

 Score = 31.0 bits (70), Expect = 1.8
 Identities = 18/72 (25%), Positives = 32/72 (44%), Gaps = 2/72 (2%)

Query: 250 DEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAIL 309
            + L  +D  E  E   +  E D  EED  +D    NE+S+     S  ++++ +D+A  
Sbjct: 210 RDMLGSMDMAE--ETGDDGIEEDADEEDGDDDQPDNNEDSEAGREESEGSDESEEDEAEA 267

Query: 310 NSDDETENSSNA 321
              +  E   +A
Sbjct: 268 TDGEGEEGEMDA 279



 Score = 29.8 bits (67), Expect = 4.0
 Identities = 29/162 (17%), Positives = 62/162 (38%), Gaps = 12/162 (7%)

Query: 235 LLTDSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESP 294
           +L   D  EE    GD+ + E    +  +   +  +++   E   E+ E  +E+ ++E+ 
Sbjct: 212 MLGSMDMAEE---TGDDGIEEDA--DEEDGDDDQPDNNEDSEAGREESEGSDESEEDEAE 266

Query: 295 NSNSNEDNSQDKAILNSDDETENSSNADSDHI-RSRLRKRKRKQKADKKEGKRKTSSLLM 353
            ++   +  +  A   S+D   + S+ D++             +  ++ + K  T     
Sbjct: 267 ATDGEGEEGEMDAAEASEDSESDESDEDTETPGEDARPATPFTELMEEVDYKVFTREFDE 326

Query: 354 TLQANQKKKTKRLRSLREYFGEDENVYIMDARTSGNIGRYLN 395
            + A +      L  LR +  +        A  SG +GR  N
Sbjct: 327 IVLAEELCDEAELDRLRAFLDKQL------AHLSGVVGRLAN 362


>gnl|CDD|235943 PRK07133, PRK07133, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 725

 Score = 30.9 bits (70), Expect = 1.8
 Identities = 19/109 (17%), Positives = 41/109 (37%), Gaps = 5/109 (4%)

Query: 282 DEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADK 341
           DE  NE  D  + ++N +E+ +      N+ D    +S    D I+  +     ++   K
Sbjct: 498 DENINETFDTSTISANLSENKTNFAQSFNNKDTNLINSEIPIDLIKDTITINNSQKNVKK 557

Query: 342 KEGKRKTS-----SLLMTLQANQKKKTKRLRSLREYFGEDENVYIMDAR 385
              K   S     +L+M       +     + L + + ++  ++  D  
Sbjct: 558 NGNKDYLSVEEVINLIMLAIKFHSQNQVEYKKLVQNWNKNLPLFEYDVE 606


>gnl|CDD|227880 COG5593, COG5593, Nucleic-acid-binding protein possibly involved in
           ribosomal biogenesis [Translation, ribosomal structure
           and biogenesis].
          Length = 821

 Score = 30.8 bits (69), Expect = 1.9
 Identities = 23/121 (19%), Positives = 55/121 (45%), Gaps = 11/121 (9%)

Query: 270 ESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDE-------TENSSNAD 322
           +SD  E D  EDD +++ + DE   ++  +ED   + +  +  +E       + +    +
Sbjct: 704 DSDDSELDFAEDDFSDSTSDDEPKLDAIDDEDAKSEGSQESDQEEGLDEIFYSFDGEQDN 763

Query: 323 SDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYFGEDENVYIM 382
           SD       + +  ++  ++E  ++ S+     +A +K++   L+SL  +   D+    +
Sbjct: 764 SDSFAESSEEDESSEEEKEEEENKEVSA----KRAKKKQRKNMLKSLPVFASADDYAQYL 819

Query: 383 D 383
           D
Sbjct: 820 D 820


>gnl|CDD|240338 PTZ00262, PTZ00262, subtilisin-like protease; Provisional.
          Length = 639

 Score = 30.7 bits (69), Expect = 2.0
 Identities = 23/116 (19%), Positives = 48/116 (41%), Gaps = 4/116 (3%)

Query: 237 TDSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNS 296
           +    N    + G++  +EL F+  VER  E   S++  E +  + +  + N D  SP  
Sbjct: 30  SRRKHNTARNDDGEKIGSELRFLGKVERGAE--TSNLRGEGV--EADVNSSNPDSASPKE 85

Query: 297 NSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTSSLL 352
              +   Q ++            + + D ++++LRK+K+  +    E    + S  
Sbjct: 86  ELQKIQGQQESSPPQVSHLLQDDSHNMDEMKTKLRKKKKILRLIVSENHATSPSFF 141


>gnl|CDD|220600 pfam10147, CR6_interact, Growth arrest and DNA-damage-inducible
           proteins-interacting protein 1.  Members of this family
           of proteins act as negative regulators of G1 to S cell
           cycle phase progression by inhibiting cyclin-dependent
           kinases. Inhibitory effects are additive with GADD45
           proteins but occur also in the absence of GADD45
           proteins. Furthermore, they act as a repressor of the
           orphan nuclear receptor NR4A1 by inhibiting AB
           domain-mediated transcriptional activity.
          Length = 217

 Score = 30.2 bits (68), Expect = 2.0
 Identities = 13/56 (23%), Positives = 23/56 (41%), Gaps = 19/56 (33%)

Query: 330 LRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYFGEDENVYIMDAR 385
            R +KRK++   +  K              ++K + +   RE+FG     Y +D R
Sbjct: 140 WRAQKRKREQKARAAK--------------ERKERLVAEAREHFG-----YWVDPR 176


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 30.7 bits (70), Expect = 2.1
 Identities = 15/108 (13%), Positives = 40/108 (37%), Gaps = 7/108 (6%)

Query: 238 DSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENEN---SDEESP 294
           D +  E+ K   D+    LD     +           E++  E+D  ++E+    D++  
Sbjct: 288 DDEEEEDSKESADD----LDDEFEPDDDDNFGLGQGEEDEEEEEDGVDDEDEEDDDDDLE 343

Query: 295 NSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKK 342
               + D S ++     +D  +     + +  + + +K+  +    + 
Sbjct: 344 EEEEDVDLSDEEEDEEDEDSDDEDDEEEEEEEKEKKKKKSAESTRSEL 391


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
           family consists of several hypothetical bacterial
           proteins of around 200 residues in length. The function
           of this family is unknown.
          Length = 214

 Score = 30.1 bits (68), Expect = 2.2
 Identities = 32/102 (31%), Positives = 49/102 (48%), Gaps = 12/102 (11%)

Query: 228 ICIYAGHLLTDSDANE-------EGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMV- 279
           I I A  L   S  ++       E K   D+  AE++  E  E  KEA  S+  E+    
Sbjct: 27  IIIVAYQLFFPSSPSDQAAADEQEAKKSDDQETAEIE--EVKEEEKEAANSEDKEDKGDA 84

Query: 280 --EDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSS 319
             ED+E+E EN +E+  +S+ NE  +++K   N + E  N S
Sbjct: 85  EKEDEESEEENEEEDEESSDENEKETEEKTESNVEKEITNPS 126


>gnl|CDD|215579 PLN03106, TCP2, Protein TCP2; Provisional.
          Length = 447

 Score = 30.4 bits (68), Expect = 2.3
 Identities = 22/85 (25%), Positives = 38/85 (44%), Gaps = 7/85 (8%)

Query: 267 EAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSS--NADSD 324
           E   SD  E+    DDE      + ++   N  + NS  K+  +S  +T   S  +    
Sbjct: 149 EKRTSDGTEQGFDSDDE----EHENQTLTQNQAQHNSLSKSACSSTSDTSKGSGLSLSRS 204

Query: 325 HIRSRLRKRKRKQKADKKEGKRKTS 349
            +R + R+R R++ A +KE K   +
Sbjct: 205 ELRDKARERARERTAKEKE-KEDHN 228


>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
          Length = 651

 Score = 30.4 bits (69), Expect = 2.3
 Identities = 18/85 (21%), Positives = 32/85 (37%), Gaps = 20/85 (23%)

Query: 300 EDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQ 359
           ED     ++   DD+ E            R ++ K   K  K+  K K        +  +
Sbjct: 536 EDKPDGPSVWKLDDKEELQ----------REKEEKEALKEQKRLRKLKKQ------EEKK 579

Query: 360 KKKTKRLRSLR----EYFGEDENVY 380
           KK+ ++L   +    E+F   E+ Y
Sbjct: 580 KKELEKLEKAKIPPAEFFKRQEDKY 604


>gnl|CDD|240226 PTZ00007, PTZ00007, (NAP-L) nucleosome assembly protein -L;
           Provisional.
          Length = 337

 Score = 30.1 bits (68), Expect = 2.6
 Identities = 10/62 (16%), Positives = 25/62 (40%), Gaps = 10/62 (16%)

Query: 275 EEDMVEDDEAENENSDE-ESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKR 333
           +ED     + ++++ D  +S +S S++ NS         D   N  +   +   +  +  
Sbjct: 281 DEDSDYSSDEDDDDYDSYDSSDSASSDSNS---------DVDTNEEDDRGEKESNGAKSN 331

Query: 334 KR 335
           + 
Sbjct: 332 EL 333


>gnl|CDD|112562 pfam03753, HHV6-IE, Human herpesvirus 6 immediate early protein.
           The proteins in this family are poorly characterized,
           but an investigation has indicated that the immediate
           early protein is required the down-regulation of MHC
           class I expression in dendritic cells. Human herpesvirus
           6 immediate early protein is also referred to as U90.
          Length = 993

 Score = 30.4 bits (68), Expect = 2.6
 Identities = 18/86 (20%), Positives = 36/86 (41%), Gaps = 10/86 (11%)

Query: 275 EEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSN----------ADSD 324
           E D     E+E++N+DE    ++ +  N  +   +N+ +E    S           + + 
Sbjct: 585 ETDHSAPYESESDNNDEIDYIASVDSGNRTNNIHMNNTNENTPFSKSGKSPPEVTPSKTF 644

Query: 325 HIRSRLRKRKRKQKADKKEGKRKTSS 350
           + R + +     +K  K+  KRKT  
Sbjct: 645 YKRDKKKDISTNRKVKKRTAKRKTVG 670


>gnl|CDD|227891 COG5604, COG5604, Uncharacterized conserved protein [Function
           unknown].
          Length = 523

 Score = 30.2 bits (68), Expect = 2.6
 Identities = 11/49 (22%), Positives = 24/49 (48%), Gaps = 1/49 (2%)

Query: 266 KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDE 314
           +   ES   + +  ++  ++   SD +S +S   ED S+  + LN + +
Sbjct: 43  ENKMESGTNDNNKNKEKLSKLY-SDVDSSSSEEEEDGSESISKLNVNSK 90


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
           splicing factor.  These splicing factors consist of an
           N-terminal arginine-rich low complexity domain followed
           by three tandem RNA recognition motifs (pfam00076). The
           well-characterized members of this family are auxilliary
           components of the U2 small nuclear ribonuclearprotein
           splicing factor (U2AF). These proteins are closely
           related to the CC1-like subfamily of splicing factors
           (TIGR01622). Members of this subfamily are found in
           plants, metazoa and fungi.
          Length = 509

 Score = 30.2 bits (68), Expect = 2.6
 Identities = 22/94 (23%), Positives = 33/94 (35%), Gaps = 12/94 (12%)

Query: 280 EDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKA 339
             D +   +    S   +  ED S+ +     D      S   S   RSR R R+R +  
Sbjct: 28  SRDRSRFRDRHRRSRERSYRED-SRPRDRRRYDSR-SPRSLRYSSVRRSRDRPRRRSRSV 85

Query: 340 DK----KEGKRKTSSLLMTLQANQKKKTKRLRSL 369
                 +   R  S       +NQ +K  + RSL
Sbjct: 86  RSIEQHRRRLRDRSP------SNQWRKDDKKRSL 113


>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457).  This is
           a family of uncharacterized proteins.
          Length = 449

 Score = 30.0 bits (67), Expect = 2.7
 Identities = 17/65 (26%), Positives = 32/65 (49%), Gaps = 3/65 (4%)

Query: 260 ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSS 319
              E   E  + D  +ED  +DD+ ++E+ ++E  + ++  D+S       +D+E     
Sbjct: 50  MEEEDDDEEDDDDDDDEDEDDDDDDDDEDDEDEDDDDSTLHDDSSADDGNETDNEA---G 106

Query: 320 NADSD 324
            ADSD
Sbjct: 107 FADSD 111



 Score = 28.8 bits (64), Expect = 6.0
 Identities = 17/57 (29%), Positives = 33/57 (57%), Gaps = 3/57 (5%)

Query: 266 KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNAD 322
           KEA E  + EED  E+D+ ++++ DE+    + ++D+  D+   + D    + S+AD
Sbjct: 43  KEAEEEAMEEEDDDEEDDDDDDDEDED---DDDDDDDEDDEDEDDDDSTLHDDSSAD 96


>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
           subunit 1; Provisional.
          Length = 319

 Score = 30.0 bits (68), Expect = 2.7
 Identities = 9/47 (19%), Positives = 20/47 (42%)

Query: 271 SDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETEN 317
               E+     ++AE E  +++   S   ++  +D+     DDE + 
Sbjct: 273 GGDEEDLEELLEKAEEEEEEDDYSESEDEDEEDEDEEEEEDDDEGDK 319



 Score = 28.5 bits (64), Expect = 7.8
 Identities = 12/46 (26%), Positives = 20/46 (43%), Gaps = 1/46 (2%)

Query: 248 YGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEES 293
            G +     + +E  E  +E  +     ED  E+DE E E  D++ 
Sbjct: 272 VGGDEEDLEELLEKAEE-EEEEDDYSESEDEDEEDEDEEEEEDDDE 316


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 30.1 bits (68), Expect = 2.9
 Identities = 20/111 (18%), Positives = 36/111 (32%), Gaps = 13/111 (11%)

Query: 280 EDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKA 339
                   + +  S  S      +      +  ++  N S   S     + +K+K+++K 
Sbjct: 27  IYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKS----EKKKKKKKEKKE 82

Query: 340 DKKEGKRKTSSLLMTLQANQKKKTKRLRSLRE---------YFGEDENVYI 381
            K EG+ K            KKK  + +   +            E  NVYI
Sbjct: 83  PKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSNVYI 133


>gnl|CDD|200340 TIGR03927, T7SS_EssA_Firm, type VII secretion protein EssA.
           Members of this family are associated with type VII
           secretion of WXG100 family targets in the Firmicutes,
           but not in the Actinobacteria. This highly divergent
           protein family consists largely of a central region of
           highly polar low-complexity sequence containing
           occasional LF motifs in weak repeats about 17 residues
           in length, flanked by hydrophobic N- and C-terminal
           regions [Protein fate, Protein and peptide secretion and
           trafficking].
          Length = 150

 Score = 28.9 bits (65), Expect = 3.0
 Identities = 12/78 (15%), Positives = 25/78 (32%), Gaps = 17/78 (21%)

Query: 287 ENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGKR 346
           E  D E       E+   DK +   +++ + +                  QK  +K  + 
Sbjct: 32  EKKDIEINTDYLQEETELDKELFTPEEQKKIT-----------------FQKHKEKPEQE 74

Query: 347 KTSSLLMTLQANQKKKTK 364
           +  + L +  A +    K
Sbjct: 75  ELKNQLFSENATENNTVK 92


>gnl|CDD|165391 PHA03118, PHA03118, multifunctional expression regulator;
           Provisional.
          Length = 474

 Score = 30.0 bits (67), Expect = 3.0
 Identities = 23/75 (30%), Positives = 37/75 (49%), Gaps = 1/75 (1%)

Query: 274 PEEDMVEDDEAENENSDEES-PNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRK 332
           PE+D ++D   E ++  EES PNS   +D+     IL  D+ET N ++   D  R+    
Sbjct: 25  PEDDPIDDFALEVQDWAEESVPNSVQIDDDEAAGEILGEDEETPNPADVCEDADRAYTNP 84

Query: 333 RKRKQKADKKEGKRK 347
              K+   ++EG   
Sbjct: 85  NFEKKAHGRREGYHH 99


>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
           Provisional.
          Length = 2849

 Score = 30.4 bits (68), Expect = 3.1
 Identities = 20/87 (22%), Positives = 41/87 (47%), Gaps = 2/87 (2%)

Query: 248 YGDEYLAELDFIETVERYKE--AYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQD 305
           +G   + +LD I    R     A E   P ++ V DD+ E+E+ D++    +  E+  ++
Sbjct: 119 HGKAEIGDLDMIIIKRRRARHLAEEDMSPRDNFVIDDDDEDEDEDDDDEEDDEEEEEEEE 178

Query: 306 KAILNSDDETENSSNADSDHIRSRLRK 332
           +     D++ E+    D  + +S + K
Sbjct: 179 EIKGFDDEDEEDEGGEDFTYEKSEVDK 205


>gnl|CDD|128673 smart00391, MBD, Methyl-CpG binding domain.  Methyl-CpG binding
          domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1)
          domain.
          Length = 77

 Score = 27.7 bits (62), Expect = 3.1
 Identities = 11/40 (27%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 7  KKCIMYTAPCGRTLRTSDQLVLYLFITKA-KWTIDMFEYD 45
          K  + Y +PCG+ LR+  +L  YL         ++ F+++
Sbjct: 27 KFDVYYISPCGKKLRSKSELARYLHKNGDLSLDLECFDFN 66


>gnl|CDD|172295 PRK13757, PRK13757, chloramphenicol acetyltransferase; Provisional.
          Length = 219

 Score = 29.4 bits (66), Expect = 3.2
 Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 12/69 (17%)

Query: 362 KTKRLRSLREYFGEDEN----VYIMDARTSGNIGRYLNHSCTPNVFVQNVFVDTHDPRFP 417
           +T+   SL   + +D      +Y  D    G      N +  P  F++N+F  + +   P
Sbjct: 98  QTETFSSLWSEYHDDFRQFLHIYSQDVACYGE-----NLAYFPKGFIENMFFVSAN---P 149

Query: 418 WVSFFALKF 426
           WVSF +   
Sbjct: 150 WVSFTSFDL 158


>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
           chain; Provisional.
          Length = 1033

 Score = 30.2 bits (68), Expect = 3.2
 Identities = 17/97 (17%), Positives = 37/97 (38%), Gaps = 7/97 (7%)

Query: 275 EEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRK 334
           EE +      E +  + E+   ++  D+  D+     +DE E          ++ + KR+
Sbjct: 2   EEQVNTQANEEEDEEELEAVARSAGSDSDDDEVPAEDEDEDEEDDEEAESPAKAEISKRE 61

Query: 335 R-------KQKADKKEGKRKTSSLLMTLQANQKKKTK 364
           +       KQK  + +   +  +  +    N K K +
Sbjct: 62  KARLKELKKQKKQEIQKILEQQNAAIDADMNNKGKGR 98



 Score = 29.4 bits (66), Expect = 5.3
 Identities = 17/74 (22%), Positives = 35/74 (47%), Gaps = 1/74 (1%)

Query: 270 ESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSR 329
           E  V  +   E+DE E E     + + + +++   +    + +D+ E  S A ++ I  R
Sbjct: 2   EEQVNTQANEEEDEEELEAVARSAGSDSDDDEVPAEDEDEDEEDDEEAESPAKAE-ISKR 60

Query: 330 LRKRKRKQKADKKE 343
            + R ++ K  KK+
Sbjct: 61  EKARLKELKKQKKQ 74


>gnl|CDD|177447 PHA02664, PHA02664, hypothetical protein; Provisional.
          Length = 534

 Score = 30.0 bits (67), Expect = 3.2
 Identities = 14/54 (25%), Positives = 28/54 (51%), Gaps = 1/54 (1%)

Query: 271 SDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSD 324
           +     D  EDD  E E+ DE   +   + D+S   +  +S+DE++++ ++  D
Sbjct: 445 AHADRADSDEDDMDEQESGDE-RADGEDDSDSSYSYSTTSSEDESDSADDSWGD 497


>gnl|CDD|238691 cd01397, HAT_MBD, Methyl-CpG binding domains (MBD) present in
          putative chromatin remodelling factor such as BAZ2A;
          BAZ2A contains a MBD, DDT, PHD-type zinc finger and
          Bromo domain suggesting that BAZ2A might be associated
          with histone acetyltransferase (HAT) activity. The
          Drosophila melanogaster toutatis protein, a putative
          subunit of the chromatin-remodeling complex, and other
          such proteins in this group share a similar domain
          architecture with BAZ2A, as does the Caenorhabditis
          elegans flectin homolog.
          Length = 73

 Score = 27.4 bits (61), Expect = 3.5
 Identities = 13/39 (33%), Positives = 21/39 (53%), Gaps = 4/39 (10%)

Query: 12 YTAPCGRTLRTSDQLVLYLFITKAKWTIDMFEYDHFVSS 50
          Y APCG+ LR   +++ YL    +K  I +   ++F  S
Sbjct: 29 YYAPCGKKLRQYPEVIKYL----SKNGISLLSRENFSFS 63


>gnl|CDD|217861 pfam04050, Upf2, Up-frameshift suppressor 2.  Transcripts
           harbouring premature signals for translation termination
           are recognised and rapidly degraded by eukaryotic cells
           through a pathway known as nonsense-mediated mRNA decay.
           In Saccharomyces cerevisiae, three trans-acting factors
           (Upf1 to Upf3) are required for nonsense-mediated mRNA
           decay.
          Length = 171

 Score = 28.9 bits (65), Expect = 3.5
 Identities = 9/37 (24%), Positives = 23/37 (62%), Gaps = 3/37 (8%)

Query: 272 DVPEEDMVEDDEAENENSDEES---PNSNSNEDNSQD 305
           D  E++ + +++ ++E+SDEE    P+   +E++  +
Sbjct: 9   DGEEDEELPEEDEDDESSDEEEVDLPDDEQDEESDSE 45


>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
           subunit [Translation, ribosomal structure and
           biogenesis].
          Length = 591

 Score = 29.7 bits (66), Expect = 3.8
 Identities = 26/111 (23%), Positives = 50/111 (45%), Gaps = 17/111 (15%)

Query: 254 AELDFIETVERYKEAYESDVPEEDMVEDDEAEN--ENSDEESPNSNSNEDNSQDKAILNS 311
           A L  +E  +R+ E    +  E+      E     E+ D++       E   + + I  S
Sbjct: 460 ASLMTMEETQRHSEEDLVNRFED---VRYEHVAGEEDDDDDEELQAQKELELEAQGIKYS 516

Query: 312 DDETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKK 362
           +     +S AD D  +S+ +KRK  ++ ++K+       L M + +N++KK
Sbjct: 517 E-----TSEADKDVNKSKNKKRKVDEEEEEKK-------LKMIMMSNKQKK 555



 Score = 29.7 bits (66), Expect = 4.1
 Identities = 21/113 (18%), Positives = 43/113 (38%), Gaps = 4/113 (3%)

Query: 258 FIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDD-ETE 316
           F         A    + E     +++  N   D    +    ED+  D+ +    + E E
Sbjct: 450 FASVDSYDPRASLMTMEETQRHSEEDLVNRFEDVRYEHVAGEEDDDDDEELQAQKELELE 509

Query: 317 NSSNADSDHIRSR--LRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLR 367
                 S+   +   + K K K++   +E + K   ++M +   QKK  K+++
Sbjct: 510 AQGIKYSETSEADKDVNKSKNKKRKVDEEEEEKKLKMIM-MSNKQKKLYKKMK 561


>gnl|CDD|220267 pfam09494, Slx4, Slx4 endonuclease.  The Slx4 protein is a
           heteromeric structure-specific endonuclease found in
           fungi. Slx4 with Slx1 acts as a nuclease on branched DNA
           substrates, particularly simple-Y, 5'-flap, or
           replication fork structures by cleaving the strand
           bearing the 5' non-homologous arm at the branch junction
           and thus generating ligatable nicked products from
           5'-flap or replication fork substrates.
          Length = 627

 Score = 29.6 bits (66), Expect = 3.9
 Identities = 27/116 (23%), Positives = 43/116 (37%), Gaps = 23/116 (19%)

Query: 281 DDEAENENSDEESPNSNSNED---NSQDKAILNSDDETEN-------------SSNA-DS 323
           DD+  +E  D ES      +    ++Q +  L+  +E EN             S  + D 
Sbjct: 1   DDDEADEEEDTESGEEEHEDQIFISTQIQGRLDDFEEEENQRLQLRQVISRFTSFESDDQ 60

Query: 324 DHIRSRLRKRKRKQKADKKEGKRKTSSLL------MTLQANQKKKTKRLRSLREYF 373
            +  +   KR  K+K  KK   RK +         +T    +  +T R  SL  Y 
Sbjct: 61  ANSGNVSGKRVPKKKKIKKPKLRKRTKRKNKKIKSLTAFNEENFETDRAPSLLSYL 116


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 29.6 bits (67), Expect = 3.9
 Identities = 21/100 (21%), Positives = 42/100 (42%), Gaps = 6/100 (6%)

Query: 266 KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENS-SNADSD 324
           KE  ESD  EE++ +++EA+ E    +    +      +++     + + EN      S 
Sbjct: 448 KEKKESDE-EEELEDEEEAKVEKVANKLLKRSEKAQKEEEE----EELDEENPWLKTTSS 502

Query: 325 HIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTK 364
             +S  ++  +K+ + K +      S        +KKK K
Sbjct: 503 VGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEK 542



 Score = 29.3 bits (66), Expect = 5.5
 Identities = 22/132 (16%), Positives = 42/132 (31%), Gaps = 18/132 (13%)

Query: 270 ESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSD-------DETENSSNAD 322
           E    EE+  +DDE +++  +         +    +    NS           E     +
Sbjct: 330 EDSDSEEEDEDDDEDDDDGENPWMLRKKLGKLKEGEDDEENSGLLSMKFMQRAEARKKEE 389

Query: 323 SD----------HIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLR-SLRE 371
           +D                 + + ++ + K  G+RK        +A  KK  K  +   +E
Sbjct: 390 NDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLKKENKNEFKE 449

Query: 372 YFGEDENVYIMD 383
               DE   + D
Sbjct: 450 KKESDEEEELED 461


>gnl|CDD|223726 COG0653, SecA, Preprotein translocase subunit SecA (ATPase, RNA
           helicase) [Intracellular trafficking and secretion].
          Length = 822

 Score = 29.9 bits (68), Expect = 4.0
 Identities = 19/94 (20%), Positives = 31/94 (32%), Gaps = 7/94 (7%)

Query: 250 DEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAIL 309
           +  L  LD  E + +  E     +  E +    +AE    D E            D  I 
Sbjct: 618 NRLLEALDLSEFISKMIEDVIKALVGEYIPPPQQAELW--DLEGLIDELKGTVHPDLPIN 675

Query: 310 NSDDETENSSNADSDHIRSRLRKRKRKQKADKKE 343
            SD E     +   + +  R+ K   +    K+E
Sbjct: 676 KSDLE-----DEAEEELAERILKAADEAYDKKEE 704


>gnl|CDD|237748 PRK14537, PRK14537, 50S ribosomal protein L20/unknown domain fusion
           protein; Provisional.
          Length = 230

 Score = 29.2 bits (65), Expect = 4.0
 Identities = 19/108 (17%), Positives = 46/108 (42%), Gaps = 5/108 (4%)

Query: 260 ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSS 319
            T+ + K+   ++V +E  +  +E +    +E+  +   +E+ S  +             
Sbjct: 123 TTITKPKKVLINEVLQEKTINQNEEKTSLQNEKVLSPELSEEKSDSELETQPQKTQLKEK 182

Query: 320 NADSDHIR-SRLRKRKRKQKADKKE----GKRKTSSLLMTLQANQKKK 362
               +HI  S++  ++ K+ A + +     K K + ++  L+   KKK
Sbjct: 183 KPSIEHIDLSKMLLKELKKLAKEHKIPNFNKLKKTEIIKALKKALKKK 230


>gnl|CDD|237312 PRK13236, PRK13236, nitrogenase reductase; Reviewed.
          Length = 296

 Score = 29.2 bits (65), Expect = 4.0
 Identities = 20/63 (31%), Positives = 35/63 (55%), Gaps = 2/63 (3%)

Query: 255 ELDFIETV-ERYKEAYESDVPEEDMVEDDEAENENSDEESPNSN-SNEDNSQDKAILNSD 312
           E++ IET+ +R        VP +++V+  E      +E +P+SN  NE  +  K I+N+D
Sbjct: 195 EIELIETLAKRLNTQMIHFVPRDNIVQHAELRRMTVNEYAPDSNQGNEYRALAKKIINND 254

Query: 313 DET 315
           + T
Sbjct: 255 NLT 257


>gnl|CDD|220635 pfam10221, DUF2151, Cell cycle and development regulator.  This is
           a set of proteins conserved from worms to humans. The
           proteins are a PAN GU kinase substrate, Mat89Bb,
           essential for S-M cycles of early Drosophila
           embryogenesis, Xenopus embryonic cell cycles and
           morphogenesis, and cell division in cultured mammalian
           cells.
          Length = 692

 Score = 29.8 bits (67), Expect = 4.1
 Identities = 29/127 (22%), Positives = 44/127 (34%), Gaps = 22/127 (17%)

Query: 263 ERYKEAYE------SDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETE 316
           ER+K  YE      S  P  +  EDDEA ++   +E     +      ++   + D E  
Sbjct: 553 ERHKAVYECLMTCRSADPLLE--EDDEAGDKMFVKEPFFEKTRGTADTEEGEGSGDREKN 610

Query: 317 NSSN----ADSDHIRSRLRKRKR-----KQKADKKEGKRKTSSLLMTLQANQKKKTKRLR 367
                    DS      L K+K            K+G     S     +  +K+  KR  
Sbjct: 611 MLKEFEKYTDSPDSPEPLEKKKNAKIEELLPEMAKKGPVSLLS-NWCSRIEKKESRKR-- 667

Query: 368 SLREYFG 374
             RE+ G
Sbjct: 668 --REFVG 672


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 29.3 bits (66), Expect = 4.1
 Identities = 17/81 (20%), Positives = 27/81 (33%), Gaps = 8/81 (9%)

Query: 264 RYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADS 323
           R       D  E D  ED+E E + SDE       + +              E+ S+  S
Sbjct: 358 RRARLDPIDFEEVDEDEDEE-EEQRSDEHEEEEGEDSEE-------EGSQSREDGSSESS 409

Query: 324 DHIRSRLRKRKRKQKADKKEG 344
             + S    +  K+ A   + 
Sbjct: 410 SDVGSDSESKADKESASDSDS 430


>gnl|CDD|218658 pfam05616, Neisseria_TspB, Neisseria meningitidis TspB protein.
           This family consists of several Neisseria meningitidis
           TspB virulence factor proteins.
          Length = 502

 Score = 29.6 bits (66), Expect = 4.2
 Identities = 21/81 (25%), Positives = 35/81 (43%), Gaps = 6/81 (7%)

Query: 267 EAYESDVPEEDMVEDDEAENENSDEE---SPNSNSNEDNSQDKAILNSDDETENSSNADS 323
           EA E+    E    ++ A N N  E     PN   + D + D    N D + +  +  DS
Sbjct: 325 EAPEAQPLPEVSPAENPANNPNPRENPGTRPNPEPDPDLNPDA---NPDTDGQPGTRPDS 381

Query: 324 DHIRSRLRKRKRKQKADKKEG 344
             +  R   R RK++ + ++G
Sbjct: 382 PAVPDRPNGRHRKERKEGEDG 402


>gnl|CDD|238268 cd00481, Ribosomal_L19e, Ribosomal protein L19e.  L19e is found in
           the large ribosomal subunit of eukaryotes and archaea.
           L19e is distinct from the ribosomal subunit L19, which
           is found in prokaryotes. It consists of two small
           globular domains connected by an extended segment. It is
           located toward the surface of the large subunit, with
           one exposed end involved in forming the intersubunit
           bridge with the small subunit.  The other exposed end is
           involved in forming the translocon binding site, along
           with L22, L23, L24, L29, and L31e subunits.
          Length = 145

 Score = 28.3 bits (64), Expect = 4.7
 Identities = 17/53 (32%), Positives = 23/53 (43%), Gaps = 10/53 (18%)

Query: 325 HIRSRLRKR--KRKQKADKKEGKRKTSSLLMTLQANQKKKT---KRLRSLREY 372
           H R R RKR   R++   +  G RK      T  A    K    +R+R+LR  
Sbjct: 55  HSRGRARKRHEARRKGRHRGPGSRKG-----TKGARMPSKELWIRRIRALRRL 102


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 28.7 bits (64), Expect = 4.8
 Identities = 27/101 (26%), Positives = 42/101 (41%), Gaps = 13/101 (12%)

Query: 239 SDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENEN-------SDE 291
              NE+ K+   E   E +  E  E   E  E    EE++VED+E E E+        D 
Sbjct: 37  IKENEDVKDEKQEDDEEEE--EEDEEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDI 94

Query: 292 ESPNSNSNEDNSQD----KAILNSDDETENSSNADSDHIRS 328
           E  N N   +++QD      I  +  + E S     D +++
Sbjct: 95  EKKNINDIFNSTQDDNAQNLISKNYKKNEKSKKTAEDIVKT 135


>gnl|CDD|133377 cd04177, RSR1, RSR1/Bud1p family GTPase.  RSR1/Bud1p is a member of
           the Rap subfamily of the Ras family that is found in
           fungi. In budding yeasts, RSR1 is involved in selecting
           a site for bud growth on the cell cortex, which directs
           the establishment of cell polarization. The Rho family
           GTPase cdc42 and its GEF, cdc24, then establish an axis
           of polarized growth by organizing the actin cytoskeleton
           and secretory apparatus at the bud site. It is believed
           that cdc42 interacts directly with RSR1 in vivo. In
           filamentous fungi, polar growth occurs at the tips of
           hypha and at novel growth sites along the extending
           hypha. In Ashbya gossypii, RSR1 is a key regulator of
           hyphal growth, localizing at the tip region and
           regulating in apical polarization of the actin
           cytoskeleton. Most Ras proteins contain a lipid
           modification site at the C-terminus, with a typical
           sequence motif CaaX, where a = an aliphatic amino acid
           and X = any amino acid. Lipid binding is essential for
           membrane attachment, a key feature of most Ras proteins.
          Length = 168

 Score = 28.6 bits (64), Expect = 5.3
 Identities = 11/28 (39%), Positives = 18/28 (64%), Gaps = 4/28 (14%)

Query: 387 SGNIGRYLNHSCTPNVFVQNVFVDTHDP 414
           +G +G+    S     FVQNVF++++DP
Sbjct: 9   AGGVGK----SALTVQFVQNVFIESYDP 32


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 29.1 bits (65), Expect = 5.3
 Identities = 18/55 (32%), Positives = 26/55 (47%), Gaps = 1/55 (1%)

Query: 270 ESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSD 324
           E  V EE+  E++E E E   EE    +  E+   +     S++E E SS  D D
Sbjct: 442 EESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADN-GSEEEMEGSSEGDGD 495


>gnl|CDD|227952 COG5665, NOT5, CCR4-NOT transcriptional regulation complex, NOT5
           subunit [Transcription].
          Length = 548

 Score = 29.3 bits (65), Expect = 5.4
 Identities = 15/72 (20%), Positives = 30/72 (41%), Gaps = 9/72 (12%)

Query: 257 DFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNE-----DNSQDKAILNS 311
           +F + ++ Y E  +      D +E D    +   E  P+S++NE     +N    + + S
Sbjct: 169 EFQDDIKYYVENNDD----PDFIEYDTIYEDMGCEIQPSSSNNEAPKEGNNQTSLSSIRS 224

Query: 312 DDETENSSNADS 323
             + E S    +
Sbjct: 225 SKKQERSPKKKA 236


>gnl|CDD|217476 pfam03286, Pox_Ag35, Pox virus Ag35 surface protein. 
          Length = 198

 Score = 28.6 bits (64), Expect = 5.4
 Identities = 15/59 (25%), Positives = 29/59 (49%), Gaps = 7/59 (11%)

Query: 260 ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENS 318
            T E  K   + D  EE+  E+D   NE S +   ++++N  +  D     ++D+ ++S
Sbjct: 76  LTEEEKKPESDDDKTEEN--ENDPDNNEESGDSQESASANSLSDID-----NEDDMDDS 127


>gnl|CDD|183859 PRK13103, secA, preprotein translocase subunit SecA; Reviewed.
          Length = 913

 Score = 29.2 bits (65), Expect = 6.0
 Identities = 18/53 (33%), Positives = 28/53 (52%)

Query: 324 DHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYFGED 376
           DH+R  +  R   QK  K+E KR++ +L   L  + K+ T R+ S  +   ED
Sbjct: 783 DHLRHGIHLRGYAQKNPKQEYKRESFTLFQELLDSIKRDTIRVLSHVQVRRED 835


>gnl|CDD|214703 smart00508, PostSET, Cysteine-rich motif following a subset of SET
           domains. 
          Length = 17

 Score = 25.4 bits (57), Expect = 6.1
 Identities = 6/12 (50%), Positives = 7/12 (58%)

Query: 451 CYCGSSECRQRL 462
           C CG+  CR  L
Sbjct: 5   CLCGAPNCRGFL 16


>gnl|CDD|220785 pfam10498, IFT57, Intra-flagellar transport protein 57.  Eukaryotic
           cilia and flagella are specialised organelles found at
           the periphery of cells of diverse organisms.
           Intra-flagellar transport (IFT) is required for the
           assembly and maintenance of eukaryotic cilia and
           flagella, and consists of the bidirectional movement of
           large protein particles between the base and the distal
           tip of the organelle. IFT particles contain multiple
           copies of two distinct protein complexes, A and B, which
           contain at least 6 and 11 protein subunits. IFT57 is
           part of complex B but is not, however, required for the
           core subunits to stay associated. This protein is known
           as Huntington-interacting protein-1 in humans.
          Length = 355

 Score = 28.9 bits (65), Expect = 6.1
 Identities = 16/68 (23%), Positives = 29/68 (42%), Gaps = 3/68 (4%)

Query: 251 EYLAELDFIETVERYK-EAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAIL 309
           +Y  E D  E V+    E    +V EE  +E+ + +    + +    +++    Q K +L
Sbjct: 125 KYPNEEDEEENVDEDDAEIILEEVEEEVEIEEVDDDEGTQETKYKRGDTSLTP-QAKDVL 183

Query: 310 NSD-DETE 316
            S  D  E
Sbjct: 184 ESLIDAAE 191


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 28.8 bits (65), Expect = 6.8
 Identities = 16/110 (14%), Positives = 46/110 (41%), Gaps = 2/110 (1%)

Query: 237 TDSDANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEA-ENENSDEESPN 295
            DS    E KN  D+        +     +   + D  ++D ++DD+  ++++ +++  +
Sbjct: 100 LDSSKKAEKKNALDKDDDLNYVKDIDVLNQADDDDDDDDDDDLDDDDIDDDDDDEDDDED 159

Query: 296 SNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGK 345
            + ++ + +D+    + +  + S + D           ++ +K D K   
Sbjct: 160 DDDDDVDDEDEEKKEAKELEKLSDDDDFVWDEDDSEALRQARK-DAKLTA 208


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 28.8 bits (64), Expect = 7.0
 Identities = 27/138 (19%), Positives = 53/138 (38%), Gaps = 7/138 (5%)

Query: 241 ANEEGKNYGDEYLAELDFIETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNE 300
             EE K    E   E++  E V + ++  +    EE   E+ E E E  ++    S    
Sbjct: 111 VEEEEKEESREEREEVEETEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEEN 170

Query: 301 DNSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQK-------ADKKEGKRKTSSLLM 353
           +       L   + T +   A+   + +     K KQK        ++ + KR+    ++
Sbjct: 171 NGEFMTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVL 230

Query: 354 TLQANQKKKTKRLRSLRE 371
             +  ++K+ +  R  RE
Sbjct: 231 EEEEQRRKQEEADRKSRE 248


>gnl|CDD|129694 TIGR00606, rad50, rad50.  All proteins in this family for which
            functions are known are involvedin recombination,
            recombinational repair, and/or non-homologous end
            joining.They are components of an exonuclease complex
            with MRE11 homologs. This family is distantly related to
            the SbcC family of bacterial proteins.This family is
            based on the phylogenomic analysis of JA Eisen (1999,
            Ph.D. Thesis, Stanford University).
          Length = 1311

 Score = 28.9 bits (64), Expect = 7.1
 Identities = 22/108 (20%), Positives = 45/108 (41%), Gaps = 5/108 (4%)

Query: 266  KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDH 325
            K+A E D P E  +E D+ E E        S+    N + +  +N   E   + +     
Sbjct: 905  KDAKEQDSPLETFLEKDQQEKE-----ELISSKETSNKKAQDKVNDIKEKVKNIHGYMKD 959

Query: 326  IRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLREYF 373
            I ++++  K      K+      ++ L   + +Q+K  + +R +R+  
Sbjct: 960  IENKIQDGKDDYLKQKETELNTVNAQLEECEKHQEKINEDMRLMRQDI 1007


>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6.  The surfeit locus
           protein SURF-6 is shown to be a component of the
           nucleolar matrix and has a strong binding capacity for
           nucleic acids.
          Length = 206

 Score = 28.0 bits (63), Expect = 7.4
 Identities = 23/109 (21%), Positives = 40/109 (36%), Gaps = 20/109 (18%)

Query: 263 ERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNAD 322
           +R +   + D  + +  E    EN++  + +P  N+  +    K      ++        
Sbjct: 25  KRKEAKKKEDAQKSEAEEVKNEENKSKKKAAPIENAEGNIVFSKVEFADGEQA------- 77

Query: 323 SDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKRLRSLRE 371
                    K+  K K  KK+ K     LL  L+A +KK    L  L E
Sbjct: 78  ---------KKDLKLKKKKKKKKTDYKQLLKKLEARKKK----LEELDE 113


>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
          Length = 619

 Score = 28.6 bits (65), Expect = 7.4
 Identities = 20/100 (20%), Positives = 39/100 (39%), Gaps = 15/100 (15%)

Query: 280 EDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSRLRKR---KRK 336
            D  AE + +   S     ++D  ++      +DE ++S  AD   +  ++ ++     K
Sbjct: 174 VDPNAEEDPAHVGSELEELDDDEDEE----EEEDENDDSLAADESELPEKVLEKFKALAK 229

Query: 337 QKA------DKKEGKRKTSSLLMTLQANQKKKTKRLRSLR 370
           Q        +KK   R            ++K  + L+SLR
Sbjct: 230 QYKKLRKAQEKKVEGRLAQHK--KYAKLREKLKEELKSLR 267


>gnl|CDD|235401 PRK05306, infB, translation initiation factor IF-2; Validated.
          Length = 746

 Score = 28.7 bits (65), Expect = 7.8
 Identities = 19/106 (17%), Positives = 38/106 (35%)

Query: 260 ETVERYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSS 319
           E +E+ KE           VE++EA  E +  E+      E      A    + + E ++
Sbjct: 19  ELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAEEEAKAEAAA 78

Query: 320 NADSDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKKKTKR 365
            A ++         +   +  + E  R   +     +A +  K K+
Sbjct: 79  AAPAEEAAEAAAAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKK 124


>gnl|CDD|184885 PRK14891, PRK14891, 50S ribosomal protein L24e/unknown domain
           fusion protein; Provisional.
          Length = 131

 Score = 27.6 bits (61), Expect = 8.6
 Identities = 13/58 (22%), Positives = 27/58 (46%), Gaps = 6/58 (10%)

Query: 267 EAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSD 324
            A E++  + D   D+ AE + +DE       +E+   D+A+  + DE +  +    +
Sbjct: 72  AAEEAEAADADEDADEAAEADAADEA------DEEEETDEAVDETADEADAEAEEADE 123


>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
           hydrophilic C-term.  This domain is a hydrophilic region
           found at the C-terminus of plant and metazoan
           pre-mRNA-splicing factor 38 proteins. The function is
           not known.
          Length = 97

 Score = 26.7 bits (59), Expect = 8.6
 Identities = 13/87 (14%), Positives = 27/87 (31%), Gaps = 2/87 (2%)

Query: 264 RYKEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADS 323
               A E D+  ++  E +E E++         + +      +            S    
Sbjct: 1   PRVSALEEDL--DEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRR 58

Query: 324 DHIRSRLRKRKRKQKADKKEGKRKTSS 350
              R R R R R +    ++   ++ S
Sbjct: 59  RRRRDRDRARYRDRDDRDRDRYDRSRS 85


>gnl|CDD|236304 PRK08581, PRK08581, N-acetylmuramoyl-L-alanine amidase; Validated.
          Length = 619

 Score = 28.6 bits (64), Expect = 8.8
 Identities = 14/84 (16%), Positives = 35/84 (41%), Gaps = 8/84 (9%)

Query: 270 ESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDHIRSR 329
            +  P     + +  +   +     NS    D++ ++   +S  + ++ S++  D I  +
Sbjct: 180 NNTKPSTSNKQPNSPKP--TQPNQSNSQPASDDTANQK--SSSKDNQSMSDSALDSILDQ 235

Query: 330 LRKRKRKQKAD----KKEGKRKTS 349
             +  +K + D     K+ K +TS
Sbjct: 236 YSEDAKKTQKDYASQSKKDKTETS 259


>gnl|CDD|236295 PRK08570, rpl19e, 50S ribosomal protein L19e; Reviewed.
          Length = 150

 Score = 27.5 bits (62), Expect = 9.2
 Identities = 17/51 (33%), Positives = 22/51 (43%), Gaps = 10/51 (19%)

Query: 327 RSRLRKRKRKQKAD--KKEGKRKTSSLLMTLQANQKKKT---KRLRSLREY 372
           R R R+R  K+K    +  G RK         A   KK     R+R+LR Y
Sbjct: 60  RGRARERHEKRKKGRRRGPGSRKGKKG-----ARTPKKERWINRIRALRRY 105


>gnl|CDD|184536 PRK14145, PRK14145, heat shock protein GrpE; Provisional.
          Length = 196

 Score = 28.0 bits (62), Expect = 9.6
 Identities = 20/91 (21%), Positives = 35/91 (38%), Gaps = 19/91 (20%)

Query: 266 KEAYESDVPEEDMVEDDEAENENSDEESPNSNSNEDNSQDKAILNSDDETENSSNADSDH 325
            E  E ++ +E   E+ +  N +S+E+      +E   Q++    + DE E         
Sbjct: 1   MEEVEKEINKE---EEKDVNNLSSNEQMEGPPEDEQAQQNQPQQQTVDEIEELKQKLQQK 57

Query: 326 ---------IRSRL-------RKRKRKQKAD 340
                    I  RL       RKR  K+K++
Sbjct: 58  EVEAQEYLDIAQRLKAEFENYRKRTEKEKSE 88


>gnl|CDD|237624 PRK14143, PRK14143, heat shock protein GrpE; Provisional.
          Length = 238

 Score = 27.8 bits (62), Expect = 9.9
 Identities = 28/111 (25%), Positives = 44/111 (39%), Gaps = 23/111 (20%)

Query: 260 ETVERYKEAYESDVPEEDM-----VEDDEAENENSDEESPNSNSNEDNS-------QDKA 307
           E  +   E+ E    +E        +  EAE  + D  S  S +  DN+       Q+  
Sbjct: 19  EAEDNSPESSEEVTEQEAELTNPEGDAAEAE-SSPDSGSAASETAADNAARLAQLEQELE 77

Query: 308 ILNSDDETENSS----NADSDHIRSRLRKRKRKQKAD-KKEGKRKT-SSLL 352
            L  + E  NS      AD D+     RKR  +++ D + + K  T S +L
Sbjct: 78  SLKQELEELNSQYMRIAADFDN----FRKRTSREQEDLRLQLKCNTLSEIL 124


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 27.3 bits (61), Expect = 10.0
 Identities = 15/66 (22%), Positives = 25/66 (37%)

Query: 302 NSQDKAILNSDDETENSSNADSDHIRSRLRKRKRKQKADKKEGKRKTSSLLMTLQANQKK 361
             + K   +S   T N  N D D   +   KR   +K  K++ +R            +K+
Sbjct: 30  KKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKKAKEKLRRDKLKAKKEEAEKEKE 89

Query: 362 KTKRLR 367
           K +R  
Sbjct: 90  KEERFM 95


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.318    0.134    0.415 

Gapped
Lambda     K      H
   0.267   0.0892    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 22,946,486
Number of extensions: 2183355
Number of successful extensions: 3711
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3213
Number of HSP's successfully gapped: 225
Length of query: 463
Length of database: 10,937,602
Length adjustment: 100
Effective length of query: 363
Effective length of database: 6,502,202
Effective search space: 2360299326
Effective search space used: 2360299326
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 61 (27.0 bits)