RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy5197
         (564 letters)



>gnl|CDD|205150 pfam12925, APP_E2, E2 domain of amyloid precursor protein.  The E2
           domain is the largest of the conserved domains of the
           amyloid precursor protein. The structure of E2 consists
           of two coiled-coil sub-structures connected through a
           continuous helix, and bears an unexpected resemblance to
           the spectrin family of protein structures.E 2 can
           reversibly dimerise in solution, and the dimerisation
           occurs along the longest dimension of the molecule in an
           antiparallel orientation, which enables the N-terminal
           substructure of one monomer to pack against the
           C-terminal substructure of a second monomer. The high
           degree of conservation of residues at the putative dimer
           interface suggests that the E2 dimer observed in the
           crystal could be physiologically relevant. Heparin
           sulfate proteoglycans, the putative ligands for the
           precursor present in extracellular matrix, bind to E2 at
           a conserved and positively charged site near the dimer
           interface.
          Length = 193

 Score =  206 bits (527), Expect = 8e-64
 Identities = 83/203 (40%), Positives = 129/203 (63%), Gaps = 12/203 (5%)

Query: 305 STTTPTSTATTKSHATTRVPTPDPYFTHFEPKDEHHAFKEALQRLEEMHREKVTKVMKDW 364
            T +PTS A             DPYF+  +  +EH  +KEA +RLEE HRE++T+VMK+W
Sbjct: 3   PTPSPTSDAV------------DPYFSEPKEDNEHERYKEAKKRLEEKHRERMTQVMKEW 50

Query: 365 SDLEERYQDMRSKSPGVAEDFKQKMTLRFQQTVQSLEEEGNAEKHQLIVMHQQRVAARIN 424
            + E +Y+++    P  A+  ++++T RFQ+TVQ+LE+E  AE+ QL+  HQQRV A +N
Sbjct: 51  EEAESQYKNLPKADPKAAQLMRKELTERFQETVQTLEQEAAAERQQLVETHQQRVEAHLN 110

Query: 425 QHKKDAMNCYIEALNDVSLNTHKVQKCLQKLLRALHKDRHHTIAHYKHLLATNLDFAVKE 484
           + ++ A+  Y+ AL     N HK+ K L++ +RA  KDR HT+ H++H+  T+ + A + 
Sbjct: 111 ERRRAALENYLRALQAEPPNPHKILKALKRYIRAEQKDRQHTLRHFQHVRKTDPEKAAQM 170

Query: 485 KPMTLEHLVDIDHTINQSMTMLQ 507
           +P  LEHL  ID  +NQS+T+L 
Sbjct: 171 RPQVLEHLRVIDERMNQSLTLLY 193


>gnl|CDD|128326 smart00006, A4_EXTRA, amyloid A4.  amyloid A4 precursor of
           Alzheimers disease.
          Length = 165

 Score =  185 bits (470), Expect = 8e-56
 Identities = 61/113 (53%), Positives = 76/113 (67%), Gaps = 1/113 (0%)

Query: 1   IYPKHDITNIVESSNYVKITNWCKVGHSKCKHTD-WVKPYRCLEGPFQSDALLVPEHCVF 59
            YP+  ITN+VE+S  V I NWC+ G S+CK    +V P+RCL G F SDALLVPE C F
Sbjct: 53  AYPELQITNVVEASQPVTIQNWCRRGRSQCKTHHHFVIPFRCLVGEFVSDALLVPEGCQF 112

Query: 60  DHIHNQSKCWEYERWNQTAAQSCLERDLSLRSFAMLLPCGISLFAGVEFVCCP 112
            H     +C  ++RW+Q A ++C E+ + L SF MLLPCGI  F GVEFVCCP
Sbjct: 113 LHQERMDQCETHQRWHQEAKEACSEKGMILHSFGMLLPCGIDKFRGVEFVCCP 165


>gnl|CDD|193397 pfam12924, APP_Cu_bd, Copper-binding of amyloid precursor, CuBD.
           This short domain, part of the extra-cellular N-terminus
           of the amyloid precursor protein, APP, can bind both
           copper and zinc, CuBD. The structure of Cu2+-bound CuBD
           reveals that the metal ligands are His147, His151,
           Tyr168 and two water molecules, which are arranged in a
           square pyramidal geometry. The structure of Cu+-bound
           CuBD is almost identical to the Cu2+-bound structure
           except for the loss of one of the water ligands. The
           geometry of the site is unfavourable for Cu+, thus
           providing a mechanism by which CuBD could readily
           transfer Cu ions to other proteins.
          Length = 57

 Score =  107 bits (270), Expect = 1e-28
 Identities = 30/57 (52%), Positives = 37/57 (64%)

Query: 56  HCVFDHIHNQSKCWEYERWNQTAAQSCLERDLSLRSFAMLLPCGISLFAGVEFVCCP 112
            C FDHIH    C  ++ W+  A ++C  + + L SF MLLPCGI LF GVEFVCCP
Sbjct: 1   KCKFDHIHRMDVCESFQHWHTVAKEACSTKGMELHSFGMLLPCGIDLFTGVEFVCCP 57


>gnl|CDD|216917 pfam02177, APP_N, Amyloid A4 N-terminal heparin-binding.  This
           N-terminal domain of APP, amyloid precursor protein, is
           the heparin-binding domain of the protein. this region
           is also responsible for stimulation of neurite
           outgrowth. The structure reveals both a highly charged
           basic surface that may interact with glycosaminoglycans
           in the brain and an abutting hydrophobic surface that is
           proposed to play an important functional role such as in
           dimerisation or ligand-binding. Structural similarities
           with cysteine-rich growth factors, taken together with
           its known growth-promoting properties, suggest the APP
           N-terminal domain could function as a growth factor in
           vivo.
          Length = 102

 Score =  103 bits (260), Expect = 9e-27
 Identities = 38/56 (67%), Positives = 43/56 (76%), Gaps = 1/56 (1%)

Query: 1   IYPKHDITNIVESSNYVKITNWCKVGHSKCK-HTDWVKPYRCLEGPFQSDALLVPE 55
           +YP+ DITN+VE+S  V I+NWCK   SKCK HT  VKPYRCL G F SDALLVPE
Sbjct: 47  VYPELDITNVVEASQPVTISNWCKFNRSKCKSHTHTVKPYRCLVGEFVSDALLVPE 102


>gnl|CDD|151071 pfam10515, APP_amyloid, beta-amyloid precursor protein C-terminus. 
           This is the amyloid, C-terminal, protein of the
           beta-Amyloid precursor protein (APP) which is a
           conserved and ubiquitous transmembrane glycoprotein
           strongly implicated in the pathogenesis of Alzheimer's
           disease but whose normal biological function is unknown.
           The C-terminal 100 residues are released and aggregate
           into amyloid deposits which are strongly implicated in
           the pathology of Alzheimer's disease plaque-formation.
           The domain is associated with family A4_EXTRA,
           pfam02177, further towards the N-terminus.
          Length = 53

 Score = 80.3 bits (198), Expect = 6e-19
 Identities = 30/62 (48%), Positives = 39/62 (62%), Gaps = 9/62 (14%)

Query: 224 AAVFIAMTVLKRRSARSPQNLCNVFFYFQGFIEVDQAATPEERHVANMQINGYENPTYKY 283
             + +++ +L+RR   +            G +EVD A TPEERH+ANMQ NGYENPTYKY
Sbjct: 1   IVIVVSLAMLRRRPYGAIS---------HGVVEVDPALTPEERHLANMQNNGYENPTYKY 51

Query: 284 FE 285
           FE
Sbjct: 52  FE 53


>gnl|CDD|217803 pfam03938, OmpH, Outer membrane protein (OmpH-like).  This family
           includes outer membrane proteins such as OmpH among
           others. Skp (OmpH) has been characterized as a molecular
           chaperone that interacts with unfolded proteins as they
           emerge in the periplasm from the Sec translocation
           machinery.
          Length = 157

 Score = 38.8 bits (91), Expect = 0.002
 Identities = 20/101 (19%), Positives = 46/101 (45%), Gaps = 1/101 (0%)

Query: 337 DEHHAFKEALQRLEEMHREKVTKVMKDWSDLEERYQDMRSKSPGVAEDFKQKMTLRFQQT 396
            E  A K A ++LE+  ++   ++ K   +L++  Q ++ ++  ++E+ ++      QQ 
Sbjct: 28  SESPAGKAAQKQLEKEFKKLQAELQKKEKELQKEEQKLQKQAATLSEEARKAKQQELQQK 87

Query: 397 VQSLEEEGNAEKHQLIVMHQQRVAARINQHKKDAMNCYIEA 437
            Q L+++  A   Q +   QQ +   I      A+    + 
Sbjct: 88  QQELQQKQQA-AQQELQQKQQELLQPIYDKIDKAIKEVAKE 127


>gnl|CDD|115579 pfam06933, SSP160, Special lobe-specific silk protein SSP160.  This
           family consists of several special lobe-specific silk
           protein SSP160 sequences which appear to be specific to
           Chironomus (Midge) species.
          Length = 758

 Score = 37.4 bits (86), Expect = 0.020
 Identities = 16/38 (42%), Positives = 23/38 (60%), Gaps = 2/38 (5%)

Query: 290 DSYENIVSPSS--GPASSTTTPTSTATTKSHATTRVPT 325
            +Y+N  +P S   PA++ TT +ST TT +  TT  PT
Sbjct: 662 AAYQNCTAPGSVTVPAAANTTTSSTTTTTTTTTTAAPT 699



 Score = 29.7 bits (66), Expect = 5.5
 Identities = 20/65 (30%), Positives = 29/65 (44%)

Query: 294 NIVSPSSGPASSTTTPTSTATTKSHATTRVPTPDPYFTHFEPKDEHHAFKEALQRLEEMH 353
           N  S S+  ++STT   ST TT S  +T   +     T     D    F  ALQ L+ + 
Sbjct: 284 NSTSNSNSTSNSTTNSNSTTTTNSTTSTNSTSSSNSSTIAGCIDIAANFTIALQNLQALL 343

Query: 354 REKVT 358
            ++ T
Sbjct: 344 LQEAT 348


>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1.  This domain
           family is found in eukaryotes, and is approximately 40
           amino acids in length. The family is found in
           association with pfam07719, pfam00515. There is a single
           completely conserved residue L that may be functionally
           important. NARP1 is the mammalian homologue of a yeast
           N-terminal acetyltransferase that regulates entry into
           the G(0) phase of the cell cycle.
          Length = 516

 Score = 34.5 bits (80), Expect = 0.15
 Identities = 18/53 (33%), Positives = 24/53 (45%), Gaps = 2/53 (3%)

Query: 112 PMKDKERERFLEKQRKEVHKHEREELREEKARVKAAAEGRTYEPTGPSTPIPP 164
            +   ER++  +KQRK   K E+EE   EKA  K  AE    +  GP      
Sbjct: 405 NLSPAERKKLRKKQRKAEKKAEKEE--AEKAAAKKKAEAAAKKAKGPDGETKK 455


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 33.0 bits (76), Expect = 0.37
 Identities = 12/35 (34%), Positives = 16/35 (45%)

Query: 115 DKERERFLEKQRKEVHKHEREELREEKARVKAAAE 149
           DK RE   EK  K   +  +EE +E+K   K    
Sbjct: 262 DKTREEEEEKILKAAEEERQEEAQEKKEEKKKEER 296


>gnl|CDD|219102 pfam06600, DUF1140, Protein of unknown function (DUF1140).  This
           family consists of several short, hypothetical phage and
           bacterial proteins. The function of this family is
           unknown.
          Length = 107

 Score = 30.7 bits (69), Expect = 0.51
 Identities = 19/80 (23%), Positives = 38/80 (47%), Gaps = 4/80 (5%)

Query: 447 KVQKCLQKLLRALHKDRHHTIAHYKHLLATNLDFAVKEKPMTLEHLVDIDHTINQSMTML 506
           +  +C +K+  +    R  T  H+K     NL+F ++EK   L  L+++D +   S  + 
Sbjct: 31  RANRCREKIAESGLCVR--TSRHWKA--QENLEFYIREKSFLLHQLLELDRSYRWSEKLH 86

Query: 507 QRHPALAVKISELMQDYMQA 526
           Q   +   K   ++++Y Q 
Sbjct: 87  QDRYSFVTKYVAVLEEYRQE 106


>gnl|CDD|225715 COG3174, COG3174, Predicted membrane protein [Function unknown].
          Length = 371

 Score = 32.7 bits (75), Expect = 0.55
 Identities = 36/191 (18%), Positives = 60/191 (31%), Gaps = 40/191 (20%)

Query: 84  ERDLSLRS-FAMLLPCGISLFAGVEFVCCPMKDKERE------RFLEKQRKEVHKHER-- 134
           +RDL + +  A+L    +   AG       M D E          L   ++ +H+  R  
Sbjct: 43  DRDLGVTTPIALLATFALGALAG-------MGDLEAAAGGIVLALLLASKEPLHRFLRRL 95

Query: 135 --EELREEKARVKAAAEGRTYEPTGPSTPIPPGVDAHPPYSSQRHDTVQPAYAMSHDLSI 192
             EELR   A   AA     Y P  P+  + P       +             ++   +I
Sbjct: 96  SWEELRS--ALELAALAAVVY-PVLPNGGVDPWGGPREVWL--------MVVLIA---AI 141

Query: 193 GEPSYLRHEVRPRGDSKGVYVTVVFAGLAVMAAVFIAMTVLKR------RSARSPQNLCN 246
               Y+   VR  G  +G+ +T +  G     AV        R          +   L  
Sbjct: 142 SFAGYI--AVRILGGRRGLILTGLIGGFVSSTAVTATFAARVRIGEDVLPPEAAAALLAA 199

Query: 247 VFFYFQGFIEV 257
                +  + +
Sbjct: 200 AVMLIRNLLLI 210


>gnl|CDD|163153 TIGR03142, cytochro_ccmI, cytochrome c-type biogenesis protein
           CcmI.  This TPR repeat-containing protein is the CcmI
           protein (also called CycH) of c-type cytochrome
           biogenesis. CcmI is thought to act as an apo-cytochrome
           c chaperone. This model describes the N-terminal region
           of the protein, Members of this protein family [Protein
           fate, Protein folding and stabilization, Energy
           metabolism, Electron transport].
          Length = 117

 Score = 31.1 bits (71), Expect = 0.55
 Identities = 12/46 (26%), Positives = 25/46 (54%), Gaps = 2/46 (4%)

Query: 215 VVFAGLAVMAAVFIAMTVLKRRSARSPQNL--CNVFFYFQGFIEVD 258
           +V A L ++A +F+ + +L+RR A +  +    N+  Y     E++
Sbjct: 4   IVAALLTLVALLFLLLPLLRRRRAAATVDRDELNLAVYRDRLAELE 49


>gnl|CDD|236993 PRK11820, PRK11820, hypothetical protein; Provisional.
          Length = 288

 Score = 32.4 bits (75), Expect = 0.56
 Identities = 18/70 (25%), Positives = 35/70 (50%), Gaps = 10/70 (14%)

Query: 341 AFKEALQRLEEMHREKVTKVMKDWSDLEERYQDM-------RSKSPGVAEDFKQKMTLRF 393
           A  EAL  L EM RE+    +K   DL +R   +        + +P + E++++++  R 
Sbjct: 134 ALDEALDDLIEM-REREGAALKA--DLLQRLDAIEALVAKIEALAPEILEEYRERLRERL 190

Query: 394 QQTVQSLEEE 403
           ++ +  L+E 
Sbjct: 191 EELLGELDEN 200


>gnl|CDD|218738 pfam05766, NinG, Bacteriophage Lambda NinG protein.  NinG or Rap is
           involved in recombination. Rap (recombination adept with
           plasmid) increases lambda-by-plasmid recombination
           catalyzed by Escherichia coli's RecBCD pathway.
          Length = 188

 Score = 31.2 bits (71), Expect = 0.83
 Identities = 16/44 (36%), Positives = 25/44 (56%), Gaps = 4/44 (9%)

Query: 109 VCCP---MKDKERERFLEKQRKEVHKHEREELREEKARVKAAAE 149
           VC P   +  K RE+  EK+RK   + ER EL+  K ++K  ++
Sbjct: 25  VCSPECALALK-REKAQEKKRKAEAQAERRELKARKEKLKTRSD 67


>gnl|CDD|224328 COG1410, MetH, Methionine synthase I, cobalamin-binding domain
           [Amino acid transport and metabolism].
          Length = 842

 Score = 31.5 bits (72), Expect = 1.2
 Identities = 14/66 (21%), Positives = 23/66 (34%), Gaps = 4/66 (6%)

Query: 103 FAGVEFVCCPMKDKERERFLEKQRKEVHK----HEREELREEKARVKAAAEGRTYEPTGP 158
              V  +   M  ++R  + E  RKE       H   + R     ++AA +         
Sbjct: 521 SRAVGVMDTLMSAEQRADYSEGFRKEYETVRTQHANRKARTRPLSIEAARDNAEAVWADY 580

Query: 159 STPIPP 164
             P+PP
Sbjct: 581 EPPVPP 586


>gnl|CDD|139531 PRK13383, PRK13383, acyl-CoA synthetase; Provisional.
          Length = 516

 Score = 31.5 bits (71), Expect = 1.3
 Identities = 25/102 (24%), Positives = 34/102 (33%), Gaps = 19/102 (18%)

Query: 138 REEKARVKAAAEGRTYEPTGPSTPIPPGVDAHPPYSSQRHDTVQPAYAMSHDLSIGEPSY 197
            E   R   AA GR    T  +T  P GV   P   S                ++G    
Sbjct: 164 EESGGRPAVAAPGRIVLLTSGTTGKPKGVPRAPQLRS----------------AVGVWVT 207

Query: 198 LRHEVRPRGDSKGVYVTVVFAGLA---VMAAVFIAMTVLKRR 236
           +    R R  S+      +F GL    +M  + +  TVL  R
Sbjct: 208 ILDRTRLRTGSRISVAMPMFHGLGLGMLMLTIALGGTVLTHR 249


>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex.  This
           entry is characterized by proteins with alternating
           conserved and low-complexity regions. Bud13 together
           with Snu17p and a newly identified factor,
           Pml1p/Ylr016c, form a novel trimeric complex. called The
           RES complex, pre-mRNA retention and splicing complex.
           Subunits of this complex are not essential for viability
           of yeasts but they are required for efficient splicing
           in vitro and in vivo. Furthermore, inactivation of this
           complex causes pre-mRNA leakage from the nucleus. Bud13
           contains a unique, phylogenetically conserved C-terminal
           region of unknown function.
          Length = 141

 Score = 30.0 bits (68), Expect = 1.6
 Identities = 12/40 (30%), Positives = 19/40 (47%)

Query: 114 KDKERERFLEKQRKEVHKHEREELREEKARVKAAAEGRTY 153
           K+++ E+  E  +  V K ERE+  EE  + K     R  
Sbjct: 28  KERKEEKEKEWGKGLVQKEEREKRLEELEKAKNKPLARYA 67


>gnl|CDD|151642 pfam11200, DUF2981, Protein of unknown function (DUF2981).  This
           eukaryotic family of proteins has no known function.
          Length = 319

 Score = 31.0 bits (69), Expect = 1.6
 Identities = 15/60 (25%), Positives = 24/60 (40%)

Query: 274 NGYENPTYKYFEIKDYDSYENIVSPSSGPASSTTTPTSTATTKSHATTRVPTPDPYFTHF 333
           N Y+NP  K    KD +S     S ++ P S   T   +    + A  +    D ++ H 
Sbjct: 196 NAYDNPDGKVGAAKDLNSGNTANSVNNTPNSVDNTAKPSDNGSNSAGEKKKEDDSFYDHL 255


>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584).  This
           protein is found in bacteria and eukaryotes. Proteins in
           this family are typically between 943 to 1234 amino
           acids in length. This family contains a P-loop motif
           suggesting it is a nucleotide binding protein. It may be
           involved in replication.
          Length = 1198

 Score = 31.2 bits (71), Expect = 2.1
 Identities = 36/226 (15%), Positives = 70/226 (30%), Gaps = 36/226 (15%)

Query: 322 RVPTPDPYFTHFEPKDEHHAFKEALQRLEEMHREKVTKVMKDWSDLEERYQDMRSKSPGV 381
           R+  PD      E ++     +EALQ      ++            EE+     ++    
Sbjct: 592 RLDVPDYAANETELRERLQQAEEALQSAVAKQKQ-----------AEEQLVQANAE---- 636

Query: 382 AEDFKQKMTLRFQQTVQSLEEEGNAEKHQLIVMHQQRVAARINQHKKDAMNCYIEALNDV 441
            E+ K+               +      Q +   QQ +  ++     +        L   
Sbjct: 637 LEEQKRAEAEART------ALKQARLDLQRLQNEQQSLKDKLELAIAERKQQAETQLRQ- 689

Query: 442 SLNTHKVQKCLQKLLRALHKDRHHTIAHYKHLLATNLDFAVKEKPMTLEHLVDIDHTINQ 501
                     L   L+ L + +   +   K             K   +E   ++D+ + Q
Sbjct: 690 ----------LDAQLKQLLEQQQAFLEALKDDFRELR-TERLAKWQVVEG--ELDNQLAQ 736

Query: 502 -SMTMLQRHPALAVKISELMQDYMQALRSKDETPGSLLSLTREAEE 546
            S  +         ++ EL + Y + L S D  P ++  L R+ EE
Sbjct: 737 LSAAIEAARTQAKARLKELKKQYDRELASLDVDPNTVKELKRQIEE 782


>gnl|CDD|216513 pfam01456, Mucin, Mucin-like glycoprotein.  This family of
           trypanosomal proteins resemble vertebrate mucins. The
           protein consists of three regions. The N and C terminii
           are conserved between all members of the family, whereas
           the central region is not well conserved and contains a
           large number of threonine residues which can be
           glycosylated. Indirect evidence suggested that these
           genes might encode the core protein of parasite mucins,
           glycoproteins that were proposed to be involved in the
           interaction with, and invasion of, mammalian host cells.
           This family contains an N-terminal signal peptide.
          Length = 143

 Score = 29.8 bits (66), Expect = 2.2
 Identities = 13/42 (30%), Positives = 22/42 (52%)

Query: 284 FEIKDYDSYENIVSPSSGPASSTTTPTSTATTKSHATTRVPT 325
            E  +  S     + ++ P ++TTT T+T TT +  TT+  T
Sbjct: 36  VEAAEGQSQTTTTTTTTTPPTTTTTTTTTTTTITTTTTKTTT 77



 Score = 29.1 bits (64), Expect = 3.9
 Identities = 11/30 (36%), Positives = 17/30 (56%)

Query: 297 SPSSGPASSTTTPTSTATTKSHATTRVPTP 326
           S ++   ++TT PT+T TT +  TT   T 
Sbjct: 43  SQTTTTTTTTTPPTTTTTTTTTTTTITTTT 72


>gnl|CDD|215964 pfam00513, Late_protein_L2, Late Protein L2. 
          Length = 466

 Score = 30.3 bits (69), Expect = 2.6
 Identities = 15/54 (27%), Positives = 18/54 (33%), Gaps = 12/54 (22%)

Query: 289 YDSYENI--------VSPSSGPASSTTTPTSTATTK----SHATTRVPTPDPYF 330
             SYE I           +  P SST  P      +    S A  +V   DP F
Sbjct: 188 THSYEEIPMDTFAVSEGTTPPPISSTPIPGVRRVARLRLYSRALQQVKVTDPAF 241


>gnl|CDD|233045 TIGR00601, rad23, UV excision repair protein Rad23.  All proteins
           in this family for which functions are known are
           components of a multiprotein complex used for targeting
           nucleotide excision repair to specific parts of the
           genome. In humans, Rad23 complexes with the XPC protein.
           This family is based on the phylogenomic analysis of JA
           Eisen (1999, Ph.D. Thesis, Stanford University) [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 378

 Score = 30.2 bits (68), Expect = 2.8
 Identities = 9/74 (12%), Positives = 24/74 (32%), Gaps = 7/74 (9%)

Query: 296 VSPSSGPASSTTTPTSTATTKSHATTRVPTPDPYFTHFEPKDEH-----HAFKEALQRLE 350
            +     A ++     + + +S   T   +P                     +  ++ + 
Sbjct: 105 PASGMSAAPASAVEEKSPSEESATATAPESPSTSVPSSGSDAASTLVVGSERETTIEEIM 164

Query: 351 EM--HREKVTKVMK 362
           EM   RE+V + ++
Sbjct: 165 EMGYEREEVERALR 178


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 30.5 bits (69), Expect = 2.8
 Identities = 39/244 (15%), Positives = 94/244 (38%), Gaps = 28/244 (11%)

Query: 336 KDEHHAFKEALQRLEEMHREKVTKVM---KDWSDLEERYQDMRSKSPGVAEDFKQKMTLR 392
           + E    +E L RLEE   E   ++    K+  +L+   +++R +      +  Q+  L 
Sbjct: 238 RKELEELEEELSRLEEELEELQEELEEAEKEIEELKSELEELREE-----LEELQEELLE 292

Query: 393 FQQTVQSLEEEGNAEKHQL------IVMHQQRVAARINQHKKDAMN-----CYIEALNDV 441
            ++ ++ LE E +  + +L      +   ++R+     + +            +E L  +
Sbjct: 293 LKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQL 352

Query: 442 SLNTHKVQKCLQKLLRALHKDRHHTIAHYKHLLATNLDFAVKEKPMTLEHLVDIDHTINQ 501
                + ++ L++ L AL ++        +      L     E       L ++   I  
Sbjct: 353 LAELEEAKEELEEKLSALLEELEELFEALR----EELAELEAELAEIRNELEELKREIES 408

Query: 502 SMTMLQRHPALAVKISELMQDYMQALRSKDETPGSLLSLTREAE--EAILDKYKAQVIAM 559
               L+R   L+ ++ +L ++  +     +E    L  L  E E  E  L++ + ++  +
Sbjct: 409 LEERLER---LSERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLKEL 465

Query: 560 QEDF 563
           + + 
Sbjct: 466 EREL 469


>gnl|CDD|226809 COG4372, COG4372, Uncharacterized protein conserved in bacteria
           with the myosin-like domain [Function unknown].
          Length = 499

 Score = 30.4 bits (68), Expect = 2.8
 Identities = 14/90 (15%), Positives = 38/90 (42%), Gaps = 1/90 (1%)

Query: 343 KEALQRLEEMHREKVTKVMKDWSDLEERYQDMRSKSPGVAEDFKQKMTLRFQQTVQSLEE 402
           +   Q L    +    K  ++ + L ++ QD++++   +AE  +Q             + 
Sbjct: 119 EAVRQELAAARQNL-AKAQQELARLTKQAQDLQTRLKTLAEQRRQLEAQAQSLQASQKQL 177

Query: 403 EGNAEKHQLIVMHQQRVAARINQHKKDAMN 432
           + +A + +  V+  +  +A+I Q  ++   
Sbjct: 178 QASATQLKSQVLDLKLRSAQIEQEAQNLAT 207


>gnl|CDD|224477 COG1561, COG1561, Uncharacterized stress-induced protein [Function
           unknown].
          Length = 290

 Score = 29.9 bits (68), Expect = 3.0
 Identities = 19/70 (27%), Positives = 34/70 (48%), Gaps = 10/70 (14%)

Query: 341 AFKEALQRLEEMHREKVTKVMKDWSDLEERYQDM-------RSKSPGVAEDFKQKMTLRF 393
           A +EAL  L EM RE+    +K  +DL +R   +        S  P + E +++++  R 
Sbjct: 136 ALEEALDDLIEM-REREGAALK--ADLLQRLDAIEELVEKVESLMPEILEWYRERLVARL 192

Query: 394 QQTVQSLEEE 403
            +    L+E+
Sbjct: 193 NEAQDQLDED 202


>gnl|CDD|225252 COG2377, COG2377, Predicted molecular chaperone distantly related
           to HSP70-fold metalloproteases [Posttranslational
           modification, protein turnover, chaperones].
          Length = 371

 Score = 30.3 bits (69), Expect = 3.0
 Identities = 13/63 (20%), Positives = 21/63 (33%), Gaps = 2/63 (3%)

Query: 371 YQDMRSKSPGVAEDFKQKMTLRFQQTVQSLEEEGNAEKHQL--IVMHQQRVAARINQHKK 428
               R+ +     +  + + L   Q V +L  E       +  I  H Q V  R   H  
Sbjct: 53  LCAARADTLAELAELDRALALLHAQAVAALLAEQGLLPRDIRAIGCHGQTVLHRPPGHAP 112

Query: 429 DAM 431
           D +
Sbjct: 113 DTV 115


>gnl|CDD|143653 cd07912, Tweety_N, N-terminal domain of the protein encoded by the
           Drosophila tweety gene and related proteins, a family of
           chloride ion channels.  The protein product of the
           Drosophila tweety (tty) gene is thought to form a
           trans-membrane protein with five membrane-spanning
           regions and a cytoplasmic C-terminus. This N-terminal
           domain contains the putative transmembrane spanning
           regions. Tweety has been suggested as a candidate for a
           large conductance chloride channel, both in vertebrate
           and insect cells. Three human homologs have been
           identified and designated TTYH1-3. TTYH2 has been
           associated with the progression of cancer, and
           Drosophila melanogaster tweety has been assumed to play
           a role in development. TTYH2, and TTYH3 bind to and are
           ubiquinated by Nedd4-2, a HECT type E3 ubiquitin ligase,
           which most likely plays a role in controlling the
           cellular levels of tweety family proteins.
          Length = 418

 Score = 30.0 bits (68), Expect = 3.4
 Identities = 16/68 (23%), Positives = 32/68 (47%), Gaps = 1/68 (1%)

Query: 462 DRHHTIAHYKHLLATNLDFAVKEKPMTLEHLVDIDHTINQSMTMLQRHPALAVKISELMQ 521
           +    + + +  +   L  AV E P   ++L+ +   +N +   L +  AL +    L +
Sbjct: 310 ESQRALTNMQSQVQGLLREAVFEFPTAEDNLLSLQGDLNSTEINLHQLTAL-LDCRGLHK 368

Query: 522 DYMQALRS 529
           DY++ALR 
Sbjct: 369 DYVEALRG 376


>gnl|CDD|219546 pfam07739, TipAS, TipAS antibiotic-recognition domain.  This domain
           is found at the C-terminus of some MerR family
           transcription factors. The domain has an alpha-helical
           globin-like fold. The family includes Mta a central
           regulator of multidrug resistance in Bacillus subtilis.
          Length = 118

 Score = 28.8 bits (65), Expect = 3.4
 Identities = 15/106 (14%), Positives = 40/106 (37%), Gaps = 10/106 (9%)

Query: 339 HHAFKEALQRLEEMHREKVTKVMKDWSDLEERYQDMRSKSPGVAEDFKQKMTLRFQQTVQ 398
             A++E+ ++ + + +E   ++ ++W +L         +      +  Q++  R +  + 
Sbjct: 12  DEAYQESEEKTKNLSKEDWEEIQEEWDELFAELAAAMDEGVDPDSEEVQELAERHRDWLS 71

Query: 399 SLEEEGNAEKH-QLIVMHQQ--RVAARINQHKK-------DAMNCY 434
                 + E    L  M+    R  A  +++         DA+  Y
Sbjct: 72  ESFTPCDPEALAGLGEMYVADPRFTANYDKYGPGLAEFLRDAIEAY 117


>gnl|CDD|234347 TIGR03759, conj_TIGR03759, integrating conjugative element protein,
           PFL_4693 family.  Members of this protein family, such
           as model protein PFL_4693 from Pseudomonas fluorescens
           Pf-5, belong to extended genomic regions that appear to
           be spread by conjugative transfer. Most members have a
           predicted N-terminal signal sequence. The function is
           unknown [Mobile and extrachromosomal element functions,
           Plasmid functions].
          Length = 200

 Score = 29.6 bits (67), Expect = 3.6
 Identities = 16/66 (24%), Positives = 26/66 (39%), Gaps = 3/66 (4%)

Query: 115 DKERERFLEKQRKEVHKHEREELREEKARVKAAAEGRTYEPTGPSTPIPPGVDAHPPYSS 174
           D+ER R+ E   K+  +   +EL  ++A   A    R Y P      +     A  P + 
Sbjct: 50  DEERRRYAELWVKQEAQRVEKELAFQRAYDAAWQ--RLY-PNVLPVNLFDPSGAAGPIAL 106

Query: 175 QRHDTV 180
           Q    +
Sbjct: 107 QGGGRL 112


>gnl|CDD|235316 PRK04863, mukB, cell division protein MukB; Provisional.
          Length = 1486

 Score = 30.3 bits (69), Expect = 3.9
 Identities = 22/113 (19%), Positives = 45/113 (39%), Gaps = 16/113 (14%)

Query: 334 EPKDEHHAFKEALQRLEE--MHREKVTKVMKDWSDLEERYQD------------MRSKSP 379
              +     +E L+RL E     E++ ++    S+LE+R +              R    
Sbjct: 490 SRSEAWDVARELLRRLREQRHLAEQLQQLRMRLSELEQRLRQQQRAERLLAEFCKRLGKN 549

Query: 380 GVAEDFKQKMTLRFQQTVQSLEEEGNAEKHQLIVMHQQR--VAARINQHKKDA 430
              ED  +++    +  ++SL E  +  + + + + QQ   + ARI +    A
Sbjct: 550 LDDEDELEQLQEELEARLESLSESVSEARERRMALRQQLEQLQARIQRLAARA 602


>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
           Provisional.
          Length = 695

 Score = 29.9 bits (68), Expect = 4.2
 Identities = 13/37 (35%), Positives = 20/37 (54%), Gaps = 5/37 (13%)

Query: 115 DKERERFLEKQRKEVHKHEREELREEKARVKAAAEGR 151
           ++ + RF  +Q +     ERE+    +AR K AAE R
Sbjct: 449 EEAKARFEARQARL----EREKA-AREARHKKAAEAR 480


>gnl|CDD|215243 PLN02444, PLN02444, HMP-P synthase.
          Length = 642

 Score = 29.9 bits (67), Expect = 4.5
 Identities = 15/68 (22%), Positives = 28/68 (41%), Gaps = 4/68 (5%)

Query: 290 DSYENIVSPSSGPASSTTTPTSTATTKSHATTRVPTPDPYFTHFEPKDEHHAFKEALQRL 349
            S   +   S+   ++ T    T  ++    T+ PT DP    F P     +F+E   + 
Sbjct: 38  VSAARLKKESTATRATLTFDPPTGNSEKAKQTK-PTVDPSAPDFLP---IPSFEECFPKS 93

Query: 350 EEMHREKV 357
            + ++E V
Sbjct: 94  TKEYKEVV 101


>gnl|CDD|184349 PRK13824, PRK13824, replication initiation protein RepC;
           Provisional.
          Length = 404

 Score = 29.8 bits (68), Expect = 4.5
 Identities = 19/85 (22%), Positives = 35/85 (41%), Gaps = 17/85 (20%)

Query: 341 AFKEALQRLEE---MHREKVTK---------VMKDWSDLEERYQDM-----RSKSPGVAE 383
           A ++AL+RL E   + R  + K         V  DW  +E+R++ +     R  +    E
Sbjct: 160 AERKALRRLRERLTLCRRDIAKLIEAAIEEGVPGDWEGVEQRFRAIVARLPRRATLAELE 219

Query: 384 DFKQKMTLRFQQTVQSLEEEGNAEK 408
               ++    ++ V  LE    +E 
Sbjct: 220 PILDELEALREEVVNLLESHLKSEN 244


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 28.5 bits (64), Expect = 4.8
 Identities = 14/35 (40%), Positives = 18/35 (51%), Gaps = 1/35 (2%)

Query: 111 CPMKDKERERFLEKQRKEVHKHEREELREEKARVK 145
              K+KE+E    K   E  K ER EL ++KA  K
Sbjct: 83  EAEKEKEKEERFMKALAEAEK-ERAELEKKKAEAK 116


>gnl|CDD|224250 COG1331, COG1331, Highly conserved protein containing a thioredoxin
           domain [Posttranslational modification, protein
           turnover, chaperones].
          Length = 667

 Score = 29.6 bits (67), Expect = 5.1
 Identities = 14/38 (36%), Positives = 21/38 (55%), Gaps = 6/38 (15%)

Query: 325 TPD--PYF--THFEPKDEHHA--FKEALQRLEEMHREK 356
           TPD  P+F  T+F  +D +    FK+ L+ + E  RE 
Sbjct: 116 TPDGKPFFAGTYFPKEDRYGRPGFKQLLEAIRETWRED 153


>gnl|CDD|224556 COG1641, COG1641, Uncharacterized conserved protein [Function
           unknown].
          Length = 387

 Score = 29.2 bits (66), Expect = 5.4
 Identities = 16/64 (25%), Positives = 28/64 (43%), Gaps = 6/64 (9%)

Query: 435 IEALN-DVSLNTHKVQKCLQKLLRALHKDRHHTIAHYKH----LLATNLDFAVKEKPMTL 489
           +EAL  +V L   +VQK   +  + +  D  H   H       + A +L   VK+  + +
Sbjct: 34  VEALGPEVDLRVEEVQKRGIRATK-VEVDAEHHHRHLPEIVELIKAADLPDRVKQVALAI 92

Query: 490 EHLV 493
             L+
Sbjct: 93  FELL 96


>gnl|CDD|218421 pfam05086, Dicty_REP, Dictyostelium (Slime Mold) REP protein.  This
           family consists of REP proteins from Dictyostelium
           (Slime molds). REP protein is likely involved in
           transcription regulation and control of DNA replication,
           specifically amplification of plasmid at low copy
           numbers. The formation of homomultimers may be required
           for their regulatory activity.
          Length = 910

 Score = 29.8 bits (67), Expect = 5.5
 Identities = 12/26 (46%), Positives = 15/26 (57%)

Query: 297 SPSSGPASSTTTPTSTATTKSHATTR 322
            PS  P ++TTT T+T TT     TR
Sbjct: 247 QPSKRPNNTTTTTTTTTTTTFQPRTR 272


>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
           component YidC; Validated.
          Length = 429

 Score = 29.0 bits (65), Expect = 7.6
 Identities = 14/69 (20%), Positives = 24/69 (34%)

Query: 114 KDKERERFLEKQRKEVHKHEREELREEKARVKAAAEGRTYEPTGPSTPIPPGVDAHPPYS 173
           K +  ER + ++ ++         R  +A VKA  +G         TP         P  
Sbjct: 353 KRRAAEREINREARQERAAAMARARARRAAVKAKKKGLIDASPNEDTPSENEESKGSPPQ 412

Query: 174 SQRHDTVQP 182
            +   T +P
Sbjct: 413 VEATTTAEP 421


>gnl|CDD|222023 pfam13281, DUF4071, Domain of unknown function (DUF4071).  This
           domain is found at the N-terminus of many
           serine-threonine kinase-like proteins.
          Length = 365

 Score = 28.5 bits (64), Expect = 9.3
 Identities = 13/38 (34%), Positives = 19/38 (50%)

Query: 109 VCCPMKDKERERFLEKQRKEVHKHEREELREEKARVKA 146
           V       +RE+FL   RK   K   +ELR+E   ++A
Sbjct: 85  VPIQSSAHKREKFLSDLRKAREKLSGKELRKELRELRA 122


>gnl|CDD|132218 TIGR03174, cas_Csc3, CRISPR type I-D/CYANO-associated protein
           Csc3/Cas10d.  CRISPR (Clustered Regularly Interspaced
           Short Palindromic Repeats) is a widespread family of
           prokaryotic direct repeats with spacers of unique
           sequence between consecutive repeats. This protein
           family is a CRISPR-associated (Cas) family strictly
           associated with the Cyano subtype of CRISPR/Cas locus,
           found in several species of Cyanobacteria and several
           archaeal species. This family is designated Csc3 for
           CRISPR/Cas Subtype Cyano protein 3, as it is often the
           third gene upstream of the core cas genes,
           cas3-cas4-cas1-cas2 [Mobile and extrachromosomal element
           functions, Other].
          Length = 953

 Score = 28.7 bits (64), Expect = 9.5
 Identities = 17/52 (32%), Positives = 27/52 (51%)

Query: 500 NQSMTMLQRHPALAVKISELMQDYMQALRSKDETPGSLLSLTREAEEAILDK 551
           N  +T    +  L  ++ EL + + +   SK+  P ++L   REA EAIL K
Sbjct: 815 NIVLTEKLENSNLTKRLVELYRRFYRVALSKNPKPHAVLKPFREAVEAILSK 866


>gnl|CDD|225500 COG2949, SanA, Uncharacterized membrane protein [Function unknown].
          Length = 235

 Score = 28.1 bits (63), Expect = 9.6
 Identities = 20/48 (41%), Positives = 21/48 (43%), Gaps = 9/48 (18%)

Query: 128 EVHKHEREELREEKARVKAAAE----GRTYEPT--GPSTPIPPGVDAH 169
           E        LRE  ARVKA  +     R  EP   GP  PIPP  DA 
Sbjct: 185 EGRSGLSVRLREFLARVKAVLDLYILKR--EPKFLGPPVPIPPF-DAQ 229


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.318    0.131    0.391 

Gapped
Lambda     K      H
   0.267   0.0788    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 28,532,182
Number of extensions: 2771925
Number of successful extensions: 4385
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4349
Number of HSP's successfully gapped: 82
Length of query: 564
Length of database: 10,937,602
Length adjustment: 102
Effective length of query: 462
Effective length of database: 6,413,494
Effective search space: 2963034228
Effective search space used: 2963034228
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 62 (27.5 bits)