RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy13861
         (1048 letters)



>gnl|CDD|214614 smart00317, SET, SET (Su(var)3-9, Enhancer-of-zeste, Trithorax)
           domain.  Putative methyl transferase, based on outlier
           plant homologues.
          Length = 124

 Score =  128 bits (324), Expect = 1e-34
 Identities = 51/97 (52%), Positives = 61/97 (62%), Gaps = 2/97 (2%)

Query: 484 KNEFISEYCGEIISQDEADRRGKVYDKYM--CSFLFNLNNDFVVDATRKGNKIRFANHSI 541
           K EFI EY GEII+ +EA+ R K YD       +LF++++D  +DA RKGN  RF NHS 
Sbjct: 23  KGEFIGEYVGEIITSEEAEERPKAYDTDGAKAFYLFDIDSDLCIDARRKGNLARFINHSC 82

Query: 542 NPNCYAKVMMVNGDHRIGIFAKRAILPGEELYFDYRY 578
            PNC    + VNGD RI IFA R I PGEEL  DY  
Sbjct: 83  EPNCELLFVEVNGDDRIVIFALRDIKPGEELTIDYGS 119



 Score = 61.2 bits (149), Expect = 3e-11
 Identities = 21/50 (42%), Positives = 27/50 (54%)

Query: 995  KHLLMAPSDVAGWGIFLKDSAQKNEFISEYCGEIISQDEADRRGKVYDKY 1044
              L +  S   GWG+   +   K EFI EY GEII+ +EA+ R K YD  
Sbjct: 1    NKLEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERPKAYDTD 50


>gnl|CDD|216155 pfam00856, SET, SET domain.  SET domains are protein lysine
           methyltransferase enzymes. SET domains appear to be
           protein-protein interaction domains. It has been
           demonstrated that SET domains mediate interactions with
           a family of proteins that display similarity with
           dual-specificity phosphatases (dsPTPases). A subset of
           SET domains have been called PR domains. These domains
           are divergent in sequence from other SET domains, but
           also appear to mediate protein-protein interaction. The
           SET domain consists of two regions known as SET-N and
           SET-C. SET-C forms an unusual and conserved knot-like
           structure of probably functional importance.
           Additionally to SET-N and SET-C, an insert region
           (SET-I) and flanking regions of high structural
           variability form part of the overall structure.
          Length = 113

 Score =  106 bits (265), Expect = 6e-27
 Identities = 44/102 (43%), Positives = 58/102 (56%), Gaps = 8/102 (7%)

Query: 483 QKNEFISEYCGEIISQDEADRRGKVYDKYMCS--------FLFNLNNDFVVDATRKGNKI 534
            K E I EY GE+I+ +EA+ R  +Y+K            FL  L++++ +DAT  GN  
Sbjct: 11  PKGELIIEYVGELITPEEAEERELLYNKEELRGLLSDLELFLSRLDSEYDIDATGLGNVA 70

Query: 535 RFANHSINPNCYAKVMMVNGDHRIGIFAKRAILPGEELYFDY 576
           RF NHS  PNC  + + VNG  RI + A R I PGEEL  DY
Sbjct: 71  RFINHSCEPNCEVRFVFVNGGDRIVVRALRDIKPGEELTIDY 112



 Score = 47.9 bits (114), Expect = 1e-06
 Identities = 16/43 (37%), Positives = 23/43 (53%)

Query: 1006 GWGIFLKDSAQKNEFISEYCGEIISQDEADRRGKVYDKYMCSF 1048
            G G+F      K E I EY GE+I+ +EA+ R  +Y+K     
Sbjct: 1    GRGLFATRDIPKGELIIEYVGELITPEEAEERELLYNKEELRG 43


>gnl|CDD|225491 COG2940, COG2940, Proteins containing SET domain [General function
           prediction only].
          Length = 480

 Score = 72.9 bits (179), Expect = 3e-13
 Identities = 45/95 (47%), Positives = 54/95 (56%), Gaps = 2/95 (2%)

Query: 484 KNEFISEYCGEIISQDEADRRGKVYDKYMCSFLFNLNND--FVVDATRKGNKIRFANHSI 541
           K EFI EY GEII + EA  R + YD     F F L  D   V D+ + G+  RF NHS 
Sbjct: 354 KGEFIIEYHGEIIRRKEAREREENYDLLGNEFSFGLLEDKDKVRDSQKAGDVARFINHSC 413

Query: 542 NPNCYAKVMMVNGDHRIGIFAKRAILPGEELYFDY 576
            PNC A  + VNG  +I I+A R I  GEEL +DY
Sbjct: 414 TPNCEASPIEVNGIFKISIYAIRDIKAGEELTYDY 448



 Score = 42.5 bits (100), Expect = 0.001
 Identities = 27/120 (22%), Positives = 39/120 (32%), Gaps = 16/120 (13%)

Query: 931  CKCSFDCQNRFPGCRC--KAQCNTKQCPCYLAVRECDPDLCQTCGADQFDVSKISCKNVS 988
              CS    +  P                          ++ +       +  K   +   
Sbjct: 280  EGCSPLLCSASPSAINRISKSEEDSTTSSD----FSKSNVSKLKELLNSNGCKKRREPNV 335

Query: 989  VQRGLHKHLLMAPSDVAGWGIFLKDSAQKNEFISEYCGEIISQDEADRRGKVYDKYMCSF 1048
            VQ    K          G+G+F  +S +K EFI EY GEII + EA  R + YD     F
Sbjct: 336  VQESEIK----------GYGVFALESIKKGEFIIEYHGEIIRRKEAREREENYDLLGNEF 385


>gnl|CDD|238096 cd00167, SANT, 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains.
           Tandem copies of the domain bind telomeric DNA tandem
           repeatsas part of the capping complex. Binding is
           sequence dependent for repeats which contain the G/C
           rich motif [C2-3 A (CA)1-6]. The domain is also found in
           regulatory transcriptional repressor complexes where it
           also binds DNA.
          Length = 45

 Score = 32.9 bits (76), Expect = 0.038
 Identities = 11/43 (25%), Positives = 20/43 (46%), Gaps = 1/43 (2%)

Query: 814 EWTGSDQSLF-RAIHKVLYNNYCAIAQVMMTKTCQQVYQFAQK 855
            WT  +  L   A+ K   NN+  IA+ +  +T +Q  +  + 
Sbjct: 1   PWTEEEDELLLEAVKKYGKNNWEKIAKELPGRTPKQCRERWRN 43


>gnl|CDD|197842 smart00717, SANT, SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding
           domains. 
          Length = 49

 Score = 31.8 bits (73), Expect = 0.10
 Identities = 11/44 (25%), Positives = 20/44 (45%), Gaps = 1/44 (2%)

Query: 813 NEWTGSDQSLF-RAIHKVLYNNYCAIAQVMMTKTCQQVYQFAQK 855
            EWT  +  L    + K   NN+  IA+ +  +T +Q  +  + 
Sbjct: 2   GEWTEEEDELLIELVKKYGKNNWEKIAKELPGRTAEQCRERWRN 45


>gnl|CDD|220297 pfam09581, Spore_III_AF, Stage III sporulation protein AF
           (Spore_III_AF).  This family represents the stage III
           sporulation protein AF (Spore_III_AF) of the bacterial
           endospore formation program, which exists in some but
           not all members of the Firmicutes (formerly called
           low-GC Gram-positives). The C-terminal region of these
           proteins is poorly conserved.
          Length = 185

 Score = 34.2 bits (79), Expect = 0.18
 Identities = 19/84 (22%), Positives = 38/84 (45%), Gaps = 3/84 (3%)

Query: 707 DGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAV--QATEVKTTKGKLSIEKQ 764
             +++++E ++K EE     K  +++++ED +    D   V     E    K K  +E  
Sbjct: 86  KQLEKQVEKKLK-EEYGVKVKDVEVEIDEDLESNNFDIKEVNVTLKEESKEKQKSKVEPV 144

Query: 765 VSLDSGSGNDASSEDSNDSKDLKN 788
           V     S      E+S +++ +KN
Sbjct: 145 VIDTQTSKPKEEEEESEEAEKIKN 168



 Score = 33.8 bits (78), Expect = 0.24
 Identities = 20/87 (22%), Positives = 38/87 (43%), Gaps = 3/87 (3%)

Query: 340 DGMKEKIEAKIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAV--QATEVKTTKGKLSIEKQ 397
             +++++E K+K EE     K  +++++ED +    D   V     E    K K  +E  
Sbjct: 86  KQLEKQVEKKLK-EEYGVKVKDVEVEIDEDLESNNFDIKEVNVTLKEESKEKQKSKVEPV 144

Query: 398 VSLDSGSGNDASSEDSNDSRDLKNNIE 424
           V     S      E+S ++  +KN + 
Sbjct: 145 VIDTQTSKPKEEEEESEEAEKIKNFLA 171



 Score = 33.8 bits (78), Expect = 0.28
 Identities = 19/87 (21%), Positives = 38/87 (43%), Gaps = 3/87 (3%)

Query: 55  DGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAV--QATEVKTTKGKLSIEKQ 112
             +++++E ++K EE     K  +++++ED +    D   V     E    K K  +E  
Sbjct: 86  KQLEKQVEKKLK-EEYGVKVKDVEVEIDEDLESNNFDIKEVNVTLKEESKEKQKSKVEPV 144

Query: 113 VSLDSGSGNDASSEDSNDSRDLKNNIE 139
           V     S      E+S ++  +KN + 
Sbjct: 145 VIDTQTSKPKEEEEESEEAEKIKNFLA 171


>gnl|CDD|220611 pfam10169, Laps, Learning-associated protein.  This is a family of
           121-amino acid secretory proteins. Laps functions in the
           regulation of neuronal cell adhesion and/or movement and
           synapse attachment. Laps binds to the ApC/EBP (Aplysia
           CCAAT/enhancer binding protein) promoter and activates
           the transcription of ApC/EBP mRNA.
          Length = 124

 Score = 30.6 bits (69), Expect = 1.5
 Identities = 15/47 (31%), Positives = 29/47 (61%)

Query: 334 DCYMLLDGMKEKIEAKIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAV 380
           D  +L+  +K+    K  +E +++ KK+     E+D+KM+VD++ AV
Sbjct: 35  DGNVLMKDVKDIATVKPAEEVKEKKKKEGTESEEDDEKMEVDNKAAV 81



 Score = 29.0 bits (65), Expect = 5.8
 Identities = 14/47 (29%), Positives = 29/47 (61%)

Query: 49 DCYMLLDGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAV 95
          D  +L+  +K+    +  +E +++ KK+     E+D+KM+VD++ AV
Sbjct: 35 DGNVLMKDVKDIATVKPAEEVKEKKKKEGTESEEDDEKMEVDNKAAV 81



 Score = 29.0 bits (65), Expect = 5.8
 Identities = 14/47 (29%), Positives = 29/47 (61%)

Query: 701 DCYMLLDGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAV 747
           D  +L+  +K+    +  +E +++ KK+     E+D+KM+VD++ AV
Sbjct: 35  DGNVLMKDVKDIATVKPAEEVKEKKKKEGTESEEDDEKMEVDNKAAV 81


>gnl|CDD|240555 cd13122, MSL2_CXC, DNA-binding cysteine-rich domain of
           male-specific lethal 2 and related proteins.  The CXC
           domain of Drosophila melanogaster MSL2 forms a
           Zn(3)Cys(9) cluster and is involved in recruiting
           members of the dosage compensation complex (DCC) to
           sites on the X chromosome.
          Length = 50

 Score = 28.9 bits (65), Expect = 1.6
 Identities = 12/39 (30%), Positives = 16/39 (41%), Gaps = 8/39 (20%)

Query: 942 PGCRC--------KAQCNTKQCPCYLAVRECDPDLCQTC 972
            GCRC           C  ++CPCY   + C    C+ C
Sbjct: 5   KGCRCGTATQSPGVLTCRGQRCPCYSNGKSCLDCKCRGC 43


>gnl|CDD|227434 COG5103, CDC39, Cell division control protein, negative regulator of
            transcription [Cell division and chromosome partitioning
            / Transcription].
          Length = 2005

 Score = 32.3 bits (73), Expect = 1.9
 Identities = 18/86 (20%), Positives = 29/86 (33%), Gaps = 3/86 (3%)

Query: 599  IYEWD-FNLRSPVSATILFGNMRAMEIKNYQSSKVVL--GKNKTGGILMPLELLREANTS 655
            +Y  D  N    VS T++      +E   ++S  + +   + + GG      LL      
Sbjct: 1834 VYLIDNLNEMDAVSHTVVESIKFMIERFMHKSEILTMLWIRMRAGGPPKRYFLLTAILAQ 1893

Query: 656  CQYDTAGRCYKYDCFLHRLKDHHSGP 681
             +Y           FL   K H   P
Sbjct: 1894 LRYPEIHTYDFSRVFLKSFKSHGYNP 1919


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 31.8 bits (73), Expect = 2.1
 Identities = 14/69 (20%), Positives = 36/69 (52%), Gaps = 2/69 (2%)

Query: 51  YMLLDGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAVQATEVKTTKGKLSIE 110
             +++  ++K E E K+++++    K K + EE++K + +++   +  E +  K +   +
Sbjct: 413 KKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEK 472

Query: 111 --KQVSLDS 117
             KQ +L  
Sbjct: 473 KKKQATLFD 481



 Score = 31.8 bits (73), Expect = 2.1
 Identities = 14/69 (20%), Positives = 36/69 (52%), Gaps = 2/69 (2%)

Query: 703 YMLLDGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAVQATEVKTTKGKLSIE 762
             +++  ++K E E K+++++    K K + EE++K + +++   +  E +  K +   +
Sbjct: 413 KKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEK 472

Query: 763 --KQVSLDS 769
             KQ +L  
Sbjct: 473 KKKQATLFD 481



 Score = 31.4 bits (72), Expect = 2.5
 Identities = 13/69 (18%), Positives = 36/69 (52%), Gaps = 2/69 (2%)

Query: 336 YMLLDGMKEKIEAKIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAVQATEVKTTKGKLSIE 395
             +++  ++K E + K+++++    K K + EE++K + +++   +  E +  K +   +
Sbjct: 413 KKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEK 472

Query: 396 --KQVSLDS 402
             KQ +L  
Sbjct: 473 KKKQATLFD 481


>gnl|CDD|222912 PHA02624, PHA02624, large T antigen; Provisional.
          Length = 647

 Score = 31.5 bits (72), Expect = 2.9
 Identities = 15/42 (35%), Positives = 18/42 (42%), Gaps = 11/42 (26%)

Query: 921 VSA-QNFCEKFCKCSFDCQNRFPGCRCKAQCNTKQCPCYLAV 961
           VSA  NFC+K C  SF          CK     K+   Y A+
Sbjct: 220 VSAVNNFCKKHCTVSF--------LICKGV--KKEYELYSAL 251


>gnl|CDD|219956 pfam08658, Rad54_N, Rad54 N terminal.  This is the N terminal of
           the DNA repair protein Rad54.
          Length = 191

 Score = 30.8 bits (70), Expect = 2.9
 Identities = 16/61 (26%), Positives = 26/61 (42%), Gaps = 5/61 (8%)

Query: 25  HHSGPNLMRRKRPDL--KPFSDPCSPDCYMLLDGM---KEKIEAEIKDEEEQEMKKKTKL 79
               P L  R+      +P  DP      +L D     K KIE E  +++++  + +TKL
Sbjct: 106 RRPPPTLGMRRGAVFVPRPLHDPTGEFAIVLYDPTVDDKPKIEEEKAEKDQEPEESETKL 165

Query: 80  D 80
            
Sbjct: 166 S 166



 Score = 30.8 bits (70), Expect = 2.9
 Identities = 16/61 (26%), Positives = 26/61 (42%), Gaps = 5/61 (8%)

Query: 677 HHSGPNLMRRKRPDL--KPFSDPCSPDCYMLLDGM---KEKIEAEIKDEEEQEMKKKTKL 731
               P L  R+      +P  DP      +L D     K KIE E  +++++  + +TKL
Sbjct: 106 RRPPPTLGMRRGAVFVPRPLHDPTGEFAIVLYDPTVDDKPKIEEEKAEKDQEPEESETKL 165

Query: 732 D 732
            
Sbjct: 166 S 166


>gnl|CDD|220460 pfam09894, DUF2121, Uncharacterized protein conserved in archaea
           (DUF2121).  This domain, found in various hypothetical
           archaeal proteins, has no known function.
          Length = 194

 Score = 30.4 bits (69), Expect = 3.9
 Identities = 20/75 (26%), Positives = 34/75 (45%), Gaps = 12/75 (16%)

Query: 341 GMKEKIEA--------KIKDEEEQEMKKKTKLDLE---EDDKMQVDDQNAVQATEVKTTK 389
           G +E  E         KIK +EE   KK ++  ++    DD+ +V     V   EV + +
Sbjct: 26  GDEENREKLEEELYSGKIKTDEEL-KKKASEFGVKIYITDDREKVRKVGDVLVGEVTSIE 84

Query: 390 GKLSIEKQVSLDSGS 404
           GK S  +++    G+
Sbjct: 85  GKDSKRRRMYATKGA 99



 Score = 30.0 bits (68), Expect = 4.7
 Identities = 19/75 (25%), Positives = 34/75 (45%), Gaps = 12/75 (16%)

Query: 56  GMKEKIEA--------EIKDEEEQEMKKKTKLDLE---EDDKMQVDDQNAVQATEVKTTK 104
           G +E  E         +IK +EE   KK ++  ++    DD+ +V     V   EV + +
Sbjct: 26  GDEENREKLEEELYSGKIKTDEEL-KKKASEFGVKIYITDDREKVRKVGDVLVGEVTSIE 84

Query: 105 GKLSIEKQVSLDSGS 119
           GK S  +++    G+
Sbjct: 85  GKDSKRRRMYATKGA 99



 Score = 30.0 bits (68), Expect = 4.7
 Identities = 19/75 (25%), Positives = 34/75 (45%), Gaps = 12/75 (16%)

Query: 708 GMKEKIEA--------EIKDEEEQEMKKKTKLDLE---EDDKMQVDDQNAVQATEVKTTK 756
           G +E  E         +IK +EE   KK ++  ++    DD+ +V     V   EV + +
Sbjct: 26  GDEENREKLEEELYSGKIKTDEEL-KKKASEFGVKIYITDDREKVRKVGDVLVGEVTSIE 84

Query: 757 GKLSIEKQVSLDSGS 771
           GK S  +++    G+
Sbjct: 85  GKDSKRRRMYATKGA 99


>gnl|CDD|219060 pfam06493, DUF1096, Protein of unknown function (DUF1096).  This
           family represents the N-terminal region of several
           proteins found in C. elegans. The family is often found
           with pfam02363.
          Length = 51

 Score = 27.6 bits (61), Expect = 5.0
 Identities = 14/38 (36%), Positives = 17/38 (44%), Gaps = 3/38 (7%)

Query: 909 PPTQPCDASCPCVSAQNFCEKFCKCSFDCQNRFPGCRC 946
           PP QP   SC C  AQ   ++ C C    Q +   C C
Sbjct: 16  PPQQP---SCSCQQAQQNQQQSCSCQNAPQPQQSSCSC 50


>gnl|CDD|112704 pfam03904, DUF334, Domain of unknown function (DUF334).
           Staphylococcus aureus plasmid proteins with no
           characterized function.
          Length = 229

 Score = 30.0 bits (67), Expect = 5.2
 Identities = 13/52 (25%), Positives = 32/52 (61%), Gaps = 6/52 (11%)

Query: 344 EKIEAKIKDEEEQEMKKKTKL------DLEEDDKMQVDDQNAVQATEVKTTK 389
           +KI+  +++EE QE+KK+ KL      +++E+  ++  +  A+++   + T+
Sbjct: 33  QKIQKSLENEELQELKKQNKLIIKYIAEIKENQDIREKELKAIKSELKEATE 84



 Score = 30.0 bits (67), Expect = 5.3
 Identities = 13/52 (25%), Positives = 32/52 (61%), Gaps = 6/52 (11%)

Query: 59  EKIEAEIKDEEEQEMKKKTKL------DLEEDDKMQVDDQNAVQATEVKTTK 104
           +KI+  +++EE QE+KK+ KL      +++E+  ++  +  A+++   + T+
Sbjct: 33  QKIQKSLENEELQELKKQNKLIIKYIAEIKENQDIREKELKAIKSELKEATE 84



 Score = 30.0 bits (67), Expect = 5.3
 Identities = 13/52 (25%), Positives = 32/52 (61%), Gaps = 6/52 (11%)

Query: 711 EKIEAEIKDEEEQEMKKKTKL------DLEEDDKMQVDDQNAVQATEVKTTK 756
           +KI+  +++EE QE+KK+ KL      +++E+  ++  +  A+++   + T+
Sbjct: 33  QKIQKSLENEELQELKKQNKLIIKYIAEIKENQDIREKELKAIKSELKEATE 84


>gnl|CDD|240396 PTZ00388, PTZ00388, 40S ribosomal protein S8-like; Provisional.
          Length = 223

 Score = 30.1 bits (68), Expect = 5.4
 Identities = 19/59 (32%), Positives = 29/59 (49%), Gaps = 8/59 (13%)

Query: 339 LDGMKEKIEAKIKDEEEQEMKKKTKLDLEEDDKMQVDDQ---NAVQA-----TEVKTTK 389
           L G+K K+  K +  E+ EMKK  K   E+D K +V ++    AV +      +V   K
Sbjct: 5   LRGIKAKLFNKKRYAEKAEMKKTIKAHEEKDVKEKVPEKVPDGAVPSYLLDREQVNRAK 63


>gnl|CDD|149438 pfam08374, Protocadherin, Protocadherin.  The structure of
           protocadherins is similar to that of classic cadherins
           (pfam00028), but particularly on the cytoplasmic domains
           they also have some unique features. They are expressed
           in a variety of organisms and are found in high
           concentrations in the brain where they seem to be
           localised mainly at cell-cell contact sites. Their
           expression seems to be developmentally regulated.
          Length = 223

 Score = 29.8 bits (67), Expect = 5.4
 Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 5/48 (10%)

Query: 837 IAQVMMTKTCQQ-----VYQFAQKEAADITTEDSANDTTPPRKKKKKH 879
           I  V++ + C+Q      YQ  +KE  D  + +  N     +K KKK 
Sbjct: 54  ILVVVLVRYCRQAERKKGYQAGKKETEDWFSPNQENKQKKKKKDKKKK 101


>gnl|CDD|212055 cd11486, SLC5sbd_SGLT1, Na(+)/glucose cotransporter SGLT1;solute
           binding domain.  Human SGLT1 (hSGLT1) is a
           high-affinity/low-capacity glucose transporter, which
           can also transport galactose. In the transport
           mechanism, two Na+ ions first bind to the extracellular
           side of the transporter and induce a conformational
           change in the glucose binding site. This results in an
           increased affinity for glucose. A second conformational
           change in the transporter follows, bringing the Na+ and
           glucose binding sites to the inner surface of the
           membrane. Glucose is then released, followed by the Na+
           ions. In the process, hSGLT1 is also able to transport
           water and urea and may be a major pathway for transport
           of these across the intestinal brush-border membrane.
           hSGLT1 is encoded by the SLC5A1 gene and expressed
           mostly in the intestine, but also in the trachea,
           kidney, heart, brain, testis, and prostate. The
           WHO/UNICEF oral rehydration solution (ORS) for the
           treatment of secretory diarrhea contains salt and
           glucose. The glucose, along with sodium ions, is
           transported by hSGLT1 and water is either co-transported
           along with these or follows by osmosis. Mutations in
           SGLT1 are associated with intestinal glucose galactose
           malabsorption (GGM). Up-regulation of intestinal SGLT1
           may protect against enteric infections. SGLT1 is
           expressed in colorectal, head and neck, and prostate
           tumors. Epidermal growth factor receptor (EGFR)
           functions in cell survival by stabilizing SGLT1, and
           thereby maintaining intracellular glucose levels. SGLT1
           is predicted to have 14 membrane-spanning regions. This
           subgroup belongs to the solute carrier 5
           (SLC5)transporter family.
          Length = 635

 Score = 30.2 bits (68), Expect = 6.9
 Identities = 9/37 (24%), Positives = 18/37 (48%)

Query: 50  CYMLLDGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDK 86
           C+ L +  +E+I+ +  D  E E + + + D E    
Sbjct: 534 CWSLRNSTEERIDLDADDWTEDEDENEMETDEERKKP 570



 Score = 30.2 bits (68), Expect = 6.9
 Identities = 9/37 (24%), Positives = 18/37 (48%)

Query: 702 CYMLLDGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDK 738
           C+ L +  +E+I+ +  D  E E + + + D E    
Sbjct: 534 CWSLRNSTEERIDLDADDWTEDEDENEMETDEERKKP 570


>gnl|CDD|211393 cd11381, NSA2, pre-ribosomal protein NSA2 (Nop seven-associated 2).
            NSA2 appears to be a protein required for the
           maturation of 27S pre-rRNA in yeast; it has been
           characterized in mammalian cells as a nucleolar protein
           that might play a role in the regulation of the cell
           cycle and in cell proliferation.
          Length = 257

 Score = 29.5 bits (67), Expect = 7.9
 Identities = 18/47 (38%), Positives = 28/47 (59%), Gaps = 3/47 (6%)

Query: 339 LDGMKEKIEAKIKDEEEQEMKKKTKLDLEEDDKMQVDD---QNAVQA 382
           L G+K K+  K + +E+ +MKK  K+  E + K +VDD   + AV A
Sbjct: 39  LRGLKAKLYNKKRYKEKIQMKKTIKMHEERNVKQKVDDKVPEGAVPA 85


>gnl|CDD|218393 pfam05033, Pre-SET, Pre-SET motif.  This protein motif is a zinc
           binding motif. It contains 9 conserved cysteines that
           coordinate three zinc ions. It is thought that this
           region plays a structural role in stabilising SET
           domains.
          Length = 103

 Score = 28.1 bits (63), Expect = 8.0
 Identities = 7/14 (50%), Positives = 7/14 (50%)

Query: 927 CEKFCKCSFDCQNR 940
           C   CKC   C NR
Sbjct: 90  CNSRCKCDPSCPNR 103


>gnl|CDD|193205 pfam12729, 4HB_MCP_1, Four helix bundle sensory module for signal
           transduction.  This family is a four helix bundle that
           operates as a ubiquitous sensory module in prokaryotic
           signal-transduction. The 4HB_MCP is always found between
           two predicted transmembrane helices indicating that it
           detects only extracellular signals. In many cases the
           domain is associated with a cytoplasmic HAMP domain
           suggesting that most proteins carrying the bundle might
           share the mechanism of transmembrane signalling which is
           well-characterized in E coli chemoreceptors.
          Length = 181

 Score = 29.1 bits (66), Expect = 8.5
 Identities = 10/27 (37%), Positives = 16/27 (59%)

Query: 238 FIELVNDLIKYQVKDSEEESNSNKGSA 264
            IE +++LI Y +K ++E    NK S 
Sbjct: 155 VIEALDELIDYNLKVAKEAYEDNKASY 181


>gnl|CDD|217384 pfam03137, OATP, Organic Anion Transporter Polypeptide (OATP)
           family.  This family consists of several eukaryotic
           Organic-Anion-Transporting Polypeptides (OATPs). Several
           have been identified mostly in human and rat. Different
           OATPs vary in tissue distribution and substrate
           specificity. Since the numbering of different OATPs in
           particular species was based originally on the order of
           discovery, similarly numbered OATPs in humans and rats
           did not necessarily correspond in function, tissue
           distribution and substrate specificity (in spite of the
           name, some OATPs also transport organic cations and
           neutral molecules). Thus, Tamai et al. initiated the
           current scheme of using digits for rat OATPs and letters
           for human ones. Prostaglandin transporter (PGT) proteins
           are also considered to be OATP family members. In
           addition, the methotrexate transporter OATK is closely
           related to OATPs. This family also includes several
           predicted proteins from Caenorhabditis elegans and
           Drosophila melanogaster. This similarity was not
           previously noted. Note: Members of this family are
           described (in the Swiss-Prot database) as belonging to
           the SLC21 family of transporters.
          Length = 582

 Score = 30.0 bits (68), Expect = 8.5
 Identities = 13/75 (17%), Positives = 17/75 (22%), Gaps = 11/75 (14%)

Query: 907 RHPPTQPCDASCPC-------VSAQNFCEKFCKCSFDCQNRFPGCRCKAQCNTKQCPCY- 958
              P   C+  C C       V   N       C     +   GC       +  C    
Sbjct: 418 HLEPLSSCNEDCSCDTSFFPPVCGDNGLTYLSPCHAGSSSSGTGCDTSCSTWSNNCSSGN 477

Query: 959 -LAVR--ECDPDLCQ 970
             +     C  D C 
Sbjct: 478 SHSASKGYCPSDCCT 492


>gnl|CDD|227466 COG5137, COG5137, Histone chaperone involved in gene silencing
           [Transcription / Chromatin structure and dynamics].
          Length = 279

 Score = 29.6 bits (66), Expect = 9.6
 Identities = 15/87 (17%), Positives = 33/87 (37%), Gaps = 3/87 (3%)

Query: 707 DGMKEKIEAEIKDEEEQEMKKKTKLDLEEDDKMQVDDQNAVQATEVKTTKGKLSIEKQVS 766
           DG +E+ + E+  +   E  ++   + EE+ +   D ++ V     +  K +   E+   
Sbjct: 186 DGREEEEDEEVGSDSYGEGNRELNEEEEEEAEGSDDGEDVVDYEGERIDKKQGEEEEM-- 243

Query: 767 LDSGSGNDASSEDSNDSKDLKNNTEVE 793
            +    N    E   +S   +     E
Sbjct: 244 -EEEVINLFEIEWEEESPSEEVPRNNE 269


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.131    0.394 

Gapped
Lambda     K      H
   0.267   0.0774    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 50,932,405
Number of extensions: 4902743
Number of successful extensions: 5968
Number of sequences better than 10.0: 1
Number of HSP's gapped: 5913
Number of HSP's successfully gapped: 92
Length of query: 1048
Length of database: 10,937,602
Length adjustment: 107
Effective length of query: 941
Effective length of database: 6,191,724
Effective search space: 5826412284
Effective search space used: 5826412284
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 64 (28.3 bits)