RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy592
         (339 letters)



>gnl|CDD|216265 pfam01049, Cadherin_C, Cadherin cytoplasmic region.  Cadherins are
           vital in cell-cell adhesion during tissue
           differentiation. Cadherins are linked to the
           cytoskeleton by catenins. Catenins bind to the
           cytoplasmic tail of the cadherin. Cadherins cluster to
           form foci of homophilic binding units. A key determinant
           to the strength of the binding that it is mediated by
           cadherins is the juxtamembrane region of the cadherin.
           This region induces clustering and also binds to the
           protein p120ctn.
          Length = 145

 Score =  164 bits (418), Expect = 1e-50
 Identities = 59/121 (48%), Positives = 71/121 (58%), Gaps = 6/121 (4%)

Query: 219 RRREAHIKYPGPDDDVRENIINYDDEGGGEDDMTAFDITPLQIPIGGP----HPEMNNKI 274
           RRR+        +DD+RENIINYDDEGGGE+D  A+DI+ L+ P+        P ++   
Sbjct: 1   RRRKKEPLIIDKEDDIRENIINYDDEGGGEEDTDAYDISALRNPLDPLRRDVQPLISALP 60

Query: 275 PYGLGPMPMGVEPNVGIFIEEHKKRADADPNAPPFDDLRNYAYEGGGSTAGSLSSLASGS 334
                P P     +VG FI E    AD DP APP+D L  YAYEG GS AGSLSSL S S
Sbjct: 61  RPRPPPPPDP--GDVGDFIHEKLPEADNDPTAPPYDSLLTYAYEGRGSVAGSLSSLNSSS 118

Query: 335 S 335
           S
Sbjct: 119 S 119


>gnl|CDD|238058 cd00110, LamG, Laminin G domain; Laminin G-like domains are usually
           Ca++ mediated receptors that can have binding sites for
           steroids, beta1 integrins, heparin, sulfatides,
           fibulin-1, and alpha-dystroglycans. Proteins that
           contain LamG domains serve a variety of purposes
           including signal transduction via cell-surface steroid
           receptors, adhesion, migration and differentiation
           through mediation of cell adhesion molecules.
          Length = 151

 Score = 61.3 bits (149), Expect = 2e-11
 Identities = 26/117 (22%), Positives = 42/117 (35%), Gaps = 31/117 (26%)

Query: 77  PVTFKPQSYIKYALSFEPDKYTTQLQLKFRTREEFGELFRLSDQHNREYAILEIKDSKLR 136
            V+F   SY++      P    + +   FRT    G L     Q+  ++  LE++D +L 
Sbjct: 1   GVSFSGSSYVRLPTLPAPRTRLS-ISFSFRTTSPNGLLLYAGSQNGGDFLALELEDGRLV 59

Query: 137 FRYNLNNLKTDEKDIWLNAIKDSKLRFRYNLNNLKTDEKDIWLNAVAVNDGQWHFVK 193
            RY+L         + L++                            +NDGQWH V 
Sbjct: 60  LRYDLG-----SGSLVLSSK-------------------------TPLNDGQWHSVS 86


>gnl|CDD|216930 pfam02210, Laminin_G_2, Laminin G domain.  This family includes the
           Thrombospondin N-terminal-like domain, a Laminin G
           subfamily.
          Length = 124

 Score = 50.1 bits (120), Expect = 7e-08
 Identities = 20/89 (22%), Positives = 32/89 (35%), Gaps = 29/89 (32%)

Query: 105 FRTREEFGELFRLSDQHNREYAILEIKDSKLRFRYNLNNLKTDEKDIWLNAIKDSKLRFR 164
           FRT +  G L     +   ++  LE++D +L  RY+L                       
Sbjct: 1   FRTTQPNGLLLYAGGEDGLDFLALELEDGRLVLRYDLG---------------------- 38

Query: 165 YNLNNLKTDEKDIWLNAVAVNDGQWHFVK 193
                  +    + L+   +NDGQWH V 
Sbjct: 39  -------SGGSVLLLSGKKLNDGQWHRVS 60


>gnl|CDD|214598 smart00282, LamG, Laminin G domain. 
          Length = 132

 Score = 49.6 bits (119), Expect = 1e-07
 Identities = 20/94 (21%), Positives = 31/94 (32%), Gaps = 29/94 (30%)

Query: 100 QLQLKFRTREEFGELFRLSDQHNREYAILEIKDSKLRFRYNLNNLKTDEKDIWLNAIKDS 159
            +   FRT    G L     +   +Y  LE++D +L  RY+L +                
Sbjct: 1   SISFSFRTTSPNGLLLYAGSKGGGDYLALELRDGRLVLRYDLGS---------------- 44

Query: 160 KLRFRYNLNNLKTDEKDIWLNAVAVNDGQWHFVK 193
                            +  +   +NDGQWH V 
Sbjct: 45  -------------GPARLTSDPTPLNDGQWHRVA 65


>gnl|CDD|219677 pfam07974, EGF_2, EGF-like domain.  This family contains EGF
          domains found in a variety of extracellular proteins.
          Length = 31

 Score = 32.4 bits (74), Expect = 0.018
 Identities = 13/43 (30%), Positives = 16/43 (37%), Gaps = 12/43 (27%)

Query: 29 CPATEAVCNAHSRLHCYPEHGVCVGSMHTAKCQCLPGWSGTGC 71
          C A+  +CN           G CV      KC C  G+ G  C
Sbjct: 1  CSASG-ICN---------GRGTCVR--PCGKCVCDSGYQGATC 31


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
          large number of membrane-bound and extracellular
          (mostly animal) proteins. Many of these proteins
          require calcium for their biological function and
          calcium-binding sites have been found to be located at
          the N-terminus of particular EGF-like domains;
          calcium-binding may be crucial for numerous
          protein-protein interactions. Six conserved core
          cysteines form three disulfide bridges as in non
          calcium-binding EGF domains, whose structures are very
          similar. EGF_CA can be found in tandem repeat
          arrangements.
          Length = 38

 Score = 31.8 bits (73), Expect = 0.029
 Identities = 10/29 (34%), Positives = 16/29 (55%), Gaps = 1/29 (3%)

Query: 44 CYPEHGVCVGSMHTAKCQCLPGWSGTGCV 72
          C    G CV ++ + +C C PG++G  C 
Sbjct: 11 CQN-GGTCVNTVGSYRCSCPPGYTGRNCE 38


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 30.6 bits (69), Expect = 0.087
 Identities = 7/22 (31%), Positives = 13/22 (59%)

Query: 47 EHGVCVGSMHTAKCQCLPGWSG 68
           +G C+ +  +  C C PG++G
Sbjct: 9  SNGTCINTPGSYTCSCPPGYTG 30


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
          growth factor (EGF) presents in a large number of
          proteins, mostly animal; the list of proteins currently
          known to contain one or more copies of an EGF-like
          pattern is large and varied; the functional
          significance of EGF-like domains in what appear to be
          unrelated proteins is not yet clear; a common feature
          is that these repeats are found in the extracellular
          domain of membrane-bound proteins or in proteins known
          to be secreted (exception: prostaglandin G/H synthase);
          the domain includes six cysteine residues which have
          been shown to be involved in disulfide bonds; the main
          structure is a two-stranded beta-sheet followed by a
          loop to a C-terminal short two-stranded sheet;
          Subdomains between the conserved cysteines vary in
          length; the region between the 5th and 6th cysteine
          contains two conserved glycines of which at  least  one
           is  present  in  most EGF-like domains; a subset of
          these bind calcium.
          Length = 36

 Score = 30.1 bits (68), Expect = 0.12
 Identities = 8/21 (38%), Positives = 13/21 (61%)

Query: 48 HGVCVGSMHTAKCQCLPGWSG 68
           G CV +  + +C C PG++G
Sbjct: 11 GGTCVNTPGSYRCVCPPGYTG 31


>gnl|CDD|204999 pfam12661, hEGF, Human growth factor-like EGF.  hEGF, or human
          growth factor-like EGF, domains have six conserved
          residues disulfide-bonded into the characteristic
          'ababcc' pattern. They are involved in growth and
          proliferation of cells, in proteins of the Notch/Delta
          pathway, neurogulin and selectins. hEGFs are also found
          in mosaic proteins with four-disulfide laminin EGFs
          such as aggrecan and perlecan. The core fold of the EGF
          domain consists of two small beta-hairpins packed
          against each other. Two major structural variants have
          been identified based on the structural context of the
          C-terminal Cys residue of disulfide 'c' in the
          C-terminal hairpin: hEGFs and cEGFs. In hEGFs the
          C-terminal thiol resides in the beta-turn, resulting in
          shorter loop-lengths between the Cys residues of
          disulfide 'c', typically C[8-9]XC. These shorter
          loop-lengths are also typical of the four-disulfide EGF
          domains, laminin ad integrin. Tandem hEGF domains have
          six linking residues between terminal cysteines of
          adjacent domains. hEGF domains may or may not bind
          calcium in the linker region. hEGF domains with the
          consensus motif CXD4X[F,Y]XCXC are hydroxylated
          exclusively in the Asp residue.
          Length = 13

 Score = 28.1 bits (64), Expect = 0.44
 Identities = 7/10 (70%), Positives = 9/10 (90%)

Query: 59 KCQCLPGWSG 68
          KCQC PG++G
Sbjct: 1  KCQCPPGYTG 10


>gnl|CDD|215680 pfam00053, Laminin_EGF, Laminin EGF-like (Domains III and V).
          This family is like pfam00008 but has 8 conserved
          cysteines instead of six.
          Length = 49

 Score = 28.1 bits (63), Expect = 0.87
 Identities = 14/38 (36%), Positives = 15/38 (39%), Gaps = 11/38 (28%)

Query: 36 CNAHSRLH--CYPEHGVCVGSMHTAKCQCLPGWSGTGC 71
          CN H  L   C PE G C          C PG +G  C
Sbjct: 3  CNPHGSLSDTCDPETGQC---------LCKPGVTGRHC 31


>gnl|CDD|222036 pfam13304, AAA_21, AAA domain. 
          Length = 256

 Score = 30.5 bits (68), Expect = 1.0
 Identities = 22/157 (14%), Positives = 50/157 (31%), Gaps = 5/157 (3%)

Query: 89  ALSFEPDKYTTQLQLKFRTREEFGELFRLSDQHNR----EYAILEIKDSKLRFRYNLNNL 144
           AL+      +  L L          L  L D++      E+ I E     +R+RY     
Sbjct: 18  ALALLLLLLSLGLTLDRGLNVGIKLLPFLLDENEIEIPLEFEIEEFLIDGIRYRYGFELD 77

Query: 145 KTDEKDIWLNAIKDSKLRFRYNLNNLKTDEKDIWLNAVAVNDGQWHFVKTGSALFVTLLI 204
           K D  +  L   +  +    +     K   +        +   +   +    +L   LL+
Sbjct: 78  KEDILEELLYEYRKGEELL-FERERSKESFEKSPEKKRELRGLREVLLLLNLSLSSFLLL 136

Query: 205 SPNQVLVLVFVVYSRRREAHIKYPGPDDDVRENIINY 241
           +  ++L+ + + +S            +  + + +   
Sbjct: 137 ASLEILLSILLPFSFILGNLRNLRNIELKLLDLVKRL 173


>gnl|CDD|215681 pfam00054, Laminin_G_1, Laminin G domain. 
          Length = 131

 Score = 29.6 bits (67), Expect = 1.1
 Identities = 14/37 (37%), Positives = 20/37 (54%)

Query: 105 FRTREEFGELFRLSDQHNREYAILEIKDSKLRFRYNL 141
           FRT E  G L     Q  R++  LE++D +L   Y+L
Sbjct: 1   FRTTEPSGLLLYNGTQTERDFLALELRDGRLEVSYDL 37


>gnl|CDD|236405 PRK09194, PRK09194, prolyl-tRNA synthetase; Provisional.
          Length = 565

 Score = 30.8 bits (71), Expect = 1.1
 Identities = 10/19 (52%), Positives = 14/19 (73%), Gaps = 1/19 (5%)

Query: 109 EEFG-ELFRLSDQHNREYA 126
           EE+G EL RL D+H R++ 
Sbjct: 87  EEYGPELLRLKDRHGRDFV 105


>gnl|CDD|114337 pfam05609, LAP1C, Lamina-associated polypeptide 1C (LAP1C).  This
           family contains rat LAP1C proteins and several
           uncharacterized highly related sequences from both mice
           and humans. LAP1s (lamina-associated polypeptide 1s) are
           type 2 integral membrane proteins with a single
           membrane-spanning region of the inner nuclear membrane.
           LAP1s bind to both A- and B-type lamins and have a
           putative role in the membrane attachment and assembly of
           the nuclear lamina.
          Length = 465

 Score = 30.4 bits (68), Expect = 1.3
 Identities = 9/33 (27%), Positives = 17/33 (51%)

Query: 218 SRRREAHIKYPGPDDDVRENIINYDDEGGGEDD 250
            + R   ++YP  +    +N  ++ +EG  EDD
Sbjct: 73  DKLRRPPLRYPRYEATEVQNKQSFLEEGETEDD 105


>gnl|CDD|214543 smart00180, EGF_Lam, Laminin-type epidermal growth factor-like
          domai. 
          Length = 46

 Score = 26.9 bits (60), Expect = 2.4
 Identities = 11/44 (25%), Positives = 18/44 (40%), Gaps = 8/44 (18%)

Query: 36 CNA--HSRLHCYPEHGVCVGSMHTA--KC-QCLPG-W--SGTGC 71
          C+    +   C P+ G C    +    +C +C PG +     GC
Sbjct: 3  CDPGGSASGTCDPDTGQCECKPNVTGRRCDRCAPGYYGDGPPGC 46


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 26.2 bits (58), Expect = 2.5
 Identities = 7/21 (33%), Positives = 11/21 (52%)

Query: 48 HGVCVGSMHTAKCQCLPGWSG 68
           G CV +     C+C  G++G
Sbjct: 10 GGTCVDTPGGYTCECPEGYTG 30


>gnl|CDD|238825 cd01647, RT_LTR, RT_LTR: Reverse transcriptases (RTs) from
           retrotransposons and retroviruses which have long
           terminal repeats (LTRs) in their DNA copies but not in
           their RNA template. RT catalyzes DNA replication from an
           RNA template, and is responsible for the replication of
           retroelements. An RT gene is usually indicative of a
           mobile element such as a retrotransposon or retrovirus.
           RTs are present in a variety of mobile elements,
           including retrotransposons, retroviruses, group II
           introns, bacterial msDNAs, hepadnaviruses, and
           Caulimoviruses.
          Length = 177

 Score = 28.7 bits (65), Expect = 3.0
 Identities = 29/129 (22%), Positives = 45/129 (34%), Gaps = 39/129 (30%)

Query: 131 KDSKLRF-----RYNLNNLKTDE-----KDIWLNAIKDSK------LRFRYNLNNLKTDE 174
           KD KLR      + N      D       D  L  +  +K      LR  Y+   +   E
Sbjct: 20  KDGKLRLCVDYRKLN-KVTIKDRYPLPTIDELLEELAGAKVFSKLDLRSGYH--QIPLAE 76

Query: 175 KDIWLNAVAVNDGQWHFV------KTGSALFVTLLISPNQVL-------VLVFV----VY 217
           +     A     G + +       K   A F  L+   N++L       V V++    VY
Sbjct: 77  ESRPKTAFRTPFGLYEYTRMPFGLKNAPATFQRLM---NKILGDLLGDFVEVYLDDILVY 133

Query: 218 SRRREAHIK 226
           S+  E H++
Sbjct: 134 SKTEEEHLE 142


>gnl|CDD|238012 cd00055, EGF_Lam, Laminin-type epidermal growth factor-like
          domain; laminins are the major noncollagenous
          components of basement membranes that mediate cell
          adhesion, growth migration, and differentiation; the
          laminin-type epidermal growth factor-like module occurs
          in tandem arrays; the domain contains 4 disulfide bonds
          (loops a-d) the first three resemble epidermal growth
          factor (EGF); the number of copies of this domain in
          the different forms of laminins is highly variable
          ranging from 3 up to 22 copies.
          Length = 50

 Score = 26.2 bits (58), Expect = 3.8
 Identities = 12/38 (31%), Positives = 14/38 (36%), Gaps = 11/38 (28%)

Query: 36 CNAHSRLH--CYPEHGVCVGSMHTAKCQCLPGWSGTGC 71
          CN H  L   C P  G C         +C P  +G  C
Sbjct: 4  CNGHGSLSGQCDPGTGQC---------ECKPNTTGRRC 32


>gnl|CDD|153114 cd01056, Euk_Ferritin, eukaryotic ferritins.  Eukaryotic Ferritin
           (Euk_Ferritin) domain. Ferritins are the primary iron
           storage proteins of most living organisms and members of
           a broad superfamily of ferritin-like diiron-carboxylate
           proteins. The iron-free (apoferritin) ferritin molecule
           is a protein shell composed of 24 protein chains
           arranged in 432 symmetry. Iron storage involves the
           uptake of iron (II) at the protein shell, its oxidation
           by molecular oxygen at the dinuclear ferroxidase
           centers, and the movement of iron (III) into the cavity
           for deposition as ferrihydrite; the protein shell can
           hold up to 4500 iron atoms. In vertebrates, two types of
           chains (subunits) have been characterized, H or M (fast)
           and L (slow), which differ in rates of iron uptake and
           mineralization. Fe(II) oxidation in the H/M subunits
           take place initially at the ferroxidase center, a
           carboxylate-bridged diiron center, located within the
           subunit four-helix bundle. In a complementary role,
           negatively charged residues on the protein shell inner
           surface of the L subunits promote ferrihydrite
           nucleation. Most plant ferritins combine both oxidase
           and nucleation functions in one chain: they have four
           interior glutamate residues as well as seven ferroxidase
           center residues.
          Length = 161

 Score = 27.9 bits (63), Expect = 5.4
 Identities = 22/76 (28%), Positives = 35/76 (46%), Gaps = 10/76 (13%)

Query: 84  SYIKYALSFEPDKYTTQLQLKFRTREEFGELFR-LSDQHNREYAILEIKDSKLR-FRYNL 141
           SY+  +++   D+              F + FR LSD+  RE+A   IK    R  R  L
Sbjct: 19  SYVYLSMAAYFDRDDV-------ALPGFAKFFRKLSDEE-REHAEKLIKYQNKRGGRVVL 70

Query: 142 NNLKTDEKDIWLNAIK 157
            ++K  EKD W + ++
Sbjct: 71  QDIKKPEKDEWGSGLE 86


>gnl|CDD|191582 pfam06679, DUF1180, Protein of unknown function (DUF1180).  This
           family consists of several hypothetical mammalian
           proteins of around 190 residues in length. The function
           of this family is unknown.
          Length = 163

 Score = 28.0 bits (62), Expect = 5.5
 Identities = 22/68 (32%), Positives = 30/68 (44%), Gaps = 8/68 (11%)

Query: 194 TGSALFVTLLISPNQVLVLVF-VVYSRRREAHIKYPGPDDDVRENI----INYDDEGGGE 248
           T  AL V +  S   ++  V   V +RRR    +  G  D   EN+    +  DDE   +
Sbjct: 94  TQRALIVLVAFSAAVIVYFVLRTVRTRRRNKKTRKYGVLDTNAENMELTPLEQDDE---D 150

Query: 249 DDMTAFDI 256
           DD T FD 
Sbjct: 151 DDSTLFDA 158


>gnl|CDD|166027 PLN02386, PLN02386, superoxide dismutase [Cu-Zn].
          Length = 152

 Score = 27.2 bits (60), Expect = 8.2
 Identities = 9/21 (42%), Positives = 12/21 (57%)

Query: 247 GEDDMTAFDITPLQIPIGGPH 267
           G+D    F I   QIP+ GP+
Sbjct: 89  GDDGTATFTIVDKQIPLTGPN 109


>gnl|CDD|219210 pfam06873, SerH, Cell surface immobilisation antigen SerH.  This
           family consists of several cell surface immobilisation
           antigen SerH proteins which seem to be specific to
           Tetrahymena thermophila. The SerH locus of Tetrahymena
           thermophila is one of several paralogous loci with genes
           encoding variants of the major cell surface protein
           known as the immobilisation antigen (i-ag).
          Length = 407

 Score = 28.0 bits (62), Expect = 8.4
 Identities = 19/71 (26%), Positives = 24/71 (33%), Gaps = 20/71 (28%)

Query: 29  CPATEAVCNAHSR----------LHCYP-------EHGVCVGSMHTAKCQCLPGWSGTGC 71
           C A+ A C + SR          L C P       +   C  S   A      GW+ + C
Sbjct: 175 CVASSASCGSTSRGTTAWTDADCLLCNPTTPYLVGDKSSCAASSCAACSSSTSGWTDSDC 234

Query: 72  V---TPTSPVT 79
               T  SP T
Sbjct: 235 NACNTTASPST 245


>gnl|CDD|221769 pfam12784, PDDEXK_2, PD-(D/E)XK nuclease family transposase.
           Members of this family belong to the PD-(D/E)XK nuclease
           superfamily. These proteins are transposase proteins.
          Length = 229

 Score = 27.5 bits (62), Expect = 9.0
 Identities = 12/56 (21%), Positives = 24/56 (42%), Gaps = 11/56 (19%)

Query: 109 EEFGELFRLSDQHNR-------EYAILEIKDSKLRFRYNLNNLKTDEKDIWLNAIK 157
           +++   ++L ++          E   +E+     +F  N   L+TD  D WL  +K
Sbjct: 124 DKYHSCYKLKEKETNKILTDDLEIHFIELP----KFNKNEEELETDTLDKWLYFLK 175


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.317    0.137    0.423 

Gapped
Lambda     K      H
   0.267   0.0715    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 17,567,633
Number of extensions: 1710603
Number of successful extensions: 1292
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1287
Number of HSP's successfully gapped: 33
Length of query: 339
Length of database: 10,937,602
Length adjustment: 98
Effective length of query: 241
Effective length of database: 6,590,910
Effective search space: 1588409310
Effective search space used: 1588409310
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 59 (26.5 bits)