RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy6570
         (713 letters)



>gnl|CDD|214531 smart00135, LY, Low-density lipoprotein-receptor YWTD domain.  Type
           "B" repeats in low-density lipoprotein (LDL) receptor
           that plays a central role in mammalian cholesterol
           metabolism. Also present in a variety of molecules
           similar to gp300/megalin.
          Length = 43

 Score = 61.5 bits (150), Expect = 2e-12
 Identities = 15/41 (36%), Positives = 23/41 (56%)

Query: 110 LVDNNIQWPTGITIDYPSQRLYWADPKARTIESINLNGKDR 150
           L+ + +  P G+ +D+   RLYW D     IE  NL+G +R
Sbjct: 3   LLSSGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLDGTNR 43



 Score = 51.8 bits (125), Expect = 5e-09
 Identities = 18/40 (45%), Positives = 23/40 (57%), Gaps = 3/40 (7%)

Query: 21 VLSNLHDPRGVAVDWVGKNLYWTDAGGRSSNNIMVSTLEG 60
          + S L  P G+AVDW+   LYWTD G      I V+ L+G
Sbjct: 4  LSSGLGHPNGLAVDWIEGRLYWTDWGLDV---IEVANLDG 40



 Score = 47.6 bits (114), Expect = 2e-07
 Identities = 18/43 (41%), Positives = 29/43 (67%), Gaps = 1/43 (2%)

Query: 64  RTLLNTGLNEPYDIALEPLSGRMFWTELGIKPRISGASIDGKN 106
           RTLL++GL  P  +A++ + GR++WT+ G    I  A++DG N
Sbjct: 1   RTLLSSGLGHPNGLAVDWIEGRLYWTDWG-LDVIEVANLDGTN 42


>gnl|CDD|215683 pfam00058, Ldl_recept_b, Low-density lipoprotein receptor repeat
           class B.  This domain is also known as the YWTD motif
           after the most conserved region of the repeat. The YWTD
           repeat is found in multiple tandem repeats and has been
           predicted to form a beta-propeller structure.
          Length = 42

 Score = 56.8 bits (138), Expect = 9e-11
 Identities = 15/42 (35%), Positives = 27/42 (64%)

Query: 84  GRMFWTELGIKPRISGASIDGKNKFNLVDNNIQWPTGITIDY 125
           GR++WT+  ++  IS A ++G ++  L   ++QWP GI +D 
Sbjct: 1   GRLYWTDSSLRASISVADLNGSDRRTLFSEDLQWPNGIAVDP 42



 Score = 41.8 bits (99), Expect = 2e-05
 Identities = 19/43 (44%), Positives = 26/43 (60%), Gaps = 2/43 (4%)

Query: 39 NLYWTDAGGRSSNNIMVSTLEGRKKRTLLNTGLNEPYDIALEP 81
           LYWTD+  R+S  I V+ L G  +RTL +  L  P  IA++P
Sbjct: 2  RLYWTDSSLRAS--ISVADLNGSDRRTLFSEDLQWPNGIAVDP 42



 Score = 35.2 bits (82), Expect = 0.004
 Identities = 13/35 (37%), Positives = 19/35 (54%), Gaps = 5/35 (14%)

Query: 2  ASISSGNVTRVKREMNLKTVLS-NLHDPRGVAVDW 35
          ASIS  ++    R    +T+ S +L  P G+AVD 
Sbjct: 12 ASISVADLNGSDR----RTLFSEDLQWPNGIAVDP 42



 Score = 32.5 bits (75), Expect = 0.042
 Identities = 13/37 (35%), Positives = 19/37 (51%), Gaps = 3/37 (8%)

Query: 128 QRLYWADPKAR-TIESINLNGKDRFVVYHTEDNGYKP 163
            RLYW D   R +I   +LNG DR  ++   ++   P
Sbjct: 1   GRLYWTDSSLRASISVADLNGSDRRTLF--SEDLQWP 35


>gnl|CDD|225926 COG3391, COG3391, Uncharacterized conserved protein [Function
           unknown].
          Length = 381

 Score = 38.7 bits (90), Expect = 0.010
 Identities = 45/199 (22%), Positives = 74/199 (37%), Gaps = 22/199 (11%)

Query: 2   ASISSGNVTRVKREMNLKTVLSN--LHDPRGVAVDWVGKNLYWTDAGGRSSNNIMVSTLE 59
           A+  S +V+ +    N  T   +     P GVAV+  G  +Y T      SN + V    
Sbjct: 48  ANSGSNDVSVIDATSNTVTQSLSVGGVYPAGVAVNPAGNKVYVTT---GDSNTVSVIDTA 104

Query: 60  GRKKRTLLNTGLNEPYDIALEPLSGRMFWTELGIKPRISGASIDGK-NKFNLVDNNIQWP 118
                  +  GL  P  +A++P    ++    G     + + ID   NK          P
Sbjct: 105 TNTVLGSIPVGLG-PVGLAVDPDGKYVYVANAGNGNN-TVSVIDAATNKVTATIPVGNTP 162

Query: 119 TGITIDYPSQRLYWADPKARTIESINLNGKDRFVVYHTEDNGYK----PYKLEVFEDNLY 174
           TG+ +D    ++Y  +    T+  I+ +G    VV  +  +       P  + V  D   
Sbjct: 163 TGVAVDPDGNKVYVTNSDDNTVSVIDTSGN--SVVRGSVGSLVGVGTGPAGIAVDPDGNR 220

Query: 175 FSTYRTN------NILKIN 187
              Y  N      N+LKI+
Sbjct: 221 V--YVANDGSGSNNVLKID 237



 Score = 31.8 bits (72), Expect = 1.2
 Identities = 16/61 (26%), Positives = 28/61 (45%), Gaps = 3/61 (4%)

Query: 27 DPRGVAVDWVGKNLYWTDAGGRSSNNIMVSTLEGRKKRTLLNTGLNEPYDIALEPLSGRM 86
           P GVAV+  G  +Y  ++G   SN++ V           L+ G   P  +A+ P   ++
Sbjct: 32 GPGGVAVNPDGTQVYVANSG---SNDVSVIDATSNTVTQSLSVGGVYPAGVAVNPAGNKV 88

Query: 87 F 87
          +
Sbjct: 89 Y 89



 Score = 30.6 bits (69), Expect = 2.7
 Identities = 33/162 (20%), Positives = 50/162 (30%), Gaps = 15/162 (9%)

Query: 27  DPRGVAVDWVGKNLYWTDAGGRSSNNIMVSTLEG----RKKRTLLNTGLNEPYDIALEPL 82
            P GVAVD  G  +Y T++     N + V    G    R     L      P  IA++P 
Sbjct: 161 TPTGVAVDPDGNKVYVTNSD---DNTVSVIDTSGNSVVRGSVGSLVGVGTGPAGIAVDP- 216

Query: 83  SGRMFWTELGIKPRISGASIDGKNKFNLV-DNNIQ--WPTGITIDYPSQRLYWADPKART 139
            G   +         +   ID         D  +    P G+ +D   +  Y A+ +  T
Sbjct: 217 DGNRVYVANDGSGSNNVLKIDTATGNVTATDLPVGSGAPRGVAVDPAGKAAYVANSQGGT 276

Query: 140 IESINLNGKDRFVVYHTEDNGYKPYKLEVFEDNLYFSTYRTN 181
           +  I+        V  T   G      E     +        
Sbjct: 277 VSVIDG---ATDRVVKTGPTG-NEALGEPVSIAISPLYDTNY 314


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 33.6 bits (77), Expect = 0.012
 Identities = 16/32 (50%), Positives = 21/32 (65%), Gaps = 1/32 (3%)

Query: 555 CTPN-YCSNNGTCVLIEGKPSCKCLPPYSGKQ 585
           C+PN  CSN GTCV   G  +C+C   Y+GK+
Sbjct: 1   CSPNNPCSNGGTCVDTPGGYTCECPEGYTGKR 32



 Score = 28.9 bits (65), Expect = 0.57
 Identities = 15/29 (51%), Positives = 17/29 (58%), Gaps = 2/29 (6%)

Query: 388 ENKCHNGGTCIATTQ--TCVCPPGFTGDT 414
            N C NGGTC+ T    TC CP G+TG  
Sbjct: 4   NNPCSNGGTCVDTPGGYTCECPEGYTGKR 32



 Score = 27.8 bits (62), Expect = 1.8
 Identities = 14/29 (48%), Positives = 17/29 (58%)

Query: 420 NLKCQNGGVCVNKTTGLECDCPKFYYGKN 448
           N  C NGG CV+   G  C+CP+ Y GK 
Sbjct: 4   NNPCSNGGTCVDTPGGYTCECPEGYTGKR 32



 Score = 25.9 bits (57), Expect = 7.0
 Identities = 9/29 (31%), Positives = 15/29 (51%)

Query: 224 DDKPCHQSALCINLPSSHTCLCPDHLTEE 252
            + PC     C++ P  +TC CP+  T +
Sbjct: 3   PNNPCSNGGTCVDTPGGYTCECPEGYTGK 31



 Score = 25.9 bits (57), Expect = 7.5
 Identities = 13/26 (50%), Positives = 16/26 (61%)

Query: 513 CLNGGTCIPNSKNNVCKCPSQYTGRR 538
           C NGGTC+       C+CP  YTG+R
Sbjct: 7   CSNGGTCVDTPGGYTCECPEGYTGKR 32



 Score = 25.9 bits (57), Expect = 8.1
 Identities = 10/22 (45%), Positives = 11/22 (50%)

Query: 460 NGECSITDSGPKCMCSPGYSGK 481
            G C  T  G  C C  GY+GK
Sbjct: 10  GGTCVDTPGGYTCECPEGYTGK 31



 Score = 25.5 bits (56), Expect = 9.0
 Identities = 9/20 (45%), Positives = 12/20 (60%)

Query: 487 TCLNGDSGPKCMCSPGYSGK 506
           TC++   G  C C  GY+GK
Sbjct: 12  TCVDTPGGYTCECPEGYTGK 31


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 33.4 bits (77), Expect = 0.018
 Identities = 16/31 (51%), Positives = 19/31 (61%), Gaps = 2/31 (6%)

Query: 388 ENKCHNGGTCIATTQ--TCVCPPGFTGDTCQ 416
            N C NGGTC+ T     C CPPG+TG  C+
Sbjct: 8   GNPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38



 Score = 33.4 bits (77), Expect = 0.020
 Identities = 14/32 (43%), Positives = 17/32 (53%)

Query: 556 TPNYCSNNGTCVLIEGKPSCKCLPPYSGKQCT 587
           + N C N GTCV   G   C C P Y+G+ C 
Sbjct: 7   SGNPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38



 Score = 32.2 bits (74), Expect = 0.041
 Identities = 15/28 (53%), Positives = 17/28 (60%)

Query: 423 CQNGGVCVNKTTGLECDCPKFYYGKNCQ 450
           CQNGG CVN      C CP  Y G+NC+
Sbjct: 11  CQNGGTCVNTVGSYRCSCPPGYTGRNCE 38



 Score = 29.1 bits (66), Expect = 0.61
 Identities = 16/29 (55%), Positives = 19/29 (65%), Gaps = 2/29 (6%)

Query: 513 CLNGGTCIPNSKNN-VCKCPSQYTGRRCE 540
           C NGGTC+ N+  +  C CP  YTGR CE
Sbjct: 11  CQNGGTCV-NTVGSYRCSCPPGYTGRNCE 38



 Score = 29.1 bits (66), Expect = 0.65
 Identities = 10/22 (45%), Positives = 14/22 (63%)

Query: 487 TCLNGDSGPKCMCSPGYSGKKC 508
           TC+N     +C C PGY+G+ C
Sbjct: 16  TCVNTVGSYRCSCPPGYTGRNC 37



 Score = 27.6 bits (62), Expect = 2.0
 Identities = 14/33 (42%), Positives = 16/33 (48%), Gaps = 4/33 (12%)

Query: 351 PCLNQGMCYPDLTHPEPTYKCHCAPSYTGARCE 383
           PC N G C         +Y+C C P YTG  CE
Sbjct: 10  PCQNGGTCVNT----VGSYRCSCPPGYTGRNCE 38



 Score = 27.2 bits (61), Expect = 2.7
 Identities = 8/21 (38%), Positives = 10/21 (47%)

Query: 227 PCHQSALCINLPSSHTCLCPD 247
           PC     C+N   S+ C CP 
Sbjct: 10  PCQNGGTCVNTVGSYRCSCPP 30



 Score = 26.1 bits (58), Expect = 6.4
 Identities = 10/24 (41%), Positives = 13/24 (54%)

Query: 460 NGECSITDSGPKCMCSPGYSGKKC 483
            G C  T    +C C PGY+G+ C
Sbjct: 14  GGTCVNTVGSYRCSCPPGYTGRNC 37


>gnl|CDD|165214 PHA02887, PHA02887, EGF-like protein; Provisional.
          Length = 126

 Score = 34.9 bits (80), Expect = 0.035
 Identities = 17/43 (39%), Positives = 28/43 (65%), Gaps = 6/43 (13%)

Query: 451 YSQCK----NYCVNGEC-SITDSGPK-CMCSPGYSGKKCDTCT 487
           + +CK    ++C+NGEC +I D   K C+C+ GY+G +CD  +
Sbjct: 83  FEKCKNDFNDFCINGECMNIIDLDEKFCICNKGYTGIRCDEVS 125



 Score = 30.3 bits (68), Expect = 1.6
 Identities = 15/35 (42%), Positives = 19/35 (54%), Gaps = 5/35 (14%)

Query: 635 FCFNGGTCREQNYSLDPDLKPICICPRGYAGVRCQ 669
           FC NG  C      +D D +  CIC +GY G+RC 
Sbjct: 93  FCING-ECM---NIIDLD-EKFCICNKGYTGIRCD 122


>gnl|CDD|119334 cd06564, GH20_DspB_LnbB-like, Glycosyl hydrolase family 20 (GH20)
           catalytic domain of dispersin B (DspB),
           lacto-N-biosidase (LnbB) and related proteins. Dispersin
           B is a soluble beta-N-acetylglucosamidase found in
           bacteria that hydrolyzes the beta-1,6-linkages of PGA
           (poly-beta-(1,6)-N-acetylglucosamine), a major component
           of the extracellular polysaccharide matrix.
           Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I
           oligosaccharides at the nonreducing terminus to produce
           lacto-N-biose as part of the GNB/LNB
           (galacto-N-biose/lacto-N-biose I) degradation pathway.
           The lacto-N-biosidase from Bifidobacterium bifidum has
           this GH20 domain, a carbohydrate binding module 32, and
           a bacterial immunoglobulin-like domain 2, as well as a
           YSIRK signal peptide and a G5 membrane anchor at the N
           and C termini, respectively. The GH20 hexosaminidases
           are thought to act via a catalytic mechanism in which
           the catalytic nucleophile is not provided by solvent or
           the enzyme, but by the substrate itself.
          Length = 326

 Score = 36.5 bits (85), Expect = 0.044
 Identities = 24/118 (20%), Positives = 41/118 (34%), Gaps = 27/118 (22%)

Query: 90  ELGIKPRISGASIDGKNKFNLVDNNIQWPTGITIDYPSQRLYWADPKAR-----TIESIN 144
           + G  PR+ G  I  K    ++  ++       I+Y S    WADPK        I  IN
Sbjct: 191 DKGKTPRVWGDGIYYKGDTTVLSKDVI------INYWS--YGWADPKELLNKGYKI--IN 240

Query: 145 LNGKDRFVVYHTEDNGYKPYKLEVFEDNLYFSTYRTNNILKINKFGNSDFNVLANNLN 202
            N    ++V      G      +++ +               NKFG ++  +   +  
Sbjct: 241 TNDGYLYIVPGAGYYGDYLNTEDIYNNW------------TPNKFGGTNATLPEGDPQ 286


>gnl|CDD|219677 pfam07974, EGF_2, EGF-like domain.  This family contains EGF
           domains found in a variety of extracellular proteins.
          Length = 31

 Score = 32.0 bits (73), Expect = 0.052
 Identities = 11/25 (44%), Positives = 14/25 (56%)

Query: 391 CHNGGTCIATTQTCVCPPGFTGDTC 415
           C+  GTC+     CVC  G+ G TC
Sbjct: 7   CNGRGTCVRPCGKCVCDSGYQGATC 31



 Score = 26.6 bits (59), Expect = 4.3
 Identities = 12/31 (38%), Positives = 13/31 (41%), Gaps = 2/31 (6%)

Query: 556 TPNYCSNNGTCVLIEGKPSCKCLPPYSGKQC 586
               C+  GTCV   GK  C C   Y G  C
Sbjct: 3   ASGICNGRGTCVRPCGK--CVCDSGYQGATC 31


>gnl|CDD|193205 pfam12729, 4HB_MCP_1, Four helix bundle sensory module for signal
           transduction.  This family is a four helix bundle that
           operates as a ubiquitous sensory module in prokaryotic
           signal-transduction. The 4HB_MCP is always found between
           two predicted transmembrane helices indicating that it
           detects only extracellular signals. In many cases the
           domain is associated with a cytoplasmic HAMP domain
           suggesting that most proteins carrying the bundle might
           share the mechanism of transmembrane signalling which is
           well-characterized in E coli chemoreceptors.
          Length = 181

 Score = 34.9 bits (81), Expect = 0.068
 Identities = 11/20 (55%), Positives = 13/20 (65%)

Query: 686 ISSILILILLLITVGGIGYY 705
           I   L+L LLLI VG +G Y
Sbjct: 9   ILLFLLLALLLIIVGIVGLY 28


>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
           Pvs28.  This family consists of several ookinete surface
           protein (Pvs28) from several species of Plasmodium.
           Pvs25 and Pvs28 are expressed on the surface of
           ookinetes. These proteins are potential candidates for
           vaccine and induce antibodies that block the infectivity
           of Plasmodium vivax in immunised animals.
          Length = 196

 Score = 34.3 bits (79), Expect = 0.11
 Identities = 39/152 (25%), Positives = 52/152 (34%), Gaps = 50/152 (32%)

Query: 471 KCMCSPGYSGKKCDTC--------------------TCLNGDSGP-----KCMCSPGYSG 505
           +C C+ GY  K  +TC                    TC+N  +       KC C  GY+ 
Sbjct: 21  ECKCNEGYVLKNENTCEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGCINGYT- 79

Query: 506 KKCDTCTCLNGGTCIPNSKNNVC----KC---PSQYTGRRCECAVGDTSCASLANKCTPN 558
                   L+ G C+PN  NN      KC   P+      C C +G     +   KCT  
Sbjct: 80  --------LSQGVCVPNKCNNKVCGSGKCIVDPANPNNTTCSCNIGKVPDQN--GKCTKT 129

Query: 559 -------YCSNNGTCVLIEGKPSCKCLPPYSG 583
                   C  N  C L+ G   C C   + G
Sbjct: 130 GETKCSLKCKENEECKLVGGYYECVCKEGFPG 161


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 30.5 bits (69), Expect = 0.19
 Identities = 17/28 (60%), Positives = 19/28 (67%), Gaps = 2/28 (7%)

Query: 389 NKCHNGGTCIATTQ--TCVCPPGFTGDT 414
           N C NGGTC+ T     CVCPPG+TGD 
Sbjct: 6   NPCSNGGTCVNTPGSYRCVCPPGYTGDR 33



 Score = 30.1 bits (68), Expect = 0.23
 Identities = 14/29 (48%), Positives = 15/29 (51%)

Query: 556 TPNYCSNNGTCVLIEGKPSCKCLPPYSGK 584
             N CSN GTCV   G   C C P Y+G 
Sbjct: 4   ASNPCSNGGTCVNTPGSYRCVCPPGYTGD 32



 Score = 28.2 bits (63), Expect = 1.3
 Identities = 15/29 (51%), Positives = 17/29 (58%), Gaps = 1/29 (3%)

Query: 513 CLNGGTCIPNSKNNVCKCPSQYTG-RRCE 540
           C NGGTC+    +  C CP  YTG R CE
Sbjct: 8   CSNGGTCVNTPGSYRCVCPPGYTGDRSCE 36



 Score = 26.7 bits (59), Expect = 4.5
 Identities = 9/21 (42%), Positives = 12/21 (57%)

Query: 227 PCHQSALCINLPSSHTCLCPD 247
           PC     C+N P S+ C+CP 
Sbjct: 7   PCSNGGTCVNTPGSYRCVCPP 27


>gnl|CDD|238012 cd00055, EGF_Lam, Laminin-type epidermal growth factor-like domain;
           laminins are the major noncollagenous components of
           basement membranes that mediate cell adhesion, growth
           migration, and differentiation; the laminin-type
           epidermal growth factor-like module occurs in tandem
           arrays; the domain contains 4 disulfide bonds (loops
           a-d) the first three resemble epidermal growth factor
           (EGF); the number of copies of this domain in the
           different forms of laminins is highly variable ranging
           from 3 up to 22 copies.
          Length = 50

 Score = 30.4 bits (69), Expect = 0.29
 Identities = 11/33 (33%), Positives = 14/33 (42%), Gaps = 4/33 (12%)

Query: 391 CHNGGT----CIATTQTCVCPPGFTGDTCQQCL 419
           C+  G+    C   T  C C P  TG  C +C 
Sbjct: 4   CNGHGSLSGQCDPGTGQCECKPNTTGRRCDRCA 36


>gnl|CDD|215680 pfam00053, Laminin_EGF, Laminin EGF-like (Domains III and V).  This
           family is like pfam00008 but has 8 conserved cysteines
           instead of six.
          Length = 49

 Score = 29.6 bits (67), Expect = 0.48
 Identities = 14/40 (35%), Positives = 16/40 (40%), Gaps = 4/40 (10%)

Query: 394 GGTCIATTQTCVCPPGFTGDTCQQCL----NLKCQNGGVC 429
             TC   T  C+C PG TG  C +C      L    G  C
Sbjct: 10  SDTCDPETGQCLCKPGVTGRHCDRCKPGYYGLPSDPGQGC 49


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 29.4 bits (66), Expect = 0.50
 Identities = 12/30 (40%), Positives = 13/30 (43%)

Query: 453 QCKNYCVNGECSITDSGPKCMCSPGYSGKK 482
                C NG C  T     C C PGY+G K
Sbjct: 3   ASGGPCSNGTCINTPGSYTCSCPPGYTGDK 32


>gnl|CDD|214543 smart00180, EGF_Lam, Laminin-type epidermal growth factor-like
           domai. 
          Length = 46

 Score = 29.2 bits (66), Expect = 0.70
 Identities = 11/29 (37%), Positives = 12/29 (41%)

Query: 392 HNGGTCIATTQTCVCPPGFTGDTCQQCLN 420
              GTC   T  C C P  TG  C +C  
Sbjct: 8   SASGTCDPDTGQCECKPNVTGRRCDRCAP 36


>gnl|CDD|215497 PLN02919, PLN02919, haloacid dehalogenase-like hydrolase family
           protein.
          Length = 1057

 Score = 32.9 bits (75), Expect = 0.80
 Identities = 45/196 (22%), Positives = 75/196 (38%), Gaps = 56/196 (28%)

Query: 28  PRGVAVDWVGKNLYWTDAGGRSSNNI-----MVSTLEG--------RKKRTLLNTGLNEP 74
           P+G+A +     LY  D    +   I      V TL G        +  +   +  LN P
Sbjct: 626 PQGLAYNAKKNLLYVADTENHALREIDFVNETVRTLAGNGTKGSDYQGGKKGTSQVLNSP 685

Query: 75  YDIALEPLSGRMF---------W---TELGIKPRISGASIDG----KNKFNLVDNNIQWP 118
           +D+  EP++ +++         W      G+    SG   DG     N  +    +   P
Sbjct: 686 WDVCFEPVNEKVYIAMAGQHQIWEYNISDGVTRVFSG---DGYERNLNGSSGTSTSFAQP 742

Query: 119 TGITIDYPSQRLYWADPKARTIESINL-NGKDRFVVYHTEDNGYKPYKLEVFEDNLYFST 177
           +GI++    + LY AD ++ +I +++L  G  R +       G  P     F DNL+   
Sbjct: 743 SGISLSPDLKELYIADSESSSIRALDLKTGGSRLLA------GGDPT----FSDNLF--- 789

Query: 178 YRTNNILKINKFGNSD 193
                     KFG+ D
Sbjct: 790 ----------KFGDHD 795


>gnl|CDD|221770 pfam12785, VESA1_N, Variant erythrocyte surface antigen-1.  This
           family represents the N-terminal of the variant
           erythrocyte surface antigen 1, versions a and b, of
           Babesia. Babesia bovis is a tick-borne,
           intra-erythrocytic, protozoal parasite of cattle that
           shares many lifestyle parallels with the most virulent
           of the human malarial parasites, Plasmodium falciparum.
           Babesia uses antigenic variation to establish consistent
           infections of long duration. The two variants of VESA1,
           a and b, are expressed from different but closely
           related genes, and variation is achieved through the
           involvement of a segmental gene conversion mechanism and
           low-frequency epigenetic in situ switching of
           transcriptional activity from the VESA1 gene-pair to a
           possible other gene pair.
          Length = 428

 Score = 32.3 bits (74), Expect = 0.84
 Identities = 27/117 (23%), Positives = 32/117 (27%), Gaps = 31/117 (26%)

Query: 458 CVNGECSITDSGPKCMCSPGYSG-----------KKCDTCTCLNGDS-----------GP 495
           C  G       G       G  G             CD C C+  D            G 
Sbjct: 78  CWGGGGGKCKGGGGNGNGHGQKGGCKYLKDVKPNNPCDDCGCMKWDVPKADSDEGHHLGR 137

Query: 496 KCM-CSPGYSGKKCDTCTC-LNGGTCIPNSKNNVCKCPSQYTGRRCECAVGDTSCAS 550
            C  CS   SG     C C   GG+C    +   CKC     G+ C+C         
Sbjct: 138 GCTRCSD--SGGSDHGCKCSTGGGSCSAGKE---CKCALA--GKCCKCCCKGKCGKG 187



 Score = 31.5 bits (72), Expect = 1.7
 Identities = 28/115 (24%), Positives = 37/115 (32%), Gaps = 19/115 (16%)

Query: 422 KCQNGGVCVNKTTGLECDCPKFYYGKN--CQYSQ---CKNYCVNGECSITDSGPKCMCSP 476
           KC  GG       G         +G+   C+Y +     N C +  C   D         
Sbjct: 77  KCWGGG-GGKCKGGGGNGNG---HGQKGGCKYLKDVKPNNPCDDCGCMKWDVPKADSDEG 132

Query: 477 GYSGKKCDTCTCLNGDSGPKCMCSPGYSG-KKCDTCTCLNGGTCIPNSKNNVCKC 530
            + G+ C  C+  +G S   C CS G         C C   G C        CKC
Sbjct: 133 HHLGRGCTRCSD-SGGSDHGCKCSTGGGSCSAGKECKCALAGKC--------CKC 178


>gnl|CDD|227064 COG4720, COG4720, Predicted membrane protein [Function unknown].
          Length = 177

 Score = 31.6 bits (72), Expect = 0.86
 Identities = 5/36 (13%), Positives = 15/36 (41%)

Query: 671 LVHYISKKQSYVNSHISSILILILLLITVGGIGYYI 706
           +     K++   +   + I + I+L + +    Y +
Sbjct: 90  IAGLFGKREMLSSGKKTIIWLGIVLGLAIMVGWYLL 125


>gnl|CDD|204999 pfam12661, hEGF, Human growth factor-like EGF.  hEGF, or human
           growth factor-like EGF, domains have six conserved
           residues disulfide-bonded into the characteristic
           'ababcc' pattern. They are involved in growth and
           proliferation of cells, in proteins of the Notch/Delta
           pathway, neurogulin and selectins. hEGFs are also found
           in mosaic proteins with four-disulfide laminin EGFs such
           as aggrecan and perlecan. The core fold of the EGF
           domain consists of two small beta-hairpins packed
           against each other. Two major structural variants have
           been identified based on the structural context of the
           C-terminal Cys residue of disulfide 'c' in the
           C-terminal hairpin: hEGFs and cEGFs. In hEGFs the
           C-terminal thiol resides in the beta-turn, resulting in
           shorter loop-lengths between the Cys residues of
           disulfide 'c', typically C[8-9]XC. These shorter
           loop-lengths are also typical of the four-disulfide EGF
           domains, laminin ad integrin. Tandem hEGF domains have
           six linking residues between terminal cysteines of
           adjacent domains. hEGF domains may or may not bind
           calcium in the linker region. hEGF domains with the
           consensus motif CXD4X[F,Y]XCXC are hydroxylated
           exclusively in the Asp residue.
          Length = 13

 Score = 27.7 bits (63), Expect = 1.2
 Identities = 8/13 (61%), Positives = 9/13 (69%)

Query: 403 TCVCPPGFTGDTC 415
            C CPPG+TG  C
Sbjct: 1   KCQCPPGYTGPRC 13


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 28.0 bits (63), Expect = 1.3
 Identities = 14/29 (48%), Positives = 18/29 (62%), Gaps = 1/29 (3%)

Query: 423 CQNGGVCVNKTTGLECDCPK-FYYGKNCQ 450
           CQNGG CVN      C+CP  +  G+NC+
Sbjct: 11  CQNGGTCVNTVGSYRCECPPGYTDGRNCE 39



 Score = 28.0 bits (63), Expect = 1.5
 Identities = 8/21 (38%), Positives = 10/21 (47%)

Query: 227 PCHQSALCINLPSSHTCLCPD 247
           PC     C+N   S+ C CP 
Sbjct: 10  PCQNGGTCVNTVGSYRCECPP 30



 Score = 28.0 bits (63), Expect = 1.5
 Identities = 16/31 (51%), Positives = 20/31 (64%), Gaps = 3/31 (9%)

Query: 389 NKCHNGGTCIAT--TQTCVCPPGFT-GDTCQ 416
           N C NGGTC+ T  +  C CPPG+T G  C+
Sbjct: 9   NPCQNGGTCVNTVGSYRCECPPGYTDGRNCE 39



 Score = 26.1 bits (58), Expect = 6.8
 Identities = 14/33 (42%), Positives = 18/33 (54%), Gaps = 1/33 (3%)

Query: 556 TPNYCSNNGTCVLIEGKPSCKCLPPYS-GKQCT 587
           + N C N GTCV   G   C+C P Y+ G+ C 
Sbjct: 7   SGNPCQNGGTCVNTVGSYRCECPPGYTDGRNCE 39


>gnl|CDD|165381 PHA03099, PHA03099, epidermal growth factor-like protein (EGF-like
           protein); Provisional.
          Length = 139

 Score = 30.4 bits (68), Expect = 1.6
 Identities = 16/46 (34%), Positives = 27/46 (58%), Gaps = 3/46 (6%)

Query: 657 CICPRGYAGVRCQTLV---HYISKKQSYVNSHISSILILILLLITV 699
           C C  GY G+RCQ +V   +  S+K +   S+I S  I+++L+  +
Sbjct: 69  CRCSHGYTGIRCQHVVLVDYQRSEKPNTTTSYIPSPGIVLVLVGII 114


>gnl|CDD|233951 TIGR02614, ftsW, cell division protein FtsW.  This family consists
           of FtsW, an integral membrane protein with ten
           transmembrane segments. In general, it is one of two
           paralogs involved in peptidoglycan biosynthesis, the
           other being RodA, and is essential for cell division.
           All members of the seed alignment for this model are
           encoded in operons for the biosynthesis of
           UDP-N-acetylmuramoyl-pentapeptide, a precursor of murein
           (peptidoglycan). The FtsW designation is not used in
           endospore-forming bacterial (e.g. Bacillus subtilis),
           where the member of this family is designated SpoVE and
           three or more RodA/FtsW/SpoVE family paralogs are
           present. SpoVE acts in spore cortex formation and is
           dispensible for growth. Biological rolls for FtsW in
           cell division include recruitment of penicillin-binding
           protein 3 to the division site [Cell envelope,
           Biosynthesis and degradation of murein sacculus and
           peptidoglycan, Cellular processes, Cell division].
          Length = 356

 Score = 30.2 bits (69), Expect = 3.6
 Identities = 10/32 (31%), Positives = 18/32 (56%)

Query: 671 LVHYISKKQSYVNSHISSILILILLLITVGGI 702
           L  Y+++KQ  V S +  +  L +L + VG +
Sbjct: 118 LAWYLARKQKEVKSFLKFLKPLAVLGLLVGLL 149


>gnl|CDD|220280 pfam09529, Intg_mem_TP0381, Integral membrane protein
           (intg_mem_TP0381).  This entry represents a family of
           hydrophobic proteins with seven predicted transmembrane
           alpha helices. Members are found in Bacillus subtilis
           (ywaF), TP0381 from Treponema pallidum (TP0381),
           Streptococcus pyogenes, Rhodococcus erythropolis, etc.
          Length = 225

 Score = 29.9 bits (68), Expect = 3.7
 Identities = 8/37 (21%), Positives = 15/37 (40%)

Query: 671 LVHYISKKQSYVNSHISSILILILLLITVGGIGYYIF 707
           LV    ++          I  ++LLL  +    +YI+
Sbjct: 25  LVLLGRRQTERQKRLFRRIFAILLLLQEIALYLWYIY 61


>gnl|CDD|185732 cd08991, GH43_bXyl_2, Glycosyl hydrolase family 43.  This glycosyl
           hydrolase family 43 (GH43) includes enzymes that have
           been characterized with xylan-digesting beta-xylosidase
           (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase)
           activities. These are all inverting enzymes (i.e. they
           invert the stereochemistry of the anomeric carbon atom
           of the substrate) that have an aspartate as the
           catalytic general base, a glutamate as the catalytic
           general acid and another aspartate that is responsible
           for pKa modulation and orienting the catalytic acid.
           Many of the enzymes in this family display both
           alpha-L-arabinofuranosidase and beta-D-xylosidase
           activity using aryl-glycosides as substrates. A common
           structural feature of GH43 enzymes is a 5-bladed
           beta-propeller domain that contains the catalytic acid
           and catalytic base. A long V-shaped groove, partially
           enclosed at one end, forms a single extended
           substrate-binding surface across the face of the
           propeller.
          Length = 294

 Score = 29.6 bits (67), Expect = 5.2
 Identities = 9/42 (21%), Positives = 19/42 (45%), Gaps = 4/42 (9%)

Query: 146 NGKDRFVVYHTEDNGYKPYKLEVFEDNLYFSTYRTNNILKIN 187
           +G + ++VYH  +   +     +  D LYF     ++ L + 
Sbjct: 254 DGGELYIVYHAHNATDEVEPRTMRIDPLYF----KDDGLDVA 291


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 26.0 bits (58), Expect = 6.6
 Identities = 9/21 (42%), Positives = 10/21 (47%)

Query: 227 PCHQSALCINLPSSHTCLCPD 247
            CH +A C N   S TC C  
Sbjct: 7   GCHPNATCTNTGGSFTCTCKS 27


>gnl|CDD|201774 pfam01401, Peptidase_M2, Angiotensin-converting enzyme.  Members of
           this family are dipeptidyl carboxydipeptidases (cleave
           carboxyl dipeptides) and most notably convert
           angiotensin I to angiotensin II. Many members of this
           family contain a tandem duplication of the 600 amino
           acid peptidase domain, both of these are catalytically
           active. Most members are secreted membrane bound
           ectoenzymes.
          Length = 595

 Score = 29.5 bits (66), Expect = 6.6
 Identities = 16/34 (47%), Positives = 18/34 (52%), Gaps = 4/34 (11%)

Query: 631 SCAHFCFNGGTCREQNYSLDPDLKPICICPRGYA 664
           S A  CF  GTC    +SL+PDL  I    R YA
Sbjct: 119 STAKVCFPNGTC----WSLEPDLTNIMATSRKYA 148


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.320    0.136    0.459 

Gapped
Lambda     K      H
   0.267   0.0623    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 33,329,328
Number of extensions: 3030625
Number of successful extensions: 2820
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2750
Number of HSP's successfully gapped: 178
Length of query: 713
Length of database: 10,937,602
Length adjustment: 104
Effective length of query: 609
Effective length of database: 6,324,786
Effective search space: 3851794674
Effective search space used: 3851794674
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 63 (28.3 bits)