RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy17325
         (461 letters)



>gnl|CDD|214531 smart00135, LY, Low-density lipoprotein-receptor YWTD domain.  Type
           "B" repeats in low-density lipoprotein (LDL) receptor
           that plays a central role in mammalian cholesterol
           metabolism. Also present in a variety of molecules
           similar to gp300/megalin.
          Length = 43

 Score = 56.8 bits (138), Expect = 7e-11
 Identities = 21/41 (51%), Positives = 27/41 (65%)

Query: 143 IVVSGLDLVEGLAYDWIGGHIYWLDSRLNRIEVCDENGTNR 183
           ++ SGL    GLA DWI G +YW D  L+ IEV + +GTNR
Sbjct: 3   LLSSGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLDGTNR 43



 Score = 54.5 bits (132), Expect = 4e-10
 Identities = 18/42 (42%), Positives = 26/42 (61%)

Query: 361 TIISTKIYWPNGLTLDIATRRVYFADSKLDFIDFCNYDGTGR 402
           T++S+ +  PNGL +D    R+Y+ D  LD I+  N DGT R
Sbjct: 2   TLLSSGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLDGTNR 43



 Score = 46.4 bits (111), Expect = 3e-07
 Identities = 16/45 (35%), Positives = 22/45 (48%), Gaps = 3/45 (6%)

Query: 185 VLAKDNITQPRGMMLDPSPGTRWLFWTDWGENPRIERIGMDGSNR 229
            L    +  P G+ +D       L+WTDWG    IE   +DG+NR
Sbjct: 2   TLLSSGLGHPNGLAVDWI--EGRLYWTDWG-LDVIEVANLDGTNR 43



 Score = 43.4 bits (103), Expect = 4e-06
 Identities = 13/34 (38%), Positives = 21/34 (61%)

Query: 231 TIISTKIYWPNGLTLDIATRRVYFADSKLDFIDC 264
           T++S+ +  PNGL +D    R+Y+ D  LD I+ 
Sbjct: 2   TLLSSGLGHPNGLAVDWIEGRLYWTDWGLDVIEV 35



 Score = 39.9 bits (94), Expect = 7e-05
 Identities = 20/54 (37%), Positives = 24/54 (44%), Gaps = 14/54 (25%)

Query: 308 ILVSGQLY--EALALDLENGMLYYSTLKCTRWLFWTDWGENPRIERIGMDGSNR 359
            L+S  L     LA+D   G LY           WTDWG    IE   +DG+NR
Sbjct: 2   TLLSSGLGHPNGLAVDWIEGRLY-----------WTDWG-LDVIEVANLDGTNR 43


>gnl|CDD|215683 pfam00058, Ldl_recept_b, Low-density lipoprotein receptor repeat
           class B.  This domain is also known as the YWTD motif
           after the most conserved region of the repeat. The YWTD
           repeat is found in multiple tandem repeats and has been
           predicted to form a beta-propeller structure.
          Length = 42

 Score = 54.9 bits (133), Expect = 3e-10
 Identities = 15/40 (37%), Positives = 23/40 (57%)

Query: 208 LFWTDWGENPRIERIGMDGSNRSTIISTKIYWPNGLTLDI 247
           L+WTD      I    ++GS+R T+ S  + WPNG+ +D 
Sbjct: 3   LYWTDSSLRASISVADLNGSDRRTLFSEDLQWPNGIAVDP 42



 Score = 54.9 bits (133), Expect = 3e-10
 Identities = 15/40 (37%), Positives = 23/40 (57%)

Query: 338 LFWTDWGENPRIERIGMDGSNRSTIISTKIYWPNGLTLDI 377
           L+WTD      I    ++GS+R T+ S  + WPNG+ +D 
Sbjct: 3   LYWTDSSLRASISVADLNGSDRRTLFSEDLQWPNGIAVDP 42



 Score = 37.9 bits (89), Expect = 3e-04
 Identities = 17/42 (40%), Positives = 25/42 (59%), Gaps = 1/42 (2%)

Query: 161 GHIYWLDSRL-NRIEVCDENGTNRIVLAKDNITQPRGMMLDP 201
           G +YW DS L   I V D NG++R  L  +++  P G+ +DP
Sbjct: 1   GRLYWTDSSLRASISVADLNGSDRRTLFSEDLQWPNGIAVDP 42



 Score = 34.8 bits (81), Expect = 0.003
 Identities = 13/44 (29%), Positives = 22/44 (50%), Gaps = 5/44 (11%)

Query: 280 RKLYWIDEGGNGVPLKIGKANMDGSNASILVSGQLY--EALALD 321
            +LYW D   + +   I  A+++GS+   L S  L     +A+D
Sbjct: 1   GRLYWTD---SSLRASISVADLNGSDRRTLFSEDLQWPNGIAVD 41


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 45.7 bits (109), Expect = 5e-07
 Identities = 16/42 (38%), Positives = 19/42 (45%), Gaps = 5/42 (11%)

Query: 70  DMNECEQPGYCSQ--MCTNTKGSYICSCNEGYVLEYDNHTCK 109
           D++EC     C     C NT GSY C C  GY    D   C+
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRCECPPGY---TDGRNCE 39



 Score = 28.4 bits (64), Expect = 0.69
 Identities = 12/29 (41%), Positives = 12/29 (41%), Gaps = 3/29 (10%)

Query: 32 ECKHKFGLCSN--TCHPTPLGALCTCPPG 58
          EC      C N  TC  T     C CPPG
Sbjct: 4  ECASG-NPCQNGGTCVNTVGSYRCECPPG 31


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 43.4 bits (103), Expect = 3e-06
 Identities = 15/34 (44%), Positives = 17/34 (50%), Gaps = 2/34 (5%)

Query: 70  DMNECEQPGYCS--QMCTNTKGSYICSCNEGYVL 101
           D++EC     C     C NT GSY CSC  GY  
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34



 Score = 27.2 bits (61), Expect = 1.7
 Identities = 13/32 (40%), Positives = 15/32 (46%), Gaps = 3/32 (9%)

Query: 29 DLAECKHKFGLCSN--TCHPTPLGALCTCPPG 58
          D+ EC      C N  TC  T     C+CPPG
Sbjct: 1  DIDECAS-GNPCQNGGTCVNTVGSYRCSCPPG 31


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 41.2 bits (97), Expect = 2e-05
 Identities = 15/42 (35%), Positives = 20/42 (47%), Gaps = 3/42 (7%)

Query: 70  DMNECEQPGY-CSQM--CTNTKGSYICSCNEGYVLEYDNHTC 108
           D++EC    + C     C NT GS+ C C +GY    D   C
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 44.3 bits (105), Expect = 5e-05
 Identities = 18/41 (43%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 68  CQDMNEC-EQPGYCSQMCTNTKGSYICSCNEGYVLEYDNHT 107
           C   + C      C Q+C +T GSY+C+C EGY L  DN T
Sbjct: 184 CVVPDLCATLSHVCQQVCISTPGSYLCACTEGYALLEDNKT 224


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 39.8 bits (93), Expect = 6e-05
 Identities = 17/38 (44%), Positives = 19/38 (50%), Gaps = 4/38 (10%)

Query: 73  ECEQPGYCSQ-MCTNTKGSYICSCNEGYVLEYDNHTCK 109
           EC   G CS   C NT GSY CSC  GY     +  C+
Sbjct: 1   ECASGGPCSNGTCINTPGSYTCSCPPGYTG---DKRCE 35



 Score = 29.0 bits (65), Expect = 0.35
 Identities = 15/28 (53%), Positives = 16/28 (57%), Gaps = 2/28 (7%)

Query: 32 ECKHKFGLCSN-TCHPTPLGALCTCPPG 58
          EC    G CSN TC  TP    C+CPPG
Sbjct: 1  ECASG-GPCSNGTCINTPGSYTCSCPPG 27


>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or complement
           Clr-like EGF, domains have six conserved cysteine
           residues disulfide-bonded into the characteristic
           pattern 'ababcc'. They are found in blood coagulation
           proteins such as fibrillin, Clr and Cls, thrombomodulin,
           and the LDL receptor. The core fold of the EGF domain
           consists of two small beta-hairpins packed against each
           other. Two major structural variants have been
           identified based on the structural context of the
           C-terminal cysteine residue of disulfide 'c' in the
           C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
           C-terminal thiol resides on the C-terminal beta-sheet,
           resulting in long loop-lengths between the cysteine
           residues of disulfide 'c', typically C[10+]XC. These
           longer loop-lengths may have arisen by selective
           cysteine loss from a four-disulfide EGF template such as
           laminin or integrin. Tandem cEGF domains have five
           linking residues between terminal cysteines of adjacent
           domains. cEGF domains may or may not bind calcium in the
           linker region. cEGF domains with the consensus motif
           CXN4X[F,Y]XCXC are hydroxylated exclusively on the
           asparagine residue.
          Length = 24

 Score = 37.8 bits (89), Expect = 2e-04
 Identities = 12/23 (52%), Positives = 14/23 (60%)

Query: 90  SYICSCNEGYVLEYDNHTCKAIN 112
           SY CSC  GY L  D  TC+ I+
Sbjct: 1   SYTCSCPPGYQLSGDGRTCEDID 23



 Score = 32.4 bits (75), Expect = 0.018
 Identities = 12/21 (57%), Positives = 16/21 (76%)

Query: 53 CTCPPGEILSNDSVTCQDMNE 73
          C+CPPG  LS D  TC+D++E
Sbjct: 4  CSCPPGYQLSGDGRTCEDIDE 24


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 37.8 bits (88), Expect = 3e-04
 Identities = 15/38 (39%), Positives = 17/38 (44%), Gaps = 5/38 (13%)

Query: 73  ECEQPGYCS--QMCTNTKGSYICSCNEGYVLEYDNHTC 108
           EC     CS    C NT GSY C C  GY     + +C
Sbjct: 1   ECAASNPCSNGGTCVNTPGSYRCVCPPGYTG---DRSC 35



 Score = 28.2 bits (63), Expect = 0.67
 Identities = 14/29 (48%), Positives = 14/29 (48%), Gaps = 3/29 (10%)

Query: 32 ECKHKFGLCSN--TCHPTPLGALCTCPPG 58
          EC      CSN  TC  TP    C CPPG
Sbjct: 1  ECAA-SNPCSNGGTCVNTPGSYRCVCPPG 28


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 37.5 bits (88), Expect = 4e-04
 Identities = 13/25 (52%), Positives = 15/25 (60%), Gaps = 2/25 (8%)

Query: 84  CTNTKGSYICSCNEGYVLEYDNHTC 108
           CTNT GS+ C+C  GY    D  TC
Sbjct: 14  CTNTGGSFTCTCKSGY--TGDGVTC 36


>gnl|CDD|197566 smart00192, LDLa, Low-density lipoprotein receptor domain class
          A.  Cysteine-rich repeat in the low-density lipoprotein
          (LDL) receptor that plays a central role in mammalian
          cholesterol metabolism. The N-terminal type A repeats
          in LDL receptor bind the lipoproteins. Other homologous
          domains occur in related receptors, including the very
          low-density lipoprotein receptor and the LDL
          receptor-related protein/alpha 2-macroglobulin
          receptor, and in proteins which are functionally
          unrelated, such as the C9 component of complement.
          Mutations in the LDL receptor gene cause familial
          hypercholesterolemia.
          Length = 33

 Score = 36.1 bits (84), Expect = 0.001
 Identities = 11/21 (52%), Positives = 15/21 (71%)

Query: 4  RCVNLTKVCDGKTDCPNGADE 24
          RC+  + VCDG  DC +G+DE
Sbjct: 13 RCIPSSWVCDGVDDCGDGSDE 33


>gnl|CDD|238060 cd00112, LDLa, Low Density Lipoprotein Receptor Class A domain, a
          cysteine-rich repeat that plays a central role in
          mammalian cholesterol metabolism; the receptor protein
          binds LDL and transports it into cells by endocytosis;
          7 successive cysteine-rich repeats of about 40 amino
          acids are present in the N-terminal of this multidomain
          membrane protein; other homologous domains occur in
          related receptors, including the very low-density
          lipoprotein receptor and the LDL receptor-related
          protein/alpha 2-macroglobulin receptor, and in proteins
          which are functionally unrelated, such as the C9
          component of complement; the binding of calcium is
          required for in vitro formation of the native disulfide
          isomer and is necessary in establishing and maintaining
          the modular structure.
          Length = 35

 Score = 35.3 bits (82), Expect = 0.002
 Identities = 11/21 (52%), Positives = 16/21 (76%)

Query: 4  RCVNLTKVCDGKTDCPNGADE 24
          RC+  + VCDG+ DC +G+DE
Sbjct: 12 RCIPSSWVCDGEDDCGDGSDE 32


>gnl|CDD|200964 pfam00057, Ldl_recept_a, Low-density lipoprotein receptor domain
          class A. 
          Length = 37

 Score = 34.6 bits (80), Expect = 0.004
 Identities = 10/21 (47%), Positives = 15/21 (71%)

Query: 4  RCVNLTKVCDGKTDCPNGADE 24
           C+ ++ VCDG  DC +G+DE
Sbjct: 14 ECIPMSWVCDGDPDCEDGSDE 34


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 29.3 bits (66), Expect = 0.30
 Identities = 12/28 (42%), Positives = 13/28 (46%), Gaps = 2/28 (7%)

Query: 74 CEQPGYCS--QMCTNTKGSYICSCNEGY 99
          C     CS    C +T G Y C C EGY
Sbjct: 1  CSPNNPCSNGGTCVDTPGGYTCECPEGY 28



 Score = 26.2 bits (58), Expect = 4.0
 Identities = 12/21 (57%), Positives = 12/21 (57%), Gaps = 2/21 (9%)

Query: 40 CSN--TCHPTPLGALCTCPPG 58
          CSN  TC  TP G  C CP G
Sbjct: 7  CSNGGTCVDTPGGYTCECPEG 27


>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
           Pvs28.  This family consists of several ookinete surface
           protein (Pvs28) from several species of Plasmodium.
           Pvs25 and Pvs28 are expressed on the surface of
           ookinetes. These proteins are potential candidates for
           vaccine and induce antibodies that block the infectivity
           of Plasmodium vivax in immunised animals.
          Length = 196

 Score = 31.6 bits (72), Expect = 0.56
 Identities = 16/68 (23%), Positives = 27/68 (39%), Gaps = 14/68 (20%)

Query: 53  CTCPPGEILSNDSVTCQDMNECEQP----------GYC-SQMCTNTKGSYICSCNEGYVL 101
           C C  G +L N++ TC++  +C++             C +Q     + +  C C  GY L
Sbjct: 22  CKCNEGYVLKNEN-TCEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGCINGYTL 80

Query: 102 EYDNHTCK 109
                 C 
Sbjct: 81  --SQGVCV 86



 Score = 27.8 bits (62), Expect = 9.9
 Identities = 15/36 (41%), Positives = 19/36 (52%), Gaps = 5/36 (13%)

Query: 78  GYCSQMCTNTKGSYICSCNEGYVLEYDNHTCKAINH 113
           GY  QM       + C CNEGYVL+ +N TC+    
Sbjct: 11  GYLIQM----SNHFECKCNEGYVLKNEN-TCEEKVK 41


>gnl|CDD|190848 pfam04056, Ssl1, Ssl1-like.  Ssl1-like proteins are 40kDa subunits
           of the Transcription factor II H complex.
          Length = 193

 Score = 31.6 bits (72), Expect = 0.57
 Identities = 14/48 (29%), Positives = 23/48 (47%), Gaps = 6/48 (12%)

Query: 50  GALCTCPPGEI------LSNDSVTCQDMNECEQPGYCSQMCTNTKGSY 91
           G+L TC PG+I      L  + + C  +    +   C ++C  T G+Y
Sbjct: 109 GSLSTCDPGDIYSTIDTLKKEKIRCSVIGLSAEVFICKELCKATNGTY 156


>gnl|CDD|219847 pfam08450, SGL, SMP-30/Gluconolaconase/LRE-like region.  This
           family describes a region that is found in proteins
           expressed by a variety of eukaryotic and prokaryotic
           species. These proteins include various enzymes, such as
           senescence marker protein 30 (SMP-30), gluconolactonase
           and luciferin-regenerating enzyme (LRE). SMP-30 is known
           to hydrolyse diisopropyl phosphorofluoridate in the
           liver, and has been noted as having sequence similarity,
           in the region described in this family, with PON1 and
           LRE.
          Length = 245

 Score = 31.4 bits (72), Expect = 0.78
 Identities = 32/153 (20%), Positives = 50/153 (32%), Gaps = 39/153 (25%)

Query: 118 LIISNRHSILV-----------ADLEEKGKDHRAQDIVVSGL------DLVEGLAYDWIG 160
           LI++ +  + +           ADLE     +R  D  V          +   +A     
Sbjct: 54  LIVALKRGLALLDLDTGELTTLADLEPDEPLNRFNDGKVDPDGRFWFGTMGFDIAPGGEP 113

Query: 161 GHIYWLDSRLNRIEVCDENGTNRIVLAKDNITQPRGMMLDPSPGTRWLFWTDWGENPRIE 220
           G +Y LD               ++    D IT   G+    SP  + L++ D     RI 
Sbjct: 114 GALYRLD------------PDGKVERVLDGITISNGLAW--SPDGKTLYFADSPTR-RIW 158

Query: 221 RI-----GMDGSNRSTIISTKIY--WPNGLTLD 246
                  G   SNR      K     P+G+ +D
Sbjct: 159 AFDYDADGGLISNRRVFADFKDGDGEPDGMAVD 191



 Score = 29.1 bits (66), Expect = 4.0
 Identities = 37/203 (18%), Positives = 54/203 (26%), Gaps = 49/203 (24%)

Query: 203 PGTRWLFWTDWGENPRIERIGMDGSNRSTIISTKIYWPNGLTLDIATRRVYFADSK---- 258
                L+W D     RI R+       +         P G        R+  A  +    
Sbjct: 9   EEEGALYWVDI-LGGRIHRLDPATGKETVWDLPG---PVGAIALRDDGRLIVALKRGLAL 64

Query: 259 LDFIDCKSLEIKTNVNYSTIFRKLYWIDEGGNGVPLKIGKANMDGSNASILVSGQLYEAL 318
           LD               +     L  ++          GK + DG        G +   +
Sbjct: 65  LDL-------------DTGELTTLADLEPDEPLNRFNDGKVDPDGR----FWFGTMGFDI 107

Query: 319 ALDLENGMLYYSTLKCTRWLFWTDWGENPRIERIGMDGSNRSTIISTKIYWPNGLTLDIA 378
           A   E G LY                         +D   +   +   I   NGL     
Sbjct: 108 APGGEPGALYR------------------------LDPDGKVERVLDGITISNGLAWSPD 143

Query: 379 TRRVYFADSKLDFIDFCNYDGTG 401
            + +YFADS    I   +YD  G
Sbjct: 144 GKTLYFADSPTRRIWAFDYDADG 166


>gnl|CDD|213205 cd03238, ABC_UvrA, ATP-binding cassette domain of the excision
           repair protein UvrA.  Nucleotide excision repair in
           eubacteria is a process that repairs DNA damage by the
           removal of a 12-13-mer oligonucleotide containing the
           lesion. Recognition and cleavage of the damaged DNA is a
           multistep ATP-dependent reaction that requires the UvrA,
           UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases,
           with UvrA having two ATP binding sites, which have the
           characteristic signature of the family of ABC proteins,
           and UvrB having one ATP binding site that is
           structurally related to that of helicases.
          Length = 176

 Score = 30.4 bits (69), Expect = 1.1
 Identities = 18/60 (30%), Positives = 30/60 (50%), Gaps = 2/60 (3%)

Query: 223 GMDGSNRSTIISTKIYWPNGLTLDIATRRVYFADSKLDFIDCKSLEIKTNVNYSTIFRKL 282
           G+ GS +ST+++  +Y      L     +  F+ +KL FID     I   + Y T+ +KL
Sbjct: 28  GVSGSGKSTLVNEGLYASGKARLISFLPK--FSRNKLIFIDQLQFLIDVGLGYLTLGQKL 85


>gnl|CDD|201923 pfam01683, EB, EB module.  This domain has no known function. It is
           found in several C. elegans proteins. The domain
           contains 8 conserved cysteines that probably form four
           disulphide bridges. This domain is found associated with
           kunitz domains pfam00014.
          Length = 52

 Score = 27.8 bits (62), Expect = 1.8
 Identities = 14/62 (22%), Positives = 21/62 (33%), Gaps = 18/62 (29%)

Query: 55  CPPGEILSNDS--------VTCQDMNECEQPGYCSQMCTNTKGSYICSCNEGYVLEYDNH 106
           CP G++L N           +C+   +C+    C            C C EG+ L     
Sbjct: 1   CPSGQVLVNGECLPKVLPGESCEYDEQCQGGSVCING--------TCQCPEGFTL--VGG 50

Query: 107 TC 108
            C
Sbjct: 51  RC 52


>gnl|CDD|215497 PLN02919, PLN02919, haloacid dehalogenase-like hydrolase family
           protein.
          Length = 1057

 Score = 30.6 bits (69), Expect = 2.3
 Identities = 20/53 (37%), Positives = 26/53 (49%), Gaps = 4/53 (7%)

Query: 133 EKGKDHRAQDIVVSGLDLVEGLAYDWIGGHIYWLDSRLNRIEVCDENGTNRIV 185
           EK  D R   ++ S L     LA D +   ++  DS  NRI V D +G N IV
Sbjct: 555 EKDNDPR---LLTSPLKFPGKLAIDLLNNRLFISDSNHNRIVVTDLDG-NFIV 603


>gnl|CDD|216726 pfam01826, TIL, Trypsin Inhibitor like cysteine rich domain.  This
           family contains trypsin inhibitors as well as a domain
           found in many extracellular proteins. The domain
           typically contains ten cysteine residues that form five
           disulphide bonds. The cysteine residues that form the
           disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.
          Length = 55

 Score = 27.3 bits (61), Expect = 2.6
 Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 11/57 (19%)

Query: 55  CPPGEILSNDSVTCQDMNECEQ---PGYCSQMCTNTKGSYICSCNEGYVLEYDNHTC 108
           CPP E+ S     C     C     P  C   C        C C  GYV + ++  C
Sbjct: 1   CPPNEVYSECGSACPP--TCANLSTPPPCPLPCVEG-----CVCPPGYVRD-NDGKC 49


>gnl|CDD|212108 cd10796, GH57N_APU, N-terminal catalytic domain of thermoactive
           amylopullulanases; glycoside hydrolase family 57 (GH57).
            Pullulanases (EC 3.2.1.41) are capable of hydrolyzing
           the alpha-1,6 glucosidic bonds of pullulan, producing
           maltotriose.  Amylopullulanases (APU, E.C 3.2.1.1/41)
           are type II pullulanases which can also degrade both the
           alpha-1,6 and alpha-1,4 glucosidic bonds of starch,
           producing oligosaccharides. This subfamily includes GH57
           archaeal thermoactive APUs, which show both
           pullulanolytic and amylolytic activities. They have an
           acid pH optimum and the presence of Ca2+ might increase
           their activity, thermostability, and substrate affinity.
           Besides GH57 thermoactive APUs, all mesophilic and some
           thermoactive APUs belong to glycoside hydrolase family
           13 with catalytic features distinct from GH57. This
           subfamily also includes many uncharacterized proteins
           found in bacteria and archaea.
          Length = 313

 Score = 29.5 bits (67), Expect = 3.4
 Identities = 20/71 (28%), Positives = 29/71 (40%), Gaps = 4/71 (5%)

Query: 381 RVYFADSKL-DFIDFCNYDGTGRQQVI--LIDLMVEFTNDLMGDNGVPALALDLENGMLY 437
            V+F D +L D I F  Y     +      I  +      +    GV  +ALD EN   +
Sbjct: 197 YVFFRDHELSDLIGF-TYSFWPAEDAARDFIHRLKSIREQIYNPGGVVTIALDGENAWEF 255

Query: 438 YSTEGAEFLKT 448
           Y   G +FL+ 
Sbjct: 256 YPNNGYDFLEA 266


>gnl|CDD|238730 cd01453, vWA_transcription_factor_IIH_type, Transcription factors
           IIH type: TFIIH is a multiprotein complex that is one of
           the five general transcription factors that binds RNA
           polymerase II holoenzyme. Orthologues of these genes are
           found in all completed eukaryotic genomes and all these
           proteins contain a VWA domain. The p44 subunit of TFIIH
           functions as a DNA helicase in RNA polymerase II
           transcription initiation and DNA repair, and its
           transcriptional activity is dependent on its C-terminal
           Zn-binding domains. The function of the vWA domain is
           unclear, but may be involved in complex assembly. The
           MIDAS motif is not conserved in this sub-group.
          Length = 183

 Score = 28.8 bits (65), Expect = 4.4
 Identities = 12/48 (25%), Positives = 21/48 (43%), Gaps = 6/48 (12%)

Query: 50  GALCTCPPGEI------LSNDSVTCQDMNECEQPGYCSQMCTNTKGSY 91
            +L TC PG I      L  +++    +    +   C ++C  T G+Y
Sbjct: 115 SSLSTCDPGNIYETIDKLKKENIRVSVIGLSAEMHICKEICKATNGTY 162


>gnl|CDD|183740 PRK12779, PRK12779, putative bifunctional glutamate synthase
           subunit beta/2-polyprenylphenol hydroxylase;
           Provisional.
          Length = 944

 Score = 29.4 bits (66), Expect = 5.5
 Identities = 30/113 (26%), Positives = 47/113 (41%), Gaps = 14/113 (12%)

Query: 320 LDLENGMLYYSTLKCTRWLFWTDWGENPRIERIGMDGSNRSTIISTKIYWPNGLTLDIAT 379
           L L N +   S  +   +LFWT   E  R+ ++  +  ++  +I T        T D + 
Sbjct: 773 LRLGNHVTLISGFRAKEFLFWTGDDE--RVGKLKAEFGDQLDVIYT--------TNDGSF 822

Query: 380 RRVYFADSKLDFIDFCNYDGTGRQ--QVILID--LMVEFTNDLMGDNGVPALA 428
               F    L+ +   N  G GR   +VI I   LM+   +DL    GV  +A
Sbjct: 823 GVKGFVTGPLEEMLKANQQGKGRTIAEVIAIGPPLMMRAVSDLTKPYGVKTVA 875


>gnl|CDD|215784 pfam00200, Disintegrin, Disintegrin. 
          Length = 76

 Score = 26.8 bits (60), Expect = 5.6
 Identities = 20/76 (26%), Positives = 25/76 (32%), Gaps = 25/76 (32%)

Query: 12 CDGKTDCPNGADEGPGCDLAECKHK------FGLCSNTCHPTPLGALCTCPPGEILSNDS 65
          C    +C N     P CD   CK K       G C + C   P G +C            
Sbjct: 8  CGSPEECQN-----PCCDATTCKLKPGAQCATGPCCDQCKFKPAGTVCR----------- 51

Query: 66 VTCQDMNECEQPGYCS 81
                 EC+ P YC+
Sbjct: 52 ---PASGECDLPEYCT 64


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.321    0.139    0.448 

Gapped
Lambda     K      H
   0.267   0.0761    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 23,351,192
Number of extensions: 2239606
Number of successful extensions: 1634
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1620
Number of HSP's successfully gapped: 54
Length of query: 461
Length of database: 10,937,602
Length adjustment: 100
Effective length of query: 361
Effective length of database: 6,502,202
Effective search space: 2347294922
Effective search space used: 2347294922
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 61 (27.2 bits)