RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11306
         (413 letters)



>gnl|CDD|173890 cd06902, lectin_ERGIC-53_ERGL, ERGIC-53 and ERGL type 1
           transmembrane proteins, N-terminal lectin domain.
           ERGIC-53 and ERGL, N-terminal carbohydrate recognition
           domain. ERGIC-53 and ERGL are eukaryotic mannose-binding
           type 1 transmembrane proteins of the early secretory
           pathway that transport newly synthesized glycoproteins
           from the endoplasmic reticulum (ER) to the ER-Golgi
           intermediate compartment (ERGIC).  ERGIC-53 and ERGL
           have an N-terminal lectin-like carbohydrate recognition
           domain (represented by this alignment model) as well as
           a C-terminal transmembrane domain.  ERGIC-53 functions
           as a 'cargo receptor' to facilitate the export of
           glycoproteins with different characteristics from the
           ER, while the ERGIC-53-like protein (ERGL) which may act
           as a regulator of ERGIC-53.  In mammals, ERGIC-53 forms
           a complex with MCFD2 (multi-coagulation factor
           deficiency 2) which then recruits blood coagulation
           factors V and VIII.  Mutations in either MCFD2 or
           ERGIC-53 cause a mild form of inherited hemophilia known
           as combined deficiency of factors V and VIII (F5F8D). In
           addition to the lectin and transmembrane domains,
           ERGIC-53 and ERGL have a short N-terminal cytoplasmic
           region of about 12 amino acids. ERGIC-53 forms
           disulphide-linked homodimers and homohexamers. ERGIC-53
           and ERGL are sequence-similar to the lectins of
           leguminous plants.  L-type lectins have a dome-shaped
           beta-barrel carbohydrate recognition domain with a
           curved seven-stranded beta-sheet referred to as the
           "front face" and a flat six-stranded beta-sheet referred
           to as the "back face".  This domain homodimerizes so
           that adjacent back sheets form a contiguous 12-stranded
           sheet and homotetramers occur by a back-to-back
           association of these homodimers.  Though L-type lectins
           exhibit both sequence and structural similarity to one
           another, their carbohydrate binding specificities differ
           widely.
          Length = 225

 Score =  420 bits (1083), Expect = e-149
 Identities = 153/224 (68%), Positives = 180/224 (80%), Gaps = 1/224 (0%)

Query: 31  RFEYKYSFKPPYLAQKDGSVPFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEWW 90
           RFEYKYSFK P+LAQKDG+VPFW +GG+ IASLE VR+ PSLRS+KG++WTK   +FE W
Sbjct: 2   RFEYKYSFKGPHLAQKDGTVPFWSHGGDAIASLEQVRLTPSLRSKKGSVWTKNPFSFENW 61

Query: 91  NVDIVFRVTGRGRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDNNHNNP 150
            V++ FRVTGRGRIGADGLA WYT E+G  +G VFGSSD+W G+G+FFDSFDND   NNP
Sbjct: 62  EVEVTFRVTGRGRIGADGLAIWYTKERGE-EGPVFGSSDKWNGVGIFFDSFDNDGKKNNP 120

Query: 151 YIMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTNN 210
            I+ V NDG  ++DHQNDG +Q+L  CLRDFRNKPYP RA+I YY N LTV  +NG T N
Sbjct: 121 AILVVGNDGTKSYDHQNDGLTQALGSCLRDFRNKPYPVRAKITYYQNVLTVSINNGFTPN 180

Query: 211 EQDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHFLTSSL 254
           + D E+C RVEN+ LP  GYFGVSAATGGLADDHD+L FLT SL
Sbjct: 181 KDDYELCTRVENMVLPPNGYFGVSAATGGLADDHDVLSFLTFSL 224



 Score = 46.2 bits (110), Expect = 9e-06
 Identities = 19/35 (54%), Positives = 23/35 (65%), Gaps = 3/35 (8%)

Query: 296 RFEYKYSFKPPYLAQKDG---YQKDHPDAHPNEEE 327
           RFEYKYSFK P+LAQKDG   +     DA  + E+
Sbjct: 2   RFEYKYSFKGPHLAQKDGTVPFWSHGGDAIASLEQ 36


>gnl|CDD|217528 pfam03388, Lectin_leg-like, Legume-like lectin family.  Lectins are
           structurally diverse proteins that bind to specific
           carbohydrates. This family includes the VIP36 and
           ERGIC-53 lectins. These two proteins were the first
           recognised members of a family of animal lectins similar
           (19-24%) to the leguminous plant lectins. The alignment
           for this family aligns residues lying towards the
           N-terminus, where the similarity of VIP36 and ERGIC-53
           is greatest. However, while Fiedler and Simons
           identified these proteins as a new family of animal
           lectins, our alignment also includes yeast sequences.
           ERGIC-53 is a 53kD protein, localised to the
           intermediate region between the endoplasmic reticulum
           and the Golgi apparatus (ER-Golgi-Intermediate
           Compartment, ERGIC). It was identified as a
           calcium-dependent, mannose-specific lectin. Its
           dysfunction has been associated with combined factors V
           and VIII deficiency OMIM:227300 OMIM:601567, suggesting
           an important and substrate-specific role for ERGIC-53 in
           the glycoprotein- secreting pathway.
          Length = 226

 Score =  321 bits (825), Expect = e-110
 Identities = 123/225 (54%), Positives = 157/225 (69%), Gaps = 1/225 (0%)

Query: 30  ERFEYKYSFKPPYLAQKDGSVPFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEW 89
           +RF+YK+S K PYL Q DGS+PFWEYGG+ I S   +R+ P L+SQKG++W KQ  + + 
Sbjct: 1   DRFKYKHSLKAPYLGQGDGSIPFWEYGGSAILSSGYIRLTPDLQSQKGSLWNKQPLDLDS 60

Query: 90  WNVDIVFRVTGRGRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDNNHNN 149
           W V++ FRV G GR+G DGLA WYTSE+G   G VFGS D W GL +F D++DNDN   N
Sbjct: 61  WEVEVTFRVHGSGRLGGDGLAIWYTSERGV-PGPVFGSKDNWDGLAIFLDTYDNDNQTLN 119

Query: 150 PYIMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTN 209
           PYI  ++NDG+  +DH  DG  Q LA C  DFRNK YPTR RI Y  NTLTV   +G+  
Sbjct: 120 PYISGMLNDGSKPYDHTKDGTDQELASCTADFRNKDYPTRIRITYDKNTLTVMIDDGLLE 179

Query: 210 NEQDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHFLTSSL 254
           ++ D ++C +VEN+ LP   YFGVSAATG L+D+HD+  FLT  L
Sbjct: 180 DKVDYKLCFQVENVRLPTGYYFGVSAATGDLSDNHDVFSFLTFQL 224



 Score = 31.6 bits (72), Expect = 0.55
 Identities = 19/62 (30%), Positives = 24/62 (38%), Gaps = 13/62 (20%)

Query: 295 ERFEYKYSFKPPYLAQKDGYQKDHPDAHPNEEEWYES----ENQRELRQIFQGQSQLAEW 350
           +RF+YK+S K PYL Q DG             E+  S         L    Q Q   + W
Sbjct: 1   DRFKYKHSLKAPYLGQGDG--------SIPFWEYGGSAILSSGYIRLTPDLQSQKG-SLW 51

Query: 351 TK 352
            K
Sbjct: 52  NK 53


>gnl|CDD|173892 cd07308, lectin_leg-like, legume-like lectins: ERGIC-53, ERGL,
           VIP36, VIPL, EMP46, and EMP47.  The legume-like
           (leg-like) lectins are eukaryotic intracellular sugar
           transport proteins with a carbohydrate recognition
           domain similar to that of the legume lectins.  This
           domain binds high-mannose-type oligosaccharides for
           transport from the endoplasmic reticulum to the Golgi
           complex.  These leg-like lectins include ERGIC-53, ERGL,
           VIP36, VIPL, EMP46, EMP47, and the UIP5
           (ULP1-interacting protein 5) precursor protein.
           Leg-like lectins have different intracellular
           distributions and dynamics in the endoplasmic
           reticulum-Golgi system of the secretory pathway and
           interact with N-glycans of glycoproteins in a
           calcium-dependent manner, suggesting a role in
           glycoprotein sorting and trafficking.  L-type lectins
           have a dome-shaped beta-barrel carbohydrate recognition
           domain with a curved seven-stranded beta-sheet referred
           to as the "front face" and a flat six-stranded
           beta-sheet referred to as the "back face".  This domain
           homodimerizes so that adjacent back sheets form a
           contiguous 12-stranded sheet and homotetramers occur by
           a back-to-back association of these homodimers.  Though
           L-type lectins exhibit both sequence and structural
           similarity to one another, their carbohydrate binding
           specificities differ widely.
          Length = 218

 Score =  215 bits (550), Expect = 2e-68
 Identities = 89/223 (39%), Positives = 128/223 (57%), Gaps = 5/223 (2%)

Query: 32  FEYKYSFKPPYLAQKDGSVPFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEWWN 91
           F  ++S  PP+L   DG +  W  GG+ + +   +R+ P + SQ G++W++     + + 
Sbjct: 1   FISEHSLSPPFLDDNDGEIGNWTVGGSTVITKNYIRLTPDVPSQSGSLWSRVPIPAKDFE 60

Query: 92  VDIVFRVTGRGRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDNNHNNPY 151
           +++ F + G   +G DG AFWYT E GS DG +FG  D++KGL +FFD++DND     P 
Sbjct: 61  IEVEFSIHGGSGLGGDGFAFWYTEEPGS-DGPLFGGPDKFKGLAIFFDTYDNDGK-GFPS 118

Query: 152 IMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTNNE 211
           I   +NDG  ++D++ DG    LA C   FRN   PT  RI Y  NTL V       NN 
Sbjct: 119 ISVFLNDGTKSYDYETDGEKLELASCSLKFRNSNAPTTLRISYLNNTLKVDITYSEGNNW 178

Query: 212 QDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHFLTSSL 254
           ++   C  VE++ LP +GYFG SA TG L+D+HDIL   T  L
Sbjct: 179 KE---CFTVEDVILPSQGYFGFSAQTGDLSDNHDILSVHTYEL 218


>gnl|CDD|173889 cd06901, lectin_VIP36_VIPL, VIP36 and VIPL type 1 transmembrane
           proteins, lectin domain.  The vesicular integral protein
           of 36 kDa (VIP36) is a type 1 transmembrane protein of
           the mammalian early secretory pathway that acts as a
           cargo receptor transporting high mannose type
           glycoproteins between the Golgi and the endoplasmic
           reticulum (ER).  Lectins of the early secretory pathway
           are involved in the selective transport of newly
           synthesized glycoproteins from the ER to the ER-Golgi
           intermediate compartment (ERGIC). The most prominent
           cycling lectin is the mannose-binding type1 membrane
           protein ERGIC-53, which functions as a cargo receptor to
           facilitate export of glycoproteins from the ER. L-type
           lectins have a dome-shaped beta-barrel carbohydrate
           recognition domain with a curved seven-stranded
           beta-sheet referred to as the "front face" and a flat
           six-stranded beta-sheet referred to as the "back face". 
           This domain homodimerizes so that adjacent back sheets
           form a contiguous 12-stranded sheet and homotetramers
           occur by a back-to-back association of these homodimers.
            Though L-type lectins exhibit both sequence and
           structural similarity to one another, their carbohydrate
           binding specificities differ widely.
          Length = 248

 Score =  189 bits (481), Expect = 9e-58
 Identities = 85/221 (38%), Positives = 126/221 (57%), Gaps = 15/221 (6%)

Query: 36  YSFKPPYLAQKDGSV-PFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEWWNVDI 94
           +S   PY  Q  GS  P W++ G+ + + + +R+ P  +S++G+IW +       W + +
Sbjct: 6   HSLIKPY--QGVGSSMPLWDFLGSTMVTSQYIRLTPDHQSKQGSIWNRVPCYLRDWEMHV 63

Query: 95  VFRVTGRGR-IGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDN---NHNNP 150
            F+V G G+ +  DG A WYT E+    G VFGS D + GL +FFD++ N N    H +P
Sbjct: 64  HFKVHGSGKNLFGDGFAIWYTKER-MQPGPVFGSKDNFHGLAIFFDTYSNQNGEHEHVHP 122

Query: 151 YIMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTN- 209
           YI A+VN+G++++DH  DG    LAGC   FRNK + T   I+Y    LTV     MT+ 
Sbjct: 123 YISAMVNNGSLSYDHDRDGTHTELAGCSAPFRNKDHDTFVAIRYSKGRLTV-----MTDI 177

Query: 210 -NEQDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHF 249
             + + + C  V  + LP   YFG SAATG L+D+HDI+  
Sbjct: 178 DGKNEWKECFDVTGVRLPTGYYFGASAATGDLSDNHDIISM 218


>gnl|CDD|173891 cd06903, lectin_EMP46_EMP47, EMP46 and EMP47 type 1 transmembrane
           proteins, N-terminal lectin domain.  EMP46 and EMP47,
           N-terminal carbohydrate recognition domain. EMP46 and
           EMP47 are fungal type-I transmembrane proteins that
           cycle between the endoplasmic reticulum and the golgi
           apparatus and are thought to function as cargo receptors
           that transport newly synthesized glycoproteins.  EMP47
           is a receptor for EMP46 responsible for the selective
           transport of EMP46 by forming hetero-oligomerization
           between the two proteins. EMP46 and EMP47 have an
           N-terminal lectin-like carbohydrate recognition domain
           (represented by this alignment model) as well as a
           C-terminal transmembrane domain. EMP46 and EMP47 are 45%
           sequence-identical to one another and have sequence
           homology to a class of intracellular lectins defined by
           ERGIC-53 and VIP36.  L-type lectins have a dome-shaped
           beta-barrel carbohydrate recognition domain with a
           curved seven-stranded beta-sheet referred to as the
           "front face" and a flat six-stranded beta-sheet referred
           to as the "back face".  This domain homodimerizes so
           that adjacent back sheets form a contiguous 12-stranded
           sheet and homotetramers occur by a back-to-back
           association of these homodimers.  Though L-type lectins
           exhibit both sequence and structural similarity to one
           another, their carbohydrate binding specificities differ
           widely.
          Length = 215

 Score = 89.7 bits (223), Expect = 1e-20
 Identities = 48/213 (22%), Positives = 91/213 (42%), Gaps = 18/213 (8%)

Query: 47  DGSVPFWEYGGNCIASLENVR-VAPSLRSQKGAIWTKQTTNFEW-WNVDIVFRVTGRGRI 104
              +P W+  GN    LE+ R +     +Q+G++W K+  + +  W ++  FR TG    
Sbjct: 17  GKLIPNWQTSGN--PKLESGRIILTPPGNQRGSLWLKKPLSLKDEWTIEWTFRSTGPEGR 74

Query: 105 GADGLAFWYTSEKGSYDGE-VFGSSDRWKGLGLFFDSFDNDNNHNNPYIMAVVNDGNMAF 163
              GL FW   +  +  G        ++ GL L  D+       +   +   +NDG+  +
Sbjct: 75  SGGGLNFWLVKDGNADVGTSSIYGPSKFDGLQLLIDNNGG----SGGSLRGFLNDGSKDY 130

Query: 164 DHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTNNEQDIEVCLRVENI 223
            +++   S +   CL  +++   P+  R+ Y        F   + N      +C + + +
Sbjct: 131 KNEDV-DSLAFGSCLFAYQDSGVPSTIRLSYDAL--NSLFKVQVDNR-----LCFQTDKV 182

Query: 224 YLPKEGY-FGVSAATGGLADDHDILHFLTSSLL 255
            LP+ GY FG++AA     +  +IL     + L
Sbjct: 183 QLPQGGYRFGITAANADNPESFEILKLKVWNGL 215


>gnl|CDD|173886 cd01951, lectin_L-type, legume lectins.  The L-type (legume-type)
           lectins are a highly diverse family of carbohydrate
           binding proteins that generally display no enzymatic
           activity toward the sugars they bind.  This family
           includes arcelin, concanavalinA, the lectin-like
           receptor kinases, the ERGIC-53/VIP36/EMP46 type1
           transmembrane proteins, and an alpha-amylase inhibitor. 
           L-type lectins have a dome-shaped beta-barrel
           carbohydrate recognition domain with a curved
           seven-stranded beta-sheet referred to as the "front
           face" and a flat six-stranded beta-sheet referred to as
           the "back face".  This domain homodimerizes so that
           adjacent back sheets form a contiguous 12-stranded sheet
           and homotetramers occur by a back-to-back association of
           these homodimers.  Though L-type lectins exhibit both
           sequence and structural similarity to one another, their
           carbohydrate binding specificities differ widely.
          Length = 223

 Score = 80.6 bits (199), Expect = 2e-17
 Identities = 57/215 (26%), Positives = 88/215 (40%), Gaps = 25/215 (11%)

Query: 49  SVPFWEYGGNCIASLENV--RVAPSLRSQKGAIWTKQ----TTNFEWWNVDIVFRVTGRG 102
           +   W+  G+   + ++   R+ P   +Q G+ W K     + +F        F +  +G
Sbjct: 12  NQSNWQLNGSATLTTDSGVLRLTPDTGNQAGSAWYKTPIDLSKDFT---TTFKFYLGTKG 68

Query: 103 RIGADGLAFWYTSEK-----GSYDGEVFGSSDRWKGLGLFFDSFDND--NNHNNPYIMAV 155
             GADG+AF   ++      G   G   G       + + FD++ ND  N+ N  +I   
Sbjct: 69  TNGADGIAFVLQNDPAGALGGGGGGGGLGYGGIGNSVAVEFDTYKNDDNNDPNGNHISID 128

Query: 156 VNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTR-ARIQY--YMNTLTVWFHNGMTNNEQ 212
           VN               SL                 RI Y    NTLTV+  NG T    
Sbjct: 129 VNGNGNNTALAT-----SLGSASLPNGTGLGNEHTVRITYDPTTNTLTVYLDNGSTLTSL 183

Query: 213 DIEVCLRVENIYLPKEGYFGVSAATGGLADDHDIL 247
           DI + + +  +  P + YFG +A+TGGL + HDIL
Sbjct: 184 DITIPVDLIQL-GPTKAYFGFTASTGGLTNLHDIL 217


>gnl|CDD|215744 pfam00139, Lectin_legB, Legume lectin domain. 
          Length = 231

 Score = 43.8 bits (104), Expect = 6e-05
 Identities = 48/190 (25%), Positives = 74/190 (38%), Gaps = 35/190 (18%)

Query: 79  IWTKQTTNFEWWNVDIVFRVTGR--GRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGL 136
           +W   T     ++   VF +        G DGLAF+      +  G    SS  +  LGL
Sbjct: 51  LWDSSTGKVASFSTSFVFAIKNIPKSTNGGDGLAFFLAPSG-TQPG---ASSGGY--LGL 104

Query: 137 FFDSFDNDNNHNNPYIMAVVNDGNMAFDHQN-DG-------------ASQSLAGCLRDFR 182
           F  S  N+ N +N  I+AV  D  +  +  + D              AS+S +    D  
Sbjct: 105 FNSS--NNGNSSNH-IVAVEFDTFLNPEFNDIDDNHVGIDVNSIISVASESASFVPLDLN 161

Query: 183 N-KPYPTRARIQY--YMNTLTVWFHNGMTNNEQDIEVCLRVENI--YLPKEGYFGVSAAT 237
           + KP   +  I Y      L+V        N+    +     ++   LP+  Y G SA+T
Sbjct: 162 SGKPI--QVWIDYDGSSKRLSVTLAY---PNKPKRPLLSASVDLSTVLPEWVYVGFSAST 216

Query: 238 GGLADDHDIL 247
           GG  + H +L
Sbjct: 217 GGATESHYVL 226


>gnl|CDD|173887 cd06899, lectin_legume_LecRK_Arcelin_ConA, legume lectins,
           lectin-like receptor kinases, arcelin, concanavalinA,
           and alpha-amylase inhibitor.  This alignment model
           includes the legume lectins (also known as agglutinins),
           the arcelin (also known as phytohemagglutinin-L) family
           of lectin-like defense proteins, the LecRK family of
           lectin-like receptor kinases, concanavalinA (ConA), and
           an alpha-amylase inhibitor.  Arcelin is a major seed
           glycoprotein discovered in kidney beans (Phaseolus
           vulgaris) that has insecticidal properties and protects
           the seeds from predation by larvae of various bruchids. 
           Arcelin is devoid of monosaccharide binding properties
           and lacks a key metal-binding loop that is present in
           other members of this family.  Phytohaemagglutinin (PHA)
           is a lectin found in plants, especially beans, that
           affects cell metabolism by inducing mitosis and by
           altering the permeability of the cell membrane to
           various proteins.  PHA agglutinates most mammalian red
           blood cell types by binding glycans on the cell surface.
            Medically, PHA is used as a mitogen to trigger cell
           division in T-lymphocytes and to activate latent HIV-1
           from human peripheral lymphocytes.  Plant L-type lectins
           are primarily found in the seeds of leguminous plants
           where they constitute about 10% of the total soluble
           protein of the seed extracts. They are synthesized
           during seed development several weeks after flowering
           and transported to the vacuole where they become
           condensed into specialized vesicles called protein
           bodies. L-type lectins have a dome-shaped beta-barrel
           carbohydrate recognition domain with a curved
           seven-stranded beta-sheet referred to as the "front
           face" and a flat six-stranded beta-sheet referred to as
           the "back face".  This domain homodimerizes so that
           adjacent back sheets form a contiguous 12-stranded sheet
           and homotetramers occur by a back-to-back association of
           these homodimers.  Though L-type lectins exhibit both
           sequence and structural similarity to one another, their
           carbohydrate binding specificities differ widely.
          Length = 236

 Score = 36.8 bits (86), Expect = 0.012
 Identities = 53/195 (27%), Positives = 71/195 (36%), Gaps = 61/195 (31%)

Query: 84  TTNFEWWNVDIVFRVTGRGR-IGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFD 142
           +T+F        F +T     +G DGLAF+        D     SS  +  LGLF  S  
Sbjct: 64  STSF-------SFSITPPNPSLGGDGLAFFLAP----TDSLPPASSGGY--LGLFNSS-- 108

Query: 143 NDNNHNNPYIMAVVND--GNMAFDHQND---G------ASQSLAGCLRDFRNKPY---PT 188
           N+ N +N  I+AV  D   N  F   +D   G       S   AG   D   K     P 
Sbjct: 109 NNGNSSNH-IVAVEFDTFQNPEFGDPDDNHVGIDVNSLVSVK-AGYWDDDGGKLKSGKPM 166

Query: 189 RARIQYYMNTLTVWFHNGMTNNEQDIEVCLRVENI----------------YLPKEGYFG 232
           +A I Y          +  +     + V L    +                 LP+E Y G
Sbjct: 167 QAWIDY----------DSSSKR---LSVTLAYSGVAKPKKPLLSYPVDLSKVLPEEVYVG 213

Query: 233 VSAATGGLADDHDIL 247
            SA+TG L + H IL
Sbjct: 214 FSASTGLLTELHYIL 228


>gnl|CDD|234354 TIGR03789, pdsO, proteobacterial sortase system OmpA family
           protein.  A newly defined histidine kinase (TIGR03785)
           and response regulator (TIGR03787) gene pair occurs
           exclusively in Proteobacteria, mostly of marine origin,
           nearly all of which contain a subfamily 6 sortase
           (TIGR03784) and its single dedicated target protein
           (TIGR03788) adjacent to to the sortase. This protein
           family shows up in only in those species with the
           histidine kinase/response regulator gene pair, and often
           adjacent to that pair. It belongs to the OmpA protein
           family (pfam00691). Its function is unknown. We assign
           the gene symbol pdsO, for Proteobacterial Dedicated
           Sortase system OmpA family protein.
          Length = 239

 Score = 31.3 bits (71), Expect = 0.69
 Identities = 11/39 (28%), Positives = 20/39 (51%), Gaps = 6/39 (15%)

Query: 259 AKQQEQV------NQEDQKVAQEYAQYEKKLEEQKQHSQ 291
           A+Q++Q+       Q  +++  EY Q +  LE  +Q  Q
Sbjct: 87  AQQRQQMVALTQKQQALEQLEAEYQQAQVHLETLQQDQQ 125


>gnl|CDD|213783 TIGR03181, PDH_E1_alph_x, pyruvate dehydrogenase E1 component,
           alpha subunit.  Members of this protein family are the
           alpha subunit of the E1 component of pyruvate
           dehydrogenase (PDH). This model represents one branch of
           a larger family that E1-alpha proteins from
           2-oxoisovalerate dehydrogenase, acetoin dehydrogenase,
           another PDH clade, etc [Energy metabolism, Pyruvate
           dehydrogenase].
          Length = 341

 Score = 31.3 bits (72), Expect = 0.90
 Identities = 12/55 (21%), Positives = 25/55 (45%), Gaps = 8/55 (14%)

Query: 260 KQQEQVNQE-DQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYL-AQKD 312
           +Q+E + +E + +VA+  A+                + F++ Y+  PP L  Q+ 
Sbjct: 291 EQEEALEEEAEAEVAEAVAEALAL------PPPPVDDIFDHVYAELPPELEEQRA 339


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 31.6 bits (72), Expect = 0.99
 Identities = 15/98 (15%), Positives = 39/98 (39%), Gaps = 7/98 (7%)

Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQK--D 317
           K+ EQ+ QE++K+ +   + E+ L   +Q  +N     +   +            ++  +
Sbjct: 723 KEIEQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKELEARIEELEEDLHKLEEALN 782

Query: 318 HPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKAIA 355
             +A  +     E   Q EL ++   + +++     + 
Sbjct: 783 DLEARLSHSRIPEI--QAELSKL---EEEVSRIEARLR 815


>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584).  This
           protein is found in bacteria and eukaryotes. Proteins in
           this family are typically between 943 to 1234 amino
           acids in length. This family contains a P-loop motif
           suggesting it is a nucleotide binding protein. It may be
           involved in replication.
          Length = 1198

 Score = 30.8 bits (70), Expect = 1.5
 Identities = 32/148 (21%), Positives = 55/148 (37%), Gaps = 23/148 (15%)

Query: 239 GLADDHDILHFLTSSLLPPGAKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFE 298
             A   + L  L S L       ++Q+     +  +E  + E +L   KQ   +      
Sbjct: 414 QKAAIEEDLQALESQL-------RQQLEAGKLEFNEEEYELELRLGRLKQRLDSA----- 461

Query: 299 YKYSFKPPYLAQKDGYQKDHPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKAIATGL 358
              +  P  L Q +   +    A    EE  ++E   E     Q QS+L +  K     L
Sbjct: 462 ---TATPEELEQLEINDEALEKAQ---EEQEQAEANVE-----QLQSELRQLRKRRDEAL 510

Query: 359 DALQQKQDRILAVVSQGGGIPHQVVPGQ 386
           +ALQ+ + R+L +      +  Q+ P  
Sbjct: 511 EALQRAERRLLQLRQALDELELQLSPQA 538


>gnl|CDD|197696 smart00389, HOX, Homeodomain.  DNA-binding factors that are
           involved in the transcriptional regulation of key
           developmental processes.
          Length = 57

 Score = 27.6 bits (62), Expect = 1.9
 Identities = 10/32 (31%), Positives = 13/32 (40%), Gaps = 7/32 (21%)

Query: 181 FRNKPYPTRARIQYYMNTL-------TVWFHN 205
           F+  PYP+R   +     L        VWF N
Sbjct: 20  FQKNPYPSREEREELAKKLGLSERQVKVWFQN 51


>gnl|CDD|236196 PRK08241, PRK08241, RNA polymerase factor sigma-70; Validated.
          Length = 339

 Score = 30.3 bits (69), Expect = 2.0
 Identities = 23/67 (34%), Positives = 29/67 (43%), Gaps = 14/67 (20%)

Query: 341 FQGQSQLAEWTKAIATG--LDALQQKQDRIL-------AVVSQGGGIPHQVVPG-QPMPM 390
           F+G+S L  W   IAT   LDAL+ +  R L       A       +    VP  +P P 
Sbjct: 64  FEGRSSLRTWLYRIATNVCLDALEGRARRPLPTDLGAPAADPVDELVERPEVPWLEPYP- 122

Query: 391 INNDALL 397
              DALL
Sbjct: 123 ---DALL 126


>gnl|CDD|99890 cd05816, CBM20_DPE2_repeat2, Disproportionating enzyme 2 (DPE2),
          N-terminal CBM20 (carbohydrate-binding module, family
          20) domain, repeat 2. DPE2 is a transglucosidase that
          is essential for the cytosolic metabolism of maltose in
          plant leaves at night. Maltose is an intermediate on
          the pathway from starch to sucrose and DPE2 is thought
          to metabolize the maltose that is exported from the
          chloroplast. DPE2 has two N-terminal CBM20 domains as
          well as a C-terminal amylomaltase
          (4-alpha-glucanotransferase) catalytic domain. DPE1,
          the plastid version of this enzyme, has a
          transglucosidase domain that is similar to that of DPE2
          but lacks the N-terminal CBM20 domains. Included in
          this group are PDE2-like proteins from Dictyostelium,
          Entamoeba, and Bacteroides. The CBM20 domain is found
          in a large number of starch degrading enzymes including
          alpha-amylase, beta-amylase, glucoamylase, and CGTase
          (cyclodextrin glucanotransferase). CBM20 is also
          present in proteins that have a regulatory role in
          starch metabolism in plants (e.g. alpha-amylase) or
          glycogen metabolism in mammals (e.g. laforin). CBM20
          folds as an antiparallel beta-barrel structure with two
          starch binding sites. These two sites are thought to
          differ functionally with site 1 acting as the initial
          starch recognition site and site 2 involved in the
          specific recognition of appropriate regions of starch.
          Length = 99

 Score = 28.1 bits (63), Expect = 2.6
 Identities = 15/40 (37%), Positives = 18/40 (45%), Gaps = 8/40 (20%)

Query: 19 LVVLSSSQNPVERFEYKYSFKPPYLAQKDGSVPFWEYGGN 58
           + +S    P   FEYKY      +A KD  V  WE G N
Sbjct: 48 DIDISKDSFP---FEYKY-----IIANKDSGVVSWENGPN 79


>gnl|CDD|240584 cd12949, NOPS_PSPC1, NOPS domain, including C-terminal coiled-coil
           region, in paraspeckle protein component 1 (PSPC1) and
           similar proteins.  The family contains a DBHS domain
           (for Drosophila behavior, human splicing), which
           comprises two conserved RNA recognition motifs (RRMs),
           also termed RBDs (RNA binding domains) or RNPs
           (ribonucleoprotein domains), and a charged
           protein-protein interaction NOPS (NONA and PSP1) domain.
           This model corresponds to the NOPS domain, with a long
           helical C-terminal extension, of paraspeckle component 1
           (PSPC1, also termed PSP1), a novel nucleolar factor that
           accumulates within a new nucleoplasmic compartment,
           termed paraspeckles, and diffusely distributes in the
           nucleoplasm. It is ubiquitously expressed and highly
           conserved in vertebrates. Although its cellular function
           remains unknown currently, PSPC1 forms a novel
           heterodimer with the nuclear protein p54nrb, also known
           as non-POU domain-containing octamer-binding protein
           (NONO), which localizes to paraspeckles in an
           RNA-dependent manner. The NOPS domain specifically binds
           to the second RNA recognition motif (RRM2) domain of the
           partner DBHS protein via a substantial interaction
           surface. Its highly conserved C-terminal residues are
           critical for functional DBHS dimerization while the
           highly conserved C-terminal helical extension, forming a
           right-handed antiparallel heterodimeric coiled-coil, is
           essential for localization of these proteins to
           subnuclear bodies.
          Length = 94

 Score = 28.1 bits (62), Expect = 2.8
 Identities = 22/74 (29%), Positives = 38/74 (51%), Gaps = 6/74 (8%)

Query: 263 EQVNQED---QKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
           EQ + ED   +K+ Q+  QY K+  EQ      P   FE++Y+ +   L + +  Q++  
Sbjct: 10  EQFDDEDGLPEKLMQKTQQYHKE-REQPPRFAQP-GTFEFEYASRWKALDEMEKQQREQV 67

Query: 320 DAHPNE-EEWYESE 332
           D +  E +E  E+E
Sbjct: 68  DRNIREAKEKLEAE 81


>gnl|CDD|182544 PRK10555, PRK10555, aminoglycoside/multidrug efflux system;
           Provisional.
          Length = 1037

 Score = 29.8 bits (67), Expect = 3.0
 Identities = 14/34 (41%), Positives = 19/34 (55%), Gaps = 1/34 (2%)

Query: 249 FLTSSLLPPGAKQQEQVNQEDQKVAQEYAQYEKK 282
           F TS  LP G+ QQ Q  +  +KV + Y  +EK 
Sbjct: 571 FTTSVQLPSGSTQQ-QTLKVVEKVEKYYFTHEKD 603


>gnl|CDD|236465 PRK09319, PRK09319, bifunctional 3,4-dihydroxy-2-butanone
           4-phosphate synthase/GTP cyclohydrolase II/unknown
           domain fusion protein; Provisional.
          Length = 555

 Score = 29.5 bits (67), Expect = 3.9
 Identities = 14/60 (23%), Positives = 23/60 (38%), Gaps = 9/60 (15%)

Query: 318 HPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKA------IATGLD---ALQQKQDRI 368
                    +WY+  N   ++ I     +LA+W         I++G D    LQ + DR 
Sbjct: 466 FDQNKVASADWYKQSNHPYIKAIELLLDELAQWPNTKRLGFLISSGDDPALHLQVQLDRQ 525


>gnl|CDD|225087 COG2176, PolC, DNA polymerase III, alpha subunit (gram-positive
           type) [DNA replication, recombination, and repair].
          Length = 1444

 Score = 29.6 bits (67), Expect = 4.5
 Identities = 19/85 (22%), Positives = 33/85 (38%), Gaps = 15/85 (17%)

Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
           K +E +N+E +K AQE  + EKKL+ +       VE+ +  +  +     +     K   
Sbjct: 178 KFEEAINEEVEKAAQEALEAEKKLKAE----SPKVEKPKPLFDGQKGRKIKSTEEIKP-- 231

Query: 320 DAHPNEEEWYESENQRELRQIFQGQ 344
                        N+ E R   +G 
Sbjct: 232 ---------LIKINEEETRVKVEGY 247


>gnl|CDD|240581 cd12946, NOPS_p54nrb_PSF_PSPC1, NOPS domain, including C-terminal
           coiled-coil region, in p54nrb/PSF/PSPC1 family proteins.
            The family contains a DBHS domain (for Drosophila
           behavior, human splicing), which comprises two conserved
           RNA recognition motifs (RRMs), also termed RBDs (RNA
           binding domains) or RNPs (ribonucleoprotein domains),
           and a charged protein-protein interaction NOPS (NONA and
           PSP1) domain. This model corresponds to the NOPS domain,
           with a long helical C-terminal extension, found in the
           p54nrb/PSF/PSPC1 proteins. The NOPS domain specifically
           binds to the second RNA recognition motif (RRM2) domain
           of the partner DBHS protein via a substantial
           interaction surface. Its highly conserved C-terminal
           residues are critical for functional DBHS dimerization
           while the highly conserved C-terminal helical extension,
           forming a right-handed antiparallel heterodimeric
           coiled-coil, is essential for localization of these
           proteins to subnuclear bodies. Members in the family
           include 54 kDa nuclear RNA- and DNA-binding protein
           (p54nrb), polypyrimidine tract-binding protein
           (PTB)-associated-splicing factor (PSF) and paraspeckle
           protein component 1 (PSPC1 or PSP1), which are
           ubiquitously expressed and are conserved in vertebrates.
           p54nrb, also termed NONO or NMT55, is a multi-functional
           protein involved in numerous nuclear processes including
           transcriptional regulation, splicing, DNA unwinding,
           nuclear retention of hyperedited double-stranded RNA,
           viral RNA processing, control of cell proliferation, and
           circadian rhythm maintenance. PSF, also termed POMp100,
           is a multi-functional protein that binds RNA,
           single-stranded DNA (ssDNA), double-stranded DNA (dsDNA)
           and many factors, and mediates diverse activities in the
           cell. PSPC1 is a novel nucleolar factor that accumulates
           within a new nucleoplasmic compartment, termed
           paraspeckles, and diffusely distributes in the
           nucleoplasm. The cellular function of PSPC1 remains
           unknown currently. PSF has an additional large
           N-terminal domain that differentiates it from other
           family members.
          Length = 93

 Score = 27.4 bits (60), Expect = 4.9
 Identities = 24/74 (32%), Positives = 42/74 (56%), Gaps = 6/74 (8%)

Query: 263 EQVNQED---QKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
           EQ++ ED   +K+AQ+  QY K+ E+  + +Q     FEY+Y+ +   L + +  Q++  
Sbjct: 10  EQLDDEDGLPEKLAQKNQQYHKEREQPPRFAQPGT--FEYEYAQRWKALDEMEKQQREQV 67

Query: 320 DAHPNE-EEWYESE 332
           D +  E +E  ESE
Sbjct: 68  DRNIKEAKEKLESE 81


>gnl|CDD|238039 cd00086, homeodomain, Homeodomain;  DNA binding domains involved in
           the transcriptional regulation of key eukaryotic
           developmental processes; may bind to DNA as monomers or
           as homo- and/or heterodimers, in a sequence-specific
           manner.
          Length = 59

 Score = 26.4 bits (59), Expect = 5.2
 Identities = 9/32 (28%), Positives = 12/32 (37%), Gaps = 7/32 (21%)

Query: 181 FRNKPYPTRARIQYYMNTL-------TVWFHN 205
           F   PYP+R   +     L        +WF N
Sbjct: 19  FEKNPYPSREEREELAKELGLTERQVKIWFQN 50


>gnl|CDD|233065 TIGR00634, recN, DNA repair protein RecN.  All proteins in this
           family for which functions are known are ATP binding
           proteins involved in the initiation of recombination and
           recombinational repair [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 563

 Score = 28.9 bits (65), Expect = 5.8
 Identities = 22/109 (20%), Positives = 48/109 (44%), Gaps = 20/109 (18%)

Query: 259 AKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEY-KYSFKPPYLAQKDGYQKD 317
           A   E+V +  +++ Q + +  ++L++++Q  Q   +R ++ ++  +          + +
Sbjct: 154 AGANEKV-KAYRELYQAWLKARQQLKDRQQKEQELAQRLDFLQFQLE----------ELE 202

Query: 318 HPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKAIATGLDALQQKQD 366
             D  P E+E  E+E QR         S L +  +     L AL+   D
Sbjct: 203 EADLQPGEDEALEAEQQR--------LSNLEKLRELSQNALAALRGDVD 243


>gnl|CDD|235179 PRK03947, PRK03947, prefoldin subunit alpha; Reviewed.
          Length = 140

 Score = 27.6 bits (62), Expect = 6.1
 Identities = 9/32 (28%), Positives = 19/32 (59%)

Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQ 291
           K  E++ +  QK+A   AQ  ++L++ +Q + 
Sbjct: 108 KALEKLEEALQKLASRIAQLAQELQQLQQEAA 139


>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
           and chromosome partitioning].
          Length = 420

 Score = 28.5 bits (64), Expect = 6.7
 Identities = 11/46 (23%), Positives = 23/46 (50%)

Query: 246 ILHFLTSSLLPPGAKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQ 291
           +L  L S+ +   A      +++ +++ +E A  EKK+ EQ+    
Sbjct: 17  LLASLLSAAVLAAAFSAAADDKQLKQIQKEIAALEKKIREQQDQRA 62


>gnl|CDD|221122 pfam11490, DNA_pol3_alph_N, DNA polymerase III polC-type
           N-terminus.  This is an N-terminal domain of DNA
           polymerase III polC subunit A that is found only in
           Firmicutes. DNA polymerase polC-type III enzyme
           functions as the 'replicase' in low G + C Gram-positive
           bacteria. Purine asymmetry is a characteristic of
           organisms with a heterodimeric DNA polymerase III
           alpha-subunit constituted by polC which probably plays a
           direct role in the maintenance of strand-biased gene
           distribution; since, among prokaryotic genomes, the
           distribution of genes on the leading and lagging strands
           of the replication fork is known to be biased. The
           domain is associated with DNA_pol3_alpha pfam07733.
          Length = 180

 Score = 28.1 bits (63), Expect = 6.7
 Identities = 10/35 (28%), Positives = 20/35 (57%)

Query: 259 AKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNP 293
            + +EQ  +E+ K+A+E  +  KK E +K+  +  
Sbjct: 146 EEFEEQKEEEEAKLAEEALEALKKKEAEKKKKEKE 180


>gnl|CDD|227731 COG5444, COG5444, Uncharacterized conserved protein [Function
           unknown].
          Length = 565

 Score = 28.6 bits (64), Expect = 7.4
 Identities = 18/133 (13%), Positives = 33/133 (24%), Gaps = 30/133 (22%)

Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSF---------------- 303
              E++ + DQ+     A  E K++E K   +    + E                     
Sbjct: 157 DTLEKLYKLDQEGMTLMAAVESKMQELKAIIR----QLEEWTIKGGATKKGVPIHYVAKA 212

Query: 304 --------KPPYLAQKDGYQKDHPDAHPNEEEWYESENQRELRQIFQ-GQSQLAEWTKAI 354
                   K   +A +     D       +E     +  + L    + G   LA      
Sbjct: 213 FAEVTIHKKAAEVALQSETYLDIKTEL-AKERPDMRDLDKPLESANKTGYEGLAGEDIVP 271

Query: 355 ATGLDALQQKQDR 367
               +   Q    
Sbjct: 272 FFLAEETGQLAYS 284


>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
           protein; Reviewed.
          Length = 782

 Score = 28.6 bits (65), Expect = 7.9
 Identities = 9/46 (19%), Positives = 22/46 (47%), Gaps = 4/46 (8%)

Query: 260 KQQEQVNQEDQKVAQE----YAQYEKKLEEQKQHSQNPVERFEYKY 301
           ++ EQ  +E + + +E      + E+K E+ ++     +E  E + 
Sbjct: 530 RELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEKEA 575


>gnl|CDD|221389 pfam12037, DUF3523, Domain of unknown function (DUF3523).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 257 to 277 amino acids in length. This domain is
           found associated with pfam00004. This domain has a
           conserved LER sequence motif.
          Length = 276

 Score = 27.8 bits (62), Expect = 9.6
 Identities = 22/80 (27%), Positives = 34/80 (42%), Gaps = 11/80 (13%)

Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
            QQ Q   E  +V  E  + +   E+ +Q  Q    R +Y+       LA+K  YQK+  
Sbjct: 77  AQQAQAKLERARVEAE-ERRKTLQEQTQQEQQ----RAQYQDE-----LARK-RYQKELE 125

Query: 320 DAHPNEEEWYESENQRELRQ 339
                 EE  + + +  LRQ
Sbjct: 126 QQRRQNEELLKMQEESVLRQ 145


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.316    0.134    0.414 

Gapped
Lambda     K      H
   0.267   0.0760    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 21,118,718
Number of extensions: 2026201
Number of successful extensions: 2291
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2266
Number of HSP's successfully gapped: 51
Length of query: 413
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 314
Effective length of database: 6,546,556
Effective search space: 2055618584
Effective search space used: 2055618584
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 60 (26.8 bits)