RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy1765
         (867 letters)



>gnl|CDD|133004 cd02510, pp-GalNAc-T, pp-GalNAc-T initiates the formation of
           mucin-type O-linked glycans.  UDP-GalNAc: polypeptide
           alpha-N-acetylgalactosaminyltransferases (pp-GalNAc-T)
           initiate the formation of mucin-type, O-linked glycans
           by catalyzing the transfer of
           alpha-N-acetylgalactosamine (GalNAc) from UDP-GalNAc to
           hydroxyl groups of Ser or Thr residues of core proteins
           to form the Tn antigen (GalNAc-a-1-O-Ser/Thr). These
           enzymes are type II membrane proteins with a GT-A type
           catalytic domain and a lectin domain located on the
           lumen side of the Golgi apparatus. In human, there are
           15 isozymes of pp-GalNAc-Ts, representing the largest of
           all glycosyltransferase families. Each isozyme has
           unique but partially redundant substrate specificity for
           glycosylation sites on acceptor proteins.
          Length = 299

 Score =  418 bits (1077), Expect = e-140
 Identities = 166/274 (60%), Positives = 197/274 (71%), Gaps = 5/274 (1%)

Query: 147 LLHEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMFGAKYATGKV 206
           LL EIILV+DFS+ P  L   +E + K     +V + R  KREGLIRAR+ GA+ ATG V
Sbjct: 29  LLKEIILVDDFSDKP-ELKLLLEEYYKKYL-PKVKVLRLKKREGLIRARIAGARAATGDV 86

Query: 207 LVFLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQY-TSSALVRGGFNWGLH 265
           LVFLDSH EVN  WLEPLL  IAE   TV  PIID+I+ADTF+Y  SS   RGGF+W LH
Sbjct: 87  LVFLDSHCEVNVGWLEPLLARIAENRKTVVCPIIDVIDADTFEYRGSSGDARGGFDWSLH 146

Query: 266 FKWENLPKGTLNSSEDFIKPILSPTMAGGLFAIDRQYFDSLGQYDAGLEIWGGENLELSF 325
           FKW  LP+      E    PI SPTMAGGLFAIDR++F  LG YD G++IWGGENLELSF
Sbjct: 147 FKWLPLPEEERRR-ESPTAPIRSPTMAGGLFAIDREWFLELGGYDEGMDIWGGENLELSF 205

Query: 326 RIWMCGGSLAMIPCSRIGHVFRS-RRPYNNGHNEDPLTRNSLRVAHVWMDEYIEHFLKQR 384
           ++W CGGS+ ++PCSR+GH+FR  R+PY        + RN  RVA VWMDEY E+F K R
Sbjct: 206 KVWQCGGSIEIVPCSRVGHIFRRKRKPYTFPGGSGTVLRNYKRVAEVWMDEYKEYFYKAR 265

Query: 385 PEARNIDYGDVTDRKQLRARLGCKSFKWYLDNVY 418
           PE RNIDYGD+++RK LR RL CKSFKWYL+NVY
Sbjct: 266 PELRNIDYGDLSERKALRERLKCKSFKWYLENVY 299



 Score =  105 bits (265), Expect = 6e-25
 Identities = 40/55 (72%), Positives = 45/55 (81%)

Query: 455 NSLRVAHVWMDEYIEHFLKQRPEARNIDYGDVTDRKQLRARLGCKSFKWYLDNVY 509
           N  RVA VWMDEY E+F K RPE RNIDYGD+++RK LR RL CKSFKWYL+NVY
Sbjct: 245 NYKRVAEVWMDEYKEYFYKARPELRNIDYGDLSERKALRERLKCKSFKWYLENVY 299



 Score =  104 bits (263), Expect = 1e-24
 Identities = 43/88 (48%), Positives = 58/88 (65%), Gaps = 2/88 (2%)

Query: 55  SVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKGLNDG 114
           SVII F+NE  +TL R+V ++++RT   LL EIILV+DFS+ P  L   +E + K     
Sbjct: 1   SVIIIFHNEALSTLLRTVHSVINRTPPELLKEIILVDDFSDKP-ELKLLLEEYYKKYL-P 58

Query: 115 RVHLYRTSKREGLIRARMFGAKYATGKN 142
           +V + R  KREGLIRAR+ GA+ ATG  
Sbjct: 59  KVKVLRLKKREGLIRARIAGARAATGDV 86



 Score = 39.1 bits (92), Expect = 0.007
 Identities = 12/14 (85%), Positives = 14/14 (100%)

Query: 626 LRCKSFKWYLDNVY 639
           L+CKSFKWYL+NVY
Sbjct: 286 LKCKSFKWYLENVY 299


>gnl|CDD|215980 pfam00535, Glycos_transf_2, Glycosyl transferase family 2.  Diverse
           family, transferring sugar from UDP-glucose,
           UDP-N-acetyl- galactosamine, GDP-mannose or
           CDP-abequose, to a range of substrates including
           cellulose, dolichol phosphate and teichoic acids.
          Length = 168

 Score = 85.2 bits (211), Expect = 3e-19
 Identities = 42/166 (25%), Positives = 64/166 (38%), Gaps = 20/166 (12%)

Query: 144 IQSLLH------EIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMF 197
           ++SLL+      EII+V+D S          E + K  N+ RV + R  +  G   AR  
Sbjct: 17  LESLLNQTYKNFEIIVVDDGS--TDGTVEIAEEYAK--NDPRVRVIRLEENLGKAAARNA 72

Query: 198 GAKYATGKVLVFLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQYTSSALVR 257
           G K ATG  + FLD+  EV   WLE L+  + +    + +    +IN +T  Y       
Sbjct: 73  GLKLATGDYIAFLDADDEVAPDWLEKLVELLEKNGADIVIGSRVVINGETRLY------- 125

Query: 258 GGFNWGLHFKWENLPKGTLNSSEDFIKPILSPTMAGGLFAIDRQYF 303
                 L F+   L       S       L  + A     +  +  
Sbjct: 126 ---GRALRFELLLLLGKLGARSLGLKVLFLIGSNALYRREVLEELL 168



 Score = 68.3 bits (167), Expect = 3e-13
 Identities = 32/87 (36%), Positives = 45/87 (51%), Gaps = 7/87 (8%)

Query: 55  SVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKGLNDG 114
           SVII  YNE    L  ++++LL++T ++   EII+V+D S          E + K  ND 
Sbjct: 1   SVIIPTYNE-EKYLEETLESLLNQTYKNF--EIIVVDDGS--TDGTVEIAEEYAK--NDP 53

Query: 115 RVHLYRTSKREGLIRARMFGAKYATGK 141
           RV + R  +  G   AR  G K ATG 
Sbjct: 54  RVRVIRLEENLGKAAARNAGLKLATGD 80


>gnl|CDD|216044 pfam00652, Ricin_B_lectin, Ricin-type beta-trefoil lectin domain. 
          Length = 124

 Score = 62.6 bits (152), Expect = 9e-12
 Identities = 26/111 (23%), Positives = 41/111 (36%), Gaps = 9/111 (8%)

Query: 683 SGTDLCLTSKVDKTKGSPLVLKKCDELSKTQRWSKTDKSEL--VLAELLCLDAG----AT 736
           + +  CL        G P+ L  C      Q W+ T    +       LCLD       +
Sbjct: 9   NRSGKCLDVPGGSADGGPVGLYPCHG-GGNQLWTLTGDGTIRSNGNSNLCLDVSGGGNGS 67

Query: 737 KPKLTKCHEMGGSQEWNFVLRDKTPIYSPATGTCLGSKNRLENTVIVMEMC 787
           K  L  C+   G+Q W++       I +  +G CL  K     T +++  C
Sbjct: 68  KVVLWPCNGGSGNQRWDY--DGDGTIRNRKSGKCLDVKGASNGTKVILWTC 116



 Score = 50.2 bits (120), Expect = 2e-07
 Identities = 31/169 (18%), Positives = 47/169 (27%), Gaps = 59/169 (34%)

Query: 553 SSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHWSKTDKSEL--VLAELLCLDAG----AT 606
           + +  CL        G P+ L  C      Q W+ T    +       LCLD       +
Sbjct: 9   NRSGKCLDVPGGSADGGPVGLYPCHG-GGNQLWTLTGDGTIRSNGNSNLCLDVSGGGNGS 67

Query: 607 KPKLTKCHEMGGSQEYWCWLRCKSFKWYLDNVYPEMILPSDDEDRLKKKWAQVEQPKFQP 666
           K  L  C+   G+Q           +W                                 
Sbjct: 68  KVVLWPCNGGSGNQ-----------RWD-------------------------------- 84

Query: 667 WYSRARNYTSHFHIRLSGTDLCLTSKVDKTKGSPLVLKKCDELSKTQRW 715
                  Y     IR   +  CL  K   + G+ ++L  CD  +  Q+W
Sbjct: 85  -------YDGDGTIRNRKSGKCLDVKGA-SNGTKVILWTCDG-NPNQQW 124



 Score = 36.4 bits (84), Expect = 0.012
 Identities = 17/68 (25%), Positives = 24/68 (35%), Gaps = 3/68 (4%)

Query: 544 YTSHFHIR-LSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHWSKTDKSELVLAEL-LCL 601
            T    IR   +++LCL        GS +VL  C+  S  Q W       +   +   CL
Sbjct: 42  LTGDGTIRSNGNSNLCLDVSGGGN-GSKVVLWPCNGGSGNQRWDYDGDGTIRNRKSGKCL 100

Query: 602 DAGATKPK 609
           D       
Sbjct: 101 DVKGASNG 108



 Score = 32.9 bits (75), Expect = 0.24
 Identities = 14/51 (27%), Positives = 20/51 (39%), Gaps = 7/51 (13%)

Query: 535 QPWYSRARNYTSHFHIRLSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHW 585
           Q W      Y     IR   +  CL  K   + G+ ++L  CD  +  Q W
Sbjct: 81  QRWD-----YDGDGTIRNRKSGKCLDVKGA-SNGTKVILWTCDG-NPNQQW 124


>gnl|CDD|238092 cd00161, RICIN, Ricin-type beta-trefoil; Carbohydrate-binding
           domain formed from presumed gene triplication. The
           domain is found in a variety of molecules serving
           diverse functions such as enzymatic activity, inhibitory
           toxicity and signal transduction. Highly specific ligand
           binding occurs on exposed surfaces of the compact domain
           sturcture.
          Length = 124

 Score = 60.2 bits (146), Expect = 6e-11
 Identities = 31/114 (27%), Positives = 46/114 (40%), Gaps = 10/114 (8%)

Query: 678 FHIRLSG-TDLCLTSKVDKTKGSPLVLKKCDELSKTQRWSKTDKSELVL-AELLCLDAGA 735
             IR    T LCL      + G P+ L  C      Q+W+ T    + + +  LCLD G 
Sbjct: 1   GTIRNVNNTGLCLDVN-GGSDGGPVQLYPCHGNGNNQKWTLTSDGTIRIKSSNLCLDVGG 59

Query: 736 ----TKPKLTKCHEMGGSQEWNFVLRDKTPIYSPATGTCLGSKNRLEN-TVIVM 784
               +K +L  C     +Q W F       I +  +G CL  K    N T +++
Sbjct: 60  DAPGSKVRLYTCSGGSDNQRWTF--NKDGTIRNLKSGKCLDVKGGNTNGTNLIL 111



 Score = 53.7 bits (129), Expect = 1e-08
 Identities = 34/169 (20%), Positives = 52/169 (30%), Gaps = 57/169 (33%)

Query: 552 LSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHWSKTDKSELVL-AELLCLDAGA----T 606
           +++T LCL      + G P+ L  C      Q W+ T    + + +  LCLD G     +
Sbjct: 6   VNNTGLCLDVN-GGSDGGPVQLYPCHGNGNNQKWTLTSDGTIRIKSSNLCLDVGGDAPGS 64

Query: 607 KPKLTKCHEMGGSQEYWCWLRCKSFKWYLDNVYPEMILPSDDEDRLKKKWAQVEQPKFQP 666
           K +L  C     +Q           +W  +                              
Sbjct: 65  KVRLYTCSGGSDNQ-----------RWTFNKDG--------------------------- 86

Query: 667 WYSRARNYTSHFHIRLSGTDLCLTSKVDKTKGSPLVLKKCDELSKTQRW 715
                        IR   +  CL  K   T G+ L+L  CD     Q+W
Sbjct: 87  ------------TIRNLKSGKCLDVKGGNTNGTNLILWTCDG-GPNQKW 122



 Score = 43.6 bits (103), Expect = 4e-05
 Identities = 23/95 (24%), Positives = 31/95 (32%), Gaps = 12/95 (12%)

Query: 526 WAQVEQPKFQPWYSRARNYTSHFHIRLSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHW 585
           +        Q W       TS   IR+ S++LCL        GS + L  C   S  Q W
Sbjct: 27  YPCHGNGNNQKWT-----LTSDGTIRIKSSNLCLDV-GGDAPGSKVRLYTCSGGSDNQRW 80

Query: 586 SKTDKSELVLAEL-LCLDA-----GATKPKLTKCH 614
           +      +   +   CLD        T   L  C 
Sbjct: 81  TFNKDGTIRNLKSGKCLDVKGGNTNGTNLILWTCD 115



 Score = 36.7 bits (85), Expect = 0.010
 Identities = 15/51 (29%), Positives = 19/51 (37%), Gaps = 6/51 (11%)

Query: 535 QPWYSRARNYTSHFHIRLSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHW 585
           Q W      +     IR   +  CL  K   T G+ L+L  CD     Q W
Sbjct: 78  QRWT-----FNKDGTIRNLKSGKCLDVKGGNTNGTNLILWTCDG-GPNQKW 122


>gnl|CDD|214672 smart00458, RICIN, Ricin-type beta-trefoil.  Carbohydrate-binding
           domain formed from presumed gene triplication.
          Length = 118

 Score = 60.2 bits (146), Expect = 6e-11
 Identities = 30/123 (24%), Positives = 45/123 (36%), Gaps = 9/123 (7%)

Query: 680 IRLSGTDLCLTSKVDKTKGSPLVLKKCDELSKTQRWSKTDKSEL-VLAELLCLDAGATKP 738
           I    T  CL    +K    P+ L  C      Q W  T    + +    LCL A     
Sbjct: 1   IISGNTGKCLDVNGNKN---PVGLFDCHGTGGNQLWKLTSDGAIRIKDTDLCLTANGNTG 57

Query: 739 ---KLTKCHEMGGSQEWNFVLRDKTPIYSPATGTCLGSKNRLENTVIVMEMCAQHKDTSW 795
               L  C     +Q W    +D T I +P +G CL  K+    T +++  C+ + +  W
Sbjct: 58  STVTLYSCDGTNDNQYWEV-NKDGT-IRNPDSGKCLDVKDGNTGTKVILWTCSGNPNQKW 115

Query: 796 DLV 798
              
Sbjct: 116 IFE 118



 Score = 43.3 bits (102), Expect = 5e-05
 Identities = 31/170 (18%), Positives = 46/170 (27%), Gaps = 59/170 (34%)

Query: 550 IRLSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHWSKTDKSEL-VLAELLCLDAGATKP 608
           I   +T  CL    +K    P+ L  C      Q W  T    + +    LCL A     
Sbjct: 1   IISGNTGKCLDVNGNKN---PVGLFDCHGTGGNQLWKLTSDGAIRIKDTDLCLTANGNTG 57

Query: 609 ---KLTKCHEMGGSQEYWCWLRCKSFKWYLDNVYPEMILPSDDEDRLKKKWAQVEQPKFQ 665
               L  C     +Q            W ++                             
Sbjct: 58  STVTLYSCDGTNDNQ-----------YWEVNK---------------------------- 78

Query: 666 PWYSRARNYTSHFHIRLSGTDLCLTSKVDKTKGSPLVLKKCDELSKTQRW 715
                  + T    IR   +  CL  K D   G+ ++L  C   +  Q+W
Sbjct: 79  -------DGT----IRNPDSGKCLDVK-DGNTGTKVILWTCSG-NPNQKW 115



 Score = 42.5 bits (100), Expect = 9e-05
 Identities = 30/96 (31%), Positives = 37/96 (38%), Gaps = 13/96 (13%)

Query: 665 QPWYSRARNYTSHFHIRLSGTDLCLTSKVDKTKGSPLVLKKCDELSKTQRWS-KTDKSEL 723
           Q W       TS   IR+  TDLCLT   +   GS + L  CD  +  Q W    D +  
Sbjct: 31  QLWK-----LTSDGAIRIKDTDLCLT--ANGNTGSTVTLYSCDGTNDNQYWEVNKDGTIR 83

Query: 724 VLAELLCLDA----GATKPKLTKCHEMGGSQEWNFV 755
                 CLD       TK  L  C     +Q+W F 
Sbjct: 84  NPDSGKCLDVKDGNTGTKVILWTCSG-NPNQKWIFE 118



 Score = 35.9 bits (83), Expect = 0.018
 Identities = 27/86 (31%), Positives = 33/86 (38%), Gaps = 12/86 (13%)

Query: 535 QPWYSRARNYTSHFHIRLSSTDLCLTSKVDKTKGSPLVLKKCDELSKTQHWS-KTDKSEL 593
           Q W       TS   IR+  TDLCLT   +   GS + L  CD  +  Q+W    D +  
Sbjct: 31  QLWK-----LTSDGAIRIKDTDLCLT--ANGNTGSTVTLYSCDGTNDNQYWEVNKDGTIR 83

Query: 594 VLAELLCLDA----GATKPKLTKCHE 615
                 CLD       TK  L  C  
Sbjct: 84  NPDSGKCLDVKDGNTGTKVILWTCSG 109


>gnl|CDD|234419 TIGR03965, mycofact_glyco, mycofactocin system glycosyltransferase.
            Members of this protein family are putative
           glycosyltransferases, members of pfam00535 (glycosyl
           transferase family 2). Members appear mostly in the
           Actinobacteria, where they appear to be part of a system
           for converting a precursor peptide (TIGR03969) into a
           novel redox carrier designated mycofactocin. A radical
           SAM enzyme, TIGR03962, is a proposed to be a key
           maturase for mycofactocin.
          Length = 467

 Score = 57.8 bits (140), Expect = 1e-08
 Identities = 56/239 (23%), Positives = 93/239 (38%), Gaps = 44/239 (18%)

Query: 149 HEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMFGAKYATGKVLV 208
            E+I+V+D SE P      V +         V + R  +R+G   AR  GA+ A  + + 
Sbjct: 106 LEVIVVDDGSEDP------VPTRAARGARLPVRVIRHPRRQGPAAARNAGARAARTEFVA 159

Query: 209 FLDSHIEVNTHWLEPLLVPIAE-RTNTVTVPIIDIINADT----FQYTSSALVRGGFNWG 263
           F DS +     WL  LL    +     V   ++ +   DT    ++   S+L        
Sbjct: 160 FTDSDVVPRPGWLRALLAHFDDPGVALVAPRVVALPAEDTRLARYEAVRSSL-------- 211

Query: 264 LHFKWENLPKGTLNSSEDFIKPILSPT--MAGGLFAIDRQYFDSLGQYDAGLEIWGGENL 321
                       L   E  ++P   P   +      + R+    +G +D  LE+  GE++
Sbjct: 212 -----------DLGPEEAVVRP-RGPVSYVPSAALLVRRRALLEVGGFDERLEV--GEDV 257

Query: 322 ELSFRIWMCGGSLAMIPCSRIGHVFRSR------RPYNNGHNEDPLTR---NSLRVAHV 371
           +L +R+   GG +   P + + H  R+R      R    G +  PL R    S+R   V
Sbjct: 258 DLCWRLCEAGGRVRYEPAAVVAHDHRTRLWPWLARRAFYGTSAAPLARRHPGSVRPMVV 316



 Score = 34.7 bits (80), Expect = 0.22
 Identities = 28/93 (30%), Positives = 41/93 (44%), Gaps = 7/93 (7%)

Query: 48  PSTLPSTSVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETF 107
             + PS +V++   N  PA L R +  LL+        E+I+V+D SE P      V T 
Sbjct: 70  LPSPPSVTVVVPVRN-RPAGLARLLAALLALDYPRDRLEVIVVDDGSEDP------VPTR 122

Query: 108 VKGLNDGRVHLYRTSKREGLIRARMFGAKYATG 140
                   V + R  +R+G   AR  GA+ A  
Sbjct: 123 AARGARLPVRVIRHPRRQGPAAARNAGARAART 155


>gnl|CDD|132997 cd00761, Glyco_tranf_GTA_type, Glycosyltransferase family A (GT-A)
           includes diverse families of glycosyl transferases with
           a common GT-A type structural fold.
           Glycosyltransferases (GTs) are enzymes that synthesize
           oligosaccharides, polysaccharides, and glycoconjugates
           by transferring the sugar moiety from an activated
           nucleotide-sugar donor to an acceptor molecule, which
           may be a growing oligosaccharide, a lipid, or a protein.
            Based on the stereochemistry of the donor and acceptor
           molecules, GTs are classified as either retaining or
           inverting enzymes. To date, all GT structures adopt one
           of two possible folds, termed GT-A fold and GT-B fold.
           This hierarchy includes diverse families of glycosyl
           transferases with a common GT-A type structural fold,
           which has two tightly associated beta/alpha/beta domains
           that tend to form a continuous central sheet of at least
           eight beta-strands. The majority of the proteins in this
           superfamily are Glycosyltransferase family 2 (GT-2)
           proteins. But it also includes families GT-43, GT-6,
           GT-8, GT13 and GT-7; which are evolutionarily related to
           GT-2 and share structure similarities.
          Length = 156

 Score = 53.7 bits (129), Expect = 2e-08
 Identities = 27/96 (28%), Positives = 45/96 (46%), Gaps = 10/96 (10%)

Query: 144 IQSLL------HEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMF 197
           ++SLL       E+I+V+D S   +     +E + K   + RV      + +GL  AR  
Sbjct: 16  LESLLAQTYPNFEVIVVDDGSTDGTLE--ILEEYAK--KDPRVIRVINEENQGLAAARNA 71

Query: 198 GAKYATGKVLVFLDSHIEVNTHWLEPLLVPIAERTN 233
           G K A G+ ++FLD+   +   WLE L+  +     
Sbjct: 72  GLKAARGEYILFLDADDLLLPDWLERLVAELLADPE 107



 Score = 41.7 bits (98), Expect = 3e-04
 Identities = 33/101 (32%), Positives = 47/101 (46%), Gaps = 16/101 (15%)

Query: 56  VIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKGLNDGR 115
           VII  YNE    L R +++LL++T  +   E+I+V+D S   +     +E + K   D R
Sbjct: 1   VIIPAYNE-EPYLERCLESLLAQTYPNF--EVIVVDDGSTDGTLE--ILEEYAK--KDPR 53

Query: 116 VHLYRTSKREGLIRARMFGAKYATGKNRIQSLLHEIILVND 156
           V      + +GL  AR  G K A G         E IL  D
Sbjct: 54  VIRVINEENQGLAAARNAGLKAARG---------EYILFLD 85


>gnl|CDD|197727 smart00443, G_patch, glycine rich nucleic binding domain.  A
           predicted glycine rich nucleic binding domain found in
           the splicing factor 45, SON DNA binding protein and
           D-type Retrovirus- polyproteins.
          Length = 47

 Score = 49.5 bits (119), Expect = 5e-08
 Identities = 15/26 (57%), Positives = 18/26 (69%)

Query: 818 IPNENKGFRMLSKMGWKAGQTLGKDE 843
           I   N G ++L KMGWK GQ LGK+E
Sbjct: 1   ISTSNIGAKLLRKMGWKEGQGLGKNE 26


>gnl|CDD|144978 pfam01585, G-patch, G-patch domain.  This domain is found in a
           number of RNA binding proteins, and is also found in
           proteins that contain RNA binding domains. This suggests
           that this domain may have an RNA binding function. This
           domain has seven highly conserved glycines.
          Length = 45

 Score = 48.3 bits (116), Expect = 1e-07
 Identities = 15/23 (65%), Positives = 18/23 (78%)

Query: 821 ENKGFRMLSKMGWKAGQTLGKDE 843
            N GF++L KMGWK GQ LGK+E
Sbjct: 2   SNIGFKLLQKMGWKPGQGLGKNE 24


>gnl|CDD|224137 COG1216, COG1216, Predicted glycosyltransferases [General function
           prediction only].
          Length = 305

 Score = 50.2 bits (120), Expect = 2e-06
 Identities = 50/224 (22%), Positives = 79/224 (35%), Gaps = 32/224 (14%)

Query: 137 YATGKNRIQSL---------LHEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSK 187
           Y  G++ ++ L            I++V++ S             +K      V L    +
Sbjct: 12  YNRGEDLVECLASLAAQTYPDDVIVVVDNGS------TDGSLEALKARFFPNVRLIENGE 65

Query: 188 REGLIRARMFGAKYATGKV---LVFLDSHIEVNTHWLEPLLVPIAE-RTNTVTVPIIDII 243
             G       G KYA  K    ++ L+    V    LE LL    E     V  P+I   
Sbjct: 66  NLGFAGGFNRGIKYALAKGDDYVLLLNPDTVVEPDLLEELLKAAEEDPAAGVVGPLIRN- 124

Query: 244 NADTFQYTSSALVRGGFNWGLHFKWENLPKGT---LNSSEDFIKPILSPTMAGGLFAIDR 300
               +  +     RGG + GL   W   P        SS   +   LS    G    I R
Sbjct: 125 ----YDESLYIDRRGGESDGLTGGWRASPLLEIAPDLSSYLEVVASLS----GACLLIRR 176

Query: 301 QYFDSLGQYDAGLEIWGGENLELSFRIWMCGGSLAMIPCSRIGH 344
           + F+ +G +D    I+  E+++L  R    G  +  +P + I H
Sbjct: 177 EAFEKVGGFDERFFIY-YEDVDLCLRARKAGYKIYYVPDAIIYH 219



 Score = 34.0 bits (78), Expect = 0.30
 Identities = 23/105 (21%), Positives = 37/105 (35%), Gaps = 14/105 (13%)

Query: 52  PSTSVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKGL 111
           P  S+II  YN     L   + +L ++T       I++V++ S             +K  
Sbjct: 3   PKISIIIVTYN-RGEDLVECLASLAAQTYPD--DVIVVVDNGS------TDGSLEALKAR 53

Query: 112 NDGRVHLYRTSKREGLIRARMFGAKYATGKNRIQSLLHEIILVND 156
               V L    +  G       G KYA  K         ++L+N 
Sbjct: 54  FFPNVRLIENGENLGFAGGFNRGIKYALAKGDD-----YVLLLNP 93


>gnl|CDD|223539 COG0463, WcaA, Glycosyltransferases involved in cell wall
           biogenesis [Cell envelope biogenesis, outer membrane].
          Length = 291

 Score = 48.5 bits (114), Expect = 6e-06
 Identities = 28/87 (32%), Positives = 42/87 (48%), Gaps = 7/87 (8%)

Query: 55  SVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKGLNDG 114
           SV+I  YNE    L  ++++LL++T +    EII+V+D S   +        +  G  D 
Sbjct: 6   SVVIPTYNE-EEYLPEALESLLNQTYKDF--EIIVVDDGSTDGTTE--IAIEY--GAKDV 58

Query: 115 RVHLYRTSKREGLIRARMFGAKYATGK 141
           RV      +  GL  AR  G +YA G 
Sbjct: 59  RVIRLINERNGGLGAARNAGLEYARGD 85



 Score = 43.1 bits (100), Expect = 3e-04
 Identities = 28/111 (25%), Positives = 43/111 (38%), Gaps = 15/111 (13%)

Query: 144 IQSLLH------EIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMF 197
           ++SLL+      EII+V+D S   +        +  G  + RV      +  GL  AR  
Sbjct: 22  LESLLNQTYKDFEIIVVDDGSTDGTTE--IAIEY--GAKDVRVIRLINERNGGLGAARNA 77

Query: 198 GAKYATGKVLVFLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTF 248
           G +YA G  +VFLD+          P L+P+                 D +
Sbjct: 78  GLEYARGDYIVFLDADD-----QHPPELIPLVAAGGDGDYIARLDDRDDIW 123


>gnl|CDD|224136 COG1215, COG1215, Glycosyltransferases, probably involved in cell
           wall biogenesis [Cell envelope biogenesis, outer
           membrane].
          Length = 439

 Score = 46.1 bits (109), Expect = 7e-05
 Identities = 36/183 (19%), Positives = 62/183 (33%), Gaps = 16/183 (8%)

Query: 149 HEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMFGAKYATGKVLV 208
           +E+I+V+D S      +  +E            +Y   K  G   A   G K A G V+V
Sbjct: 85  YEVIVVDDGST--DETYEILEELGAEYGPNFRVIYPEKKNGGKAGALNNGLKRAKGDVVV 142

Query: 209 FLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQYTSSALVRGGFNWGLHFKW 268
            LD+        L  L+ P  +      V    I N          +    +    +F+ 
Sbjct: 143 ILDADTVPEPDALRELVSPFEDPPVGAVVGTPRIRNRPDPSNLLGRIQAIEYLSAFYFRL 202

Query: 269 ENLPKGTLNSSEDFIKPILSPTMAGGLFAIDRQYFDSLGQYDAGLEIWGGENLELSFRIW 328
                    +S+  +   LS    G   A  R   + +G +         E+ +L+ R+ 
Sbjct: 203 -------RAASKGGLISFLS----GSSSAFRRSALEEVGGWLED---TITEDADLTLRLH 248

Query: 329 MCG 331
           + G
Sbjct: 249 LRG 251



 Score = 44.5 bits (105), Expect = 2e-04
 Identities = 28/91 (30%), Positives = 40/91 (43%), Gaps = 4/91 (4%)

Query: 51  LPSTSVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKG 110
           LP  SVII  YNE P  L  ++++LLS+      +E+I+V+D S      +  +E     
Sbjct: 53  LPKVSVIIPAYNEEPEVLEETLESLLSQDYPR--YEVIVVDDGST--DETYEILEELGAE 108

Query: 111 LNDGRVHLYRTSKREGLIRARMFGAKYATGK 141
                  +Y   K  G   A   G K A G 
Sbjct: 109 YGPNFRVIYPEKKNGGKAGALNNGLKRAKGD 139


>gnl|CDD|221692 pfam12656, G-patch_2, DExH-box splicing factor binding site.  Yeast
           Spp2, a G-patch protein and spliceosome component,
           interacts with the ATP-dependent DExH-box splicing
           factor Prp2. As this interaction involves the G-patch
           sequence in Spp2 and is required for the recruitment of
           Prp2 to the spliceosome before the first catalytic step
           of splicing, it is proposed that Spp2 might be an
           accessory factor that confers spliceosome specificity on
           Prp2.
          Length = 79

 Score = 39.3 bits (92), Expect = 4e-04
 Identities = 18/46 (39%), Positives = 24/46 (52%), Gaps = 5/46 (10%)

Query: 816 ESIPNENKGFRMLSKMGWKAGQTLGKDEANSAALIEPE-----LGL 856
           E++P E  G  +L  MGWK GQ +GK+        EP+     LGL
Sbjct: 26  EAVPVEEFGAALLRGMGWKEGQGIGKNNKGDVKPKEPKRRPGGLGL 71


>gnl|CDD|133016 cd02525, Succinoglycan_BP_ExoA, ExoA is involved in the
           biosynthesis of succinoglycan.  Succinoglycan
           Biosynthesis Protein ExoA catalyzes the formation of a
           beta-1,3 linkage of the second sugar (glucose) of the
           succinoglycan with the galactose on the lipid carrie.
           Succinoglycan is an acidic exopolysaccharide that is
           important for invasion of the nodules. Succinoglycan is
           a high-molecular-weight polymer composed of repeating
           octasaccharide units. These units are synthesized on
           membrane-bound isoprenoid lipid carriers, beginning with
           galactose followed by seven glucose molecules, and
           modified by the addition of acetate, succinate, and
           pyruvate. ExoA is a membrane protein with a
           transmembrance domain at c-terminus.
          Length = 249

 Score = 38.4 bits (90), Expect = 0.010
 Identities = 29/170 (17%), Positives = 59/170 (34%), Gaps = 25/170 (14%)

Query: 194 ARMFGAKYATGKVLVFLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQYTSS 253
               G + + G +++ +D+H      ++   LV   +RT    V                
Sbjct: 72  GLNIGIRNSRGDIIIRVDAHAVYPKDYILE-LVEALKRTGADNVGGPMET---------- 120

Query: 254 ALVRGGFNWGLHFKWENLP--KGTLNSSEDFIKPILSPTMAGGLFAIDRQYFDSLGQYDA 311
            +    F   +     +     G+        K     T+  G +  +   F+ +G +D 
Sbjct: 121 -IGESKFQKAIAVAQSSPLGSGGSAYRGGAV-KIGYVDTVHHGAYRREV--FEKVGGFDE 176

Query: 312 GLEIWGGENLELSFRIWMCGGSLAMIPCSRIGHVFRS------RRPYNNG 355
            L     E+ EL++R+   G  + + P  R+ +  RS      R+ +  G
Sbjct: 177 SLVR--NEDAELNYRLRKAGYKIWLSPDIRVYYYPRSTLKKLARQYFRYG 224


>gnl|CDD|133022 cd04179, DPM_DPG-synthase_like, DPM_DPG-synthase_like is a member
           of the Glycosyltransferase 2 superfamily.  DPM1 is the
           catalytic subunit of eukaryotic dolichol-phosphate
           mannose (DPM) synthase. DPM synthase is required for
           synthesis of the glycosylphosphatidylinositol (GPI)
           anchor, N-glycan precursor, protein O-mannose, and
           C-mannose. In higher eukaryotes,the enzyme has three
           subunits, DPM1, DPM2 and DPM3. DPM is synthesized from
           dolichol phosphate and GDP-Man on the cytosolic surface
           of the ER membrane by DPM synthase and then is flipped
           onto the luminal side and used as a donor substrate. In
           lower eukaryotes, such as Saccharomyces cerevisiae and
           Trypanosoma brucei, DPM synthase consists of a single
           component (Dpm1p and TbDpm1, respectively) that
           possesses one predicted transmembrane region near the C
           terminus for anchoring to the ER membrane. In contrast,
           the Dpm1 homologues of higher eukaryotes, namely fission
           yeast, fungi, and animals, have no transmembrane region,
           suggesting the existence of adapter molecules for
           membrane anchoring. This family also includes bacteria
           and archaea DPM1_like enzymes. However, the enzyme
           structure and mechanism of function are not well
           understood. The UDP-glucose:dolichyl-phosphate
           glucosyltransferase (DPG_synthase) is a
           transmembrane-bound enzyme of the endoplasmic reticulum
           involved in protein N-linked glycosylation. This enzyme
           catalyzes the transfer of glucose from UDP-glucose to
           dolichyl phosphate. This protein family belongs to
           Glycosyltransferase 2 superfamily.
          Length = 185

 Score = 37.2 bits (87), Expect = 0.015
 Identities = 37/186 (19%), Positives = 62/186 (33%), Gaps = 46/186 (24%)

Query: 149 HEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMFGAKYATGKVLV 208
           +EII+V+D S                    RV + R S+  G   A   G K A G ++V
Sbjct: 29  YEIIVVDDGS--TDGTAEIAREL--AARVPRVRVIRLSRNFGKGAAVRAGFKAARGDIVV 84

Query: 209 FLDS----HIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQYTSSA---LVRGGFN 261
            +D+      E     +  LL  + E          D++    F     A   L+R   +
Sbjct: 85  TMDADLQHPPED----IPKLLEKLLEGGA-------DVVIGSRFVRGGGAGMPLLRRLGS 133

Query: 262 WGLHFKWENLPKGTLNSSEDFIKPILSPTM---AGGLFAIDRQYFDSLGQ------YDAG 312
              +F                I+ +L   +     G     R+  ++L        ++ G
Sbjct: 134 RLFNF---------------LIRLLLGVRISDTQSGFRLFRREVLEALLSLLESNGFEFG 178

Query: 313 LEIWGG 318
           LE+  G
Sbjct: 179 LELLVG 184



 Score = 34.1 bits (79), Expect = 0.17
 Identities = 26/87 (29%), Positives = 39/87 (44%), Gaps = 9/87 (10%)

Query: 56  VIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETFVKGL--ND 113
           V+I  YNE    +   V+ LL+   +   +EII+V+D S   +    E+    + L    
Sbjct: 1   VVIPAYNE-EENIPELVERLLAVLEEGYDYEIIVVDDGSTDGT---AEI---ARELAARV 53

Query: 114 GRVHLYRTSKREGLIRARMFGAKYATG 140
            RV + R S+  G   A   G K A G
Sbjct: 54  PRVRVIRLSRNFGKGAAVRAGFKAARG 80


>gnl|CDD|217196 pfam02709, Glyco_transf_7C, N-terminal domain of
           galactosyltransferase.  This is the N-terminal domain of
           a family of galactosyltransferases from a wide range of
           Metazoa with three related galactosyltransferases
           activities, all three of which are possessed by one
           sequence in some cases. EC:2.4.1.90, N-acetyllactosamine
           synthase; EC:2.4.1.38,
           Beta-N-acetylglucosaminyl-glycopeptide beta-1,4-
           galactosyltransferase; and EC:2.4.1.22 Lactose synthase.
           Note that N-acetyllactosamine synthase is a component of
           Lactose synthase along with alpha-lactalbumin, in the
           absence of alpha-lactalbumin EC:2.4.1.90 is the
           catalyzed reaction.
          Length = 78

 Score = 34.9 bits (81), Expect = 0.017
 Identities = 12/57 (21%), Positives = 25/57 (43%)

Query: 278 SSEDFIKPILSPTMAGGLFAIDRQYFDSLGQYDAGLEIWGGENLELSFRIWMCGGSL 334
           + + F   +      GG+ A  ++ F  +  +      WGGE+ +L  R+ + G  +
Sbjct: 6   ALDKFNYKLPYKGYFGGVLAFSKEDFLKVNGFSNNFWGWGGEDDDLYARLLLAGLKI 62


>gnl|CDD|133030 cd04187, DPM1_like_bac, Bacterial DPM1_like enzymes are related to
           eukaryotic DPM1.  A family of  bacterial enzymes related
           to eukaryotic DPM1; Although the mechanism of eukaryotic
           enzyme is well studied, the mechanism of the  bacterial
           enzymes is not well understood. The eukaryotic DPM1 is
           the catalytic subunit of eukaryotic Dolichol-phosphate
           mannose (DPM) synthase. DPM synthase is required for
           synthesis of the glycosylphosphatidylinositol (GPI)
           anchor, N-glycan precursor, protein O-mannose, and
           C-mannose. The enzyme has three subunits, DPM1, DPM2 and
           DPM3. DPM is synthesized from dolichol phosphate and
           GDP-Man on the cytosolic surface of the ER membrane by
           DPM synthase and then is flipped onto the luminal side
           and used as a donor substrate. This protein family
           belongs to Glycosyltransferase 2 superfamily.
          Length = 181

 Score = 35.1 bits (82), Expect = 0.072
 Identities = 33/153 (21%), Positives = 62/153 (40%), Gaps = 32/153 (20%)

Query: 56  VIICFYNEHPA--TLYRSVQTLLSRTGQSLLHEIILVNDFSEYPSNLHGEVETF--VKGL 111
           +++  YNE      LY  ++ +L   G    +EII V+D S           T   ++ L
Sbjct: 1   IVVPVYNEEENLPELYERLKAVLESLGYD--YEIIFVDDGS--TDR------TLEILREL 50

Query: 112 --NDGRVHLYRTSKREGLIRARMFGAKYATGKNRIQSLLHEIILVNDFSEYPSNLHGEVE 169
              D RV + R S+  G   A + G  +A G + +      I +  D  + P      + 
Sbjct: 51  AARDPRVKVIRLSRNFGQQAALLAGLDHARG-DAV------ITMDADLQDPPE----LIP 99

Query: 170 SFVKGLNNGR--VHLYRTSKREGLIR---ARMF 197
             +     G   V+  R +++E  ++   +++F
Sbjct: 100 EMLAKWEEGYDVVYGVRKNRKESWLKRLTSKLF 132



 Score = 29.8 bits (68), Expect = 4.4
 Identities = 18/76 (23%), Positives = 34/76 (44%), Gaps = 10/76 (13%)

Query: 143 RIQSLL------HEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARM 196
           R++++L      +EII V+D S         +        + RV + R S+  G   A +
Sbjct: 18  RLKAVLESLGYDYEIIFVDDGS--TDRTLEILRELAA--RDPRVKVIRLSRNFGQQAALL 73

Query: 197 FGAKYATGKVLVFLDS 212
            G  +A G  ++ +D+
Sbjct: 74  AGLDHARGDAVITMDA 89


>gnl|CDD|223844 COG0773, MurC, UDP-N-acetylmuramate-alanine ligase [Cell envelope
           biogenesis, outer membrane].
          Length = 459

 Score = 36.0 bits (84), Expect = 0.092
 Identities = 28/117 (23%), Positives = 41/117 (35%), Gaps = 35/117 (29%)

Query: 465 DEYIEHFLKQRPEA---RNID------YGDVTDRKQLRARLGCKSFKWYLDNV--YPEMI 513
           DE    FL   P      NI+      YGD+   KQ        +F  ++ NV  Y   +
Sbjct: 164 DESDSSFLHYNPRVAIVTNIEFDHLDYYGDLEAIKQ--------AFHHFVRNVPFYGRAV 215

Query: 514 LPSDDEERLKK-----KWAQVEQPKFQP---WYSRARNYT-----SHFHIRLSSTDL 557
           +  DD   L++      W+ V    F     W  RA N       + F +     +L
Sbjct: 216 VCGDDPN-LRELLSRGCWSPVVTYGFDDEADW--RAENIRQDGSGTTFDVLFRGEEL 269



 Score = 32.6 bits (75), Expect = 1.0
 Identities = 26/100 (26%), Positives = 36/100 (36%), Gaps = 30/100 (30%)

Query: 374 DEYIEHFLKQRPEA---RNID------YGDVTDRKQLRARLGCKSFKWYLDNV--YPEMI 422
           DE    FL   P      NI+      YGD+   KQ        +F  ++ NV  Y   +
Sbjct: 164 DESDSSFLHYNPRVAIVTNIEFDHLDYYGDLEAIKQ--------AFHHFVRNVPFYGRAV 215

Query: 423 LPSDDEERLKK-----KWAQVEQPKFQP---WYSRARNYT 454
           +  DD   L++      W+ V    F     W  RA N  
Sbjct: 216 VCGDDPN-LRELLSRGCWSPVVTYGFDDEADW--RAENIR 252


>gnl|CDD|133029 cd04186, GT_2_like_c, Subfamily of Glycosyltransferase Family GT2
           of unknown function.  GT-2 includes diverse families of
           glycosyltransferases with a common GT-A type structural
           fold, which has two tightly associated beta/alpha/beta
           domains that tend to form a continuous central sheet of
           at least eight beta-strands. These are enzymes that
           catalyze the transfer of sugar moieties from activated
           donor molecules to specific acceptor molecules, forming
           glycosidic bonds. Glycosyltransferases have been
           classified into more than 90 distinct sequence based
           families.
          Length = 166

 Score = 34.1 bits (79), Expect = 0.12
 Identities = 35/207 (16%), Positives = 65/207 (31%), Gaps = 64/207 (30%)

Query: 144 IQSLL------HEIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMF 197
           + SLL       E+I+V++ S   S         ++ L    V L R  +  G       
Sbjct: 16  LDSLLAQTYPDFEVIVVDNASTDGS------VELLRELF-PEVRLIRNGENLGFGAGNNQ 68

Query: 198 GAKYATGKVLVFLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQYTSSALVR 257
           G + A G  ++ L+    V    L  LL    +  +   V                    
Sbjct: 69  GIREAKGDYVLLLNPDTVVEPGALLELLDAAEQDPDVGIV-------------------- 108

Query: 258 GGFNWGLHFKWENLPKGTLNSSEDFIKPILSPTMAGGLFAIDRQYFDSLGQYDAGLEIWG 317
                                          P ++G    + R+ F+ +G +D    ++ 
Sbjct: 109 ------------------------------GPKVSGAFLLVRREVFEEVGGFDEDFFLY- 137

Query: 318 GENLELSFRIWMCGGSLAMIPCSRIGH 344
            E+++L  R  + G  +  +P + I H
Sbjct: 138 YEDVDLCLRARLAGYRVLYVPQAVIYH 164


>gnl|CDD|133035 cd04192, GT_2_like_e, Subfamily of Glycosyltransferase Family GT2
           of unknown function.  GT-2 includes diverse families of
           glycosyltransferases with a common GT-A type structural
           fold, which has two tightly associated beta/alpha/beta
           domains that tend to form a continuous central sheet of
           at least eight beta-strands. These are enzymes that
           catalyze the transfer of sugar moieties from activated
           donor molecules to specific acceptor molecules, forming
           glycosidic bonds. Glycosyltransferases have been
           classified into more than 90 distinct sequence based
           families.
          Length = 229

 Score = 34.2 bits (79), Expect = 0.20
 Identities = 24/90 (26%), Positives = 38/90 (42%), Gaps = 11/90 (12%)

Query: 150 EIILVNDFS-----EYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMFGAKYATG 204
           E+ILV+D S     +       +    +K LNN RV +  + K+  L  A     K A G
Sbjct: 30  EVILVDDHSTDGTVQILEFAAAKPNFQLKILNNSRVSI--SGKKNALTTA----IKAAKG 83

Query: 205 KVLVFLDSHIEVNTHWLEPLLVPIAERTNT 234
             +V  D+   V ++WL   +  I +    
Sbjct: 84  DWIVTTDADCVVPSNWLLTFVAFIQKEQIG 113


>gnl|CDD|222281 pfam13641, Glyco_tranf_2_3, Glycosyltransferase like family 2.
           Members of this family of prokaryotic proteins include
           putative glucosyltransferase, which are involved in
           bacterial capsule biosynthesis.
          Length = 229

 Score = 33.9 bits (78), Expect = 0.24
 Identities = 33/190 (17%), Positives = 55/190 (28%), Gaps = 30/190 (15%)

Query: 167 EVESFVKGLNNGRVHLYRTSKREGL---IRARMFGAKYATGKVLVFLDSHIEVNTHWLEP 223
                     + RVH+ R  +  G     RA     +     ++V LD+   V+   L  
Sbjct: 47  VARELAAAYPDVRVHVVRRPRPPGPTGKARALNEALRAIKSDLVVLLDADSVVDPDTLR- 105

Query: 224 LLVPIAERTNTVTVPIIDIINADTFQYTSSALVRGGFNW-----GLHFKWENLPKGTLNS 278
                        +P          Q      V            L F   +L    L  
Sbjct: 106 -----------RLLPFFLSKGVGAVQ--GPVFVLNLRTAVAPLYALEFALRHLRFMALRR 152

Query: 279 SEDFIKPILSPTMAGGLFAIDRQYFDSLGQYDAGLEIWGGENLELSFRIWMCGGSLAMIP 338
           +           +AG      R   + +G +D G  +   E+ EL  R+   G   A +P
Sbjct: 153 ALGV------APLAGSGSLFRRSVLEEIGGFDPGFLLG--EDKELGLRLRRAGWRTAYVP 204

Query: 339 CSRIGHVFRS 348
            + +  +  S
Sbjct: 205 GAAVYELSPS 214


>gnl|CDD|220577 pfam10111, Glyco_tranf_2_2, Glycosyltransferase like family 2.
           Members of this family of prokaryotic proteins include
           putative glucosyltransferase, which are involved in
           bacterial capsule biosynthesis.
          Length = 278

 Score = 33.9 bits (78), Expect = 0.26
 Identities = 27/134 (20%), Positives = 47/134 (35%), Gaps = 13/134 (9%)

Query: 194 ARMFGAKYATGKVLVFLDSHIEVNTHWLEPLLVPIAERTNTVTVPIIDIINADTFQYTSS 253
           AR  GA+Y++   + FLD    ++   LE ++    E        +       + + +  
Sbjct: 79  ARNRGAEYSSSDFIFFLDVDCLISPDTLEKIIKHFQELQTNPNAFLALPCLYLSKEGSE- 137

Query: 254 ALVRGGFNWGLHFKWENLPKGTLNSSEDFIK-PILSPTMAGGLFAIDRQYFDSLGQYDAG 312
                       F  +          ED I        +A     I+R +F  +G +D  
Sbjct: 138 -----------IFLSDFKYLLREEILEDAITGKSTFFALASSCILINRDFFLKIGGFDEN 186

Query: 313 LEIWGGENLELSFR 326
               GGE+ EL +R
Sbjct: 187 FRGHGGEDFELLYR 200


>gnl|CDD|133031 cd04188, DPG_synthase, DPG_synthase is involved in protein
          N-linked glycosylation.  UDP-glucose:dolichyl-phosphate
          glucosyltransferase (DPG_synthase) is a
          transmembrane-bound enzyme of the endoplasmic reticulum
          involved in protein N-linked glycosylation. This enzyme
          catalyzes the transfer of glucose from UDP-glucose to
          dolichyl phosphate.
          Length = 211

 Score = 31.8 bits (73), Expect = 1.1
 Identities = 18/42 (42%), Positives = 24/42 (57%), Gaps = 5/42 (11%)

Query: 56 VIICFYNEH---PATLYRSVQTLLSRTGQSLLHEIILVNDFS 94
          V+I  YNE    P TL  +V+ L  R   S  +EII+V+D S
Sbjct: 1  VVIPAYNEEKRLPPTLEEAVEYLEERPSFS--YEIIVVDDGS 40


>gnl|CDD|213250 cd03283, ABC_MutS-like, ATP-binding cassette domain of MutS-like
           homolog.  The MutS protein initiates DNA mismatch repair
           by recognizing mispaired and unpaired bases embedded in
           duplex DNA and activating endo- and exonucleases to
           remove the mismatch. Members of the MutS family possess
           C-terminal domain with a conserved ATPase activity that
           belongs to the ATP binding cassette (ABC) superfamily.
           MutS homologs (MSH) have been identified in most
           prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 199

 Score = 31.1 bits (71), Expect = 1.9
 Identities = 16/40 (40%), Positives = 19/40 (47%), Gaps = 5/40 (12%)

Query: 7   DLITRDEGYRYYGFNALI-SNKLSLDRKIPD----TRNSL 41
           DL+  D   R Y F   I  NKL  D K+      TRN+L
Sbjct: 152 DLLDLDSAVRNYHFREDIDDNKLIFDYKLKPGVSPTRNAL 191


>gnl|CDD|133056 cd06434, GT2_HAS, Hyaluronan synthases catalyze polymerization of
           hyaluronan.  Hyaluronan synthases (HASs) are
           bi-functional glycosyltransferases that catalyze
           polymerization of hyaluronan. HASs transfer both GlcUA
           and GlcNAc in beta-(1,3) and beta-(1,4) linkages,
           respectively to the hyaluronan chain using UDP-GlcNAc
           and UDP-GlcUA as substrates. HA is made as a free
           glycan, not attached to a protein or lipid. HASs do not
           need a primer for HA synthesis; they initiate HA
           biosynthesis de novo with only UDP-GlcNAc, UDP-GlcUA,
           and Mg2+. Hyaluronan (HA) is a linear
           heteropolysaccharide composed of (1-3)-linked
           beta-D-GlcUA-beta-D-GlcNAc disaccharide repeats. It can
           be found in vertebrates and a few microbes and is
           typically on the cell surface or in the extracellular
           space, but is also found inside mammalian cells.
           Hyaluronan has several physiochemical and biological
           functions such as space filling, lubrication, and
           providing a hydrated matrix through which cells can
           migrate.
          Length = 235

 Score = 30.7 bits (70), Expect = 2.5
 Identities = 21/82 (25%), Positives = 31/82 (37%), Gaps = 18/82 (21%)

Query: 141 KNRIQSLL----HEIILVNDFS--EYPSNLHGEVESFVKGLNNGRVHLY--RTSKREGLI 192
           +  ++S+L     EII+V D     Y S L   V         G   +      KR  L 
Sbjct: 17  RECLRSILRQKPLEIIVVTDGDDEPYLSILSQTV------KYGGIFVITVPHPGKRRALA 70

Query: 193 RARMFGAKYATGKVLVFLDSHI 214
                G ++ T  ++V LDS  
Sbjct: 71  E----GIRHVTTDIVVLLDSDT 88


>gnl|CDD|133062 cd06442, DPM1_like, DPM1_like represents putative enzymes similar
           to eukaryotic DPM1.  Proteins similar to eukaryotic
           DPM1, including enzymes from bacteria and archaea; DPM1
           is the catalytic subunit of eukaryotic
           dolichol-phosphate mannose (DPM) synthase. DPM synthase
           is required for synthesis of the
           glycosylphosphatidylinositol (GPI) anchor, N-glycan
           precursor, protein O-mannose, and C-mannose. In higher
           eukaryotes,the enzyme has three subunits, DPM1, DPM2 and
           DPM3. DPM is synthesized from dolichol phosphate and
           GDP-Man on the cytosolic surface of the ER membrane by
           DPM synthase and then is flipped onto the luminal side
           and used as a donor substrate. In lower eukaryotes, such
           as Saccharomyces cerevisiae and Trypanosoma brucei, DPM
           synthase consists of a single component (Dpm1p and
           TbDpm1, respectively) that possesses one predicted
           transmembrane region near the C terminus for anchoring
           to the ER membrane. In contrast, the Dpm1 homologues of
           higher eukaryotes, namely fission yeast, fungi, and
           animals, have no transmembrane region, suggesting the
           existence of adapter molecules for membrane anchoring.
           This family also includes bacteria and archaea DPM1_like
           enzymes. However, the enzyme structure and mechanism of
           function are not well understood. This protein family
           belongs to Glycosyltransferase 2 superfamily.
          Length = 224

 Score = 30.6 bits (70), Expect = 3.1
 Identities = 24/67 (35%), Positives = 30/67 (44%), Gaps = 7/67 (10%)

Query: 150 EIILVNDFSEYPSNLHGEVESFVKGLNNGRVHLYRTSKREGLIRARMFGAKYATGKVLVF 209
           EII+V+D S  P      V    K     RV L     + GL  A + G K A G V+V 
Sbjct: 29  EIIVVDDNS--PDGTAEIVRELAK--EYPRVRLIVRPGKRGLGSAYIEGFKAARGDVIVV 84

Query: 210 LD---SH 213
           +D   SH
Sbjct: 85  MDADLSH 91


>gnl|CDD|234666 PRK00147, queA, S-adenosylmethionine:tRNA
           ribosyltransferase-isomerase; Provisional.
          Length = 342

 Score = 30.4 bits (70), Expect = 3.5
 Identities = 15/42 (35%), Positives = 19/42 (45%), Gaps = 12/42 (28%)

Query: 111 LNDGRVHLYRTSKREGLIRARMFGAKYATGKNRIQSLLHEII 152
            ND RV           I AR+FG K  TG  +I+ LL   +
Sbjct: 58  FNDTRV-----------IPARLFGRKKETGG-KIEVLLLRRL 87



 Score = 29.3 bits (67), Expect = 9.8
 Identities = 14/37 (37%), Positives = 17/37 (45%), Gaps = 12/37 (32%)

Query: 175 LNNGRVHLYRTSKREGLIRARMFGAKYATG-KVLVFL 210
            N+ RV           I AR+FG K  TG K+ V L
Sbjct: 58  FNDTRV-----------IPARLFGRKKETGGKIEVLL 83


>gnl|CDD|211965 TIGR04242, nodulat_NodC, chitooligosaccharide synthase NodC.
           Members of this family are NodC, an
           N-acetylglucosaminyltransferase involved in the
           production of nodulation factors through which rhizobia
           establish symbioses with leguminous plants.
          Length = 395

 Score = 30.1 bits (68), Expect = 5.2
 Identities = 17/74 (22%), Positives = 32/74 (43%), Gaps = 1/74 (1%)

Query: 35  PDTRNSLCANQTFPSTLPSTSVIICFYNEHPATLYRSVQTLLSRTGQSLLHEIILVNDFS 94
           P T  +  ++      LPS  VI+  +NE P TL   + ++ ++     L  + +V+D S
Sbjct: 31  PTTVPATSSDALPSDPLPSVDVIVPCFNEDPRTLSECLASIAAQDYAGKLR-VYVVDDGS 89

Query: 95  EYPSNLHGEVETFV 108
                L    + + 
Sbjct: 90  TNRDALVPVHDAYA 103


>gnl|CDD|206323 pfam14154, DUF4306, Domain of unknown function (DUF4306).  This
           family includes the B. subtilis YjdJ protein, which is
           functionally uncharacterized. This is not a homologue of
           E. coli YjdJ, which belongs to pfam00583. This family of
           proteins is functionally uncharacterized. This family of
           proteins is found in bacteria. Proteins in this family
           are typically between 95 and 152 amino acids in length.
          Length = 89

 Score = 28.1 bits (63), Expect = 5.2
 Identities = 8/32 (25%), Positives = 12/32 (37%)

Query: 250 YTSSALVRGGFNWGLHFKWENLPKGTLNSSED 281
           Y  S L+   + W    K+     GT+ S   
Sbjct: 20  YQGSELLDDPWEWKYTAKFTPQLNGTVTSYSQ 51


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 30.0 bits (68), Expect = 6.5
 Identities = 17/73 (23%), Positives = 28/73 (38%), Gaps = 11/73 (15%)

Query: 330 CGGSLAM-----IPCSRIGHVFRSRRPYNNGHNEDPLTRNSLRVAHVWMDEYIEHFLKQR 384
           CGG + +     +    I +V  S RP+N  +        S +V  ++ D  IE  L+  
Sbjct: 60  CGGMVDLEEFLQLDEDVIVYVIDSHRPWNLDN-----VFGSDQV-VIFDDGDIEEELQDE 113

Query: 385 PEARNIDYGDVTD 397
           P   +       D
Sbjct: 114 PRYDDAYRDLEED 126


>gnl|CDD|133038 cd04195, GT2_AmsE_like, GT2_AmsE_like is involved in
           exopolysaccharide amylovora biosynthesis.  AmsE is a
           glycosyltransferase involved in exopolysaccharide
           amylovora biosynthesis in Erwinia amylovora. Amylovara
           is one of the three exopolysaccharide produced by E.
           amylovora. Amylovara-deficient mutants are
           non-pathogenic. It is a subfamily of Glycosyltransferase
           Family GT2, which includes diverse families of
           glycosyltransferases with a common GT-A type structural
           fold, which has two tightly associated beta/alpha/beta
           domains that tend to form a continuous central sheet of
           at least eight beta-strands. These are enzymes that
           catalyze the transfer of sugar moieties from activated
           donor molecules to specific acceptor molecules, forming
           glycosidic bonds.
          Length = 201

 Score = 29.2 bits (66), Expect = 6.6
 Identities = 24/87 (27%), Positives = 42/87 (48%), Gaps = 9/87 (10%)

Query: 55  SVIICFY-NEHPATLYRSVQTLLSRTGQSLL-HEIILVNDFSEYPSNLHGEVETFVKGLN 112
           SV++  Y  E P  L  +++++L    Q+L   E++LV D      +L+  +E F + L 
Sbjct: 1   SVLMSVYIKEKPEFLREALESILK---QTLPPDEVVLVKD-GPVTQSLNEVLEEFKRKLP 56

Query: 113 DGRVHLYRTSKREGLIRARMFGAKYAT 139
              + +    K  GL +A   G K+ T
Sbjct: 57  ---LKVVPLEKNRGLGKALNEGLKHCT 80


>gnl|CDD|234173 TIGR03346, chaperone_ClpB, ATP-dependent chaperone ClpB.  Members
           of this protein family are the bacterial ATP-dependent
           chaperone ClpB. This protein belongs to the AAA family,
           ATPases associated with various cellular activities
           (pfam00004). This molecular chaperone does not act as a
           protease, but rather serves to disaggregate misfolded
           and aggregated proteins [Protein fate, Protein folding
           and stabilization].
          Length = 852

 Score = 29.9 bits (68), Expect = 7.0
 Identities = 10/24 (41%), Positives = 17/24 (70%)

Query: 795 WDLVPVGSLVEGEKTQVAHMEESI 818
           W  +PV  ++EGE+ ++ HMEE +
Sbjct: 540 WTGIPVSKMLEGEREKLLHMEEVL 563


>gnl|CDD|181467 PRK08559, nusG, transcription antitermination protein NusG;
           Validated.
          Length = 153

 Score = 28.7 bits (65), Expect = 7.7
 Identities = 10/21 (47%), Positives = 13/21 (61%)

Query: 375 EYIEHFLKQRPEARNIDYGDV 395
           E +EHFLK +P    I  GD+
Sbjct: 80  EEVEHFLKPKPIVEGIKEGDI 100



 Score = 28.7 bits (65), Expect = 7.7
 Identities = 10/21 (47%), Positives = 13/21 (61%)

Query: 466 EYIEHFLKQRPEARNIDYGDV 486
           E +EHFLK +P    I  GD+
Sbjct: 80  EEVEHFLKPKPIVEGIKEGDI 100


>gnl|CDD|198432 cd10034, UDG_like_2, Uncharacterized subfamily of Uracil-DNA
           glycosylases.  This is a subfamily of Uracil-DNA
           glycosylase superfamily. Uracil-DNA glycosylases (UDG)
           catalyze the removal of uracil from DNA to initiate DNA
           base excision repair pathway. Uracil in DNA can arise as
           a result of mis-incorporation of dUMP residues by DNA
           polymerase or deamination of cytosine. Uracil mispaired
           with guanine in DNA is one of the major pro-mutagenic
           events, causing G:C->A:T mutations. UDG is an essential
           enzyme for maintaining the integrity of genetic
           information. This ubiquitously found enzyme hydrolyzes
           the N-glycosidic bond of deoxyuridine in DNA.
          Length = 141

 Score = 28.4 bits (64), Expect = 9.7
 Identities = 8/37 (21%), Positives = 15/37 (40%), Gaps = 1/37 (2%)

Query: 610 LTKCHEMGGSQEYWCWLRCKS-FKWYLDNVYPEMILP 645
           L KC       +      C+      +D + P++I+P
Sbjct: 56  LLKCRSPDREADDEEIKNCEPYLLAQIDLIKPKIIVP 92


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.135    0.420 

Gapped
Lambda     K      H
   0.267   0.0634    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 44,253,013
Number of extensions: 4327300
Number of successful extensions: 3200
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3164
Number of HSP's successfully gapped: 68
Length of query: 867
Length of database: 10,937,602
Length adjustment: 105
Effective length of query: 762
Effective length of database: 6,280,432
Effective search space: 4785689184
Effective search space used: 4785689184
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 63 (28.2 bits)