RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy12365
         (1324 letters)



>gnl|CDD|225201 COG2319, COG2319, FOG: WD40 repeat [General function prediction
           only].
          Length = 466

 Score = 51.2 bits (121), Expect = 3e-06
 Identities = 52/224 (23%), Positives = 96/224 (42%), Gaps = 25/224 (11%)

Query: 2   DKLIFTLEEPHGPGDVYVCWQRRTGE--LLATTGRDSSVSIYNKHGKLIDKITLPGL--- 56
           +KLI +LE  H      +      G   LLA++  D +V +++         TL G    
Sbjct: 98  EKLIKSLEGLHDSSVSKLALSSPDGNSILLASSSLDGTVKLWDLSTPGKLIRTLEGHSES 157

Query: 57  CIVMDWDSEGDLLGIISSNSSAVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSML---- 112
              + +  +G LL   SS    + +W+  T K     +G  DP++ L +     +L    
Sbjct: 158 VTSLAFSPDGKLLASGSSLDGTIKLWDLRTGKPLSTLAGHTDPVSSLAFSPDGGLLIASG 217

Query: 113 --QLSVSIYNKHGKLIDKITLPGL--CIVMDWDSEGDLLGIISSNSSAVNVWTL-----L 163
               ++ +++     + + TL G    +V  +  +G LL    S+   + +W L     L
Sbjct: 218 SSDGTIRLWDLSTGKLLRSTLSGHSDSVVSSFSPDGSLL-ASGSSDGTIRLWDLRSSSSL 276

Query: 164 TYTLG------ERISWSDDGQLLAVTTSGGSVKIYLSKLPKLVV 201
             TL         +++S DG+LLA  +S G+V+++  +  KL+ 
Sbjct: 277 LRTLSGHSSSVLSVAFSPDGKLLASGSSDGTVRLWDLETGKLLS 320



 Score = 47.8 bits (112), Expect = 4e-05
 Identities = 43/239 (17%), Positives = 95/239 (39%), Gaps = 28/239 (11%)

Query: 3   KLIFTLEEPHGPGDVYVCWQRRTGELLATTGRDSSVSIYN--KHGKLIDKITLPGLCIV- 59
           KL+ +    H    V        G LLA+   D ++ +++      L+  ++     ++ 
Sbjct: 232 KLLRSTLSGHSDSVVSSFS--PDGSLLASGSSDGTIRLWDLRSSSSLLRTLSGHSSSVLS 289

Query: 60  MDWDSEGDLLGIISSNSSAVNVWNTYTKKRTIVDS--GLRDPLTCLVWCKQCSMLQLSVS 117
           + +  +G LL    S+   V +W+  T K     +  G   P++ L +    S+L    S
Sbjct: 290 VAFSPDGKLL-ASGSSDGTVRLWDLETGKLLSSLTLKGHEGPVSSLSFSPDGSLLVSGGS 348

Query: 118 IYNKH-------GKLIDKITLPGLCIVMDWDSEGDLLGIISSNSSAVNVWTLLTYTLG-- 168
                       GK +  +      + + +  +G ++   S++ + V +W L T +L   
Sbjct: 349 DDGTIRLWDLRTGKPLKTLEGHSNVLSVSFSPDGRVVSSGSTDGT-VRLWDLSTGSLLRN 407

Query: 169 --------ERISWSDDGQLLAVTTSGGSVKIY--LSKLPKLVVANNGKIAILSSLNQVS 217
                     + +S DG+ LA  +S  +++++   + L  +  + +GK+    S +   
Sbjct: 408 LDGHTSRVTSLDFSPDGKSLASGSSDNTIRLWDLKTSLKSVSFSPDGKVLASKSSDLSV 466



 Score = 47.4 bits (111), Expect = 4e-05
 Identities = 38/210 (18%), Positives = 90/210 (42%), Gaps = 22/210 (10%)

Query: 19  VCWQRRTGELLATTGRDSSVSIYNKHGKLIDKITLPGL--CIVMDWDSEGDLLGIISSNS 76
           + +    G L+A+   D ++ +++     + + TL G    +V  +  +G LL    S+ 
Sbjct: 204 LAFSPDGGLLIASGSSDGTIRLWDLSTGKLLRSTLSGHSDSVVSSFSPDGSLL-ASGSSD 262

Query: 77  SAVNVWNT-YTKKRTIVDSGLRDPLTCLVWCKQCSML-----QLSVSIYN-KHGKLIDKI 129
             + +W+   +       SG    +  + +     +L       +V +++ + GKL+  +
Sbjct: 263 GTIRLWDLRSSSSLLRTLSGHSSSVLSVAFSPDGKLLASGSSDGTVRLWDLETGKLLSSL 322

Query: 130 TLPGLCIV---MDWDSEGDLLGIISSNSSAVNVWTLLT---------YTLGERISWSDDG 177
           TL G       + +  +G LL    S+   + +W L T         ++    +S+S DG
Sbjct: 323 TLKGHEGPVSSLSFSPDGSLLVSGGSDDGTIRLWDLRTGKPLKTLEGHSNVLSVSFSPDG 382

Query: 178 QLLAVTTSGGSVKIYLSKLPKLVVANNGKI 207
           ++++  ++ G+V+++      L+   +G  
Sbjct: 383 RVVSSGSTDGTVRLWDLSTGSLLRNLDGHT 412


>gnl|CDD|238121 cd00200, WD40, WD40 domain, found in a number of eukaryotic
           proteins that cover a wide variety of functions
           including adaptor/regulatory modules in signal
           transduction, pre-mRNA processing and cytoskeleton
           assembly; typically contains a GH dipeptide 11-24
           residues from its N-terminus and the WD dipeptide at its
           C-terminus and is 40 residues long, hence the name WD40;
           between GH and WD lies a conserved core; serves as a
           stable propeller-like platform to which proteins can
           bind either stably or reversibly; forms a propeller-like
           structure with several blades where each blade is
           composed of a four-stranded anti-parallel b-sheet;
           instances with few detectable copies are hypothesized to
           form larger structures by dimerization; each WD40
           sequence repeat forms the first three strands of one
           blade and the last strand in the next blade; the last
           C-terminal WD40 repeat completes the blade structure of
           the first WD40 repeat to create the closed ring
           propeller-structure; residues on the top and bottom
           surface of the propeller are proposed to coordinate
           interactions with other proteins and/or small ligands; 7
           copies of the repeat are present in this alignment.
          Length = 289

 Score = 43.5 bits (103), Expect = 4e-04
 Identities = 41/213 (19%), Positives = 88/213 (41%), Gaps = 33/213 (15%)

Query: 4   LIFTLEEPHGPGDVY-VCWQRRTGELLATTGRDSSVSIYN-KHGKLIDKITLPGLCIV-M 60
           L  TL+  H  G V  V +    G+LLAT   D ++ +++ + G+L+  +      +  +
Sbjct: 1   LRRTLKG-H-TGGVTCVAFSP-DGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDV 57

Query: 61  DWDSEGDLLGIISSNSSAVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSML-----QLS 115
              ++G  L    S+   + +W+  T +     +G    ++ + +     +L       +
Sbjct: 58  AASADGTYL-ASGSSDKTIRLWDLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKT 116

Query: 116 VSIYN-KHGKLIDKITLPG-----LCIVMDWDSEGDLLGIISSNSSAVNVWTL----LTY 165
           + +++ + GK +   TL G       +    D         SS    + +W L       
Sbjct: 117 IKVWDVETGKCL--TTLRGHTDWVNSVAFSPD---GTFVASSSQDGTIKLWDLRTGKCVA 171

Query: 166 TL------GERISWSDDGQLLAVTTSGGSVKIY 192
           TL         +++S DG+ L  ++S G++K++
Sbjct: 172 TLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204



 Score = 37.3 bits (87), Expect = 0.036
 Identities = 42/194 (21%), Positives = 78/194 (40%), Gaps = 33/194 (17%)

Query: 24  RTGELLATTGRDSSVSIYN-KHGKLIDKITLPG-----LCIVMDWDSEGDLLGIISSNSS 77
             G +L+++ RD ++ +++ + GK +   TL G       +    D         SS   
Sbjct: 103 PDGRILSSSSRDKTIKVWDVETGKCL--TTLRGHTDWVNSVAFSPD---GTFVASSSQDG 157

Query: 78  AVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSMLQLSVS-----IYN-KHGKLIDKITL 131
            + +W+  T K     +G    +  + +      L  S S     +++   GK +   TL
Sbjct: 158 TIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTGKCLG--TL 215

Query: 132 PG---LCIVMDWDSEGDLLGIISSNSSAVNVWTLLT----YTLGE------RISWSDDGQ 178
            G       + +  +G LL    S    + VW L T     TL         ++WS DG+
Sbjct: 216 RGHENGVNSVAFSPDGYLL-ASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGK 274

Query: 179 LLAVTTSGGSVKIY 192
            LA  ++ G+++I+
Sbjct: 275 RLASGSADGTIRIW 288


>gnl|CDD|205602 pfam13424, TPR_12, Tetratricopeptide repeat. 
          Length = 78

 Score = 34.7 bits (80), Expect = 0.028
 Identities = 16/69 (23%), Positives = 33/69 (47%), Gaps = 3/69 (4%)

Query: 834 LNDAVTLYESAGNYEKAATCYIQ-LKNWTKIGQLLPHIKSATTFIQYAKAKEAMGSYRES 892
           LN+   +    G+Y++A     + L+   ++G+   H ++A      A+   A+G Y E+
Sbjct: 8   LNNLALVLRRLGDYDEALELLEKALELARELGED--HPETARALNNLARLYLALGDYDEA 65

Query: 893 VGAYERAED 901
           +   E+A  
Sbjct: 66  LEYLEKALA 74



 Score = 31.6 bits (72), Expect = 0.39
 Identities = 18/84 (21%), Positives = 32/84 (38%), Gaps = 21/84 (25%)

Query: 697 GHVAALLGNHDTAQQRYLTSDIPTMALTLRRDLRQWREALALATSLGSN--QTPIISCDY 754
             V   LG++D A +                      +AL LA  LG +  +T     + 
Sbjct: 12  ALVLRRLGDYDEALELL-------------------EKALELARELGEDHPETARALNNL 52

Query: 755 AQQLEMTGQHAQALSFYQKSMELA 778
           A+     G + +AL + +K++ L 
Sbjct: 53  ARLYLALGDYDEALEYLEKALALR 76


>gnl|CDD|216037 pfam00637, Clathrin, Region in Clathrin and VPS.  Each region is
           about 140 amino acids long. The regions are composed of
           multiple alpha helical repeats. They occur in the arm
           region of the Clathrin heavy chain.
          Length = 143

 Score = 35.3 bits (82), Expect = 0.071
 Identities = 20/135 (14%), Positives = 47/135 (34%), Gaps = 15/135 (11%)

Query: 822 NECADILQQFNKLNDAVTLYESAGNYEKAAT---------CYIQLKNWTKIGQLLPHIKS 872
           +    + ++   L + +   ESA                  Y + ++  K   L   +K 
Sbjct: 11  SRVVKLFEKRGLLEELIPYLESALKENSRENPALQTALLELYAKYEDPEK---LEEFLKK 67

Query: 873 ATTF-IQYAKAK-EAMGSYRESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTE 930
              + ++      E    Y E+V  Y++  +Y   + + L  L   + A++        E
Sbjct: 68  NNNYDLEKVAKLCEKADLYEEAVILYKKNGNYKEAISL-LKKLKLYKDAIEYAVKSNDPE 126

Query: 931 GAKRIADYCNKHGDF 945
             +++ +    +G F
Sbjct: 127 LWEKLLNALLDNGRF 141


>gnl|CDD|238112 cd00189, TPR, Tetratricopeptide repeat domain; typically contains
           34 amino acids
           [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-
           X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found
           in a variety of organisms including bacteria,
           cyanobacteria, yeast, fungi, plants, and humans in
           various subcellular locations; involved in a variety of
           functions including protein-protein interactions, but
           common features in the interaction partners have not
           been defined; involved in chaperone, cell-cycle,
           transciption, and protein transport complexes; the
           number of TPR motifs varies among proteins (1,3-11,13
           15,16,19); 5-6 tandem repeats generate a right-handed
           helical structure with an amphipathic channel that is
           thought to accomodate an alpha-helix of a target
           protein; it has been proposed that TPR proteins
           preferably interact with WD-40 repeat proteins, but in
           many instances several TPR-proteins seem to aggregate to
           multi-protein complexes; examples of TPR-proteins
           include, Cdc16p, Cdc23p and Cdc27p components of the
           cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal
           targeting signals, the Tom70p co-receptor for
           mitochondrial targeting signals, Ser/Thr phosphatase 5C
           and the p110 subunit of O-GlcNAc transferase; three
           copies of the repeat are present here.
          Length = 100

 Score = 30.8 bits (70), Expect = 1.1
 Identities = 19/65 (29%), Positives = 30/65 (46%), Gaps = 8/65 (12%)

Query: 840 LYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAMGSYRESVGAYERA 899
           LY   G+Y++A   Y       K  +L P   +A  +   A A   +G Y E++  YE+A
Sbjct: 9   LYYKLGDYDEALEYY------EKALELDP--DNADAYYNLAAAYYKLGKYEEALEDYEKA 60

Query: 900 EDYDN 904
            + D 
Sbjct: 61  LELDP 65



 Score = 28.9 bits (65), Expect = 5.5
 Identities = 18/66 (27%), Positives = 27/66 (40%), Gaps = 8/66 (12%)

Query: 834 LNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAMGSYRESV 893
             +    Y   G YE+A   Y       K  +L P   +A  +     A   +G Y E++
Sbjct: 37  YYNLAAAYYKLGKYEEALEDY------EKALELDP--DNAKAYYNLGLAYYKLGKYEEAL 88

Query: 894 GAYERA 899
            AYE+A
Sbjct: 89  EAYEKA 94


>gnl|CDD|223533 COG0457, NrfG, FOG: TPR repeat [General function prediction only].
          Length = 291

 Score = 32.5 bits (72), Expect = 1.5
 Identities = 35/193 (18%), Positives = 60/193 (31%), Gaps = 10/193 (5%)

Query: 824  CADILQQFNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAK 883
              + L+     + A  L   A    K       L+   K  +L      A   +      
Sbjct: 46   LEEALELLPNSDLAGLLLLLALALLKLGRLEEALELLEKALELELLPNLAEALLNLGLLL 105

Query: 884  EAMGSYRESVGAYERAEDYDNVVRVDLDHLNDI--RHAVDIVKAKKCTEGAKRIADYCNK 941
            EA+G Y E++   E+A   D    +    L         D  +A +  E A  +    N+
Sbjct: 106  EALGKYEEALELLEKALALDPDPDLAEALLALGALYELGDYEEALELYEKALELDPELNE 165

Query: 942  HGDFGAAIHFLILSKCYQDAFNLSQQHKKLHEFGKFLLEEDEPNPVELKRLAIHFEEDKD 1001
              +   A+              L +  + L    K L    + +   L  L + + +   
Sbjct: 166  LAEALLAL--------GALLEALGRYEEALELLEKALKLNPDDDAEALLNLGLLYLKLGK 217

Query: 1002 MFRAAQYYYHAKE 1014
               A +YY  A E
Sbjct: 218  YEEALEYYEKALE 230


>gnl|CDD|222858 PHA02533, 17, large terminase protein; Provisional.
          Length = 534

 Score = 32.7 bits (75), Expect = 1.5
 Identities = 26/84 (30%), Positives = 34/84 (40%), Gaps = 25/84 (29%)

Query: 445 PESGEKYTIMTQAMSE-------EFLFFATSEYELKIFSLSEWKFVSGYKHSDKIKS-IY 496
           P  G KY I T  +SE              +EY         +K V+ Y H++ I   I 
Sbjct: 310 PVEGHKY-IATLDVSEGRGQDYSALHIIDITEYP--------YKQVAVY-HNNTISPLIL 359

Query: 497 PDI-------YGICLVLIEMNNTG 513
           PDI       Y    V IE+N+TG
Sbjct: 360 PDIIVDYLMEYNEAPVYIELNSTG 383


>gnl|CDD|146707 pfam04212, MIT, MIT (microtubule interacting and transport) domain.
            The MIT domain forms an asymmetric three-helix bundle
           and binds ESCRT-III (endosomal sorting complexes
           required for transport) substrates.
          Length = 69

 Score = 29.5 bits (67), Expect = 1.8
 Identities = 21/76 (27%), Positives = 33/76 (43%), Gaps = 14/76 (18%)

Query: 827 ILQQFNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAM 886
           + +    +  AV   + AGNYE+A   Y +      I  LL  +K         K +EA 
Sbjct: 2   LEKALELVKKAVEA-DEAGNYEEALELYKE-----AIEYLLQALKYEPD----PKRREA- 50

Query: 887 GSYRESVGAY-ERAED 901
              R+ +  Y +RAE+
Sbjct: 51  --LRQKIAEYLDRAEE 64


>gnl|CDD|225504 COG2956, COG2956, Predicted N-acetylglucosaminyl transferase
           [Carbohydrate transport and metabolism].
          Length = 389

 Score = 32.0 bits (73), Expect = 2.3
 Identities = 35/159 (22%), Positives = 58/159 (36%), Gaps = 36/159 (22%)

Query: 666 AIRAYQSLDEAGMVWCLESLVEEEEDTSIL-CGHVAALLGNHDTAQQRY-LTSDIPTMAL 723
           AIR +Q+L E+        L  E+   ++   G      G  D A+  +    D    A 
Sbjct: 88  AIRIHQTLLES------PDLTFEQRLLALQQLGRDYMAAGLLDRAEDIFNQLVDEGEFAE 141

Query: 724 TLRRDL-------RQWREALALATSL----GSNQTPIIS---CDYAQQLEMTGQHAQALS 769
              + L       R+W +A+ +A  L    G      I+   C+ AQQ   +    +A  
Sbjct: 142 GALQQLLNIYQATREWEKAIDVAERLVKLGGQTYRVEIAQFYCELAQQALASSDVDRARE 201

Query: 770 FYQKSMELATPDIQDPECQRKCKEGIARTSIRVGDFRLG 808
             +K+++       D +C         R SI +G   L 
Sbjct: 202 LLKKALQ------ADKKC--------VRASIILGRVELA 226


>gnl|CDD|233895 TIGR02494, PFLE_PFLC, glycyl-radical enzyme activating protein
            family.  This subset of the radical-SAM family
            (pfam04055) includes a number of probable activating
            proteins acting on different enzymes all requiring an
            amino-acid-centered radical. The closest relatives to
            this family are the pyruvate-formate lyase activating
            enzyme (PflA, 1.97.1.4, TIGR02493) and the anaerobic
            ribonucleotide reductase activating enzyme (TIGR02491).
            Included within this subfamily are activators of
            hydroxyphenyl acetate decarboxylase (HdpA, ),
            benzylsuccinate synthase (BssD, ), gycerol dehydratase
            (DhaB2,) as well as enzymes annotated in E. coli as
            activators of different isozymes of pyruvate-formate
            lyase (PFLC and PFLE) however, these appear to lack
            characterization and may activate enzymes with
            distinctive functions. Most of the sequence-level
            variability between these forms is concentrated within an
            N-terminal domain which follows a conserved group of
            three cysteines and contains a variable pattern of 0 to 8
            additional cysteines.
          Length = 295

 Score = 31.5 bits (72), Expect = 2.4
 Identities = 13/62 (20%), Positives = 20/62 (32%), Gaps = 17/62 (27%)

Query: 1236 LPCPYCD-----TMVPDMM------LHCASCARIIPFCIA------SGKHITRNELTKCL 1278
            L C +C         P+++      L C  C  + P   A       G++       KC 
Sbjct: 26   LRCKWCSNPESQRKSPELLFKENRCLGCGKCVEVCPAGTARLSELADGRNRIIIRREKCT 85

Query: 1279 EC 1280
             C
Sbjct: 86   HC 87


>gnl|CDD|184804 PRK14720, PRK14720, transcript cleavage factor/unknown domain fusion
            protein; Provisional.
          Length = 906

 Score = 32.0 bits (73), Expect = 2.6
 Identities = 22/75 (29%), Positives = 36/75 (48%), Gaps = 6/75 (8%)

Query: 939  CNKHGDFGAAIHFL-ILSKCYQDAFNLSQQHKKLHEFGKFLLEEDEPNPVELKRLAIHFE 997
            C+K   +G     L  L++ Y        ++KKL    + L++ D  NP  +K+LA  +E
Sbjct: 106  CDKILLYGENKLALRTLAEAYAKL----NENKKLKGVWERLVKADRDNPEIVKKLATSYE 161

Query: 998  EDKDMFRAAQYYYHA 1012
            E+ D  +A  Y   A
Sbjct: 162  EE-DKEKAITYLKKA 175


>gnl|CDD|237756 PRK14559, PRK14559, putative protein serine/threonine phosphatase;
            Provisional.
          Length = 645

 Score = 31.6 bits (72), Expect = 3.1
 Identities = 10/19 (52%), Positives = 11/19 (57%)

Query: 1237 PCPYCDTMVPDMMLHCASC 1255
            PCP C T VP    HC +C
Sbjct: 29   PCPQCGTEVPVDEAHCPNC 47


>gnl|CDD|237953 PRK15378, PRK15378, inositol phosphate phosphatase SopB;
           Provisional.
          Length = 564

 Score = 31.8 bits (72), Expect = 3.1
 Identities = 20/86 (23%), Positives = 29/86 (33%)

Query: 812 AAESNSSVLKNECADILQQFNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIK 871
           AA++    L    A   QQ   L        +A  +  A    +  + W  I   L H  
Sbjct: 126 AAKALKKNLIELIAARTQQQLGLPAKEAHRFAALAFSDAQVKQLNNQPWQTIKNTLSHNG 185

Query: 872 SATTFIQYAKAKEAMGSYRESVGAYE 897
              T  Q   A+  +G+      AYE
Sbjct: 186 HHYTNTQLPAAEMKIGAKDIFPKAYE 211


>gnl|CDD|221693 pfam12657, TFIIIC_delta, Transcription factor IIIC subunit delta
           N-term.  In humans there are six subunits of
           transcription factor IIIC, and this one is the 90 kDa
           subunit; whereas in fungi the complex resolves into nine
           different subunits and this is No. 9 in yeasts. The
           whole subunit is involved in RNA polymerase III-mediated
           transcription. It is possible that this N-terminal
           domain interacts with TFIIIC subunit 8.
          Length = 167

 Score = 30.5 bits (69), Expect = 3.2
 Identities = 33/115 (28%), Positives = 45/115 (39%), Gaps = 21/115 (18%)

Query: 171 ISWSDDGQLLAVTTSGGSVKIYLSKLPKLVVANNGKIAILSSLNQVSVYLRSIERKGTPW 230
           +SWS+DGQL   T  G +V I   K               S++      LR     G   
Sbjct: 10  LSWSEDGQLAVAT--GETVHILNPKSLAKSFIPTPSTLPASAIQWDITKLRGNLFTGQEL 67

Query: 231 --------TNFIIETEIEPS------WREYHGLVVANNGK--IAILSSLNQVSVY 269
                     F I  EI PS      W    GL  A NG+  +A+L+S  ++S+Y
Sbjct: 68  PSILPQSRDLFSIGEEISPSHVRAVAWSP-PGL--AKNGRCLLAVLTSNLRLSLY 119


>gnl|CDD|197651 smart00320, WD40, WD40 repeats.  Note that these repeats are
           permuted with respect to the structural repeats (blades)
           of the beta propeller domain.
          Length = 40

 Score = 27.7 bits (62), Expect = 3.3
 Identities = 7/22 (31%), Positives = 16/22 (72%)

Query: 171 ISWSDDGQLLAVTTSGGSVKIY 192
           +++S DG+ LA  +  G++K++
Sbjct: 18  VAFSPDGKYLASGSDDGTIKLW 39


>gnl|CDD|236058 PRK07579, PRK07579, hypothetical protein; Provisional.
          Length = 245

 Score = 30.6 bits (69), Expect = 4.2
 Identities = 11/53 (20%), Positives = 24/53 (45%), Gaps = 2/53 (3%)

Query: 891 ESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTEGAKRIADYCNKHG 943
            + G     +D+  +  +DLD     RH ++ ++A   T    + A + ++ G
Sbjct: 178 ATEGNLNSKKDFKQLREIDLDERGTFRHFINRLRA--LTHDDYKNAYFVDESG 228


>gnl|CDD|239145 cd02682, MIT_AAA_Arch, MIT: domain contained within Microtubule
            Interacting and Trafficking molecules. This sub-family of
            MIT domains is found in mostly archaebacterial
            AAA-ATPases. The molecular function of the MIT domain is
            unclear.
          Length = 75

 Score = 28.6 bits (64), Expect = 4.2
 Identities = 13/36 (36%), Positives = 20/36 (55%), Gaps = 3/36 (8%)

Query: 1186 NLQESALKFATLLLRPEYRSNLEDK---YRKQIELI 1218
             L+E A K+A   ++ E   N ED    Y+K IE++
Sbjct: 1    MLEEMARKYAINAVKAEKEGNAEDAITNYKKAIEVL 36


>gnl|CDD|211334 cd02566, PseudoU_synth_RluE, Pseudouridine synthase, Escherichia
          coli RluE.  This group is comprised of bacterial
          proteins similar to E. coli RluE. Pseudouridine
          synthases catalyze the isomerization of specific
          uridines in an RNA molecule to pseudouridines
          (5-ribosyluracil, psi).  No cofactors are required.
          Escherichia coli RluE makes psi2457 in 23S RNA. psi2457
          is not universally conserved.
          Length = 168

 Score = 30.0 bits (68), Expect = 4.4
 Identities = 14/30 (46%), Positives = 16/30 (53%), Gaps = 2/30 (6%)

Query: 42 NKHGKLIDKITLPGLCIV--MDWDSEGDLL 69
           KH  L D I  PG+     +D DSEG LL
Sbjct: 19 EKHKTLKDYIDDPGVYAAGRLDRDSEGLLL 48



 Score = 30.0 bits (68), Expect = 4.4
 Identities = 14/30 (46%), Positives = 16/30 (53%), Gaps = 2/30 (6%)

Query: 120 NKHGKLIDKITLPGLCIV--MDWDSEGDLL 147
            KH  L D I  PG+     +D DSEG LL
Sbjct: 19  EKHKTLKDYIDDPGVYAAGRLDRDSEGLLL 48


>gnl|CDD|212567 cd11694, DHR2_DOCK_D, Dock Homology Region 2, a GEF domain, of
           Class D Dedicator of Cytokinesis proteins.  DOCK
           proteins are atypical guanine nucleotide exchange
           factors (GEFs) that lack the conventional Dbl homology
           (DH) domain. As GEFs, they activate small GTPases by
           exchanging bound GDP for free GTP. They are divided into
           four classes (A-D) based on sequence similarity and
           domain architecture; class D, also called the Zizimin
           subfamily, includes Dock9, 10 and 11. Class D Docks are
           specific GEFs for Cdc42. Dock9 plays important roles in
           spine formation and dendritic growth. Dock10 and Dock11
           are preferentially expressed in lymphocytes. All DOCKs
           contain two homology domains: the DHR-1 (Dock homology
           region-1), also called CZH1 (CED-5, Dock180, and
           MBC-zizimin homology 1), and DHR-2 (also called CZH2 or
           Docker). The DHR-1 domain binds
           phosphatidylinositol-3,4,5-triphosphate. This alignment
           model represents the DHR-2 domain of class D DOCKs,
           which contains the catalytic GEF activity for Cdc42.
           Class D DOCKs also contain a Pleckstrin homology (PH)
           domain at the N-terminus.
          Length = 376

 Score = 31.2 bits (71), Expect = 4.5
 Identities = 20/88 (22%), Positives = 29/88 (32%), Gaps = 29/88 (32%)

Query: 824 CADILQQ---------FNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSAT 874
           C + L +           KL   + +YE   ++E+ A CY  L                 
Sbjct: 51  CVEGLWKAERYELLGELYKL--IIPIYEKRRDFEQLADCYRTLH---------------- 92

Query: 875 TFIQYAKAKEAMGSYRESVGAYERAEDY 902
               Y K  E M S +  +G Y R   Y
Sbjct: 93  --RAYEKVVEVMESGKRLLGTYYRVAFY 118


>gnl|CDD|128594 smart00299, CLH, Clathrin heavy chain repeat homology. 
          Length = 140

 Score = 29.9 bits (68), Expect = 4.7
 Identities = 10/63 (15%), Positives = 24/63 (38%)

Query: 880 AKAKEAMGSYRESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTEGAKRIADYC 939
            K  E    Y E+V  Y++  ++ + +   ++HL +   A++    +   E    +    
Sbjct: 76  GKLCEKAKLYEEAVELYKKDGNFKDAIVTLIEHLGNYEKAIEYFVKQNNPELWAEVLKAL 135

Query: 940 NKH 942
              
Sbjct: 136 LDK 138


>gnl|CDD|132218 TIGR03174, cas_Csc3, CRISPR type I-D/CYANO-associated protein
           Csc3/Cas10d.  CRISPR (Clustered Regularly Interspaced
           Short Palindromic Repeats) is a widespread family of
           prokaryotic direct repeats with spacers of unique
           sequence between consecutive repeats. This protein
           family is a CRISPR-associated (Cas) family strictly
           associated with the Cyano subtype of CRISPR/Cas locus,
           found in several species of Cyanobacteria and several
           archaeal species. This family is designated Csc3 for
           CRISPR/Cas Subtype Cyano protein 3, as it is often the
           third gene upstream of the core cas genes,
           cas3-cas4-cas1-cas2 [Mobile and extrachromosomal element
           functions, Other].
          Length = 953

 Score = 31.0 bits (70), Expect = 5.6
 Identities = 29/129 (22%), Positives = 39/129 (30%), Gaps = 23/129 (17%)

Query: 623 ISLHKVLKLLNWKEAWNICAVLNQSETWRSFAEACLQNL---EFSWAIRAYQSLDEAGMV 679
            +LH   K  +  +  N    LN  E       A  + L   EF     +    D    +
Sbjct: 45  FTLHDYHKHCHADDMPNDEFDLNIQEI-IPIILALGKRLGLDEFWAPENSEDWRDYIAEI 103

Query: 680 WCLESLVEEEEDTSILCGHVAALLGNHDTAQQRYLTSDIPTMALTLRRDLRQWREALALA 739
             L   V  ++  S           NH TA        +        R LR  R  L LA
Sbjct: 104 SFLAQNVHGKQHIS----------SNHSTAG---YNFTLKE------RTLRPLRHLLLLA 144

Query: 740 TSLGSNQTP 748
            S  S  +P
Sbjct: 145 DSAASLSSP 153


>gnl|CDD|218251 pfam04762, IKI3, IKI3 family.  Members of this family are
           components of the elongator multi-subunit component of a
           novel RNA polymerase II holoenzyme for transcriptional
           elongation. This region contains WD40 like repeats.
          Length = 903

 Score = 31.1 bits (71), Expect = 5.6
 Identities = 9/27 (33%), Positives = 17/27 (62%), Gaps = 1/27 (3%)

Query: 169 ERISWSDDGQLLAVTTSGGSVKIYLSK 195
              +WS D +LLA+TT   +V + +++
Sbjct: 120 SAAAWSPDEELLALTTGENTV-LLMTR 145


>gnl|CDD|239956 cd04583, CBS_pair_ABC_OpuCA_assoc2, This cd contains two tandem
           repeats of the cystathionine beta-synthase (CBS pair)
           domains in association with the ABC transporter OpuCA.
           OpuCA is the ATP binding component of a bacterial solute
           transporter that serves a protective role to cells
           growing in a hyperosmolar environment but the function
           of the CBS domains in OpuCA remains unknown.  In the
           related ABC transporter, OpuA, the tandem CBS domains
           have been shown to function as sensors for ionic
           strength, whereby they control the transport activity
           through an electronic switching mechanism. ABC
           transporters are a large family of proteins involved in
           the transport of a wide variety of different compounds,
           like sugars, ions, peptides, and more complex organic
           molecules. They are a subset of nucleotide hydrolases
           that contain a signature motif, Q-loop, and
           H-loop/switch region, in addition to the Walker A
           motif/P-loop and Walker B motif commonly found in a
           number of ATP- and GTP-binding and hydrolyzing proteins.
           CBS is a small domain originally identified in
           cystathionine beta-synthase and subsequently found in a
           wide range of different proteins. CBS domains usually
           come in tandem repeats, which associate to form a
           so-called Bateman domain or a CBS pair which is
           reflected in this model. The interface between the two
           CBS domains forms a cleft that is a potential ligand
           binding site. The CBS pair coexists with a variety of
           other functional domains.  It has been proposed that the
           CBS domain may play a regulatory role, although its
           exact function is unknown.
          Length = 109

 Score = 29.1 bits (66), Expect = 5.7
 Identities = 22/92 (23%), Positives = 40/92 (43%), Gaps = 18/92 (19%)

Query: 63  DSEGDLLGIISSNSSAVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSMLQLSVSIYNKH 122
           D +  LLGI+S  S    +   Y + +++ D  L D  T          +Q   S+ +  
Sbjct: 32  DKDNKLLGIVSLES----LEQAYKEAKSLEDIMLEDVFT----------VQPDASLRD-- 75

Query: 123 GKLIDKITLPGLCIVMDWDSEGDLLGIISSNS 154
             ++  +   G   V   D +G L+G+I+ +S
Sbjct: 76  --VLGLVLKRGPKYVPVVDEDGKLVGLITRSS 105


>gnl|CDD|239057 cd02143, NADH_nitroreductase, Nitroreductase family. Members of this
            family utilize FMN as a cofactor. This family is involved
            in the reduction of flavin or nitroaromatic compounds by
            using NAD(P)H as electron donor in a obligatory
            two-electron transfer. Nitrogenase is homodimer. Each
            subunit contains one FMN molecule. Members of this family
            are also called NADH dehydrogenase, oxygen-insensitive
            NAD(P)H nitrogenase or dihydropteridine reductase.
          Length = 147

 Score = 29.5 bits (67), Expect = 6.5
 Identities = 12/51 (23%), Positives = 20/51 (39%), Gaps = 4/51 (7%)

Query: 1207 LEDKYRKQIELIVRKAPRKDIASPEEAHVLPCPYCDTMVP--DMMLHCASC 1255
            ++D + K I+ I R AP   +AS       P    D ++      L   + 
Sbjct: 45   VDDPWEKGIDPIFRGAPHLLLASAPRDF--PTAQVDAIIALTYFELAAQAL 93


>gnl|CDD|129168 TIGR00058, Hemerythrin, hemerythrin family non-heme iron protein.
            This family includes oxygen carrier proteins of various
            oligomeric states from the vascular fluid (hemerythrin)
            and muscle (myohemerythrin) of some marine invertebrates.
            Each unit binds 2 non-heme Fe using 5 H, one E and one D.
            One member of this family,from the sandworm Nereis
            diversicolor, is an unusual (non-metallothionein)
            cadmium-binding protein. Homologous proteins, excluded
            from this narrowly defined family, are found in archaea
            and bacteria (see pfam01814).
          Length = 115

 Score = 29.0 bits (65), Expect = 6.8
 Identities = 18/50 (36%), Positives = 25/50 (50%), Gaps = 6/50 (12%)

Query: 963  NLSQQHKKLHEFGKFLLEEDEPNPVELKRL----AIHFEEDKDMFRAAQY 1008
            NL ++HK L   G F L  D  +   LK L     +HF +++ M  AA Y
Sbjct: 17   NLDEEHKTLFN-GIFALAAD-NSATALKELIDVTVLHFLDEEAMMIAANY 64


>gnl|CDD|237602 PRK14081, PRK14081, triple tyrosine motif-containing protein;
           Provisional.
          Length = 667

 Score = 30.8 bits (70), Expect = 7.0
 Identities = 19/82 (23%), Positives = 33/82 (40%), Gaps = 23/82 (28%)

Query: 445 PESGEKYTIMTQAMSEE----FLFFATSEYELKIFSLSEWKFVSGYKHSDK-IKSIYPDI 499
           P+   +YTIM QA  E+    F + +  +Y +              K  +K IK+IY D 
Sbjct: 60  PKEEGEYTIMVQAKKEDSNKPFDYVSKEDYVIG-------------KAEEKLIKNIYLDK 106

Query: 500 YGI-----CLVLIEMNNTGFLY 516
             +       + ++ N    +Y
Sbjct: 107 DTLNVGEKIEIKVDSNKEPLMY 128


>gnl|CDD|234059 TIGR02917, PEP_TPR_lipo, putative PEP-CTERM system TPR-repeat
            lipoprotein.  This protein family occurs in strictly
            within a subset of Gram-negative bacterial species with
            the proposed PEP-CTERM/exosortase system, analogous to
            the LPXTG/sortase system common in Gram-positive
            bacteria. This protein occurs in a species if and only if
            a transmembrane histidine kinase (TIGR02916) and a
            DNA-binding response regulator (TIGR02915) also occur.
            The present of tetratricopeptide repeats (TPR) suggests
            protein-protein interaction, possibly for the regulation
            of PEP-CTERM protein expression, since many PEP-CTERM
            proteins in these genomes are preceded by a proposed DNA
            binding site for the response regulator.
          Length = 899

 Score = 30.4 bits (69), Expect = 7.2
 Identities = 60/345 (17%), Positives = 105/345 (30%), Gaps = 90/345 (26%)

Query: 711  QRYLTSDIPTMALTLRRDLRQW-REALALATSLGSNQTPIISCDYAQQLEMTGQHAQALS 769
            Q YL       AL +  +      ++      LG  Q               G   +A+S
Sbjct: 575  QYYLGKGQLKKALAILNEAADAAPDSPEAWLMLGRAQL------------AAGDLNKAVS 622

Query: 770  FYQKSMELA---------TPDIQDPECQRKCKEGIARTSIRVGDFRLGIRLAAESNSSVL 820
             ++K + L            D       +   +  A TS++         L  + +++  
Sbjct: 623  SFKKLLALQPDSALALLLLADAY--AVMKNYAK--AITSLKRA-------LELKPDNTEA 671

Query: 821  KNECADILQQFNKLNDAVTLYES-AGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQY 879
            +   A +L    +   A  + +S    + KAA  +         G L    K     IQ 
Sbjct: 672  QIGLAQLLLAAKRTESAKKIAKSLQKQHPKAALGFELE------GDLYLRQKDYPAAIQA 725

Query: 880  AKAKEAMGSYRESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTEGAKRIADYC 939
                     YR+   A +RA    N +++    L           +    E  K +  + 
Sbjct: 726  ---------YRK---ALKRAPSSQNAIKLHRALL----------ASGNTAEAVKTLEAWL 763

Query: 940  NKHGDFGAAIHFLILSKCYQDAFNLSQQHKKLHEFGKFLLEEDEPNPVELKRLAIHFEED 999
              H +  A +    L++ Y    +  +  K        ++++   N V L  LA  + E 
Sbjct: 764  KTHPN-DAVLRTA-LAELYLAQKDYDKAIKHYQT----VVKKAPDNAVVLNNLAWLYLEL 817

Query: 1000 KDMFRAAQY---------------------YYHAKEYGRAMKLLL 1023
            KD  RA +Y                          E  RA+ LL 
Sbjct: 818  KDP-RALEYAERALKLAPNIPAILDTLGWLLVEKGEADRALPLLR 861


>gnl|CDD|238917 cd01942, ribokinase_group_A, Ribokinase-like subgroup A.  Found in
           bacteria and archaea, this subgroup is part of the
           ribokinase/pfkB superfamily.  Its oligomerization state
           is unknown at this time.
          Length = 279

 Score = 30.0 bits (68), Expect = 8.3
 Identities = 8/25 (32%), Positives = 11/25 (44%)

Query: 517 HTAMDYLLPIPEFPPATEEVLWDTV 541
           H   D +L +  FP   E VL   +
Sbjct: 7   HLNYDIILKVESFPGPFESVLVKDL 31


>gnl|CDD|177301 PHA00733, PHA00733, hypothetical protein.
          Length = 128

 Score = 28.7 bits (64), Expect = 8.7
 Identities = 15/38 (39%), Positives = 19/38 (50%), Gaps = 7/38 (18%)

Query: 362 HYLGDNQKTIPDAFIKNRNYFAHTLVYKPQLLGDKEYL 399
           H L   QK +  A +K       TL+Y PQLL +  YL
Sbjct: 32  HSLTPEQKRLIRAVVK-------TLIYNPQLLDESSYL 62


>gnl|CDD|222112 pfam13414, TPR_11, TPR repeat. 
          Length = 69

 Score = 27.3 bits (61), Expect = 9.7
 Identities = 16/64 (25%), Positives = 27/64 (42%), Gaps = 9/64 (14%)

Query: 841 YESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAMGS-YRESVGAYERA 899
               G+Y++A   Y       K  +L P   +A  +   A A   +G  Y E++   E+A
Sbjct: 13  LFKLGDYDEAIEAY------EKALELDP--DNAEAYYNLALAYLKLGKDYEEALEDLEKA 64

Query: 900 EDYD 903
            + D
Sbjct: 65  LELD 68


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.135    0.405 

Gapped
Lambda     K      H
   0.267   0.0728    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 66,463,601
Number of extensions: 6555846
Number of successful extensions: 5515
Number of sequences better than 10.0: 1
Number of HSP's gapped: 5489
Number of HSP's successfully gapped: 51
Length of query: 1324
Length of database: 10,937,602
Length adjustment: 108
Effective length of query: 1216
Effective length of database: 6,147,370
Effective search space: 7475201920
Effective search space used: 7475201920
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 65 (28.8 bits)