RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy6070
         (1454 letters)



>gnl|CDD|238492 cd00992, PDZ_signaling, PDZ domain found in a variety of Eumetazoan
            signaling molecules, often in tandem arrangements. May be
            responsible for specific protein-protein interactions, as
            most PDZ domains bind C-terminal polypeptides, and
            binding to internal (non-C-terminal) polypeptides and
            even to lipids has been demonstrated. In this subfamily
            of PDZ domains an N-terminal beta-strand forms the
            peptide-binding groove base, a circular permutation with
            respect to PDZ domains found in proteases.
          Length = 82

 Score = 76.5 bits (189), Expect = 7e-17
 Identities = 35/88 (39%), Positives = 46/88 (52%), Gaps = 6/88 (6%)

Query: 1303 LVRVSFEKGPGKKSLGFSIVGGVDSPKGEMGIFVKTIFPHGQAAESGLLVEGDEILLFNN 1362
            +  V+  K PG   LGFS+ GG DS     GIFV  + P G  AE G L  GD IL  N 
Sbjct: 1    VRTVTLRKDPGG-GLGFSLRGGKDS---GGGIFVSRVEP-GGPAERGGLRVGDRILEVNG 55

Query: 1363 EPLQGRTHAEAITIFKKTKQGLVELVLQ 1390
              ++G TH EA+ + K +    V L ++
Sbjct: 56   VSVEGLTHEEAVELLKNSG-DEVTLTVR 82


>gnl|CDD|214570 smart00228, PDZ, Domain present in PSD-95, Dlg, and ZO-1/2.  Also
            called DHR (Dlg homologous region) or GLGF (relatively
            well conserved tetrapeptide in these domains). Some PDZs
            have been shown to bind C-terminal polypeptides; others
            appear to bind internal (non-C-terminal) polypeptides.
            Different PDZs possess different binding specificities.
          Length = 85

 Score = 70.9 bits (174), Expect = 6e-15
 Identities = 39/93 (41%), Positives = 51/93 (54%), Gaps = 9/93 (9%)

Query: 1300 EPTLVRVSFEKGPGKKSLGFSIVGGVDSPKGEMGIFVKTIFPHGQAAESGLLVEGDEILL 1359
            EP LV +  EKG G   LGFS+VGG D   G   + V ++ P   AA++GL V GD IL 
Sbjct: 1    EPRLVEL--EKGGG--GLGFSLVGGKDEGGG---VVVSSVVPGSPAAKAGLRV-GDVILE 52

Query: 1360 FNNEPLQGRTHAEAITIFKKTKQGLVELVLQPN 1392
             N   ++G TH EA+ + KK   G V L +   
Sbjct: 53   VNGTSVEGLTHLEAVDLLKKAG-GKVTLTVLRG 84


>gnl|CDD|238080 cd00136, PDZ, PDZ domain, also called DHR (Dlg homologous region) or
            GLGF (after a conserved sequence motif). Many PDZ domains
            bind C-terminal polypeptides, though binding to internal
            (non-C-terminal) polypeptides and even to lipids has been
            demonstrated. Heterodimerization through PDZ-PDZ domain
            interactions adds to the domain's versatility, and PDZ
            domain-mediated interactions may be modulated dynamically
            through target phosphorylation. Some PDZ domains play a
            role in scaffolding supramolecular complexes. PDZ domains
            are found in diverse signaling proteins in bacteria,
            archebacteria, and eurkayotes. This CD contains two
            distinct structural subgroups with either a N- or
            C-terminal beta-strand forming the peptide-binding groove
            base. The circular permutation placing the strand on the
            N-terminus appears to be found in Eumetazoa only, while
            the C-terminal variant is found in all three kingdoms of
            life, and seems to co-occur with protease domains. PDZ
            domains have been named after PSD95(post synaptic density
            protein), DlgA (Drosophila disc large tumor suppressor),
            and ZO1, a mammalian tight junction protein.
          Length = 70

 Score = 54.2 bits (131), Expect = 3e-09
 Identities = 24/75 (32%), Positives = 33/75 (44%), Gaps = 6/75 (8%)

Query: 1315 KSLGFSIVGGVDSPKGEMGIFVKTIFPHGQAAESGLLVEGDEILLFNNEPLQGRTHAEAI 1374
              LGFSI GG      E G+ V ++ P   A  +GL   GD IL  N   ++  T  +  
Sbjct: 1    GGLGFSIRGG-----TEGGVVVLSVEPGSPAERAGLQA-GDVILAVNGTDVKNLTLEDVA 54

Query: 1375 TIFKKTKQGLVELVL 1389
             + KK     V L +
Sbjct: 55   ELLKKEVGEKVTLTV 69


>gnl|CDD|201332 pfam00595, PDZ, PDZ domain (Also known as DHR or GLGF).  PDZ domains
            are found in diverse signaling proteins.
          Length = 80

 Score = 51.1 bits (123), Expect = 5e-08
 Identities = 35/77 (45%), Positives = 46/77 (59%), Gaps = 5/77 (6%)

Query: 1306 VSFEKGPGKKSLGFSIVGGVDSPKGEMGIFVKTIFPHGQAAESGLLVEGDEILLFNNEPL 1365
            V+ EK  G+  LGFS+VGG D   G+ GIFV  + P G AAE+G L EGD IL  N + L
Sbjct: 2    VTLEK-SGRGGLGFSLVGGSD---GDPGIFVSEVLP-GGAAEAGGLQEGDRILSINGQDL 56

Query: 1366 QGRTHAEAITIFKKTKQ 1382
            +  +H EA+   K +  
Sbjct: 57   ENLSHDEAVLALKGSGG 73


>gnl|CDD|236304 PRK08581, PRK08581, N-acetylmuramoyl-L-alanine amidase; Validated.
          Length = 619

 Score = 40.5 bits (95), Expect = 0.007
 Identities = 16/92 (17%), Positives = 43/92 (46%), Gaps = 4/92 (4%)

Query: 1185 LPSSSDFMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTNRASPNA 1244
             P +S+  +N ++ +  ++     K++    +S Q+   + + PSS ++  +T+   PN+
Sbjct: 138  QPRNSEKSTNDSNKNSDSSI----KNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNS 193

Query: 1245 TNSNQALTNLTDSVSNTNQESPTSTELAKTSS 1276
                Q   + +   S+      +S++  ++ S
Sbjct: 194  PKPTQPNQSNSQPASDDTANQKSSSKDNQSMS 225



 Score = 32.8 bits (75), Expect = 1.6
 Identities = 21/102 (20%), Positives = 41/102 (40%), Gaps = 5/102 (4%)

Query: 1180 NSNEVLPSSSDFMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTNR 1239
              N+    SS      N  + S+ N     SN+   +     P  +    ++D  +N   
Sbjct: 157  IKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTANQKS 216

Query: 1240 AS-PNATNSNQALTNLTDSVS----NTNQESPTSTELAKTSS 1276
            +S  N + S+ AL ++ D  S     T ++  + ++  KT +
Sbjct: 217  SSKDNQSMSDSALDSILDQYSEDAKKTQKDYASQSKKDKTET 258



 Score = 32.5 bits (74), Expect = 1.9
 Identities = 12/65 (18%), Positives = 25/65 (38%)

Query: 106 SSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTPCGSQNKG 165
            S+    + +QSS Q  + +  +P +N+   ++S  Q +    +    S+  P       
Sbjct: 154 DSSIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTAN 213

Query: 166 AIRSD 170
              S 
Sbjct: 214 QKSSS 218



 Score = 32.1 bits (73), Expect = 2.5
 Identities = 16/95 (16%), Positives = 35/95 (36%), Gaps = 10/95 (10%)

Query: 75  QGDGGEDGRMRRRSSIWRRVGSVQSTQAPKG--SSNESQCSQSQSSPQRSSQSNNSPQNN 132
           Q    E        +    + +   TQ+ K   + N+   S + + P  S++  NSP+  
Sbjct: 138 QPRNSEKSTNDSNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPT 197

Query: 133 SQSNNSSQSQRSPQSQSTSQSSSQTPCGSQNKGAI 167
             + +         SQ  S  ++     S++  ++
Sbjct: 198 QPNQS--------NSQPASDDTANQKSSSKDNQSM 224



 Score = 31.3 bits (71), Expect = 4.3
 Identities = 39/210 (18%), Positives = 69/210 (32%), Gaps = 6/210 (2%)

Query: 95  GSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSS 154
              + + A   S +  + +  ++S   SS+  +   NN+ SN  +  ++     S++  S
Sbjct: 29  DPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKFSTIDSSTSDS 88

Query: 155 SQTPCGSQNKGAIRSDLTLSENRPLFSPSLNEEVMDMNYLYNFKCRRKQNFLLLNDYKSL 214
           +           +                       +  L+N            N  KS 
Sbjct: 89  NNIIDFIYKN--LPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNSDISDYEQPRNSEKST 146

Query: 215 TNDYKSCDSKSLSNDYKSSDCKSLANDYKLSDCKALANDYKSSDSMSLVNDYKPSQTNGS 274
            +  K+ DS   ++    S  +  A++ K       +N+ K S S    N  KP+Q N S
Sbjct: 147 NDSNKNSDSSIKNDTDTQSSKQDKADNQKAPS----SNNTKPSTSNKQPNSPKPTQPNQS 202

Query: 275 TEVTAHRSTDLVGTDGKPNLSDVSKPLPHI 304
               A   T    +  K N S     L  I
Sbjct: 203 NSQPASDDTANQKSSSKDNQSMSDSALDSI 232



 Score = 30.1 bits (68), Expect = 9.4
 Identities = 22/105 (20%), Positives = 48/105 (45%), Gaps = 10/105 (9%)

Query: 1181 SNEVLPSSSDFMSNSNSSSRSNANQSLPKSNQSLPNSN---QNLPTSNQVPSSTDSVSNT 1237
             N+  PSS++   ++++   ++   + P  + S P S+       +S    S +DS  ++
Sbjct: 172  DNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTANQKSSSKDNQSMSDSALDS 231

Query: 1238 --NRASPNATNSN-----QALTNLTDSVSNTNQESPTSTELAKTS 1275
              ++ S +A  +      Q+  + T++ +  N + PT  EL   S
Sbjct: 232  ILDQYSEDAKKTQKDYASQSKKDKTETSNTKNPQLPTQDELKHKS 276


>gnl|CDD|148679 pfam07218, RAP1, Rhoptry-associated protein 1 (RAP-1).  This family
           consists of several rhoptry-associated protein 1 (RAP-1)
           sequences which appear to be specific to Plasmodium
           falciparum.
          Length = 790

 Score = 39.7 bits (92), Expect = 0.012
 Identities = 47/260 (18%), Positives = 92/260 (35%), Gaps = 35/260 (13%)

Query: 37  NDGAVSKRKSSTWTKVAKTFDLMRKSEARQCSEAGPSSQGDGGEDGRMRRRSSIWRRVGS 96
           +DG ++   +S     +K      KS  R  S A    + D  +D   +   +       
Sbjct: 77  DDGNINLTDTSENGDASKKGH--GKSRVRSASAAAILEEDDSKDDMEFKANPNE------ 128

Query: 97  VQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQ 156
                 PKG+  E   S S    + S++S +  ++ S+   S+ S  S      + +S  
Sbjct: 129 AGKPGKPKGNQGEGLASSSDGKSKASAKSGS--KSASKHGESNSSDESATDSGKASASVA 186

Query: 157 TPCGSQNKGAIRSDLTLSENRPLFSP---------SLNEEVMDMNYLYN--FKCRRKQNF 205
              G+  +       TL+    L+            L +   +++ L N   K   ++ F
Sbjct: 187 GIVGADEEAPPAPKNTLTPLEELYETNVNLFALKHPLEKLEEEIDILKNDGDKVAEEEEF 246

Query: 206 LLLNDYKSLTNDYKSCDSKSLSNDYKSSDCKSLANDYKLSDCKALANDYKS-------SD 258
            L  +++    D K    ++L       D +    D    + K + +D K        S+
Sbjct: 247 ELDEEHEEAEEDKK----EALEKIGAEGDEEKFKFD---EEIKFIEHDVKDRNIAGGFSE 299

Query: 259 SMSLVNDYKPSQTNGSTEVT 278
             S +N +K  +     E++
Sbjct: 300 FFSKLNPFKKDEKIEKKEIS 319


>gnl|CDD|238488 cd00988, PDZ_CTP_protease, PDZ domain of C-terminal processing-,
            tail-specific-, and tricorn proteases, which function in
            posttranslational protein processing, maturation, and
            disassembly or degradation, in Bacteria, Archaea, and
            plant chloroplasts. May be responsible for substrate
            recognition and/or binding, as most PDZ domains bind
            C-terminal polypeptides, and binding to internal
            (non-C-terminal) polypeptides and even to lipids has been
            demonstrated. In this subfamily of protease-associated
            PDZ domains a C-terminal beta-strand forms the
            peptide-binding groove base, a circular permutation with
            respect to PDZ domains found in Eumetazoan signaling
            proteins.
          Length = 85

 Score = 35.3 bits (82), Expect = 0.025
 Identities = 15/63 (23%), Positives = 31/63 (49%), Gaps = 1/63 (1%)

Query: 1333 GIFVKTIFPHGQAAESGLLVEGDEILLFNNEPLQGRTHAEAITIFKKTKQGLVELVLQPN 1392
            G+ + ++ P   AA++G+   GD I+  + EP+ G +  + + + +      V L L+  
Sbjct: 14   GLVITSVLPGSPAAKAGIKA-GDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRG 72

Query: 1393 TTE 1395
              E
Sbjct: 73   DGE 75


>gnl|CDD|215386 PLN02727, PLN02727, NAD kinase.
          Length = 986

 Score = 38.3 bits (89), Expect = 0.037
 Identities = 45/236 (19%), Positives = 84/236 (35%), Gaps = 24/236 (10%)

Query: 73  SSQGDGGEDGRMRRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNN 132
           S++   G++  +     + +  GS+Q T     SSN S+  +S S    + +SN    N+
Sbjct: 369 SAERLLGQNSVVNGNGKLDQETGSLQETNDKDSSSNGSESGESCSIKDETGRSNLEAYNS 428

Query: 133 SQSNNSSQSQRSPQSQSTSQSS---------SQTPCGSQNKGAIRSDLTLSENRPLFSPS 183
             S+ S+Q      +   SQS+         +Q P          S    S+      P 
Sbjct: 429 LPSDQSTQQGEMVGTGVESQSNFNMESDPLKAQVPPCDVFSKKEMSKFFRSKK---IYPP 485

Query: 184 --LNEEVMDMNYLYNFKCRRKQNFLLLNDYKSLT------NDYKSCDSKSLSNDYKSSDC 235
             LN        L   +         ++D  S++              K+ S  Y+SS+ 
Sbjct: 486 TYLNYRRKGFEKLPVPQFTGVTQGSKIDDTDSISRLVETGRSNGLVSEKNSSPKYQSSEF 545

Query: 236 KSLANDYKLSDCKALANDYKSSDSMSLVNDYKPSQTNGSTEVTAHRSTDLVGTDGK 291
            +     K S+  + A+D   S + S+ N    +    S+ V+ +    +     +
Sbjct: 546 DNG----KSSNGSSFASDGSLSVASSITNGNPSNNGASSSTVSDNLERSVASVSVR 597


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 38.2 bits (89), Expect = 0.041
 Identities = 26/156 (16%), Positives = 49/156 (31%), Gaps = 9/156 (5%)

Query: 8   PSGQSVTADTNTEAEDC---PLTEAPDMDTRSNDGAVSKRKSSTWTKVAKTFDLMRKSEA 64
            +G S +  +++E+  C   P  E P              ++S W   +        S +
Sbjct: 231 DAGASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSS 290

Query: 65  RQ--CSEAGPSSQGDGGEDGRMRRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRS 122
            +       PSS G G                 S + + +   SS+      +  SP  S
Sbjct: 291 PRERSPSPSPSSPGSGPAPSS----PRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPS 346

Query: 123 SQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTP 158
              + SP       + S  ++ P+      S + + 
Sbjct: 347 PSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASA 382


>gnl|CDD|237081 PRK12372, PRK12372, ribonuclease III; Reviewed.
          Length = 413

 Score = 36.8 bits (85), Expect = 0.080
 Identities = 23/90 (25%), Positives = 36/90 (40%), Gaps = 16/90 (17%)

Query: 633 ENTPSKGRLPALPPHSSLDAFCHSGSISVSSVDKPVNSVDKPVRSVDKPVNSVDKPVSSV 692
           E    KG   A P  +   A            DKP +  D   ++ +KP  +  +   + 
Sbjct: 308 ETAADKGERAAKPAAADKAA------------DKPADRPDAAEKAAEKPAEAAPR---AA 352

Query: 693 DKPVNSMDKP-VNSVDKPVRAVDSAFVSPS 721
           DKP      P  +S DKP  + D+A  +P+
Sbjct: 353 DKPAGQAADPASSSADKPGASADAAARTPA 382


>gnl|CDD|215814 pfam00242, DNA_pol_viral_N, DNA polymerase (viral) N-terminal
           domain. 
          Length = 379

 Score = 36.7 bits (85), Expect = 0.092
 Identities = 28/118 (23%), Positives = 43/118 (36%), Gaps = 12/118 (10%)

Query: 60  RKSEARQCSEAGPSSQGDGGEDGRMRRR--SSIWRRVGSVQSTQAPKGSSNESQC-SQSQ 116
           ++S     +  G  + G  G  G +R R  S+  R  G       P  S   +   S S 
Sbjct: 246 QRSRLGLQANQGKLAHGQQGRSGSIRGRKHSTTRRPFG-----VEPSSSGVTTNRASSSS 300

Query: 117 SSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTPCGSQNKGAIRSDLTLS 174
           S   +S+    +  + S S   S S  + + +S    S      SQN G + S   L 
Sbjct: 301 SCFHQSAVRETAYSSLSTSERHSSSGHAVELRSIPGGS----VSSQNAGPLLSCWWLQ 354



 Score = 33.3 bits (76), Expect = 0.97
 Identities = 17/85 (20%), Positives = 27/85 (31%), Gaps = 5/85 (5%)

Query: 85  RRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRS 144
           RR  ++         T    G   +SQ  +S+   Q    +N     + Q    S S R 
Sbjct: 218 RRTRNLANNTSRKSDTSRSVGPVRQSQIQRSRLGLQ----ANQGKLAHGQQG-RSGSIRG 272

Query: 145 PQSQSTSQSSSQTPCGSQNKGAIRS 169
            +  +T +     P  S       S
Sbjct: 273 RKHSTTRRPFGVEPSSSGVTTNRAS 297


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 36.1 bits (83), Expect = 0.16
 Identities = 30/160 (18%), Positives = 55/160 (34%), Gaps = 6/160 (3%)

Query: 8   PSGQSVTADTNTEAEDCPLTEAPDMDTRSNDGAVSKRKSSTWTKVAKTFDLMRKSEARQC 67
            S    +     E E     +           ++      +    +K   +         
Sbjct: 240 HSLSKSSNSQTPELEYSEKGKDHHHSHNHQHHSIGINNHHSKHADSKLQTIEVIENHSNK 299

Query: 68  SEAGPSSQGDGGEDGRMRRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNN 127
           S    SS     E       +S     GS+ S  +       S  ++S SSP+  S +N 
Sbjct: 300 SRPSSSSTNGSKE----TTSNSSSAAAGSIGSKSSKSAKH--SNRNKSNSSPKSHSSANG 353

Query: 128 SPQNNSQSNNSSQSQRSPQSQSTSQSSSQTPCGSQNKGAI 167
           S  ++S S+N S+ +R+ +S S ++ S +   G    G +
Sbjct: 354 SVPSSSVSDNESKQKRASKSSSGARDSKKDASGMSANGTV 393



 Score = 33.7 bits (77), Expect = 0.84
 Identities = 17/77 (22%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 96  SVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSS 155
           +  ++ +    S  S+ S+S     R+  SN+SP+++S +N S  S     ++S  + +S
Sbjct: 313 TTSNSSSAAAGSIGSKSSKSAKHSNRNK-SNSSPKSHSSANGSVPSSSVSDNESKQKRAS 371

Query: 156 QTPCGSQNKGAIRSDLT 172
           ++  G+++     S ++
Sbjct: 372 KSSSGARDSKKDASGMS 388


>gnl|CDD|220633 pfam10214, Rrn6, RNA polymerase I-specific transcription-initiation
           factor.  RNA polymerase I-specific
           transcription-initiation factor Rrn6 and Rrn7 represent
           components of a multisubunit transcription factor
           essential for the initiation of rDNA transcription by
           Pol I. These proteins are found in fungi.
          Length = 753

 Score = 35.9 bits (83), Expect = 0.19
 Identities = 25/90 (27%), Positives = 39/90 (43%), Gaps = 7/90 (7%)

Query: 74  SQGDGGEDGRMRRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQ-----SSPQRSSQSNNS 128
           +Q D  +  ++  +S I     S Q +Q  KG S+    + +      S P  S  S++ 
Sbjct: 652 TQPDVTDSSQLESQSQIPTIRSSQQVSQTRKGGSSVVPSAPAPRLAQSSQPPTSQSSSDL 711

Query: 129 PQNNSQSNNSSQSQRSPQSQSTSQSSSQTP 158
           P ++SQ+   S S    QSQS S  S    
Sbjct: 712 PPSSSQA--FSLSDLPMQSQSESGLSGGRS 739



 Score = 33.2 bits (76), Expect = 1.3
 Identities = 21/70 (30%), Positives = 27/70 (38%), Gaps = 3/70 (4%)

Query: 95  GSVQSTQAPKG--SSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQ 152
                +Q P    S   SQ  +  SS   S+ +    Q+ SQ   S  S   P S S + 
Sbjct: 661 QLESQSQIPTIRSSQQVSQTRKGGSSVVPSAPAPRLAQS-SQPPTSQSSSDLPPSSSQAF 719

Query: 153 SSSQTPCGSQ 162
           S S  P  SQ
Sbjct: 720 SLSDLPMQSQ 729


>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family.  This model
            represents a subfamily of RNA splicing factors including
            the Pad-1 protein (N. crassa), CAPER (M. musculus) and
            CC1.3 (H.sapiens). These proteins are characterized by an
            N-terminal arginine-rich, low complexity domain followed
            by three (or in the case of 4 H. sapiens paralogs, two)
            RNA recognition domains (rrm: pfam00706). These splicing
            factors are closely related to the U2AF splicing factor
            family (TIGR01642). A homologous gene from Plasmodium
            falciparum was identified in the course of the analysis
            of that genome at TIGR and was included in the seed.
          Length = 457

 Score = 35.6 bits (82), Expect = 0.20
 Identities = 21/100 (21%), Positives = 34/100 (34%), Gaps = 3/100 (3%)

Query: 951  SRREETMRSRQSDPLPAVDSRKSQDRKSVRNQVRKAEHVSLKSKAIFSNEESRGLSN--R 1008
             R  E  R R           +S+ R   R++ R+                SR  +   R
Sbjct: 2    YRDRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRG-RSRSRSPNRYYR 60

Query: 1009 HEANRRSKRSSKKSVEPTPAQNDENSLENDLVRALNLALR 1048
               +R  +R  ++S   T     E   ++  V  L LAL+
Sbjct: 61   PRGDRSYRRDDRRSGRNTKEPLTEAERDDRTVFVLQLALK 100


>gnl|CDD|233230 TIGR01000, bacteriocin_acc, bacteriocin secretion accessory protein. 
            This family represents an accessory protein that works
            with the bacteriocin maturation and ABC transport
            secretion protein described by TIGR01193 [Transport and
            binding proteins, Other].
          Length = 457

 Score = 35.4 bits (82), Expect = 0.22
 Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 20/195 (10%)

Query: 1085 NIQRKVSSLDRKRRSQQNNPNQHRSEDVKAYRSRTEDLKDCLSRTENVNASTNSTKEYRS 1144
            N++ +  SLD  ++S +N  NQ  ++D   YR+        L++ E++ + T    +   
Sbjct: 108  NLKDQKKSLDTLKQSIENGRNQFPTDDSFGYRNL---FNGYLAQVESLTSETQQQNDKSQ 164

Query: 1145 LVDTASEVRLDEMGHVPDVLRNCVRRDAKVIQNFPNSNEVLPSSSDFMSNSNS-----SS 1199
              + A+E    ++          + +D + ++N  ++   + + + + S   +      S
Sbjct: 165  TQNEAAEKTKAQLDQQISKTDQKL-QDYQALKNAISNGTKVANFNPYQSLYENYQAQLKS 223

Query: 1200 RSNANQSLPKSNQSLPNSNQN---LPTS-----NQVPSSTDSVSNTNRASPN---ATNSN 1248
             S+ +Q     +  L    Q    L  S      Q    T S ++   +S N   A    
Sbjct: 224  ASDKDQKNQVKSTILATIQQQIDQLQKSIASYQVQKAGLTKSTASNYASSQNSKLAQLKE 283

Query: 1249 QALTNLTDSVSNTNQ 1263
            Q L  +   +++ NQ
Sbjct: 284  QQLAKVKQEITDLNQ 298


>gnl|CDD|237555 PRK13914, PRK13914, invasion associated secreted endopeptidase;
            Provisional.
          Length = 481

 Score = 35.2 bits (80), Expect = 0.30
 Identities = 22/81 (27%), Positives = 42/81 (51%), Gaps = 5/81 (6%)

Query: 1198 SSRSNANQSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTNRASPNATNSNQALTNLTDS 1257
            S+ +NAN++   +N +  N+N + P+ N     T++ +N+N  + + TN+NQ  +N   +
Sbjct: 309  STNTNANKTNTNTNTNTNNTNTSTPSKN-----TNTNTNSNTNTNSNTNANQGSSNNNSN 363

Query: 1258 VSNTNQESPTSTELAKTSSGG 1278
             S +   +     L K  S G
Sbjct: 364  SSASAIIAEAQKHLGKAYSWG 384



 Score = 31.3 bits (70), Expect = 4.4
 Identities = 24/108 (22%), Positives = 49/108 (45%), Gaps = 4/108 (3%)

Query: 1169 RRDAKVIQNFPNSNEVLPSSSDFMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQVP 1228
            ++ A V++   N+N       +  +   ++ ++    + P    S  N+N N   +N   
Sbjct: 265  KQAAPVVKENTNTNTATTEKKETTTQQQTAPKAPTEAAKPAPAPS-TNTNANKTNTNTNT 323

Query: 1229 SSTDSVSNTNRASPNATNSNQALTNLTDSVSNTNQESPTSTELAKTSS 1276
            ++ ++  NT+  S N TN+N      T+S +N NQ S  +   +  S+
Sbjct: 324  NTNNT--NTSTPSKN-TNTNTNSNTNTNSNTNANQGSSNNNSNSSASA 368


>gnl|CDD|240274 PTZ00112, PTZ00112, origin recognition complex 1 protein;
            Provisional.
          Length = 1164

 Score = 35.4 bits (81), Expect = 0.34
 Identities = 33/165 (20%), Positives = 62/165 (37%), Gaps = 22/165 (13%)

Query: 1085 NIQRKVSSLDRKRRSQQNNPNQHRSEDVKAYRSRTEDLKDCLSRTENVNASTNSTKEYRS 1144
            NI++      + +R+ + +  Q+   DV+  RS T++ K        V+   +S    R 
Sbjct: 232  NIKKDRDGDKQTKRNSEKSKVQNSHFDVRILRSYTKENKKDEKNV--VSGIRSSVLLKRK 289

Query: 1145 LVDTASEVRLDEMGHVPDVLRNCVRRDAKVIQNFPNSNEVLPSSSDFMSNSNSSSRSNAN 1204
                                  C+R+D+ V  N     +     +    N+ SS+ +N +
Sbjct: 290  --------------------SQCLRKDSYVYSNHQKKAKTGDPKNIIHRNNGSSNSNNDD 329

Query: 1205 QSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTNRASPNATNSNQ 1249
             S      S   SN+N  +  +  ++T   +NT     N T + Q
Sbjct: 330  TSSSNHLGSNRISNRNPSSPYKKQTTTKHTNNTKNNKYNKTKTTQ 374


>gnl|CDD|113196 pfam04415, DUF515, Protein of unknown function (DUF515).  Family of
           hypothetical Archaeal proteins.
          Length = 416

 Score = 34.5 bits (79), Expect = 0.44
 Identities = 21/66 (31%), Positives = 33/66 (50%), Gaps = 4/66 (6%)

Query: 106 SSNESQCS----QSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTPCGS 161
           S ++SQ +     S +S   SS ++ SP + S  N+     +S  SQS S+S+S +   S
Sbjct: 280 SESQSQSTSTSSSSSTSSSESSSTSYSPGDASIQNSQRSQLQSSTSQSESESASSSYSYS 339

Query: 162 QNKGAI 167
            N   I
Sbjct: 340 VNLPEI 345



 Score = 30.3 bits (68), Expect = 9.3
 Identities = 21/64 (32%), Positives = 31/64 (48%), Gaps = 1/64 (1%)

Query: 93  RVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSN-NSSQSQRSPQSQSTS 151
            +   +S      +S+ S  S S+SS    S  + S QN+ +S   SS SQ   +S S+S
Sbjct: 276 SISVSESQSQSTSTSSSSSTSSSESSSTSYSPGDASIQNSQRSQLQSSTSQSESESASSS 335

Query: 152 QSSS 155
            S S
Sbjct: 336 YSYS 339


>gnl|CDD|218439 pfam05109, Herpes_BLLF1, Herpes virus major outer envelope
            glycoprotein (BLLF1).  This family consists of the BLLF1
            viral late glycoprotein, also termed gp350/220. It is the
            most abundantly expressed glycoprotein in the viral
            envelope of the Herpesviruses and is the major antigen
            responsible for stimulating the production of
            neutralising antibodies in vivo.
          Length = 830

 Score = 34.4 bits (78), Expect = 0.63
 Identities = 32/165 (19%), Positives = 58/165 (35%), Gaps = 22/165 (13%)

Query: 1129 TENVNASTNSTKEYRSLVDTASEVRLDEMGHVPDVLRNCVRRDAKVIQNFPNSNEVLPSS 1188
            T    +    T      V  A+  ++ E   V +     V     V+ +   + +    S
Sbjct: 536  TTTATSPPTGTTS----VPNATSPQVTEESPVNNTNTPVVTSAPSVLTSAVTTGQHGTGS 591

Query: 1189 SDFM------SNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTNRASP 1242
            S         S+S+S+ RSN+  + P    + P   +N+        ST  VS  +    
Sbjct: 592  SPTSQQPGIPSSSHSTPRSNSTSTTPLLTSAHPTGGENITEETPSVPSTTHVSTLSPGPG 651

Query: 1243 NATNSNQA------------LTNLTDSVSNTNQESPTSTELAKTS 1275
              T S  +              ++T+ + N N  SP++    KT+
Sbjct: 652  PGTTSQVSGPGNSSTSRYPGEVHVTEGMPNPNATSPSAPSGQKTA 696



 Score = 31.3 bits (70), Expect = 4.5
 Identities = 24/117 (20%), Positives = 43/117 (36%), Gaps = 3/117 (2%)

Query: 1187 SSSDFMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQV-PSSTDSVSNTNRASPNAT 1245
            +S+   + S + + +  N + P + ++    N   PT   +  ++T +   T   S    
Sbjct: 493  TSATPNATSPTPAVTTPNATSPTTQKTSDTPNATSPTPIVIGVTTTATSPPTGTTSVPNA 552

Query: 1246 NSNQALTNLTDSVSNTNQESPTSTELAKTSSGGFGLLHSLLASRSAFRSRPQTCEPT 1302
             S Q        V+NTN    TS     TS+   G   +  +  S     P +   T
Sbjct: 553  TSPQVTE--ESPVNNTNTPVVTSAPSVLTSAVTTGQHGTGSSPTSQQPGIPSSSHST 607


>gnl|CDD|217469 pfam03276, Gag_spuma, Spumavirus gag protein. 
          Length = 582

 Score = 34.1 bits (78), Expect = 0.65
 Identities = 16/62 (25%), Positives = 23/62 (37%), Gaps = 5/62 (8%)

Query: 97  VQSTQAP-----KGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTS 151
            QS Q       +G       SQ Q+  ++ ++   S Q   Q  N S      QSQ  +
Sbjct: 439 PQSDQQRPVSRGRGRGQRGPRSQPQNQRRQQNRGRQSSQPPRQQQNRSNQNNQRQSQGPN 498

Query: 152 QS 153
           Q 
Sbjct: 499 QG 500


>gnl|CDD|237460 PRK13656, PRK13656, trans-2-enoyl-CoA reductase; Provisional.
          Length = 398

 Score = 33.3 bits (77), Expect = 0.94
 Identities = 16/46 (34%), Positives = 22/46 (47%), Gaps = 12/46 (26%)

Query: 1275 SSGGFGLLHSLLASR--SAFRSRPQTCEPTLVRVSFEKGPGKKSLG 1318
            +S G+GL     ASR  +AF +   T     + V FEK   +K  G
Sbjct: 49   ASSGYGL-----ASRIAAAFGAGADT-----LGVFFEKPGTEKKTG 84


>gnl|CDD|219500 pfam07655, Secretin_N_2, Secretin N-terminal domain.  This is a
           short domain found in bacterial type II/III secretory
           system proteins. The architecture of these proteins
           suggest that this family may be functionally analogous
           to pfam03958.
          Length = 95

 Score = 31.1 bits (71), Expect = 0.98
 Identities = 20/54 (37%), Positives = 24/54 (44%), Gaps = 2/54 (3%)

Query: 104 KGSSNESQCS--QSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSS 155
            GSSN S  S   S S    SS S+NS    S S++SS    S    +T   S 
Sbjct: 14  SGSSNTSVTSGSVSSSGSNSSSSSSNSSNGGSSSSSSSGDSSSGTRITTESESD 67


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 33.3 bits (76), Expect = 1.1
 Identities = 35/182 (19%), Positives = 63/182 (34%), Gaps = 19/182 (10%)

Query: 8   PSGQSVTADTN--TEAEDCPLTEAPDMDTRSNDGAVSKRKSSTWTK---VAKTFDLMRKS 62
           PS  S T+  +    +++    E+ + +    +    + + S   +     +  ++   +
Sbjct: 421 PSKASSTSGESPSMASQESEEEESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADN 480

Query: 63  EARQCSEAGPSSQGDGGEDGR-MRRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQR 121
            + +  E      GDG E      RR+S    +  +   Q P+GS      S    SPQ 
Sbjct: 481 GSEEEMEGSSEGDGDGEEPEEDAERRNSEMAGISRMSEGQQPRGS------SVQPESPQE 534

Query: 122 SSQ---SNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTP----CGSQNKGAIRSDLTLS 174
                 S ++     +S+    ++ SP S  T      TP      S  K       T  
Sbjct: 535 EPLQPESMDAESVGEESDEELLAEESPLSSHTELEGVATPVETKISSSRKLPPPPVSTSL 594

Query: 175 EN 176
           EN
Sbjct: 595 EN 596


>gnl|CDD|220401 pfam09786, CytochromB561_N, Cytochrome B561, N terminal.  Members
           of this family are found in the N terminal region of
           cytochrome B561, as well as in various other putative
           uncharacterized proteins.
          Length = 559

 Score = 32.8 bits (75), Expect = 1.4
 Identities = 16/89 (17%), Positives = 28/89 (31%), Gaps = 12/89 (13%)

Query: 108 NESQCSQSQSSPQRSSQSNNSPQNNSQ-----------SNNSSQSQR-SPQSQSTSQSSS 155
           +      SQ+     +   ++P N S+           S + S S        ST Q S 
Sbjct: 107 DSQFTVVSQAKKSPPASKTSTPMNTSEPLVPGHSSFSDSPSRSASPSRKFSPSSTIQQSP 166

Query: 156 QTPCGSQNKGAIRSDLTLSENRPLFSPSL 184
           Q    ++      S  + S +  L   + 
Sbjct: 167 QLTPSNKPASPSSSYQSPSYSSSLGPVNS 195


>gnl|CDD|218693 pfam05687, DUF822, Plant protein of unknown function (DUF822).
           This family consists of the N terminal regions of
           several plant proteins of unknown function.
          Length = 151

 Score = 31.7 bits (72), Expect = 1.5
 Identities = 19/73 (26%), Positives = 29/73 (39%), Gaps = 2/73 (2%)

Query: 101 QAPKGSSNESQCSQSQSSPQRSS-QSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTPC 159
           +    S+  S CS  Q SP  S+  S     + S +++S  S  S  S   S ++S  P 
Sbjct: 80  EGAGSSATASPCSSYQLSPVSSAFPSPVPSYSASPASSSFPSPSSLDSIPISSAASLLP- 138

Query: 160 GSQNKGAIRSDLT 172
                  + S L 
Sbjct: 139 WLSVLSLVSSSLP 151


>gnl|CDD|215556 PLN03064, PLN03064, alpha,alpha-trehalose-phosphate synthase
           (UDP-forming); Provisional.
          Length = 934

 Score = 32.8 bits (75), Expect = 1.5
 Identities = 21/80 (26%), Positives = 29/80 (36%), Gaps = 14/80 (17%)

Query: 72  PSSQGDGGEDGRMRRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQN 131
            S   DG        +SS  RR          K  S+ S    SQ   QRS  S+     
Sbjct: 823 RSRSPDGL-------KSSGDRRPSG-------KLPSSRSNSKNSQGKKQRSLLSSAKSGV 868

Query: 132 NSQSNNSSQSQRSPQSQSTS 151
           N  +++ S  + SP+    S
Sbjct: 869 NHAASHGSDRRPSPEKIGWS 888


>gnl|CDD|236402 PRK09191, PRK09191, two-component response regulator; Provisional.
          Length = 261

 Score = 32.1 bits (74), Expect = 1.6
 Identities = 10/18 (55%), Positives = 14/18 (77%)

Query: 1368 RTHAEAITIFKKTKQGLV 1385
            RT AEA+ + KKT+ GL+
Sbjct: 169  RTRAEAVALAKKTRPGLI 186


>gnl|CDD|220764 pfam10454, DUF2458, Protein of unknown function (DUF2458).  This a is
            family of uncharacterized proteins.
          Length = 155

 Score = 31.4 bits (71), Expect = 1.7
 Identities = 18/86 (20%), Positives = 33/86 (38%), Gaps = 7/86 (8%)

Query: 1081 SSSPNIQRKVSSLDRKRRSQQNNPNQHRSEDVKAYRSRTEDLKDCLSRTE-----NVNAS 1135
            S +P   +K+  L  ++   +    + R   ++   +R ED K  L         +V A 
Sbjct: 20   SKNPEKLQKIRRLIIEQHRHERQWWKEREALIQKQEARKEDKKKLLPNGTDSKLSSVGAQ 79

Query: 1136 TNST-KEYRSLVDTASEVR-LDEMGH 1159
             +   K  R   +   E+R  DE  +
Sbjct: 80   VDDGSKNTRLEKELERELRAFDERVY 105


>gnl|CDD|227578 COG5253, MSS4, Phosphatidylinositol-4-phosphate 5-kinase [Signal
            transduction mechanisms].
          Length = 612

 Score = 31.8 bits (72), Expect = 2.8
 Identities = 36/211 (17%), Positives = 67/211 (31%), Gaps = 28/211 (13%)

Query: 1072 APRAMVPLHSSSPNIQRKVSSLDRKRRSQQNNPNQHRSEDVKAYRSRTEDLKDCLSRTEN 1131
             P         S    +     DR   +  +    +++ D               +  E 
Sbjct: 4    RPPISRSGTGISMTHDKSTRPNDRSMSNDSSLCGLNQASD--------------ANGNEY 49

Query: 1132 VNASTNSTKEYRS--LVDTASEVRLDEMGHVPDVL------RNCVRRDAKVIQNFPNSNE 1183
               +  S K+  S  L D  S+    E       L         ++    +++ F N+ +
Sbjct: 50   SPNNKVSKKDTFSDQLHDALSKEFTLERERDRLQLNKRKYQAIRLQTSTPIVEIFKNNKD 109

Query: 1184 -VLPSSSDFMSNSNSSSRSNANQSLPKSNQSL----PNSNQNLPTSNQVPSSTDS-VSNT 1237
             V P +    S +N S+ +    S P    S     PN +QNL T  +   S    +   
Sbjct: 110  AVDPPNHTRSSGNNLSNANVKTLSAPVGEHSRSNNPPNLDQNLDTEPESSISQWGELQLN 169

Query: 1238 NRASPNATNSNQALTNLTDSVSNTNQESPTS 1268
                  ++  ++  T+      + N + PTS
Sbjct: 170  PSGKTLSSQPSRKPTSENPKSESDNSKLPTS 200


>gnl|CDD|153367 cd07683, F-BAR_srGAP1, The F-BAR (FES-CIP4 Homology and
            Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase
            Activating Protein 1.  F-BAR domains are dimerization
            modules that bind and bend membranes and are found in
            proteins involved in membrane dynamics and actin
            reorganization. Slit-Robo GTPase Activating Proteins
            (srGAPs) are Rho GAPs that interact with Robo1, the
            transmembrane receptor of Slit proteins. Slit proteins
            are secreted proteins that control axon guidance and the
            migration of neurons and leukocytes. Vertebrates contain
            three isoforms of srGAPs. srGAP1, also called Rho
            GTPase-Activating Protein 13 (ARHGAP13), is a Cdc42- and
            RhoA-specific GAP and is expressed later in the
            development of CNS (central nervous system) tissues. It
            is an important downstream signaling molecule of Robo1.
            srGAP1 contains an N-terminal F-BAR domain, a Rho GAP
            domain, and a C-terminal SH3 domain. F-BAR domains form
            banana-shaped dimers with a positively-charged concave
            surface that binds to negatively-charged lipid membranes.
            They can induce membrane deformation in the form of long
            tubules.
          Length = 253

 Score = 31.6 bits (71), Expect = 2.9
 Identities = 18/61 (29%), Positives = 31/61 (50%), Gaps = 3/61 (4%)

Query: 950  ASRREETMRSRQSDPLPAVDSRKSQDRKSVRNQVRKAEHVSLKSKAIFSNEESRGLSNRH 1009
            A ++EE    R  DP   V   + +DR   R+ V+K E +  K +A +S  + + +  R+
Sbjct: 160  AEKQEEKQIGRSGDP---VFHIRLEDRHQRRSSVKKIEKMKEKRQAKYSENKLKSIKARN 216

Query: 1010 E 1010
            E
Sbjct: 217  E 217


>gnl|CDD|132697 TIGR03658, IsdH_HarA, haptoglobin-binding heme uptake protein HarA.
            HarA is a heme-binding NEAT-domain (NEAr Transporter,
            pfam05031) protein which has been shown to bind to the
            haptoglobin-hemoglobin complex in order to extract heme
            from it. HarA has also been reported to bind hemoglobin
            directly. HarA (also known as IsdH) contains three NEAT
            domains as well as a sortase A C-terminal signal for
            localization to the cell wall. The heme bound at the
            third of these NEAT domains has been shown to be
            transferred to the IsdA protein also localized at the
            cell wall, presumably through an additional specific
            protein-protein interaction. Haptoglobin is a hemoglobin
            carrier protein involved in scavenging hemoglobin in the
            blood following red blood cell lysis and targetting it to
            the liver.
          Length = 895

 Score = 31.8 bits (71), Expect = 3.1
 Identities = 23/85 (27%), Positives = 38/85 (44%), Gaps = 2/85 (2%)

Query: 1188 SSDFMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQVPSST--DSVSNTNRASPNAT 1245
            SS   SN  +++ SN N S   +  + P +  N+    Q  SS   D  S+      N+ 
Sbjct: 244  SSSDASNQTNTNTSNQNTSTINNANNQPQATTNMSQPAQPKSSANADQASSQPAHETNSN 303

Query: 1246 NSNQALTNLTDSVSNTNQESPTSTE 1270
             +    TN + + S+ NQ+ P + E
Sbjct: 304  GNTNDKTNESSNQSDVNQQYPPADE 328



 Score = 31.0 bits (69), Expect = 5.6
 Identities = 18/56 (32%), Positives = 30/56 (53%), Gaps = 1/56 (1%)

Query: 103 PKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTP 158
           PK S+N  Q S SQ + + +S  N + + N  SN S  +Q+ P +  + Q + + P
Sbjct: 283 PKSSANADQAS-SQPAHETNSNGNTNDKTNESSNQSDVNQQYPPADESLQDAIKNP 337


>gnl|CDD|216927 pfam02203, TarH, Tar ligand binding domain homologue. 
          Length = 146

 Score = 30.4 bits (69), Expect = 3.2
 Identities = 14/90 (15%), Positives = 31/90 (34%), Gaps = 8/90 (8%)

Query: 1191 FMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTS----NQV---PSSTDSVSNTNRASPN 1243
             + ++N +       SL +   +L ++   L  +    N+      + D+     RA  +
Sbjct: 7    GLQSANDALDEVYTNSL-QQQAALADAWVLLLQARLTLNRAAMLGDAPDAAELLARARKS 65

Query: 1244 ATNSNQALTNLTDSVSNTNQESPTSTELAK 1273
               S +A           ++E   + EL  
Sbjct: 66   LAQSEKAWKAYLALPKTADEEEALADELKA 95


>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1).  This family
           consists of several mammalian dentin matrix protein 1
           (DMP1) sequences. The dentin matrix acidic
           phosphoprotein 1 (DMP1) gene has been mapped to human
           chromosome 4q21. DMP1 is a bone and teeth specific
           protein initially identified from mineralised dentin.
           DMP1 is primarily localised in the nuclear compartment
           of undifferentiated osteoblasts. In the nucleus, DMP1
           acts as a transcriptional component for activation of
           osteoblast-specific genes like osteocalcin. During the
           early phase of osteoblast maturation, Ca(2+) surges into
           the nucleus from the cytoplasm, triggering the
           phosphorylation of DMP1 by a nuclear isoform of casein
           kinase II. This phosphorylated DMP1 is then exported out
           into the extracellular matrix, where it regulates
           nucleation of hydroxyapatite. DMP1 is a unique molecule
           that initiates osteoblast differentiation by
           transcription in the nucleus and orchestrates
           mineralised matrix formation extracellularly, at later
           stages of osteoblast maturation. The DMP1 gene has been
           found to be ectopically expressed in lung cancer
           although the reason for this is unknown.
          Length = 514

 Score = 31.6 bits (71), Expect = 3.7
 Identities = 21/59 (35%), Positives = 32/59 (54%)

Query: 98  QSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQ 156
           QST+    S +    S S+ SP+ +   N+S Q   QS+++S   RS +SQS   S S+
Sbjct: 393 QSTEEQADSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSE 451


>gnl|CDD|139585 PRK13460, PRK13460, F0F1 ATP synthase subunit B; Provisional.
          Length = 173

 Score = 30.8 bits (69), Expect = 3.8
 Identities = 25/98 (25%), Positives = 47/98 (47%), Gaps = 10/98 (10%)

Query: 965  LPAVDSRKSQDRKSVRNQVRKAEHVSLKSKAIFSNEESRGLSNRHEANRRSKRSSKKSVE 1024
            L A+D R S     V+N + KA  + L+++A+  + E+R  S + EAN     +   +++
Sbjct: 42   LKALDERAS----GVQNDINKASELRLEAEALLKDYEARLNSAKDEANAIVAEAKSDALK 97

Query: 1025 -----PTPAQNDENSLENDLVRALNLALRTGALMSPQN 1057
                      N+  + ++  V+ + LA +  AL   QN
Sbjct: 98   LKNKLLEETNNEVKAQKDQAVKEIELA-KGKALSQLQN 134


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 31.6 bits (71), Expect = 4.4
 Identities = 44/331 (13%), Positives = 105/331 (31%), Gaps = 54/331 (16%)

Query: 951  SRREETMRSRQSDP-LPAVDSRKSQDRKSVRNQVRKAEHVSLKSKAIFSNE--------- 1000
            +R EE M+  + +  + A +++K+++ K    +++KAE    K + +   E         
Sbjct: 1592 ARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEE 1651

Query: 1001 ----ESRGLSNRHEANRRSKRSSKKSVEPTPAQNDENSLENDLVRALNLALRTGALMSPQ 1056
                E        E  ++++   KK+ E   A+ DE      L +    A +   L   +
Sbjct: 1652 LKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKE 1711

Query: 1057 NPSAQRPKSFDLEEFAPRAMVPLHSSSPNIQRKVSSLDRKRRSQQNNPNQHRS------- 1109
                ++ +     E   +             +K +   +K   ++      +        
Sbjct: 1712 AEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAE 1771

Query: 1110 ----------------EDVKAYRSRTEDLKDCLSRTENVNASTNSTKEYRSLVDTASEVR 1153
                            ED K      + +KD      N+             ++ + E+ 
Sbjct: 1772 EIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFDNFANIIEGGKEGNLV---INDSKEME 1828

Query: 1154 LDEMGHVPDVLRNCVRRDAKVIQ--NFPNSNEVLPSSSDFMSNSNSSSRSNANQSL---- 1207
               +  V D  +N    +A   +   F  +NE   +  D    ++ +   +  +      
Sbjct: 1829 DSAIKEVADS-KNMQLEEADAFEKHKFNKNNE---NGEDGNKEADFNKEKDLKEDDEEEI 1884

Query: 1208 --PKSNQSLPNSN--QNLPTSNQVPSSTDSV 1234
                  + +   +  + +P +N    + D +
Sbjct: 1885 EEADEIEKIDKDDIEREIPNNNMAGKNNDII 1915


>gnl|CDD|216289 pfam01080, Presenilin, Presenilin.  Mutations in presenilin-1 are a
            major cause of early onset Alzheimer's disease. It has
            been found that presenilin-1 binds to beta-catenin
            in-vivo. This family also contains SPE proteins from
            C.elegans.
          Length = 403

 Score = 31.3 bits (71), Expect = 4.5
 Identities = 36/168 (21%), Positives = 61/168 (36%), Gaps = 13/168 (7%)

Query: 1143 RSLVDTASEVRLDEMGHVPDVLRNCVRRDAKVIQNFPNSNEVLPSSSDFMSNSNSSSRSN 1202
            R LV+TA E R + +   P ++ +       V  N   +NE  PS+    S S  S+   
Sbjct: 200  RMLVETAQE-RNEPI--FPALIYSSTVVVLTVGSNQEETNEGTPSTIRRTSKSTRSAA-- 254

Query: 1203 ANQSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTNRASPNATNSNQALTNLTDSVSNTN 1262
               S P S+ +L    ++         S  S + +   S  A   + A      S S  +
Sbjct: 255  NPDSAPTSHSTLELPEKSSTPELSDDESDSSETESQSDSSLAPEEDAAEQPEVQSNSLPS 314

Query: 1263 QESPTSTELAKTSSGGFG--LLHSLLASRSAFRSRPQTCEPTLVRVSF 1308
             E     E  +    G G  + +S+L  +++      T +       F
Sbjct: 315  NEKREEEE-ERGVKLGLGDFIFYSVLVGKAS-----ATGDWNTTIACF 356


>gnl|CDD|197320 cd09086, ExoIII-like_AP-endo, Escherichia coli exonuclease III
           (ExoIII) and Neisseria meningitides NExo-like subfamily
           of the ExoIII family purinic/apyrimidinic (AP)
           endonucleases.  This subfamily includes Escherichia coli
           ExoIII, Neisseria meningitides NExo,and related
           proteins. These are ExoIII family AP endonucleases and
           they belong to the large EEP
           (exonuclease/endonuclease/phosphatase) superfamily that
           contains functionally diverse enzymes that share a
           common catalytic mechanism of cleaving phosphodiester
           bonds. AP endonucleases participate in the DNA base
           excision repair (BER) pathway. AP sites are one of the
           most common lesions in cellular DNA. During BER, the
           damaged DNA is first recognized by DNA glycosylase. AP
           endonucleases then catalyze the hydrolytic cleavage of
           the phosphodiester bond 5' to the AP site, and this is
           followed by the coordinated actions of DNA polymerase,
           deoxyribose phosphatase, and DNA ligase. If left
           unrepaired, AP sites block DNA replication, and have
           both mutagenic and cytotoxic effects. AP endonucleases
           can carry out a variety of excision and incision
           reactions on DNA, including 3'-5' exonuclease,
           3'-deoxyribose phosphodiesterase, 3'-phosphatase, and
           occasionally, nonspecific DNase activities. Different AP
           endonuclease enzymes catalyze the different reactions
           with different efficiencies. Many organisms have two AP
           endonucleases, usually one is the dominant AP
           endonuclease, the other has weak AP endonuclease
           activity. For example, Neisseria meningitides Nape and
           NExo, and exonuclease III (ExoIII) and endonuclease IV
           (EndoIV) in Escherichia coli. NExo and ExoIII  are found
           in this subfamily. NExo is the non-dominant AP
           endonuclease. It exhibits strong 3'-5' exonuclease and
           3'-deoxyribose phosphodiesterase activities. Escherichia
           coli ExoIII is an active AP endonuclease, and in
           addition, it exhibits double strand (ds)-specific 3'-5'
           exonuclease, exonucleolytic RNase H,
           3'-phosphomonoesterase and  3'-phosphodiesterase
           activities, all catalyzed by a single active site. Class
           II AP endonucleases have been classified into two
           families, designated ExoIII and EndoIV, based on their
           homology to the Escherichia coli enzymes ExoIII and
           endonuclease IV (EndoIV). This subfamily belongs to the
           ExoIII family; the EndoIV family belongs to a different
           superfamily.
          Length = 254

 Score = 30.9 bits (71), Expect = 4.5
 Identities = 10/47 (21%), Positives = 17/47 (36%), Gaps = 19/47 (40%)

Query: 797 FTDMVRKFHPLKTQPVWSWFVSSNNILMRIWVSTLVPNYQVGRRQRN 843
           F D  R  HP +    ++W+                 +Y+ G  +RN
Sbjct: 184 FVDAFRALHPDEKL--FTWW-----------------DYRAGAFERN 211


>gnl|CDD|227952 COG5665, NOT5, CCR4-NOT transcriptional regulation complex, NOT5
           subunit [Transcription].
          Length = 548

 Score = 31.2 bits (70), Expect = 4.7
 Identities = 20/64 (31%), Positives = 34/64 (53%), Gaps = 1/64 (1%)

Query: 96  SVQSTQAPKGSSNESQCSQSQSSP-QRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSS 154
           S  + +APK  +N++  S  +SS  Q  S    +PQ +   ++ + +  +P  +S SQS 
Sbjct: 203 SSSNNEAPKEGNNQTSLSSIRSSKKQERSPKKKAPQRDVSISDRATTPIAPGVESASQSI 262

Query: 155 SQTP 158
           S TP
Sbjct: 263 SSTP 266


>gnl|CDD|233696 TIGR02038, protease_degS, periplasmic serine pepetdase DegS.  This
            family consists of the periplasmic serine protease DegS
            (HhoB), a shorter paralog of protease DO (HtrA, DegP) and
            DegQ (HhoA). It is found in E. coli and several other
            Proteobacteria of the gamma subdivision. It contains a
            trypsin domain and a single copy of PDZ domain (in
            contrast to DegP with two copies). A critical role of
            this DegS is to sense stress in the periplasm and
            partially degrade an inhibitor of sigma(E) [Protein fate,
            Degradation of proteins, peptides, and glycopeptides,
            Regulatory functions, Protein interactions].
          Length = 351

 Score = 30.9 bits (70), Expect = 4.9
 Identities = 13/36 (36%), Positives = 22/36 (61%), Gaps = 1/36 (2%)

Query: 1333 GIFVKTIFPHGQAAESGLLVEGDEILLFNNEPLQGR 1368
            GI +  + P+G AA +G+LV  D IL ++ + + G 
Sbjct: 279  GIVITGVDPNGPAARAGILV-RDVILKYDGKDVIGA 313


>gnl|CDD|237533 PRK13863, PRK13863, type IV secretion system T-DNA border
           endonuclease VirD2; Provisional.
          Length = 446

 Score = 31.1 bits (70), Expect = 5.1
 Identities = 17/85 (20%), Positives = 32/85 (37%), Gaps = 5/85 (5%)

Query: 116 QSSPQRS----SQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSSQTPCGSQNKGAIRSDL 171
           + SP       SQS ++    +       ++R  + Q+ S+   Q P GS  K   R   
Sbjct: 218 EFSPGEDHREPSQSFDTSPGEAPQGEPESAERPEKLQNESEVRLQEPAGSSIKADAR-IR 276

Query: 172 TLSENRPLFSPSLNEEVMDMNYLYN 196
              E+     PS ++  +  ++   
Sbjct: 277 VSLESERRAQPSASKIPVADDFGIE 301


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
            consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 30.7 bits (69), Expect = 6.4
 Identities = 18/95 (18%), Positives = 37/95 (38%), Gaps = 7/95 (7%)

Query: 954  EETMRSRQSDPLPAVDSRKSQDRKSVRNQVRKAEHVSLKSK----AIFSNEESRGLSNRH 1009
             +  +  ++ P PA  S   +D++   NQ R+ E   ++ K         ++ +    + 
Sbjct: 251  RQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDHKAFDLKQ 310

Query: 1010 EANRRSKRSSKKSVEPTPAQNDENSLENDLVRALN 1044
            E+    K +  K +E   AQ     +  DL +   
Sbjct: 311  ESKASEKEAEDKELE---AQKKREPVAEDLQKTKP 342


>gnl|CDD|177266 PHA00101, PHA00101, internal virion protein B.
          Length = 194

 Score = 29.8 bits (67), Expect = 7.0
 Identities = 14/74 (18%), Positives = 32/74 (43%)

Query: 96  SVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQSQSTSQSSS 155
           ++   QA  G++ +++   +Q    R        + N Q+ N S   R    +++++ + 
Sbjct: 10  AMMGAQAIMGANQQAKAEGAQIDAGRRQAMEMVKEMNIQNANLSLEARDKLEEASAELTE 69

Query: 156 QTPCGSQNKGAIRS 169
                 +N G IR+
Sbjct: 70  ANMQKVRNMGTIRA 83


>gnl|CDD|221391 pfam12042, RP1-2, Tubuliform egg casing silk strands structural
           domain.  Spiders use fibroins to make silk strands. This
           family includes tubuliform silk fibroins which are used
           to protect egg cases. This domain is a structural domain
           which is found in repeats of up to 20 in many
           individuals (although this is not necessarily the case).
           RP1 makes up structural domains in the N terminal while
           RP2 makes up structural domains in the C terminal.
          Length = 167

 Score = 29.8 bits (67), Expect = 7.6
 Identities = 17/75 (22%), Positives = 38/75 (50%), Gaps = 2/75 (2%)

Query: 88  SSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNSSQSQRSPQS 147
           ++I   +G   + Q    +SN S  + S +S   +S ++++ Q  S S  ++ +Q   Q+
Sbjct: 91  NAISNAIGQFLAGQGVLNASNASSLASSFASALSASAASSAAQAASASAAAAAAQ--SQA 148

Query: 148 QSTSQSSSQTPCGSQ 162
            +++ S + +   SQ
Sbjct: 149 AASAFSQAASQSSSQ 163


>gnl|CDD|204002 pfam08613, Cyclin, Cyclin.  This family includes many different
           cyclin proteins. Members include the G1/S-specific
           cyclin pas1, and the phosphate system cyclin
           PHO80/PHO85.
          Length = 140

 Score = 29.3 bits (66), Expect = 7.9
 Identities = 12/35 (34%), Positives = 16/35 (45%), Gaps = 2/35 (5%)

Query: 343 SDSTTRQSEQHSPNSEHFNFGTPSPPLIEVGDYLS 377
           +DST   S   S   E   F + + P I +  YLS
Sbjct: 18  NDSTATASSSASSPLE--PFYSKAVPSISLTQYLS 50


>gnl|CDD|227430 COG5099, COG5099, RNA-binding protein of the Puf family,
            translational repressor [Translation, ribosomal structure
            and biogenesis].
          Length = 777

 Score = 30.5 bits (69), Expect = 9.0
 Identities = 24/99 (24%), Positives = 39/99 (39%), Gaps = 9/99 (9%)

Query: 1179 PNSNEVLPSSSDFMSNSNSSSRSNANQSLPKSNQSLPNSNQNLPTSNQVPSSTDSVSNTN 1238
               +     S   +    SSS + +  S  KSN +L ++ Q    S        SV+ ++
Sbjct: 86   VAISSSTSGSQSLLMELPSSSFNPSTSSRNKSNSALSSTQQGNANS--------SVTLSS 137

Query: 1239 RASPNATNSNQALT-NLTDSVSNTNQESPTSTELAKTSS 1276
              + +  NSN+    N   S S T  +S +S      SS
Sbjct: 138  STASSMFNSNKLPLPNPNHSNSATTNQSGSSFINTPASS 176


>gnl|CDD|223023 PHA03249, PHA03249, DNA packaging tegument protein UL25;
           Provisional.
          Length = 653

 Score = 30.4 bits (68), Expect = 9.8
 Identities = 28/142 (19%), Positives = 45/142 (31%), Gaps = 11/142 (7%)

Query: 25  PLTEAPDMDTRSNDGAVSKRKSSTWTKVAKTFDLMRKSEARQCSEAGPSSQGDGGEDGRM 84
           P   AP  D    +  +S   SS+  K   +F+++   E    SEA        G  GR 
Sbjct: 33  PRPRAPTEDLDRMEAGLSSYSSSSDNK--SSFEVVS--ETDSGSEAEAERGRRAGMGGRN 88

Query: 85  ------RRRSSIWRRVGSVQSTQAPKGSSNESQCSQSQSSPQRSSQSNNSPQNNSQSNNS 138
                 RR  +   R  S+ +              +S       S  + S ++     NS
Sbjct: 89  KATKPSRRNKTTQCRPTSL-ALATAATMPATPSSGKSPKVSSPPSIPSLSEEDEGAERNS 147

Query: 139 SQSQRSPQSQSTSQSSSQTPCG 160
                S     ++QS  +    
Sbjct: 148 GGDDSSHTDNESTQSQPEADDE 169


>gnl|CDD|222912 PHA02624, PHA02624, large T antigen; Provisional.
          Length = 647

 Score = 30.3 bits (69), Expect = 9.9
 Identities = 28/118 (23%), Positives = 48/118 (40%), Gaps = 17/118 (14%)

Query: 55  TFDLMRKSEARQCSEAGPSSQGDGGEDGRMRRRSSIWRRV---GSVQSTQAPKGSSNESQ 111
              LMRK+  R+C E  P     GG++ +M+R +S+++++               S+E  
Sbjct: 26  NLPLMRKAYLRKCKEYHPDK---GGDEEKMKRLNSLYKKLQEGVKSARQSFGTQDSSEIP 82

Query: 112 CSQSQSSPQRSSQSNNSPQNN-------SQSNNSSQSQ---RSPQSQSTSQSSSQ-TP 158
              +    Q   + N     +       S S++  +       P  +  SQSSSQ TP
Sbjct: 83  TYGTPEWEQWWEEFNEKWDEDLFCDEELSSSDDEDEPPPPSPPPSQEEESQSSSQATP 140


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.309    0.125    0.353 

Gapped
Lambda     K      H
   0.267   0.0695    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 68,046,114
Number of extensions: 6354281
Number of successful extensions: 4884
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4567
Number of HSP's successfully gapped: 184
Length of query: 1454
Length of database: 10,937,602
Length adjustment: 109
Effective length of query: 1345
Effective length of database: 6,103,016
Effective search space: 8208556520
Effective search space used: 8208556520
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.7 bits)
S2: 65 (28.9 bits)