RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy15738
         (187 letters)



>gnl|CDD|197706 smart00408, IGc2, Immunoglobulin C-2 Type. 
          Length = 63

 Score = 68.6 bits (168), Expect = 4e-16
 Identities = 28/62 (45%), Positives = 36/62 (58%), Gaps = 1/62 (1%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNS-LTVRHTNRHSAGIYLCVANNM 171
           G SVTL C A+GNPVPNITW +    LP      +  S LT++  +   +G+Y CVA N 
Sbjct: 2   GQSVTLTCPAEGNPVPNITWLKDGKPLPESNRFVASGSTLTIKSVSLEDSGLYTCVAENS 61

Query: 172 VG 173
            G
Sbjct: 62  AG 63


>gnl|CDD|143202 cd05725, Ig3_Robo, Third immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors.  Ig3_Robo: domain similar to the
           third immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors. Robo receptors play a role in
           the development of the central nervous system (CNS), and
           are receptors of Slit protein. Slit is a repellant
           secreted by the neural cells in the midline. Slit acts
           through Robo to prevent most neurons from crossing the
           midline from either side. Three mammalian Robo homologs
           (robo1, -2, and -3), and three mammalian Slit homologs
           (Slit-1,-2, -3), have been identified. Commissural
           axons, which cross the midline, express low levels of
           Robo; longitudinal axons, which avoid the midline,
           express high levels of Robo. robo1, -2, and -3 are
           expressed by commissural neurons in the vertebrate
           spinal cord and Slits 1, -2, -3 are expressed at the
           ventral midline. Robo-3 is a divergent member of the
           Robo family which instead of being a positive regulator
           of slit responsiveness, antagonizes slit responsiveness
           in precrossing axons.  The Slit-Robo interaction is
           mediated by the second leucine-rich repeat (LRR) domain
           of Slit and the two N-terminal Ig domains of Robo, Ig1
           and Ig2. The primary Robo binding site for Slit2 has
           been shown by surface plasmon resonance experiments and
           mutational analysis to be is the Ig1 domain, while the
           Ig2 domain has been proposed to harbor a weak secondary
           binding site.
          Length = 69

 Score = 63.2 bits (154), Expect = 8e-14
 Identities = 22/66 (33%), Positives = 33/66 (50%), Gaps = 1/66 (1%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEYS-YSGNSLTVRHTNRHSAGIYLCVANNMVGS 174
           V  +C+  G+PVP + W +++  LP G        SL +R+      G Y C A NMVG 
Sbjct: 1   VEFQCEVGGDPVPTVLWRKEDGELPKGRAEILDDKSLKIRNVTAGDEGSYTCEAENMVGK 60

Query: 175 SAAASI 180
             A++ 
Sbjct: 61  IEASAS 66


>gnl|CDD|191810 pfam07679, I-set, Immunoglobulin I-set domain. 
          Length = 90

 Score = 62.7 bits (153), Expect = 2e-13
 Identities = 27/82 (32%), Positives = 38/82 (46%), Gaps = 6/82 (7%)

Query: 108 VEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGE---YSYSGN--SLTVRHTNRHSAG 162
           VEV++G S    C   G+P P ++W +    L   +    +Y G   +LT+ +      G
Sbjct: 10  VEVQEGESARFTCTVTGDPDPTVSWFKDGQPLRSSDRFKVTYEGGTYTLTISNVQPDDEG 69

Query: 163 IYLCVANNMVGSSAAASIALHV 184
            Y CVA N  G  A AS  L V
Sbjct: 70  KYTCVATNSAG-EAEASAELTV 90


>gnl|CDD|143207 cd05730, Ig3_NCAM-1_like, Third immunoglobulin (Ig)-like domain of
           Neural Cell Adhesion Molecule NCAM-1 (NCAM).
           Ig3_NCAM-1_like: domain similar to the third
           immunoglobulin (Ig)-like domain of Neural Cell Adhesion
           Molecule NCAM-1 (NCAM). NCAM plays important roles in
           the development and regeneration of the central nervous
           system, in synaptogenesis and neural migration. NCAM
           mediates cell-cell and cell-substratum recognition and
           adhesion via homophilic (NCAM-NCAM), and heterophilic
           (NCAM-non-NCAM), interactions. NCAM is expressed as
           three major isoforms having different intracellular
           extensions. The extracellular portion of NCAM has five
           N-terminal Ig-like domains and two fibronectin type III
           domains. The double zipper adhesion complex model for
           NCAM homophilic binding involves Ig1, Ig2, and Ig3. By
           this model, Ig1,and Ig2 mediate dimerization of NCAM
           molecules situated on the same cell surface (cis
           interactions), and Ig3 domains mediate interactions
           between NCAM molecules expressed on the surface of
           opposing cells (trans interactions), through binding to
           the Ig1 and Ig2 domains. The adhesive ability of NCAM is
           modulated by the addition of polysialic acid chains to
           the fifth Ig-like domain.
          Length = 95

 Score = 62.3 bits (151), Expect = 3e-13
 Identities = 28/79 (35%), Positives = 39/79 (49%), Gaps = 5/79 (6%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGE----YSYSGNSLTVRHTNRHSAGIYLCVA 168
           G SVTL C ADG P P +TWT+    +  GE    ++  G+ +T+   ++     Y C+A
Sbjct: 18  GQSVTLACDADGFPEPTMTWTKDGEPIESGEEKYSFNEDGSEMTILDVDKLDEAEYTCIA 77

Query: 169 NNMVGSSAAASIALHVLCK 187
            N  G    A I L V  K
Sbjct: 78  ENKAGEQ-EAEIHLKVFAK 95


>gnl|CDD|214653 smart00410, IG_like, Immunoglobulin like.  IG domains that cannot
           be classified into one of IGv1, IGc1, IGc2, IG.
          Length = 85

 Score = 58.3 bits (141), Expect = 7e-12
 Identities = 30/85 (35%), Positives = 43/85 (50%), Gaps = 7/85 (8%)

Query: 106 GKVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPG------GEYSYSGNSLTVRHTNRH 159
             V VK+G SVTL C+A G+P P +TW ++   L           S S ++LT+ +    
Sbjct: 2   PSVTVKEGESVTLSCEASGSPPPEVTWYKQGGKLLAESGRFSVSRSGSTSTLTISNVTPE 61

Query: 160 SAGIYLCVANNMVGSSAAASIALHV 184
            +G Y C A N  G SA++   L V
Sbjct: 62  DSGTYTCAATNSSG-SASSGTTLTV 85


>gnl|CDD|214652 smart00409, IG, Immunoglobulin. 
          Length = 85

 Score = 58.3 bits (141), Expect = 7e-12
 Identities = 30/85 (35%), Positives = 43/85 (50%), Gaps = 7/85 (8%)

Query: 106 GKVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPG------GEYSYSGNSLTVRHTNRH 159
             V VK+G SVTL C+A G+P P +TW ++   L           S S ++LT+ +    
Sbjct: 2   PSVTVKEGESVTLSCEASGSPPPEVTWYKQGGKLLAESGRFSVSRSGSTSTLTISNVTPE 61

Query: 160 SAGIYLCVANNMVGSSAAASIALHV 184
            +G Y C A N  G SA++   L V
Sbjct: 62  DSGTYTCAATNSSG-SASSGTTLTV 85


>gnl|CDD|143277 cd05869, Ig5_NCAM-1, Fifth immunoglobulin (Ig)-like domain of
           Neural Cell Adhesion Molecule NCAM-1 (NCAM).
           Ig5_NCAM-1: The fifth immunoglobulin (Ig)-like domain of
           Neural Cell Adhesion Molecule NCAM-1 (NCAM). NCAM plays
           important roles in the development and regeneration of
           the central nervous system, in synaptogenesis and neural
           migration. NCAM mediates cell-cell and cell-substratum
           recognition and adhesion via homophilic (NCAM-NCAM) and
           heterophilic (NCAM-non-NCAM) interactions. NCAM is
           expressed as three major isoforms having different
           intracellular extensions. The extracellular portion of
           NCAM has five N-terminal Ig-like domains and two
           fibronectin type III domains. The double zipper adhesion
           complex model for NCAM homophilic binding involves Ig1,
           Ig2, and Ig3. By this model, Ig1 and Ig2 mediate
           dimerization of NCAM molecules situated on the same cell
           surface (cis interactions), and Ig3 domains mediate
           interactions between NCAM molecules expressed on the
           surface of opposing cells (trans interactions), through
           binding to the Ig1 and Ig2 domains. The adhesive ability
           of NCAM is modulated by the addition of polysialic acid
           chains to the fifth Ig-like domain.
          Length = 97

 Score = 57.7 bits (139), Expect = 2e-11
 Identities = 29/92 (31%), Positives = 49/92 (53%), Gaps = 12/92 (13%)

Query: 97  PRIIYVSGAGKVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGN------- 149
           P+I YV     +E+++   +TL C+A G+P+P+ITW     N+   E +  G+       
Sbjct: 3   PKITYVENQTAMELEE--QITLTCEASGDPIPSITWRTSTRNISSEEKTLDGHIVVRSHA 60

Query: 150 ---SLTVRHTNRHSAGIYLCVANNMVGSSAAA 178
              SLT+++     AG YLC A+N +G  + +
Sbjct: 61  RVSSLTLKYIQYTDAGEYLCTASNTIGQDSQS 92


>gnl|CDD|143209 cd05732, Ig5_NCAM-1_like, Fifth immunoglobulin (Ig)-like domain of
           Neural Cell Adhesion Molecule NCAM-1 (NCAM) and similar
           proteins.  Ig5_NCAM-1 like: domain similar to the fifth
           immunoglobulin (Ig)-like domain of Neural Cell Adhesion
           Molecule NCAM-1 (NCAM). NCAM plays important roles in
           the development and regeneration of the central nervous
           system, in synaptogenesis and neural migration. NCAM
           mediates cell-cell and cell-substratum recognition and
           adhesion via homophilic  (NCAM-NCAM), and heterophilic
           (NCAM-non-NCAM), interactions. NCAM is expressed as
           three major isoforms having different intracellular
           extensions. The extracellular portion of NCAM has five
           N-terminal Ig-like domains and two fibronectin type III
           domains. The double zipper adhesion complex model for
           NCAM homophilic binding involves Ig1, Ig2, and Ig3. By
           this model, Ig1 and Ig2 mediate dimerization of NCAM
           molecules situated on the same cell surface (cis
           interactions), and Ig3 domains mediate interactions
           between NCAM molecules expressed on the surface of
           opposing cells (trans interactions), through binding to
           the Ig1 and Ig2 domains. The adhesive ability of NCAM is
           modulated by the addition of polysialic acid chains to
           the fifth Ig-like domain. Also included in this group is
           NCAM-2 (also known as OCAM/mamFas II and RNCAM)  NCAM-2
           is differentially expressed in the developing and mature
           olfactory epithelium (OE).
          Length = 96

 Score = 56.8 bits (137), Expect = 4e-11
 Identities = 30/87 (34%), Positives = 45/87 (51%), Gaps = 13/87 (14%)

Query: 97  PRIIYVSGAGKVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGN------- 149
           P+I Y+     VE+++   +TL C+A+G+P+P ITW R   N   G+ S  G        
Sbjct: 3   PKITYLENQTAVELEQ---ITLTCEAEGDPIPEITWRRATRNFSEGDKSLDGRIVVRGHA 59

Query: 150 ---SLTVRHTNRHSAGIYLCVANNMVG 173
              SLT++      AG Y C A+N +G
Sbjct: 60  RVSSLTLKDVQLTDAGRYDCEASNRIG 86


>gnl|CDD|143165 cd00096, Ig, Immunoglobulin domain.  Ig: immunoglobulin (Ig) domain
           found in the Ig superfamily. The Ig superfamily is a
           heterogenous group of proteins, built on a common fold
           comprised of a sandwich of two beta sheets. Members of
           this group are components of immunoglobulin, neuroglia,
           cell surface glycoproteins, such as, T-cell receptors,
           CD2, CD4, CD8, and membrane glycoproteins, such as,
           butyrophilin and chondroitin sulfate proteoglycan core
           protein. A predominant feature of most Ig domains is a
           disulfide bridge connecting the two beta-sheets with a
           tryptophan residue packed against the disulfide bond.
          Length = 74

 Score = 56.0 bits (134), Expect = 4e-11
 Identities = 25/68 (36%), Positives = 33/68 (48%), Gaps = 9/68 (13%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEY--------SYSGNS-LTVRHTNRHSAGIYLC 166
           VTL C A G P P ITW +    LP            + SG+S LT+ +     +G Y C
Sbjct: 1   VTLTCLASGPPPPTITWLKNGKPLPSSVLTRVRSSRGTSSGSSTLTISNVTLEDSGTYTC 60

Query: 167 VANNMVGS 174
           VA+N  G+
Sbjct: 61  VASNSAGT 68


>gnl|CDD|206066 pfam13895, Ig_2, Immunoglobulin domain.  This domain contains
           immunoglobulin-like domains.
          Length = 80

 Score = 54.4 bits (131), Expect = 2e-10
 Identities = 26/77 (33%), Positives = 34/77 (44%), Gaps = 5/77 (6%)

Query: 108 VEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNSLTVRHTNRHSAGIYLCV 167
             V +G  VTL C A GNP PN TW +    L       S N     + +   +G Y CV
Sbjct: 9   TVVFEGEDVTLTCSAPGNPPPNYTWYKDGVPLS-----SSQNGFFTPNVSAEDSGTYTCV 63

Query: 168 ANNMVGSSAAASIALHV 184
           A+N  G   +  + L V
Sbjct: 64  ASNGGGGKTSNPVTLTV 80


>gnl|CDD|143231 cd05754, Ig3_Perlecan_like, Third immunoglobulin (Ig)-like domain
           found in Perlecan and similar proteins.
           Ig3_Perlecan_like: domain similar to the third
           immunoglobulin (Ig)-like domain found in Perlecan.
           Perlecan is a large multi-domain heparin sulfate
           proteoglycan, important in tissue development and
           organogenesis.  Perlecan can be represented as 5 major
           portions; its fourth major portion (domain IV) is a
           tandem repeat of immunoglobulin-like domains (Ig2-Ig15),
           which can vary in size due to alternative splicing.
           Perlecan binds many cellular and extracellular ligands.
           Its domain IV region has many binding sites.  Some of
           these have been mapped at the level of individual
           Ig-like domains, including a site restricted to the Ig5
           domain for heparin/sulfatide, a site restricted to the
           Ig3 domain for nidogen-1 and nidogen-2, a site
           restricted to Ig4-5 for fibronectin, and sites
           restricted to Ig2 and to Ig13-15 for fibulin-2.
          Length = 85

 Score = 54.1 bits (130), Expect = 3e-10
 Identities = 24/78 (30%), Positives = 38/78 (48%), Gaps = 3/78 (3%)

Query: 107 KVEVKKGYSVTLECKA-DGNPVPNITWTRKNNNLPGGEYSYSGNSLTVRHTNRHSAGIYL 165
             EV+ G  V+  C+A   +P   + WTR    LP     ++G  LT+R+     AG Y+
Sbjct: 10  SQEVRPGADVSFICRAKSKSPAYTLVWTRVGGGLPSRAMDFNG-ILTIRNVQLSDAGTYV 68

Query: 166 CVANNMVGSSAAASIALH 183
           C  +NM   +  A+  L+
Sbjct: 69  CTGSNM-LDTDEATATLY 85


>gnl|CDD|143259 cd05851, Ig3_Contactin-1, Third Ig domain of contactin-1.
           Ig3_Contactin-1: Third Ig domain of the neural cell
           adhesion molecule contactin-1. Contactins are comprised
           of six Ig domains followed by four fibronectin type III
           (FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. Contactin-1 is
           differentially expressed in tumor tissues and may
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 88

 Score = 53.5 bits (128), Expect = 5e-10
 Identities = 27/63 (42%), Positives = 33/63 (52%), Gaps = 1/63 (1%)

Query: 112 KGYSVTLECKADGNPVPNITWTRKNNNLPG-GEYSYSGNSLTVRHTNRHSAGIYLCVANN 170
           KG +VTLEC A GNPVP I W +    +P   E S SG  L + +      G Y C A N
Sbjct: 15  KGQNVTLECFALGNPVPVIRWRKILEPMPATAEISMSGAVLKIFNIQPEDEGTYECEAEN 74

Query: 171 MVG 173
           + G
Sbjct: 75  IKG 77


>gnl|CDD|143169 cd04968, Ig3_Contactin_like, Third Ig domain of contactin.
           Ig3_Contactin_like: Third Ig domain of contactins.
           Contactins are neural cell adhesion molecules and are
           comprised of six Ig domains followed by four fibronectin
           type III(FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. The first four Ig domains
           form the intermolecular binding fragment, which arranges
           as a compact U-shaped module via contacts between Ig
           domains 1 and 4, and between Ig domains 2 and 3.
           Contactin-2 (TAG-1, axonin-1) may play a part in the
           neuronal processes of neurite outgrowth, axon guidance
           and fasciculation, and neuronal migration. This group
           also includes contactin-1 and contactin-5. The different
           contactins show different expression patterns in the
           central nervous system. During development and in
           adulthood, contactin-2 is transiently expressed in
           subsets of central and peripheral neurons. Contactin-5
           is expressed specifically in the rat postnatal nervous
           system, peaking at about 3 weeks postnatal, and a lack
           of contactin-5 (NB-2) results in an impairment of
           neuronal act ivity in the rat auditory system.
           Contactin-5 is highly expressed in the adult human brain
           in the occipital lobe and in the amygdala. Contactin-1
           is differentially expressed in tumor tissues and may,
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 88

 Score = 53.2 bits (128), Expect = 6e-10
 Identities = 27/63 (42%), Positives = 35/63 (55%), Gaps = 1/63 (1%)

Query: 112 KGYSVTLECKADGNPVPNITWTRKNNNLPG-GEYSYSGNSLTVRHTNRHSAGIYLCVANN 170
           KG +VTLEC A GNPVP I W + + ++P   E S SG  L + +      G Y C A N
Sbjct: 15  KGQNVTLECFALGNPVPQIKWRKVDGSMPSSAEISMSGAVLKIPNIQFEDEGTYECEAEN 74

Query: 171 MVG 173
           + G
Sbjct: 75  IKG 77


>gnl|CDD|143317 cd07693, Ig1_Robo, First immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors and similar proteins.  Ig1_Robo:
           domain similar to the first immunoglobulin (Ig)-like
           domain in Robo (roundabout) receptors. Robo receptors
           play a role in the development of the central nervous
           system (CNS), and are receptors of Slit protein. Slit is
           a repellant secreted by the neural cells in the midline.
           Slit acts through Robo to prevent most neurons from
           crossing the midline from either side. Three mammalian
           Robo homologs (robo1, -2, and -3), and three mammalian
           Slit homologs (Slit-1,-2, -3), have been identified.
           Commissural axons, which cross the midline, express low
           levels of Robo; longitudinal axons, which avoid the
           midline, express high levels of Robo. robo1, -2, and -3
           are expressed by commissural neurons in the vertebrate
           spinal cord and Slits 1, -2, -3 are expressed at the
           ventral midline. Robo-3 is a divergent member of the
           Robo family which instead of being a positive regulator
           of slit responsiveness, antagonizes slit responsiveness
           in precrossing axons.  The Slit-Robo interaction is
           mediated by the second leucine-rich repeat (LRR) domain
           of Slit and the two N-terminal Ig domains of Robo, Ig1
           and Ig2. The primary Robo binding site for Slit2 has
           been shown by surface plasmon resonance experiments and
           mutational analysis to be is the Ig1 domain, while the
           Ig2 domain has been proposed to harbor a weak secondary
           binding site.
          Length = 100

 Score = 50.2 bits (120), Expect = 1e-08
 Identities = 36/106 (33%), Positives = 49/106 (46%), Gaps = 24/106 (22%)

Query: 96  PPRIIYVSGAGKVEVKKGYSVTLECKADGNPVPNITW--------TRKNNN------LPG 141
           PPRI  V     + V KG   TL CKA+G P P I W        T K++       LP 
Sbjct: 1   PPRI--VEHPSDLIVSKGDPATLNCKAEGRPTPTIQWLKNGQPLETDKDDPRSHRIVLPS 58

Query: 142 GEYSYSGNSLTVRHTNRHS---AGIYLCVANNMVGSSAAASIALHV 184
           G   +    L V H  R      G+Y+CVA+N +G + + + +L V
Sbjct: 59  GSLFF----LRVVH-GRKGRSDEGVYVCVAHNSLGEAVSRNASLEV 99


>gnl|CDD|143208 cd05731, Ig3_L1-CAM_like, Third immunoglobulin (Ig)-like domain of
           the L1 cell adhesion molecule (CAM).  Ig3_L1-CAM_like:
           domain similar to the third immunoglobulin (Ig)-like
           domain of the L1 cell adhesion molecule (CAM). L1
           belongs to the L1 subfamily of cell adhesion molecules
           (CAMs) and is comprised of an extracellular region
           having six Ig-like domains and five fibronectin type III
           domains, a transmembrane region and an intracellular
           domain. L1 is primarily expressed in the nervous system
           and is involved in its development and function. L1 is
           associated with an X-linked recessive disorder, X-linked
           hydrocephalus, MASA syndrome, or spastic paraplegia type
           1, that involves abnormalities of axonal growth. This
           group also contains the chicken neuron-glia cell
           adhesion molecule, Ng-CAM and human neurofascin.
          Length = 71

 Score = 49.3 bits (118), Expect = 1e-08
 Identities = 20/62 (32%), Positives = 31/62 (50%), Gaps = 2/62 (3%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNSLTVRHTNRHSA--GIYLCVANNMVG 173
           + LEC A+G P P I+W +    LP     +   + T++  N      G Y C A+N +G
Sbjct: 1   LLLECIAEGLPTPEISWIKIGGELPADRTKFENFNKTLKIDNVSEEDDGEYRCTASNSLG 60

Query: 174 SS 175
           S+
Sbjct: 61  SA 62


>gnl|CDD|143237 cd05760, Ig2_PTK7, Second immunoglobulin (Ig)-like domain of
           protein tyrosine kinase (PTK) 7, also known as CCK4.
           Ig2_PTK7: domain similar to the second immunoglobulin
           (Ig)-like domain in protein tyrosine kinase (PTK) 7,
           also known as CCK4. PTK7 is a subfamily of the receptor
           protein tyrosine kinase family, and is referred to as an
           RPTK-like molecule. RPTKs transduce extracellular
           signals across the cell membrane, and play important
           roles in regulating cell proliferation, migration, and
           differentiation. PTK7 is organized as an extracellular
           portion having seven Ig-like domains, a single
           transmembrane region, and a cytoplasmic tyrosine
           kinase-like domain. PTK7 is considered a pseudokinase as
           it has several unusual residues in some of the highly
           conserved tyrosine kinase (TK) motifs; it is predicted
           to lack TK activity. PTK7 may function as a
           cell-adhesion molecule. PTK7 mRNA is expressed at high
           levels in placenta, melanocytes, liver, lung, pancreas,
           and kidney. PTK7 is overexpressed in several cancers,
           including melanoma and colon cancer lines.
          Length = 77

 Score = 49.1 bits (117), Expect = 2e-08
 Identities = 25/63 (39%), Positives = 33/63 (52%), Gaps = 4/63 (6%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNS----LTVRHTNRHSAGIYLCVANNM 171
           VTL C  DG+P P   W R    L  G+ +YS +S    LT+R      +G+Y C A+N 
Sbjct: 1   VTLRCHIDGHPRPTYQWFRDGTPLSDGQGNYSVSSKERTLTLRSAGPDDSGLYYCCAHNA 60

Query: 172 VGS 174
            GS
Sbjct: 61  FGS 63


>gnl|CDD|143201 cd05724, Ig2_Robo, Second immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors.  Ig2_Robo: domain similar to the
           second immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors. Robo receptors play a role in
           the development of the central nervous system (CNS), and
           are receptors of Slit protein. Slit is a repellant
           secreted by the neural cells in the midline. Slit acts
           through Robo to prevent most neurons from crossing the
           midline from either side. Three mammalian Robo homologs
           (robo1, -2, and -3), and three mammalian Slit homologs
           (Slit-1,-2, -3), have been identified. Commissural
           axons, which cross the midline, express low levels of
           Robo; longitudinal axons, which avoid the midline,
           express high levels of Robo. robo1, -2, and -3 are
           expressed by commissural neurons in the vertebrate
           spinal cord and Slits 1, -2, -3 are expressed at the
           ventral midline. Robo-3 is a divergent member of the
           Robo family which instead of being a positive regulator
           of slit responsiveness, antagonizes slit responsiveness
           in precrossing axons.  The Slit-Robo interaction is
           mediated by the second leucine-rich repeat (LRR) domain
           of Slit and the two N-terminal Ig domains of Robo, Ig1
           and Ig2. The primary Robo binding site for Slit2 has
           been shown by surface plasmon resonance experiments and
           mutational analysis to be is the Ig1 domain, while the
           Ig2 domain has been proposed to harbor a weak secondary
           binding site.
          Length = 86

 Score = 48.5 bits (116), Expect = 3e-08
 Identities = 26/78 (33%), Positives = 34/78 (43%), Gaps = 8/78 (10%)

Query: 108 VEVKKGYSVTLECKAD-GNPVPNITWTRKN----NNLPGGEYSYSGNSLTVRHTNRHSAG 162
            +V  G    LEC    G+P P ++W RK+    N            +L +    +   G
Sbjct: 6   TQVAVGEMAVLECSPPRGHPEPTVSW-RKDGQPLNLDNERVRIVDDGNLLIAEARKSDEG 64

Query: 163 IYLCVANNMVGS--SAAA 178
            Y CVA NMVG   SAAA
Sbjct: 65  TYKCVATNMVGERESAAA 82


>gnl|CDD|143211 cd05734, Ig7_DSCAM, Seventh immunoglobulin (Ig)-like domain of Down
           Syndrome Cell Adhesion molecule (DSCAM).  Ig7_DSCAM: the
           seventh immunoglobulin (Ig)-like domain of Down Syndrome
           Cell Adhesion molecule (DSCAM). DSCAM is a cell adhesion
           molecule expressed largely in the developing nervous
           system. The gene encoding DSCAM is located at human
           chromosome 21q22, the locus associated with the mental
           retardation phenotype of Down Syndrome. DSCAM is
           predicted to be the largest member of the IG
           superfamily. It has been demonstrated that DSCAM can
           mediate cation-independent homophilic intercellular
           adhesion.
          Length = 79

 Score = 47.6 bits (113), Expect = 6e-08
 Identities = 27/79 (34%), Positives = 35/79 (44%), Gaps = 10/79 (12%)

Query: 116 VTLECKADGNPVPNITWTRKNNN----------LPGGEYSYSGNSLTVRHTNRHSAGIYL 165
           VTL C A+G P P I W                L G     S  SL ++H     +G YL
Sbjct: 1   VTLNCSAEGYPPPTIVWKHSKGRGHPQHTHTCCLAGRIQLLSNGSLLIKHVLEEDSGYYL 60

Query: 166 CVANNMVGSSAAASIALHV 184
           C  +N VG+ A+ S+ L V
Sbjct: 61  CKVSNDVGADASKSMVLTV 79


>gnl|CDD|222457 pfam13927, Ig_3, Immunoglobulin domain.  This family contains
           immunoglobulin-like domains.
          Length = 74

 Score = 47.0 bits (111), Expect = 9e-08
 Identities = 26/77 (33%), Positives = 32/77 (41%), Gaps = 5/77 (6%)

Query: 96  PPRIIYVSGAGKVEVKKGYSVTLECKADGNPVP-NITWTRKNNNLPGGEYSYSGNS-LTV 153
            P I          V  G  VTL C A+G P P  I+W R  +   G     S  S LT+
Sbjct: 1   KPVITVSPSP---SVTSGGGVTLTCSAEGGPPPPTISWYRNGSISGGSGGLGSSGSTLTL 57

Query: 154 RHTNRHSAGIYLCVANN 170
                  +G Y CVA+N
Sbjct: 58  SSVTSEDSGTYTCVASN 74


>gnl|CDD|143179 cd04978, Ig4_L1-NrCAM_like, Fourth immunoglobulin (Ig)-like domain
           of L1, Ng-CAM (Neuron-glia CAM cell adhesion molecule),
           and NrCAM (Ng-CAM-related).  Ig4_L1-NrCAM_like: fourth
           immunoglobulin (Ig)-like domain of L1, Ng-CAM
           (Neuron-glia CAM cell adhesion molecule), and NrCAM
           (Ng-CAM-related). These proteins belong to the L1
           subfamily of cell adhesion molecules (CAMs) and are
           comprised of an extracellular region having six Ig-like
           domains and five fibronectin type III domains, a
           transmembrane region and an intracellular domain. These
           molecules are primarily expressed in the nervous system.
           L1 is associated with an X-linked recessive disorder,
           X-linked hydrocephalus, MASA syndrome, or spastic
           paraplegia type 1, that involves abnormalities of axonal
           growth.
          Length = 76

 Score = 47.0 bits (112), Expect = 1e-07
 Identities = 23/78 (29%), Positives = 36/78 (46%), Gaps = 7/78 (8%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNL-----PGGEYSYSGNSLTVRHTNRHSAGIYLCV 167
           G +  L+C+A+G P P ITW R N        P       G +L + +   +   +Y C 
Sbjct: 1   GETGRLDCEAEGIPQPTITW-RLNGVPIEELPPDPRRRVDGGTLILSNVQPNDTAVYQCN 59

Query: 168 ANNMVGSSAAASIALHVL 185
           A+N V     A+  +HV+
Sbjct: 60  ASN-VHGYLLANAFVHVV 76


>gnl|CDD|143168 cd04967, Ig1_Contactin, First Ig domain of contactin.
           Ig1_Contactin: First Ig domain of contactins. Contactins
           are neural cell adhesion molecules and are comprised of
           six Ig domains followed by four fibronectin type
           III(FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. The first four Ig domains
           form the intermolecular binding fragment, which arranges
           as a compact U-shaped module via contacts between Ig
           domains 1 and 4, and between Ig domains 2 and 3.
           Contactin-2 (TAG-1, axonin-1) may play a part in the
           neuronal processes of neurite outgrowth, axon guidance
           and fasciculation, and neuronal migration. This group
           also includes contactin-1 and contactin-5. The different
           contactins show different expression patterns in the
           central nervous system. During development and in
           adulthood, contactin-2 is transiently expressed in
           subsets of central and peripheral neurons. Contactin-5
           is expressed specifically in the rat postnatal nervous
           system, peaking at about 3 weeks postnatal, and a lack
           of contactin-5 (NB-2) results in an impairment of
           neuronal activity in the rat auditory system.
           Contactin-5 is highly expressed in the adult human brain
           in the occipital lobe and in the amygdala. Contactin-1
           is differentially expressed in tumor tissues and may,
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 91

 Score = 46.7 bits (111), Expect = 2e-07
 Identities = 21/63 (33%), Positives = 32/63 (50%), Gaps = 4/63 (6%)

Query: 116 VTLECKADGNPVPNITWTRKNNN---LPGGEYSYSGNSLTVRH-TNRHSAGIYLCVANNM 171
           V+L C+A G+P P   W          P   YS  G +L + + +    AG Y C+A+N+
Sbjct: 22  VSLNCRARGSPPPTYRWLMNGTEIDDEPDSRYSLVGGNLVISNPSKAKDAGRYQCLASNI 81

Query: 172 VGS 174
           VG+
Sbjct: 82  VGT 84


>gnl|CDD|143222 cd05745, Ig3_Peroxidasin, Third immunoglobulin (Ig)-like domain of
           peroxidasin.  Ig3_Peroxidasin: the third immunoglobulin
           (Ig)-like domain in peroxidasin. Peroxidasin has a
           peroxidase domain and interacting extracellular motifs
           containing four Ig-like domains. It has been suggested
           that peroxidasin is secreted and has functions related
           to the stabilization of the extracellular matrix. It may
           play a part in various other important processes such as
           removal and destruction of cells which have undergone
           programmed cell death, and protection of the organism
           against non-self.
          Length = 74

 Score = 46.1 bits (109), Expect = 2e-07
 Identities = 22/65 (33%), Positives = 30/65 (46%), Gaps = 2/65 (3%)

Query: 112 KGYSVTLECKADGNPVPNITWTRKNNNLPGGEYS--YSGNSLTVRHTNRHSAGIYLCVAN 169
           +G +V   C+A G P P I WT+  + L         S  +L +     H  G Y C A 
Sbjct: 1   EGQTVDFLCEAQGYPQPVIAWTKGGSQLSVDRRHLVLSSGTLRISRVALHDQGQYECQAV 60

Query: 170 NMVGS 174
           N+VGS
Sbjct: 61  NIVGS 65


>gnl|CDD|143180 cd04979, Ig_Semaphorin_C, Immunoglobulin (Ig)-like domain of
           semaphorin.  Ig_Semaphorin_C; Immunoglobulin (Ig)-like
           domain in semaphorins. Semaphorins are transmembrane
           protein that have important roles in a variety of
           tissues. Functionally, semaphorins were initially
           characterized for their importance in the development of
           the nervous system and in axonal guidance. Later they
           have been found to be important for the formation and
           functioning of the cardiovascular, endocrine,
           gastrointestinal, hepatic, immune, musculoskeletal,
           renal, reproductive, and respiratory systems.
           Semaphorins function through binding to their receptors
           and transmembrane semaphorins also serves as receptors
           themselves. Although molecular mechanism of semaphorins
           is poorly understood, the Ig-like domains may involve in
           ligand binding or dimerization.
          Length = 89

 Score = 46.2 bits (110), Expect = 3e-07
 Identities = 22/84 (26%), Positives = 33/84 (39%), Gaps = 6/84 (7%)

Query: 107 KVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSG-----NSLTVRHTNRHSA 161
            V V +G SV LEC    N    + W  +   L   E          + L +R  +   A
Sbjct: 5   VVTVVEGNSVFLECSPKSNLAS-VVWLFQGGPLQRKEEPEERLLVTEDGLLIRSVSPADA 63

Query: 162 GIYLCVANNMVGSSAAASIALHVL 185
           G+Y C +         A+ +L+VL
Sbjct: 64  GVYTCQSVEHGFKQTLATYSLNVL 87


>gnl|CDD|215677 pfam00047, ig, Immunoglobulin domain.  Members of the
           immunoglobulin superfamily are found in hundreds of
           proteins of different functions. Examples include
           antibodies, the giant muscle kinase titin and receptor
           tyrosine kinases. Immunoglobulin-like domains may be
           involved in protein-protein and protein-ligand
           interactions. The Pfam alignments do not include the
           first and last strand of the immunoglobulin-like domain.
          Length = 62

 Score = 45.6 bits (108), Expect = 3e-07
 Identities = 18/60 (30%), Positives = 26/60 (43%), Gaps = 6/60 (10%)

Query: 115 SVTLECKADGNPVPNITWTRKNNNLPGG------EYSYSGNSLTVRHTNRHSAGIYLCVA 168
           SVTL C   G P  ++TW ++   L         E   S  +LT+ +     +G Y CV 
Sbjct: 3   SVTLTCSVSGPPQVDVTWFKEGKGLEESTTVGTDENRVSSITLTISNVTPEDSGTYTCVV 62


>gnl|CDD|143206 cd05729, Ig2_FGFR_like, Second immunoglobulin (Ig)-like domain of
           fibroblast growth factor (FGF) receptor and similar
           proteins.  Ig2_FGFR_like: domain similar to the second
           immunoglobulin (Ig)-like domain of fibroblast growth
           factor (FGF) receptor. FGF receptors bind FGF signaling
           polypeptides. FGFs participate in multiple processes
           such as morphogenesis, development, and angiogenesis.
           FGFs bind to four FGF receptor tyrosine kinases (FGFR1,
           -2, -3, -4). Receptor diversity is controlled by
           alternative splicing producing splice variants with
           different ligand binding characteristics and different
           expression patterns. FGFRs have an extracellular region
           comprised of three Ig-like domains, a single
           transmembrane helix, and an intracellular tyrosine
           kinase domain. Ligand binding and specificity reside in
           the Ig-like domains 2 and 3, and the linker region that
           connects these two. FGFR activation and signaling depend
           on FGF-induced dimerization, a process involving cell
           surface heparin or heparin sulfate proteoglycans. This
           group also contains fibroblast growth factor (FGF)
           receptor_like-1(FGFRL1). FGFRL1 does not have a protein
           tyrosine kinase domain at its C terminus; neither does
           its cytoplasmic domain appear to interact with a
           signaling partner. It has been suggested that FGFRL1 may
           not have any direct signaling function, but instead acts
           as a decoy receptor trapping FGFs and preventing them
           from binding other receptors.
          Length = 85

 Score = 45.8 bits (109), Expect = 3e-07
 Identities = 21/68 (30%), Positives = 31/68 (45%), Gaps = 6/68 (8%)

Query: 113 GYSVTLECKADGNPVPNITWT------RKNNNLPGGEYSYSGNSLTVRHTNRHSAGIYLC 166
           G +V L+C A GNP P ITW       +K + + G +      +L +       +G Y C
Sbjct: 9   GSTVRLKCPASGNPRPTITWLKDGKPFKKEHRIGGYKVRKKKWTLILESVVPSDSGKYTC 68

Query: 167 VANNMVGS 174
           +  N  GS
Sbjct: 69  IVENKYGS 76


>gnl|CDD|143205 cd05728, Ig4_Contactin-2-like, Fourth Ig domain of the neural cell
           adhesion molecule contactin-2 and similar proteins.
           Ig4_Contactin-2-like: fourth Ig domain of the neural
           cell adhesion molecule contactin-2. Contactins are
           comprised of six Ig domains followed by four fibronectin
           type III (FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. Contactin-2 (aliases
           TAG-1, axonin-1) facilitates cell adhesion by homophilic
           binding between molecules in apposed membranes. The
           first four Ig domains form the intermolecular binding
           fragment which arranges as a compact U-shaped module by
           contacts between Ig domains 1 and 4, and domains 2 and
           3. It has been proposed that a linear zipper-like array
           forms, from contactin-2 molecules alternatively provided
           by the two apposed membranes.
          Length = 85

 Score = 44.5 bits (105), Expect = 1e-06
 Identities = 22/72 (30%), Positives = 31/72 (43%), Gaps = 1/72 (1%)

Query: 109 EVKKGYSVTLECKADGNPVPNITWTRKNNNLPG-GEYSYSGNSLTVRHTNRHSAGIYLCV 167
           E   G S+  ECKA GNP P   W +    L            L +   +   +G+Y CV
Sbjct: 10  EADIGSSLRWECKASGNPRPAYRWLKNGQPLASENRIEVEAGDLRITKLSLSDSGMYQCV 69

Query: 168 ANNMVGSSAAAS 179
           A N  G+  A++
Sbjct: 70  AENKHGTIYASA 81


>gnl|CDD|143241 cd05764, Ig_2, Subgroup of the immunoglobulin (Ig) superfamily.
           Ig_2: subgroup of the immunoglobulin (Ig) domain found
           in the Ig superfamily. The Ig superfamily is a
           heterogenous group of proteins, built on a common fold
           comprised of a sandwich of two beta sheets. Members of
           the Ig superfamily are components of immunoglobulin,
           neuroglia, cell surface glycoproteins, such as T-cell
           receptors, CD2, CD4, CD8, and membrane glycoproteins,
           such as butyrophilin and chondroitin sulfate
           proteoglycan core protein. A predominant feature of most
           Ig domains is a disulfide bridge connecting the two
           beta-sheets with a tryptophan residue packed against the
           disulfide bond.
          Length = 74

 Score = 43.6 bits (103), Expect = 2e-06
 Identities = 24/75 (32%), Positives = 34/75 (45%), Gaps = 4/75 (5%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGE---YSYSGNSLTVRHTNRHSAGIYLCVAN 169
           G   TL CKA G+P P I W   +  L         Y   +L +  T     G + C+A+
Sbjct: 1   GQRATLRCKARGDPEPAIHWISPDGKLISNSSRTLVYDNGTLDILITTVKDTGSFTCIAS 60

Query: 170 NMVGSSAAASIALHV 184
           N  G  A A++ LH+
Sbjct: 61  NAAG-EATATVELHI 74


>gnl|CDD|143203 cd05726, Ig4_Robo, Third immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors.  Ig4_Robo: domain similar to the
           third immunoglobulin (Ig)-like domain in Robo
           (roundabout) receptors. Robo receptors play a role in
           the development of the central nervous system (CNS), and
           are receptors of Slit protein. Slit is a repellant
           secreted by the neural cells in the midline. Slit acts
           through Robo to prevent most neurons from crossing the
           midline from either side. Three mammalian Robo homologs
           (robo1, -2, and -3), and three mammalian Slit homologs
           (Slit-1,-2, -3), have been identified. Commissural
           axons, which cross the midline, express low levels of
           Robo; longitudinal axons, which avoid the midline,
           express high levels of Robo. robo1, -2, and -3 are
           expressed by commissural neurons in the vertebrate
           spinal cord and Slits 1, -2, -3 are expressed at the
           ventral midline. Robo-3 is a divergent member of the
           Robo family which instead of being a positive regulator
           of slit responsiveness, antagonizes slit responsiveness
           in precrossing axons.  The Slit-Robo interaction is
           mediated by the second leucine-rich repeat (LRR) domain
           of Slit and the two N-terminal Ig domains of Robo, Ig1
           and Ig2. The primary Robo binding site for Slit2 has
           been shown by surface plasmon resonance experiments and
           mutational analysis to be is the Ig1 domain, while the
           Ig2 domain has been proposed to harbor a weak secondary
           binding site.
          Length = 90

 Score = 43.4 bits (102), Expect = 4e-06
 Identities = 24/72 (33%), Positives = 35/72 (48%), Gaps = 10/72 (13%)

Query: 113 GYSVTLECKADGNPVPNITWTRK-NNNL--------PGGEYSYS-GNSLTVRHTNRHSAG 162
           G +VT +C+A GNP P I W ++ + NL            +S S    LT+ +  R   G
Sbjct: 1   GRTVTFQCEATGNPQPAIFWQKEGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQRSDVG 60

Query: 163 IYLCVANNMVGS 174
            Y+C   N+ GS
Sbjct: 61  YYICQTLNVAGS 72


>gnl|CDD|143256 cd05848, Ig1_Contactin-5, First Ig domain of contactin-5.
           Ig1_Contactin-5: First Ig domain of the neural cell
           adhesion molecule contactin-5. Contactins are comprised
           of six Ig domains followed by four fibronectin type III
           (FnIII) domains, anchored to the membrane by
           glycosylphosphatidylinositol. The different contactins
           show different expression patterns in the central
           nervous system. In rats, a lack of contactin-5 (NB-2)
           results in an impairment of the neuronal activity in the
           auditory system. Contactin-5 is expressed specifically
           in the postnatal nervous system, peaking at about 3
           weeks postnatal. Contactin-5 is highly expressed in the
           adult human brain in the occipital lobe and in the
           amygdala; lower levels of expression have been detected
           in the corpus callosum, caudate nucleus, and spinal
           cord.
          Length = 94

 Score = 43.0 bits (101), Expect = 4e-06
 Identities = 23/63 (36%), Positives = 31/63 (49%), Gaps = 4/63 (6%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLP-GGEYSYS---GNSLTVRHTNRHSAGIYLCVANNM 171
           V L C+A GNPVP   W R    +    +Y YS   GN +    +    +G Y C+A N 
Sbjct: 22  VILNCEARGNPVPTYRWLRNGTEIDTESDYRYSLIDGNLIISNPSEVKDSGRYQCLATNS 81

Query: 172 VGS 174
           +GS
Sbjct: 82  IGS 84


>gnl|CDD|143257 cd05849, Ig1_Contactin-1, First Ig domain of contactin-1.
           Ig1_Contactin-1: First Ig domain of the neural cell
           adhesion molecule contactin-1. Contactins are comprised
           of six Ig domains followed by four fibronectin type III
           (FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. Contactin-1 is
           differentially expressed in tumor tissues and may,
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 93

 Score = 43.0 bits (101), Expect = 5e-06
 Identities = 22/62 (35%), Positives = 36/62 (58%), Gaps = 5/62 (8%)

Query: 116 VTLECKADGNPVPNITWTRKNN---NLPGGEYSYSGNSLTVRHTNRH-SAGIYLCVANNM 171
           V++ C+A  NP P   W RKNN   +L    YS  G +L + + +++  AG Y+C+ +N+
Sbjct: 22  VSVNCRARANPFPIYKW-RKNNLDIDLTNDRYSMVGGNLVINNPDKYKDAGRYVCIVSNI 80

Query: 172 VG 173
            G
Sbjct: 81  YG 82


>gnl|CDD|143264 cd05856, Ig2_FGFRL1-like, Second immunoglobulin (Ig)-like domain of
           fibroblast growth factor (FGF) receptor_like-1(FGFRL1). 
           Ig2_FGFRL1-like: second immunoglobulin (Ig)-like domain
           of fibroblast growth factor (FGF)
           receptor_like-1(FGFRL1). FGFRL1 is comprised of a signal
           peptide, three extracellular Ig-like modules, a
           transmembrane segment, and a short intracellular domain.
           FGFRL1 is expressed preferentially in skeletal tissues.
           Similar to FGF receptors, the expressed protein
           interacts specifically with heparin and with FGF2.
           FGFRL1 does not have a protein tyrosine kinase domain at
           its C terminus; neither does its cytoplasmic domain
           appear to interact with a signaling partner. It has been
           suggested that FGFRL1 may not have any direct signaling
           function, but instead acts as a decoy receptor trapping
           FGFs and preventing them from binding other receptors.
          Length = 82

 Score = 42.5 bits (100), Expect = 5e-06
 Identities = 23/64 (35%), Positives = 33/64 (51%), Gaps = 3/64 (4%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLP---GGEYSYSGNSLTVRHTNRHSAGIYLCVAN 169
           G SV L+C A GNP P+ITW + N  L     GE      +L++++     +G Y C  +
Sbjct: 9   GSSVRLKCVASGNPRPDITWLKDNKPLTPTEIGESRKKKWTLSLKNLKPEDSGKYTCHVS 68

Query: 170 NMVG 173
           N  G
Sbjct: 69  NRAG 72


>gnl|CDD|143284 cd05876, Ig3_L1-CAM, Third immunoglobulin (Ig)-like domain of the
           L1 cell adhesion molecule (CAM).  Ig3_L1-CAM:  third
           immunoglobulin (Ig)-like domain of the L1 cell adhesion
           molecule (CAM). L1 belongs to the L1 subfamily of cell
           adhesion molecules (CAMs) and is comprised of an
           extracellular region having six Ig-like domains, five
           fibronectin type III domains, a transmembrane region and
           an intracellular domain. L1 is primarily expressed in
           the nervous system and is involved in its development
           and function. L1 is associated with an X-linked
           recessive disorder, X-linked hydrocephalus, MASA
           syndrome, or spastic paraplegia type 1, that involves
           abnormalities of axonal growth. This group also contains
           the chicken neuron-glia cell adhesion molecule, Ng-CAM.
          Length = 71

 Score = 42.2 bits (99), Expect = 6e-06
 Identities = 18/61 (29%), Positives = 28/61 (45%), Gaps = 2/61 (3%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLP--GGEYSYSGNSLTVRHTNRHSAGIYLCVANNMVG 173
           + LEC A+G P P + W R +  L     +   +  +L + +      G Y+C A N  G
Sbjct: 1   LVLECIAEGLPTPEVHWDRIDGPLSPNRTKKLNNNKTLQLDNVLESDDGEYVCTAENSEG 60

Query: 174 S 174
           S
Sbjct: 61  S 61


>gnl|CDD|143223 cd05746, Ig4_Peroxidasin, Fourth immunoglobulin (Ig)-like domain of
           peroxidasin.  Ig4_Peroxidasin: the fourth immunoglobulin
           (Ig)-like domain in peroxidasin. Peroxidasin has a
           peroxidase domain and interacting extracellular motifs
           containing four Ig-like domains. It has been suggested
           that peroxidasin is secreted, and has functions related
           to the stabilization of the extracellular matrix. It may
           play a part in various other important processes such as
           removal and destruction of cells, which have undergone
           programmed cell death, and protection of the organism
           against non-self.
          Length = 69

 Score = 41.0 bits (96), Expect = 1e-05
 Identities = 20/65 (30%), Positives = 31/65 (47%), Gaps = 2/65 (3%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLP-GGEYSYS-GNSLTVRHTNRHSAGIYLCVANNMVG 173
           V + C A G+P P ITW +    +   G++  S    L +R       G Y CVA N +G
Sbjct: 1   VQIPCSAQGDPEPTITWNKDGVQVTESGKFHISPEGYLAIRDVGVADQGRYECVARNTIG 60

Query: 174 SSAAA 178
            ++ +
Sbjct: 61  YASVS 65


>gnl|CDD|143178 cd04977, Ig1_NCAM-1_like, First immunoglobulin (Ig)-like domain of
           neural cell adhesion molecule NCAM-1 and similar
           proteins.  Ig1_NCAM-1 like: first immunoglobulin
           (Ig)-like domain of neural cell adhesion molecule
           NCAM-1. NCAM-1 plays important roles in the development
           and regeneration of the central nervous system, in
           synaptogenesis and neural migration. NCAM mediates
           cell-cell and cell-substratum recognition and adhesion
           via homophilic (NCAM-NCAM), and heterophilic
           (NCAM-nonNCAM), interactions. NCAM is expressed as three
           major isoforms having different intracellular
           extensions. The extracellular portion of NCAM has five
           N-terminal Ig-like domains and two fibronectin type III
           domains. The double zipper adhesion complex model for
           NCAM homophilic binding involves the Ig1, Ig2, and Ig3
           domains. By this model, Ig1 and Ig2 mediate dimerization
           of NCAM molecules situated on the same cell surface (cis
           interactions), and Ig3 domains mediate interactions
           between NCAM molecules expressed on the surface of
           opposing cells (trans interactions), through binding to
           the Ig1 and Ig2 domains. The adhesive ability of NCAM is
           modulated by the addition of polysialic acid chains to
           the fifth Ig-like domain. Also included in this group is
           NCAM-2 (also known as OCAM/mamFas II and RNCAM).  NCAM-2
           is differentially expressed in the developing and mature
           olfactory epithelium (OE).
          Length = 92

 Score = 39.0 bits (91), Expect = 1e-04
 Identities = 24/85 (28%), Positives = 40/85 (47%), Gaps = 8/85 (9%)

Query: 107 KVEVKKGYSVTLECKADGNPVPNITWTRKNNN--LPGGEYSYSGN-----SLTVRHTNRH 159
           + E+  G S    C+  G P  +I+W   N    +   + S   N     +LT+ + N  
Sbjct: 9   QGEISVGESKFFLCQVIGEPK-DISWFSPNGEKLVTQQQISVVQNDDVRSTLTIYNANIE 67

Query: 160 SAGIYLCVANNMVGSSAAASIALHV 184
            AGIY CVA +  G+ + A++ L +
Sbjct: 68  DAGIYKCVATDAKGTESEATVNLKI 92


>gnl|CDD|143175 cd04974, Ig3_FGFR, Third immunoglobulin (Ig)-like domain of
           fibroblast growth factor receptor (FGFR).  Ig3_FGFR:
           third immunoglobulin (Ig)-like domain of fibroblast
           growth factor receptor (FGFR). Fibroblast growth factors
           (FGFs) participate in morphogenesis, development,
           angiogenesis, and wound healing. These FGF-stimulated
           processes are mediated by four FGFR tyrosine kinases
           (FGRF1-4). FGFRs are comprised of an extracellular
           portion consisting of three Ig-like domains, a
           transmembrane helix, and a cytoplasmic portion having
           protein tyrosine kinase activity. The highly conserved
           Ig-like domains 2 and 3, and the linker region between
           D2 and D3 define a general binding site for FGFs.
          Length = 90

 Score = 38.2 bits (89), Expect = 2e-04
 Identities = 25/92 (27%), Positives = 35/92 (38%), Gaps = 21/92 (22%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNS-------------------LTV 153
           G  V   CK   +  P+I W  K+  + G +Y   G                     L +
Sbjct: 1   GSDVEFHCKVYSDAQPHIQWL-KHVEVNGSKYGPDGLPYVTVLKVAGINTTDNESEVLYL 59

Query: 154 RHTNRHSAGIYLCVANNMVGSSAAASIALHVL 185
           R+ +   AG Y C+A N +G S   S  L VL
Sbjct: 60  RNVSFDDAGEYTCLAGNSIGPS-HHSAWLTVL 90


>gnl|CDD|143278 cd05870, Ig5_NCAM-2, Fifth immunoglobulin (Ig)-like domain of
           Neural Cell Adhesion Molecule NCAM-2 (also known as
           OCAM/mamFas II and RNCAM).  Ig5_NCAM-2: the fifth
           immunoglobulin (Ig)-like domain of Neural Cell Adhesion
           Molecule NCAM-2 (also known as OCAM/mamFas II and
           RNCAM). NCAM-2  is organized similarly to NCAM ,
           including five N-terminal Ig-like domains and two
           fibronectin type III domains. NCAM-2 is differentially
           expressed in the developing and mature olfactory
           epithelium (OE), and may function like NCAM, as an
           adhesion molecule.
          Length = 98

 Score = 38.4 bits (89), Expect = 2e-04
 Identities = 27/91 (29%), Positives = 40/91 (43%), Gaps = 15/91 (16%)

Query: 95  IPPRIIYVSGAGKVEVKKGYSVTLECKADGNPVPNITWTRKNNNL--------PGGEYSY 146
           + P II +     VE     + TL CKA+G P+P ITW R ++          P G    
Sbjct: 1   VQPHIIQLKNETTVE---NGAATLSCKAEGEPIPEITWKRASDGHTFSEGDKSPDGRIEV 57

Query: 147 SG----NSLTVRHTNRHSAGIYLCVANNMVG 173
            G    +SL ++      +G Y C A + +G
Sbjct: 58  KGQHGESSLHIKDVKLSDSGRYDCEAASRIG 88


>gnl|CDD|143242 cd05765, Ig_3, Subgroup of the immunoglobulin (Ig) superfamily.
           Ig_3: subgroup of the immunoglobulin (Ig) domain found
           in the Ig superfamily. The Ig superfamily is a
           heterogenous group of proteins, built on a common fold
           comprised of a sandwich of two beta sheets. Members of
           the Ig superfamily are components of immunoglobulin,
           neuroglia, cell surface glycoproteins, such as T-cell
           receptors, CD2, CD4, CD8, and membrane glycoproteins,
           such as butyrophilin and chondroitin sulfate
           proteoglycan core protein. A predominant feature of most
           Ig domains is a disulfide bridge connecting the two
           beta-sheets with a tryptophan residue packed against the
           disulfide bond.
          Length = 81

 Score = 37.1 bits (86), Expect = 4e-04
 Identities = 20/75 (26%), Positives = 28/75 (37%), Gaps = 16/75 (21%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGEY------SYSGN-------SLTVRHTNRH 159
           G + +  C   G P P ITW ++   + G E          GN        L + +    
Sbjct: 1   GETASFHCDVTGRPPPEITWEKQ---VHGKENLIMRPNHVRGNVVVTNIGQLVIYNAQPQ 57

Query: 160 SAGIYLCVANNMVGS 174
            AG+Y C A N  G 
Sbjct: 58  DAGLYTCTARNSGGL 72


>gnl|CDD|143170 cd04969, Ig5_Contactin_like, Fifth Ig domain of contactin.
           Ig5_Contactin_like: Fifth Ig domain of contactins.
           Contactins are neural cell adhesion molecules and are
           comprised of six Ig domains followed by four fibronectin
           type III(FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. The first four Ig domains
           form the intermolecular binding fragment, which arranges
           as a compact U-shaped module via contacts between Ig
           domains 1 and 4, and between Ig domains 2 and 3.
           Contactin-2 (TAG-1, axonin-1) may play a part in the
           neuronal processes of neurite outgrowth, axon guidance
           and fasciculation, and neuronal migration. This group
           also includes contactin-1 and contactin-5. The different
           contactins show different expression patterns in the
           central nervous system. During development and in
           adulthood, contactin-2 is transiently expressed in
           subsets of central and peripheral neurons. Contactin-5
           is expressed specifically in the rat postnatal nervous
           system, peaking at about 3 weeks postnatal, and a lack
           of contactin-5 (NB-2) results in an impairment of
           neuronal act ivity in the rat auditory system.
           Contactin-5 is highly expressed in the adult human brain
           in the occipital lobe and in the amygdala. Contactin-1
           is differentially expressed in tumor tissues and may,
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 73

 Score = 36.7 bits (85), Expect = 7e-04
 Identities = 20/71 (28%), Positives = 28/71 (39%), Gaps = 4/71 (5%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLP-GGEYSYSGN-SLTVRHTNRHSAGIYLCVANN 170
           G  V +ECK    P P I+W++    L          + SL + +  +   G Y C A N
Sbjct: 1   GGDVIIECKPKAAPKPTISWSKGTELLTNSSRICIWPDGSLEILNVTKSDEGKYTCFAEN 60

Query: 171 MVGSSAAASIA 181
             G   A S  
Sbjct: 61  FFGK--ANSTG 69


>gnl|CDD|212460 cd05723, Ig4_Neogenin, Fourth immunoglobulin (Ig)-like domain in
           neogenin and similar proteins.  Ig4_Neogenin: fourth
           immunoglobulin (Ig)-like domain in neogenin and related
           proteins. Neogenin  is a cell surface protein which is
           expressed in the developing nervous system of vertebrate
           embryos in the growing nerve cells. It is also expressed
           in other embryonic tissues, and may play a general role
           in developmental processes such as cell migration,
           cell-cell recognition, and tissue growth regulation.
           Included in this group is the tumor suppressor protein
           DCC, which is deleted in colorectal carcinoma . DCC and
           neogenin each have four Ig-like domains followed by six
           fibronectin type III domains, a transmembrane domain,
           and an intracellular domain.
          Length = 71

 Score = 36.5 bits (84), Expect = 7e-04
 Identities = 17/66 (25%), Positives = 30/66 (45%), Gaps = 2/66 (3%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEY--SYSGNSLTVRHTNRHSAGIYLCVANNMVG 173
           +  EC+  G P P + W +  + +   +Y      ++L V    +   G Y C+A N VG
Sbjct: 2   IVFECEVTGKPTPTVKWVKNGDMVIPSDYFKIVKEHNLQVLGLVKSDEGFYQCIAENDVG 61

Query: 174 SSAAAS 179
           +  A +
Sbjct: 62  NVQAGA 67


>gnl|CDD|143258 cd05850, Ig1_Contactin-2, First Ig domain of contactin-2.
           Ig1_Contactin-2: First Ig domain of the neural cell
           adhesion molecule contactin-2-like. Contactins are
           comprised of six Ig domains followed by four fibronectin
           type III (FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. Contactin-2 (TAG-1,
           axonin-1) facilitates cell adhesion by homophilic
           binding between molecules in apposed membranes. It may
           play a part in the neuronal processes of neurite
           outgrowth, axon guidance and fasciculation, and neuronal
           migration. The first four Ig domains form the
           intermolecular binding fragment, which arranges as a
           compact U-shaped module by contacts between IG domains 1
           and 4, and domains 2 and 3. The different contactins
           show different expression patterns in the central
           nervous system. During development and in adulthood,
           contactin-2 is transiently expressed in subsets of
           central and peripheral neurons. Contactin-2 is also
           expressed in retinal amacrine cells in the developing
           chick retina, corresponding to the period of formation
           and maturation of AC processes.
          Length = 94

 Score = 36.5 bits (84), Expect = 0.001
 Identities = 17/63 (26%), Positives = 26/63 (41%), Gaps = 4/63 (6%)

Query: 116 VTLECKADGNPVPNITWTRKNNNL---PGGEYSYSGNSLTVRH-TNRHSAGIYLCVANNM 171
           VTL C+A  +P     W      +   P   Y+    +L + +      AG Y C+A N 
Sbjct: 22  VTLGCRARASPPATYRWKMNGTEIKFAPESRYTLVAGNLVINNPQKARDAGSYQCLAINR 81

Query: 172 VGS 174
            G+
Sbjct: 82  CGT 84


>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
           complex aNOP56 subunit; Provisional.
          Length = 414

 Score = 38.4 bits (90), Expect = 0.001
 Identities = 13/25 (52%), Positives = 21/25 (84%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K EE++ +K+K+KK++KKK KK K+
Sbjct: 386 KREEKKPQKRKKKKKRKKKGKKRKK 410



 Score = 37.6 bits (88), Expect = 0.002
 Identities = 12/25 (48%), Positives = 22/25 (88%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K+ EE+K +K++KK+K+KKK K+++
Sbjct: 385 KKREEKKPQKRKKKKKRKKKGKKRK 409



 Score = 37.6 bits (88), Expect = 0.002
 Identities = 16/26 (61%), Positives = 23/26 (88%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           EK+ ++RKKKKKRKK+ KK+KKK ++
Sbjct: 389 EKKPQKRKKKKKRKKKGKKRKKKGRK 414



 Score = 36.9 bits (86), Expect = 0.003
 Identities = 12/26 (46%), Positives = 22/26 (84%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
            +E++ +K+KKK+K++KK KK+K+K 
Sbjct: 387 REEKKPQKRKKKKKRKKKGKKRKKKG 412



 Score = 36.9 bits (86), Expect = 0.003
 Identities = 14/26 (53%), Positives = 24/26 (92%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           ++EE++ +K+KK+KKRKKK KK++K+
Sbjct: 386 KREEKKPQKRKKKKKRKKKGKKRKKK 411



 Score = 34.6 bits (80), Expect = 0.020
 Identities = 15/25 (60%), Positives = 21/25 (84%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K++ E KK +KRKK+KK+KKK +KR
Sbjct: 384 KKKREEKKPQKRKKKKKRKKKGKKR 408



 Score = 33.4 bits (77), Expect = 0.058
 Identities = 12/26 (46%), Positives = 23/26 (88%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           E+++ +++KKKK++K+K KK+KK+ R
Sbjct: 388 EEKKPQKRKKKKKRKKKGKKRKKKGR 413



 Score = 31.1 bits (71), Expect = 0.29
 Identities = 10/24 (41%), Positives = 19/24 (79%)

Query: 75  RKKKKKRKKRKKKKKKKEKRIPPR 98
           +KK++++K +K+KKKKK K+   +
Sbjct: 384 KKKREEKKPQKRKKKKKRKKKGKK 407



 Score = 30.7 bits (70), Expect = 0.44
 Identities = 13/37 (35%), Positives = 23/37 (62%), Gaps = 6/37 (16%)

Query: 68  MEKEEEERKK------KKKRKKRKKKKKKKEKRIPPR 98
           + K  EE K+      KKKR+++K +K+KK+K+   +
Sbjct: 368 LNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKK 404


>gnl|CDD|143215 cd05738, Ig2_RPTP_IIa_LAR_like, Second immunoglobulin (Ig)-like
           domain of  the receptor protein tyrosine phosphatase
           (RPTP)-F, also known as LAR.  Ig2_RPTP_IIa_LAR_like:
           domain similar to the second immunoglobulin (Ig)-like
           domain found in the receptor protein tyrosine
           phosphatase (RPTP)-F, also known as LAR. LAR belongs to
           the RPTP type IIa subfamily. Members of this subfamily
           are cell adhesion molecule-like proteins involved in
           central nervous system (CNS) development. They have
           large extracellular portions, comprised of multiple
           Ig-like domains and two to nine fibronectin type III
           (FNIII) domains, and a cytoplasmic portion having two
           tandem phosphatase domains.
          Length = 74

 Score = 35.0 bits (80), Expect = 0.002
 Identities = 22/72 (30%), Positives = 32/72 (44%), Gaps = 4/72 (5%)

Query: 117 TLECKADGNPVPNITWTRK----NNNLPGGEYSYSGNSLTVRHTNRHSAGIYLCVANNMV 172
           T+ C A GNP P ITW +     +    G        +L + ++     G Y CVA N  
Sbjct: 2   TMLCAASGNPDPEITWFKDFLPVDTTSNGRIKQLRSGALQIENSEESDQGKYECVATNSA 61

Query: 173 GSSAAASIALHV 184
           G+  +A   L+V
Sbjct: 62  GTRYSAPANLYV 73


>gnl|CDD|218482 pfam05178, Kri1, KRI1-like family.  The yeast member of this
          family (Kri1p) is found to be required for 40S ribosome
          biogenesis in the nucleolus.
          Length = 99

 Score = 35.3 bits (82), Expect = 0.003
 Identities = 9/24 (37%), Positives = 18/24 (75%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKE 92
          E++EEE+ ++++  KR K  K++E
Sbjct: 2  ERKEEEKAQREEELKRLKNLKREE 25



 Score = 32.2 bits (74), Expect = 0.034
 Identities = 9/26 (34%), Positives = 19/26 (73%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEK 93
           E+E+ +R+++ KR K  K+++ +EK
Sbjct: 4  KEEEKAQREEELKRLKNLKREEIEEK 29



 Score = 28.4 bits (64), Expect = 0.84
 Identities = 8/26 (30%), Positives = 16/26 (61%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKRI 95
          KE +E +K ++ ++ K+ K  K + I
Sbjct: 1  KERKEEEKAQREEELKRLKNLKREEI 26



 Score = 25.3 bits (56), Expect = 9.8
 Identities = 6/29 (20%), Positives = 18/29 (62%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
            EK + E + K+ +  ++++ ++K ++I
Sbjct: 5  EEEKAQREEELKRLKNLKREEIEEKLEKI 33


>gnl|CDD|143213 cd05736, Ig2_Follistatin_like, Second immunoglobulin (Ig)-like
           domain of a follistatin-like molecule encoded by the
           Mahya gene and similar proteins.  Ig2_Follistatin_like:
           domain similar to the second immunoglobulin (Ig)-like
           domain found in a follistatin-like molecule encoded by
           the CNS-related Mahya gene. Mahya genes have been
           retained in certain Bilaterian branches during
           evolution.  They are conserved in Hymenoptera and
           Deuterostomes, but are absent from other metazoan
           species such as fruit fly and nematode. Mahya proteins
           are secretory, with a follistatin-like domain
           (Kazal-type serine/threonine protease inhibitor domain
           and EF-hand calcium-binding domain), two Ig-like
           domains, and a novel C-terminal domain. Mahya may be
           involved in learning and memory and in processing of
           sensory information in Hymenoptera and vertebrates.
           Follistatin is a secreted, multidomain protein that
           binds activins with high affinity and antagonizes their
           signaling.
          Length = 76

 Score = 34.9 bits (80), Expect = 0.003
 Identities = 17/63 (26%), Positives = 27/63 (42%), Gaps = 7/63 (11%)

Query: 117 TLECKADGNPVPNITWTRK----NNNLPGGEYSYSGNSLTVRHTNRHSA--GIYLCVANN 170
           +L C A+G P+P +TW +        L   + +   N   +  +N      G Y C+A N
Sbjct: 2   SLRCHAEGIPLPRLTWLKNGMDITPKLS-KQLTLIANGSELHISNVRYEDTGAYTCIAKN 60

Query: 171 MVG 173
             G
Sbjct: 61  EAG 63


>gnl|CDD|143224 cd05747, Ig5_Titin_like, M5, fifth immunoglobulin (Ig)-like domain
           of human titin C terminus and similar proteins.
           Ig5_Titin_like: domain similar to the M5, fifth
           immunoglobulin (Ig)-like domain from the human titin C
           terminus. Titin (also called connectin) is a fibrous
           sarcomeric protein specifically found in vertebrate
           striated muscle. Titin is gigantic; depending on isoform
           composition it ranges from 2970 to 3700 kDa, and is of a
           length that spans half a sarcomere. Titin largely
           consists of multiple repeats of Ig-like and fibronectin
           type 3 (FN-III)-like domains. Titin connects the ends of
           myosin thick filaments to Z disks and extends along the
           thick filament to the H zone, and appears to function
           similar to an elastic band, keeping the myosin filaments
           centered in the sarcomere during muscle contraction or
           stretching.
          Length = 92

 Score = 35.0 bits (80), Expect = 0.004
 Identities = 11/26 (42%), Positives = 14/26 (53%)

Query: 110 VKKGYSVTLECKADGNPVPNITWTRK 135
           V +G S    C  DG P P +TW R+
Sbjct: 15  VSEGESARFSCDVDGEPAPTVTWMRE 40


>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
          Length = 434

 Score = 36.8 bits (86), Expect = 0.004
 Identities = 13/33 (39%), Positives = 20/33 (60%), Gaps = 1/33 (3%)

Query: 66  LAMEKEEEERKKKKKR-KKRKKKKKKKEKRIPP 97
           LA   E++E++K+K + KKR +  K   KR  P
Sbjct: 396 LAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKP 428



 Score = 34.9 bits (81), Expect = 0.017
 Identities = 10/26 (38%), Positives = 16/26 (61%)

Query: 73  EERKKKKKRKKRKKKKKKKEKRIPPR 98
            E+K+K+K K + KK+ +  K I  R
Sbjct: 400 AEKKEKEKEKPKVKKRHRDTKNIGKR 425



 Score = 32.2 bits (74), Expect = 0.14
 Identities = 10/32 (31%), Positives = 18/32 (56%)

Query: 62  LSYILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           L+     +++E+E+ K KKR +  K   K+ K
Sbjct: 396 LAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRK 427



 Score = 31.8 bits (73), Expect = 0.18
 Identities = 9/24 (37%), Positives = 16/24 (66%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           K   +R +KK+++K K K KK+ +
Sbjct: 394 KVLAKRAEKKEKEKEKPKVKKRHR 417



 Score = 31.8 bits (73), Expect = 0.19
 Identities = 9/25 (36%), Positives = 16/25 (64%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K+   ++ +KK K+++K K KK  R
Sbjct: 393 KKVLAKRAEKKEKEKEKPKVKKRHR 417



 Score = 29.9 bits (68), Expect = 0.73
 Identities = 7/26 (26%), Positives = 16/26 (61%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
              +   KK+K+++K K KK+ ++ +
Sbjct: 395 VLAKRAEKKEKEKEKPKVKKRHRDTK 420



 Score = 29.5 bits (67), Expect = 1.1
 Identities = 11/25 (44%), Positives = 17/25 (68%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           EK+  +  KK   K+ +KK+K+KEK
Sbjct: 385 EKKTGKPSKKVLAKRAEKKEKEKEK 409



 Score = 29.1 bits (66), Expect = 1.5
 Identities = 8/25 (32%), Positives = 16/25 (64%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           +  +  +K   KR ++K+K+K+K K
Sbjct: 387 KTGKPSKKVLAKRAEKKEKEKEKPK 411



 Score = 28.8 bits (65), Expect = 1.6
 Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           K  ++   K+  KK K+K+K K K
Sbjct: 390 KPSKKVLAKRAEKKEKEKEKPKVK 413



 Score = 28.8 bits (65), Expect = 1.7
 Identities = 10/29 (34%), Positives = 17/29 (58%), Gaps = 4/29 (13%)

Query: 70  KEEEERKKKKKRKK----RKKKKKKKEKR 94
           K   E+K  K  KK    R +KK+K++++
Sbjct: 381 KAPSEKKTGKPSKKVLAKRAEKKEKEKEK 409



 Score = 28.4 bits (64), Expect = 2.7
 Identities = 11/40 (27%), Positives = 19/40 (47%)

Query: 76  KKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYS 115
           KK   ++  KK+K+K++ ++  R       GK     G S
Sbjct: 393 KKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGTS 432


>gnl|CDD|143276 cd05868, Ig4_NrCAM, Fourth immunoglobulin (Ig)-like domain of NrCAM
           (NgCAM-related cell adhesion molecule).  Ig4_ NrCAM:
           fourth immunoglobulin (Ig)-like domain of NrCAM
           (NgCAM-related cell adhesion molecule). NrCAM belongs to
           the L1 subfamily of cell adhesion molecules (CAMs) and
           is comprised of an extracellular region having six
           IG-like domains and five fibronectin type III domains, a
           transmembrane region and an intracellular domain. NrCAM
           is primarily expressed in the nervous system.
          Length = 76

 Score = 34.6 bits (79), Expect = 0.004
 Identities = 19/63 (30%), Positives = 30/63 (47%), Gaps = 8/63 (12%)

Query: 117 TLECKADGNPVPNITWTRKNNNLP------GGEYSYSGNSLTVRHTNRHSAGIYLCVANN 170
           TL C+A+GNP P+I+W    N +P             G+++        S+ +Y C A+N
Sbjct: 5   TLICRANGNPKPSISWL--TNGVPIEIAPTDPSRKVDGDTIIFSKVQERSSAVYQCNASN 62

Query: 171 MVG 173
             G
Sbjct: 63  EYG 65


>gnl|CDD|143217 cd05740, Ig_CEACAM_D4, Fourth immunoglobulin (Ig)-like domain of
           carcinoembryonic antigen (CEA) related cell adhesion
           molecule (CEACAM).  Ig_CEACAM_D4:  immunoglobulin
           (Ig)-like domain 4 in carcinoembryonic antigen (CEA)
           related cell adhesion molecule (CEACAM) protein
           subfamily. The CEA family is a group of anchored or
           secreted glycoproteins, expressed by epithelial cells,
           leukocytes, endothelial cells and placenta. The CEA
           family is divided into the CEACAM and pregnancy-specific
           glycoprotein (PSG) subfamilies. This group represents
           the CEACAM subfamily. CEACAM1 has many important
           cellular functions, it is a cell adhesion molecule, and
           a signaling molecule that regulates the growth of tumor
           cells, it is an angiogenic factor, and is a receptor for
           bacterial and viral pathogens, including mouse hepatitis
           virus (MHV). In mice, four isoforms of CEACAM1 generated
           by alternative splicing have either two [D1, D4] or four
           [D1-D4] Ig-like domains on the cell surface. This family
           corresponds to the D4 Ig-like domain.
          Length = 91

 Score = 34.6 bits (79), Expect = 0.004
 Identities = 22/82 (26%), Positives = 33/82 (40%), Gaps = 3/82 (3%)

Query: 106 GKVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGN--SLTVRHTNRHSAGI 163
           G    +    VTL C+A+G     I W    + L       S +  +LT  +  R   G 
Sbjct: 11  GNQPPEDNQPVTLTCEAEGQ-ATYIWWVNNGSLLVPPRLQLSNDNRTLTFNNVTRSDTGH 69

Query: 164 YLCVANNMVGSSAAASIALHVL 185
           Y C A+N V +  +    L+V 
Sbjct: 70  YQCEASNEVSNMTSDPYILNVN 91


>gnl|CDD|143199 cd05722, Ig1_Neogenin, First immunoglobulin (Ig)-like domain in
           neogenin and similar proteins.  Ig1_Neogenin: first
           immunoglobulin (Ig)-like domain in neogenin and related
           proteins. Neogenin  is a cell surface protein which is
           expressed in the developing nervous system of vertebrate
           embryos in the growing nerve cells. It is also expressed
           in other embryonic tissues, and may play a general role
           in developmental processes such as cell migration,
           cell-cell recognition, and tissue growth regulation.
           Included in this group is the tumor suppressor protein
           DCC, which is deleted in colorectal carcinoma . DCC and
           neogenin each have four Ig-like domains followed by six
           fibronectin type III domains, a transmembrane domain,
           and an intracellular domain.
          Length = 95

 Score = 34.8 bits (80), Expect = 0.005
 Identities = 20/68 (29%), Positives = 25/68 (36%), Gaps = 9/68 (13%)

Query: 112 KGYSVTLECKADGNPVPNITWTRKNNNLPGGE----YSYSGNSLTVRH-----TNRHSAG 162
           +G  V L C A+G P P I W +    L              SL +        N+   G
Sbjct: 13  RGGPVVLNCSAEGEPPPKIEWKKDGVLLNLVSDERRQQLPNGSLLITSVVHSKHNKPDEG 72

Query: 163 IYLCVANN 170
            Y CVA N
Sbjct: 73  FYQCVAQN 80


>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19. 
           Med19 represents a family of conserved proteins which
           are members of the multi-protein co-activator Mediator
           complex. Mediator is required for activation of RNA
           polymerase II transcription by DNA binding
           transactivators.
          Length = 178

 Score = 35.6 bits (82), Expect = 0.006
 Identities = 18/37 (48%), Positives = 23/37 (62%), Gaps = 6/37 (16%)

Query: 67  AMEKEEEERKKKKK------RKKRKKKKKKKEKRIPP 97
              K  E++ KKKK      RKK+KK+KKKK+KR  P
Sbjct: 135 EGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRHSP 171



 Score = 31.3 bits (71), Expect = 0.17
 Identities = 12/29 (41%), Positives = 22/29 (75%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           +K E+++++KKK+K++KKKKK+     P 
Sbjct: 147 KKHEDDKERKKKKKEKKKKKKRHSPEHPG 175


>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168).  This
           family consists of several hypothetical eukaryotic
           proteins of unknown function.
          Length = 142

 Score = 35.4 bits (82), Expect = 0.006
 Identities = 12/27 (44%), Positives = 23/27 (85%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRI 95
           ++++EE+  KK+ K++KKK+KKK+K+ 
Sbjct: 76  KRKDEEKTAKKRAKRQKKKQKKKKKKK 102



 Score = 31.9 bits (73), Expect = 0.080
 Identities = 13/28 (46%), Positives = 20/28 (71%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIP 96
            K+  +R+KKK++KK+KKK KK  K+  
Sbjct: 84  AKKRAKRQKKKQKKKKKKKAKKGNKKEE 111



 Score = 30.8 bits (70), Expect = 0.22
 Identities = 12/27 (44%), Positives = 21/27 (77%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
              K+EE+  KK+ ++++KK+KKKK+K
Sbjct: 75  KKRKDEEKTAKKRAKRQKKKQKKKKKK 101



 Score = 29.6 bits (67), Expect = 0.45
 Identities = 13/34 (38%), Positives = 23/34 (67%), Gaps = 3/34 (8%)

Query: 69  EKEEEERKKK---KKRKKRKKKKKKKEKRIPPRI 99
           +++ EE+K+K   K  KKR K++KKK+K+   + 
Sbjct: 69  QQKREEKKRKDEEKTAKKRAKRQKKKQKKKKKKK 102



 Score = 26.2 bits (58), Expect = 7.3
 Identities = 14/29 (48%), Positives = 20/29 (68%), Gaps = 3/29 (10%)

Query: 69 EKEEEERKKKKKRKKRK---KKKKKKEKR 94
          E E+EE ++K++ KKRK   K  KK+ KR
Sbjct: 62 ETEDEEFQQKREEKKRKDEEKTAKKRAKR 90


>gnl|CDD|143265 cd05857, Ig2_FGFR, Second immunoglobulin (Ig)-like domain of
           fibroblast growth factor (FGF) receptor.  Ig2_FGFR:
           second immunoglobulin (Ig)-like domain of fibroblast
           growth factor (FGF) receptor. FGF receptors bind FGF
           signaling polypeptides. FGFs participate in multiple
           processes such as morphogenesis, development, and
           angiogenesis. FGFs bind to four FGF receptor tyrosine
           kinases (FGFR1, -2, -3, -4). Receptor diversity is
           controlled by alternative splicing producing splice
           variants with different ligand binding characteristics
           and different expression patterns. FGFRs have an
           extracellular region comprised of three IG-like domains,
           a single transmembrane helix, and an intracellular
           tyrosine kinase domain. Ligand binding and specificity
           reside in the Ig-like domains 2 and 3, and the linker
           region that connects these two. FGFR activation and
           signaling depend on FGF-induced dimerization, a process
           involving cell surface heparin or heparin sulfate
           proteoglycans.
          Length = 85

 Score = 34.1 bits (78), Expect = 0.006
 Identities = 19/71 (26%), Positives = 27/71 (38%), Gaps = 6/71 (8%)

Query: 110 VKKGYSVTLECKADGNPVPNITWT------RKNNNLPGGEYSYSGNSLTVRHTNRHSAGI 163
           V    +V   C A GNP P + W       ++ + + G +      SL +        G 
Sbjct: 6   VPAANTVKFRCPAAGNPTPTMRWLKNGKEFKQEHRIGGYKVRNQHWSLIMESVVPSDKGN 65

Query: 164 YLCVANNMVGS 174
           Y CV  N  GS
Sbjct: 66  YTCVVENEYGS 76


>gnl|CDD|221654 pfam12589, WBS_methylT, Methyltransferase involved in
          Williams-Beuren syndrome.  This domain family is found
          in eukaryotes, and is typically between 72 and 83 amino
          acids in length. The family is found in association
          with pfam08241. This family is made up of
          S-adenosylmethionine-dependent methyltransferases. The
          proteins are deleted in Williams-Beuren syndrome (WBS),
          a complex developmental disorder with multisystemic
          manifestations including supravalvular aortic stenosis
          (SVAS) and a specific cognitive phenotype.
          Length = 85

 Score = 33.8 bits (78), Expect = 0.008
 Identities = 13/31 (41%), Positives = 18/31 (58%), Gaps = 4/31 (12%)

Query: 68 MEKEEEERKKKKKRKKRKKKK----KKKEKR 94
             +   RKKKKK+K +KK K    +KKE+ 
Sbjct: 34 RISQRNRRKKKKKKKLKKKSKEWILRKKEQM 64



 Score = 27.6 bits (62), Expect = 1.2
 Identities = 10/24 (41%), Positives = 16/24 (66%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEKR 94
           +  R  ++ R+K+KKKKK K+K 
Sbjct: 30 SKVRRISQRNRRKKKKKKKLKKKS 53



 Score = 27.6 bits (62), Expect = 1.2
 Identities = 10/26 (38%), Positives = 16/26 (61%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEK 93
            K     ++ +++KK+KKK KKK K
Sbjct: 29 ASKVRRISQRNRRKKKKKKKLKKKSK 54


>gnl|CDD|143274 cd05866, Ig1_NCAM-2, First immunoglobulin (Ig)-like domain of
           neural cell adhesion molecule NCAM-2.  Ig1_NCAM-2:
           first immunoglobulin (Ig)-like domain of neural cell
           adhesion molecule NCAM-2 (OCAM/mamFas II, RNCAM). NCAM-2
            is organized similarly to NCAM , including five
           N-terminal Ig-like domains and two fibronectin type III
           domains. NCAM-2 is differentially expressed in the
           developing and mature olfactory epithelium (OE), and may
           function like NCAM, as an adhesion molecule.
          Length = 92

 Score = 33.8 bits (77), Expect = 0.009
 Identities = 27/89 (30%), Positives = 37/89 (41%), Gaps = 17/89 (19%)

Query: 107 KVEVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSG-----------NSLTVRH 155
           KVE+  G S    C A G P  +I W       P GE   S            + LT+ +
Sbjct: 9   KVELSVGESKFFTCTAIGEPE-SIDWYN-----PQGEKIVSSQRVVVQKEGVRSRLTIYN 62

Query: 156 TNRHSAGIYLCVANNMVGSSAAASIALHV 184
            N   AGIY C A +  G +  A++ L +
Sbjct: 63  ANIEDAGIYRCQATDAKGQTQEATVVLEI 91


>gnl|CDD|143240 cd05763, Ig_1, Subgroup of the immunoglobulin (Ig) superfamily.
           Ig_1: subgroup of the immunoglobulin (Ig) domain found
           in the Ig superfamily. The Ig superfamily is a
           heterogenous group of proteins, built on a common fold
           comprised of a sandwich of two beta sheets. Members of
           the Ig superfamily are components of immunoglobulin,
           neuroglia, cell surface glycoproteins, such as T-cell
           receptors, CD2, CD4, CD8, and membrane glycoproteins,
           such as butyrophilin and chondroitin sulfate
           proteoglycan core protein. A predominant feature of most
           Ig domains is a disulfide bridge connecting the two
           beta-sheets with a tryptophan residue packed against the
           disulfide bond.
          Length = 75

 Score = 33.4 bits (76), Expect = 0.009
 Identities = 20/72 (27%), Positives = 29/72 (40%), Gaps = 7/72 (9%)

Query: 118 LECKADGNPVPNITWTRK-NNNLPGGEYSY-----SGNSLTVRHTNRHSAGIYLCVANNM 171
           LEC A G+P P I W +    + P             +   +        G+Y C A N 
Sbjct: 3   LECAATGHPTPQIAWQKDGGTDFPAARERRMHVMPEDDVFFIVDVKIEDTGVYSCTAQNT 62

Query: 172 VGS-SAAASIAL 182
            GS SA A++ +
Sbjct: 63  AGSISANATLTV 74


>gnl|CDD|143210 cd05733, Ig6_L1-CAM_like, Sixth immunoglobulin (Ig)-like domain of
           the L1 cell adhesion molecule (CAM) and similar
           proteins.  Ig6_L1-CAM_like: domain similar to the sixth
           immunoglobulin (Ig)-like domain of the L1 cell adhesion
           molecule (CAM).  L1 belongs to the L1 subfamily of cell
           adhesion molecules (CAMs) and is comprised of an
           extracellular region having six Ig-like domains and five
           fibronectin type III domains, a transmembrane region and
           an intracellular domain. L1 is primarily expressed in
           the nervous system and is involved in its development
           and function. L1 is associated with an X-linked
           recessive disorder, X-linked hydrocephalus, MASA
           syndrome, or spastic paraplegia type 1, that involves
           abnormalities of axonal growth. This group also contains
           NrCAM [Ng(neuronglia)CAM-related cell adhesion
           molecule], which is primarily expressed in the nervous
           system, and human neurofascin.
          Length = 77

 Score = 33.5 bits (77), Expect = 0.009
 Identities = 21/76 (27%), Positives = 35/76 (46%), Gaps = 9/76 (11%)

Query: 116 VTLECKADGNPVPNITWTRKNNNL-----PGGEYSYSGNSLTVRHTNRHSA----GIYLC 166
           + ++C+A GNP P  +WTR   +      P         +L + + N   A    G Y C
Sbjct: 1   IVIKCEAKGNPPPTFSWTRNGTHFDPEKDPRVTMKPDSGTLVIDNMNGGRAEDYEGEYQC 60

Query: 167 VANNMVGSSAAASIAL 182
            A+N +G++ +  I L
Sbjct: 61  YASNELGTAISNEIHL 76


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
           cell cycle arrest and pre-mRNA splicing. It has been
           shown to be a component of U4/U6 x U5 tri-snRNP complex
           in human, Schizosaccharomyces pombe and Saccharomyces
           cerevisiae. SART-1 is a known tumour antigen in a range
           of cancers recognised by T cells.
          Length = 603

 Score = 35.5 bits (82), Expect = 0.013
 Identities = 13/25 (52%), Positives = 17/25 (68%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKRI 95
           +  E  K KK KK+KKKKKK+ K +
Sbjct: 268 DVSEMVKFKKPKKKKKKKKKRRKDL 292



 Score = 33.2 bits (76), Expect = 0.064
 Identities = 11/26 (42%), Positives = 18/26 (69%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEK 93
            +  E  + KK K+KK+KKKK++K+ 
Sbjct: 267 YDVSEMVKFKKPKKKKKKKKKRRKDL 292



 Score = 30.5 bits (69), Expect = 0.50
 Identities = 11/25 (44%), Positives = 16/25 (64%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
              E  K KK +KK+KKKKK+++  
Sbjct: 268 DVSEMVKFKKPKKKKKKKKKRRKDL 292



 Score = 28.2 bits (63), Expect = 3.1
 Identities = 9/18 (50%), Positives = 14/18 (77%)

Query: 75  RKKKKKRKKRKKKKKKKE 92
           + KKKK+KK+K++K   E
Sbjct: 277 KPKKKKKKKKKRRKDLDE 294



 Score = 27.8 bits (62), Expect = 4.5
 Identities = 9/18 (50%), Positives = 15/18 (83%)

Query: 75  RKKKKKRKKRKKKKKKKE 92
           +K KKK+KK+KK++K  +
Sbjct: 276 KKPKKKKKKKKKRRKDLD 293



 Score = 26.6 bits (59), Expect = 9.8
 Identities = 12/25 (48%), Positives = 18/25 (72%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKE 92
           M K ++ +KKKKK+KKR+K   + E
Sbjct: 272 MVKFKKPKKKKKKKKKRRKDLDEDE 296


>gnl|CDD|143275 cd05867, Ig4_L1-CAM_like, Fourth immunoglobulin (Ig)-like domain of
           the L1 cell adhesion molecule (CAM).  Ig4_L1-CAM_like:
           fourth immunoglobulin (Ig)-like domain of the L1 cell
           adhesion molecule (CAM). L1 is comprised of an
           extracellular region having six Ig-like domains and five
           fibronectin type III domains, a transmembrane region and
           an intracellular domain. L1 is primarily expressed in
           the nervous system and is involved in its development
           and function. L1 is associated with an X-linked
           recessive disorder, X-linked hydrocephalus, MASA
           syndrome, or spastic paraplegia type 1, that involves
           abnormalities of axonal growth. This group also contains
           the chicken neuron-glia cell adhesion molecule, Ng-CAM.
          Length = 76

 Score = 32.9 bits (75), Expect = 0.014
 Identities = 19/67 (28%), Positives = 27/67 (40%), Gaps = 8/67 (11%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGE------YSYSGNSLTVRHTNRHSAGIYLC 166
           G +  L+C+ +G P PNITW+   N  P            S  +L +         +Y C
Sbjct: 1   GETARLDCQVEGIPTPNITWSI--NGAPIEGTDPDPRRHVSSGALILTDVQPSDTAVYQC 58

Query: 167 VANNMVG 173
            A N  G
Sbjct: 59  EARNRHG 65


>gnl|CDD|143283 cd05875, Ig6_hNeurofascin_like, Sixth immunoglobulin (Ig)-like
           domain of human neurofascin (NF).
           Ig6_hNeurofascin_like:  the sixth immunoglobulin
           (Ig)-like domain of human neurofascin (NF). NF belongs
           to the L1 subfamily of cell adhesion molecules (CAMs)
           and is comprised of an extracellular region having six
           Ig-like domains and five fibronectin type III domains, a
           transmembrane region, and a cytoplasmic domain. NF has
           many alternatively spliced isoforms having different
           temporal expression patterns during development. NF
           participates in axon subcellular targeting and synapse
           formation, however little is known of the functions of
           the different isoforms.
          Length = 77

 Score = 33.0 bits (75), Expect = 0.015
 Identities = 21/76 (27%), Positives = 31/76 (40%), Gaps = 9/76 (11%)

Query: 116 VTLECKADGNPVPNITWTRKNNNL-----PGGEYSYSGNSLTVRHTNRHSA----GIYLC 166
           + +EC+A GNPVP   WTR          P         +L +  +         G Y C
Sbjct: 1   IIIECEAKGNPVPTFQWTRNGKFFNVAKDPRVSMRRRSGTLVIDFSGGGRPEDYEGEYQC 60

Query: 167 VANNMVGSSAAASIAL 182
            A N +G++ +  I L
Sbjct: 61  FARNNLGTALSNKILL 76


>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
          Length = 330

 Score = 34.8 bits (81), Expect = 0.016
 Identities = 10/35 (28%), Positives = 22/35 (62%), Gaps = 1/35 (2%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYV 102
           ++K+E +   K ++KK K++ K  + +  PR ++V
Sbjct: 61  LDKKELKAWHKAQKKKEKQEAKAAKAKSKPR-LFV 94



 Score = 27.9 bits (63), Expect = 3.0
 Identities = 9/22 (40%), Positives = 12/22 (54%)

Query: 72 EEERKKKKKRKKRKKKKKKKEK 93
          +     KK+ K   K +KKKEK
Sbjct: 57 KAALLDKKELKAWHKAQKKKEK 78


>gnl|CDD|143220 cd05743, Ig_Perlecan_D2_like, Immunoglobulin (Ig)-like domain II
           (D2) of the human basement membrane heparan sulfate
           proteoglycan perlecan, also known as HSPG2.
           Ig_Perlecan_D2_like: the immunoglobulin (Ig)-like domain
           II (D2) of the human basement membrane heparan sulfate
           proteoglycan perlecan, also known as HSPG2. Perlecan
           consists of five domains. Domain I has three putative
           heparan sulfate attachment sites; domain II has four LDL
           receptor-like repeats, and one Ig-like repeat; domain
           III resembles the short arm of laminin chains; domain IV
           has multiple Ig-like repeats (21 repeats in human
           perlecan); and domain V resembles the globular G domain
           of the laminin A chain and internal repeats of EGF.
           Perlecan may participate in a variety of biological
           functions including cell binding, LDL-metabolism,
           basement membrane assembly and selective permeability,
           calcium binding, and growth- and neurite-promoting
           activities.
          Length = 78

 Score = 32.8 bits (75), Expect = 0.016
 Identities = 21/66 (31%), Positives = 26/66 (39%), Gaps = 5/66 (7%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGEY----SYSG-NSLTVRHTNRHSAGIYLCV 167
           G +V   C A G P P I W     ++P        S  G  +LT+R       G Y C 
Sbjct: 1   GETVEFTCVATGVPTPIINWRLNWGHVPDSARVSITSEGGYGTLTIRDVKESDQGAYTCE 60

Query: 168 ANNMVG 173
           A N  G
Sbjct: 61  AINTRG 66


>gnl|CDD|143171 cd04970, Ig6_Contactin_like, Sixth Ig domain of contactin.
           Ig6_Contactin_like: Sixth Ig domain of contactins.
           Contactins are neural cell adhesion molecules and are
           comprised of six Ig domains followed by four fibronectin
           type III(FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. The first four Ig domains
           form the intermolecular binding fragment, which arranges
           as a compact U-shaped module via contacts between Ig
           domains 1 and 4, and between Ig domains 2 and 3.
           Contactin-2 (TAG-1, axonin-1) may play a part in the
           neuronal processes of neurite outgrowth, axon guidance
           and fasciculation, and neuronal migration. This group
           also includes contactin-1 and contactin-5. The different
           contactins show different expression patterns in the
           central nervous system. During development and in
           adulthood, contactin-2 is transiently expressed in
           subsets of central and peripheral neurons. Contactin-5
           is expressed specifically in the rat postnatal nervous
           system, peaking at about 3 weeks postnatal, and a lack
           of contactin-5 (NB-2) results in an impairment of neur
           onal act ivity in the rat auditory system. Contactin-5
           is highly expressed in the adult human brain in the
           occipital lobe and in the amygdala. Contactin-1 is
           differentially expressed in tumor tissues and may,
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 85

 Score = 32.9 bits (75), Expect = 0.018
 Identities = 24/76 (31%), Positives = 37/76 (48%), Gaps = 11/76 (14%)

Query: 115 SVTLECKADGNPVPNITWTRKNNNLP------GGEYSYSG-----NSLTVRHTNRHSAGI 163
           S+TL+C A  +P  ++T+T   N +P      GG Y   G       L +R+     AG 
Sbjct: 2   SITLQCHASHDPTLDLTFTWSFNGVPIDFDKDGGHYRRVGGKDSNGDLMIRNAQLKHAGK 61

Query: 164 YLCVANNMVGSSAAAS 179
           Y C A  +V S +A++
Sbjct: 62  YTCTAQTVVDSLSASA 77


>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal.  This
           domain is found to the N-terminus of bacterial signal
           peptidases of the S49 family (pfam01343).
          Length = 154

 Score = 34.0 bits (79), Expect = 0.018
 Identities = 12/33 (36%), Positives = 20/33 (60%), Gaps = 1/33 (3%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYV 102
             E+  KK +K K + +KKK K++   PR ++V
Sbjct: 70  AWEKAEKKAEKAKAKAEKKKAKKEEPKPR-LFV 101



 Score = 27.5 bits (62), Expect = 2.7
 Identities = 9/23 (39%), Positives = 12/23 (52%)

Query: 72 EEERKKKKKRKKRKKKKKKKEKR 94
          E     KK+ K  +K +KK EK 
Sbjct: 59 EAALLDKKELKAWEKAEKKAEKA 81


>gnl|CDD|205206 pfam13025, DUF3886, Protein of unknown function (DUF3886).  This
          family of proteins is functionally uncharacterized.
          This family of proteins is found in bacteria. Proteins
          in this family are approximately 90 amino acids in
          length. There are two completely conserved L residues
          that may be functionally important.
          Length = 70

 Score = 32.3 bits (74), Expect = 0.020
 Identities = 7/24 (29%), Positives = 21/24 (87%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEK 93
          K EEE++++++  ++++++K++EK
Sbjct: 30 KAEEEKREEEEEARKREERKEREK 53



 Score = 31.2 bits (71), Expect = 0.054
 Identities = 9/27 (33%), Positives = 22/27 (81%), Gaps = 1/27 (3%)

Query: 68 MEKEEEERKKK-KKRKKRKKKKKKKEK 93
          ++ EEE+R+++ + RK+ ++K+++K K
Sbjct: 29 LKAEEEKREEEEEARKREERKEREKNK 55



 Score = 29.2 bits (66), Expect = 0.25
 Identities = 9/31 (29%), Positives = 22/31 (70%), Gaps = 2/31 (6%)

Query: 66 LAMEKE--EEERKKKKKRKKRKKKKKKKEKR 94
           A +KE   EE K++++ + RK++++K+ ++
Sbjct: 23 KAKKKELKAEEEKREEEEEARKREERKEREK 53


>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
           This is a family of fungal proteins whose function is
           unknown.
          Length = 130

 Score = 33.4 bits (77), Expect = 0.021
 Identities = 7/29 (24%), Positives = 19/29 (65%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
               KE  E++K+ ++ + KK K++++++
Sbjct: 100 RERTKERAEKEKRTRKNREKKFKRRQKEK 128



 Score = 29.5 bits (67), Expect = 0.45
 Identities = 11/26 (42%), Positives = 21/26 (80%), Gaps = 1/26 (3%)

Query: 69  EKEEEERKKKKKRKKRKKKK-KKKEK 93
           E+ E+E++ +K R+K+ K++ K+KEK
Sbjct: 105 ERAEKEKRTRKNREKKFKRRQKEKEK 130



 Score = 29.2 bits (66), Expect = 0.71
 Identities = 9/31 (29%), Positives = 20/31 (64%), Gaps = 2/31 (6%)

Query: 66  LAMEKEEEERKKKKKRKKRKKK--KKKKEKR 94
           +A+    E  K++ +++KR +K  +KK ++R
Sbjct: 94  IALRLRRERTKERAEKEKRTRKNREKKFKRR 124



 Score = 26.9 bits (60), Expect = 4.4
 Identities = 8/21 (38%), Positives = 14/21 (66%)

Query: 75  RKKKKKRKKRKKKKKKKEKRI 95
           R   + R++R K++ +KEKR 
Sbjct: 93  RIALRLRRERTKERAEKEKRT 113


>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR).  This
           family consists of several bovine specific leukaemia
           virus receptors which are thought to function as
           transmembrane proteins, although their exact function is
           unknown.
          Length = 561

 Score = 34.7 bits (79), Expect = 0.022
 Identities = 14/27 (51%), Positives = 22/27 (81%), Gaps = 2/27 (7%)

Query: 70  KEEEERKKKKK--RKKRKKKKKKKEKR 94
           K EEER+ +++  + KR+KKK++KEKR
Sbjct: 83  KLEEERRHRQRLEKDKREKKKREKEKR 109



 Score = 32.3 bits (73), Expect = 0.11
 Identities = 11/25 (44%), Positives = 19/25 (76%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKRI 95
           E++ +K KKK KK K+K++ K+K+ 
Sbjct: 196 EKKSKKPKKKEKKEKEKERDKDKKK 220



 Score = 32.3 bits (73), Expect = 0.14
 Identities = 11/29 (37%), Positives = 21/29 (72%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           L  E+   +R +K KR+K+K++K+K+ +R
Sbjct: 84  LEEERRHRQRLEKDKREKKKREKEKRGRR 112



 Score = 29.3 bits (65), Expect = 1.5
 Identities = 6/26 (23%), Positives = 17/26 (65%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
            ++  E+ K++K+K+ K+K+ ++   
Sbjct: 90  HRQRLEKDKREKKKREKEKRGRRRHH 115



 Score = 28.9 bits (64), Expect = 1.7
 Identities = 5/25 (20%), Positives = 18/25 (72%)

Query: 73  EERKKKKKRKKRKKKKKKKEKRIPP 97
           E+ K++KK+++++K+ +++   +  
Sbjct: 95  EKDKREKKKREKEKRGRRRHHSLGT 119



 Score = 28.9 bits (64), Expect = 1.9
 Identities = 12/25 (48%), Positives = 18/25 (72%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           EK +    +KK +K +KK+KK+KEK
Sbjct: 188 EKGDVPAVEKKSKKPKKKEKKEKEK 212



 Score = 28.5 bits (63), Expect = 2.4
 Identities = 10/24 (41%), Positives = 15/24 (62%)

Query: 74  ERKKKKKRKKRKKKKKKKEKRIPP 97
           ++KK++K K+ KKKKKK       
Sbjct: 281 KKKKQRKEKEEKKKKKKHHHHRCH 304



 Score = 28.5 bits (63), Expect = 2.6
 Identities = 12/29 (41%), Positives = 23/29 (79%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
           A+EK+ ++ KKK+K++K K++ K K+K +
Sbjct: 194 AVEKKSKKPKKKEKKEKEKERDKDKKKEV 222



 Score = 27.3 bits (60), Expect = 5.5
 Identities = 12/30 (40%), Positives = 19/30 (63%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           E ++ E ++ KK  K KKKK++KEK    +
Sbjct: 265 EPKDAEAEETKKSPKHKKKKQRKEKEEKKK 294



 Score = 27.3 bits (60), Expect = 5.8
 Identities = 13/33 (39%), Positives = 22/33 (66%), Gaps = 2/33 (6%)

Query: 68  MEKEEEERK--KKKKRKKRKKKKKKKEKRIPPR 98
             + EE +K  K KK+K+RK+K++KK+K+    
Sbjct: 268 DAEAEETKKSPKHKKKKQRKEKEEKKKKKKHHH 300



 Score = 27.3 bits (60), Expect = 6.4
 Identities = 8/26 (30%), Positives = 18/26 (69%)

Query: 73  EERKKKKKRKKRKKKKKKKEKRIPPR 98
           +++K++K+++++KKKKK    R    
Sbjct: 281 KKKKQRKEKEEKKKKKKHHHHRCHHS 306


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
           domain is functionally uncharacterized. This domain is
           found in eukaryotes. This presumed domain is typically
           between 156 to 174 amino acids in length. This domain is
           found associated with pfam07780, pfam01728.
          Length = 154

 Score = 33.4 bits (77), Expect = 0.027
 Identities = 6/25 (24%), Positives = 18/25 (72%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKRI 95
           E  E++  K +++++++ ++K+K I
Sbjct: 122 ELLEKELAKLKREKRRENERKQKEI 146



 Score = 31.5 bits (72), Expect = 0.13
 Identities = 6/27 (22%), Positives = 17/27 (62%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKR 94
              E+E  K K+++++  ++K+K+  +
Sbjct: 122 ELLEKELAKLKREKRRENERKQKEILK 148



 Score = 31.5 bits (72), Expect = 0.13
 Identities = 7/27 (25%), Positives = 19/27 (70%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRI 95
           ++  E+   K KR+KR++ ++K+++ +
Sbjct: 121 DELLEKELAKLKREKRRENERKQKEIL 147



 Score = 30.3 bits (69), Expect = 0.29
 Identities = 9/28 (32%), Positives = 20/28 (71%), Gaps = 1/28 (3%)

Query: 68  MEKE-EEERKKKKKRKKRKKKKKKKEKR 94
           +EKE  + +++K++  +RK+K+  KE+ 
Sbjct: 124 LEKELAKLKREKRRENERKQKEILKEQM 151


>gnl|CDD|218188 pfam04641, Rtf2, Replication termination factor 2.  It is vital for
           effective cell-replication that replication is not
           stalled at any point by, for instance, damaged bases.
           Rtf2 stabilizes the replication fork stalled at the
           site-specific replication barrier RTS1 by preventing
           replication restart until completion of DNA synthesis by
           a converging replication fork initiated at a flanking
           origin. The RTS1 element terminates replication forks
           that are moving in the cen2-distal direction while
           allowing forks moving in the cen2-proximal direction to
           pass through the region. Rtf2 contains a C2HC2 motif
           related to the C3HC4 RING-finger motif, and would appear
           to fold up, creating a RING finger-like structure but
           forming only one functional Zn2+ ion-binding site.
          Length = 254

 Score = 33.9 bits (78), Expect = 0.033
 Identities = 11/28 (39%), Positives = 16/28 (57%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEK 93
             +E+E  ++KKKKK+KK KK       
Sbjct: 174 ARLEEERAKKKKKKKKKKTKKNNATGSS 201



 Score = 33.1 bits (76), Expect = 0.056
 Identities = 19/56 (33%), Positives = 25/56 (44%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLECKA 122
            ++   EE + KKK+KK+KKK KK           VS A   E+  G     E K 
Sbjct: 171 LLKARLEEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAKK 226



 Score = 30.4 bits (69), Expect = 0.48
 Identities = 15/28 (53%), Positives = 18/28 (64%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           L   + EEER KKKK+KK+KK KK    
Sbjct: 171 LLKARLEEERAKKKKKKKKKKTKKNNAT 198



 Score = 28.5 bits (64), Expect = 1.8
 Identities = 13/32 (40%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 65  ILAMEKEEEERKKKK--KRKKRKKKKKKKEKR 94
            L   +EE E  K +  + + +KKKKKKK+K 
Sbjct: 161 PLNPTEEEVELLKARLEEERAKKKKKKKKKKT 192


>gnl|CDD|143173 cd04972, Ig_TrkABC_d4, Fourth domain (immunoglobulin-like) of Trk
           receptors TrkA, TrkB and TrkC.  TrkABC_d4: the fourth
           domain of Trk receptors TrkA, TrkB and TrkC, this is an
           immunoglobulin (Ig)-like domain which binds to
           neurotrophin. The Trk family of receptors are tyrosine
           kinase receptors. They are activated by dimerization,
           leading to autophosphorylation of intracellular tyrosine
           residues, and triggering the signal transduction
           pathway. TrkA, TrkB, and TrkC share significant sequence
           homology and domain organization. The first three
           domains are leucine-rich domains. The fourth and fifth
           domains are Ig-like domains playing a part in ligand
           binding. TrkA, Band C mediate the trophic effects of the
           neurotrophin Nerve growth factor (NGF) family. TrkA is
           recognized by NGF. TrKB is recognized by brain-derived
           neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC
           is recognized by NT-3. NT-3 is promiscuous as in some
           cell systems it activates TrkA and TrkB receptors. TrkA
           is a receptor found in all major NGF targets, including
           the sympathetic, trigeminal, and dorsal root ganglia,
           cholinergic neurons of the basal forebrain and the
           striatum. TrKB transcripts are found throughout multiple
           structures of the central and peripheral nervous
           systems. The TrkC gene is expressed throughout the
           mammalian nervous system.
          Length = 90

 Score = 32.1 bits (73), Expect = 0.033
 Identities = 22/90 (24%), Positives = 36/90 (40%), Gaps = 6/90 (6%)

Query: 100 IYVSGAGKVEVKKGYSVTLECKADGNPVPNITWT-----RKNNNLPGGEYSYSGNSLTVR 154
           I V G     V +G + T+ C A+G+P+P + W               E +    +L + 
Sbjct: 2   IPVDGPNATVVYEGGTATIRCTAEGSPLPKVEWIIAGLIVIQTRTDTLETTVDIYNLQLS 61

Query: 155 HTNRHSAGIYLCVANNMVGSSAAASIALHV 184
           +    +     C A N VG  A  S+ + V
Sbjct: 62  NITSETQTTVTCTAENPVG-QANVSVQVTV 90


>gnl|CDD|143260 cd05852, Ig5_Contactin-1, Fifth Ig domain of contactin-1.
           Ig5_Contactin-1: fifth Ig domain of the neural cell
           adhesion molecule contactin-1. Contactins are comprised
           of six Ig domains followed by four fibronectin type III
           (FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. Contactin-1 is
           differentially expressed in tumor tissues and may
           through a RhoA mechanism, facilitate invasion and
           metastasis of human lung adenocarcinoma.
          Length = 73

 Score = 31.5 bits (71), Expect = 0.039
 Identities = 17/70 (24%), Positives = 28/70 (40%), Gaps = 2/70 (2%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGE--YSYSGNSLTVRHTNRHSAGIYLCVANN 170
           G  V +ECK    P P  +W++    L        +   SL + +  +   G Y C A N
Sbjct: 1   GGRVIIECKPKAAPKPKFSWSKGTELLVNNSRISIWDDGSLEILNITKLDEGSYTCFAEN 60

Query: 171 MVGSSAAASI 180
             G + +  +
Sbjct: 61  NRGKANSTGV 70


>gnl|CDD|221028 pfam11208, DUF2992, Protein of unknown function (DUF2992).  This
           bacterial family of proteins has no known function.
           However, the cis-regulatory yjdF motif, just upstream
           from the gene encoding the proteins for this family, is
           a small non-coding RNA, Rfam:RF01764. The yjdF motif is
           found in many Firmicutes, including Bacillus subtilis.
           In most cases, it resides in potential 5' UTRs of
           homologues of the yjdF gene whose function is unknown.
           However, in Streptococcus thermophilus, a yjdF RNA motif
           is associated with an operon whose protein products
           synthesise nicotinamide adenine dinucleotide (NAD+).
           Also, the S. thermophilus yjdF RNA lacks typical yjdF
           motif consensus features downstream of and including the
           P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the
           S. thermophilus RNAs might sense a distinct compound
           that structurally resembles the ligand bound by other
           yjdF RNAs. On the ohter hand, perhaps these RNAs have an
           alternative solution forming a similar binding site, as
           is observed with some SAM riboswitches.
          Length = 132

 Score = 32.6 bits (75), Expect = 0.047
 Identities = 10/28 (35%), Positives = 23/28 (82%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           A++ E E  K++KK++ ++KK+++KE++
Sbjct: 90  ALKLEHERNKQEKKKRSKEKKEEEKERK 117



 Score = 28.8 bits (65), Expect = 0.96
 Identities = 10/24 (41%), Positives = 20/24 (83%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           KE++E +K++KR+ +++KKK K +
Sbjct: 107 KEKKEEEKERKRQLKQQKKKAKHR 130



 Score = 28.0 bits (63), Expect = 1.6
 Identities = 11/23 (47%), Positives = 20/23 (86%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKK 91
           EK+EEE+++K++ K++KKK K +
Sbjct: 108 EKKEEEKERKRQLKQQKKKAKHR 130


>gnl|CDD|143262 cd05854, Ig6_Contactin-2, Sixth Ig domain of contactin-2.
           Ig6_Contactin-2: Sixth Ig domain of the neural cell
           adhesion molecule contactin-2-like. Contactins are
           comprised of six Ig domains followed by four fibronectin
           type III (FnIII) domains anchored to the membrane by
           glycosylphosphatidylinositol. Contactin-2 (TAG-1,
           axonin-1) facilitates cell adhesion by homophilic
           binding between molecules in apposed membranes. It may
           play a part in the neuronal processes of neurite
           outgrowth, axon guidance and fasciculation, and neuronal
           migration. The first four Ig domains form the
           intermolecular binding fragment, which arranges as a
           compact U-shaped module by contacts between IG domains 1
           and 4, and domains 2 and 3. The different contactins
           show different expression patterns in the central
           nervous system. During development and in adulthood,
           contactin-2 is transiently expressed in subsets of
           central and peripheral neurons. Contactin-2 is also
           expressed in retinal amacrine cells in the developing
           chick retina, corresponding to the period of formation
           and maturation of AC proce sses.
          Length = 85

 Score = 31.9 bits (72), Expect = 0.047
 Identities = 19/76 (25%), Positives = 35/76 (46%), Gaps = 11/76 (14%)

Query: 115 SVTLECKADGNPVPNITWTRKNNNLP------GGEYSYSG-----NSLTVRHTNRHSAGI 163
           ++TL+C A  +P  ++T+T   ++ P       G Y           L + +     AG 
Sbjct: 2   NLTLQCHASHDPTMDLTFTWSLDDFPIDLDKPNGHYRRMEVKETIGDLVIVNAQLSHAGT 61

Query: 164 YLCVANNMVGSSAAAS 179
           Y C A  +V S++A++
Sbjct: 62  YTCTAQTVVDSASASA 77


>gnl|CDD|143216 cd05739, Ig3_RPTP_IIa_LAR_like, Third immunoglobulin (Ig)-like
           domain of the receptor protein tyrosine phosphatase
           (RPTP)-F, also known as LAR.  Ig3_RPTP_IIa_LAR_like:
           domain similar to the third immunoglobulin (Ig)-like
           domain found in the receptor protein tyrosine
           phosphatase (RPTP)-F, also known as LAR. LAR belongs to
           the RPTP type IIa subfamily. Members of this subfamily
           are cell adhesion molecule-like proteins involved in
           central nervous system (CNS) development. They have
           large extracellular portions, comprised of multiple
           IG-like domains and two to nine fibronectin type III
           (FNIII) domains, and a cytoplasmic portion having two
           tandem phosphatase domains. Included in this group is
           Drosophila LAR (DLAR).
          Length = 69

 Score = 31.4 bits (71), Expect = 0.049
 Identities = 23/73 (31%), Positives = 33/73 (45%), Gaps = 13/73 (17%)

Query: 113 GYSVTLECKADGNPVPNITWTR------KNNNLPGGEYSYSGNSLTVRHTNRHSAGIYLC 166
           G SV L C A G P+P + W +      K + +P G      N L + +    SA  Y C
Sbjct: 1   GGSVNLTCVAVGAPMPYVKWMKGGEELTKEDEMPVGR-----NVLELTNI-YESAN-YTC 53

Query: 167 VANNMVGSSAAAS 179
           VA + +G   A +
Sbjct: 54  VAISSLGMIEATA 66


>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
           subunit.  This is a family of proteins which are
           subunits of the eukaryotic translation initiation factor
           3 (eIF3). In yeast it is called Hcr1. The Saccharomyces
           cerevisiae protein eIF3j (HCR1) has been shown to be
           required for processing of 20S pre-rRNA and binds to 18S
           rRNA and eIF3 subunits Rpg1p and Prt1p.
          Length = 242

 Score = 33.1 bits (76), Expect = 0.053
 Identities = 10/26 (38%), Positives = 15/26 (57%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKK 90
           +L  EK + E+  K  +KK+ K K K
Sbjct: 189 VLINEKLKAEKAAKGGKKKKGKAKAK 214



 Score = 30.8 bits (70), Expect = 0.35
 Identities = 12/25 (48%), Positives = 14/25 (56%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
          E EE+E +K K   K K KK  K K
Sbjct: 41 EDEEKEEEKAKVAAKAKAKKALKAK 65



 Score = 30.0 bits (68), Expect = 0.54
 Identities = 10/26 (38%), Positives = 17/26 (65%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKR 94
          ++EE+E K+++K K   K K KK  +
Sbjct: 38 DEEEDEEKEEEKAKVAAKAKAKKALK 63



 Score = 30.0 bits (68), Expect = 0.54
 Identities = 10/29 (34%), Positives = 16/29 (55%), Gaps = 1/29 (3%)

Query: 69 EKEEEERKKKKKRKKRKK-KKKKKEKRIP 96
          EK +   K K K+  + K ++K+K KR  
Sbjct: 48 EKAKVAAKAKAKKALKAKIEEKEKAKREK 76



 Score = 29.6 bits (67), Expect = 0.79
 Identities = 9/27 (33%), Positives = 17/27 (62%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKRI 95
          E++EE+ ++K K   + K KK  + +I
Sbjct: 40 EEDEEKEEEKAKVAAKAKAKKALKAKI 66



 Score = 27.3 bits (61), Expect = 4.1
 Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
          E++EEE+ K   + K KK  K K +
Sbjct: 43 EEKEEEKAKVAAKAKAKKALKAKIE 67


>gnl|CDD|143239 cd05762, Ig8_MLCK, Eighth immunoglobulin (Ig)-like domain of human
           myosin light-chain kinase (MLCK).  Ig8_MLCK: the eighth
           immunoglobulin (Ig)-like domain of human myosin
           light-chain kinase (MLCK). MLCK is a key regulator of
           different forms of cell motility involving actin and
           myosin II.  Agonist stimulation of smooth muscle cells
           increases cytosolic Ca2+, which binds calmodulin.  This
           Ca2+-calmodulin complex in turn binds to and activates
           MLCK. Activated MLCK leads to the phosphorylation of the
           20 kDa myosin regulatory light chain (RLC) of myosin II
           and the stimulation of actin-activated myosin MgATPase
           activity. MLCK is widely present in vertebrate tissues;
           it phosphorylates the 20 kDa RLC of both smooth and
           nonmuscle myosin II. Phosphorylation leads to the
           activation of the myosin motor domain and altered
           structural properties of myosin II. In smooth muscle
           MLCK it is involved in initiating contraction. In
           nonmuscle cells, MLCK may participate in cell division
           and cell motility; it has been suggested MLCK plays a
           role in cardiomyocyte differentiation and contraction
           through regulation of nonmuscle myosin II.
          Length = 98

 Score = 31.9 bits (72), Expect = 0.059
 Identities = 20/75 (26%), Positives = 30/75 (40%), Gaps = 5/75 (6%)

Query: 109 EVKKGYSVTLECKADGNPVPNITWTRKNNNLPGGEY-----SYSGNSLTVRHTNRHSAGI 163
           +V+ G SV L CK  G      TW +    +  GE      + + + LT+    +   G 
Sbjct: 11  KVRAGESVELFCKVTGTQPITCTWMKFRKQIQEGEGIKIENTENSSKLTITEGQQEHCGC 70

Query: 164 YLCVANNMVGSSAAA 178
           Y     N +GS  A 
Sbjct: 71  YTLEVENKLGSRQAQ 85


>gnl|CDD|235322 PRK04950, PRK04950, ProP expression regulator; Provisional.
          Length = 213

 Score = 32.6 bits (75), Expect = 0.066
 Identities = 12/32 (37%), Positives = 18/32 (56%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           A ++E    K+K  R++RK K K   K+  PR
Sbjct: 119 AKKREAAGEKEKAPRRERKPKPKAPRKKRKPR 150



 Score = 31.0 bits (71), Expect = 0.26
 Identities = 12/50 (24%), Positives = 26/50 (52%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTL 118
           +++   R++K K K  +KK+K + ++  P+   VS   ++ V +   V  
Sbjct: 128 KEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSDISELTVGQAVKVKA 177



 Score = 28.7 bits (65), Expect = 1.3
 Identities = 7/30 (23%), Positives = 16/30 (53%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           E   E+ K  ++ +K K K  +K+++   +
Sbjct: 123 EAAGEKEKAPRRERKPKPKAPRKKRKPRAQ 152



 Score = 28.4 bits (64), Expect = 2.0
 Identities = 8/30 (26%), Positives = 16/30 (53%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
              E+E+  +++RK + K  +KK K    +
Sbjct: 124 AAGEKEKAPRRERKPKPKAPRKKRKPRAQK 153



 Score = 26.4 bits (59), Expect = 7.5
 Identities = 7/30 (23%), Positives = 18/30 (60%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
            +  E++ KK++    K+K  ++E++  P+
Sbjct: 112 AQRAEQQAKKREAAGEKEKAPRRERKPKPK 141


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
           RPA34.5.  This is a family of proteins conserved from
           yeasts to human. Subunit A34.5 of RNA polymerase I is a
           non-essential subunit which is thought to help Pol I
           overcome topological constraints imposed on ribosomal
           DNA during the process of transcription.
          Length = 193

 Score = 32.8 bits (75), Expect = 0.069
 Identities = 15/26 (57%), Positives = 23/26 (88%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           E EEEE+K+KKK+K+ KK+KK+K+ +
Sbjct: 148 EVEEEEKKEKKKKKEVKKEKKEKKDK 173



 Score = 30.5 bits (69), Expect = 0.33
 Identities = 15/31 (48%), Positives = 23/31 (74%), Gaps = 2/31 (6%)

Query: 66  LAMEKEEEERKKKK--KRKKRKKKKKKKEKR 94
           +  EK+E++ KK+K  + K  KKKKKKK+K+
Sbjct: 163 VKKEKKEKKDKKEKMVEPKGSKKKKKKKKKK 193



 Score = 29.3 bits (66), Expect = 0.95
 Identities = 13/30 (43%), Positives = 21/30 (70%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
              +EKE E  +++KK KK+KK+ KK++K 
Sbjct: 140 TAKVEKEAEVEEEEKKEKKKKKEVKKEKKE 169



 Score = 29.3 bits (66), Expect = 0.98
 Identities = 15/26 (57%), Positives = 21/26 (80%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEK 93
            E E EE +KK+K+KK++ KK+KKEK
Sbjct: 145 KEAEVEEEEKKEKKKKKEVKKEKKEK 170



 Score = 27.4 bits (61), Expect = 4.4
 Identities = 11/30 (36%), Positives = 19/30 (63%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
            K E+E + +++ KK KKKKK+ +K    +
Sbjct: 141 AKVEKEAEVEEEEKKEKKKKKEVKKEKKEK 170



 Score = 26.6 bits (59), Expect = 7.3
 Identities = 8/27 (29%), Positives = 17/27 (62%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
                + E++ + + +++K+KKKKKE 
Sbjct: 137 KETTAKVEKEAEVEEEEKKEKKKKKEV 163


>gnl|CDD|143221 cd05744, Ig_Myotilin_C_like, Immunoglobulin (Ig)-like domain of
           myotilin, palladin, and myopalladin.
           Ig_Myotilin_like_C: immunoglobulin (Ig)-like domain in
           myotilin, palladin, and myopalladin.  Myotilin,
           palladin, and myopalladin function as scaffolds that
           regulate actin organization. Myotilin and myopalladin
           are most abundant in skeletal and cardiac muscle;
           palladin is ubiquitously expressed in the organs of
           developing vertebrates and  plays a key role in cellular
           morphogenesis. The three family members each interact
           with specific molecular partners: all three bind to
           alpha-actinin; in addition, palladin also binds to
           vasodilator-stimulated phosphoprotein (VASP) and ezrin,
           myotilin binds to filamin and actin, and myopalladin
           also binds to nebulin and cardiac ankyrin repeat protein
           (CARP).
          Length = 75

 Score = 30.9 bits (70), Expect = 0.071
 Identities = 20/65 (30%), Positives = 26/65 (40%), Gaps = 7/65 (10%)

Query: 116 VTLECKADGNPVPNITWTRKNNNL---PGGEYSYSGNS----LTVRHTNRHSAGIYLCVA 168
           V LEC+    P P I W + N  L         Y  N     L +++ N+  AG Y   A
Sbjct: 1   VRLECRVSAIPPPQIFWKKNNEMLTYNTDRISLYQDNCGRICLLIQNANKEDAGWYTVSA 60

Query: 169 NNMVG 173
            N  G
Sbjct: 61  VNEAG 65


>gnl|CDD|143301 cd05893, Ig_Palladin_C, C-terminal immunoglobulin (Ig)-like domain
           of palladin.  Ig_Palladin_C: C-terminal immunoglobulin
           (Ig)-like domain of palladin. Palladin belongs to the
           palladin-myotilin-myopalladin family. Proteins belonging
           to this family contain multiple Ig-like domains and
           function as scaffolds, modulating actin cytoskeleton.
           Palladin binds to alpha-actinin ezrin,
           vasodilator-stimulated phosphoprotein VASP, SPIN90 (DIP,
           mDia interacting protein), and Src. Palladin also binds
           F-actin directly, via its Ig3 domain. Palladin is
           expressed as several alternatively spliced isoforms,
           having various combinations of Ig-like domains, in a
           cell-type-specific manner. It has been suggested that
           palladin's different Ig-like domains may be specialized
           for distinct functions.
          Length = 75

 Score = 31.2 bits (70), Expect = 0.073
 Identities = 19/65 (29%), Positives = 27/65 (41%), Gaps = 7/65 (10%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNS-------LTVRHTNRHSAGIYLCVA 168
           V LEC+  G P P I W ++N +L       S +        L ++   +  AG Y   A
Sbjct: 1   VRLECRVSGVPHPQIFWKKENESLTHNTDRVSMHQDNCGYICLLIQGATKEDAGWYTVSA 60

Query: 169 NNMVG 173
            N  G
Sbjct: 61  KNEAG 65


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 32.9 bits (75), Expect = 0.074
 Identities = 14/29 (48%), Positives = 20/29 (68%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           E  +EE K+K++ K+ KKKKK+K K  P 
Sbjct: 101 ESGKEEEKEKEQVKEEKKKKKEKPKEEPK 129



 Score = 30.2 bits (68), Expect = 0.59
 Identities = 10/30 (33%), Positives = 22/30 (73%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
             +EEE++K++ ++++KKKK+K ++    R
Sbjct: 102 SGKEEEKEKEQVKEEKKKKKEKPKEEPKDR 131



 Score = 30.2 bits (68), Expect = 0.62
 Identities = 8/26 (30%), Positives = 21/26 (80%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
            K E  ++++K++++ K++KKKK+++
Sbjct: 98  PKNESGKEEEKEKEQVKEEKKKKKEK 123



 Score = 30.2 bits (68), Expect = 0.65
 Identities = 8/30 (26%), Positives = 22/30 (73%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           ++ + E  K+++++K + K++KK+K+  P+
Sbjct: 96  KEPKNESGKEEEKEKEQVKEEKKKKKEKPK 125



 Score = 29.1 bits (65), Expect = 1.5
 Identities = 12/26 (46%), Positives = 22/26 (84%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           EKE+E+ K++KK+KK K K++ K+++
Sbjct: 107 EKEKEQVKEEKKKKKEKPKEEPKDRK 132



 Score = 28.7 bits (64), Expect = 1.8
 Identities = 11/28 (39%), Positives = 23/28 (82%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIP 96
           E++E+E+ K++K+KK++K K++ + R P
Sbjct: 106 EEKEKEQVKEEKKKKKEKPKEEPKDRKP 133



 Score = 28.7 bits (64), Expect = 1.8
 Identities = 16/34 (47%), Positives = 23/34 (67%), Gaps = 5/34 (14%)

Query: 69  EKEEEERKKKKKR-----KKRKKKKKKKEKRIPP 97
           E+ +EE+KKKK++     K RK K++ KEKR P 
Sbjct: 111 EQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPK 144



 Score = 28.7 bits (64), Expect = 1.9
 Identities = 12/30 (40%), Positives = 20/30 (66%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           + +EE++KKK+K K+  K +K KE+    R
Sbjct: 112 QVKEEKKKKKEKPKEEPKDRKPKEEAKEKR 141



 Score = 28.7 bits (64), Expect = 2.0
 Identities = 9/30 (30%), Positives = 13/30 (43%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
               + R KK  +KK   KKK+  +    R
Sbjct: 168 RVRAKSRPKKPPKKKPPNKKKEPPEEEKQR 197



 Score = 28.7 bits (64), Expect = 2.0
 Identities = 11/25 (44%), Positives = 16/25 (64%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
            KEE + +K K+  K K+  K+KEK
Sbjct: 124 PKEEPKDRKPKEEAKEKRPPKEKEK 148



 Score = 28.3 bits (63), Expect = 2.5
 Identities = 7/30 (23%), Positives = 19/30 (63%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           E +E+   K+K+++K KK ++ +++    +
Sbjct: 136 EAKEKRPPKEKEKEKEKKVEEPRDREEEKK 165



 Score = 27.9 bits (62), Expect = 3.1
 Identities = 8/33 (24%), Positives = 19/33 (57%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           A  K  +E K +  +++ K+K++ KE++   + 
Sbjct: 90  AKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKE 122



 Score = 27.9 bits (62), Expect = 3.5
 Identities = 9/26 (34%), Positives = 18/26 (69%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           E+ +E+R  K+K K+++KK ++   R
Sbjct: 135 EEAKEKRPPKEKEKEKEKKVEEPRDR 160



 Score = 27.5 bits (61), Expect = 4.2
 Identities = 10/37 (27%), Positives = 21/37 (56%), Gaps = 7/37 (18%)

Query: 69  EKEEEERKKKKKRKK-------RKKKKKKKEKRIPPR 98
           +++ +E  K +K K+        K+K+K+KEK++   
Sbjct: 121 KEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEP 157



 Score = 27.2 bits (60), Expect = 5.5
 Identities = 11/25 (44%), Positives = 18/25 (72%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           KEE + K+  K K+++K+KK +E R
Sbjct: 134 KEEAKEKRPPKEKEKEKEKKVEEPR 158



 Score = 27.2 bits (60), Expect = 6.1
 Identities = 10/26 (38%), Positives = 20/26 (76%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           EKE+E+ KK ++ + R+++KK++  R
Sbjct: 145 EKEKEKEKKVEEPRDREEEKKRERVR 170



 Score = 27.2 bits (60), Expect = 6.3
 Identities = 10/31 (32%), Positives = 20/31 (64%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           EK+ EE + +++ KKR++ + K   + PP+ 
Sbjct: 151 EKKVEEPRDREEEKKRERVRAKSRPKKPPKK 181



 Score = 27.2 bits (60), Expect = 7.0
 Identities = 11/30 (36%), Positives = 20/30 (66%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
            K +EE K+K+  K+++K+K+KK +    R
Sbjct: 131 RKPKEEAKEKRPPKEKEKEKEKKVEEPRDR 160



 Score = 26.8 bits (59), Expect = 9.6
 Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           E+++ ER + K R K+  KKK   K
Sbjct: 162 EEKKRERVRAKSRPKKPPKKKPPNK 186


>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62. 
          Length = 217

 Score = 32.5 bits (74), Expect = 0.080
 Identities = 24/82 (29%), Positives = 34/82 (41%), Gaps = 12/82 (14%)

Query: 13  IFGKPITR-IMFADLAIMFADSRPESTFWIFGKLITSCSYIILATSLLHYLSYILAMEKE 71
            F   I R I+F    I++A    +  FWIF  L     ++     L  +          
Sbjct: 147 FFALAILRLILFV---IVWAIVGGKPGFWIFPNLTEDVGFLDSFKPLYTW--------HY 195

Query: 72  EEERKKKKKRKKRKKKKKKKEK 93
           + ++   KK KK KKKKKKK  
Sbjct: 196 KGDKSSAKKDKKSKKKKKKKRS 217


>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207).  This
           family is found in eukaryotes; it has several conserved
           tryptophan residues. The function is not known.
          Length = 261

 Score = 32.8 bits (75), Expect = 0.083
 Identities = 7/25 (28%), Positives = 25/25 (100%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K++++++++++RK+RKK+++++E++
Sbjct: 193 KQQQQKREEERRKQRKKQQEEEERK 217



 Score = 29.7 bits (67), Expect = 0.84
 Identities = 8/28 (28%), Positives = 27/28 (96%), Gaps = 1/28 (3%)

Query: 68  MEK-EEEERKKKKKRKKRKKKKKKKEKR 94
           ++K +++++K++++R+K++KK++++E+R
Sbjct: 189 LKKLKQQQQKREEERRKQRKKQQEEEER 216



 Score = 28.9 bits (65), Expect = 1.3
 Identities = 8/28 (28%), Positives = 24/28 (85%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           L  ++++ E +++K+RKK+++++++K+K
Sbjct: 192 LKQQQQKREEERRKQRKKQQEEEERKQK 219



 Score = 27.4 bits (61), Expect = 4.7
 Identities = 10/25 (40%), Positives = 22/25 (88%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           ++EEE RK++KK+++ +++K+K E+
Sbjct: 198 KREEERRKQRKKQQEEEERKQKAEE 222



 Score = 26.6 bits (59), Expect = 8.1
 Identities = 6/24 (25%), Positives = 21/24 (87%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKR 94
           +E E KK K+++++++++++K+++
Sbjct: 185 QEWELKKLKQQQQKREEERRKQRK 208


>gnl|CDD|143282 cd05874, Ig6_NrCAM, Sixth immunoglobulin (Ig)-like domain of NrCAM
           (Ng (neuronglia) CAM-related cell adhesion molecule).
           Ig6_NrCAM: sixth immunoglobulin (Ig)-like domain of
           NrCAM (Ng (neuronglia) CAM-related cell adhesion
           molecule). NrCAM belongs to the L1 subfamily of cell
           adhesion molecules (CAMs) and is comprised of an
           extracellular region having six Ig-like domains and five
           fibronectin type III domains, a transmembrane region,
           and an intracellular domain. NrCAM is primarily
           expressed in the nervous system.
          Length = 77

 Score = 30.7 bits (69), Expect = 0.083
 Identities = 19/76 (25%), Positives = 35/76 (46%), Gaps = 9/76 (11%)

Query: 116 VTLECKADGNPVPNITWTRKNNNL-----PGGEYSYSGNSLTVRHTNRHSA----GIYLC 166
           + ++C+A G P P+ +WTR   +      P      +  +L +   N   A    G+Y C
Sbjct: 1   IVIQCEAKGKPPPSFSWTRNGTHFDIDKDPKVTMKPNTGTLVINIMNGEKAEAYEGVYQC 60

Query: 167 VANNMVGSSAAASIAL 182
            A N  G++ + +I +
Sbjct: 61  TARNERGAAVSNNIVI 76


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 32.7 bits (74), Expect = 0.092
 Identities = 12/24 (50%), Positives = 15/24 (62%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKK 90
           A    EEE+ +KK RK+RK KK  
Sbjct: 124 ASSDVEEEKTEKKVRKRRKVKKMD 147



 Score = 26.9 bits (59), Expect = 8.5
 Identities = 10/24 (41%), Positives = 15/24 (62%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKE 92
            +EE+  KK +KR+K KK  +  E
Sbjct: 128 VEEEKTEKKVRKRRKVKKMDEDVE 151


>gnl|CDD|219564 pfam07771, TSGP1, Tick salivary peptide group 1.  This contains a
           group of peptides derived from a salivary gland cDNA
           library of the tick Ixodes scapularis. Also present are
           peptides from a related tick species, Ixodes ricinus.
           They are characterized by a putative signal peptide
           indicative of secretion and conserved cysteine residues.
          Length = 120

 Score = 31.4 bits (71), Expect = 0.099
 Identities = 14/25 (56%), Positives = 17/25 (68%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           E  E+ +KKKKK KK KK KK  +K
Sbjct: 95  EPTEKPKKKKKKSKKTKKPKKSSKK 119



 Score = 28.7 bits (64), Expect = 1.1
 Identities = 12/22 (54%), Positives = 16/22 (72%)

Query: 73  EERKKKKKRKKRKKKKKKKEKR 94
           E+ KKKKK+ K+ KK KK  K+
Sbjct: 98  EKPKKKKKKSKKTKKPKKSSKK 119



 Score = 28.3 bits (63), Expect = 1.1
 Identities = 12/23 (52%), Positives = 16/23 (69%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKK 91
           EK ++++KK KK KK KK  KK 
Sbjct: 98  EKPKKKKKKSKKTKKPKKSSKKD 120



 Score = 27.5 bits (61), Expect = 2.6
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 73  EERKKKKKRKKRKKKKKKKEKR 94
            E  +K K+KK+K KK KK K+
Sbjct: 94  PEPTEKPKKKKKKSKKTKKPKK 115



 Score = 26.8 bits (59), Expect = 3.7
 Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEK 93
           E+ ++KKKK +K +K KK  K+ 
Sbjct: 98  EKPKKKKKKSKKTKKPKKSSKKD 120



 Score = 26.8 bits (59), Expect = 4.4
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 72  EEERKKKKKRKKRKKKKKKKEK 93
            E  +K KK+KK+ KK KK +K
Sbjct: 94  PEPTEKPKKKKKKSKKTKKPKK 115



 Score = 26.4 bits (58), Expect = 5.4
 Identities = 12/23 (52%), Positives = 14/23 (60%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEK 93
            E   K KKK+KK KK KK K+ 
Sbjct: 94  PEPTEKPKKKKKKSKKTKKPKKS 116


>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
          Length = 651

 Score = 32.7 bits (75), Expect = 0.10
 Identities = 13/29 (44%), Positives = 23/29 (79%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           L  EKEE+E  K++KR ++ KK+++K+K+
Sbjct: 553 LQREKEEKEALKEQKRLRKLKKQEEKKKK 581



 Score = 31.9 bits (73), Expect = 0.17
 Identities = 14/33 (42%), Positives = 25/33 (75%), Gaps = 1/33 (3%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEK-RIPP 97
            A+++++  RK KK+ +K+KK+ +K EK +IPP
Sbjct: 561 EALKEQKRLRKLKKQEEKKKKELEKLEKAKIPP 593


>gnl|CDD|220692 pfam10324, 7TM_GPCR_Srw, Serpentine type 7TM GPCR chemoreceptor
           Srw.  Chemoreception is mediated in Caenorhabditis
           elegans by members of the seven-transmembrane
           G-protein-coupled receptor class (7TM GPCRs) of proteins
           which are of the serpentine type. Srw is a solo family
           amongst the superfamilies of chemoreceptors.
           Chemoperception is one of the central senses of soil
           nematodes like C. elegans which are otherwise 'blind'
           and 'deaf'. The genes encoding Srw do not appear to be
           under as strong an adaptive evolutionary pressure as
           those of Srz.
          Length = 317

 Score = 32.2 bits (74), Expect = 0.11
 Identities = 14/53 (26%), Positives = 18/53 (33%), Gaps = 12/53 (22%)

Query: 41  IFGKLITSCSYIILATSLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           I  K+I      IL   L+  L            RK KK RK       K ++
Sbjct: 198 IVSKIIPCILLPILTILLIIEL------------RKAKKSRKNLSSSSNKSDR 238


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 32.4 bits (74), Expect = 0.11
 Identities = 11/45 (24%), Positives = 21/45 (46%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKG 113
            ++E +R+++ K+KKR K K  KE     +    + A   +    
Sbjct: 74  GEKELQREERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAP 118



 Score = 31.6 bits (72), Expect = 0.18
 Identities = 15/25 (60%), Positives = 19/25 (76%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           KE E R+KK + K RK+K+KKKEK 
Sbjct: 158 KEREIRRKKIQAKARKRKEKKKEKE 182



 Score = 30.0 bits (68), Expect = 0.50
 Identities = 14/44 (31%), Positives = 21/44 (47%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
            +E  ++KK+ K K  K+  KKK+K+ P        A     KK
Sbjct: 80  REERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAPRPKKK 123



 Score = 30.0 bits (68), Expect = 0.62
 Identities = 14/36 (38%), Positives = 20/36 (55%), Gaps = 4/36 (11%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGA 105
           +E+EE KKK K +  KK++  +     P I Y SG 
Sbjct: 207 EEQEEEKKKAKIQALKKRRLYEG----PVIRYWSGT 238



 Score = 29.7 bits (67), Expect = 0.65
 Identities = 10/31 (32%), Positives = 20/31 (64%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           E+E E+  ++++R K+KK+ K K  + P + 
Sbjct: 71  EEEGEKELQREERLKKKKRVKTKAYKEPTKK 101



 Score = 28.5 bits (64), Expect = 1.6
 Identities = 12/60 (20%), Positives = 24/60 (40%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLECKADGN 125
           L  E+  +++K+ K +  ++  KKKK+K         + A + + K           D  
Sbjct: 78  LQREERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAPRPKKKSERISWAPTLLDSP 137



 Score = 28.5 bits (64), Expect = 2.1
 Identities = 9/26 (34%), Positives = 19/26 (73%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKRIP 96
           +E E ++KK + K +K+K+KK+++  
Sbjct: 158 KEREIRRKKIQAKARKRKEKKKEKEL 183



 Score = 27.7 bits (62), Expect = 3.2
 Identities = 8/25 (32%), Positives = 18/25 (72%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           E+ +E   ++KK + + +K+K+K+K
Sbjct: 155 ERLKEREIRRKKIQAKARKRKEKKK 179



 Score = 27.3 bits (61), Expect = 4.6
 Identities = 10/23 (43%), Positives = 17/23 (73%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKK 91
            +EEE +K+ +R++R KKKK+ 
Sbjct: 68 SDDEEEGEKELQREERLKKKKRV 90



 Score = 26.6 bits (59), Expect = 6.9
 Identities = 12/24 (50%), Positives = 18/24 (75%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKE 92
           E+E   +K + K +KRK+KKK+KE
Sbjct: 159 EREIRRKKIQAKARKRKEKKKEKE 182


>gnl|CDD|220600 pfam10147, CR6_interact, Growth arrest and DNA-damage-inducible
           proteins-interacting protein 1.  Members of this family
           of proteins act as negative regulators of G1 to S cell
           cycle phase progression by inhibiting cyclin-dependent
           kinases. Inhibitory effects are additive with GADD45
           proteins but occur also in the absence of GADD45
           proteins. Furthermore, they act as a repressor of the
           orphan nuclear receptor NR4A1 by inhibiting AB
           domain-mediated transcriptional activity.
          Length = 217

 Score = 32.1 bits (73), Expect = 0.11
 Identities = 11/25 (44%), Positives = 20/25 (80%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           +KE+EE+KK K+ K+R+K++K+   
Sbjct: 186 QKEKEEKKKVKEAKRREKEEKRMAA 210


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 32.2 bits (74), Expect = 0.12
 Identities = 10/27 (37%), Positives = 20/27 (74%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
           A E+ +EE ++KK+ KK+++++ K  K
Sbjct: 276 AEEERQEEAQEKKEEKKKEEREAKLAK 302



 Score = 31.1 bits (71), Expect = 0.26
 Identities = 10/31 (32%), Positives = 22/31 (70%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           A E+E +E  ++KK +K+K++++ K  ++ P
Sbjct: 275 AAEEERQEEAQEKKEEKKKEEREAKLAKLSP 305



 Score = 29.1 bits (66), Expect = 1.1
 Identities = 11/38 (28%), Positives = 24/38 (63%), Gaps = 10/38 (26%)

Query: 67  AMEKEEEERKKKKKR----------KKRKKKKKKKEKR 94
           A EK+EE++K++++           +K ++K++KK+ R
Sbjct: 284 AQEKKEEKKKEEREAKLAKLSPEEQRKLEEKERKKQAR 321



 Score = 28.8 bits (65), Expect = 1.8
 Identities = 8/39 (20%), Positives = 22/39 (56%), Gaps = 10/39 (25%)

Query: 66  LAMEKEEEERKKKKK----------RKKRKKKKKKKEKR 94
              +KEE+++++++           RK  +K++KK+ ++
Sbjct: 284 AQEKKEEKKKEEREAKLAKLSPEEQRKLEEKERKKQARK 322



 Score = 28.0 bits (63), Expect = 3.4
 Identities = 10/37 (27%), Positives = 22/37 (59%), Gaps = 10/37 (27%)

Query: 68  MEKEEEERKKKKKRKK----------RKKKKKKKEKR 94
             +E++E KKK++R+           RK ++K+++K+
Sbjct: 283 EAQEKKEEKKKEEREAKLAKLSPEEQRKLEEKERKKQ 319



 Score = 27.6 bits (62), Expect = 4.4
 Identities = 7/30 (23%), Positives = 21/30 (70%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           IL   +EE + + ++K++++KK++++ +  
Sbjct: 272 ILKAAEEERQEEAQEKKEEKKKEEREAKLA 301


>gnl|CDD|206039 pfam13868, Trichoplein, Tumour suppressor, Mitostatin.  Trichoplein
           or mitostatin, was first defined as a meiosis-specific
           nuclear structural protein. It has since been linked
           with mitochondrial movement. It is associated with the
           mitochondrial outer membrane, and over-expression leads
           to reduction in mitochondrial motility whereas lack of
           it enhances mitochondrial movement. The activity appears
           to be mediated through binding the mitochondria to the
           actin intermediate filaments (IFs).
          Length = 349

 Score = 32.2 bits (74), Expect = 0.12
 Identities = 8/26 (30%), Positives = 18/26 (69%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKE 92
            +E+ +EE + + + K+ K+KK ++E
Sbjct: 104 IIERIQEEDEAEAQEKREKQKKLREE 129



 Score = 30.6 bits (70), Expect = 0.40
 Identities = 9/25 (36%), Positives = 19/25 (76%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKK 89
           I+   +EE+E + ++KR+K+KK ++
Sbjct: 104 IIERIQEEDEAEAQEKREKQKKLRE 128



 Score = 29.1 bits (66), Expect = 1.2
 Identities = 9/29 (31%), Positives = 21/29 (72%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
             EK E E +++ +R++RK++K+++  R+
Sbjct: 159 QREKAEREEEREAERRERKEEKEREVARL 187



 Score = 29.1 bits (66), Expect = 1.4
 Identities = 8/27 (29%), Positives = 17/27 (62%)

Query: 62  LSYILAMEKEEEERKKKKKRKKRKKKK 88
           L Y     + EEER+ +++ +K +K++
Sbjct: 156 LEYQREKAEREEEREAERRERKEEKER 182



 Score = 28.0 bits (63), Expect = 3.5
 Identities = 10/26 (38%), Positives = 18/26 (69%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKR 94
          E  EEER K    ++ +++K+K+E+R
Sbjct: 45 EMMEEERLKALAEEEERERKRKEERR 70



 Score = 27.2 bits (61), Expect = 5.1
 Identities = 8/34 (23%), Positives = 20/34 (58%), Gaps = 8/34 (23%)

Query: 69  EKEEEERKKKKKRKKR--------KKKKKKKEKR 94
           E E +E+++K+K+ +         + ++K++EK 
Sbjct: 113 EAEAQEKREKQKKLREEIDEFNEERIERKEEEKE 146



 Score = 26.8 bits (60), Expect = 7.7
 Identities = 9/31 (29%), Positives = 25/31 (80%), Gaps = 1/31 (3%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
           IL  ++E+ ER+++++  +R+++K++KE+ +
Sbjct: 155 ILEYQREKAEREEEREA-ERRERKEEKEREV 184



 Score = 26.8 bits (60), Expect = 8.4
 Identities = 8/27 (29%), Positives = 19/27 (70%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKE 92
           L  E+ E + ++K+K +  K++++K+E
Sbjct: 208 LYQEEYERKERQKEKEEAEKRRRQKQE 234


>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
           Provisional.
          Length = 695

 Score = 32.2 bits (74), Expect = 0.13
 Identities = 6/25 (24%), Positives = 14/25 (56%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKE 92
           +E+E+  R+ + K+    +  K K+
Sbjct: 462 LEREKAAREARHKKAAEARAAKDKD 486


>gnl|CDD|219547 pfam07741, BRF1, Brf1-like TBP-binding domain.  This region
          covers both the Brf homology II and III regions. This
          region is involved in binding TATA binding protein.
          Length = 95

 Score = 30.7 bits (70), Expect = 0.13
 Identities = 12/38 (31%), Positives = 20/38 (52%), Gaps = 4/38 (10%)

Query: 64 YILAMEKEEEERKK----KKKRKKRKKKKKKKEKRIPP 97
          Y+   E++E ++K        +KK+K+K KKK     P
Sbjct: 28 YLEEQEEKELKQKADEGNNSGKKKKKRKAKKKRDEAGP 65



 Score = 27.6 bits (62), Expect = 1.6
 Identities = 10/29 (34%), Positives = 18/29 (62%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
          +E+EE++ K+K  +     KKK+KR   +
Sbjct: 30 EEQEEKELKQKADEGNNSGKKKKKRKAKK 58



 Score = 26.1 bits (58), Expect = 5.6
 Identities = 13/31 (41%), Positives = 16/31 (51%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
          E+EE+E K+K        KKKKK K    R 
Sbjct: 31 EQEEKELKQKADEGNNSGKKKKKRKAKKKRD 61


>gnl|CDD|143174 cd04973, Ig1_FGFR, First immunoglobulin (Ig)-like domain of
           fibroblast growth factor receptor (FGFR).  Ig1_FGFR: The
           first immunoglobulin (Ig)-like domain of fibroblast
           growth factor receptor (FGFR). Fibroblast growth factors
           (FGFs) participate in morphogenesis, development,
           angiogenesis, and wound healing. These FGF-stimulated
           processes are mediated by four FGFR tyrosine kinases
           (FGRF1-4). FGFRs are comprised of an extracellular
           portion consisting of three Ig-like domains, a
           transmembrane helix, and a cytoplasmic portion having
           protein tyrosine kinase activity. The highly conserved
           Ig-like domains 2 and 3, and the linker region between
           D2 and D3 define a general binding site for all FGFs.
          Length = 79

 Score = 30.2 bits (68), Expect = 0.15
 Identities = 15/63 (23%), Positives = 28/63 (44%), Gaps = 2/63 (3%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNL-PGGEYSYSGNSLTVRHTNRHSAGIYLCVANNM 171
           G  + L C+   + V +I WT+    L        +G  + ++      +G+Y CV ++ 
Sbjct: 9   GDLLQLRCRLR-DDVQSINWTKDGVQLGENNRTRITGEEVQIKDAVPRDSGLYACVTSSP 67

Query: 172 VGS 174
            GS
Sbjct: 68  SGS 70


>gnl|CDD|240246 PTZ00053, PTZ00053, methionine aminopeptidase 2; Provisional.
          Length = 470

 Score = 32.0 bits (73), Expect = 0.15
 Identities = 16/72 (22%), Positives = 29/72 (40%), Gaps = 7/72 (9%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLECKADG 124
           +  +  E +E + K+  KK+KKKKKKK+K+                    S   +   D 
Sbjct: 40  LAELISENQEAENKQNNKKKKKKKKKKKKKNLGEA---YDLAYDLPVVWSSAAFQ---DN 93

Query: 125 NPVPNIT-WTRK 135
           + +  +  W  +
Sbjct: 94  SHIRKLGNWPEQ 105


>gnl|CDD|233160 TIGR00869, sec62, protein translocation protein, Sec62 family.
           Members of the NSCC2 family have been sequenced from
           various yeast, fungal and animals species including
           Saccharomyces cerevisiae, Drosophila melanogaster and
           Homo sapiens. These proteins are the Sec62 proteins,
           believed to be associated with the Sec61 and Sec63
           constituents of the general protein secretary systems of
           yeast microsomes. They are also the non-selective cation
           (NS) channels of the mammalian cytoplasmic membrane. The
           yeast Sec62 protein has been shown to be essential for
           cell growth. The mammalian NS channel proteins has been
           implicated in platelet derived growth factor(PGDF)
           dependent single channel current in fibroblasts. These
           channels are essentially closed in serum deprived
           tissue-culture cells and are specifically opened by
           exposure to PDGF. These channels are reported to exhibit
           equal selectivity for Na+, K+ and Cs+ with low
           permeability to Ca2+, and no permeability to anions
           [Transport and binding proteins, Amino acids, peptides
           and amines].
          Length = 232

 Score = 31.8 bits (72), Expect = 0.16
 Identities = 18/54 (33%), Positives = 25/54 (46%), Gaps = 6/54 (11%)

Query: 39  FWIFGKLITSCSYIILATSLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKKE 92
            WIF  L     ++     L  +       EK++   KKK K KK KKK+ K+E
Sbjct: 183 IWIFPNLFADVGFLDSFKPLWGW------HEKDKYSYKKKLKSKKLKKKQAKRE 230


>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6.  The surfeit locus
          protein SURF-6 is shown to be a component of the
          nucleolar matrix and has a strong binding capacity for
          nucleic acids.
          Length = 206

 Score = 31.5 bits (72), Expect = 0.16
 Identities = 9/26 (34%), Positives = 20/26 (76%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKRI 95
          ++ E+RK +KK+K+++ KKK+  ++ 
Sbjct: 13 RKREQRKARKKQKRKEAKKKEDAQKS 38



 Score = 30.7 bits (70), Expect = 0.32
 Identities = 11/23 (47%), Positives = 20/23 (86%)

Query: 72 EEERKKKKKRKKRKKKKKKKEKR 94
          E+ R+K+++RK RKK+K+K+ K+
Sbjct: 9  EQRRRKREQRKARKKQKRKEAKK 31



 Score = 30.0 bits (68), Expect = 0.55
 Identities = 13/25 (52%), Positives = 18/25 (72%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKRI 95
           EE  +K+K  +K +KKKK KK+ RI
Sbjct: 181 EENLKKRKDDKKNKKKKKAKKKGRI 205



 Score = 29.6 bits (67), Expect = 0.68
 Identities = 10/26 (38%), Positives = 21/26 (80%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKR 94
          E   E+R++K++++K +KK+K+KE +
Sbjct: 5  EALLEQRRRKREQRKARKKQKRKEAK 30



 Score = 29.6 bits (67), Expect = 0.76
 Identities = 10/30 (33%), Positives = 22/30 (73%)

Query: 65 ILAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
          +L   + + E++K +K++KRK+ KKK++ +
Sbjct: 7  LLEQRRRKREQRKARKKQKRKEAKKKEDAQ 36



 Score = 28.8 bits (65), Expect = 1.2
 Identities = 7/25 (28%), Positives = 18/25 (72%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
           ++ + RKK+K+++ +KK+  +K +
Sbjct: 15 REQRKARKKQKRKEAKKKEDAQKSE 39



 Score = 28.4 bits (64), Expect = 1.7
 Identities = 18/37 (48%), Positives = 24/37 (64%), Gaps = 5/37 (13%)

Query: 69  EKEEEERKKK-----KKRKKRKKKKKKKEKRIPPRII 100
           EK++ ER+KK     KKRK  KK KKKK+ +   RI+
Sbjct: 170 EKKKAERQKKREENLKKRKDDKKNKKKKKAKKKGRIL 206



 Score = 27.7 bits (62), Expect = 3.0
 Identities = 9/25 (36%), Positives = 17/25 (68%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
           K E+ + +KK+++K  KKK+  +K
Sbjct: 13 RKREQRKARKKQKRKEAKKKEDAQK 37



 Score = 26.5 bits (59), Expect = 7.1
 Identities = 9/27 (33%), Positives = 18/27 (66%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKKEK 93
             +E    ++++KR++RK +KK+K K
Sbjct: 1  PSSREALLEQRRRKREQRKARKKQKRK 27


>gnl|CDD|143279 cd05871, Ig_Semaphorin_classIII, Immunoglobulin (Ig)-like domain of
           class III semaphorin.  Ig_Semaphorin_class III;
           Immunoglobulin (Ig)-like domain of class III
           semaphorins. Semaphorins are classified into various
           classes on the basis of structural features additional
           to the Sema domain. Class III semaphorins are a
           vertebrate class having a Sema domain, an Ig domain, a
           short basic domain, and are secreted. They have been
           shown to be axonal guidance cues and have a part in the
           regulation of the cardiovascular, immune and respiratory
           systems. Sema3A, the prototype member of this class III
           subfamily, induces growth cone collapse and is an
           inhibitor of axonal sprouting. In perinatal rat cortex
           as a chemoattractant, it functions to direct, for
           pyramidal neurons, the orientated extension of apical
           dendrites. It may play a role, prior to the development
           of apical dendrites, in signaling the radial migration
           of newborn cortical neurons towards the upper layers.
           Sema3A selectively inhibits vascular endothelial growth
           factor receptor (VEGF)-induced angiogenesis and induces
           microvascular permeability. This group also includes
           Sema3B, -C, -D, -E, -G.
          Length = 91

 Score = 30.4 bits (69), Expect = 0.17
 Identities = 21/81 (25%), Positives = 30/81 (37%), Gaps = 8/81 (9%)

Query: 112 KGYSVTLECKADGNPVPNITWT-------RKNNNLPGGEYSYSGNSLTVRHTNRHSAGIY 164
           +  S  LEC    +P  ++ W        RK          ++   L +R   R  AG+Y
Sbjct: 10  ENNSTFLECLPK-SPQASVKWLFQRGGDQRKEEVKTEERLIHTERGLLLRSLQRSDAGVY 68

Query: 165 LCVANNMVGSSAAASIALHVL 185
            C A     S   A   LHV+
Sbjct: 69  TCTAVEHSFSQTLAKYTLHVI 89


>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1.  This domain
           family is found in eukaryotes, and is approximately 40
           amino acids in length. The family is found in
           association with pfam07719, pfam00515. There is a single
           completely conserved residue L that may be functionally
           important. NARP1 is the mammalian homologue of a yeast
           N-terminal acetyltransferase that regulates entry into
           the G(0) phase of the cell cycle.
          Length = 516

 Score = 31.8 bits (73), Expect = 0.17
 Identities = 10/25 (40%), Positives = 13/25 (52%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           EKEE E+   KK+ +   KK K   
Sbjct: 426 EKEEAEKAAAKKKAEAAAKKAKGPD 450



 Score = 30.3 bits (69), Expect = 0.60
 Identities = 11/25 (44%), Positives = 17/25 (68%), Gaps = 2/25 (8%)

Query: 72  EEERKK--KKKRKKRKKKKKKKEKR 94
             ERKK  KK+RK  KK +K++ ++
Sbjct: 408 PAERKKLRKKQRKAEKKAEKEEAEK 432



 Score = 28.4 bits (64), Expect = 2.7
 Identities = 8/30 (26%), Positives = 12/30 (40%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
            ++EE  K   K+K     KK K      +
Sbjct: 425 AEKEEAEKAAAKKKAEAAAKKAKGPDGETK 454



 Score = 28.0 bits (63), Expect = 3.8
 Identities = 7/27 (25%), Positives = 14/27 (51%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKE 92
                     +KK ++K+RK +KK ++
Sbjct: 401 GENGNLSPAERKKLRKKQRKAEKKAEK 427



 Score = 27.6 bits (62), Expect = 4.7
 Identities = 10/28 (35%), Positives = 15/28 (53%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           A +K E+E  +K   KK+ +   KK K 
Sbjct: 421 AEKKAEKEEAEKAAAKKKAEAAAKKAKG 448


>gnl|CDD|143266 cd05858, Ig3_FGFR-2, Third immunoglobulin (Ig)-like domain of
           fibroblast growth factor receptor 2 (FGFR2).
           Ig3_FGFR-2-like; domain similar to the third
           immunoglobulin (Ig)-like domain of human fibroblast
           growth factor receptor 2 (FGFR2). Fibroblast growth
           factors (FGFs) participate in morphogenesis,
           development, angiogenesis, and wound healing. These
           FGF-stimulated processes are mediated by four FGFR
           tyrosine kinases (FGRF1-4). FGFRs are comprised of an
           extracellular portion consisting of three Ig-like
           domains, a transmembrane helix, and a cytoplasmic
           portion having protein tyrosine kinase activity. The
           highly conserved Ig-like domains 2 and 3, and the linker
           region between D2 and D3 define a general binding site
           for FGFs. FGFR2 is required for male sex determination.
          Length = 90

 Score = 30.3 bits (68), Expect = 0.18
 Identities = 21/82 (25%), Positives = 30/82 (36%), Gaps = 20/82 (24%)

Query: 113 GYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSG-------------------NSLTV 153
           G +V   CK   +  P+I W  K+    G +Y   G                     L +
Sbjct: 1   GSTVEFVCKVYSDAQPHIQWL-KHVEKNGSKYGPDGLPYVTVLKTAGVNTTDKEMEVLYL 59

Query: 154 RHTNRHSAGIYLCVANNMVGSS 175
           R+     AG Y C+A N +G S
Sbjct: 60  RNVTFEDAGEYTCLAGNSIGIS 81


>gnl|CDD|185429 PTZ00074, PTZ00074, 60S ribosomal protein L34; Provisional.
          Length = 135

 Score = 30.8 bits (70), Expect = 0.18
 Identities = 13/25 (52%), Positives = 20/25 (80%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           EK +++++KKKK+KK+KKK  KK  
Sbjct: 107 EKAKQKKQKKKKKKKKKKKTSKKAA 131



 Score = 30.0 bits (68), Expect = 0.32
 Identities = 12/24 (50%), Positives = 19/24 (79%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKK 91
            +K+++++KKKKK+K  KK  KKK
Sbjct: 111 QKKQKKKKKKKKKKKTSKKAAKKK 134



 Score = 30.0 bits (68), Expect = 0.35
 Identities = 14/24 (58%), Positives = 21/24 (87%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           KE+ ++KK+KK+KK+KKKKK  +K
Sbjct: 106 KEKAKQKKQKKKKKKKKKKKTSKK 129



 Score = 29.3 bits (66), Expect = 0.56
 Identities = 13/24 (54%), Positives = 20/24 (83%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
            +E+ ++KK+K+KK+KKKKKK  K
Sbjct: 105 LKEKAKQKKQKKKKKKKKKKKTSK 128



 Score = 29.3 bits (66), Expect = 0.62
 Identities = 14/27 (51%), Positives = 22/27 (81%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
           A +K+++++KKKKK+KK  KK  KK+K
Sbjct: 109 AKQKKQKKKKKKKKKKKTSKKAAKKKK 135



 Score = 28.9 bits (65), Expect = 0.82
 Identities = 11/27 (40%), Positives = 22/27 (81%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
            ++++ +++K+KKK+KK+KKKK  K+ 
Sbjct: 104 VLKEKAKQKKQKKKKKKKKKKKTSKKA 130



 Score = 28.9 bits (65), Expect = 0.99
 Identities = 12/26 (46%), Positives = 21/26 (80%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
            K+++++KKKKK+KK+K  KK  +K+
Sbjct: 109 AKQKKQKKKKKKKKKKKTSKKAAKKK 134



 Score = 27.3 bits (61), Expect = 3.3
 Identities = 13/23 (56%), Positives = 20/23 (86%)

Query: 72  EEERKKKKKRKKRKKKKKKKEKR 94
           +E+ K+KK++KK+KKKKKKK  +
Sbjct: 106 KEKAKQKKQKKKKKKKKKKKTSK 128



 Score = 26.6 bits (59), Expect = 4.9
 Identities = 13/29 (44%), Positives = 22/29 (75%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           +L  + +++++KKKKK+KK+KK  KK  K
Sbjct: 104 VLKEKAKQKKQKKKKKKKKKKKTSKKAAK 132


>gnl|CDD|214818 smart00784, SPT2, SPT2 chromatin protein.  This entry includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 106

 Score = 30.4 bits (69), Expect = 0.19
 Identities = 9/29 (31%), Positives = 19/29 (65%)

Query: 63  SYILAMEKEEEERKKKKKRKKRKKKKKKK 91
           S  LA  ++ EE + +K+ ++ K+ +K+K
Sbjct: 78  SARLARLEDREEERLEKEEEREKRARKRK 106


>gnl|CDD|143229 cd05752, Ig1_FcgammaR_like, Frst immunoglobulin (Ig)-like domain of
            Fcgamma-receptors (FcgammaRs) and similar proteins.
           Ig1_FcgammaR_like: domain similar to the first
           immunoglobulin (Ig)-like domain of  Fcgamma-receptors
           (FcgammaRs). Interactions between IgG and FcgammaR are
           important to the initiation of cellular and humoral
           response. IgG binding to FcgammaR leads to a cascade of
           signals and ultimately to functions such as
           antibody-dependent-cellular-cytotoxicity (ADCC),
           endocytosis, phagocytosis, release of inflammatory
           mediators, etc. FcgammaR has two Ig-like domains. This
           group also contains FcepsilonRI, which binds IgE with
           high affinity.
          Length = 78

 Score = 29.6 bits (67), Expect = 0.21
 Identities = 18/55 (32%), Positives = 22/55 (40%), Gaps = 5/55 (9%)

Query: 112 KGYSVTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNSLTVRHTNRHSAGIYLC 166
           +G  VTL C    +P  N T    N  L       + NS  +R  N  S G Y C
Sbjct: 14  QGEKVTLTCNGFNSPEQNSTQWYHNGKL----LETTTNSYRIRAANNDS-GEYRC 63


>gnl|CDD|219514 pfam07686, V-set, Immunoglobulin V-set domain.  This domain is
           found in antibodies as well as neural protein P0 and
           CTL4 amongst others.
          Length = 114

 Score = 30.3 bits (68), Expect = 0.22
 Identities = 22/104 (21%), Positives = 34/104 (32%), Gaps = 27/104 (25%)

Query: 108 VEVKKGYSVTLECK-ADGNPVPNITWTR----------------KNNNLPGGEY------ 144
           V V +G SVTL C  +  +   ++ W +                  N   G  +      
Sbjct: 11  VTVAEGGSVTLPCSFSSSSGSTSVYWYKQPLGKGPELIIHYVTSTPNGKVGPRFKGRVTL 70

Query: 145 ----SYSGNSLTVRHTNRHSAGIYLCVANNMVGSSAAASIALHV 184
               S +  SLT+ +     +G Y C  +N       A   L V
Sbjct: 71  SGNGSKNDFSLTISNLRLSDSGTYTCAVSNPNELVFGAGTRLTV 114


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
           This is a family of fungal proteins of unknown
          function.
          Length = 182

 Score = 30.8 bits (70), Expect = 0.25
 Identities = 14/24 (58%), Positives = 19/24 (79%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEK 93
          KE EE++K K +KK+ KKKK K+K
Sbjct: 76 KEYEEKQKWKWKKKKSKKKKDKDK 99



 Score = 30.0 bits (68), Expect = 0.48
 Identities = 13/25 (52%), Positives = 19/25 (76%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           E EE+++ K KK+K +KKK K K+K
Sbjct: 77  EYEEKQKWKWKKKKSKKKKDKDKDK 101



 Score = 28.5 bits (64), Expect = 1.6
 Identities = 11/33 (33%), Positives = 18/33 (54%)

Query: 61  YLSYILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           +       +K++++ K KK  KK  K +KK EK
Sbjct: 84  WKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEK 116



 Score = 27.3 bits (61), Expect = 3.4
 Identities = 14/59 (23%), Positives = 26/59 (44%), Gaps = 5/59 (8%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLECKADGNP 126
            +K ++++ K K +K  KK  K ++K               ++ K YS TL   ++  P
Sbjct: 88  KKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAE-----DKLEDLTKSYSETLSTLSELKP 141



 Score = 27.3 bits (61), Expect = 3.6
 Identities = 12/26 (46%), Positives = 20/26 (76%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEK 93
          +EK ++E ++K+K K +KKK KKK+ 
Sbjct: 71 IEKVKKEYEEKQKWKWKKKKSKKKKD 96



 Score = 27.0 bits (60), Expect = 4.9
 Identities = 13/24 (54%), Positives = 17/24 (70%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEK 93
          K+E E K+K K KK+K KKKK + 
Sbjct: 75 KKEYEEKQKWKWKKKKSKKKKDKD 98


>gnl|CDD|234707 PRK00270, rpsU, 30S ribosomal protein S21; Reviewed.
          Length = 64

 Score = 29.1 bits (66), Expect = 0.26
 Identities = 9/26 (34%), Positives = 17/26 (65%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKR 94
          EK  E+RK+KK   +++++KK   + 
Sbjct: 39 EKPSEKRKRKKAAARKRRRKKLAREE 64


>gnl|CDD|205480 pfam13300, DUF4078, Domain of unknown function (DUF4078).  This
          family is found from fungi to humans, but its exact
          function is not known.
          Length = 88

 Score = 29.6 bits (67), Expect = 0.27
 Identities = 10/29 (34%), Positives = 22/29 (75%)

Query: 66 LAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
          L   +EE ER++K++ ++++K+K+  E+R
Sbjct: 52 LEKAREETERERKEREERKEKRKRAIEER 80



 Score = 28.0 bits (63), Expect = 1.1
 Identities = 13/25 (52%), Positives = 20/25 (80%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKR 94
          KE EERK+K+KR   +++KK +E+R
Sbjct: 64 KEREERKEKRKRAIEERRKKIEERR 88



 Score = 27.2 bits (61), Expect = 1.7
 Identities = 11/32 (34%), Positives = 26/32 (81%), Gaps = 1/32 (3%)

Query: 68 MEKEEEERKK-KKKRKKRKKKKKKKEKRIPPR 98
          ME+ E+ R++ +++RK+R+++K+K+++ I  R
Sbjct: 49 MEELEKAREETERERKEREERKEKRKRAIEER 80



 Score = 27.2 bits (61), Expect = 2.2
 Identities = 12/34 (35%), Positives = 23/34 (67%), Gaps = 9/34 (26%)

Query: 70 KEEEERKK---------KKKRKKRKKKKKKKEKR 94
          K+EEERK+         ++  ++RK+++++KEKR
Sbjct: 40 KDEEERKEQMEELEKAREETERERKEREERKEKR 73


>gnl|CDD|219621 pfam07890, Rrp15p, Rrp15p.  Rrp15p is required for the formation
          of 60S ribosomal subunits.
          Length = 132

 Score = 30.4 bits (69), Expect = 0.29
 Identities = 6/28 (21%), Positives = 19/28 (67%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
          A +K + E+ +KK +++ + +K++  ++
Sbjct: 33 AKKKLKSEKLEKKAKRQLRAEKRQALEK 60


>gnl|CDD|143299 cd05891, Ig_M-protein_C, C-terminal immunoglobulin (Ig)-like domain
           of M-protein (also known as myomesin-2).
           Ig_M-protein_C: the C-terminal immunoglobulin (Ig)-like
           domain of M-protein (also known as myomesin-2).
           M-protein is a structural protein localized to the
           M-band, a transverse structure in the center of the
           sarcomere, and is a candidate for M-band bridges.
           M-protein is modular consisting mainly of repetitive
           IG-like and fibronectin type III (FnIII) domains, and
           has a muscle-type specific expression pattern. M-protein
           is present in fast fibers.
          Length = 92

 Score = 29.5 bits (66), Expect = 0.32
 Identities = 18/72 (25%), Positives = 30/72 (41%), Gaps = 6/72 (8%)

Query: 108 VEVKKGYSVTLECKADGNPVPNITWTRKNNNL-PGGEYSYSGN-----SLTVRHTNRHSA 161
           V + +G ++ L C   GNP P + W + + ++     YS         SLT++      +
Sbjct: 11  VTIMEGKTLNLTCTVFGNPDPEVIWFKNDQDIELSEHYSVKLEQGKYASLTIKGVTSEDS 70

Query: 162 GIYLCVANNMVG 173
           G Y     N  G
Sbjct: 71  GKYSINVKNKYG 82


>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1.  All
           proteins in this family for which functions are known
           are cyclin dependent protein kinases that are components
           of TFIIH, a complex that is involved in nucleotide
           excision repair and transcription initiation. Also known
           as MAT1 (menage a trois 1). This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 309

 Score = 30.9 bits (70), Expect = 0.35
 Identities = 11/33 (33%), Positives = 20/33 (60%)

Query: 61  YLSYILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
            L   L  EKEEEE+++   +K+ ++++  K K
Sbjct: 146 ELEEALEFEKEEEEQRRLLLQKEEEEQQMNKRK 178


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 30.9 bits (70), Expect = 0.35
 Identities = 12/25 (48%), Positives = 19/25 (76%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
          +KEEE  ++++K +K+KK KK KE 
Sbjct: 56 DKEEEVDEEEEKEEKKKKTKKVKET 80



 Score = 29.7 bits (67), Expect = 0.95
 Identities = 11/24 (45%), Positives = 17/24 (70%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKE 92
          E +EEE K++KK+K +K K+   E
Sbjct: 60 EVDEEEEKEEKKKKTKKVKETTTE 83



 Score = 29.0 bits (65), Expect = 1.6
 Identities = 9/24 (37%), Positives = 16/24 (66%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEKR 94
          EE + +++K+ KK+K KK K+   
Sbjct: 59 EEVDEEEEKEEKKKKTKKVKETTT 82



 Score = 27.8 bits (62), Expect = 3.8
 Identities = 10/23 (43%), Positives = 17/23 (73%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKK 91
          E+ +EE +K++K+KK KK K+  
Sbjct: 59 EEVDEEEEKEEKKKKTKKVKETT 81



 Score = 27.4 bits (61), Expect = 4.8
 Identities = 11/24 (45%), Positives = 17/24 (70%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKE 92
          ++EEE+ +KKKK KK K+   + E
Sbjct: 62 DEEEEKEEKKKKTKKVKETTTEWE 85



 Score = 27.0 bits (60), Expect = 7.2
 Identities = 10/26 (38%), Positives = 16/26 (61%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEK 93
             ++EE   +++ K+ KKKK KK K
Sbjct: 53 KTTDKEEEVDEEEEKEEKKKKTKKVK 78


>gnl|CDD|143273 cd05865, Ig1_NCAM-1, First immunoglobulin (Ig)-like domain of
           neural cell adhesion molecule NCAM-1.  Ig1_NCAM-1: first
           immunoglobulin (Ig)-like domain of neural cell adhesion
           molecule NCAM-1. NCAM-1 plays important roles in the
           development and regeneration of the central nervous
           system, in synaptogenesis and neural migration. NCAM
           mediates cell-cell and cell-substratum recognition and
           adhesion via homophilic (NCAM-NCAM), and heterophilic
           (NCAM-nonNCAM), interactions. NCAM is expressed as three
           major isoforms having different intracellular
           extensions. The extracellular portion of NCAM has five
           N-terminal Ig-like domains and two fibronectin type III
           domains. The double zipper adhesion complex model for
           NCAM homophilic binding involves the Ig1, Ig2, and Ig3
           domains. By this model, Ig1 and Ig2 mediate dimerization
           of NCAM molecules situated on the same cell surface (cis
           interactions), and Ig3 domains mediate interactions
           between NCAM molecules expressed on the surface of
           opposing cells (trans interactions), through binding to
           the Ig1 and Ig2 domains. The adhesive ability of NCAM is
           modulated by the addition of polysialic acid chains to
           the fifth Ig-like domain.
          Length = 96

 Score = 29.6 bits (66), Expect = 0.37
 Identities = 11/36 (30%), Positives = 21/36 (58%)

Query: 149 NSLTVRHTNRHSAGIYLCVANNMVGSSAAASIALHV 184
           ++LT+ + N   AGIY CV +N     + A++ + +
Sbjct: 60  STLTIYNANIDDAGIYKCVVSNEDEGESEATVNVKI 95


>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex.  This
          entry is characterized by proteins with alternating
          conserved and low-complexity regions. Bud13 together
          with Snu17p and a newly identified factor,
          Pml1p/Ylr016c, form a novel trimeric complex. called
          The RES complex, pre-mRNA retention and splicing
          complex. Subunits of this complex are not essential for
          viability of yeasts but they are required for efficient
          splicing in vitro and in vivo. Furthermore,
          inactivation of this complex causes pre-mRNA leakage
          from the nucleus. Bud13 contains a unique,
          phylogenetically conserved C-terminal region of unknown
          function.
          Length = 141

 Score = 30.0 bits (68), Expect = 0.43
 Identities = 10/23 (43%), Positives = 20/23 (86%)

Query: 72 EEERKKKKKRKKRKKKKKKKEKR 94
          EE+R++K++ K+ K++K++KEK 
Sbjct: 15 EEKREEKEREKEEKERKEEKEKE 37



 Score = 28.8 bits (65), Expect = 1.1
 Identities = 9/24 (37%), Positives = 20/24 (83%)

Query: 66 LAMEKEEEERKKKKKRKKRKKKKK 89
          +  ++EE+ER+K++K +K +K+K+
Sbjct: 14 IEEKREEKEREKEEKERKEEKEKE 37



 Score = 26.9 bits (60), Expect = 4.4
 Identities = 9/23 (39%), Positives = 21/23 (91%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKK 91
          E++ EE++++K+ K+RK++K+K+
Sbjct: 15 EEKREEKEREKEEKERKEEKEKE 37



 Score = 26.9 bits (60), Expect = 4.9
 Identities = 9/23 (39%), Positives = 19/23 (82%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKE 92
          +E+ E K+++K +K +K++K+KE
Sbjct: 15 EEKREEKEREKEEKERKEEKEKE 37



 Score = 26.5 bits (59), Expect = 6.2
 Identities = 7/23 (30%), Positives = 21/23 (91%)

Query: 73 EERKKKKKRKKRKKKKKKKEKRI 95
          EE++++K+R+K +K++K+++++ 
Sbjct: 15 EEKREEKEREKEEKERKEEKEKE 37



 Score = 26.1 bits (58), Expect = 7.6
 Identities = 9/38 (23%), Positives = 19/38 (50%)

Query: 75  RKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
           R    + K+ +K+++K+EK          G G V+ ++
Sbjct: 10  RIIDIEEKREEKEREKEEKERKEEKEKEWGKGLVQKEE 47


>gnl|CDD|197876 smart00792, Agouti, Agouti protein.  The agouti protein regulates
          pigmentation in the mouse hair follicle producing a
          black hair with a subapical yellow band. A highly
          homologous protein agouti signal protein (ASIP) is
          present in humans and is expressed at highest levels in
          adipose tissue where it may play a role in energy
          homeostasis and possibly human pigmentation.
          Length = 124

 Score = 29.5 bits (66), Expect = 0.44
 Identities = 10/29 (34%), Positives = 14/29 (48%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
          K   E  +KK  +K++KK      R  PR
Sbjct: 56 KISAEEAEKKLLQKKEKKALTNVLRPEPR 84



 Score = 26.0 bits (57), Expect = 8.7
 Identities = 8/31 (25%), Positives = 16/31 (51%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
          +  EE E+K  +K++K+      + +   PR
Sbjct: 57 ISAEEAEKKLLQKKEKKALTNVLRPEPRSPR 87


>gnl|CDD|220926 pfam10988, DUF2807, Protein of unknown function (DUF2807).  This
           bacterial family of proteins shows structural similarity
           to other pectin lyase families. Although structures from
           this family align with acetyl-transferases, there is no
           conservation of catalytic residues found. It is likely
           that the function is one of cell-adhesion. In PDB:3jx8,
           it is interesting to note that the sequence of contains
           several well defined sequence repeats, centred around
           GSG motifs defining the tight beta turn between the two
           sheets of the super-helix; there are 8 such repeats in
           the C-terminal half of the protein, which could be
           grouped into 4 repeats of two. It seems likely that this
           family belongs to the superfamily of trimeric
           autotransporter adhesins (TAAs), which are important
           virulence factors in Gram-negative pathogens. In the
           case of Parabacteroides distasonis, which is a cmoponent
           of the normal distal human gut microbiota, TAA-like
           complexes probably modulate adherence to the host
           (information derived from TOPSAN).
          Length = 181

 Score = 30.2 bits (69), Expect = 0.45
 Identities = 14/42 (33%), Positives = 18/42 (42%), Gaps = 4/42 (9%)

Query: 100 IYVSGAGKVEVKKG--YSVTLECKADGNPVPNITWTRKNNNL 139
           I VSG   V + +G   SV +E   D N +  I    K   L
Sbjct: 5   IKVSGGIDVILTQGDENSVVIE--GDENLLDKIETEVKGGTL 44


>gnl|CDD|218640 pfam05565, Sipho_Gp157, Siphovirus Gp157.  This family contains
           both viral and bacterial proteins which are related to
           the Gp157 protein of the Streptococcus thermophilus SFi
           bacteriophages. It is thought that bacteria possessing
           the gene coding for this protein have an increased
           resistance to the bacteriophage.
          Length = 162

 Score = 29.9 bits (68), Expect = 0.48
 Identities = 14/54 (25%), Positives = 25/54 (46%), Gaps = 2/54 (3%)

Query: 60  HYLSYILAMEKEEEERKKKKKR-KKRKKKKKKKEKRIPPRI-IYVSGAGKVEVK 111
           +    I  +E + E  K + KR  +RKK  + K KR+   +   +   G  ++K
Sbjct: 44  NIAKVIKNLEADIEAIKAEIKRLAERKKSIENKVKRLKDYLEEAMEATGIKKIK 97


>gnl|CDD|224212 COG1293, COG1293, Predicted RNA-binding protein homologous to
           eukaryotic snRNP [Transcription].
          Length = 564

 Score = 30.4 bits (69), Expect = 0.49
 Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
            +E  E    K +KK++KKK+  EK
Sbjct: 416 REELIEEGLLKSKKKKRKKKEWFEK 440



 Score = 29.3 bits (66), Expect = 1.4
 Identities = 12/28 (42%), Positives = 17/28 (60%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
             E+  EE   K K+KKRKKK+  ++ R
Sbjct: 415 IREELIEEGLLKSKKKKRKKKEWFEKFR 442



 Score = 28.5 bits (64), Expect = 2.5
 Identities = 10/36 (27%), Positives = 17/36 (47%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSG 104
           E  EE  ++   + K+KK+KKK+        +   G
Sbjct: 414 EIREELIEEGLLKSKKKKRKKKEWFEKFRWFVSSDG 449



 Score = 28.1 bits (63), Expect = 3.1
 Identities = 12/37 (32%), Positives = 19/37 (51%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAG 106
           +E  E   ++   K KKKK+KK++       +VS  G
Sbjct: 413 EEIREELIEEGLLKSKKKKRKKKEWFEKFRWFVSSDG 449


>gnl|CDD|219761 pfam08243, SPT2, SPT2 chromatin protein.  This family includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 116

 Score = 29.5 bits (66), Expect = 0.50
 Identities = 11/26 (42%), Positives = 19/26 (73%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKK 91
           +A  ++E E  ++++ +KRKKKKK K
Sbjct: 91  MARLEDERELAREEEEEKRKKKKKNK 116



 Score = 25.6 bits (56), Expect = 9.3
 Identities = 11/26 (42%), Positives = 19/26 (73%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEK 93
           M + E+ER+  ++ ++ K+KKKKK K
Sbjct: 91  MARLEDERELAREEEEEKRKKKKKNK 116


>gnl|CDD|214395 CHL00204, ycf1, Ycf1; Provisional.
          Length = 1832

 Score = 30.5 bits (69), Expect = 0.53
 Identities = 8/26 (30%), Positives = 13/26 (50%)

Query: 70   KEEEERKKKKKRKKRKKKKKKKEKRI 95
            +E+      KKRK +K+ K   E  +
Sbjct: 1558 EEDYAESDIKKRKNKKQYKSNTEAEL 1583



 Score = 29.7 bits (67), Expect = 1.0
 Identities = 11/31 (35%), Positives = 23/31 (74%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
           I    +E+ ++KKKK++KK ++ K++++ RI
Sbjct: 737 ISDSVEEKTKKKKKKEKKKEEEYKREEKARI 767



 Score = 28.1 bits (63), Expect = 3.1
 Identities = 12/31 (38%), Positives = 15/31 (48%)

Query: 63   SYILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
            S +   EK  EE   +   KKRK KK+ K  
Sbjct: 1548 SVLSNQEKNIEEDYAESDIKKRKNKKQYKSN 1578


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 30.3 bits (68), Expect = 0.60
 Identities = 12/23 (52%), Positives = 18/23 (78%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKK 90
          + +EE ERKKKK+ K ++K+ KK
Sbjct: 13 LTEEELERKKKKEEKAKEKELKK 35



 Score = 28.3 bits (63), Expect = 2.6
 Identities = 13/23 (56%), Positives = 17/23 (73%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEK 93
          EEE  +KKKK +K K+K+ KK K
Sbjct: 15 EEELERKKKKEEKAKEKELKKLK 37



 Score = 27.6 bits (61), Expect = 4.8
 Identities = 10/29 (34%), Positives = 20/29 (68%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
          A +K   E + ++K+KK +K K+K+ K++
Sbjct: 8  AEKKILTEEELERKKKKEEKAKEKELKKL 36



 Score = 27.2 bits (60), Expect = 6.6
 Identities = 11/25 (44%), Positives = 17/25 (68%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
          E+E E +KKK+++ K K+ KK K  
Sbjct: 15 EEELERKKKKEEKAKEKELKKLKAA 39


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 30.3 bits (68), Expect = 0.66
 Identities = 12/26 (46%), Positives = 14/26 (53%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           E+E    KK KK KK K KK   +K 
Sbjct: 329 EEEGGLSKKGKKLKKLKGKKNGLDKD 354



 Score = 28.4 bits (63), Expect = 2.2
 Identities = 13/30 (43%), Positives = 18/30 (60%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           E  +EE+ KKKK+K  K KKK  + +   R
Sbjct: 234 EDGDEEKSKKKKKKLAKNKKKLDDDKKGKR 263



 Score = 27.2 bits (60), Expect = 6.7
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 69  EKEEEERKKKKKRKKRKKKKKK 90
            ++ +E K KKK+KK  K KKK
Sbjct: 233 GEDGDEEKSKKKKKKLAKNKKK 254


>gnl|CDD|212594 cd11720, FANCI, Fanconi anemia I protein.  The Fanconi anemia ID
            complex consists of two subunits, Fanconi anemia I and
            Fanconi anemia D2 (FANCI-FANCD2) and plays a central role
            in the repair of DNA interstrand cross-links (ICLs). The
            complex is activated via DNA damage-induced
            phosphorylation by ATR (ataxia telangiectasia and
            Rad3-related) and monoubiquitination by the FA core
            complex ubiquitin ligase, and it binds to DNA at the ICL
            site, recognizing branched DNA structures. Defects in the
            complex cause Fanconi anemia, a cancer predisposition
            syndrome. The phosphorylation of FANCI may function as a
            molecular switch to turn on the FA pathway.
          Length = 1202

 Score = 30.4 bits (69), Expect = 0.66
 Identities = 15/41 (36%), Positives = 25/41 (60%), Gaps = 4/41 (9%)

Query: 65   ILAMEKEEEERKKKKKRKKRKKK----KKKKEKRIPPRIIY 101
            I  +E+ + E+  KKK+KK K K    K  +E ++ P++IY
Sbjct: 1125 ITYVEENQSEQDSKKKKKKAKSKVQRNKILRETKLIPKLIY 1165


>gnl|CDD|240576 cd12932, RRP7_like, RRP7 domain ribosomal RNA-processing protein
          7 (Rrp7p), ribosomal RNA-processing protein 7 homolog A
          (Rrp7A), and similar proteins.  This CD corresponds to
          the RRP7 domain of Rrp7p and Rrp7A. Rrp7p is encoded by
          YCL031C gene from Saccharomyces cerevisiae. It is an
          essential yeast protein involved in pre-rRNA processing
          and ribosome assembly, and is speculated to be required
          for correct assembly of rpS27 into the pre-ribosomal
          particle. Rrp7A, also termed gastric cancer antigen
          Zg14, is the Rrp7p homolog mainly found in Metazoans.
          The cellular function of Rrp7A remains unclear
          currently. Both Rrp7p and Rrp7A harbor an N-terminal
          RNA recognition motif (RRM), also termed RBD (RNA
          binding domain) or RNP (ribonucleoprotein domain), and
          a C-terminal RRP7 domain.
          Length = 118

 Score = 28.8 bits (65), Expect = 0.71
 Identities = 12/22 (54%), Positives = 15/22 (68%)

Query: 72 EEERKKKKKRKKRKKKKKKKEK 93
           EE  + K ++K KKKKKKKE 
Sbjct: 67 REEAVEAKAKEKEKKKKKKKEL 88



 Score = 28.8 bits (65), Expect = 0.94
 Identities = 12/35 (34%), Positives = 22/35 (62%), Gaps = 7/35 (20%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKK-------KEKR 94
           A E+  E + K+K++KK+KKK+ +       +EK+
Sbjct: 66  AREEAVEAKAKEKEKKKKKKKELEDFYRFQIREKK 100


>gnl|CDD|176058 cd08676, C2A_Munc13-like, C2 domain first repeat in Munc13
           (mammalian uncoordinated)-like proteins.  C2-like
           domains are thought to be involved in phospholipid
           binding in a Ca2+ independent manner in both Unc13 and
           Munc13. Caenorabditis elegans Unc13 has a central domain
           with sequence similarity to PKC, which includes C1 and
           C2-related domains. Unc13 binds phorbol esters and DAG
           with high affinity in a phospholipid manner.  Mutations
           in Unc13 results in abnormal neuronal connections and
           impairment in cholinergic neurotransmission in the
           nematode.  Munc13 is the mammalian homolog which are
           expressed in the brain.  There are 3 isoforms (Munc13-1,
           -2, -3) and are thought to play a role in
           neurotransmitter release and are hypothesized to be
           high-affinity receptors for phorbol esters.  Unc13 and
           Munc13 contain both C1 and C2 domains.  There are two C2
           related domains present, one central and one at the
           carboxyl end.  Munc13-1 contains a third C2-like domain.
            Munc13 interacts with syntaxin, synaptobrevin, and
           synaptotagmin suggesting a role for these as scaffolding
           proteins. C2 domains fold into an 8-standed
           beta-sandwich that can adopt 2 structural arrangements:
           Type I and Type II, distinguished by a circular
           permutation involving their N- and C-terminal beta
           strands. Many C2 domains are Ca2+-dependent
           membrane-targeting modules that bind a wide variety of
           substances including bind phospholipids, inositol
           polyphosphates, and intracellular proteins.  Most C2
           domain proteins are either signal transduction enzymes
           that contain a single C2 domain, such as protein kinase
           C, or membrane trafficking proteins which contain at
           least two C2 domains, such as synaptotagmin 1.  However,
           there are a few exceptions to this including RIM
           isoforms and some splice variants of piccolo/aczonin and
           intersectin which only have a single C2 domain.  C2
           domains with a calcium binding region have negatively
           charged residues, primarily aspartates, that serve as
           ligands for calcium ions. This cd contains the second C2
           repeat, C2B, and has a type-II topology.
          Length = 153

 Score = 29.3 bits (66), Expect = 0.72
 Identities = 12/44 (27%), Positives = 20/44 (45%), Gaps = 5/44 (11%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
               E   +K K++K  +KK   +  +P + I V+     EVK 
Sbjct: 59  PASRERNSEKSKKRKSHRKKAVLKDTVPAKSIKVT-----EVKP 97


>gnl|CDD|193409 pfam12936, Kri1_C, KRI1-like family C-terminal.  The yeast member
          of this family (Kri1p) is found to be required for 40S
          ribosome biogenesis in the nucleolus. This is the
          C-terminal domain of the family.
          Length = 93

 Score = 28.3 bits (64), Expect = 0.78
 Identities = 10/26 (38%), Positives = 16/26 (61%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKR 94
          +KEE  + KKK  KK + ++ KK+  
Sbjct: 68 DKEERRKDKKKYGKKARLREWKKKVF 93



 Score = 26.4 bits (59), Expect = 3.7
 Identities = 10/30 (33%), Positives = 22/30 (73%)

Query: 66 LAMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
          LA  +++EER+K KK+  +K + ++ +K++
Sbjct: 63 LAPYRDKEERRKDKKKYGKKARLREWKKKV 92


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 29.7 bits (67), Expect = 0.87
 Identities = 10/28 (35%), Positives = 15/28 (53%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
             E E E   K  + K++ K ++ KEKR
Sbjct: 261 GFESEYEPINKPVRPKRKTKAQRNKEKR 288



 Score = 28.1 bits (63), Expect = 2.9
 Identities = 11/26 (42%), Positives = 20/26 (76%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           +KE+   +KK++RK+R +KKK K ++
Sbjct: 321 QKEKARARKKEQRKERGEKKKLKRRK 346



 Score = 27.7 bits (62), Expect = 3.5
 Identities = 9/28 (32%), Positives = 21/28 (75%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
            + ++E+ R +KK+++K + +KKK ++R
Sbjct: 318 EVAQKEKARARKKEQRKERGEKKKLKRR 345



 Score = 27.4 bits (61), Expect = 5.8
 Identities = 10/32 (31%), Positives = 19/32 (59%), Gaps = 6/32 (18%)

Query: 70  KEEEERKKKKKRKKR------KKKKKKKEKRI 95
           K + +R K+K+RK+       +K+ KKK  ++
Sbjct: 278 KTKAQRNKEKRRKELEREAKEEKQLKKKLAQL 309



 Score = 27.0 bits (60), Expect = 6.2
 Identities = 8/26 (30%), Positives = 15/26 (57%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
              +  R K+K + +R K+K++KE  
Sbjct: 268 PINKPVRPKRKTKAQRNKEKRRKELE 293


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 29.9 bits (68), Expect = 0.88
 Identities = 13/26 (50%), Positives = 19/26 (73%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEK 93
            EK+ EE KK+KK+K    KKK++E+
Sbjct: 419 AEKKREEEKKEKKKKAFAGKKKEEEE 444



 Score = 29.5 bits (67), Expect = 0.93
 Identities = 11/25 (44%), Positives = 21/25 (84%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           ++EEE+++KKKK    KKK++++E+
Sbjct: 422 KREEEKKEKKKKAFAGKKKEEEEEE 446



 Score = 29.5 bits (67), Expect = 0.97
 Identities = 8/33 (24%), Positives = 21/33 (63%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           I  + ++ E++++++K++K+KK    K+K    
Sbjct: 412 IKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEE 444



 Score = 29.5 bits (67), Expect = 0.99
 Identities = 12/25 (48%), Positives = 20/25 (80%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           +K EEE+K+KKK+    KKK+++E+
Sbjct: 421 KKREEEKKEKKKKAFAGKKKEEEEE 445



 Score = 28.7 bits (65), Expect = 2.2
 Identities = 10/26 (38%), Positives = 21/26 (80%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           E+E++E+KKK    K+K++++++EK 
Sbjct: 424 EEEKKEKKKKAFAGKKKEEEEEEEKE 449



 Score = 28.0 bits (63), Expect = 3.0
 Identities = 11/36 (30%), Positives = 21/36 (58%)

Query: 62  LSYILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           +  I+   +++ E +KK+K+KK    KKK+E+    
Sbjct: 412 IKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEE 447



 Score = 27.6 bits (62), Expect = 5.0
 Identities = 9/44 (20%), Positives = 24/44 (54%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
           +K ++  +K +K+++ +KK+KKK+     +        K + ++
Sbjct: 410 KKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEE 453



 Score = 27.2 bits (61), Expect = 5.8
 Identities = 23/85 (27%), Positives = 30/85 (35%), Gaps = 18/85 (21%)

Query: 44  KLITSCSYIILATSLLHYLSYI--------------LAMEKEEEE--RKKKKKRKKRKKK 87
           KL TS   +     +L +LS I              L + +EE E     KK  KK KK 
Sbjct: 358 KLHTSKRKV--RREVLPFLSIIFKHNPELAARLAAFLELTEEEIEFLTGSKKATKKIKKI 415

Query: 88  KKKKEKRIPPRIIYVSGAGKVEVKK 112
            +K EK+                KK
Sbjct: 416 VEKAEKKREEEKKEKKKKAFAGKKK 440



 Score = 27.2 bits (61), Expect = 6.3
 Identities = 14/51 (27%), Positives = 25/51 (49%), Gaps = 1/51 (1%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIP-PRIIYVSGAGKVEVKKGYSVTL 118
           EK+++    KKK ++ +++K+KKE+              + E KK    TL
Sbjct: 429 EKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKKKKQATL 479


>gnl|CDD|238356 cd00660, Topoisomer_IB_N, Topoisomer_IB_N: N-terminal DNA binding
           fragment found in eukaryotic DNA topoisomerase (topo) IB
           proteins similar to the monomeric yeast and human topo I
           and heterodimeric topo I from Leishmania donvanni. Topo
           I enzymes are divided into:  topo type IA (bacterial)
           and type IB (eukaryotic). Topo I relaxes superhelical
           tension in duplex DNA by creating a single-strand nick,
           the broken strand can then rotate around the unbroken
           strand to remove DNA supercoils and, the nick is
           religated, liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit re-ligation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  In addition to differences in
           structure and some biochemical properties,
           Trypanosomatid parasite topo I differ from human topo I
           in their sensitivity to CPTs and other classical topo I
           inhibitors. Trypanosomatid topos I play putative roles
           in organizing the kinetoplast DNA network unique to
           these parasites.  This family may represent more than
           one structural domain.
          Length = 215

 Score = 29.2 bits (66), Expect = 0.94
 Identities = 12/22 (54%), Positives = 16/22 (72%)

Query: 72  EEERKKKKKRKKRKKKKKKKEK 93
           EEE++KKK   K +KK  K+EK
Sbjct: 95  EEEKEKKKAMSKEEKKAIKEEK 116



 Score = 26.5 bits (59), Expect = 8.8
 Identities = 10/25 (40%), Positives = 19/25 (76%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           +EE+E+KK   ++++K  K++KEK 
Sbjct: 95  EEEKEKKKAMSKEEKKAIKEEKEKL 119


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
          approximately 300 residues, found in plants and
          vertebrates. They contain a highly conserved DDRGK
          motif.
          Length = 189

 Score = 29.3 bits (66), Expect = 0.96
 Identities = 9/28 (32%), Positives = 18/28 (64%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
            E+EE E +KK + K+  ++K+++E  
Sbjct: 22 EAEEEEREERKKLEEKREGERKEEEELE 49



 Score = 27.0 bits (60), Expect = 5.3
 Identities = 13/31 (41%), Positives = 22/31 (70%), Gaps = 2/31 (6%)

Query: 67 AMEKEEEERKKK--KKRKKRKKKKKKKEKRI 95
          A E+E EERKK   K+  +RK++++ +E+R 
Sbjct: 23 AEEEEREERKKLEEKREGERKEEEELEEERE 53



 Score = 26.2 bits (58), Expect = 9.6
 Identities = 9/25 (36%), Positives = 22/25 (88%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
          +KEEEERK+++++ ++++++ +K K
Sbjct: 55 KKEEEERKEREEQARKEQEEYEKLK 79


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 29.6 bits (66), Expect = 0.97
 Identities = 16/46 (34%), Positives = 20/46 (43%), Gaps = 3/46 (6%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
             +K  E RKKK+KR+  KK   K +K    R I       V  K 
Sbjct: 556 RTDKNRERRKKKRKRRAAKKAVTKAKKE---RKIGKEKVDGVAKKS 598



 Score = 28.4 bits (63), Expect = 2.6
 Identities = 10/28 (35%), Positives = 18/28 (64%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           A+EK + E  +  K ++R+KKK+K+   
Sbjct: 546 AIEKSKTELDRTDKNRERRKKKRKRRAA 573



 Score = 28.0 bits (62), Expect = 3.2
 Identities = 8/27 (29%), Positives = 17/27 (62%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKR 94
             K E +R  K + +++KK+K++  K+
Sbjct: 549 KSKTELDRTDKNRERRKKKRKRRAAKK 575



 Score = 27.3 bits (60), Expect = 6.7
 Identities = 7/25 (28%), Positives = 17/25 (68%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           E +  ++ +++++KKRK++  KK  
Sbjct: 553 ELDRTDKNRERRKKKRKRRAAKKAV 577


>gnl|CDD|220129 pfam09159, Ydc2-catalyt, Mitochondrial resolvase Ydc2 / RNA
           splicing MRS1.  Members of this family adopt a secondary
           structure consisting of two beta sheets and one alpha
           helix, arranged as a beta-alpha-beta motif. Each beta
           sheet has five strands, arranged in a 32145 order, with
           the second strand being antiparallel to the rest.
           Mitochondrial resolvase Ydc2 is capable of resolving
           Holliday junctions and cleaves DNA after 5'-CT-3' and
           5'-TT-3' sequences. This family also contains the
           mitochondrial RNA-splicing protein MRS1 which is
           involved in the excision of group I introns.
          Length = 254

 Score = 29.3 bits (66), Expect = 0.97
 Identities = 12/28 (42%), Positives = 16/28 (57%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRI 95
            E+ E   +KKK R K+K  K  K+ RI
Sbjct: 146 CERTEILAEKKKPRSKKKSSKNSKKLRI 173


>gnl|CDD|237035 PRK12280, rplW, 50S ribosomal protein L23; Reviewed.
          Length = 158

 Score = 29.0 bits (65), Expect = 1.0
 Identities = 10/27 (37%), Positives = 21/27 (77%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRI 95
           ++  +E ++K+  K +K+KK+KKEK++
Sbjct: 100 KEVSKETEEKEAIKAKKEKKEKKEKKV 126



 Score = 27.4 bits (61), Expect = 3.6
 Identities = 10/31 (32%), Positives = 21/31 (67%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           EKE++E  K+ + K+  K KK+K+++   ++
Sbjct: 96  EKEQKEVSKETEEKEAIKAKKEKKEKKEKKV 126



 Score = 26.3 bits (58), Expect = 8.9
 Identities = 8/24 (33%), Positives = 17/24 (70%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKR 94
           EE E+++K+  K+ ++K+  K K+
Sbjct: 93  EESEKEQKEVSKETEEKEAIKAKK 116


>gnl|CDD|143300 cd05892, Ig_Myotilin_C, C-terminal immunoglobulin (Ig)-like domain
           of myotilin.  Ig_Myotilin_C: C-terminal immunoglobulin
           (Ig)-like domain of myotilin. Mytolin belongs to the
           palladin-myotilin-myopalladin family. Proteins belonging
           to the latter family contain multiple Ig-like domains
           and function as scaffolds, modulating actin
           cytoskeleton. Myotilin is most abundant in skeletal and
           cardiac muscle, and is involved in maintaining sarcomere
           integrity. It binds to alpha-actinin, filamin and actin.
           Mutations in myotilin lead to muscle disorders.
          Length = 75

 Score = 27.6 bits (61), Expect = 1.1
 Identities = 22/68 (32%), Positives = 30/68 (44%), Gaps = 13/68 (19%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEYS------YSGNS----LTVRHTNRHSAGIYL 165
           V LEC+    P P I W R N  +   +Y+      Y  NS    L +++ N+  AG Y 
Sbjct: 1   VKLECQISAIPPPKIFWKRNNEMV---QYNTDRISLYQDNSGRVTLLIKNVNKKDAGWYT 57

Query: 166 CVANNMVG 173
             A N  G
Sbjct: 58  VSAVNEAG 65


>gnl|CDD|235501 PRK05559, PRK05559, DNA topoisomerase IV subunit B; Reviewed.
          Length = 631

 Score = 29.3 bits (67), Expect = 1.1
 Identities = 8/21 (38%), Positives = 11/21 (52%)

Query: 70  KEEEERKKKKKRKKRKKKKKK 90
           K  + R +  K+ KRKKK   
Sbjct: 378 KAAQARLRAAKKVKRKKKTSG 398



 Score = 27.8 bits (63), Expect = 4.1
 Identities = 7/22 (31%), Positives = 10/22 (45%)

Query: 65  ILAMEKEEEERKKKKKRKKRKK 86
           I A +      KK K++KK   
Sbjct: 377 IKAAQARLRAAKKVKRKKKTSG 398



 Score = 27.4 bits (62), Expect = 5.0
 Identities = 7/26 (26%), Positives = 13/26 (50%)

Query: 72  EEERKKKKKRKKRKKKKKKKEKRIPP 97
           +  + + +  KK K+KKK     +P 
Sbjct: 378 KAAQARLRAAKKVKRKKKTSGPALPG 403


>gnl|CDD|177016 CHL00077, rps18, ribosomal protein S18.
          Length = 86

 Score = 28.0 bits (63), Expect = 1.1
 Identities = 5/24 (20%), Positives = 11/24 (45%)

Query: 75 RKKKKKRKKRKKKKKKKEKRIPPR 98
           K K+  +K K+  +++   I   
Sbjct: 2  DKSKRPFRKSKRSFRRRLPPIQSG 25


>gnl|CDD|220252 pfam09468, RNase_H2-Ydr279, Ydr279p protein family (RNase H2
           complex component).  RNases H are enzymes that
           specifically hydrolyse RNA when annealed to a
           complementary DNA and are present in all living
           organisms. In yeast RNase H2 is composed of a complex of
           three proteins (Rnh2Ap, Ydr279p and Ylr154p), this
           family represents the homologues of Ydr279p. It is not
           known whether non yeast proteins in this family fulfil
           the same function.
          Length = 287

 Score = 29.2 bits (66), Expect = 1.2
 Identities = 11/27 (40%), Positives = 18/27 (66%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
           A  + E++RK K++ KK+K K+ K  K
Sbjct: 244 AESRAEKKRKSKEEIKKKKPKESKGVK 270



 Score = 28.8 bits (65), Expect = 1.6
 Identities = 10/23 (43%), Positives = 13/23 (56%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKK 91
           + +EE +KKK K  K  K  KK 
Sbjct: 253 KSKEEIKKKKPKESKGVKALKKV 275



 Score = 28.8 bits (65), Expect = 1.7
 Identities = 11/25 (44%), Positives = 14/25 (56%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
            KEE ++KK K+ K  K  KK   K
Sbjct: 254 SKEEIKKKKPKESKGVKALKKVVAK 278



 Score = 28.8 bits (65), Expect = 1.7
 Identities = 11/30 (36%), Positives = 15/30 (50%)

Query: 62  LSYILAMEKEEEERKKKKKRKKRKKKKKKK 91
            S      K +EE KKKK ++ +  K  KK
Sbjct: 245 ESRAEKKRKSKEEIKKKKPKESKGVKALKK 274



 Score = 28.5 bits (64), Expect = 1.8
 Identities = 9/26 (34%), Positives = 15/26 (57%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEK 93
            +++ +E  KKKK K+ K  K  K+ 
Sbjct: 250 KKRKSKEEIKKKKPKESKGVKALKKV 275


>gnl|CDD|219901 pfam08555, DUF1754, Eukaryotic family of unknown function
          (DUF1754).  This is a eukaryotic protein family of
          unknown function.
          Length = 90

 Score = 27.8 bits (62), Expect = 1.2
 Identities = 11/17 (64%), Positives = 13/17 (76%)

Query: 77 KKKKRKKRKKKKKKKEK 93
          K KK   +KKKKKKK+K
Sbjct: 13 KGKKIDVKKKKKKKKKK 29



 Score = 26.2 bits (58), Expect = 4.6
 Identities = 11/19 (57%), Positives = 14/19 (73%)

Query: 76 KKKKKRKKRKKKKKKKEKR 94
           KKKK+KK+KK K K+E  
Sbjct: 19 VKKKKKKKKKKNKSKEEVV 37



 Score = 25.8 bits (57), Expect = 6.6
 Identities = 11/18 (61%), Positives = 14/18 (77%)

Query: 76 KKKKKRKKRKKKKKKKEK 93
            KKK+KK+KKK K KE+
Sbjct: 18 DVKKKKKKKKKKNKSKEE 35


>gnl|CDD|206208 pfam14038, YqzE, YqzE-like protein.  The YqzE-like protein family
          includes the B. subtilis YqzE protein, which is
          functionally uncharacterized. It is a part of the ComG
          operon, which is regulated by the competence
          transcription factor ComK. This family of proteins is
          found in bacteria. Proteins in this family are
          typically between 49 and 66 amino acids in length.
          Length = 54

 Score = 26.8 bits (60), Expect = 1.3
 Identities = 9/20 (45%), Positives = 15/20 (75%)

Query: 68 MEKEEEERKKKKKRKKRKKK 87
          M+  +EERK+KK+ +K +K 
Sbjct: 17 MDTPKEERKEKKEERKEEKP 36


>gnl|CDD|227880 COG5593, COG5593, Nucleic-acid-binding protein possibly involved in
           ribosomal biogenesis [Translation, ribosomal structure
           and biogenesis].
          Length = 821

 Score = 29.2 bits (65), Expect = 1.3
 Identities = 12/28 (42%), Positives = 19/28 (67%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIP 96
           EKEEEE K+   ++ +KK++K   K +P
Sbjct: 780 EKEEEENKEVSAKRAKKKQRKNMLKSLP 807


>gnl|CDD|234533 TIGR04285, nucleoid_noc, nucleoid occlusion protein.  This model
           describes nucleoid occlusion protein, a close homolog to
           ParB chromosome partitioning proteins including Spo0J in
           Bacillus subtilis. Its gene often is located near the
           gene for the Spo0J ortholog. This protein bind a
           specific DNA sequence and blocks cytokinesis from
           happening until chromosome segregation is complete.
          Length = 255

 Score = 29.0 bits (66), Expect = 1.3
 Identities = 12/25 (48%), Positives = 15/25 (60%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K+ EE  KK   K  K KKKKK ++
Sbjct: 185 KQTEELIKKLLEKPEKPKKKKKRRK 209


>gnl|CDD|221673 pfam12626, PolyA_pol_arg_C, Polymerase A arginine-rich C-terminus. 
           The C-terminus of polymerase A in E coli is
           arginine-rich and is necessary for full functioning of
           the enzyme.
          Length = 123

 Score = 28.3 bits (64), Expect = 1.3
 Identities = 8/24 (33%), Positives = 17/24 (70%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKK 90
           AM +  + R+  K+R++R +++KK
Sbjct: 100 AMIEALQGREGGKRRRRRPRRRKK 123



 Score = 26.0 bits (58), Expect = 6.6
 Identities = 6/21 (28%), Positives = 15/21 (71%)

Query: 71  EEEERKKKKKRKKRKKKKKKK 91
           E  + ++  KR++R+ +++KK
Sbjct: 103 EALQGREGGKRRRRRPRRRKK 123


>gnl|CDD|179580 PRK03449, PRK03449, putative inner membrane protein translocase
           component YidC; Provisional.
          Length = 304

 Score = 28.9 bits (65), Expect = 1.4
 Identities = 10/34 (29%), Positives = 18/34 (52%), Gaps = 8/34 (23%)

Query: 68  MEKEEEERKKKKKRKKR--------KKKKKKKEK 93
           ++KEEE +K+ K  ++         K K+ KK+ 
Sbjct: 270 IDKEEEAKKQAKAERRAANAPKPGAKPKRSKKKA 303


>gnl|CDD|218328 pfam04921, XAP5, XAP5, circadian clock regulator.  This protein is
           found in a wide range of eukaryotes. It is a nuclear
           protein and is suggested to be DNA binding. In plants,
           this family is essential for correct circadian clock
           functioning by acting as a light-quality regulator
           coordinating the activities of blue and red light
           signalling pathways during plant growth - inhibiting
           growth in red light but promoting growth in blue light.
          Length = 233

 Score = 28.9 bits (65), Expect = 1.4
 Identities = 17/59 (28%), Positives = 32/59 (54%), Gaps = 5/59 (8%)

Query: 63  SYILAMEKEEEERKKKKKRKKR--KKKKKKKEKRIPPRIIYVSG---AGKVEVKKGYSV 116
           S++    +EE+E + +++ ++   KK++  KE+ I     Y  G    G V VKKG ++
Sbjct: 59  SFLPDKAREEKEAELREELREEFLKKQEAVKEEEIEITFSYYDGTGHPGTVRVKKGDTI 117


>gnl|CDD|143166 cd00098, IgC, Immunoglobulin Constant domain.  IgC: Immunoglobulin
           constant domain (IgC). Members of the IgC family are
           components of immunoglobulin, T-cell receptors, CD1 cell
           surface glycoproteins, secretory glycoproteins A/C, and
           Major Histocompatibility Complex (MHC) class I/II
           molecules. In immunoglobulins, each chain is composed of
           one variable domain (IgV) and one or more IgC domains.
           These names reflect the fact that the variability in
           sequences is higher in the variable domain than in the
           constant domain. The IgV domain is responsible for
           antigen binding, and the IgC domain is involved in
           oligomerization and molecular interactions.
          Length = 95

 Score = 27.8 bits (62), Expect = 1.4
 Identities = 23/74 (31%), Positives = 31/74 (41%), Gaps = 15/74 (20%)

Query: 109 EVKKGYSVTLECKADG---NPVPNITWTRKNNNLPGGEY----------SYSGNS-LTVR 154
           E   G SVTL C A G     +  +TW +    L  G            +YS +S LTV 
Sbjct: 9   EELLGGSVTLTCLATGFYPPDI-TVTWLKNGKELTSGVTTTPPVPNSDGTYSVSSQLTVS 67

Query: 155 HTNRHSAGIYLCVA 168
            ++ +S   Y CV 
Sbjct: 68  PSDWNSGDTYTCVV 81


>gnl|CDD|143214 cd05737, Ig_Myomesin_like_C, C-temrinal immunoglobulin (Ig)-like
           domain of myomesin and M-protein.  Ig_Myomesin_like_C:
           domain similar to the C-temrinal immunoglobulin
           (Ig)-like domain of myomesin and M-protein. Myomesin and
           M-protein are both structural proteins localized to the
           M-band, a transverse structure in the center of the
           sarcomere, and are candidates for M-band bridges. Both
           proteins are modular, consisting mainly of repetitive
           Ig-like and fibronectin type III (FnIII) domains.
           Myomesin is expressed in all types of vertebrate
           striated muscle; M-protein has a muscle-type specific
           expression pattern. Myomesin is present in both slow and
           fast fibers; M-protein is present only in fast fibers.
           It has been suggested that myomesin acts as a molecular
           spring with alternative splicing as a means of modifying
           its elasticity.
          Length = 92

 Score = 27.9 bits (62), Expect = 1.4
 Identities = 18/73 (24%), Positives = 32/73 (43%), Gaps = 6/73 (8%)

Query: 108 VEVKKGYSVTLECKADGNPVPNITWTRKNNNL-PGGEYSYSGN-----SLTVRHTNRHSA 161
           V + +G ++ L C   G+P P ++W + +  L     Y+         SLT++  +   +
Sbjct: 11  VTIMEGKTLNLTCTVFGDPDPEVSWLKNDQALALSDHYNVKVEQGKYASLTIKGVSSEDS 70

Query: 162 GIYLCVANNMVGS 174
           G Y  V  N  G 
Sbjct: 71  GKYGIVVKNKYGG 83


>gnl|CDD|221857 pfam12923, RRP7, Ribosomal RNA-processing protein 7 (RRP7).  RRP7
           is an essential protein in yeast that is involved in
           pre-rRNA processing and ribosome assembly. It is
           speculated to be required for correct assembly of rpS27
           into the pre-ribosomal particle.
          Length = 131

 Score = 28.3 bits (64), Expect = 1.5
 Identities = 12/33 (36%), Positives = 20/33 (60%), Gaps = 6/33 (18%)

Query: 69  EKEEEERKKKKKRKKRKKKKK------KKEKRI 95
           +  EE RK K+K+KK+KK+ +       +EK+ 
Sbjct: 70  KAAEERRKLKEKKKKKKKELENFYRFQIREKKK 102


>gnl|CDD|220431 pfam09831, DUF2058, Uncharacterized protein conserved in bacteria
          (DUF2058).  This domain, found in various prokaryotic
          proteins, has no known function.
          Length = 177

 Score = 28.3 bits (64), Expect = 1.5
 Identities = 9/19 (47%), Positives = 15/19 (78%)

Query: 76 KKKKKRKKRKKKKKKKEKR 94
          KK KK KK K+K++K+ ++
Sbjct: 15 KKAKKAKKEKRKQRKQARK 33



 Score = 27.2 bits (61), Expect = 3.8
 Identities = 7/21 (33%), Positives = 14/21 (66%)

Query: 77 KKKKRKKRKKKKKKKEKRIPP 97
          KK K++KRK++K+ ++     
Sbjct: 18 KKAKKEKRKQRKQARKGADDG 38



 Score = 26.8 bits (60), Expect = 6.1
 Identities = 9/19 (47%), Positives = 16/19 (84%)

Query: 76 KKKKKRKKRKKKKKKKEKR 94
          KKK K+ K++K+K++K+ R
Sbjct: 14 KKKAKKAKKEKRKQRKQAR 32


>gnl|CDD|177060 CHL00138, rps5, ribosomal protein S5; Validated.
          Length = 143

 Score = 28.0 bits (63), Expect = 1.6
 Identities = 14/42 (33%), Positives = 19/42 (45%), Gaps = 1/42 (2%)

Query: 72  EEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKG 113
                KK   KK+ +K   KE +   R+I +    KV VK G
Sbjct: 1   MLFLLKKMYNKKKNRKSNIKENKWEERVIQIKRVSKV-VKGG 41


>gnl|CDD|240577 cd12950, RRP7_Rrp7p, RRP7 domain ribosomal RNA-processing protein
          7 (Rrp7p) and similar proteins.  This CD corresponds to
          the RRP7 domain of Rrp7p. Rrp7p is encoded by YCL031C
          gene from Saccharomyces cerevisiae. It is an essential
          yeast protein involved in pre-rRNA processing and
          ribosome assembly. Rrp7p contains an N-terminal RNA
          recognition motif (RRM), also termed RBD (RNA binding
          domain) or RNP (ribonucleoprotein domain), and a
          C-terminal RRP7 domain.
          Length = 128

 Score = 28.0 bits (63), Expect = 1.7
 Identities = 11/23 (47%), Positives = 17/23 (73%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEK 93
          E  +  +++K++K KKKKKKKE 
Sbjct: 69 EAGKAAEEEKKEKEKKKKKKKEL 91



 Score = 26.1 bits (58), Expect = 7.6
 Identities = 12/23 (52%), Positives = 16/23 (69%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEK 93
           EE  K  ++ KK K+KKKKK+K
Sbjct: 67 GEEAGKAAEEEKKEKEKKKKKKK 89



 Score = 26.1 bits (58), Expect = 7.9
 Identities = 11/22 (50%), Positives = 17/22 (77%)

Query: 69 EKEEEERKKKKKRKKRKKKKKK 90
           K  EE KK+K++KK+KKK+ +
Sbjct: 71 GKAAEEEKKEKEKKKKKKKELE 92



 Score = 25.7 bits (57), Expect = 9.4
 Identities = 15/37 (40%), Positives = 23/37 (62%), Gaps = 10/37 (27%)

Query: 67  AMEKEEEERKKKKKRKKRKKKK----------KKKEK 93
           A +  EEE+K+K+K+KK+KK+           KKKE+
Sbjct: 70  AGKAAEEEKKEKEKKKKKKKELEDFYRFQLREKKKEE 106


>gnl|CDD|173607 PTZ00417, PTZ00417, lysine-tRNA ligase; Provisional.
          Length = 585

 Score = 28.8 bits (64), Expect = 1.8
 Identities = 24/93 (25%), Positives = 43/93 (46%), Gaps = 19/93 (20%)

Query: 63  SYILAMEKEE---EERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLE 119
            ++   EK+E   E  KK +  +  K KKK++E  + PR+ Y + +  ++ +K   +   
Sbjct: 45  CFVTMSEKKEHVMEGEKKVRSVQASKDKKKEEEAEVDPRLYYENRSKFIQEQKAKGI--- 101

Query: 120 CKADGNPVPN-----IT---WTRKNNNLPGGEY 144
                NP P+     IT   +  K  +L  GE+
Sbjct: 102 -----NPYPHKFERTITVPEFVEKYQDLASGEH 129


>gnl|CDD|143244 cd05767, IgC_MHC_II_alpha, Class II major histocompatibility
           complex (MHC) alpha chain immunoglobulin domain.
           IgC_MHC_II_alpha: Immunoglobulin (Ig) domain of major
           histocompatibility complex (MHC) class II alpha chain.
           MHC class II molecules play a key role in the initiation
           of the antigen-specific immune reponse. In both humans
           and in mice these molecules have been shown to be
           expressed constitutively on the cell surface of
           professional antigen-presenting cells (APCs), for
           example on B-lymphocytes, monocytes, and macrophages.
           The expression of these molecules has been shown to be
           induced in nonprofessional APCs such as keratinocyctes,
           and they are expressed on the surface of activated human
           T cells and on T cells from other species. The MHC II
           molecules present antigenic peptides to CD4(+)
           T-lymphocytes. These peptides derive mostly from
           protelytic processing via the endocytic pathway, of
           antigens internalized by the APC. These peptides bind to
           the MHC class II molecules in the endosome before they
           are transported to the cell surface. MHC class II
           molecules are heterodimers, comprised of two
           similarly-sized membrane-spanning chains, alpha and
           beta. Each chain had two globular domains (N- and
           C-terminal), and a membrane-anchoring transmembrane
           segment. The two chains form a compact four-domain
           structure. The peptide-binding site is a cleft in the
           structure.
          Length = 94

 Score = 27.3 bits (61), Expect = 1.8
 Identities = 14/45 (31%), Positives = 18/45 (40%), Gaps = 4/45 (8%)

Query: 110 VKKGYSVTLECKADG--NPVPNITWTRKNNNLPGG--EYSYSGNS 150
           V+ G   TL C  D    PV N+TW +    +  G  E  Y    
Sbjct: 12  VELGEPNTLICFVDNFFPPVLNVTWLKNGVPVTDGVSETRYYPRQ 56


>gnl|CDD|217286 pfam02919, Topoisom_I_N, Eukaryotic DNA topoisomerase I, DNA
           binding fragment.  Topoisomerase I promotes the
           relaxation of DNA superhelical tension by introducing a
           transient single-stranded break in duplex DNA and are
           vital for the processes of replication, transcription,
           and recombination. This family may be more than one
           structural domain.
          Length = 215

 Score = 28.3 bits (64), Expect = 1.8
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 72  EEERKKKKKRKKRKKKKKKKEK 93
           E E++KKK   K +KK  K+EK
Sbjct: 96  EAEKEKKKAMSKEEKKAIKEEK 117


>gnl|CDD|211392 cd11380, Ribosomal_S8e_like, Eukaryotic/archaeal ribosomal protein
           S8e and similar proteins.  This family contains the
           eukaryotic/archaeal ribosomal protein S8, a component of
           the small ribosomal subunits, as well as the NSA2 gene
           product.
          Length = 138

 Score = 27.9 bits (63), Expect = 1.8
 Identities = 9/40 (22%), Positives = 14/40 (35%)

Query: 75  RKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGY 114
           RK    + K  +KK+K E    P    + G       +  
Sbjct: 7   RKATGGKFKVVRKKRKYELGRKPANTKLGGERFTRKVRVR 46


>gnl|CDD|215124 PLN02195, PLN02195, cellulose synthase A.
          Length = 977

 Score = 28.8 bits (64), Expect = 1.8
 Identities = 12/26 (46%), Positives = 17/26 (65%)

Query: 72  EEERKKKKKRKKRKKKKKKKEKRIPP 97
           E  + KK K+KK  KKK+  + +IPP
Sbjct: 119 ESWKDKKNKKKKSAKKKEAHKAQIPP 144


>gnl|CDD|178635 PLN03086, PLN03086, PRLI-interacting factor K; Provisional.
          Length = 567

 Score = 28.7 bits (64), Expect = 1.9
 Identities = 10/27 (37%), Positives = 19/27 (70%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEKR 94
          +E+E+ ERK++ K K  +++K K+E  
Sbjct: 12 LEREQRERKQRAKLKLERERKAKEEAA 38



 Score = 27.1 bits (60), Expect = 6.1
 Identities = 11/27 (40%), Positives = 20/27 (74%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKKEK 93
          A EK E E++++K+R K K ++++K K
Sbjct: 8  AREKLEREQRERKQRAKLKLERERKAK 34


>gnl|CDD|220231 pfam09420, Nop16, Ribosome biogenesis protein Nop16.  Nop16 is a
          protein involved in ribosome biogenesis.
          Length = 173

 Score = 28.2 bits (63), Expect = 1.9
 Identities = 10/25 (40%), Positives = 13/25 (52%)

Query: 74 ERKKKKKRKKRKKKKKKKEKRIPPR 98
           RKKKK R    K  +K+ KR   +
Sbjct: 1  VRKKKKNRSSNYKVNRKRLKRKDRK 25


>gnl|CDD|143271 cd05863, Ig2_VEGFR-3, Second immunoglobulin (Ig)-like domain of
           vascular endothelial growth factor receptor 3 (VEGFR-3).
            Ig2_VEGFR-3: Second immunoglobulin (Ig)-like domain of
           vascular endothelial growth factor receptor 3 (VEGFR-3).
           The VEGFRs have an extracellular component with seven
           Ig-like domains, a transmembrane segment, and an
           intracellular tyrosine kinase domain interrupted by a
           kinase-insert domain. VEGFRs bind VEGFs with high
           affinity at the Ig-like domains. VEGFR-3 (Flt-4) binds
           two members of the VEGF family (VEGF-C and -D) and is
           involved in tumor angiogenesis and growth.
          Length = 67

 Score = 26.8 bits (59), Expect = 1.9
 Identities = 18/55 (32%), Positives = 23/55 (41%), Gaps = 3/55 (5%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGEYSYSGNSLTVRHTNRHSAGIYLCVANN 170
           V L  K    P P   W  K+  L  G++S    SL ++     SAG Y  V  N
Sbjct: 1   VKLPVKVAAYPPPEFQWY-KDGKLISGKHSQH--SLQIKDVTEASAGTYTLVLWN 52


>gnl|CDD|203848 pfam08079, Ribosomal_L30_N, Ribosomal L30 N-terminal domain.
          This presumed domain is found at the N-terminus of
          Ribosomal L30 proteins and has been termed RL30NT or
          NUC018.
          Length = 71

 Score = 26.8 bits (60), Expect = 2.0
 Identities = 9/25 (36%), Positives = 15/25 (60%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKR 94
          + +  +K+  K+  RKKK+K   KR
Sbjct: 13 RAKRAKKRAAKKAARKKKRKLIFKR 37



 Score = 26.0 bits (58), Expect = 3.6
 Identities = 9/25 (36%), Positives = 16/25 (64%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKR 94
          K  E+ + K+ +K+  KK  +K+KR
Sbjct: 7  KRNEKLRAKRAKKRAAKKAARKKKR 31



 Score = 24.9 bits (55), Expect = 8.6
 Identities = 11/26 (42%), Positives = 15/26 (57%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKRI 95
           E+   K+ KKR  +K  +KKK K I
Sbjct: 9  NEKLRAKRAKKRAAKKAARKKKRKLI 34


>gnl|CDD|220611 pfam10169, Laps, Learning-associated protein.  This is a family of
           121-amino acid secretory proteins. Laps functions in the
           regulation of neuronal cell adhesion and/or movement and
           synapse attachment. Laps binds to the ApC/EBP (Aplysia
           CCAAT/enhancer binding protein) promoter and activates
           the transcription of ApC/EBP mRNA.
          Length = 124

 Score = 27.5 bits (61), Expect = 2.2
 Identities = 12/17 (70%), Positives = 15/17 (88%)

Query: 75  RKKKKKRKKRKKKKKKK 91
           R+ KK +KKR+KKKKKK
Sbjct: 101 RQAKKLKKKREKKKKKK 117


>gnl|CDD|236782 PRK10871, nlpD, lipoprotein NlpD; Provisional.
          Length = 319

 Score = 28.3 bits (63), Expect = 2.2
 Identities = 11/27 (40%), Positives = 17/27 (62%), Gaps = 2/27 (7%)

Query: 129 NITWTRKNNNLPGGEYSYSGNSLTVRH 155
            I + R+  N+P G  SYSG++ TV+ 
Sbjct: 43  RIVYNRQYGNIPKG--SYSGSTYTVKK 67


>gnl|CDD|218660 pfam05620, DUF788, Protein of unknown function (DUF788).  This
           family consists of several eukaryotic proteins of
           unknown function.
          Length = 166

 Score = 28.1 bits (63), Expect = 2.2
 Identities = 7/33 (21%), Positives = 16/33 (48%)

Query: 61  YLSYILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
            L    ++   +     + K K+++K +K+ EK
Sbjct: 134 VLGPFASLPSSQGAETNETKSKRQEKLEKRGEK 166


>gnl|CDD|227935 COG5648, NHP6B, Chromatin-associated proteins containing the HMG
           domain [Chromatin structure and dynamics].
          Length = 211

 Score = 28.3 bits (63), Expect = 2.2
 Identities = 11/27 (40%), Positives = 15/27 (55%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKRIP 96
            E +E KKKK   K KK K++ +   P
Sbjct: 183 SELDESKKKKYIDKYKKLKEEYDSFYP 209


>gnl|CDD|203444 pfam06424, PRP1_N, PRP1 splicing factor, N-terminal.  This domain
          is specific to the N-terminal part of the prp1 splicing
          factor, which is involved in mRNA splicing (and
          possibly also poly(A)+ RNA nuclear export and cell
          cycle progression). This domain is specific to the N
          terminus of the RNA splicing factor encoded by prp1. It
          is involved in mRNA splicing and possibly also
          poly(A)and RNA nuclear export and cell cycle
          progression.
          Length = 131

 Score = 27.6 bits (62), Expect = 2.3
 Identities = 8/23 (34%), Positives = 17/23 (73%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEK 93
          E  + +  ++RKKR+++K+K+E 
Sbjct: 68 ESIDERMDERRKKRREQKEKEEI 90



 Score = 26.9 bits (60), Expect = 4.7
 Identities = 10/44 (22%), Positives = 24/44 (54%), Gaps = 13/44 (29%)

Query: 69  EKEEEE-------------RKKKKKRKKRKKKKKKKEKRIPPRI 99
           + E+EE              ++KK+R++++K++ +K +   P+I
Sbjct: 57  DDEDEEADRIYESIDERMDERRKKRREQKEKEEIEKYREENPKI 100


>gnl|CDD|185611 PTZ00428, PTZ00428, 60S ribosomal protein L4; Provisional.
          Length = 381

 Score = 28.5 bits (64), Expect = 2.3
 Identities = 10/30 (33%), Positives = 16/30 (53%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           +LA EK   +  +K K ++ +KK KK    
Sbjct: 340 VLAQEKATAKGAQKVKNRRARKKAKKARLA 369


>gnl|CDD|218941 pfam06217, GAGA_bind, GAGA binding protein-like family.  This
           family includes gbp a protein from Soybean that binds to
           GAGA element dinucleotide repeat DNA. It seems likely
           that the this domain mediates DNA binding. This putative
           domain contains several conserved cysteines and a
           histidine suggesting this may be a zinc-binding DNA
           interaction domain.
          Length = 301

 Score = 28.3 bits (63), Expect = 2.4
 Identities = 10/24 (41%), Positives = 13/24 (54%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKR 94
           E +E KK KK +  K  K  K K+
Sbjct: 143 EAKEVKKPKKGQSPKVPKAPKPKK 166


>gnl|CDD|115071 pfam06390, NESP55, Neuroendocrine-specific golgi protein P55
           (NESP55).  This family consists of several mammalian
           neuroendocrine-specific golgi protein P55 (NESP55)
           sequences. NESP55 is a novel member of the chromogranin
           family and is a soluble, acidic, heat-stable secretory
           protein that is expressed exclusively in endocrine and
           nervous tissues, although less widely than
           chromogranins.
          Length = 261

 Score = 28.3 bits (62), Expect = 2.4
 Identities = 12/28 (42%), Positives = 20/28 (71%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIP 96
           E+EEEE++++K++  R K KK   +R P
Sbjct: 218 EEEEEEKEEEKQQPHRCKPKKPARRRDP 245


>gnl|CDD|236498 PRK09401, PRK09401, reverse gyrase; Reviewed.
          Length = 1176

 Score = 28.4 bits (64), Expect = 2.4
 Identities = 11/50 (22%), Positives = 19/50 (38%), Gaps = 4/50 (8%)

Query: 50  SYIILATSLLHYLSY----ILAMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
           SYII  T LL             +     +        +KK+K++  +R+
Sbjct: 126 SYIIFPTRLLVEQVVEKLEKFGEKVGCGVKILYYHSSLKKKEKEEFLERL 175


>gnl|CDD|220245 pfam09444, MRC1, MRC1-like domain.  This putative domain is found
           to be the most conserved region in mediator of
           replication checkpoint protein 1.
          Length = 145

 Score = 27.7 bits (62), Expect = 2.4
 Identities = 6/24 (25%), Positives = 14/24 (58%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKE 92
           E E  +R++ K+R+    +++  E
Sbjct: 103 EDELLQRRRLKRRELALMRQRLLE 126



 Score = 25.8 bits (57), Expect = 9.7
 Identities = 3/23 (13%), Positives = 14/23 (60%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKK 91
            ++E  ++++ KR++    +++ 
Sbjct: 102 SEDELLQRRRLKRRELALMRQRL 124


>gnl|CDD|217301 pfam02956, TT_ORF1, TT viral orf 1.  TT virus (TTV), isolated
           initially from a Japanese patient with hepatitis of
           unknown aetiology, has since been found to infect both
           healthy and diseased individuals and numerous prevalence
           studies have raised questions about its role in
           unexplained hepatitis. ORF1 is a large 750 residue
           protein. The N-terminal half of this protein corresponds
           to the capsid protein.
          Length = 525

 Score = 28.4 bits (64), Expect = 2.4
 Identities = 9/26 (34%), Positives = 11/26 (42%), Gaps = 3/26 (11%)

Query: 75  RKKKK---KRKKRKKKKKKKEKRIPP 97
            K K      K R K KK  + +I P
Sbjct: 188 AKHKILIPSLKTRPKGKKYVKIKIKP 213


>gnl|CDD|219947 pfam08639, SLD3, DNA replication regulator SLD3.  The SLD3 DNA
           replication regulator is required for loading and
           maintenance of Cdc45 on chromatin during DNA
           replication.
          Length = 437

 Score = 28.2 bits (63), Expect = 2.5
 Identities = 18/53 (33%), Positives = 27/53 (50%), Gaps = 8/53 (15%)

Query: 52  IILATSLLHYL-SYILAMEKEEEERKKK-------KKRKKRKKKKKKKEKRIP 96
           IIL   LL          EKE +++  K        KR+KR+K+KK K++ +P
Sbjct: 109 IILILELLALELDRFTKFEKEYKKKLLKRSQNLDRSKRRKRRKRKKNKKQDLP 161


>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355).  This
          family of proteins is found in bacteria and viruses.
          Proteins in this family are typically between 180 and
          214 amino acids in length.
          Length = 125

 Score = 27.2 bits (61), Expect = 2.6
 Identities = 12/32 (37%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKK--KEKRIP 96
           +EKE EE + +  R++ K + KK   EK +P
Sbjct: 53 KLEKELEELEAELARRELKAEAKKMLSEKGLP 84


>gnl|CDD|217889 pfam04092, SAG, SRS domain.  Toxoplasma gondii is a persistent
           protozoan parasite capable of infecting almost any
           warm-blooded vertebrate. The surface of Toxoplasma is
           coated with a family of developmentally regulated
           glycosylphosphatidylinositol (GPI)-linked proteins
           (SRSs), of which SAG1 is the prototypic member. SRS
           proteins mediate attachment to host cells and interface
           with the host immune response to regulate the virulence
           of the parasite. SAG1 is composed of two disulphide
           linked SRS domains. These have 6 cysteines that form
           1-6,2-5 and 3-4 pairings. The structure of the
           immunodominant SAG1 antigen reveals a homodimeric
           configuration. The SRS domain is found in a single copy
           in the SAG2 proteins. This family of surface antigens
           are found in other apicomplexans.
          Length = 126

 Score = 27.5 bits (61), Expect = 2.6
 Identities = 12/31 (38%), Positives = 16/31 (51%), Gaps = 3/31 (9%)

Query: 101 YVSGAGKVEV---KKGYSVTLECKADGNPVP 128
           Y S A    V   K+  ++TL+C  DG  VP
Sbjct: 8   YESNASPTTVTLSKESNTLTLKCGGDGTLVP 38


>gnl|CDD|223562 COG0488, Uup, ATPase components of ABC transporters with duplicated
           ATPase domains [General function prediction only].
          Length = 530

 Score = 28.4 bits (64), Expect = 2.6
 Identities = 7/27 (25%), Positives = 17/27 (62%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRI 95
           +K E  R++    +K++K+  K+++ I
Sbjct: 241 QKAERLRQEAAAYEKQQKELAKEQEWI 267


>gnl|CDD|224415 COG1498, SIK1, Protein implicated in ribosomal biogenesis, Nop56p
           homolog [Translation, ribosomal structure and
           biogenesis].
          Length = 395

 Score = 28.1 bits (63), Expect = 2.7
 Identities = 10/25 (40%), Positives = 17/25 (68%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K E ++K++  R +RKKK+KK +  
Sbjct: 365 KPERDKKERPGRYRRKKKEKKAKSE 389



 Score = 27.7 bits (62), Expect = 3.4
 Identities = 9/26 (34%), Positives = 19/26 (73%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           E++++ER  + +RKK++KK K + + 
Sbjct: 367 ERDKKERPGRYRRKKKEKKAKSERRG 392



 Score = 26.6 bits (59), Expect = 9.6
 Identities = 10/25 (40%), Positives = 16/25 (64%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
            + E  KK++  + R+KKK+KK K 
Sbjct: 364 AKPERDKKERPGRYRRKKKEKKAKS 388



 Score = 26.6 bits (59), Expect = 9.8
 Identities = 10/32 (31%), Positives = 16/32 (50%), Gaps = 5/32 (15%)

Query: 69  EKEEEERKKK-----KKRKKRKKKKKKKEKRI 95
           EK  +   K      KK +  + ++KKKEK+ 
Sbjct: 355 EKPPKPPTKAKPERDKKERPGRYRRKKKEKKA 386


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 28.6 bits (63), Expect = 2.7
 Identities = 8/26 (30%), Positives = 13/26 (50%)

Query: 67   AMEKEEEERKKKKKRKKRKKKKKKKE 92
            A +K+ EE KK  +  K + +    E
Sbjct: 1333 AAKKKAEEAKKAAEAAKAEAEAAADE 1358



 Score = 27.8 bits (61), Expect = 5.0
 Identities = 8/28 (28%), Positives = 17/28 (60%)

Query: 68   MEKEEEERKKKKKRKKRKKKKKKKEKRI 95
             +K EE+ KK  +  K++ ++ KK + +
Sbjct: 1680 AKKAEEDEKKAAEALKKEAEEAKKAEEL 1707



 Score = 27.4 bits (60), Expect = 5.7
 Identities = 9/27 (33%), Positives = 19/27 (70%)

Query: 69   EKEEEERKKKKKRKKRKKKKKKKEKRI 95
            +K EE++KK ++ KK ++ +KK  + +
Sbjct: 1668 KKAEEDKKKAEEAKKAEEDEKKAAEAL 1694


>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
           (TAF4) is one of several TAFs that bind TBP and is
           involved in forming Transcription Factor IID (TFIID)
           complex.  The TATA Binding Protein (TBP) Associated
           Factor 4 (TAF4) is one of several TAFs that bind TBP and
           are involved in forming the Transcription Factor IID
           (TFIID) complex. TFIID is one of seven General
           Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
           TFIIF, and TFIID) that are involved in accurate
           initiation of transcription by RNA polymerase II in
           eukaryote. TFIID plays an important role in the
           recognition of promoter DNA and assembly of the
           pre-initiation complex. TFIID complex is composed of the
           TBP and at least 13 TAFs. TAFs from various species were
           originally named by their predicted molecular weight or
           their electrophoretic mobility in polyacrylamide gels. A
           new, unified nomenclature for the pol II TAFs has been
           suggested to show the relationship between TAF orthologs
           and paralogs. Several hypotheses are proposed for TAFs
           functions such as serving as activator-binding sites,
           core-promoter recognition or a role in essential
           catalytic activity. Each TAF, with the help of a
           specific activator, is required only for the expression
           of subset of genes and is not universally involved for
           transcription as are GTFs. In yeast and human cells,
           TAFs have been found as components of other complexes
           besides TFIID.   Several TAFs interact via histone-fold
           (HFD) motifs; HFD is the interaction motif involved in
           heterodimerization of the core histones and their
           assembly into nucleosome octamers. The minimal HFD
           contains three alpha-helices linked by two loops and is
           found in core histones, TAFS and many other
           transcription factors. TFIID has a histone octamer-like
           substructure. TAF4 domain interacts with TAF12 and makes
           a novel histone-like heterodimer that binds DNA and has
           a core promoter function of a subset of genes.
          Length = 212

 Score = 28.1 bits (63), Expect = 2.7
 Identities = 7/27 (25%), Positives = 20/27 (74%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKR 94
           +E+EEEE++ +++R++  +  K + ++
Sbjct: 122 LEREEEEKRDEEERERLLRAAKSRSEQ 148


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 28.1 bits (62), Expect = 2.8
 Identities = 9/31 (29%), Positives = 22/31 (70%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
             EEEE+++K++   RK +++++++R+   I
Sbjct: 229 VLEEEEQRRKQEEADRKSREEEEKRRLKEEI 259


>gnl|CDD|147699 pfam05688, DUF824, Salmonella repeat of unknown function (DUF824). 
           This family consists of several repeated sequences of
           around 45 residues.
          Length = 47

 Score = 25.7 bits (57), Expect = 2.8
 Identities = 14/34 (41%), Positives = 17/34 (50%), Gaps = 3/34 (8%)

Query: 103 SGAGKVEVKKGYSVTLECK---ADGNPVPNITWT 133
           S     + KKG S+ L      A GNPVPN  +T
Sbjct: 2   SDKNAAKAKKGESIPLTVTVKDAAGNPVPNAPFT 35


>gnl|CDD|233352 TIGR01310, L7, 60S ribosomal protein L7, eukaryotic.  This model
          describes the eukaryotic 60S (cytosolic) ribosomal
          protein L7 and paralogs that may or may not also be L7.
          Human, Drosophila, and Arabidopsis all have both a
          typical L7 and an L7-related paralog. This family is
          designated subfamily rather than equivalog to reflect
          these uncharacterized paralogs. Members of this family
          average ~ 250 residues in length, somewhat longer than
          the archaeal L30P/L7E homolog (~ 155 residues) and much
          longer than the related bacterial/organellar form (~ 60
          residues).
          Length = 235

 Score = 28.1 bits (63), Expect = 2.8
 Identities = 8/27 (29%), Positives = 14/27 (51%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKRI 95
          ++   +  K+ K KK+  KKK+K    
Sbjct: 11 QELAVQVAKQAKAKKKANKKKRKIYFK 37


>gnl|CDD|204935 pfam12474, PKK, Polo kinase kinase.  This domain family is found in
           eukaryotes, and is approximately 140 amino acids in
           length. The family is found in association with
           pfam00069. Polo-like kinase 1 (Plx1) is essential during
           mitosis for the activation of Cdc25C, for spindle
           assembly, and for cyclin B degradation. This family is
           Polo kinase kinase (PKK) which phosphorylates Polo
           kinase and Polo-like kinase to activate them. PKK is a
           serine/threonine kinase.
          Length = 142

 Score = 27.3 bits (61), Expect = 2.9
 Identities = 7/24 (29%), Positives = 17/24 (70%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           K+E E+  + + +++K+ K +KE+
Sbjct: 80  KQEVEKLPRFQEQEKKRMKAEKEE 103


>gnl|CDD|218391 pfam05029, TIMELESS_C, Timeless protein C terminal region.  The
           timeless (tim) gene is essential for circadian function
           in Drosophila. Putative homologues of Drosophila tim
           have been identified in both mice and humans (mTim and
           hTIM, respectively). Mammalian TIM is not the true
           orthologue of Drosophila TIM, but is the likely
           orthologue of a fly gene, timeout (also called tim-2).
           mTim has been shown to be essential for embryonic
           development, but does not have substantiated circadian
           function. Some family members contain a SANT domain in
           this region.
          Length = 507

 Score = 28.1 bits (62), Expect = 2.9
 Identities = 8/29 (27%), Positives = 12/29 (41%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
           A E E E    +K  ++RK      E+  
Sbjct: 398 ADESEHETLALRKNARQRKAGLASPEEEA 426


>gnl|CDD|217838 pfam04004, Leo1, Leo1-like protein.  Members of this family are
           part of the Paf1/RNA polymerase II complex. The Paf1
           complex probably functions during the elongation phase
           of transcription. The Leo1 subunit of the yeast
           Paf1-complex binds RNA and contributes to complex
           recruitment. The subunit acts by co-ordinating
           co-transcriptional chromain modifications and helping
           recruitment of mRNA 3prime-end processing factors.
          Length = 312

 Score = 27.9 bits (62), Expect = 2.9
 Identities = 10/34 (29%), Positives = 19/34 (55%), Gaps = 1/34 (2%)

Query: 69  EKEEEER-KKKKKRKKRKKKKKKKEKRIPPRIIY 101
           EK+EE++ + +++R+ R+K K K   R       
Sbjct: 243 EKKEEQKLRARRRRQNREKMKNKPPNRPGHGSGS 276



 Score = 27.5 bits (61), Expect = 4.8
 Identities = 12/30 (40%), Positives = 22/30 (73%), Gaps = 1/30 (3%)

Query: 69  EKEEEERKKKKK-RKKRKKKKKKKEKRIPP 97
           EK E E+K+++K R +R+++ ++K K  PP
Sbjct: 238 EKREREKKEEQKLRARRRRQNREKMKNKPP 267


>gnl|CDD|218337 pfam04939, RRS1, Ribosome biogenesis regulatory protein (RRS1).
           This family consists of several eukaryotic ribosome
           biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear
           protein that is essential for the maturation of 25 S
           rRNA and the 60 S ribosomal subunit assembly in
           Saccharomyces cerevisiae.
          Length = 164

 Score = 27.6 bits (62), Expect = 3.0
 Identities = 7/23 (30%), Positives = 17/23 (73%)

Query: 72  EEERKKKKKRKKRKKKKKKKEKR 94
            ++R++KK+R  + +K++ K K+
Sbjct: 142 AKKRREKKERVAKNEKRELKNKK 164



 Score = 26.1 bits (58), Expect = 8.2
 Identities = 6/23 (26%), Positives = 15/23 (65%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKK 91
            K+  E+K++  + ++++ K KK
Sbjct: 142 AKKRREKKERVAKNEKRELKNKK 164


>gnl|CDD|176075 cd08693, C2_PI3K_class_I_beta_delta, C2 domain present in class I
           beta and delta phosphatidylinositol 3-kinases (PI3Ks).
           PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases)
           regulate cell processes such as cell growth,
           differentiation, proliferation, and motility.  PI3Ks
           work on phosphorylation of phosphatidylinositol,
           phosphatidylinositide (4)P (PtdIns (4)P),2 or
           PtdIns(4,5)P2. Specifically they phosphorylate the D3
           hydroxyl group of phosphoinositol lipids on the inositol
           ring. There are 3 classes of PI3Ks based on structure,
           regulation, and specificity. All classes contain a C2
           domain, a PIK domain, and a kinase catalytic domain.
           The members here are class I, beta and delta isoforms of
           PI3Ks and contain both a Ras-binding domain and a
           p85-binding domain.  Class II PI3Ks contain both of
           these as well as a PX domain, and a C-terminal C2 domain
           containing a nuclear localization signal.  C2 domains
           fold into an 8-standed beta-sandwich that can adopt 2
           structural arrangements: Type I and Type II,
           distinguished by a circular permutation involving their
           N- and C-terminal beta strands. Many C2 domains are
           Ca2+-dependent membrane-targeting modules that bind a
           wide variety of substances including bind phospholipids,
           inositol polyphosphates, and intracellular proteins.
           Most C2 domain proteins are either signal transduction
           enzymes that contain a single C2 domain, such as protein
           kinase C, or membrane trafficking proteins which contain
           at least two C2 domains, such as synaptotagmin 1.
           However, there are a few exceptions to this including
           RIM isoforms and some splice variants of piccolo/aczonin
           and intersectin which only have a single C2 domain.  C2
           domains with a calcium binding region have negatively
           charged residues, primarily aspartates, that serve as
           ligands for calcium ions.  Members have a type-I
           topology.
          Length = 173

 Score = 27.7 bits (62), Expect = 3.0
 Identities = 10/23 (43%), Positives = 15/23 (65%)

Query: 75  RKKKKKRKKRKKKKKKKEKRIPP 97
           +K K KR ++ + KKKK+K   P
Sbjct: 87  KKAKGKRSRKNQTKKKKKKDDNP 109


>gnl|CDD|192292 pfam09429, Wbp11, WW domain binding protein 11.  The WW domain is
          a small protein module with a triple-stranded
          beta-sheet fold. This is a family of WW domain binding
          proteins.
          Length = 78

 Score = 26.5 bits (59), Expect = 3.1
 Identities = 7/23 (30%), Positives = 17/23 (73%)

Query: 72 EEERKKKKKRKKRKKKKKKKEKR 94
          +  RK++KK++ +K K +++ +R
Sbjct: 8  DAYRKEQKKKELKKNKAERQARR 30


>gnl|CDD|143270 cd05862, Ig1_VEGFR, First immunoglobulin (Ig)-like domain of
           vascular endothelial growth factor (VEGF) receptor(R).
           IG1_VEGFR: first immunoglobulin (Ig)-like domain of
           vascular endothelial growth factor (VEGF) receptor(R).
           The VEGFRs have an extracellular component with seven
           Ig-like domains, a transmembrane segment, and an
           intracellular tyrosine kinase domain interrupted by a
           kinase-insert domain. The VEGFR family consists of three
           members, VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1) and
           VEGFR-3 (Flt-4). VEGF_A interacts with both VEGFR-1 and
           VEGFR-2. VEGFR-1 binds strongest to VEGF, VEGF-2 binds
           more weakly. VEGFR-3 appears not to bind VEGF, but binds
           other members of the VEGF family (VEGF-C and -D). VEGFRs
           bind VEGFs with high affinity with the IG-like domains.
           VEGF-A is important to the growth and maintenance of
           vascular endothelial cells and to the development of new
           blood- and lymphatic-vessels in physiological and
           pathological states. VEGFR-2 is a major mediator of the
           mitogenic, angiogenic and microvascular
           permeability-enhancing effects of VEGF-A. VEGFR-1 may
           play an inhibitory part in these processes by binding
           VEGF and interfering with its interaction with VEGFR-2.
           VEGFR-1 has a signaling role in mediating monocyte
           chemotaxis. VEGFR-2 and -1 may mediate a chemotactic and
           a survival signal in hematopoietic stem cells or
           leukemia cells. VEGFR-3 has been shown to be involved in
           tumor angiogenesis and growth.
          Length = 86

 Score = 26.7 bits (59), Expect = 3.2
 Identities = 6/22 (27%), Positives = 12/22 (54%)

Query: 149 NSLTVRHTNRHSAGIYLCVANN 170
           ++LT+ +      G Y C A++
Sbjct: 50  STLTIENVTLSDLGRYTCTASS 71


>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
          Length = 434

 Score = 27.8 bits (62), Expect = 3.2
 Identities = 10/25 (40%), Positives = 19/25 (76%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K+EE  KK+K++    K+K++K++R
Sbjct: 408 KQEENEKKQKEQADEDKEKRQKDER 432



 Score = 27.8 bits (62), Expect = 3.4
 Identities = 8/28 (28%), Positives = 19/28 (67%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEK 93
           L  E+ E+++K++    K K++K +++K
Sbjct: 407 LKQEENEKKQKEQADEDKEKRQKDERKK 434



 Score = 26.7 bits (59), Expect = 7.6
 Identities = 9/24 (37%), Positives = 19/24 (79%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKR 94
           EE E+K+K++  + K+K++K E++
Sbjct: 410 EENEKKQKEQADEDKEKRQKDERK 433



 Score = 26.7 bits (59), Expect = 8.9
 Identities = 7/24 (29%), Positives = 18/24 (75%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKE 92
           + ++EE +KK+K +  + K+K+++
Sbjct: 406 KLKQEENEKKQKEQADEDKEKRQK 429


>gnl|CDD|227518 COG5191, COG5191, Uncharacterized conserved protein, contains HAT
           (Half-A-TPR) repeat [General function prediction only].
          Length = 435

 Score = 28.0 bits (62), Expect = 3.2
 Identities = 15/46 (32%), Positives = 22/46 (47%)

Query: 57  SLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYV 102
            L  ++ YI      E+ R K+ KRKK  KK    +  IP + I+ 
Sbjct: 50  KLNDFMRYIKYECNLEKLRAKRVKRKKVGKKASFSDMSIPQKKIFE 95


>gnl|CDD|233042 TIGR00598, rad14, DNA repair protein.  All proteins in this family
           for which functions are known are used for the
           recognition of DNA damage as part of nucleotide excision
           repair. This family is based on the phylogenomic
           analysis of JA Eisen (1999, Ph.D. Thesis, Stanford
           University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 172

 Score = 27.4 bits (61), Expect = 3.2
 Identities = 11/30 (36%), Positives = 19/30 (63%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
           E++E   + K++ K++K +KK KE R   R
Sbjct: 98  EEKERREESKEEMKEKKFEKKLKELRRAVR 127


>gnl|CDD|217970 pfam04220, YihI, Der GTPase activator (YihI).  YihI activates the
          GTPase activity of Der, a 50S ribosomal subunit
          stability factor. The stimulation is specific to Der as
          YihI does not stimulate the GTPase activity of Era or
          ObgE. The interaction of YihI with Der requires only
          the C-terminal 78 amino acids of YihI. A yihI deletion
          mutant is viable and shows a shorter lag period, but
          the same post-lag growth rate as a wild-type strain.
          yihI is expressed during the lag period. Overexpression
          of yihI inhibits cell growth and biogenesis of the 50S
          ribosomal subunit. YihI is an unusual, highly
          hydrophilic protein with an uneven distribution of
          charged residues, resulting in an N-terminal region
          with high pI and a C-terminal region with low pI.
          Length = 169

 Score = 27.3 bits (61), Expect = 3.3
 Identities = 15/40 (37%), Positives = 19/40 (47%), Gaps = 15/40 (37%)

Query: 71 EEEERKKKKKRK---------------KRKKKKKKKEKRI 95
          E  ERK+KKKRK               K+K   +KK+ RI
Sbjct: 25 EARERKRKKKRKGLKSGSRHNEESESQKQKGAAQKKDPRI 64


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 28.0 bits (63), Expect = 3.3
 Identities = 4/27 (14%), Positives = 17/27 (62%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRI 95
           + +   R++  +R++R+++ ++K   +
Sbjct: 155 DDDIATRERSLERRRRRREWEEKRAEL 181


>gnl|CDD|143177 cd04976, Ig2_VEGFR, Second immunoglobulin (Ig)-like domain of
           vascular endothelial growth factor receptor (VEGFR).
           Ig2_VEGFR: Second immunoglobulin (Ig)-like domain of
           vascular endothelial growth factor receptor (VEGFR). The
           VEGFRs have an extracellular component with seven
           Ig-like domains, a transmembrane segment, and an
           intracellular tyrosine kinase domain interrupted by a
           kinase-insert domain. The VEGFR family consists of three
           members, VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1) and
           VEGFR-3 (Flt-4). VEGFRs bind VEGFs with high affinity at
           the Ig-like domains. VEGF-A is important to the growth
           and maintenance of vascular endothelial cells and to the
           development of new blood- and lymphatic-vessels in
           physiological and pathological states. VEGFR-2 is a
           major mediator of the mitogenic, angiogenic and
           microvascular permeability-enhancing effects of VEGF-A.
           VEGFR-1 may play an inhibitory part in these processes
           by binding VEGF and interfering with its interaction
           with VEGFR-2. VEGFR-1 has a signaling role in mediating
           monocyte chemotaxis. VEGFR-2 and -1 may mediate a
           chemotactic and a survival signal in hematopoietic stem
           cells or leukemia cells. VEGFR-3 has been shown to be
           involved in tumor angiogenesis and growth.
          Length = 71

 Score = 26.2 bits (58), Expect = 3.4
 Identities = 19/58 (32%), Positives = 23/58 (39%), Gaps = 5/58 (8%)

Query: 116 VTLECKADGNPVPNITWTRKNNNLPGGE---YSYSGNSLTVRHTNRHSAGIYLCVANN 170
           V L  K    P P I W +  N     E      SG+SLT++      AG Y  V  N
Sbjct: 1   VRLPVKVKAYPPPEIQWYK--NGKLISEKNRTKKSGHSLTIKDVTEEDAGNYTVVLTN 56


>gnl|CDD|146486 pfam03879, Cgr1, Cgr1 family.  Members of this family are
          coiled-coil proteins that are involved in pre-rRNA
          processing.
          Length = 105

 Score = 27.0 bits (60), Expect = 3.4
 Identities = 9/25 (36%), Positives = 18/25 (72%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEK 93
          EKE E +++ +  K+R+  K++KE+
Sbjct: 53 EKEAERQRRIQAIKERRAAKEEKER 77



 Score = 25.8 bits (57), Expect = 7.5
 Identities = 10/26 (38%), Positives = 16/26 (61%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKR 94
          EKE  E+   K   K+ ++ K++EKR
Sbjct: 74 EKERYEKMAAKMHAKKVERLKRREKR 99


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 27.8 bits (62), Expect = 3.5
 Identities = 8/26 (30%), Positives = 20/26 (76%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           EK ++ER ++K+R++ +K  ++++ R
Sbjct: 152 EKIKKERAEEKEREEEEKAAEEEKAR 177


>gnl|CDD|218148 pfam04557, tRNA_synt_1c_R2, Glutaminyl-tRNA synthetase,
          non-specific RNA binding region part 2.  This is a
          region found N terminal to the catalytic domain of
          glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes
          but not in Escherichia coli. This region is thought to
          bind RNA in a non-specific manner, enhancing
          interactions between the tRNA and enzyme, but is not
          essential for enzyme function.
          Length = 83

 Score = 26.6 bits (59), Expect = 3.5
 Identities = 8/18 (44%), Positives = 11/18 (61%)

Query: 76 KKKKKRKKRKKKKKKKEK 93
          KKKKK+KK+K +      
Sbjct: 24 KKKKKKKKKKAEDTAATA 41



 Score = 25.4 bits (56), Expect = 6.7
 Identities = 9/20 (45%), Positives = 12/20 (60%)

Query: 74 ERKKKKKRKKRKKKKKKKEK 93
          E    KK+KK+KKKK +   
Sbjct: 19 EADLVKKKKKKKKKKAEDTA 38



 Score = 25.4 bits (56), Expect = 7.1
 Identities = 10/24 (41%), Positives = 14/24 (58%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEK 93
          K E +  KKKK+KK+KK +     
Sbjct: 17 KTEADLVKKKKKKKKKKAEDTAAT 40



 Score = 25.4 bits (56), Expect = 9.1
 Identities = 8/22 (36%), Positives = 10/22 (45%)

Query: 76 KKKKKRKKRKKKKKKKEKRIPP 97
          KKKKK+K        K K+   
Sbjct: 27 KKKKKKKAEDTAATAKAKKATA 48


>gnl|CDD|149438 pfam08374, Protocadherin, Protocadherin.  The structure of
           protocadherins is similar to that of classic cadherins
           (pfam00028), but particularly on the cytoplasmic domains
           they also have some unique features. They are expressed
           in a variety of organisms and are found in high
           concentrations in the brain where they seem to be
           localised mainly at cell-cell contact sites. Their
           expression seems to be developmentally regulated.
          Length = 223

 Score = 27.5 bits (61), Expect = 3.6
 Identities = 14/44 (31%), Positives = 24/44 (54%), Gaps = 4/44 (9%)

Query: 67  AMEKEEEE----RKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAG 106
           A +KE E+     ++ K++KK+K KKKK  K +    + V  + 
Sbjct: 74  AGKKETEDWFSPNQENKQKKKKKDKKKKSPKSLLLNFVTVEESK 117


>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein.  The YqfQ-like protein family
           includes the B. subtilis YqfQ protein, also known as
           VrrA, which is functionally uncharacterized. This family
           of proteins is found in bacteria. Proteins in this
           family are typically between 146 and 237 amino acids in
           length. There are two conserved sequence motifs: QYGP
           and PKLY.
          Length = 155

 Score = 27.4 bits (61), Expect = 3.7
 Identities = 10/28 (35%), Positives = 17/28 (60%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIP 96
           E + E ++KKK+   + K +K+K K  P
Sbjct: 116 ETKTESKEKKKREVPKPKTEKEKPKTEP 143


>gnl|CDD|200446 cd11290, gelsolin_S1_like, Gelsolin sub-domain 1-like domain
          found in gelsolin, severin, villin, and related
          proteins.  Gelsolin repeats occur in gelsolin, severin,
          villin, advillin, villidin, supervillin, flightless,
          quail, fragmin, and other proteins, usually in several
          copies. They co-occur with villin headpiece domains,
          leucine-rich repeats, and several other domains. These
          gelsolin-related actin binding proteins (GRABPs) play
          regulatory roles in the assembly and disassembly of
          actin filaments; they are involved in F-actin capping,
          uncapping, severing, or the nucleation of actin
          filaments. Severing of actin filaments is Ca2+
          dependent. Villins are also linked to generating
          bundles of F-actin with uniform filament polarity,
          which is most likely mediated by their extra villin
          headpiece domain. Many family members have also adopted
          functions in the nucleus, including the regulation of
          transcription. Supervillin, gelsolin, and flightless I
          are involved in intracellular signaling via nuclear
          hormone receptors. The gelsolin_like domain is
          distantly related to the actin depolymerizing domains
          found in cofilin and similar proteins.
          Length = 113

 Score = 26.8 bits (60), Expect = 3.7
 Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 6/40 (15%)

Query: 42 FGKLITSCSYIILAT------SLLHYLSYILAMEKEEEER 75
          +GK     SYI+L T      SL + + Y L  E  ++E 
Sbjct: 28 YGKFYEGDSYIVLKTTLDPSGSLSYDIHYWLGKEASQDEA 67


>gnl|CDD|223683 COG0610, COG0610, Type I site-specific restriction-modification
           system, R (restriction) subunit and related helicases
           [Defense mechanisms].
          Length = 962

 Score = 27.8 bits (62), Expect = 3.8
 Identities = 11/70 (15%), Positives = 21/70 (30%), Gaps = 21/70 (30%)

Query: 26  LAIMFADSRPESTFWIFGKLITSCSYIILATSLLHYLSYILAMEKEEEERKKKKKRKKRK 85
           LA     +         G  +                      EK  +E+  ++K K   
Sbjct: 905 LAFYDDLALNGGKLPENGTELV---------------------EKLAKEKSLREKNKDDW 943

Query: 86  KKKKKKEKRI 95
           K K++ E ++
Sbjct: 944 KAKEEVEAKL 953


>gnl|CDD|237276 PRK13024, PRK13024, bifunctional preprotein translocase subunit
           SecD/SecF; Reviewed.
          Length = 755

 Score = 27.9 bits (63), Expect = 3.8
 Identities = 24/83 (28%), Positives = 38/83 (45%), Gaps = 25/83 (30%)

Query: 13  IFGKPITRIMFADLAIMFADSRPESTFWIFGKLITSCSYIILATSLLHYLSYILAMEKEE 72
           IFG    R  F+ LA+            + G ++ + S I +A  L   L          
Sbjct: 697 IFGGSSLRN-FS-LAL------------LVGLIVGTYSSIFIAAPLWLDL---------- 732

Query: 73  EERKKKKKRKKRKKKKKKKEKRI 95
            E+++ KK+KKRKK KK + ++I
Sbjct: 733 -EKRRLKKKKKRKKVKKWEVEKI 754


>gnl|CDD|215239 PLN02436, PLN02436, cellulose synthase A.
          Length = 1094

 Score = 27.9 bits (62), Expect = 3.9
 Identities = 13/24 (54%), Positives = 18/24 (75%)

Query: 76  KKKKKRKKRKKKKKKKEKRIPPRI 99
           +KKKK+KK K+KKKKK +    +I
Sbjct: 684 RKKKKKKKSKEKKKKKNREASKQI 707


>gnl|CDD|201249 pfam00472, RF-1, RF-1 domain.  This domain is found in peptide
          chain release factors such as RF-1 and RF-2, and a
          number of smaller proteins of unknown function. This
          domain contains the peptidyl-tRNA hydrolase activity.
          The domain contains a highly conserved motif GGQ, where
          the glutamine is thought to coordinate the water that
          mediates the hydrolysis.
          Length = 114

 Score = 26.8 bits (60), Expect = 3.9
 Identities = 7/22 (31%), Positives = 13/22 (59%)

Query: 74 ERKKKKKRKKRKKKKKKKEKRI 95
          E + +KKR+K K  +  + +R 
Sbjct: 73 EAELQKKREKTKPTRASQVRRG 94



 Score = 25.6 bits (57), Expect = 9.1
 Identities = 7/29 (24%), Positives = 15/29 (51%)

Query: 65 ILAMEKEEEERKKKKKRKKRKKKKKKKEK 93
          +   E +++  K K  R  + ++  +KEK
Sbjct: 71 LYEAELQKKREKTKPTRASQVRRGDRKEK 99


>gnl|CDD|218896 pfam06098, Radial_spoke_3, Radial spoke protein 3.  This family
           consists of several radial spoke protein 3 (RSP3)
           sequences. Eukaryotic cilia and flagella present in
           diverse types of cells perform motile, sensory, and
           developmental functions in organisms from protists to
           humans. They are centred by precisely organised,
           microtubule-based structures, the axonemes. The axoneme
           consists of two central singlet microtubules, called the
           central pair, and nine outer doublet microtubules. These
           structures are well-conserved during evolution. The
           outer doublet microtubules, each composed of A and B
           sub-fibres, are connected to each other by nexin links,
           while the central pair is held at the centre of the
           axoneme by radial spokes. The radial spokes are T-shaped
           structures extending from the A-tubule of each outer
           doublet microtubule to the centre of the axoneme. Radial
           spoke protein 3 (RSP3), is present at the proximal end
           of the spoke stalk and helps in anchoring the radial
           spoke to the outer doublet. It is thought that radial
           spokes regulate the activity of inner arm dynein through
           protein phosphorylation and dephosphorylation.
          Length = 288

 Score = 27.7 bits (62), Expect = 3.9
 Identities = 9/24 (37%), Positives = 22/24 (91%)

Query: 71  EEEERKKKKKRKKRKKKKKKKEKR 94
           EE ER++++++++RKK+ K++++R
Sbjct: 179 EEAERRRREEKERRKKQDKERKQR 202



 Score = 26.6 bits (59), Expect = 8.0
 Identities = 9/24 (37%), Positives = 21/24 (87%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           +EE+ER+KK+ +++++++K+  EK
Sbjct: 186 REEKERRKKQDKERKQREKETAEK 209


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
            biogenesis [Translation, ribosomal structure and
            biogenesis].
          Length = 1077

 Score = 27.8 bits (61), Expect = 3.9
 Identities = 15/45 (33%), Positives = 24/45 (53%), Gaps = 1/45 (2%)

Query: 69   EKEEEERKKKKKRKKRKKKKKKKEKRIPPRII-YVSGAGKVEVKK 112
            EKE  E  ++ K ++  KK+K++E+RI   I        K  +KK
Sbjct: 1031 EKERMESLQRAKEEEIGKKEKEREQRIRKTIHDNYKEMAKKRLKK 1075



 Score = 26.6 bits (58), Expect = 9.1
 Identities = 10/27 (37%), Positives = 15/27 (55%)

Query: 68   MEKEEEERKKKKKRKKRKKKKKKKEKR 94
             EKE E+R +K      K+  KK+ K+
Sbjct: 1049 KEKEREQRIRKTIHDNYKEMAKKRLKK 1075


>gnl|CDD|225816 COG3277, GAR1, RNA-binding protein involved in rRNA processing
          [Translation, ribosomal structure and biogenesis].
          Length = 98

 Score = 26.6 bits (59), Expect = 4.0
 Identities = 7/22 (31%), Positives = 15/22 (68%)

Query: 73 EERKKKKKRKKRKKKKKKKEKR 94
           ++  +KKRK  +KK++ K+ +
Sbjct: 77 PDKLIRKKRKLPRKKRRPKKPK 98


>gnl|CDD|225651 COG3109, ProQ, Activator of osmoprotectant transporter ProP [Signal
           transduction mechanisms].
          Length = 208

 Score = 27.5 bits (61), Expect = 4.1
 Identities = 6/43 (13%), Positives = 18/43 (41%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
           K+  E K + + ++ +++ KK+E+         +       + 
Sbjct: 101 KQLAEAKARVQAQRAEQQAKKREEAPAAGEKPTAERPATAARP 143


>gnl|CDD|221185 pfam11719, Drc1-Sld2, DNA replication and checkpoint protein.
           Genome duplication is precisely regulated by
           cyclin-dependent kinases CDKs, which bring about the
           onset of S phase by activating replication origins and
           then prevent relicensing of origins until mitosis is
           completed. The optimum sequence motif for CDK
           phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found
           to have at least 11 potential phosphorylation sites.
           Drc1 is required for DNA synthesis and S-M replication
           checkpoint control. Drc1 associates with Cdc2 and is
           phosphorylated at the onset of S phase when Cdc2 is
           activated. Thus Cdc2 promotes DNA replication by
           phosphorylating Drc1 and regulating its association with
           Cut5. Sld2 and Sld3 represent the minimal set of S-CDK
           substrates required for DNA replication.
          Length = 397

 Score = 27.5 bits (61), Expect = 4.2
 Identities = 11/21 (52%), Positives = 15/21 (71%)

Query: 69  EKEEEERKKKKKRKKRKKKKK 89
            KEE E+K+K K+K RK+K  
Sbjct: 350 SKEEVEKKQKVKKKPRKRKVN 370


>gnl|CDD|216652 pfam01698, FLO_LFY, Floricaula / Leafy protein.  This family
           consists of various plant development proteins which are
           homologues of floricaula (FLO) and Leafy (LFY) proteins
           which are floral meristem identity proteins. Mutations
           in the sequences of these proteins affect flower and
           leaf development.
          Length = 382

 Score = 27.7 bits (62), Expect = 4.3
 Identities = 9/29 (31%), Positives = 17/29 (58%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           L          +KKK++K++++K+ KE R
Sbjct: 171 LVGVPGHSSDSEKKKQRKKQRRKRSKELR 199


>gnl|CDD|220818 pfam10595, UPF0564, Uncharacterized protein family UPF0564.  This
           family of proteins has no known function. However, one
           of the members is annotated as an EF-hand family
           protein.
          Length = 349

 Score = 27.4 bits (61), Expect = 4.3
 Identities = 15/45 (33%), Positives = 22/45 (48%)

Query: 55  ATSLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           A  LL        M K E + +  KK+K+ +KK K K+K   P+ 
Sbjct: 154 AQELLQSSRLPPRMAKHEAQERLTKKKKRGQKKSKYKKKTFKPKR 198


>gnl|CDD|220413 pfam09805, Nop25, Nucleolar protein 12 (25kDa).  Members of this
          family of proteins are part of the yeast nuclear pore
          complex-associated pre-60S ribosomal subunit. The
          family functions as a highly conserved exonuclease that
          is required for the 5'-end maturation of 5.8S and 25S
          rRNAs, demonstrating that 5'-end processing also has a
          redundant pathway. Nop25 binds late pre-60S ribosomes,
          accompanying them from the nucleolus to the nuclear
          periphery; and there is evidence for both physical and
          functional links between late 60S subunit processing
          and export.
          Length = 134

 Score = 26.9 bits (60), Expect = 4.4
 Identities = 9/24 (37%), Positives = 19/24 (79%), Gaps = 3/24 (12%)

Query: 75 RKKKKKRKKRKK---KKKKKEKRI 95
           K+K++R+K+ +   K+K++E+RI
Sbjct: 28 HKRKQQRRKKAQEEAKEKEREERI 51


>gnl|CDD|235370 PRK05244, PRK05244, Der GTPase activator; Provisional.
          Length = 177

 Score = 27.2 bits (61), Expect = 4.4
 Identities = 14/40 (35%), Positives = 18/40 (45%), Gaps = 15/40 (37%)

Query: 71 EEEERKKKKKRK---------------KRKKKKKKKEKRI 95
          E  ERK+KKK K               K K + +KK+ RI
Sbjct: 27 EARERKRKKKHKGLKSGSRHNEGNTQSKGKGQAQKKDPRI 66


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 27.7 bits (62), Expect = 4.5
 Identities = 7/25 (28%), Positives = 16/25 (64%)

Query: 74  ERKKKKKRKKRKKKKKKKEKRIPPR 98
           +++KKK+++KR+   K +  +   R
Sbjct: 627 KKRKKKRKRKRRFLTKIEGVKKEKR 651



 Score = 27.7 bits (62), Expect = 4.7
 Identities = 10/25 (40%), Positives = 18/25 (72%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K+ ++++K+K+R   K +  KKEKR
Sbjct: 627 KKRKKKRKRKRRFLTKIEGVKKEKR 651


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 27.6 bits (61), Expect = 4.6
 Identities = 7/24 (29%), Positives = 17/24 (70%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKE 92
           + EEEER+K+++R+++    +  +
Sbjct: 397 DTEEEERRKRQERERQGTSSRSSD 420


>gnl|CDD|223880 COG0810, TonB, Periplasmic protein TonB, links inner and outer
           membranes [Cell envelope biogenesis, outer membrane].
          Length = 244

 Score = 27.4 bits (61), Expect = 4.8
 Identities = 12/29 (41%), Positives = 15/29 (51%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIP 96
             K E++ KK K + K K K K K K  P
Sbjct: 87  KPKPEKKPKKPKPKPKPKPKPKPKVKPQP 115



 Score = 26.7 bits (59), Expect = 8.3
 Identities = 13/31 (41%), Positives = 16/31 (51%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
            EK + E+K KK + K K K K K K  P  
Sbjct: 85  KEKPKPEKKPKKPKPKPKPKPKPKPKVKPQP 115


>gnl|CDD|241121 cd12677, RRM4_Nop4p, RNA recognition motif 4 in yeast nucleolar
          protein 4 (Nop4p) and similar proteins.  This subgroup
          corresponds to the RRM4 of Nop4p (also known as
          Nop77p), encoded by YPL043W from Saccharomyces
          cerevisiae. It is an essential nucleolar protein
          involved in processing and maturation of 27S pre-rRNA
          and biogenesis of 60S ribosomal subunits. Nop4p has
          four RNA recognition motifs (RRMs), also termed RBDs
          (RNA binding domains) or RNPs (ribonucleoprotein
          domains). .
          Length = 156

 Score = 26.9 bits (59), Expect = 5.0
 Identities = 10/24 (41%), Positives = 14/24 (58%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKK 91
          + KEEE R K  + K+ + KK K 
Sbjct: 44 LSKEEENRDKGHRYKEAQLKKGKS 67


>gnl|CDD|221712 pfam12687, DUF3801, Protein of unknown function (DUF3801).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria. Proteins in
           this family are typically between 158 and 187 amino
           acids in length. This family includes the PcfB protein.
          Length = 137

 Score = 26.8 bits (60), Expect = 5.1
 Identities = 12/23 (52%), Positives = 16/23 (69%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKK 91
           +K +EE+  KKK +K K K KKK
Sbjct: 115 KKFKEEQAAKKKERKDKVKNKKK 137


>gnl|CDD|240253 PTZ00068, PTZ00068, 60S ribosomal protein L13a; Provisional.
          Length = 202

 Score = 26.9 bits (60), Expect = 5.6
 Identities = 9/26 (34%), Positives = 15/26 (57%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
            K EE+RK++     +KK K +K  +
Sbjct: 153 AKLEEKRKERAAAYYKKKVKLRKAWK 178


>gnl|CDD|241486 cd13332, FERM_C_JAK1, Janus kinase 1 FERM domain C-lobe.  JAK1 is a
           tyrosine kinase protein essential in signaling type I
           and type II cytokines. It interacts with the gamma chain
           of type I cytokine receptors to elicit signals from the
           IL-2 receptor family, the IL-4 receptor family, the
           gp130 receptor family, ciliary neurotrophic factor
           receptor (CNTF-R), neurotrophin-1 receptor (NNT-1R) and
           Leptin-R). It also is involved in transducing a signal
           by type I (IFN-alpha/beta) and type II (IFN-gamma)
           interferons, and members of the IL-10 family via type II
           cytokine receptors. JAK (also called Just Another
           Kinase) is a family of intracellular, non-receptor
           tyrosine kinases that transduce cytokine-mediated
           signals via the JAK-STAT pathway. The JAK family in
           mammals consists of 4 members: JAK1, JAK2, JAK3 and
           TYK2. JAKs are composed of seven JAK homology (JH)
           domains (JH1-JH7) . The C-terminal JH1 domain is the
           main catalytic domain, followed by JH2, which is often
           referred to as a pseudokinase domain, followed by
           JH3-JH4 which is homologous to the SH2 domain, and
           lastly JH5-JH7 which is a FERM domain.  Named after
           Janus, the two-faced Roman god of doorways, JAKs possess
           two near-identical phosphate-transferring domains; one
           which displays the kinase activity (JH1), while the
           other negatively regulates the kinase activity of the
           first (JH2). The FERM domain has a cloverleaf tripart
           structure (FERM_N, FERM_M, FERM_C/N, alpha-, and
           C-lobe/A-lobe,A-lobe, B-lobe, C-lobe/F1, F2, F3). The
           C-lobe/F3 within the FERM domain is part of the PH
           domain family. The FERM domain is found in the
           cytoskeletal-associated proteins such as ezrin, moesin,
           radixin, 4.1R, and merlin. These proteins provide a link
           between the membrane and cytoskeleton and are involved
           in signal transduction pathways. The FERM domain is also
           found in protein tyrosine phosphatases (PTPs) , the
           tyrosine kinases FAK and JAK, in addition to other
           proteins involved in signaling. This domain is
           structurally similar to the PH and PTB domains and
           consequently is capable of binding to both peptides and
           phospholipids at different sites.
          Length = 198

 Score = 27.1 bits (60), Expect = 5.6
 Identities = 12/21 (57%), Positives = 14/21 (66%)

Query: 74  ERKKKKKRKKRKKKKKKKEKR 94
           E+KKK K KK K K KK E +
Sbjct: 90  EKKKKGKSKKNKLKGKKDEDK 110



 Score = 26.7 bits (59), Expect = 6.3
 Identities = 11/27 (40%), Positives = 19/27 (70%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
           A+EK+++ + KK K K +K + KKK +
Sbjct: 88  AVEKKKKGKSKKNKLKGKKDEDKKKAR 114


>gnl|CDD|239572 cd03490, Topoisomer_IB_N_1, Topoisomer_IB_N_1: A subgroup of the
           N-terminal DNA binding fragment found in eukaryotic DNA
           topoisomerase (topo) IB. Topo IB proteins include the
           monomeric yeast and human topo I and heterodimeric topo
           I from Leishmania donvanni. Topo I enzymes are divided
           into:  topo type IA (bacterial) and type IB
           (eukaryotic). Topo I relaxes superhelical tension in
           duplex DNA by creating a single-strand nick, the broken
           strand can then rotate around the unbroken strand to
           remove DNA supercoils and, the nick is religated,
           liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit religation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  In addition to differences in
           structure and some biochemical properties,
           Trypanosomatid parasite topos I differ from human topo I
           in their sensitivity to CPTs and other classical topo I
           inhibitors. Trypanosomatid topos I have putative roles
           in organizing the kinetoplast DNA network unique to
           these parasites.  This family may represent more than
           one structural domain.
          Length = 217

 Score = 27.2 bits (60), Expect = 5.6
 Identities = 10/29 (34%), Positives = 20/29 (68%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIP 96
           +E+E+E++K   K +K  KKK++ ++  P
Sbjct: 93  LEEEKEKKKNLNKEEKEAKKKERAKREYP 121


>gnl|CDD|226520 COG4033, COG4033, Uncharacterized protein conserved in archaea
           [Function unknown].
          Length = 102

 Score = 26.3 bits (58), Expect = 5.7
 Identities = 10/30 (33%), Positives = 19/30 (63%)

Query: 61  YLSYILAMEKEEEERKKKKKRKKRKKKKKK 90
           Y +++L ++ E EE +K  ++ K+ K  KK
Sbjct: 71  YGTFVLKIKVEAEEIEKLLEKYKKDKNVKK 100


>gnl|CDD|227674 COG5384, Mpp10, U3 small nucleolar ribonucleoprotein component
           [Translation, ribosomal structure and biogenesis].
          Length = 569

 Score = 27.3 bits (60), Expect = 5.7
 Identities = 10/31 (32%), Positives = 19/31 (61%), Gaps = 1/31 (3%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKRIP 96
           +AM KEE  R+ K + ++   K+K+ +  +P
Sbjct: 490 VAMSKEELTREDKNRLRRA-LKRKRSKANLP 519


>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing.  This is a family of
          proteins that are involved in rRNA processing. In a
          localisation study they were found to localise to the
          nucleus and nucleolus. The family also includes other
          metazoa members from plants to mammals where the
          protein has been named BR22 and is associated with
          TTF-1, thyroid transcription factor 1. In the lungs,
          the family binds TTF-1 to form a complex which
          influences the expression of the key lung surfactant
          protein-B (SP-B) and -C (SP-C), the small hydrophobic
          surfactant proteins that maintain surface tension in
          alveoli.
          Length = 150

 Score = 26.8 bits (59), Expect = 5.7
 Identities = 9/24 (37%), Positives = 19/24 (79%)

Query: 71 EEEERKKKKKRKKRKKKKKKKEKR 94
          E++E  K++KR++R+K+  K++K 
Sbjct: 76 EKKEIAKQRKREQREKELAKRQKE 99



 Score = 26.4 bits (58), Expect = 8.1
 Identities = 14/30 (46%), Positives = 21/30 (70%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           +EK E  +KK+K+R++R+KK  KK K   P
Sbjct: 100 LEKIELSKKKQKERERRRKKLTKKTKSGQP 129


>gnl|CDD|206228 pfam14058, PcfK, PcfK-like protein.  The PcfK-like protein family
           includes the Enterococcus faecalis PcfK protein, which
           is functionally uncharacterized. This family of proteins
           is found in bacteria and viruses. Proteins in this
           family are typically between 137 and 257 amino acids in
           length. There are two completely conserved residues (D
           and L) that may be functionally important.
          Length = 136

 Score = 26.6 bits (59), Expect = 5.7
 Identities = 8/27 (29%), Positives = 17/27 (62%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRI 95
             ++EE +K +KR K+ KK +  + ++
Sbjct: 105 AYQQEELRKIQKRSKKSKKAEPVQGQL 131


>gnl|CDD|221550 pfam12366, Casc1, Cancer susceptibility candidate 1.  This domain
           family is found in eukaryotes, and is typically between
           216 and 263 amino acids in length. Casc1 has many SNPs
           associated with cancer susceptibility.
          Length = 227

 Score = 27.0 bits (60), Expect = 5.9
 Identities = 14/51 (27%), Positives = 19/51 (37%), Gaps = 4/51 (7%)

Query: 53  ILATSLLHYLSYILAMEKEEEE----RKKKKKRKKRKKKKKKKEKRIPPRI 99
           I+ +    ++ Y L + K  EE    R       K  K K  K   IP  I
Sbjct: 53  IIFSMDTFHVRYELYISKIAEEAEGARGYVTDIPKEYKAKPVKYLEIPKPI 103


>gnl|CDD|237028 PRK12267, PRK12267, methionyl-tRNA synthetase; Reviewed.
          Length = 648

 Score = 27.1 bits (61), Expect = 5.9
 Identities = 8/32 (25%), Positives = 18/32 (56%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           +E+E    K++ +    K+ ++K++K   P I
Sbjct: 512 VEEEIAYIKEQMEGSAPKEPEEKEKKPEKPEI 543


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 27.3 bits (61), Expect = 5.9
 Identities = 10/34 (29%), Positives = 17/34 (50%)

Query: 65   ILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
            +   E  +E+R K K + K  K +K K K+   +
Sbjct: 1147 VEEKEIAKEQRLKSKTKGKASKLRKPKLKKKEKK 1180


>gnl|CDD|221177 pfam11708, Slu7, Pre-mRNA splicing Prp18-interacting factor.  The
           spliceosome, an assembly of snRNAs (U1, U2, U4/U6, and
           U5) and proteins, catalyzes the excision of introns from
           pre-mRNAs in two successive trans-esterification
           reactions. Step 2 depends upon integral spliceosome
           constituents such as U5 snRNA and Prp8 and
           non-spliceosomal proteins Prp16, Slu7, Prp18, and Prp22.
           ATP hydrolysis by the DEAH-box enzyme Prp16 promotes a
           conformational change in the spliceosome that leads to
           protection of the 3'ss from targeted RNase H cleavage.
           This change, which probably reflects binding of the 3'ss
           PyAG in the catalytic centre of the spliceosome,
           requires the ordered recruitment of Slu7, Prp18, and
           Prp22 to the spliceosome. There is a close functional
           relationship between Prp8, Prp18, and Slu7, and Prp18
           interacts with Slu7, so that together they recruit Prp22
           to the spliceosome. Most members of the family carry a
           zinc-finger of the CCHC-type upstream of this domain.
          Length = 236

 Score = 27.0 bits (60), Expect = 5.9
 Identities = 15/63 (23%), Positives = 26/63 (41%), Gaps = 11/63 (17%)

Query: 68  MEKEEEERKKKKKRKKRKK-----------KKKKKEKRIPPRIIYVSGAGKVEVKKGYSV 116
           + K+E+E+K++ K +K++             K  KE  +     YV      + KK  S 
Sbjct: 161 LRKKEKEKKEQLKIQKKQSLLEKYGGEEHLDKPPKELLLGQSEDYVEYDRAGKKKKAKSK 220

Query: 117 TLE 119
             E
Sbjct: 221 YEE 223


>gnl|CDD|143236 cd05759, Ig2_KIRREL3-like, Second immunoglobulin (Ig)-like domain
           of Kirrel (kin of irregular chiasm-like) 3 (also known
           as Neph2).  Ig2_KIRREL3-like: domain similar to the
           second immunoglobulin (Ig)-like domain of Kirrel (kin of
           irregular chiasm-like) 3 (also known as Neph2). This
           protein has five Ig-like domains, one transmembrane
           domain, and a cytoplasmic tail. Included in this group
           is mammalian Kirrel (Neph1), Kirrel2 (Neph3), and
           Drosophila RST (irregular chiasm C-roughest) protein.
           These proteins contain multiple Ig domains, have
           properties of cell adhesion molecules, and are important
           in organ development.
          Length = 82

 Score = 25.9 bits (57), Expect = 6.0
 Identities = 19/79 (24%), Positives = 30/79 (37%), Gaps = 12/79 (15%)

Query: 118 LECKADG-NPVPNITWTRKNNNLPGGEYSYS----------GNSLTVRHTNRHSAGIYLC 166
           L C+A G  P   I W R    L G  YS             ++L +  ++  +   + C
Sbjct: 4   LTCRARGAKPAAEIIWFRDGEVLDGATYSKELLKDGKRETTVSTLPITPSDHDTGRTFTC 63

Query: 167 VANNM-VGSSAAASIALHV 184
            A N  + +    S+ L V
Sbjct: 64  RARNEALPTGKETSVTLDV 82


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
           membrane complex of itself with TolQ and TolR to the
           outer membrane complex of TolB and OprL (also called
           Pal). Most of the length of the protein consists of
           low-complexity sequence that may differ in both length
           and composition from one species to another,
           complicating efforts to discriminate TolA (the most
           divergent gene in the tol-pal system) from paralogs such
           as TonB. Selection of members of the seed alignment and
           criteria for setting scoring cutoffs are based largely
           conserved operon struction. //The Tol-Pal complex is
           required for maintaining outer membrane integrity. Also
           involved in transport (uptake) of colicins and
           filamentous DNA, and implicated in pathogenesis.
           Transport is energized by the proton motive force. TolA
           is an inner membrane protein that interacts with
           periplasmic TolB and with outer membrane porins ompC,
           phoE and lamB [Transport and binding proteins, Other,
           Cellular processes, Pathogenesis].
          Length = 346

 Score = 27.1 bits (60), Expect = 6.0
 Identities = 6/25 (24%), Positives = 14/25 (56%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
             E+  +K+ ++R   +K  K+ E+
Sbjct: 85  AAEQARQKELEQRAAAEKAAKQAEQ 109



 Score = 27.1 bits (60), Expect = 6.8
 Identities = 10/27 (37%), Positives = 13/27 (48%)

Query: 67  AMEKEEEERKKKKKRKKRKKKKKKKEK 93
           A  K   E KKK    K+K + + K K
Sbjct: 156 AKAKAAAEAKKKAAEAKKKAEAEAKAK 182


>gnl|CDD|224461 COG1544, COG1544, Ribosome-associated protein Y (PSrp-1)
           [Translation, ribosomal structure and biogenesis].
          Length = 110

 Score = 26.1 bits (58), Expect = 6.1
 Identities = 13/36 (36%), Positives = 21/36 (58%), Gaps = 1/36 (2%)

Query: 61  YLSYILAMEK-EEEERKKKKKRKKRKKKKKKKEKRI 95
           Y +  LA++K E + RK K+K K  ++ K  KE+  
Sbjct: 73  YAAIDLAIDKLERQLRKHKEKLKDHRRAKVSKEEAF 108


>gnl|CDD|214335 CHL00016, ndhG, NADH dehydrogenase subunit 6.
          Length = 182

 Score = 26.8 bits (60), Expect = 6.2
 Identities = 15/42 (35%), Positives = 18/42 (42%), Gaps = 8/42 (19%)

Query: 21  IMFADLAIMFADSRPES----TFWIFGKLITSCSYIILATSL 58
           I+FA   +MF +  PE       W  G  ITS     L  SL
Sbjct: 74  IIFA---VMFMNG-PEYSKDFNLWTVGDGITSLVCTSLFFSL 111


>gnl|CDD|197664 smart00338, BRLZ, basic region leucin zipper. 
          Length = 65

 Score = 25.2 bits (56), Expect = 6.2
 Identities = 8/29 (27%), Positives = 19/29 (65%), Gaps = 7/29 (24%)

Query: 71 EEEERKKKKK-------RKKRKKKKKKKE 92
          EE+E++++++       R+ R++KK + E
Sbjct: 1  EEDEKRRRRRERNREAARRSRERKKAEIE 29


>gnl|CDD|221466 pfam12220, U1snRNP70_N, U1 small nuclear ribonucleoprotein of
          70kDa MW N terminal.  This domain is found in
          eukaryotes. This domain is about 90 amino acids in
          length. This domain is found associated with pfam00076.
          This domain is part of U1 snRNP, which is the pre-mRNA
          binding protein of the penta-snRNP spliceosome complex.
          It extends over a distance of 180 A from its RNA
          binding domain, wraps around the core domain of U1
          snRNP consisting of the seven Sm proteins and finally
          contacts U1-C, which is crucial for 5'-splice-site
          recognition.
          Length = 94

 Score = 25.7 bits (57), Expect = 6.3
 Identities = 10/39 (25%), Positives = 22/39 (56%), Gaps = 8/39 (20%)

Query: 63 SYILAMEKEEEE--------RKKKKKRKKRKKKKKKKEK 93
           Y+   +  ++E          +K++R+KR+KK+K ++K
Sbjct: 42 QYLSEFKDYKDEPPPEPTETWLEKREREKREKKEKLEKK 80


>gnl|CDD|236944 PRK11642, PRK11642, exoribonuclease R; Provisional.
          Length = 813

 Score = 27.4 bits (61), Expect = 6.3
 Identities = 8/24 (33%), Positives = 13/24 (54%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEK 93
           + E++ K K  +K  +K KK   K
Sbjct: 770 RGEKKAKPKAAKKDARKAKKPSAK 793



 Score = 27.0 bits (60), Expect = 8.5
 Identities = 9/25 (36%), Positives = 14/25 (56%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEK 93
           EK+ + +  KK  +K KK   K +K
Sbjct: 772 EKKAKPKAAKKDARKAKKPSAKTQK 796


>gnl|CDD|223014 PHA03231, PHA03231, glycoprotein BALF4; Provisional.
          Length = 829

 Score = 27.2 bits (61), Expect = 6.3
 Identities = 10/24 (41%), Positives = 14/24 (58%), Gaps = 1/24 (4%)

Query: 65  ILAMEK-EEEERKKKKKRKKRKKK 87
           +LAM     EER++KK +KK    
Sbjct: 777 LLAMHLLSAEERQEKKAKKKNSGP 800


>gnl|CDD|219978 pfam08701, GN3L_Grn1, GNL3L/Grn1 putative GTPase.  Grn1 (yeast)
          and GNL3L (human) are putative GTPases which are
          required for growth and play a role in processing of
          nucleolar pre-rRNA. This family contains a potential
          nuclear localisation signal.
          Length = 80

 Score = 25.7 bits (57), Expect = 6.4
 Identities = 13/27 (48%), Positives = 16/27 (59%), Gaps = 2/27 (7%)

Query: 72 EEERKKKKKRKKR--KKKKKKKEKRIP 96
          E  RK +K+ KK    K KKKK+  IP
Sbjct: 12 EHHRKLRKEAKKNPTWKSKKKKDPGIP 38



 Score = 25.3 bits (56), Expect = 7.5
 Identities = 7/25 (28%), Positives = 17/25 (68%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKE 92
           ++EEE+ ++K+ RK  + + +K+ 
Sbjct: 56 RKQEEEKERRKEARKAERAEARKRG 80



 Score = 25.3 bits (56), Expect = 8.4
 Identities = 10/29 (34%), Positives = 21/29 (72%)

Query: 70 KEEEERKKKKKRKKRKKKKKKKEKRIPPR 98
          +E EE+K+K++ +K ++K+ +K +R   R
Sbjct: 49 EEIEEKKRKQEEEKERRKEARKAERAEAR 77



 Score = 25.3 bits (56), Expect = 8.8
 Identities = 9/27 (33%), Positives = 22/27 (81%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKKKEKR 94
          +E+ EE+++K+++ K+R+K+ +K E+ 
Sbjct: 48 LEEIEEKKRKQEEEKERRKEARKAERA 74


>gnl|CDD|152599 pfam12164, SporV_AA, Stage V sporulation protein AA.  This domain
           family is found in bacteria - primarily Firmicutes, and
           is approximately 90 amino acids in length. There is a
           single completely conserved residue G that may be
           functionally important. Most annotation associated with
           this domain suggests that it is involved in the fifth
           stage of sporulation, however there is little
           publication to back this up.
          Length = 93

 Score = 25.6 bits (57), Expect = 6.5
 Identities = 8/22 (36%), Positives = 11/22 (50%)

Query: 98  RIIYVSGAGKVEVKKGYSVTLE 119
             +Y+    +VEV    SVTL 
Sbjct: 2   DTVYIRLRHRVEVTPKKSVTLG 23


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
           found at the N terminus of SMC proteins. The SMC
           (structural maintenance of chromosomes) superfamily
           proteins have ATP-binding domains at the N- and
           C-termini, and two extended coiled-coil domains
           separated by a hinge in the middle. The eukaryotic SMC
           proteins form two kind of heterodimers: the SMC1/SMC3
           and the SMC2/SMC4 types. These heterodimers constitute
           an essential part of higher order complexes, which are
           involved in chromatin and DNA dynamics. This family also
           includes the RecF and RecN proteins that are involved in
           DNA metabolism and recombination.
          Length = 1162

 Score = 27.2 bits (60), Expect = 6.5
 Identities = 8/36 (22%), Positives = 17/36 (47%)

Query: 60  HYLSYILAMEKEEEERKKKKKRKKRKKKKKKKEKRI 95
                 L   K +E+ KK  +  + K+K + +E+ +
Sbjct: 192 DLEELKLQELKLKEQAKKALEYYQLKEKLELEEENL 227


>gnl|CDD|206035 pfam13864, Enkurin, Calmodulin-binding.  This is a family of
           apparent calmodulin-binding proteins found at high
           levels in the testis and vomeronasal organ and at lower
           levels in certain other tissues. Enkurin is a scaffold
           protein that binds PI3 kinase to sperm transient
           receptor potential (canonical) (TRPC) channels. The
           mammalian transient receptor potential (canonical)
           channels are the primary candidates for the Ca(2+) entry
           pathway activated by the hormones, growth factors, and
           neurotransmitters that exert their effect through
           activation of PLC. Calmodulin binds to the C-terminus of
           all TRPC channels, and dissociation of calmodulin from
           TRPC4 results in profound activation of the channel.
          Length = 98

 Score = 26.0 bits (58), Expect = 6.8
 Identities = 8/41 (19%), Positives = 24/41 (58%)

Query: 73  EERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKG 113
            +RK++   +K ++++++++   PP    +S   ++E+  G
Sbjct: 8   LKRKEEIAEEKEERERQEEDPDCPPGHRLLSEEERLELLNG 48


>gnl|CDD|220665 pfam10268, Tmemb_161AB, Predicted transmembrane protein 161AB.
          Transmemb_161AB is a family of conserved proteins found
          from worms to humans. Members are putative
          transmembrane proteins but otherwise the function is
          not known.
          Length = 486

 Score = 27.0 bits (60), Expect = 6.8
 Identities = 14/52 (26%), Positives = 23/52 (44%), Gaps = 8/52 (15%)

Query: 44 KLITSCSY---IILATSLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKKE 92
          KL    S+   ++   SL  YL        E+E R    K+KK K ++ ++ 
Sbjct: 19 KLSPHYSFARWLLCNGSLYRYLH-----PTEDELRALAGKQKKPKGRRDRRA 65


>gnl|CDD|143225 cd05748, Ig_Titin_like, Immunoglobulin (Ig)-like domain of titin
           and similar proteins.  Ig_Titin_like: immunoglobulin
           (Ig)-like domain found in titin-like proteins. Titin
           (also called connectin) is a fibrous sarcomeric protein
           specifically found in vertebrate striated muscle. Titin
           is gigantic, depending on isoform composition it ranges
           from 2970 to 3700 kDa, and is of a length that spans
           half a sarcomere. Titin largely consists of multiple
           repeats of Ig-like and fibronectin type 3 (FN-III)-like
           domains. Titin connects the ends of myosin thick
           filaments to Z disks and extends along the thick
           filament to the H zone.  It appears to function
           similarly to an elastic band, keeping the myosin
           filaments centered in the sarcomere during muscle
           contraction or stretching. Within the sarcomere, titin
           is also attached to or is associated with myosin binding
           protein C (MyBP-C). MyBP-C appears to contribute to the
           generation of passive tension by titin, and similar to
           titin has repeated Ig-like and FN-III domains. Also
           included in this group are worm twitchin and insect
           projectin, thick filament proteins of invertebrate
           muscle, which also have repeated Ig-like and FN-III
           domains.
          Length = 74

 Score = 25.2 bits (56), Expect = 6.9
 Identities = 18/71 (25%), Positives = 30/71 (42%), Gaps = 16/71 (22%)

Query: 124 GNPVPNITWTR--KNNNLPG-------GEYSYSGNSLTVRHTNRHSAGIY-LCVANNMVG 173
           G P P +TW++  K   L G          +    SL +++  R  +G Y L + N    
Sbjct: 10  GRPTPTVTWSKDGKPLKLSGRVQIETTASST----SLVIKNAERSDSGKYTLTLKNP--A 63

Query: 174 SSAAASIALHV 184
              +A+I + V
Sbjct: 64  GEKSATINVKV 74


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 27.1 bits (60), Expect = 7.1
 Identities = 10/35 (28%), Positives = 17/35 (48%), Gaps = 6/35 (17%)

Query: 70   KEEEERKKKKKRKKR------KKKKKKKEKRIPPR 98
            K EE R+K ++   R      KK  ++  K+  P+
Sbjct: 1174 KAEEAREKLQRAAARGESGAAKKVSRQAPKKPAPK 1208


>gnl|CDD|218845 pfam05991, NYN_YacP, YacP-like NYN domain.  This family consists of
           bacterial proteins related to YacP. This family is
           uncharacterized functionally, but it has been suggested
           that these proteins are nucleases due to them containing
           a NYN domain. NYN (for N4BP1, YacP-like Nuclease)
           domains were discovered by Anantharaman and Aravind.
           Based on gene neighborhoods it was suggested that the
           bacterial YacP proteins interact with the Ribonuclease
           III and TrmH methylase in a processome complex that
           catalyzes the maturation of rRNA and tRNA.
          Length = 165

 Score = 26.4 bits (59), Expect = 7.1
 Identities = 13/34 (38%), Positives = 21/34 (61%)

Query: 66  LAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRI 99
           LA E +  E+K +KK +KR+K +KK   R+   +
Sbjct: 123 LAEEVKRAEKKIRKKAEKRRKSRKKYLDRLSDEV 156


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 27.0 bits (60), Expect = 7.2
 Identities = 10/27 (37%), Positives = 19/27 (70%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKRI 95
           K++ E+KKKKK++K++ K + + K  
Sbjct: 66 SKKKSEKKKKKKKEKKEPKSEGETKLG 92



 Score = 26.6 bits (59), Expect = 8.5
 Identities = 11/29 (37%), Positives = 17/29 (58%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
           EK+E + + + K   +  KK KK K+ PP
Sbjct: 79  EKKEPKSEGETKLGFKTPKKSKKTKKKPP 107


>gnl|CDD|227474 COG5145, RAD14, DNA excision repair protein [DNA replication,
           recombination, and repair].
          Length = 292

 Score = 26.9 bits (59), Expect = 7.2
 Identities = 13/33 (39%), Positives = 22/33 (66%), Gaps = 1/33 (3%)

Query: 67  AMEKEEEERKK-KKKRKKRKKKKKKKEKRIPPR 98
            +++E++ R+K K  RK++K +KK KE R   R
Sbjct: 213 ELDREKQRREKMKDDRKEKKLEKKIKELRRKTR 245


>gnl|CDD|191063 pfam04689, S1FA, DNA binding protein S1FA.  S1FA is a DNA-binding
          protein found in plants that specifically recognises
          the negative promoter element S1F.
          Length = 70

 Score = 25.2 bits (55), Expect = 7.4
 Identities = 17/50 (34%), Positives = 28/50 (56%), Gaps = 4/50 (8%)

Query: 52 IILATSLLHYL--SYILAM--EKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
          +++   LL +L  +YIL +  +K    RKKK   KK+ K++K K+    P
Sbjct: 19 LVVGGLLLTFLVGNYILYVYAQKNLPPRKKKPVSKKKMKREKLKQGVSVP 68


>gnl|CDD|219039 pfam06461, DUF1086, Domain of Unknown Function (DUF1086).  This
          family consists of several eukaryotic domains of
          unknown function which are present in chromodomain
          helicase DNA binding proteins. This domain is often
          found in conjunction with pfam00176, pfam00271,
          pfam06465, pfam00385 and pfam00628.
          Length = 158

 Score = 26.5 bits (58), Expect = 7.4
 Identities = 7/29 (24%), Positives = 19/29 (65%)

Query: 69 EKEEEERKKKKKRKKRKKKKKKKEKRIPP 97
          +++ +E+ +  +R  RK+ +   E+++PP
Sbjct: 9  DEDYDEKPRTVRRPYRKRARDNSEEKLPP 37


>gnl|CDD|193580 cd09891, NGN_Bact_1, Bacterial N-Utilization Substance G (NusG)
          N-terminal (NGN) domain, subgroup 1.  The N-Utilization
          Substance G (NusG) protein is involved in transcription
          elongation and termination in bacteria. NusG is
          essential in Escherichia coli and associates with RNA
          polymerase elongation and Rho-termination. Homologs of
          the NusG gene exist in all bacteria. The NusG
          N-terminal domain (NGN) is similar in all NusG
          homologs, but its C-terminal domain and the linker that
          separates these two domains are different. The domain
          organization of NusG suggests that the common
          properties of NusG and its homologs are due to their
          similar NGN domains.
          Length = 107

 Score = 25.9 bits (58), Expect = 7.6
 Identities = 8/18 (44%), Positives = 11/18 (61%)

Query: 68 MEKEEEERKKKKKRKKRK 85
           E+  E +  KKK K+RK
Sbjct: 39 TEEVVEVKNGKKKVKERK 56


>gnl|CDD|188282 TIGR03110, exosort_Gpos, exosortase family protein XrtG.  Members
           of this protein family are found in a modest number of
           non-pathogenic Gram-positive bacteria, including three
           species of Lactococcus and three paralogs in Clostridium
           acetobutylicum. This protein appears related to the
           conserved core region of a family of proposed
           transpeptidases, exosortase (previously EpsH), thought
           to act on PEP-CTERM proteins. Members of the seed
           alignment include all exosortase proposed active site
           residues. However, in contrast to canonical exosortase
           (TIGR02602) and archaeal (TIGR03762), and cyanobacterial
           (TIGR03763) variants, this family has not yet been
           matched to a cognate PEP-CTERM-like sorting signal. This
           protein is assigned the gene symbol XrtG (eXosoRTase
           family protein of Gram-positives).
          Length = 187

 Score = 26.6 bits (59), Expect = 8.0
 Identities = 12/46 (26%), Positives = 19/46 (41%), Gaps = 5/46 (10%)

Query: 12  WIFGKPITRIMFADLAIMFADSRPESTFW---IFGKLITSCSYIIL 54
           WIF   I RI    + + +      S +    + G++I     IIL
Sbjct: 128 WIFLANILRISLIIVIVKYFGV--SSFYIAHTVLGRIIFYVLVIIL 171


>gnl|CDD|153340 cd07656, F-BAR_srGAP, The F-BAR (FES-CIP4 Homology and
           Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase
           Activating Proteins.  F-BAR domains are dimerization
           modules that bind and bend membranes and are found in
           proteins involved in membrane dynamics and actin
           reorganization. Slit-Robo GTPase Activating Proteins
           (srGAPs) are Rho GAPs that interact with Robo1, the
           transmembrane receptor of Slit proteins. Slit proteins
           are secreted proteins that control axon guidance and the
           migration of neurons and leukocytes. Vertebrates contain
           three isoforms of srGAPs, all of which are expressed
           during embryonic and early development in the nervous
           system but with different localization and timing.
           srGAPs contain an N-terminal F-BAR domain, a Rho GAP
           domain, and a C-terminal SH3 domain. F-BAR domains form
           banana-shaped dimers with a positively-charged concave
           surface that binds to negatively-charged lipid
           membranes. They can induce membrane deformation in the
           form of long tubules.
          Length = 241

 Score = 26.5 bits (59), Expect = 8.1
 Identities = 9/27 (33%), Positives = 17/27 (62%)

Query: 68  MEKEEEERKKKKKRKKRKKKKKKKEKR 94
            ++E+   KK ++ +  KK +K+ EKR
Sbjct: 162 EKQEQSPEKKLERSRSSKKIEKEVEKR 188


>gnl|CDD|234045 TIGR02876, spore_yqfD, sporulation protein YqfD.  YqfD is part of
           the sigma-E regulon in the sporulation program of
           endospore-forming Gram-positive bacteria. Mutation
           results in a sporulation defect in Bacillus subtilis.
           Members are found in all currently known
           endospore-forming bacteria, including the genera
           Bacillus, Symbiobacterium, Carboxydothermus,
           Clostridium, and Thermoanaerobacter [Cellular processes,
           Sporulation and germination].
          Length = 382

 Score = 26.5 bits (59), Expect = 8.1
 Identities = 14/61 (22%), Positives = 25/61 (40%), Gaps = 3/61 (4%)

Query: 65  ILAMEKEEEERKKKKK---RKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLECK 121
               E +E+  K  K+    K ++K +K+ +K + P    VS     E  +G  V +   
Sbjct: 314 ETYYEVKEKVEKVTKEEAIEKAKEKAEKELKKELDPNAKIVSDKILSERVEGGKVKVTVH 373

Query: 122 A 122
            
Sbjct: 374 V 374


>gnl|CDD|220628 pfam10198, Ada3, Histone acetyltransferases subunit 3.  Ada3 is a
           family of proteins conserved from yeasts to humans. It
           is an essential component of the Ada transcriptional
           coactivator (alteration/deficiency in activation)
           complex. Ada3 plays a key role in linking histone
           acetyltransferase-containing complexes to p53 (tumour
           suppressor protein) thereby regulating p53 acetylation,
           stability and transcriptional activation following DNA
           damage.
          Length = 127

 Score = 26.1 bits (58), Expect = 8.2
 Identities = 10/16 (62%), Positives = 12/16 (75%)

Query: 79  KKRKKRKKKKKKKEKR 94
            KR + + KKKKKEKR
Sbjct: 91  LKRIRARGKKKKKEKR 106


>gnl|CDD|218517 pfam05236, TAF4, Transcription initiation factor TFIID component
           TAF4 family.  This region of similarity is found in
           Transcription initiation factor TFIID component TAF4.
          Length = 255

 Score = 26.6 bits (59), Expect = 8.3
 Identities = 7/26 (26%), Positives = 17/26 (65%)

Query: 69  EKEEEERKKKKKRKKRKKKKKKKEKR 94
           +KEEEER+ +++R+   +  ++   +
Sbjct: 122 QKEEEERRVERRRELGLEDPEQLRLK 147


>gnl|CDD|227479 COG5150, COG5150, Class 2 transcription repressor NC2, beta subunit
           (Dr1) [Transcription].
          Length = 148

 Score = 26.1 bits (57), Expect = 8.4
 Identities = 14/48 (29%), Positives = 23/48 (47%)

Query: 44  KLITSCSYIILATSLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKK 91
           K   +  ++I A   L +  YI +  +E E  K  +K+K+ K  K K 
Sbjct: 63  KKTIAYEHVIKALENLEFEEYIESCMEEHENYKSYQKQKESKISKFKD 110


>gnl|CDD|239570 cd03488, Topoisomer_IB_N_htopoI_like, Topoisomer_IB_N_htopoI_like :
           N-terminal DNA binding fragment found in eukaryotic DNA
           topoisomerase (topo) IB proteins similar to the
           monomeric yeast and human topo I.  Topo I enzymes are
           divided into:  topo type IA (bacterial) and type IB
           (eukaryotic). Topo I relaxes superhelical tension in
           duplex DNA by creating a single-strand nick, the broken
           strand can then rotate around the unbroken strand to
           remove DNA supercoils and, the nick is religated,
           liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit religation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  This family may represent more than
           one structural domain.
          Length = 215

 Score = 26.5 bits (59), Expect = 8.4
 Identities = 8/22 (36%), Positives = 14/22 (63%)

Query: 72  EEERKKKKKRKKRKKKKKKKEK 93
           + ++++KK   K +KK  K EK
Sbjct: 95  KAQKEEKKAMSKEEKKAIKAEK 116


>gnl|CDD|218899 pfam06102, DUF947, Domain of unknown function (DUF947).  Family of
           eukaryotic proteins with unknown function.
          Length = 168

 Score = 26.1 bits (58), Expect = 8.5
 Identities = 10/37 (27%), Positives = 22/37 (59%)

Query: 58  LLHYLSYILAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
           L    S +  ++ ++ ER+  K+ KK++K+  K+ K+
Sbjct: 88  LQSMKSRLKTLKNKDREREILKEHKKQEKELIKEGKK 124


>gnl|CDD|218734 pfam05758, Ycf1, Ycf1.  The chloroplast genomes of most higher
           plants contain two giant open reading frames designated
           ycf1 and ycf2. Although the function of Ycf1 is unknown,
           it is known to be an essential gene.
          Length = 832

 Score = 26.9 bits (60), Expect = 8.5
 Identities = 11/34 (32%), Positives = 24/34 (70%), Gaps = 6/34 (17%)

Query: 68  MEKEEEER------KKKKKRKKRKKKKKKKEKRI 95
           +EK++E +      +K KK +K++K K+++++RI
Sbjct: 647 IEKKKEFKILDYTEEKTKKEEKKEKNKREEKERI 680


>gnl|CDD|222592 pfam14204, Ribosomal_L18_c, Ribosomal L18 C-terminal region.
          This domain is the C-terminal end of ribosomal L18/L5
          proteins.
          Length = 93

 Score = 25.6 bits (57), Expect = 8.6
 Identities = 7/22 (31%), Positives = 15/22 (68%)

Query: 74 ERKKKKKRKKRKKKKKKKEKRI 95
           RKKK+K++ + + K+   K++
Sbjct: 60 SRKKKEKKEVKAESKRYNAKKL 81


>gnl|CDD|153289 cd07605, I-BAR_IMD, Inverse (I)-BAR, also known as the IRSp53/MIM
           homology Domain (IMD), a dimerization module that binds
           and bends membranes.  Inverse (I)-BAR (or IMD) is a
           member of the Bin/Amphiphysin/Rvs (BAR) domain family.
           It is a dimerization and lipid-binding module that bends
           membranes and induces membrane protrusions in the
           opposite direction compared to classical BAR and F-BAR
           domains, which produce membrane invaginations. IMD
           domains are found in Insulin Receptor tyrosine kinase
           Substrate p53 (IRSp53), Missing in Metastasis (MIM), and
           Brain-specific Angiogenesis Inhibitor 1-Associated
           Protein 2-like (BAIAP2L) proteins. These are
           multi-domain proteins that act as scaffolding proteins
           and transducers of a variety of signaling pathways that
           link membrane dynamics and the underlying actin
           cytoskeleton. Most members contain an N-terminal IMD, an
           SH3 domain, and a WASP homology 2 (WH2) actin-binding
           motif at the C-terminus, exccept for MIM which does not
           carry an SH3 domain. Some members contain additional
           domains and motifs. The IMD domain binds and bundles
           actin filaments, binds membranes and produces membrane
           protrusions, and interacts with the small GTPase Rac.
          Length = 223

 Score = 26.6 bits (59), Expect = 8.8
 Identities = 13/43 (30%), Positives = 21/43 (48%), Gaps = 6/43 (13%)

Query: 64  YILAMEKE-EEERKKK-----KKRKKRKKKKKKKEKRIPPRII 100
            I   EK+ ++E K+K     K R + KK +KK +K    +  
Sbjct: 109 VINKFEKDYKKEYKQKREDLDKARSELKKLQKKSQKSGTGKYQ 151


>gnl|CDD|165398 PHA03126, PHA03126, dUTPase; Provisional.
          Length = 326

 Score = 26.5 bits (58), Expect = 8.9
 Identities = 12/22 (54%), Positives = 14/22 (63%)

Query: 23  FADLAIMFADSRPESTFWIFGK 44
           F DL I+FA S P  T  IFG+
Sbjct: 197 FVDLPIVFASSNPAVTPCIFGR 218


>gnl|CDD|217935 pfam04158, Sof1, Sof1-like domain.  Sof1 is essential for cell
          growth and is a component of the nucleolar rRNA
          processing machinery.
          Length = 88

 Score = 25.3 bits (56), Expect = 8.9
 Identities = 8/34 (23%), Positives = 18/34 (52%), Gaps = 2/34 (5%)

Query: 67 AMEKEEEERKKKKKRKKRKKK--KKKKEKRIPPR 98
          A + + E ++ KK++++ ++K  K       P R
Sbjct: 50 AQKIKREMKEAKKRKEENRRKHSKPGSVPPKPER 83


>gnl|CDD|189027 cd09857, PIN_EXO1, PIN domain of Exonuclease-1, a
           structure-specific, divalent-metal-ion dependent, 5'
           nuclease and homologs.  Exonuclease-1 (EXO1) is involved
           in multiple, eukaryotic DNA metabolic pathways,
           including DNA replication processes (5' flap DNA
           endonuclease activity and double stranded DNA
           5'-exonuclease activity), DNA repair processes (DNA
           mismatch repair (MMR) and post-replication repair
           (PRR)), recombination, and telomere integrity. EXO1
           functions in the MMS2 error-free branch of the PRR
           pathway in the maintenance and repair of stalled
           replication forks. Studies also suggest that EXO1 plays
           both structural and catalytic roles during MMR-mediated
           mutation avoidance. EXO1 belongs to the FEN1-EXO1-like
           family of structure-specific, 5' nucleases. These
           nucleases contain a PIN (PilT N terminus) domain with a
           helical arch/clamp region (I domain) of variable length
           (approximately 43 residues in EXO1 PIN domains) and a
           H3TH (helix-3-turn-helix) domain, an atypical
           helix-hairpin-helix-2-like region. Both the H3TH domain
           (not included here) and the helical arch/clamp region
           are involved in DNA binding. Nucleases within this group
           also have a carboxylate-rich active site that is
           involved in binding essential divalent metal ion
           cofactors (Mg2+/Mn2+). EXO1 nucleases also have
           C-terminal Mlh1- and Msh2-binding domains which allow
           interaction with MMR and PRR proteins, respectively.
          Length = 210

 Score = 26.3 bits (59), Expect = 8.9
 Identities = 4/25 (16%), Positives = 15/25 (60%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKR 94
           K+  E +++++R++  +K  +  + 
Sbjct: 85  KKGTEEERRERREENLEKALELLRE 109


>gnl|CDD|143230 cd05753, Ig2_FcgammaR_like, Second immunoglobulin (Ig)-like domain
           of  Fcgamma-receptors (FcgammaRs) and similar proteins. 
           Ig2_FcgammaR_like: domain similar to the second
           immunoglobulin (Ig)-like domain of  Fcgamma-receptors
           (FcgammaRs). Interactions between IgG and FcgammaR are
           important to the initiation of cellular and humoral
           response. IgG binding to FcgammaR leads to a cascade of
           signals and ultimately to functions such as
           antibody-dependent-cellular-cytotoxicity (ADCC),
           endocytosis, phagocytosis, release of inflammatory
           mediators, etc. FcgammaR has two Ig-like domains. This
           group also contains FcepsilonRI, which binds IgE with
           high affinity.
          Length = 83

 Score = 25.4 bits (56), Expect = 9.1
 Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 4/62 (6%)

Query: 109 EVKKGYSVTLECKADGN-PVPNITWTRKNNNLPGGEYSYSGNSLTVRHTNRHSAGIYLCV 167
            V +G  + L C    N PV  +T+ R   +    +YS+S ++L++       +G Y C 
Sbjct: 10  VVFEGEPLVLRCHGWKNKPVYKVTYYR---DGKAKKYSHSNSNLSIPQATLSDSGSYHCS 66

Query: 168 AN 169
             
Sbjct: 67  GI 68


>gnl|CDD|214659 smart00433, TOP2c, TopoisomeraseII.  Eukaryotic DNA topoisomerase
           II, GyrB, ParE.
          Length = 594

 Score = 26.8 bits (60), Expect = 9.1
 Identities = 7/28 (25%), Positives = 12/28 (42%)

Query: 70  KEEEERKKKKKRKKRKKKKKKKEKRIPP 97
              + R   KK ++  +KKK     +P 
Sbjct: 342 LAAKARAAAKKARELTRKKKLSSISLPG 369


>gnl|CDD|216295 pfam01093, Clusterin, Clusterin. 
          Length = 434

 Score = 26.6 bits (59), Expect = 9.1
 Identities = 9/23 (39%), Positives = 14/23 (60%)

Query: 68 MEKEEEERKKKKKRKKRKKKKKK 90
          ME+ EEE K      ++ KK+K+
Sbjct: 32 MERTEEEHKNLMSTLEKTKKEKE 54


>gnl|CDD|215444 PLN02830, PLN02830, UDP-sugar pyrophosphorylase.
          Length = 615

 Score = 26.6 bits (59), Expect = 9.3
 Identities = 11/25 (44%), Positives = 17/25 (68%)

Query: 58  LLHYLSYILAMEKEEEERKKKKKRK 82
           L  Y+  ILA+++  ++RK KK RK
Sbjct: 162 LQLYIESILALQERAKKRKAKKGRK 186


>gnl|CDD|220296 pfam09580, Spore_YhcN_YlaJ, Sporulation lipoprotein YhcN/YlaJ
           (Spore_YhcN_YlaJ).  This entry contains YhcN and YlaJ,
           which are predicted lipoproteins that have been detected
           as spore proteins but not vegetative proteins in
           Bacillus subtilis. Both appear to be expressed under
           control of the RNA polymerase sigma-G factor. The
           YlaJ-like members of this family have a low-complexity,
           strongly acidic, 40-residue C-terminal domain.
          Length = 169

 Score = 26.1 bits (58), Expect = 9.3
 Identities = 11/30 (36%), Positives = 18/30 (60%), Gaps = 2/30 (6%)

Query: 76  KKKKKRKKRKKKKKKKEKRIPPRI--IYVS 103
            ++   ++ KK+ KK  K + PRI  +YVS
Sbjct: 102 GERSLTEEIKKQVKKAVKSVDPRIYNVYVS 131


>gnl|CDD|219589 pfam07808, RED_N, RED-like protein N-terminal region.  This
          family contains sequences that are similar to the
          N-terminal region of Red protein. This and related
          proteins contain a RED repeat which consists of a
          number of RE and RD sequence elements. The region in
          question has several conserved NLS sequences and a
          putative trimeric coiled-coil region, suggesting that
          these proteins are expressed in the nucleus. The
          function of Red protein is unknown, but efficient
          sequestration to nuclear bodies suggests that its
          expression may be tightly regulated of that the protein
          self-aggregates extremely efficiently.
          Length = 238

 Score = 26.4 bits (58), Expect = 9.3
 Identities = 10/23 (43%), Positives = 15/23 (65%)

Query: 76 KKKKKRKKRKKKKKKKEKRIPPR 98
          KKKKK    +K+++  EK I P+
Sbjct: 1  KKKKKYAYLRKQEENAEKEINPK 23


>gnl|CDD|234379 TIGR03876, cas_csaX, CRISPR type I-A/APERN-associated protein CsaX.
            This family comprises a minor CRISPR-associated protein
           family. It occurs only in the context of the (strictly
           archaeal) Apern subtype of CRISPR/Cas system, and is
           further restricted to the Sulfolobales, including
           Metallosphaera sedula DSM 5348 and multiple species of
           the genus Sulfolobus.
          Length = 281

 Score = 26.4 bits (58), Expect = 9.3
 Identities = 18/46 (39%), Positives = 27/46 (58%), Gaps = 6/46 (13%)

Query: 75  RKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYSVTLEC 120
           R K+++R+KR +K KK E  +  RII +SG      KK +   L+C
Sbjct: 20  RGKEEERRKRIEKTKKIEGIL--RIIPLSGND----KKPFEQALKC 59


>gnl|CDD|206332 pfam14163, SieB, Superinfection exclusion protein B.  This family
          includes superinfection exclusion proteins. These
          proteins prevent the growth of superinfecting phage
          which are insensitive to repression. It aborts lytic
          development of superinfecting phage.
          Length = 151

 Score = 26.0 bits (58), Expect = 9.4
 Identities = 13/65 (20%), Positives = 30/65 (46%), Gaps = 14/65 (21%)

Query: 31 ADSRPESTFWIFGKLITSCSYIILATSLLHYLSYILAMEKEEEERKKKKKRKKRKKKKKK 90
           +   +   WI    + S +Y+I        LS IL          + K++ ++K+++++
Sbjct: 26 DEFVTKYRPWIGLIFLISVAYLIT-----LLLSKILQ---------EAKEKYQKKREQER 71

Query: 91 KEKRI 95
           EK++
Sbjct: 72 IEKKL 76


>gnl|CDD|201659 pfam01201, Ribosomal_S8e, Ribosomal protein S8e. 
          Length = 129

 Score = 25.9 bits (58), Expect = 9.6
 Identities = 13/38 (34%), Positives = 18/38 (47%), Gaps = 3/38 (7%)

Query: 75  RKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKK 112
           RK+   R    +KK+K E   PP    +   GK  VK+
Sbjct: 11  RKRTGGRLDHHRKKRKFELGRPPAPTKL---GKNRVKQ 45


>gnl|CDD|129200 TIGR00092, TIGR00092, GTP-binding protein YchF.  This predicted
           GTP-binding protein is found in a single copy in every
           complete bacterial genome, and is found in Eukaryotes. A
           more distantly related protein, separated from this
           model, is found in the archaea. It is known to bind GTP
           and double-stranded nucleic acid. It is suggested to
           belong to a nucleoprotein complex and act as a
           translation factor [Unknown function, General].
          Length = 368

 Score = 26.7 bits (59), Expect = 9.8
 Identities = 12/51 (23%), Positives = 22/51 (43%)

Query: 65  ILAMEKEEEERKKKKKRKKRKKKKKKKEKRIPPRIIYVSGAGKVEVKKGYS 115
           + A E   E+R  + K+     K KK+E  +   I+ +   G++      S
Sbjct: 134 LKADEFLVEKRIGRSKKSAEGGKDKKEELLLLEIILPLLNGGQMARHVDLS 184


>gnl|CDD|149048 pfam07768, PVL_ORF50, PVL ORF-50-like family.  This is a family
          of sequences found in both bacteria and bacteriophages.
          This region is approximately 130 residues long and in
          some cases is found as part of the PVL
          (Panton-Valentine leukocidin) group of genes, which
          encode a member of the leukocidin group of bacterial
          toxins that kill leukocytes by creation of pores in the
          cell membrane. PVL appears to be a virulence factor
          associated with a number of human diseases.
          Length = 118

 Score = 25.6 bits (56), Expect = 9.8
 Identities = 9/25 (36%), Positives = 19/25 (76%)

Query: 67 AMEKEEEERKKKKKRKKRKKKKKKK 91
           +E+E +ER+ ++KR+K  + ++KK
Sbjct: 62 NIEQERKERELERKRRKEAELRRKK 86


>gnl|CDD|220705 pfam10344, Fmp27, Mitochondrial protein from FMP27.  This family
           contains mitochondrial FMP27 proteins which in yeasts
           together with SEN1 are long genes that exist in a looped
           conformation, effectively bringing together their
           promoter and terminator regions. Pol-II is located at
           both ends of FMP27 when this gene is transcribed from a
           GAL1 promoter under induced and non-induced conditions.
           The exact function of the Fmp27 protein is not certain.
          Length = 861

 Score = 26.6 bits (59), Expect = 9.9
 Identities = 8/39 (20%), Positives = 19/39 (48%), Gaps = 1/39 (2%)

Query: 62  LSYILAMEKEEEERKKKKKRKKRKKKKKKKE-KRIPPRI 99
           LS +L++  ++  +      KK+K+ +      R  P++
Sbjct: 406 LSLLLSLLNKKLLKSLPSLNKKKKRHRLISLLSRYLPKL 444


>gnl|CDD|151173 pfam10669, Phage_Gp23, Protein gp23 (Bacteriophage A118).  This
          is the highly conserved family of the major tail
          subunit protein.
          Length = 121

 Score = 25.8 bits (56), Expect = 10.0
 Identities = 11/32 (34%), Positives = 21/32 (65%)

Query: 63 SYILAMEKEEEERKKKKKRKKRKKKKKKKEKR 94
          S I+ +E +EE  K + +R+KR K+ K++  +
Sbjct: 41 SKIVRVEMKEERDKMETEREKRDKESKEERDK 72


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.318    0.133    0.395 

Gapped
Lambda     K      H
   0.267   0.0603    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 9,534,490
Number of extensions: 895827
Number of successful extensions: 8281
Number of sequences better than 10.0: 1
Number of HSP's gapped: 6572
Number of HSP's successfully gapped: 939
Length of query: 187
Length of database: 10,937,602
Length adjustment: 91
Effective length of query: 96
Effective length of database: 6,901,388
Effective search space: 662533248
Effective search space used: 662533248
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 56 (25.6 bits)