RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11546
         (1634 letters)



>gnl|CDD|238113 cd00190, Tryp_SPc, Trypsin-like serine protease; Many of these are
            synthesized as inactive precursor zymogens that are
            cleaved during limited proteolysis to generate their
            active forms. Alignment contains also inactive enzymes
            that have substitutions of the catalytic triad residues.
          Length = 232

 Score =  303 bits (778), Expect = 4e-94
 Identities = 110/238 (46%), Positives = 150/238 (63%), Gaps = 9/238 (3%)

Query: 1285 IVGGGNARLGSWPWQAAL-YKEGEFQCGATLISDQWLLSAGHCFYRAQDDYWVARLGTLR 1343
            IVGG  A++GS+PWQ +L Y  G   CG +LIS +W+L+A HC Y +    +  RLG+  
Sbjct: 1    IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHD 60

Query: 1344 RGTKLPSPYEQLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTPL 1401
              +       Q+  + K+I+HP Y  + + NDI++LK+K P   S+ VRPICLP     L
Sbjct: 61   LSS--NEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNL 118

Query: 1402 TDGTLCTVVGWGQLFEIGRVFPDTLQEVQLPIISTAECRKRTLFLPLYRVTENMFCAGFE 1461
              GT CTV GWG+  E G   PD LQEV +PI+S AEC++   +     +T+NM CAG  
Sbjct: 119  PAGTTCTVSGWGRTSEGGP-LPDVLQEVNVPIVSNAECKRA--YSYGGTITDNMLCAGGL 175

Query: 1462 RGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYGCARANRPGVYTKVSNYIPWLYNN 1519
             GG+DAC GDSGGPL+C + +GR  L+G+ S G GCAR N PGVYT+VS+Y+ W+   
Sbjct: 176  EGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232


>gnl|CDD|214473 smart00020, Tryp_SPc, Trypsin-like serine protease.  Many of these
            are synthesised as inactive precursor zymogens that are
            cleaved during limited proteolysis to generate their
            active forms. A few, however, are active as single chain
            molecules, and others are inactive due to substitutions
            of the catalytic triad residues.
          Length = 229

 Score =  302 bits (776), Expect = 7e-94
 Identities = 112/235 (47%), Positives = 148/235 (62%), Gaps = 10/235 (4%)

Query: 1284 RIVGGGNARLGSWPWQAAL-YKEGEFQCGATLISDQWLLSAGHCFYRAQDDYWVARLGTL 1342
            RIVGG  A +GS+PWQ +L Y  G   CG +LIS +W+L+A HC   +       RLG+ 
Sbjct: 1    RIVGGSEANIGSFPWQVSLQYGGGRHFCGGSLISPRWVLTAAHCVRGSDPSNIRVRLGSH 60

Query: 1343 RRGTKLPSPYEQLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTP 1400
               +       Q+  +SK+I+HP Y  + + NDI++LK+K P   S+ VRPICLP  N  
Sbjct: 61   DLSS---GEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKEPVTLSDNVRPICLPSSNYN 117

Query: 1401 LTDGTLCTVVGWGQLFEIGRVFPDTLQEVQLPIISTAECRKRTLFLPLYRVTENMFCAGF 1460
            +  GT CTV GWG+  E     PDTLQEV +PI+S A CR+   +     +T+NM CAG 
Sbjct: 118  VPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRA--YSGGGAITDNMLCAGG 175

Query: 1461 ERGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYGCARANRPGVYTKVSNYIPW 1515
              GG+DAC GDSGGPL+C   DGRW L+G+ S G GCAR  +PGVYT+VS+Y+ W
Sbjct: 176  LEGGKDACQGDSGGPLVCN--DGRWVLVGIVSWGSGCARPGKPGVYTRVSSYLDW 228


>gnl|CDD|215708 pfam00089, Trypsin, Trypsin. 
          Length = 218

 Score =  231 bits (591), Expect = 2e-69
 Identities = 107/235 (45%), Positives = 140/235 (59%), Gaps = 20/235 (8%)

Query: 1285 IVGGGNARLGSWPWQAALY-KEGEFQCGATLISDQWLLSAGHCFYRAQDDYWVARLGTLR 1343
            IVGG  A+ GS+PWQ +L    G+  CG +LIS+ W+L+A HC   A+       LG   
Sbjct: 1    IVGGDEAQPGSFPWQVSLQVSSGKHFCGGSLISENWVLTAAHCVSNAKS--VRVVLGAHN 58

Query: 1344 RGTKLPSPYEQLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTPL 1401
               +     EQ   + K+I+HP Y +    NDI++LK+K+P    + VRPICLP  ++ L
Sbjct: 59   IVLREGG--EQKFDVKKVIVHPNY-NPDTDNDIALLKLKSPVTLGDTVRPICLPTASSDL 115

Query: 1402 TDGTLCTVVGWGQLFEIGRVFPDTLQEVQLPIISTAECRKRTLFLPLYRVTENMFCAGFE 1461
              GT CTV GWG    +G   PDTLQEV +P++S   CR          VT+NM CAG  
Sbjct: 116  PVGTTCTVSGWGNTKTLGL--PDTLQEVTVPVVSRETCRSAYGG----TVTDNMICAGA- 168

Query: 1462 RGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYGCARANRPGVYTKVSNYIPWL 1516
             GG+DAC GDSGGPL+C   DG   L+G+ S GYGCA  N PGVYT VS+Y+ W+
Sbjct: 169  -GGKDACQGDSGGPLVC--SDGE--LIGIVSWGYGCASGNYPGVYTPVSSYLDWI 218


>gnl|CDD|227927 COG5640, COG5640, Secreted trypsin-like serine protease
            [Posttranslational modification, protein turnover,
            chaperones].
          Length = 413

 Score =  101 bits (252), Expect = 3e-22
 Identities = 76/273 (27%), Positives = 104/273 (38%), Gaps = 38/273 (13%)

Query: 1283 SRIVGGGNARLGSWPWQAAL------YKEGEFQCGATLISDQWLLSAGHCFYRAQDDYWV 1336
            SRI+GG NA  G +P   AL      Y  G F CG + +  +++L+A HC     D    
Sbjct: 31   SRIIGGSNANAGEYPSLVALVDRISDYVSGTF-CGGSKLGGRYVLTAAHCA----DASSP 85

Query: 1337 ARLGTLRRGTKLP-SPYEQLRPISKIILHPQYVDAGFINDISILKMKTPFSNYVRPICLP 1395
                  R    L  S   +   +  I +H  Y      NDI++L++    S     I   
Sbjct: 86   ISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSF 145

Query: 1396 HPNTPLTDGTLCTVVGWGQLFEIGRVFPDT--------------LQEVQLPIISTAECRK 1441
                  +D  L +V     +      F  T              L EV +  +  + C +
Sbjct: 146  DA----SDTFLNSVTTVSPMTNGT--FGVTTPSDVPRSSPKGTILHEVAVLFVPLSTCAQ 199

Query: 1442 RTLFLPLYRVTENM--FCAGFERGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYG-CA 1498
                         +  FCAG  R  +DAC GDSGGP+  +  +GR    GV S G G C 
Sbjct: 200  YKGCANASDGATGLTGFCAG--RPPKDACQGDSGGPIFHKGEEGR-VQRGVVSWGDGGCG 256

Query: 1499 RANRPGVYTKVSNYIPWLYNNMAASEYNMMRNE 1531
                PGVYT VSNY  W+        Y   R  
Sbjct: 257  GTLIPGVYTNVSNYQDWIAAMTNGLSYLQFRPL 289


>gnl|CDD|216474 pfam01390, SEA, SEA domain.  Domain found in Sea urchin sperm
           protein, Enterokinase, Agrin (SEA). Proposed function of
           regulating or binding carbohydrate side chains. Recently
           a proteolytic activity has been shown for a SEA domain.
          Length = 107

 Score = 66.2 bits (162), Expect = 5e-13
 Identities = 30/107 (28%), Positives = 53/107 (49%), Gaps = 5/107 (4%)

Query: 70  VELIFDSSFRVTAGDSYNPSLENSTSNLYKEKSKRYKSMIEKLYNASVLSPAIKYCGVIG 129
           VE +F+ SFR+T    ++  L + +S  YKE ++R ++++ +++  S L P  K   V+ 
Sbjct: 2   VEQVFNGSFRIT-NLEFSEDLSDPSSPEYKELARRIENLLNEVFKKSSLKPGFKGVRVLS 60

Query: 130 FKNGSLIVFYRIILDRRKIPRSIGNVEEVVKNILVDEITSRKAVAFK 176
           F+ GS++V Y +I        S  N   V + +L     S      K
Sbjct: 61  FRPGSVVVDYDVIFR----KPSSENGATVEEQLLEQLQQSNNIGNLK 103


>gnl|CDD|238060 cd00112, LDLa, Low Density Lipoprotein Receptor Class A domain, a
            cysteine-rich repeat that plays a central role in
            mammalian cholesterol metabolism; the receptor protein
            binds LDL and transports it into cells by endocytosis; 7
            successive cysteine-rich repeats of about 40 amino acids
            are present in the N-terminal of this multidomain
            membrane protein; other homologous domains occur in
            related receptors, including the very low-density
            lipoprotein receptor and the LDL receptor-related
            protein/alpha 2-macroglobulin receptor, and in proteins
            which are functionally unrelated, such as the C9
            component of complement; the binding of calcium is
            required for in vitro formation of the native disulfide
            isomer and is necessary in establishing and maintaining
            the modular structure.
          Length = 35

 Score = 54.9 bits (133), Expect = 9e-10
 Identities = 16/30 (53%), Positives = 19/30 (63%)

Query: 999  FRCGNGECVSIGSKCNQLVDCADGSDEKNC 1028
            FRC NG C+     C+   DC DGSDE+NC
Sbjct: 6    FRCANGRCIPSSWVCDGEDDCGDGSDEENC 35



 Score = 54.9 bits (133), Expect = 9e-10
 Identities = 16/30 (53%), Positives = 19/30 (63%)

Query: 1574 FRCGNGECVSIGSKCNQLVDCADGSDEKNC 1603
            FRC NG C+     C+   DC DGSDE+NC
Sbjct: 6    FRCANGRCIPSSWVCDGEDDCGDGSDEENC 35



 Score = 52.2 bits (126), Expect = 8e-09
 Identities = 16/36 (44%), Positives = 21/36 (58%), Gaps = 1/36 (2%)

Query: 1062 CSPGQYICPNSRVCIERTRLCDGIKDCPLGDDEKQC 1097
            C P ++ C N R CI  + +CDG  DC  G DE+ C
Sbjct: 1    CPPNEFRCANGR-CIPSSWVCDGEDDCGDGSDEENC 35


>gnl|CDD|200964 pfam00057, Ldl_recept_a, Low-density lipoprotein receptor domain
            class A. 
          Length = 37

 Score = 53.8 bits (130), Expect = 2e-09
 Identities = 18/38 (47%), Positives = 23/38 (60%), Gaps = 1/38 (2%)

Query: 1566 GKCEMNSSFRCGNGECVSIGSKCNQLVDCADGSDEKNC 1603
              C     F+CG+GEC+ +   C+   DC DGSDEKNC
Sbjct: 1    STC-GPDEFQCGSGECIPMSWVCDGDPDCEDGSDEKNC 37



 Score = 53.5 bits (129), Expect = 3e-09
 Identities = 19/38 (50%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 991  SECEMNSSFRCGNGECVSIGSKCNQLVDCADGSDEKNC 1028
            S C     F+CG+GEC+ +   C+   DC DGSDEKNC
Sbjct: 1    STC-GPDEFQCGSGECIPMSWVCDGDPDCEDGSDEKNC 37



 Score = 43.8 bits (104), Expect = 8e-06
 Identities = 15/36 (41%), Positives = 20/36 (55%), Gaps = 1/36 (2%)

Query: 1062 CSPGQYICPNSRVCIERTRLCDGIKDCPLGDDEKQC 1097
            C P ++ C +   CI  + +CDG  DC  G DEK C
Sbjct: 3    CGPDEFQCGSGE-CIPMSWVCDGDPDCEDGSDEKNC 37


>gnl|CDD|197566 smart00192, LDLa, Low-density lipoprotein receptor domain class A.
            Cysteine-rich repeat in the low-density lipoprotein (LDL)
            receptor that plays a central role in mammalian
            cholesterol metabolism. The N-terminal type A repeats in
            LDL receptor bind the lipoproteins. Other homologous
            domains occur in related receptors, including the very
            low-density lipoprotein receptor and the LDL
            receptor-related protein/alpha 2-macroglobulin receptor,
            and in proteins which are functionally unrelated, such as
            the C9 component of complement. Mutations in the LDL
            receptor gene cause familial hypercholesterolemia.
          Length = 33

 Score = 47.2 bits (113), Expect = 4e-07
 Identities = 13/27 (48%), Positives = 17/27 (62%)

Query: 999  FRCGNGECVSIGSKCNQLVDCADGSDE 1025
            F+C NG C+     C+ + DC DGSDE
Sbjct: 7    FQCDNGRCIPSSWVCDGVDDCGDGSDE 33



 Score = 47.2 bits (113), Expect = 4e-07
 Identities = 13/27 (48%), Positives = 17/27 (62%)

Query: 1574 FRCGNGECVSIGSKCNQLVDCADGSDE 1600
            F+C NG C+     C+ + DC DGSDE
Sbjct: 7    FQCDNGRCIPSSWVCDGVDDCGDGSDE 33



 Score = 46.5 bits (111), Expect = 9e-07
 Identities = 16/33 (48%), Positives = 21/33 (63%), Gaps = 1/33 (3%)

Query: 1062 CSPGQYICPNSRVCIERTRLCDGIKDCPLGDDE 1094
            C PG++ C N R CI  + +CDG+ DC  G DE
Sbjct: 2    CPPGEFQCDNGR-CIPSSWVCDGVDDCGDGSDE 33



 Score = 37.2 bits (87), Expect = 0.002
 Identities = 10/27 (37%), Positives = 15/27 (55%)

Query: 1540 CNGHRCPLGECLPKARVCNGYMECSDG 1566
                +C  G C+P + VC+G  +C DG
Sbjct: 4    PGEFQCDNGRCIPSSWVCDGVDDCGDG 30


>gnl|CDD|220189 pfam09342, DUF1986, Domain of unknown function (DUF1986).  This
            domain is found in serine proteases and is predicted to
            contain disulphide bonds.
          Length = 267

 Score = 49.7 bits (118), Expect = 5e-06
 Identities = 35/120 (29%), Positives = 53/120 (44%), Gaps = 11/120 (9%)

Query: 1296 WPWQAALYKEGEFQCGATLISDQWLLSAGHCFY--RAQDDYWVARLGTLRRGTKLPSPYE 1353
            WPW A +Y EG ++C   LI   W+L +  C +    +  Y    LG  +    +  PYE
Sbjct: 16   WPWIAKVYVEGNYRCTGVLIDLSWVLVSHSCLWDTSLEHSYISVVLGGHKTLKSVKGPYE 75

Query: 1354 QLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTPLTDGTLCTVVG 1411
            Q+  +      P+       + IS+L +K+P  FSN+V P  +P           C  VG
Sbjct: 76   QIYRVDCRKDLPR-------SKISLLHLKSPATFSNHVLPTFVPSTRNHNEKNNKCVTVG 128


>gnl|CDD|214554 smart00200, SEA, Domain found in sea urchin sperm protein,
           enterokinase, agrin.  Proposed function of regulating or
           binding carbohydrate sidechains.
          Length = 121

 Score = 45.5 bits (108), Expect = 1e-05
 Identities = 19/52 (36%), Positives = 33/52 (63%)

Query: 86  YNPSLENSTSNLYKEKSKRYKSMIEKLYNASVLSPAIKYCGVIGFKNGSLIV 137
           Y+PSLE+ +S  Y+E  +  + ++E++Y  + L P      VI F+NGS++V
Sbjct: 21  YSPSLEDPSSEEYQELVRDVEKLLEQIYGKTDLKPDFVGTEVIEFRNGSVVV 72


>gnl|CDD|225766 COG3225, GldG, ABC-type uncharacterized transport system involved
           in gliding motility, auxiliary component [Cell motility
           and secretion].
          Length = 538

 Score = 36.3 bits (84), Expect = 0.15
 Identities = 33/188 (17%), Positives = 57/188 (30%), Gaps = 22/188 (11%)

Query: 528 QLIHGPSSEFPVLQKIGNLDEVLKAYKANRTMSSIQKKNDFVSSETAFNGDLAIMESSNE 587
           +L+     +F +       D +L  Y         ++   FV             E   E
Sbjct: 114 KLVGFLLQQFDLRP--NGQDIILGNYSRISIDYEFEQVIPFVRPLE---------EKFLE 162

Query: 588 YQYAHTIRTPGNRHSPVVTLLPVRSNVGPG------KPLRPR-PYLGTRNNIGTTTITTI 640
           Y  A  +   G R   V  L+               + +RPR   +     I   T+  I
Sbjct: 163 YDLARLVIELGQRTQLVQGLMSSEPLSEIQLTNANQQEIRPRAFMVYLLQEIDLRTLKLI 222

Query: 641 PT--PTLEDDPHNIDSDYVDQHSNRGASMNIFKGHNLDYGALNTKDFYLPPPPNISDHII 698
            T  P L +    +    +D+ +       +  G  L    ++T  +YL     +     
Sbjct: 223 STRIPALVNVLLIVGPLNLDEQAAYDIDAFVLAGGKLL-AFVSTLSYYLNALYMVGP-KS 280

Query: 699 NDFSDSLL 706
           +D    LL
Sbjct: 281 SDLLPDLL 288


>gnl|CDD|224794 COG1882, PflD, Pyruvate-formate lyase [Energy production and
           conversion].
          Length = 755

 Score = 34.2 bits (79), Expect = 0.74
 Identities = 22/88 (25%), Positives = 33/88 (37%), Gaps = 5/88 (5%)

Query: 314 LENINEKIRITPSTNQP----RKRTSPIANKHAGLIETANDEPVFRETDLDDKMLRHSPL 369
           LE +   I+     NQ         S I    AG IE   +  V  +TD + K       
Sbjct: 54  LEKVEILIKDEELGNQAVDFDTAIISTITTHDAGYIEKELEPIVGLQTDEELKRALRPFG 113

Query: 370 ESF-AHNSLLDMYKPMMEEDEEIKTKSQ 396
               A  SL    + +  + E+I TK++
Sbjct: 114 GPRMAEGSLKAYGRELDPDIEKIFTKTR 141


>gnl|CDD|149682 pfam08702, Fib_alpha, Fibrinogen alpha/beta chain family.
           Fibrinogen is a protein involved in platelet aggregation
           and is essential for the coagulation of blood. This
           domain forms part of the central coiled coiled region of
           the protein which is formed from two sets of three
           non-identical chains (alpha, beta and gamma).
          Length = 146

 Score = 32.3 bits (74), Expect = 0.87
 Identities = 21/108 (19%), Positives = 38/108 (35%), Gaps = 20/108 (18%)

Query: 832 KDLLNKRDGDTKESGVKLE-------NNTSEADSIEKKVILVMSSNSSNMLNFNENRTSD 884
           +DLL+K + D  +    LE       N+TS A    K +   +            ++   
Sbjct: 24  QDLLDKYEKDVDKRIEDLENLLDQLANSTSSAHQYVKHIKDSL----------RGDQKQA 73

Query: 885 -DNDNKNKAMAQNLLTQMLEKYNRVITNDSSVSSLKYLIDQISHQHLK 931
             NDN   A +++L   +       I      S ++ L + +     K
Sbjct: 74  QPNDNIYNAYSKSLRKMIEYILETKINT--QESQIRVLQEVLRSNRSK 119


>gnl|CDD|147509 pfam05357, Phage_Coat_A, Phage Coat Protein A.  Infection of
           Escherichia coli by filamentous bacteriophages is
           mediated by the minor phage coat protein A and involves
           two distinct cellular receptors, the F' pilus and the
           periplasmic protein TolA. These two receptors are
           contacted in a sequential manner, such that binding of
           TolA by the extreme N-terminal domain is conditional on
           a primary interaction of the second coat protein A
           domain with the F' pilus.
          Length = 62

 Score = 29.9 bits (67), Expect = 1.2
 Identities = 10/30 (33%), Positives = 16/30 (53%)

Query: 87  NPSLENSTSNLYKEKSKRYKSMIEKLYNAS 116
            PSLE S  N++K ++ RY +    L   +
Sbjct: 27  KPSLEESQPNVWKFQNNRYANREGCLTVYT 56


>gnl|CDD|147167 pfam04867, DUF643, Protein of unknown function (DUF643).  Protein
           of unknown function found in Borrelia burgdorferi, the
           Lyme disease spirochete.
          Length = 114

 Score = 31.1 bits (70), Expect = 1.6
 Identities = 31/102 (30%), Positives = 43/102 (42%), Gaps = 18/102 (17%)

Query: 92  NSTSNLYKEKSKRYKSMIEKLYNASVLSPAIK---YCGVI-----GFKNG-SLIVFYRII 142
           N  S+ Y   SKR K  I KLY  S ++   K   Y  V        K G S+      I
Sbjct: 2   NEISDFYDNLSKRTKKEINKLYLTSQITLKQKRQIYSAVEKMQEYVIKTGKSVEEIINDI 61

Query: 143 LDRRKIPRSIGNVEEVVKNILVDEITSRKAVAFKNIKVDEND 184
           +D  K         E +K++L  +   +K   FKN+KVD + 
Sbjct: 62  IDPEK---------EFIKDVLKRKNLIKKYKNFKNMKVDFSY 94


>gnl|CDD|114270 pfam05539, Pneumo_att_G, Pneumovirinae attachment membrane
           glycoprotein G. 
          Length = 408

 Score = 32.3 bits (73), Expect = 2.4
 Identities = 16/86 (18%), Positives = 29/86 (33%), Gaps = 4/86 (4%)

Query: 392 KTKSQPISKPEAAMPKEEISGVGEAQVIVLPAASSSHELLLSPHGTLHSKPTTFRPHTYT 451
            T +    +P+   P  +    G  Q    P +++S +   +  G  H+      P   +
Sbjct: 223 GTTTSSNPEPQTEPPPSQRGPSGSPQH---PPSTTSQDQSTTGDGQEHT-QRRKTPPATS 278

Query: 452 KSRQTTQHSVPPEIVVTTEARKATPS 477
             R     + PP      E  + TP 
Sbjct: 279 NRRSPHSTATPPPTTKRQETGRPTPR 304


>gnl|CDD|226845 COG4421, COG4421, Capsular polysaccharide biosynthesis protein
           [Carbohydrate transport and metabolism].
          Length = 368

 Score = 31.7 bits (72), Expect = 3.6
 Identities = 18/72 (25%), Positives = 30/72 (41%), Gaps = 8/72 (11%)

Query: 385 MEEDEEIKTKSQ----PISKPEAAMPKEEISGVGEAQVIVLPAASSSHELLLSPHGT--- 437
           +  +EE++   Q     I +PE   P+E+     +A+VIV P  S     + +  G    
Sbjct: 240 LVNEEEVERLLQRSGLTIVRPETLGPREQARLFRKAKVIVGPHGSGLANAVFAAPGCKVV 299

Query: 438 -LHSKPTTFRPH 448
            +    T FR  
Sbjct: 300 EIQPGTTNFRSF 311


>gnl|CDD|226119 COG3591, COG3591, V8-like Glu-specific endopeptidase [Amino acid
            transport and metabolism].
          Length = 251

 Score = 30.8 bits (70), Expect = 4.6
 Identities = 14/38 (36%), Positives = 19/38 (50%), Gaps = 3/38 (7%)

Query: 1295 SWPWQAALYKE---GEFQCGATLISDQWLLSAGHCFYR 1329
             +P+ A +  E   G     ATLI    +L+AGHC Y 
Sbjct: 48   QFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYS 85


>gnl|CDD|218601 pfam05477, SURF2, Surfeit locus protein 2 (SURF2).  Surfeit locus
           protein 2 is part of a group of at least six sequence
           unrelated genes (Surf-1 to Surf-6). The six Surfeit
           genes have been classified as housekeeping genes, being
           expressed in all tissue types tested and not containing
           a TATA box in their promoter region. The exact function
           of SURF2 is unknown.
          Length = 244

 Score = 30.7 bits (69), Expect = 5.0
 Identities = 18/88 (20%), Positives = 30/88 (34%), Gaps = 3/88 (3%)

Query: 684 DFYLPPPPNISDHIINDFSDSLLDKISVGDSPSLSEEAPPLDNGDDNFSTTEESRKVILN 743
           DF+ PP    SD   +D  DS+ D          +       + DD+F T +E    + +
Sbjct: 153 DFWEPPS---SDEDDSDSEDSMSDLYPPELFTLKNPGKEQNGDEDDDFETDDEDEMEVES 209

Query: 744 TEVVTSTRSLNLNGTHELTVKNERTGKM 771
            E+             +   KN +    
Sbjct: 210 PELQQKRSKKQSGSLTKKFKKNHKKKGP 237


>gnl|CDD|193472 pfam12999, PRKCSH-like, Glucosidase II beta subunit-like.  The
            sequences found in this family are similar to a region
            found in the beta-subunit of glucosidase II, which is
            also known as protein kinase C substrate 80K-H (PRKCSH).
            The enzyme catalyzes the sequential removal of two
            alpha-1,3-linked glucose residues in the second step of
            N-linked oligosaccharide processing. The beta subunit is
            required for the solubility and stability of the
            heterodimeric enzyme, and is involved in retaining the
            enzyme within the endoplasmic reticulum.
          Length = 176

 Score = 30.5 bits (69), Expect = 5.0
 Identities = 25/76 (32%), Positives = 31/76 (40%), Gaps = 23/76 (30%)

Query: 1018 DCADGSDEKNCS--------CAD--FLKSQFLTRKICDGIID---CWDFSDEYECEWCSP 1064
            DC DGSDE   +        CA+  F+     + K+ DG+ D   C D SDE        
Sbjct: 59   DCPDGSDEPGTNACSNGKFYCANEGFIPGYIPSFKVDDGVCDYDICCDGSDEALG----- 113

Query: 1065 GQYICPNSRVCIERTR 1080
                CPN   C E  R
Sbjct: 114  ---KCPNK--CGEIAR 124


>gnl|CDD|235943 PRK07133, PRK07133, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 725

 Score = 30.9 bits (70), Expect = 6.7
 Identities = 30/162 (18%), Positives = 54/162 (33%), Gaps = 15/162 (9%)

Query: 832 KDLLNKRDG-DTKESGVKLENNTSEADSIEKKVILVMSSNSSNMLNFNENRTSDDNDNKN 890
           K   N  D  D KE  ++ EN+      IE K             +     +S  N  +N
Sbjct: 370 KIEENSIDNLDIKEKKIENEND------IEGKSDTKNLEEGFETKDNKNKNSSFINKTEN 423

Query: 891 KAMAQNLLTQMLEKYNRVITNDS-SVSSLKYLIDQI--SHQHLKHTHQHNPDT-----NI 942
                 L  ++LEK   +I  ++        + + I  +       +Q+  DT       
Sbjct: 424 ILTNSPLKDELLEKTTEIINIENPQEFEFGQIGNDIISTEIAQLDENQNLIDTGEFDLEN 483

Query: 943 STKAIPSSFNFNQVNGIPILTKVYTKVSKTNSTSKERNQTEF 984
           +     +  N N+++     T   + +S   S +K      F
Sbjct: 484 NFSNSFNPENGNKIDENINETFDTSTISANLSENKTNFAQSF 525


>gnl|CDD|226908 COG4531, ZnuA, ABC-type Zn2+ transport system, periplasmic
           component/surface adhesin [Inorganic ion transport and
           metabolism].
          Length = 318

 Score = 30.5 bits (69), Expect = 8.0
 Identities = 15/72 (20%), Positives = 29/72 (40%), Gaps = 5/72 (6%)

Query: 363 MLRHSPLESFAHNSLLDMYKPMMEEDEEIKTKSQPISKPEAAMPKEEISGVGEAQVIVLP 422
           ++ H      +    L +          + T  +P+    +A+      GVGE +V++ P
Sbjct: 1   IMLHKKTLLLSALFALLLGSAPAAAAAAVVTSIKPLGFIASAIAD----GVGEPEVLL-P 55

Query: 423 AASSSHELLLSP 434
             +S H+  L P
Sbjct: 56  GGASPHDYSLRP 67


>gnl|CDD|130673 TIGR01612, 235kDa-fam, reticulocyte binding/rhoptry protein.  This
            model represents a group of paralogous families in
            plasmodium species alternately annotated as reticulocyte
            binding protein, 235-kDa family protein and rhoptry
            protein. Rhoptry protein is localized on the cell surface
            and is extremely large (although apparently lacking in
            repeat structure) and is important for the process of
            invasion of the RBCs by the parasite. These proteins are
            found in P. falciparum, P. vivax and P. yoelii.
          Length = 2757

 Score = 30.8 bits (69), Expect = 8.2
 Identities = 43/167 (25%), Positives = 76/167 (45%), Gaps = 14/167 (8%)

Query: 815  TSLENNENL-FSYGSEEHKDLLNKRDGDTKESG---VKLENNTSEADSIEKKVILVMSSN 870
            TSLE  + +  SYG    K  L K D + K+S      +E    + D I++K   + +  
Sbjct: 1207 TSLEEVKGINLSYGKNLGKLFLEKIDEEKKKSEHMIKAMEAYIEDLDEIKEKSPEIENEM 1266

Query: 871  SSNMLNFNENRT---SDDNDNKNKAMAQN---LLTQMLEKYNRVITNDSSVSSLKYLIDQ 924
               M    E  T   S D+D  +  +++     ++ + EK  ++I + S  S +   I +
Sbjct: 1267 GIEMDIKAEMETFNISHDDDKDHHIISKKHDENISDIREKSLKIIEDFSEESDIND-IKK 1325

Query: 925  ISHQHLKHTHQHNPDTNISTKAIPSSFNFNQVNGIP-ILTKV--YTK 968
               ++L    +HN D N+    I + +N  ++N I  I+ +V  YTK
Sbjct: 1326 ELQKNLLDAQKHNSDINLYLNEIANIYNILKLNKIKKIIDEVKEYTK 1372


>gnl|CDD|109198 pfam00131, Metallothio, Metallothionein. 
          Length = 62

 Score = 27.4 bits (61), Expect = 9.4
 Identities = 12/51 (23%), Positives = 15/51 (29%)

Query: 1556 VCNGYMECSDGKCEMNSSFRCGNGECVSIGSKCNQLVDCADGSDEKNCSCA 1606
             C     C    C+     +C    C S   KC     C   +    CSC 
Sbjct: 11   TCICGTSCKCTNCKCGPCKKCCCSCCCSGCCKCAGGCVCKGCTGPDKCSCC 61


>gnl|CDD|115185 pfam06513, DUF1103, Repeat of unknown function (DUF1103).  This
           family consists of several repeats of around 30 residues
           in length which are found specifically in
           mature-parasite-infected erythrocyte surface antigen
           proteins from Plasmodium falciparum. This family often
           found in conjunction with pfam00226.
          Length = 215

 Score = 29.8 bits (66), Expect = 9.8
 Identities = 20/77 (25%), Positives = 37/77 (48%), Gaps = 5/77 (6%)

Query: 828 SEEHKDLLNKRDGDTKESGVKLENNTSEADSIEKKVILVMSSNSSNMLNFNENRTSDDND 887
           +EE K+ + K+     E G+K EN+T   D +    I+             E    +D +
Sbjct: 136 TEEVKEEIKKQ----VEEGIK-ENDTEGKDKLIGPEIITEEVKEEIKKQVEEGIKENDTE 190

Query: 888 NKNKAMAQNLLTQMLEK 904
           NK+K + Q ++T+ ++K
Sbjct: 191 NKDKVIGQEIITEEVKK 207


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.131    0.390 

Gapped
Lambda     K      H
   0.267   0.0730    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 79,481,594
Number of extensions: 7638899
Number of successful extensions: 6317
Number of sequences better than 10.0: 1
Number of HSP's gapped: 6281
Number of HSP's successfully gapped: 64
Length of query: 1634
Length of database: 10,937,602
Length adjustment: 110
Effective length of query: 1524
Effective length of database: 6,058,662
Effective search space: 9233400888
Effective search space used: 9233400888
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 66 (29.2 bits)