RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11546
(1634 letters)
>gnl|CDD|238113 cd00190, Tryp_SPc, Trypsin-like serine protease; Many of these are
synthesized as inactive precursor zymogens that are
cleaved during limited proteolysis to generate their
active forms. Alignment contains also inactive enzymes
that have substitutions of the catalytic triad residues.
Length = 232
Score = 303 bits (778), Expect = 4e-94
Identities = 110/238 (46%), Positives = 150/238 (63%), Gaps = 9/238 (3%)
Query: 1285 IVGGGNARLGSWPWQAAL-YKEGEFQCGATLISDQWLLSAGHCFYRAQDDYWVARLGTLR 1343
IVGG A++GS+PWQ +L Y G CG +LIS +W+L+A HC Y + + RLG+
Sbjct: 1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHD 60
Query: 1344 RGTKLPSPYEQLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTPL 1401
+ Q+ + K+I+HP Y + + NDI++LK+K P S+ VRPICLP L
Sbjct: 61 LSS--NEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNL 118
Query: 1402 TDGTLCTVVGWGQLFEIGRVFPDTLQEVQLPIISTAECRKRTLFLPLYRVTENMFCAGFE 1461
GT CTV GWG+ E G PD LQEV +PI+S AEC++ + +T+NM CAG
Sbjct: 119 PAGTTCTVSGWGRTSEGGP-LPDVLQEVNVPIVSNAECKRA--YSYGGTITDNMLCAGGL 175
Query: 1462 RGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYGCARANRPGVYTKVSNYIPWLYNN 1519
GG+DAC GDSGGPL+C + +GR L+G+ S G GCAR N PGVYT+VS+Y+ W+
Sbjct: 176 EGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232
>gnl|CDD|214473 smart00020, Tryp_SPc, Trypsin-like serine protease. Many of these
are synthesised as inactive precursor zymogens that are
cleaved during limited proteolysis to generate their
active forms. A few, however, are active as single chain
molecules, and others are inactive due to substitutions
of the catalytic triad residues.
Length = 229
Score = 302 bits (776), Expect = 7e-94
Identities = 112/235 (47%), Positives = 148/235 (62%), Gaps = 10/235 (4%)
Query: 1284 RIVGGGNARLGSWPWQAAL-YKEGEFQCGATLISDQWLLSAGHCFYRAQDDYWVARLGTL 1342
RIVGG A +GS+PWQ +L Y G CG +LIS +W+L+A HC + RLG+
Sbjct: 1 RIVGGSEANIGSFPWQVSLQYGGGRHFCGGSLISPRWVLTAAHCVRGSDPSNIRVRLGSH 60
Query: 1343 RRGTKLPSPYEQLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTP 1400
+ Q+ +SK+I+HP Y + + NDI++LK+K P S+ VRPICLP N
Sbjct: 61 DLSS---GEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKEPVTLSDNVRPICLPSSNYN 117
Query: 1401 LTDGTLCTVVGWGQLFEIGRVFPDTLQEVQLPIISTAECRKRTLFLPLYRVTENMFCAGF 1460
+ GT CTV GWG+ E PDTLQEV +PI+S A CR+ + +T+NM CAG
Sbjct: 118 VPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRA--YSGGGAITDNMLCAGG 175
Query: 1461 ERGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYGCARANRPGVYTKVSNYIPW 1515
GG+DAC GDSGGPL+C DGRW L+G+ S G GCAR +PGVYT+VS+Y+ W
Sbjct: 176 LEGGKDACQGDSGGPLVCN--DGRWVLVGIVSWGSGCARPGKPGVYTRVSSYLDW 228
>gnl|CDD|215708 pfam00089, Trypsin, Trypsin.
Length = 218
Score = 231 bits (591), Expect = 2e-69
Identities = 107/235 (45%), Positives = 140/235 (59%), Gaps = 20/235 (8%)
Query: 1285 IVGGGNARLGSWPWQAALY-KEGEFQCGATLISDQWLLSAGHCFYRAQDDYWVARLGTLR 1343
IVGG A+ GS+PWQ +L G+ CG +LIS+ W+L+A HC A+ LG
Sbjct: 1 IVGGDEAQPGSFPWQVSLQVSSGKHFCGGSLISENWVLTAAHCVSNAKS--VRVVLGAHN 58
Query: 1344 RGTKLPSPYEQLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTPL 1401
+ EQ + K+I+HP Y + NDI++LK+K+P + VRPICLP ++ L
Sbjct: 59 IVLREGG--EQKFDVKKVIVHPNY-NPDTDNDIALLKLKSPVTLGDTVRPICLPTASSDL 115
Query: 1402 TDGTLCTVVGWGQLFEIGRVFPDTLQEVQLPIISTAECRKRTLFLPLYRVTENMFCAGFE 1461
GT CTV GWG +G PDTLQEV +P++S CR VT+NM CAG
Sbjct: 116 PVGTTCTVSGWGNTKTLGL--PDTLQEVTVPVVSRETCRSAYGG----TVTDNMICAGA- 168
Query: 1462 RGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYGCARANRPGVYTKVSNYIPWL 1516
GG+DAC GDSGGPL+C DG L+G+ S GYGCA N PGVYT VS+Y+ W+
Sbjct: 169 -GGKDACQGDSGGPLVC--SDGE--LIGIVSWGYGCASGNYPGVYTPVSSYLDWI 218
>gnl|CDD|227927 COG5640, COG5640, Secreted trypsin-like serine protease
[Posttranslational modification, protein turnover,
chaperones].
Length = 413
Score = 101 bits (252), Expect = 3e-22
Identities = 76/273 (27%), Positives = 104/273 (38%), Gaps = 38/273 (13%)
Query: 1283 SRIVGGGNARLGSWPWQAAL------YKEGEFQCGATLISDQWLLSAGHCFYRAQDDYWV 1336
SRI+GG NA G +P AL Y G F CG + + +++L+A HC D
Sbjct: 31 SRIIGGSNANAGEYPSLVALVDRISDYVSGTF-CGGSKLGGRYVLTAAHCA----DASSP 85
Query: 1337 ARLGTLRRGTKLP-SPYEQLRPISKIILHPQYVDAGFINDISILKMKTPFSNYVRPICLP 1395
R L S + + I +H Y NDI++L++ S I
Sbjct: 86 ISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSF 145
Query: 1396 HPNTPLTDGTLCTVVGWGQLFEIGRVFPDT--------------LQEVQLPIISTAECRK 1441
+D L +V + F T L EV + + + C +
Sbjct: 146 DA----SDTFLNSVTTVSPMTNGT--FGVTTPSDVPRSSPKGTILHEVAVLFVPLSTCAQ 199
Query: 1442 RTLFLPLYRVTENM--FCAGFERGGRDACLGDSGGPLMCQEPDGRWSLMGVTSNGYG-CA 1498
+ FCAG R +DAC GDSGGP+ + +GR GV S G G C
Sbjct: 200 YKGCANASDGATGLTGFCAG--RPPKDACQGDSGGPIFHKGEEGR-VQRGVVSWGDGGCG 256
Query: 1499 RANRPGVYTKVSNYIPWLYNNMAASEYNMMRNE 1531
PGVYT VSNY W+ Y R
Sbjct: 257 GTLIPGVYTNVSNYQDWIAAMTNGLSYLQFRPL 289
>gnl|CDD|216474 pfam01390, SEA, SEA domain. Domain found in Sea urchin sperm
protein, Enterokinase, Agrin (SEA). Proposed function of
regulating or binding carbohydrate side chains. Recently
a proteolytic activity has been shown for a SEA domain.
Length = 107
Score = 66.2 bits (162), Expect = 5e-13
Identities = 30/107 (28%), Positives = 53/107 (49%), Gaps = 5/107 (4%)
Query: 70 VELIFDSSFRVTAGDSYNPSLENSTSNLYKEKSKRYKSMIEKLYNASVLSPAIKYCGVIG 129
VE +F+ SFR+T ++ L + +S YKE ++R ++++ +++ S L P K V+
Sbjct: 2 VEQVFNGSFRIT-NLEFSEDLSDPSSPEYKELARRIENLLNEVFKKSSLKPGFKGVRVLS 60
Query: 130 FKNGSLIVFYRIILDRRKIPRSIGNVEEVVKNILVDEITSRKAVAFK 176
F+ GS++V Y +I S N V + +L S K
Sbjct: 61 FRPGSVVVDYDVIFR----KPSSENGATVEEQLLEQLQQSNNIGNLK 103
>gnl|CDD|238060 cd00112, LDLa, Low Density Lipoprotein Receptor Class A domain, a
cysteine-rich repeat that plays a central role in
mammalian cholesterol metabolism; the receptor protein
binds LDL and transports it into cells by endocytosis; 7
successive cysteine-rich repeats of about 40 amino acids
are present in the N-terminal of this multidomain
membrane protein; other homologous domains occur in
related receptors, including the very low-density
lipoprotein receptor and the LDL receptor-related
protein/alpha 2-macroglobulin receptor, and in proteins
which are functionally unrelated, such as the C9
component of complement; the binding of calcium is
required for in vitro formation of the native disulfide
isomer and is necessary in establishing and maintaining
the modular structure.
Length = 35
Score = 54.9 bits (133), Expect = 9e-10
Identities = 16/30 (53%), Positives = 19/30 (63%)
Query: 999 FRCGNGECVSIGSKCNQLVDCADGSDEKNC 1028
FRC NG C+ C+ DC DGSDE+NC
Sbjct: 6 FRCANGRCIPSSWVCDGEDDCGDGSDEENC 35
Score = 54.9 bits (133), Expect = 9e-10
Identities = 16/30 (53%), Positives = 19/30 (63%)
Query: 1574 FRCGNGECVSIGSKCNQLVDCADGSDEKNC 1603
FRC NG C+ C+ DC DGSDE+NC
Sbjct: 6 FRCANGRCIPSSWVCDGEDDCGDGSDEENC 35
Score = 52.2 bits (126), Expect = 8e-09
Identities = 16/36 (44%), Positives = 21/36 (58%), Gaps = 1/36 (2%)
Query: 1062 CSPGQYICPNSRVCIERTRLCDGIKDCPLGDDEKQC 1097
C P ++ C N R CI + +CDG DC G DE+ C
Sbjct: 1 CPPNEFRCANGR-CIPSSWVCDGEDDCGDGSDEENC 35
>gnl|CDD|200964 pfam00057, Ldl_recept_a, Low-density lipoprotein receptor domain
class A.
Length = 37
Score = 53.8 bits (130), Expect = 2e-09
Identities = 18/38 (47%), Positives = 23/38 (60%), Gaps = 1/38 (2%)
Query: 1566 GKCEMNSSFRCGNGECVSIGSKCNQLVDCADGSDEKNC 1603
C F+CG+GEC+ + C+ DC DGSDEKNC
Sbjct: 1 STC-GPDEFQCGSGECIPMSWVCDGDPDCEDGSDEKNC 37
Score = 53.5 bits (129), Expect = 3e-09
Identities = 19/38 (50%), Positives = 24/38 (63%), Gaps = 1/38 (2%)
Query: 991 SECEMNSSFRCGNGECVSIGSKCNQLVDCADGSDEKNC 1028
S C F+CG+GEC+ + C+ DC DGSDEKNC
Sbjct: 1 STC-GPDEFQCGSGECIPMSWVCDGDPDCEDGSDEKNC 37
Score = 43.8 bits (104), Expect = 8e-06
Identities = 15/36 (41%), Positives = 20/36 (55%), Gaps = 1/36 (2%)
Query: 1062 CSPGQYICPNSRVCIERTRLCDGIKDCPLGDDEKQC 1097
C P ++ C + CI + +CDG DC G DEK C
Sbjct: 3 CGPDEFQCGSGE-CIPMSWVCDGDPDCEDGSDEKNC 37
>gnl|CDD|197566 smart00192, LDLa, Low-density lipoprotein receptor domain class A.
Cysteine-rich repeat in the low-density lipoprotein (LDL)
receptor that plays a central role in mammalian
cholesterol metabolism. The N-terminal type A repeats in
LDL receptor bind the lipoproteins. Other homologous
domains occur in related receptors, including the very
low-density lipoprotein receptor and the LDL
receptor-related protein/alpha 2-macroglobulin receptor,
and in proteins which are functionally unrelated, such as
the C9 component of complement. Mutations in the LDL
receptor gene cause familial hypercholesterolemia.
Length = 33
Score = 47.2 bits (113), Expect = 4e-07
Identities = 13/27 (48%), Positives = 17/27 (62%)
Query: 999 FRCGNGECVSIGSKCNQLVDCADGSDE 1025
F+C NG C+ C+ + DC DGSDE
Sbjct: 7 FQCDNGRCIPSSWVCDGVDDCGDGSDE 33
Score = 47.2 bits (113), Expect = 4e-07
Identities = 13/27 (48%), Positives = 17/27 (62%)
Query: 1574 FRCGNGECVSIGSKCNQLVDCADGSDE 1600
F+C NG C+ C+ + DC DGSDE
Sbjct: 7 FQCDNGRCIPSSWVCDGVDDCGDGSDE 33
Score = 46.5 bits (111), Expect = 9e-07
Identities = 16/33 (48%), Positives = 21/33 (63%), Gaps = 1/33 (3%)
Query: 1062 CSPGQYICPNSRVCIERTRLCDGIKDCPLGDDE 1094
C PG++ C N R CI + +CDG+ DC G DE
Sbjct: 2 CPPGEFQCDNGR-CIPSSWVCDGVDDCGDGSDE 33
Score = 37.2 bits (87), Expect = 0.002
Identities = 10/27 (37%), Positives = 15/27 (55%)
Query: 1540 CNGHRCPLGECLPKARVCNGYMECSDG 1566
+C G C+P + VC+G +C DG
Sbjct: 4 PGEFQCDNGRCIPSSWVCDGVDDCGDG 30
>gnl|CDD|220189 pfam09342, DUF1986, Domain of unknown function (DUF1986). This
domain is found in serine proteases and is predicted to
contain disulphide bonds.
Length = 267
Score = 49.7 bits (118), Expect = 5e-06
Identities = 35/120 (29%), Positives = 53/120 (44%), Gaps = 11/120 (9%)
Query: 1296 WPWQAALYKEGEFQCGATLISDQWLLSAGHCFY--RAQDDYWVARLGTLRRGTKLPSPYE 1353
WPW A +Y EG ++C LI W+L + C + + Y LG + + PYE
Sbjct: 16 WPWIAKVYVEGNYRCTGVLIDLSWVLVSHSCLWDTSLEHSYISVVLGGHKTLKSVKGPYE 75
Query: 1354 QLRPISKIILHPQYVDAGFINDISILKMKTP--FSNYVRPICLPHPNTPLTDGTLCTVVG 1411
Q+ + P+ + IS+L +K+P FSN+V P +P C VG
Sbjct: 76 QIYRVDCRKDLPR-------SKISLLHLKSPATFSNHVLPTFVPSTRNHNEKNNKCVTVG 128
>gnl|CDD|214554 smart00200, SEA, Domain found in sea urchin sperm protein,
enterokinase, agrin. Proposed function of regulating or
binding carbohydrate sidechains.
Length = 121
Score = 45.5 bits (108), Expect = 1e-05
Identities = 19/52 (36%), Positives = 33/52 (63%)
Query: 86 YNPSLENSTSNLYKEKSKRYKSMIEKLYNASVLSPAIKYCGVIGFKNGSLIV 137
Y+PSLE+ +S Y+E + + ++E++Y + L P VI F+NGS++V
Sbjct: 21 YSPSLEDPSSEEYQELVRDVEKLLEQIYGKTDLKPDFVGTEVIEFRNGSVVV 72
>gnl|CDD|225766 COG3225, GldG, ABC-type uncharacterized transport system involved
in gliding motility, auxiliary component [Cell motility
and secretion].
Length = 538
Score = 36.3 bits (84), Expect = 0.15
Identities = 33/188 (17%), Positives = 57/188 (30%), Gaps = 22/188 (11%)
Query: 528 QLIHGPSSEFPVLQKIGNLDEVLKAYKANRTMSSIQKKNDFVSSETAFNGDLAIMESSNE 587
+L+ +F + D +L Y ++ FV E E
Sbjct: 114 KLVGFLLQQFDLRP--NGQDIILGNYSRISIDYEFEQVIPFVRPLE---------EKFLE 162
Query: 588 YQYAHTIRTPGNRHSPVVTLLPVRSNVGPG------KPLRPR-PYLGTRNNIGTTTITTI 640
Y A + G R V L+ + +RPR + I T+ I
Sbjct: 163 YDLARLVIELGQRTQLVQGLMSSEPLSEIQLTNANQQEIRPRAFMVYLLQEIDLRTLKLI 222
Query: 641 PT--PTLEDDPHNIDSDYVDQHSNRGASMNIFKGHNLDYGALNTKDFYLPPPPNISDHII 698
T P L + + +D+ + + G L ++T +YL +
Sbjct: 223 STRIPALVNVLLIVGPLNLDEQAAYDIDAFVLAGGKLL-AFVSTLSYYLNALYMVGP-KS 280
Query: 699 NDFSDSLL 706
+D LL
Sbjct: 281 SDLLPDLL 288
>gnl|CDD|224794 COG1882, PflD, Pyruvate-formate lyase [Energy production and
conversion].
Length = 755
Score = 34.2 bits (79), Expect = 0.74
Identities = 22/88 (25%), Positives = 33/88 (37%), Gaps = 5/88 (5%)
Query: 314 LENINEKIRITPSTNQP----RKRTSPIANKHAGLIETANDEPVFRETDLDDKMLRHSPL 369
LE + I+ NQ S I AG IE + V +TD + K
Sbjct: 54 LEKVEILIKDEELGNQAVDFDTAIISTITTHDAGYIEKELEPIVGLQTDEELKRALRPFG 113
Query: 370 ESF-AHNSLLDMYKPMMEEDEEIKTKSQ 396
A SL + + + E+I TK++
Sbjct: 114 GPRMAEGSLKAYGRELDPDIEKIFTKTR 141
>gnl|CDD|149682 pfam08702, Fib_alpha, Fibrinogen alpha/beta chain family.
Fibrinogen is a protein involved in platelet aggregation
and is essential for the coagulation of blood. This
domain forms part of the central coiled coiled region of
the protein which is formed from two sets of three
non-identical chains (alpha, beta and gamma).
Length = 146
Score = 32.3 bits (74), Expect = 0.87
Identities = 21/108 (19%), Positives = 38/108 (35%), Gaps = 20/108 (18%)
Query: 832 KDLLNKRDGDTKESGVKLE-------NNTSEADSIEKKVILVMSSNSSNMLNFNENRTSD 884
+DLL+K + D + LE N+TS A K + + ++
Sbjct: 24 QDLLDKYEKDVDKRIEDLENLLDQLANSTSSAHQYVKHIKDSL----------RGDQKQA 73
Query: 885 -DNDNKNKAMAQNLLTQMLEKYNRVITNDSSVSSLKYLIDQISHQHLK 931
NDN A +++L + I S ++ L + + K
Sbjct: 74 QPNDNIYNAYSKSLRKMIEYILETKINT--QESQIRVLQEVLRSNRSK 119
>gnl|CDD|147509 pfam05357, Phage_Coat_A, Phage Coat Protein A. Infection of
Escherichia coli by filamentous bacteriophages is
mediated by the minor phage coat protein A and involves
two distinct cellular receptors, the F' pilus and the
periplasmic protein TolA. These two receptors are
contacted in a sequential manner, such that binding of
TolA by the extreme N-terminal domain is conditional on
a primary interaction of the second coat protein A
domain with the F' pilus.
Length = 62
Score = 29.9 bits (67), Expect = 1.2
Identities = 10/30 (33%), Positives = 16/30 (53%)
Query: 87 NPSLENSTSNLYKEKSKRYKSMIEKLYNAS 116
PSLE S N++K ++ RY + L +
Sbjct: 27 KPSLEESQPNVWKFQNNRYANREGCLTVYT 56
>gnl|CDD|147167 pfam04867, DUF643, Protein of unknown function (DUF643). Protein
of unknown function found in Borrelia burgdorferi, the
Lyme disease spirochete.
Length = 114
Score = 31.1 bits (70), Expect = 1.6
Identities = 31/102 (30%), Positives = 43/102 (42%), Gaps = 18/102 (17%)
Query: 92 NSTSNLYKEKSKRYKSMIEKLYNASVLSPAIK---YCGVI-----GFKNG-SLIVFYRII 142
N S+ Y SKR K I KLY S ++ K Y V K G S+ I
Sbjct: 2 NEISDFYDNLSKRTKKEINKLYLTSQITLKQKRQIYSAVEKMQEYVIKTGKSVEEIINDI 61
Query: 143 LDRRKIPRSIGNVEEVVKNILVDEITSRKAVAFKNIKVDEND 184
+D K E +K++L + +K FKN+KVD +
Sbjct: 62 IDPEK---------EFIKDVLKRKNLIKKYKNFKNMKVDFSY 94
>gnl|CDD|114270 pfam05539, Pneumo_att_G, Pneumovirinae attachment membrane
glycoprotein G.
Length = 408
Score = 32.3 bits (73), Expect = 2.4
Identities = 16/86 (18%), Positives = 29/86 (33%), Gaps = 4/86 (4%)
Query: 392 KTKSQPISKPEAAMPKEEISGVGEAQVIVLPAASSSHELLLSPHGTLHSKPTTFRPHTYT 451
T + +P+ P + G Q P +++S + + G H+ P +
Sbjct: 223 GTTTSSNPEPQTEPPPSQRGPSGSPQH---PPSTTSQDQSTTGDGQEHT-QRRKTPPATS 278
Query: 452 KSRQTTQHSVPPEIVVTTEARKATPS 477
R + PP E + TP
Sbjct: 279 NRRSPHSTATPPPTTKRQETGRPTPR 304
>gnl|CDD|226845 COG4421, COG4421, Capsular polysaccharide biosynthesis protein
[Carbohydrate transport and metabolism].
Length = 368
Score = 31.7 bits (72), Expect = 3.6
Identities = 18/72 (25%), Positives = 30/72 (41%), Gaps = 8/72 (11%)
Query: 385 MEEDEEIKTKSQ----PISKPEAAMPKEEISGVGEAQVIVLPAASSSHELLLSPHGT--- 437
+ +EE++ Q I +PE P+E+ +A+VIV P S + + G
Sbjct: 240 LVNEEEVERLLQRSGLTIVRPETLGPREQARLFRKAKVIVGPHGSGLANAVFAAPGCKVV 299
Query: 438 -LHSKPTTFRPH 448
+ T FR
Sbjct: 300 EIQPGTTNFRSF 311
>gnl|CDD|226119 COG3591, COG3591, V8-like Glu-specific endopeptidase [Amino acid
transport and metabolism].
Length = 251
Score = 30.8 bits (70), Expect = 4.6
Identities = 14/38 (36%), Positives = 19/38 (50%), Gaps = 3/38 (7%)
Query: 1295 SWPWQAALYKE---GEFQCGATLISDQWLLSAGHCFYR 1329
+P+ A + E G ATLI +L+AGHC Y
Sbjct: 48 QFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYS 85
>gnl|CDD|218601 pfam05477, SURF2, Surfeit locus protein 2 (SURF2). Surfeit locus
protein 2 is part of a group of at least six sequence
unrelated genes (Surf-1 to Surf-6). The six Surfeit
genes have been classified as housekeeping genes, being
expressed in all tissue types tested and not containing
a TATA box in their promoter region. The exact function
of SURF2 is unknown.
Length = 244
Score = 30.7 bits (69), Expect = 5.0
Identities = 18/88 (20%), Positives = 30/88 (34%), Gaps = 3/88 (3%)
Query: 684 DFYLPPPPNISDHIINDFSDSLLDKISVGDSPSLSEEAPPLDNGDDNFSTTEESRKVILN 743
DF+ PP SD +D DS+ D + + DD+F T +E + +
Sbjct: 153 DFWEPPS---SDEDDSDSEDSMSDLYPPELFTLKNPGKEQNGDEDDDFETDDEDEMEVES 209
Query: 744 TEVVTSTRSLNLNGTHELTVKNERTGKM 771
E+ + KN +
Sbjct: 210 PELQQKRSKKQSGSLTKKFKKNHKKKGP 237
>gnl|CDD|193472 pfam12999, PRKCSH-like, Glucosidase II beta subunit-like. The
sequences found in this family are similar to a region
found in the beta-subunit of glucosidase II, which is
also known as protein kinase C substrate 80K-H (PRKCSH).
The enzyme catalyzes the sequential removal of two
alpha-1,3-linked glucose residues in the second step of
N-linked oligosaccharide processing. The beta subunit is
required for the solubility and stability of the
heterodimeric enzyme, and is involved in retaining the
enzyme within the endoplasmic reticulum.
Length = 176
Score = 30.5 bits (69), Expect = 5.0
Identities = 25/76 (32%), Positives = 31/76 (40%), Gaps = 23/76 (30%)
Query: 1018 DCADGSDEKNCS--------CAD--FLKSQFLTRKICDGIID---CWDFSDEYECEWCSP 1064
DC DGSDE + CA+ F+ + K+ DG+ D C D SDE
Sbjct: 59 DCPDGSDEPGTNACSNGKFYCANEGFIPGYIPSFKVDDGVCDYDICCDGSDEALG----- 113
Query: 1065 GQYICPNSRVCIERTR 1080
CPN C E R
Sbjct: 114 ---KCPNK--CGEIAR 124
>gnl|CDD|235943 PRK07133, PRK07133, DNA polymerase III subunits gamma and tau;
Validated.
Length = 725
Score = 30.9 bits (70), Expect = 6.7
Identities = 30/162 (18%), Positives = 54/162 (33%), Gaps = 15/162 (9%)
Query: 832 KDLLNKRDG-DTKESGVKLENNTSEADSIEKKVILVMSSNSSNMLNFNENRTSDDNDNKN 890
K N D D KE ++ EN+ IE K + +S N +N
Sbjct: 370 KIEENSIDNLDIKEKKIENEND------IEGKSDTKNLEEGFETKDNKNKNSSFINKTEN 423
Query: 891 KAMAQNLLTQMLEKYNRVITNDS-SVSSLKYLIDQI--SHQHLKHTHQHNPDT-----NI 942
L ++LEK +I ++ + + I + +Q+ DT
Sbjct: 424 ILTNSPLKDELLEKTTEIINIENPQEFEFGQIGNDIISTEIAQLDENQNLIDTGEFDLEN 483
Query: 943 STKAIPSSFNFNQVNGIPILTKVYTKVSKTNSTSKERNQTEF 984
+ + N N+++ T + +S S +K F
Sbjct: 484 NFSNSFNPENGNKIDENINETFDTSTISANLSENKTNFAQSF 525
>gnl|CDD|226908 COG4531, ZnuA, ABC-type Zn2+ transport system, periplasmic
component/surface adhesin [Inorganic ion transport and
metabolism].
Length = 318
Score = 30.5 bits (69), Expect = 8.0
Identities = 15/72 (20%), Positives = 29/72 (40%), Gaps = 5/72 (6%)
Query: 363 MLRHSPLESFAHNSLLDMYKPMMEEDEEIKTKSQPISKPEAAMPKEEISGVGEAQVIVLP 422
++ H + L + + T +P+ +A+ GVGE +V++ P
Sbjct: 1 IMLHKKTLLLSALFALLLGSAPAAAAAAVVTSIKPLGFIASAIAD----GVGEPEVLL-P 55
Query: 423 AASSSHELLLSP 434
+S H+ L P
Sbjct: 56 GGASPHDYSLRP 67
>gnl|CDD|130673 TIGR01612, 235kDa-fam, reticulocyte binding/rhoptry protein. This
model represents a group of paralogous families in
plasmodium species alternately annotated as reticulocyte
binding protein, 235-kDa family protein and rhoptry
protein. Rhoptry protein is localized on the cell surface
and is extremely large (although apparently lacking in
repeat structure) and is important for the process of
invasion of the RBCs by the parasite. These proteins are
found in P. falciparum, P. vivax and P. yoelii.
Length = 2757
Score = 30.8 bits (69), Expect = 8.2
Identities = 43/167 (25%), Positives = 76/167 (45%), Gaps = 14/167 (8%)
Query: 815 TSLENNENL-FSYGSEEHKDLLNKRDGDTKESG---VKLENNTSEADSIEKKVILVMSSN 870
TSLE + + SYG K L K D + K+S +E + D I++K + +
Sbjct: 1207 TSLEEVKGINLSYGKNLGKLFLEKIDEEKKKSEHMIKAMEAYIEDLDEIKEKSPEIENEM 1266
Query: 871 SSNMLNFNENRT---SDDNDNKNKAMAQN---LLTQMLEKYNRVITNDSSVSSLKYLIDQ 924
M E T S D+D + +++ ++ + EK ++I + S S + I +
Sbjct: 1267 GIEMDIKAEMETFNISHDDDKDHHIISKKHDENISDIREKSLKIIEDFSEESDIND-IKK 1325
Query: 925 ISHQHLKHTHQHNPDTNISTKAIPSSFNFNQVNGIP-ILTKV--YTK 968
++L +HN D N+ I + +N ++N I I+ +V YTK
Sbjct: 1326 ELQKNLLDAQKHNSDINLYLNEIANIYNILKLNKIKKIIDEVKEYTK 1372
>gnl|CDD|109198 pfam00131, Metallothio, Metallothionein.
Length = 62
Score = 27.4 bits (61), Expect = 9.4
Identities = 12/51 (23%), Positives = 15/51 (29%)
Query: 1556 VCNGYMECSDGKCEMNSSFRCGNGECVSIGSKCNQLVDCADGSDEKNCSCA 1606
C C C+ +C C S KC C + CSC
Sbjct: 11 TCICGTSCKCTNCKCGPCKKCCCSCCCSGCCKCAGGCVCKGCTGPDKCSCC 61
>gnl|CDD|115185 pfam06513, DUF1103, Repeat of unknown function (DUF1103). This
family consists of several repeats of around 30 residues
in length which are found specifically in
mature-parasite-infected erythrocyte surface antigen
proteins from Plasmodium falciparum. This family often
found in conjunction with pfam00226.
Length = 215
Score = 29.8 bits (66), Expect = 9.8
Identities = 20/77 (25%), Positives = 37/77 (48%), Gaps = 5/77 (6%)
Query: 828 SEEHKDLLNKRDGDTKESGVKLENNTSEADSIEKKVILVMSSNSSNMLNFNENRTSDDND 887
+EE K+ + K+ E G+K EN+T D + I+ E +D +
Sbjct: 136 TEEVKEEIKKQ----VEEGIK-ENDTEGKDKLIGPEIITEEVKEEIKKQVEEGIKENDTE 190
Query: 888 NKNKAMAQNLLTQMLEK 904
NK+K + Q ++T+ ++K
Sbjct: 191 NKDKVIGQEIITEEVKK 207
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.315 0.131 0.390
Gapped
Lambda K H
0.267 0.0730 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 79,481,594
Number of extensions: 7638899
Number of successful extensions: 6317
Number of sequences better than 10.0: 1
Number of HSP's gapped: 6281
Number of HSP's successfully gapped: 64
Length of query: 1634
Length of database: 10,937,602
Length adjustment: 110
Effective length of query: 1524
Effective length of database: 6,058,662
Effective search space: 9233400888
Effective search space used: 9233400888
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 66 (29.2 bits)