RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy13967
(379 letters)
>gnl|CDD|219673 pfam07970, COPIIcoated_ERV, Endoplasmic reticulum vesicle
transporter. This family is conserved from plants and
fungi to humans. Erv46 works in close conjunction with
Erv41 and together they form a complex which cycles
between the endoplasmic reticulum and Golgi complex.
Erv46-41 interacts strongly with the endoplasmic
reticulum glucosidase II. Mammalian glucosidase II
comprises a catalytic alpha-subunit and a 58 kDa beta
subunit, which is required for ER localisation. All
proteins identified biochemically as Erv41p-Erv46p
interactors are localised to the early secretory pathway
and are involved in protein maturation and processing in
the ER and/or sorting into COPII vesicles for transport
to the Golgi.
Length = 222
Score = 243 bits (623), Expect = 9e-80
Identities = 97/225 (43%), Positives = 131/225 (58%), Gaps = 22/225 (9%)
Query: 151 CYGAETETRKCCNTCNEVKEAYRYKKWALPELDTIVQCKNEYSTEKLK--NTFTEGCQIY 208
CYGAE KCCNTC +V+EAY+ K WA P+L+ I QCK EY EKLK EGC++
Sbjct: 1 CYGAEDNNGKCCNTCEDVREAYKKKGWAFPDLENIEQCKREYV-EKLKAQKNSNEGCRVK 59
Query: 209 GYLEVNRVSGSFHIAPGLSYSINHVHVHDIQPYTSAAFNTTHHIRHLSFGIKLQDDDERR 268
G LEVNRV+G+FHIAPG S+ HVHD+ +T N +H I HLSFG +
Sbjct: 60 GTLEVNRVAGNFHIAPGRSFQEKGGHVHDLSLFTDEKLNFSHTINHLSFGEE--FPGGVT 117
Query: 269 KPLDGT--VAKAEEGASMFNYYIKIIPTIYERLDGSKL---------------GGGDGGM 311
PLDGT + ++G M++Y++K++PT YE L+G + GG GG+
Sbjct: 118 NPLDGTTKFVQTDKGYHMYSYFLKVVPTRYESLNGLIVETNQYSVTSHDRPVTGGSRGGV 177
Query: 312 PGIFFSYELSPLMVKITEKSKSLGHLWTKIMCNISGTYITFMLVD 356
PG+FF+Y+ SP+ V TE +S H T + I G + L+D
Sbjct: 178 PGVFFNYDFSPIKVINTEDRQSFSHFLTNLCAIIGGVFAVAGLID 222
>gnl|CDD|206021 pfam13850, ERGIC_N, Endoplasmic Reticulum-Golgi Intermediate
Compartment (ERGIC). This family is the N-terminal of
ERGIC proteins, ER-Golgi intermediate compartment
clusters, otherwise known as Ervs, and is associated
with family COPIIcoated_ERV, pfam07970.
Length = 105
Score = 138 bits (350), Expect = 2e-40
Identities = 54/105 (51%), Positives = 65/105 (61%)
Query: 6 RLKGLDAFTKPYEDFHEKTVYGGAVTIVCWLFISYLICVDVCDYFQVSTTEELFVDSSRG 65
+LK LDAF K EDF KT GG +T++ L I L ++ DY T EL VD+SRG
Sbjct: 1 KLKSLDAFPKTDEDFRIKTTSGGIITLISILIIIILFVSELRDYLTPVTRPELVVDTSRG 60
Query: 66 SKLPIHLDIVVPTISCDYLALDAVDSSGEQHLHVEHNIYKRRLDL 110
KL I+LDI P + CD L+LD +D SGE L VEHNI K RLD
Sbjct: 61 EKLRINLDITFPRLPCDLLSLDVMDVSGEHQLDVEHNIKKTRLDS 105
>gnl|CDD|111090 pfam02158, Neuregulin, Neuregulin family.
Length = 406
Score = 30.6 bits (68), Expect = 1.3
Identities = 15/47 (31%), Positives = 24/47 (51%), Gaps = 2/47 (4%)
Query: 113 KPIQEPQKEVVNAVKKKKVTTENGTTTTELE-DPNKCGSCYGAETET 158
+P EP K++ N+ ++ K T NG LE D + +E+ET
Sbjct: 308 EPSLEPAKKLTNS-RRAKRTKPNGHIANRLELDSDSSSESSNSESET 353
>gnl|CDD|225528 COG2981, CysZ, Uncharacterized protein involved in cysteine
biosynthesis [Amino acid transport and metabolism].
Length = 250
Score = 29.6 bits (67), Expect = 2.4
Identities = 8/24 (33%), Positives = 13/24 (54%), Gaps = 1/24 (4%)
Query: 27 GGAVTIVCW-LFISYLICVDVCDY 49
G V V W LF ++++ + DY
Sbjct: 159 GQTVAPVAWFLFTAWMLAIQYFDY 182
>gnl|CDD|184320 PRK13778, paaA, phenylacetate-CoA oxygenase subunit PaaA;
Provisional.
Length = 314
Score = 29.5 bits (67), Expect = 2.8
Identities = 12/35 (34%), Positives = 22/35 (62%), Gaps = 1/35 (2%)
Query: 247 NTTHHIRHLSFGIKLQDDDE-RRKPLDGTVAKAEE 280
++ H + +++ IK +DE R+K +D TV +AE
Sbjct: 213 DSPHSAQSMAWKIKRFSNDELRQKFVDATVPQAEV 247
>gnl|CDD|212490 cd11679, archaeal_Sm_like, archaeal Sm-related protein. Archaeal
Sm-related proteins: The Sm proteins are conserved in
all three domains of life and are always associated with
U-rich RNA sequences. They function to mediate RNA-RNA
interactions and RNA biogenesis. All Sm proteins contain
a common sequence motif in two segments, Sm1 and Sm2,
separated by a short variable linker. Eukaryotic Sm
proteins form part of specific small nuclear
ribonucleoproteins (snRNPs) that are involved in the
processing of pre-mRNAs to mature mRNAs, and are a major
component of the eukaryotic spliceosome. Most snRNPs
consist of seven Sm proteins (B/B', D1, D2, D3, E, F and
G) arranged in a ring on a uridine-rich sequence (Sm
site), plus a small nuclear RNA (snRNA) (either U1, U2,
U5 or U4/6). Since archaebacteria do not have any
splicing apparatus, their Sm proteins may play a more
general role. Archaeal Lsm proteins are likely to
represent the ancestral Sm domain.
Length = 65
Score = 27.2 bits (61), Expect = 3.1
Identities = 12/43 (27%), Positives = 23/43 (53%), Gaps = 2/43 (4%)
Query: 313 GIFFSYELSPLMVKITEKSKSLGHLWTKIMCNISGTYITFMLV 355
G ++ S L + +T S G+ + K++ I+G I+ +LV
Sbjct: 25 GQLVGFDPSSLNIVLTNAKDSSGNKFPKVI--INGNRISEILV 65
>gnl|CDD|220713 pfam10356, DUF2034, Protein of unknown function (DUF2034). This
protein is expressed in fungi but its function is
unknown.
Length = 185
Score = 28.6 bits (64), Expect = 3.5
Identities = 17/80 (21%), Positives = 24/80 (30%), Gaps = 5/80 (6%)
Query: 142 LEDPNKCGSCYGAETETRKC-CNTCNEVKEAYRYKKWALPELDTIVQCKNEYSTEKLK-N 199
L D + + K C + YR A P L +VQCK +K+
Sbjct: 45 LPDIWEIERDMPVDEFPSKLKCGSIRLKPLKYRIIPLA-PPLRVLVQCKAL--KKKIGPR 101
Query: 200 TFTEGCQIYGYLEVNRVSGS 219
E + R S
Sbjct: 102 LVRELEGTFSSHVFRRARNS 121
>gnl|CDD|130130 TIGR01058, parE_Gpos, DNA topoisomerase IV, B subunit,
Gram-positive. Operationally, topoisomerase IV is a
type II topoisomerase required for the decatenation step
of chromosome segregation. Not every bacterium has both
a topo II and a topo IV. The topo IV families of the
Gram-positive bacteria and the Gram-negative bacteria
appear not to represent a single clade among the type II
topoisomerases, and are represented by separate models
for this reason [DNA metabolism, DNA replication,
recombination, and repair].
Length = 637
Score = 29.1 bits (65), Expect = 4.9
Identities = 24/110 (21%), Positives = 35/110 (31%), Gaps = 33/110 (30%)
Query: 61 DSSRGSKLPIHLDIVVPTISCDYLALDAVDSSGEQ---------------------HLHV 99
D RG IH D + T+ + L A + L V
Sbjct: 72 DDGRGIPTGIHQDGNISTVETVFTVLHAGGKFDQGGYKTAGGLHGVGASVVNALSSWLEV 131
Query: 100 E----HNIYKRRLDLDGKPIQEPQKEVVNAVKKKKVTTENGTTTTELEDP 145
IY++R + GK +Q ++KK T + GT DP
Sbjct: 132 TVKRDGQIYQQRFENGGKIVQ--------SLKKIGTTKKTGTLVHFHPDP 173
>gnl|CDD|221396 pfam12051, DUF3533, Protein of unknown function (DUF3533). This
family of transmembrane proteins is functionally
uncharacterized. This protein is found in bacteria and
eukaryotes. Proteins in this family are typically
between 393 to 772 amino acids in length.
Length = 379
Score = 28.0 bits (63), Expect = 9.9
Identities = 7/33 (21%), Positives = 14/33 (42%)
Query: 241 YTSAAFNTTHHIRHLSFGIKLQDDDERRKPLDG 273
Y + +N + + +L + QD P+ G
Sbjct: 16 YWGSLYNRSDRLHNLKVLVVNQDGGGGTVPVIG 48
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.136 0.411
Gapped
Lambda K H
0.267 0.0838 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 19,112,268
Number of extensions: 1813873
Number of successful extensions: 1335
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1331
Number of HSP's successfully gapped: 12
Length of query: 379
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 280
Effective length of database: 6,546,556
Effective search space: 1833035680
Effective search space used: 1833035680
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 60 (26.7 bits)