Query 027172
Match_columns 227
No_of_seqs 38 out of 40
Neff 2.6
Searched_HMMs 46136
Date Fri Mar 29 06:02:22 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/027172.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/027172hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF07655 Secretin_N_2: Secreti 88.2 0.42 9.1E-06 36.7 2.5 26 177-202 67-94 (98)
2 PF05151 PsbM: Photosystem II 63.3 10 0.00022 25.2 2.9 19 70-88 11-29 (31)
3 PF09049 SNN_transmemb: Stanni 62.0 12 0.00026 25.1 3.1 26 56-82 4-29 (33)
4 PF09323 DUF1980: Domain of un 60.7 39 0.00085 28.2 6.8 32 109-140 30-61 (182)
5 PF14387 DUF4418: Domain of un 58.6 71 0.0015 26.1 7.8 76 53-130 31-120 (124)
6 PF14126 DUF4293: Domain of un 57.6 19 0.0004 29.9 4.3 72 68-139 7-78 (149)
7 PF06638 Strabismus: Strabismu 49.3 85 0.0019 31.7 8.0 83 53-136 85-188 (505)
8 PF02932 Neur_chan_memb: Neuro 43.7 1.3E+02 0.0027 22.3 6.4 16 70-85 6-21 (237)
9 CHL00036 ycf4 photosystem I as 43.5 44 0.00096 29.7 4.6 48 69-122 24-76 (184)
10 PF08910 Aida_N: Aida N-termin 41.4 5 0.00011 32.7 -1.4 36 145-195 6-41 (106)
11 PRK02542 photosystem I assembl 41.4 49 0.0011 29.5 4.6 29 69-97 28-56 (188)
12 TIGR01573 cas2 CRISPR-associat 39.0 29 0.00063 26.4 2.5 46 176-223 41-87 (95)
13 PRK12895 ubiA prenyltransferas 38.9 2.9E+02 0.0063 25.2 9.2 16 118-133 136-151 (286)
14 KOG3208 SNARE protein GS28 [In 38.7 27 0.00059 32.0 2.6 21 66-86 209-229 (231)
15 PF09827 CRISPR_Cas2: CRISPR a 38.6 30 0.00065 24.7 2.4 30 176-205 38-67 (78)
16 TIGR03038 PS_II_psbM photosyst 37.5 27 0.00059 23.5 1.8 18 71-88 12-29 (33)
17 PF02392 Ycf4: Ycf4; InterPro 33.8 84 0.0018 27.8 4.8 28 69-96 21-48 (180)
18 PF07069 PRRSV_2b: Porcine rep 33.6 76 0.0016 24.5 4.0 29 64-92 25-56 (73)
19 PRK10927 essential cell divisi 33.5 47 0.001 31.7 3.5 26 60-85 31-56 (319)
20 PRK04989 psbM photosystem II r 32.3 35 0.00077 23.2 1.8 19 71-89 12-30 (35)
21 CHL00080 psbM photosystem II p 32.0 35 0.00076 23.2 1.7 18 71-88 12-29 (34)
22 PF13994 PgaD: PgaD-like prote 30.9 2.4E+02 0.0052 22.8 6.7 71 63-135 13-86 (138)
23 PF07172 GRP: Glycine rich pro 30.8 43 0.00093 26.3 2.3 18 69-86 6-23 (95)
24 PRK04165 acetyl-CoA decarbonyl 28.9 53 0.0012 32.2 3.1 29 137-166 17-45 (450)
25 PF07204 Orthoreo_P10: Orthore 28.8 38 0.00082 27.6 1.7 26 113-138 42-67 (98)
26 PF03839 Sec62: Translocation 28.1 1.9E+02 0.0041 26.1 6.2 20 65-84 109-129 (224)
27 PF13303 PTS_EIIC_2: Phosphotr 26.8 2E+02 0.0044 27.1 6.4 22 76-97 130-151 (327)
28 PF05568 ASFV_J13L: African sw 26.2 81 0.0018 28.0 3.4 19 131-158 48-70 (189)
29 KOG3814 Signaling protein van 25.3 3.4E+02 0.0073 27.6 7.7 74 63-136 114-212 (531)
30 PHA02980 hypothetical protein; 25.2 1.1E+02 0.0024 26.4 4.0 27 61-87 100-126 (160)
31 PF01102 Glycophorin_A: Glycop 24.8 83 0.0018 26.0 3.1 45 108-162 61-105 (122)
32 PF00509 Hemagglutinin: Haemag 24.4 49 0.0011 33.7 2.0 43 141-191 61-105 (550)
33 PRK14584 hmsS hemin storage sy 21.7 4.3E+02 0.0093 22.8 6.9 13 123-135 73-85 (153)
34 PF11755 DUF3311: Protein of u 21.4 1.8E+02 0.0039 21.2 4.0 32 104-135 21-52 (66)
35 PF07829 Toxin_14: Alpha-A con 21.3 45 0.00097 21.4 0.7 9 205-213 12-20 (26)
36 PF07787 DUF1625: Protein of u 21.2 2E+02 0.0043 25.1 4.9 16 76-91 195-210 (248)
37 PRK00753 psbL photosystem II r 21.1 69 0.0015 22.3 1.6 19 68-86 19-37 (39)
38 PF02038 ATP1G1_PLM_MAT8: ATP1 20.5 54 0.0012 23.8 1.0 23 118-140 16-38 (50)
39 PRK13029 2-oxoacid ferredoxin 20.5 85 0.0018 34.6 2.9 28 179-209 607-634 (1186)
40 smart00508 PostSET Cysteine-ri 20.3 46 0.001 21.1 0.6 11 139-149 6-16 (26)
41 PRK10747 putative protoheme IX 20.1 1.1E+02 0.0025 27.9 3.3 21 145-166 89-109 (398)
42 COG0230 RpmH Ribosomal protein 20.0 19 0.00041 25.6 -1.3 16 189-204 19-34 (44)
No 1
>PF07655 Secretin_N_2: Secretin N-terminal domain; InterPro: IPR011514 This is a short domain found in bacterial type II/III secretory system proteins. The architecture of these proteins suggests that this family may be functionally analogous to IPR005644 from INTERPRO.; GO: 0009297 pilus assembly, 0019867 outer membrane
Probab=88.24 E-value=0.42 Score=36.72 Aligned_cols=26 Identities=31% Similarity=0.532 Sum_probs=22.5
Q ss_pred CCcchhHHHHHHHhhcC--CCCCcEEEE
Q 027172 177 LPRDHHRELQAELKKMA--PPNGRAVLV 202 (227)
Q Consensus 177 Lp~d~hreLeAELrKMA--PPNGRAVLv 202 (227)
.--|=+++||.||+.|+ |.+||.|++
T Consensus 67 s~~dfW~~L~~~l~~ilg~~~~Gr~vv~ 94 (98)
T PF07655_consen 67 SKSDFWEDLQKTLQAILGTPGDGRSVVS 94 (98)
T ss_pred ECCchHHHHHHHHHHHhCCCCCCCEEEe
Confidence 33456899999999999 899999987
No 2
>PF05151 PsbM: Photosystem II reaction centre M protein (PsbM); InterPro: IPR007826 Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product. PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [, ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This family represents the low molecular weight transmembrane protein PsbM found in PSII. PsbM is one of the most hydrophobic proteins in the thylakoid membrane. The function of this protein is unknown.; GO: 0015979 photosynthesis, 0019684 photosynthesis, light reaction, 0009523 photosystem II, 0016021 integral to membrane; PDB: 3A0H_m 3ARC_m 3A0B_M 3PRR_M 3PRQ_M 1S5L_M 4FBY_e 3BZ2_M 3BZ1_M 2AXT_M ....
Probab=63.26 E-value=10 Score=25.22 Aligned_cols=19 Identities=16% Similarity=0.513 Sum_probs=15.5
Q ss_pred HHHHHHHhhhhhhhhHHHH
Q 027172 70 ILIAVITACGFLLFPYIRV 88 (227)
Q Consensus 70 ILiaVl~a~~FLl~pY~~~ 88 (227)
+++.++.-.+||+.+|++-
T Consensus 11 taLfi~iPt~FLiilyvqT 29 (31)
T PF05151_consen 11 TALFILIPTAFLIILYVQT 29 (31)
T ss_dssp HHHHHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHHHHHhheEeee
Confidence 4566778889999999984
No 3
>PF09049 SNN_transmemb: Stannin transmembrane; InterPro: IPR015135 This region consists of a single highly hydrophobic transmembrane helix that transverses the lipid bilayer at a 20 degree angle with respect to the membrane normal. It contains a conserved cysteine residue (Cys32) that, together with Cys34 found in the stannin unstructured linker domain, constitutes the putative trimethyltin-binding site that resides at the end of the transmembrane domain close to the lipid/solvent interface []. ; PDB: 1ZZA_A.
Probab=61.96 E-value=12 Score=25.11 Aligned_cols=26 Identities=23% Similarity=0.493 Sum_probs=14.9
Q ss_pred CCCCCcchhHHHHHHHHHHHHhhhhhh
Q 027172 56 CDRSRFRSAAVDVVILIAVITACGFLL 82 (227)
Q Consensus 56 C~hS~~psA~vD~lILiaVl~a~~FLl 82 (227)
-||| |...++-+++.+..++|+|-|+
T Consensus 4 ~dhs-pttgvvti~viliavaalg~li 29 (33)
T PF09049_consen 4 TDHS-PTTGVVTIIVILIAVAALGALI 29 (33)
T ss_dssp -TTT-THHHHHHHHHHHHHHHHHHHHH
T ss_pred ccCC-CCccEEEehhHHHHHHHHhhhh
Confidence 4888 3345556555555566776654
No 4
>PF09323 DUF1980: Domain of unknown function (DUF1980); InterPro: IPR015402 Members of this occur in gene pairs with members of PF03773 from PFAM. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region. Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function has not, as yet, been defined.
Probab=60.67 E-value=39 Score=28.16 Aligned_cols=32 Identities=19% Similarity=0.171 Sum_probs=26.1
Q ss_pred ccchhHHHHHHHHHHHHHHHHHHHhhhcccCC
Q 027172 109 IGNPLIYSSIGVSMSCVAIATWVALLCTSRKC 140 (227)
Q Consensus 109 ~~~P~~y~~~~~~~~~aa~~~w~~~~c~sRKC 140 (227)
...|++++++.++.+.|++-+|..+.-..++|
T Consensus 30 ~~~~~~~~a~i~l~ilai~q~~~~~~~~~~~~ 61 (182)
T PF09323_consen 30 RYIPLLYFAAILLLILAIVQLWRWFRPKRRKE 61 (182)
T ss_pred cHHHHHHHHHHHHHHHHHHHHHHHHhcccccc
Confidence 34589998888888899999999888877765
No 5
>PF14387 DUF4418: Domain of unknown function (DUF4418)
Probab=58.64 E-value=71 Score=26.11 Aligned_cols=76 Identities=13% Similarity=0.108 Sum_probs=47.2
Q ss_pred CCCCCCCCcchhHHHHHHHHHHHHhhhhhh-hhHHHHHHHHhhhhhhhhhhhhhccc---------c----cchhHHHHH
Q 027172 53 TPACDRSRFRSAAVDVVILIAVITACGFLL-FPYIRVVSVKSVEVSAAVFYLVKEEV---------I----GNPLIYSSI 118 (227)
Q Consensus 53 ~~~C~hS~~psA~vD~lILiaVl~a~~FLl-~pY~~~i~~~~~~~~~~i~~l~~~~~---------~----~~P~~y~~~ 118 (227)
.=.|..+ ..|+.=+-++++++....|+. .+.++....=..-..+...+|++... . -.|+++...
T Consensus 31 ~M~Ch~t--g~a~~~ig~vi~~~~li~~~~k~~~~~~gl~i~~i~~gil~~lip~~lIG~C~~~~M~Ch~~T~p~v~v~~ 108 (124)
T PF14387_consen 31 HMKCHWT--GQAVTGIGAVIAVLSLIMLFVKNKKARIGLSIANIALGILVILIPTVLIGVCMMPTMHCHTVTKPAVRVLG 108 (124)
T ss_pred eeeehhH--HHHHHHHHHHHHHHHHHHHHhCcHHHHHHHHHHHHHHHHHHHHhhcccccCCCCCCCChhhhHHHHHHHHH
Confidence 4458888 778777777777777766666 46666555543333444455543211 1 238888777
Q ss_pred HHHHHHHHHHHH
Q 027172 119 GVSMSCVAIATW 130 (227)
Q Consensus 119 ~~~~~~aa~~~w 130 (227)
++.++.+++..|
T Consensus 109 ~l~iv~~~~~~f 120 (124)
T PF14387_consen 109 GLIIVIGIIYLF 120 (124)
T ss_pred HHHHHHHHHHHH
Confidence 777777766655
No 6
>PF14126 DUF4293: Domain of unknown function (DUF4293)
Probab=57.57 E-value=19 Score=29.93 Aligned_cols=72 Identities=17% Similarity=0.078 Sum_probs=39.1
Q ss_pred HHHHHHHHHhhhhhhhhHHHHHHHHhhhhhhhhhhhhhcccccchhHHHHHHHHHHHHHHHHHHHhhhcccC
Q 027172 68 VVILIAVITACGFLLFPYIRVVSVKSVEVSAAVFYLVKEEVIGNPLIYSSIGVSMSCVAIATWVALLCTSRK 139 (227)
Q Consensus 68 ~lILiaVl~a~~FLl~pY~~~i~~~~~~~~~~i~~l~~~~~~~~P~~y~~~~~~~~~aa~~~w~~~~c~sRK 139 (227)
+-.|++++.++..|++|...+...+..........-+..+.......-...++.+..++++.+.++.-+.||
T Consensus 7 lyLlla~i~~~~~l~~Pi~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~l~il~~l~~~lal~aIFlyKnR~ 78 (149)
T PF14126_consen 7 LYLLLAAILMGVLLFFPIWSFSSNGMGYSADLYNLGLVDAAGGKVSNTPLFILLVLSAILALIAIFLYKNRK 78 (149)
T ss_pred HHHHHHHHHHHHHHHhhhhhhccCcceeeeeehhhhcccccccchhhhHHHHHHHHHHHHHHHHHHccccHH
Confidence 346788888888899999988776654321111111111111111110345555666666777777776665
No 7
>PF06638 Strabismus: Strabismus protein; InterPro: IPR009539 This family consists of several strabismus (STB) or Van Gogh-like (VANGL) proteins 1 and 2. The exact function of this family is unknown. It is thought, however that STB1 gene and STB2 may be potent tumour suppressor gene candidates [].; GO: 0007275 multicellular organismal development, 0016021 integral to membrane
Probab=49.27 E-value=85 Score=31.70 Aligned_cols=83 Identities=16% Similarity=0.157 Sum_probs=48.8
Q ss_pred CCCCCCCCcchhHHHHHHHHHHHHhhhhhhhhHHH------------------HHHHHhhhhhhhhhhhhhcccccchhH
Q 027172 53 TPACDRSRFRSAAVDVVILIAVITACGFLLFPYIR------------------VVSVKSVEVSAAVFYLVKEEVIGNPLI 114 (227)
Q Consensus 53 ~~~C~hS~~psA~vD~lILiaVl~a~~FLl~pY~~------------------~i~~~~~~~~~~i~~l~~~~~~~~P~~ 114 (227)
...|.+.-....++ +|-+++++.-++||+.|++- +.+-.++.+++...++..+.-...|=+
T Consensus 85 ~~~c~r~l~~~~~~-~L~l~aflSPiaflvLP~il~~~~~~~C~~~CeGllislafKLliLlig~WAlf~R~~~a~lPRi 163 (505)
T PF06638_consen 85 GFDCSRYLGLILAS-ILGLLAFLSPIAFLVLPKILWRWQLEPCGAECEGLLISLAFKLLILLIGTWALFFRRPRADLPRI 163 (505)
T ss_pred CcccceeHHHHHHH-HHHHHHHHhhHHHHHhcccccCccccccCCcccceeHHHHHHHHHHHHHHHHHhcCcccCCCchh
Confidence 35587773233333 56678888899999999763 222222333333333345555666766
Q ss_pred HHHHH---HHHHHHHHHHHHHhhhc
Q 027172 115 YSSIG---VSMSCVAIATWVALLCT 136 (227)
Q Consensus 115 y~~~~---~~~~~aa~~~w~~~~c~ 136 (227)
+.+-+ +.+++.+++-|.++.=+
T Consensus 164 f~fRa~ll~Lvfl~~~syWLFY~vr 188 (505)
T PF06638_consen 164 FVFRALLLVLVFLFLFSYWLFYGVR 188 (505)
T ss_pred HHHHHHHHHHHHHHHHHHHHHhhhe
Confidence 65544 44455567788888775
No 8
>PF02932 Neur_chan_memb: Neurotransmitter-gated ion-channel transmembrane region ion channel family signature gamma-aminobutyric acid (GABA) receptor signature nicotinic acetylcholine receptor signature; InterPro: IPR006029 Neurotransmitter ligand-gated ion channels are transmembrane receptor-ion channel complexes that open transiently upon binding of specific ligands, allowing rapid transmission of signals at chemical synapses [, ]. Five of these ion channel receptor families have been shown to form a sequence-related superfamily: Nicotinic acetylcholine receptor (AchR), an excitatory cation channel in vertebrates and invertebrates; in vertebrate motor endplates it is composed of alpha, beta, gamma and delta/epsilon subunits; in neurons it is composed of alpha and non-alpha (or beta) subunits []. Glycine receptor, an inhibitory chloride ion channel composed of alpha and beta subunits []. Gamma-aminobutyric acid (GABA) receptor, an inhibitory chloride ion channel; at least four types of subunits (alpha, beta, gamma and delta) are known []. Serotonin 5HT3 receptor, of which there are seven major types (5HT3-5HT7) []. Glutamate receptor, an excitatory cation channel of which at least three types have been described (kainate, N-methyl-D-aspartate (NMDA) and quisqualate) []. These receptors possess a pentameric structure (made up of varying subunits), surrounding a central pore. All known sequences of subunits from neurotransmitter-gated ion-channels are structurally related. They are composed of a large extracellular glycosylated N-terminal ligand-binding domain, followed by three hydrophobic transmembrane regions which form the ionic channel, followed by an intracellular region of variable length. A fourth hydrophobic region is found at the C-terminal of the sequence [, ]. This domain represents four transmembrane helices of a variety of neurotransmitter-gated ion-channels.; GO: 0006811 ion transport, 0016020 membrane; PDB: 1DXZ_A 3MRA_A 1EQ8_C 1OED_C 2PR9_P 1A11_A 1CEK_A 2BG9_E 2KSR_A 2K59_B ....
Probab=43.71 E-value=1.3e+02 Score=22.25 Aligned_cols=16 Identities=44% Similarity=0.553 Sum_probs=12.6
Q ss_pred HHHHHHHhhhhhhhhH
Q 027172 70 ILIAVITACGFLLFPY 85 (227)
Q Consensus 70 ILiaVl~a~~FLl~pY 85 (227)
+||.++.-++|.+-|-
T Consensus 6 ~li~~~s~~~f~~~~~ 21 (237)
T PF02932_consen 6 ILIVVLSWLSFWLPPE 21 (237)
T ss_dssp HHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHhheEeCcc
Confidence 5778888888888775
No 9
>CHL00036 ycf4 photosystem I assembly protein Ycf4
Probab=43.51 E-value=44 Score=29.66 Aligned_cols=48 Identities=17% Similarity=0.317 Sum_probs=31.4
Q ss_pred HHHHHHHHhhhhhhhhHHHHHHHHhhhhhhhhhhhhhcccccch--h---HHHHHHHHH
Q 027172 69 VILIAVITACGFLLFPYIRVVSVKSVEVSAAVFYLVKEEVIGNP--L---IYSSIGVSM 122 (227)
Q Consensus 69 lILiaVl~a~~FLl~pY~~~i~~~~~~~~~~i~~l~~~~~~~~P--~---~y~~~~~~~ 122 (227)
...++.+.++|||++.--+|+=..+.+++. .++....| + .|+.+|+..
T Consensus 24 wA~i~~~G~~GFll~g~SSYl~~~Llpf~~------~~~i~FiPQGivM~FYGi~gl~l 76 (184)
T CHL00036 24 WAFILFLGSLGFLLVGISSYLGKNLIPFLP------SQQILFFPQGIVMCFYGIAGLFI 76 (184)
T ss_pred HHHHHHhhhHHHHHhhhHHhhCcCccccCC------hhhCeEeCccHHHHHHHHHHHHH
Confidence 456778899999999888888777655432 34455555 2 355555433
No 10
>PF08910 Aida_N: Aida N-terminus; InterPro: IPR015006 This entry represents the axin interactor, dorsalization-associated protein family AIDA [].; PDB: 1UG7_A.
Probab=41.44 E-value=5 Score=32.70 Aligned_cols=36 Identities=33% Similarity=0.395 Sum_probs=26.9
Q ss_pred CccchhhhhhccccchhhhhcCCccccCCcccCCcchhHHHHHHHhhcCCC
Q 027172 145 CKGLKKAAEFDIQLETEECVKNKDSAKKGLFELPRDHHRELQAELKKMAPP 195 (227)
Q Consensus 145 CKGLkKA~EFDIQLeTEeCVk~~~~~k~gl~eLp~d~hreLeAELrKMAPP 195 (227)
|..|+||.||| .++-++|. -|+|+-|-.+|+|-+.-
T Consensus 6 ~~s~~ka~dfD--------------sWGQlvEA-~deY~~La~~l~k~~~~ 41 (106)
T PF08910_consen 6 HASFKKATDFD--------------SWGQLVEA-IDEYQRLARQLKKEVQS 41 (106)
T ss_dssp HHHHHHHHHHH--------------HHT-HHHH-HHHHHHHHHHHHHHHT-
T ss_pred HHHHHHhcCcc--------------hHHHHHHH-HHHHHHHHHHHHHHHhc
Confidence 56799999999 45666676 68899888888876654
No 11
>PRK02542 photosystem I assembly protein Ycf4; Provisional
Probab=41.42 E-value=49 Score=29.48 Aligned_cols=29 Identities=21% Similarity=0.209 Sum_probs=22.5
Q ss_pred HHHHHHHHhhhhhhhhHHHHHHHHhhhhh
Q 027172 69 VILIAVITACGFLLFPYIRVVSVKSVEVS 97 (227)
Q Consensus 69 lILiaVl~a~~FLl~pY~~~i~~~~~~~~ 97 (227)
..+++.+.++|||++.--+|+=..+.+++
T Consensus 28 wA~i~~~G~~GFll~g~sSYl~~~Llpf~ 56 (188)
T PRK02542 28 WASMVTIGGIGFLLAGLSSYLGRNLLPVG 56 (188)
T ss_pred HHHHHHhhhHHHHHhhhHHhhCcCccccC
Confidence 45677889999999988888877765543
No 12
>TIGR01573 cas2 CRISPR-associated endoribonuclease Cas2. This model describes most members of the family of Cas2, one of the first four protein families found to mark prokaryotic genomes that contain multiple CRISPR elements. It is an endoribonuclease, capable of cleaving single-stranded RNA. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats. The cas genes are found near the repeats. A distinct branch of the Cas2 family shows a very low level of sequence identity and is modeled by TIGR01873 instead.
Probab=38.98 E-value=29 Score=26.37 Aligned_cols=46 Identities=17% Similarity=0.216 Sum_probs=34.7
Q ss_pred cCCcchhH-HHHHHHhhcCCCCCcEEEEeeccCCCccccccccCCcccc
Q 027172 176 ELPRDHHR-ELQAELKKMAPPNGRAVLVFRARCGCSVGRLEVPGPKKQR 223 (227)
Q Consensus 176 eLp~d~hr-eLeAELrKMAPPNGRAVLvFRArCGCpv~rLEvwGpKK~r 223 (227)
+++..++. +|+++|++..|++|. |.+++=.=.| ..+++++|-++..
T Consensus 41 ~~~~~~~~~~l~~~l~~~i~~~ds-v~i~~l~~~~-~~~~~~~G~~~~~ 87 (95)
T TIGR01573 41 ILEPNQLARKLIERLKRIIPDEGD-IRIYPLTEKQ-KAAAIVIGGEPVT 87 (95)
T ss_pred EcCHHHHHHHHHHHHHHhCCCCCe-EEEEEeChHH-hceEEEEeCCcCC
Confidence 36666788 799999999999885 8888753333 5688888876653
No 13
>PRK12895 ubiA prenyltransferase; Reviewed
Probab=38.87 E-value=2.9e+02 Score=25.18 Aligned_cols=16 Identities=19% Similarity=0.391 Sum_probs=7.4
Q ss_pred HHHHHHHHHHHHHHHh
Q 027172 118 IGVSMSCVAIATWVAL 133 (227)
Q Consensus 118 ~~~~~~~aa~~~w~~~ 133 (227)
+|+....+.+..|..+
T Consensus 136 lG~~~g~~~l~g~~Av 151 (286)
T PRK12895 136 MGSIIGLGVLAGYLAV 151 (286)
T ss_pred HHHHHHhHHHHHHHHH
Confidence 4444444555555443
No 14
>KOG3208 consensus SNARE protein GS28 [Intracellular trafficking, secretion, and vesicular transport]
Probab=38.71 E-value=27 Score=32.02 Aligned_cols=21 Identities=48% Similarity=0.741 Sum_probs=18.5
Q ss_pred HHHHHHHHHHHhhhhhhhhHH
Q 027172 66 VDVVILIAVITACGFLLFPYI 86 (227)
Q Consensus 66 vD~lILiaVl~a~~FLl~pY~ 86 (227)
-|.+||-+|+..|.+|++=|.
T Consensus 209 rdslILa~Vis~C~llllfy~ 229 (231)
T KOG3208|consen 209 RDSLILAAVISVCTLLLLFYW 229 (231)
T ss_pred hhhHHHHHHHHHHHHHHHHHH
Confidence 489999999999999998764
No 15
>PF09827 CRISPR_Cas2: CRISPR associated protein Cas2; InterPro: IPR019199 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. Members of this family of bacterial proteins comprise various hypothetical proteins, as well as CRISPR (clustered regularly interspaced short palindromic repeats) associated proteins, conferring resistance to infection by certain bacteriophages. ; PDB: 3EXC_X 2I0X_A 3OQ2_B 3UI3_A 1ZPW_X 2I8E_A 2IVY_A.
Probab=38.56 E-value=30 Score=24.73 Aligned_cols=30 Identities=23% Similarity=0.494 Sum_probs=23.4
Q ss_pred cCCcchhHHHHHHHhhcCCCCCcEEEEeec
Q 027172 176 ELPRDHHRELQAELKKMAPPNGRAVLVFRA 205 (227)
Q Consensus 176 eLp~d~hreLeAELrKMAPPNGRAVLvFRA 205 (227)
++...+.++|++||++..+|+..-|.+|+-
T Consensus 38 ~~~~~~~~~l~~~l~~~i~~~~d~i~i~~l 67 (78)
T PF09827_consen 38 NLTNAELRKLRRELEKLIDPDEDSIRIYPL 67 (78)
T ss_dssp EE-HHHHHHHHHHHHHHSCTTTCEEEEEEE
T ss_pred EcCHHHHHHHHHHHHhhCCCCCCEEEEEEe
Confidence 345667889999999999999556777763
No 16
>TIGR03038 PS_II_psbM photosystem II reaction center protein PsbM. Members of this protein family are the photosystem II reaction center M protein, product of the psbM gene, in Cyanobacteria and their derived organelles in plants. This model resembles Pfam model pfam05151 but has cutoffs set to avoid false-positive matches to similar (not necessarily homologous) sequences in species that are not photosynthetic.
Probab=37.46 E-value=27 Score=23.52 Aligned_cols=18 Identities=33% Similarity=0.628 Sum_probs=13.0
Q ss_pred HHHHHHhhhhhhhhHHHH
Q 027172 71 LIAVITACGFLLFPYIRV 88 (227)
Q Consensus 71 LiaVl~a~~FLl~pY~~~ 88 (227)
++.++.-.+||+.+|++-
T Consensus 12 ~Lfi~iPt~FLiilYvqT 29 (33)
T TIGR03038 12 LLFILVPTVFLLILYIQT 29 (33)
T ss_pred HHHHHHHHHHHHHHheec
Confidence 445566778999998863
No 17
>PF02392 Ycf4: Ycf4; InterPro: IPR003359 Photosystem I (PSI) is a large protein complex embedded within the photosynthetic thylakoid membrane. It consists of 11 subunits, ~100 chlorophyll a molecules, 2 phylloquinones, and 3 Fe4S4-clusters. The three dimensional structure of the PSI complex has been resolved at 2.5 A [], which allows the precise localisation of each cofactor. PSI together with photosystem II (PSII) catalyses the light-induced steps in oxygenic photosynthesis - a process found in cyanobacteria, eukaryotic algae (e.g. red algae, green algae) and higher plants. To date, three thylakoid proteins involved in the stable accumulation of PSI have been identified: BtpA (IPR005137 from INTERPRO) [], Ycf3 [, ], and Ycf4 []. Because translation of the psaA and psaB mRNAs encoding the two reaction centre polypeptides, of PSI and PSII respectively, is not affected in mutant strains lacking functional ycf3 and ycf4, the products of these two genes appear to act at a post-translational step of PSI biosynthesis. These gene products are therefore involved either in the stabilisation or in the assembly of the PSI complex. However, their exact roles remain unknown. The BtpA protein appears to act at the level of PSI stabilisation []. It is an extrinsic membrane protein located on the cytoplasmic side of the thylakoid membrane [, ]. Homologs of BtpA are found in the crenarchaeota and euryarchaeota, where their function remains unknown. The Ycf4 protein is firmly associated with the thylakoid membrane, presumably through a transmembrane domain []. Ycf4 co-fractionates with a protein complex larger than PSI upon sucrose density gradient centrifugation of solubilised thylakoids []. The Ycf3 protein is loosely associated with the thylakoid membrane and can be released from the membrane with sodium carbonate. This suggests that Ycf3 is not part of a stable complex and that it probably interacts transiently with its partners []. Ycf3 contains a number of tetratrico peptide repeats (TPR, IPR001440 from INTERPRO); TPR is a structural motif present in a wide range of proteins, which mediates protein-protein interactions. ; GO: 0015979 photosynthesis, 0009522 photosystem I, 0009579 thylakoid, 0016021 integral to membrane
Probab=33.77 E-value=84 Score=27.78 Aligned_cols=28 Identities=18% Similarity=0.234 Sum_probs=22.2
Q ss_pred HHHHHHHHhhhhhhhhHHHHHHHHhhhh
Q 027172 69 VILIAVITACGFLLFPYIRVVSVKSVEV 96 (227)
Q Consensus 69 lILiaVl~a~~FLl~pY~~~i~~~~~~~ 96 (227)
..+++.+.++|||++--.+|+=..+.++
T Consensus 21 wa~ii~~G~lGFll~G~sSYl~~nll~~ 48 (180)
T PF02392_consen 21 WAFIIFLGGLGFLLVGISSYLGKNLLPF 48 (180)
T ss_pred HHHHHHHhhHHHHHhHHHHHhCCCcccc
Confidence 4567788999999998888887766544
No 18
>PF07069 PRRSV_2b: Porcine reproductive and respiratory syndrome virus 2b ; InterPro: IPR009775 This family consists of several Porcine reproductive and respiratory syndrome virus (PRRSV) ORF2b proteins. The function of this family is unknown however it is known that large amounts of 2b protein are present in the virion and it is thought that this protein may be an integral component of the virion [].
Probab=33.62 E-value=76 Score=24.47 Aligned_cols=29 Identities=31% Similarity=0.616 Sum_probs=18.2
Q ss_pred hHHHHHHHHHHHHhh---hhhhhhHHHHHHHH
Q 027172 64 AAVDVVILIAVITAC---GFLLFPYIRVVSVK 92 (227)
Q Consensus 64 A~vD~lILiaVl~a~---~FLl~pY~~~i~~~ 92 (227)
..||++|+++.+|.. |.|+.=-+++++..
T Consensus 25 sivdiiiflailfgftiagwlvvfcirlv~sa 56 (73)
T PF07069_consen 25 SIVDIIIFLAILFGFTIAGWLVVFCIRLVCSA 56 (73)
T ss_pred HHHHHHHHHHHHHhhHHHHHHHHHHHHHHHHH
Confidence 468999999998753 34444444444443
No 19
>PRK10927 essential cell division protein FtsN; Provisional
Probab=33.51 E-value=47 Score=31.66 Aligned_cols=26 Identities=15% Similarity=0.193 Sum_probs=21.2
Q ss_pred CcchhHHHHHHHHHHHHhhhhhhhhH
Q 027172 60 RFRSAAVDVVILIAVITACGFLLFPY 85 (227)
Q Consensus 60 ~~psA~vD~lILiaVl~a~~FLl~pY 85 (227)
.++-+++=+.+.|+|+|.+||+|+..
T Consensus 31 ~~~~~m~alAvavlv~fiGGLyFith 56 (319)
T PRK10927 31 AVSPAMVAIAAAVLVTFIGGLYFITH 56 (319)
T ss_pred CcchHHHHHHHHHHHHHhhheEEEec
Confidence 44678999999999999999887543
No 20
>PRK04989 psbM photosystem II reaction center protein M; Provisional
Probab=32.34 E-value=35 Score=23.24 Aligned_cols=19 Identities=32% Similarity=0.541 Sum_probs=13.4
Q ss_pred HHHHHHhhhhhhhhHHHHH
Q 027172 71 LIAVITACGFLLFPYIRVV 89 (227)
Q Consensus 71 LiaVl~a~~FLl~pY~~~i 89 (227)
++.|+.-.+||+..|++-.
T Consensus 12 ~Lfi~iPt~FLlilYvqT~ 30 (35)
T PRK04989 12 LLFVLVPTVFLIILYIQTN 30 (35)
T ss_pred HHHHHHHHHHHHHHheecc
Confidence 4455667788999888743
No 21
>CHL00080 psbM photosystem II protein M
Probab=32.00 E-value=35 Score=23.16 Aligned_cols=18 Identities=22% Similarity=0.602 Sum_probs=12.6
Q ss_pred HHHHHHhhhhhhhhHHHH
Q 027172 71 LIAVITACGFLLFPYIRV 88 (227)
Q Consensus 71 LiaVl~a~~FLl~pY~~~ 88 (227)
++.|+.--+||+.+|++-
T Consensus 12 ~LFi~iPt~FLlilyvkT 29 (34)
T CHL00080 12 ALFILVPTAFLLIIYVKT 29 (34)
T ss_pred HHHHHHHHHHHHHhheee
Confidence 445556678888888764
No 22
>PF13994 PgaD: PgaD-like protein
Probab=30.88 E-value=2.4e+02 Score=22.79 Aligned_cols=71 Identities=10% Similarity=0.107 Sum_probs=39.3
Q ss_pred hhHHHHHHHHHHHHhhhhhhhhHHHHHHHHhhhhhh---hhhhhhhcccccchhHHHHHHHHHHHHHHHHHHHhhh
Q 027172 63 SAAVDVVILIAVITACGFLLFPYIRVVSVKSVEVSA---AVFYLVKEEVIGNPLIYSSIGVSMSCVAIATWVALLC 135 (227)
Q Consensus 63 sA~vD~lILiaVl~a~~FLl~pY~~~i~~~~~~~~~---~i~~l~~~~~~~~P~~y~~~~~~~~~aa~~~w~~~~c 135 (227)
.-.+|.++-++.=+...||+.|.+..+.- ++..-. .+......+....=..| +..+.+.+.+++.|..+.-
T Consensus 13 ~r~~~~~lT~~~W~~~~yL~~pl~~ll~~-ll~~~~~~~~~~~~~~~~~~~~l~~y-~~i~~~~a~~Li~Wa~yn~ 86 (138)
T PF13994_consen 13 QRLIDYFLTLLFWGGFIYLWRPLLTLLAW-LLGLHLFYPQMSLGGFLSSLNTLQIY-LLIALVNAVILILWAKYNR 86 (138)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHccccccchhhhcchhhHHHHHHHH-HHHHHHHHHHHHHHHHHHH
Confidence 36789998888889999999998774433 222110 01100000111111223 3334445567889999886
No 23
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=30.84 E-value=43 Score=26.28 Aligned_cols=18 Identities=17% Similarity=0.349 Sum_probs=11.4
Q ss_pred HHHHHHHHhhhhhhhhHH
Q 027172 69 VILIAVITACGFLLFPYI 86 (227)
Q Consensus 69 lILiaVl~a~~FLl~pY~ 86 (227)
++||++++|..+|+++=+
T Consensus 6 ~llL~l~LA~lLlisSev 23 (95)
T PF07172_consen 6 FLLLGLLLAALLLISSEV 23 (95)
T ss_pred HHHHHHHHHHHHHHHhhh
Confidence 667777777666665543
No 24
>PRK04165 acetyl-CoA decarbonylase/synthase complex subunit gamma; Provisional
Probab=28.91 E-value=53 Score=32.18 Aligned_cols=29 Identities=21% Similarity=0.512 Sum_probs=21.2
Q ss_pred ccCCCCCCCccchhhhhhccccchhhhhcC
Q 027172 137 SRKCGNPNCKGLKKAAEFDIQLETEECVKN 166 (227)
Q Consensus 137 sRKCgnP~CKGLkKA~EFDIQLeTEeCVk~ 166 (227)
-.+||-|+|.++-.|+- +=+.+.++|.-.
T Consensus 17 Cg~CG~~~C~afA~~v~-~g~~~~~~C~~~ 45 (450)
T PRK04165 17 CGECGEPTCLAFAMKLA-SGKAELDDCPYL 45 (450)
T ss_pred CCCCCCccHHHHHHHHH-cCCCCccCCCCC
Confidence 46799999999987764 335566777655
No 25
>PF07204 Orthoreo_P10: Orthoreovirus membrane fusion protein p10; InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=28.81 E-value=38 Score=27.58 Aligned_cols=26 Identities=15% Similarity=0.268 Sum_probs=14.6
Q ss_pred hHHHHHHHHHHHHHHHHHHHhhhccc
Q 027172 113 LIYSSIGVSMSCVAIATWVALLCTSR 138 (227)
Q Consensus 113 ~~y~~~~~~~~~aa~~~w~~~~c~sR 138 (227)
|-|.+.|..++..+++...++||+.|
T Consensus 42 WpyLA~GGG~iLilIii~Lv~CC~~K 67 (98)
T PF07204_consen 42 WPYLAAGGGLILILIIIALVCCCRAK 67 (98)
T ss_pred hHHhhccchhhhHHHHHHHHHHhhhh
Confidence 34444444444445556677888654
No 26
>PF03839 Sec62: Translocation protein Sec62; InterPro: IPR004728 Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins have been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions.; GO: 0008565 protein transporter activity, 0015031 protein transport, 0016021 integral to membrane
Probab=28.07 E-value=1.9e+02 Score=26.08 Aligned_cols=20 Identities=20% Similarity=0.464 Sum_probs=9.6
Q ss_pred HHHHHH-HHHHHHhhhhhhhh
Q 027172 65 AVDVVI-LIAVITACGFLLFP 84 (227)
Q Consensus 65 ~vD~lI-LiaVl~a~~FLl~p 84 (227)
..+.|+ ++++++.++..+||
T Consensus 109 ~~~~l~~~~~~~~v~a~~lFP 129 (224)
T PF03839_consen 109 LMQYLIGALLLVGVIAICLFP 129 (224)
T ss_pred HHHHHHHHHHHHHHHHHHhhh
Confidence 444444 44444444555555
No 27
>PF13303 PTS_EIIC_2: Phosphotransferase system, EIIC
Probab=26.85 E-value=2e+02 Score=27.13 Aligned_cols=22 Identities=23% Similarity=0.379 Sum_probs=17.0
Q ss_pred HhhhhhhhhHHHHHHHHhhhhh
Q 027172 76 TACGFLLFPYIRVVSVKSVEVS 97 (227)
Q Consensus 76 ~a~~FLl~pY~~~i~~~~~~~~ 97 (227)
...|.++.||++.+...+..+.
T Consensus 130 g~ig~~~~P~v~~i~~~IG~~I 151 (327)
T PF13303_consen 130 GLIGLLTLPYVSPITTWIGNVI 151 (327)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 3457889999999998875543
No 28
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=26.22 E-value=81 Score=27.95 Aligned_cols=19 Identities=68% Similarity=0.919 Sum_probs=13.5
Q ss_pred HHhhhcccCCCCCCCccchhhh----hhcccc
Q 027172 131 VALLCTSRKCGNPNCKGLKKAA----EFDIQL 158 (227)
Q Consensus 131 ~~~~c~sRKCgnP~CKGLkKA~----EFDIQL 158 (227)
.++.|.+|| |||| |=|||.
T Consensus 48 li~lcssRK---------kKaaAAi~eediQf 70 (189)
T PF05568_consen 48 LIYLCSSRK---------KKAAAAIEEEDIQF 70 (189)
T ss_pred HHHHHhhhh---------HHHHhhhhhhcccc
Confidence 456788887 6664 678886
No 29
>KOG3814 consensus Signaling protein van gogh/strabismus [Signal transduction mechanisms]
Probab=25.29 E-value=3.4e+02 Score=27.56 Aligned_cols=74 Identities=12% Similarity=0.212 Sum_probs=39.9
Q ss_pred hhHHHHHHHHHHHHhhhhhhhhHH----------------------HHHHHHhhhhhhhhhhhhhcccccchhHHHHHH-
Q 027172 63 SAAVDVVILIAVITACGFLLFPYI----------------------RVVSVKSVEVSAAVFYLVKEEVIGNPLIYSSIG- 119 (227)
Q Consensus 63 sA~vD~lILiaVl~a~~FLl~pY~----------------------~~i~~~~~~~~~~i~~l~~~~~~~~P~~y~~~~- 119 (227)
.++--+|-+|.++...+|++.|-+ .+.+.-++.+++.-.++..-.-..-|=+|+.-|
T Consensus 114 l~~~slL~~~sf~sp~am~~lP~~~P~~~~r~~l~~C~~~CeGllismA~kll~L~ig~walf~Rk~~A~mPRvf~~RAl 193 (531)
T KOG3814|consen 114 LLASSLLGLLSFLSPPAMCLLPIIAPRFLWRMELEPCGTDCEGLLISMAFKLLILLIGIWALFFRKAMADMPRVFVVRAL 193 (531)
T ss_pred HHHHHHHHHHHHhchhHHHhccccccchhhhccccccccccchhhHHHHHHHHHHHHHHHHHHhhhhhccCchhHHHHHH
Confidence 344456778888999999988821 111211111111111122223334465555433
Q ss_pred --HHHHHHHHHHHHHhhhc
Q 027172 120 --VSMSCVAIATWVALLCT 136 (227)
Q Consensus 120 --~~~~~aa~~~w~~~~c~ 136 (227)
+.+++.+++-|.++.-+
T Consensus 194 ll~LV~~~~fayWLFYiVr 212 (531)
T KOG3814|consen 194 LLVLVFLIVFAYWLFYIVR 212 (531)
T ss_pred HHHHHHHHHHHHHHHHhhh
Confidence 55667778889887765
No 30
>PHA02980 hypothetical protein; Provisional
Probab=25.25 E-value=1.1e+02 Score=26.42 Aligned_cols=27 Identities=11% Similarity=-0.085 Sum_probs=20.0
Q ss_pred cchhHHHHHHHHHHHHhhhhhhhhHHH
Q 027172 61 FRSAAVDVVILIAVITACGFLLFPYIR 87 (227)
Q Consensus 61 ~psA~vD~lILiaVl~a~~FLl~pY~~ 87 (227)
+.-|.+|+++|.+.+.+..+.+.+.=+
T Consensus 100 ~~lAli~illL~~lv~~~~~~f~~i~~ 126 (160)
T PHA02980 100 LRLSIAISTFSICLSVYNIYLWRFETD 126 (160)
T ss_pred hhHHHHHHHHHHHHHHHHHHHHHhccH
Confidence 467889999988888887766555444
No 31
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=24.79 E-value=83 Score=25.96 Aligned_cols=45 Identities=22% Similarity=0.240 Sum_probs=17.5
Q ss_pred cccchhHHHHHHHHHHHHHHHHHHHhhhcccCCCCCCCccchhhhhhccccchhh
Q 027172 108 VIGNPLIYSSIGVSMSCVAIATWVALLCTSRKCGNPNCKGLKKAAEFDIQLETEE 162 (227)
Q Consensus 108 ~~~~P~~y~~~~~~~~~aa~~~w~~~~c~sRKCgnP~CKGLkKA~EFDIQLeTEe 162 (227)
++..-++++.+|+......+++.+.|+++. |||--.+|+|-..|+
T Consensus 61 fs~~~i~~Ii~gv~aGvIg~Illi~y~irR----------~~Kk~~~~~~p~P~~ 105 (122)
T PF01102_consen 61 FSEPAIIGIIFGVMAGVIGIILLISYCIRR----------LRKKSSSDVQPLPEE 105 (122)
T ss_dssp SS-TCHHHHHHHHHHHHHHHHHHHHHHHHH----------HS-------------
T ss_pred ccccceeehhHHHHHHHHHHHHHHHHHHHH----------HhccCCCCCCCCCCC
Confidence 333445566666655555556666677643 456677888874443
No 32
>PF00509 Hemagglutinin: Haemagglutinin; InterPro: IPR001364 Haemagglutinin (HA) is one of two main surface fusion glycoproteins embedded in the envelope of influenza viruses, the other being neuraminidase (NA). There are sixteen known HA subtypes (H1-H16) and nine NA subtypes (N1-N9), which together are used to classify influenza viruses (e.g. H5N1). The antigenic variations in HA and NA enable the virus to evade host antibodies made to previous influenza strains, accounting for recurrent influenza epidemics []. The HA glycoprotein is present in the viral membrane as a single polypeptide (HA0), which must be cleaved by the host's trypsin-like proteases to produce two peptides (HA1 and HA2) in order for the virus to be infectious. Once HA0 is cleaved, the newly exposed N-terminal of the HA2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the viral negative-stranded RNA to infect the host cell. The type of host protease can influence the infectivity and pathogenicity of the virus. The haemagglutinin glycoprotein is a trimer containing three structurally distinct regions: a globular head consisting of anti-parallel beta-sheets that form a beta-sandwich with a jelly-roll fold (contains the receptor binding site and the HA1/HA2 cleavage site); a triple-stranded, coiled-coil, alpha-helical stalk; and a globular foot composed of anti-parallel beta-sheets [, ]. Each monomer consists of an intact HA0 polypeptide with the HA1 and HA2 regions linked by disulphide bonds. The N terminus of HA1 provides the central strand in the 5-stranded globular foot, while the rest of the HA1 chain makes its way to the 8-stranded globular head. HA2 provides two alpha helices, which form part of the triple-stranded coiled-coil that stabilises the trimer, its C terminus providing the remaining strands of the 5-stranded globular foot. This entry represents the entire haemagglutinin protein (HA0) consisting of both the HA1 and HA2 regions, as found in influenza A and B viruses.; GO: 0046789 host cell surface receptor binding, 0019064 viral envelope fusion with host membrane, 0019031 viral envelope; PDB: 2WR5_A 2IBX_A 2WR0_B 2WR1_C 2XN9_F 2WRF_I 3S11_E 3BT6_A 3SM5_E 2FK0_H ....
Probab=24.39 E-value=49 Score=33.69 Aligned_cols=43 Identities=30% Similarity=0.537 Sum_probs=32.0
Q ss_pred CCCCCccchhhhhhccccchhhhhcCCccccCCcccCCcc--hhHHHHHHHhh
Q 027172 141 GNPNCKGLKKAAEFDIQLETEECVKNKDSAKKGLFELPRD--HHRELQAELKK 191 (227)
Q Consensus 141 gnP~CKGLkKA~EFDIQLeTEeCVk~~~~~k~gl~eLp~d--~hreLeAELrK 191 (227)
|||.|-+|-.+-|-|.=+|+..-+-+ .=-|+| ||+|||+.+.-
T Consensus 61 GnP~CD~ll~~~~WsyIVEr~~~~ng--------~CYPG~~~d~eeLR~l~ss 105 (550)
T PF00509_consen 61 GNPQCDSLLNASSWSYIVERPNAVNG--------ICYPGDFEDYEELRSLLSS 105 (550)
T ss_dssp T-GGGGGGTT-SBBSSEEEETTSSBS--------SSSSEEETTHHHHHHHHTT
T ss_pred cCcchhcccCcccceeeEecCCCCCC--------cccCCcccCHHHHHHHHhh
Confidence 89999999999999999999876554 122333 79999998763
No 33
>PRK14584 hmsS hemin storage system protein; Provisional
Probab=21.68 E-value=4.3e+02 Score=22.83 Aligned_cols=13 Identities=15% Similarity=0.348 Sum_probs=9.9
Q ss_pred HHHHHHHHHHhhh
Q 027172 123 SCVAIATWVALLC 135 (227)
Q Consensus 123 ~~aa~~~w~~~~c 135 (227)
.+.++++|..+.-
T Consensus 73 nAvlLI~WA~YN~ 85 (153)
T PRK14584 73 NAVLLIIWAKYNQ 85 (153)
T ss_pred HHHHHHHHHHHHH
Confidence 3447789999887
No 34
>PF11755 DUF3311: Protein of unknown function (DUF3311); InterPro: IPR021741 This is a family of short bacterial proteins of unknwon function.
Probab=21.39 E-value=1.8e+02 Score=21.24 Aligned_cols=32 Identities=16% Similarity=0.150 Sum_probs=23.3
Q ss_pred hhcccccchhHHHHHHHHHHHHHHHHHHHhhh
Q 027172 104 VKEEVIGNPLIYSSIGVSMSCVAIATWVALLC 135 (227)
Q Consensus 104 ~~~~~~~~P~~y~~~~~~~~~aa~~~w~~~~c 135 (227)
+.+.+...|+.|.-..+-++.+.+.+|.++--
T Consensus 21 ~~P~v~G~Pff~~w~~~wv~lts~~~~~~y~l 52 (66)
T PF11755_consen 21 VEPTVFGMPFFYWWQLAWVVLTSVCMAIVYRL 52 (66)
T ss_pred CCccccCcHHHHHHHHHHHHHHHHHHHHHHHh
Confidence 35667788988887777777777777766554
No 35
>PF07829 Toxin_14: Alpha-A conotoxin PIVA-like protein; InterPro: IPR012498 Alpha-A conotoxin PIVA (P55963 from SWISSPROT) is the major paralytic toxin found in the venom produced by the piscivorous snail Conus purpurascens. This peptide acts by blocking the acetylcholine-binding site of the nicotinic acetylcholine receptor at the neuromuscular junction []. The overall shape of the peptide is described as an "iron" with a highly charged hydrophilic loop of 15S-19R forming the "handle" domain that is exposed to the exterior of the protein. The stability of the conotoxin is primarily governed by three disulphide bonds. A triangular structural motif formed by residues 19R, 12H and 6Y is thought to constitute a "binding core" that is important in binding to the acetylcholine receptor []. ; GO: 0030550 acetylcholine receptor inhibitor activity, 0009405 pathogenesis, 0005576 extracellular region; PDB: 1PQR_A 1P1P_A.
Probab=21.29 E-value=45 Score=21.40 Aligned_cols=9 Identities=67% Similarity=1.579 Sum_probs=6.3
Q ss_pred ccCCCcccc
Q 027172 205 ARCGCSVGR 213 (227)
Q Consensus 205 ArCGCpv~r 213 (227)
-||||-|+|
T Consensus 12 hpc~ckv~r 20 (26)
T PF07829_consen 12 HPCGCKVGR 20 (26)
T ss_dssp -TTTSTST-
T ss_pred cccccccCC
Confidence 389999887
No 36
>PF07787 DUF1625: Protein of unknown function (DUF1625); InterPro: IPR012430 Sequences making up this family are derived from hypothetical proteins expressed by both prokaryotic and eukaryotic species. The region in question is approximately 250 residues long.
Probab=21.16 E-value=2e+02 Score=25.12 Aligned_cols=16 Identities=25% Similarity=0.168 Sum_probs=8.0
Q ss_pred HhhhhhhhhHHHHHHH
Q 027172 76 TACGFLLFPYIRVVSV 91 (227)
Q Consensus 76 ~a~~FLl~pY~~~i~~ 91 (227)
|.+-++++..+..++.
T Consensus 195 f~G~~~~~~~l~~l~~ 210 (248)
T PF07787_consen 195 FIGFFLLFSPLYTLVD 210 (248)
T ss_pred HHHHHHHHHHHHHHHh
Confidence 3344555555555544
No 37
>PRK00753 psbL photosystem II reaction center L; Provisional
Probab=21.10 E-value=69 Score=22.33 Aligned_cols=19 Identities=16% Similarity=0.387 Sum_probs=13.4
Q ss_pred HHHHHHHHHhhhhhhhhHH
Q 027172 68 VVILIAVITACGFLLFPYI 86 (227)
Q Consensus 68 ~lILiaVl~a~~FLl~pY~ 86 (227)
+-.=++.+|.+|.|+++|+
T Consensus 19 Ly~GlLlifvl~vLFssYf 37 (39)
T PRK00753 19 LYLGLLLVFVLGILFSSYF 37 (39)
T ss_pred HHHHHHHHHHHHHHHHhhc
Confidence 3444566777888888886
No 38
>PF02038 ATP1G1_PLM_MAT8: ATP1G1/PLM/MAT8 family; InterPro: IPR000272 The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable. Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=20.49 E-value=54 Score=23.76 Aligned_cols=23 Identities=26% Similarity=0.257 Sum_probs=12.8
Q ss_pred HHHHHHHHHHHHHHHhhhcccCC
Q 027172 118 IGVSMSCVAIATWVALLCTSRKC 140 (227)
Q Consensus 118 ~~~~~~~aa~~~w~~~~c~sRKC 140 (227)
+|..++++++.+-+++--.|+||
T Consensus 16 igGLi~A~vlfi~Gi~iils~kc 38 (50)
T PF02038_consen 16 IGGLIFAGVLFILGILIILSGKC 38 (50)
T ss_dssp HHHHHHHHHHHHHHHHHHCTTHH
T ss_pred ccchHHHHHHHHHHHHHHHcCcc
Confidence 44455555555555555566776
No 39
>PRK13029 2-oxoacid ferredoxin oxidoreductase; Provisional
Probab=20.48 E-value=85 Score=34.56 Aligned_cols=28 Identities=32% Similarity=0.624 Sum_probs=21.0
Q ss_pred cchhHHHHHHHhhcCCCCCcEEEEeeccCCC
Q 027172 179 RDHHRELQAELKKMAPPNGRAVLVFRARCGC 209 (227)
Q Consensus 179 ~d~hreLeAELrKMAPPNGRAVLvFRArCGC 209 (227)
+|+-+.++.|||. -.|=+|||++.+|-=
T Consensus 607 R~~l~~vq~~lr~---~~GvsViI~~q~Ca~ 634 (1186)
T PRK13029 607 RDELDAVQRELRE---VPGVSVLIYDQTCAT 634 (1186)
T ss_pred HHHHHHHHHHHhc---CCCcEEEEEcCcCcc
Confidence 3455666777764 459999999999964
No 40
>smart00508 PostSET Cysteine-rich motif following a subset of SET domains.
Probab=20.35 E-value=46 Score=21.12 Aligned_cols=11 Identities=55% Similarity=1.446 Sum_probs=9.4
Q ss_pred CCCCCCCccch
Q 027172 139 KCGNPNCKGLK 149 (227)
Q Consensus 139 KCgnP~CKGLk 149 (227)
.||-++|+|.-
T Consensus 6 ~CGs~~CRG~l 16 (26)
T smart00508 6 LCGAPNCRGFL 16 (26)
T ss_pred eCCCcccccee
Confidence 49999999975
No 41
>PRK10747 putative protoheme IX biogenesis protein; Provisional
Probab=20.10 E-value=1.1e+02 Score=27.88 Aligned_cols=21 Identities=33% Similarity=0.278 Sum_probs=12.6
Q ss_pred CccchhhhhhccccchhhhhcC
Q 027172 145 CKGLKKAAEFDIQLETEECVKN 166 (227)
Q Consensus 145 CKGLkKA~EFDIQLeTEeCVk~ 166 (227)
=+||....|=|.+ +-|.....
T Consensus 89 ~~gl~a~~eGd~~-~A~k~l~~ 109 (398)
T PRK10747 89 EQALLKLAEGDYQ-QVEKLMTR 109 (398)
T ss_pred HHHHHHHhCCCHH-HHHHHHHH
Confidence 4567666766766 55555443
No 42
>COG0230 RpmH Ribosomal protein L34 [Translation, ribosomal structure and biogenesis]
Probab=20.02 E-value=19 Score=25.62 Aligned_cols=16 Identities=50% Similarity=0.625 Sum_probs=12.8
Q ss_pred HhhcCCCCCcEEEEee
Q 027172 189 LKKMAPPNGRAVLVFR 204 (227)
Q Consensus 189 LrKMAPPNGRAVLvFR 204 (227)
+-+|+-.|||.||--|
T Consensus 19 raRM~Tk~GR~vl~~R 34 (44)
T COG0230 19 RARMATKNGRKVLARR 34 (44)
T ss_pred HHHhcccchHHHHHHH
Confidence 3589999999998554
Done!