Query 048367
Match_columns 85
No_of_seqs 113 out of 140
Neff 4.1
Searched_HMMs 46136
Date Fri Mar 29 08:51:47 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/048367.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/048367hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF07189 SF3b10: Splicing fact 100.0 3.5E-57 7.6E-62 296.2 9.1 79 3-81 1-79 (79)
2 KOG3485 Uncharacterized conser 100.0 8.3E-51 1.8E-55 268.8 7.3 83 2-84 2-85 (86)
3 PF10022 DUF2264: Uncharacteri 85.5 2.5 5.5E-05 33.9 5.7 75 2-81 172-255 (361)
4 PF02909 TetR_C: Tetracyclin r 51.0 44 0.00095 21.4 4.5 44 27-72 7-50 (139)
5 COG4519 Uncharacterized protei 50.6 13 0.00028 25.3 1.9 17 27-43 75-91 (95)
6 PF13215 DUF4023: Protein of u 49.6 22 0.00047 20.7 2.5 24 2-25 8-31 (38)
7 PF04614 Pex19: Pex19 protein 42.8 8.6 0.00019 29.2 0.1 16 65-81 201-216 (248)
8 PF07627 PSCyt3: Protein of un 40.2 31 0.00066 23.2 2.5 41 42-82 4-45 (101)
9 PF03181 BURP: BURP domain; I 38.3 36 0.00077 25.7 2.9 35 46-84 46-80 (216)
10 cd07594 BAR_Endophilin_B The B 34.8 94 0.002 23.6 4.7 32 28-72 29-60 (229)
11 PF15178 TOM_sub5: Mitochondri 31.2 1.3E+02 0.0028 18.4 4.1 36 18-57 4-39 (51)
12 PF12616 DUF3775: Protein of u 30.3 1.5E+02 0.0033 19.0 4.5 52 3-56 11-69 (75)
13 PF05750 Rubella_Capsid: Rubel 29.9 25 0.00054 27.5 0.9 12 71-82 192-203 (300)
14 PF13978 DUF4223: Protein of u 27.8 41 0.0009 20.9 1.4 13 69-81 43-55 (56)
15 PF11333 DUF3135: Protein of u 26.2 70 0.0015 20.8 2.4 32 46-77 3-36 (83)
16 PF01152 Bac_globin: Bacterial 25.8 1.1E+02 0.0024 19.6 3.3 48 30-80 10-57 (120)
17 KOG3541 Predicted guanine nucl 25.2 49 0.0011 28.2 1.8 36 39-74 262-299 (477)
18 KOG2468 Dolichol kinase [Lipid 25.1 29 0.00062 29.8 0.5 10 38-47 423-432 (510)
19 TIGR00994 3a0901s05TIC20 chlor 25.0 67 0.0015 25.5 2.5 24 50-73 174-198 (267)
20 cd07617 BAR_Endophilin_B2 The 23.3 1.8E+02 0.0038 22.2 4.4 33 29-74 30-62 (220)
21 cd00252 SPARC_EC SPARC_EC; ext 21.8 1.1E+02 0.0025 20.6 2.8 19 29-47 15-35 (116)
22 PF13774 Longin: Regulated-SNA 21.7 17 0.00037 22.3 -1.1 27 47-73 31-57 (83)
23 PRK11188 rrmJ 23S rRNA methylt 21.7 73 0.0016 23.0 2.0 22 25-46 7-28 (209)
24 KOG3821 Heparin sulfate cell s 20.5 70 0.0015 27.9 1.9 45 36-83 306-350 (563)
No 1
>PF07189 SF3b10: Splicing factor 3B subunit 10 (SF3b10); InterPro: IPR009846 This family consists of several eukaryotic splicing factor 3B subunit 5 (SF3b5) proteins. SF3b5 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b5 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site []. Also included in this entry is RDS3 complex subunit 10, another protein involved in mRNA splicing [].
Probab=100.00 E-value=3.5e-57 Score=296.19 Aligned_cols=79 Identities=58% Similarity=1.047 Sum_probs=78.4
Q ss_pred hhHhHHHHHHHHHhhhcCCCCCCccHHHHHHHhhhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHhhCCCCCCCC
Q 048367 3 DRFNINSQLEHLQAKYVGTGHADLNRFEWAVNIQRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKMLLPCGLPPE 81 (85)
Q Consensus 3 dk~~~~~qle~Lq~KY~GtG~~dTTk~EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM~~pcg~pp~ 81 (85)
||+|+++||||||+||+||||+|||||||++||||||||||+||++||+|||||+|||++|+|++||+||+|||||||+
T Consensus 1 Dk~~~~~qle~Lq~KY~GtG~~dTTk~EW~tnihRDT~aS~~gH~~lL~Y~aia~ne~~~r~r~~ll~kM~~p~g~pp~ 79 (79)
T PF07189_consen 1 DKYRIQQQLEHLQSKYVGTGHADTTKEEWLTNIHRDTYASIIGHPDLLEYFAIAENESKARVRFNLLEKMVQPCGPPPP 79 (79)
T ss_pred ChhhHHHHHHHHHHHhCCCCCCCcCHHHHHHHHHHHHHHHHhcCHHHHHHHHHhcCCCHHHHHHHHHHHHhccCCCCCC
Confidence 8999999999999999999999999999999999999999999999999999999999999999999999999999996
No 2
>KOG3485 consensus Uncharacterized conserved protein [Function unknown]
Probab=100.00 E-value=8.3e-51 Score=268.84 Aligned_cols=83 Identities=67% Similarity=1.204 Sum_probs=79.6
Q ss_pred chhHhHHHHHHHHHhhhcCCCCCCccHHHHHHHhhhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHhhCCCCC-CC
Q 048367 2 SDRFNINSQLEHLQAKYVGTGHADLNRFEWAVNIQRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKMLLPCGL-PP 80 (85)
Q Consensus 2 ~dk~~~~~qle~Lq~KY~GtG~~dTTk~EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM~~pcg~-pp 80 (85)
+|||++++||||||+||+||||+||||+||++||||||++|++||+++|.||+||||||++|+|+|+|+||++|||| ||
T Consensus 2 ~dRf~i~aqLEhLQskYvGtg~a~~tk~ew~vnq~RdS~~S~vgh~~~l~Y~a~ae~Ep~~rvr~N~lekml~pcg~wpp 81 (86)
T KOG3485|consen 2 GDRFNIHAQLEHLQSKYVGTGHADTTKFEWLVNQHRDSLASYVGHYPLLNYFAIAENEPKARVRFNLLEKMLQPCGPWPP 81 (86)
T ss_pred cchhhHHHHHHHHHHHhhcccccccchHHHHHhcchhhhhhhcCCchHHHHHHHhccCchhhhhhcHHHHhhcccCCCCC
Confidence 69999999999999999999999999999999999999999999999999999999999999999999999999999 55
Q ss_pred CCCC
Q 048367 81 ERED 84 (85)
Q Consensus 81 ~~~~ 84 (85)
++++
T Consensus 82 ~~e~ 85 (86)
T KOG3485|consen 82 EKEE 85 (86)
T ss_pred cccc
Confidence 5554
No 3
>PF10022 DUF2264: Uncharacterized protein conserved in bacteria (DUF2264); InterPro: IPR016624 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=85.53 E-value=2.5 Score=33.92 Aligned_cols=75 Identities=27% Similarity=0.360 Sum_probs=58.6
Q ss_pred chhHhHHHHHHHHHhhhcCCCCCCccHHHHHHHhhhhhHhhhhCChhHHHHHHHhhCcchH---HHHHH------HHHHh
Q 048367 2 SDRFNINSQLEHLQAKYVGTGHADLNRFEWAVNIQRDSYASYVGHYPILAYFSIAENESIG---REHYN------FMQKM 72 (85)
Q Consensus 2 ~dk~~~~~qle~Lq~KY~GtG~~dTTk~EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~---r~r~~------ll~kM 72 (85)
.|..++..-++.+.+=|+|=|=-.-.. +-|.|=|+|.+-|+-+|-|..++....+. +.+.+ -.+.|
T Consensus 172 ~d~~~i~~~l~~~e~~Y~GdGWY~DG~-----~~~~DYYns~aih~y~l~~~~~~~~~~~~~~~~~~~Ra~~fa~~~~~~ 246 (361)
T PF10022_consen 172 YDEERIDYDLERIEEWYLGDGWYSDGP-----EFQFDYYNSWAIHPYLLLYARLMGDEDPERAARYRQRAQRFAEDYERM 246 (361)
T ss_pred CcHHHHHHHHHHHHHHhccCCccccCC-----ccCCcchHHHHHHHHHHHHHHHhcccCHHHHHHHHHHHHHHHHHHHHH
Confidence 366788889999999999998544333 67999999999999999999999977633 33332 35789
Q ss_pred hCCCCCCCC
Q 048367 73 LLPCGLPPE 81 (85)
Q Consensus 73 ~~pcg~pp~ 81 (85)
+.|.|.+|+
T Consensus 247 f~~dG~~~~ 255 (361)
T PF10022_consen 247 FSPDGAAPP 255 (361)
T ss_pred cCCCCCcCC
Confidence 999996664
No 4
>PF02909 TetR_C: Tetracyclin repressor, C-terminal all-alpha domain; InterPro: IPR004111 The antibiotic tetracycline has a broad spectrum of activity, acting to inhibit bacterial protein synthesis by binding to the 30S ribosomal subunit, which prevents the association of the aminoacyl-tRNA to the ribosomal acceptor A site. Tetracycline binding is reversible, therefore diluting out the antibiotic can reverse its effects. Tetracycline resistance genes are often located on mobile elements, such as plasmids, transposons and/or conjugative transposons, which can sometimes be transferred between bacterial species. In certain cases, tetracycline can enhance the transfer of these elements, thereby promoting resistance amongst a bacterial colony. There are three types of tetracycline resistance: tetracycline efflux, ribosomal protection, and tetracycline modification [, ]: Tetracycline efflux proteins belong to the major facilitator superfamily. Efflux proteins are membrane-associated proteins that recognise and export tetracycline from the cell. They are found in both Gram-positive and Gram-negative bacteria []. There are at least 22 different tetracycline efflux proteins, grouped according to sequence similarity: Group 1 are Tet(A), Tet(B), Tet(C), Tet(D), Tet(E), Tet(G), Tet(H), Tet(J), Tet(Z) and Tet(30); Group 2 are Tet(K) and Tet(L); Group 3 are Otr(B) and Tcr(3); Group 4 is TetA(P); Group 5 is Tet(V). In addition, there are the efflux proteins Tet(31), Tet(33), Tet(V), Tet(Y), Tet(34), and Tet(35). Ribosomal protection proteins are cytoplasmic proteins that display homology with the elongation factors EF-Tu and EF-G. Protection proteins bind the ribosome, causing an alteration in ribosomal conformation that prevents tetracycline from binding. There are at least ten ribosomal protection proteins: Tet(M), Tet(O), Tet(S), Tet(W), Tet(32), Tet(36), Tet(Q), Tet(T), Otr(A), and TetB(P). Both Tet(M) and Tet(O) have ribosome-dependent GTPase activity, the hydrolysis of GTP providing the energy for the ribosomal conformational changes. Tetracycline modification proteins include the enzymes Tet(37) and Tet(X), both of which inactivate tetracycline. In addition, there are the tetracycline resistance proteins Tet(U) and Otr(C). The expression of several of these tet genes is controlled by a family of tetracycline transcriptional regulators known as TetR. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity []. The TetR proteins identified in over 115 genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain found in the tetracycline transcriptional repressor TetR, which binds to the Tet(A) gene to repress its expression in the absence of tetracycline []. Tet(A) is a membrane-associated efflux protein that exports tetracycline from the cell before it can attach to ribosomes and inhibit polypeptide chain growth. TetR occurs as a homodimer and uses two helix-turn-helix (HTH) motifs to bind tandem DNA operators, thereby blocking the expression of the associated genes, TetA and TetR. The structure of the class D TetR repressor protein [] involves 10 alpha-helices, with connecting turns and loops. The three N-terminal helices constitute the DNA-binding HTH domain, which has an inverse orientation compared with HTH motifs in other DNA-binding proteins. The core of the protein, formed by helices 5-10, is responsible for dimerisation and contains, for each monomer, a binding pocket that accommodates tetracycline in the presence of a divalent cation.; GO: 0045892 negative regulation of transcription, DNA-dependent; PDB: 2Y30_B 3ZQL_C 2Y31_B 2Y2Z_A 2VPR_A 3B6A_A 3B6C_A 2OPT_A 2NS7_B 2NS8_C ....
Probab=50.98 E-value=44 Score=21.43 Aligned_cols=44 Identities=14% Similarity=0.072 Sum_probs=32.0
Q ss_pred cHHHHHHHhhhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHh
Q 048367 27 NRFEWAVNIQRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKM 72 (85)
Q Consensus 27 Tk~EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM 72 (85)
.=.||+...-+...+.+.-||.+..+++-....+.. .+.++|.|
T Consensus 7 ~W~~~l~~~a~~~r~~~~~hP~~~~~~~~~~~~~p~--~l~~~e~~ 50 (139)
T PF02909_consen 7 DWRERLRALARAYRAALLRHPWLAELLLARPPPGPN--ALRLMEAM 50 (139)
T ss_dssp EHHHHHHHHHHHHHHHHHTSTTHHHHHHTSSCTSHH--HHHHHHHH
T ss_pred CHHHHHHHHHHHHHHHHHHCcCHHHHHHhcCCCChh--HHHHHHHH
Confidence 457899999999999999999999997654333333 33444444
No 5
>COG4519 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=50.58 E-value=13 Score=25.27 Aligned_cols=17 Identities=35% Similarity=0.485 Sum_probs=14.9
Q ss_pred cHHHHHHHhhhhhHhhh
Q 048367 27 NRFEWAVNIQRDSYASY 43 (85)
Q Consensus 27 Tk~EW~tnihRDT~aS~ 43 (85)
-+.||++.+|||..|++
T Consensus 75 ~~~~Wv~E~h~~IaAal 91 (95)
T COG4519 75 VRREWVVEQHRRIAAAL 91 (95)
T ss_pred hhHHHHHHHHHHHHHHh
Confidence 36799999999999986
No 6
>PF13215 DUF4023: Protein of unknown function (DUF4023)
Probab=49.64 E-value=22 Score=20.66 Aligned_cols=24 Identities=17% Similarity=0.320 Sum_probs=19.8
Q ss_pred chhHhHHHHHHHHHhhhcCCCCCC
Q 048367 2 SDRFNINSQLEHLQAKYVGTGHAD 25 (85)
Q Consensus 2 ~dk~~~~~qle~Lq~KY~GtG~~d 25 (85)
.||++..+.-...-.+..|.|||+
T Consensus 8 v~Kl~e~Q~K~e~Nk~~qG~G~P~ 31 (38)
T PF13215_consen 8 VEKLNETQEKQEKNKKHQGKGNPS 31 (38)
T ss_pred HHHHHHHHHHHHHHHhccCCCCch
Confidence 378888887777778889999997
No 7
>PF04614 Pex19: Pex19 protein family; InterPro: IPR006708 Peroxisome(s) form an intracellular compartment, bounded by a typical lipid bilayer membrane. Peroxisome functions are often specialised by organism and cell type; two widely distributed and well-conserved functions are H2O2-based respiration and fatty acid beta-oxidation. Other functions include ether lipid (plasmalogen) synthesis and cholesterol synthesis in animals, the glyoxylate cycle in germinating seeds ("glyoxysomes"), photorespiration in leaves, glycolysis in trypanosomes ("glycosomes"), and methanol and/or amine oxidation and assimilation in some yeasts. PEX genes encode the machinery ("peroxins") required to assemble the peroxisome. Membrane assembly and maintenance requires three of these (peroxins 3, 16, and 19) and may occur without the import of the matrix (lumen) enzymes. Matrix protein import follows a branched pathway of soluble recycling receptors, with one branch for each class of peroxisome targeting sequence (two are well characterised), and a common trunk for all. At least one of these receptors, Pex5p, enters and exits peroxisomes as it functions. Proliferation of the organelle is regulated by Pex11p. Peroxisome biogenesis is remarkably conserved among eukaryotes. A group of fatal, inherited neuropathologies are recognised as peroxisome biogenesis diseases. ; GO: 0005777 peroxisome; PDB: 2WL8_B 2W85_B.
Probab=42.78 E-value=8.6 Score=29.22 Aligned_cols=16 Identities=44% Similarity=0.692 Sum_probs=8.3
Q ss_pred HHHHHHHhhCCCCCCCC
Q 048367 65 HYNFMQKMLLPCGLPPE 81 (85)
Q Consensus 65 r~~ll~kM~~pcg~pp~ 81 (85)
=..|+.+| |-||.||+
T Consensus 201 i~~lmqem-Q~~G~PP~ 216 (248)
T PF04614_consen 201 IMELMQEM-QELGQPPE 216 (248)
T ss_dssp HHHHHHHH-HHT----G
T ss_pred HHHHHHHH-HHcCCCcH
Confidence 35677887 67999996
No 8
>PF07627 PSCyt3: Protein of unknown function (DUF1588); InterPro: IPR013039 A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also contain IPR011478 from INTERPRO, IPR013036 from INTERPRO, IPR013042 from INTERPRO and IPR013043 from INTERPRO.
Probab=40.21 E-value=31 Score=23.24 Aligned_cols=41 Identities=15% Similarity=0.143 Sum_probs=27.4
Q ss_pred hhhCChhHHHHHHHhhCc-chHHHHHHHHHHhhCCCCCCCCC
Q 048367 42 SYVGHYPILAYFSIAENE-SIGREHYNFMQKMLLPCGLPPER 82 (85)
Q Consensus 42 S~~gH~~lL~Y~aiaene-s~~r~r~~ll~kM~~pcg~pp~~ 82 (85)
-+.+|.++|+-.|-...- ++.|=++-+=+=|.+|-+|||+.
T Consensus 4 GlLt~~~~Lt~~s~~~~tsPv~RG~~v~~~lLc~~~ppPP~~ 45 (101)
T PF07627_consen 4 GLLTQGAFLTRTSDGDRTSPVHRGVWVRERLLCQPPPPPPPN 45 (101)
T ss_pred hhhhhHHHHhccCCCCCCCchHHHHHHHHHHcCCCCCCCCCC
Confidence 467888888888877644 55665554444566777777763
No 9
>PF03181 BURP: BURP domain; InterPro: IPR004873 The BURP domain is a ~230-residue module, which has been named for the four members of the group initially identified, BNM2, USP, RD22, and PG1beta. It is found in the C-terminal part of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they are expressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain []. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions []. Some proteins known to contain a BURP domain are listed below [, , ]: Brassica protein BNM2, which is expressed during the induction of microspore embryogenesis. Field bean USPs, abundant non-storage seed proteins with unknown function. Soybean USP-like proteins ADR6 (or SALI5-4A), an auxin-repressible, aluminium-inducible protein and SALI3-2, a protein that is up-regulated by aluminium. Soybean seed coat BURP-domain protein 1 (SCB1). It might play a role in the differentiation of the seed coat parenchyma cells. Arabidopsis RD22 drought induced protein. Maize ZRP2, a protein of unknown function in cortex parenchyma. Tomato PG1beta, the beta-subunit of polygalacturonase isozyme 1 (PG1), which is expressed in ripening fruits. Cereal RAFTIN. It is essential specifically for the maturation phase of pollen development.
Probab=38.32 E-value=36 Score=25.75 Aligned_cols=35 Identities=29% Similarity=0.517 Sum_probs=25.9
Q ss_pred ChhHHHHHHHhhCcchHHHHHHHHHHhhCCCCCCCCCCC
Q 048367 46 HYPILAYFSIAENESIGREHYNFMQKMLLPCGLPPERED 84 (85)
Q Consensus 46 H~~lL~Y~aiaenes~~r~r~~ll~kM~~pcg~pp~~~~ 84 (85)
-+.+|.+|+|..|-..+.. ++.-+.-|..||.+.|
T Consensus 46 l~~iL~~Fsi~~~S~~A~~----m~~Tl~~Ce~~~~~GE 80 (216)
T PF03181_consen 46 LPEILQMFSIPPGSPMAKA----MKNTLEECESPPIKGE 80 (216)
T ss_pred HHHHHHHhcCCCCCHHHHH----HHHHHHHhhcCCCCCc
Confidence 4789999999999877654 3444455998887655
No 10
>cd07594 BAR_Endophilin_B The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle.
Probab=34.81 E-value=94 Score=23.60 Aligned_cols=32 Identities=16% Similarity=0.184 Sum_probs=26.9
Q ss_pred HHHHHHHhhhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHh
Q 048367 28 RFEWAVNIQRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKM 72 (85)
Q Consensus 28 k~EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM 72 (85)
..+|..+|++.|-+-+.+.|+. |++-.+.+||
T Consensus 29 ~~~~~e~i~~~~~~~lqpNp~~-------------r~~~~~~~k~ 60 (229)
T cd07594 29 TKVWTEKILKQTEAVLQPNPNV-------------RVEDFIYEKL 60 (229)
T ss_pred HHHHHHHHHHHHHHHhCCChhh-------------hHHHHHHHHh
Confidence 3589999999999999988754 7777788888
No 11
>PF15178 TOM_sub5: Mitochondrial import receptor subunit TOM5 homolog
Probab=31.21 E-value=1.3e+02 Score=18.42 Aligned_cols=36 Identities=19% Similarity=0.311 Sum_probs=27.5
Q ss_pred hcCCCCCCccHHHHHHHhhhhhHhhhhCChhHHHHHHHhh
Q 048367 18 YVGTGHADLNRFEWAVNIQRDSYASYVGHYPILAYFSIAE 57 (85)
Q Consensus 18 Y~GtG~~dTTk~EW~tnihRDT~aS~~gH~~lL~Y~aiae 57 (85)
-.|+| +-..-+|--....+|+++|+ .+.|-|.|+-.
T Consensus 4 ~egl~-pk~DPeE~k~kmR~dvissv---rnFliyVALlR 39 (51)
T PF15178_consen 4 IEGLG-PKMDPEEMKRKMREDVISSV---RNFLIYVALLR 39 (51)
T ss_pred cccCC-CCCCHHHHHHHHHHHHHHHH---HHHHHHHHHHH
Confidence 35666 44556788888999999998 57888988754
No 12
>PF12616 DUF3775: Protein of unknown function (DUF3775); InterPro: IPR022254 This domain family is found in bacteria, and is approximately 80 amino acids in length. There is a single completely conserved residue G that may be functionally important.
Probab=30.31 E-value=1.5e+02 Score=18.97 Aligned_cols=52 Identities=19% Similarity=0.347 Sum_probs=37.7
Q ss_pred hhHhHHHHHHHHHhhhcCCCCCCccHHHHHH--Hhh-----hhhHhhhhCChhHHHHHHHh
Q 048367 3 DRFNINSQLEHLQAKYVGTGHADLNRFEWAV--NIQ-----RDSYASYVGHYPILAYFSIA 56 (85)
Q Consensus 3 dk~~~~~qle~Lq~KY~GtG~~dTTk~EW~t--nih-----RDT~aS~~gH~~lL~Y~aia 56 (85)
+.++..+|.+..---|+|=| |-+-.||-. ..- ..|..=++|+|.+-.|+.=+
T Consensus 11 ~~l~~deqaeLvALmwiGRG--d~~~eew~~a~~~A~~~~~~~ta~YLl~~p~ladyLe~G 69 (75)
T PF12616_consen 11 EDLNEDEQAELVALMWIGRG--DFEAEEWEEAVAEARERASARTADYLLGTPMLADYLEEG 69 (75)
T ss_pred HhCCHHHHHHHHHHHHhcCC--CCCHHHHHHHHHHHHHhccchHHHHHHcCCcHHHHHHHH
Confidence 56788889999999999999 566666643 222 23455589999999987543
No 13
>PF05750 Rubella_Capsid: Rubella capsid protein; InterPro: IPR008819 Rubella virus is an enveloped positive-strand RNA virus of the family Togaviridae. Virions are composed of three structural proteins: a capsid and two membrane-spanning glycoproteins, E2 and E1. During virus assembly, the capsid interacts with genomic RNA to form nucleocapsids. It has been discovered that capsid phosphorylation serves to negatively regulate binding of viral genomic RNA. This may delay the initiation of nucleocapsid assembly until sufficient amounts of virus glycoproteins accumulate at the budding site and/or prevent non-specific binding to cellular RNA when levels of genomic RNA are low. It follows that at a late stage in replication, the capsid may undergo dephosphorylation before nucleocapsid assembly occurs []. This family is found together with IPR008820 from INTERPRO and IPR008821 from INTERPRO.; GO: 0016021 integral to membrane, 0019013 viral nucleocapsid
Probab=29.93 E-value=25 Score=27.50 Aligned_cols=12 Identities=42% Similarity=0.794 Sum_probs=9.7
Q ss_pred HhhCCCCCCCCC
Q 048367 71 KMLLPCGLPPER 82 (85)
Q Consensus 71 kM~~pcg~pp~~ 82 (85)
-|..||||-|+.
T Consensus 192 lmynpcgpeppa 203 (300)
T PF05750_consen 192 LMYNPCGPEPPA 203 (300)
T ss_pred hhcCCCCCCChh
Confidence 378999998864
No 14
>PF13978 DUF4223: Protein of unknown function (DUF4223)
Probab=27.85 E-value=41 Score=20.95 Aligned_cols=13 Identities=23% Similarity=0.598 Sum_probs=10.7
Q ss_pred HHHhhCCCCCCCC
Q 048367 69 MQKMLLPCGLPPE 81 (85)
Q Consensus 69 l~kM~~pcg~pp~ 81 (85)
|.||+-.|||+.+
T Consensus 43 ISKiIGGCGp~~~ 55 (56)
T PF13978_consen 43 ISKIIGGCGPAAQ 55 (56)
T ss_pred HHHHhcCCCCccc
Confidence 5899999999754
No 15
>PF11333 DUF3135: Protein of unknown function (DUF3135); InterPro: IPR021482 This family of proteins with unkown function appears to be restricted to Proteobacteria.
Probab=26.19 E-value=70 Score=20.84 Aligned_cols=32 Identities=9% Similarity=0.099 Sum_probs=25.0
Q ss_pred ChhHHHHHHHhhCcchH--HHHHHHHHHhhCCCC
Q 048367 46 HYPILAYFSIAENESIG--REHYNFMQKMLLPCG 77 (85)
Q Consensus 46 H~~lL~Y~aiaenes~~--r~r~~ll~kM~~pcg 77 (85)
.|+.-+...+|++.|.+ .+|.++++.|+.-|-
T Consensus 3 lp~FD~L~~LA~~dPe~fe~lr~~~~ee~I~~a~ 36 (83)
T PF11333_consen 3 LPDFDELKELAQNDPEAFEQLRQELIEEMIESAP 36 (83)
T ss_pred CCCHHHHHHHHHhCHHHHHHHHHHHHHHHHHhCC
Confidence 45666778899998874 499999999987653
No 16
>PF01152 Bac_globin: Bacterial-like globin; InterPro: IPR001486 Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms []. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include: Haemoglobin (Hb): trimer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates []. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors []. Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle []. Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia []. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin []. Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers []. Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation. Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants []. Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin []. Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [, ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors []. Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 alpha-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features []. This entry represents a group of haemoglobin-like proteins found in eubacteria, cyanobacteria, protozoa, algae and plants, but not in animals or yeast. These proteins have a truncated 2-over-2 rather than the canonical 3-over-3 alpha-helical sandwich fold []. This entry includes: HbN (or GlbN): a truncated haemoglobin-like protein that binds oxygen cooperatively with a very high affinity and a slow dissociation rate, which may exclude it from oxygen transport. It appears to be involved in bacterial nitric oxide detoxification and in nitrosative stress []. Cyanoglobin (or GlbN): a truncated haemoprotein found in cyanobacteria that has high oxygen affinity, and which appears to serve as part of a terminal oxidase, rather than as a respiratory pigment []. HbO (or GlbO): a truncated haemoglobin-like protein with a lower oxygen affinity than HbN. HbO associates with the bacterial cell membrane, where it significantly increases oxygen uptake over membranes lacking this protein. HbO appears to interact with a terminal oxidase, and could participate in an oxygen/electron-transfer process that facilitates oxygen transfer during aerobic metabolism []. Glb3: a nuclear-encoded truncated haemoglobin from plants that appears more closely related to HbO than HbN. Glb3 from Arabidopsis thaliana (Mouse-ear cress) exhibits an unusual concentration-independent binding of oxygen and carbon dioxide []. ; GO: 0019825 oxygen binding, 0015671 oxygen transport; PDB: 2BKM_B 1UVY_A 1DLW_A 2XYK_B 2IG3_A 2GKM_B 1S61_A 1S56_B 1RTE_B 2GLN_A ....
Probab=25.85 E-value=1.1e+02 Score=19.62 Aligned_cols=48 Identities=13% Similarity=0.158 Sum_probs=34.6
Q ss_pred HHHHHhhhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHhhCCCCCCC
Q 048367 30 EWAVNIQRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKMLLPCGLPP 80 (85)
Q Consensus 30 EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM~~pcg~pp 80 (85)
|=+..+...-|.-+.-++.+..||. +......+..+.+=|.+=||.|+
T Consensus 10 ~~I~~lv~~fY~rv~~d~~l~~~F~---~~d~~~~~~~~~~fl~~~~GGp~ 57 (120)
T PF01152_consen 10 EGIRALVDAFYDRVLADPRLKPFFE---GIDLEKHKEKQAEFLSQLLGGPP 57 (120)
T ss_dssp HHHHHHHHHHHHHHHT-TTTGGGGT---TSCHHHHHHHHHHHHHHHTTSSS
T ss_pred HHHHHHHHHHHHHHHcCHHHHhhcC---CCCHHHHHHHHHHHHHHHhCCCC
Confidence 3355667777888888888888887 66666777777777777788776
No 17
>KOG3541 consensus Predicted guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=25.21 E-value=49 Score=28.24 Aligned_cols=36 Identities=11% Similarity=0.220 Sum_probs=29.2
Q ss_pred hHhhhhCChhHHHHHHHhh--CcchHHHHHHHHHHhhC
Q 048367 39 SYASYVGHYPILAYFSIAE--NESIGREHYNFMQKMLL 74 (85)
Q Consensus 39 T~aS~~gH~~lL~Y~aiae--nes~~r~r~~ll~kM~~ 74 (85)
|+.+|+.|.+.|+|++--| -..+.+.|..||+.|++
T Consensus 262 sie~y~~wfn~Lsa~~Atevlk~~kk~~rsamlef~iD 299 (477)
T KOG3541|consen 262 SIERYMSWFNHLSALCATEVLKAAKKQTRSAMLEFLID 299 (477)
T ss_pred cHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4567889999999998777 34677899999999875
No 18
>KOG2468 consensus Dolichol kinase [Lipid transport and metabolism]
Probab=25.10 E-value=29 Score=29.83 Aligned_cols=10 Identities=40% Similarity=0.614 Sum_probs=9.1
Q ss_pred hhHhhhhCCh
Q 048367 38 DSYASYVGHY 47 (85)
Q Consensus 38 DT~aS~~gH~ 47 (85)
||.||++||.
T Consensus 423 DTmASiiG~r 432 (510)
T KOG2468|consen 423 DTMASIIGKR 432 (510)
T ss_pred hHHHHHHhhh
Confidence 8999999985
No 19
>TIGR00994 3a0901s05TIC20 chloroplast protein import component, Tic20 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic20 protein.
Probab=25.04 E-value=67 Score=25.54 Aligned_cols=24 Identities=25% Similarity=0.419 Sum_probs=20.3
Q ss_pred HHHHHHhhCcchHH-HHHHHHHHhh
Q 048367 50 LAYFSIAENESIGR-EHYNFMQKML 73 (85)
Q Consensus 50 L~Y~aiaenes~~r-~r~~ll~kM~ 73 (85)
+-|+.|+.|+.... +|||.++-|+
T Consensus 174 ~Lyl~VVRN~~iphFIRFNtMQAIL 198 (267)
T TIGR00994 174 LAYMWVVRRKEWPHFFRFHMMMGML 198 (267)
T ss_pred HHHHHHhcCCCcchhhhHHHHHHHH
Confidence 56889999998766 9999998875
No 20
>cd07617 BAR_Endophilin_B2 The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilin-B2, also called SH3GLB2 (SH3-domain GRB2-like endophilin B2), is a cytoplasmic protein that interacts with the apoptosis inducer Bax. It is overexpressed in prostate cancer metastasis and has been identified
Probab=23.34 E-value=1.8e+02 Score=22.25 Aligned_cols=33 Identities=15% Similarity=0.130 Sum_probs=28.7
Q ss_pred HHHHHHhhhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHhhC
Q 048367 29 FEWAVNIQRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKMLL 74 (85)
Q Consensus 29 ~EW~tnihRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM~~ 74 (85)
.+|..+|.+.|-.-+.+.|+ .|++..|++||..
T Consensus 30 k~~~~~i~~~t~~~LqPNp~-------------~R~~~~~~~k~~~ 62 (220)
T cd07617 30 KNWTEKILRQTEVLLQPNPS-------------ARVEEFLYEKLDR 62 (220)
T ss_pred HHHHHHHHHHHHHHhCCChh-------------hhHHHHHHHHHhc
Confidence 47999999999999999886 6888888899855
No 21
>cd00252 SPARC_EC SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules.
Probab=21.81 E-value=1.1e+02 Score=20.62 Aligned_cols=19 Identities=21% Similarity=0.293 Sum_probs=14.0
Q ss_pred HHHHHHhhhhhHhhh--hCCh
Q 048367 29 FEWAVNIQRDSYASY--VGHY 47 (85)
Q Consensus 29 ~EW~tnihRDT~aS~--~gH~ 47 (85)
-+|+.|+|+|++--. -||.
T Consensus 15 ~dW~~~~~~~~~~~~~~~~~~ 35 (116)
T cd00252 15 RDWFKNVHEDLKERDELEKHK 35 (116)
T ss_pred HHHHHHHHHHHhhcccchhhh
Confidence 489999999998533 4444
No 22
>PF13774 Longin: Regulated-SNARE-like domain; PDB: 1IOU_A 3BW6_A 1H8M_A 3EGX_C 2NUP_C 3EGD_C 2NUT_C 3KYQ_A 1IFQ_B 2VX8_D ....
Probab=21.69 E-value=17 Score=22.28 Aligned_cols=27 Identities=19% Similarity=0.369 Sum_probs=22.5
Q ss_pred hhHHHHHHHhhCcchHHHHHHHHHHhh
Q 048367 47 YPILAYFSIAENESIGREHYNFMQKML 73 (85)
Q Consensus 47 ~~lL~Y~aiaenes~~r~r~~ll~kM~ 73 (85)
.+=+.|++|+..+-+.|+-|.||+.+.
T Consensus 31 ~~~i~~~citd~~~~~r~aF~fL~~i~ 57 (83)
T PF13774_consen 31 EDGIAYLCITDKSYPKRVAFAFLEEIK 57 (83)
T ss_dssp ETTEEEEEEEETTS-HHHHHHHHHHHH
T ss_pred cCCeEEEEEEcCCCCcchHHHHHHHHH
Confidence 566789999999999999999999875
No 23
>PRK11188 rrmJ 23S rRNA methyltransferase J; Provisional
Probab=21.66 E-value=73 Score=22.97 Aligned_cols=22 Identities=14% Similarity=0.243 Sum_probs=16.9
Q ss_pred CccHHHHHHHhhhhhHhhhhCC
Q 048367 25 DLNRFEWAVNIQRDSYASYVGH 46 (85)
Q Consensus 25 dTTk~EW~tnihRDT~aS~~gH 46 (85)
..+...|+.++.||-|.+..-.
T Consensus 7 ~~~~~~~~~~~~~d~~~~~~~~ 28 (209)
T PRK11188 7 SASSSRWLQEHFSDKYVQQAQK 28 (209)
T ss_pred ccchHHHHHHhhcCHHHHHHhh
Confidence 3467789999999988776443
No 24
>KOG3821 consensus Heparin sulfate cell surface proteoglycan [Signal transduction mechanisms]
Probab=20.45 E-value=70 Score=27.90 Aligned_cols=45 Identities=13% Similarity=0.090 Sum_probs=25.7
Q ss_pred hhhhHhhhhCChhHHHHHHHhhCcchHHHHHHHHHHhhCCCCCCCCCC
Q 048367 36 QRDSYASYVGHYPILAYFSIAENESIGREHYNFMQKMLLPCGLPPERE 83 (85)
Q Consensus 36 hRDT~aS~~gH~~lL~Y~aiaenes~~r~r~~ll~kM~~pcg~pp~~~ 83 (85)
+++++-|.+.|-+.+=-=||-- ...=+-.+.+||.|-||+|-+.+
T Consensus 306 g~~~iesvl~~i~v~iseAIm~---~q~N~~~lt~kV~q~Cg~p~~~p 350 (563)
T KOG3821|consen 306 GPFNIESVLLPIHVKISEAIMA---AQENSDKLTAKVFQGCGPPKPTP 350 (563)
T ss_pred CcchHHHHHhhhhhHHHHHHHH---HHHhhHHHHHHHHhhcCCCCCCc
Confidence 4566666666654432111111 11134567899999999997643
Done!