Query psy394
Match_columns 66
No_of_seqs 136 out of 1027
Neff 8.6
Searched_HMMs 46136
Date Fri Aug 16 21:13:14 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy394.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/394hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1542|consensus 99.9 2.3E-23 4.9E-28 131.3 4.2 61 1-66 119-180 (372)
2 PTZ00203 cathepsin L protease; 99.7 5.5E-18 1.2E-22 108.0 5.1 63 1-66 85-149 (348)
3 PTZ00021 falcipain-2; Provisio 99.6 3.2E-16 6.8E-21 103.3 4.8 66 1-66 217-289 (489)
4 PTZ00200 cysteine proteinase; 99.6 1.6E-15 3.4E-20 99.3 4.6 24 43-66 234-258 (448)
5 KOG1543|consensus 99.5 3.9E-14 8.4E-19 89.7 4.2 59 1-66 74-133 (325)
6 cd02621 Peptidase_C1A_Cathepsi 99.1 3.2E-11 7E-16 73.4 1.8 24 43-66 1-28 (243)
7 smart00645 Pept_C1 Papain fami 99.1 5.4E-11 1.2E-15 69.6 1.8 24 43-66 1-24 (174)
8 cd02698 Peptidase_C1A_Cathepsi 99.0 3.2E-10 6.8E-15 69.1 2.0 24 43-66 1-30 (239)
9 PF00112 Peptidase_C1: Papain 98.9 4E-10 8.6E-15 66.6 1.8 24 43-66 1-25 (219)
10 cd02248 Peptidase_C1A Peptidas 98.9 1E-09 2.2E-14 65.0 1.9 23 44-66 1-23 (210)
11 cd02620 Peptidase_C1A_Cathepsi 98.8 1.3E-09 2.8E-14 66.3 1.4 23 44-66 1-27 (236)
12 PTZ00049 cathepsin C-like prot 98.8 3.1E-09 6.7E-14 72.7 1.8 27 40-66 378-408 (693)
13 PTZ00364 dipeptidyl-peptidase 98.8 4.1E-09 8.9E-14 70.8 2.3 27 40-66 202-234 (548)
14 COG4870 Cysteine protease [Pos 98.0 2.4E-06 5.3E-11 55.1 1.2 25 42-66 98-122 (372)
15 KOG1544|consensus 95.8 0.0024 5.3E-08 41.5 -0.1 27 40-66 206-234 (470)
16 PTZ00462 Serine-repeat antigen 92.2 0.065 1.4E-06 39.1 1.0 18 49-66 535-555 (1004)
17 PF05391 Lsm_interact: Lsm int 79.6 2 4.3E-05 16.9 1.5 12 5-16 10-21 (21)
18 smart00002 PLP Myelin proteoli 70.3 1.4 3E-05 21.9 -0.0 20 46-66 23-44 (60)
19 PLN00165 hypothetical protein; 68.6 1 2.2E-05 24.0 -0.7 13 52-64 14-26 (88)
20 KOG1297|consensus 56.2 4.9 0.00011 24.8 0.5 21 46-66 95-115 (249)
21 PF01320 Colicin_Pyocin: Colic 56.1 9.1 0.0002 20.3 1.5 14 1-14 6-19 (85)
22 PF10655 DUF2482: Hypothetical 51.3 9 0.00019 20.7 1.0 14 2-15 5-18 (100)
23 TIGR02792 PCA_ligA protocatech 50.5 12 0.00027 20.9 1.5 15 2-16 94-108 (117)
24 PRK12702 mannosyl-3-phosphogly 49.0 13 0.00027 24.2 1.5 17 2-19 121-137 (302)
25 PF09851 SHOCT: Short C-termin 48.4 19 0.0004 15.1 1.6 11 3-13 15-25 (31)
26 PF04369 Lactococcin: Lactococ 45.6 14 0.00031 18.3 1.1 15 2-16 7-21 (60)
27 PF08992 QH-AmDH_gamma: Quinoh 43.5 10 0.00022 19.6 0.4 12 46-57 64-75 (78)
28 PRK13377 protocatechuate 4,5-d 42.9 21 0.00045 20.4 1.6 15 2-16 100-114 (129)
29 PF13986 DUF4224: Domain of un 41.9 25 0.00054 16.3 1.6 15 5-20 3-17 (47)
30 PF09725 Fra10Ac1: Folate-sens 39.8 6.6 0.00014 22.1 -0.6 21 46-66 55-75 (118)
31 PF13900 GVQW: Putative bindin 38.4 15 0.00032 17.4 0.5 13 40-52 26-38 (48)
32 cd07924 PCA_45_Doxase_A The A 35.9 29 0.00063 19.6 1.5 15 2-16 97-111 (121)
33 PF02209 VHP: Villin headpiece 34.4 38 0.00083 14.9 1.5 10 5-14 2-11 (36)
34 PF07606 DUF1569: Protein of u 34.2 39 0.00085 19.4 1.9 14 1-14 125-138 (152)
35 PF00376 MerR: MerR family reg 33.2 16 0.00035 16.1 0.2 13 49-61 19-32 (38)
36 cd04761 HTH_MerR-SF Helix-Turn 32.6 22 0.00048 15.6 0.6 12 49-60 20-31 (49)
37 PF07110 EthD: EthD domain; I 31.9 36 0.00078 17.0 1.4 12 4-15 2-13 (95)
38 PF12368 DUF3650: Protein of u 31.8 50 0.0011 13.8 1.7 11 4-14 15-25 (28)
39 PF13647 Glyco_hydro_80: Glyco 31.8 21 0.00046 22.1 0.6 13 54-66 218-230 (308)
40 KOG0816|consensus 31.6 24 0.00052 21.8 0.8 16 1-16 103-118 (223)
41 PF09056 Phospholip_A2_3: Prok 29.4 23 0.00051 19.6 0.5 15 43-57 22-36 (111)
42 PF10685 KGG: Stress-induced b 28.7 51 0.0011 13.1 1.5 12 2-13 2-13 (23)
43 KOG3032|consensus 28.7 19 0.0004 22.8 -0.0 11 45-55 253-263 (264)
44 KOG2479|consensus 28.0 18 0.00038 25.0 -0.2 21 42-62 403-430 (549)
45 PRK13372 pcmA protocatechuate 27.5 44 0.00096 22.9 1.6 14 2-15 100-113 (444)
46 PF04833 COBRA: COBRA-like pro 26.9 30 0.00066 20.6 0.7 18 47-66 32-49 (169)
47 KOG2476|consensus 26.9 24 0.00052 24.5 0.2 11 41-51 492-502 (528)
48 PF11918 DUF3436: Domain of un 26.6 77 0.0017 15.4 1.9 14 2-15 35-48 (55)
49 COG2832 Uncharacterized protei 25.7 23 0.00051 19.9 0.0 14 44-57 58-71 (119)
50 smart00153 VHP Villin headpiec 25.7 69 0.0015 14.0 1.6 10 5-14 2-11 (36)
51 TIGR01847 bacteriocin_sig bact 25.4 64 0.0014 13.2 1.3 17 2-18 1-17 (26)
52 smart00422 HTH_MERR helix_turn 25.3 35 0.00076 16.2 0.7 10 49-58 20-29 (70)
53 PF06574 FAD_syn: FAD syntheta 25.0 51 0.0011 19.0 1.4 15 2-16 88-102 (157)
54 PF15649 Tox-REase-7: Restrict 25.0 29 0.00062 18.4 0.3 20 40-60 30-49 (87)
55 PF08127 Propeptide_C1: Peptid 24.7 53 0.0012 14.8 1.1 15 2-17 22-36 (41)
56 PRK15431 ferrous iron transpor 24.6 29 0.00063 18.1 0.3 17 48-64 39-56 (78)
57 PF09012 FeoC: FeoC like trans 23.3 19 0.00042 17.6 -0.5 15 49-63 38-52 (69)
58 PF08838 DUF1811: Protein of u 23.2 38 0.00083 18.6 0.6 13 2-14 4-16 (102)
59 PF02845 CUE: CUE domain; Int 23.2 62 0.0013 14.2 1.2 14 1-14 11-24 (42)
60 PF02276 CytoC_RC: Photosynthe 23.0 65 0.0014 21.2 1.6 13 2-14 60-72 (314)
61 PF05443 ROS_MUCR: ROS/MUCR tr 22.9 57 0.0012 18.6 1.3 14 4-18 94-107 (132)
62 PF13024 DUF3884: Protein of u 22.7 69 0.0015 16.6 1.4 14 3-16 42-55 (77)
63 PF12958 DUF3847: Protein of u 22.6 71 0.0015 16.9 1.5 11 4-14 62-72 (86)
64 PF07105 DUF1367: Protein of u 21.6 78 0.0017 19.4 1.7 15 2-16 151-165 (196)
65 cd04774 HTH_YfmP Helix-Turn-He 21.2 48 0.001 17.4 0.7 13 49-61 20-32 (96)
66 cd01277 HINT_subgroup HINT (hi 20.9 72 0.0016 16.3 1.4 12 2-13 48-59 (103)
67 PF03047 ComC: COMC family; I 20.8 33 0.00071 14.8 0.0 15 2-16 11-25 (32)
68 PF12162 STAT1_TAZ2bind: STAT1 20.7 82 0.0018 12.5 1.3 9 5-13 10-18 (23)
69 cd04766 HTH_HspR Helix-Turn-He 20.1 50 0.0011 17.0 0.6 13 49-61 21-33 (91)
No 1
>KOG1542|consensus
Probab=99.88 E-value=2.3e-23 Score=131.32 Aligned_cols=61 Identities=36% Similarity=0.605 Sum_probs=45.1
Q ss_pred CCcCCCHHHHHHHHcCCCCC-cccccccccccceeeeCCCCCCCCCcccccccCCCCCCCcccCCCC
Q psy394 1 MSINWLHHEFVHMMNGFKRS-TRLLGTERVEEGVTYIAPDNVKLPEEVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 1 ~fsDlt~eEf~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~P~~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
.|||||.|||+++|++.+.. ... ..............+|++||||++|+||||||||+|||
T Consensus 119 qFSDlT~eEFkk~~l~~~~~~~~~-----~~~~~~~~~~~~~~lP~~fDWR~kgaVTpVKnQG~CGS 180 (372)
T KOG1542|consen 119 QFSDLTEEEFKKIYLGVKRRGSKL-----PGDAAEAPIEPGESLPESFDWRDKGAVTPVKNQGMCGS 180 (372)
T ss_pred chhhcCHHHHHHHhhccccccccC-----ccccccCcCCCCCCCCcccchhccCCccccccCCcCcc
Confidence 49999999999999876543 111 11111111233568999999999999999999999997
No 2
>PTZ00203 cathepsin L protease; Provisional
Probab=99.73 E-value=5.5e-18 Score=107.96 Aligned_cols=63 Identities=32% Similarity=0.481 Sum_probs=40.9
Q ss_pred CCcCCCHHHHHHHHcCCCCCcccccccccccceeeeC--CCCCCCCCcccccccCCCCCCCcccCCCC
Q psy394 1 MSINWLHHEFVHMMNGFKRSTRLLGTERVEEGVTYIA--PDNVKLPEEVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 1 ~fsDlt~eEf~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~P~~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
.|+|||.|||.+++++........ .......+.. ....++|++||||++|+|+||||||.|||
T Consensus 85 ~FaDlT~eEf~~~~l~~~~~~~~~---~~~~~~~~~~~~~~~~~lP~~~DWR~~g~VtpVkdQg~CGS 149 (348)
T PTZ00203 85 KFFDLSEAEFAARYLNGAAYFAAA---KQHAGQHYRKARADLSAVPDAVDWREKGAVTPVKNQGACGS 149 (348)
T ss_pred ccccCCHHHHHHHhcCCCcccccc---cccccccccccccccccCCCCCcCCcCCCCCCccccCCCcc
Confidence 499999999998776422111100 0000011111 11236899999999999999999999997
No 3
>PTZ00021 falcipain-2; Provisional
Probab=99.63 E-value=3.2e-16 Score=103.29 Aligned_cols=66 Identities=24% Similarity=0.283 Sum_probs=40.4
Q ss_pred CCcCCCHHHHHHHHcCCCCC-cccccccccc--c-c---eeeeCCCCCCCCCcccccccCCCCCCCcccCCCC
Q psy394 1 MSINWLHHEFVHMMNGFKRS-TRLLGTERVE--E-G---VTYIAPDNVKLPEEVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 1 ~fsDlt~eEf~~~~~~~~~~-~~~~~~~~~~--~-~---~~~~~~~~~~~P~~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
.|+|||.|||.++|++.... .......... . . ..+.......+|.++|||++|+||||||||.|||
T Consensus 217 qFsDlT~EEF~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P~s~DWR~~g~VtpVKdQG~CGS 289 (489)
T PTZ00021 217 RFGDLSFEEFKKKYLTLKSFDFKSNGKKSPRVINYDDVIKKYKPKDATFDHAKYDWRLHNGVTPVKDQKNCGS 289 (489)
T ss_pred ccccCCHHHHHHHhccccccccccccccccccccccccccccccccccCCccccccccCCCCCCccccccccc
Confidence 39999999999988764321 1100000000 0 0 0011111112499999999999999999999997
No 4
>PTZ00200 cysteine proteinase; Provisional
Probab=99.59 E-value=1.6e-15 Score=99.27 Aligned_cols=24 Identities=50% Similarity=0.855 Sum_probs=23.1
Q ss_pred CCCcccccccCCCCCCCccc-CCCC
Q psy394 43 LPEEVDWRNKGAVTPIKDQG-QCYK 66 (66)
Q Consensus 43 ~P~~~DWR~~g~Vt~Vk~Qg-~CGS 66 (66)
+|++||||++|+|+|||||| .|||
T Consensus 234 ~P~~~DWR~~g~vtpVkdQG~~CGS 258 (448)
T PTZ00200 234 TGEGLDWRRADAVTKVKDQGLNCGS 258 (448)
T ss_pred CCCCccCCCCCCCCCcccCCCccch
Confidence 69999999999999999999 9997
No 5
>KOG1543|consensus
Probab=99.48 E-value=3.9e-14 Score=89.74 Aligned_cols=59 Identities=36% Similarity=0.499 Sum_probs=41.2
Q ss_pred CCcCCCHHHHHHHHcCCCCCcccccccccccceeeeCCCCCCCCCcccccccCCCCC-CCcccCCCC
Q psy394 1 MSINWLHHEFVHMMNGFKRSTRLLGTERVEEGVTYIAPDNVKLPEEVDWRNKGAVTP-IKDQGQCYK 66 (66)
Q Consensus 1 ~fsDlt~eEf~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P~~~DWR~~g~Vt~-Vk~Qg~CGS 66 (66)
.|+|+|.+||.+.+.+....... . ...........+|++||||++|+|++ |||||+|||
T Consensus 74 ~~~d~~~ee~~~~~~~~~~~~~~-----~--~~~~~~~~~~~~p~s~DwR~~~~~~~~vkdQg~Cgs 133 (325)
T KOG1543|consen 74 QFADLTTEEFKRKKTGKKPPEIK-----R--DKFTEKLDGDDLPDSFDWRDKGAVTPPVKDQGSCGS 133 (325)
T ss_pred cccccchHHHHHhhccccCcccc-----c--cccccccchhhCCCCccccccCCcCCCcCCCCcCcc
Confidence 38999999999988765433210 0 01111122357999999999987665 999999997
No 6
>cd02621 Peptidase_C1A_CathepsinC Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are assoc
Probab=99.10 E-value=3.2e-11 Score=73.44 Aligned_cols=24 Identities=38% Similarity=0.918 Sum_probs=22.9
Q ss_pred CCCcccccccC----CCCCCCcccCCCC
Q psy394 43 LPEEVDWRNKG----AVTPIKDQGQCYK 66 (66)
Q Consensus 43 ~P~~~DWR~~g----~Vt~Vk~Qg~CGS 66 (66)
||++||||+.+ +|+||||||.|||
T Consensus 1 lP~~fDwr~~~~~~~~v~~v~dQg~CGs 28 (243)
T cd02621 1 LPKSFDWGDVNNGFNYVSPVRNQGGCGS 28 (243)
T ss_pred CCCcccccccCCCCcccccCCCCCcCcc
Confidence 69999999998 9999999999997
No 7
>smart00645 Pept_C1 Papain family cysteine protease.
Probab=99.07 E-value=5.4e-11 Score=69.58 Aligned_cols=24 Identities=75% Similarity=1.334 Sum_probs=22.9
Q ss_pred CCCcccccccCCCCCCCcccCCCC
Q psy394 43 LPEEVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 43 ~P~~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
||.+||||++++++||||||.|||
T Consensus 1 lP~~~D~R~~~~~~~v~dQg~CGs 24 (174)
T smart00645 1 LPESFDWRKKGAVTPVKDQGQCGS 24 (174)
T ss_pred CCCcCcccccCCCCccccCcccch
Confidence 699999999999999999999996
No 8
>cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.
Probab=98.96 E-value=3.2e-10 Score=69.10 Aligned_cols=24 Identities=42% Similarity=0.810 Sum_probs=22.6
Q ss_pred CCCcccccccC---CCCCCCccc---CCCC
Q psy394 43 LPEEVDWRNKG---AVTPIKDQG---QCYK 66 (66)
Q Consensus 43 ~P~~~DWR~~g---~Vt~Vk~Qg---~CGS 66 (66)
||++||||+++ +|+|||||| .|||
T Consensus 1 lP~~~Dwr~~~~~~~v~~vk~Qg~~~~CGs 30 (239)
T cd02698 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGS 30 (239)
T ss_pred CCCCcccccCCCCcccCccccCCCCCCCCc
Confidence 69999999998 999999998 8997
No 9
>PF00112 Peptidase_C1: Papain family cysteine protease This is family C1 in the peptidase classification. ; InterPro: IPR000668 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues. The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity []. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate []. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. ; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MOR_B 3HHI_B 1S4V_A 3F75_A 1MEG_A 1PCI_C 1PPO_A 3HD3_B 1F29_A 1EWL_A ....
Probab=98.93 E-value=4e-10 Score=66.61 Aligned_cols=24 Identities=54% Similarity=1.233 Sum_probs=19.4
Q ss_pred CCCccccccc-CCCCCCCcccCCCC
Q psy394 43 LPEEVDWRNK-GAVTPIKDQGQCYK 66 (66)
Q Consensus 43 ~P~~~DWR~~-g~Vt~Vk~Qg~CGS 66 (66)
||++||||+. |.++||||||.|||
T Consensus 1 lP~~~D~r~~~~~~~~v~dQg~~gs 25 (219)
T PF00112_consen 1 LPKSFDWRDKGGRITPVRDQGSCGS 25 (219)
T ss_dssp STSSEEGGGTTTCSG---BTTSSBT
T ss_pred CCCCEecccCCCCcCccccCCcccc
Confidence 7999999998 48999999999996
No 10
>cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to h
Probab=98.87 E-value=1e-09 Score=65.00 Aligned_cols=23 Identities=74% Similarity=1.347 Sum_probs=22.0
Q ss_pred CCcccccccCCCCCCCcccCCCC
Q psy394 44 PEEVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 44 P~~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
|++||||+.+.++||+|||.|||
T Consensus 1 P~~~d~r~~~~~~~v~dQg~cgs 23 (210)
T cd02248 1 PESVDWREKGAVTPVKDQGSCGS 23 (210)
T ss_pred CCcccCCcCCCCCCCccCCCCcc
Confidence 78999999999999999999996
No 11
>cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane
Probab=98.83 E-value=1.3e-09 Score=66.30 Aligned_cols=23 Identities=43% Similarity=0.803 Sum_probs=20.2
Q ss_pred CCccccccc--CCCC--CCCcccCCCC
Q psy394 44 PEEVDWRNK--GAVT--PIKDQGQCYK 66 (66)
Q Consensus 44 P~~~DWR~~--g~Vt--~Vk~Qg~CGS 66 (66)
|++||||++ ++++ ||||||.|||
T Consensus 1 p~~~DwR~~~~~~~~v~~v~dQg~CGs 27 (236)
T cd02620 1 PESFDAREKWPNCISIGEIRDQGNCGS 27 (236)
T ss_pred CCcccchhhCCCCCCccccCCcccchh
Confidence 889999997 5655 9999999997
No 12
>PTZ00049 cathepsin C-like protein; Provisional
Probab=98.76 E-value=3.1e-09 Score=72.71 Aligned_cols=27 Identities=19% Similarity=0.342 Sum_probs=24.1
Q ss_pred CCCCCCccccccc----CCCCCCCcccCCCC
Q psy394 40 NVKLPEEVDWRNK----GAVTPIKDQGQCYK 66 (66)
Q Consensus 40 ~~~~P~~~DWR~~----g~Vt~Vk~Qg~CGS 66 (66)
..+||.+||||++ ++|+||||||.|||
T Consensus 378 ~~~LP~sfDWRd~~~~~~~vtpVkdQG~CGS 408 (693)
T PTZ00049 378 IDELPKNFTWGDPFNNNTREYDVTNQLLCGS 408 (693)
T ss_pred cccCCCCEecCcCCCCCCcccCCCCCccCcH
Confidence 3579999999985 67999999999997
No 13
>PTZ00364 dipeptidyl-peptidase I precursor; Provisional
Probab=98.76 E-value=4.1e-09 Score=70.79 Aligned_cols=27 Identities=15% Similarity=0.352 Sum_probs=24.6
Q ss_pred CCCCCCcccccccC---CCCCCCcccC---CCC
Q psy394 40 NVKLPEEVDWRNKG---AVTPIKDQGQ---CYK 66 (66)
Q Consensus 40 ~~~~P~~~DWR~~g---~Vt~Vk~Qg~---CGS 66 (66)
..++|++||||++| +|+||||||. |||
T Consensus 202 ~~~LP~sfDWR~~gg~~~VtpVrdQg~~~~CGS 234 (548)
T PTZ00364 202 GDPPPAAWSWGDVGGASFLPAAPPASPGRGCNS 234 (548)
T ss_pred ccCCCCccccCcCCCCccCCCCcCCCCCCCCcC
Confidence 46799999999998 6999999999 997
No 14
>COG4870 Cysteine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.99 E-value=2.4e-06 Score=55.15 Aligned_cols=25 Identities=44% Similarity=0.718 Sum_probs=23.5
Q ss_pred CCCCcccccccCCCCCCCcccCCCC
Q psy394 42 KLPEEVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 42 ~~P~~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
.+|..+|||..|-|+|||+||.|||
T Consensus 98 s~~~~fd~r~~g~vs~v~dQg~~Gs 122 (372)
T COG4870 98 SLPSYFDRRDEGKVSPVKDQGSGGS 122 (372)
T ss_pred cchhheeeeccCCcccccccCcccc
Confidence 4899999999999999999999996
No 15
>KOG1544|consensus
Probab=95.77 E-value=0.0024 Score=41.53 Aligned_cols=27 Identities=30% Similarity=0.431 Sum_probs=24.6
Q ss_pred CCCCCCccccccc--CCCCCCCcccCCCC
Q psy394 40 NVKLPEEVDWRNK--GAVTPIKDQGQCYK 66 (66)
Q Consensus 40 ~~~~P~~~DWR~~--g~Vt~Vk~Qg~CGS 66 (66)
...||+.||-+.+ +.+.++-|||.|++
T Consensus 206 ~~~LPE~F~As~KWp~liH~plDQgnCa~ 234 (470)
T KOG1544|consen 206 GEVLPEAFEASEKWPNLIHEPLDQGNCAG 234 (470)
T ss_pred ccccchhhhhhhcCCccccCccccCCccc
Confidence 4679999999998 88999999999985
No 16
>PTZ00462 Serine-repeat antigen protein; Provisional
Probab=92.23 E-value=0.065 Score=39.12 Aligned_cols=18 Identities=28% Similarity=0.573 Sum_probs=13.2
Q ss_pred cccc-CC--CCCCCcccCCCC
Q psy394 49 WRNK-GA--VTPIKDQGQCYK 66 (66)
Q Consensus 49 WR~~-g~--Vt~Vk~Qg~CGS 66 (66)
|.++ .| ..||||||.|||
T Consensus 535 ~kD~~sC~s~i~VKDQG~CGS 555 (1004)
T PTZ00462 535 LKDENNCISKIQIEDQGNCAI 555 (1004)
T ss_pred cccCCCCCCCCCcccCCcchH
Confidence 4443 45 578999999996
No 17
>PF05391 Lsm_interact: Lsm interaction motif; InterPro: IPR008669 This short motif is found at the C terminus of Prp24 proteins and probably interacts with the Lsm proteins to promote U4/U6 formation [].
Probab=79.64 E-value=2 Score=16.92 Aligned_cols=12 Identities=17% Similarity=0.172 Sum_probs=9.6
Q ss_pred CCHHHHHHHHcC
Q psy394 5 WLHHEFVHMMNG 16 (66)
Q Consensus 5 lt~eEf~~~~~~ 16 (66)
++.++|.+++++
T Consensus 10 ~SNddFrkmfl~ 21 (21)
T PF05391_consen 10 KSNDDFRKMFLK 21 (21)
T ss_pred cchHHHHHHHcC
Confidence 788999998753
No 18
>smart00002 PLP Myelin proteolipid protein (PLP or lipophilin).
Probab=70.33 E-value=1.4 Score=21.90 Aligned_cols=20 Identities=35% Similarity=0.521 Sum_probs=13.7
Q ss_pred cccccccCCCCCCCc-cc-CCCC
Q psy394 46 EVDWRNKGAVTPIKD-QG-QCYK 66 (66)
Q Consensus 46 ~~DWR~~g~Vt~Vk~-Qg-~CGS 66 (66)
=+|-|+.|+| |+.. .| .||+
T Consensus 23 C~D~RQyGil-pwna~pgK~Cg~ 44 (60)
T smart00002 23 CVDARQYGIL-PWNAFPGKVCGS 44 (60)
T ss_pred Eeechhccee-ecCCCCCchHhH
Confidence 3899999988 5555 44 4653
No 19
>PLN00165 hypothetical protein; Provisional
Probab=68.56 E-value=1 Score=24.00 Aligned_cols=13 Identities=62% Similarity=1.020 Sum_probs=11.0
Q ss_pred cCCCCCCCcccCC
Q psy394 52 KGAVTPIKDQGQC 64 (66)
Q Consensus 52 ~g~Vt~Vk~Qg~C 64 (66)
.|+|-..||||.|
T Consensus 14 vgaVEalkDQG~c 26 (88)
T PLN00165 14 VGAVEALKDQGFC 26 (88)
T ss_pred HHHHhhccccCee
Confidence 3778889999988
No 20
>KOG1297|consensus
Probab=56.24 E-value=4.9 Score=24.81 Aligned_cols=21 Identities=29% Similarity=0.436 Sum_probs=18.3
Q ss_pred cccccccCCCCCCCcccCCCC
Q psy394 46 EVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 46 ~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
.+.||...-|-.=|-|-+||+
T Consensus 95 glRWRtEkEV~tGkgQf~CG~ 115 (249)
T KOG1297|consen 95 GLRWRTEKEVKTGKGQFSCGA 115 (249)
T ss_pred ceeeehhhhhccccccccccc
Confidence 488999988888899999996
No 21
>PF01320 Colicin_Pyocin: Colicin immunity protein / pyocin immunity protein; InterPro: IPR023802 Bacterial colicin and pyocin immunity proteins [, ] can bind specifically to the DNase-type colicins and pyocins and inhibit their bactericidal activity. The 1.8-angstrom crystal structure of the ImmE7 protein consists of four antiparallel alpha-helices []. Sequence similarities between colicins E2, A and E1 [] are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 [] immunity proteins. Pyocin protects a cell that harbours the plasmid ColE2 encoding colicin E2 against colicin E2; it is thus essential both for autonomous replication and colicin E2 immunity []. This entry represents the structural domain of colicin and pyocin immunity proteins.; GO: 0015643 toxin binding, 0030153 bacteriocin immunity; PDB: 1GXH_A 1GXG_A 1MZ8_C 2ERH_A 1ZNV_C 1AYI_A 1UNK_A 2JBG_A 7CEI_A 1CEI_A ....
Probab=56.10 E-value=9.1 Score=20.25 Aligned_cols=14 Identities=14% Similarity=0.387 Sum_probs=11.7
Q ss_pred CCcCCCHHHHHHHH
Q psy394 1 MSINWLHHEFVHMM 14 (66)
Q Consensus 1 ~fsDlt~eEf~~~~ 14 (66)
+++|+|.+||..++
T Consensus 6 ~i~dyTE~EFl~~v 19 (85)
T PF01320_consen 6 KISDYTESEFLEFV 19 (85)
T ss_dssp SGGGSBHHHHHHHH
T ss_pred HHHHhhHHHHHHHH
Confidence 47899999998865
No 22
>PF10655 DUF2482: Hypothetical protein of unknown function (DUF2482); InterPro: IPR018917 This entry is represented by Bacteriophage 80, Orf10. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. All the members of this very small, very short family are derived from bacteriophages, of the SA bacteriophages 11, Mu50B, system, and from the Staphylococcal_phi-Mu50B-like_prophages subsystem. All members are hypothetical proteins.
Probab=51.34 E-value=9 Score=20.73 Aligned_cols=14 Identities=7% Similarity=0.192 Sum_probs=11.8
Q ss_pred CcCCCHHHHHHHHc
Q psy394 2 SINWLHHEFVHMMN 15 (66)
Q Consensus 2 fsDlt~eEf~~~~~ 15 (66)
|-|||++|+...+.
T Consensus 5 yKdMTqeelr~lls 18 (100)
T PF10655_consen 5 YKDMTQEELRDLLS 18 (100)
T ss_pred hhhhhHHHHHHHHH
Confidence 77999999998653
No 23
>TIGR02792 PCA_ligA protocatechuate 4,5-dioxygenase, alpha subunit. Protocatechuate (PCA) 4,5-dioxygenase is the first enzyme in the PCA 4,5-cleavage pathway that is an alternative to PCA 3,4-cleavage and PCA 2,3 cleavage pathways. PCA is an intermediate in the breakdown of lignin (hence the gene symbol ligA) and other compounds. Members of this family are the alpha chain of PCA 4,5-dioxygenase, or the equivalent domain of a fusion protein.
Probab=50.51 E-value=12 Score=20.94 Aligned_cols=15 Identities=27% Similarity=0.384 Sum_probs=12.2
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
++.||.|||++++..
T Consensus 94 mtG~t~eef~~mm~~ 108 (117)
T TIGR02792 94 MTGMTEEEYRQMMIG 108 (117)
T ss_pred hcCCCHHHHHHHHHh
Confidence 567999999998753
No 24
>PRK12702 mannosyl-3-phosphoglycerate phosphatase; Reviewed
Probab=49.01 E-value=13 Score=24.18 Aligned_cols=17 Identities=18% Similarity=0.321 Sum_probs=14.0
Q ss_pred CcCCCHHHHHHHHcCCCC
Q psy394 2 SINWLHHEFVHMMNGFKR 19 (66)
Q Consensus 2 fsDlt~eEf~~~~~~~~~ 19 (66)
|+|||.+|+.+ ++|...
T Consensus 121 F~d~t~~ei~~-~TGL~~ 137 (302)
T PRK12702 121 FGDWTASELAA-ATGIPL 137 (302)
T ss_pred hhhCCHHHHHH-HhCcCH
Confidence 89999999998 567543
No 25
>PF09851 SHOCT: Short C-terminal domain; InterPro: IPR018649 This family of hypothetical prokaryotic proteins has no known function.
Probab=48.40 E-value=19 Score=15.12 Aligned_cols=11 Identities=9% Similarity=-0.001 Sum_probs=8.5
Q ss_pred cCCCHHHHHHH
Q psy394 3 INWLHHEFVHM 13 (66)
Q Consensus 3 sDlt~eEf~~~ 13 (66)
+.+|.+||.+.
T Consensus 15 G~IseeEy~~~ 25 (31)
T PF09851_consen 15 GEISEEEYEQK 25 (31)
T ss_pred CCCCHHHHHHH
Confidence 45899999774
No 26
>PF04369 Lactococcin: Lactococcin-like family; InterPro: IPR007464 Bacteriocins are produced by bacteria to inhibit the growth of similar or closely related bacterial strains. The class II bacteriocins are small heat-stable proteins for which disulphide bonds are the only modification to the peptide. Lactococcin A and B are class-IId bacteriocins (one-peptide non-pediocin-like bacteriocin) [, ].; GO: 0042742 defense response to bacterium, 0005576 extracellular region
Probab=45.61 E-value=14 Score=18.31 Aligned_cols=15 Identities=13% Similarity=0.049 Sum_probs=12.6
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
|.++++||...+-.|
T Consensus 7 f~~~sdeeL~~i~GG 21 (60)
T PF04369_consen 7 FNILSDEELSKINGG 21 (60)
T ss_pred ceecCHHHHhhccCC
Confidence 788999999997654
No 27
>PF08992 QH-AmDH_gamma: Quinohemoprotein amine dehydrogenase, gamma subunit; InterPro: IPR015084 Quinohemoprotein amine dehydrogenases (QHNDH) 1.4.99 from EC) are enzymes produced in the periplasmic space of certain Gram-negative bacteria, such as Paracoccus denitrificans and Pseudomonas putida, in response to primary amines, including n-butylamine and benzylamine. QHNDH catalyses the oxidative deamination of a wide range of aliphatic and aromatic amines through formation of a Schiff-base intermediate involving one of the quinone O atoms []. Catalysis requires the presence of a novel redox cofactor, cysteine tryptophylquinone (CTQ). CTQ is derived from the post-translational modification of specific residues, which involves the oxidation of the indole ring of a tryptophan residue to form tryptophylquinone, followed by covalent cross-linking with a cysteine residue []. There is one CTQ per subunit in QHNDH. In addition to CTQ, two haem c cofactors are present in QHNDH that mediate the transfer of the substrate-derived electrons from CTQ to an external electron acceptor, cytochrome c-550 [, ]. QHNDH is a heterotrimer of alpha, beta and gamma subunits. The alpha and beta subunits contain signal peptides necessary for the translocation of QHNDH to the periplasm. The alpha subunit is composed of four domains - domain 1 forming a dihaem cytochrome, and domains 2-4 forming antiparallel beta-barrel structures; the beta subunit is a 7-bladed beta-propeller that provides part of the active site; and the small, catalytic gamma subunit contains the novel cross-linked CTQ cofactor, in addition to additional thioester cross-links between Cys and Asp/Glu residues that encage CTQ. The gamma subunit assumes a globular secondary structure with two short alpha-helices having many turns and bends []. This entry represents the main structural domain of the QHNDH gamma subunit.; GO: 0016638 oxidoreductase activity, acting on the CH-NH2 group of donors, 0055114 oxidation-reduction process; PDB: 1JJU_C 1PBY_C 1JMX_G 1JMZ_G.
Probab=43.48 E-value=10 Score=19.62 Aligned_cols=12 Identities=58% Similarity=1.102 Sum_probs=6.4
Q ss_pred cccccccCCCCC
Q psy394 46 EVDWRNKGAVTP 57 (66)
Q Consensus 46 ~~DWR~~g~Vt~ 57 (66)
.-|||+.+.|-|
T Consensus 64 ~~DWR~L~~vfP 75 (78)
T PF08992_consen 64 TRDWRNLGSVFP 75 (78)
T ss_dssp HHHGGG---SS-
T ss_pred HHHHHHhcccCC
Confidence 479999988766
No 28
>PRK13377 protocatechuate 4,5-dioxygenase subunit alpha; Provisional
Probab=42.94 E-value=21 Score=20.41 Aligned_cols=15 Identities=27% Similarity=0.379 Sum_probs=12.2
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
++.||.|||++++..
T Consensus 100 mtG~t~eef~~mm~~ 114 (129)
T PRK13377 100 MTGMTEEEYRQMMLG 114 (129)
T ss_pred hcCCCHHHHHHHHHh
Confidence 567999999998753
No 29
>PF13986 DUF4224: Domain of unknown function (DUF4224)
Probab=41.88 E-value=25 Score=16.31 Aligned_cols=15 Identities=27% Similarity=0.592 Sum_probs=11.8
Q ss_pred CCHHHHHHHHcCCCCC
Q psy394 5 WLHHEFVHMMNGFKRS 20 (66)
Q Consensus 5 lt~eEf~~~~~~~~~~ 20 (66)
||.+|... ++|++..
T Consensus 3 LT~~El~e-lTG~k~~ 17 (47)
T PF13986_consen 3 LTDEELQE-LTGYKRP 17 (47)
T ss_pred CCHHHHHH-HHCCCCH
Confidence 78999998 5787643
No 30
>PF09725 Fra10Ac1: Folate-sensitive fragile site protein Fra10Ac1; InterPro: IPR019129 This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved nuclear protein of unknown function that is highly expressed in brain tissue [].
Probab=39.83 E-value=6.6 Score=22.05 Aligned_cols=21 Identities=29% Similarity=0.415 Sum_probs=18.4
Q ss_pred cccccccCCCCCCCcccCCCC
Q psy394 46 EVDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 46 ~~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
.+.||...-|-.=|-|-.||+
T Consensus 55 glRWRte~EVi~GkGqf~Cgn 75 (118)
T PF09725_consen 55 GLRWRTEKEVISGKGQFICGN 75 (118)
T ss_pred eeeeeeceEEEecceEEeeeC
Confidence 589999988888899999986
No 31
>PF13900 GVQW: Putative binding domain
Probab=38.38 E-value=15 Score=17.43 Aligned_cols=13 Identities=31% Similarity=0.611 Sum_probs=10.3
Q ss_pred CCCCCCccccccc
Q psy394 40 NVKLPEEVDWRNK 52 (66)
Q Consensus 40 ~~~~P~~~DWR~~ 52 (66)
-..+|.++|+|-.
T Consensus 26 cLSlpssWDyr~~ 38 (48)
T PF13900_consen 26 CLSLPSSWDYRHA 38 (48)
T ss_pred ccccccccccccC
Confidence 3578999999964
No 32
>cd07924 PCA_45_Doxase_A The A subunit of Protocatechuate 4,5-dioxygenase (LigAB) is the smaller, non-catalytic subunit. The A subunit is the non-catalytic subunit of Protocatechuate (PCA) 4,5-dioxygenase (LigAB), which is composed of A and B subunits that form a tetramer. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. PCA 4,5-dioxygenase is one of the aromatic ring opening dioxygenases which play key roles in the degradation of aromatic compounds. As a member of the Class III extradiol dioxygenase family, LigAB uses a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.
Probab=35.90 E-value=29 Score=19.60 Aligned_cols=15 Identities=20% Similarity=0.197 Sum_probs=12.1
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
++.||.|||++++..
T Consensus 97 mtG~s~eef~~mm~~ 111 (121)
T cd07924 97 MTGMSMEEYRQMMVD 111 (121)
T ss_pred hcCCCHHHHHHHHHh
Confidence 567999999998753
No 33
>PF02209 VHP: Villin headpiece domain; InterPro: IPR003128 Villin is an F-actin bundling protein involved in the maintenance of the microvilli of the absorptive epithelia. The villin-type "headpiece" domain is a modular motif found at the extreme C terminus of larger "core" domains in over 25 cytoskeletal proteins in plants and animals, often in assocation with the Gelsolin repeat. Although the headpiece is classified as an F-actin-binding domain, it has been shown that not all headpiece domains are intrinsically F-actin-binding motifs, surface charge distribution may be an important element for F-actin recognition []. An autonomously folding, 35 residue, thermostable subdomain (HP36) of the full-length 76 amino acid residue villin headpiece, is the smallest known example of a cooperatively folded domain of a naturally occurring protein. The structure of HP36, as determined by NMR spectroscopy, consists of three short helices surrounding a tightly packed hydrophobic core []. ; GO: 0003779 actin binding, 0007010 cytoskeleton organization; PDB: 1ZV6_A 1QZP_A 1UND_A 2PPZ_A 3TJW_B 1YU8_X 2JM0_A 1WY4_A 3MYC_A 1YU5_X ....
Probab=34.39 E-value=38 Score=14.92 Aligned_cols=10 Identities=10% Similarity=0.322 Sum_probs=7.3
Q ss_pred CCHHHHHHHH
Q psy394 5 WLHHEFVHMM 14 (66)
Q Consensus 5 lt~eEf~~~~ 14 (66)
|+.+||.+.+
T Consensus 2 Lsd~dF~~vF 11 (36)
T PF02209_consen 2 LSDEDFEKVF 11 (36)
T ss_dssp S-HHHHHHHH
T ss_pred cCHHHHHHHH
Confidence 6788888876
No 34
>PF07606 DUF1569: Protein of unknown function (DUF1569); InterPro: IPR011463 This entry represents a family of hypothetical proteins identified in Rhodopirellula baltica and other bacteria.
Probab=34.16 E-value=39 Score=19.36 Aligned_cols=14 Identities=7% Similarity=0.005 Sum_probs=11.8
Q ss_pred CCcCCCHHHHHHHH
Q psy394 1 MSINWLHHEFVHMM 14 (66)
Q Consensus 1 ~fsDlt~eEf~~~~ 14 (66)
.|+.||.+|+..++
T Consensus 125 ~FG~Lt~eew~~~~ 138 (152)
T PF07606_consen 125 FFGKLTKEEWGKLH 138 (152)
T ss_pred CCCCCCHHHHHHHH
Confidence 48999999998864
No 35
>PF00376 MerR: MerR family regulatory protein; InterPro: IPR000551 The many bacterial transcription regulation proteins which bind DNA through a 'helix-turn-helix' motif can be classified into subfamilies on the basis of sequence similarities. One of these is the MerR subfamily. MerR, which is found in many bacterial species mediates the mercuric-dependent induction of the mercury resistance operon. In the absence of mercury merR represses transcription by binding tightly, as a dimer, to the 'mer' operator region; when mercury is present the dimeric complex binds a single ion and becomes a potent transcriptional activator, while remaining bound to the mer site. Members of the family include the mercuric resistance operon regulatory protein merR; Bacillus subtilis bltR and bmrR; Bacillus glnR; Streptomyces coelicolor hspR; Bradyrhizobium japonicum nolA; Escherichia coli superoxide response regulator soxR; and Streptomyces lividans transcriptional activator tipA [, , , , , ]. Other members include hypothetical proteins from E. coli, B. subtilis and Haemophilus influenzae. Within this family, the HTH motif is situated towards the N terminus.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent; PDB: 3HH0_A 2DG6_A 1R8D_B 1JBG_A 2VZ4_A 2ZHH_A 2ZHG_A 1Q07_A 1Q06_A 1Q05_B ....
Probab=33.22 E-value=16 Score=16.05 Aligned_cols=13 Identities=15% Similarity=0.580 Sum_probs=5.9
Q ss_pred ccccCCC-CCCCcc
Q psy394 49 WRNKGAV-TPIKDQ 61 (66)
Q Consensus 49 WR~~g~V-t~Vk~Q 61 (66)
|-+.|.+ +|+++.
T Consensus 19 ye~~Gll~~~~r~~ 32 (38)
T PF00376_consen 19 YEREGLLPPPERTE 32 (38)
T ss_dssp HHHTTSS-SSEETT
T ss_pred HHHCCCCCCCccCC
Confidence 3444544 444444
No 36
>cd04761 HTH_MerR-SF Helix-Turn-Helix DNA binding domain of transcription regulators from the MerR superfamily. Helix-turn-helix (HTH) transcription regulator MerR superfamily, N-terminal domain. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporter genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.
Probab=32.56 E-value=22 Score=15.64 Aligned_cols=12 Identities=17% Similarity=0.581 Sum_probs=5.5
Q ss_pred ccccCCCCCCCc
Q psy394 49 WRNKGAVTPIKD 60 (66)
Q Consensus 49 WR~~g~Vt~Vk~ 60 (66)
|.++|.+.|.+.
T Consensus 20 ~~~~g~l~~~~~ 31 (49)
T cd04761 20 YERIGLLSPART 31 (49)
T ss_pred HHHCCCCCCCcC
Confidence 444444444443
No 37
>PF07110 EthD: EthD domain; InterPro: IPR009799 This family consists of several bacterial sequences which are related to the EthD protein of Rhodococcus ruber (Q93EX2 from SWISSPROT). R. ruber (formerly Gordonia terrae) IFP 2001 is one of a few bacterial strains able to degrade ethyl tert-butyl ether (ETBE), which is a major pollutant from gasoline. This strain was found to undergo a spontaneous 14.3-kbp chromosomal deletion, which results in the loss of the ability to degrade ETBE. Sequence analysis of the region corresponding to the deletion revealed the presence of a gene cluster, ethABCD, encoding a ferredoxin reductase (EthA), a cytochrome P-450 (EthB), a ferredoxin (EthC), and a 10kDa protein of unknown function (EthD), respectively. Upstream of ethABCD lies ethR, which codes for a putative positive transcriptional regulator of the AraC/XylS family. Transformation of the ETBE-negative mutant by a plasmid carrying the ethRABCD genes restored the ability to degrade ETBE. Complementation was abolished if the plasmid carried ethRABC only demonstrating that EthD is essential for the ETBE degradation system [].; PDB: 3BF4_B 2FTR_A.
Probab=31.89 E-value=36 Score=17.04 Aligned_cols=12 Identities=17% Similarity=0.141 Sum_probs=8.2
Q ss_pred CCCHHHHHHHHc
Q psy394 4 NWLHHEFVHMMN 15 (66)
Q Consensus 4 Dlt~eEf~~~~~ 15 (66)
+||.+||...+.
T Consensus 2 gls~eeF~~~~~ 13 (95)
T PF07110_consen 2 GLSPEEFHDYWR 13 (95)
T ss_dssp -S-HHHHHHHHH
T ss_pred CCCHHHHHHHHH
Confidence 589999988764
No 38
>PF12368 DUF3650: Protein of unknown function (DUF3650) ; InterPro: IPR022111 This domain family is found in bacteria, and is approximately 30 amino acids in length. The family is found in association with PF00581 from PFAM. There is a single completely conserved residue N that may be functionally important.
Probab=31.80 E-value=50 Score=13.83 Aligned_cols=11 Identities=9% Similarity=0.114 Sum_probs=8.7
Q ss_pred CCCHHHHHHHH
Q psy394 4 NWLHHEFVHMM 14 (66)
Q Consensus 4 Dlt~eEf~~~~ 14 (66)
.||.+|+.+.+
T Consensus 15 ~ls~ee~~~RL 25 (28)
T PF12368_consen 15 GLSEEEVAERL 25 (28)
T ss_pred CCCHHHHHHHH
Confidence 48899988765
No 39
>PF13647 Glyco_hydro_80: Glycosyl hydrolase family 80 of chitosanase A
Probab=31.78 E-value=21 Score=22.10 Aligned_cols=13 Identities=31% Similarity=0.664 Sum_probs=11.0
Q ss_pred CCCCCCcccCCCC
Q psy394 54 AVTPIKDQGQCYK 66 (66)
Q Consensus 54 ~Vt~Vk~Qg~CGS 66 (66)
+.+.||.-|+|||
T Consensus 218 afssvksagncgs 230 (308)
T PF13647_consen 218 AFSSVKSAGNCGS 230 (308)
T ss_pred hhhhccccccccC
Confidence 4678999999997
No 40
>KOG0816|consensus
Probab=31.57 E-value=24 Score=21.82 Aligned_cols=16 Identities=6% Similarity=0.169 Sum_probs=12.8
Q ss_pred CCcCCCHHHHHHHHcC
Q psy394 1 MSINWLHHEFVHMMNG 16 (66)
Q Consensus 1 ~fsDlt~eEf~~~~~~ 16 (66)
||+|++.+|....+..
T Consensus 103 lFTd~~keeV~e~f~s 118 (223)
T KOG0816|consen 103 LFTDMSKEEVIEWFRS 118 (223)
T ss_pred EecCCCHHHHHHHHHH
Confidence 6999999998776543
No 41
>PF09056 Phospholip_A2_3: Prokaryotic phospholipase A2; InterPro: IPR015141 This entry represents bacterial and fungal phospholipase A2 proteins, as well as various hypothetical and putative proteins. They enable the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The phospholipase domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments []. ; PDB: 1IT5_A 1KP4_A 1IT4_A 1LWB_A 1FAZ_A.
Probab=29.45 E-value=23 Score=19.60 Aligned_cols=15 Identities=27% Similarity=0.747 Sum_probs=5.9
Q ss_pred CCCcccccccCCCCC
Q psy394 43 LPEEVDWRNKGAVTP 57 (66)
Q Consensus 43 ~P~~~DWR~~g~Vt~ 57 (66)
-|..+||..-||-.|
T Consensus 22 ~~~~ldWs~DgCS~s 36 (111)
T PF09056_consen 22 WPPYLDWSSDGCSSS 36 (111)
T ss_dssp GGGG------TTTTS
T ss_pred CCCCCCcCCCCCCCC
Confidence 456699999999654
No 42
>PF10685 KGG: Stress-induced bacterial acidophilic repeat motif; InterPro: IPR019626 This repeat contains a highly conserved, characteristic sequence motif, KGG, that is recognised by plants and lower eukaryotes. Further downstream from this motif is a Walker A, nucleotide binding motif. YciG is expressed as part of a three-gene operon, yciGFE and this operon is induced by stress and is regulated by RpoS, which controls the general stress-response in E coli. YciG was shown to be important for stationary-phase resistance to thermal stress and in particular to acid stress [].
Probab=28.75 E-value=51 Score=13.05 Aligned_cols=12 Identities=0% Similarity=-0.236 Sum_probs=8.8
Q ss_pred CcCCCHHHHHHH
Q psy394 2 SINWLHHEFVHM 13 (66)
Q Consensus 2 fsDlt~eEf~~~ 13 (66)
|+.|..|+...+
T Consensus 2 Fa~~d~e~~~ei 13 (23)
T PF10685_consen 2 FASMDPEKAREI 13 (23)
T ss_pred ccccCHHHHHHH
Confidence 777888877664
No 43
>KOG3032|consensus
Probab=28.72 E-value=19 Score=22.77 Aligned_cols=11 Identities=45% Similarity=0.891 Sum_probs=8.4
Q ss_pred CcccccccCCC
Q psy394 45 EEVDWRNKGAV 55 (66)
Q Consensus 45 ~~~DWR~~g~V 55 (66)
.++|||.+|+.
T Consensus 253 ~a~DWRaKnl~ 263 (264)
T KOG3032|consen 253 SAVDWRAKNLF 263 (264)
T ss_pred hhhhhhhhhcc
Confidence 46899998763
No 44
>KOG2479|consensus
Probab=28.01 E-value=18 Score=25.01 Aligned_cols=21 Identities=52% Similarity=0.925 Sum_probs=14.4
Q ss_pred CCCCccccccc-----CCC--CCCCccc
Q psy394 42 KLPEEVDWRNK-----GAV--TPIKDQG 62 (66)
Q Consensus 42 ~~P~~~DWR~~-----g~V--t~Vk~Qg 62 (66)
.+-.+||||++ |+| |.+||-+
T Consensus 403 k~sngVdWRqKLdtQRGAVlAtElkNNs 430 (549)
T KOG2479|consen 403 KYSNGVDWRQKLDTQRGAVLATELKNNS 430 (549)
T ss_pred cccCcccHHHhhhhhcchhhhhhhhcch
Confidence 34558999996 775 4666643
No 45
>PRK13372 pcmA protocatechuate 4,5-dioxygenase; Provisional
Probab=27.47 E-value=44 Score=22.93 Aligned_cols=14 Identities=14% Similarity=0.109 Sum_probs=12.0
Q ss_pred CcCCCHHHHHHHHc
Q psy394 2 SINWLHHEFVHMMN 15 (66)
Q Consensus 2 fsDlt~eEf~~~~~ 15 (66)
++.||.|||+++|.
T Consensus 100 m~g~t~e~f~~~~~ 113 (444)
T PRK13372 100 MTGLSEAAYRDMMI 113 (444)
T ss_pred hcCCCHHHHHHHHH
Confidence 56799999999875
No 46
>PF04833 COBRA: COBRA-like protein; InterPro: IPR006918 In Arabidopsis thaliana (Mouse-ear cress) members of the family are all extracellular glycosyl-phosphatidyl inositol-anchored proteins (GPI-linked) []. The type example of the family is COBRA (Q94KT8 from SWISSPROT) and the family is generally annotated as COBRA-like (COBL). COBRA is involved in determining the orientation of cell expansion, probably by playing an important role in cellulose deposition. It may act by recruiting cellulose synthesizing complexes to discrete positions on the cell surface. Some members of this family are annotated as phytochelatin synthase, but these annotations are incorrect [].
Probab=26.95 E-value=30 Score=20.63 Aligned_cols=18 Identities=56% Similarity=0.909 Sum_probs=11.7
Q ss_pred ccccccCCCCCCCcccCCCC
Q psy394 47 VDWRNKGAVTPIKDQGQCYK 66 (66)
Q Consensus 47 ~DWR~~g~Vt~Vk~Qg~CGS 66 (66)
|=|--.|+-+. +||.||.
T Consensus 32 ~IwsM~GA~~t--dqgdCs~ 49 (169)
T PF04833_consen 32 FIWSMKGAQTT--DQGDCSK 49 (169)
T ss_pred EEEEeeCceec--cCCcccc
Confidence 33444577665 9999973
No 47
>KOG2476|consensus
Probab=26.89 E-value=24 Score=24.53 Aligned_cols=11 Identities=55% Similarity=1.356 Sum_probs=9.5
Q ss_pred CCCCCcccccc
Q psy394 41 VKLPEEVDWRN 51 (66)
Q Consensus 41 ~~~P~~~DWR~ 51 (66)
.++|+.+|||+
T Consensus 492 Ln~pdrvdWr~ 502 (528)
T KOG2476|consen 492 LNLPDRVDWRT 502 (528)
T ss_pred cCCcccccHHH
Confidence 57899999995
No 48
>PF11918 DUF3436: Domain of unknown function (DUF3436); InterPro: IPR024591 This uncharacterised N-terminal domain is associated with the interphotoreceptor retinol-binding protein family. It is about 50 amino acids in length and has two conserved sequence motifs: DPRL and SYEP.
Probab=26.65 E-value=77 Score=15.42 Aligned_cols=14 Identities=14% Similarity=0.214 Sum_probs=11.6
Q ss_pred CcCCCHHHHHHHHc
Q psy394 2 SINWLHHEFVHMMN 15 (66)
Q Consensus 2 fsDlt~eEf~~~~~ 15 (66)
+.+|+.||..+++.
T Consensus 35 ~~~Lt~EqLla~lq 48 (55)
T PF11918_consen 35 LPNLTPEQLLAMLQ 48 (55)
T ss_pred CCCcCHHHHHHHHH
Confidence 67899999988764
No 49
>COG2832 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=25.68 E-value=23 Score=19.91 Aligned_cols=14 Identities=43% Similarity=0.850 Sum_probs=11.3
Q ss_pred CCcccccccCCCCC
Q psy394 44 PEEVDWRNKGAVTP 57 (66)
Q Consensus 44 P~~~DWR~~g~Vt~ 57 (66)
|.--|||+.|+++.
T Consensus 58 ~~v~~~~e~~ai~~ 71 (119)
T COG2832 58 PYVRDWREGGAIPR 71 (119)
T ss_pred HHHHHHHHcCCCCh
Confidence 55689999998875
No 50
>smart00153 VHP Villin headpiece domain.
Probab=25.67 E-value=69 Score=14.01 Aligned_cols=10 Identities=10% Similarity=0.315 Sum_probs=7.8
Q ss_pred CCHHHHHHHH
Q psy394 5 WLHHEFVHMM 14 (66)
Q Consensus 5 lt~eEf~~~~ 14 (66)
|+.+||...+
T Consensus 2 LsdeeF~~vf 11 (36)
T smart00153 2 LSDEDFEEVF 11 (36)
T ss_pred CCHHHHHHHH
Confidence 6788888866
No 51
>TIGR01847 bacteriocin_sig bacteriocin-type signal sequence. Bacteriocins are bacterial peptide products toxic to closely related bacteria. This model represents the N-terminal region up to the GG cleavage motif. Processing to remove this bacteriocin leader peptide occurs together with export by an ABC transporter. Note: because this model is so small (15 amino acids), it may have many spurious high-scoring matches to unrelated proteins, even with fairly stringent cutoff scores. The most likely true positives are small proteins of Gram-positive bacteria, matching regions that start within the first 15 amino acids, and encoded near bacteriocin transport family proteins (TIGR01000, TIGR01193).
Probab=25.42 E-value=64 Score=13.23 Aligned_cols=17 Identities=12% Similarity=0.251 Sum_probs=12.3
Q ss_pred CcCCCHHHHHHHHcCCC
Q psy394 2 SINWLHHEFVHMMNGFK 18 (66)
Q Consensus 2 fsDlt~eEf~~~~~~~~ 18 (66)
|.-|+.+|..++..|..
T Consensus 1 fk~Ls~kEL~~I~GG~~ 17 (26)
T TIGR01847 1 FKELSEKELAQIIGGXX 17 (26)
T ss_pred CccCCHHHHhhccCCcc
Confidence 55688889888776643
No 52
>smart00422 HTH_MERR helix_turn_helix, mercury resistance.
Probab=25.34 E-value=35 Score=16.21 Aligned_cols=10 Identities=20% Similarity=0.501 Sum_probs=5.1
Q ss_pred ccccCCCCCC
Q psy394 49 WRNKGAVTPI 58 (66)
Q Consensus 49 WR~~g~Vt~V 58 (66)
|.++|.+.|+
T Consensus 20 ~~~~gli~~~ 29 (70)
T smart00422 20 YERIGLLPPP 29 (70)
T ss_pred HHHCCCCCCC
Confidence 4445555554
No 53
>PF06574 FAD_syn: FAD synthetase; InterPro: IPR015864 Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase (2.7.1.26 from EC), which converts it into FMN, and FAD synthetase (2.7.7.2 from EC), which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme [], the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family []. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases []. This entry represents prokaryotic-type FAD synthetase, which occurs primarily as part of a bifunctional enzyme.; GO: 0003919 FMN adenylyltransferase activity, 0009231 riboflavin biosynthetic process; PDB: 2X0K_B 3OP1_B 1T6Z_A 2I1L_A 1T6Y_B 1T6X_B 1S4M_A 1MRZ_A.
Probab=25.04 E-value=51 Score=19.00 Aligned_cols=15 Identities=13% Similarity=0.155 Sum_probs=11.6
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
|+.|+.++|...++.
T Consensus 88 ~~~ls~~~Fi~~iL~ 102 (157)
T PF06574_consen 88 FANLSPEDFIEKILK 102 (157)
T ss_dssp HCCS-HHHHHHHHCC
T ss_pred HHcCCHHHHHHHHHH
Confidence 678999999997654
No 54
>PF15649 Tox-REase-7: Restriction endonuclease fold toxin 7
Probab=24.96 E-value=29 Score=18.39 Aligned_cols=20 Identities=25% Similarity=0.720 Sum_probs=14.0
Q ss_pred CCCCCCcccccccCCCCCCCc
Q psy394 40 NVKLPEEVDWRNKGAVTPIKD 60 (66)
Q Consensus 40 ~~~~P~~~DWR~~g~Vt~Vk~ 60 (66)
...+|+.++ ...|-++.|||
T Consensus 30 ~~rIPD~~~-~~~~~l~EVKN 49 (87)
T PF15649_consen 30 GNRIPDGLD-KNNGQLVEVKN 49 (87)
T ss_pred CcCCCcccc-cCCCcEEEEec
Confidence 357899998 33567777776
No 55
>PF08127 Propeptide_C1: Peptidase family C1 propeptide; InterPro: IPR012599 This domain is found at the N-terminal of cathepsin B and cathepsin B-like peptidases that belong to MEROPS peptidase subfamily C1A. Cathepsin B are lysosomal cysteine proteinases belonging to the papain superfamily and are unique in their ability to act as both an endo- and an exopeptidases. They are synthesized as inactive zymogens. Activation of the peptidases occurs with the removal of the propeptide [, ]. ; GO: 0004197 cysteine-type endopeptidase activity, 0050790 regulation of catalytic activity; PDB: 1MIR_A 1PBH_A 2PBH_A 3PBH_A.
Probab=24.72 E-value=53 Score=14.78 Aligned_cols=15 Identities=13% Similarity=-0.117 Sum_probs=6.3
Q ss_pred CcCCCHHHHHHHHcCC
Q psy394 2 SINWLHHEFVHMMNGF 17 (66)
Q Consensus 2 fsDlt~eEf~~~~~~~ 17 (66)
|.+++.+.++. ++|.
T Consensus 22 F~~~~~~~ik~-LlGv 36 (41)
T PF08127_consen 22 FENTSIEYIKR-LLGV 36 (41)
T ss_dssp -SSB-HHHHHH-CS-B
T ss_pred CCCCCHHHHHH-HcCC
Confidence 45555555555 3454
No 56
>PRK15431 ferrous iron transport protein FeoC; Provisional
Probab=24.65 E-value=29 Score=18.06 Aligned_cols=17 Identities=18% Similarity=0.474 Sum_probs=11.7
Q ss_pred cccccCCCCCCC-cccCC
Q psy394 48 DWRNKGAVTPIK-DQGQC 64 (66)
Q Consensus 48 DWR~~g~Vt~Vk-~Qg~C 64 (66)
-|..+|.|..|- ++..|
T Consensus 39 ~l~~kGkverv~~~~~gC 56 (78)
T PRK15431 39 QLESMGKAVRIQEEPDGC 56 (78)
T ss_pred HHHHCCCeEeeccCCCCC
Confidence 477788887776 55455
No 57
>PF09012 FeoC: FeoC like transcriptional regulator; InterPro: IPR015102 This entry contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependent transcriptional repressor []. ; PDB: 1XN7_A 2K02_A.
Probab=23.26 E-value=19 Score=17.61 Aligned_cols=15 Identities=13% Similarity=0.133 Sum_probs=7.9
Q ss_pred ccccCCCCCCCcccC
Q psy394 49 WRNKGAVTPIKDQGQ 63 (66)
Q Consensus 49 WR~~g~Vt~Vk~Qg~ 63 (66)
|..+|.|-.+.....
T Consensus 38 l~~kG~I~~~~~~~~ 52 (69)
T PF09012_consen 38 LIRKGYIRKVDMSSC 52 (69)
T ss_dssp HHCCTSCEEEEEE--
T ss_pred HHHCCcEEEecCCCC
Confidence 566666665555443
No 58
>PF08838 DUF1811: Protein of unknown function (DUF1811); InterPro: IPR014938 This entry consists uncharacterised bacterial proteins. Some of the proteins are annotated as being transcriptional regulators (see Q4MQL7 from SWISSPROT, Q65MA2 from SWISSPROT). The structure of one of the proteins has revealed a beta-barrel like structure with helix-turn-helix like motif. ; PDB: 2YXY_A 1SF9_A.
Probab=23.18 E-value=38 Score=18.56 Aligned_cols=13 Identities=8% Similarity=0.008 Sum_probs=7.7
Q ss_pred CcCCCHHHHHHHH
Q psy394 2 SINWLHHEFVHMM 14 (66)
Q Consensus 2 fsDlt~eEf~~~~ 14 (66)
||+||.+|..+-+
T Consensus 4 ySeMs~~EL~~Ei 16 (102)
T PF08838_consen 4 YSEMSEEELRQEI 16 (102)
T ss_dssp HHC--HHHHHHHH
T ss_pred hhhcCHHHHHHHH
Confidence 6788888886543
No 59
>PF02845 CUE: CUE domain; InterPro: IPR003892 This domain may be involved in binding ubiquitin-conjugating enzymes (UBCs). CUE domains also occur in two proteins of the IL-1 signal transduction pathway, tollip and TAB2.; GO: 0005515 protein binding; PDB: 2EKF_A 1OTR_A 1P3Q_Q 1MN3_A 1WGL_A 2EJS_A 2DAE_A 2DHY_A 2DI0_A.
Probab=23.18 E-value=62 Score=14.17 Aligned_cols=14 Identities=7% Similarity=-0.040 Sum_probs=9.9
Q ss_pred CCcCCCHHHHHHHH
Q psy394 1 MSINWLHHEFVHMM 14 (66)
Q Consensus 1 ~fsDlt~eEf~~~~ 14 (66)
||-+++.+.+...+
T Consensus 11 mFP~~~~~~I~~~L 24 (42)
T PF02845_consen 11 MFPDLDREVIEAVL 24 (42)
T ss_dssp HSSSS-HHHHHHHH
T ss_pred HCCCCCHHHHHHHH
Confidence 58888888887765
No 60
>PF02276 CytoC_RC: Photosynthetic reaction centre cytochrome C subunit; InterPro: IPR003158 The photosynthetic apparatus in non-oxygenic bacteria consists of light-harvesting (LH) protein-pigment complexes LH1 and LH2, which use carotenoid and bacteriochlorophyll as primary donors []. LH1 acts as the energy collection hub, temporarily storing it before its transfer to the photosynthetic reaction centre (RC) []. Electrons are transferred from the primary donor via an intermediate acceptor (bacteriopheophytin) to the primary acceptor (quinine Qa), and finally to the secondary acceptor (quinone Qb), resulting in the formation of ubiquinol QbH2. RC uses the excitation energy to shuffle electrons across the membrane, transferring them via ubiquinol to the cytochrome bc1 complex in order to establish a proton gradient across the membrane, which is used by ATP synthetase to form ATP [, , ]. The core complex is anchored in the cell membrane, consisting of one unit of RC surrounded by LH1; in some species there may be additional subunits []. RC consists of three subunits: L (light), M (medium), and H (heavy). Subunits L and M provide the scaffolding for the chromophore, while subunit H contains a cytoplasmic domain []. In Rhodopseudomonas viridis, there is also a non-membranous tetrahaem cytochrome (4Hcyt) subunit on the periplasmic surface. In the purple bacterium Rhodocyclus gelatinosus (Rhodopseudomonas gelatinosa), a high potential Fe-S protein (HiPIP) acts as an electron donor to reaction centre-bound cyt bc1 under anaerobic conditions in the light, while cyt c acts as a soluble electron carrier under aerobic conditions in the dark in order to re-reduce the oxidized electron donor [].; GO: 0005506 iron ion binding, 0009055 electron carrier activity, 0020037 heme binding, 0019684 photosynthesis, light reaction, 0030077 plasma membrane light-harvesting complex; PDB: 7PRC_C 6PRC_C 2X5V_C 2X5U_C 2WJM_C 2JBL_C 3T6D_C 3D38_C 2WJN_C 1VRN_C ....
Probab=23.02 E-value=65 Score=21.16 Aligned_cols=13 Identities=23% Similarity=0.141 Sum_probs=11.0
Q ss_pred CcCCCHHHHHHHH
Q psy394 2 SINWLHHEFVHMM 14 (66)
Q Consensus 2 fsDlt~eEf~~~~ 14 (66)
|.||+.+||.+++
T Consensus 60 L~dls~~ef~rlM 72 (314)
T PF02276_consen 60 LGDLSVAEFDRLM 72 (314)
T ss_dssp STTSBHHHHHHHH
T ss_pred hcCCCHHHHHHHH
Confidence 6899999998865
No 61
>PF05443 ROS_MUCR: ROS/MUCR transcriptional regulator protein; InterPro: IPR008807 This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti (Sinorhizobium meliloti). This gene encodes a 15.5 kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid [] and is necessary for succinoglycan production []. S. meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of Medicago sativa (Alfalfa) nodules. MucR from S. meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan [].; GO: 0003677 DNA binding, 0008270 zinc ion binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2JSP_A.
Probab=22.88 E-value=57 Score=18.60 Aligned_cols=14 Identities=14% Similarity=0.221 Sum_probs=9.5
Q ss_pred CCCHHHHHHHHcCCC
Q psy394 4 NWLHHEFVHMMNGFK 18 (66)
Q Consensus 4 Dlt~eEf~~~~~~~~ 18 (66)
.||.+||.+++ |++
T Consensus 94 gltp~eYR~kw-Glp 107 (132)
T PF05443_consen 94 GLTPEEYRAKW-GLP 107 (132)
T ss_dssp -S-HHHHHHHT-T-G
T ss_pred CCCHHHHHHHh-CcC
Confidence 69999999987 654
No 62
>PF13024 DUF3884: Protein of unknown function (DUF3884)
Probab=22.69 E-value=69 Score=16.62 Aligned_cols=14 Identities=14% Similarity=0.083 Sum_probs=11.5
Q ss_pred cCCCHHHHHHHHcC
Q psy394 3 INWLHHEFVHMMNG 16 (66)
Q Consensus 3 sDlt~eEf~~~~~~ 16 (66)
|+++.+||+..++.
T Consensus 42 S~~~~eeFq~~Fl~ 55 (77)
T PF13024_consen 42 SDLSLEEFQKKFLN 55 (77)
T ss_pred ccccHHHHHHHHHH
Confidence 58899999998754
No 63
>PF12958 DUF3847: Protein of unknown function (DUF3847); InterPro: IPR024215 This entry represents a family of uncharacterised proteins that were found by clustering human gut metagenomic sequences [].
Probab=22.58 E-value=71 Score=16.91 Aligned_cols=11 Identities=18% Similarity=0.525 Sum_probs=9.5
Q ss_pred CCCHHHHHHHH
Q psy394 4 NWLHHEFVHMM 14 (66)
Q Consensus 4 Dlt~eEf~~~~ 14 (66)
+||.+||..++
T Consensus 62 ~lT~~E~~~ll 72 (86)
T PF12958_consen 62 DLTNDEFYELL 72 (86)
T ss_pred hcCHHHHHHHH
Confidence 78999999875
No 64
>PF07105 DUF1367: Protein of unknown function (DUF1367); InterPro: IPR009797 This entry is represented by Bacteriophage VT2phi_272, P37. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This family consists of several highly conserved, hypothetical bacterial and phage proteins of around 200 resides in length. The function of this family is unknown.
Probab=21.57 E-value=78 Score=19.38 Aligned_cols=15 Identities=20% Similarity=0.220 Sum_probs=12.9
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
|+.|.++||.+.|.+
T Consensus 151 Fa~Mdq~eF~~lY~a 165 (196)
T PF07105_consen 151 FANMDQEEFEELYKA 165 (196)
T ss_pred ccccCHHHHHHHHHH
Confidence 889999999998754
No 65
>cd04774 HTH_YfmP Helix-Turn-Helix DNA binding domain of the YfmP transcription regulator. Helix-turn-helix (HTH) transcription regulator, YfmP, and related proteins; N-terminal domain. YfmP regulates the multidrug efflux protein, YfmO, and indirectly regulates the expression of the Bacillus subtilis copZA operon encoding a metallochaperone, CopZ, and a CPx-type ATPase efflux protein, CopA. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.
Probab=21.22 E-value=48 Score=17.44 Aligned_cols=13 Identities=23% Similarity=0.600 Sum_probs=6.6
Q ss_pred ccccCCCCCCCcc
Q psy394 49 WRNKGAVTPIKDQ 61 (66)
Q Consensus 49 WR~~g~Vt~Vk~Q 61 (66)
|...|.+.|+++.
T Consensus 20 ye~~Gll~p~r~~ 32 (96)
T cd04774 20 YEEIGLVSPERSE 32 (96)
T ss_pred HHHCCCCCCCcCC
Confidence 4444555555544
No 66
>cd01277 HINT_subgroup HINT (histidine triad nucleotide-binding protein) subgroup: Members of this CD belong to the superfamily of histidine triad hydrolases that act on alpha-phosphate of ribonucleotides. This subgroup includes members from all three forms of cellular life. Although the biochemical function has not been characterised for many of the members of this subgroup, the proteins from Yeast have been shown to be involved in secretion, peroxisome formation and gene expression.
Probab=20.93 E-value=72 Score=16.31 Aligned_cols=12 Identities=8% Similarity=0.180 Sum_probs=9.8
Q ss_pred CcCCCHHHHHHH
Q psy394 2 SINWLHHEFVHM 13 (66)
Q Consensus 2 fsDlt~eEf~~~ 13 (66)
|.||+.+|+..+
T Consensus 48 ~~~l~~~e~~~l 59 (103)
T cd01277 48 LLDLDPEELAEL 59 (103)
T ss_pred hhhCCHHHHHHH
Confidence 689999998764
No 67
>PF03047 ComC: COMC family; InterPro: IPR004288 Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. Cells that take up DNA inevitably acquire the nucleotides the DNA consists of, and, because nucleotides are needed for DNA and RNA synthesis and are expensive to synthesise, these may make a significant contribution to the cell's energy budget []. The lateral gene transfer caused by competence also contributes to the genetic diversity that makes evolution possible. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions []. This family consists of streptococcal competence stimulating peptide precursors, which are generally up to 50 amino acid residues long. In all the members of this family, the leader sequence is cleaved after two conserved glycine residues; thus the leader sequence is of the double- glycine type []. Competence stimulating peptides (CSP) are small (less than 25 amino acid residues) cationic peptides. The N-terminal amino acid residue is negatively charged, either glutamate or aspartate. The C-terminal end is positively charged. The third residue is also positively charged: a highly conserved arginine []. Some COMC proteins and their precursors (not included in this family) do not fully follow the above description. Functionally, CSP act as pheromones, stimulating competence for genetic transformation in streptococci. In streptococci, the (CSP mediated) competence response requires exponential cell growth at a critical density, a relatively simple requirement when compared to the stationary-phase requirement of Haemophilus, or the late-logarithmic- phase of Bacillus []. All bacteria induced to competence by a particular CSP are said to belong to the same pherotype, because each CSP is recognised by a specific receptor (the signalling domain of a histidine kinase ComD). Pherotypes are not necessarily species-specific. In addition, an organism may change pherotype. There are two possible mechanisms for pherotype switching: horizontal gene transfer, and accumulation of point mutations. The biological significance of pherotypes and pherotype switching is not definitively determined. Pherotype switching occurs frequently enough in naturally competent streptococci to suggest that it may be an important contributor to genetic exchange between different bacterial species []. This entry also includes proteins that form bacteriocin-like propetides with a glycine-glycine cleavage site. The bacteriocin is initially formed as a pre-propeptide and upon cleavage at the glycine-glycine cleavage site, a leader peptide and the propeptide would be formed. The propeptide then undergoes posttranslational modification before becoming functional [].; GO: 0005186 pheromone activity; PDB: 2I2J_A 2I2H_A 2A1C_A.
Probab=20.84 E-value=33 Score=14.76 Aligned_cols=15 Identities=13% Similarity=0.162 Sum_probs=0.0
Q ss_pred CcCCCHHHHHHHHcC
Q psy394 2 SINWLHHEFVHMMNG 16 (66)
Q Consensus 2 fsDlt~eEf~~~~~~ 16 (66)
|.-||.+|..++-.|
T Consensus 11 F~~lt~~eL~~I~GG 25 (32)
T PF03047_consen 11 FEELTEEELQEIQGG 25 (32)
T ss_dssp ---------------
T ss_pred HhcCCHHHHhhccCC
Confidence 566677777665433
No 68
>PF12162 STAT1_TAZ2bind: STAT1 TAZ2 binding domain; InterPro: IPR022752 This entry represents the C-terminal domain of STAT1, which selectively binds the TAZ2 domain of CRB (CREB-binding protein) []. This group of eukaryotic proteins is approximately 20 amino acids in length, and is found in association with PF02865 from PFAM, PF00017 from PFAM, PF01017 from PFAM, PF02864 from PFAM. By binding to CRB, it becomes a transcriptional activator and can initiate transcription of certain genes. ; GO: 0003700 sequence-specific DNA binding transcription factor activity; PDB: 2KA6_B.
Probab=20.75 E-value=82 Score=12.54 Aligned_cols=9 Identities=0% Similarity=0.161 Sum_probs=5.9
Q ss_pred CCHHHHHHH
Q psy394 5 WLHHEFVHM 13 (66)
Q Consensus 5 lt~eEf~~~ 13 (66)
|++++|.+.
T Consensus 10 MSPddy~~l 18 (23)
T PF12162_consen 10 MSPDDYDEL 18 (23)
T ss_dssp S-HHHHHHH
T ss_pred CCHHHHHHH
Confidence 778888763
No 69
>cd04766 HTH_HspR Helix-Turn-Helix DNA binding domain of the HspR transcription regulator. Helix-turn-helix (HTH) transcription regulator HspR, N-terminal domain. Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.
Probab=20.06 E-value=50 Score=16.99 Aligned_cols=13 Identities=15% Similarity=0.534 Sum_probs=6.4
Q ss_pred ccccCCCCCCCcc
Q psy394 49 WRNKGAVTPIKDQ 61 (66)
Q Consensus 49 WR~~g~Vt~Vk~Q 61 (66)
|-+.|.+.|.++.
T Consensus 21 ye~~Gli~p~r~~ 33 (91)
T cd04766 21 YERLGLLSPSRTD 33 (91)
T ss_pred HHHCCCcCCCcCC
Confidence 4444555555543
Done!