Query 033920
Match_columns 109
No_of_seqs 94 out of 96
Neff 3.5
Searched_HMMs 46136
Date Fri Mar 29 07:47:50 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/033920.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/033920hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09597 IGR: IGR protein moti 100.0 5E-33 1.1E-37 180.6 5.3 57 39-98 1-57 (57)
2 PF00536 SAM_1: SAM domain (St 98.4 5.6E-07 1.2E-11 56.1 4.4 58 37-96 6-64 (64)
3 cd00166 SAM Sterile alpha moti 98.2 2.8E-06 6.2E-11 51.3 4.4 58 37-96 5-63 (63)
4 smart00454 SAM Sterile alpha m 97.9 2.9E-05 6.3E-10 47.0 4.7 59 37-97 7-67 (68)
5 PF07647 SAM_2: SAM domain (St 97.9 1.7E-05 3.7E-10 49.5 3.6 59 36-96 6-66 (66)
6 KOG4384 Uncharacterized SAM do 96.5 0.0053 1.1E-07 52.4 5.4 59 37-97 216-276 (361)
7 KOG0196 Tyrosine kinase, EPH ( 84.9 2.3 5E-05 40.6 5.9 66 33-100 920-987 (996)
8 KOG4374 RNA-binding protein Bi 80.4 2.8 6.1E-05 33.9 4.1 59 37-97 152-211 (216)
9 PF01031 Dynamin_M: Dynamin ce 78.8 3 6.5E-05 32.9 3.8 76 22-97 41-123 (295)
10 PF07524 Bromo_TP: Bromodomain 78.4 2.4 5.3E-05 27.5 2.7 41 39-83 36-76 (77)
11 KOG3678 SARM protein (with ste 76.5 4.6 9.9E-05 37.3 4.7 60 34-95 465-526 (832)
12 PF15652 Tox-SHH: HNH/Endo VII 69.4 5.8 0.00013 28.7 3.0 38 56-96 62-99 (100)
13 PF00730 HhH-GPD: HhH-GPD supe 65.8 9.1 0.0002 25.4 3.2 34 63-96 29-66 (108)
14 COG3272 Uncharacterized conser 57.6 6.7 0.00014 30.6 1.6 21 80-100 101-121 (163)
15 PF01152 Bac_globin: Bacterial 56.9 18 0.0004 24.5 3.5 37 62-98 80-116 (120)
16 KOG4375 Scaffold protein Shank 56.2 16 0.00034 30.6 3.6 55 36-92 212-267 (272)
17 COG1603 RPP1 RNase P/RNase MRP 54.8 25 0.00054 28.5 4.5 76 6-106 147-226 (229)
18 TIGR02527 dot_icm_IcmQ Dot/Icm 53.0 9.9 0.00021 30.2 1.9 27 38-64 16-42 (182)
19 PF10454 DUF2458: Protein of u 50.8 32 0.00069 25.8 4.3 28 54-81 92-120 (150)
20 PRK10308 3-methyl-adenine DNA 50.6 21 0.00045 28.9 3.4 37 64-100 158-194 (283)
21 PF12836 HHH_3: Helix-hairpin- 50.2 13 0.00029 23.4 1.9 32 69-102 10-42 (65)
22 PF14520 HHH_5: Helix-hairpin- 49.5 59 0.0013 19.9 4.7 41 52-92 16-58 (60)
23 PTZ00096 40S ribosomal protein 48.9 9.6 0.00021 29.1 1.2 26 64-90 22-47 (143)
24 cd00454 Trunc_globin Truncated 48.8 27 0.00058 23.4 3.3 38 62-99 76-113 (116)
25 PF09475 Dot_icm_IcmQ: Dot/Icm 48.7 9.5 0.00021 30.2 1.2 26 38-63 16-41 (179)
26 PRK10361 DNA recombination pro 47.7 14 0.00031 32.7 2.3 49 35-83 381-432 (475)
27 smart00460 TGc Transglutaminas 44.7 19 0.00041 21.5 1.9 13 74-86 19-31 (68)
28 PRK00024 hypothetical protein; 44.6 44 0.00096 26.2 4.4 46 47-92 40-86 (224)
29 PF01841 Transglut_core: Trans 44.4 33 0.00071 22.2 3.1 37 33-84 37-74 (113)
30 PRK11639 zinc uptake transcrip 43.6 29 0.00063 25.8 3.1 23 72-94 14-37 (169)
31 cd00923 Cyt_c_Oxidase_Va Cytoc 43.4 33 0.00072 25.0 3.2 30 55-84 70-99 (103)
32 PF08955 BofC_C: BofC C-termin 42.7 22 0.00049 24.3 2.2 31 55-96 43-73 (75)
33 smart00478 ENDO3c endonuclease 42.4 50 0.0011 23.1 4.0 37 61-97 21-61 (149)
34 PF11328 DUF3130: Protein of u 41.7 19 0.00041 25.7 1.7 35 38-80 42-76 (90)
35 PRK13482 DNA integrity scannin 41.5 41 0.00089 28.9 4.0 56 41-97 287-344 (352)
36 COG1577 ERG12 Mevalonate kinas 40.7 52 0.0011 27.4 4.4 56 38-95 201-259 (307)
37 PF04362 Iron_traffic: Bacteri 40.2 33 0.00072 24.2 2.8 50 44-98 24-76 (88)
38 PF07487 SopE_GEF: SopE GEF do 39.5 19 0.00041 28.2 1.6 31 64-105 62-92 (165)
39 PF12826 HHH_2: Helix-hairpin- 39.2 31 0.00066 21.8 2.3 40 55-94 17-57 (64)
40 PF06320 GCN5L1: GCN5-like pro 37.9 34 0.00074 24.7 2.6 32 48-79 57-88 (121)
41 PF15144 DUF4576: Domain of un 37.9 34 0.00073 24.3 2.5 22 36-57 43-64 (88)
42 PF02284 COX5A: Cytochrome c o 37.8 36 0.00078 25.0 2.7 30 55-84 73-102 (108)
43 PRK13766 Hef nuclease; Provisi 35.8 78 0.0017 28.2 5.0 53 42-94 716-769 (773)
44 PRK05408 oxidative damage prot 35.6 61 0.0013 23.0 3.5 49 45-98 25-76 (90)
45 PF03457 HA: Helicase associat 34.6 62 0.0013 20.0 3.1 39 64-102 8-56 (68)
46 PTZ00418 Poly(A) polymerase; P 33.3 48 0.001 30.3 3.3 71 37-107 72-147 (593)
47 KOG1170 Diacylglycerol kinase 32.9 18 0.00039 35.1 0.6 60 32-95 996-1058(1099)
48 KOG2841 Structure-specific end 32.9 75 0.0016 26.4 4.1 53 39-92 193-247 (254)
49 PRK04038 rps19p 30S ribosomal 32.0 32 0.0007 25.9 1.7 24 64-88 14-37 (134)
50 smart00540 LEM in nuclear memb 31.1 64 0.0014 19.9 2.7 27 72-98 12-43 (44)
51 PF13518 HTH_28: Helix-turn-he 30.4 61 0.0013 18.4 2.4 23 72-97 16-38 (52)
52 PRK00558 uvrC excinuclease ABC 30.3 77 0.0017 28.5 4.1 55 39-93 541-596 (598)
53 COG1623 Predicted nucleic-acid 30.0 50 0.0011 28.6 2.7 45 41-85 293-339 (349)
54 COG0122 AlkA 3-methyladenine D 28.8 56 0.0012 26.6 2.7 59 37-99 118-182 (285)
55 PRK09462 fur ferric uptake reg 28.3 53 0.0011 23.5 2.3 23 72-94 5-28 (148)
56 COG5457 Uncharacterized conser 28.1 60 0.0013 21.5 2.3 28 64-91 32-59 (63)
57 PF10281 Ish1: Putative stress 27.5 61 0.0013 18.6 2.1 21 75-95 13-37 (38)
58 PF13812 PPR_3: Pentatricopept 26.5 53 0.0011 16.7 1.5 19 62-81 15-33 (34)
59 PF08349 DUF1722: Protein of u 26.3 50 0.0011 23.1 1.8 23 78-100 64-86 (117)
60 cd00056 ENDO3c endonuclease II 25.0 92 0.002 21.9 3.0 39 63-101 32-73 (158)
61 KOG0005 Ubiquitin-like protein 24.8 57 0.0012 22.2 1.8 14 77-90 33-46 (70)
62 PF14527 LAGLIDADG_WhiA: WhiA 24.6 25 0.00054 24.1 0.0 21 37-59 65-85 (93)
63 PF11899 DUF3419: Protein of u 24.0 1.2E+02 0.0025 25.9 3.9 49 38-88 159-214 (380)
64 KOG4374 RNA-binding protein Bi 23.5 59 0.0013 26.4 1.9 53 36-92 117-173 (216)
65 COG4352 RPL13 Ribosomal protei 23.0 58 0.0012 24.2 1.6 34 44-87 55-88 (113)
66 PF10305 Fmp27_SW: RNA pol II 22.5 55 0.0012 22.8 1.4 16 36-51 80-95 (103)
67 PF13871 Helicase_C_4: Helicas 22.4 1.7E+02 0.0037 24.1 4.5 44 33-80 227-271 (278)
68 COG3179 Predicted chitinase [G 22.2 1.2E+02 0.0026 24.6 3.4 42 52-93 8-51 (206)
69 TIGR00608 radc DNA repair prot 22.1 2E+02 0.0042 22.7 4.6 41 49-89 33-77 (218)
70 PF10330 Stb3: Putative Sin3 b 21.8 65 0.0014 23.1 1.7 16 78-93 38-54 (92)
71 TIGR03019 pepcterm_femAB FemAB 21.8 1.8E+02 0.0039 23.1 4.3 59 37-95 127-191 (330)
72 PF05577 Peptidase_S28: Serine 21.8 72 0.0016 26.3 2.2 49 32-80 147-202 (434)
73 PF00633 HHH: Helix-hairpin-he 21.7 1.1E+02 0.0025 17.0 2.4 27 64-90 2-29 (30)
74 PF12972 NAGLU_C: Alpha-N-acet 21.4 2.8E+02 0.0062 22.2 5.5 58 49-107 124-189 (267)
75 smart00611 SEC63 Domain of unk 21.2 89 0.0019 24.4 2.5 35 61-95 172-207 (312)
76 COG0735 Fur Fe2+/Zn2+ uptake r 21.0 92 0.002 22.7 2.4 24 72-95 9-33 (145)
77 COG1305 Transglutaminase-like 20.9 1.2E+02 0.0027 22.4 3.1 37 33-83 180-216 (319)
78 TIGR01025 rpsS_arch ribosomal 20.8 52 0.0011 24.8 1.1 26 64-90 12-37 (135)
79 PF14304 CSTF_C: Transcription 20.7 67 0.0015 20.2 1.4 34 64-99 11-44 (46)
80 PF15368 BioT2: Spermatogenesi 20.3 1.4E+02 0.003 23.6 3.3 35 36-74 126-164 (170)
No 1
>PF09597 IGR: IGR protein motif; InterPro: IPR019083 This entry is found in fungal and plant proteins and contains a conserved IGR motif. Its function is unknown.
Probab=99.98 E-value=5e-33 Score=180.63 Aligned_cols=57 Identities=46% Similarity=0.853 Sum_probs=56.2
Q ss_pred HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhc
Q 033920 39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLG 98 (109)
Q Consensus 39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~G 98 (109)
|+|||++|||||++|++|||+ +|++||+++|.+||++||||++|||||+|+||||+|
T Consensus 1 V~tFL~~IGR~~~~~~~kf~~---~w~~lf~~~s~~LK~~GIp~r~RryiL~~~ek~r~G 57 (57)
T PF09597_consen 1 VETFLKLIGRGCEEHAEKFES---DWEKLFTTSSKQLKELGIPVRQRRYILRWREKYRQG 57 (57)
T ss_pred CHHHHHHHcccHHHHHHHHHH---HHHHHHhcCHHHHHHCCCCHHHHHHHHHHHHHHhCc
Confidence 799999999999999999998 999999999999999999999999999999999998
No 2
>PF00536 SAM_1: SAM domain (Sterile alpha motif); InterPro: IPR021129 The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins [] involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms []. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins [], nevertheless with a low affinity constant []. SAM domains also appear to possess the ability to bind RNA []. Smaug, a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA, binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding. Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces []. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures []. This entry represents type 1 SAM domains. ; PDB: 2KIV_A 3HIL_B 3KKA_A 3K1R_B 3SEN_B 3SEI_B 1V85_A 2KE7_A 2EAM_A 1WWV_A ....
Probab=98.39 E-value=5.6e-07 Score=56.06 Aligned_cols=58 Identities=31% Similarity=0.461 Sum_probs=51.8
Q ss_pred CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYR 96 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR 96 (109)
-+|.+||+.|| +++|++.|+...=|.+.|+.++...|+++|| ++-+|+.|++-.+++|
T Consensus 6 ~~V~~WL~~~~--l~~y~~~F~~~~i~g~~L~~lt~~dL~~lgi~~~ghr~ki~~~i~~Lk 64 (64)
T PF00536_consen 6 EDVSEWLKSLG--LEQYAENFEKNYIDGEDLLSLTEEDLEELGITKLGHRKKILRAIQKLK 64 (64)
T ss_dssp HHHHHHHHHTT--GGGGHHHHHHTTSSHHHHTTSCHHHHHHTT-SSHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHHCC--CHHHHHHHHcCCchHHHHHhcCHHHHHHcCCCCHHHHHHHHHHHHHhC
Confidence 47999999997 9999999977677899999999999999999 5599999999998876
No 3
>cd00166 SAM Sterile alpha motif.; Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerization.
Probab=98.20 E-value=2.8e-06 Score=51.25 Aligned_cols=58 Identities=31% Similarity=0.404 Sum_probs=50.4
Q ss_pred CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCC-chhhhHHhhhhHhhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIP-CKHRKLILKHTHKYR 96 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp-~r~RKyIL~~~ekyR 96 (109)
.+|.+||+.+| +++|++.|...==|.+.|..++...|+++||+ +-+|+.|++..++++
T Consensus 5 ~~V~~wL~~~~--~~~y~~~f~~~~i~g~~L~~l~~~dL~~lgi~~~g~r~~i~~~i~~l~ 63 (63)
T cd00166 5 EDVAEWLESLG--LGQYADNFRENGIDGDLLLLLTEEDLKELGITLPGHRKKILKAIQKLK 63 (63)
T ss_pred HHHHHHHHHcC--hHHHHHHHHHcCCCHHHHhHCCHHHHHHcCCCCHHHHHHHHHHHHHcC
Confidence 48999999997 89999999865338999999999999999995 499999999887653
No 4
>smart00454 SAM Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation.
Probab=97.89 E-value=2.9e-05 Score=46.98 Aligned_cols=59 Identities=32% Similarity=0.401 Sum_probs=50.7
Q ss_pred CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhc-hHHHHhcCC-CchhhhHHhhhhHhhhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTR-TLKLKKLGI-PCKHRKLILKHTHKYRL 97 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~-S~~LKelGI-p~r~RKyIL~~~ekyR~ 97 (109)
.+|..||..+| +.+|++.|...-=+-..|+..+ ...|+++|| ++-+|+.|++..+++|.
T Consensus 7 ~~v~~wL~~~g--~~~y~~~f~~~~i~g~~ll~~~~~~~l~~lgi~~~~~r~~ll~~i~~l~~ 67 (68)
T smart00454 7 ESVADWLESIG--LEQYADNFRKNGIDGALLLLLTSEEDLKELGITKLGHRKKILKAIQKLKD 67 (68)
T ss_pred HHHHHHHHHCC--hHHHHHHHHHCCCCHHHHHhcChHHHHHHcCCCcHHHHHHHHHHHHHHHh
Confidence 48999999997 9999999987533446788888 899999999 89999999999988874
No 5
>PF07647 SAM_2: SAM domain (Sterile alpha motif); InterPro: IPR011510 The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins [] involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms []. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins [], nevertheless with a low affinity constant []. SAM domains also appear to possess the ability to bind RNA []. Smaug, a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA, binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding. Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces []. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures []. This entry represents a second domain related to the SAM domain. ; GO: 0005515 protein binding; PDB: 1B0X_A 1X9X_B 1OW5_A 1V38_A 3BS7_A 3BS5_A 3TAD_A 3TAC_B 2K60_A 2DL0_A ....
Probab=97.88 E-value=1.7e-05 Score=49.48 Aligned_cols=59 Identities=24% Similarity=0.420 Sum_probs=50.5
Q ss_pred cCCHHHHHhhhccchhHHHHhhhhhh-hhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhh
Q 033920 36 KVGIPEFLNGIGKGVETHSAKLESEI-GDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYR 96 (109)
Q Consensus 36 ~~dV~tFL~~IGRg~~eha~Kfes~~-gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR 96 (109)
..+|.+||..+| +.+|++.|...= ...+.|..++...|+++|| +..+|+.||+-.++.|
T Consensus 6 ~~~v~~WL~~~g--l~~y~~~f~~~~i~g~~~L~~l~~~~L~~lGI~~~~~r~kll~~i~~Lk 66 (66)
T PF07647_consen 6 PEDVAEWLKSLG--LEQYADNFRENGIDGLEDLLQLTEEDLKELGITNLGHRRKLLSAIQELK 66 (66)
T ss_dssp HHHHHHHHHHTT--CGGGHHHHHHTTCSHHHHHTTSCHHHHHHTTTTHHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHHHCC--cHHHHHHHHHcCCcHHHHHhhCCHHHHHHcCCCCHHHHHHHHHHHHHcC
Confidence 458999999996 899999999733 4447799999999999999 8899999999887654
No 6
>KOG4384 consensus Uncharacterized SAM domain protein [General function prediction only]
Probab=96.50 E-value=0.0053 Score=52.40 Aligned_cols=59 Identities=25% Similarity=0.320 Sum_probs=50.3
Q ss_pred CCHHHHHhhhccchhHHHHhhhh-hhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLES-EIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYRL 97 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes-~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR~ 97 (109)
-.|+++|.+|| +++|.++|-. --.+++.+=..+-..|-++|| .+.+||.||.-+|.++.
T Consensus 216 ~~~~ewL~~i~--le~y~~~~L~nGYd~le~~k~i~e~dL~~lgI~nP~Hr~kLL~av~~~~e 276 (361)
T KOG4384|consen 216 KSLEEWLRRIG--LEEYIETLLENGYDTLEDLKDITEEDLEELGIDNPDHRKKLLSAVELLKE 276 (361)
T ss_pred hHHHHHHHHhh--HHHHHHHHHHcchHHHHHHHhccHHHHHHhCCCCHHHHHHHHHHHHHHHh
Confidence 47999999998 9999999854 224577777888899999999 99999999999988774
No 7
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=84.92 E-value=2.3 Score=40.65 Aligned_cols=66 Identities=18% Similarity=0.267 Sum_probs=56.7
Q ss_pred ccccCCHHHHHhhhccchhHHHHhhhhhh-hhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhhhccc
Q 033920 33 YIVKVGIPEFLNGIGKGVETHSAKLESEI-GDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYRLGLW 100 (109)
Q Consensus 33 ~~~~~dV~tFL~~IGRg~~eha~Kfes~~-gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR~Gl~ 100 (109)
+..-..|.++|..|+ |..+.+.|-++= ...+.+..++.+.|+.+|| =+-+-|.||.-.|-.|.+.-
T Consensus 920 ~~~f~sv~~WL~aIk--m~rY~~~F~~ag~~s~~~V~q~s~eDl~~~Gitl~GhqkkIl~SIq~m~~q~~ 987 (996)
T KOG0196|consen 920 FTPFRSVGDWLEAIK--MGRYKEHFAAAGYTSFEDVAQMSAEDLLRLGITLAGHQKKILSSIQAMRAQMR 987 (996)
T ss_pred CcccCCHHHHHHHhh--hhHHHHHHHhcCcccHHHHHhhhHHHHHhhceeecchhHHHHHHHHHHHHHhc
Confidence 445689999999998 999999998754 7899999999999999999 67788889988888887763
No 8
>KOG4374 consensus RNA-binding protein Bicaudal-C [RNA processing and modification]
Probab=80.40 E-value=2.8 Score=33.86 Aligned_cols=59 Identities=27% Similarity=0.237 Sum_probs=49.5
Q ss_pred CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYRL 97 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR~ 97 (109)
-+|--+|...| |.++-.-|+-.==||++|..++...||.+|| +.--||.|+..-++-|.
T Consensus 152 ~~vl~~L~~lg--lg~y~~~f~~~evd~~~l~~lte~dlk~~gi~~~GpRkKi~~A~~~~r~ 211 (216)
T KOG4374|consen 152 EGVLMELGILG--LGAYWKMFEAIEVDMDNLRLLTEEDLKDMGINSVGPRKKILCAIGKLRR 211 (216)
T ss_pred chHHHHHHHHh--HHHHHHHHHHHHHHHHHHHhcccchhhhhcccccCcchhhhhhhhcccc
Confidence 34556677776 8888888876447999999999999999999 99999999998887664
No 9
>PF01031 Dynamin_M: Dynamin central region; InterPro: IPR000375 Dynamin is a microtubule-associated force-producing protein of 100 Kd which is involved in the production of microtubule bundles. At the N terminus of dynamin is a GTPase domain (see IPR001401 from INTERPRO), and at the C terminus is a PH domain (see IPR001849 from INTERPRO). Between these two domains lies a central region of unknown function, which this entry represents.; GO: 0005525 GTP binding; PDB: 3ZVR_A 2AKA_B 2X2F_D 2X2E_D 3SNH_A 3ZYS_D 3ZYC_D 1JWY_B 1JX2_B 3SZR_A ....
Probab=78.80 E-value=3 Score=32.88 Aligned_cols=76 Identities=21% Similarity=0.291 Sum_probs=54.5
Q ss_pred cccccCCCC-CCccccCCHHHHHhhhccchhHHHHhh-hhhhhhHHHHhhhchHHHHhcCC-Cc----hhhhHHhhhhHh
Q 033920 22 SRFFTSKAS-NQYIVKVGIPEFLNGIGKGVETHSAKL-ESEIGDFQRLLVTRTLKLKKLGI-PC----KHRKLILKHTHK 94 (109)
Q Consensus 22 ~r~fs~~~~-~p~~~~~dV~tFL~~IGRg~~eha~Kf-es~~gdw~~Lf~~~S~~LKelGI-p~----r~RKyIL~~~ek 94 (109)
..||+..|. +......+++..-+.+.+-..+|..+- ++-....++.+.-...+|+.+|- ++ .++.||+....+
T Consensus 41 ~~fF~~~~~~~~~~~~~G~~~L~~~L~~~L~~~I~~~LP~l~~~I~~~l~~~~~eL~~lG~~~~~~~~~~~~~l~~~~~~ 120 (295)
T PF01031_consen 41 KEFFSNHPWYSSPADRCGTPALRKRLSELLVEHIRKSLPSLKSEIQKKLQEAEKELKRLGPPRPETPEEQRAYLLQIISK 120 (295)
T ss_dssp HHHHHHSTTTGGGGGGSSHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHTHHHCSSSCHHHHHHHHHHHHHH
T ss_pred HHHHhcccccCCcccccchHHHHHHHHHHHHHHHHHhCcHHHHHHHHHHHHHHHHHHHhCCCCCCCHHHHHHHHHHHHHH
Confidence 456666541 122235667666666666689998764 44338899999999999999988 33 688899999888
Q ss_pred hhh
Q 033920 95 YRL 97 (109)
Q Consensus 95 yR~ 97 (109)
|-+
T Consensus 121 f~~ 123 (295)
T PF01031_consen 121 FSR 123 (295)
T ss_dssp HHH
T ss_pred HHH
Confidence 864
No 10
>PF07524 Bromo_TP: Bromodomain associated; InterPro: IPR006565 This bromodomain is found in eukaryotic transcription factors and PHD domain containing proteins (IPR001965 from INTERPRO). The tandem PHD finger-bromodomain is found in many chromatin-associated proteins. It is involved in gene silencing by the human co-repressor KRAB-associated protein 1 (KAP1). The tandem PHD finger-bromodomain of KAP1 has a distinct structure that joins the two protein modules. The first helix, alpha(Z), of an atypical bromodomain forms the central hydrophobic core that anchors the other three helices of the bromodomain on one side and the zinc binding PHD finger on the other []. The Rap1 GTPase-activating protein, Sipa1, is modulated by the cellular bromodomain protein, Brd4. Brd4 belongs to the BET family and is a multifunctional protein involved in transcription, replication, the signal transduction pathway, and cell cycle progression. All of these functions are linked to its association with acetylated chromatin. It has tandem bromodomains []. The dysregulation of the Brd4-associated pathways may play an important role in breast cancer progression []. Bovine papillomavirus type 1 E2 also binds to chromosomes in a complex with Brd4. Interaction with Brd4 is additionally important for E2-mediated transcriptional regulation [, ].
Probab=78.35 E-value=2.4 Score=27.47 Aligned_cols=41 Identities=15% Similarity=0.340 Sum_probs=30.2
Q ss_pred HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCch
Q 033920 39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCK 83 (109)
Q Consensus 39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r 83 (109)
+..|+..||+.+..+++..--...+..++. ..|+++||.+.
T Consensus 36 ~~~yl~~l~~~~~~~ae~~gRt~~~~~Dv~----~al~~~gi~v~ 76 (77)
T PF07524_consen 36 LQRYLQELGRTAKRYAEHAGRTEPNLQDVE----QALEEMGISVN 76 (77)
T ss_pred HHHHHHHHHHHHHHHHHHcCCCCCCHHHHH----HHHHHhCCCCC
Confidence 458999999999999986554334455553 67899999764
No 11
>KOG3678 consensus SARM protein (with sterile alpha and armadillo motifs) [Extracellular structures]
Probab=76.53 E-value=4.6 Score=37.30 Aligned_cols=60 Identities=23% Similarity=0.299 Sum_probs=48.3
Q ss_pred cccCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHh-cCC-CchhhhHHhhhhHhh
Q 033920 34 IVKVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKK-LGI-PCKHRKLILKHTHKY 95 (109)
Q Consensus 34 ~~~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKe-lGI-p~r~RKyIL~~~eky 95 (109)
-.--||++++++|| -++|.+||....=|=+=|+.++-..||. .|+ .-=+||..||-.+..
T Consensus 465 Wt~AdVQ~WvkkIG--FeeY~EkFakQ~VDGDLLLqLTEndLk~DvGM~SGl~RKRFlRELqtL 526 (832)
T KOG3678|consen 465 WTCADVQYWVKKIG--FEEYVEKFAKQMVDGDLLLQLTENDLKHDVGMISGLHRKRFLRELQTL 526 (832)
T ss_pred cchHHHHHHHHHhC--HHHHHHHHHHHhccchHHHhhhhhhhhhhhhhhhhhhHHHHHHHHHHH
Confidence 33569999999998 9999999998775555688999999984 577 777899888765543
No 12
>PF15652 Tox-SHH: HNH/Endo VII superfamily toxin with a SHH signature
Probab=69.39 E-value=5.8 Score=28.71 Aligned_cols=38 Identities=24% Similarity=0.236 Sum_probs=31.7
Q ss_pred hhhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhh
Q 033920 56 KLESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYR 96 (109)
Q Consensus 56 Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR 96 (109)
|+++ +.++=|...+++|-+.|||..+|+--|+..=||+
T Consensus 62 kw~t---~~~~Ef~~~~~eM~dAGV~~~~~~~~l~~~Ykyf 99 (100)
T PF15652_consen 62 KWST---TLQEEFNNSYREMFDAGVSKECRKKALKAQYKYF 99 (100)
T ss_pred Cccc---hHHHHHHHHHHHHHHcCCCHHHHHHHHHHHHhhc
Confidence 3666 6778888889999999999999999998876664
No 13
>PF00730 HhH-GPD: HhH-GPD superfamily base excision DNA repair protein This entry corresponds to Endonuclease III This entry corresponds to Alkylbase DNA glycosidase; InterPro: IPR003265 Endonuclease III (4.2.99.18 from EC) is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism [, ]. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair []. The 3-D structures of Escherichia coli endonuclease III [] and catalytic domain of MutY [] have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL) []. Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs (see IPR003651 from INTERPRO). The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif [, ]. The HhH-GPD domain gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III, 4.2.99.18 from EC and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C-terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II 3.2.2.21 from EC, 8-oxoguanine DNA glycosylases and other members of the AlkA family.; GO: 0006284 base-excision repair; PDB: 3F0Z_A 3I0X_A 3F10_A 3I0W_A 3S6I_D 3N5N_Y 1PU7_A 1PU8_B 1PU6_B 1NGN_A ....
Probab=65.83 E-value=9.1 Score=25.36 Aligned_cols=34 Identities=18% Similarity=0.242 Sum_probs=31.5
Q ss_pred hHHHHhhhchHHHHhc----CCCchhhhHHhhhhHhhh
Q 033920 63 DFQRLLVTRTLKLKKL----GIPCKHRKLILKHTHKYR 96 (109)
Q Consensus 63 dw~~Lf~~~S~~LKel----GIp~r~RKyIL~~~ekyR 96 (109)
+++++..++..+|++. |.+.+--+||..-++.+.
T Consensus 29 t~~~l~~~~~~el~~~i~~~G~~~~ka~~i~~~a~~~~ 66 (108)
T PF00730_consen 29 TPEALAEASEEELRELIRPLGFSRRKAKYIIELARAIL 66 (108)
T ss_dssp SHHHHHCSHHHHHHHHHTTSTSHHHHHHHHHHHHHHHH
T ss_pred CHHHHHhCCHHHHHHHhhccCCCHHHHHHHHHHHHHhh
Confidence 4999999999999999 999888899999998887
No 14
>COG3272 Uncharacterized conserved protein [Function unknown]
Probab=57.64 E-value=6.7 Score=30.62 Aligned_cols=21 Identities=19% Similarity=0.306 Sum_probs=18.5
Q ss_pred CCchhhhHHhhhhHhhhhccc
Q 033920 80 IPCKHRKLILKHTHKYRLGLW 100 (109)
Q Consensus 80 Ip~r~RKyIL~~~ekyR~Gl~ 100 (109)
+...+|++++.|.|+||.|.-
T Consensus 101 L~s~er~~l~e~Ie~YR~G~~ 121 (163)
T COG3272 101 LNSEERQELAELIESYRRGEQ 121 (163)
T ss_pred hchHHHHHHHHHHHHHHcCCC
Confidence 467899999999999999973
No 15
>PF01152 Bac_globin: Bacterial-like globin; InterPro: IPR001486 Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms []. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include: Haemoglobin (Hb): trimer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates []. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors []. Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle []. Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia []. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin []. Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers []. Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation. Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants []. Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin []. Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [, ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors []. Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 alpha-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features []. This entry represents a group of haemoglobin-like proteins found in eubacteria, cyanobacteria, protozoa, algae and plants, but not in animals or yeast. These proteins have a truncated 2-over-2 rather than the canonical 3-over-3 alpha-helical sandwich fold []. This entry includes: HbN (or GlbN): a truncated haemoglobin-like protein that binds oxygen cooperatively with a very high affinity and a slow dissociation rate, which may exclude it from oxygen transport. It appears to be involved in bacterial nitric oxide detoxification and in nitrosative stress []. Cyanoglobin (or GlbN): a truncated haemoprotein found in cyanobacteria that has high oxygen affinity, and which appears to serve as part of a terminal oxidase, rather than as a respiratory pigment []. HbO (or GlbO): a truncated haemoglobin-like protein with a lower oxygen affinity than HbN. HbO associates with the bacterial cell membrane, where it significantly increases oxygen uptake over membranes lacking this protein. HbO appears to interact with a terminal oxidase, and could participate in an oxygen/electron-transfer process that facilitates oxygen transfer during aerobic metabolism []. Glb3: a nuclear-encoded truncated haemoglobin from plants that appears more closely related to HbO than HbN. Glb3 from Arabidopsis thaliana (Mouse-ear cress) exhibits an unusual concentration-independent binding of oxygen and carbon dioxide []. ; GO: 0019825 oxygen binding, 0015671 oxygen transport; PDB: 2BKM_B 1UVY_A 1DLW_A 2XYK_B 2IG3_A 2GKM_B 1S61_A 1S56_B 1RTE_B 2GLN_A ....
Probab=56.87 E-value=18 Score=24.46 Aligned_cols=37 Identities=24% Similarity=0.306 Sum_probs=31.9
Q ss_pred hhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhc
Q 033920 62 GDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLG 98 (109)
Q Consensus 62 gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~G 98 (109)
..++..+..=...|++.|+|...++.++...+.+|..
T Consensus 80 ~~f~~~~~~~~~al~~~~v~~~~~~~~~~~~~~~~~~ 116 (120)
T PF01152_consen 80 EHFDRWLELLKQALDELGVPEELIDELLARLESLRDD 116 (120)
T ss_dssp HHHHHHHHHHHHHHHHTTCTHHHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHHH
Confidence 4577777777889999999999999999999988864
No 16
>KOG4375 consensus Scaffold protein Shank and related SAM domain proteins [Signal transduction mechanisms]
Probab=56.21 E-value=16 Score=30.64 Aligned_cols=55 Identities=20% Similarity=0.291 Sum_probs=43.4
Q ss_pred cCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhh
Q 033920 36 KVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHT 92 (109)
Q Consensus 36 ~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ 92 (109)
+.||.+.|.-++ +.||-++|.+-==|=.-|=..+..++.++|| -+-+|.-|=|-.
T Consensus 212 k~DV~dWLssl~--L~E~~~aF~d~eIdG~hLp~l~k~df~~LGVTRVgHRmnIerAL 267 (272)
T KOG4375|consen 212 KIDVNDWLSSLH--LIEYDDAFHDIEIDGKHLPLLRKLDFRGLGVTRVGHRMNIERAL 267 (272)
T ss_pred cccHHHHHHhhh--hhhcchhhhhcccccchhhhcchhhhhcccchhhhhHHHHHHHH
Confidence 799999999997 9999999997212223455668889999999 899998875543
No 17
>COG1603 RPP1 RNase P/RNase MRP subunit p30 [Translation, ribosomal structure and biogenesis]
Probab=54.83 E-value=25 Score=28.54 Aligned_cols=76 Identities=16% Similarity=0.175 Sum_probs=49.5
Q ss_pred HHHhhchhhhhhhc--CccccccCCCCCCccc--cCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCC
Q 033920 6 IFNNAGANSMVAVS--GFSRFFTSKASNQYIV--KVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIP 81 (109)
Q Consensus 6 ~~~~~~~~~~~~~~--~~~r~fs~~~~~p~~~--~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp 81 (109)
++++.|.++- ++. ....+++..+.+|++. .-||.+|++.+|=. +.+|.++.+
T Consensus 147 ~l~~lr~~lr-l~rk~~v~ivvtS~A~s~~elrsP~dv~sl~~~lG~e-~~ea~~~~~---------------------- 202 (229)
T COG1603 147 LLSFLRSLLR-LARKYDVPIVVTSDAESPLELRSPRDVISLAKVLGLE-DDEAKKSLS---------------------- 202 (229)
T ss_pred HHHHHHHHHH-HHHhcCCCEEEeCCCCChhhhcChhhHHHHHHHhCCC-HHHHHHHHH----------------------
Confidence 4444444442 333 5566777777777776 46999999999822 234444433
Q ss_pred chhhhHHhhhhHhhhhcccccCCCC
Q 033920 82 CKHRKLILKHTHKYRLGLWRPRAAP 106 (109)
Q Consensus 82 ~r~RKyIL~~~ekyR~Gl~~P~g~~ 106 (109)
...+.||+|..+.|.|...||...
T Consensus 203 -~~p~~iL~~~~~~~~~~i~~gv~~ 226 (229)
T COG1603 203 -EYPRLILRNRNRIRDGFIVPGVGS 226 (229)
T ss_pred -HhHHHHHHHhhhcCCceEEecccc
Confidence 234568888999999888887653
No 18
>TIGR02527 dot_icm_IcmQ Dot/Icm secretion system protein IcmQ. Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation (PubMed:15661013).
Probab=53.02 E-value=9.9 Score=30.16 Aligned_cols=27 Identities=26% Similarity=0.277 Sum_probs=23.6
Q ss_pred CHHHHHhhhccchhHHHHhhhhhhhhH
Q 033920 38 GIPEFLNGIGKGVETHSAKLESEIGDF 64 (109)
Q Consensus 38 dV~tFL~~IGRg~~eha~Kfes~~gdw 64 (109)
+-..||+.||+++.+--+.|++.+|.=
T Consensus 16 eeSnFLRvIgKnL~eIRd~f~~~l~~~ 42 (182)
T TIGR02527 16 DESLFLRNIGKKLIAIKDLFEEAIAAA 42 (182)
T ss_pred hHHHHHHHHHHhHHHHHHHHHHHhccc
Confidence 678899999999999999999977543
No 19
>PF10454 DUF2458: Protein of unknown function (DUF2458); InterPro: IPR018858 This entry represents a family of uncharacterised proteins.
Probab=50.78 E-value=32 Score=25.82 Aligned_cols=28 Identities=21% Similarity=0.464 Sum_probs=22.7
Q ss_pred HHhhhhhh-hhHHHHhhhchHHHHhcCCC
Q 033920 54 SAKLESEI-GDFQRLLVTRTLKLKKLGIP 81 (109)
Q Consensus 54 a~Kfes~~-gdw~~Lf~~~S~~LKelGIp 81 (109)
.++|...| -.|.++-......|+++|||
T Consensus 92 L~~fD~kV~~a~~~m~~~~~~~L~~LgVP 120 (150)
T PF10454_consen 92 LDKFDEKVYKASKQMSKEQQAELKELGVP 120 (150)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhcCCC
Confidence 35555555 67899999999999999998
No 20
>PRK10308 3-methyl-adenine DNA glycosylase II; Provisional
Probab=50.55 E-value=21 Score=28.95 Aligned_cols=37 Identities=24% Similarity=0.328 Sum_probs=34.3
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhccc
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGLW 100 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl~ 100 (109)
-+.|..++..+|++.|++-+--+||..-.+.+..|..
T Consensus 158 pe~La~~~~~eL~~~Gl~~~Ra~~L~~lA~~i~~g~l 194 (283)
T PRK10308 158 PERLAAADPQALKALGMPLKRAEALIHLANAALEGTL 194 (283)
T ss_pred HHHHHcCCHHHHHHCCCCHHHHHHHHHHHHHHHcCCC
Confidence 8999999999999999998888999999999998865
No 21
>PF12836 HHH_3: Helix-hairpin-helix motif; PDB: 2EDU_A 2OCE_A 3BZK_A 3BZC_A 2DUY_A.
Probab=50.22 E-value=13 Score=23.42 Aligned_cols=32 Identities=28% Similarity=0.397 Sum_probs=21.0
Q ss_pred hhchHHHHhc-CCCchhhhHHhhhhHhhhhccccc
Q 033920 69 VTRTLKLKKL-GIPCKHRKLILKHTHKYRLGLWRP 102 (109)
Q Consensus 69 ~~~S~~LKel-GIp~r~RKyIL~~~ekyR~Gl~~P 102 (109)
+++..+|..+ ||..++=+.|+.++++. |-|.=
T Consensus 10 ~as~~eL~~lpgi~~~~A~~Iv~~R~~~--G~f~s 42 (65)
T PF12836_consen 10 TASAEELQALPGIGPKQAKAIVEYREKN--GPFKS 42 (65)
T ss_dssp TS-HHHHHTSTT--HHHHHHHHHHHHHH---S-SS
T ss_pred cCCHHHHHHcCCCCHHHHHHHHHHHHhC--cCCCC
Confidence 5677788888 99888889999988776 55443
No 22
>PF14520 HHH_5: Helix-hairpin-helix domain; PDB: 3AUO_B 3AU6_A 3AU2_A 3B0X_A 3B0Y_A 1SZP_C 3LDA_A 1WCN_A 2JZB_B 2ZTC_A ....
Probab=49.49 E-value=59 Score=19.89 Aligned_cols=41 Identities=22% Similarity=0.286 Sum_probs=29.3
Q ss_pred HHHHhhhhh-hhhHHHHhhhchHHHHhc-CCCchhhhHHhhhh
Q 033920 52 THSAKLESE-IGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHT 92 (109)
Q Consensus 52 eha~Kfes~-~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~ 92 (109)
.-+.++-+. +.++++|...+-.+|.+. ||..++-.-|..+.
T Consensus 16 ~~a~~L~~~G~~t~~~l~~a~~~~L~~i~Gig~~~a~~i~~~~ 58 (60)
T PF14520_consen 16 KRAEKLYEAGIKTLEDLANADPEELAEIPGIGEKTAEKIIEAA 58 (60)
T ss_dssp HHHHHHHHTTCSSHHHHHTSHHHHHHTSTTSSHHHHHHHHHHH
T ss_pred HHHHHHHhcCCCcHHHHHcCCHHHHhcCCCCCHHHHHHHHHHH
Confidence 344555555 667999999999999988 99666666555543
No 23
>PTZ00096 40S ribosomal protein S15; Provisional
Probab=48.86 E-value=9.6 Score=29.07 Aligned_cols=26 Identities=23% Similarity=0.405 Sum_probs=14.8
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHHhh
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLILK 90 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyIL~ 90 (109)
+|+|++++..+|-++ +|++|||.+.|
T Consensus 22 l~~L~~m~~~e~~~L-~~aR~RR~~~R 47 (143)
T PTZ00096 22 LEKLLALPEEELVEL-FRARQRRRINR 47 (143)
T ss_pred HHHHHcCCHHHHHHH-cCccccccccc
Confidence 555666655555443 46666666543
No 24
>cd00454 Trunc_globin Truncated hemoglobins (trHbs) are a family of oxygen-binding heme proteins found in cyanobacteria, eubacteria, unicellular eukaryotes, and plants. The truncated hemoglobins have a characteristic two-over-two alpha helical folding pattern that is distinct from the three-over-three pattern found in other globins. A subset of these have been demonstrated to form homodimers.
Probab=48.78 E-value=27 Score=23.36 Aligned_cols=38 Identities=18% Similarity=0.307 Sum_probs=32.1
Q ss_pred hhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcc
Q 033920 62 GDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGL 99 (109)
Q Consensus 62 gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl 99 (109)
.+++.++..=...|++.|+|...+..++...+..|..+
T Consensus 76 ~~f~~~l~~l~~al~~~~~~~~~~~~~~~~~~~~~~~~ 113 (116)
T cd00454 76 EEFDAWLELLRDALDELGVPAELADALLARAERIADHM 113 (116)
T ss_pred HHHHHHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHHHH
Confidence 45777777778899999999999999999999887643
No 25
>PF09475 Dot_icm_IcmQ: Dot/Icm secretion system protein (dot_icm_IcmQ); InterPro: IPR013365 Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation ().; PDB: 3FXE_A 3FXD_C.
Probab=48.68 E-value=9.5 Score=30.18 Aligned_cols=26 Identities=23% Similarity=0.388 Sum_probs=22.5
Q ss_pred CHHHHHhhhccchhHHHHhhhhhhhh
Q 033920 38 GIPEFLNGIGKGVETHSAKLESEIGD 63 (109)
Q Consensus 38 dV~tFL~~IGRg~~eha~Kfes~~gd 63 (109)
+-..||+.||+++.+--+.|.+.+|.
T Consensus 16 eeSnFLRvIgKnL~eIRd~f~~~l~~ 41 (179)
T PF09475_consen 16 EESNFLRVIGKNLREIRDNFANQLGL 41 (179)
T ss_dssp TSSHHHHHHHHHHHHHHHHHHHHHC-
T ss_pred hHHHHHHHHHHhHHHHHHHHHHHhcc
Confidence 66789999999999999999997754
No 26
>PRK10361 DNA recombination protein RmuC; Provisional
Probab=47.75 E-value=14 Score=32.70 Aligned_cols=49 Identities=10% Similarity=0.224 Sum_probs=38.2
Q ss_pred ccCCHHHHHhhhccchhHHHHhhhhhhhhHHH---HhhhchHHHHhcCCCch
Q 033920 35 VKVGIPEFLNGIGKGVETHSAKLESEIGDFQR---LLVTRTLKLKKLGIPCK 83 (109)
Q Consensus 35 ~~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~---Lf~~~S~~LKelGIp~r 83 (109)
....+.+=+..||+.++.-.+.|.++++++.. =+-.+-.+||++|+.++
T Consensus 381 kl~~f~~~~~klG~~L~~a~~~y~~A~~~L~~Grgnli~~a~~~k~Lg~~~~ 432 (475)
T PRK10361 381 KMRLFVDDMSAIGQSLDKAQDNYRQAMKKLSSGRGNVLAQAEAFRGLGVEIK 432 (475)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCchHHHHHHHHHcCCCcC
Confidence 34456667788999999999999998888873 34557789999999663
No 27
>smart00460 TGc Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events.
Probab=44.66 E-value=19 Score=21.52 Aligned_cols=13 Identities=38% Similarity=0.672 Sum_probs=10.5
Q ss_pred HHHhcCCCchhhh
Q 033920 74 KLKKLGIPCKHRK 86 (109)
Q Consensus 74 ~LKelGIp~r~RK 86 (109)
.|+..|||++--.
T Consensus 19 llr~~GIpar~v~ 31 (68)
T smart00460 19 LLRSLGIPARVVS 31 (68)
T ss_pred HHHHCCCCeEEEe
Confidence 6889999998643
No 28
>PRK00024 hypothetical protein; Reviewed
Probab=44.61 E-value=44 Score=26.18 Aligned_cols=46 Identities=22% Similarity=0.225 Sum_probs=33.6
Q ss_pred ccchhHHHHhhhhhhhhHHHHhhhchHHHHh-cCCCchhhhHHhhhh
Q 033920 47 GKGVETHSAKLESEIGDFQRLLVTRTLKLKK-LGIPCKHRKLILKHT 92 (109)
Q Consensus 47 GRg~~eha~Kfes~~gdw~~Lf~~~S~~LKe-lGIp~r~RKyIL~~~ 92 (109)
++++.+-|.++-+.+|++.+|+.++-.+|++ .||.......|+.-.
T Consensus 40 ~~~~~~LA~~LL~~fgsL~~l~~as~~eL~~i~GIG~akA~~L~a~~ 86 (224)
T PRK00024 40 GKSVLDLARELLQRFGSLRGLLDASLEELQSIKGIGPAKAAQLKAAL 86 (224)
T ss_pred CCCHHHHHHHHHHHcCCHHHHHhCCHHHHhhccCccHHHHHHHHHHH
Confidence 3456688888888888999999999999988 499444333443333
No 29
>PF01841 Transglut_core: Transglutaminase-like superfamily; InterPro: IPR002931 This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the epsilon-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds []. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' []. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease []. A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal beta-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal beta-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [].; PDB: 2F4M_A 2F4O_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B ....
Probab=44.37 E-value=33 Score=22.17 Aligned_cols=37 Identities=27% Similarity=0.348 Sum_probs=24.0
Q ss_pred ccccCCHHHHHh-hhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCchh
Q 033920 33 YIVKVGIPEFLN-GIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCKH 84 (109)
Q Consensus 33 ~~~~~dV~tFL~-~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~ 84 (109)
.....++.++|+ +.| .|.+++..|.. -|+.+||||+-
T Consensus 37 ~~~~~~~~~~l~~~~G-~C~~~a~l~~a--------------llr~~Gipar~ 74 (113)
T PF01841_consen 37 SPGPRDASEVLRSGRG-DCEDYASLFVA--------------LLRALGIPARV 74 (113)
T ss_dssp CCCCTTHHHHHHCEEE-SHHHHHHHHHH--------------HHHHHT--EEE
T ss_pred CCCCCCHHHHHHcCCC-ccHHHHHHHHH--------------HHhhCCCceEE
Confidence 333556788877 444 78877777665 58899999863
No 30
>PRK11639 zinc uptake transcriptional repressor; Provisional
Probab=43.56 E-value=29 Score=25.81 Aligned_cols=23 Identities=9% Similarity=-0.009 Sum_probs=20.0
Q ss_pred hHHHHhcCC-CchhhhHHhhhhHh
Q 033920 72 TLKLKKLGI-PCKHRKLILKHTHK 94 (109)
Q Consensus 72 S~~LKelGI-p~r~RKyIL~~~ek 94 (109)
...||+.|+ .++||..||.....
T Consensus 14 ~~~L~~~GlR~T~qR~~IL~~l~~ 37 (169)
T PRK11639 14 EKLCAQRNVRLTPQRLEVLRLMSL 37 (169)
T ss_pred HHHHHHcCCCCCHHHHHHHHHHHh
Confidence 455899999 99999999999865
No 31
>cd00923 Cyt_c_Oxidase_Va Cytochrome c oxidase subunit Va. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Va is one of three mammalian subunits that lacks a transmembrane region. Subunit Va is located on the matrix side of the membrane and binds thyroid hormone T2, releasing allosteric inhibition caused by the binding of ATP to subunit IV and allowing high turnover at elevated intramitochondrial ATP/ADP ratios.
Probab=43.42 E-value=33 Score=25.01 Aligned_cols=30 Identities=23% Similarity=0.245 Sum_probs=21.7
Q ss_pred HhhhhhhhhHHHHhhhchHHHHhcCCCchh
Q 033920 55 AKLESEIGDFQRLLVTRTLKLKKLGIPCKH 84 (109)
Q Consensus 55 ~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~ 84 (109)
+|-++.-+-|.-+++-=...|+|+|||...
T Consensus 70 ~K~~~~~~~y~~~lqeikp~l~ELGI~t~E 99 (103)
T cd00923 70 DKCGAHKEIYPYILQEIKPTLKELGISTPE 99 (103)
T ss_pred HHccCchhhHHHHHHHHhHHHHHHCCCCHH
Confidence 344332234888888889999999998764
No 32
>PF08955 BofC_C: BofC C-terminal domain; InterPro: IPR015050 The C-terminal domain of the bacterial protein, bypass of forespore C (BofC), contains a three-stranded beta-sheet and three alpha-helices. The exact function is unknown []. ; PDB: 2BW2_A.
Probab=42.73 E-value=22 Score=24.31 Aligned_cols=31 Identities=29% Similarity=0.432 Sum_probs=17.6
Q ss_pred HhhhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhh
Q 033920 55 AKLESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYR 96 (109)
Q Consensus 55 ~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR 96 (109)
+++|+ +++++| +.||+++.+.-.++-.+-|.
T Consensus 43 ~~Les--~~~~~L---------~~GIrV~~~~ey~~vLe~~~ 73 (75)
T PF08955_consen 43 EKLES--SDHDQL---------KRGIRVRSKEEYNSVLETFK 73 (75)
T ss_dssp TTS-H--HHHHHH---------HH--S---HHHHHHHHHHHH
T ss_pred HHcCH--hHHHHH---------hCCCeeCCHHHHHHHHHhhc
Confidence 34555 468888 88999999988777766653
No 33
>smart00478 ENDO3c endonuclease III. includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases
Probab=42.45 E-value=50 Score=23.06 Aligned_cols=37 Identities=14% Similarity=0.146 Sum_probs=25.9
Q ss_pred hhhHHHHhhhchHHH----HhcCCCchhhhHHhhhhHhhhh
Q 033920 61 IGDFQRLLVTRTLKL----KKLGIPCKHRKLILKHTHKYRL 97 (109)
Q Consensus 61 ~gdw~~Lf~~~S~~L----KelGIp~r~RKyIL~~~ekyR~ 97 (109)
+++|+++..++..+| ++.|.+.+-=+||..-.+.+..
T Consensus 21 ~~~~~~l~~~~~~eL~~~l~~~g~~~~ka~~i~~~a~~~~~ 61 (149)
T smart00478 21 FPTPEDLAAADEEELEELIRPLGFYRRKAKYLIELARILVE 61 (149)
T ss_pred CCCHHHHHCCCHHHHHHHHHHcCChHHHHHHHHHHHHHHHH
Confidence 345888888888777 6677777767777776665544
No 34
>PF11328 DUF3130: Protein of unknown function (DUF3130; InterPro: IPR021477 This bacterial family of proteins has no known function.
Probab=41.70 E-value=19 Score=25.69 Aligned_cols=35 Identities=23% Similarity=0.376 Sum_probs=26.8
Q ss_pred CHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC
Q 033920 38 GIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI 80 (109)
Q Consensus 38 dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI 80 (109)
.+..|-+.| .+..++.|+ ++.+...+..+||++|+
T Consensus 42 sin~~r~Al----~dLv~~Ve~----fq~v~~~DA~RlkkmG~ 76 (90)
T PF11328_consen 42 SINQLRTAL----IDLVDVVEN----FQQVVKKDASRLKKMGK 76 (90)
T ss_pred hHHHHHHHH----HHHHHHHHH----HHHHHHHHHHHHHHHHH
Confidence 555665554 455556555 99999999999999998
No 35
>PRK13482 DNA integrity scanning protein DisA; Provisional
Probab=41.47 E-value=41 Score=28.87 Aligned_cols=56 Identities=21% Similarity=0.277 Sum_probs=41.2
Q ss_pred HHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CC-CchhhhHHhhhhHhhhh
Q 033920 41 EFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GI-PCKHRKLILKHTHKYRL 97 (109)
Q Consensus 41 tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GI-p~r~RKyIL~~~ekyR~ 97 (109)
-.|..|.|--...++.+-+.||++++++.++-.+|++. || +.+.+. |-.-..++..
T Consensus 287 RiLs~IPrl~k~iAk~Ll~~FGSL~~Il~As~eeL~~VeGIGe~rA~~-I~e~l~Rl~e 344 (352)
T PRK13482 287 RLLSKIPRLPSAVIENLVEHFGSLQGLLAASIEDLDEVEGIGEVRARA-IREGLSRLAE 344 (352)
T ss_pred HHHhcCCCCCHHHHHHHHHHcCCHHHHHcCCHHHHhhCCCcCHHHHHH-HHHHHHHHHH
Confidence 45666666666777777788888999999999999986 89 555554 6665555544
No 36
>COG1577 ERG12 Mevalonate kinase [Lipid metabolism]
Probab=40.74 E-value=52 Score=27.36 Aligned_cols=56 Identities=25% Similarity=0.363 Sum_probs=41.7
Q ss_pred CHHHHHhhhccchhHHHHhhhhhhhh---HHHHhhhchHHHHhcCCCchhhhHHhhhhHhh
Q 033920 38 GIPEFLNGIGKGVETHSAKLESEIGD---FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKY 95 (109)
Q Consensus 38 dV~tFL~~IGRg~~eha~Kfes~~gd---w~~Lf~~~S~~LKelGIp~r~RKyIL~~~eky 95 (109)
.+..|+..||.-+.+-..-+++ +| +.++++....-|+++||....=+.|+.-.+++
T Consensus 201 ~~~~~~~~ig~~~~~a~~al~~--~d~e~lgelm~~nq~LL~~LgVs~~~L~~lv~~a~~~ 259 (307)
T COG1577 201 VIDPILDAIGELVQEAEAALQT--GDFEELGELMNINQGLLKALGVSTPELDELVEAARSL 259 (307)
T ss_pred HHHHHHHHHHHHHHHHHHHHhc--ccHHHHHHHHHHHHHHHHhcCcCcHHHHHHHHHHHhc
Confidence 4677889999666665555554 55 78889999999999999777767776666544
No 37
>PF04362 Iron_traffic: Bacterial Fe(2+) trafficking; InterPro: IPR007457 The protein represented by this entry, YggX, serves to protect Fe-S clusters from oxidative damage []. The effect is two-fold: proteins that rely on Fe-S clusters do not become inactivated, and the release of free iron and hydrogen peroxide--a DNA damaging agent--is prevented. These observations are consistent with the hypothesis that YggX chelates free iron, and recent experiments show that YggX can indeed bind Fe(II) in vitro and in vivo []. Furthermore, YggX has a positive effect on the action of at least one Fe(II)-responsive protein. The combined actions of YggX is reminiscent of iron trafficking proteins [], and YggX is therefore proposed to play a role in Fe(II) trafficking []. In Escherichia coli, YggX was shown to be under the transcriptional control of the redox-sensing SoxRS system []. ; GO: 0005506 iron ion binding; PDB: 1YHD_A 1T07_A 1XS8_A.
Probab=40.17 E-value=33 Score=24.21 Aligned_cols=50 Identities=18% Similarity=0.355 Sum_probs=39.1
Q ss_pred hhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC---CchhhhHHhhhhHhhhhc
Q 033920 44 NGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI---PCKHRKLILKHTHKYRLG 98 (109)
Q Consensus 44 ~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI---p~r~RKyIL~~~ekyR~G 98 (109)
..+|..+-++++|= -|+.=+.-.+.-.-|+++ ++++|++|..+.++|=-|
T Consensus 24 G~lG~~I~~~iSk~-----AW~~W~~~QTmLINE~rLn~~dp~~R~~L~~qM~~Flf~ 76 (88)
T PF04362_consen 24 GELGQRIYDNISKE-----AWQEWLEHQTMLINEYRLNMMDPEARKFLEEQMEKFLFG 76 (88)
T ss_dssp SHHHHHHHHHSBHH-----HHHHHHHHHHHHHHHHT--TTSHHHHHHHHHHHHHHTTT
T ss_pred CHHHHHHHHHHhHH-----HHHHHHHHHHHHHHhccCCCCCHHHHHHHHHHHHHHhcC
Confidence 45677777777772 288888888888888887 899999999999999654
No 38
>PF07487 SopE_GEF: SopE GEF domain; InterPro: IPR016019 The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal" state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE []. Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell []. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection. This entry represents the guanine nucleotide exchange factor domain of SopE. This domain has an alpha-helical structure consisting of two three-helix bundles arranged in a lamdba shape [, ].; GO: 0005085 guanyl-nucleotide exchange factor activity, 0009405 pathogenesis, 0031532 actin cytoskeleton reorganization, 0032862 activation of Rho GTPase activity, 0005576 extracellular region; PDB: 1GZS_B 1R9K_A 1R6E_A 2JOL_A 2JOK_A.
Probab=39.48 E-value=19 Score=28.15 Aligned_cols=31 Identities=26% Similarity=0.462 Sum_probs=22.4
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcccccCCC
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGLWRPRAA 105 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl~~P~g~ 105 (109)
-+.++-...+..|+.|+|+.. ++|.|.|+|+
T Consensus 62 i~pFL~eiGeaak~aGLPge~-----------KNgVFtp~Ga 92 (165)
T PF07487_consen 62 IQPFLFEIGEAAKNAGLPGEN-----------KNGVFTPSGA 92 (165)
T ss_dssp SHHHHHHHHHHHHHTT-SEEE-----------ETTEEEETT-
T ss_pred ccHHHHHHHHHHHHCCCCccc-----------cCCeeccCCC
Confidence 555666667788899999975 5888988885
No 39
>PF12826 HHH_2: Helix-hairpin-helix motif; PDB: 1X2I_B 1DGS_A 1V9P_B.
Probab=39.16 E-value=31 Score=21.84 Aligned_cols=40 Identities=20% Similarity=0.280 Sum_probs=27.3
Q ss_pred HhhhhhhhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhHh
Q 033920 55 AKLESEIGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTHK 94 (109)
Q Consensus 55 ~Kfes~~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~ek 94 (109)
..+-..+|++++|..++-++|.+. ||..+.-+-|..|-+.
T Consensus 17 k~L~~~f~sl~~l~~a~~e~L~~i~gIG~~~A~si~~ff~~ 57 (64)
T PF12826_consen 17 KLLAKHFGSLEALMNASVEELSAIPGIGPKIAQSIYEFFQD 57 (64)
T ss_dssp HHHHHCCSCHHHHCC--HHHHCTSTT--HHHHHHHHHHHH-
T ss_pred HHHHHHcCCHHHHHHcCHHHHhccCCcCHHHHHHHHHHHCC
Confidence 444455667999999999999887 9988888888877654
No 40
>PF06320 GCN5L1: GCN5-like protein 1 (GCN5L1); InterPro: IPR009395 This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown [,].
Probab=37.92 E-value=34 Score=24.71 Aligned_cols=32 Identities=25% Similarity=0.416 Sum_probs=26.9
Q ss_pred cchhHHHHhhhhhhhhHHHHhhhchHHHHhcC
Q 033920 48 KGVETHSAKLESEIGDFQRLLVTRTLKLKKLG 79 (109)
Q Consensus 48 Rg~~eha~Kfes~~gdw~~Lf~~~S~~LKelG 79 (109)
|.+.+++.+|-.....|..+..--...|||+|
T Consensus 57 k~L~~~~~~l~kqt~qw~~~~~~~~~~LKEiG 88 (121)
T PF06320_consen 57 KQLQRNTAKLAKQTDQWLKLVDSFNDALKEIG 88 (121)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence 55677888888888889999988889999988
No 41
>PF15144 DUF4576: Domain of unknown function (DUF4576)
Probab=37.86 E-value=34 Score=24.33 Aligned_cols=22 Identities=23% Similarity=0.520 Sum_probs=17.6
Q ss_pred cCCHHHHHhhhccchhHHHHhh
Q 033920 36 KVGIPEFLNGIGKGVETHSAKL 57 (109)
Q Consensus 36 ~~dV~tFL~~IGRg~~eha~Kf 57 (109)
.||.+.||+..|-...|.|..|
T Consensus 43 ~p~fPkFLn~LGteIiEnAVef 64 (88)
T PF15144_consen 43 EPDFPKFLNLLGTEIIENAVEF 64 (88)
T ss_pred CCchHHHHHHhhHHHHHHHHHH
Confidence 6899999999997766666554
No 42
>PF02284 COX5A: Cytochrome c oxidase subunit Va; InterPro: IPR003204 Cytochrome c oxidase (1.9.3.1 from EC) is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen []. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits is known as Va.; GO: 0004129 cytochrome-c oxidase activity; PDB: 2DYR_R 3AG1_E 3ABL_E 1V54_R 2EIJ_R 1OCR_E 2DYS_E 2EIM_E 2OCC_E 3ASN_R ....
Probab=37.84 E-value=36 Score=25.01 Aligned_cols=30 Identities=23% Similarity=0.311 Sum_probs=19.1
Q ss_pred HhhhhhhhhHHHHhhhchHHHHhcCCCchh
Q 033920 55 AKLESEIGDFQRLLVTRTLKLKKLGIPCKH 84 (109)
Q Consensus 55 ~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~ 84 (109)
+|-++.-+-|+-++.-=...|+|+|||..+
T Consensus 73 ~K~~~~~~~Y~~~lqElkPtl~ELGI~t~E 102 (108)
T PF02284_consen 73 DKCGNKKEIYPYILQELKPTLEELGIPTPE 102 (108)
T ss_dssp HHTTT-TTHHHHHHHHHHHHHHHHT---TT
T ss_pred HHccChHHHHHHHHHHHhhHHHHhCCCCHH
Confidence 444443335888888889999999998764
No 43
>PRK13766 Hef nuclease; Provisional
Probab=35.79 E-value=78 Score=28.16 Aligned_cols=53 Identities=15% Similarity=0.179 Sum_probs=39.3
Q ss_pred HHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhHh
Q 033920 42 FLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTHK 94 (109)
Q Consensus 42 FL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~ek 94 (109)
+|..|..--..-+.++-+.+|++++++.++..+|++. ||..+.-+-|..+.++
T Consensus 716 ~L~~ipgig~~~a~~Ll~~fgs~~~i~~as~~~L~~i~Gig~~~a~~i~~~~~~ 769 (773)
T PRK13766 716 IVESLPDVGPVLARNLLEHFGSVEAVMTASEEELMEVEGIGEKTAKRIREVVTS 769 (773)
T ss_pred HHhcCCCCCHHHHHHHHHHcCCHHHHHhCCHHHHHhCCCCCHHHHHHHHHHHhh
Confidence 5666643334445666666788999999999999998 9987777777776654
No 44
>PRK05408 oxidative damage protection protein; Provisional
Probab=35.62 E-value=61 Score=23.05 Aligned_cols=49 Identities=20% Similarity=0.357 Sum_probs=39.1
Q ss_pred hhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC---CchhhhHHhhhhHhhhhc
Q 033920 45 GIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI---PCKHRKLILKHTHKYRLG 98 (109)
Q Consensus 45 ~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI---p~r~RKyIL~~~ekyR~G 98 (109)
.+|+.+-++++|= -|+.=+.-.+.-.-|.++ .+++|+||-.+.++|=-|
T Consensus 25 ~lGkrI~~~ISk~-----AW~~W~~~QTmLINE~rLn~~dp~ar~~L~~qMekF~F~ 76 (90)
T PRK05408 25 ELGKRIYENISKE-----AWQEWLKHQTMLINEKRLNMMDPEARKFLEEQMEKFLFG 76 (90)
T ss_pred HHHHHHHHHHHHH-----HHHHHHHhhHhhhhhccCCCCCHHHHHHHHHHHHHHhcC
Confidence 4677777777772 288888778888888877 899999999999999754
No 45
>PF03457 HA: Helicase associated domain; InterPro: IPR005114 This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.; PDB: 2KTA_A.
Probab=34.62 E-value=62 Score=20.04 Aligned_cols=39 Identities=18% Similarity=0.304 Sum_probs=22.3
Q ss_pred HHHHhhhchHHHHhcC---CCchh-------hhHHhhhhHhhhhccccc
Q 033920 64 FQRLLVTRTLKLKKLG---IPCKH-------RKLILKHTHKYRLGLWRP 102 (109)
Q Consensus 64 w~~Lf~~~S~~LKelG---Ip~r~-------RKyIL~~~ekyR~Gl~~P 102 (109)
|++-|..=..--.+-| ||... -++|-+++.+||+|...|
T Consensus 8 W~~~~~~l~~y~~~~G~~~vp~~~~~~~~~Lg~Wl~~qR~~~r~g~L~~ 56 (68)
T PF03457_consen 8 WEERYEALKAYKEEHGHLNVPRDYVTDGFPLGQWLNNQRRKYRKGKLTP 56 (68)
T ss_dssp HHHHHHHHHHHHHHHS--S-SS-----SSHHHHHHHHHHHHHHHT---H
T ss_pred HHHHHHHHHHHHHHHCCCCCCcccCcCCCcHHHHHHHHHHHHHcCCCCH
Confidence 5555544444445555 35443 788999999999987644
No 46
>PTZ00418 Poly(A) polymerase; Provisional
Probab=33.27 E-value=48 Score=30.33 Aligned_cols=71 Identities=20% Similarity=0.165 Sum_probs=45.2
Q ss_pred CCHHHHHhhhccc-hhHHHHhhhhhhhhHHHHh-hhchHHHHhcCCCchhhh---HHhhhhHhhhhcccccCCCCC
Q 033920 37 VGIPEFLNGIGKG-VETHSAKLESEIGDFQRLL-VTRTLKLKKLGIPCKHRK---LILKHTHKYRLGLWRPRAAPA 107 (109)
Q Consensus 37 ~dV~tFL~~IGRg-~~eha~Kfes~~gdw~~Lf-~~~S~~LKelGIp~r~RK---yIL~~~ekyR~Gl~~P~g~~~ 107 (109)
-.+.+||+..|-- .+|-..|=+..++.++++. .|-...-+++|++..+.+ -.|--.=-||.|.|+|+.|..
T Consensus 72 ~~L~~~L~~~~~fes~ee~~kR~~vL~~L~~iv~~wv~~vs~~k~~~~~~~~~~~g~I~tfGSYrLGV~~pgSDID 147 (593)
T PTZ00418 72 NELINLLKSYNLYETEEGKKKRERVLGSLNKLVREFVVEASIEQGINEEEASQISGKLFTFGSYRLGVVAPGSDID 147 (593)
T ss_pred HHHHHHHHHcCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCChhHHhcCCeEEEEeccccccCCCCCCccc
Confidence 5678888876532 2222333333346677777 444455577888777655 334455679999999999974
No 47
>KOG1170 consensus Diacylglycerol kinase [Lipid transport and metabolism]
Probab=32.92 E-value=18 Score=35.05 Aligned_cols=60 Identities=27% Similarity=0.404 Sum_probs=48.2
Q ss_pred CccccCCHHHHHhhhccchhHHHHhhhhh-h-hhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhh
Q 033920 32 QYIVKVGIPEFLNGIGKGVETHSAKLESE-I-GDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKY 95 (109)
Q Consensus 32 p~~~~~dV~tFL~~IGRg~~eha~Kfes~-~-gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~eky 95 (109)
||-..-.|-+.|.-|| ++||.+.|+.. | | ..|+.+--..||++|| .+-+=|.||.-.-..
T Consensus 996 ~~w~seeV~awLe~~~--LsEy~d~f~kndirG--seLl~L~rrDLkdlgvtkVGhvkril~aIkdl 1058 (1099)
T KOG1170|consen 996 PYWTSEEVCAWLESIG--LSEYKDTFRKNDIRG--SELLHLERRDLKDLGVTKVGHVKRILSAIKDL 1058 (1099)
T ss_pred ccccHHHHHHHHhccc--cchhhhhhhccCccc--ceeeecCcccccccchhhhHHHHHHHHHHHHH
Confidence 4555667889999998 99999999852 2 4 5799999999999999 888888888654433
No 48
>KOG2841 consensus Structure-specific endonuclease ERCC1-XPF, ERCC1 component [Replication, recombination and repair]
Probab=32.88 E-value=75 Score=26.44 Aligned_cols=53 Identities=23% Similarity=0.344 Sum_probs=39.5
Q ss_pred HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CC-CchhhhHHhhhh
Q 033920 39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GI-PCKHRKLILKHT 92 (109)
Q Consensus 39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GI-p~r~RKyIL~~~ 92 (109)
+..||+.|-.-...-|..+-..||+++++++++..+|-.. |+ |.+.++ |....
T Consensus 193 ~~~~Lt~i~~VnKtda~~LL~~FgsLq~~~~AS~~ele~~~G~G~~kak~-l~~~l 247 (254)
T KOG2841|consen 193 LLGFLTTIPGVNKTDAQLLLQKFGSLQQISNASEGELEQCPGLGPAKAKR-LHKFL 247 (254)
T ss_pred HHHHHHhCCCCCcccHHHHHHhcccHHHHHhcCHhHHHhCcCcCHHHHHH-HHHHH
Confidence 8899999954445566777777888999999999999876 88 555444 44443
No 49
>PRK04038 rps19p 30S ribosomal protein S19P; Provisional
Probab=31.96 E-value=32 Score=25.94 Aligned_cols=24 Identities=17% Similarity=0.426 Sum_probs=10.5
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHH
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLI 88 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyI 88 (109)
.|+|++++-.+|-++ +|++|||.|
T Consensus 14 l~~L~~m~~~~~~~l-~~ar~RRsl 37 (134)
T PRK04038 14 LEELQEMSLEEFAEL-LPARQRRSL 37 (134)
T ss_pred HHHHHcCCHHHHHHH-cchhhhhhh
Confidence 344444444444332 345555544
No 50
>smart00540 LEM in nuclear membrane-associated proteins. LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin.
Probab=31.10 E-value=64 Score=19.87 Aligned_cols=27 Identities=41% Similarity=0.567 Sum_probs=21.3
Q ss_pred hHHHHhcCCC-----chhhhHHhhhhHhhhhc
Q 033920 72 TLKLKKLGIP-----CKHRKLILKHTHKYRLG 98 (109)
Q Consensus 72 S~~LKelGIp-----~r~RKyIL~~~ekyR~G 98 (109)
..+|++.|+| ..+|+...+.-++++.|
T Consensus 12 ~~~L~~~G~~~gPIt~sTR~vy~kkL~~~~~~ 43 (44)
T smart00540 12 RAELKQYGLPPGPITDTTRKLYEKKLRKLRRG 43 (44)
T ss_pred HHHHHHcCCCCCCcCcchHHHHHHHHHHHHcC
Confidence 3578889985 37899999988888765
No 51
>PF13518 HTH_28: Helix-turn-helix domain
Probab=30.37 E-value=61 Score=18.42 Aligned_cols=23 Identities=22% Similarity=0.449 Sum_probs=17.9
Q ss_pred hHHHHhcCCCchhhhHHhhhhHhhhh
Q 033920 72 TLKLKKLGIPCKHRKLILKHTHKYRL 97 (109)
Q Consensus 72 S~~LKelGIp~r~RKyIL~~~ekyR~ 97 (109)
+...++.|| .+.-|-+|..+|+.
T Consensus 16 ~~~a~~~gi---s~~tv~~w~~~y~~ 38 (52)
T PF13518_consen 16 REIAREFGI---SRSTVYRWIKRYRE 38 (52)
T ss_pred HHHHHHHCC---CHhHHHHHHHHHHh
Confidence 445678899 45678899999988
No 52
>PRK00558 uvrC excinuclease ABC subunit C; Validated
Probab=30.27 E-value=77 Score=28.46 Aligned_cols=55 Identities=20% Similarity=0.224 Sum_probs=39.7
Q ss_pred HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhH
Q 033920 39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTH 93 (109)
Q Consensus 39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~e 93 (109)
....|..|..-=..-+.++-+.+|+++++++++..+|.+. ||+.+.-..|..|.+
T Consensus 541 ~~s~L~~IpGIG~k~~k~Ll~~FgS~~~i~~As~eeL~~v~Gig~~~A~~I~~~l~ 596 (598)
T PRK00558 541 LTSALDDIPGIGPKRRKALLKHFGSLKAIKEASVEELAKVPGISKKLAEAIYEALH 596 (598)
T ss_pred hhhhHhhCCCcCHHHHHHHHHHcCCHHHHHhCCHHHHhhcCCcCHHHHHHHHHHhc
Confidence 3444444432223344566666778999999999999999 999998888887764
No 53
>COG1623 Predicted nucleic-acid-binding protein (contains the HHH domain) [General function prediction only]
Probab=29.99 E-value=50 Score=28.55 Aligned_cols=45 Identities=22% Similarity=0.301 Sum_probs=38.8
Q ss_pred HHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CC-Cchhh
Q 033920 41 EFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GI-PCKHR 85 (109)
Q Consensus 41 tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GI-p~r~R 85 (109)
--|++|.|---.+++++-.++|+++++++++-+.|++. || ..+.|
T Consensus 293 R~l~kIpRlp~~iv~nlV~~F~~l~~il~As~edL~~VeGIGe~rAr 339 (349)
T COG1623 293 RLLNKIPRLPFAIVENLVRAFGTLDGILEASAEDLDAVEGIGEARAR 339 (349)
T ss_pred HHHhcCcCccHHHHHHHHHHHhhHHHHHHhcHhHHhhhcchhHHHHH
Confidence 47889999999999999999999999999999999987 78 44433
No 54
>COG0122 AlkA 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase [DNA replication, recombination, and repair]
Probab=28.82 E-value=56 Score=26.64 Aligned_cols=59 Identities=12% Similarity=0.130 Sum_probs=46.2
Q ss_pred CCHHHHHhhhccchhHHHHh------hhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcc
Q 033920 37 VGIPEFLNGIGKGVETHSAK------LESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGL 99 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~K------fes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl 99 (109)
+.+..=-+.++|-++..-.. |++ =+.|...+...|++.|.+-+-=+||.+.++.+.+|.
T Consensus 118 vS~~~A~~i~~rl~~~~g~~~~~~~~fpt----pe~l~~~~~~~l~~~g~s~~Ka~yi~~~A~~~~~g~ 182 (285)
T COG0122 118 VSVAAAAKIWARLVSLYGNALEIYHSFPT----PEQLAAADEEALRRCGLSGRKAEYIISLARAAAEGE 182 (285)
T ss_pred hhHHHHHHHHHHHHHHhCCccccccCCCC----HHHHHhcCHHHHHHhCCcHHHHHHHHHHHHHHHcCC
Confidence 44444445555555555554 666 899999999999999999999999999999999994
No 55
>PRK09462 fur ferric uptake regulator; Provisional
Probab=28.28 E-value=53 Score=23.47 Aligned_cols=23 Identities=30% Similarity=0.302 Sum_probs=19.5
Q ss_pred hHHHHhcCC-CchhhhHHhhhhHh
Q 033920 72 TLKLKKLGI-PCKHRKLILKHTHK 94 (109)
Q Consensus 72 S~~LKelGI-p~r~RKyIL~~~ek 94 (109)
...||+.|+ .++||..||.....
T Consensus 5 ~~~l~~~glr~T~qR~~Il~~l~~ 28 (148)
T PRK09462 5 NTALKKAGLKVTLPRLKILEVLQE 28 (148)
T ss_pred HHHHHHcCCCCCHHHHHHHHHHHh
Confidence 356899999 99999999998754
No 56
>COG5457 Uncharacterized conserved small protein [Function unknown]
Probab=28.05 E-value=60 Score=21.52 Aligned_cols=28 Identities=18% Similarity=0.113 Sum_probs=22.6
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHHhhh
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLILKH 91 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~ 91 (109)
=..|..++..+|++.||.-.+..+....
T Consensus 32 r~eL~~lsd~~L~DiGisR~d~~~e~~k 59 (63)
T COG5457 32 RRELLRLSDHLLSDIGISRADIEAEAAK 59 (63)
T ss_pred HHHHHHHhHHHHHHcCCCHHHHHHHHHH
Confidence 4568888999999999988887776654
No 57
>PF10281 Ish1: Putative stress-responsive nuclear envelope protein; InterPro: IPR018803 This group of proteins, found primarily in fungi, consists of putative stress-responsive nuclear envelope protein Ish1 and homologues [].
Probab=27.51 E-value=61 Score=18.56 Aligned_cols=21 Identities=43% Similarity=0.575 Sum_probs=14.7
Q ss_pred HHhcCCCch----hhhHHhhhhHhh
Q 033920 75 LKKLGIPCK----HRKLILKHTHKY 95 (109)
Q Consensus 75 LKelGIp~r----~RKyIL~~~eky 95 (109)
|++.||++. .|..+|..+.++
T Consensus 13 L~~~gi~~~~~~~~rd~Ll~~~k~~ 37 (38)
T PF10281_consen 13 LKSHGIPVPKSAKTRDELLKLAKKN 37 (38)
T ss_pred HHHcCCCCCCCCCCHHHHHHHHHHh
Confidence 567899443 688888877553
No 58
>PF13812 PPR_3: Pentatricopeptide repeat domain
Probab=26.55 E-value=53 Score=16.72 Aligned_cols=19 Identities=26% Similarity=0.291 Sum_probs=11.8
Q ss_pred hhHHHHhhhchHHHHhcCCC
Q 033920 62 GDFQRLLVTRTLKLKKLGIP 81 (109)
Q Consensus 62 gdw~~Lf~~~S~~LKelGIp 81 (109)
|+|+..+.+=.. |++.||.
T Consensus 15 g~~~~a~~~~~~-M~~~gv~ 33 (34)
T PF13812_consen 15 GDPDAALQLFDE-MKEQGVK 33 (34)
T ss_pred CCHHHHHHHHHH-HHHhCCC
Confidence 556666555443 6678884
No 59
>PF08349 DUF1722: Protein of unknown function (DUF1722); InterPro: IPR013560 This domain of unknown function is found in bacteria and archaea and is homologous to the hypothetical protein ybgA from Escherichia coli.
Probab=26.28 E-value=50 Score=23.09 Aligned_cols=23 Identities=17% Similarity=0.296 Sum_probs=19.8
Q ss_pred cCCCchhhhHHhhhhHhhhhccc
Q 033920 78 LGIPCKHRKLILKHTHKYRLGLW 100 (109)
Q Consensus 78 lGIp~r~RKyIL~~~ekyR~Gl~ 100 (109)
.-++..+|++++...++||+|..
T Consensus 64 ~~ls~~EK~~~~~~i~~yr~g~i 86 (117)
T PF08349_consen 64 KKLSSEEKQHFLDLIEDYREGKI 86 (117)
T ss_pred HhCCHHHHHHHHHHHHHHHcCCc
Confidence 34688899999999999999973
No 60
>cd00056 ENDO3c endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases
Probab=25.02 E-value=92 Score=21.87 Aligned_cols=39 Identities=13% Similarity=0.060 Sum_probs=31.2
Q ss_pred hHHHHhhhchHHHHhcCCC---chhhhHHhhhhHhhhhcccc
Q 033920 63 DFQRLLVTRTLKLKKLGIP---CKHRKLILKHTHKYRLGLWR 101 (109)
Q Consensus 63 dw~~Lf~~~S~~LKelGIp---~r~RKyIL~~~ekyR~Gl~~ 101 (109)
+++.|..++..+|++.|.+ .+--+||..-.+.+..+..+
T Consensus 32 t~~~l~~~~~~~l~~~~~~~G~~~kA~~i~~~a~~~~~~~~~ 73 (158)
T cd00056 32 TPEALAAADEEELRELIRSLGYRRKAKYLKELARAIVEGFGG 73 (158)
T ss_pred CHHHHHCCCHHHHHHHHHhcChHHHHHHHHHHHHHHHHHcCC
Confidence 4999999999999998887 46667888888887776543
No 61
>KOG0005 consensus Ubiquitin-like protein [Cell cycle control, cell division, chromosome partitioning; Posttranslational modification, protein turnover, chaperones]
Probab=24.84 E-value=57 Score=22.23 Aligned_cols=14 Identities=43% Similarity=0.714 Sum_probs=11.4
Q ss_pred hcCCCchhhhHHhh
Q 033920 77 KLGIPCKHRKLILK 90 (109)
Q Consensus 77 elGIp~r~RKyIL~ 90 (109)
+-|||++|.|.|..
T Consensus 33 keGIPp~qqrli~~ 46 (70)
T KOG0005|consen 33 KEGIPPQQQRLIYA 46 (70)
T ss_pred hcCCCchhhhhhhc
Confidence 66999999888864
No 62
>PF14527 LAGLIDADG_WhiA: WhiA LAGLIDADG-like domain; PDB: 3HYI_A 3HYJ_D.
Probab=24.63 E-value=25 Score=24.06 Aligned_cols=21 Identities=29% Similarity=0.447 Sum_probs=15.9
Q ss_pred CCHHHHHhhhccchhHHHHhhhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLES 59 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes 59 (109)
-++.+||+.|| ..+.+-+||+
T Consensus 65 e~I~dfL~~iG--A~~s~~~~E~ 85 (93)
T PF14527_consen 65 EQISDFLKLIG--AHKSVLEFEN 85 (93)
T ss_dssp HHHHHHHHHTT----CHCCHHHH
T ss_pred HHHHHHHHHcC--hHHHHHHHHH
Confidence 68999999998 7777777776
No 63
>PF11899 DUF3419: Protein of unknown function (DUF3419); InterPro: IPR021829 This family of proteins are functionally uncharacterised. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 398 to 802 amino acids in length.
Probab=23.96 E-value=1.2e+02 Score=25.93 Aligned_cols=49 Identities=20% Similarity=0.350 Sum_probs=34.7
Q ss_pred CHHHHHhhhccchhHHHHhhhhhhhh--HHHHhhhc-----hHHHHhcCCCchhhhHH
Q 033920 38 GIPEFLNGIGKGVETHSAKLESEIGD--FQRLLVTR-----TLKLKKLGIPCKHRKLI 88 (109)
Q Consensus 38 dV~tFL~~IGRg~~eha~Kfes~~gd--w~~Lf~~~-----S~~LKelGIp~r~RKyI 88 (109)
+|+.+++. +..+|..+-|++.+.. |..++.+- .-.|+-+|||+.|+++|
T Consensus 159 ~v~~l~~a--~tleeQr~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lGvp~~q~~~l 214 (380)
T PF11899_consen 159 DVKRLLEA--RTLEEQRRIWEKKIRPLFWRRLVRWLFVGNRRFLWFALGVPPAQYKML 214 (380)
T ss_pred HHHHHHcC--CCHHHHHHHHHHhhhHHHHHHHHHHHHhcCHHHHHHhcCCCHHHHHHH
Confidence 66666664 4677777777776533 55566443 34789999999999998
No 64
>KOG4374 consensus RNA-binding protein Bicaudal-C [RNA processing and modification]
Probab=23.51 E-value=59 Score=26.40 Aligned_cols=53 Identities=17% Similarity=0.100 Sum_probs=37.0
Q ss_pred cCCHHHHHhhhccchhHHHHhhhhhhhhHHHH---hhhchHHHHhcCC-CchhhhHHhhhh
Q 033920 36 KVGIPEFLNGIGKGVETHSAKLESEIGDFQRL---LVTRTLKLKKLGI-PCKHRKLILKHT 92 (109)
Q Consensus 36 ~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~L---f~~~S~~LKelGI-p~r~RKyIL~~~ 92 (109)
.||.+++++.- .+.+.+...|.. +|+.| .+.++.-++++|| -+-.|+-+|.--
T Consensus 117 ~pd~~~~~~~~-~~l~s~~~~~~~---~~~~l~~~~t~~~~vl~~L~~lglg~y~~~f~~~ 173 (216)
T KOG4374|consen 117 RPDIQSLLTSR-LGLESYIKEFNL---QEIDLQTFGTLTEGVLMELGILGLGAYWKMFEAI 173 (216)
T ss_pred CCchhhHHHHh-hcccccchhhhc---chHhhhhcccccchHHHHHHHHhHHHHHHHHHHH
Confidence 36889999982 335555555554 67774 5678889999999 777777666543
No 65
>COG4352 RPL13 Ribosomal protein L13E [Translation, ribosomal structure and biogenesis]
Probab=22.97 E-value=58 Score=24.17 Aligned_cols=34 Identities=26% Similarity=0.501 Sum_probs=24.8
Q ss_pred hhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCchhhhH
Q 033920 44 NGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCKHRKL 87 (109)
Q Consensus 44 ~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKy 87 (109)
-++|||-+ +|.+++ ..++-.+-+.+||++.+||.
T Consensus 55 ~R~GRGFs---------lgEl~a-AGL~~~~AR~LGI~VD~RRr 88 (113)
T COG4352 55 VRAGRGFS---------LGELKA-AGLSARKARTLGIAVDHRRR 88 (113)
T ss_pred eeccCCcc---------HHHHHH-cCcCHHHHHhhCcceehhhc
Confidence 57899976 122333 36778888999999999974
No 66
>PF10305 Fmp27_SW: RNA pol II promoter Fmp27 protein domain; InterPro: IPR019415 The function of the FMP27 protein is not known. FMP27 is the product of a nuclear encoded gene but it is detected in highly purified mitochondria in high-throughput studies []. This entry represents a conserved region within FMP27 that contains characteristic SW and GKG sequence motifs.
Probab=22.51 E-value=55 Score=22.80 Aligned_cols=16 Identities=44% Similarity=0.893 Sum_probs=13.8
Q ss_pred cCCHHHHHhhhccchh
Q 033920 36 KVGIPEFLNGIGKGVE 51 (109)
Q Consensus 36 ~~dV~tFL~~IGRg~~ 51 (109)
.-++++||-.+|+|+-
T Consensus 80 l~~~pdFLh~~GkG~P 95 (103)
T PF10305_consen 80 LDDLPDFLHDVGKGVP 95 (103)
T ss_pred chhhHHHHHHhCCCCC
Confidence 4689999999999974
No 67
>PF13871 Helicase_C_4: Helicase_C-like
Probab=22.44 E-value=1.7e+02 Score=24.11 Aligned_cols=44 Identities=25% Similarity=0.236 Sum_probs=31.0
Q ss_pred ccccCCHHHHHhhhcc-chhHHHHhhhhhhhhHHHHhhhchHHHHhcCC
Q 033920 33 YIVKVGIPEFLNGIGK-GVETHSAKLESEIGDFQRLLVTRTLKLKKLGI 80 (109)
Q Consensus 33 ~~~~~dV~tFL~~IGR-g~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI 80 (109)
....++|.+||++|-- -++....-|+- +...++..-+++|..|.
T Consensus 227 ~~k~~~V~kFLNRLLGL~V~~Qn~LF~y----F~~~l~~~I~~AK~~G~ 271 (278)
T PF13871_consen 227 LDKDPSVPKFLNRLLGLPVEMQNALFKY----FTDTLDAIIEQAKAEGR 271 (278)
T ss_pred ccccccHHHHHHHhhCCCHHHHHHHHHH----HHHHHHHHHHHHHHcCC
Confidence 3456799999999842 23344445554 88888888888888875
No 68
>COG3179 Predicted chitinase [General function prediction only]
Probab=22.18 E-value=1.2e+02 Score=24.60 Aligned_cols=42 Identities=14% Similarity=0.241 Sum_probs=28.5
Q ss_pred HHHHhhhhhhhhHHHHhhhchHHHHhcCCCc--hhhhHHhhhhH
Q 033920 52 THSAKLESEIGDFQRLLVTRTLKLKKLGIPC--KHRKLILKHTH 93 (109)
Q Consensus 52 eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~--r~RKyIL~~~e 93 (109)
.|...|+.++.+.-..+.+=+..|++.||.. ++--+|-.+.|
T Consensus 8 ~~~ki~p~a~k~~~~v~~al~~~l~~~gi~~p~r~AmFlAQ~~H 51 (206)
T COG3179 8 DLRKIFPKARKEFVDVIVALQPALDEAGITTPLRQAMFLAQVMH 51 (206)
T ss_pred HHHHhcchhhhhhHHHHHHHHHHHHHhcCCCHHHHHHHHHHHhh
Confidence 4455566666667777888899999999944 44445555554
No 69
>TIGR00608 radc DNA repair protein radc. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=22.10 E-value=2e+02 Score=22.68 Aligned_cols=41 Identities=20% Similarity=0.144 Sum_probs=30.9
Q ss_pred chhHHHHhhhhhh---hhHHHHhhhchHHHHh-cCCCchhhhHHh
Q 033920 49 GVETHSAKLESEI---GDFQRLLVTRTLKLKK-LGIPCKHRKLIL 89 (109)
Q Consensus 49 g~~eha~Kfes~~---gdw~~Lf~~~S~~LKe-lGIp~r~RKyIL 89 (109)
++.+-|..+-+.+ |++..|+.++-.+|++ .||.....-.|+
T Consensus 33 ~~~~lA~~ll~~f~~~g~l~~l~~a~~~eL~~i~GiG~aka~~l~ 77 (218)
T TIGR00608 33 DVLSLSKRLLDVFGRQDSLGHLLSAPPEELSSVPGIGEAKAIQLK 77 (218)
T ss_pred CHHHHHHHHHHHhcccCCHHHHHhCCHHHHHhCcCCcHHHHHHHH
Confidence 5668887777777 8899999999999988 599443333333
No 70
>PF10330 Stb3: Putative Sin3 binding protein; InterPro: IPR018818 This entry represents Sin3 binding proteins conserved in fungi. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex []. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein [].
Probab=21.80 E-value=65 Score=23.09 Aligned_cols=16 Identities=31% Similarity=0.605 Sum_probs=13.8
Q ss_pred cCC-CchhhhHHhhhhH
Q 033920 78 LGI-PCKHRKLILKHTH 93 (109)
Q Consensus 78 lGI-p~r~RKyIL~~~e 93 (109)
.+| |.||||.|..-.|
T Consensus 38 ~~ls~sKqRRLi~~ALE 54 (92)
T PF10330_consen 38 SDLSPSKQRRLIMAALE 54 (92)
T ss_pred ccCCHHHHHHHHHHHHh
Confidence 356 8899999999988
No 71
>TIGR03019 pepcterm_femAB FemAB-related protein, PEP-CTERM system-associated. Members of this protein family are found always as part of extended exopolysaccharide biosynthesis loci in bacteria. In nearly every case, these loci contain determinants for the processing of the PEP-CTERM proposed C-terminal protein sorting signal. This family shows remote, local sequence similarity to the FemAB protein family (see pfam02388), whose members
Probab=21.79 E-value=1.8e+02 Score=23.06 Aligned_cols=59 Identities=15% Similarity=0.142 Sum_probs=45.4
Q ss_pred CCHHHHHhhhccchhHHHHhhhhh------hhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhh
Q 033920 37 VGIPEFLNGIGKGVETHSAKLESE------IGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKY 95 (109)
Q Consensus 37 ~dV~tFL~~IGRg~~eha~Kfes~------~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~eky 95 (109)
.|.++++..+.+.+-...-|.+.. .++++.++.+-...|+.+|.|+..+.|.-+-.+.|
T Consensus 127 ~~~e~~~~~~~~k~R~~IRka~k~Gv~v~~~~~l~~F~~l~~~t~~r~g~p~~~~~~f~~l~~~~ 191 (330)
T TIGR03019 127 ADPEANWLAIPRKQRAMVRKGIKAGLTVTVDGDLDRFYDVYAENMRDLGTPVFSRRYFRLLKDVF 191 (330)
T ss_pred CCHHHHHHhcCHHHHHHHHHHHHCCeEEEECCcHHHHHHHHHHHHhcCCCCCCCHHHHHHHHHhc
Confidence 588999988877665555554421 25799999999999999999999899887766655
No 72
>PF05577 Peptidase_S28: Serine carboxypeptidase S28; InterPro: IPR008758 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S28 (clan SC). The predicted active site residues for members of this family and family S10 occur in the same order in the sequence: S, D, H. These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase [, , , ].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis; PDB: 3N2Z_B 3JYH_A 3N0T_C.
Probab=21.76 E-value=72 Score=26.26 Aligned_cols=49 Identities=18% Similarity=0.295 Sum_probs=33.0
Q ss_pred CccccCCHHHHHhhhccchhH----HHHhhhhhhhhHHHHhhhch--HHH-HhcCC
Q 033920 32 QYIVKVGIPEFLNGIGKGVET----HSAKLESEIGDFQRLLVTRT--LKL-KKLGI 80 (109)
Q Consensus 32 p~~~~~dV~tFL~~IGRg~~e----ha~Kfes~~gdw~~Lf~~~S--~~L-KelGI 80 (109)
|.+.+.|-.+|...|.+.+.. -++.+..++...++++...+ .+| +++++
T Consensus 147 pv~a~~df~~y~~~v~~~~~~~~~~C~~~i~~a~~~i~~~~~~~~~~~~l~~~f~~ 202 (434)
T PF05577_consen 147 PVQAKVDFWEYFEVVTESLRKYGPNCYDAIRAAFDQIDKLLKTGNGRQQLKKKFKL 202 (434)
T ss_dssp -CCHCCTTTHHHHHHHHHHHCCSCCHHHHHHHHHHHHHHHCCTCHHHHHHHHHCTB
T ss_pred eeeeecccHHHHHHHHHHHHhhccHHHHHHHHHHHHHHHHhhcccHHHHHHHHhhh
Confidence 888888988888876543221 33666667778999987776 566 44565
No 73
>PF00633 HHH: Helix-hairpin-helix motif; InterPro: IPR000445 The HhH motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins [, , ]. The HhH motif is similar to, but distinct from, the HtH motif. Both of these motifs have two helices connected by a short turn. In the HtH motif the second helix binds to DNA with the helix in the major groove. This allows the contact between specific base and residues throughout the protein. In the HhH motif the second helix does not protrude from the surface of the protein and therefore cannot lie in the major groove of the DNA. Crystallographic studies suggest that the interaction of the HhH domain with DNA is mediated by amino acids located in the strongly conserved loop (L-P-G-V) and at the N-terminal end of the second helix []. This interaction could involve the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups []. The structural difference between the HtH and HhH domains is reflected at the functional level: whereas the HtH domain, found primarily in gene regulatory proteins, binds DNA in a sequence specific manner, the HhH domain is rather found in proteins involved in enzymatic activities and binds DNA with no sequence specificity []. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain [].; GO: 0003677 DNA binding; PDB: 3C1Z_A 3C23_A 3C1Y_A 3C21_A 1Z00_A 2A1J_B 1KEA_A 1VRL_A 1RRQ_A 3G0Q_A ....
Probab=21.67 E-value=1.1e+02 Score=17.03 Aligned_cols=27 Identities=33% Similarity=0.381 Sum_probs=17.0
Q ss_pred HHHHhhhchHHHHhc-CCCchhhhHHhh
Q 033920 64 FQRLLVTRTLKLKKL-GIPCKHRKLILK 90 (109)
Q Consensus 64 w~~Lf~~~S~~LKel-GIp~r~RKyIL~ 90 (109)
.+.++..+-++|.++ ||-.+.=..||.
T Consensus 2 ~~g~~pas~eeL~~lpGIG~~tA~~I~~ 29 (30)
T PF00633_consen 2 LDGLIPASIEELMKLPGIGPKTANAILS 29 (30)
T ss_dssp HHHHHTSSHHHHHTSTT-SHHHHHHHHH
T ss_pred CCCcCCCCHHHHHhCCCcCHHHHHHHHh
Confidence 445666666777766 887776666664
No 74
>PF12972 NAGLU_C: Alpha-N-acetylglucosaminidase (NAGLU) C-terminal domain; InterPro: IPR024732 Alpha-N-acetylglucosaminidase is a lysosomal enzyme required for the stepwise degradation of heparan sulphate []. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB, or Sanfilippo syndrome type B) characterised by neurological dysfunction but relatively mild somatic manifestations []. The structure shows that the enzyme is composed of three domains. This C-terminal domain has an all alpha helical fold [].; PDB: 2VC9_A 2VCC_A 2VCB_A 2VCA_A 4A4A_A.
Probab=21.41 E-value=2.8e+02 Score=22.21 Aligned_cols=58 Identities=21% Similarity=0.353 Sum_probs=37.2
Q ss_pred chhHHHHhhhhhhhhHHHHhhhchH--------HHHhcCCCchhhhHHhhhhHhhhhcccccCCCCC
Q 033920 49 GVETHSAKLESEIGDFQRLLVTRTL--------KLKKLGIPCKHRKLILKHTHKYRLGLWRPRAAPA 107 (109)
Q Consensus 49 g~~eha~Kfes~~gdw~~Lf~~~S~--------~LKelGIp~r~RKyIL~~~ekyR~Gl~~P~g~~~ 107 (109)
....+..+|.+-+.|.+.|+.+... .-|..|-...++++.-. --+=--=+|+|.|.+.
T Consensus 124 ~~~~~~~~~l~ll~dlD~lL~t~~~f~Lg~Wi~~Ar~~g~~~~e~~~yE~-NAR~qIT~Wg~~g~l~ 189 (267)
T PF12972_consen 124 AFKALSARFLELLDDLDRLLATNPEFLLGKWIEDARAWGTTPEEKDLYEY-NARNQITLWGPSGELH 189 (267)
T ss_dssp HHHHHHHHHHHHHHHHHHHHTT-GGGBHHHHHHHHHHSSTT--HHHHHHH-HHHHHTTTSHTTTS-T
T ss_pred HHHHHHHHHHHHHHHHHHHHCcCCCCCHHHHHHHHHHHCCCHHHHHHHHH-HHHHHhhccCCCCCcc
Confidence 4556677888877889999988876 45888987777655432 2234445678887653
No 75
>smart00611 SEC63 Domain of unknown function in Sec63p, Brr2p and other proteins.
Probab=21.15 E-value=89 Score=24.36 Aligned_cols=35 Identities=14% Similarity=0.150 Sum_probs=27.9
Q ss_pred hhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhHhh
Q 033920 61 IGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTHKY 95 (109)
Q Consensus 61 ~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~eky 95 (109)
+.++++|.+++..+++++ |.+.++-+-|++..++|
T Consensus 172 i~s~~~l~~~~~~~~~~ll~~~~~~~~~i~~~~~~~ 207 (312)
T smart00611 172 VLSLEDLLELEDEERGELLGLLDAEGERVYKVLSRL 207 (312)
T ss_pred CCCHHHHHhcCHHHHHHHHcCCHHHHHHHHHHHHhC
Confidence 456999999999888887 88777777788777654
No 76
>COG0735 Fur Fe2+/Zn2+ uptake regulation proteins [Inorganic ion transport and metabolism]
Probab=21.00 E-value=92 Score=22.65 Aligned_cols=24 Identities=21% Similarity=0.242 Sum_probs=20.2
Q ss_pred hHHHHhcCC-CchhhhHHhhhhHhh
Q 033920 72 TLKLKKLGI-PCKHRKLILKHTHKY 95 (109)
Q Consensus 72 S~~LKelGI-p~r~RKyIL~~~eky 95 (109)
...||+.|+ .++||.-||+...+-
T Consensus 9 ~~~lk~~glr~T~qR~~vl~~L~~~ 33 (145)
T COG0735 9 IERLKEAGLRLTPQRLAVLELLLEA 33 (145)
T ss_pred HHHHHHcCCCcCHHHHHHHHHHHhc
Confidence 467999999 999999999887543
No 77
>COG1305 Transglutaminase-like enzymes, putative cysteine proteases [Amino acid transport and metabolism]
Probab=20.91 E-value=1.2e+02 Score=22.44 Aligned_cols=37 Identities=19% Similarity=0.112 Sum_probs=26.3
Q ss_pred ccccCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCch
Q 033920 33 YIVKVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCK 83 (109)
Q Consensus 33 ~~~~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r 83 (109)
+....+.+.+|..-...|..|+.-|-. -|+-+|||+|
T Consensus 180 ~~~~~~~~~~l~~~~G~C~d~a~l~va--------------l~Ra~GIpAR 216 (319)
T COG1305 180 TPVTGSASDALRLGRGVCRDFAHLLVA--------------LLRAAGIPAR 216 (319)
T ss_pred CCCCCCHHHHHHhCCcccccHHHHHHH--------------HHHHcCCcce
Confidence 445567677776665678777766554 5788999998
No 78
>TIGR01025 rpsS_arch ribosomal protein S19(archaeal)/S15(eukaryotic). This model represents eukaryotic ribosomal protein S15 and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15.
Probab=20.83 E-value=52 Score=24.84 Aligned_cols=26 Identities=23% Similarity=0.438 Sum_probs=14.7
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHHhh
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLILK 90 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyIL~ 90 (109)
+++|..++-.++-++ +|++|||.|-+
T Consensus 12 l~~L~~m~~~e~~~l-~~ar~RRs~~R 37 (135)
T TIGR01025 12 LEELQDMSLEELAKL-LPARQRRRLKR 37 (135)
T ss_pred HHHHHcCCHHHHHHH-cCcccCccccc
Confidence 455555555555443 46777766643
No 79
>PF14304 CSTF_C: Transcription termination and cleavage factor C-terminal; PDB: 2J8P_A.
Probab=20.65 E-value=67 Score=20.22 Aligned_cols=34 Identities=18% Similarity=0.313 Sum_probs=22.4
Q ss_pred HHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcc
Q 033920 64 FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGL 99 (109)
Q Consensus 64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl 99 (109)
...+++++..+.- -.|+.+|.-|+.-++++++|.
T Consensus 11 l~QVL~Lt~eQI~--~LPp~qR~~I~~Lr~ql~~~~ 44 (46)
T PF14304_consen 11 LMQVLQLTPEQIN--ALPPDQRQQILQLRQQLMRGE 44 (46)
T ss_dssp HHHHHTS-HHHHH--TS-HHHHTHHHHHHHHHH---
T ss_pred HHHHHcCCHHHHH--hCCHHHHHHHHHHHHHHHhcC
Confidence 4456666666654 349999999999999999984
No 80
>PF15368 BioT2: Spermatogenesis family BioT2
Probab=20.34 E-value=1.4e+02 Score=23.60 Aligned_cols=35 Identities=17% Similarity=0.169 Sum_probs=26.7
Q ss_pred cCCHHHHHhhhccchhHHHHhhhhhh----hhHHHHhhhchHH
Q 033920 36 KVGIPEFLNGIGKGVETHSAKLESEI----GDFQRLLVTRTLK 74 (109)
Q Consensus 36 ~~dV~tFL~~IGRg~~eha~Kfes~~----gdw~~Lf~~~S~~ 74 (109)
--|+..||.. |+++|.-+|.++ +-++.||.|--.|
T Consensus 126 gdD~~SFL~~----CS~faaQLEeAvKEE~niLeSLfKWFQ~Q 164 (170)
T PF15368_consen 126 GDDMNSFLLC----CSQFAAQLEEAVKEERNILESLFKWFQQQ 164 (170)
T ss_pred cccHHHHHHH----HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4688888864 899999999887 6677777775444
Done!